

# Object2Vec Hyperparameters
<a name="object2vec-hyperparameters"></a>

In the `CreateTrainingJob` request, you specify the training algorithm. You can also specify algorithm-specific hyperparameters as string-to-string maps. The following table lists the hyperparameters for the Object2Vec training algorithm.


| Parameter Name | Description | 
| --- | --- | 
| enc0\$1max\$1seq\$1len |  The maximum sequence length for the enc0 encoder. **Required** Valid values: 1 ≤ integer ≤ 5000  | 
| enc0\$1vocab\$1size |  The vocabulary size of enc0 tokens. **Required** Valid values: 2 ≤ integer ≤ 3000000  | 
| bucket\$1width |  The allowed difference between data sequence length when bucketing is enabled. To enable bucketing, specify a non-zero value for this parameter. **Optional** Valid values: 0 ≤ integer ≤ 100 Default value: 0 (no bucketing)  | 
| comparator\$1list |  A list used to customize the way in which two embeddings are compared. The Object2Vec comparator operator layer takes the encodings from both encoders as inputs and outputs a single vector. This vector is a concatenation of subvectors. The string values passed to the `comparator_list` and the order in which they are passed determine how these subvectors are assembled. For example, if `comparator_list="hadamard, concat"`, then the comparator operator constructs the vector by concatenating the Hadamard product of two encodings and the concatenation of two encodings. If, on the other hand, `comparator_list="hadamard"`, then the comparator operator constructs the vector as the hadamard product of only two encodings.  **Optional** Valid values: A string that contains any combination of the names of the three binary operators: `hadamard`, `concat`, or `abs_diff`. The Object2Vec algorithm currently requires that the two vector encodings have the same dimension. These operators produce the subvectors as follows: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/object2vec-hyperparameters.html) Default value: `"hadamard, concat, abs_diff"`  | 
| dropout |  The dropout probability for network layers. *Dropout* is a form of regularization used in neural networks that reduces overfitting by trimming codependent neurons. **Optional** Valid values: 0.0 ≤ float ≤ 1.0 Default value: 0.0  | 
| early\$1stopping\$1patience |  The number of consecutive epochs without improvement allowed before early stopping is applied. Improvement is defined by with the `early_stopping_tolerance` hyperparameter. **Optional** Valid values: 1 ≤ integer ≤ 5 Default value: 3  | 
| early\$1stopping\$1tolerance |  The reduction in the loss function that an algorithm must achieve between consecutive epochs to avoid early stopping after the number of consecutive epochs specified in the `early_stopping_patience` hyperparameter concludes. **Optional** Valid values: 0.000001 ≤ float ≤ 0.1 Default value: 0.01  | 
| enc\$1dim |  The dimension of the output of the embedding layer. **Optional** Valid values: 4 ≤ integer ≤ 10000 Default value: 4096  | 
| enc0\$1network |  The network model for the enc0 encoder. **Optional** Valid values: `hcnn`, `bilstm`, or `pooled_embedding` [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/object2vec-hyperparameters.html) Default value: `hcnn`  | 
| enc0\$1cnn\$1filter\$1width |  The filter width of the convolutional neural network (CNN) enc0 encoder. **Conditional** Valid values: 1 ≤ integer ≤ 9 Default value: 3  | 
| enc0\$1freeze\$1pretrained\$1embedding |  Whether to freeze enc0 pretrained embedding weights. **Conditional** Valid values: `True` or `False` Default value: `True`  | 
| enc0\$1layers  |  The number of layers in the enc0 encoder. **Conditional** Valid values: `auto` or 1 ≤ integer ≤ 4 [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/object2vec-hyperparameters.html) Default value: `auto`  | 
| enc0\$1pretrained\$1embedding\$1file |  The filename of the pretrained enc0 token embedding file in the auxiliary data channel. **Conditional** Valid values: String with alphanumeric characters, underscore, or period. [A-Za-z0-9\$1.\$1\$1]  Default value: "" (empty string)  | 
| enc0\$1token\$1embedding\$1dim |  The output dimension of the enc0 token embedding layer. **Conditional** Valid values: 2 ≤ integer ≤ 1000 Default value: 300  | 
| enc0\$1vocab\$1file |  The vocabulary file for mapping pretrained enc0 token embedding vectors to numerical vocabulary IDs. **Conditional** Valid values: String with alphanumeric characters, underscore, or period. [A-Za-z0-9\$1.\$1\$1]  Default value: "" (empty string)  | 
| enc1\$1network |  The network model for the enc1 encoder. If you want the enc1 encoder to use the same network model as enc0, including the hyperparameter values, set the value to `enc0`.   Even when the enc0 and enc1 encoder networks have symmetric architectures, you can't shared parameter values for these networks.  **Optional** Valid values: `enc0`, `hcnn`, `bilstm`, or `pooled_embedding` [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/object2vec-hyperparameters.html) Default value: `enc0`  | 
| enc1\$1cnn\$1filter\$1width |  The filter width of the CNN enc1 encoder. **Conditional** Valid values: 1 ≤ integer ≤ 9 Default value: 3  | 
| enc1\$1freeze\$1pretrained\$1embedding |  Whether to freeze enc1 pretrained embedding weights. **Conditional** Valid values: `True` or `False` Default value: `True`  | 
| enc1\$1layers  |  The number of layers in the enc1 encoder. **Conditional** Valid values: `auto` or 1 ≤ integer ≤ 4 [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/object2vec-hyperparameters.html) Default value: `auto`  | 
| enc1\$1max\$1seq\$1len |  The maximum sequence length for the enc1 encoder. **Conditional** Valid values: 1 ≤ integer ≤ 5000  | 
| enc1\$1pretrained\$1embedding\$1file |  The name of the enc1 pretrained token embedding file in the auxiliary data channel. **Conditional** Valid values: String with alphanumeric characters, underscore, or period. [A-Za-z0-9\$1.\$1\$1]  Default value: "" (empty string)  | 
| enc1\$1token\$1embedding\$1dim |  The output dimension of the enc1 token embedding layer. **Conditional** Valid values: 2 ≤ integer ≤ 1000 Default value: 300  | 
| enc1\$1vocab\$1file |  The vocabulary file for mapping pretrained enc1 token embeddings to vocabulary IDs. **Conditional** Valid values: String with alphanumeric characters, underscore, or period. [A-Za-z0-9\$1.\$1\$1]  Default value: "" (empty string)  | 
| enc1\$1vocab\$1size |  The vocabulary size of enc0 tokens. **Conditional** Valid values: 2 ≤ integer ≤ 3000000  | 
| epochs |  The number of epochs to run for training.  **Optional** Valid values: 1 ≤ integer ≤ 100 Default value: 30  | 
| learning\$1rate |  The learning rate for training. **Optional** Valid values: 1.0E-6 ≤ float ≤ 1.0 Default value: 0.0004  | 
| mini\$1batch\$1size |  The batch size that the dataset is split into for an `optimizer` during training. **Optional** Valid values: 1 ≤ integer ≤ 10000 Default value: 32  | 
| mlp\$1activation |  The type of activation function for the multilayer perceptron (MLP) layer. **Optional** Valid values: `tanh`, `relu`, or `linear` [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/object2vec-hyperparameters.html) Default value: `linear`  | 
| mlp\$1dim |  The dimension of the output from MLP layers. **Optional** Valid values: 2 ≤ integer ≤ 10000 Default value: 512  | 
| mlp\$1layers |  The number of MLP layers in the network. **Optional** Valid values: 0 ≤ integer ≤ 10 Default value: 2  | 
| negative\$1sampling\$1rate |  The ratio of negative samples, generated to assist in training the algorithm, to positive samples that are provided by users. Negative samples represent data that is unlikely to occur in reality and are labelled negatively for training. They facilitate training a model to discriminate between the positive samples observed and the negative samples that are not. To specify the ratio of negative samples to positive samples used for training, set the value to a positive integer. For example, if you train the algorithm on input data in which all of the samples are positive and set `negative_sampling_rate` to 2, the Object2Vec algorithm internally generates two negative samples per positive sample. If you don't want to generate or use negative samples during training, set the value to 0.  **Optional** Valid values: 0 ≤ integer Default value: 0 (off)  | 
| num\$1classes |  The number of classes for classification training. Amazon SageMaker AI ignores this hyperparameter for regression problems. **Optional** Valid values: 2 ≤ integer ≤ 30 Default value: 2  | 
| optimizer |  The optimizer type. **Optional** Valid values: `adadelta`, `adagrad`, `adam`, `sgd`, or `rmsprop`. [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/object2vec-hyperparameters.html) Default value: `adam`  | 
| output\$1layer |  The type of output layer where you specify that the task is regression or classification. **Optional** Valid values: `softmax` or `mean_squared_error` [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/object2vec-hyperparameters.html) Default value: `softmax`  | 
| tied\$1token\$1embedding\$1weight |  Whether to use a shared embedding layer for both encoders. If the inputs to both encoders use the same token-level units, use a shared token embedding layer. For example, for a collection of documents, if one encoder encodes sentences and another encodes whole documents, you can use a shared token embedding layer. That's because both sentences and documents are composed of word tokens from the same vocabulary. **Optional** Valid values: `True` or `False` Default value: `False`  | 
| token\$1embedding\$1storage\$1type |  The mode of gradient update used during training: when the `dense` mode is used, the optimizer calculates the full gradient matrix for the token embedding layer even if most rows of the gradient are zero-valued. When `sparse` mode is used, the optimizer only stores rows of the gradient that are actually being used in the mini-batch. If you want the algorithm to perform lazy gradient updates, which calculate the gradients only in the non-zero rows and which speed up training, specify `row_sparse`. Setting the value to `row_sparse` constrains the values available for other hyperparameters, as follows:  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/sagemaker/latest/dg/object2vec-hyperparameters.html) **Optional** Valid values: `dense` or `row_sparse` Default value: `dense`  | 
| weight\$1decay |  The weight decay parameter used for optimization. **Optional** Valid values: 0 ≤ float ≤ 10000 Default value: 0 (no decay)  | 