BlazingText Hyperparameters
When you start a training job with a CreateTrainingJob
request, you
specify a training algorithm. You can also specify algorithm-specific hyperparameters as
string-to-string maps. The hyperparameters for the BlazingText algorithm depend on which
mode you use: Word2Vec (unsupervised) and Text Classification (supervised).
Word2Vec Hyperparameters
The following table lists the hyperparameters for the BlazingText Word2Vec training algorithm provided by Amazon SageMaker.
Parameter Name | Description |
---|---|
mode |
The Word2vec architecture used for training. Required Valid values: |
batch_size |
The size of each batch when Optional Valid values: Positive integer Default value: 11 |
buckets |
The number of hash buckets to use for subwords. Optional Valid values: positive integer Default value: 2000000 |
epochs |
The number of complete passes through the training data. Optional Valid values: Positive integer Default value: 5 |
evaluation |
Whether the trained model is evaluated using the WordSimilarity-353 Test Optional Valid values: (Boolean) Default value: |
learning_rate |
The step size used for parameter updates. Optional Valid values: Positive float Default value: 0.05 |
min_char |
The minimum number of characters to use for subwords/character n-grams. Optional Valid values: positive integer Default value: 3 |
min_count |
Words that appear less than Optional Valid values: Non-negative integer Default value: 5 |
max_char |
The maximum number of characters to use for subwords/character n-grams Optional Valid values: positive integer Default value: 6 |
negative_samples |
The number of negative samples for the negative sample sharing strategy. Optional Valid values: Positive integer Default value: 5 |
sampling_threshold |
The threshold for the occurrence of words. Words that appear with higher frequency in the training data are randomly down-sampled. Optional Valid values: Positive fraction. The recommended range is (0, 1e-3] Default value: 0.0001 |
subwords |
Whether to learn subword embeddings on not. Optional Valid values: (Boolean) Default value: |
vector_dim |
The dimension of the word vectors that the algorithm learns. Optional Valid values: Positive integer Default value: 100 |
window_size |
The size of the context window. The context window is the number of words surrounding the target word used for training. Optional Valid values: Positive integer Default value: 5 |
Text Classification Hyperparameters
The following table lists the hyperparameters for the Text Classification training algorithm provided by Amazon SageMaker.
Note
Although some of the parameters are common between the Text Classification and Word2Vec modes, they might have different meanings depending on the context.
Parameter Name | Description |
---|---|
mode |
The training mode. Required Valid values: |
buckets |
The number of hash buckets to use for word n-grams. Optional Valid values: Positive integer Default value: 2000000 |
early_stopping |
Whether to stop training if validation accuracy doesn't
improve after a Optional Valid values: (Boolean) Default value: |
epochs |
The maximum number of complete passes through the training data. Optional Valid values: Positive integer Default value: 5 |
learning_rate |
The step size used for parameter updates. Optional Valid values: Positive float Default value: 0.05 |
min_count |
Words that appear less than Optional Valid values: Non-negative integer Default value: 5 |
min_epochs |
The minimum number of epochs to train before early stopping logic is invoked. Optional Valid values: Positive integer Default value: 5 |
patience |
The number of epochs to wait before applying early stopping
when no progress is made on the validation set. Used only when
Optional Valid values: Positive integer Default value: 4 |
vector_dim |
The dimension of the embedding layer. Optional Valid values: Positive integer Default value: 100 |
word_ngrams |
The number of word n-gram features to use. Optional Valid values: Positive integer Default value: 2 |