NTM Hyperparameters

The following table lists the hyperparameters that you can set for the Amazon SageMaker AI Neural Topic Model (NTM) algorithm.

Parameter Name	Description
`feature_dim`	The vocabulary size of the dataset. Required Valid values: Positive integer (min: 1, max: 1,000,000)
`num_topics`	The number of required topics. Required Valid values: Positive integer (min: 2, max: 1000)
`batch_norm`	Whether to use batch normalization during training. Optional Valid values: true or false Default value: false
`clip_gradient`	The maximum magnitude for each gradient component. Optional Valid values: Float (min: 1e-3) Default value: Infinity
`encoder_layers`	The number of layers in the encoder and the output size of each layer. When set to auto, the algorithm uses two layers of sizes 3 x `num_topics` and 2 x `num_topics` respectively. Optional Valid values: Comma-separated list of positive integers or auto Default value: auto
`encoder_layers_activation`	The activation function to use in the encoder layers. Optional Valid values: `sigmoid`: Sigmoid function `tanh`: Hyperbolic tangent `relu`: Rectified linear unit Default value: `sigmoid`
`epochs`	The maximum number of passes over the training data. Optional Valid values: Positive integer (min: 1) Default value: 50
`learning_rate`	The learning rate for the optimizer. Optional Valid values: Float (min: 1e-6, max: 1.0) Default value: 0.001
`mini_batch_size`	The number of examples in each mini batch. Optional Valid values: Positive integer (min: 1, max: 10000) Default value: 256
`num_patience_epochs`	The number of successive epochs over which early stopping criterion is evaluated. Early stopping is triggered when the change in the loss function drops below the specified `tolerance` within the last `num_patience_epochs` number of epochs. To disable early stopping, set `num_patience_epochs` to a value larger than `epochs`. Optional Valid values: Positive integer (min: 1) Default value: 3
`optimizer`	The optimizer to use for training. Optional Valid values: `sgd`: Stochastic gradient descent `adam`: Adaptive momentum estimation `adagrad`: Adaptive gradient algorithm `adadelta`: An adaptive learning rate algorithm `rmsprop`: Root mean square propagation Default value: `adadelta`
`rescale_gradient`	The rescale factor for gradient. Optional Valid values: float (min: 1e-3, max: 1.0) Default value: 1.0
`sub_sample`	The fraction of the training data to sample for training per epoch. Optional Valid values: Float (min: 0.0, max: 1.0) Default value: 1.0
`tolerance`	The maximum relative change in the loss function. Early stopping is triggered when change in the loss function drops below this value within the last `num_patience_epochs` number of epochs. Optional Valid values: Float (min: 1e-6, max: 0.1) Default value: 0.001
`weight_decay`	The weight decay coefficient. Adds L2 regularization. Optional Valid values: Float (min: 0.0, max: 1.0) Default value: 0.0

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Neural Topic Model (NTM) Algorithm

Model Tuning