DeepAR Hyperparameters
The following table lists the hyperparameters that you can set when training with the Amazon SageMaker AI DeepAR forecasting algorithm.
Parameter Name | Description |
---|---|
context_length |
The number of time-points that the model gets to see before making
the prediction. The value for this parameter should be about the
same as the Required Valid values: Positive integer |
epochs |
The maximum number of passes over the training data. The optimal
value depends on your data size and learning rate. See also
Required Valid values: Positive integer |
prediction_length |
The number of time-steps that the model is trained to predict,
also called the forecast horizon. The trained model always generates
forecasts with this length. It can't generate longer forecasts. The
Required Valid values: Positive integer |
time_freq |
The granularity of the time series in the dataset. Use
Required Valid values: An integer followed by M,
W, D, H,
or min. For example, |
cardinality |
When using the categorical features ( Set cardinality to To perform additional data validation, it is possible to explicitly set this parameter to the actual value. For example, if two categorical features are provided where the first has 2 and the other has 3 possible values, set this to [2, 3]. For more information on how to use categorical feature, see the data-section on the main documentation page of DeepAR. Optional Valid values: Default value: |
dropout_rate |
The dropout rate to use during training. The model uses zoneout regularization. For each iteration, a random subset of hidden neurons are not updated. Typical values are less than 0.2. Optional Valid values: float Default value: 0.1 |
early_stopping_patience |
If this parameter is set, training stops when no progress is made
within the specified number of Optional Valid values: integer |
embedding_dimension |
Size of embedding vector learned per categorical feature (same value is used for all categorical features). The DeepAR model can learn group-level time series patterns when a
categorical grouping feature is provided. To do this, the model
learns an embedding vector of size Optional Valid values: positive integer Default value: 10 |
learning_rate |
The learning rate used in training. Typical values range from 1e-4 to 1e-1. Optional Valid values: float Default value: 1e-3 |
likelihood |
The model generates a probabilistic forecast, and can provide quantiles of the distribution and return samples. Depending on your data, select an appropriate likelihood (noise model) that is used for uncertainty estimates. The following likelihoods can be selected:
Optional Valid values: One of gaussian, beta, negative-binomial, student-T, or deterministic-L1. Default value: |
mini_batch_size |
The size of mini-batches used during training. Typical values range from 32 to 512. Optional Valid values: positive integer Default value: 128 |
num_cells |
The number of cells to use in each hidden layer of the RNN. Typical values range from 30 to 100. Optional Valid values: positive integer Default value: 40 |
num_dynamic_feat |
The number of To force DeepAR to not use dynamic features, even it they are
present in the data, set To perform additional data validation, it is possible to explicitly set this parameter to the actual integer value. For example, if two dynamic features are provided, set this to 2. Optional Valid values: Default value: |
num_eval_samples |
The number of samples that are used per time-series when calculating test accuracy metrics. This parameter does not have any influence on the training or the final model. In particular, the model can be queried with a different number of samples. This parameter only affects the reported accuracy scores on the test channel after training. Smaller values result in faster evaluation, but then the evaluation scores are typically worse and more uncertain. When evaluating with higher quantiles, for example 0.95, it may be important to increase the number of evaluation samples. Optional Valid values: integer Default value: 100 |
num_layers |
The number of hidden layers in the RNN. Typical values range from 1 to 4. Optional Valid values: positive integer Default value: 2 |
test_quantiles |
Quantiles for which to calculate quantile loss on the test channel. Optional Valid values: array of floats Default value: [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9] |