Tune a linear learner model
Automatic model tuning, also known as hyperparameter tuning, finds the best version of a model by running many jobs that test a range of hyperparameters on your dataset. You choose the tunable hyperparameters, a range of values for each, and an objective metric. You choose the objective metric from the metrics that the algorithm computes. Automatic model tuning searches the hyperparameters chosen to find the combination of values that result in the model that optimizes the objective metric.
The linear learner algorithm also has an internal mechanism for tuning hyperparameters
separate from the automatic model tuning feature described here. By default, the linear
learner algorithm tunes hyperparameters by training multiple models in parallel. When
you use automatic model tuning, the linear learner internal tuning mechanism is turned
off automatically. This sets the number of parallel models, num_models
, to
1. The algorithm ignores any value that you set for num_models
.
For more information about model tuning, see Automatic model tuning with SageMaker.
Metrics computed by the linear learner algorithm
The linear learner algorithm reports the metrics in the following table, which are computed during training. Choose one of them as the objective metric. To avoid overfitting, we recommend tuning the model against a validation metric instead of a training metric.
Metric Name | Description | Optimization Direction |
---|---|---|
test:absolute_loss |
The absolute loss of the final model on the test dataset. This objective metric is only valid for regression. |
Minimize |
test:binary_classification_accuracy |
The accuracy of the final model on the test dataset. This objective metric is only valid for binary classification. |
Maximize |
test:binary_f_beta |
The F-beta score of the final model on the test dataset. By default, it is the F1 score, which is the harmonic mean of precision and recall. This objective metric is only valid for binary classification. |
Maximize |
test:dcg |
The discounted cumulative gain of the final model on the test dataset. This objective metric is only valid for multiclass classification. |
Maximize |
test:macro_f_beta |
The F-beta score of the final model on the test dataset. This objective metric is only valid for multiclass classification. |
Maximize |
test:macro_precision |
The precision score of the final model on the test dataset. This objective metric is only valid for multiclass classification. |
Maximize |
test:macro_recall |
The recall score of the final model on the test dataset. This objective metric is only valid for multiclass classification. |
Maximize |
test:mse |
The mean square error of the final model on the test dataset. This objective metric is only valid for regression. |
Minimize |
test:multiclass_accuracy |
The accuracy of the final model on the test dataset. This objective metric is only valid for multiclass classification. |
Maximize |
test:multiclass_top_k_accuracy |
The accuracy among the top k labels predicted on the test dataset.
If you choose this metric as the objective, we recommend setting the value of k
using the |
Maximize |
test:objective_loss |
The mean value of the objective loss function on the test
dataset after the model is trained. By default, the loss is
logistic loss for binary classification and squared loss for
regression. To set the loss to other types, use the
|
Minimize |
test:precision |
The precision of the final model on the test dataset. If you
choose this metric as the objective, we recommend setting a
target recall by setting the
|
Maximize |
test:recall |
The recall of the final model on the test dataset. If you
choose this metric as the objective, we recommend setting a
target precision by setting the
|
Maximize |
test:roc_auc_score |
The area under receiving operating characteristic curve (ROC curve) of the final model on the test dataset. This objective metric is only valid for binary classification. |
Maximize |
validation:absolute_loss |
The absolute loss of the final model on the validation dataset. This objective metric is only valid for regression. |
Minimize |
validation:binary_classification_accuracy |
The accuracy of the final model on the validation dataset. This objective metric is only valid for binary classification. |
Maximize |
validation:binary_f_beta |
The F-beta score of the final model on the validation dataset.
By default, the F-beta score is the F1 score, which is the
harmonic mean of the |
Maximize |
validation:dcg |
The discounted cumulative gain of the final model on the validation dataset. This objective metric is only valid for multiclass classification. |
Maximize |
validation:macro_f_beta |
The F-beta score of the final model on the validation dataset. This objective metric is only valid for multiclass classification. |
Maximize |
validation:macro_precision |
The precision score of the final model on the validation dataset. This objective metric is only valid for multiclass classification. |
Maximize |
validation:macro_recall |
The recall score of the final model on the validation dataset. This objective metric is only valid for multiclass classification. |
Maximize |
validation:mse |
The mean square error of the final model on the validation dataset. This objective metric is only valid for regression. |
Minimize |
validation:multiclass_accuracy |
The accuracy of the final model on the validation dataset. This objective metric is only valid for multiclass classification. |
Maximize |
validation:multiclass_top_k_accuracy |
The accuracy among the top k labels predicted on the validation dataset.
If you choose this metric as the objective, we recommend setting the value of k using the
|
Maximize |
validation:objective_loss |
The mean value of the objective loss function on the
validation dataset every epoch. By default, the loss is logistic
loss for binary classification and squared loss for regression.
To set loss to other types, use the |
Minimize |
validation:precision |
The precision of the final model on the validation dataset. If
you choose this metric as the objective, we recommend setting a
target recall by setting the
|
Maximize |
validation:recall |
The recall of the final model on the validation dataset. If
you choose this metric as the objective, we recommend setting a
target precision by setting the
|
Maximize |
validation:rmse |
The root mean square error of the final model on the validation dataset. This objective metric is only valid for regression. |
Minimize |
validation:roc_auc_score |
The area under receiving operating characteristic curve (ROC curve) of the final model on the validation dataset. This objective metric is only valid for binary classification. |
Maximize |
Tuning linear learner hyperparameters
You can tune a linear learner model with the following hyperparameters.
Parameter Name | Parameter Type | Recommended Ranges |
---|---|---|
wd |
|
|
l1 |
|
|
learning_rate |
|
|
mini_batch_size |
|
|
use_bias |
|
|
positive_example_weight_mult |
|
|