Example: Viewing a Training and Validation Curve
Typically, you split the data on which you train your model into training and validation datasets. You use the training set to train the model parameters that are used to make predictions on the training dataset. Then you test how well the model makes predictions by calculating predictions for the validation set. To analyze the performance of a training job, you commonly plot a training curve against a validation curve.
Viewing a graph that shows the accuracy for both the training and validation sets over time can help you to improve the performance of your model. For example, if training accuracy continues to increase over time, but, at some point, validation accuracy starts to decrease, you are likely overfitting your model. To address this, you can make adjustments to your model, such as increasing regularization.
For
this example, you can use the
Image-classification-full-training example in the
Example notebooks section of your SageMaker AI notebook instance. If
you don't have a SageMaker notebook instance, create one by following the instructions at
Create an Amazon SageMaker Notebook Instance for the
tutorial. If you
prefer, you can follow along with the End-to-End Multiclass Image Classification Example
To view training and validation error curves
-
Open the SageMaker AI console at https://console.aws.amazon.com/sagemaker
. -
Choose Notebooks, and then choose Notebook instances.
-
Choose the notebook instance that you want to use, and then choose Open.
-
On the dashboard for your notebook instance, choose SageMaker AI Examples.
-
Expand the Introduction to Amazon Algorithms section, and then choose Use next to Image-classification-fulltraining.ipynb.
-
Choose Create copy. SageMaker AI creates an editable copy of the Image-classification-fulltraining.ipynb notebook in your notebook instance.
-
Run all of the cells in the notebook up to the Inference section. You don't need to deploy an endpoint or get inference for this example.
-
After the training job starts, open the CloudWatch console at https://console.aws.amazon.com/cloudwatch
. -
Choose Metrics, then choose /aws/sagemaker/TrainingJobs.
-
Choose TrainingJobName.
-
On the All metrics tab, choose the train:accuracy and validation:accuracy metrics for the training job that you created in the notebook.
-
On the graph, choose an area that the metric's values to zoom in. You should see something like the following example.