Model performance report

Focus mode

Model performance report - Amazon SageMaker AI

Autopilot job details Model quality report

An Amazon SageMaker AI model quality report (also referred to as performance report) provides insights and quality information for the best model candidate generated by an AutoML job. This includes information about the job details, model problem type, objective function, and various metrics. This section details the content of a performance report for text classification problems and explains how to access the metrics as raw data in a JSON file.

You can find the Amazon S3 prefix to the model quality report artifacts generated for the best candidate in the response to DescribeAutoMLJobV2 at BestCandidate.CandidateProperties.CandidateArtifactLocations.ModelInsights.

The performance report contains two sections:

The first section contains details about the Autopilot job that produced the model.
The second section contains a model quality report with various performance metrics.

Autopilot job details

This first section of the report gives some general information about the Autopilot job that produced the model. These details include the following information:

Autopilot candidate name: The name of the best model candidate.
Autopilot job name: The name of the job.
Problem type: The problem type. In our case, text classification.
Objective metric: The objective metric used to optimize the performance of the model. In our case, Accuracy.
Optimization direction: Indicates whether to minimize or maximize the objective metric.

Model quality report

Model quality information is generated by Autopilot model insights. The report's content that is generated depends on the problem type it addressed. The report specifies the number of rows that were included in the evaluation dataset and the time at which the evaluation occurred.

Metrics tables

The first part of the model quality report contains metrics tables. These are appropriate for the type of problem that the model addressed.

The following image is an example of a metrics table generated by Autopilot for an image or text classification problem. It shows the metric name, value, and standard deviation.

Amazon SageMaker Autopilot model insights image or text classification metrics report example.

Graphical model performance information

The second part of the model quality report contains graphical information to help you evaluate model performance. The contents of this section depend on the selected problem type.

Confusion matrix

A confusion matrix provides a way to visualize the accuracy of the predictions made by a model for binary and multiclass classification for different problems.

A summary of the graph's components of false positive rate (FPR) and true positive rate (TPR) are defined as follows.

Correct predictions
- True positive (TP): The predicted the value is 1, and the true value is 1.
- True negative (TN): The predicted the value is 0, and the true value is 0.
Erroneous predictions
- False positive (FP): The predicted the value is 1, but the true value is 0.
- False negative (FN): The predicted the value is 0, but the true value is 1.

The confusion matrix in the model quality report contains the following.

The number and percentage of correct and incorrect predictions for the actual labels
The number and percentage of accurate predictions on the diagonal from the upper-left to the lower-right corner
The number and percentage of inaccurate predictions on the diagonal from the upper-right to the lower-left corner

The incorrect predictions on a confusion matrix are the confusion values.

The following diagram is an example of a confusion matrix for a multi-class classification problem. The confusion matrix in the model quality report contains the following.

The vertical axis is divided into three rows containing three different actual labels.
The horizontal axis is divided into three columns containing labels that were predicted by the model.
The color bar assigns a darker tone to a larger number of samples to visually indicate the number of values that were classified in each category.

In the example below, the model correctly predicted actual 354 values for label f, 1094 values for label i and 852 values for label m. The difference in tone indicates that the dataset is not balanced because there are many more labels for the value i than for f or m.

Amazon SageMaker Autopilot multiclass confusion matrix example.

The confusion matrix in the model quality report provided can accommodate a maximum of 15 labels for multiclass classification problem types. If a row corresponding to a label shows a Nan value, it means that the validation dataset used to check model predictions does not contain data with that label.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Explainability Report

Create a time-series forecasting job using the AutoML API

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences

Model performance report

Autopilot job details

Model quality report

Metrics tables

Graphical model performance information

Confusion matrix

On this page

Related resources

Did this page help you?

Related resources

Next topic:

Previous topic:

Need help?

Amazon SageMaker Autopilot model insights image or text classification metrics report example.

Amazon SageMaker Autopilot multiclass confusion matrix example.