Configure inference output in generated containers
Autopilot generates an ordered ContainerDefinition
list. This can be used to build a model to
deploy in a machine learning pipeline. This model can be used for online hosting and
inference.
Customers can list inference container definitions with the ListCandidateForAutoMLJob
API. The list of inference container
definitions that represent the best candidate is also available in the DescribeAutoMLJob
response.
Inference container definitions for regression and classification problem types
Autopilot generates inference containers specific to the training mode and the problem type of the job.
Container definitions for hyperparameter optimization (HPO) mode
-
Regression: HPO generates two containers:
-
A feature engineering container that transforms the original features into features that the regression algorithms can train on.
-
An algorithm container that transforms features and generates a regression score for the dataset.
-
-
Classification: HPO generates three containers:
-
A feature engineering container that transforms the original features into features that the classification algorithms can train on.
-
An algorithm container that generates the
predicted_label
with the highest probability. This container can also produce the various probabilities associated with the classification outcomes in the inference response. -
A feature engineering container that performs post-processing of the algorithm prediction. For example, it can perform an inverse transform on the predicted label and change it to the original label.
-
Container definitions for ensembling mode
In ensembling mode, both regression and classification problem types have only one inference container. This inference container transforms the features and generates the predictions based on problem type.
Inference responses per problem type
Inference responses for classification models
For classification inference containers, you can select the content of the inference response by using four predefined keys:
-
predicted_label
: The label with the highest probability of predicting the correct label, as determined by Autopilot. -
probability
:-
HPO models: The probability of the
True
class for binary classification. The probability of thepredicted_label
for multiclass classification. -
Ensemble models: The probability of the
predicted_label
for binary and multiclass classification.
-
-
probabilities
: The list of probabilities for all corresponding classes. -
labels
: The list of all labels.
For example, for a binary classification problem, if you pass the inference response
keys ['predicted_label', 'probability', 'probabilities', 'labels']
and the
output response appears as [1, 0.1, "[0.9, 0.1]", "['1', '0']"]
, you should
interpret it as follows:
-
predicted_label
equals1
because label "1" has a higher probability (0.9
in this case). -
For HPO models,
probability
equals0.1
which is the probability of thepositive_class
(0
in this case) selected by Autopilot.For Ensemble models,
probability
equals0.9
which is the probability of thepredicted_label
. -
probabilities
lists theprobability
of each label inlabels
. -
labels
are the unique labels in the dataset, where the second label ("0" in this case) is thepositive_class
selected by Autopilot.
By default, inference containers are configured to generate only the
predicted_label
. To select additional inference content, you can update the
inference_response_keys
parameter to include up to these three environment
variables:
-
SAGEMAKER_INFERENCE_SUPPORTED
: This is set to provide hints to you about what content each container supports. -
SAGEMAKER_INFERENCE_INPUT
: This should be set to the keys that the container expects in input payload. -
SAGEMAKER_INFERENCE_OUTPUT
: This should be populated with the set of keys that the container outputs.
Inference responses for classification models in HPO mode
This section shows how to configure the inference response from classification models using hyperparameter optimization (HPO) mode.
To choose the inference response content in HPO mode: Add the
SAGEMAKER_INFERENCE_INPUT
and SAGEMAKER_INFERENCE_OUTPUT
variables to the second and third containers that are generated in HPO mode for
classification problems.
The keys supported by the second container (algorithm) are predicted_label,
probability, and probabilities. Note that labels
is deliberately not added to
SAGEMAKER_INFERENCE_SUPPORTED
.
The keys supported by the third classification model container are
predicted_label
, labels
, probability
, and
probabilities
. Therefore, the SAGEMAKER_INFERENCE_SUPPORTED
environment includes the names of these keys.
To update the definition of the inference containers to receive
predicted_label
and probability
, use the following code
example.
containers[1]['Environment'].update({'SAGEMAKER_INFERENCE_OUTPUT': 'predicted_label, probability'}) containers[2]['Environment'].update({'SAGEMAKER_INFERENCE_INPUT': 'predicted_label, probability'}) containers[2]['Environment'].update({'SAGEMAKER_INFERENCE_OUTPUT': 'predicted_label, probability'})
The following code example updates the definition of the inference containers to
receive predicted_label
, probabilities
, and labels
.
Do not pass the labels
to the second container (the algorithm container),
because it is generated by the third container independently.
containers[1]['Environment'].update({'SAGEMAKER_INFERENCE_OUTPUT': 'predicted_label,probabilities'}) containers[2]['Environment'].update({'SAGEMAKER_INFERENCE_INPUT': 'predicted_label,probabilities'}) containers[2]['Environment'].update({'SAGEMAKER_INFERENCE_OUTPUT': 'predicted_label, probabilities,labels'})
The following collapsible sections provide code examples for AWS SDK for Python (Boto3) and for SageMaker SDK for Python. Each section shows how to select the content of the inference responses in HPO mode for the respective code example.
import boto3 sm_client = boto3.client('sagemaker', region_name='
<Region>
') role = '<IAM role>
' input_data = '<S3 input uri>
' output_path = '<S3 output uri>
' best_candidate = sm_client.describe_auto_ml_job(AutoMLJobName='<AutoML Job Name>
')['BestCandidate'] best_candidate_containers = best_candidate['InferenceContainers'] best_candidate_name = best_candidate['CandidateName
'] best_candidate_containers[1]['Environment'].update({'SAGEMAKER_INFERENCE_OUTPUT': 'predicted_label, probability'}) best_candidate_containers[2]['Environment'].update({'SAGEMAKER_INFERENCE_INPUT': 'predicted_label, probability'}) best_candidate_containers[2]['Environment'].update({'SAGEMAKER_INFERENCE_OUTPUT': 'predicted_label, probability'}) # create model reponse = sm_client.create_model( ModelName = '<Model Name>
', ExecutionRoleArn = role, Containers = best_candidate_containers ) # Lauch Transform Job response = sm_client.create_transform_job( TransformJobName='<Transform Job Name>
', ModelName='<Model Name>
', TransformInput={ 'DataSource': { 'S3DataSource': { 'S3DataType': 'S3Prefix
', 'S3Uri':input_data
} }, 'ContentType': "text/CSV", 'SplitType': 'Line' }, TransformOutput={ 'S3OutputPath':output_path
, 'AssembleWith': 'Line', }, TransformResources={ 'InstanceType': 'ml.m4.xlarge
', 'InstanceCount': 1, }, )
from sagemaker import AutoML aml = AutoML.attach(auto_ml_job_name='
<AutoML Job Name>
') aml_best_model = aml.create_model(name='<Model Name>
', candidate=None, inference_response_keys**=['probabilities', 'labels']) aml_transformer = aml_best_model.transformer(accept='text/csv', assemble_with='Line', instance_type='ml.m5.xlarge
', instance_count=1,) aml_transformer.transform('<S3 input uri>
', content_type='text/csv', split_type='Line', job_name='<Transform Job Name>
', wait=True)
Inference responses for classification models in ensembling mode
This section shows how to configure the inference response from classification models using ensembling mode.
In ensembling mode, to choose the content of the
inference response, update the SAGEMAKER_INFERENCE_OUTPUT
environment
variable.
The keys supported by the classification model container are
predicted_label
, labels
, probability
, and
probabilities
. These keys are included in the
SAGEMAKER_INFERENCE_SUPPORTED
environment.
To update the inference container definition to receive predicted_label
and probability
, refer to the following code example.
containers[0]['Environment'].update({'SAGEMAKER_INFERENCE_OUTPUT': 'predicted_label, probability'})
The following collapsible section provides a code example for selecting the content of the inference responses in ensembling mode. The example uses AWS SDK for Python (Boto3).
import boto3 sm_client = boto3.client('sagemaker', region_name='
<Region>
') role = '<IAM role>
' input_data = '<S3 input uri>
' output_path = '<S3 output uri>
' best_candidate = sm_client.describe_auto_ml_job(AutoMLJobName='<AutoML Job Name>
')['BestCandidate'] best_candidate_containers = best_candidate['InferenceContainers'] best_candidate_name = best_candidate['CandidateName
'] *best_candidate_containers[0]['Environment'].update({'SAGEMAKER_INFERENCE_OUTPUT': 'predicted_label, probability'}) * # create model reponse = sm_client.create_model( ModelName = '<Model Name>
', ExecutionRoleArn = role, Containers = best_candidate_containers ) # Lauch Transform Job response = sm_client.create_transform_job( TransformJobName='<Transform Job Name>
', ModelName='<Model Name>
', TransformInput={ 'DataSource': { 'S3DataSource': { 'S3DataType': 'S3Prefix
', 'S3Uri': input_data } }, 'ContentType': "text/CSV", 'SplitType': 'Line' }, TransformOutput={ 'S3OutputPath': output_path, 'AssembleWith': 'Line', }, TransformResources={ 'InstanceType': 'ml.m4.xlarge
', 'InstanceCount': 1, }, )
The following collapsible section provides a code example that is identical to the SageMaker SDK for Python example for HPO. It is included for your convenience.
The following HPO code example uses SageMaker SDK for Python.
from sagemaker import AutoML aml = AutoML.attach(auto_ml_job_name='<AutoML Job Name>') aml_best_model = aml.create_model(name='<Model Name>', candidate=None, *inference_response_keys**=['probabilities', 'labels'])* aml_transformer = aml_best_model.transformer(accept='text/csv', assemble_with='Line', instance_type='ml.m5.xlarge', instance_count=1,) aml_transformer.transform('<S3 input uri>', content_type='text/csv', split_type='Line', job_name='<Transform Job Name>', wait=True)