Deploy Autopilot models for real-time inference
After you train your Amazon SageMaker Autopilot models, you can set up an endpoint and obtain predictions interactively. The following section describes the steps for deploying your model to a SageMaker AI real-time inference endpoint to get predictions from your model.
Real-time inferencing
Real-time inference is ideal for inference workloads where you have real-time, interactive, low latency requirements. This section shows how you can use real-time inferencing to obtain predictions interactively from your model.
You can use SageMaker APIs to manually deploy the model that produced the best validation metric in an Autopilot experiment as follows.
Alternatively, you can chose the automatic deployment option when creating your Autopilot
experiment. For information on setting up the automatic deployment of models, see
ModelDeployConfig
in the request parameters of CreateAutoMLJobV2
. This creates an endpoint automatically.
Note
To avoid incurring unnecessary charges, you can delete unneeded endpoint and resources
created from model deployment. For information about pricing of instances by Region, see
Amazon SageMaker AI Pricing
-
Obtain the candidate container definitions
Obtain the candidate container definitions from InferenceContainers. A container definition for inference refers to the containerized environment designed for deploying and running your trained SageMaker AI model to make predictions.
The following AWS CLI command example uses the DescribeAutoMLJobV2 API to obtain candidate definitions for the best model candidate.
aws sagemaker describe-auto-ml-job-v2 --auto-ml-job-name
job-name
--regionregion
-
List candidates
The following AWS CLI command example uses the ListCandidatesForAutoMLJob API to list all model candidates.
aws sagemaker list-candidates-for-auto-ml-job --auto-ml-job-name
<job-name>
--region<region>
-
Create a SageMaker AI model
Use the container definitions from the previous steps and a candidate of your choice to create a SageMaker AI model by using the CreateModel API. See the following AWS CLI command as an example.
aws sagemaker create-model --model-name '
<your-candidate-name>
' \ --containers ['<container-definition1
>,<container-definition2>
,<container-definition3>
]' \ --execution-role-arn '<execution-role-arn>
' --region '<region>
-
Create an endpoint configuration
The following AWS CLI command example uses the CreateEndpointConfig API to create an endpoint configuration.
aws sagemaker create-endpoint-config --endpoint-config-name '
<your-endpoint-config-name>
' \ --production-variants '<list-of-production-variants>
' \ --region '<region>
' -
Create the endpoint
The following AWS CLI example uses the CreateEndpoint API to create the endpoint.
aws sagemaker create-endpoint --endpoint-name '
<your-endpoint-name>
' \ --endpoint-config-name '<endpoint-config-name-you-just-created>
' \ --region '<region>
'Check the progress of your endpoint deployment by using the DescribeEndpoint API. See the following AWS CLI command as an example.
aws sagemaker describe-endpoint —endpoint-name '
<endpoint-name>
' —region<region>
After the
EndpointStatus
changes toInService
, the endpoint is ready to use for real-time inference. -
Invoke the endpoint
The following command structure invokes the endpoint for real-time inferencing.
aws sagemaker invoke-endpoint --endpoint-name '
<endpoint-name>
' \ --region '<region>
' --body '<your-data>
' [--content-type] '<content-type>
'<outfile>