Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Autopilot model deployment and prediction

Focus mode
Autopilot model deployment and prediction - Amazon SageMaker AI

This Amazon SageMaker Autopilot guide includes steps for model deployment, setting up real-time inference, and running inference with batch jobs.

After you train your Autopilot models, you can deploy them to get predictions in one of two ways:

  1. Use Deploy models for real-time inference to set up an endpoint and obtain predictions interactively. Real-time inference is ideal for inference workloads where you have real-time, interactive, low latency requirements.

  2. Use Run batch inference jobs to make predictions in parallel on batches of observations on an entire dataset. Batch inference is a good option for large datasets or if you don't need an immediate response to a model prediction request.

Note

To avoid incurring unnecessary charges: After the endpoints and resources that were created from model deployment are no longer needed, you can delete them. For information about pricing of instances by Region, see Amazon SageMaker Pricing.

PrivacySite termsCookie preferences
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.