Deploy a Model

Focus mode

Deploy a Model - Amazon SageMaker AI

To deploy an Amazon SageMaker Neo-compiled model to an HTTPS endpoint, you must configure and create the endpoint for the model using Amazon SageMaker AI hosting services. Currently, developers can use Amazon SageMaker APIs to deploy modules on to ml.c5, ml.c4, ml.m5, ml.m4, ml.p3, ml.p2, and ml.inf1 instances.

For Inferentia and Trainium instances, models need to be compiled specifically for those instances. Models compiled for other instance types are not guaranteed to work with Inferentia or Trainium instances.

When you deploy a compiled model, you need to use the same instance for the target that you used for compilation. This creates a SageMaker AI endpoint that you can use to perform inferences. You can deploy a Neo-compiled model using any of the following: Amazon SageMaker AI SDK for Python, SDK for Python (Boto3), AWS Command Line Interface, and the SageMaker AI console.

Note

For deploying a model using AWS CLI, the console, or Boto3, see Neo Inference Container Images to select the inference image URI for your primary container.

Topics

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Supported Instance Types and Frameworks

Prerequisites

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences

Deploy a Model

Note

Topics

Did this page help you?

Next topic:

Previous topic:

Need help?