Deploy a Model
To deploy an Amazon SageMaker Neo-compiled model to an HTTPS endpoint, you must configure and create the endpoint for the model using Amazon SageMaker hosting services. Currently, developers can use Amazon SageMaker APIs to deploy modules on to ml.c5, ml.c4, ml.m5, ml.m4, ml.p3, ml.p2, and ml.inf1 instances.
For Inferentia
When you deploy a compiled model, you need to use the same instance for the target that
you used for compilation. This creates a SageMaker endpoint that you can use to perform
inferences. You can deploy a Neo-compiled model using any of the following: Amazon SageMaker SDK for Python
Note
For deploying a model using AWS CLI, the console, or Boto3, see Neo Inference Container Images to select the inference image URI for your primary container.