How to create an Asynchronous Inference Endpoint - Amazon SageMaker AI

How to create an Asynchronous Inference Endpoint

Create an asynchronous endpoint the same way you would create an endpoint using SageMaker AI hosting services:

  • Create a model in SageMaker AI with CreateModel.

  • Create an endpoint configuration with CreateEndpointConfig.

  • Create an HTTPS endpoint with CreateEndpoint.

To create an endpoint, you first create a model with CreateModel, where you point to the model artifact and a Docker registry path (Image). You then create a configuration using CreateEndpointConfig where you specify one or more models that were created using the CreateModel API to deploy and the resources that you want SageMaker AI to provision. Create your endpoint with CreateEndpoint using the endpoint configuration specified in the request. You can update an asynchronous endpoint with the UpdateEndpoint API. Send and receive inference requests from the model hosted at the endpoint with InvokeEndpointAsync. You can delete your endpoints with the DeleteEndpoint API.

For a full list of the available SageMaker AI Images, see Available Deep Learning Containers Images. See Containers with custom inference code for information on how to create your Docker image.