Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Create a multi-container endpoint (Boto 3)

Focus mode
Create a multi-container endpoint (Boto 3) - Amazon SageMaker AI

Create a Multi-container endpoint by calling CreateModel, CreateEndpointConfig, and CreateEndpoint APIs as you would to create any other endpoints. You can run these containers sequentially as an inference pipeline, or run each individual container by using direct invocation. Multi-container endpoints have the following requirements when you call create_model:

  • Use the Containers parameter instead of PrimaryContainer, and include more than one container in the Containers parameter.

  • The ContainerHostname parameter is required for each container in a multi-container endpoint with direct invocation.

  • Set the Mode parameter of the InferenceExecutionConfig field to Direct for direct invocation of each container, or Serial to use containers as an inference pipeline. The default mode is Serial.

Note

Currently there is a limit of up to 15 containers supported on a multi-container endpoint.

The following example creates a multi-container model for direct invocation.

  1. Create container elements and InferenceExecutionConfig with direct invocation.

    container1 = { 'Image': '123456789012.dkr.ecr.us-east-1.amazonaws.com/myimage1:mytag', 'ContainerHostname': 'firstContainer' } container2 = { 'Image': '123456789012.dkr.ecr.us-east-1.amazonaws.com/myimage2:mytag', 'ContainerHostname': 'secondContainer' } inferenceExecutionConfig = {'Mode': 'Direct'}
  2. Create the model with the container elements and set the InferenceExecutionConfig field.

    import boto3 sm_client = boto3.Session().client('sagemaker') response = sm_client.create_model( ModelName = 'my-direct-mode-model-name', InferenceExecutionConfig = inferenceExecutionConfig, ExecutionRoleArn = role, Containers = [container1, container2] )

To create an endoint, you would then call create_endpoint_config and create_endpoint as you would to create any other endpoint.

PrivacySite termsCookie preferences
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.