Create an inference recommendation programmatically using the AWS SDK for Python (Boto3) or
the AWS CLI, or interactively using Studio Classic or the SageMaker AI console. Specify a job
name for your inference recommendation, an AWS IAM role ARN, an input
configuration, and either a model package ARN when you registered your model
with the model registry, or your model name and a ContainerConfig
dictionary from when you created your model in the
Prerequisites section.
Use the CreateInferenceRecommendationsJob
API
to start an inference recommendation job. Set the
JobType
field to 'Default'
for
inference recommendation jobs. In addition, provide the
following:
-
The Amazon Resource Name (ARN) of an IAM role that enables Inference Recommender to perform tasks on your behalf. Define this for the
RoleArn
field. -
A model package ARN or model name. Inference Recommender supports either one model package ARN or a model name as input. Specify one of the following:
-
The ARN of the versioned model package you created when you registered your model with SageMaker AI model registry. Define this for
ModelPackageVersionArn
in theInputConfig
field. -
The name of the model you created. Define this for
ModelName
in theInputConfig
field. Also, provide theContainerConfig
dictionary, which includes the required fields that need to be provided with the model name. Define this forContainerConfig
in theInputConfig
field. In theContainerConfig
, you can also optionally specify theSupportedEndpointType
field as eitherRealTime
orServerless
. If you specify this field, Inference Recommender returns recommendations for only that endpoint type. If you don't specify this field, Inference Recommender returns recommendations for both endpoint types.
-
-
A name for your Inference Recommender recommendation job for the
JobName
field. The Inference Recommender job name must be unique within the AWS Region and within your AWS account.
Import the AWS SDK for Python (Boto3) package and create a SageMaker AI client object using the client class. If you followed the steps in the Prerequisites section, only specify one of the following:
-
Option 1: If you would like to create an inference recommendations job with a model package ARN, then store the model package group ARN in a variable named
model_package_arn
. -
Option 2: If you would like to create an inference recommendations job with a model name and
ContainerConfig
, store the model name in a variable namedmodel_name
and theContainerConfig
dictionary in a variable namedcontainer_config
.
# Create a low-level SageMaker service client.
import boto3
aws_region = '<INSERT>'
sagemaker_client = boto3.client('sagemaker', region_name=aws_region)
# Provide only one of model package ARN or model name, not both.
# Provide your model package ARN that was created when you registered your
# model with Model Registry
model_package_arn = '<INSERT>'
## Uncomment if you would like to create an inference recommendations job with a
## model name instead of a model package ARN, and comment out model_package_arn above
## Provide your model name
# model_name = '<INSERT>'
## Provide your container config
# container_config = '<INSERT>'
# Provide a unique job name for SageMaker Inference Recommender job
job_name = '<INSERT>'
# Inference Recommender job type. Set to Default to get an initial recommendation
job_type = 'Default'
# Provide an IAM Role that gives SageMaker Inference Recommender permission to
# access AWS services
role_arn = 'arn:aws:iam::<account>:role/*'
sagemaker_client.create_inference_recommendations_job(
JobName = job_name,
JobType = job_type,
RoleArn = role_arn,
# Provide only one of model package ARN or model name, not both.
# If you would like to create an inference recommendations job with a model name,
# uncomment ModelName and ContainerConfig, and comment out ModelPackageVersionArn.
InputConfig = {
'ModelPackageVersionArn': model_package_arn
# 'ModelName': model_name,
# 'ContainerConfig': container_config
}
)
See the Amazon SageMaker API
Reference Guide for a full list of optional and required
arguments you can pass to CreateInferenceRecommendationsJob
.