Complete the prerequisites
The following topic describes the prerequisites that you must complete before creating a serverless endpoint. These prerequisites include properly storing your model artifacts, configuring an AWS IAM with the correct permissions, and selecting a container image.
To complete the prerequisites
-
Set up an AWS account. You first need an AWS account and an AWS Identity and Access Management administrator user. For instructions on how to set up an AWS account, see How do I create and activate a new AWS account?
. For instructions on how to secure your account with an IAM administrator user, see Creating your first IAM admin user and user group in the IAM User Guide. -
Create an Amazon S3 bucket. You use an Amazon S3 bucket to store your model artifacts. To learn how to create a bucket, see Create your first S3 bucket in the Amazon S3 User Guide.
-
Upload your model artifacts to your S3 bucket. For instructions on how to upload your model to your bucket, see Upload an object to your bucket in the Amazon S3 User Guide.
-
Create an IAM role for Amazon SageMaker AI. Amazon SageMaker AI needs access to the S3 bucket that stores your model. Create an IAM role with a policy that gives SageMaker AI read access to your bucket. The following procedure shows how to create a role in the console, but you can also use the CreateRole API from the IAM User Guide. For information on giving your role more granular permissions based on your use case, see How to use SageMaker AI execution roles.
Sign in to the IAM console
. In the navigation tab, choose Roles.
Choose Create Role.
-
For Select type of trusted entity, choose AWS service and then choose SageMaker AI.
-
Choose Next: Permissions and then choose Next: Tags.
-
(Optional) Add tags as key-value pairs if you want to have metadata for the role.
Choose Next: Review.
-
For Role name, enter a name for the new role that is unique within your AWS account. You cannot edit the role name after creating the role.
-
(Optional) For Role description, enter a description for the new role.
-
Choose Create role.
-
Attach S3 bucket permissions to your SageMaker AI role. After creating an IAM role, attach a policy that gives SageMaker AI permission to access the S3 bucket containing your model artifacts.
-
In the IAM console navigation tab, choose Roles.
-
From the list of roles, search for the role you created in the previous step by name.
-
Choose your role, and then choose Attach policies.
-
For Attach permissions, choose Create policy.
-
In the Create policy view, select the JSON tab.
-
Add the following policy statement into the JSON editor. Make sure to replace
with the name of the S3 bucket that stores your model artifacts. If you want to restrict the access to a specific folder or file in your bucket, you can also specify the Amazon S3 folder path, for example,<your-bucket-name>
.<your-bucket-name>
/<model-folder>
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": "s3:GetObject", "Resource": "arn:aws:s3:::
<your-bucket-name>
/*" } ] } Choose Next: Tags.
-
(Optional) Add tags in key-value pairs to the policy.
-
Choose Next: Review.
-
For Name, enter a name for the new policy.
-
(Optional) Add a Description for the policy.
-
Choose Create policy.
-
After creating the policy, return to Roles in the IAM console
and select your SageMaker AI role. -
Choose Attach policies.
-
For Attach permissions, search for the policy you created by name. Select it and choose Attach policy.
-
-
Select a prebuilt Docker container image or bring your own. The container you choose serves inference on your endpoint. SageMaker AI provides containers for built-in algorithms and prebuilt Docker images for some of the most common machine learning frameworks, such as Apache MXNet, TensorFlow, PyTorch, and Chainer. For a full list of the available SageMaker AI images, see Available Deep Learning Containers Images
. If none of the existing SageMaker AI containers meet your needs, you may need to create your own Docker container. For information about how to create your Docker image and make it compatible with SageMaker AI, see Containers with custom inference code. To use your container with a serverless endpoint, the container image must reside in an Amazon ECR repository within the same AWS account that creates the endpoint.
-
(Optional) Register your model with Model Registry. SageMaker Model Registry helps you catalog and manage versions of your models for use in ML pipelines. For more information about registering a version of your model, see Create a Model Group and Register a Model Version. For an example of a Model Registry and Serverless Inference workflow, see the following example notebook
. -
(Optional) Bring an AWS KMS key. When setting up a serverless endpoint, you have the option to specify a KMS key that SageMaker AI uses to encrypt your Amazon ECR image. Note that the key policy for the KMS key must grant access to the IAM role you specify when setting up your endpoint. To learn more about KMS keys, see the AWS Key Management Service Developer Guide.