This section describes restrictions and quotas on your machine learning (ML) products in AWS Marketplace.
Topics
- Network isolation
- Image size
- Storage size
- Instance size
- Payload size for inference
- Processing time for inference
- Service quotas
- Asynchronous inference
- Serverless inference
- Managed spot training
- Docker images and AWS accounts
- Publishing model packages from built-in algorithms or AWS Marketplace
- Supported AWS Regions for publishing
Network isolation
For security purposes, when a buyer subscribes to your containerized product, the Docker containers are run in an isolated environment without network access. When you create your containers, don't rely on making outgoing calls over the internet because they will fail. Calls to AWS services will also fail.
Image size
Your Docker image size is governed by the Amazon Elastic Container Registry (Amazon ECR) service quotas. The Docker image size affects the startup time during training jobs, batch-transform jobs, and endpoint creation. For better performance, maintain an optimal Docker image size.
Storage size
When you create an endpoint, Amazon SageMaker AI attaches an Amazon Elastic Block Store (Amazon EBS) storage volume to each ML compute instance that hosts the endpoint. (An endpoint is also known as real-time inference or Amazon SageMaker AI hosting service.) The size of the storage volume depends on the instance type. For more information, see Host Instance Storage Volumes in the Amazon SageMaker AI Developer Guide.
For batch transform, see Storage in Batch Transform in the Amazon SageMaker AI Developer Guide.
Instance size
SageMaker AI provides a selection of instance types that are optimized to fit different ML use
cases. Instance types are comprised of varying combinations of CPU, GPU, memory, and
networking capacity. Instance types give you the flexibility to choose the appropriate mix of
resources for building, training, and deploying your ML models. For more information, see
Amazon SageMaker AI ML Instance
Types
Payload size for inference
For an endpoint, limit the maximum size of the input data per invocation to 6 MB. This value can't be adjusted.
For batch transform, the maximum size of the input data per invocation is 100 MB. This value can't be adjusted.
Processing time for inference
For an endpoint, the maximum processing time per invocation is 60 seconds. This value can't be adjusted.
For batch transform, the maximum processing time per invocation is 60 minutes. This value can't be adjusted.
Service quotas
For more information about quotas related to training and inference, see Amazon SageMaker AI Service Quotas.
Asynchronous inference
Model packages and algorithms published in AWS Marketplace can't be deployed to endpoints configured for Amazon SageMaker AI Asynchronous Inference. Endpoints configured for asynchronous inference requires models to have network connectivity. All AWS Marketplace models operate in network isolation. For more information, see No network access.
Serverless inference
Model packages and algorithms published in AWS Marketplace can't be deployed to endpoints configured for Amazon SageMaker AI Serverless Inference. Endpoints configured for serverless inference require models to have network connectivity. All AWS Marketplace models operate in network isolation. For more information, see No network access.
Managed spot training
For all algorithms from AWS Marketplace, the value of MaxWaitTimeInSeconds
is set to
3,600 seconds (60 minutes), even if the checkpoint for managed spot training is
implemented. This value can't be adjusted.
Docker images and AWS accounts
For publishing, images must be stored in Amazon ECR repositories owned by the AWS account of the seller. It isn't possible to publish images that are stored in a repository owned by another AWS account.
Publishing model packages from built-in algorithms or AWS Marketplace
Model packages created from training jobs using an Amazon SageMaker AI built-in algorithm or an algorithm from an AWS Marketplace subscription can't be published.
You can still use the model artifacts from the training job, but your own inference image is required for publishing model packages.
Supported AWS Regions for
publishing
AWS Marketplace supports publishing model package and algorithm resources from AWS Regions where the following are both true:
-
A Region that Amazon SageMaker AI supports
-
An available Region
that is opted-in by default (for example, describe-regions returns "OptInStatus": "opt-in-not-required"
)
All assets required for publishing a model package or algorithm product must be stored in the same Region that you choose to publish from. This includes the following:
-
Model package and algorithm resources that are created in Amazon SageMaker AI
-
Inference and training images that are uploaded to Amazon ECR repositories
-
Model artifacts (if any) that are stored in Amazon Simple Storage Service (Amazon S3) and dynamically loaded during model deployment for model package resources
-
Test data for inference and training validation that are stored in Amazon S3
You can develop and train your product in any Region that is supported by SageMaker AI. But, before you can publish, you must copy all assets to and re-create resources in a Region that AWS Marketplace supports publishing from.
During the listing process, regardless of the AWS Region that you publish from, you can choose the Regions that you want to publish to and make your product available in.