Check the status of a scaling activity by describing scaling activities - Amazon SageMaker AI

Check the status of a scaling activity by describing scaling activities

You can check the status of a scaling activity for your auto scaled endpoint by describing scaling activities. Application Auto Scaling provides descriptive information about the scaling activities in the specified namespace from the previous six weeks. For more information, see Scaling activities for Application Auto Scaling in the Application Auto Scaling User Guide.

To check the status of a scaling activity, use the describe-scaling-activities command. You can't check the status of a scaling activity using the console.

Describe scaling activities (AWS CLI)

To describe scaling activities for all SageMaker AI resources that registered with Application Auto Scaling, use the describe-scaling-activities command, specifying sagemaker for the --service-namespace option.

aws application-autoscaling describe-scaling-activities \ --service-namespace sagemaker

To describe scaling activities for a specific resource, include the --resource-id option.

aws application-autoscaling describe-scaling-activities \ --service-namespace sagemaker \ --resource-id endpoint/my-endpoint/variant/my-variant

The following example shows the output produced when you run this command.

{ "ActivityId": "activity-id", "ServiceNamespace": "sagemaker", "ResourceId": "endpoint/my-endpoint/variant/my-variant", "ScalableDimension": "sagemaker:variant:DesiredInstanceCount", "Description": "string", "Cause": "string", "StartTime": timestamp, "EndTime": timestamp, "StatusCode": "string", "StatusMessage": "string" }

Identify blocked scaling activities from instance quotas (AWS CLI)

When you scale out (add more instances), you might reach your account-level instance quota. You can use the describe-scaling-activities command to check whether you have reached your instance quota. When you exceed your quota, auto scaling is blocked.

To check if you have reached your instance quota, use the describe-scaling-activities command and specify the resource ID for the --resource-id option.

aws application-autoscaling describe-scaling-activities \ --service-namespace sagemaker \ --resource-id endpoint/my-endpoint/variant/my-variant

Within the return syntax, check the StatusCode and StatusMessage keys and their associated values. StatusCode returns Failed. Within StatusMessage there is a message indicating that the account-level service quota was reached. The following is an example of what that message might look like:

{ "ActivityId": "activity-id", "ServiceNamespace": "sagemaker", "ResourceId": "endpoint/my-endpoint/variant/my-variant", "ScalableDimension": "sagemaker:variant:DesiredInstanceCount", "Description": "string", "Cause": "minimum capacity was set to 110", "StartTime": timestamp, "EndTime": timestamp, "StatusCode": "Failed", "StatusMessage": "Failed to set desired instance count to 110. Reason: The account-level service limit 'ml.xx.xxxxxx for endpoint usage' is 1000 Instances, with current utilization of 997 Instances and a request delta of 20 Instances. Please contact AWS support to request an increase for this limit. (Service: AmazonSageMaker; Status Code: 400; Error Code: ResourceLimitExceeded; Request ID: request-id)." }