Amazon Comprehend and Application Auto Scaling - Application Auto Scaling

Amazon Comprehend and Application Auto Scaling

You can scale Amazon Comprehend document classification and entity recognizer endpoints using target tracking scaling policies and scheduled scaling.

Use the following information to help you integrate Amazon Comprehend with Application Auto Scaling.

Service-linked role created for Amazon Comprehend

The following service-linked role is automatically created in your AWS account when registering Amazon Comprehend resources as scalable targets with Application Auto Scaling. This role allows Application Auto Scaling to perform supported operations within your account. For more information, see Service-linked roles for Application Auto Scaling.

  • AWSServiceRoleForApplicationAutoScaling_ComprehendEndpoint

Service principal used by the service-linked role

The service-linked role in the previous section can be assumed only by the service principal authorized by the trust relationships defined for the role. The service-linked role used by Application Auto Scaling grants access to the following service principal:

  • comprehend.application-autoscaling.amazonaws.com

Registering Amazon Comprehend resources as scalable targets with Application Auto Scaling

Application Auto Scaling requires a scalable target before you can create scaling policies or scheduled actions for an Amazon Comprehend document classification or entity recognizer endpoint. A scalable target is a resource that Application Auto Scaling can scale out and scale in. Scalable targets are uniquely identified by the combination of resource ID, scalable dimension, and namespace.

To configure auto scaling using the AWS CLI or one of the AWS SDKs, you can use the following options:

  • AWS CLI:

    Call the register-scalable-target command for a document classification endpoint. The following example registers the desired number of inference units to be used by the model for a document classifier endpoint using the endpoint's ARN, with a minimum capacity of one inference unit and a maximum capacity of three inference units.

    aws application-autoscaling register-scalable-target \ --service-namespace comprehend \ --scalable-dimension comprehend:document-classifier-endpoint:DesiredInferenceUnits \ --resource-id arn:aws:comprehend:us-west-2:123456789012:document-classifier-endpoint/EXAMPLE \ --min-capacity 1 \ --max-capacity 3

    If successful, this command returns the ARN of the scalable target.

    { "ScalableTargetARN": "arn:aws:application-autoscaling:region:account-id:scalable-target/1234abcd56ab78cd901ef1234567890ab123" }

    Call the register-scalable-target command for an entity recognizer endpoint. The following example registers the desired number of inference units to be used by the model for an entity recognizer using the endpoint's ARN, with a minimum capacity of one inference unit and a maximum capacity of three inference units.

    aws application-autoscaling register-scalable-target \ --service-namespace comprehend \ --scalable-dimension comprehend:entity-recognizer-endpoint:DesiredInferenceUnits \ --resource-id arn:aws:comprehend:us-west-2:123456789012:entity-recognizer-endpoint/EXAMPLE \ --min-capacity 1 \ --max-capacity 3

    If successful, this command returns the ARN of the scalable target.

    { "ScalableTargetARN": "arn:aws:application-autoscaling:region:account-id:scalable-target/1234abcd56ab78cd901ef1234567890ab123" }
  • AWS SDK:

    Call the RegisterScalableTarget operation and provide ResourceId, ScalableDimension, ServiceNamespace, MinCapacity, and MaxCapacity as parameters.

If you are just getting started with Application Auto Scaling, you can find additional useful information about scaling your Amazon Comprehend resources in the following documentation:

Auto scaling with endpoints in the Amazon Comprehend Developer Guide