Best practices for using Amazon Connect APIs
This topic provides guidance for using Amazon Connect Describe and List APIs so you don't get unexpected 4xx errors for the response. It also explains how to configure your client Read APIs.
Contents
Types of errors
The Amazon Connect APIs provide an HTTP interface. HTTP defines ranges of HTTP Status Codes for different types of error responses.
-
Client errors are indicated by HTTP Status Code class of 4xx
-
Service errors are indicated by HTTP Status Code class of 5xx
In this reference guide, the documentation for each API has an Errors section that includes a brief discussion about HTTP status codes. We recommend looking there as part of your investigation when you get an error.
For information about the common errors returned by Amazon Connect public APIs, see Common Errors.
Throttling in Amazon Connect APIs
Throttling errors in Amazon Connect public API(s) are defined by HTTP status code 429. This HTTP status code can be retried by the client based on their requirement.
Important
The throttling limits are defined for each API separately at the AWS account level, not for the individual Amazon Connect instance.
To use any API for Amazon Connect resources (such as users, queues, and routing profiles), you need the ID/ARN for the Amazon Connect instance.
By default, Amazon Connect limits the steady-state requests per second (RPS) across all APIs within an AWS account, per Region. It also limits the burst (that is, the maximum bucket size) across all APIs within an AWS account, per Region.
In Amazon Connect the burst limit represents the target maximum number of concurrent request submissions that APIs will fulfill before returning 429 Too Many Requests error responses.
For more information about throttling quotas, see Amazon Connect throttling quotas.
How to configure your client Read API(s)
Your client configuration will vary based on number of resources that your API tries to describe/list per second.
In the following Java example, the number of retries is set to 3. This means after
your Amazon Connect client implementation experiences throttling, it retries for
maximum of 3 times. Instead of retrying immediately and aggressively, the following
snippet waits a specified amount of time (between 0 to max of 5 seconds as defined by
maxBackoffTime parameter) between tries and uses EqualJitterBackoffStrategy
final class ClientBuilder { private static final int NUMBER_OF_RETRIES = 3; private static final RetryPolicy RETRY_POLICY = RetryPolicy.builder() .numRetries(NUMBER_OF_RETRIES) .retryCondition(RetryCondition.defaultRetryCondition()) .backoffStrategy(EqualJitterBackoffStrategy.builder() .baseDelay(Duration.ofSeconds(1)) .maxBackoffTime(Duration.ofSeconds(5)) .build()) .build(); public static ConnectClient getClient() { return ConnectClient.builder() .httpClient(LambdaWrapper.HTTP_CLIENT) .overrideConfiguration(ClientOverrideConfiguration.builder().retryPolicy(RETRY_POLICY).build()) .build(); } }
When failures are caused by overload or contention, backing off often doesn't help as much as it seems like it should. This is because there's a correlation between failures and backing off/contention:
-
If all the failed calls back off to the same time, they cause contention or overload again when they are retried.
To address this, we recommend adding jitter. Jitter adds some amount of randomness to
the backoff which spreads the retries around in time. For more information about how
much jitter to add and the best ways to add it, see this AWS blog post:
Exponential Backoff and Jitter
For information about types of backoff strategies, see Interface BackoffStrategy
How to make 2 TPS work for List APIs when you have a large number of resources
There are two options: use List APIs with maxResults
= 1,000, or use
Search APIs as an alternative to List/Describe round trips. Both options are discussed
here.
The List API of a particular Amazon Connect resource supports a
maxResults
parameter as part of request body. List API(s) support a
maximum of 1,000 results in single API call unless specified otherwise in the
documentation.
The following example shows the maxResults
of the ListUsers API.
String nextToken = null; do { ListUsersRequest listUsersRequest = ListUsersRequest.builder() .instanceId(
your Amazon Connect instanceId
) .maxResults(1000) .nextToken(nextToken) .build(); ListUsersResponse response = client.listUsers(listUsersRequest); System.out.println(response.sdkHttpResponse().statusCode()); } while (nextToken != null);
If nextToken
is returned, then more results are available. The value of
nextToken
is a unique pagination token for each page. Make the call
again using the returned token to retrieve the next page. Keep all other arguments
unchanged. Each pagination token expires after 24 hours. Using an expired pagination
token will return an HTTP 400 InvalidToken
error.
When to use Search APIs instead of List APIs
We recommend you assess the speed of pulling details for 100 records at a time (the Search API limit) instead of pulling 1,000 IDs and doing Describe round trips. It's better to try using Search APIs instead of combination of List and Describe API for a specific resource.
Let's say you have a situation where you're listing specific resources in your Amazon Connect instance and then call a Describe API on an individual resource. Instead, we recommend leveraging the Search API for that corresponding resource. Search APIs support several filters that can help to reduce response set as per requirement.
How to make 2 TPS work for Create/Update APIs when you have a large number of resources
There is a performance impact behind creating/updating resources at a default 2 TPS. For example, 100 resources can be created/updated with 2 TPS within 50 seconds. A 1,000 resources with this TPS would need nearly 8 minutes. Based on your use case, if the operation is impacting performance, contact AWS Support and provide a business justification for your request to increase your throttling quota. See How to request an increase to an API throttling quota.
It is your responsibility to always implement the following best practices:
-
Check your logic and implement best practices to make your requests as efficient as possible. Check out AWS Well-Architected Tool (AWS WA Tool) for processes that help measure your architecture using AWS best practices.
-
Test your requests and any custom processes before adding them to production operations.
Hitting a resource quota? Delete unused / stale resources
If you keep hitting the quota limit for a specific resource, we recommend deleting any unused or stale resources. You can find the Delete API for a resource on the resource-specific Action pages. These pages list all the APIs for a given resource.
How to request an increase to an API throttling quota
Important
-
We analyze all requests for quota increases and provide guidance for all queries.
-
We rarely approve requests if they apply to situations other than those listed below.
-
For smaller increase requests, we can approve in hours. Larger increase requests take time to review, process, approve, and deploy. Depending on your specific implementation, your resource, and the size of quota that you want, a request can take up to 3 weeks. An extra-large worldwide increase can potentially take months. If you're increasing your quotas as part of a larger project, keep this information in mind and plan accordingly.
For instructions about how to use the Service Quotas
console
In the Services Quotas console, open an AWS Support case and provide the following information:
-
Have you implemented the best practices explained in the Retry behavior topic of the AWS SDKs and Tools Reference Guide?
-
What is the performance impact without the requested limit increase? Provide some calculations.
-
What is the expected number of resources customer is trying to create/update/describe every second with the APIs?
-
What is the new quota for the API that you want?
Include in your case if the following situation(s) apply:
-
It is a migration request and you need high TPS for a specific time range to configure your instance(s).
-
There are performance or business impacting usecases, such as handling huge call volume for peak season.
-
You have thousands of resources with multiple concurrent agents working at the same time which might increase the overall traffic from your contact center.