Optimize service use - Amazon Athena

Optimize service use

Service level considerations include the number of workloads you run per account, service quotas not only for Athena, but across services, and thinking about how to reduce 'out of resource' errors.

Run one workload per account to avoid service quota limits

Athena enforces quotas for metrics like query running time, the number of concurrent queries in an account, and API request rates. For more information about these quotas, see Service Quotas. Exceeding these quotas causes a query to fail — either when it is submitted, or during query execution.

Many of the performance optimization tips on this page can help reduce the running time of queries. Optimization frees up capacity so that you can run more queries within the concurrency quota and keeps queries from being cancelled for running too long.

Quotas on the number of concurrent queries and API requests are per AWS account and AWS Region. We recommend running one workload per AWS account (or using separate provisioned capacity reservations) to keep workloads from competing for the same quota.

If you run two workloads in the same account, one of the workloads can run a burst of queries. This can cause the remaining workload to be throttled or blocked from running queries. To avoid this, you can move the workloads to separate accounts to give each workload its own concurrency quota. Creating a provisioned capacity reservation for one or both of the workloads accomplishes the same goal.

Consider quotas in other services

When Athena runs a query, it can call other services that enforce quotas. During query execution, Athena can make API calls to the AWS Glue Data Catalog, Amazon S3, and other AWS services like IAM and AWS KMS. If you use federated queries, Athena also calls AWS Lambda. All of these services have their own limits and quotas that can be exceeded. When a query execution encounters errors from these services, it fails and includes the error from the source service. Recoverable errors are retried, but queries can still fail if the issue does not resolve itself in time. Make sure to read error messages thoroughly to determine if they come from Athena or from another service. Some of the relevant errors are covered in this performance tuning section.

For more information about working around errors caused by Amazon S3 service quotas, see Avoid having too many files later in this document. For more information about Amazon S3 performance optimization, see Best practices design patterns: optimizing Amazon S3 performance in the Amazon S3 User Guide.

Reduce 'out of resource' errors

Athena runs queries in a distributed query engine. When you submit a query, the Athena engine query planner estimates the compute capacity required to run the query and prepares a cluster of compute nodes accordingly. Some queries like DDL queries run on only one node. Complex queries over large data sets run on much bigger clusters. The nodes are uniform, with the same memory, CPU, and disk configurations. Athena scales out, not up, to process more demanding queries.

Sometimes the demands of a query exceed the resources available to the cluster running the query. When this happens, the query fails with the error Query exhausted resources at this scale factor.

The resource most commonly exhausted is memory, but in rare cases it can also be disk space. Memory errors commonly occur when the engine performs a join or a window function, but they can also occur in distinct counts and aggregations.

Even if a query fails with an 'out of resource' error once, it might succeed when you run it again. Query execution is not deterministic. Factors such as how long it takes to load data and how intermediate datasets are distributed over the nodes can result in different resource usage. For example, imagine a query that joins two tables and has a heavy skew in the distribution of the values for the join condition. Such a query can succeed most of the time but occasionally fail when the most common values end up being processed by the same node.

To prevent your queries from exceeding available resources, use the performance tuning tips mentioned in this document. In particular, for tips on how to optimize queries that exhaust the resources available, see Optimize joins, Reduce the scope of window functions, or remove them, and Optimize queries by using approximations.