Managing performance and scaling for Aurora DB clusters
You can use the following options to manage performance and scaling for Aurora DB clusters and DB instances:
Storage scaling
Aurora storage automatically scales with the data in your cluster volume. As your data grows, your cluster volume storage expands up to a maximum of 128 tebibytes (TiB) or 64 TiB. The maximum size depends on the DB engine version. To learn what kinds of data are included in the cluster volume, see Amazon Aurora storage. For details about the maximum size for a specific version, see Amazon Aurora size limits.
The size of your cluster volume is evaluated on an hourly basis to determine your
storage costs. For pricing information, see the
Aurora pricing page
Even though an Aurora cluster volume can scale up in size to many tebibytes, you are only charged for the space that you use in the volume. The mechanism for determining billed storage space depends on the version of your Aurora cluster.
-
When Aurora data is removed from the cluster volume, the overall billed space decreases by a comparable amount. This dynamic resizing behavior happens when underlying tablespaces are dropped or reorganized to require less space. Thus, you can reduce storage charges by dropping tables and databases that you no longer need. Dynamic resizing applies to certain Aurora versions. The following are the Aurora versions where the cluster volume dynamically resizes as you remove data:
Database engine Versions with dynamic resizing Aurora MySQL -
Version 3 (compatible with MySQL 8.0): all supported versions
-
Version 2 (compatible with MySQL 5.7): 2.11 and higher
Aurora PostgreSQL All supported versions Aurora Serverless v2 All supported versions Aurora Serverless v1 Not available -
-
In Aurora versions lower than those in the preceding list, the cluster volume can reuse space that's freed up when you remove data, but the volume itself never decreases in size.
Dynamic resizing applies to operations that physically remove or resize tablespaces within the cluster volume. Thus, it
applies to SQL statements such as DROP TABLE
, DROP DATABASE
, TRUNCATE TABLE
, and
ALTER TABLE ... DROP PARTITION
. It doesn't apply to deleting rows using the DELETE
statement.
If you delete a large number of rows from a table, you can run the Aurora MySQL OPTIMIZE TABLE
statement or use the
Aurora PostgreSQL pg_repack
extension afterward to reorganize the table and dynamically resize the cluster
volume.
For Aurora MySQL, the following considerations apply:
-
After you upgrade your DB cluster to a DB engine version that supports dynamic resizing, and when the feature is enabled in that specific AWS Region, any space that's later freed by certain SQL statements, such as
DROP TABLE
, is reclaimable.If the feature is explicitly disabled in a particular AWS Region, the space might only be reusable—and not reclaimable—even on versions that support dynamic resizing.
The feature was enabled for specific DB engine versions (1.23.0–1.23.4, 2.09.0–2.09.3, and 2.10.0) between November 2020 and March 2022, and is enabled by default on any subsequent versions.
-
A table is stored internally in one or more contiguous fragments of varying sizes. While running
TRUNCATE TABLE
operations, the space corresponding to the first fragment is reusable and not reclaimable. Other fragments are reclaimable. DuringDROP TABLE
operations, space corresponding to the entire tablespace is reclaimable. -
The
innodb_file_per_table
parameter affects how table storage is organized. When tables are part of the system tablespace, dropping the table doesn't reduce the size of the system tablespace. Thus, make sure to setinnodb_file_per_table
to 1 for Aurora MySQL DB clusters to take full advantage of dynamic resizing. -
In version 2.11 and higher, the InnoDB temporary tablespace is dropped and re-created on restart. This releases the space occupied by the temporary tablespace to the system, and then the cluster volume resizes. To take full advantage of the dynamic resizing feature, we recommend that you upgrade your DB cluster to version 2.11 or higher.
Note
The dynamic resizing feature doesn't reclaim space immediately when tables in tablespaces are dropped, but gradually at a rate of approximately 10 TB per day. Space in the system tablespace isn't reclaimed, because the system tablespace is never removed. Unreclaimed free space in a tablespace is reused when an operation needs space in that tablespace. The dynamic resizing feature can reclaim storage space only when the cluster is in an available state.
You can check how much storage space a cluster is using by monitoring the VolumeBytesUsed
metric in CloudWatch. For
more information on storage billing, see How Aurora data storage is billed.
-
In the AWS Management Console, you can see this figure in a chart by viewing the
Monitoring
tab on the details page for the cluster. -
With the AWS CLI, you can run a command similar to the following Linux example. Substitute your own values for the start and end times and the name of the cluster.
aws cloudwatch get-metric-statistics --metric-name "VolumeBytesUsed" \ --start-time "$(date -d '6 hours ago')" --end-time "$(date -d 'now')" --period 60 \ --namespace "AWS/RDS" \ --statistics Average Maximum Minimum \ --dimensions Name=DBClusterIdentifier,Value=
my_cluster_identifier
That command produces output similar to the following.
{ "Label": "VolumeBytesUsed", "Datapoints": [ { "Timestamp": "2020-08-04T21:25:00+00:00", "Average": 182871982080.0, "Minimum": 182871982080.0, "Maximum": 182871982080.0, "Unit": "Bytes" } ] }
The following examples show how you might track storage usage for an Aurora cluster over time using AWS CLI commands on a
Linux system. The --start-time
and --end-time
parameters define the overall time interval as one day.
The --period
parameter requests the measurements at one hour intervals. It doesn't make sense to choose a
--period
value that's small, because the metrics are collected at intervals, not continuously. Also, Aurora
storage operations sometimes continue for some time in the background after the relevant SQL statement finishes.
The first example returns output in the default JSON format. The data points are returned in arbitrary order, not sorted by timestamp. You might import this JSON data into a charting tool to do sorting and visualization.
$
aws cloudwatch get-metric-statistics --metric-name "VolumeBytesUsed" \
--start-time "$(date -d '1 day ago')" --end-time "$(date -d 'now')" --period 3600
--namespace "AWS/RDS" --statistics Maximum --dimensions Name=DBClusterIdentifier,Value=my_cluster_id
{
"Label": "VolumeBytesUsed",
"Datapoints": [
{
"Timestamp": "2020-08-04T19:40:00+00:00",
"Maximum": 182872522752.0,
"Unit": "Bytes"
},
{
"Timestamp": "2020-08-05T00:40:00+00:00",
"Maximum": 198573719552.0,
"Unit": "Bytes"
},
{
"Timestamp": "2020-08-05T05:40:00+00:00",
"Maximum": 206827454464.0,
"Unit": "Bytes"
},
{
"Timestamp": "2020-08-04T17:40:00+00:00",
"Maximum": 182872522752.0,
"Unit": "Bytes"
},
... output omitted ...
This example returns the same data as the previous one. The --output
parameter represents the data in compact
plain text format. The aws cloudwatch
command pipes its output to the sort
command. The
-k
parameter of the sort
command sorts the output by the third field, which is the timestamp in
Universal Coordinated Time (UTC) format.
$
aws cloudwatch get-metric-statistics --metric-name "VolumeBytesUsed" \ --start-time "$(date -d '1 day ago')" --end-time "$(date -d 'now')" --period 3600 \ --namespace "AWS/RDS" --statistics Maximum --dimensions Name=DBClusterIdentifier,Value=my_cluster_id
\ --output text | sort -k 3VolumeBytesUsed DATAPOINTS 182872522752.0 2020-08-04T17:41:00+00:00 Bytes DATAPOINTS 182872522752.0 2020-08-04T18:41:00+00:00 Bytes DATAPOINTS 182872522752.0 2020-08-04T19:41:00+00:00 Bytes DATAPOINTS 182872522752.0 2020-08-04T20:41:00+00:00 Bytes DATAPOINTS 187667791872.0 2020-08-04T21:41:00+00:00 Bytes DATAPOINTS 190981029888.0 2020-08-04T22:41:00+00:00 Bytes DATAPOINTS 195587244032.0 2020-08-04T23:41:00+00:00 Bytes DATAPOINTS 201048915968.0 2020-08-05T00:41:00+00:00 Bytes DATAPOINTS 205368492032.0 2020-08-05T01:41:00+00:00 Bytes DATAPOINTS 206827454464.0 2020-08-05T02:41:00+00:00 Bytes DATAPOINTS 206827454464.0 2020-08-05T03:41:00+00:00 Bytes DATAPOINTS 206827454464.0 2020-08-05T04:41:00+00:00 Bytes DATAPOINTS 206827454464.0 2020-08-05T05:41:00+00:00 Bytes DATAPOINTS 206827454464.0 2020-08-05T06:41:00+00:00 Bytes DATAPOINTS 206827454464.0 2020-08-05T07:41:00+00:00 Bytes DATAPOINTS 206827454464.0 2020-08-05T08:41:00+00:00 Bytes DATAPOINTS 206827454464.0 2020-08-05T09:41:00+00:00 Bytes DATAPOINTS 206827454464.0 2020-08-05T10:41:00+00:00 Bytes DATAPOINTS 206827454464.0 2020-08-05T11:41:00+00:00 Bytes DATAPOINTS 206827454464.0 2020-08-05T12:41:00+00:00 Bytes DATAPOINTS 206827454464.0 2020-08-05T13:41:00+00:00 Bytes DATAPOINTS 206827454464.0 2020-08-05T14:41:00+00:00 Bytes DATAPOINTS 206833664000.0 2020-08-05T15:41:00+00:00 Bytes DATAPOINTS 206833664000.0 2020-08-05T16:41:00+00:00 Bytes
The sorted output shows how much storage was used at the start and end of the monitoring period. You can also find the
points during that period when Aurora allocated more storage for the cluster. The following example uses Linux commands to
reformat the starting and ending VolumeBytesUsed
values as gigabytes (GB) and as gibibytes (GiB). Gigabytes
represent units measured in powers of 10 and are commonly used in discussions of storage for rotational hard drives. Gibibytes
represent units measured in powers of 2. Aurora storage measurements and limits are typically stated in the power-of-2 units,
such as gibibytes and tebibytes.
$
GiB=$((1024*1024*1024))$
GB=$((1000*1000*1000))$
echo "Start: $((182872522752/$GiB)) GiB, End: $((206833664000/$GiB)) GiB"Start: 170 GiB, End: 192 GiB
$
echo "Start: $((182872522752/$GB)) GB, End: $((206833664000/$GB)) GB"Start: 182 GB, End: 206 GB
The VolumeBytesUsed
metric tells you how much storage in the cluster is incurring charges. Thus, it's
best to minimize this number when practical. However, this metric doesn't include some storage that Aurora uses internally
in the cluster and doesn't charge for. If your cluster is approaching the storage limit and might run out of space,
it's more helpful to monitor the AuroraVolumeBytesLeftTotal
metric and try to maximize that number. The
following example runs a similar calculation as the previous one, but for AuroraVolumeBytesLeftTotal
instead of
VolumeBytesUsed
.
$
aws cloudwatch get-metric-statistics --metric-name "AuroraVolumeBytesLeftTotal" \ --start-time "$(date -d '1 hour ago')" --end-time "$(date -d 'now')" --period 3600 \ --namespace "AWS/RDS" --statistics Maximum --dimensions Name=DBClusterIdentifier,Value=my_old_cluster_id
\ --output text | sort -k 3AuroraVolumeBytesLeftTotal DATAPOINTS 140530528288768.0 2023-02-23T19:25:00+00:00 Count
$
TiB=$((1024*1024*1024*1024))$
TB=$((1000*1000*1000*1000))$
echo "$((69797067915264 / $TB)) TB remaining for this cluster"69 TB remaining for this cluster
$
echo "$((69797067915264 / $TiB)) TiB remaining for this cluster"63 TiB remaining for this cluster
For a cluster running Aurora MySQL version 2.09 or higher, or Aurora PostgreSQL, the free size reported by
VolumeBytesUsed
increases when data is added and decreases when data is removed. The following example shows
how. This report shows the maximum and minimum storage size for a cluster at 15-minute intervals as tables with temporary data
are created and dropped. The report lists the maximum value before the minimum value. Thus, to understand how storage usage
changed within the 15-minute interval, interpret the numbers from right to left.
$
aws cloudwatch get-metric-statistics --metric-name "VolumeBytesUsed" \ --start-time "$(date -d '4 hours ago')" --end-time "$(date -d 'now')" --period 1800 \ --namespace "AWS/RDS" --statistics Maximum Minimum --dimensions Name=DBClusterIdentifier,Value=my_new_cluster_id
--output text | sort -k 4VolumeBytesUsed DATAPOINTS 14545305600.0 14545305600.0 2020-08-05T20:49:00+00:00 Bytes DATAPOINTS 14545305600.0 14545305600.0 2020-08-05T21:19:00+00:00 Bytes DATAPOINTS 22022176768.0 14545305600.0 2020-08-05T21:49:00+00:00 Bytes DATAPOINTS 22022176768.0 22022176768.0 2020-08-05T22:19:00+00:00 Bytes DATAPOINTS 22022176768.0 22022176768.0 2020-08-05T22:49:00+00:00 Bytes DATAPOINTS 22022176768.0 15614263296.0 2020-08-05T23:19:00+00:00 Bytes DATAPOINTS 15614263296.0 15614263296.0 2020-08-05T23:49:00+00:00 Bytes DATAPOINTS 15614263296.0 15614263296.0 2020-08-06T00:19:00+00:00 Bytes
The following example shows how with a cluster running Aurora MySQL version 2.09 or higher, or Aurora PostgreSQL, the free size
reported by AuroraVolumeBytesLeftTotal
reflects the 128-TiB size limit.
$
aws cloudwatch get-metric-statistics --region us-east-1 --metric-name "AuroraVolumeBytesLeftTotal" \ --start-time "$(date -d '4 hours ago')" --end-time "$(date -d 'now')" --period 1800 \ --namespace "AWS/RDS" --statistics Minimum --dimensions Name=DBClusterIdentifier,Value=pq-57 \ --output text | sort -k 3AuroraVolumeBytesLeftTotal DATAPOINTS 140515818864640.0 2020-08-05T20:56:00+00:00 Count DATAPOINTS 140515818864640.0 2020-08-05T21:26:00+00:00 Count DATAPOINTS 140515818864640.0 2020-08-05T21:56:00+00:00 Count DATAPOINTS 140514866757632.0 2020-08-05T22:26:00+00:00 Count DATAPOINTS 140511020580864.0 2020-08-05T22:56:00+00:00 Count DATAPOINTS 140503168843776.0 2020-08-05T23:26:00+00:00 Count DATAPOINTS 140503168843776.0 2020-08-05T23:56:00+00:00 Count DATAPOINTS 140515818864640.0 2020-08-06T00:26:00+00:00 Count
$
TiB=$((1024*1024*1024*1024))$
TB=$((1000*1000*1000*1000))$
echo "$((140515818864640 / $TB)) TB remaining for this cluster"140 TB remaining for this cluster
$
echo "$((140515818864640 / $TiB)) TiB remaining for this cluster"127 TiB remaining for this cluster
Instance scaling
You can scale your Aurora DB cluster as needed by modifying the DB instance class for each DB instance in the DB cluster. Aurora supports several DB instance classes optimized for Aurora, depending on database engine compatibility.
Database engine | Instance scaling |
---|---|
Amazon Aurora MySQL |
|
Amazon Aurora PostgreSQL |
Read scaling
You can achieve read scaling for your Aurora DB cluster by creating up to 15 Aurora Replicas in a DB cluster. Each Aurora Replica returns the same data from the cluster volume with minimal replica lag—usually considerably less than 100 milliseconds after the primary instance has written an update. As your read traffic increases, you can create additional Aurora Replicas and connect to them directly to distribute the read load for your DB cluster. Aurora Replicas don't have to be of the same DB instance class as the primary instance.
For information about adding Aurora Replicas to a DB cluster, see Adding Aurora Replicas to a DB cluster.
Managing connections
The maximum number of connections allowed to an Aurora DB instance is determined by
the max_connections
parameter in the instance-level parameter group for
the DB instance. The default value of that parameter varies depends on the DB
instance class used for the DB instance and database engine compatibility.
Database engine | max_connections default value |
---|---|
Amazon Aurora MySQL |
|
Amazon Aurora PostgreSQL |
Tip
If your applications frequently open and close connections, or keep a large number of long-lived connections open, we recommend that you use Amazon RDS Proxy. RDS Proxy is a fully managed, highly available database proxy that uses connection pooling to share database connections securely and efficiently. To learn more about RDS Proxy, see Amazon RDS Proxy for Aurora.
Managing query execution plans
If you use query plan management for Aurora PostgreSQL, you gain control over which plans the optimizer runs. For more information, see Managing query execution plans for Aurora PostgreSQL.