Fast recovery after failover with cluster cache management for Aurora PostgreSQL
For fast recovery of the writer DB instance in your Aurora PostgreSQL clusters if there's a failover, use cluster cache management for Amazon Aurora PostgreSQL. Cluster cache management ensures that application performance is maintained if there's a failover.
In a typical failover situation, you might see a temporary but large performance degradation after failover. This degradation occurs because when the failover DB instance starts, the buffer cache is empty. An empty cache is also known as a cold cache. A cold cache degrades performance because the DB instance has to read from the slower disk, instead of taking advantage of values stored in the buffer cache.
With cluster cache management, you set a specific reader DB instance as the failover target. Cluster cache management ensures that the data in the designated reader's cache is kept synchronized with the data in the writer DB instance's cache. The designated reader's cache with prefilled values is known as a warm cache. If a failover occurs, the designated reader uses values in its warm cache immediately when it's promoted to the new writer DB instance. This approach provides your application much better recovery performance.
Cluster cache management requires that the designated reader instance have the same instance
class type and size (db.r5.2xlarge
or db.r5.xlarge
, for example)
as the writer. Keep this in mind when you create your Aurora PostgreSQL DB clusters so that your
cluster can recover during a failover. For a listing of instance class types and sizes, see
Hardware specifications for DB instance classes for Aurora.
Note
Cluster cache management is not supported for Aurora PostgreSQL DB clusters that are part of Aurora global databases. It is recommended that no workload should run on the designated tier-0 reader.
Contents
Configuring cluster cache management
To configure cluster cache management, do the following processes in order.
Topics
Note
Allow at least 1 minute after completing these steps for cluster cache management to be fully operational.
Enabling cluster cache management
To enable cluster cache management, take the steps described following.
To enable cluster cache management
-
Sign in to the AWS Management Console and open the Amazon RDS console at https://console.aws.amazon.com/rds/
. -
In the navigation pane, choose Parameter groups.
-
In the list, choose the parameter group for your Aurora PostgreSQL DB cluster.
The DB cluster must use a parameter group other than the default, because you can't change values in a default parameter group.
-
For Parameter group actions, choose Edit.
-
Set the value of the
apg_ccm_enabled
cluster parameter to 1. -
Choose Save changes.
To enable cluster cache management for an Aurora PostgreSQL DB cluster, use
the AWS CLI modify-db-cluster-parameter-group
command with the
following required parameters:
-
--db-cluster-parameter-group-name
-
--parameters
Example
For Linux, macOS, or Unix:
aws rds modify-db-cluster-parameter-group \ --db-cluster-parameter-group-name
my-db-cluster-parameter-group
\ --parameters "ParameterName=apg_ccm_enabled,ParameterValue=1,ApplyMethod=immediate"
For Windows:
aws rds modify-db-cluster-parameter-group ^ --db-cluster-parameter-group-name
my-db-cluster-parameter-group
^ --parameters "ParameterName=apg_ccm_enabled,ParameterValue=1,ApplyMethod=immediate"
Setting the promotion tier priority for the writer DB instance
For cluster cache management, make sure that the promotion priority is tier-0 for the writer DB instance of the Aurora PostgreSQL DB cluster. The promotion tier priority is a value that specifies the order in which an Aurora reader is promoted to the writer DB instance after a failure. Valid values are 0–15, where 0 is the first priority and 15 is the last priority. For more information about the promotion tier, see Fault tolerance for an Aurora DB cluster.
To set the promotion priority for the writer DB instance to tier-0
Sign in to the AWS Management Console and open the Amazon RDS console at https://console.aws.amazon.com/rds/
. -
In the navigation pane, choose Databases.
-
Choose the Writer DB instance of the Aurora PostgreSQL DB cluster.
-
Choose Modify. The Modify DB Instance page appears.
-
On the Additional configuration panel, choose tier-0 for the Failover priority.
-
Choose Continue and check the summary of modifications.
-
To apply the changes immediately after you save them, choose Apply immediately.
-
Choose Modify DB Instance to save your changes.
To set the promotion tier priority to 0 for the writer DB instance using the AWS CLI, call the modify-db-instance command with the following required parameters:
-
--db-instance-identifier
-
--promotion-tier
-
--apply-immediately
Example
For Linux, macOS, or Unix:
aws rds modify-db-instance \ --db-instance-identifier
writer-db-instance
\ --promotion-tier 0 \ --apply-immediately
For Windows:
aws rds modify-db-instance ^ --db-instance-identifier
writer-db-instance
^ ---promotion-tier 0 ^ --apply-immediately
Setting the promotion tier priority for a reader DB instance
You must set only one reader DB instance for cluster cache management. To do so, choose a
reader from the Aurora PostgreSQL cluster that is the same instance class and size as the writer
DB instance. For example, if the writer uses db.r5.xlarge
, choose a reader that
uses this same instance class type and size. Then set its promotion tier priority to 0.
The promotion tier priority is a value that specifies the order in which an Aurora reader is promoted to the writer DB instance after a failure. Valid values are 0–15, where 0 is the first priority and 15 is the last priority.
To set the promotion priority of the reader DB instance to tier-0
Sign in to the AWS Management Console and open the Amazon RDS console at https://console.aws.amazon.com/rds/
. -
In the navigation pane, choose Databases.
-
Choose a Reader DB instance of the Aurora PostgreSQL DB cluster that is the same instance class as the writer DB instance.
-
Choose Modify. The Modify DB Instance page appears.
-
On the Additional configuration panel, choose tier-0 for the Failover priority.
-
Choose Continue and check the summary of modifications.
-
To apply the changes immediately after you save them, choose Apply immediately.
-
Choose Modify DB Instance to save your changes.
To set the promotion tier priority to 0 for the reader DB instance using the AWS CLI, call the modify-db-instance command with the following required parameters:
-
--db-instance-identifier
-
--promotion-tier
-
--apply-immediately
Example
For Linux, macOS, or Unix:
aws rds modify-db-instance \ --db-instance-identifier
reader-db-instance
\ --promotion-tier 0 \ --apply-immediately
For Windows:
aws rds modify-db-instance ^ --db-instance-identifier
reader-db-instance
^ ---promotion-tier 0 ^ --apply-immediately
Monitoring the buffer cache
After setting up cluster cache management, you can monitor the state of
synchronization between the writer DB instance's buffer cache and the designated
reader's warm buffer cache. To examine the buffer cache contents on both the writer DB
instance and the designated reader DB instance, use the PostgreSQL
pg_buffercache
module. For more information, see the PostgreSQL
pg_buffercache
documentation
Using the aurora_ccm_status
Function
Cluster cache management also provides the aurora_ccm_status
function. Use the aurora_ccm_status
function on the writer DB instance to
get the following information about the progress of cache warming on the designated
reader:
-
buffers_sent_last_minute
– How many buffers have been sent to the designated reader in the last minute. buffers_found_last_minute
– The number of frequently accessed buffers identified during the past minute.-
buffers_sent_last_scan
– How many buffers have been sent to the designated reader during the last complete scan of the buffer cache. -
buffers_found_last_scan
– How many buffers have been identified as frequently accessed and needed to be sent during the last complete scan of the buffer cache. Buffers already cached on the designated reader aren't sent. -
buffers_sent_current_scan
– How many buffers have been sent so far during the current scan. -
buffers_found_current_scan
– How many buffers have been identified as frequently accessed in the current scan. -
current_scan_progress
– How many buffers have been visited so far during the current scan.
The following example shows how to use the aurora_ccm_status
function to
convert some of its output into a warm rate and warm percentage.
SELECT buffers_sent_last_minute*8/60 AS warm_rate_kbps, 100*(1.0-buffers_sent_last_scan::float/buffers_found_last_scan) AS warm_percent FROM aurora_ccm_status();
Troubleshooting CCM configuration
When you enable apg_ccm_enabled
cluster parameter, cluster cache management is automatically turned on at the
instance level on the writer DB instance and one reader DB instance on the Aurora PostgreSQL DB cluster. The writer and reader instance
should use the same instance class type and size. Their promotion tier priority is set to 0
. Other reader instances in the DB cluster
should have a non-zero promotion tier and cluster cache management is disabled for those instances.
The following reasons may lead to issues in the configuration and disable cluster cache management:
When there is no single reader DB instance set to promotion tier 0.
When the writer DB instance is not set to promotion tier 0.
When more than one reader DB instances are set to promotion tier 0.
When the writer and one reader DB instances with promotion tier 0 doesn't have the same instance size.