Using pgactive to support active-active replication
The pgactive
extension uses active-active replication to support and coordinate write operations on multiple RDS for PostgreSQL databases.
Amazon RDS for PostgreSQL supports the pgactive
extension on the following versions:
-
RDS for PostgreSQL 16.1 and higher 16 versions
-
RDS for PostgreSQL 15.4-R2 and higher 15 versions
-
RDS for PostgreSQL 14.10 and higher 14 versions
-
RDS for PostgreSQL 13.13 and higher 13 versions
-
RDS for PostgreSQL 12.17 and higher 12 versions
-
RDS for PostgreSQL 11.22
Note
When there are write operations on more than one database in a replication configuration, conflicts are possible. For more information, see Handling conflicts in active-active replication
Initializing the pgactive extension capability
To initialize the pgactive
extension capability on your RDS for PostgreSQL DB instance, set the value of the rds.enable_pgactive
parameter to 1
and then create the extension in the database. Doing so automatically turns on the parameters
rds.logical_replication
and track_commit_timestamp
and sets the value of wal_level
to logical
.
You must have permissions as the rds_superuser
role to perform these tasks.
You can use the AWS Management Console or the AWS CLI to create the required RDS for PostgreSQL DB instances. The steps following assume that your RDS for PostgreSQL DB instance is associated with a custom DB parameter group. For information about creating a custom DB parameter group, see Parameter groups for Amazon RDS.
To initialize the pgactive extension capability
Sign in to the AWS Management Console and open the Amazon RDS console at https://console.aws.amazon.com/rds/
. -
In the navigation pane, choose your RDS for PostgreSQL DB instance.
-
Open the Configuration tab for your RDS for PostgreSQL DB instance. In the instance details, find the DB instance parameter group link.
-
Choose the link to open the custom parameters associated with your RDS for PostgreSQL DB instance.
Find the
rds.enable_pgactive
parameter, and set it to1
to initialize thepgactive
capability.Choose Save changes.
In the navigation pane of the Amazon RDS console, choose Databases.
Select your RDS for PostgreSQL DB instance, and then choose Reboot from the Actions menu.
Confirm the DB instance reboot so that your changes take effect.
When the DB instance is available, you can use
psql
or any other PostgreSQL client to connect to the RDS for PostgreSQL DB instance.The following example assumes that your RDS for PostgreSQL DB instance has a default database named
postgres
.psql --host=
mydb.111122223333
.aws-region
.rds.amazonaws.com --port=5432 --username=master username
--password --dbname=postgres
To verify that pgactive is initialized, run the following command.
postgres=>
SELECT setting ~ 'pgactive' FROM pg_catalog.pg_settings WHERE name = 'shared_preload_libraries';
If
pgactive
is inshared_preload_libraries
, the preceding command will return the following:?column? ---------- t
Create the extension, as follows.
postgres=>
CREATE EXTENSION pgactive;
To initialize the pgactive extension capability
To initialize the pgactive
using the AWS CLI, call the
modify-db-parameter-group operation
to modify certain parameters in your
custom parameter group as shown in the following procedure.
Use the following AWS CLI command to set
rds.enable_pgactive
to1
to initialize thepgactive
capability for the RDS for PostgreSQL DB instance.postgres=>
aws rds modify-db-parameter-group \ --db-parameter-group-namecustom-param-group-name
\ --parameters "ParameterName=rds.enable_pgactive,ParameterValue=1,ApplyMethod=pending-reboot" \ --regionaws-region
-
Use the following AWS CLI command to reboot the RDS for PostgreSQL DB instance so that the
pgactive
library is initialized.aws rds reboot-db-instance \ --db-instance-identifier
your-instance
\ --regionaws-region
When the instance is available, use
psql
to connect to the RDS for PostgreSQL DB instance.psql --host=
mydb.111122223333
.aws-region
.rds.amazonaws.com --port=5432 --username=master user
--password --dbname=postgres
Create the extension, as follows.
postgres=>
CREATE EXTENSION pgactive;
Setting up active-active replication for RDS for PostgreSQL DB instances
The following procedure shows you how to start active-active replication between two RDS for PostgreSQL DB instances running PostgreSQL 15.4 or higher in the same region. To run the multi-region high availability example, you need to deploy Amazon RDS for PostgreSQL instances in two different regions and set up VPC Peering. For more information, see VPC peering.
Note
Sending traffic between multiple regions may incur additional costs.
These steps assume that the RDS for PostgreSQL DB instance has been setup with the pgactive
extension. For more information, see
Initializing the pgactive extension capability.
To configure the first RDS for PostgreSQL DB instance with the pgactive
extension
The following example illustrates how the pgactive
group is created, along with other steps required to create the pgactive
extension on the RDS for PostgreSQL DB instance.
Use
psql
or another client tool to connect to your first RDS for PostgreSQL DB instance.psql --host=
firstinstance.111122223333
.aws-region
.rds.amazonaws.com --port=5432 --username=master username
--password --dbname=postgres
Create a database on the RDS for PostgreSQL instance using the following command:
postgres=>
CREATE DATABASEapp
;Switch connection to the new database using the following command:
\c
app
To check if the
shared_preload_libraries
parameter containspgactive
, run the following command:app=>
SELECT setting ~ 'pgactive' FROM pg_catalog.pg_settings WHERE name = 'shared_preload_libraries';?column? ---------- t
-
Create and populate a sample table using the following SQL statements:
Create an example table using the following SQL statement.
app=>
CREATE SCHEMA inventory; CREATE TABLE inventory.products ( id int PRIMARY KEY, product_name text NOT NULL, created_at timestamptz NOT NULL DEFAULT CURRENT_TIMESTAMP);Populate the table with some sample data by using the following SQL statement.
app=>
INSERT INTO inventory.products (id, product_name) VALUES (1, 'soap'), (2, 'shampoo'), (3, 'conditioner');Verify that data exists in the table by using the following SQL statement.
app=>
SELECT count(*) FROM inventory.products;count ------- 3
Create
pgactive
extension on the existing database.app=>
CREATE EXTENSION pgactive;Create and initialize the pgactive group using the following commands:
app=>
SELECT pgactive.pgactive_create_group( node_name :='node1-app'
, node_dsn := 'dbname=app
host=firstinstance.111122223333
.aws-region
.rds.amazonaws.com user=master username
password=PASSWORD
');node1-app is the name that you assign to uniquely identify a node in the
pgactive
group.Note
To perform this step successfully on a DB instance that is publicly accessible, you must turn on the
rds.custom_dns_resolution
parameter by setting it to1
.To check if the DB instance is ready, use the following command:
app=>
SELECT pgactive.pgactive_wait_for_node_ready();If the command succeeds, you can see the following output:
pgactive_wait_for_node_ready ------------------------------ (1 row)
To configure the second RDS for PostgreSQL instance and join it to the pgactive
group
The following example illustrates how you can join an RDS for PostgreSQL DB instance to the pgactive
group, along with other steps that are required to create the pgactive
extension on the DB instance.
These steps assume that another
RDS for PostgreSQL DB instances has been set up with the
pgactive
extension. For more information, see Initializing the pgactive extension capability.
Use
psql
to connect to the instance that you want to receive updates from the publisher.psql --host=
secondinstance.111122223333
.aws-region
.rds.amazonaws.com --port=5432 --username=master username
--password --dbname=postgres
Create a database on the second RDS for PostgreSQL DB instance using the following command:
postgres=>
CREATE DATABASEapp
;Switch connection to the new database using the following command:
\c
app
Create the
pgactive
extension on the existing database.app=>
CREATE EXTENSION pgactive;Join the RDS for PostgreSQL second DB instance to the
pgactive
group as follows.app=>
SELECT pgactive.pgactive_join_group( node_name :='node2-app'
, node_dsn := 'dbname=app
host=secondinstance.111122223333
.aws-region
.rds.amazonaws.com user=master username
password=PASSWORD
', join_using_dsn := 'dbname=app
host=firstinstance.111122223333
.aws-region
.rds.amazonaws.com user=postgres
password=PASSWORD
');node2-app is the name that you assign to uniquely identify a node in the
pgactive
group.To check if the DB instance is ready, use the following command:
app=>
SELECT pgactive.pgactive_wait_for_node_ready();If the command succeeds, you can see the following output:
pgactive_wait_for_node_ready ------------------------------ (1 row)
If the first RDS for PostgreSQL database is relatively large, you can see
pgactive.pgactive_wait_for_node_ready()
emitting the progress report of the restore operation. The output looks similar to the following:NOTICE: restoring database 'app', 6% of 7483 MB complete NOTICE: restoring database 'app', 42% of 7483 MB complete NOTICE: restoring database 'app', 77% of 7483 MB complete NOTICE: restoring database 'app', 98% of 7483 MB complete NOTICE: successfully restored database 'app' from node node1-app in 00:04:12.274956 pgactive_wait_for_node_ready ------------------------------ (1 row)
From this point forward,
pgactive
synchronizes the data between the two DB instances.You can use the following command to verify if the database of the second DB instance has the data:
app=>
SELECT count(*) FROM inventory.products;If the data is successfully synchronized, you’ll see the following output:
count ------- 3
Run the following command to insert new values:
app=>
INSERT INTO inventory.products (id, product_name) VALUES ('lotion');Connect to the database of the first DB instance and run the following query:
app=>
SELECT count(*) FROM inventory.products;If the active-active replication is initialized, the output is similar to the following:
count ------- 4
To detach and remove a DB instance from the pgactive
group
You can detach and remove a DB instance from the pgactive
group using these steps:
You can detach the second DB instance from the first DB instance using the following command:
app=>
SELECT * FROM pgactive.pgactive_detach_nodes(ARRAY[‘node2-app
']);Remove the
pgactive
extension from the second DB instance using the following command:app=>
SELECT * FROM pgactive.pgactive_remove();To forcefully remove the extension:
app=>
SELECT * FROM pgactive.pgactive_remove(true);Drop the extension using the following command:
app=>
DROP EXTENSION pgactive;
Handling conflicts in active-active replication
The pgactive
extension works per database and not per cluster. Each DB instance that uses pgactive
is an independent instance and can accept
data changes from any source. When a change is sent to a DB instance, PostgreSQL commits it locally and then uses pgactive
to replicate the
change asynchronously to other DB instances. When two PostgreSQL DB instances update the same record at nearly the same time, a conflict can occur.
The pgactive
extension provides mechanisms for conflict detection and automatic resolution. It tracks the time stamp when the transaction
was committed on both the DB instances and automatically applies the change with the latest time stamp. The pgactive
extension also logs when a conflict occurs in the
pgactive.pgactive_conflict_history
table.
The pgactive.pgactive_conflict_history
will keep growing. You may want to define a purging policy. This can be done by deleting some records on a regular basis or defining
a partitioning scheme for this relation (and later detach, drop, truncate partitions of interest). To implement
the purging policy on a regular basis, one option is to use the pg_cron
extension. See the following information
of an example for the pg_cron
history table, Scheduling maintenance with the PostgreSQL pg_cron extension.
Handling sequences in active-active replication
An RDS for PostgreSQL DB instance with the pgactive
extension uses two different sequence mechanisms to generate unique values.
Global Sequences
To use a global sequence, create a local sequence with the CREATE SEQUENCE
statement. Use pgactive.pgactive_snowflake_id_nextval(seqname)
instead of usingnextval(seqname)
to get the next unique value of the sequence.
The following example creates a global sequence:
postgres=>
CREATE TABLE gstest ( id bigint primary key, parrot text );
postgres=>
CREATE SEQUENCE gstest_id_seq OWNED BY gstest.id;
postgres=>
ALTER TABLE gstest \ ALTER COLUMN id SET DEFAULT \ pgactive.pgactive_snowflake_id_nextval('gstest_id_seq');
Partitioned sequences
In split-step or partitioned sequences, a normal PostgreSQL sequence is used on each node. Each sequence increments by the same amount and starts at different offsets. For example, with step 100, the node 1 generates sequence as 101, 201, 301, and so on and the node 2 generates sequence as 102, 202, 302, and so on. This scheme works well even if the nodes can't communicate for extended periods, but requires that the designer specify a maximum number of nodes when establishing the schema and requires per-node configuration. Mistakes can easily lead to overlapping sequences.
It is relatively simple to configure this approach with pgactive
by creating the desired sequence on a node as follows:
CREATE TABLE some_table (generated_value bigint primary key);
postgres=>
CREATE SEQUENCE some_seq INCREMENT 100 OWNED BY some_table.generated_value;
postgres=>
ALTER TABLE some_table ALTER COLUMN generated_value SET DEFAULT nextval('some_seq');
Then call setval
on each node to give a different offset starting value as follows.
postgres=>
-- On node 1 SELECT setval('some_seq', 1); -- On node 2 SELECT setval('some_seq', 2);
Parameter reference for the pgactive extension
You can use the following query to view all the parameters associated with pgactive
extension.
postgres=>
SELECT * FROM pg_settings WHERE name LIKE 'pgactive.%';
Measuring replication lag among pgactive members
You can use the following query to view the replication lag among the pgactive
members. Run this query
on every pgactive
node to get the full picture.
postgres=# SELECT *, (last_applied_xact_at - last_applied_xact_committs) AS lag FROM pgactive.pgactive_node_slots;
-{ RECORD 1 ]----------------+-----------------------------------------------------------------
node_name | node2-app
slot_name | pgactive_5_7332551165694385385_0_5__
slot_restart_lsn | 0/1A898A8
slot_confirmed_lsn | 0/1A898E0
walsender_active | t
walsender_pid | 69022
sent_lsn | 0/1A898E0
write_lsn | 0/1A898E0
flush_lsn | 0/1A898E0
replay_lsn | 0/1A898E0
last_sent_xact_id | 746
last_sent_xact_committs | 2024-02-06 18:04:22.430376+00
last_sent_xact_at | 2024-02-06 18:04:22.431359+00
last_applied_xact_id | 746
last_applied_xact_committs | 2024-02-06 18:04:22.430376+00
last_applied_xact_at | 2024-02-06 18:04:52.452465+00
lag | 00:00:30.022089
Limitations for the pgactive extension
All tables require a Primary Key, otherwise Update's and Delete's aren't allowed. The values in the Primary Key column shouldn't be updated.
Sequences may have gaps and sometimes might not follow an order. Sequences are not replicated. For more information, see Handling sequences in active-active replication.
DDL and large objects are not replicated.
Secondary unique indexes can cause data divergence.
Collation needs to be identical on all node in the group.
Load balancing across nodes is an anti-pattern.
Large transactions can cause replication lag.