Working with PostgreSQL autovacuum on Amazon RDS for PostgreSQL
We strongly recommend that you use the autovacuum feature to maintain the health of your PostgreSQL DB instance. Autovacuum automates the start of the VACUUM and the ANALYZE commands. It checks for tables with a large number of inserted, updated, or deleted tuples. After this check, it reclaims storage by removing obsolete data or tuples from the PostgreSQL database.
By default, autovacuum is turned on for the Amazon RDS for PostgreSQL DB instances that you create
using any of the default PostgreSQL DB parameter groups. These include
default.postgres10
, default.postgres11
, and so on. All default
PostgreSQL DB parameter groups have an rds.adaptive_autovacuum
parameter
that's set to 1
, thus activating the feature. Other configuration parameters
associated with the autovacuum feature are also set by default. Because these defaults are
somewhat generic, you can benefit from tuning some of the parameters associated with the
autovacuum feature for your specific workload.
Following, you can find more information about the autovacuum and how to tune some of its parameters on your RDS for PostgreSQL DB instance. For high-level information, see Best practices for working with PostgreSQL.
Topics
- Allocating memory for autovacuum
- Reducing the likelihood of transaction ID wraparound
- Determining if the tables in your database need vacuuming
- Determining which tables are currently eligible for autovacuum
- Determining if autovacuum is currently running and for how long
- Performing a manual vacuum freeze
- Reindexing a table when autovacuum is running
- Managing autovacuum with large indexes
- Other parameters that affect autovacuum
- Setting table-level autovacuum parameters
- Logging autovacuum and vacuum activities
- Understanding the behavior of autovacuum with invalid databases
- Identify and resolve aggressive vacuum blockers in RDS for PostgreSQL
Allocating memory for autovacuum
One of the most important parameters influencing autovacuum performance is the autovacuum_work_mem
autovacuum_work_mem
parameter is set to -1, indicating that the
setting of maintenance_work_mem
is used instead. For all other versions,
autovacuum_work_mem
is determined by GREATEST({DBInstanceClassMemory/32768},
65536).
Manual vacuum operations always use the maintenance_work_mem
setting, with a
default setting of GREATEST({DBInstanceClassMemory/63963136*1024}, 65536), and it can also be
adjusted at the session level using the SET
command for more targeted manual
VACUUM
operations.
The autovacuum_work_mem
determines memory for autovacuum to hold identifiers
of dead tuples (pg_stat_all_tables.n_dead_tup
) for vacuuming indexes.
When doing calculations to determine the autovacuum_work_mem
parameter's
value, be aware of the following:
-
If you set the parameter too low, the vacuum process might have to scan the table multiple times to complete its work. Such multiple scans can have a negative impact on performance. For larger instances, setting
maintenance_work_mem
orautovacuum_work_mem
to at least 1 GB can improve the performance of vacuuming tables with a high number of dead tuples. However, in PostgreSQL versions 16 and prior, vacuum’s memory usage is capped at 1 GB, which is sufficient to process approximately 179 million dead tuples in a single pass. If a table has more dead tuples than this, vacuum will need to make multiple passes through the table's indexes, significantly increasing the time required. Starting with PostgreSQL version 17, there isn't a limit of 1 GB, and autovacuum can process more than 179 million tuples by using radix trees.A tuple identifier is 6 bytes in size. To estimate the memory needed for vacuuming an index of a table, query
pg_stat_all_tables.n_dead_tup
to find the number of dead tuples, then multiply this number by 6 to determine the memory required for vacuuming the index in a single pass. You may use the following query:SELECT relname AS table_name, n_dead_tup, pg_size_pretty(n_dead_tup * 6) AS estimated_memory FROM pg_stat_all_tables WHERE relname = '
name_of_the_table
'; -
The
autovacuum_work_mem
parameter works in conjunction with theautovacuum_max_workers
parameter. Each worker amongautovacuum_max_workers
can use the memory that you allocate. If you have many small tables, allocate moreautovacuum_max_workers
and lessautovacuum_work_mem
. If you have large tables (larger than 100 GB), allocate more memory and fewer worker processes. You need to have enough memory allocated to succeed on your biggest table. Thus, make sure that the combination of worker processes and memory equals the total memory that you want to allocate.
Reducing the likelihood of transaction ID wraparound
In some cases, parameter group settings related to autovacuum might not be aggressive
enough to prevent transaction ID wraparound. To address this, RDS for PostgreSQL provides a
mechanism that adapts the autovacuum parameter values automatically. Adaptive
autovacuum parameter tuning is a feature for RDS for PostgreSQL. A detailed
explanation of TransactionID wraparound
Adaptive autovacuum parameter tuning is turned on by default for RDS for PostgreSQL instances
with the dynamic parameter rds.adaptive_autovacuum
set to ON. We strongly
recommend that you keep this turned on. However, to turn off adaptive autovacuum parameter
tuning, set the rds.adaptive_autovacuum
parameter to 0 or OFF.
Transaction ID wraparound is still possible even when Amazon RDS tunes the autovacuum
parameters. We encourage you to implement an Amazon CloudWatch alarm for transaction ID wraparound. For
more information, see the post Implement an early warning system for transaction ID wraparound in RDS for PostgreSQL
With adaptive autovacuum parameter tuning turned on, Amazon RDS begins adjusting autovacuum
parameters when the CloudWatch metric MaximumUsedTransactionIDs
reaches the value of
the autovacuum_freeze_max_age
parameter or 500,000,000, whichever is greater.
Amazon RDS continues to adjust parameters for autovacuum if a table continues to trend toward transaction ID wraparound. Each of these adjustments dedicates more resources to autovacuum to avoid wraparound. Amazon RDS updates the following autovacuum-related parameters:
RDS modifies these parameters only if the new value makes autovacuum more aggressive. The
parameters are modified in memory on the DB instance. The values in the parameter group
aren't changed. To view the current in-memory settings, use the PostgreSQL SHOW
When Amazon RDS modifies any of these autovacuum parameters, it generates an event for the
affected DB instance. This event is visible on the AWS Management Console and through the Amazon RDS API. After
the MaximumUsedTransactionIDs
CloudWatch metric returns below the threshold, Amazon RDS
resets the autovacuum-related parameters in memory back to the values specified in the
parameter group. It then generates another event corresponding to this change.