Removing bloat with pg_repack
You can use the pg_repack
extension to remove table and index bloat with
minimal database locking. You can create this extension in the database instance and run the
pg_repack
client (where the client version matches the extension version)
from Amazon Elastic Compute Cloud (Amazon EC2) or from a computer that can connect to your database.
Unlike VACUUM FULL
, pg_repack
doesn't require downtime or a
maintenance window, and won't block other sessions.
pg_repack
is helpful in situations where VACUUM FULL
,
CLUSTER
, or REINDEX
might not work. It creates a new table
that contains the data of the bloated table, tracks the changes from the original table, and
then replaces the original table with the new one. It doesn't lock the original table for
read or write operations while it's building the new table.
You can use pg_repack
for a full table or for an index. To see a list of
tasks, see the pg_repack
documentation
Limitations:
-
To run
pg_repack
, your table must have a primary key or a unique index. -
pg_repack
won't work with temporary tables. -
pg_repack
won't work on tables that have global indexes. -
When
pg_repack
is in progress, you can't perform DDL operations on tables.
The following table describes the differences between pg_repack
and
VACUUM FULL
.
|
|
Built-in command |
An extension that you run from Amazon EC2 or your local computer |
Requires an |
Requires an |
Works with all tables |
Works on tables that have primary and unique keys only |
Requires double the storage that's consumed by the table and indexes |
Requires double the storage that's consumed by the table and indexes |
To run pg_repack
on a table, use the command:
pg_repack -h <host> -d <dbname> --table <tablename> -k
To run pg_repack
on an index, use the command:
pg_repack -h <host> -d <dbname> --index <index name>
For more information, see the AWS blog post Remove bloat from Amazon Aurora and RDS for PostgreSQL with pg_repack