IO:WALWrite
Supported engine versions
This wait event information is supported for all versions of RDS for PostgreSQL 10 and higher.
Context
Activity in the database that's generating write-ahead log data fills up the
WAL buffers first and then writes to disk, asynchronously. The wait event IO:WALWrite
is generated
when the SQL session is waiting for the WAL data to complete writing to disk so that it can
release the transaction's COMMIT call.
Likely causes of increased waits
If this wait event occurs often, you should review your workload and the type of updates that your workload performs and their frequency. In particular, look for the following types of activity.
- Heavy DML activity
-
Changing data in database tables doesn't happen instantaneously. An insert to one table might need to wait for an insert or an update to the same table from another client. The data manipulation language (DML) statements for changing data values (INSERT, UPDATE, DELETE, COMMIT, ROLLBACK TRANSACTION) can result in contention that causes the write-ahead logfile to be waiting for the buffers to be flushed. This situation is captured in the following Amazon RDS Performance Insights metrics that indicate heavy DML activity.
tup_inserted
tup_updated
tup_deleted
xact_rollback
xact_commit
For more information about these metrics, see Performance Insights counters for Amazon RDS for PostgreSQL.
- Frequent checkpoint activity
-
Frequent checkpoints contribute to a higher number of WAL files. In RDS for PostgreSQL, full page writes are always "on." Full page writes help protect against data loss. However, when checkpointing occurs too frequently, the system can suffer overall performance issues. This is especially true on systems with heavy DML activity. In some cases, you might find error messages in your
postgresql.log
stating that “checkpoints are occurring too frequently."We recommend that when tuning checkpoints, you carefully balance performance against expected time need to recover in the event of an abnormal shutdown.
Actions
We recommend the following actions to reduce the numbers of this wait event.
Reduce the number of commits
To reduce the number of commits, you can combine statements
into transaction blocks. Use Amazon RDS Performance Insights to examine
the type of queries being run. You can also move
large maintenance operations to off-peak hours. For example,
create indexes or use pg_repack
operations during non-production hours.
Monitor your checkpoints
There are two parameters that you can monitor to see how frequently your RDS for PostgreSQL DB instance is writing to the WAL file for checkpoints.
log_checkpoints
– This parameter is set to "on" by default. It causes a message to get sent to the PostgreSQL log for each checkpoint. These log messages include the number of buffers written, the time spent writing them, and the number of WAL files added, removed, or recycled for the given checkpoint.For more information about this parameter, see Error Reporting and Logging
in the PostgreSQL documentation. checkpoint_warning
– This parameter sets a threshold value (in seconds) for checkpoint frequency above which a warning is generated. By default, this parameter isn't set in RDS for PostgreSQL. You can set the value of this parameter to get a warning when the database changes in your RDS for PostgreSQL DB instance are written at a rate for which the WAL files are not sized to handle. For example, say you set this parameter to 30. If your RDS for PostgreSQL instance needs to write changes more often than every 30 seconds, the warning that "checkpoints are occurring too frequently" is sent to the PostgreSQL log. This can indicate that yourmax_wal_size
value should be increased.For more information, see Write Ahead Log
in the PostgreSQL documentation.
Scale up IO
This type of input/output (IO) wait event can remediated by scaling the input/output operations per second (IOPs) to provide faster IO. Scaling IO is preferable to scaling CPU, because scaling CPU can result in even more IO contention because the increased CPU can handle more work and thus make the IO bottleneck even worse. In general, we recommend that you consider tuning your workload before performing scaling operations.
Dedicated log volume (DLV)
You can use a dedicated log volume (DLV) for a DB instance that uses Provisioned IOPS (PIOPS) storage by using the Amazon RDS console, AWS CLI, or Amazon RDS API. A DLV moves PostgreSQL database transaction logs to a storage volume that's separate from the volume containing the database tables. For more information, see Dedicated log volume (DLV).