# Performance and scaling for Amazon Aurora PostgreSQL
<a name="AuroraPostgreSQL.Managing"></a>

The following section discusses managing performance and scaling for an Amazon Aurora PostgreSQL DB cluster. It also includes information about other maintenance tasks.

**Topics**
+ [Scaling Aurora PostgreSQL DB instances](#AuroraPostgreSQL.Managing.Performance.InstanceScaling)
+ [Maximum connections to an Aurora PostgreSQL DB instance](#AuroraPostgreSQL.Managing.MaxConnections)
+ [Temporary storage limits for Aurora PostgreSQL](#AuroraPostgreSQL.Managing.TempStorage)
+ [Huge pages for Aurora PostgreSQL](#AuroraPostgreSQL.Managing.HugePages)
+ [Testing Amazon Aurora PostgreSQL by using fault injection queries](AuroraPostgreSQL.Managing.FaultInjectionQueries.md)
+ [Displaying volume status for an Aurora PostgreSQL DB cluster](AuroraPostgreSQL.Managing.VolumeStatus.md)
+ [Specifying the RAM disk for the stats\$1temp\$1directory](AuroraPostgreSQL.Managing.RamDisk.md)
+ [Managing temporary files with PostgreSQL](PostgreSQL.ManagingTempFiles.md)

## Scaling Aurora PostgreSQL DB instances
<a name="AuroraPostgreSQL.Managing.Performance.InstanceScaling"></a>

You can scale Aurora PostgreSQL DB instances in two ways, instance scaling and read scaling. For more information about read scaling, see [Read scaling](Aurora.Managing.Performance.md#Aurora.Managing.Performance.ReadScaling).

You can scale your Aurora PostgreSQL DB cluster by modifying the DB instance class for each DB instance in the DB cluster. Aurora PostgreSQL supports several DB instance classes optimized for Aurora. Don't use db.t2 or db.t3 instance classes for larger Aurora clusters of size greater than 40 terabytes (TB).

**Note**  
We recommend using the T DB instance classes only for development and test servers, or other non-production servers. For more details on the T instance classes, see [DB instance class types](Concepts.DBInstanceClass.Types.md).

Scaling isn't instantaneous. It can take 15 minutes or more to complete the change to a different DB instance class. If you use this approach to modify the DB instance class, you apply the change during the next scheduled maintenance window (rather than immediately) to avoid affecting users. 

As an alternative to modifying the DB instance class directly, you can minimize downtime by using the high availability features of Amazon Aurora. First, add an Aurora Replica to your cluster. When creating the replica, choose the DB instance class size that you want to use for your cluster. When the Aurora Replica is synchronized with the cluster, you then failover to the newly added Replica. To learn more, see [Aurora Replicas](Aurora.Replication.md#Aurora.Replication.Replicas) and [Fast failover with Amazon Aurora PostgreSQL](AuroraPostgreSQL.BestPractices.FastFailover.md). 

For detailed specifications of the DB instance classes supported by Aurora PostgreSQL, see [Supported DB engines for DB instance classes](Concepts.DBInstanceClass.SupportAurora.md).

## Maximum connections to an Aurora PostgreSQL DB instance
<a name="AuroraPostgreSQL.Managing.MaxConnections"></a>

An Aurora PostgreSQL DB cluster allocates resources based on the DB instance class and its available memory. Each connection to the DB cluster consumes incremental amounts of these resources, such as memory and CPU. Memory consumed per connection varies based on query type, count, and whether temporary tables are used. Even an idle connection consumes memory and CPU. That's because when queries run on a connection, more memory is allocated for each query and it's not released completely, even when processing stops. Thus, we recommend that you make sure your applications aren't holding on to idle connections: each one of these wastes resources and affects performance negatively. For more information, see [Resources consumed by idle PostgreSQL connections](https://aws.amazon.com/blogs/database/resources-consumed-by-idle-postgresql-connections/). 

The maximum number of connections allowed by an Aurora PostgreSQL DB instance is determined by the `max_connections` parameter value specified in the parameter group for that DB instance. The ideal setting for the `max_connections` parameter is one that supports all the client connections your application needs, without an excess of unused connections, plus at least 3 more connections to support AWS automation. Before modifying the `max_connections` parameter setting, we recommend that you consider the following:
+ If the `max_connections` value is too low, the Aurora PostgreSQL DB instance might not have sufficient connections available when clients attempt to connect. If this happens, attempts to connect using `psql` raise error messages such as the following: 

  ```
  psql: FATAL: remaining connection slots are reserved for non-replication superuser connections
  ```
+ If the `max_connections` value exceeds the number of connections that are actually needed, the unused connections can cause performance to degrade.

The default value of `max_connections` is derived from the following Aurora PostgreSQL `LEAST` function:

`LEAST({DBInstanceClassMemory/9531392},5000)`.

If you want to change the value for `max_connections`, you need to create a custom DB cluster parameter group and change its value there. After applying your custom DB parameter group to your cluster, be sure to reboot the primary instance so the new value takes effect. For more information, see [Amazon Aurora PostgreSQL parameters](AuroraPostgreSQL.Reference.ParameterGroups.md) and [Creating a DB cluster parameter groupin Amazon Aurora](USER_WorkingWithParamGroups.CreatingCluster.md). 

**Tip**  
If your applications frequently open and close connections, or keep a large number of long-lived connections open, we recommend that you use Amazon RDS Proxy. RDS Proxy is a fully managed, highly available database proxy that uses connection pooling to share database connections securely and efficiently. To learn more about RDS Proxy, see [Amazon RDS Proxyfor Aurora](rds-proxy.md).

 For details about how Aurora Serverless v2 instances handle this parameter, see [Maximum connections for Aurora Serverless v2](aurora-serverless-v2.setting-capacity.md#aurora-serverless-v2.max-connections). 

## Temporary storage limits for Aurora PostgreSQL
<a name="AuroraPostgreSQL.Managing.TempStorage"></a>

Aurora PostgreSQL stores tables and indexes in the Aurora storage subsystem. Aurora PostgreSQL uses separate temporary storage for non-persistent temporary files. This includes files that are used for such purposes as sorting large data sets during query processing or for index build operations. For more information, see the article [How can I troubleshoot local storage issues in Aurora PostgreSQL-Compatible instances?](https://repost.aws/knowledge-center/postgresql-aurora-storage-issue).

These local storage volumes are backed by Amazon Elastic Block Store and can be extended by using a larger DB instance class. For more information about storage, see [Amazon Aurora storage](Aurora.Overview.StorageReliability.md). You can also increase your local storage for temporary objects by using an NVMe enabled instance type and Aurora Optimized Reads-enabled temporary objects. For more information, see [Improving query performance for Aurora PostgreSQL with Aurora Optimized Reads](AuroraPostgreSQL.optimized.reads.md).

**Note**  
You might see `storage-optimization` events when scaling DB instances, for example, from db.r5.2xlarge to db.r5.4xlarge. 

The following table shows the maximum amount of temporary storage available for each Aurora PostgreSQL DB instance class. For more information on DB instance class support for Aurora, see [Amazon AuroraDB instance classes](Concepts.DBInstanceClass.md).


| DB instance class | Maximum temporary storage available (GiB) | 
| --- | --- | 
| db.x2g.16xlarge | 1829 | 
| db.x2g.12xlarge | 1606 | 
| db.x2g.8xlarge | 1071 | 
| db.x2g.4xlarge | 535 | 
| db.x2g.2xlarge | 268 | 
| db.x2g.xlarge | 134 | 
| db.x2g.large | 67 | 
| db.r8g.48xlarge | 3072 | 
| db.r8g.24xlarge | 1536 | 
| db.r8g.16xlarge | 998 | 
| db.r8g.12xlarge | 749 | 
| db.r8g.8xlarge | 499 | 
| db.r8g.4xlarge | 250 | 
| db.r8g.2xlarge | 125 | 
| db.r8g.xlarge | 63 | 
| db.r8g.large | 31 | 
| db.r7g.16xlarge | 1008 | 
| db.r7g.12xlarge | 756 | 
| db.r7g.8xlarge | 504 | 
| db.r7g.4xlarge | 252 | 
| db.r7g.2xlarge | 126 | 
| db.r7g.xlarge | 63 | 
| db.r7g.large | 32 | 
| db.r7i.48xlarge | 3072 | 
| db.r7i.24xlarge | 1500 | 
| db.r7i.16xlarge | 1008 | 
| db.r7i.12xlarge | 748 | 
| db.r7i.8xlarge | 504 | 
| db.r7i.4xlarge | 249 | 
| db.r7i.2xlarge | 124 | 
| db.r7i.xlarge | 62 | 
| db.r7i.large | 31 | 
| db.r6g.16xlarge | 1008 | 
| db.r6g.12xlarge | 756 | 
| db.r6g.8xlarge | 504 | 
| db.r6g.4xlarge | 252 | 
| db.r6g.2xlarge | 126 | 
| db.r6g.xlarge | 63 | 
| db.r6g.large | 32 | 
| db.r6i.32xlarge | 1829 | 
| db.r6i.24xlarge | 1500 | 
| db.r6i.16xlarge | 1008 | 
| db.r6i.12xlarge | 748 | 
| db.r6i.8xlarge | 504 | 
| db.r6i.4xlarge | 249 | 
| db.r6i.2xlarge | 124 | 
| db.r6i.xlarge | 62 | 
| db.r6i.large | 31 | 
| db.r5.24xlarge | 1500 | 
| db.r5.16xlarge | 1008 | 
| db.r5.12xlarge | 748 | 
| db.r5.8xlarge | 504 | 
| db.r5.4xlarge | 249 | 
| db.r5.2xlarge | 124 | 
| db.r5.xlarge | 62 | 
| db.r5.large | 31 | 
| db.r4.16xlarge | 960 | 
| db.r4.8xlarge | 480 | 
| db.r4.4xlarge | 240 | 
| db.r4.2xlarge | 120 | 
| db.r4.xlarge | 60 | 
| db.r4.large | 30 | 
| db.t4g.large | 16.5 | 
| db.t4g.medium | 8.13 | 
| db.t3.large | 16 | 
| db.t3.medium | 7.5 | 

**Note**  
NVMe enabled instance types can increase the temporary space available by up to the total NVMe size. For more information, see [Improving query performance for Aurora PostgreSQL with Aurora Optimized Reads](AuroraPostgreSQL.optimized.reads.md).

You can monitor the temporary storage available for a DB instance with the `FreeLocalStorage` CloudWatch metric, described in [Amazon CloudWatch metrics for Amazon Aurora](Aurora.AuroraMonitoring.Metrics.md). (This doesn't apply to Aurora Serverless v2.)

For some workloads, you can reduce the amount of temporary storage by allocating more memory to the processes that are performing the operation. To increase the memory available to an operation, increasing the values of the [work\$1mem](https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-WORK-MEM) or [maintenance\$1work\$1mem](https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-MAINTENANCE-WORK-MEM) PostgreSQL parameters.

## Huge pages for Aurora PostgreSQL
<a name="AuroraPostgreSQL.Managing.HugePages"></a>

*Huge pages* are a memory management feature that reduces overhead when a DB instance is working with large contiguous chunks of memory, such as that used by shared buffers. This PostgreSQL feature is supported by all currently available Aurora PostgreSQL versions.

`Huge_pages` parameter is turned on by default for all DB instance classes other than t3.medium,db.t3.large,db.t4g.medium,db.t4g.large instance classes. You can't change the `huge_pages` parameter value or turn off this feature in the supported instance classes of Aurora PostgreSQL.

On Aurora PostgreSQL DB instances that don't support the huge pages memory feature, specific process memory usage might increase without corresponding workload changes.

The system allocates shared memory segments like the buffer cache during server startup. When huge memory pages aren't available, the system doesn't charge these allocations to the postmaster process. Instead, it includes the memory in the process that first accessed each 4KB page in the shared memory segment.

**Note**  
Active connections share allocated memory as needed, regardless of how shared memory usage is tracked across processes.

# Testing Amazon Aurora PostgreSQL by using fault injection queries
<a name="AuroraPostgreSQL.Managing.FaultInjectionQueries"></a>

You can test the fault tolerance of your Aurora PostgreSQL DB cluster by using fault injection queries. Fault injection queries are issued as SQL commands to an Amazon Aurora instance. Fault injection queries let you crash the instance so that you can test failover and recovery. You can also simulate Aurora Replica failure, disk failure, and disk congestion. Fault injection queries are supported by all available Aurora PostgreSQL versions, as follows. 
+ Aurora PostgreSQL versions 12, 13, 14, and higher
+ Aurora PostgreSQL version 11.7 and higher
+ Aurora PostgreSQL version 10.11 and higher

**Topics**
+ [Testing an instance crash](#AuroraPostgreSQL.Managing.FaultInjectionQueries.Crash)
+ [Testing an Aurora Replica failure](#AuroraPostgreSQL.Managing.FaultInjectionQueries.ReplicaFailure)
+ [Testing a disk failure](#AuroraPostgreSQL.Managing.FaultInjectionQueries.DiskFailure)
+ [Testing disk congestion](#AuroraPostgreSQL.Managing.FaultInjectionQueries.DiskCongestion)

When a fault injection query specifies a crash, it forces a crash of the Aurora PostgreSQL DB instance. The other fault injection queries result in simulations of failure events, but don't cause the event to occur. When you submit a fault injection query, you also specify an amount of time for the failure event simulation to occur.

You can submit a fault injection query to one of your Aurora Replica instances by connecting to the endpoint for the Aurora Replica. For more information, see [Amazon Aurora endpoint connections](Aurora.Overview.Endpoints.md).

## Testing an instance crash
<a name="AuroraPostgreSQL.Managing.FaultInjectionQueries.Crash"></a>

You can force a crash of an Aurora PostgreSQL instance by using the fault injection query function `aurora_inject_crash()`.

For this fault injection query, a failover does not occur. If you want to test a failover, then you can choose the **Failover** instance action for your DB cluster in the RDS console, or use the [failover-db-cluster](https://docs.aws.amazon.com/cli/latest/reference/rds/failover-db-cluster.html) AWS CLI command or the [FailoverDBCluster](https://docs.aws.amazon.com/AmazonRDS/latest/APIReference/API_FailoverDBCluster.html) RDS API operation. 

**Syntax**

```
1. SELECT aurora_inject_crash ('instance' | 'dispatcher' | 'node');
```Options

This fault injection query takes one of the following crash types. The crash type is not case sensitive:

*'instance'*  
A crash of the PostgreSQL-compatible database for the Amazon Aurora instance is simulated.

*'dispatcher'*  
A crash of the dispatcher on the primary instance for the Aurora DB cluster is simulated. The *dispatcher* writes updates to the cluster volume for an Amazon Aurora DB cluster.

*'node'*  
A crash of both the PostgreSQL-compatible database and the dispatcher for the Amazon Aurora instance is simulated.

## Testing an Aurora Replica failure
<a name="AuroraPostgreSQL.Managing.FaultInjectionQueries.ReplicaFailure"></a>

You can simulate the failure of an Aurora Replica by using the fault injection query function `aurora_inject_replica_failure()`.

An Aurora Replica failure blocks replication to the Aurora Replica or all Aurora Replicas in the DB cluster by the specified percentage for the specified time interval. When the time interval completes, the affected Aurora Replicas are automatically synchronized with the primary instance.

**Syntax**

```
1. SELECT aurora_inject_replica_failure(
2.    percentage_of_failure, 
3.    time_interval, 
4.    'replica_name'
5. );
```Options

This fault injection query takes the following parameters:

*percentage\$1of\$1failure*  
The percentage of replication to block during the failure event. This value can be a double between 0 and 100. If you specify 0, then no replication is blocked. If you specify 100, then all replication is blocked.

*time\$1interval*  
The amount of time to simulate the Aurora Replica failure. The interval is in seconds. For example, if the value is 20, the simulation runs for 20 seconds.  
Take care when specifying the time interval for your Aurora Replica failure event. If you specify too long an interval, and your writer instance writes a large amount of data during the failure event, then your Aurora DB cluster might assume that your Aurora Replica has crashed and replace it.

*replica\$1name*  
The Aurora Replica in which to inject the failure simulation. Specify the name of an Aurora Replica to simulate a failure of the single Aurora Replica. Specify an empty string to simulate failures for all Aurora Replicas in the DB cluster.   
To identify replica names, see the `server_id` column from the `aurora_replica_status()` function. For example:  

```
postgres=> SELECT server_id FROM aurora_replica_status();
```

## Testing a disk failure
<a name="AuroraPostgreSQL.Managing.FaultInjectionQueries.DiskFailure"></a>

You can simulate a disk failure for an Aurora PostgreSQL DB cluster by using the fault injection query function `aurora_inject_disk_failure()`.

During a disk failure simulation, the Aurora PostgreSQL DB cluster randomly marks disk segments as faulting. Requests to those segments are blocked for the duration of the simulation.

**Syntax**

```
1. SELECT aurora_inject_disk_failure(
2.    percentage_of_failure, 
3.    index, 
4.    is_disk, 
5.    time_interval
6. );
```Options

This fault injection query takes the following parameters:

*percentage\$1of\$1failure*  
The percentage of the disk to mark as faulting during the failure event. This value can be a double between 0 and 100. If you specify 0, then none of the disk is marked as faulting. If you specify 100, then the entire disk is marked as faulting.

*index*  
A specific logical block of data in which to simulate the failure event. If you exceed the range of available logical blocks or storage nodes data, you receive an error that tells you the maximum index value that you can specify. To avoid this error, see [Displaying volume status for an Aurora PostgreSQL DB cluster](AuroraPostgreSQL.Managing.VolumeStatus.md).

*is\$1disk*  
Indicates whether the injection failure is to a logical block or a storage node. Specifying true means injection failures are to a logical block. Specifying false means injection failures are to a storage node.

*time\$1interval*  
The amount of time to simulate the disk failure. The interval is in seconds. For example, if the value is 20, the simulation runs for 20 seconds.

## Testing disk congestion
<a name="AuroraPostgreSQL.Managing.FaultInjectionQueries.DiskCongestion"></a>

You can simulate a disk congestion for an Aurora PostgreSQL DB cluster by using the fault injection query function `aurora_inject_disk_congestion()`.

During a disk congestion simulation, the Aurora PostgreSQL DB cluster randomly marks disk segments as congested. Requests to those segments are delayed between the specified minimum and maximum delay time for the duration of the simulation.

**Syntax**

```
1. SELECT aurora_inject_disk_congestion(
2.    percentage_of_failure, 
3.    index, 
4.    is_disk, 
5.    time_interval, 
6.    minimum, 
7.    maximum
8. );
```Options

This fault injection query takes the following parameters:

*percentage\$1of\$1failure*  
The percentage of the disk to mark as congested during the failure event. This is a double value between 0 and 100. If you specify 0, then none of the disk is marked as congested. If you specify 100, then the entire disk is marked as congested.

*index*  
A specific logical block of data or storage node to use to simulate the failure event.  
If you exceed the range of available logical blocks or storage nodes of data, you receive an error that tells you the maximum index value that you can specify. To avoid this error, see [Displaying volume status for an Aurora PostgreSQL DB cluster](AuroraPostgreSQL.Managing.VolumeStatus.md).

*is\$1disk*  
Indicates whether the injection failure is to a logical block or a storage node. Specifying true means injection failures are to a logical block. Specifying false means injection failures are to a storage node.

*time\$1interval*  
The amount of time to simulate the disk congestion. The interval is in seconds. For example, if the value is 20, the simulation runs for 20 seconds.

*minimum, maximum*  
The minimum and maximum amount of congestion delay, in milliseconds. Valid values range from 0.0 to 100.0 milliseconds. Disk segments marked as congested are delayed for a random amount of time within the minimum and maximum range for the duration of the simulation. The maximum value must be greater than the minimum value.

# Displaying volume status for an Aurora PostgreSQL DB cluster
<a name="AuroraPostgreSQL.Managing.VolumeStatus"></a>

In Amazon Aurora, a DB cluster volume consists of a collection of logical blocks. Each of these represents 10 gigabytes of allocated storage. These blocks are called *protection groups*. 

The data in each protection group is replicated across six physical storage devices, called *storage nodes*. These storage nodes are allocated across three Availability Zones (AZs) in the region where the DB cluster resides. In turn, each storage node contains one or more logical blocks of data for the DB cluster volume. For more information about protection groups and storage nodes, see [ Introducing the Aurora storage engine](https://aws.amazon.com/blogs/database/introducing-the-aurora-storage-engine/) on the AWS Database Blog. To learn more about Aurora cluster volumes in general, see [Amazon Aurora storage](Aurora.Overview.StorageReliability.md). 

Use the `aurora_show_volume_status()` function to return the following server status variables:
+ `Disks` — The total number of logical blocks of data for the DB cluster volume.
+ `Nodes` **—** The total number of storage nodes for the DB cluster volume.

You can use the `aurora_show_volume_status()` function to help avoid an error when using the `aurora_inject_disk_failure()` fault injection function. The `aurora_inject_disk_failure()` fault injection function simulates the failure of an entire storage node, or a single logical block of data within a storage node. In the function, you specify the index value of a specific logical block of data or storage node. However, the statement returns an error if you specify an index value greater than the number of logical blocks of data or storage nodes used by the DB cluster volume. For more information about fault injection queries, see [Testing Amazon Aurora PostgreSQL by using fault injection queries](AuroraPostgreSQL.Managing.FaultInjectionQueries.md).

**Note**  
The `aurora_show_volume_status()` function is available for Aurora PostgreSQL version 10.11. For more information about Aurora PostgreSQL versions, see [Amazon Aurora PostgreSQL releases and engine versions](AuroraPostgreSQL.Updates.20180305.md).

**Syntax**

```
1. SELECT * FROM aurora_show_volume_status();
```

**Example**

```
customer_database=> SELECT * FROM aurora_show_volume_status();
 disks | nodes 
-------+-------
    96 |    45
```

# Specifying the RAM disk for the stats\$1temp\$1directory
<a name="AuroraPostgreSQL.Managing.RamDisk"></a>

You can use the Aurora PostgreSQL parameter, `rds.pg_stat_ramdisk_size`, to specify the system memory allocated to a RAM disk for storing the PostgreSQL `stats_temp_directory`. The RAM disk parameter is only available in Aurora PostgreSQL 14 and lower versions.

Under certain workloads, setting this parameter can improve performance and decrease IO requirements. For more information about the `stats_temp_directory`, see [Run-time Statistics](https://www.postgresql.org/docs/current/static/runtime-config-statistics.html#GUC-STATS-TEMP-DIRECTORY) in the PostgreSQL documentation. From PostgreSQL version 15, the PostgreSQL community switched to use dynamic shared memory. So, there is no need to set `stats_temp_directory`.

To enable a RAM disk for your `stats_temp_directory`, set the `rds.pg_stat_ramdisk_size` parameter to a non-zero value in the DB cluster parameter group used by your DB cluster. This parameter denotes MB, so you must use an integer value. Expressions, formulas, and functions aren't valid for the `rds.pg_stat_ramdisk_size` parameter. Be sure to restart the DB cluster so that the change takes effect. For information about setting parameters, see [Parameter groups for Amazon Aurora](USER_WorkingWithParamGroups.md). For more information about restarting the DB cluster, see [Rebooting an Amazon Aurora DB cluster or Amazon Aurora DB instance](USER_RebootCluster.md).

As an example, the following AWS CLI command sets the RAM disk parameter to 256 MB.

```
aws rds modify-db-cluster-parameter-group \
    --db-cluster-parameter-group-name db-cl-pg-ramdisk-testing \
    --parameters "ParameterName=rds.pg_stat_ramdisk_size, ParameterValue=256, ApplyMethod=pending-reboot"
```

After you restart the DB cluster, run the following command to see the status of the `stats_temp_directory`:

```
postgres=> SHOW stats_temp_directory;
```

The command should return the following: 

```
stats_temp_directory
---------------------------
/rdsdbramdisk/pg_stat_tmp
(1 row)
```

# Managing temporary files with PostgreSQL
<a name="PostgreSQL.ManagingTempFiles"></a>

In PostgreSQL, a complex query might perform several sort or hash operations at the same time, with each operation using instance memory to store results up to the value specified in the [https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-WORK-MEM](https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-WORK-MEM) parameter. When the instance memory is not sufficient, temporary files are created to store the results. These are written to disk to complete the query execution. Later, these files are automatically removed after the query completes. In Aurora PostgreSQL, these files share local storage with other log files. You can monitor your Aurora PostgreSQL DB cluster's local storage space by watching the Amazon CloudWatch metric for `FreeLocalStorage`. For more information, see [ Troubleshoot local storage issues](https://aws.amazon.com/premiumsupport/knowledge-center/postgresql-aurora-storage-issue/).

We recommend using Aurora Optimized Read clusters for workloads involving multiple concurrent queries that increase the usage of temporary files. These clusters use local Non-Volatile Memory Express (NVMe) based solid state drive (SSD) block-level storage to place the temporary files. For more information, see [Improving query performance for Aurora PostgreSQL with Aurora Optimized Reads](AuroraPostgreSQL.optimized.reads.md).

You can use the following parameters and functions to manage the temporary files in your instance.
+ **[https://www.postgresql.org/docs/current/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-DISK](https://www.postgresql.org/docs/current/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-DISK)** – This parameter cancels any query exceeding the size of temp\$1files in KB. This limit prevents any query from running endlessly and consuming disk space with temporary files. You can estimate the value using the results from the `log_temp_files` parameter. As a best practice, examine the workload behavior and set the limit according to the estimation. The following example shows how a query is canceled when it exceeds the limit.

  ```
  postgres=>select * from pgbench_accounts, pg_class, big_table;
  ```

  ```
  ERROR: temporary file size exceeds temp_file_limit (64kB)
  ```
+ **[https://www.postgresql.org/docs/current/runtime-config-logging.html#GUC-LOG-TEMP-FILES](https://www.postgresql.org/docs/current/runtime-config-logging.html#GUC-LOG-TEMP-FILES)** – This parameter sends messages to the postgresql.log when the temporary files of a session are removed. This parameter produces logs after a query successfully completes. Therefore, it might not help in troubleshooting active, long-running queries. 

  The following example shows that when the query successfully completes, the entries are logged in the postgresql.log file while the temporary files are cleaned up.

  ```
                      
  2023-02-06 23:48:35 UTC:205.251.233.182(12456):adminuser@postgres:[31236]:LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp31236.5", size 140353536
  2023-02-06 23:48:35 UTC:205.251.233.182(12456):adminuser@postgres:[31236]:STATEMENT:  select a.aid from pgbench_accounts a, pgbench_accounts b where a.bid=b.bid order by a.bid limit 10;
  2023-02-06 23:48:35 UTC:205.251.233.182(12456):adminuser@postgres:[31236]:LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp31236.4", size 180428800
  2023-02-06 23:48:35 UTC:205.251.233.182(12456):adminuser@postgres:[31236]:STATEMENT:  select a.aid from pgbench_accounts a, pgbench_accounts b where a.bid=b.bid order by a.bid limit 10;
  ```
+ **[https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-GENFILE](https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-ADMIN-GENFILE)** – This function that is available from RDS for PostgreSQL 13 and above provides visibility into the current temporary file usage. The completed query doesn't appear in the results of the function. In the following example, you can view the results of this function.

  ```
  postgres=>select * from pg_ls_tmpdir();
  ```

  ```
        name       |    size    |      modification
  -----------------+------------+------------------------
   pgsql_tmp8355.1 | 1072250880 | 2023-02-06 22:54:56+00
   pgsql_tmp8351.0 | 1072250880 | 2023-02-06 22:54:43+00
   pgsql_tmp8327.0 | 1072250880 | 2023-02-06 22:54:56+00
   pgsql_tmp8351.1 |  703168512 | 2023-02-06 22:54:56+00
   pgsql_tmp8355.0 | 1072250880 | 2023-02-06 22:54:00+00
   pgsql_tmp8328.1 |  835031040 | 2023-02-06 22:54:56+00
   pgsql_tmp8328.0 | 1072250880 | 2023-02-06 22:54:40+00
  (7 rows)
  ```

  ```
  postgres=>select query from pg_stat_activity where pid = 8355;
                  
  query
  ----------------------------------------------------------------------------------------
  select a.aid from pgbench_accounts a, pgbench_accounts b where a.bid=b.bid order by a.bid
  (1 row)
  ```

  The file name includes the processing ID (PID) of the session that generated the temporary file. A more advanced query, such as in the following example, performs a sum of the temporary files for each PID.

  ```
  postgres=>select replace(left(name, strpos(name, '.')-1),'pgsql_tmp','') as pid, count(*), sum(size) from pg_ls_tmpdir() group by pid;
  ```

  ```
   pid  | count |   sum
  ------+-------------------
   8355 |     2 | 2144501760
   8351 |     2 | 2090770432
   8327 |     1 | 1072250880
   8328 |     2 | 2144501760
  (4 rows)
  ```
+ **`[ pg\$1stat\$1statements](https://www.postgresql.org/docs/current/pgstatstatements.html)`** – If you activate the pg\$1stat\$1statements parameter, then you can view the average temporary file usage per call. You can identify the query\$1id of the query and use it to examine the temporary file usage as shown in the following example.

  ```
  postgres=>select queryid from pg_stat_statements where query like 'select a.aid from pgbench%';
  ```

  ```
         queryid
  ----------------------
   -7170349228837045701
  (1 row)
  ```

  ```
  postgres=>select queryid, substr(query,1,25), calls, temp_blks_read/calls temp_blks_read_per_call, temp_blks_written/calls temp_blks_written_per_call from pg_stat_statements where queryid = -7170349228837045701;
  ```

  ```
         queryid        |          substr           | calls | temp_blks_read_per_call | temp_blks_written_per_call
  ----------------------+---------------------------+-------+-------------------------+----------------------------
   -7170349228837045701 | select a.aid from pgbench |    50 |                  239226 |                     388678
  (1 row)
  ```
+ **`[Performance Insights](https://aws.amazon.com/rds/performance-insights/)`** – In the Performance Insights dashboard, you can view temporary file usage by turning on the metrics **temp\$1bytes** and **temp\$1files**. Then, you can see the average of both of these metrics and see how they correspond to the query workload. The view within Performance Insights doesn't show specifically the queries that are generating the temporary files. However, when you combine Performance Insights with the query shown for `pg_ls_tmpdir`, you can troubleshoot, analyze, and determine the changes in your query workload. 

  For more information about how to analyze metrics and queries with Performance Insights, see [Analyzing metrics with the Performance Insights dashboard](USER_PerfInsights.UsingDashboard.md).

  For an example of viewing temporary file usage with Performance Insights, see [Viewing temporary file usage with Performance Insights](PostgreSQL.ManagingTempFiles.Example.md)

# Viewing temporary file usage with Performance Insights
<a name="PostgreSQL.ManagingTempFiles.Example"></a>

You can use Performance Insights to view temporary file usage by turning on the metrics **temp\$1bytes** and **temp\$1files**. The view in Performance Insights doesn't show the specific queries that generate temporary files, however, when you combine Performance Insights with the query shown for `pg_ls_tmpdir`, you can troubleshoot, analyze, and determine the changes in your query workload.

1. In the Performance Insights dashboard, choose **Manage Metrics**.

1. Choose **Database metrics**, and select the **temp\$1bytes** and **temp\$1files** metrics as shown in the following image.  
![\[Metrics displayed in the graph.\]](http://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/images/rpg_mantempfiles_metrics.png)

1. In the **Top SQL** tab, choose the **Preferences** icon.

1. In the **Preferences** window, turn on the following statistics to appear in the **Top SQL**tab and choose **Continue**.
   + Temp writes/sec
   + Temp reads/sec
   + Tmp blk write/call
   + Tmp blk read/call

1. The temporary file is broken out when combined with the query shown for `pg_ls_tmpdir`, as shown in the following example.  
![\[Query that displays the temporary file usage.\]](http://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/images/rpg_mantempfiles_query.png)

The `IO:BufFileRead` and `IO:BufFileWrite` events occur when the top queries in your workload often create temporary files. You can use Performance Insights to identify top queries waiting on `IO:BufFileRead` and `IO:BufFileWrite` by reviewing Average Active Session (AAS) in Database Load and Top SQL sections. 

![\[IO:BufFileRead and IO:BufFileWrite in the graph.\]](http://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/images/perfinsights_IOBufFile.png)


For more information on how to analyze top queries and load by wait event with Performance Insights, see [Overview of the Top SQL tab](USER_PerfInsights.UsingDashboard.AnalyzeDBLoad.AdditionalMetrics.md#USER_PerfInsights.UsingDashboard.Components.AvgActiveSessions.TopLoadItemsTable.TopSQL). You should identify and tune the queries that cause increase in temporary file usage and related wait events. For more information on these wait events and remediation, see [IO:BufFileRead and IO:BufFileWrite](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/apg-waits.iobuffile.html).

**Note**  
The [https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-WORK-MEM](https://www.postgresql.org/docs/current/runtime-config-resource.html#GUC-WORK-MEM) parameter controls when the sort operation runs out of memory and results are written into temporary files. We recommend that you don't change the setting of this parameter higher than the default value because it would permit every database session to consume more memory. Also, a single session that performs complex joins and sorts can perform parallel operations in which each operation consumes memory.   
As a best practice, when you have a large report with multiple joins and sorts, set this parameter at the session level by using the `SET work_mem` command. Then the change is only applied to the current session and doesn't change the value globally.