Exporting data from an RDS for PostgreSQL DB instance to Amazon S3 - Amazon Relational Database Service

Exporting data from an RDS for PostgreSQL DB instance to Amazon S3

You can query data from an RDS for PostgreSQL DB instance and export it directly into files stored in an Amazon S3 bucket. To do this, you first install the RDS for PostgreSQL aws_s3 extension. This extension provides you with the functions that you use to export the results of queries to Amazon S3. Following, you can find out how to install the extension and how to export data to Amazon S3.

Note

Cross-account export to Amazon S3 isn't supported.

All currently available versions of RDS for PostgreSQL support exporting data to Amazon Simple Storage Service. For detailed version information, see Amazon RDS for PostgreSQL updates in the Amazon RDS for PostgreSQL Release Notes.

If you don't have a bucket set up for your export, see the following topics the Amazon Simple Storage Service User Guide.

By default, the data exported from RDS for PostgreSQL to Amazon S3 uses server-side encryption with an AWS managed key. If you are using bucket encryption, the Amazon S3 bucket must be encrypted with an AWS Key Management Service (AWS KMS) key (SSE-KMS). Currently, buckets encrypted with Amazon S3 managed keys (SSE-S3) are not supported.

Note

You can save DB snapshot data to Amazon S3 using the AWS Management Console, AWS CLI, or Amazon RDS API. For more information, see Exporting DB snapshot data to Amazon S3 for Amazon RDS.

Installing the aws_s3 extension

Before you can use Amazon Simple Storage Service with your RDS for PostgreSQL DB instance, you need to install the aws_s3 extension. This extension provides functions for exporting data from an RDS for PostgreSQL DB instance to an Amazon S3 bucket. It also provides functions for importing data from an Amazon S3. For more information, see Importing data from Amazon S3 into an RDS for PostgreSQL DB instance. The aws_s3 extension depends on some of the helper functions in the aws_commons extension, which is installed automatically when needed.

To install the aws_s3 extension
  1. Use psql (or pgAdmin) to connect to the RDS for PostgreSQL DB instance as a user that has rds_superuser privileges. If you kept the default name during the setup process, you connect as postgres.

    psql --host=111122223333.aws-region.rds.amazonaws.com --port=5432 --username=postgres --password
  2. To install the extension, run the following command.

    postgres=> CREATE EXTENSION aws_s3 CASCADE; NOTICE: installing required extension "aws_commons" CREATE EXTENSION
  3. To verify that the extension is installed, you can use the psql \dx metacommand.

    postgres=> \dx List of installed extensions Name | Version | Schema | Description -------------+---------+------------+--------------------------------------------- aws_commons | 1.2 | public | Common data types across AWS services aws_s3 | 1.1 | public | AWS S3 extension for importing data from S3 plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language (3 rows)

The functions for importing data from Amazon S3 and exporting data to Amazon S3 are now available to use.

Verify that your RDS for PostgreSQL version supports exports to Amazon S3

You can verify that your RDS for PostgreSQL version supports export to Amazon S3 by using the describe-db-engine-versions command. The following example verifies support for version 10.14.

aws rds describe-db-engine-versions --region us-east-1 --engine postgres --engine-version 10.14 | grep s3Export

If the output includes the string "s3Export", then the engine supports Amazon S3 exports. Otherwise, the engine doesn't support them.

Overview of exporting data to Amazon S3

To export data stored in an RDS for PostgreSQL database to an Amazon S3 bucket, use the following procedure.

To export RDS for PostgreSQL data to S3
  1. Identify an Amazon S3 file path to use for exporting data. For details about this process, see Specifying the Amazon S3 file path to export to.

  2. Provide permission to access the Amazon S3 bucket.

    To export data to an Amazon S3 file, give the RDS for PostgreSQL DB instance permission to access the Amazon S3 bucket that the export will use for storage. Doing this includes the following steps:

    1. Create an IAM policy that provides access to an Amazon S3 bucket that you want to export to.

    2. Create an IAM role.

    3. Attach the policy you created to the role you created.

    4. Add this IAM role to your DB instance.

    For details about this process, see Setting up access to an Amazon S3 bucket.

  3. Identify a database query to get the data. Export the query data by calling the aws_s3.query_export_to_s3 function.

    After you complete the preceding preparation tasks, use the aws_s3.query_export_to_s3 function to export query results to Amazon S3. For details about this process, see Exporting query data using the aws_s3.query_export_to_s3 function.

Specifying the Amazon S3 file path to export to

Specify the following information to identify the location in Amazon S3 where you want to export data to:

  • Bucket name – A bucket is a container for Amazon S3 objects or files.

    For more information on storing data with Amazon S3, see Create a bucket and View an object in the Amazon Simple Storage Service User Guide.

  • File path – The file path identifies where the export is stored in the Amazon S3 bucket. The file path consists of the following:

    • An optional path prefix that identifies a virtual folder path.

    • A file prefix that identifies one or more files to be stored. Larger exports are stored in multiple files, each with a maximum size of approximately 6 GB. The additional file names have the same file prefix but with _partXX appended. The XX represents 2, then 3, and so on.

    For example, a file path with an exports folder and a query-1-export file prefix is /exports/query-1-export.

  • AWS Region (optional) – The AWS Region where the Amazon S3 bucket is located. If you don't specify an AWS Region value, then Amazon RDS saves your files into Amazon S3 in the same AWS Region as the exporting DB instance.

    Note

    Currently, the AWS Region must be the same as the region of the exporting DB instance.

    For a listing of AWS Region names and associated values, see Regions, Availability Zones, and Local Zones.

To hold the Amazon S3 file information about where the export is to be stored, you can use the aws_commons.create_s3_uri function to create an aws_commons._s3_uri_1 composite structure as follows.

psql=> SELECT aws_commons.create_s3_uri( 'amzn-s3-demo-bucket', 'sample-filepath', 'us-west-2' ) AS s3_uri_1 \gset

You later provide this s3_uri_1 value as a parameter in the call to the aws_s3.query_export_to_s3 function. For examples, see Exporting query data using the aws_s3.query_export_to_s3 function.