

# Data conversion when exporting to an Amazon S3 bucket for Amazon RDS
<a name="USER_ExportSnapshot.data-types"></a>

When you export a DB snapshot to an Amazon S3 bucket, Amazon RDS converts data to, exports data in, and stores data in the Parquet format. For more information about Parquet, see the [Apache Parquet](https://parquet.apache.org/docs/) website.

Parquet stores all data as one of the following primitive types:
+ BOOLEAN
+ INT32
+ INT64
+ INT96
+ FLOAT
+ DOUBLE
+ BYTE\$1ARRAY – A variable-length byte array, also known as binary
+ FIXED\$1LEN\$1BYTE\$1ARRAY – A fixed-length byte array used when the values have a constant size

The Parquet data types are few to reduce the complexity of reading and writing the format. Parquet provides logical types for extending primitive types. A *logical type* is implemented as an annotation with the data in a `LogicalType` metadata field. The logical type annotation explains how to interpret the primitive type. 

When the `STRING` logical type annotates a `BYTE_ARRAY` type, it indicates that the byte array should be interpreted as a UTF-8 encoded character string. After an export task completes, Amazon RDS notifies you if any string conversion occurred. The underlying data exported is always the same as the data from the source. However, due to the encoding difference in UTF-8, some characters might appear different from the source when read in tools such as Athena.

For more information, see [Parquet logical type definitions](https://github.com/apache/parquet-format/blob/master/LogicalTypes.md) in the Parquet documentation.

**Topics**
+ [MySQL and MariaDB data type mapping to Parquet](#USER_ExportSnapshot.data-types.MySQL)
+ [PostgreSQL data type mapping to Parquet](#USER_ExportSnapshot.data-types.PostgreSQL)

## MySQL and MariaDB data type mapping to Parquet
<a name="USER_ExportSnapshot.data-types.MySQL"></a>

The following table shows the mapping from MySQL and MariaDB data types to Parquet data types when data is converted and exported to Amazon S3.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ExportSnapshot.data-types.html)

## PostgreSQL data type mapping to Parquet
<a name="USER_ExportSnapshot.data-types.PostgreSQL"></a>

The following table shows the mapping from PostgreSQL data types to Parquet data types when data is converted and exported to Amazon S3.


| PostgreSQL data type | Parquet primitive type | Logical type annotation | Mapping notes | 
| --- | --- | --- | --- | 
| Numeric data types | 
| BIGINT | INT64 |  |   | 
| BIGSERIAL | INT64 |  |   | 
| DECIMAL | BYTE\$1ARRAY | STRING | A DECIMAL type is converted to a string in a BYTE\$1ARRAY type and encoded as UTF8.This conversion is to avoid complications due to data precision and data values that are not a number (NaN). | 
| DOUBLE PRECISION | DOUBLE |  |   | 
| INTEGER | INT32 |  |   | 
| MONEY | BYTE\$1ARRAY | STRING |   | 
| REAL | FLOAT |  |   | 
| SERIAL | INT32 |  |   | 
| SMALLINT | INT32 | INT(16, true) |   | 
| SMALLSERIAL | INT32 | INT(16, true) |   | 
| String and related data types | 
| ARRAY | BYTE\$1ARRAY | STRING |  An array is converted to a string and encoded as BINARY (UTF8). This conversion is to avoid complications due to data precision, data values that are not a number (NaN), and time data values.  | 
| BIT | BYTE\$1ARRAY | STRING |   | 
| BIT VARYING | BYTE\$1ARRAY | STRING |   | 
| BYTEA | BINARY |  |   | 
| CHAR | BYTE\$1ARRAY | STRING |   | 
| CHAR(N) | BYTE\$1ARRAY | STRING |   | 
| ENUM | BYTE\$1ARRAY | STRING |   | 
| NAME | BYTE\$1ARRAY | STRING |   | 
| TEXT | BYTE\$1ARRAY | STRING |   | 
| TEXT SEARCH | BYTE\$1ARRAY | STRING |   | 
| VARCHAR(N) | BYTE\$1ARRAY | STRING |   | 
| XML | BYTE\$1ARRAY | STRING |   | 
| Date and time data types | 
| DATE | BYTE\$1ARRAY | STRING |   | 
| INTERVAL | BYTE\$1ARRAY | STRING |   | 
| TIME | BYTE\$1ARRAY | STRING |  | 
| TIME WITH TIME ZONE | BYTE\$1ARRAY | STRING |  | 
| TIMESTAMP | BYTE\$1ARRAY | STRING |  | 
| TIMESTAMP WITH TIME ZONE | BYTE\$1ARRAY | STRING |  | 
| Geometric data types | 
| BOX | BYTE\$1ARRAY | STRING |   | 
| CIRCLE | BYTE\$1ARRAY | STRING |   | 
| LINE | BYTE\$1ARRAY | STRING |   | 
| LINESEGMENT | BYTE\$1ARRAY | STRING |   | 
| PATH | BYTE\$1ARRAY | STRING |   | 
| POINT | BYTE\$1ARRAY | STRING |   | 
| POLYGON | BYTE\$1ARRAY | STRING |   | 
| JSON data types | 
| JSON | BYTE\$1ARRAY | STRING |   | 
| JSONB | BYTE\$1ARRAY | STRING |   | 
| Other data types | 
| BOOLEAN | BOOLEAN |  |   | 
| CIDR | BYTE\$1ARRAY | STRING |  Network data type | 
| COMPOSITE | BYTE\$1ARRAY | STRING |   | 
| DOMAIN | BYTE\$1ARRAY | STRING |   | 
| INET | BYTE\$1ARRAY | STRING |  Network data type | 
| MACADDR | BYTE\$1ARRAY | STRING |   | 
| OBJECT IDENTIFIER | N/A |  |  | 
| PG\$1LSN | BYTE\$1ARRAY | STRING |   | 
| RANGE | BYTE\$1ARRAY | STRING |   | 
| UUID | BYTE\$1ARRAY | STRING |   | 