DynamoDB table export output format
A DynamoDB table export includes manifest files in addition to the files containing your table data. These files are all saved in the Amazon S3 bucket that you specify in your export request. The following sections describe the format and contents of each output object.
Topics
Manifest files
DynamoDB creates manifest files, along with their checksum files, in the specified S3 bucket for each export request.
export-prefix
/AWSDynamoDB/ExportId
/manifest-summary.jsonexport-prefix
/AWSDynamoDB/ExportId
/manifest-summary.checksumexport-prefix
/AWSDynamoDB/ExportId
/manifest-files.jsonexport-prefix
/AWSDynamoDB/ExportId
/manifest-files.checksum
You choose an export-prefix
when you request a table
export. This helps you keep files in the destination S3 bucket organized. The
ExportId
is a unique token generated by the service
to ensure that multiple exports to the same S3 bucket and
export-prefix
don't overwrite each other.
The export creates at least 1 file per partition. For partitions that are empty, your export request will create an empty file. All of the items in each file are from that particular partition's hashed keyspace.
Note
DynamoDB also creates an empty file named _started
in
the same directory as the manifest files. This file verifies that the
destination bucket is writable and that the export has begun. It can safely
be deleted.
The summary manifest
The manifest-summary.json
file contains summary
information about the export job. This allows you to know which data files
in the shared data folder are associated with this export. Its format is as
follows:
{ "version": "2020-06-30", "exportArn": "arn:aws:dynamodb:us-east-1:123456789012:table/ProductCatalog/export/01234567890123-a1b2c3d4", "startTime": "2020-11-04T07:28:34.028Z", "endTime": "2020-11-04T07:33:43.897Z", "tableArn": "arn:aws:dynamodb:us-east-1:123456789012:table/ProductCatalog", "tableId": "12345a12-abcd-123a-ab12-1234abc12345", "exportTime": "2020-11-04T07:28:34.028Z", "s3Bucket": "ddb-productcatalog-export", "s3Prefix": "2020-Nov", "s3SseAlgorithm": "AES256", "s3SseKmsKeyId": null, "manifestFilesS3Key": "AWSDynamoDB/01693685827463-2d8752fd/manifest-files.json", "billedSizeBytes": 0, "itemCount": 8, "outputFormat": "DYNAMODB_JSON", "exportType": "FULL_EXPORT" }
The files manifest
The manifest-files.json
file contains information
about the files that contain your exported table data. The file is in JSON lines
{ "itemCount": 8, "md5Checksum": "sQMSpEILNgoQmarvDFonGQ==", "etag": "af83d6f217c19b8b0fff8023d8ca4716-1", "dataFileS3Key": "AWSDynamoDB/01693685827463-2d8752fd/data/asdl123dasas.json.gz" }
Data files
DynamoDB can export your table data in two formats: DynamoDB JSON and Amazon Ion.
Regardless of the format you choose, your data will be written to multiple
compressed files named by the keys. These files are also listed in the
manifest-files.json
file.
The directory structure of your Amazon S3 bucket after a full export will contain all of your manifest files and data files under the export Id folder.
amzn-s3-demo-bucket/DestinationPrefix . └── AWSDynamoDB ├── 01693685827463-2d8752fd // the single full export │ ├── manifest-files.json // manifest points to files under 'data' subfolder │ ├── manifest-files.checksum │ ├── manifest-summary.json // stores metadata about request │ ├── manifest-summary.md5 │ ├── data // The data exported by full export │ │ ├── asdl123dasas.json.gz │ │ ... │ └── _started // empty file for permission check
DynamoDB JSON
A table export in DynamoDB JSON format consists of multiple Item
objects. Each individual object is in DynamoDB's standard marshalled JSON
format.
When creating custom parsers for DynamoDB JSON export data, the format is
JSON lines
In the following example, a single item from a DynamoDB JSON export has been formatted on multiple lines for the sake of readability.
{ "Item":{ "Authors":{ "SS":[ "Author1", "Author2" ] }, "Dimensions":{ "S":"8.5 x 11.0 x 1.5" }, "ISBN":{ "S":"333-3333333333" }, "Id":{ "N":"103" }, "InPublication":{ "BOOL":false }, "PageCount":{ "N":"600" }, "Price":{ "N":"2000" }, "ProductCategory":{ "S":"Book" }, "Title":{ "S":"Book 103 Title" } } }
Amazon Ion
Amazon Ion
When you export a table to Ion format, the DynamoDB datatypes used in the
table are mapped to Ion
datatypes
The following table lists the mapping of DynamoDB data types to ion data types:
DynamoDB data type | Ion representation |
---|---|
String (S) | string |
Boolean (BOOL) | bool |
Number (N) | decimal |
Binary (B) | blob |
Set (SS, NS, BS) | list (with type annotation $dynamodb_SS, $dynamodb_NS, or $dynamodb_BS) |
List | list |
Map | struct |
Items in an Ion export are delimited by newlines. Each line begins with an Ion version marker, followed by an item in Ion format. In the following example, an item from an Ion export has been formatted on multiple lines for the sake of readability.
$ion_1_0 { Item:{ Authors:$dynamodb_SS::["Author1","Author2"], Dimensions:"8.5 x 11.0 x 1.5", ISBN:"333-3333333333", Id:103., InPublication:false, PageCount:6d2, Price:2d3, ProductCategory:"Book", Title:"Book 103 Title" } }
Manifest files
DynamoDB creates manifest files, along with their checksum files, in the specified S3 bucket for each export request.
export-prefix
/AWSDynamoDB/ExportId
/manifest-summary.jsonexport-prefix
/AWSDynamoDB/ExportId
/manifest-summary.checksumexport-prefix
/AWSDynamoDB/ExportId
/manifest-files.jsonexport-prefix
/AWSDynamoDB/ExportId
/manifest-files.checksum
You choose an export-prefix
when you request a table
export. This helps you keep files in the destination S3 bucket organized. The
ExportId
is a unique token generated by the service
to ensure that multiple exports to the same S3 bucket and
export-prefix
don't overwrite each other.
The export creates at least 1 file per partition. For partitions that are empty, your export request will create an empty file. All of the items in each file are from that particular partition's hashed keyspace.
Note
DynamoDB also creates an empty file named _started
in
the same directory as the manifest files. This file verifies that the
destination bucket is writable and that the export has begun. It can safely
be deleted.
The summary manifest
The manifest-summary.json
file contains summary
information about the export job. This allows you to know which data files
in the shared data folder are associated with this export. Its format is as
follows:
{ "version": "2023-08-01", "exportArn": "arn:aws:dynamodb:us-east-1:599882009758:table/export-test/export/01695097218000-d6299cbd", "startTime": "2023-09-19T04:20:18.000Z", "endTime": "2023-09-19T04:40:24.780Z", "tableArn": "arn:aws:dynamodb:us-east-1:599882009758:table/export-test", "tableId": "b116b490-6460-4d4a-9a6b-5d360abf4fb3", "exportFromTime": "2023-09-18T17:00:00.000Z", "exportToTime": "2023-09-19T04:00:00.000Z", "s3Bucket": "jason-exports", "s3Prefix": "20230919-prefix", "s3SseAlgorithm": "AES256", "s3SseKmsKeyId": null, "manifestFilesS3Key": "20230919-prefix/AWSDynamoDB/01693685934212-ac809da5/manifest-files.json", "billedSizeBytes": 20901239349, "itemCount": 169928274, "outputFormat": "DYNAMODB_JSON", "outputView": "NEW_AND_OLD_IMAGES", "exportType": "INCREMENTAL_EXPORT" }
The files manifest
The manifest-files.json
file contains information
about the files that contain your exported table data. The file is in JSON lines
{ "itemCount": 8, "md5Checksum": "sQMSpEILNgoQmarvDFonGQ==", "etag": "af83d6f217c19b8b0fff8023d8ca4716-1", "dataFileS3Key": "AWSDynamoDB/data/sgad6417s6vss4p7owp0471bcq.json.gz" }
Data files
DynamoDB can export your table data in two formats: DynamoDB JSON and Amazon Ion.
Regardless of the format you choose, your data will be written to multiple
compressed files named by the keys. These files are also listed in the
manifest-files.json
file.
The data files for incremental exports are all contained in a common data folder in your S3 bucket. Your manifest files are under your export ID folder.
amzn-s3-demo-bucket/DestinationPrefix . └── AWSDynamoDB ├── 01693685934212-ac809da5 // an incremental export ID │ ├── manifest-files.json // manifest points to files under 'data' folder │ ├── manifest-files.checksum │ ├── manifest-summary.json // stores metadata about request │ ├── manifest-summary.md5 │ └── _started // empty file for permission check ├── 01693686034521-ac809da5 │ ├── manifest-files.json │ ├── manifest-files.checksum │ ├── manifest-summary.json │ ├── manifest-summary.md5 │ └── _started ├── data // stores all the data files for incremental exports │ ├── sgad6417s6vss4p7owp0471bcq.json.gz │ ...
In you export files, each item’s output includes a timestamp that represents
when that item was updated in your table and a data structure that indicates if
it was an insert
, update
, or delete
operation. The timestamp is based on an internal system clock and can vary from
your application clock. For incremental exports, you can choose between two
export view types for your output structure: new and old
images or new images
only.
-
New image provides the latest state of the item
-
Old image provides the state of the item right before the specified start date and time
View types can be helpful if you want to see how the item was changed within the export period. It can also be useful for efficiently updating your downstream systems, especially if those downstream systems have a partition key that is not the same as your DynamoDB partition key.
You can infer whether an item in your incremental export output was an
insert
, update
, or delete
by looking
at the structure of the output. The incremental export structure and its
corresponding operations are summarized in the table below for both export view
types.
Operation | New images only | New and old images |
---|---|---|
Insert |
Keys + new image |
Keys + new image |
Update |
Keys + new image | Keys + new image + old image |
Delete | Keys | Keys + old image |
Insert + delete | No output | No output |
DynamoDB JSON
A table export in DynamoDB JSON format consists of a metadata timestamp that indicates the write time of the item, followed by the keys of the item and the values. The following shows an example DynamoDB JSON output using export view type as New and Old images.
// Ex 1: Insert // An insert means the item did not exist before the incremental export window // and was added during the incremental export window { "Metadata": { "WriteTimestampMicros": "1680109764000000" }, "Keys": { "PK": { "S": "CUST#100" } }, "NewImage": { "PK": { "S": "CUST#100" }, "FirstName": { "S": "John" }, "LastName": { "S": "Don" } } } // Ex 2: Update // An update means the item existed before the incremental export window // and was updated during the incremental export window. // The OldImage would not be present if choosing "New images only". { "Metadata": { "WriteTimestampMicros": "1680109764000000" }, "Keys": { "PK": { "S": "CUST#200" } }, "OldImage": { "PK": { "S": "CUST#200" }, "FirstName": { "S": "Mary" }, "LastName": { "S": "Grace" } }, "NewImage": { "PK": { "S": "CUST#200" }, "FirstName": { "S": "Mary" }, "LastName": { "S": "Smith" } } } // Ex 3: Delete // A delete means the item existed before the incremental export window // and was deleted during the incremental export window // The OldImage would not be present if choosing "New images only". { "Metadata": { "WriteTimestampMicros": "1680109764000000" }, "Keys": { "PK": { "S": "CUST#300" } }, "OldImage": { "PK": { "S": "CUST#300" }, "FirstName": { "S": "Jose" }, "LastName": { "S": "Hernandez" } } } // Ex 4: Insert + Delete // Nothing is exported if an item is inserted and deleted within the // incremental export window.
Amazon Ion
Amazon Ion
When you export a table to Ion format, the DynamoDB datatypes used in the
table are mapped to Ion
datatypes
The following table lists the mapping of DynamoDB data types to ion data types:
DynamoDB data type | Ion representation |
---|---|
String (S) | string |
Boolean (BOOL) | bool |
Number (N) | decimal |
Binary (B) | blob |
Set (SS, NS, BS) | list (with type annotation $dynamodb_SS, $dynamodb_NS, or $dynamodb_BS) |
List | list |
Map | struct |
Items in an Ion export are delimited by newlines. Each line begins with an Ion version marker, followed by an item in Ion format. In the following example, an item from an Ion export has been formatted on multiple lines for the sake of readability.
$ion_1_0 { Record:{ Keys:{ ISBN:"333-3333333333" }, Metadata:{ WriteTimestampMicros:1684374845117899. }, OldImage:{ Authors:$dynamodb_SS::["Author1","Author2"], ISBN:"333-3333333333", Id:103., InPublication:false, ProductCategory:"Book", Title:"Book 103 Title" }, NewImage:{ Authors:$dynamodb_SS::["Author1","Author2"], Dimensions:"8.5 x 11.0 x 1.5", ISBN:"333-3333333333", Id:103., InPublication:true, PageCount:6d2, Price:2d3, ProductCategory:"Book", Title:"Book 103 Title" } } }