Deliver DynamoDB records to Amazon S3 using Kinesis Data Streams and Firehose with AWS CDK
Created by Shashank Shrivastava (AWS) and Daniel Matuki da Cunha (AWS)
Summary
This pattern provides sample code and an application for delivering records from Amazon DynamoDB to Amazon Simple Storage Service (Amazon S3) by using Amazon Kinesis Data Streams and Amazon Data Firehose. The pattern’s approach uses AWS Cloud Development Kit (AWS CDK) L3 constructs and includes an example of how to perform data transformation with AWS Lambda before data is delivered to the target S3 bucket on the Amazon Web Services (AWS) Cloud.
Kinesis Data Streams records item-level modifications in DynamoDB tables and replicates them to the required Kinesis data stream. Your applications can access the Kinesis data stream and view the item-level changes in near-real time. Kinesis Data Streams also provides access to other Amazon Kinesis services, such as Firehose and Amazon Managed Service for Apache Flink. This means that you can build applications that provide real-time dashboards, generate alerts, implement dynamic pricing and advertising, and perform sophisticated data analysis.
You can use this pattern for your data integration use cases. For example, transportation vehicles or industrial equipment can send high volumes of data to a DynamoDB table. This data can then be transformed and stored in a data lake hosted in Amazon S3. You can then query and process the data and predict any potential defects by using serverless services such as Amazon Athena, Amazon Redshift Spectrum, Amazon Rekognition, and AWS Glue.
Prerequisites and limitations
Prerequisites
An active AWS account.
AWS Command Line Interface (AWS CLI), installed and configured. For more information, see Getting started with the AWS CLI in the AWS CLI documentation.
Node.js (18.x+) and npm, installed and configured. For more information, see Downloading and installing Node.js and npm
in the npm
documentation.aws-cdk (2.x+), installed and configured. For more information, see Getting started with the AWS CDK in the AWS CDK documentation.
The GitHub aws-dynamodb-kinesisfirehose-s3-ingestion
repository, cloned and configured on your local machine. Existing sample data for the DynamoDB table. The data must use the following format:
{"SourceDataId": {"S": "123"},"MessageData":{"S": "Hello World"}}
Architecture
The following diagram shows an example workflow for delivering records from DynamoDB to Amazon S3 by using Kinesis Data Streams and Firehose.

The diagram shows the following workflow:
Data is ingested using Amazon API Gateway as a proxy for DynamoDB. You can also use any other source to ingest data into DynamoDB.
Item-level changes are generated in near-real time in Kinesis Data Streams for delivery to Amazon S3.
Kinesis Data Streams sends the records to Firehose for transformation and delivery.
A Lambda function converts the records from a DynamoDB record format to JSON format, which contains only the record item attribute names and values.
Tools
AWS services
AWS Cloud Development Kit (AWS CDK) is a software development framework that helps you define and provision AWS Cloud infrastructure in code.
AWS CDK Toolkit is a command line cloud development kit that helps you interact with your AWS CDK app.
AWS Command Line Interface (AWS CLI) is an open-source tool that helps you interact with AWS services through commands in your command-line shell.
AWS CloudFormation helps you set up AWS resources, provision them quickly and consistently, and manage them throughout their lifecycle across AWS accounts and AWS Regions.
Code repository
The code for this pattern is available in the GitHub aws-dynamodb-kinesisfirehose-s3-ingestion
Epics
Task | Description | Skills required |
---|---|---|
Install the dependencies. | On your local machine, install the dependencies from the
| App developer, General AWS |
Generate the CloudFormation template. |
| App developer, General AWS, AWS DevOps |
Task | Description | Skills required |
---|---|---|
Check and deploy the resources. |
| App developer, General AWS, AWS DevOps |
Task | Description | Skills required |
---|---|---|
Ingest your sample data into the DynamoDB table. | Send a request to your DynamoDB table by running the following command in AWS CLI:
example:
By default, the NoteYou use different approaches to add data into a DynamoDB table. For more information, see Load data into tables in the DynamoDB documentation. | App developer |
Verify that a new object is created in the S3 bucket. | Sign in to the AWS Management Console and monitor the S3 bucket to verify that a new object was created with the data that you sent. For more information, see GetObject in the Amazon S3 documentation. | App developer, General AWS |
Task | Description | Skills required |
---|---|---|
Clean up resources. | Run the | App developer, General AWS |
Related resources
s3-static-site-stack.ts
(GitHub repository) aws-apigateway-dynamodb module
(GitHub repository) aws-kinesisstreams-kinesisfirehose-s3 module
(GitHub repository) Change data capture for DynamoDB Streams (DynamoDB documentation)
Using Kinesis Data Streams to capture changes to DynamoDB (DynamoDB documentation)