Process events asynchronously with Amazon API Gateway and Amazon DynamoDB Streams
Created by Andrea Meroni (AWS), Alessandro Trisolini (AWS), Nadim Majed (AWS), Mariem Kthiri (AWS), and Michael Wallner (AWS)
Code repository: Asynchronous Processing with API Gateway and DynamoDB Streams | Environment: PoC or pilot | Technologies: Serverless |
AWS services: Amazon API Gateway; Amazon DynamoDB; Amazon DynamoDB Streams; AWS Lambda; Amazon SNS |
Summary
Amazon API Gateway is a fully managed service that developers can use to create, publish, maintain, monitor, and secure APIs at any scale. It handles the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls.
An important service quota of API Gateway is the integration timeout. The timeout is the maximum time in which a backend service must return a response before the REST API returns an error. The hard limit of 29 seconds is generally acceptable for synchronous workloads. However, that limit represents a challenge for those developers who want to use API Gateway with asynchronous workloads.
This pattern shows an example architecture for processing events asynchronously using API Gateway, Amazon DynamoDB Streams, and AWS Lambda. The architecture supports running parallel processing jobs with the same input parameters, and it uses a basic REST API as the interface. In this example, using Lambda as the backend limits the duration of jobs to 15 minutes. You can avoid this limit by using an alternative service to process incoming events (for example, AWS Fargate).
Projen
Prerequisites and limitations
Prerequisites
An active AWS account
The following tools installed on your workstation:
AWS Cloud Development Kit (AWS CDK) Toolkit version 2.85.0 or later
Docker
version 20.10.21 or later Node.js
version 18 or later Projen
version 0.71.111 or later Python
version 3.9.16 or later
Limitations
The advised maximum number of readers for DynamoDB Streams is two to avoid throttling.
The maximum runtime of a job is limited by the maximum runtime for Lambda functions (15 minutes).
The maximum number of concurrent job requests is limited by the reserved concurrency of the Lambda functions.
Architecture
Architecture
The following diagram shows the interaction of the jobs API with DynamoDB Streams and the event-processing and error-handling Lambda functions, with events stored in an Amazon EventBridge event archive.
A typical workflow includes the following steps:
You authenticate against AWS Identity and Access Management (IAM) and obtain security credentials.
You send an HTTP
POST
request to the/jobs
jobs API endpoint, specifying the job parameters in the request body.The jobs API returns to you an HTTP response that contains the job identifier.
The jobs API puts the job parameters in the
jobs_table
Amazon DynamoDB table.The
jobs_table
DynamoDB table DynamoDB stream invokes the event-processing Lambda functions.The event-processing Lambda functions process the event and then put the job results in the
jobs_table
DynamoDB table. To help ensure consistent results, the event-processing functions implement an optimistic locking mechanism.You send an HTTP
GET
request to the/jobs/{jobId}
jobs API endpoint, with the job identifier from step 3 as{jobId}
.The jobs API queries the
jobs_table
DynamoDB table to retrieve the job results.The jobs API returns an HTTP response that contains the job results.
If the event processing fails, the event-processing function's source mapping sends the event to the error-handling Amazon Simple Notification Service (Amazon SNS) topic.
The error-handling SNS topic asynchronously pushes the event to the error-handling function.
The error-handling function puts the job parameters in the
jobs_table
DynamoDB table.You can retrieve the job parameters by sending an HTTP
GET
request to the/jobs/{jobId}
jobs API endpoint.If the error handling fails, the error-handling function sends the event to an Amazon EventBridge archive.
You can replay the archived events by using EventBridge.
Tools
AWS services
AWS Cloud Development Kit (AWS CDK) is a software development framework that helps you define and provision AWS Cloud infrastructure in code.
Amazon DynamoDB is a fully managed NoSQL database service that provides fast, predictable, and scalable performance.
Amazon EventBridge is a serverless event bus service that helps you connect your applications with real-time data from a variety of sources. For example, AWS Lambda functions, HTTP invocation endpoints using API destinations, or event buses in other AWS accounts.
AWS Lambda is a compute service that helps you run code without needing to provision or manage servers. It runs your code only when needed and scales automatically, so you pay only for the compute time that you use.
Amazon Simple Notification Service (Amazon SNS) helps you coordinate and manage the exchange of messages between publishers and clients, including web servers and email addresses.
Other tools
autopep8
automatically formats Python code based on the Python Enhancement Proposal (PEP) 8 style guide. Bandit
scans Python code to find common security issues. Commitizen
is a Git commit checker and CHANGELOG
generator.cfn-lint
is an AWS CloudFormation linter Checkov
is a static code-analysis tool that checks infrastructure as code (IaC) for security and compliance misconfigurations. jq
is a command-line tool for parsing JSON. Postman
is an API platform. pre-commit
is a Git hooks manager. Projen
is a project generator. pytest
is a Python framework for writing small, readable tests.
Code repository
This example architecture code can be found in the GitHub Asynchronous Processing with API Gateway and DynamoDB Streams
Best practices
This example architecture doesn't include monitoring of the deployed infrastructure. If your use case requires monitoring, evaluate adding CDK Monitoring Constructs
or another monitoring solution. This example architecture uses IAM permissions to control the access to the jobs API. Anyone authorized to assume the
JobsAPIInvokeRole
will be able to invoke the jobs API. As such, the access control mechanism is binary. If your use case requires a more complex authorization model, evaluate using a different access control mechanism.When a user sends an HTTP
POST
request to the/jobs
jobs API endpoint, the input data is validated at two different levels:API Gateway is in charge of the first request validation.
The event processing function performs the second request.
No validation is performed when the user does an HTTP
GET
request to the/jobs/{jobId}
jobs API endpoint. If your use case requires additional input validation and an increased level of security, evaluate using AWS WAF to protect your API.
To avoid throttling, the DynamoDB Streams documentation discourages users from reading with more than two consumers from the same stream’s shard. To scale out the number of consumers, we recommend using Amazon Kinesis Data Streams.
Optimistic locking has been used in this example to ensure consistent updates of items in the
jobs_table
DynamoDB table. Depending on the use-case requirement, you might need to implement more reliable locking mechanisms, such as pessimistic locking.
Epics
Task | Description | Skills required |
---|---|---|
Clone the repository. | To clone the repository locally, run the following command:
| DevOps engineer |
Set up the project. | Change the directory to the repository root, and set up the Python virtual environment and all the tools by using Projen
| DevOps engineer |
Install pre-commit hooks. | To install pre-commit hooks, do the following:
| DevOps engineer |
Task | Description | Skills required |
---|---|---|
Bootstrap AWS CDK. | To bootstrap AWS CDK
| AWS DevOps |
Deploy the example architecture. | To deploy the example architecture in your AWS account, run the following command:
| AWS DevOps |
Task | Description | Skills required |
---|---|---|
Install test prerequisites. | Install on your workstation the AWS Command Line Interface (AWS CLI), Postman Using Postman | DevOps engineer |
Assume the | Assume
| AWS DevOps |
Configure Postman. |
| AWS DevOps |
Test the example architecture. | To test the example architecture, send requests to the jobs API. For more information, see the Postman documentation | DevOps engineer |
Troubleshooting
Issue | Solution |
---|---|
Destruction and subsequent redeployment of the example architecture fails because the Amazon CloudWatch Logs log group |
|