

# Best practices for designing and architecting with DynamoDB
<a name="best-practices"></a>

Use this section to quickly find recommendations for maximizing performance and minimizing throughput costs when working with DynamoDB.

**Topics**
+ [NoSQL design for DynamoDB](bp-general-nosql-design.md)
+ [Using the DynamoDB Well-Architected Lens to optimize your DynamoDB workload](bp-wal.md)
+ [Best practices for designing and using partition keys effectively in DynamoDB](bp-partition-key-design.md)
+ [Best practices for using sort keys to organize data in DynamoDB](bp-sort-keys.md)
+ [Best practices for using secondary indexes in DynamoDB](bp-indexes.md)
+ [Best practices for storing large items and attributes in DynamoDB](bp-use-s3-too.md)
+ [Best practices for handling time series data in DynamoDB](bp-time-series.md)
+ [Best practices for managing many-to-many relationships in DynamoDB tables](bp-adjacency-graphs.md)
+ [Best practices for querying and scanning data in DynamoDB](bp-query-scan.md)
+ [Best practices for DynamoDB table design](bp-table-design.md)
+ [Using DynamoDB global tables](bp-global-table-design.md)
+ [Best practices for managing the control plane in DynamoDB](bp-control-plane.md)
+ [Best practices for using bulk data operations in DynamoDB](BestPractices_BulkDataOperations.md)
+ [Best practices for handling concurrent updates in DynamoDB](BestPractices_ImplementingVersionControl.md)
+ [Best practices for understanding your AWS billing and usage reports in DynamoDB](bp-understanding-billing.md)
+ [Migrating a DynamoDB table from one account to another](bp-migrating-table-between-accounts.md)
+ [Prescriptive guidance to integrate DAX with DynamoDB applications](dax-prescriptive-guidance.md)
+ [Considerations when using AWS PrivateLink for Amazon DynamoDB](privatelink-interface-endpoints.md#privatelink-considerations) 

# NoSQL design for DynamoDB
<a name="bp-general-nosql-design"></a>

NoSQL database systems like Amazon DynamoDB use alternative models for data management, such as key-value pairs or document storage. When you switch from a relational database management system to a NoSQL database system like DynamoDB, it's important to understand the key differences and specific design approaches.

**Topics**
+ [Differences between relational data design and NoSQL](#bp-general-nosql-design-vs-relational)
+ [Two key concepts for NoSQL design](#bp-general-nosql-design-concepts)
+ [Approaching NoSQL design](#bp-general-nosql-design-approach)
+ [NoSQL Workbench for DynamoDB](#bp-general-nosql-workbench)

## Differences between relational data design and NoSQL
<a name="bp-general-nosql-design-vs-relational"></a>

Relational database systems (RDBMS) and NoSQL databases have different strengths and weaknesses:
+ In RDBMS, data can be queried flexibly, but queries are relatively expensive and don't scale well in high-traffic situations (see [First steps for modeling relational data in DynamoDB](bp-modeling-nosql.md)).
+ In a NoSQL database such as DynamoDB, data can be queried efficiently in a limited number of ways, outside of which queries can be expensive and slow.

These differences make database design different between the two systems:
+ In RDBMS, you design for flexibility without worrying about implementation details or performance. Query optimization generally doesn't affect schema design, but normalization is important.
+ In DynamoDB, you design your schema specifically to make the most common and important queries as fast and as inexpensive as possible. Your data structures are tailored to the specific requirements of your business use cases.

## Two key concepts for NoSQL design
<a name="bp-general-nosql-design-concepts"></a>

NoSQL design requires a different mindset than RDBMS design. For an RDBMS, you can go ahead and create a normalized data model without thinking about access patterns. You can then extend it later when new questions and query requirements arise. You can organize each type of data into its own table.

**How NoSQL design is different**
+ By contrast, you shouldn't start designing your schema for DynamoDB until you know the questions it will need to answer. Understanding the business problems and the application use cases up front is essential.
+ You should maintain as few tables as possible in a DynamoDB application. Having fewer tables keeps things more scalable, requires less permissions management, and reduces overhead for your DynamoDB application. It can also help keep backup costs lower overall.

## Approaching NoSQL design
<a name="bp-general-nosql-design-approach"></a>

*The first step in designing your DynamoDB application is to identify the specific query patterns that the system must satisfy.*

In particular, it is important to understand three fundamental properties of your application's access patterns before you begin:
+ **Data size**: Knowing how much data will be stored and requested at one time will help determine the most effective way to partition the data.
+ **Data shape**: Instead of reshaping data when a query is processed (as an RDBMS system does), a NoSQL database organizes data so that its shape in the database corresponds with what will be queried. This is a key factor in increasing speed and scalability.
+ **Data velocity**: DynamoDB scales by increasing the number of physical partitions that are available to process queries, and by efficiently distributing data across those partitions. Knowing in advance what the peak query loads will be might help determine how to partition data to best use I/O capacity.

After you identify specific query requirements, you can organize data according to general principles that govern performance:
+ **Keep related data together.**   Research has shown that the principle of 'locality of reference', keeping related data together in one place, is a key factor in improving performance and response times in NoSQL systems, just as it was found to be important for optimizing routing tables many years ago.

  As a general rule, you should maintain as few tables as possible in a DynamoDB application.

  Exceptions are cases where high-volume time series data are involved, or datasets that have very different access patterns. A single table with inverted indexes can usually enable simple queries to create and retrieve the complex hierarchical data structures required by your application.
+ **Use sort order.**   Related items can be grouped together and queried efficiently if their key design causes them to sort together. This is an important NoSQL design strategy.
+ **Distribute queries.**   It's also important that a high volume of queries not be focused on one part of the database, where they can exceed I/O capacity. Instead, you should design data keys to distribute traffic evenly across partitions as much as possible, avoiding hot spots.
+ **Use global secondary indexes.**   By creating specific global secondary indexes, you can enable different queries than your main table can support, and that are still fast and relatively inexpensive.

These general principles translate into some common design patterns that you can use to model data efficiently in DynamoDB.

## NoSQL Workbench for DynamoDB
<a name="bp-general-nosql-workbench"></a>

 [NoSQL Workbench for DynamoDB](workbench.md) is a cross-platform, client-side GUI application that you can use for modern database development and operations. It's available for Windows, macOS, and Linux. NoSQL Workbench is a visual development tool that provides data modeling, data visualization, sample data generation, and query development features to help you design, create, query, and manage DynamoDB tables. With NoSQL Workbench for DynamoDB, you can build new data models from, or design models based on, existing data models that satisfy your application's data access patterns. You can also import and export the designed data model at the end of the process. For more information, see [Building data models with NoSQL Workbench](workbench.Modeler.md). 

# Using the DynamoDB Well-Architected Lens to optimize your DynamoDB workload
<a name="bp-wal"></a>

This section describes the Amazon DynamoDB Well-Architected Lens, a collection of design principles and guidance for designing well-architected DynamoDB workloads.

# Optimizing costs on DynamoDB tables
<a name="bp-cost-optimization"></a>

This section covers best practices on how to optimize costs for your existing DynamoDB tables. You should look at the following strategies to see which cost optimization strategy best suits your needs and approach them iteratively. Each strategy will provide an overview of what might be impacting your costs, what signs to look for, and prescriptive guidance on how to reduce them.

**Topics**
+ [Evaluate your costs at the table level](CostOptimization_TableLevelCostAnalysis.md)
+ [Evaluate your DynamoDB table's capacity mode](CostOptimization_TableCapacityMode.md)
+ [Evaluate your DynamoDB table's auto scaling settings](CostOptimization_AutoScalingSettings.md)
+ [Evaluate your DynamoDB table class selection](CostOptimization_TableClass.md)
+ [Identify your unused resources in DynamoDB](CostOptimization_UnusedResources.md)
+ [Evaluate your DynamoDB table usage patterns](CostOptimization_TableUsagePatterns.md)
+ [Evaluate your DynamoDB streams usage](CostOptimization_StreamsUsage.md)
+ [Evaluate your provisioned capacity for right-sized provisioning in your DynamoDB table](CostOptimization_RightSizedProvisioning.md)

# Evaluate your costs at the table level
<a name="CostOptimization_TableLevelCostAnalysis"></a>

 The Cost Explorer tool found within the AWS Management Console allows you to see costs broken down by type, such as read, write, storage and backup charges. You can also see these costs summarized by period such as month or day.

One challenge administrators can face is when the costs of only one particular table need to be reviewed. Some of this data is available via the DynamoDB console or via calls to the `DescribeTable` API, however Cost Explorer does not, by default, allow you to filter or group by costs associated with a specific table. This section will show you how to use tagging to perform individual table cost analysis in Cost Explorer.

**Topics**
+ [How to view the costs of a single DynamoDB table](#CostOptimization_TableLevelCostAnalysis_ViewInfo)
+ [Cost Explorer's default view](#CostOptimization_TableLevelCostAnalysis_CostExplorer)
+ [How to use and apply table tags in Cost Explorer](#CostOptimization_TableLevelCostAnalysis_Tagging)

## How to view the costs of a single DynamoDB table
<a name="CostOptimization_TableLevelCostAnalysis_ViewInfo"></a>

Both the Amazon DynamoDB AWS Management Console and the `DescribeTable` API will show you information about a single table, including the primary key schema, any indexes on the table, and the size and item count of the table and any indexes. The size of the table, plus the size of the indexes, can be used to calculate the monthly storage cost for your table. For example, \$10.25 per GB in the us-east-1 region.

If the table is in provisioned capacity mode, the current RCU and WCU settings are returned as well. These could be used to calculate the current read and write costs for the table, but these costs could change, especially if the table has been configured with Auto Scaling.

**Note**  
If the table is in on-demand capacity mode, then `DescribeTable` will not help estimate throughput costs, as these are billed based on actual, not provisioned usage in any one period.

## Cost Explorer's default view
<a name="CostOptimization_TableLevelCostAnalysis_CostExplorer"></a>

Cost Explorer's default view provides charts showing the cost of consumed resources such as throughput and storage. You can choose to group costs by period, such as totals by month or by day. The costs of storage, reads, writes, and other features can be broken out and compared as well.

![\[Cost Explorer's default view showing the cost of consumed resources grouped by usage type.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/CostExplorerView.png)


## How to use and apply table tags in Cost Explorer
<a name="CostOptimization_TableLevelCostAnalysis_Tagging"></a>

By default, Cost Explorer does not provide a summary of the costs for any one specific table, as it will combine the costs of multiple tables into a total. However, you can use [AWS resource tagging](https://docs.aws.amazon.com/general/latest/gr/aws_tagging.html) to identify each table by a metadata tag. Tags are key-value pairs you can use for a variety of purposes, such as to identify all resources belonging to a project or department. For this example, we'll assume you have a table named **MyTable**.

1. Set a tag with the key of **table\$1name** and the value of **MyTable**.

1. [Activate the tag within Cost Explorer](https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/activating-tags.html) and then filter on the tag value to gain more visibility into each table's costs.

**Note**  
It may take one or two days for the tag to start appearing in Cost Explorer

You can set metadata tags yourself in the console, or via automation such as the AWS CLI or AWS SDK. Consider requiring a **table\$1name** tag to be set as part of your organization’s new table creation process. For existing tables, there is a Python utility available that will find and apply these tags to all existing tables in a certain region in your account. See [Eponymous Table Tagger on GitHub](https://github.com/awslabs/amazon-dynamodb-tools#eponymous-table-tagger-tool) for more details.

# Evaluate your DynamoDB table's capacity mode
<a name="CostOptimization_TableCapacityMode"></a>

This section provides an overview of how to select the appropriate capacity mode for your DynamoDB table. Each mode is tuned to meet the needs of a different workload in terms of responsiveness to change in throughput, as well as how that usage is billed. You must balance these factors when making your decision.

**Topics**
+ [What table capacity modes are available](#CostOptimization_TableCapacityMode_Overview)
+ [When to select on-demand capacity mode](#CostOptimization_TableCapacityMode_OnDemand)
+ [When to select provisioned capacity mode](#CostOptimization_TableCapacityMode_Provisioned)
+ [Additional factors to consider when choosing a table capacity mode](#CostOptimization_TableCapacityMode_AdditionalFactors)

## What table capacity modes are available
<a name="CostOptimization_TableCapacityMode_Overview"></a>

When you create a DynamoDB table, you must select either on-demand or provisioned capacity mode. 

You can switch tables from provisioned capacity mode to on-demand mode up to four times in a 24-hour rolling window. You can switch tables from on-demand mode to provisioned capacity mode at any time. 

**On-demand capacity mode**  
The [on-demand capacity mode](on-demand-capacity-mode.md) is designed to eliminate the need to plan or provision the capacity of your DynamoDB table. In this mode, your table will instantly accommodate requests to your table without the need to scale any resources up or down (up to twice the previous peak throughput of the table).

DynamoDB on-demand offers pay-per-request pricing for read and write requests so that you only pay for what you use.

**Provisioned capacity mode**  
The [provisioned capacity](provisioned-capacity-mode.md) mode is a more traditional model where you must define how much capacity the table has available for requests either directly or with the assistance of auto scaling. Because a specific capacity is provisioned for the table at any given time, billing is based off of the total capacity provisioned rather than the number of requests consumed. Going over the allocated capacity can also cause the table to reject requests and reduce the experience of your applications users.

Provisioned capacity mode requires constant monitoring to find a balance between not over-provisioning or under-provisioning the table to keep both throttling low and costs tuned.

## When to select on-demand capacity mode
<a name="CostOptimization_TableCapacityMode_OnDemand"></a>

When optimizing for cost, on-demand mode is your best choice when you have a workload similar to the following graphs.

The following factors contribute to this type of workload:
+ Traffic pattern that evolves over time 
+ Variable volume of requests (resulting from batch workloads)
+ Unpredictable request timing (resulting in traffic spikes)
+ Drops to zero or below 30% of the peak for a given hour 

![\[Graphs for unpredictable, variable workload with spikes and periods of low activity.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/choose-on-demand-1.png)![\[Graphs for unpredictable, variable workload with spikes and periods of low activity.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/choose-on-demand-2.png)


For workloads with the above factors, using auto scaling to maintain enough capacity on the table to respond to spikes in traffic will likely lead to the table being overprovisioned and costing more than necessary or the table being under provisioned and requests being unnecessarily throttled. On-demand capacity mode is the better choice because it can handle fluctuating traffic without requiring you to predict or adjust capacity.

With on-demand mode’s pay-per-request pricing model, you don’t have to worry about idle capacity because you only pay for the throughput you actually use. You are billed per read or write request consumed, so your costs directly reflect your actual usage, making it easy to balance costs and performance. Optionally, you can also configure maximum read or write (or both) throughput per second for individual on-demand tables and global secondary indexes to help keep costs and usage bounded. For more information, see [maximum throughput for on-demand tables](on-demand-capacity-mode-max-throughput.md) .

## When to select provisioned capacity mode
<a name="CostOptimization_TableCapacityMode_Provisioned"></a>

An ideal workload for provisioned capacity mode is one with a more steady and predictable usage pattern like the graph below.

**Note**  
We recommend reviewing metrics at a fine-grained period, such as 14 days, before taking action on your provisioned capacity.

The following factors contribute to this type of workload:
+ Steady, predictable and cyclical traffic for a given hour or day
+ Limited short-term bursts of traffic

![\[Graph depicting a predictable, cyclical workload with limited spikes in traffic.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/choose-provisioned-1.png)


Since the traffic volumes within a given hour or day are more stable, you can set the provisioned capacity of the table relatively close to the actual consumed capacity of the table. Cost optimizing a provisioned capacity table is ultimately an exercise in getting the provisioned capacity (blue line) as close to the consumed capacity (orange line) as possible without increasing `ThrottledRequests` on the table. The space between the two lines is both wasted capacity as well as insurance against a bad user experience due to throttling. If you can predict your application’s throughput requirements and you prefer the cost predictability of controlling read and write capacity, then you may want to continue using provisioned tables.

DynamoDB provides auto scaling for provisioned capacity tables which will automatically balance this on your behalf. This lets you track your consumed capacity throughout the day and set the capacity of the table based on a handful of variables. When using auto scaling, your table will be over-provisioned and you need to fine tune the ratio between number of throttles versus over-provisioned capacity units to match your workload needs.

![\[DynamoDB console. Provisioned capacity and auto scaling are enabled. Target utilization is set to 70.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/TableCapacityModeAutoScaling.png)


**Minimum capacity units**  
You can set the minimum capacity of a table to limit throttling, but it will not reduce the cost of the table. If your table has periods of low usage follow by a sudden burst of high usage, setting the minimum can prevent auto scaling from setting the table capacity too low.

**Maximum capacity units**  
You can set the maximum capacity of a table to limit a table scaling higher than intended. Consider applying a maximum for Dev or Test tables where large-scale load testing is not desired. You can set a maximum for any table, but be sure to regularly evaluate this setting against the table baseline when using it in Production to prevent accidental throttling.

**Target utilization**  
Setting the target utilization of the table is the primary means of cost optimization for a provisioned capacity table. Setting a lower percent value here will increase how much the table is overprovisioned, increasing cost, but reducing the risk of throttling. Setting a higher percent value will decrease how much the table is overprovisioned, but increase the risk of throttling.

## Additional factors to consider when choosing a table capacity mode
<a name="CostOptimization_TableCapacityMode_AdditionalFactors"></a>

When deciding between the two modes, there are some additional factors worth considering.

**Provisioned capacity utilization**  
To determine when on-demand mode would cost less than provisioned capacity, it's helpful to look at your provisioned capacity utilization, which refers to how efficiently the allocated (or “provisioned) resources are being used. On-demand mode costs less for workloads with average provisioned capacity utilization below 35%. In many cases, even for workloads with provisioned capacity utilization higher than 35%, it can be more cost-effective to use on-demand mode especially if the workload has periods of low activity mixed with occasional peaks.

**Reserved capacity**  
For provisioned capacity tables, DynamoDB offers the ability to purchase reserved capacity for your read and write capacity (replicated write capacity units (rWCU) and Standard-IA tables are currently not eligible). Reserved capacity offers significant discounts over standard provisioned capacity pricing.

 When deciding between the two table modes, consider how much this additional discount will affect the cost of the table. In some cases, it may cost less to run a relatively unpredictable workload can be cheaper to run on an overprovisioned provisioned capacity table with reserved capacity. 

**Improving predictability of your workload**  
In some situations, a workload may seemingly have both a predictable and unpredictable pattern. While this can be easily supported with an on-demand table, costs will likely be better if the unpredictable patterns in the workload can be improved.

One of the most common causes of these patterns is batch imports. This type of traffic can often exceed the baseline capacity of the table to such a degree that throttling would occur if it were to run. To keep a workload like this running on a provisioned capacity table, consider the following options:
+ If the batch occurs at scheduled times, you can schedule an increase to your auto- scaling capacity before it runs
+ If the batch occurs randomly, consider trying to extend the time it runs rather than executing as fast as possible
+ Add a ramp up period to the import where the velocity of the import starts small but is slowly increased over a few minutes until auto scaling has had the opportunity to start adjusting table capacity

# Evaluate your DynamoDB table's auto scaling settings
<a name="CostOptimization_AutoScalingSettings"></a>

This section provides an overview of how to evaluate the auto scaling settings on your DynamoDB tables. [Amazon DynamoDB auto scaling](AutoScaling.md) is a feature that manages table and global secondary index (GSI) throughput based on your application traffic and your target utilization metric. This ensures your tables or GSIs will have the required capacity required for your application patterns.

The AWS auto scaling service will monitor your current table utilization and compare it to the target utilization value: `TargetValue`. It will notify you if it is time to increase or decrease the capacity allocated. 

**Topics**
+ [Understanding your auto scaling settings](#CostOptimization_AutoScalingSettings_UnderProvisionedTables)
+ [How to identify tables with low target utilization (<=50%)](#CostOptimization_AutoScalingSettings_IdentifyLowUtilization)
+ [How to address workloads with seasonal variance](#CostOptimization_AutoScalingSettings_SeasonalVariance)
+ [How to address spiky workloads with unknown patterns](#CostOptimization_AutoScalingSettings_UnknownPatterns)
+ [How to address workloads with linked applications](#CostOptimization_AutoScalingSettings_BetweenRanges)

## Understanding your auto scaling settings
<a name="CostOptimization_AutoScalingSettings_UnderProvisionedTables"></a>

Defining the correct value for the target utilization, initial step, and final values is an activity that requires involvement from your operations team. This allows you to properly define the values based on historical application usage, which will be used to trigger the AWS auto scaling policies. The utilization target is the percentage of your total capacity that needs to be hit during a period of time before the auto scaling rules apply.

When you set a **high utilization target (a target around 90%)** it means your traffic needs to be higher than 90% for a period of time before the auto scaling kicks in. You should not use a high utilization target unless your application is very constant and doesn’t receive spikes in traffic.

When you set a very **low utilization (a target less than 50%)** it means your application would need to reach 50% of the provisioned capacity before it triggers an auto scaling policy. Unless your application traffic grows at a very aggressive rate, this usually translates into unused capacity and wasted resources. 

## How to identify tables with low target utilization (<=50%)
<a name="CostOptimization_AutoScalingSettings_IdentifyLowUtilization"></a>

You can use either the AWS CLI or AWS Management Console to monitor and identify the `TargetValues` for your auto scaling policies in your DynamoDB resources:

------
#### [ AWS CLI ]

1. Return the entire list of resources by running the following command:

   ```
   aws application-autoscaling describe-scaling-policies --service-namespace dynamodb
   ```

   This command will return the entire list of auto scaling policies that are issued to any DynamoDB resource. If you only want to retrieve the resources from a particular table, you can add the `–resource-id parameter`. For example:

   ```
   aws application-autoscaling describe-scaling-policies --service-namespace dynamodb --resource-id "table/<table-name>”
   ```

1. Return only the auto scaling policies for a particular GSI by running the following command

   ```
   aws application-autoscaling describe-scaling-policies --service-namespace dynamodb --resource-id "table/<table-name>/index/<gsi-name>”
   ```

   The values we're interested in for the auto scaling policies are highlighted below. We want to ensure that the target value is greater than 50% to avoid over-provisioning. You should obtain a result similar to the following:

   ```
   {
       "ScalingPolicies": [
           {
               "PolicyARN": "arn:aws:autoscaling:<region>:<account-id>:scalingPolicy:<uuid>:resource/dynamodb/table/<table-name>/index/<index-name>:policyName/$<full-gsi-name>-scaling-policy",
               "PolicyName": $<full-gsi-name>”,
               "ServiceNamespace": "dynamodb",
               "ResourceId": "table/<table-name>/index/<index-name>",
               "ScalableDimension": "dynamodb:index:WriteCapacityUnits",
               "PolicyType": "TargetTrackingScaling",
               "TargetTrackingScalingPolicyConfiguration": {
                   "TargetValue": 70.0,
                   "PredefinedMetricSpecification": {
                       "PredefinedMetricType": "DynamoDBWriteCapacityUtilization"
                   }
               },
               "Alarms": [
                   ...
               ],
               "CreationTime": "2022-03-04T16:23:48.641000+10:00"
           },
           {
               "PolicyARN": "arn:aws:autoscaling:<region>:<account-id>:scalingPolicy:<uuid>:resource/dynamodb/table/<table-name>/index/<index-name>:policyName/$<full-gsi-name>-scaling-policy",
               "PolicyName":$<full-gsi-name>”,
               "ServiceNamespace": "dynamodb",
               "ResourceId": "table/<table-name>/index/<index-name>",
               "ScalableDimension": "dynamodb:index:ReadCapacityUnits",
               "PolicyType": "TargetTrackingScaling",
               "TargetTrackingScalingPolicyConfiguration": {
                   "TargetValue": 70.0,
                   "PredefinedMetricSpecification": {
                       "PredefinedMetricType": "DynamoDBReadCapacityUtilization"
                   }
               },
               "Alarms": [
                   ...
               ],
               "CreationTime": "2022-03-04T16:23:47.820000+10:00"
           }
       ]
   }
   ```

------
#### [ AWS Management Console ]

1. Sign in to the AWS Management Console and open the DynamoDB console at [https://console.aws.amazon.com/dynamodb/](https://console.aws.amazon.com/dynamodb/).

   Select an appropriate AWS Region if necessary.

1. On the left navigation bar, select **Tables**. On the **Tables** page, select the table's **Name**.

1. On the *Table details* page, choose **Additional settings**, and then review your table's auto scaling settings.  
![\[DynamoDB table details page with auto scaling settings. Review the provisioned capacity utilization and adjust as needed.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/AutoScalingSettings1.png)

   For indexes, expand the **Index capacity** section to review the index's auto scaling settings.  
![\[DynamoDB console's Index capacity section. Review and manage auto scaling settings for indexes.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/AutoScalingSettings2.png)

------

If your target utilization values are less than or equal to 50%, you should explore your table utilization metrics to see if they are [under-provisioned or over-provisioned](CostOptimization_RightSizedProvisioning.md). 

## How to address workloads with seasonal variance
<a name="CostOptimization_AutoScalingSettings_SeasonalVariance"></a>

Consider the following scenario: your application is operating under a minimum average value most of the time, but the utilization target is low so your application can react quickly to events that happen at certain hours in the day and you have enough capacity and avoid getting throttled. This scenario is common when you have an application that is very busy during normal office hours (9 AM to 5 PM) but then it works at a base level during after hours. Since some users will start to connect before 9 am, the application uses this low threshold to ramp up quickly to get to the *required* capacity during peak hours.

This scenario could look like this: 
+ Between 5 PM and 9 AM the `ConsumedWriteCapacity` units stay between 90 and 100
+ Users start to connect to the application before 9 AM and the capacity units increases considerably (the maximum value you’ve seen is 1500 WCU)
+ On average, your application usage varies between 800 to 1200 during working hours

If the previous scenario applies to you, consider using [scheduled auto scaling](https://docs.aws.amazon.com/autoscaling/application/userguide/examples-scheduled-actions.html), where your table could still have an application auto scaling rule configured, but with a less aggressive target utilization that only provisions the extra capacity at the specific intervals you require.

You can use AWS CLI to execute the following steps to create a scheduled auto scaling rule that will execute based on the time of day and the day of the week.

1. Register your DynamoDB table or GSI as scalable target with Application Auto Scaling. A scalable target is a resource that Application Auto Scaling can scale out or in.

   ```
   aws application-autoscaling register-scalable-target \
       --service-namespace dynamodb \
       --scalable-dimension dynamodb:table:WriteCapacityUnits \
       --resource-id table/<table-name> \
       --min-capacity 90 \
       --max-capacity 1500
   ```

1. Set up scheduled actions according to your requirements.

   We'll need two rules to cover the scenario: one to scale up and another to scale down. The first rule to scale up the scheduled action:

   ```
   aws application-autoscaling put-scheduled-action \
       --service-namespace dynamodb \
       --scalable-dimension dynamodb:table:WriteCapacityUnits \
       --resource-id table/<table-name> \
       --scheduled-action-name my-8-5-scheduled-action \
       --scalable-target-action MinCapacity=800,MaxCapacity=1500 \
       --schedule "cron(45 8 ? * MON-FRI *)" \
       --timezone "Australia/Brisbane"
   ```

   The second rule to scale down the scheduled action:

   ```
   aws application-autoscaling put-scheduled-action \
       --service-namespace dynamodb \
       --scalable-dimension dynamodb:table:WriteCapacityUnits \
       --resource-id table/<table-name> \
       --scheduled-action-name my-5-8-scheduled-down-action \
       --scalable-target-action MinCapacity=90,MaxCapacity=1500 \
       --schedule "cron(15 17 ? * MON-FRI *)" \
       --timezone "Australia/Brisbane"
   ```

1. Run the following command to validate both rules have been activated:

   ```
   aws application-autoscaling describe-scheduled-actions --service-namespace dynamodb
   ```

   You should get a result like this:

   ```
   {
       "ScheduledActions": [
           {
               "ScheduledActionName": "my-5-8-scheduled-down-action",
               "ScheduledActionARN": "arn:aws:autoscaling:<region>:<account>:scheduledAction:<uuid>:resource/dynamodb/table/<table-name>:scheduledActionName/my-5-8-scheduled-down-action",
               "ServiceNamespace": "dynamodb",
               "Schedule": "cron(15 17 ? * MON-FRI *)",
               "Timezone": "Australia/Brisbane",
               "ResourceId": "table/<table-name>",
               "ScalableDimension": "dynamodb:table:WriteCapacityUnits",
               "ScalableTargetAction": {
                   "MinCapacity": 90,
                   "MaxCapacity": 1500
               },
               "CreationTime": "2022-03-15T17:30:25.100000+10:00"
           },
           {
               "ScheduledActionName": "my-8-5-scheduled-action",
               "ScheduledActionARN": "arn:aws:autoscaling:<region>:<account>:scheduledAction:<uuid>:resource/dynamodb/table/<table-name>:scheduledActionName/my-8-5-scheduled-action",
               "ServiceNamespace": "dynamodb",
               "Schedule": "cron(45 8 ? * MON-FRI *)",
               "Timezone": "Australia/Brisbane",
               "ResourceId": "table/<table-name>",
               "ScalableDimension": "dynamodb:table:WriteCapacityUnits",
               "ScalableTargetAction": {
                   "MinCapacity": 800,
                   "MaxCapacity": 1500
               },
               "CreationTime": "2022-03-15T17:28:57.816000+10:00"
           }
       ]
   }
   ```

The following picture shows a sample workload that always keeps the 70% target utilization. Notice how the auto scaling rules are still applying and the throughput will not be reduced.

![\[A table's throughput at 70% target utilization, even as auto scaling rules adjust capacity.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/AutoScalingSettings3.png)


Zooming in, we can see there was a spike in the application that triggered the 70% auto scaling threshold, forcing the auto scaling to kick in and provide the extra capacity required for the table. The scheduled auto scaling action will affect maximum and minimum values, and it is your responsibility to set them up.

![\[Spike in a DynamoDB table throughput that initiates auto scaling to provide required extra capacity.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/AutoScalingSettings4.png)


![\[DynamoDB table's auto scaling configuration: Target utilization and minimum and maximum capacity values.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/AutoScalingSettings5.png)


## How to address spiky workloads with unknown patterns
<a name="CostOptimization_AutoScalingSettings_UnknownPatterns"></a>

In this scenario, the application uses a very low utilization target because you don’t know the application patterns yet, and you want to ensure your workload is not throttled.

Consider using [on-demand capacity mode](capacity-mode.md#capacity-mode-on-demand) instead. On-demand tables are perfect for spiky workloads where you don’t know the traffic patterns. With on-demand capacity mode, you pay per request for the data reads and writes your application performs on your tables. You do not need to specify how much read and write throughput you expect your application to perform, as DynamoDB instantly accommodates your workloads as they ramp up or down.

## How to address workloads with linked applications
<a name="CostOptimization_AutoScalingSettings_BetweenRanges"></a>

In this scenario, the application depends on other systems, like batch processing scenarios where you can have big spikes in traffic according to events in the application logic.

Consider developing custom auto scaling logic that reacts to those events where you can increase table capacity and `TargetValues` depending on your specific needs. You could benefit from Amazon EventBridge and use a combination of AWS services like Lambda and Step Functions to react to your specific application needs.

# Evaluate your DynamoDB table class selection
<a name="CostOptimization_TableClass"></a>

This section provides an overview of how to select the appropriate table class for your DynamoDB table. With the launch of the Standard Infrequent-Access (Standard-IA) table class, you now have the ability to optimize a table for either lower storage cost or lower throughput cost.

**Topics**
+ [What table classes are available](#CostOptimization_TableClass_Overview)
+ [When to select the DynamoDB Standard table class](#CostOptimization_TableClass_Standard)
+ [When to select DynamoDB Standard-IA table class](#CostOptimization_TableClass_StandardIA)
+ [Additional factors to consider when choosing a table class](#CostOptimization_TableClass_AdditionalFactors)

## What table classes are available
<a name="CostOptimization_TableClass_Overview"></a>

When you create a DynamoDB Table, you must select either DynamoDB Standard or DynamoDB Standard-IA for the table class. The table class can be changed twice in a 30-day period, so you can always change it in the future. Selecting either table class has no effect on table performance, availability, reliability, or durability.

![\[DynamoDB table class options. In this image, the DynamoDB Standard-IA table class is selected.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/TableClassOptions.png)


**Standard table class**  
The Standard table class is the default option for new tables. This option maintains the original billing balance of DynamoDB which offers a balance of throughput and storage costs for tables with frequently accessed data.

**Standard-IA table class**  
The Standard-IA table class offers a lower storage cost (\$160% lower) for workloads that require long-term storage of data with infrequent updates or reads. Since the class is optimized for infrequent access, reads and writes will be billed at a slightly higher cost (\$125% higher) than the Standard table class.

## When to select the DynamoDB Standard table class
<a name="CostOptimization_TableClass_Standard"></a>

DynamoDB Standard table class is best suited for tables whose storage cost is approximately 50% or less of the overall monthly cost of the table. This cost balance is indicative of a workload that regularly accesses or updates items already stored within DynamoDB.

## When to select DynamoDB Standard-IA table class
<a name="CostOptimization_TableClass_StandardIA"></a>

DynamoDB Standard-IA table class is best suited for tables whose storage cost is approximately 50% or more of the overall monthly cost of the table. This cost balance is indicative of a workload that creates or reads fewer items per month than it keeps in storage.

 A common use for the Standard-IA table class is moving less frequently accessed data to individual Standard-IA tables. For further information, see [ Optimizing the storage costs of your workloads with Amazon DynamoDB Standard-IA table class](https://aws.amazon.com/blogs/database/optimize-the-storage-costs-of-your-workloads-with-amazon-dynamodb-standard-ia-table-class/).

## Additional factors to consider when choosing a table class
<a name="CostOptimization_TableClass_AdditionalFactors"></a>

When deciding between the two table classes, there are some additional factors worth considering as part of your decision.

**Reserved capacity**  
Purchasing reserved capacity for tables using the Standard-IA table class is currently not supported. When transitioning from a Standard table with reserved capacity to a Standard-IA table without reserved capacity, you may not see a cost benefit.

# Identify your unused resources in DynamoDB
<a name="CostOptimization_UnusedResources"></a>

This section provides an overview of how to evaluate your unused resources regularly. As your application requirements evolve you should ensure no resources are unused and contributing to unnecessary Amazon DynamoDB costs. The procedures described below will use Amazon CloudWatch metrics to identify unused resources and will help you identify and take action on those resources to reduce costs.

You can monitor DynamoDB using CloudWatch, which collects and processes raw data from DynamoDB into readable, near real-time metrics. These statistics are retained for a period of time, so that you can access historical information to better understand your utilization. By default, DynamoDB metric data is sent to CloudWatch automatically. For more information, see [What is Amazon CloudWatch?](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/WhatIsCloudWatch.html) and [Metrics retention](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_concepts.html#metrics-retention) in the *Amazon CloudWatch User Guide*. 

**Topics**
+ [How to identify unused resources](#CostOptimization_UnusedResources_Identifying)
+ [Identifying unused table resources](#CostOptimization_UnusedResources_Tables)
+ [Cleaning up unused table resources](#CostOptimization_UnusedResources_Tables_Cleanup)
+ [Identifying unused GSI resources](#CostOptimization_UnusedResources_GSI)
+ [Cleaning up unused GSI resources](#CostOptimization_UnusedResources_GSI_Cleanup)
+ [Cleaning up unused global tables](#CostOptimization_UnusedResources_GlobalTables)
+ [Cleaning up unused backups or point-in-time recovery (PITR)](#CostOptimization_UnusedResources_Backups)

## How to identify unused resources
<a name="CostOptimization_UnusedResources_Identifying"></a>

To identify unused tables or indexes, we'll look at the following CloudWatch metrics over a period of 30 days to understand if there are any active reads or writes on the table or any reads on the global secondary indexes (GSIs):

**[ConsumedReadCapacityUnits](metrics-dimensions.md#ConsumedReadCapacityUnits)**  
The number of read capacity units consumed over the specified time period, so you can track how much consumed capacity you have used. You can retrieve the total consumed read capacity for a table and all of its global secondary indexes, or for a particular global secondary index.

**[ConsumedWriteCapacityUnits](metrics-dimensions.md#ConsumedWriteCapacityUnits)**  
The number of write capacity units consumed over the specified time period, so you can track how much consumed capacity you have used. You can retrieve the total consumed write capacity for a table and all of its global secondary indexes, or for a particular global secondary index.

## Identifying unused table resources
<a name="CostOptimization_UnusedResources_Tables"></a>

Amazon CloudWatch is a monitoring and observability service which provides the DynamoDB table metrics you’ll use to identify unused resources. CloudWatch metrics can be viewed through the AWS Management Console as well as through the AWS Command Line Interface.

------
#### [ AWS Command Line Interface ]

To view your tables metrics through the AWS Command Line Interface, you can use the following commands.

1. First, evaluate your table's reads:

   ```
   aws cloudwatch get-metric-statistics --metric-name
   ConsumedReadCapacityUnits --start-time <start-time> --end-time <end-
   time> --period <period> --namespace AWS/DynamoDB --statistics Sum --
   dimensions Name=TableName,Value=<table-name>
   ```

   To avoid falsely identifying a table as unused, evaluate metrics over a longer period. Choose an appropriate start-time and end-time range, such as **30 days**, and an appropriate period, such as **86400**.

   In the returned data, any **Sum** above **0** indicates that the table you are evaluating received read traffic during that period.

   The following result shows a table receiving read traffic in the evaluated period:

   ```
           {
               "Timestamp": "2022-08-25T19:40:00Z",
               "Sum": 36023355.0,
               "Unit": "Count"
           },
           {
               "Timestamp": "2022-08-12T19:40:00Z",
               "Sum": 38025777.5,
               "Unit": "Count"
           },
   ```

   The following result shows a table not receiving read traffic in the evaluated period:

   ```
           {
               "Timestamp": "2022-08-01T19:50:00Z",
               "Sum": 0.0,
               "Unit": "Count"
           },
           {
               "Timestamp": "2022-08-20T19:50:00Z",
               "Sum": 0.0,
               "Unit": "Count"
           },
   ```

1. Next, evaluate your table’s writes:

   ```
   aws cloudwatch get-metric-statistics --metric-name
   ConsumedWriteCapacityUnits --start-time <start-time> --end-time <end-
   time> --period <period> --namespace AWS/DynamoDB --statistics Sum --
   dimensions Name=TableName,Value=<table-name>
   ```

   To avoid falsely identifying a table as unused, you will want to evaluate metrics over a longer period. Choose an appropriate start-time and end-time range, such as **30 days**, and an appropriate period, such as **86400**.

   In the returned data, any **Sum** above **0** indicates that the table you are evaluating received read traffic during that period.

   The following result shows a table receiving write traffic in the evaluated period:

   ```
           {
               "Timestamp": "2022-08-19T20:15:00Z",
               "Sum": 41014457.0,
               "Unit": "Count"
           },
           {
               "Timestamp": "2022-08-18T20:15:00Z",
               "Sum": 40048531.0,
               "Unit": "Count"
           },
   ```

   The following result shows a table not receiving write traffic in the evaluated period:

   ```
           {
               "Timestamp": "2022-07-31T20:15:00Z",
               "Sum": 0.0,
               "Unit": "Count"
           },
           {
               "Timestamp": "2022-08-19T20:15:00Z",
               "Sum": 0.0,
               "Unit": "Count"
           },
   ```

------
#### [ AWS Management Console ]

The following steps will allow you to evaluate your resources utilization through the AWS Management Console.

1. Log into the AWS console and navigate to the CloudWatch service page at [https://console.aws.amazon.com/cloudwatch/](https://console.aws.amazon.com/cloudwatch/). Select the appropriate AWS region in the top right of the console, if necessary.

1. On the left navigation bar, locate the Metrics section and select **All metrics**.

1. The action above will open a dashboard with two panels. In the top panel you will see currently graphed metrics. In the bottom you will select the metrics available to graph. Select DynamoDB in the bottom panel.

1. In the DynamoDB metrics selection panel select the **Table Metrics** category to show the metrics for your tables in the current region.

1. Identify your table name by scrolling down the menu, then select the metrics `ConsumedReadCapacityUnits` and `ConsumedWriteCapacityUnits` for your table.

1. Select the **Graphed metrics (2)** tab and adjust the **Statistic** column to **Sum**.  
![\[Graphed metrics tab. Statistic is set to Sum to view resource usage data in the console.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/GraphedMetricsTab.png)

1. To avoid falsely identifying a table as unused, you'll want to evaluate metrics over a longer period. At the top of the graph panel choose an appropriate time frame, such as 1 month, to evaluate your table. Select **Custom**, select **1 Months** in the dropdowns, and choose **Apply**.  
![\[CloudWatch console. Custom time frame of 1 month is selected to evaluate metrics.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/OneMonthTimeFrame.png)

1. Evaluate the graphed metrics for your table to determine if it is being used. Metrics that have gone above **0 **indicate that a table has been used during the evaluated time period. A flat graph at **0 **for both read and write indicates a table that is unused.

   The following image shows a table with read traffic:  
![\[Graph showing the ConsumedReadCapacityUnits for a DynamoDB table, suggesting the table is in use.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/TableWithReadTraffic.png)

   The following image shows a table without read traffic:  
![\[Graph showing no read activity for a DynamoDB table, suggesting the table isn't in use.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/TableWithoutReadTraffic.png)

------

## Cleaning up unused table resources
<a name="CostOptimization_UnusedResources_Tables_Cleanup"></a>

If you have identified unused table resources, you can reduce their ongoing costs in the following ways.

**Note**  
If you have identified an unused table but would still like to keep it available in case it needs to be accessed in the future, consider switching it to on-demand mode. Otherwise, you can consider backing up and deleting the table entirely.

**Capacity modes**  
DynamoDB charges for reading, writing, and storing data in your DynamoDB tables.

DynamoDB has [two capacity modes](capacity-mode.md), which come with specific billing options for processing reads and writes on your tables: on-demand and provisioned. The read/write capacity mode controls how you are charged for read and write throughput and how you manage capacity.

For on-demand mode tables, you don't need to specify how much read and write throughput you expect your application to perform. DynamoDB charges you for the reads and writes that your application performs on your tables in terms of read request units and write request units. If there is no activity on your table/index you do not pay for throughput but you’ll still incur a storage charge.

**Table class**  
DynamoDB also offers [two table classes](HowItWorks.TableClasses.md) designed to help you optimize for cost. The DynamoDB Standard table class is the default and is recommended for most workloads. The DynamoDB Standard-Infrequent Access (DynamoDB Standard-IA) table class is optimized for tables where storage is the dominant cost.

If there is no activity on your table or index, storage is likely to be the dominant cost and changing table class will offer a significant savings.

**Deleting tables**  
If you’ve discovered an unused table and would like to delete it, you may wish to make a backup or export of the data first.

Backups taken through AWS Backup can leverage cold storage tiering, further reducing costs. Refer to the [Using AWS Backup with DynamoDB](backuprestore_HowItWorksAWS.md) documentation for information on how enable backups through AWS Backup as well as the [Managing backup plans](https://docs.aws.amazon.com/aws-backup/latest/devguide/about-backup-plans) documentation for information on how to use lifecycle to move your backup to cold storage.

Alternatively, you may choose to export your table’s data to S3. To do so, refer to the [Export to Amazon S3](S3DataExport.HowItWorks.md) documentation. Once your data is exported, if you wish to leverage S3 Glacier Instant Retrieval, S3 Glacier Flexile Retrieval, or S3 Glacier Deep Archive to further reduce costs, see [Managing your storage lifecycle](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lifecycle-mgmt).

After your table has been backed up, you may choose to delete it either through the AWS Management Console or through the AWS Command Line Interface.

## Identifying unused GSI resources
<a name="CostOptimization_UnusedResources_GSI"></a>

The steps for identifying an unused global secondary are similar to those for identifying an unused table. Since DynamoDB replicates items written to your base table into your GSI if they contain the attribute used as the GSI’s partition key, an unused GSI is still likely to have `ConsumedWriteCapacityUnits` above 0 if its base table is in use. As a result, you’ll be evaluating only the `ConsumedReadCapacityUnits` metric to determine if your GSI is unused.

To view your GSI metrics through the AWS AWS CLI, you can use the following commands to evaluate your tables reads:

```
aws cloudwatch get-metric-statistics --metric-name
ConsumedReadCapacityUnits --start-time <start-time> --end-time <end-
time> --period <period> --namespace AWS/DynamoDB --statistics Sum --
dimensions Name=TableName,Value=<table-name>
Name=GlobalSecondaryIndexName,Value=<index-name>
```

To avoid falsely identifying a table as unused, you will want to evaluate metrics over a longer period. Choose an appropriate start-time and end-time range, such as 30 days, and an appropriate period, such as 86400.

In the returned data, any Sum above 0 indicates that the table you are evaluating received read traffic during that period.

The following result shows a GSI receiving read traffic in the evaluated period:

```
        {
          "Timestamp": "2022-08-17T21:20:00Z",
          "Sum": 36319167.0,
          "Unit": "Count"
        },
        {
          "Timestamp": "2022-08-11T21:20:00Z",
          "Sum": 1869136.0,
          "Unit": "Count"
        },
```

The following result shows a GSI receiving minimal read traffic in the evaluated period:

```
        {
          "Timestamp": "2022-08-28T21:20:00Z",
          "Sum": 0.0,
          "Unit": "Count"
        },
        {
          "Timestamp": "2022-08-15T21:20:00Z",
          "Sum": 2.0,
          "Unit": "Count"
        },
```

The following result shows a GSI receiving no read traffic in the evaluated period:

```
        {
          "Timestamp": "2022-08-17T21:20:00Z",
          "Sum": 0.0,
          "Unit": "Count"
        },
        {
          "Timestamp": "2022-08-11T21:20:00Z",
          "Sum": 0.0,
          "Unit": "Count"
        },
```

## Cleaning up unused GSI resources
<a name="CostOptimization_UnusedResources_GSI_Cleanup"></a>

If you've identified an unused GSI, you can choose to delete it. Since all data present in a GSI is also present in the base table, additional backup is not necessary before deleting a GSI. If in the future the GSI is once again needed, it may be added back to the table.

If you have identified an infrequently used GSI, you should consider design changes in your application that would allow you to delete it or reduce its cost. For example, while DynamoDB scans should be used sparingly because they can consume large amounts of system resources, they may be more cost effective than a GSI if the access pattern it supports is used very infrequently.

Additionally, if a GSI is required to support an infrequent access pattern consider projecting a more limited set of attributes. While this may require subsequent queries against the base table to support your infrequent access patterns, it can potentially offer a significant reduction in storage and write costs.

## Cleaning up unused global tables
<a name="CostOptimization_UnusedResources_GlobalTables"></a>

Amazon DynamoDB global tables provide a fully managed solution for deploying a multi-Region, multi-active database, without having to build and maintain your own replication solution.

Global tables are ideal for providing low-latency access to data close to users and as well as a secondary region for disaster recovery.

If the global tables option is enabled for a resource in an effort to provide low-latency access to data but is not part of your disaster recovery strategy, validate that both replicas are actively serving read traffic by evaluating their CloudWatch metrics. If one replica does not serve read traffic, it may be an unused resource.

If global tables are part of your disaster recovery strategy, one replica not receiving read traffic may be expected under an active/standby pattern.

## Cleaning up unused backups or point-in-time recovery (PITR)
<a name="CostOptimization_UnusedResources_Backups"></a>

DynamoDB offers two styles of backup. Point-in-time recovery provides continuous backups for up to 35 days to help you protect against accidental writes or deletes while on-demand backup allows for snapshot creation which can be saved long term. You can set the recovery period to any value between 1 and 35 days. Both types of backups have costs associated with them.

Refer to the documentation for [Backup and restore for DynamoDB](Backup-and-Restore.md) and [Point-in-time backups for DynamoDB](Point-in-time-recovery.md) to determine if your tables have backups enabled that may no longer be needed.

# Evaluate your DynamoDB table usage patterns
<a name="CostOptimization_TableUsagePatterns"></a>

This section provides an overview of how to evaluate if you are efficiently using your DynamoDB tables. There are certain usage patterns which are not optimal for DynamoDB, and they allow room for optimization from both a performance and cost perspective.

**Topics**
+ [Perform fewer strongly-consistent read operations](#CostOptimization_TableUsagePatterns_StronglyConsistentReads)
+ [Perform fewer transactions for read operations](#CostOptimization_TableUsagePatterns_Transactions)
+ [Perform fewer scans](#CostOptimization_TableUsagePatterns_Scans)
+ [Shorten attribute names](#CostOptimization_TableUsagePatterns_AttributeNames)
+ [Enable Time to Live (TTL)](#CostOptimization_TableUsagePatterns_TTL)
+ [Replace global tables with cross-Region backups](#CostOptimization_TableUsagePatterns_GlobalTables)

## Perform fewer strongly-consistent read operations
<a name="CostOptimization_TableUsagePatterns_StronglyConsistentReads"></a>

DynamoDB allows you to configure [read consistency](HowItWorks.ReadConsistency.md) on a per-request basis. Read requests are eventually consistent by default. Eventually consistent reads are charged at 0.5 RCU for upto 4 KB of data.

Most parts of distributed workloads are flexible and can tolerate eventual consistency. However, there can be access patterns requiring strongly consistent reads. Strongly consistent reads are charged at 1 RCU for upto 4 KB of data, essentially doubling your read costs. DynamoDB provides you with the flexibility to use both consistency models on the same table. 

You can evaluate your workload and application code to confirm if strongly consistent reads are used only where required.

## Perform fewer transactions for read operations
<a name="CostOptimization_TableUsagePatterns_Transactions"></a>

DynamoDB allows you to group certain actions in an all-or-nothing manner, which means you have the ability to execute ACID transactions with DynamoDB. However, as is the case with relational databases, it is not necessary to follow this approach for every action.

A [transactional read operation](transaction-apis.md#transaction-capacity-handling.title) of up to 4 KB consumes 2 RCUs as opposed to the default 0.5 RCUs for reading the same amount of data in an eventually consistent manner. The costs are doubled in write operations which means, a transactional write of up to 1 KB equates to 2 WCUs.

To determine if all operations on your tables are transactions, CloudWatch metrics for the table can be filtered down to the transaction APIs. If transaction APIs are the only graphs available under the `SuccessfulRequestLatency` metrics for the table, this would confirm that every operation is a transaction for this table. Alternatively, if the overall capacity utilization trend matches the transaction API trend, consider revisiting the application design as it seems dominated by transactional APIs.

## Perform fewer scans
<a name="CostOptimization_TableUsagePatterns_Scans"></a>

The extensive use of `Scan` operations generally stems from the need to run analytical queries on the DynamoDB data. Running frequent `Scan` operations on large table can be inefficient and expensive.

A better alternative is to use the [Export to S3](S3DataExport.HowItWorks.md#S3DataExport.HowItWorks.title) feature and choosing a point in time to export the table state to S3. AWS offers services like Athena which can then be used to run analytical queries on the data without consuming any capacity from the table.

The frequency for `Scan` operations can be determined using the `SampleCount` statistic under the `SuccessfulRequestLatency` metric for `Scan`. If `Scan` operations are indeed very frequent, the access patterns and data model should be re-evaluated.

## Shorten attribute names
<a name="CostOptimization_TableUsagePatterns_AttributeNames"></a>

The total size of an item in DynamoDB is the sum of its attribute name lengths and values. Having long attribute names not only contributes towards storage costs, but it might also lead to higher RCU/WCU consumption. We recommend that you choose shorter attribute names rather than long ones. Having shorter attribute names can help limit the item size within the next 4KB/1KB boundary after which you would consume additional RCU/WCU to access data.

## Enable Time to Live (TTL)
<a name="CostOptimization_TableUsagePatterns_TTL"></a>

[Time to Live (TTL)](TTL.md#TTL.title) can identify items older than the expiry time that you have set on an item and remove them from the table. If your data grows over time and older data becomes irrelevant, enabling TTL on the table can help trim your data down and save on storage costs.

 Another useful aspect of TTL is that the expired items occur on your DynamoDB streams, so rather than just removing the data from your data, it is possible to consume those items from the stream and archive them to a lower cost storage tier. Additionally, deleting items via TTL comes at no additional cost — it does not consume capacity, and there’s no overhead of designing a clean up application.

## Replace global tables with cross-Region backups
<a name="CostOptimization_TableUsagePatterns_GlobalTables"></a>

[Global tables](GlobalTables.md#GlobalTables.title) allow you to maintain multiple active replica tables in different Regions — they can all accept write operations and replicate data across each other. It is easy to set up replicas and the synchronization is managed for you. The replicas converge to a consistent state using a last writer wins strategy.

If you are using Global tables purely as a part of failover or disaster recovery (DR) strategy, you can replace them with a cross-Region backup copy for relatively lenient recovery point objectives and recovery time objective requirements. If you do not require fast local access and the high availability of five nines, maintaining a global table replica might not be the best approach for failover.

As an alternative, consider using AWS Backup to manage DynamoDB backups. You can schedule regular backups and copy them across Regions to meet DR requirements in a more cost-effective approach compared to using Global tables.

# Evaluate your DynamoDB streams usage
<a name="CostOptimization_StreamsUsage"></a>

This section provides an overview of how to evaluate your DynamoDB Streams usage. There are certain usage patterns which are not optimal for DynamoDB, and they allow room for optimization from both a performance and cost perspective.

You have two native streaming integrations for streaming and event-driven use cases:
+ [Amazon DynamoDB Streams](Streams.md) 
+ [Amazon Kinesis Data Streams](Streams.KCLAdapter.md) 

This page will just focus on cost optimization strategies for these options. If you'd like to instead find out how to choose between the two options, see [Streaming options for change data capture](streamsmain.md#streamsmain.choose).

**Topics**
+ [Optimizing costs for DynamoDB Streams](#CostOptimization_StreamsUsage_Options_DDBStreams)
+ [Optimizing costs for Kinesis Data Streams](#CostOptimization_StreamsUsage_Options_KDS)
+ [Cost optimization strategies for both types of Streams usage](#CostOptimization_StreamsUsage_GuidanceForBoth)

## Optimizing costs for DynamoDB Streams
<a name="CostOptimization_StreamsUsage_Options_DDBStreams"></a>

As mentioned in the [pricing page](https://aws.amazon.com/dynamodb/pricing/on-demand/) for DynamoDB Streams, regardless of the table’s throughput capacity mode, DynamoDB charges on the number of read requests made towards the table’s DynamoDB Stream. Read requests made towards a DynamoDB Stream are different from the read requests made towards a DynamoDB table.

Each read request in terms of the stream is in the form of a `GetRecords` API call that can return up to 1000 records or 1 MB worth of records in the response, whichever is reached first. None of the [other DynamoDB Stream APIs](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Operations_Amazon_DynamoDB_Streams.html) are charged and DynamoDB Streams are not charged for being idle. In other words, if no read requests are made to a DynamoDB Stream, no charges will be incurred for having a DynamoDB Stream enabled on a table.

Here are a few consumer applications for DynamoDB Streams:
+ AWS Lambda function(s)
+ Amazon Kinesis Data Streams-based applications
+ Customer consumer applications built using an AWS SDK

Read requests made by AWS Lambda-based consumers of DynamoDB Streams are free, whereas calls made by consumers of any other kind are charged. Every month, the first 2,500,000 read requests made by non-Lambda consumers are also free of cost. This applies to all read requests made to any DynamoDB Streams in an AWS Account for each AWS Region.

**Monitoring your DynamoDB Streams usage**  
DynamoDB Streams charges on the billing console are grouped together for all DynamoDB Streams across the AWS Region in an AWS Account. Currently, tagging DynamoDB Streams is not supported, so cost allocation tags cannot be used to identify granular costs for DynamoDB Streams. The volume of `GetRecords` calls can be obtained at the DynamoDB Stream level to compute the charges per stream. The volume is represented by the DynamoDB Stream’s CloudWatch metric `SuccessfulRequestLatency` and its `SampleCount` statistic. This metric will also include `GetRecords` calls made by global tables to perform on-going replication as well as calls made by AWS Lambda consumers, both of which are not charged. For information on other CloudWatch metrics published by DynamoDB Streams, see [DynamoDB Metrics and dimensions](metrics-dimensions.md).

**Using AWS Lambda as the consumer**  
Evaluate if using AWS Lambda functions as the consumers for DynamoDB Streams is feasible because that can eliminate costs associated with reading from the DynamoDB Stream. On the other hand, DynamoDB Streams Kinesis Adapter or SDK based consumer applications will be charged on the number of `GetRecords` calls they make towards the DynamoDB Stream.

Lambda function invocations will be charged based on standard Lambda pricing, however no charges will be incurred by DynamoDB Streams. Lambda will poll shards in your DynamoDB Stream for records at a base rate of 4 times per second. When records are available, Lambda invokes your function and waits for the result. If processing succeeds, Lambda resumes polling until it receives more records.

**Tuning DynamoDB Streams Kinesis Adapter-based consumer applications**  
Since read requests made by non-Lambda based consumers are charged for DynamoDB Streams, it is important to find a balance between the near real-time requirement and the number of times the consumer application must poll the DynamoDB Stream. 

The frequency of polling DynamoDB Streams using a DynamoDB Streams Kinesis Adapter based application is determined by the configured `idleTimeBetweenReadsInMillis` value. This parameter determines the amount of time in milliseconds that the consumer should wait before processing a shard in case the previous `GetRecords` call made to the same shard did not return any records. By default, this value for this parameter is 1000 ms. If near real-time processing is not required, this parameter could be increased to have the consumer application make fewer `GetRecords` calls and optimize on DynamoDB Streams calls.

## Optimizing costs for Kinesis Data Streams
<a name="CostOptimization_StreamsUsage_Options_KDS"></a>

When a Kinesis Data Stream is set as the destination to deliver change data capture events for a DynamoDB table, the Kinesis Data Stream may need separate sizing management which will affect the overall costs. DynamoDB charges in terms of Change Data capture Units (CDUs) where each unit is a made of up a 1 KB DynamoDB item size attempted by the DynamoDB service to the destination Kinesis Data Stream.

In addition to charges by the DynamoDB service, standard Kinesis Data Stream charges will be incurred. As mentioned in the [pricing page](https://aws.amazon.com/kinesis/data-streams/pricing/), the service pricing differs based on the capacity mode - provisioned and on-demand, which are distinct from DynamoDB table capacity modes and are user-defined. At a high level, Kinesis Data Streams charges an hourly rate based on the capacity mode, as well as on data ingested into the stream by DynamoDB service. There may be additional charges like data retrieval (for on-demand mode), extended data retention (beyond default 24 hours), and enhanced fan-out consumer retrievals depending on the user configuration for the Kinesis Data Stream.

**Monitoring your Kinesis Data Streams usage**  
Kinesis Data Streams for DynamoDB publishes metrics from DynamoDB in addition to standard Kinesis Data Stream CloudWatch Metrics. It may be possible that a `Put` attempt by the DynamoDB service is throttled by the Kinesis service because of insufficient Kinesis Data Streams capacity, or by dependent components like a AWS KMS service that may be configured to encrypt the Kinesis Data Stream data at rest.

To learn more about CloudWatch metrics published by DynamoDB service for the Kinesis Data Stream, see [Monitoring change data capture with Kinesis Data Streams](kds_using-shards-and-metrics.md#kds_using-shards-and-metrics.monitoring). In order to avoid additional costs of service retries due to throttles, it is important to right size the Kinesis Data Stream in case of Provisioned Mode.

**Choosing the right capacity mode for Kinesis Data Streams**  
Kinesis Data Streams are supported in two capacity modes – provisioned mode and on-demand mode.
+ If the workload involving Kinesis Data Stream has predictable application traffic, traffic that is consistent or ramps gradually, or traffic that can be forecasted accurately, then Kinesis Data Streams’ **provisioned mode** is suitable and will be more cost efficient
+ If the workload is new, has unpredictable application traffic, or you prefer not to manage capacity, then Kinesis Data Streams’ **on-demand mode** is suitable and will be more cost efficient

A best practice to optimize costs would be to evaluate if the DynamoDB table associated with the Kinesis Data Stream has a predictable traffic pattern that can leverage provisioned mode of Kinesis Data Streams. If the workload is new, you could use on-demand mode for the Kinesis Data Streams for a few initial weeks, review the CloudWatch metrics to understand traffic patterns, and then switch the same Stream to provisioned mode based on the nature of the workload. In the case of provisioned mode, estimation on number shards can be made by following shard management considerations for Kinesis Data Streams.

**Evaluate your consumer applications using Kinesis Data Streams for DynamoDB**  
Since Kinesis Data Streams don’t charge on the number of `GetRecords` calls like DynamoDB Streams, consumer applications could make as many number of calls as possible, provided the frequency is under the throttling limits for `GetRecords`. In terms of on-demand mode for Kinesis Data Streams, data reads are charged on a per GB basis. For provisioned mode Kinesis Data Streams, reads are not charged if the data is less than 7 days old. In the case of Lambda functions as Kinesis Data Streams consumers, Lambda polls each shard in your Kinesis Stream for records at a base rate of once per second.

## Cost optimization strategies for both types of Streams usage
<a name="CostOptimization_StreamsUsage_GuidanceForBoth"></a>

**Event filtering for AWS Lambda consumers**  
Lambda event filtering allows you to discard events based on a filter criteria from being available in the Lambda function invocation batch. This optimizes Lambda costs for processing or discarding unwanted stream records within the consumer function logic. To learn more about configuring event filtering and writing your filtering criteria, see [Lambda event filtering](https://docs.aws.amazon.com/lambda/latest/dg/invocation-eventfiltering.html).

**Tuning AWS Lambda consumers**  
Costs could be further be optimized by tuning Lambda configuration parameters like increasing the `BatchSize` to process more per invocation, enabling `BisectBatchOnFunctionError` to prevent processing duplicates (which incurs additional costs), and setting `MaximumRetryAttempts` to not run into too many retries. By default, failed consumer Lambda invocations are retried infinitely until the record expires from the stream, which is around 24 hours for DynamoDB Streams and configurable from 24 hours to up to 1 year for Kinesis Data Streams. Additional Lambda configuration options available including the ones mentioned above for DynamoDB Stream consumers are in the [AWS Lambda developer guide](https://docs.aws.amazon.com/lambda/latest/dg/with-ddb.html#services-ddb-params).

# Evaluate your provisioned capacity for right-sized provisioning in your DynamoDB table
<a name="CostOptimization_RightSizedProvisioning"></a>

This section provides an overview of how to evaluate if you have right-sized provisioning on your DynamoDB tables. As your workload evolves, you should modify your operational procedures appropriately, especially when your DynamoDB table is configured in provisioned mode and you have the risk to over-provision or under-provision your tables.

The procedures described below require statistical information that should be captured from the DynamoDB tables that are supporting your production application. To understand your application behavior, you should define a period of time that is significant enough to capture the data seasonality from your application. For example, if your application shows weekly patterns, using a three week period should give you enough room for analysing application throughput needs.

If you don’t know where to start, use at least one month’s worth of data usage for the calculations below.

While evaluating capacity, DynamoDB tables can configure **Read Capacity Units (RCUs)** and **Write Capacity Units (WCU)** independently. If your tables have any Global Secondary Indexes (GSI) configured, you will need to specify the throughput that it will consume, which will be also independent from the RCUs and WCUs from the base table.

**Note**  
Local Secondary Indexes (LSI) consume capacity from the base table.

**Topics**
+ [How to retrieve consumption metrics on your DynamoDB tables](#CostOptimization_RightSizedProvisioning_ConsumptionMetrics)
+ [How to identify under-provisioned DynamoDB tables](#CostOptimization_RightSizedProvisioning_UnderProvisionedTables)
+ [How to identify over-provisioned DynamoDB tables](#CostOptimization_RightSizedProvisioning_OverProvisionedTables)

## How to retrieve consumption metrics on your DynamoDB tables
<a name="CostOptimization_RightSizedProvisioning_ConsumptionMetrics"></a>

To evaluate the table and GSI capacity, monitor the following CloudWatch metrics and select the appropriate dimension to retrieve either table or GSI information:


| Read Capacity Units | Write Capacity Units | 
| --- | --- | 
|  `ConsumedReadCapacityUnits`  |  `ConsumedWriteCapacityUnits`  | 
|  `ProvisionedReadCapacityUnits`  |  `ProvisionedWriteCapacityUnits`  | 
|  `ReadThrottleEvents`  |  `WriteThrottleEvents`  | 

You can do this either through the AWS CLI or the AWS Management Console.

------
#### [ AWS CLI ]

Before we retrieve the table consumption metrics, we'll need to start by capturing some historical data points using the CloudWatch API.

Start by creating two files: `write-calc.json` and `read-calc.json`. These files will represent the calculations for a table or GSI. You'll need to update some of the fields, as indicated in the table below, to match your environment.


| Field Name | Definition | Example | 
| --- | --- | --- | 
| <table-name> | The name of the table that you will be analysing | SampleTable | 
| <period> | The period of time that you will be using to evaluate the utilization target, based in seconds | For a 1-hour period you should specify: 3600 | 
| <start-time> | The beginning of your evaluation interval, specified in ISO8601 format | 2022-02-21T23:00:00 | 
| <end-time> | The end of your evaluation interval, specified in ISO8601 format | 2022-02-22T06:00:00 | 

The write calculations file will retrieve the number of WCU provisioned and consumed in the time period for the date range specified. It will also generate a utilization percentage that will be used for analysis. The full content of the `write-calc.json` file should look like this:

```
{
  "MetricDataQueries": [
    {
      "Id": "provisionedWCU",
      "MetricStat": {
        "Metric": {
          "Namespace": "AWS/DynamoDB",
          "MetricName": "ProvisionedWriteCapacityUnits",
          "Dimensions": [
            {
              "Name": "TableName",
              "Value": "<table-name>"
            }
          ]
        },
        "Period": <period>,
        "Stat": "Average"
      },
      "Label": "Provisioned",
      "ReturnData": false
    },
    {
      "Id": "consumedWCU",
      "MetricStat": {
        "Metric": {
          "Namespace": "AWS/DynamoDB",
          "MetricName": "ConsumedWriteCapacityUnits",
          "Dimensions": [
            {
              "Name": "TableName",
              "Value": "<table-name>""
            }
          ]
        },
        "Period": <period>,
        "Stat": "Sum"
      },
      "Label": "",
      "ReturnData": false
    },
    {
      "Id": "m1",
      "Expression": "consumedWCU/PERIOD(consumedWCU)",
      "Label": "Consumed WCUs",
      "ReturnData": false
    },
    {
      "Id": "utilizationPercentage",
      "Expression": "100*(m1/provisionedWCU)",
      "Label": "Utilization Percentage",
      "ReturnData": true
    }
  ],
  "StartTime": "<start-time>",
  "EndTime": "<ent-time>",
  "ScanBy": "TimestampDescending",
  "MaxDatapoints": 24
}
```

The read calculations file uses a similar file. This file will retrieve how many RCUs were provisioned and consumed during the time period for the date range specified. The contents of the `read-calc.json` file should look like this:

```
{
  "MetricDataQueries": [
    {
      "Id": "provisionedRCU",
      "MetricStat": {
        "Metric": {
          "Namespace": "AWS/DynamoDB",
          "MetricName": "ProvisionedReadCapacityUnits",
          "Dimensions": [
            {
              "Name": "TableName",
              "Value": "<table-name>"
            }
          ]
        },
        "Period": <period>,
        "Stat": "Average"
      },
      "Label": "Provisioned",
      "ReturnData": false
    },
    {
      "Id": "consumedRCU",
      "MetricStat": {
        "Metric": {
          "Namespace": "AWS/DynamoDB",
          "MetricName": "ConsumedReadCapacityUnits",
          "Dimensions": [
            {
              "Name": "TableName",
              "Value": "<table-name>"
            }
          ]
        },
        "Period": <period>,
        "Stat": "Sum"
      },
      "Label": "",
      "ReturnData": false
    },
    {
      "Id": "m1",
      "Expression": "consumedRCU/PERIOD(consumedRCU)",
      "Label": "Consumed RCUs",
      "ReturnData": false
    },
    {
      "Id": "utilizationPercentage",
      "Expression": "100*(m1/provisionedRCU)",
      "Label": "Utilization Percentage",
      "ReturnData": true
    }
  ],
  "StartTime": "<start-time>",
  "EndTime": "<end-time>",
  "ScanBy": "TimestampDescending",
  "MaxDatapoints": 24
}
```

One you've created the files, you can start retrieving utilization data.

1. To retreive the write utilization data, issue the following command:

   ```
   aws cloudwatch get-metric-data --cli-input-json file://write-calc.json
   ```

1. To retreive the read utilization data, issue the following command:

   ```
   aws cloudwatch get-metric-data --cli-input-json file://read-calc.json
   ```

The result for both queries will be a series of data points in JSON format that will be used for analysis. Your result will depend on the number of data points you specified, the period, and your own specific workload data. It could look something like this:

```
{
    "MetricDataResults": [
        {
            "Id": "utilizationPercentage",
            "Label": "Utilization Percentage",
            "Timestamps": [
                "2022-02-22T05:00:00+00:00",
                "2022-02-22T04:00:00+00:00",
                "2022-02-22T03:00:00+00:00",
                "2022-02-22T02:00:00+00:00",
                "2022-02-22T01:00:00+00:00",
                "2022-02-22T00:00:00+00:00",
                "2022-02-21T23:00:00+00:00"
            ],
            "Values": [
                91.55364583333333,
                55.066631944444445,
                2.6114930555555556,
                24.9496875,
                40.94725694444445,
                25.61819444444444,
                0.0
            ],
            "StatusCode": "Complete"
        }
    ],
    "Messages": []
}
```

**Note**  
If you specify a short period and a long time range, you might need to modify the `MaxDatapoints` which is by default set to 24 in the script. This represents one data point per hour and 24 per day.

------
#### [ AWS Management Console ]

1. Log into the AWS Management Console and navigate to the CloudWatch service page. Select an appropriate AWS Region if necessary.

1. Locate the **Metrics** section on the left navigation bar and select **All metrics**.

1. This will open a dashboard with two panels. The top panel will show you the graphic, and the bottom panel will show the metrics you want to graph. Choose **DynamoDB**.

1. Choose **Table Metrics**. This will show you the tables in your current Region.

1. Use the Search box to search for your table name and choose the write operation metrics: `ConsumedWriteCapacityUnits` and `ProvisionedWriteCapacityUnits`
**Note**  
This example talks about write operation metrics, but you can also use these steps to graph the read operation metrics.

1. Choose the **Graphed metrics (2)** tab to modify the formulas. By default, CloudWatch selects the statistical function **Average** for the graphs.  
![\[The selected graphed metrics and Average as the default statistical function.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/RightSizedProvisioning1.png)

1. While having both graphed metrics selected (the checkbox on the left) select the menu **Add math**, followed by **Common**, and then select the **Percentage** function. Repeat the procedure twice.

   First time selecting the **Percentage** function:  
![\[CloudWatch console. The Percentage function is selected for the graphed metrics.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/RightSizedProvisioning2.png)

   Second time selecting the **Percentage** function:  
![\[CloudWatch console. The Percentage function is selected a second time for the graphed metrics.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/RightSizedProvisioning3.png)

1. At this point you should have four metrics in the bottom menu. Let’s work on the `ConsumedWriteCapacityUnits` calculation. To be consistent, we need to match the names for the ones we used in the AWS CLI section. Click on the **m1 ID** and change this value to **consumedWCU**.   
![\[CloudWatch console. The graphed metric with m1 ID is renamed to consumedWCU.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/RightSizedProvisioning4.png)

   Rename the **ConsumedWriteCapacityUnit** label as **consumedWCU**.  
![\[The graphed metric with ConsumedWriteCapacityUnit label is renamed to consumedWCU.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/RightSizedProvisioning5.png)

1. Change the statistic from **Average** to **Sum**. This action will automatically create another metric called **ANOMALY\$1DETECTION\$1BAND**. For the scope of this procedure, let's ignore it by removing the checkbox on the newly generated **ad1 metric**.  
![\[CloudWatch console. The statistic SUM is selected in the dropdown list for the graphed metrics.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/RightSizedProvisioning6.png)  
![\[CloudWatch console. The ANOMALY_DETECTION_BAND metric is removed from the list of graphed metrics.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/RightSizedProvisioning7.png)

1. Repeat step 8 to rename the **m2 ID** to **provisionedWCU**. Leave the statistic set to **Average**.  
![\[CloudWatch console. The graphed metric with m2 ID is renamed to provisionedWCU.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/RightSizedProvisioning8.png)

1. Select the **Expression1** label and update the value to **m1** and the label to **Consumed WCUs**.
**Note**  
Make sure you have only selected **m1** (checkbox on the left) and **provisionedWCU** to properly visualize the data. Update the formula by clicking in **Details** and changing the formula to **consumedWCU/PERIOD(consumedWCU)**. This step might also generate another **ANOMALY\$1DETECTION\$1BAND** metric, but for the scope of this procedure we can ignore it.  

![\[m1 and provisionedWCU are selected. Details for m1 is updated as consumedWCU/PERIOD(consumedWCU).\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/RightSizedProvisioning10.png)


1. You should have now have two graphics: one that indicates your provisioned WCUs on the table and another that indicates the consumed WCUs. The shape of the graphic might be different from the one below, but you can use it as reference:  
![\[Graph with the provisioned WCUs and consumed WCUs for the table plotted.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/RightSizedProvisioning11.png)

1. Update the percentage formula by selecting the Expression2 graphic (**e2**). Rename the labels and IDs to **utilizationPercentage**. Rename the formula to match **100\$1(m1/provisionedWCU)**.  
![\[CloudWatch console. Labels and IDs for Expression2 are renamed to utilizationPercentage.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/RightSizedProvisioning12.png)  
![\[CloudWatch console. Percentage formula for Expression2 is updated to 100*(m1/provisionedWCU).\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/RightSizedProvisioning13.png)

1. Remove the checkbox from all the metrics but **utilizationPercentage** to visualize your utilization patterns. The default interval is set to 1 minute, but feel free to modify it as you need.  
![\[Graph of the utilizationPercentage metric for the selected time interval.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/RightSizedProvisioning14.png)

Here is view of a longer period of time as well as a bigger period of 1 hour. You can see there are some intervals where the utilization was higher than 100%, but this particular workload has longer intervals with zero utilization.

![\[Utilization pattern for an extended period. It highlights periods of utilization over 100% and zero.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/CostOptimization/RightSizedProvisioning15.png)


At this point, you might have different results from the pictures in this example. It all depends on the data from your workload. Intervals with more than 100% utilization are prone to throttling events. DynamoDB offers [burst capacity](burst-adaptive-capacity.md#burst-capacity), but as soon as the burst capacity is done anything above 100% will be throttled.

------

## How to identify under-provisioned DynamoDB tables
<a name="CostOptimization_RightSizedProvisioning_UnderProvisionedTables"></a>

For most workloads, a table is considered under-provisioned when it constantly consumes more than 80% of their provisioned capacity.

[Burst capacity](burst-adaptive-capacity.md#burst-capacity) is a DynamoDB feature that allow customers to temporarily consume more RCUs/WCUs than originally provisioned (more than the per-second provisioned throughput that was defined in the table). The burst capacity was created to absorb sudden increases in traffic due to special events or usage spikes. This burst capacity doesn’t last forever. As soon as the unused RCUs and WCUs are depleted, you will get throttled if you try to consume more capacity than provisioned. When your application traffic is getting close to the 80% utilization rate, your risk of throttling is significantly higher.

The 80% utilization rate rule varies from the seasonality of your data and your traffic growth. Consider the following scenarios: 
+ If your traffic has been **stable** at \$190% utilization rate for the last 12 months, your table has just the right capacity
+ If your application traffic is **growing** at a rate of 8% monthly in less than 3 months, you will arrive at 100%
+ If your application traffic is **growing** at a rate of 5% in a little more than 4 months, you will still arrive at 100%

The results from the queries above provide a picture of your utilization rate. Use them as a guide to further evaluate other metrics that can help you choose to increase your table capacity as required (for example: a monthly or weekly growth rate). Work with your operations team to define what is a good percentage for your workload and your tables.

There are special scenarios where the data is skewed when we analyse it on a daily or weekly basis. For example, with seasonal applications that have spikes in usage during working hours (but then drops to almost zero outside of working hours), you could benefit by [scheduling auto scaling](https://docs.aws.amazon.com/autoscaling/application/userguide/examples-scheduled-actions.html) where you specify the hours of the day (and the days of the week) to increase the provisioned capacity and when to reduce it. Instead of aiming for higher capacity so you can cover the busy hours, you can also benefit from [DynamoDB table auto scaling](AutoScaling.md) configurations if your seasonality is less pronounced.

**Note**  
When you create a DynamoDB auto scaling configuration for your base table, remember to include another configuration for any GSI that is associated with the table.

## How to identify over-provisioned DynamoDB tables
<a name="CostOptimization_RightSizedProvisioning_OverProvisionedTables"></a>

The query results obtained from the scripts above provide the data points required to perform some initial analysis. If your data set presents values lower than 20% utilization for several intervals, your table might be over-provisioned. To further define if you need to reduce the number of WCUs and RCUS, you should revisit the other readings in the intervals.

When your tables contain several low usage intervals, you can really benefit from using auto scaling policies, either by scheduling auto scaling or just configuring the default auto scaling policies for the table that are based on utilization.

If you have a workload with low utilization to high throttle ratio (**Max(ThrottleEvents)/Min(ThrottleEvents) **in the interval), this could happen when you have a very spiky workload where traffic increases a lot during some days (or hours), but in general the traffic is consistently low. In these scenarios it might be beneficial to use [scheduled auto scaling](https://docs.aws.amazon.com/autoscaling/application/userguide/examples-scheduled-actions.html).

The AWS [Well-Architected Framework](https://aws.amazon.com/architecture/well-architected/) helps cloud architects build secure, high-performing, resilient, and efficient infrastructure for a variety of applications and workloads. Built around six pillars—operational excellence, security, reliability, performance efficiency, cost optimization, and sustainability—AWS Well-Architected provides a consistent approach for customers and partners to evaluate architectures and implement scalable designs.

The AWS [Well-Architected Lenses](https://docs.aws.amazon.com/wellarchitected/latest/userguide/lenses.html) extend the guidance offered by AWS Well-Architected to specific industry and technology domains. The Amazon DynamoDB Well-Architected Lens focuses on DynamoDB workloads. It provides best practices, design principles and questions to assess and review a DynamoDB workload. Completing an Amazon DynamoDB Well-Architected Lens review will provide you with education and guidance around recommended design principles as it relates to each of the AWS Well-Architected pillars. This guidance is based on our experience working with customers across various industries, segments, sizes and geographies.

 As a direct outcome of the Well-Architected Lens review, you will receive a summary of actionable recommendations to optimize and improve your DynamoDB workload. 

## Conducting the Amazon DynamoDB Well-Architected Lens review
<a name="bp-wal-conducting"></a>

The DynamoDB Well-Architected Lens review is usually performed by an AWS Solutions Architect together with the customer, but can also be performed by the customer as a self-service. While we recommend reviewing all six of the Well-Architected Pillars as part of the Amazon DynamoDB Well-Architected Lens, you can also decide to prioritize your focus on one or more pillars first.

Additional information and instructions for conducting an Amazon DynamoDB Well-Architected Lens review are available in [this video ](https://youtu.be/mLAUvJYvBjA) and the [DynamoDB Well-Architected Lens GitHub page ](https://github.com/aws-samples/custom-lens-wa-hub/tree/main/DynamoDB).

## The pillars of the Amazon DynamoDB Well-Architected Lens
<a name="bp-wal-pillars"></a>

The Amazon DynamoDB Well-Architected Lens is built around six pillars:

**Performance efficiency pillar**

The performance efficiency pillar includes the ability to use computing resources efficiently to meet system requirements, and to maintain that efficiency as demand changes and technologies evolve.

The primary DynamoDB design principles for this pillar revolve around [modeling the data ](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-relational-modeling.html), [choosing partition keys ](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.Partitions.html#HowItWorks.Partitions.SimpleKey) and [sort keys ](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/HowItWorks.Partitions.html#HowItWorks.Partitions.CompositeKey), and [defining secondary indexes ](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-indexes.html) based on the application access patterns. Additional considerations include choosing the optimal throughput mode for the workload, AWS SDK tuning and, when appropriate, using an optimal caching strategy. To learn more about these design principles, watch this [deep dive video ](https://youtu.be/PuCIy5Weyi8) about the performance efficiency pillar of the DynamoDB Well-Architected Lens.

**Cost optimization pillar**

The cost optimization pillar focuses on avoiding unnecessary costs. 

Key topics include understanding and controlling where money is being spent, selecting the most appropriate and right number of resource types, analyzing spend over time, designing your data models to optimize the cost for application-specific access patterns, and scaling to meet business needs without overspending.

The key cost optimization design principles for DynamoDB revolve around choosing the most appropriate capacity mode and table class for your tables and avoiding over-provisioning capacity by either using the on-demand capacity mode, or provisioned capacity mode with autoscaling. Additional considerations include efficient data modeling and querying to reduce the amount of consumed capacity, reserving portions of the consumed capacity at discounted price, minimizing item size, identifying and removing unused resources and using [TTL](TTL.md) to automatically delete aged-out data at no cost. To learn more about these design principles, watch this [deep dive video](https://youtu.be/iuI0HUuw6Jg) about the cost optimization pillar of the DynamoDB Well-Architected Lens.

See [Cost optimization](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/bp-cost-optimization.html) for additional information on cost optimization best practices for DynamoDB.

**Operational excellence pillar**

The operational excellence pillar focuses on running and monitoring systems to deliver business value, and continually improving processes and procedures. Key topics include automating changes, responding to events, and defining standards to manage daily operations.

The main operational excellence design principles for DynamoDB include monitoring DynamoDB metrics through Amazon CloudWatch and AWS Config and automatically alert and remediate when predefined thresholds are breached, or non compliant rules are detected. Additional considerations are defining DynamoDB resources via infrastructure as a code and leveraging tags for better organization, identification and cost accounting of your DynamoDB resources. To learn more about these design principles, watch this [deep dive video ](https://youtu.be/41HUSL9tJa8) about the operational excellence pillar of the DynamoDB Well-Architected Lens.

**Reliability pillar**

The reliability pillar focuses on ensuring a workload performs its intended function correctly and consistently when it’s expected to. A resilient workload quickly recovers from failures to meet business and customer demand. Key topics include distributed system design, recovery planning, and how to handle change.

The essential reliability design principles for DynamoDB revolve around choosing the backup strategy and retention based on your RPO and RTO requirements, using DynamoDB global tables for multi-regional workloads, or cross-region disaster recovery scenarios with low RTO, implementing retry logic with exponential backoff in the application by configuring and using these capabilities in the AWS SDK, and monitoring DynamoDB metrics through Amazon CloudWatch and automatically alerting and remediating when predefined thresholds are breached. To learn more about these design principles, watch this [deep dive video ](https://youtu.be/8AoPBxVQYM8) about the reliability pillar of the DynamoDB Well-Architected Lens.

**Security pillar**

The security pillar focuses on protecting information and systems. Key topics include confidentiality and integrity of data, identifying and managing who can do what with privilege management, protecting systems, and establishing controls to detect security events.

The main security design principles for DynamoDB are encrypting data in transit with HTTPS, choosing the type of keys for data at rest encryption and defining the IAM roles and policies to authenticate, authorize and provide fine grain access to DynamoDB resources. Additional considerations include auditing DynamoDB control plane and data plane operations through AWS CloudTrail. To learn more about these design principles, watch this [deep dive video](https://youtu.be/95prjv2EEXA?si=xvNci2MM856siejv) about the security pillar of the DynamoDB Well-Architected Lens.

See [Security](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/security.html) for additional information on security for DynamoDB.

**Sustainability pillar**

The sustainability pillar focuses on minimizing the environmental impacts of running cloud workloads. Key topics include a shared responsibility model for sustainability, understanding impact, and maximizing utilization to minimize required resources and reduce downstream impacts.

The main sustainability design principles for DynamoDB include identifying and removing unused DynamoDB resources, avoiding over-provisioning though the usage of on-demand capacity mode or provisioned capacity-mode with autoscaling, efficient querying to reduce the amount of capacity being consumed and reduction of the storage footprint by compressing data and by deleting aged-out data through the use of TTL. To learn more about these design principles, watch this [deep dive video ](https://youtu.be/fAfYms7u3EE) about the sustainability pillar of the DynamoDB Well-Architected Lens.

# Best practices for designing and using partition keys effectively in DynamoDB
<a name="bp-partition-key-design"></a>

The primary key that uniquely identifies each item in an Amazon DynamoDB table can be simple (a partition key only) or composite (a partition key combined with a sort key). 

You should design your application for uniform activity across all partition keys in the table and its secondary indexes. You can determine the access patterns that your application requires, and read and write units that each table and secondary index requires.

**Note**  
Adaptive capacity applies to on-demand mode and provisioned capacity.

Every partition in a DynamoDB table is designed to deliver a maximum capacity of 3,000 read units per second and 1,000 write units per second. One read unit represents one strongly consistent read operation per second, or two eventually consistent read operations per second, for an item up to 4 KB in size. One write unit represents one write operation per second for an item up to 1 KB in size.

You must factor in the item size when evaluating the partition throughput limits for your table. For example, if the table has an item size of 20 KB, a single consistent read operation will consume 5 read units. This means you can concurrently drive 600 consistent read operations per second on that single item before reaching the partition limits. The total throughput across all partitions in the table can be constrained by the provisioned throughput in provisioned mode, or by the table level throughput limit in on-demand mode. See [Service Quotas](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ServiceQuotas.html) for more information.

**Topics**
+ [Designing partition keys to distribute your workload in DynamoDB](bp-partition-key-uniform-load.md)
+ [Using write sharding to distribute workloads evenly in your DynamoDB table](bp-partition-key-sharding.md)
+ [Distributing write activity efficiently during data upload in DynamoDB](bp-partition-key-data-upload.md)

# Designing partition keys to distribute your workload in DynamoDB
<a name="bp-partition-key-uniform-load"></a>

The partition key portion of a table's primary key determines the logical partitions in which a table's data is stored. This in turn affects the underlying physical partitions. A partition key design that doesn't distribute I/O requests effectively can create "hot" partitions that result in throttling and use your provisioned I/O capacity inefficiently.

The optimal usage of a table's provisioned throughput depends not only on the workload patterns of individual items, but also on the partition key design. This doesn't mean that you must access all partition key values to achieve an efficient throughput level, or even that the percentage of accessed partition key values must be high. It does mean that the more distinct partition key values that your workload accesses, the more those requests will be spread across the partitioned space. In general, you'll use your provisioned throughput more efficiently as the ratio of partition key values accessed to the total number of partition key values increases.

The following is a comparison of the provisioned throughput efficiency of some common partition key schemas.


****  

| Partition key value | Uniformity | 
| --- | --- | 
| User ID, where the application has many users. | Good | 
| Status code, where there are only a few possible status codes. | Bad | 
| Item creation date, rounded to the nearest time period (for example, day, hour, or minute). | Bad | 
| Device ID, where each device accesses data at relatively similar intervals. | Good | 
| Device ID, where even if there are many devices being tracked, one is by far more popular than all the others. | Bad | 

If a single table has only a small number of partition key values, consider distributing your write operations across more distinct partition key values. In other words, structure the primary key elements to avoid one "hot" (heavily requested) partition key value that slows overall performance.

For example, consider a table with a composite primary key. The partition key represents the item's creation date, rounded to the nearest day. The sort key is an item identifier. On a given day, say `2014-07-09`, **all** of the new items are written to that single partition key value (and corresponding physical partition). 

If the table fits entirely into a single partition (considering growth of your data over time), and if your application's read and write throughput requirements don't exceed the read and write capabilities of a single partition, your application won't encounter any unexpected throttling as a result of partitioning.

To use NoSQL Workbench for DynamoDB to help visualize your partition key design, see [Building data models with NoSQL Workbench](workbench.Modeler.md). 

# Using write sharding to distribute workloads evenly in your DynamoDB table
<a name="bp-partition-key-sharding"></a>

One way to better distribute writes across a partition key space in Amazon DynamoDB is to expand the space. You can do this in several different ways. You can add a random number to the partition key values to distribute the items among partitions. Or you can use a number that is calculated based on something that you're querying on.

## Sharding using random suffixes
<a name="bp-partition-key-sharding-random"></a>

One strategy for distributing loads more evenly across a partition key space is to add a random number to the end of the partition key values. Then you randomize the writes across the larger space.

For example, for a partition key that represents today's date, you might choose a random number between `1` and `200` and concatenate it as a suffix to the date. This yields partition key values like `2014-07-09.1`, `2014-07-09.2`, and so on, through `2014-07-09.200`. Because you are randomizing the partition key, the writes to the table on each day are spread evenly across multiple partitions. This results in better parallelism and higher overall throughput.

However, to read all the items for a given day, you would have to query the items for all the suffixes and then merge the results. For example, you would first issue a `Query` request for the partition key value `2014-07-09.1`. Then issue another `Query` for `2014-07-09.2`, and so on, through `2014-07-09.200`. Finally, your application would have to merge the results from all those `Query` requests.

## Sharding using calculated suffixes
<a name="bp-partition-key-sharding-calculated"></a>

A randomizing strategy can greatly improve write throughput. But it's difficult to read a specific item because you don't know which suffix value was used when writing the item. To make it easier to read individual items, you can use a different strategy. Instead of using a random number to distribute the items among partitions, use a number that you can calculate based upon something that you want to query on.

Consider the previous example, in which a table uses today's date in the partition key. Now suppose that each item has an accessible `OrderId` attribute, and that you most often need to find items by order ID in addition to date. Before your application writes the item to the table, it could calculate a hash suffix based on the order ID and append it to the partition key date. The calculation might generate a number between 1 and 200 that is fairly evenly distributed, similar to what the random strategy produces.

A simple calculation would likely suffice, such as the product of the UTF-8 code point values for the characters in the order ID, modulo 200, \$1 1. The partition key value would then be the date concatenated with the calculation result.

With this strategy, the writes are spread evenly across the partition key values, and thus across the physical partitions. You can easily perform a `GetItem` operation for a particular item and date because you can calculate the partition key value for a specific `OrderId` value.

To read all the items for a given day, you still must `Query` each of the `2014-07-09.N` keys (where `N` is 1–200), and your application then has to merge all the results. The benefit is that you avoid having a single "hot" partition key value taking all of the workload.

**Note**  
For a more efficient strategy specifically designed to handle high-volume time series data, see [Time series data](bp-time-series.md).

# Distributing write activity efficiently during data upload in DynamoDB
<a name="bp-partition-key-data-upload"></a>

Typically, when you load data from other data sources, Amazon DynamoDB partitions your table data on multiple servers. You get better performance if you upload data to all the allocated servers simultaneously.

For example, suppose that you want to upload user messages to a DynamoDB table that uses a composite primary key with `UserID` as the partition key and `MessageID` as the sort key.

When you upload the data, one approach you can take is to upload all message items for each user, one user after another:


****  

| UserID | MessageID | 
| --- | --- | 
| U1 | 1 | 
| U1 | 2 | 
| U1 | ... | 
| U1 | ... up to 100 | 
| U2 | 1 | 
| U2 | 2 | 
| U2 | ... | 
| U2 | ... up to 200 | 

The problem in this case is that you are not distributing your write requests to DynamoDB across your partition key values. You are taking one partition key value at a time and uploading all of its items before going to the next partition key value and doing the same.

Behind the scenes, DynamoDB is partitioning the data in your table across multiple servers. To fully use all the throughput capacity that is provisioned for the table, you must distribute your workload across your partition key values. By directing an uneven amount of upload work toward items that all have the same partition key value, you are not fully using all the resources that DynamoDB has provisioned for your table.

You can distribute your upload work by using the sort key to load one item from each partition key value, then another item from each partition key value, and so on: 


****  

| UserID | MessageID | 
| --- | --- | 
| U1 | 1 | 
| U2 | 1 | 
| U3 | 1 | 
| ... | ... | 
| U1 | 2 | 
| U2 | 2 | 
| U3 | 2 | 
| ... | ... | 

Every upload in this sequence uses a different partition key value, keeping more DynamoDB servers busy simultaneously and improving your throughput performance.

# Best practices for using sort keys to organize data in DynamoDB
<a name="bp-sort-keys"></a>

In an Amazon DynamoDB table, the primary key that uniquely identifies each item in the table can be composed of a partition key and a sort key.

Well-designed sort keys have two key benefits:
+ They gather related information together in one place where it can be queried efficiently. Careful design of the sort key lets you retrieve commonly needed groups of related items using range queries with operators such as `begins_with`, `between`, `>`, `<`, and so on.
+ Composite sort keys let you define hierarchical (one-to-many) relationships in your data that you can query at any level of the hierarchy.

  For example, in a table listing geographical locations, you might structure the sort key as follows.

  ```
  [country]#[region]#[state]#[county]#[city]#[neighborhood]
  ```

  This would let you make efficient range queries for a list of locations at any one of these levels of aggregation, from `country`, to a `neighborhood`, and everything in between.

## Using sort keys for version control
<a name="bp-sort-keys-version-control"></a>

Many applications need to maintain a history of item-level revisions for audit or compliance purposes and to be able to retrieve the most recent version easily. There is an effective design pattern that accomplishes this using sort key prefixes:
+ For each new item, create two copies of the item: One copy should have a version-number prefix of zero (such as `v0_`) at the beginning of the sort key, and one should have a version-number prefix of one (such as `v1_`).
+ Every time the item is updated, use the next higher version-prefix in the sort key of the updated version, and copy the updated contents into the item with version-prefix zero. This means that the latest version of any item can be located easily using the zero prefix.

For example, a parts manufacturer might use a schema like the one illustrated below.

![\[Version control example showing a table with primary key and data-item attributes.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/VersionControl.png)


The `Equipment_1` item goes through a sequence of audits by various auditors. The results of each new audit are captured in a new item in the table, starting with version number one, and then incrementing the number for each successive revision.

When each new revision is added, the application layer replaces the contents of the zero-version item (having sort key equal to `v0_Audit`) with the contents of the new revision.

Whenever the application needs to retrieve for the most recent audit status, it can query for the sort key prefix of `v0_`.

If the application needs to retrieve the entire revision history, it can query all the items under the item's partition key and filter out the `v0_` item.

This design also works for audits across multiple parts of a piece of equipment, if you include the individual part-IDs in the sort key after the sort key prefix.

# Best practices for using secondary indexes in DynamoDB
<a name="bp-indexes"></a>

Secondary indexes are often essential to support the query patterns that your application requires. At the same time, overusing secondary indexes or using them inefficiently can add cost and reduce performance unnecessarily.

**Contents**
+ [General guidelines for secondary indexes in DynamoDB](bp-indexes-general.md)
  + [Use indexes efficiently](bp-indexes-general.md#bp-indexes-general-efficiency)
  + [Choose projections carefully](bp-indexes-general.md#bp-indexes-general-projections)
  + [Optimize frequent queries to avoid fetches](bp-indexes-general.md#bp-indexes-general-fetches)
  + [Be aware of item-collection size limits when creating local secondary indexes](bp-indexes-general.md#bp-indexes-general-expanding-collections)
+ [Take advantage of sparse indexes](bp-indexes-general-sparse-indexes.md)
  + [Examples of sparse indexes in DynamoDB](bp-indexes-general-sparse-indexes.md#bp-indexes-sparse-examples)
+ [Using Global Secondary Indexes for materialized aggregation queries in DynamoDB](bp-gsi-aggregation.md)
  + [Example scenario and access patterns](bp-gsi-aggregation.md#bp-gsi-aggregation-scenario)
  + [Why pre-compute aggregations](bp-gsi-aggregation.md#bp-gsi-aggregation-why)
  + [Table design](bp-gsi-aggregation.md#bp-gsi-aggregation-table-design)
  + [Aggregation pipeline with Streams and AWS Lambda](bp-gsi-aggregation.md#bp-gsi-aggregation-pipeline)
  + [Sparse GSI design](bp-gsi-aggregation.md#bp-gsi-aggregation-sparse-gsi)
  + [Querying the GSI](bp-gsi-aggregation.md#bp-gsi-aggregation-querying)
  + [Considerations](bp-gsi-aggregation.md#bp-gsi-aggregation-considerations)
+ [Overloading Global Secondary Indexes in DynamoDB](bp-gsi-overloading.md)
+ [Using Global Secondary Index write sharding for selective table queries in DynamoDB](bp-indexes-gsi-sharding.md)
  + [Pattern design](bp-indexes-gsi-sharding.md#bp-indexes-gsi-sharding-pattern-design)
  + [Sharding strategy](bp-indexes-gsi-sharding.md#bp-indexes-gsi-sharding-strategy)
  + [Querying the sharded GSI](bp-indexes-gsi-sharding.md#bp-indexes-gsi-querying-the-sharded-GSI)
  + [Parallel query execution considerations](bp-indexes-gsi-sharding.md#bp-indexes-gsi-parallel-query-execution-considerations)
  + [Code example](bp-indexes-gsi-sharding.md#bp-indexes-gsi-code-example)
+ [Using Global Secondary Indexes to create an eventually consistent replica in DynamoDB](bp-indexes-gsi-replica.md)

# General guidelines for secondary indexes in DynamoDB
<a name="bp-indexes-general"></a>

Amazon DynamoDB supports two types of secondary indexes:
+ **Global secondary index (GSI)— **An index with a partition key and a sort key that can be different from those on the base table. A global secondary index is considered "global" because queries on the index can span all of the data in the base table, across all partitions. A global secondary index has no size limitations and has its own provisioned throughput settings for read and write activity that are separate from those of the table.
+ **Local secondary index (LSI)—**An index that has the same partition key as the base table, but a different sort key. A local secondary index is "local" in the sense that every partition of a local secondary index is scoped to a base table partition that has the same partition key value. As a result, the total size of indexed items for any one partition key value can't exceed 10 GB. Also, a local secondary index shares provisioned throughput settings for read and write activity with the table it is indexing.

Each table in DynamoDB can have up to 20 global secondary indexes (default quota) and 5 local secondary indexes. 

Global secondary indexes are often more useful than local secondary indexes. Determining which type of index to use will also depend on your application's requirements. For a comparison of global secondary indexes and local secondary indexes, and more information on how to choose between them, see [Improving data access with secondary indexes in DynamoDB](SecondaryIndexes.md). 

The following are some general principles and design patterns to keep in mind when creating indexes in DynamoDB:

**Topics**
+ [Use indexes efficiently](#bp-indexes-general-efficiency)
+ [Choose projections carefully](#bp-indexes-general-projections)
+ [Optimize frequent queries to avoid fetches](#bp-indexes-general-fetches)
+ [Be aware of item-collection size limits when creating local secondary indexes](#bp-indexes-general-expanding-collections)

## Use indexes efficiently
<a name="bp-indexes-general-efficiency"></a>

**Keep the number of indexes to a minimum.** Don't create secondary indexes on attributes that you don't query often. Indexes that are seldom used contribute to increased storage and I/O costs without improving application performance. 

## Choose projections carefully
<a name="bp-indexes-general-projections"></a>

Because secondary indexes consume storage and provisioned throughput, you should keep the size of the index as small as possible. Also, the smaller the index, the greater the performance advantage compared to querying the full table. If your queries usually return only a small subset of attributes, and the total size of those attributes is much smaller than the whole item, project only the attributes that you regularly request.

If you expect a lot of write activity on a table compared to reads, follow these best practices:
+ Consider projecting fewer attributes to minimize the size of items written to the index. However, this only applies if the size of projected attributes would otherwise be larger than a single write capacity unit (1 KB). For example, if the size of an index entry is only 200 bytes, DynamoDB rounds this up to 1 KB. In other words, as long as the index items are small, you can project more attributes at no extra cost.
+ Avoid projecting attributes that you know will rarely be needed in queries. Every time you update an attribute that is projected in an index, you incur the extra cost of updating the index as well. You can still retrieve non-projected attributes in a `Query` at a higher provisioned throughput cost, but the query cost may be significantly lower than the cost of updating the index frequently.
+ Specify `ALL` only if you want your queries to return the entire table item sorted by a different sort key. Projecting all attributes eliminates the need for table fetches, but in most cases, it doubles your costs for storage and write activity.

Balance the need to keep your indexes as small as possible against the need to keep fetches to a minimum, as explained in the next section.

## Optimize frequent queries to avoid fetches
<a name="bp-indexes-general-fetches"></a>

To get the fastest queries with the lowest possible latency, project all the attributes that you expect those queries to return. In particular, if you query a local secondary index for attributes that are not projected, DynamoDB automatically fetches those attributes from the table, which requires reading the entire item from the table. This introduces latency and additional I/O operations that you can avoid.

Keep in mind that "occasional" queries can often turn into "essential" queries. If there are attributes that you don't intend to project because you anticipate querying them only occasionally, consider whether circumstances might change and you might regret not projecting those attributes after all.

For more information about table fetches, see [Provisioned throughput considerations for Local Secondary Indexes](LSI.md#LSI.ThroughputConsiderations).

## Be aware of item-collection size limits when creating local secondary indexes
<a name="bp-indexes-general-expanding-collections"></a>

An *item collection* is all the items in a table and its local secondary indexes that have the same partition key. No item collection can exceed 10 GB, so it's possible to run out of space for a particular partition key value.

When you add or update a table item, DynamoDB updates all local secondary indexes that are affected. If the indexed attributes are defined in the table, the local secondary indexes grow too.

When you create a local secondary index, think about how much data will be written to it, and how many of those data items will have the same partition key value. If you expect that the sum of table and index items for a particular partition key value might exceed 10 GB, consider whether you should avoid creating the index.

If you can't avoid creating the local secondary index, you must anticipate the item collection size limit and take action before you exceed it. As a best practice, you should utilize the [https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/dynamodbv2/model/ReturnItemCollectionMetrics.html](https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/dynamodbv2/model/ReturnItemCollectionMetrics.html) parameter when writing items to monitor and alert on item collection sizes that approach the 10GB size limit. Exceeding the maximum item collection size will result in failed write attempts. You can mitigate the item collection size issues by monitoring and alerting on item collection sizes before they impact your application.

**Note**  
Once created, you cannot delete a local secondary index.

For strategies on working within the limit and taking corrective action, see [Item collection size limit](LSI.md#LSI.ItemCollections.SizeLimit).

# Take advantage of sparse indexes
<a name="bp-indexes-general-sparse-indexes"></a>

For any item in a table, DynamoDB writes a corresponding index entry **only if the index key attributes are present in the item**. For a global secondary index, this means the index partition key must be defined on the item, and if the index also has a sort key, that attribute must be present too. If either key attribute is missing from an item, that item does not appear in the index. An index where only a subset of items from the base table appear is called a *sparse* index.

Sparse indexes are useful for queries over a small subsection of a table. For example, suppose that you have a table where you store all your customer orders, with the following key attributes:
+ Partition key: `CustomerId`
+ Sort key: `OrderId`

To track open orders, you can insert an attribute named `isOpen` in order items that have not already shipped. Then when the order ships, you can delete the attribute. If you then create an index on `CustomerId` (partition key) and `isOpen` (sort key), only those orders with `isOpen` defined appear in it. When you have thousands of orders of which only a small number are open, it's faster and less expensive to query that index for open orders than to scan the entire table.

Instead of using a type of attribute like `isOpen`, you could use an attribute with a value that results in a useful sort order in the index. For example, you could use an `OrderOpenDate` attribute set to the date on which each order was placed, and then delete it after the order is fulfilled. That way, when you query the sparse index, the items are returned sorted by the date on which each order was placed.

## Examples of sparse indexes in DynamoDB
<a name="bp-indexes-sparse-examples"></a>

Global secondary indexes are sparse by default. When you create a global secondary index, you specify a partition key and optionally a sort key. Only items in the base table that contain the required key attributes appear in the index. If an item is missing the index partition key—or the sort key, when one is defined—that item is excluded from the index.

By designing a global secondary index to be sparse, you can provision it with lower write throughput than that of the base table, while still achieving excellent performance.

For example, a gaming application might track all scores of every user, but generally only needs to query a few high scores. The following design handles this scenario efficiently:

![\[Sparse GSI example.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/SparseIndex_A.png)


Here, Rick has played three games and achieved `Champ` status in one of them. Padma has played four games and achieved `Champ` status in two of them. Notice that the `Award` attribute is present only in items where the user achieved an award. The associated global secondary index looks like the following:

![\[Sparse GSI example.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/SparseIndex_B.png)


The global secondary index contains only the high scores that are frequently queried, which are a small subset of the items in the base table.

# Using Global Secondary Indexes for materialized aggregation queries in DynamoDB
<a name="bp-gsi-aggregation"></a>

Maintaining near real-time aggregations and key metrics on top of rapidly changing data is becoming increasingly valuable to businesses for making rapid decisions. For example, a music library might want to showcase its most downloaded songs in near-real time, or an e-commerce platform might need to display trending products by category.

Because DynamoDB doesn't natively support aggregation operations like `SUM` or `COUNT` across items, computing these values at read time would require scanning large numbers of items—which may be slow and expensive. Instead, you can *pre-compute* aggregations as data changes and store the results as regular items in your table. This pattern is called *materialized aggregation*.

**Topics**
+ [Example scenario and access patterns](#bp-gsi-aggregation-scenario)
+ [Why pre-compute aggregations](#bp-gsi-aggregation-why)
+ [Table design](#bp-gsi-aggregation-table-design)
+ [Aggregation pipeline with Streams and AWS Lambda](#bp-gsi-aggregation-pipeline)
+ [Sparse GSI design](#bp-gsi-aggregation-sparse-gsi)
+ [Querying the GSI](#bp-gsi-aggregation-querying)
+ [Considerations](#bp-gsi-aggregation-considerations)

## Example scenario and access patterns
<a name="bp-gsi-aggregation-scenario"></a>

Consider a music library application with the following requirements:
+ The application records individual song downloads at high volume (thousands per second).
+ Users need to see the most downloaded songs for a given month with single-digit millisecond latency.
+ The application also needs to support queries like "top 10 songs this month" and "all songs downloaded in a given month."

Computing download counts at read time by scanning all download records may be expensive at this scale. Instead, you can maintain a running count that updates as each download occurs, and store it in a way that supports efficient querying.

## Why pre-compute aggregations
<a name="bp-gsi-aggregation-why"></a>

There are several approaches to computing aggregations. The following table compares common alternatives and explains why materialized aggregation in DynamoDB is often the best fit for this type of use case.


| Approach | Tradeoffs | When to use | 
| --- | --- | --- | 
| Scan and count at read time | Requires reading all download records for every query. Latency grows with data volume and consumes significant read capacity. | Only suitable for very small datasets where latency isn't a concern. | 
| External aggregation store (for example, Amazon ElastiCache) | Adds operational complexity with a separate service to manage. Requires synchronization logic between DynamoDB and the cache. | When you need sub-millisecond reads or complex aggregation logic that goes beyond simple counts. | 
| Application-level aggregation on write | Couples the aggregation logic to the write path. If the application fails after recording the download but before updating the count, the aggregation becomes inconsistent. | When you need synchronous, strongly consistent aggregation and can tolerate added write latency. | 
| Materialized aggregation with Streams and Lambda | Decouples aggregation from the write path. Aggregation is eventually consistent (typically seconds behind). Adds Lambda invocation costs. | When you need near real-time aggregations with low read latency and can tolerate eventual consistency. This is the approach described on this page. | 

The materialized aggregation approach keeps the write path simple (just record the download), offloads the aggregation to an asynchronous process, and stores the result in DynamoDB where it can be queried with single-digit millisecond latency.

## Table design
<a name="bp-gsi-aggregation-table-design"></a>

This design uses a single table with two item types that share the same partition key (`songID`) but use different sort key patterns to distinguish between them:
+ **Download records** – Individual download events. The sort key is the `DownloadID` (a unique identifier for each download).
+ **Monthly aggregation items** – Pre-computed download counts per song per month. The sort key is the month in `YYYY-MM` format (for example, `2018-01`). These items also contain a `DownloadCount` attribute with the running total.

Only the monthly aggregation items contain the `Month` attribute. This distinction is important for the sparse GSI design described later.

The following diagram shows the table layout with both item types:

![\[Music library table layout showing download records and monthly aggregation items sharing the same partition key (songID).\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/AggregationQueries.png)



| Item type | Partition key (songID) | Sort key | Additional attributes | 
| --- | --- | --- | --- | 
| Download record | song1 | download-abc123 | UserID, Timestamp | 
| Monthly aggregation | song1 | 2018-01 | Month=2018-01, DownloadCount=1,746,992 | 

## Aggregation pipeline with Streams and AWS Lambda
<a name="bp-gsi-aggregation-pipeline"></a>

The aggregation pipeline works as follows:

1. When a song is downloaded, the application writes a new item to the table with `Partition-Key=songID` and `Sort-Key=DownloadID`.

1. DynamoDB Streams captures this write as a stream record.

1. A Lambda function, attached to the stream, processes the new record. It identifies the `songID` and the current month, then updates the corresponding monthly aggregation item by incrementing the `DownloadCount` attribute.

1. The updated aggregation item is then available for querying through the sparse GSI.

The Lambda function uses an `UpdateItem` call with an `ADD` expression to atomically increment the download count. This avoids read-modify-write race conditions:

```
import boto3

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('MusicLibrary')

def handler(event, context):
    for record in event['Records']:
        if record['eventName'] == 'INSERT':
            new_image = record['dynamodb']['NewImage']
            song_id = new_image['songID']['S']
            # Derive the month from the download timestamp
            timestamp = new_image['Timestamp']['S']
            month = timestamp[:7]  # Extract YYYY-MM

            table.update_item(
                Key={
                    'songID': song_id,
                    'SK': month
                },
                UpdateExpression='ADD DownloadCount :inc SET #m = :month',
                ExpressionAttributeNames={
                    '#m': 'Month'
                },
                ExpressionAttributeValues={
                    ':inc': 1,
                    ':month': month
                }
            )
```

**Note**  
If a Lambda execution fails after writing the updated aggregation value, the stream record may be retried. Because the `ADD` operation increments the count each time it runs, a retry would increment the count more than once for the same download, leaving you with an *approximate* value. For most analytics and leaderboard use cases, this small margin of error is acceptable. If you need exact counts, consider adding idempotency logic—for example, by using a condition expression that checks whether the specific `DownloadID` has already been processed.

## Sparse GSI design
<a name="bp-gsi-aggregation-sparse-gsi"></a>

To efficiently query the aggregated results, create a global secondary index with the following key schema:
+ **GSI partition key:** `Month` (String)
+ **GSI sort key:** `DownloadCount` (Number)

This GSI is *sparse* because only the monthly aggregation items contain the `Month` attribute. The individual download records don't have this attribute, so they are automatically excluded from the index. This means the GSI contains only the pre-computed aggregation items—a small fraction of the total items in the table.

A sparse GSI provides two key benefits:
+ **Lower cost** – Because only aggregation items are replicated to the index, you consume far less write capacity and storage compared to an index that includes every item in the table.
+ **Faster queries** – The index contains only the data you need to query, so reads are efficient and return results with single-digit millisecond latency.

For more information about how sparse indexes work, see [Take advantage of sparse indexes](bp-indexes-general-sparse-indexes.md).

## Querying the GSI
<a name="bp-gsi-aggregation-querying"></a>

With the sparse GSI in place, you can efficiently answer several types of queries:

**Get the most downloaded song for a given month:**

```
aws dynamodb query \
    --table-name "MusicLibrary" \
    --index-name "MonthDownloadsIndex" \
    --key-condition-expression "#m = :month" \
    --expression-attribute-names '{"#m": "Month"}' \
    --expression-attribute-values '{":month": {"S": "2018-01"}}' \
    --scan-index-forward false \
    --limit 1
```

Setting `ScanIndexForward` to `false` sorts results by `DownloadCount` in descending order, and `Limit=1` returns only the top song.

**Get the top 10 songs for a given month:**

```
aws dynamodb query \
    --table-name "MusicLibrary" \
    --index-name "MonthDownloadsIndex" \
    --key-condition-expression "#m = :month" \
    --expression-attribute-names '{"#m": "Month"}' \
    --expression-attribute-values '{":month": {"S": "2018-01"}}' \
    --scan-index-forward false \
    --limit 10
```

**Get all songs downloaded in a given month** (sorted by download count):

```
aws dynamodb query \
    --table-name "MusicLibrary" \
    --index-name "MonthDownloadsIndex" \
    --key-condition-expression "#m = :month" \
    --expression-attribute-names '{"#m": "Month"}' \
    --expression-attribute-values '{":month": {"S": "2018-01"}}' \
    --scan-index-forward false
```

## Considerations
<a name="bp-gsi-aggregation-considerations"></a>

Keep the following in mind when implementing this pattern:
+ **Eventual consistency** – The aggregation values are updated asynchronously through DynamoDB Streams and Lambda. There is typically a delay of a few seconds between a download being recorded and the aggregation being updated. This means the GSI reflects near real-time data, not real-time data.
+ **Lambda concurrency** – If your table has a high write volume, multiple Lambda invocations may attempt to update the same aggregation item concurrently. The atomic `ADD` operation handles this safely, but you should monitor Lambda concurrency and throttling metrics to ensure your function can keep up with the stream.
+ **GSI write capacity** – Because the sparse GSI only contains aggregation items, it requires significantly less write capacity than the base table. However, you should still provision enough capacity (or use on-demand mode) to handle the rate of aggregation updates.
+ **Approximate counts** – As noted earlier, Lambda retries can cause counts to be slightly over-counted. For use cases that require exact counts, implement idempotency checks in the Lambda function.

# Overloading Global Secondary Indexes in DynamoDB
<a name="bp-gsi-overloading"></a>

Although Amazon DynamoDB has a default quota of 20 global secondary indexes per table, in practice, you can index across far more than 20 data fields. As opposed to a table in a relational database management system (RDBMS), in which the schema is uniform, a table in DynamoDB can hold many different kinds of data items at one time. In addition, the same attribute in different items can contain entirely different kinds of information.

Consider the following example of a DynamoDB table layout that saves a variety of different kinds of data.

![\[Table schema for GSI Overloading.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/OverloadGSIexample.png)


The `Data` attribute, which is common to all the items, has different content depending on its parent item. If you create a global secondary index for the table that uses the table's sort key as its partition key and the `Data` attribute as its sort key, you can make a variety of different queries using that single global secondary index. These queries might include the following:
+ Look up an employee by name in the global secondary index, using `Employee_Name` as the partition key value and the employee's name (for example `Murphy, John`) as the sort key value.
+ Use the global secondary index to find all employees working in a particular warehouse by searching on a warehouse ID (such as `Warehouse_01`).
+ Get a list of recent hires, querying the global secondary index on `HR_confidential` as a partition key value and using a range of dates in the sort key value.

# Using Global Secondary Index write sharding for selective table queries in DynamoDB
<a name="bp-indexes-gsi-sharding"></a>

When you need to query recent data within a specific time window, DynamoDB's requirement of providing a partition key for most read operations can present a challenge. To address this scenario, you can implement an effective query pattern using a combination of write sharding and a Global Secondary Index (GSI).

This approach allows you to efficiently retrieve and analyze time-sensitive data without performing full table scans, which can be resource-intensive and costly. By strategically designing your table structure and indexing, you can create a flexible solution that supports time-based data retrieval while maintaining optimal performance.

**Topics**
+ [Pattern design](#bp-indexes-gsi-sharding-pattern-design)
+ [Sharding strategy](#bp-indexes-gsi-sharding-strategy)
+ [Querying the sharded GSI](#bp-indexes-gsi-querying-the-sharded-GSI)
+ [Parallel query execution considerations](#bp-indexes-gsi-parallel-query-execution-considerations)
+ [Code example](#bp-indexes-gsi-code-example)

## Pattern design
<a name="bp-indexes-gsi-sharding-pattern-design"></a>

When working with DynamoDB, you can overcome time-based data retrieval challenges by implementing a sophisticated pattern that combines write sharding and Global Secondary Indexes to enable flexible, efficient querying across recent data windows.

**Structure of the table**
+ Partition Key (PK): "Username"

**Structure of the GSI**
+ GSI Partition Key (PK\$1GSI): "ShardNumber\$1"
+ GSI Sort Key (SK\$1GSI): ISO 8601 timestamp (e.g., "2030-04-01T12:00:00Z")

![\[Pattern designs for time-series data.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/BestPractices-44-TimeBoundedTable-2.png)


## Sharding strategy
<a name="bp-indexes-gsi-sharding-strategy"></a>

Assuming you decide to use 10 shards, your shard numbers could range from 0 to 9. When logging an activity, you would calculate the shard number (for example, by using a hash function on the user ID and then taking the modulus of the number of shards) and prepend it to the GSI partition key. This method distributes the entries across different shards, mitigating the risk of hot partitions.

## Querying the sharded GSI
<a name="bp-indexes-gsi-querying-the-sharded-GSI"></a>

Querying across all shards for items within a particular time range in a DynamoDB table, where data is sharded across multiple partition keys, requires a different approach than querying a single partition. Since DynamoDB queries are limited to a single partition key at a time, you can't directly query across multiple shards with a single query operation. However, you can achieve the desired result through application-level logic by performing multiple queries, each targeting a specific shard, and then aggregating the results. The procedure below explains how to do this. 

**To query and aggregate shards**

1. Identify the range of shard numbers used in your sharding strategy. For instance, if you have 10 shards, your shard numbers would range from 0-9.

1. For each shard, construct and execute a query to fetch items within the desired time range. These queries can be executed in parallel to improve efficiency. Use the partition key with the shard number and the sort key with your time range for these queries. Here's an example query for a single shard:

   ```
   aws dynamodb query \
       --table-name "YourTableName" \
       --index-name "YourIndexName" \
       --key-condition-expression "PK_GSI = :pk_val AND SK_GSI BETWEEN :start_date AND :end_date" \
       --expression-attribute-values '{
           ":pk_val": {"S": "ShardNumber#0"},
           ":start_date": {"S": "2024-04-01"},
           ":end_date": {"S": "2024-04-30"}
       }'
   ```  
![\[Query for single shard example.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/BestPractices-44-single-shard-example.png)

   You would replicate this query for each shard, adjusting the partition key accordingly (e.g., "ShardNumber\$11", "ShardNumber\$12", ..., "ShardNumber\$19").

1. Aggregate the results from each query after all queries are complete. Perform this aggregation in your application code, combining the results into a single dataset that represents the items from all shards within your specified time range.

## Parallel query execution considerations
<a name="bp-indexes-gsi-parallel-query-execution-considerations"></a>

Each query consumes read capacity from your table or index. If you're using provisioned throughput, ensure that your table is provisioned with enough capacity to handle the burst of parallel queries. If you're using on-demand capacity, be mindful of the potential cost implications.

## Code example
<a name="bp-indexes-gsi-code-example"></a>

To execute parallel queries across shards in DynamoDB using Python, you can use the boto3 library, which is the Amazon Web Services SDK for Python. This example assumes you have boto3 installed and configured with appropriate AWS credentials.

The following Python code demonstrates how to perform parallel queries across multiple shards for a given time range. It uses concurrent futures to execute queries in parallel, reducing the overall execution time compared to sequential execution.

```
import boto3
from concurrent.futures import ThreadPoolExecutor, as_completed

# Initialize a DynamoDB client
dynamodb = boto3.client('dynamodb')

# Define your table name and the total number of shards
table_name = 'YourTableName'
total_shards = 10  # Example: 10 shards numbered 0 to 9
time_start = "2030-03-15T09:00:00Z"
time_end = "2030-03-15T10:00:00Z"

def query_shard(shard_number):
    """
    Query items in a specific shard for the given time range.
    """
    response = dynamodb.query(
        TableName=table_name,
        IndexName='YourGSIName',  # Replace with your GSI name
        KeyConditionExpression="PK_GSI = :pk_val AND SK_GSI BETWEEN :date_start AND :date_end",
        ExpressionAttributeValues={
            ":pk_val": {"S": f"ShardNumber#{shard_number}"},
            ":date_start": {"S": time_start},
            ":date_end": {"S": time_end},
        }
    )
    return response['Items']

# Use ThreadPoolExecutor to query across shards in parallel
with ThreadPoolExecutor(max_workers=total_shards) as executor:
    # Submit a future for each shard query
    futures = {executor.submit(query_shard, shard_number): shard_number for shard_number in range(total_shards)}
    
    # Collect and aggregate results from all shards
    all_items = []
    for future in as_completed(futures):
        shard_number = futures[future]
        try:
            shard_items = future.result()
            all_items.extend(shard_items)
            print(f"Shard {shard_number} returned {len(shard_items)} items")
        except Exception as exc:
            print(f"Shard {shard_number} generated an exception: {exc}")

# Process the aggregated results (e.g., sorting, filtering) as needed
# For example, simply printing the count of all retrieved items
print(f"Total items retrieved from all shards: {len(all_items)}")
```

Before running this code, make sure to replace `YourTableName` and `YourGSIName` with the actual table and GSI names from your DynamoDB setup. Also, adjust `total_shards`, `time_start`, and `time_end` variables according to your specific requirements.

This script queries each shard for items within the specified time range and aggregates the results.

# Using Global Secondary Indexes to create an eventually consistent replica in DynamoDB
<a name="bp-indexes-gsi-replica"></a>

You can use a global secondary index to create an eventually consistent replica of a table. Creating a replica can allow you to do the following:
+ **Set different provisioned read capacity for different readers.** For example, suppose that you have two applications: One application handles high-priority queries and needs the highest levels of read performance, whereas the other handles low-priority queries that can tolerate throttling of read activity.

  If both of these applications read from the same table, a heavy read load from the low-priority application could consume all the available read capacity for the table. This would throttle the high-priority application's read activity.

  Instead, you can create a replica through a global secondary index whose read capacity you can set separate from that of the table itself. You can then have your low-priority app query the replica instead of the table.
+ **Eliminate reads from a table entirely.** For example, you might have an application that captures a high volume of clickstream activity from a website, and you don't want to risk having reads interfere with that. You can isolate this table and prevent reads by other applications (see [Using IAM policy conditions for fine-grained access control](specifying-conditions.md)), while letting other applications read a replica created using a global secondary index.

To create a replica, set up a global secondary index that has the same key schema as the parent table, with some or all of the non-key attributes projected into it. In applications, you can direct some or all read activity to this global secondary index rather than to the parent table. You can then adjust the provisioned read capacity of the global secondary index to handle those reads without changing the parent table's provisioned read capacity.

There is always a short propagation delay between a write to the parent table and the time when the written data appears in the index. In other words, your applications should take into account that the global secondary index replica is only *eventually consistent* with the parent table.

You can create multiple global secondary index replicas to support different read patterns. When you create the replicas, project only the attributes that each read pattern actually requires. An application can then consume less provisioned read capacity to obtain only the data it needs rather than having to read the item from the parent table. This optimization can result in significant cost savings over time.

# Best practices for storing large items and attributes in DynamoDB
<a name="bp-use-s3-too"></a>

Amazon DynamoDB limits the size of each item that you store in a table to 400 KB (see [Item size](Constraints.md#limits-items-size)). If your application needs to store more data in an item than the DynamoDB size limit permits, you can try compressing one or more large attributes or breaking the item into multiple items (efficiently indexed by sort keys). You can also store the item as an object in Amazon Simple Storage Service (Amazon S3) and store the Amazon S3 object identifier in your DynamoDB item.

As a best practice, you should utilize the [https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/dynamodbv2/model/ReturnConsumedCapacity.html](https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/dynamodbv2/model/ReturnConsumedCapacity.html) parameter when writing items to monitor and alert on items sizes that approach the 400 KB maximum item size. Exceeding the maximum item size will result in failed write attempts. DynamoDB will return a [ValidationException error](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Programming.Errors.html). Monitoring and alerting on item sizes will enable you to mitigate the items size issues before they impact your application.

## Compressing large attribute values
<a name="bp-use-s3-too-or-compress"></a>

Compressing large attribute values can let them fit within item limits in DynamoDB and reduce your storage costs. Compression algorithms such as GZIP or LZO produce binary output that you can then store in a `Binary` attribute type within the item.

As an example, consider a table that stores messages written by forum users. Such messages often contain long strings of text, which are candidates for compression. While compression can reduce item sizes, the downside is that the compressed attribute values are not useful for filtering.

For sample code that demonstrates how to compress such messages in DynamoDB, see the following:
+ [Example: Handling binary type attributes using the AWS SDK for Java document API](JavaDocumentAPIBinaryTypeExample.md)
+ [Example: Handling binary type attributes using the AWS SDK for .NET low-level API](LowLevelDotNetBinaryTypeExample.md)

## Vertical partitioning
<a name="bp-use-s3-too-vertical-partitioning"></a>

An alternative solution to dealing with large items is to break them down into smaller chunks of data and associating all relevant items by the partition key value. You can then use a sort key string to identify the associated information stored alongside it. By doing this, and having multiple items grouped by the same partition key value, you are creating an [*item collection*](WorkingWithItemCollections.md).

For more information on this approach, see:
+ [Use vertical partitioning to scale data efficiently in Amazon DynamoDB](https://aws.amazon.com/blogs/database/use-vertical-partitioning-to-scale-data-efficiently-in-amazon-dynamodb/) 
+ [Implement vertical partitioning in Amazon DynamoDB using AWS Glue](https://aws.amazon.com/blogs/database/implement-vertical-partitioning-in-amazon-dynamodb-using-aws-glue/) 

## Storing large attribute values in Amazon S3
<a name="bp-use-s3-too-large-values"></a>

As mentioned previously, you can also use Amazon S3 to store large attribute values that cannot fit in a DynamoDB item. You can store them as an object in Amazon S3 and then store the object identifier in your DynamoDB item.

You can also use the object metadata support in Amazon S3 to provide a link back to the parent item in DynamoDB. Store the primary key value of the item as Amazon S3 metadata of the object in Amazon S3. Doing this often helps with maintenance of the Amazon S3 objects.

For example, consider the `ProductCatalog` table. Items in this table store information about item price, description, book authors, and dimensions for other products. If you wanted to store an image of each product that was too large to fit in an item, you could store the images in Amazon S3 instead of in DynamoDB.

When implementing this strategy, keep the following in mind:
+ DynamoDB doesn't support transactions that cross Amazon S3 and DynamoDB. Therefore, your application must deal with any failures, which could include cleaning up orphaned Amazon S3 objects.
+ Amazon S3 limits the length of object identifiers. So you must organize your data in a way that doesn't generate excessively long object identifiers or violate other Amazon S3 constraints.

For more information about how to use Amazon S3, see the [Amazon Simple Storage Service User Guide](https://docs.aws.amazon.com/AmazonS3/latest/userguide/).

# Best practices for handling time series data in DynamoDB
<a name="bp-time-series"></a>

General design principles in Amazon DynamoDB recommend that you keep the number of tables you use to a minimum. For most applications, a single table is all you need. However, for time series data, you can often best handle it by using one table per application per period.

## Design pattern for time series data
<a name="bp-time-series-pattern"></a>

Consider a typical time series scenario, where you want to track a high volume of events. Your write access pattern is that all the events being recorded have today's date. Your read access pattern might be to read today's events most frequently, yesterday's events much less frequently, and then older events very little at all. One way to handle this is by building the current date and time into the primary key.

The following design pattern often handles this kind of scenario effectively:
+ Create one table per period, provisioned with the required read and write capacity and the required indexes.
+ Before the end of each period, prebuild the table for the next period. Just as the current period ends, direct event traffic to the new table. You can assign names to these tables that specify the periods they have recorded.
+ As soon as a table is no longer being written to, reduce its provisioned write capacity to a lower value (for example, 1 WCU), and provision whatever read capacity is appropriate. Reduce the provisioned read capacity of earlier tables as they age. You might choose to archive or delete the tables whose contents are rarely or never needed.

The idea is to allocate the required resources for the current period that will experience the highest volume of traffic and scale down provisioning for older tables that are not used actively, therefore saving costs. Depending on your business needs, you might consider write sharding to distribute traffic evenly to the logical partition key. For more information, see [Using write sharding to distribute workloads evenly in your DynamoDB table](bp-partition-key-sharding.md).

## Time series table examples
<a name="bp-time-series-examples"></a>

The following is a time series data example in which the current table is provisioned at a higher read/write capacity and the older tables are scaled down because they are accessed infrequently.

![\[Table schema for high-volume time-series data.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/TimeSeries.png)


# Best practices for managing many-to-many relationships in DynamoDB tables
<a name="bp-adjacency-graphs"></a>

Adjacency lists are a design pattern that is useful for modeling many-to-many relationships in Amazon DynamoDB. More generally, they provide a way to represent graph data (nodes and edges) in DynamoDB.

## Adjacency list design pattern
<a name="bp-adjacency-lists"></a>

When different entities of an application have a many-to-many relationship between them, the relationship can be modeled as an adjacency list. In this pattern, all top-level entities (synonymous to nodes in the graph model) are represented using the partition key. Any relationships with other entities (edges in a graph) are represented as an item within the partition by setting the value of the sort key to the target entity ID (target node).

The advantages of this pattern include minimal data duplication and simplified query patterns to find all entities (nodes) related to a target entity (having an edge to a target node).

A real-world example where this pattern has been useful is an invoicing system where invoices contain multiple bills. One bill can belong in multiple invoices. The partition key in this example is either an `InvoiceID` or a `BillID`. `BillID` partitions have all attributes specific to bills. `InvoiceID` partitions have an item storing invoice-specific attributes, and an item for each `BillID` that rolls up to the invoice.

The schema looks like the following.

![\[Table schema for billing adjacency-list example.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/AdjacencyLists_01.png)


Using the preceding schema, you can see that all bills for an invoice can be queried using the primary key on the table. To look up all invoices that contain a part of a bill, create a global secondary index on the table's sort key. 

The projections for the global secondary index look like the following.

![\[GSI projection for billing adjacency-list example.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/AdjacencyLists_02.png)


## Materialized graph pattern
<a name="bp-graph-pattern"></a>

Many applications are built around understanding rankings across peers, common relationships between entities, neighbor entity state, and other types of graph style workflows. For these types of applications, consider the following schema design pattern.

![\[Graph example number 1.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/1513869910203-418.png)


![\[Graph example number 2.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/1513852802235-256.png)


![\[Graph example number 3.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/1513852905360-671.png)


The preceding schema shows a graph data structure that is defined by a set of data partitions containing the items that define the edges and nodes of the graph. Edge items contain a `Target` and a `Type` attribute. These attributes are used as part of a composite key name "TypeTarget" to identify the item in a partition in the primary table or in a second global secondary index.

The first global secondary index is built on the `Data` attribute. This attribute uses global secondary index-overloading as described earlier to index several different attribute types, namely `Dates`, `Names`, `Places`, and `Skills`. Here, one global secondary index is effectively indexing four different attributes.

As you insert items into the table, you can use an intelligent sharding strategy to distribute item sets with large aggregations (birthdate, skill) across as many logical partitions on the global secondary indexes as are needed to avoid hot read/write problems.

The result of this combination of design patterns is a solid datastore for highly efficient real-time graph workflows. These workflows can provide high-performance neighbor entity state and edge aggregation queries for recommendation engines, social-networking applications, node rankings, subtree aggregations, and other common graph use cases.

If your use case isn't sensitive to real-time data consistency, you can use a scheduled Amazon EMR process to populate edges with relevant graph summary aggregations for your workflows. If your application doesn't need to know immediately when an edge is added to the graph, you can use a scheduled process to aggregate results.

To maintain some level of consistency, the design could include Amazon DynamoDB Streams and AWS Lambda to process edge updates. It could also use an Amazon EMR job to validate results on a regular interval. This approach is illustrated by the following diagram. It is commonly used in social networking applications, where the cost of a real-time query is high and the need to immediately know individual user updates is low.

![\[Diagram illustrating graph workflow.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/1513856345673-336.png)


IT service-management (ITSM) and security applications generally need to respond in real time to entity state changes composed of complex edge aggregations. Such applications need a system that can support real-time multiple node aggregations of second- and third-level relationships, or complex edge traversals. If your use case requires these types of real-time graph query workflows, we recommend that you consider using [Amazon Neptune](https://docs.aws.amazon.com/neptune/latest/userguide/) to manage these workflows.

**Note**  
If you need to query highly connected datasets or execute queries that need to traverse multiple nodes (also known as multi-hop queries) with millisecond latency, you should consider using [Amazon Neptune](https://docs.aws.amazon.com/neptune/latest/userguide/). Amazon Neptune is a purpose-built, high-performance graph database engine optimized for storing billions of relationships and querying the graph with millisecond latency.

# Best practices for querying and scanning data in DynamoDB
<a name="bp-query-scan"></a>

This section covers some best practices for using `Query` and `Scan` operations in Amazon DynamoDB.

## Performance considerations for scans
<a name="bp-query-scan-performance"></a>

In general, `Scan` operations are less efficient than other operations in DynamoDB. A `Scan` operation always scans the entire table or secondary index. It then filters out values to provide the result you want, essentially adding the extra step of removing data from the result set.

If possible, you should avoid using a `Scan` operation on a large table or index with a filter that removes many results. Also, as a table or index grows, the `Scan` operation slows. The `Scan` operation examines every item for the requested values and can use up the provisioned throughput for a large table or index in a single operation. For faster response times, design your tables and indexes so that your applications can use `Query` instead of `Scan`. (For tables, you can also consider using the `GetItem` and `BatchGetItem` APIs.)

Alternatively, you can design your application to use `Scan` operations in a way that minimizes the impact on your request rate. This can include modeling when it might be more efficient to use a global secondary index instead of a `Scan` operation. Further information on this process is in the following video. 

[![AWS Videos](http://img.youtube.com/vi/https://www.youtube.com/embed/LM84N-E_b_M/0.jpg)](http://www.youtube.com/watch?v=https://www.youtube.com/embed/LM84N-E_b_M)


## Avoiding sudden spikes in read activity
<a name="bp-query-scan-spikes"></a>

When you create a table, you set its read and write capacity unit requirements. For reads, the capacity units are expressed as the number of strongly consistent 4 KB data read requests per second. For eventually consistent reads, a read capacity unit is two 4 KB read requests per second. A `Scan` operation performs eventually consistent reads by default, and it can return up to 1 MB (one page) of data. Therefore, a single `Scan` request can consume (1 MB page size / 4 KB item size) / 2 (eventually consistent reads) = 128 read operations. If you request strongly consistent reads instead, the `Scan` operation would consume twice as much provisioned throughput—256 read operations.

This represents a sudden spike in usage, compared to the configured read capacity for the table. This usage of capacity units by a scan prevents other potentially more important requests for the same table from using the available capacity units. As a result, you likely get a `ProvisionedThroughputExceeded` exception for those requests.

The problem is not just the sudden increase in capacity units that the `Scan` uses. The scan is also likely to consume all of its capacity units from the same partition because the scan requests read items that are next to each other on the partition. This means that the request is hitting the same partition, causing all of its capacity units to be consumed, and throttling other requests to that partition. If the request to read data is spread across multiple partitions, the operation would not throttle a specific partition. 

The following diagram illustrates the impact of a sudden spike of capacity unit usage by `Query` and `Scan` operations, and its impact on your other requests against the same table.

![\[4 different scenarios showing provisioned throughput intervals, requests, and good and bad results on a table.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/ThroughputIntervals.png)


As illustrated here, the usage spike can impact the table's provisioned throughput in several ways:

1. Good: Even distribution of requests and size

1. Not as good: Frequent requests in bursts

1. Bad: A few random large requests

1. Bad: Large scan operations

Instead of using a large `Scan` operation, you can use the following techniques to minimize the impact of a scan on a table's provisioned throughput.
+ **Reduce page size**

  Because a Scan operation reads an entire page (by default, 1 MB), you can reduce the impact of the scan operation by setting a smaller page size. The `Scan` operation provides a *Limit* parameter that you can use to set the page size for your request. Each `Query` or `Scan` request that has a smaller page size uses fewer read operations and creates a "pause" between each request. For example, suppose that each item is 4 KB and you set the page size to 40 items. A `Query` request would then consume only 20 eventually consistent read operations or 40 strongly consistent read operations. A larger number of smaller `Query` or `Scan` operations would allow your other critical requests to succeed without throttling. 
+ **Isolate scan operations**

  DynamoDB is designed for easy scalability. As a result, an application can create tables for distinct purposes, possibly even duplicating content across several tables. You want to perform scans on a table that is not taking "mission-critical" traffic. Some applications handle this load by rotating traffic hourly between two tables—one for critical traffic, and one for bookkeeping. Other applications can do this by performing every write on two tables: a "mission-critical" table, and a "shadow" table. 

Configure your application to retry any request that receives a response code that indicates you have exceeded your provisioned throughput. Or, increase the provisioned throughput for your table using the `UpdateTable` operation. If you have temporary spikes in your workload that cause your throughput to exceed, occasionally, beyond the provisioned level, retry the request with exponential backoff. For more information about implementing exponential backoff, see [Error retries and exponential backoff](Programming.Errors.md#Programming.Errors.RetryAndBackoff).

## Taking advantage of parallel scans
<a name="bp-query-scan-parallel"></a>

Many applications can benefit from using parallel `Scan` operations rather than sequential scans. For example, an application that processes a large table of historical data can perform a parallel scan much faster than a sequential one. Multiple worker threads in a background "sweeper" process could scan a table at a low priority without affecting production traffic. In each of these examples, a parallel `Scan` is used in such a way that it does not starve other applications of provisioned throughput resources.

Although parallel scans can be beneficial, they can place a heavy demand on provisioned throughput. With a parallel scan, your application has multiple workers that are all running `Scan` operations concurrently. This can quickly consume all of your table's provisioned read capacity. In that case, other applications that need to access the table might be throttled.

A parallel scan can be the right choice if the following conditions are met:
+ The table size is 20 GB or larger.
+ The table's provisioned read throughput is not being fully used.
+ Sequential `Scan` operations are too slow.

### Choosing TotalSegments
<a name="bp-query-scan-parallel-total-segments"></a>

The best setting for `TotalSegments` depends on your specific data, the table's provisioned throughput settings, and your performance requirements. You might need to experiment to get it right. We recommend that you begin with a simple ratio, such as one segment per 2 GB of data. For example, for a 30 GB table, you could set `TotalSegments` to 15 (30 GB / 2 GB). Your application would then use 15 workers, with each worker scanning a different segment.

You can also choose a value for `TotalSegments` that is based on client resources. You can set `TotalSegments` to any number from 1 to 1000000, and DynamoDB lets you scan that number of segments. For example, if your client limits the number of threads that can run concurrently, you can gradually increase `TotalSegments` until you get the best `Scan` performance with your application.

Monitor your parallel scans to optimize your provisioned throughput use, while also making sure that your other applications aren't starved of resources. Increase the value for `TotalSegments` if you don't consume all of your provisioned throughput but still experience throttling in your `Scan` requests. Reduce the value for `TotalSegments` if the `Scan` requests consume more provisioned throughput than you want to use. 

# Best practices for DynamoDB table design
<a name="bp-table-design"></a>

General design principles in Amazon DynamoDB recommend that you keep the number of tables you use to a minimum. In the majority of cases, we recommend that you consider using a single table. However if a single or small number of tables are not viable, these guidelines may be of use.
+ The per account limit cannot be increased above 10,000 tables per account. If your application requires more tables, plan for distributing the tables across multiple accounts. For more information see [ service, account, and table quotas in Amazon DynamoDB.](ServiceQuotas.html#limits-tables) 
+ Consider control plane limits for concurrent control plane operations that might impact your table management.
+ Work with AWS solution architects to validate your design patterns for multi-tenant designs.

# Using DynamoDB global tables
<a name="bp-global-table-design"></a>

Global tables build on Amazon DynamoDB’s global footprint to provide you with a fully managed, multi-Region, and multi-active database that can deliver fast and local, read and write performance for massively scaled, global applications. Global tables replicate your DynamoDB tables automatically across your choice of AWS Regions. No application changes are required because global tables use existing DynamoDB APIs. There are no upfront costs or commitments for using global tables, and you pay only for the resources you use.

This guide explains how to use DynamoDB global tables effectively. It provides key facts about global tables, explains the feature’s primary use cases, describes the two consistency modes, introduces a taxonomy of three different write models you should consider, walks through the four main request routing choices you might implement, discusses ways to evacuate a Region that’s live or a Region that’s offline, explains how to think about throughput capacity planning, and provides a checklist of things to consider when you deploy global tables.

This guide fits into a larger context of AWS multi-Region deployments, as covered in the [AWS Multi-Region Fundamentals](https://docs.aws.amazon.com/prescriptive-guidance/latest/aws-multi-region-fundamentals/introduction.html) whitepaper and the [Data resiliency design patterns with AWS](https://www.youtube.com/watch?v=7IA48SOX20c) video.

**Topics**
+ [Key facts about DynamoDB global table design](#bp-global-table-design.prescriptive-guidance.facts)
+ [Key facts about MREC](#bp-global-table-design-MREC-facts)
+ [Key facts about MRSC](#bp-global-table-design-MRSC-facts)
+ [MREC DynamoDB global table use cases](#bp-global-table-design.prescriptive-guidance.usecases)
+ [Write modes with DynamoDB global tables](bp-global-table-design.prescriptive-guidance.writemodes.md)
+ [Routing strategies in DynamoDB](bp-global-table-design.prescriptive-guidance.request-routing.md)
+ [Evacuation processes](bp-global-table-design.prescriptive-guidance.evacuation.md)
+ [Throughput capacity planning for DynamoDB global tables](bp-global-table-design.prescriptive-guidance.throughput.md)
+ [Preparation checklist for DynamoDB global tables](bp-global-table-design.prescriptive-guidance.checklist-and-faq.md)
+ [Conclusion and resources](#bp-global-table-design.prescriptive-guidance-resources-conclusion)

## Key facts about DynamoDB global table design
<a name="bp-global-table-design.prescriptive-guidance.facts"></a>
+ There are two versions of global tables: the current version [Global Tables version 2019.11.21 (Current)](GlobalTables.md) (sometimes called "V2"), and [Global tables version 2017.11.29 (Legacy)](globaltables.V1.md) (sometimes called "V1"). This guide focuses exclusively on the current version.
+ DynamoDB (without global tables) is a Regional service, which means that it is highly available and intrinsically resilient to failures of infrastructure, including the failure of an entire Availability Zone. A single-Region DynamoDB table is designed for 99.99% availability. For more information, see the [DynamoDB service-level agreement](https://aws.amazon.com/dynamodb/sla/) (SLA).
+ A DynamoDB global table replicates its data between two or more Regions. A multi-Region DynamoDB table is designed for 99.999% availability. With proper planning, global tables can help create an architecture that is resilient to Regional failures.
+ DynamoDB doesn’t have a global endpoint. All requests are made to a Regional endpoint that accesses the global table instance that’s local to that Region.
+ Calls to DynamoDB should not go across Regions. The best practice is for an application that is homed to one Region to directly access only the local DynamoDB endpoint for its Region. If problems are detected within a Region (in the DynamoDB layer or in the surrounding stack), end user traffic should be routed to a different application endpoint that’s hosted in a different Region. Global tables ensure that the application homed in every Region has access to the same data.

### Consistency modes
<a name="bp-global-table-design-prescriptive-guidance-consistency"></a>

When you create a global table, you configure its consistency mode. Global tables support two consistency modes: multi-Region eventual consistency (MREC) and multi-Region strong consistency (MRSC) which was introduced in June 2025.

If you don't specify a consistency mode when you create a global table, the global table defaults to MREC. A global table can't contain replicas that are configured with different consistency modes. You can't change a global table's consistency mode after its creation.

## Key facts about MREC
<a name="bp-global-table-design-MREC-facts"></a>
+ Global tables that use MRSC also employ an active-active replication model. From the perspective of DynamoDB, the table in each Region has equal standing to accept read and write requests. After receiving a write request, the local replica table replicates the write operation to other participating remote Regions in the background.
+ Items are replicated individually. Items that are updated within a single transaction might not be replicated together.
+ Each table partition in the source Region replicates its write operations in parallel with every other partition. The sequence of write operations within a remote Region might not match the sequence of write operations that happened within the source Region. For more information about table partitions, see the blog post [Scaling DynamoDB: How partitions, hot keys, and split for heat impact performance](https://aws.amazon.com/blogs/database/part-3-scaling-dynamodb-how-partitions-hot-keys-and-split-for-heat-impact-performance/).
+ A newly written item is usually propagated to all replica tables within a second. Nearby Regions tend to propagate faster.
+ Amazon CloudWatch provides a `ReplicationLatency` metric for each Region pair. It is calculated by looking at arriving items, comparing their arrival time with their initial write time, and computing an average. Timings are stored within CloudWatch in the source Region. Viewing the average and maximum timings can be useful for determining the average and worst-case replication lag. There is no SLA on this latency.
+ If an individual item is updated at about the same time (within this `ReplicationLatency` window) in two different Regions, and the second write operation happens before the first write operation was replicated, there’s a potential for write conflicts. Global tables that use MREC resolve such conflicts by using a last writer wins mechanism, based on the timestamp of the write operations. The first operation “loses” to the second operation. These conflicts aren’t recorded in CloudWatch or AWS CloudTrail.
+ Each item has a last write timestamp held as a private system property. The last writer wins approach is implemented by using a conditional write operation that requires the incoming item’s timestamp to be greater than the existing item’s timestamp.
+ A global table replicates all items to all participating Regions. If you want to have different replication scopes, you can create multiple global tables and assign each table different participating Regions.
+ The local Region accepts write operations even if the replica Region is offline or `ReplicationLatency` grows. The local table continues to attempt replicating items to the remote table until each item succeeds.
+ In the unlikely event that a Region goes fully offline, when it comes back online later, all pending outbound and inbound replications will be retried. No special action is required to bring the tables back in sync. The *last writer wins *mechanism ensures that the data eventually becomes consistent.
+ You can add a new Region to a DynamoDB MREC table at any time. DynamoDB handles the initial sync and ongoing replication. You can also remove a Region (even the original Region), and this will delete the local table in that Region.

## Key facts about MRSC
<a name="bp-global-table-design-MRSC-facts"></a>
+ Global tables that use MRSC also employ an active-active replication model. From the perspective of DynamoDB, the table in each Region has equal standing to accept read and write requests. Item changes in an MRSC global table replica are **synchronously** replicated to at least one other Region before the write operation returns a successful response.
+ Strongly consistent read operations on any MRSC replica always return the latest version of an item. Conditional write operations always evaluate the condition expression against the latest version of an item. Updates always operate against the latest version of an item.
+ Eventually consistent read operations on an MRSC replica might not include changes that recently occurred in another Region, and might not even include changes that very recently occurred in the same Region.
+ A write operation fails with a `ReplicatedWriteConflictException` exception when it attempts to modify an item that is already being modified in another Region. Write operations that fail with the `ReplicatedWriteConflictException` exception can be retried and will succeed if the item is no longer being modified in another Region.
+ With MRSC, latencies are higher for write operations and for strongly consistent read operations. These operations require cross-Region communication. This communication can add latency that increases based on the round-trip latency between the Region being accessed and the nearest Region participating in the global table. For more information, see the AWS re:Invent 2024 presentation, [Multi-Region strong consistency with DynamoDB global tables](https://www.youtube.com/watch?v=R-nTs8ZD8mA). Eventually consistent read operations experience no extra latency. There is an open source [tester tool](https://github.com/awslabs/amazon-dynamodb-tools/tree/main/tester) that lets you experimentally calculate these latencies with your Regions.
+ Items are replicated individually. Global tables using MRSC do not support the transaction APIs.
+ A MRSC global table must be deployed in exactly three Regions. You can configure a MRSC global table with three replicas, or with two replicas and one witness. A witness is a component of an MRSC global table that contains recent data written to global table replicas. A witness provides an optional alternative to a full replica while supporting MRSC's availability architecture. You can't perform read or write operations on a witness. A witness doesn't incur storage or write costs. A witness is located within a different Region from the two replicas.
+ To create an MRSC global table, you add one replica and a witness, or add two replicas to an existing DynamoDB table that contains no data. You cannot add additional replicas to an existing MRSC global table. You can't delete a single replica or a witness from an MRSC global table. You can delete two replicas, or delete one replica and a witness, from an MRSC global table. The second scenario converts the remaining replica to a single-Region DynamoDB table.
+ You can determine whether an MRSC global table has a witness configured, and which Region in which it's configured, from the output of the DescribeTable API. The witness is owned and managed by DynamoDB and doesn't appear in your AWS account in the Region where it's configured.
+ MRSC global tables are available in the following Region sets:
  + US Region set: US East (N. Virginia), US East (Ohio), US West (Oregon)
  + EU Region set: Europe (Ireland), Europe (London), Europe (Paris), Europe (Frankfurt)
  + AP Region set: Asia Pacific (Tokyo), Asia Pacific (Seoul), and Asia Pacific (Osaka)
+ MRSC global tables can't span Region sets. For example, an MRSC global table can't contain replicas from both US and EU Region sets.
+ Time to Live (TTL) isn't supported for MRSC global tables.
+ Local secondary indexes (LSIs) aren't supported for MRSC global tables.
+ CloudWatch Contributor Insights information is only reported for the Region in which an operation occurred.
+ The local Region accepts all read and write operations as long as there is a second Region that hosts a replica or witness to establish quorum. If a second Region isn't available, the local Region can only service eventually consistent reads.
+ In the unlikely event that a Region goes fully offline, when it comes back online later, it will automatically catch up. Until it's caught up, write operations and strongly consistent read operations *only* to the catching up Region will return errors while requests to other Regions will continue to perform normally. Eventually consistent read operations to the catching up Region will return the data that has so far been propagated into the Region, with usual local consistency behavior between the leader node and local replicas. No special action is required to bring the tables back in sync.

## MREC DynamoDB global table use cases
<a name="bp-global-table-design.prescriptive-guidance.usecases"></a>

MREC global tables provides these benefits:
+  **Lower-latency read operations.** Place a copy of the data closer to the end user to reduce network latency during read operations. The data is kept as fresh as the `ReplicationLatency` value.
+  **Lower-latency write operations.** You can write to a nearby region to reduce network latency and the time taken to achieve the write. The write traffic must be carefully routed to ensure no conflicts. Techniques for routing are discussed in more detail in [Routing strategies in DynamoDB](bp-global-table-design.prescriptive-guidance.request-routing.md).
+ **Seamless Region migration.** You can add a new Region and delete the old Region to migrate a deployment from one Region to another without downtime at the data layer.

MREC and MRSC global tables both provide this benefit:
+  **Increased resiliency and disaster recovery.** If a Region has degraded performance or a full outage, you can evacuate it. To evacuate means moving away some or all requests going to that Region. Using global tables increases the [DynamoDB SLA](https://aws.amazon.com/dynamodb/sla/) for monthly uptime percentage from 99.99% to 99.999%. Using MREC supports a recovery point objective (RPO) and recovery time objective (RTO) measured in seconds. Using MRSC supports an RPO of zero.

  For example, Fidelity Investments presented at re:Invent 2022 on how they use DynamoDB global tables for their order management system. Their goal was to achieve reliably low latency processing at a scale they couldn't achieve with on-premises processing while also maintaining resilience to Availability Zone and Regional failures.

If your goal is resiliency and disaster recovery, MRSC tables have higher write latencies and higher strongly consistent read latencies, but support an RPO of zero. MREC global tables support an RPO equal to the replication delay between replicas, usually a few seconds depending on the replica Regions.

# Write modes with DynamoDB global tables
<a name="bp-global-table-design.prescriptive-guidance.writemodes"></a>

Global tables are always active-active at the table level. However, especially for MREC tables, you might want to treat them as active-passive by controlling how you route write requests. For example, you might decide to route write requests to a single Region to avoid potential write conflicts that can happen with MREC tables.

There are three main managed write patterns, as explained in the next three sections. You should consider which write pattern fits your use case. This choice affects how you route requests, evacuate a Region, and handle disaster recovery. The guidance in later sections depends on your application’s write mode.

## Write to any Region mode (no primary)
<a name="bp-global-table-design.prescriptive-guidance.writemodes.no-primary"></a>

The *write to any Region* mode, illustrated in the following diagram, is fully active-active and doesn’t impose restrictions on where a write may occur. Any Region may accept a write at any time. This is the simplest mode, but it can only be used with some types of applications. This mode is suitable for all MRSC tables. It’s also suitable for MREC tables when all writers are idempotent, and therefore safely repeatable so that concurrent or repeated write operations across Regions are not in conflict. For example, when a user updates their contact data. This mode also works well for a special case of being idempotent, an append-only dataset where all writes are unique inserts under a deterministic primary key. Lastly, this mode is suitable for MREC where the risk of conflicting writes would be acceptable.

![\[Diagram of how client writes to any region works.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/gt-client-read-write-to-any-region2.png)


The *write to any Region* mode is the most straightforward architecture to implement. Routing is easier because any Region can be the write target at any time. Failover is easier, because with MRSC tables, the items are always synchronized, and with MRSC tables, any recent writes can be replayed any number of times to any secondary Region. Where possible, you should design for this write mode.

For example, several video streaming services use global tables for tracking bookmarks, reviews, watch status flags, and so on. These deployments use MREC tables because they need replicas scattered around the world, with each replica providing low-latency read and write operations. These deployments can use the *write to any Region* mode as long as they ensure that every write operation is idempotent. This will be the case if every update―for example, setting a new latest time code, assigning a new review, or setting a new watch status―assigns the user’s new state directly, and the next correct value for an item doesn’t depend on its current value. If, by chance, the user’s write requests are routed to different Regions, the last write operation will persist and the global state will settle according to the last assignment. Read operations in this mode will eventually become consistent, delayed by the latest `ReplicationLatency` value. 

In another example, a financial services firm uses global tables as part of a system to maintain a running tally of debit card purchases for each customer, to calculate that customer’s cash-back rewards. They want to keep a `RunningBalance` item per customer. This write mode is not naturally idempotent because as transactions stream in, they modify the balance by using an `ADD` expression where the new correct value depends on the current value. By using MRSC tables they can still *write to any Region*, because every `ADD` call always operates against the very latest value of the item.

A third example involves a company that provides online ad placement services. This company decided that a low risk of data loss would be acceptable to achieve the design simplifications of the *write to any Region* mode. When they serve ads, they have just a few milliseconds to retrieve enough metadata to determine which ad to show, and then to record the ad impression so they don’t repeat the same ad soon. They use global tables to get both low-latency read operations for end users across the world and low-latency write operations. They record all ad impressions for a user within a single item, which is represented as a growing list. They use one item instead of appending to an item collection, so they can remove older ad impressions as part of each write operation without paying for a delete operation. This write operation is not idempotent; if the same end user sees ads served out of multiple Regions at approximately the same time, there’s a chance that one write operation for an ad impression could overwrite another. The risk is that a user might see an ad repeated once in a while. They decided that this is acceptable.

## Write to one Region (single primary)
<a name="bp-global-table-design.prescriptive-guidance.writemodes.single-primary"></a>

The *write to one Region* mode, illustrated in the following diagram, is active-passive and routes all table writes to a single active region. Note that DynamoDB doesn’t have a notion of a single active region; the application routing outside DynamoDB manages this. The *write to one Region * mode works well for MREC tables that need to avoid write conflicts by ensuring that write operations flow only to one Region at a time. This write mode helps when you want to use conditional expressions and can't use MRSC for some reason, or when you need to perform transactions. These expressions aren’t possible unless you know that you’re acting against the latest data, so they require sending all write requests to a single Region that has the latest data.

When you use an MRSC table, you might choose to generally write to one Region for convenience. For example, this can help minimize your infrastructure build-out beyond DynamoDB. The write mode would still be write to any Region because with MRSC you could safely write to any Region at any time without concern of conflict resolution that would cause MREC tables to choose to *write to one Region*.

Eventually consistent reads can go to any replica Regions to achieve lower latencies. Strongly consistent reads must go to the single primary Region.

![\[Diagram of how writing to one Region works.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/gt-client-writes-one-region2.png)


It’s sometimes necessary to change the active Region in response to a Regional failure. Some users change the currently active Region on a regular schedule, such as implementing a follow-the-sun deployment. This places the active Region near the geography that has the most activity (usually where it’s daytime, thus the name), which results in the lowest latency read and write operations. It also has the side benefit of calling the Region-changing code daily and making sure that it’s well tested before any disaster recovery.

The passive Region(s) may keep a downscaled set of infrastructure surrounding DynamoDB that gets built up only if it becomes the active Region. This guide doesn’t cover pilot light and warm standby designs. For a more information, see [ Disaster Recovery (DR) Architecture on AWS, Part III: Pilot Light and Warm Standby](https://aws.amazon.com/blogs/architecture/disaster-recovery-dr-architecture-on-aws-part-iii-pilot-light-and-warm-standby/).

Using the *write to one Region* mode works well when you use global tables for low-latency globally distributed read operations. An example is a large social media company that needs to have the same reference data available in every Region around the world. They don’t update the data often, but when they do, they write to only one Region to avoid any potential write conflicts. Read operations are always allowed from any Region.

As another example, consider the financial services company discussed earlier that implemented the daily cash-back calculation. They used *write to any Region* mode to calculate the balance but *write to one Region *mode to track payments. This work requires transactions, which aren't supported in MRSC tables, so it works better with a separate MREC table and *write to one Region* mode.

## Write to your Region (mixed primary)
<a name="bp-global-table-design.prescriptive-guidance.writemodes.mixed-primary"></a>

The *write to your Region* write mode, illustrated in the following diagram, works with MREC tables. It assigns different data subsets to different home Regions and allows write operations to an item only through its home Region. This mode is active-passive but assigns the active Region based on the item. Every Region is primary for its own non-overlapping dataset, and write operations must be guarded to ensure proper locality.

This mode is similar to *write to one Region* except that it enables lower-latency write operations, because the data associated with each user can be placed in closer network proximity to that user. It also spreads the surrounding infrastructure more evenly between Regions and requires less work to build out infrastructure during a failover scenario, because all Regions have a portion of their infrastructure already active.

![\[Diagram of how client writes to each item in a single Region works.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/get-client-writes-each-item-single-region2.png)


You can determine the home Region for items in several ways:
+  **Intrinsic:** Some aspect of the data, such as a special attribute or a value embedded within its partition key, makes its home Region clear. This technique is described in the blog post [Use Region pinning to set a home Region for items in an Amazon DynamoDB global table](https://aws.amazon.com/blogs/database/use-region-pinning-to-set-a-home-region-for-items-in-an-amazon-dynamodb-global-table/).
+  **Negotiated:** The home Region of each dataset is negotiated in some external manner, such as with a separate global service that maintains assignments. The assignment may have a finite duration after which it’s subject to renegotiation. 
+  **Table-oriented:** Instead of creating a single replicating global table, you create the same number of global tables as replicating Regions. Each table’s name indicates its home Region. In standard operations, all data is written to the home Region while other Regions keep a read-only copy. During a failover, another Region temporarily adopts write duties for that table.

For example, imagine that you’re working for a gaming company. You need low-latency read and write operations for all gamers around the world. You assign each gamer to the Region that’s closest to them. That Region takes all their read and write operations, ensuring strong read-after-write consistency. However, when a gamer travels or if their home Region suffers an outage, a complete copy of their data is available in alternative Regions, and the gamer can be assigned to a different home Region.

As another example, imagine that you’re working at a video conferencing company. Each conference call’s metadata is assigned to a particular Region. Callers can use the Region that’s closest to them for lowest latency. If there’s a Region outage, using global tables allows quick recovery because the system can move the processing of the call to a different Region where a replicated copy of the data already exists.

**To summarize**
+ Write to any Region mode is suitable for MRSC tables and idempotent calls to MREC tables.
+ Write to one Region mode is suitable for non-idempotent calls to MREC tables.
+ Write to your Region mode is suitable for non-idempotent calls to MREC tables, where it's important to have clients write to a Region that’s close to them.

# Routing strategies in DynamoDB
<a name="bp-global-table-design.prescriptive-guidance.request-routing"></a>

Perhaps the most complex piece of a global table deployment is managing request routing. Requests must first go from an end user to a Region that’s chosen and routed in some manner. The request encounters some stack of services in that Region, including a compute layer that perhaps consists of a load balancer backed by an AWS Lambdafunction, container, or Amazon Elastic Compute Cloud (Amazon EC2) node, and possibly other services including another database. That compute layer communicates with DynamoDB It should do that by using the local endpoint for that Region. The data in the global table replicates to all other participating Regions, and each Region has a similar stack of services around its DynamoDB table. 

 The global table provides each stack in the various Regions with a local copy of the same data. You might consider designing for a single stack in a single Region and anticipate making remote calls to a secondary Region’s DynamoDB endpoint if there’s an issue with the local DynamoDB table. This is not best practice. If there’s an issue in one Region that’s caused by DynamoDB (or, more likely, caused by something else in the stack or by another service that depends on DynamoDB), it’s best to route the end user to another Region for processing and use that other Region’s compute layer, which will talk to its local DynamoDB endpoint. This approach routes around the problematic Region entirely. To ensure resiliency, you need replication across multiple Regions: replication of the compute layer as well as the data layer.

 There are numerous alternative techniques to route an end user request to a Region for processing. The optimum choice depends on your write mode and your failover considerations. This section discusses four options: client-driven, compute-layer, Route 53, and Global Accelerator.

## Client-driven request routing
<a name="bp-global-table-design.prescriptive-guidance.request-routing.client-driven"></a>

With client-driven request routing, illustrated in the following diagram, the end user client (an application, a web page with JavaScript, or another client) keeps track of the valid application endpoints (for example, an Amazon API Gateway endpoint rather than a literal DynamoDB endpoint) and uses its own embedded logic to choose the Region to communicate with. It might choose based on random selection, lowest observed latencies, highest observed bandwidth measurements, or locally performed health checks.

![\[Diagram of how writing to a client's chosen target works.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/gt-routing-is-clients-choice2_v2.png)


As an advantage, client-driven request routing can adapt to things such as real-world public internet traffic conditions to switch Regions if it notices any degraded performance. The client must be aware of all potential endpoints, but launching a new Regional endpoint is not a frequent occurrence.

With *write to any Region* mode, a client can unilaterally select its preferred endpoint. If its access to one Region becomes impaired, the client can route to another endpoint.

With the *write to one Region* mode, the client will need a mechanism to route its writes to the currently active region. This could be as basic as empirically testing which region is presently accepting writes (noting any write rejections and falling back to an alternate) or as complex as calling a global coordinator to query for the current application state (perhaps built on the Amazon Application Recovery Controller (ARC) (ARC) routing control which provides a 5-region quorum-driven system to maintain global state for needs such as this). The client can decide if reads can go to any Region for eventual consistency or must be routed to the active region for strong consistency. For further information see [How Route 53 works](https://docs.aws.amazon.com/r53recovery/latest/dg/introduction-how-it-works.html).

 With the *write to your Region* mode, the client needs to determine the home region for the data set it’s working against. For example, if the client corresponds to a user account and each user account is homed to a Region, the client can request the appropriate endpoint from a global login system.

 For example, a financial services company that helps users manage their business finances via the web could use global tables with a *write to your Region* mode. Each user must login to a central service. That service returns credentials and the endpoint for the Region where those credentials will work. The credentials are valid for a short time. After that the webpage auto-negotiates a new login, which provides an opportunity to potentially redirect the user’s activity to a new Region.

## Compute-layer request routing
<a name="bp-global-table-design.prescriptive-guidance.request-routing.compute"></a>

With compute-layer request routing, illustrated in the following diagram, the code that runs in the compute layer determines whether to process the request locally or pass it to a copy of itself that’s running in another Region. When you use the *write to one Region* mode, the compute layer might detect that it’s not the active Region and allow local read operations while forwarding all write operations to another Region. This compute layer code must be aware of data topology and routing rules, and enforce them reliably, based on the latest settings that specify which Regions are active for which data. The outer software stack within the Region doesn’t have to be aware of how read and write requests are routed by the micro service. In a robust design, the receiving Region validates whether it is the current primary for the write operation. If it isn’t, it generates an error that indicates that the global state needs to be corrected. The receiving Region might also buffer the write operation for a while if the primary Region is in the process of changing. In all cases, the compute stack in a Region writes only to its local DynamoDB endpoint, but the compute stacks might communicate with one another.

![\[Diagram of compute layer request routing.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/gt-compute-layer-routing2.png)


The Vanguard Group uses a system called Global Orchestration and Status Tool (GOaST) and a library called Global Multi-Region library (GMRlib) for this routing process, as presented at [re:Invent 2022](https://www.youtube.com/watch?v=ilgpzlE7Hds&t=1882s). They use a follow-the-sun single primary model. GOaST maintains the global state, similar to the ARC routing control discussed in the previous section. It uses a global table to track which Region is the primary Region and when the next primary switch is scheduled. All read and write operations go through GMRlib, which coordinates with GOaST. GMRlib allows read operations to be performed locally, at low latency. For write operations, GMRlib checks if the local Region is the current primary Region. If so, the write operation completes directly. If not, GMRlib forwards the write task to the GMRlib in the primary Region. That receiving library confirms that it also considers itself the primary Region and raises an error if it isn’t, which indicates a propagation delay with the global state. This approach provides a validation benefit by not writing directly to a remote DynamoDB endpoint.

## Route 53 request routing
<a name="bp-global-table-design.prescriptive-guidance.request-routing.r53"></a>

Amazon Application Recovery Controller (ARC) is a Domain Name Service (DNS) technology. With Route 53, the client requests its endpoint by looking up a well-known DNS domain name, and Route 53 returns the IP address corresponding to the regional endpoint(s) it thinks most appropriate. This is illustrated in the following diagram. Route 53 has a long list of routing policies it uses to determine the appropriate Region. It also can do failover routing to route traffic away from Regions that fail health checks.

![\[Diagram of compute layer request routing.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/gt-rt-53-anycast2_v2.png)


With *write to any Region* mode, or if combined with the compute-layer request routing on the backend, Route 53 can be given full access to return the Region based on any complex internal rules such as the Region in closest network proximity, or closest geographic proximity, or any other choice.

With *write to one Region* mode, you can configure Route 53 to return the currently active Region (using Route 53 ARC). If the client wants to connect to a passive Region (for example, for read operations), it could look up a different DNS name.

**Note**  
Clients cache the IP addresses in the response from Route 53 for a time indicated by the time to live (TTL) setting on the domain name. A longer TTL extends the recovery time objective (RTO) for all clients to recognize the new endpoint. A value of 60 seconds is typical for failover use. Not all software perfectly adheres to DNS TTL expiration, and there might be multiple levels of DNS caching, such as at the operating system, virtual machine, and application.

With *write to your Region* mode, it’s best to avoid Route 53 unless you're also using compute-layer request routing.

## Global Accelerator request routing
<a name="bp-global-table-design.prescriptive-guidance.request-routing.gax"></a>

With [AWS Global Accelerator](https://aws.amazon.com/global-accelerator/), illustrated in the following diagram, a client looks up the well-known domain name in Route 53. However, instead of getting back an IP address that corresponds to a Regional endpoint, the client receives an anycast static IP address which routes to the nearest AWS edge location. Starting from that edge location, all traffic gets routed on the private AWS network and to some endpoint (such as a load balancer or API Gateway) in a Region chosen by routing rules that are maintained within Global Accelerator. Compared with routing based on Route 53 rules, Global Accelerator request routing has lower latencies because it reduces the amount of traffic on the public internet. In addition, because Global Accelerator doesn’t depend on DNS TTL expiration to change routing rules, it can adjust routing more quickly.

![\[Diagram of how client writing with Global Accelerator can work.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/gt-routing-gax-excerpt2_v2.png)


 With *write to any Region* mode, or if combined with the compute-layer request routing on the back- end, Global Accelerator works seamlessly. The client connects to the nearest edge location and need not be concerned with which Region receives the request.

 With *write to one Region* Global Accelerator routing rules must send requests to the currently active Region. You can use health checks that artificially report a failure on any Region that’s not considered by your global system to be the active Region. As with DNS, it’s possible to use an alternative DNS domain name for routing read requests if the requests can be from any Region.

 With *write to your Region* mode, it’s best to avoid Global Accelerator unless you're also using compute-layer request routing.

# Evacuation processes
<a name="bp-global-table-design.prescriptive-guidance.evacuation"></a>

Evacuating a Region is the process of migrating activity, usually read and write activity or read activity, away from that Region.

## Evacuating a live Region
<a name="bp-global-table-design.prescriptive-guidance.evacuation.live"></a>

You might decide to evacuate a live Region for a number of reasons: as part of usual business activity (for example, if you’re using a follow-the-sun, write to one Region mode), due to a business decision to change the currently active Region, in response to failures in the software stack outside DynamoDB, or because you’re encountering general issues such as higher than usual latencies within the Region.

With *write to any Region* mode, evacuating a live Region is straightforward. You can route traffic to alternative Regions by using any routing system and let the write operations in the evacuated Region replicate over as usual.

The write to one Region and write to your Region modes are usually used with MREC tables. Therefore, you must make sure that all write operations to the active Region have been fully recorded, stream processed, and globally propagated before starting write operations in the new active Region, to ensure that future write operations are processed against the latest version of the data.

Let’s say that Region A is active and Region B is passive (either for the full table or for items that are homed in Region A). The typical mechanism to perform an evacuation is to pause write operations to A, wait long enough for those operations to have fully propagated to B, update the architecture stack to recognize B as active, and then resume write operations to B. There is no metric to indicate with absolute certainty that Region A has fully replicated its data to Region B. If Region A is healthy, pausing write operations to Region A and waiting 10 times the recent maximum value of the `ReplicationLatency` metric would typically be sufficient to determine that replication is complete. If Region A is unhealthy and shows other areas of increased latencies, you would choose a larger multiple for the wait time.

## Evacuating an offline Region
<a name="bp-global-table-design.prescriptive-guidance.evacuation.offline"></a>

There’s a special case to consider: What if Region A goes fully offline without notice? This is extremely unlikely but should be considered nevertheless.

Evacuating an offline MRSC table  
If this happens with an MRSC table, there is nothing special you need to do. MRSC tables support a recovery point objective (RPO) of zero. All successful write operations made to the MRSC table in the offline Region will be available in all other Region tables, so there's no potential gap in data even if the Region goes fully offline without notice. Business can continue using replicas located in the other Regions.

Evacuating an offline MREC table  
If this happens with an MREC table, any write operations in Region A that were not yet propagated are held and propagated after Region A comes back online. The write operations aren’t lost, but their propagation is indefinitely delayed.  
How to proceed in this event is the application’s decision. For business continuity, write operations might need to proceed to the new primary Region B. However, if an item in Region B receives an update while there is a pending propagation of a write operation for that item from Region A, the propagation is suppressed under the *last writer wins* model. Any update in Region B might suppress an incoming write request.  
With the *write to any Region* mode, read and write operations can continue in Region B, trusting that the items in Region A will propagate to Region B eventually and recognizing the potential for missing items until Region A comes back online. When possible, such as with idempotent write operations, you should consider replaying recent write traffic (for example, by using an upstream event source) to fill in the gap of any potentially missing write operations and let the last writer wins conflict resolution suppress the eventual propagation of the incoming write operation.  
With the other write modes, you have to consider the degree to which work can continue with a slightly out-of-date view of the world. Some small duration of write operations, as tracked by `ReplicationLatency`, will be missing until Region A comes back online. Can business move forward? In some use cases it can, but in others it might not without additional mitigation mechanisms.  
For example, imagine that you have to maintain an available credit balance without interruption even after a full outage of a Region. You could split the balance into two different items, one homed in Region A and one in Region B, and start each with half the available balance. This would use the *write to your Region* mode. Transactional updates processed in each Region would write against the local copy of the balance. If Region A goes fully offline, work could still proceed with transaction processing in Region B, and write operations would be limited to the balance portion held in Region B. Splitting the balance like this introduces complexities when the balance gets low or the credit has to be rebalanced, but it does provide one example of safe business recovery even with uncertain pending write operations.  
As another example, imagine that you’re capturing web form data. You can use [Optimistic Concurrency Control (OCC)](DynamoDBMapper.OptimisticLocking.md) (OCC) to assign versions to data items and embed the latest version into the web form as a hidden field. On each submit, the write operation succeeds only if the version in the database still matches the version that the form was built against. If the versions don’t match, the web form can be refreshed (or carefully merged) based on the current version in the database, and the user can proceed again. The OCC model usually protects against another client overwriting and producing a new version of the data, but it can also help during failover where a client might encounter older versions of data. Let’s imagine that you’re using the timestamp as the version. The form was first built against Region A at 12:00 but (after failover) tries to write to Region B and notices that the latest version in the database is 11:59. In this scenario, the client can either wait for the 12:00 version to propagate to Region B and then write on top of that version, or build on 11:59 and create a new 12:01 version (which, after writing, would suppress the incoming version after Region A recovers).  
As a third example, a financial services company holds data about customer accounts and their financial transactions in a DynamoDB database. In the event of a complete Region A outage, they want to make sure that any write activity related to their accounts is either fully available in Region B, or they want to quarantine their accounts as known partial until Region A comes back online. Instead of pausing all business, they decided to pause business only to the tiny fraction of accounts that they determined had unpropagated transactions. To achieve this, they used a third Region, which we will call Region C. Before they processed any write operations in Region A, they placed a succinct summary of those pending operations (for example, a new transaction count for an account) in Region C. This summary was sufficient for Region B to determine if its view was fully up to date. This action effectively locked the account from the time of writing in Region C until Region A accepted the write operations and Region B received them. The data in Region C wasn’t used except as part of a failover process, after which Region B could cross-check its data with Region C to check if any of its accounts were out of date. Those accounts would be marked as quarantined until the Region A recovery propagated the partial data to Region B. If Region C were to fail, a new Region D could be spun up for use instead. The data in Region C was very transient, and after a few minutes Region D would have a sufficiently up-to-date record of the in-flight write operations to be fully useful. If Region B were to fail, Region A could continue accepting write requests in cooperation with Region C. This company was willing to accept higher latency writes (to two Regions: C and then A) and was fortunate to have a data model where the state of an account could be succinctly summarized.

# Throughput capacity planning for DynamoDB global tables
<a name="bp-global-table-design.prescriptive-guidance.throughput"></a>

Migrating traffic from one Region to another requires careful consideration of DynamoDB table settings regarding capacity. 

Here are some considerations for managing write capacity:
+ A global table must be in on-demand mode or provisioned with auto scaling enabled.
+ If provisioned with auto scaling, the write settings (minimum, maximum, and target utilization) are replicated across Regions. Although the auto scaling settings are synchronized, the actual provisioned write capacity can float independently between Regions.
+ One reason you could see different provisioned write capacity is due to the TTL feature. When you enable TTL in DynamoDB, you can specify an attribute name whose value indicates the time of expiration for the item, in Unix epoch time format in seconds. After that time, DynamoDB can delete the item without incurring write costs. With global tables, you can configure TTL in any Region, and the setting is automatically replicated to other Regions that are associated with the global table. When an item is eligible for deletion through a TTL rule, that work can be done in any Region. The delete operation is performed without consuming write units on the source table, but the replica tables will get a replicated write of that delete operation and will incur replicated write unit costs. TTL isn't supported in MRSC tables.
+ If you’re using auto scaling, make sure that the maximum provisioned write capacity setting is sufficiently high to handle all write operations as well as all potential TTL delete operations. Auto scaling adjusts each Region according to its write consumption. On-demand tables have no maximum provisioned write capacity setting, but the *table-level maximum write throughput limit* specifies the maximum sustained write capacity the on-demand table will allow. The default limit to 40,000, but it is adjustable. We recommend that you set it high enough to handle all write operations (including TTL write operations) that the on-demand table might need. This value must be the same across all participating Regions when you set up global tables.

Here are some considerations for managing read capacity:
+ Read capacity management settings are allowed to differ between Regions because it’s assumed that different Regions might have independent read patterns. When you first add a global replica to a table, the capacity of the source Region is propagated. After creation you can adjust the read capacity settings, which aren’t transferred to the other side.
+ When you use DynamoDB auto scaling, make sure that the maximum provisioned read capacity settings are sufficiently high to handle all read operations across all Regions. During standard operations the read capacity will perhaps be spread across Regions, but during failover the table should be able to automatically adapt to the increased read workload. On-demand tables have no maximum provisioned read capacity setting, but the *table-level maximum read throughput limit* specifies the maximum sustained read capacity the on-demand table will allow. The default limit is 40,000, but it is adjustable. We recommend that you set it high enough to handle all read operations that the table might need if all read operations were to route to this single Region.
+ If a table in one Region doesn’t usually receive read traffic but might have to absorb a large amount of read traffic after a failover, you can pre-warm the capacity of the to accept a higher level of read traffic.

ARC has [ readiness checks](https://docs.aws.amazon.com/r53recovery/latest/dg/recovery-readiness.rules-resources.html) that can be useful for confirming that DynamoDB Regions have similar table settings and account quotas, whether or not you use Route 53 to route requests. These readiness checks can also help in adjusting account-level quotas to make sure they match.

# Preparation checklist for DynamoDB global tables
<a name="bp-global-table-design.prescriptive-guidance.checklist-and-faq"></a>

Use the following checklist for decisions and tasks when you deploy global tables.
+ Determine if your use case benefits more from an MRSC or MREC consistency mode. Do you need strong consistency, even with the higher latency and other tradeoffs?
+ Determine how many and which Regions should participate in the global table. If you plan to use MRSC, decide if you want the third Region to be a replica or a witness.
+ Determine your application’s write mode. This is not the same as the consistency mode. For more information, see [Write modes with DynamoDB global tables](bp-global-table-design.prescriptive-guidance.writemodes.md).
+ Plan your [Routing strategies in DynamoDB](bp-global-table-design.prescriptive-guidance.request-routing.md) strategy, based on your write mode.
+ Define your [ Evacuation processes  Evacuating a Region with global tables   Evacuating a Region is the process of migrating activity, usually read and write activity or read activity, away from that Region.  Evacuating a live RegionLive Regions  Evacuating a live Region   You might decide to evacuate a live Region for a number of reasons: as part of usual business activity (for example, if you’re using a follow-the-sun, write to one Region mode), due to a business decision to change the currently active Region, in response to failures in the software stack outside DynamoDB, or because you’re encountering general issues such as higher than usual latencies within the Region. With *write to any Region* mode, evacuating a live Region is straightforward. You can route traffic to alternative Regions by using any routing system and let the write operations in the evacuated Region replicate over as usual. The write to one Region and write to your Region modes are usually used with MREC tables. Therefore, you must make sure that all write operations to the active Region have been fully recorded, stream processed, and globally propagated before starting write operations in the new active Region, to ensure that future write operations are processed against the latest version of the data. Let’s say that Region A is active and Region B is passive (either for the full table or for items that are homed in Region A). The typical mechanism to perform an evacuation is to pause write operations to A, wait long enough for those operations to have fully propagated to B, update the architecture stack to recognize B as active, and then resume write operations to B. There is no metric to indicate with absolute certainty that Region A has fully replicated its data to Region B. If Region A is healthy, pausing write operations to Region A and waiting 10 times the recent maximum value of the `ReplicationLatency` metric would typically be sufficient to determine that replication is complete. If Region A is unhealthy and shows other areas of increased latencies, you would choose a larger multiple for the wait time.   Evacuating an offline RegionOffline Regions  Evacuating an offline Region   There’s a special case to consider: What if Region A goes fully offline without notice? This is extremely unlikely but should be considered nevertheless.  

Evacuating an offline MRSC table  
If this happens with an MRSC table, there is nothing special you need to do. MRSC tables support a recovery point objective (RPO) of zero. All successful write operations made to the MRSC table in the offline Region will be available in all other Region tables, so there's no potential gap in data even if the Region goes fully offline without notice. Business can continue using replicas located in the other Regions. 

Evacuating an offline MREC table  
If this happens with an MREC table, any write operations in Region A that were not yet propagated are held and propagated after Region A comes back online. The write operations aren’t lost, but their propagation is indefinitely delayed.  
How to proceed in this event is the application’s decision. For business continuity, write operations might need to proceed to the new primary Region B. However, if an item in Region B receives an update while there is a pending propagation of a write operation for that item from Region A, the propagation is suppressed under the *last writer wins* model. Any update in Region B might suppress an incoming write request.  
With the *write to any Region* mode, read and write operations can continue in Region B, trusting that the items in Region A will propagate to Region B eventually and recognizing the potential for missing items until Region A comes back online. When possible, such as with idempotent write operations, you should consider replaying recent write traffic (for example, by using an upstream event source) to fill in the gap of any potentially missing write operations and let the last writer wins conflict resolution suppress the eventual propagation of the incoming write operation.  
With the other write modes, you have to consider the degree to which work can continue with a slightly out-of-date view of the world. Some small duration of write operations, as tracked by `ReplicationLatency`, will be missing until Region A comes back online. Can business move forward? In some use cases it can, but in others it might not without additional mitigation mechanisms.  
For example, imagine that you have to maintain an available credit balance without interruption even after a full outage of a Region. You could split the balance into two different items, one homed in Region A and one in Region B, and start each with half the available balance. This would use the *write to your Region* mode. Transactional updates processed in each Region would write against the local copy of the balance. If Region A goes fully offline, work could still proceed with transaction processing in Region B, and write operations would be limited to the balance portion held in Region B. Splitting the balance like this introduces complexities when the balance gets low or the credit has to be rebalanced, but it does provide one example of safe business recovery even with uncertain pending write operations.  
As another example, imagine that you’re capturing web form data. You can use [Optimistic Concurrency Control (OCC)](DynamoDBMapper.OptimisticLocking.md) (OCC) to assign versions to data items and embed the latest version into the web form as a hidden field. On each submit, the write operation succeeds only if the version in the database still matches the version that the form was built against. If the versions don’t match, the web form can be refreshed (or carefully merged) based on the current version in the database, and the user can proceed again. The OCC model usually protects against another client overwriting and producing a new version of the data, but it can also help during failover where a client might encounter older versions of data. Let’s imagine that you’re using the timestamp as the version. The form was first built against Region A at 12:00 but (after failover) tries to write to Region B and notices that the latest version in the database is 11:59. In this scenario, the client can either wait for the 12:00 version to propagate to Region B and then write on top of that version, or build on 11:59 and create a new 12:01 version (which, after writing, would suppress the incoming version after Region A recovers).  
As a third example, a financial services company holds data about customer accounts and their financial transactions in a DynamoDB database. In the event of a complete Region A outage, they want to make sure that any write activity related to their accounts is either fully available in Region B, or they want to quarantine their accounts as known partial until Region A comes back online. Instead of pausing all business, they decided to pause business only to the tiny fraction of accounts that they determined had unpropagated transactions. To achieve this, they used a third Region, which we will call Region C. Before they processed any write operations in Region A, they placed a succinct summary of those pending operations (for example, a new transaction count for an account) in Region C. This summary was sufficient for Region B to determine if its view was fully up to date. This action effectively locked the account from the time of writing in Region C until Region A accepted the write operations and Region B received them. The data in Region C wasn’t used except as part of a failover process, after which Region B could cross-check its data with Region C to check if any of its accounts were out of date. Those accounts would be marked as quarantined until the Region A recovery propagated the partial data to Region B. If Region C were to fail, a new Region D could be spun up for use instead. The data in Region C was very transient, and after a few minutes Region D would have a sufficiently up-to-date record of the in-flight write operations to be fully useful. If Region B were to fail, Region A could continue accepting write requests in cooperation with Region C. This company was willing to accept higher latency writes (to two Regions: C and then A) and was fortunate to have a data model where the state of an account could be succinctly summarized.   ](bp-global-table-design.prescriptive-guidance.evacuation.md#bp-global-table-design.prescriptive-guidance.evacuation.title) [Evacuation processes](bp-global-table-design.prescriptive-guidance.evacuation.md), based on your consistency mode, write mode, and routing strategy.
+ Capture metrics on the health, latency, and errors across each Region. For a list of DynamoDB metrics, see the AWS blog post [Monitoring Amazon DynamoDB for Operational Awareness](https://aws.amazon.com/blogs/database/monitoring-amazon-dynamodb-for-operational-awareness/) for a list of metrics to observe. You should also use [synthetic canaries](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Synthetics_Canaries.html) (artificial requests designed to detect failures, named after the canary in the coal mine), as well as live observation of customer traffic. Not all issues will appear in the DynamoDB metrics.
+ If you're using MREC, set alarms for any sustained increase in `ReplicationLatency`. An increase might indicate an accidental misconfiguration in which the global table has different write settings in different Regions, which leads to failed replicated requests and increased latencies. It could also indicate that there is a Regional disruption. A [good example](https://aws.amazon.com/blogs/database/monitoring-amazon-dynamodb-for-operational-awareness/) is to generate an alert if the recent average exceeds 180,000 milliseconds. You might also watch for `ReplicationLatency` dropping to 0, which indicates stalled replication.
+ Assign sufficient maximum read and write settings for each global table.
+ Identify the reasons for evacuating a Region in advance. If the decision involves human judgment, document all considerations. This work should be done carefully in advance, not under stress.
+ Maintain a runbook for every action that must take place when you evacuate a Region. Usually very little work is involved for the global tables, but moving the rest of the stack might be complex. 
**Note**  
With failover procedures, it's best practice to rely only on data plane operations and not on control plane operations, because some control plane operations could be degraded during Region failures.

   For more information, see the AWS blog post [ Build resilient applications with Amazon DynamoDB global tables: Part 4](https://aws.amazon.com/blogs/database/part-4-build-resilient-applications-with-amazon-dynamodb-global-tables/).
+ Test all aspects of the runbook periodically, including Region evacuations. An untested runbook is an unreliable runbook.
+ Consider using [AWS Resilience Hub](https://docs.aws.amazon.com/resilience-hub/latest/userguide/what-is.html) to evaluate the resilience of your entire application (including global tables). It provides a comprehensive view of your overall application portfolio resilience status through its dashboard.
+ Consider using ARC readiness checks to evaluate the current configuration of your application and track any deviances from best practices.
+ When you write health checks for use with Route 53 or Global Accelerator, make a set of calls that cover the full database flow. If you limit your check to confirm only that the DynamoDB endpoint is up, you won’t be able to cover many failure modes such as AWS Identity and Access Management (IAM) configuration errors, code deployment problems, failure in the stack outside DynamoDB, higher than average read or write latencies, and so on.

## Frequently Asked Questions (FAQ) for deploying global tables
<a name="bp-global-table-design.prescriptive-guidance.faq"></a>

**What is the pricing for global tables?**
+ A write operation in a traditional DynamoDB table is priced in write capacity units (WCUs, for provisioned tables) or write request units (WRUs) for on-demand tables. If you write a 5 KB item, it incurs a charge of 5 units. A write to a global table is priced in replicated write capacity units (rWCUs, for provisioned tables) or replicated write request units (rWRUs, for on-demand tables). rWCUs and rWRUs are priced the same as WGUs and WRUs.
+ rWCU and rWRU changes are incurred in every Region where the item is written directly or written through replication. Cross-Region data transfer fees apply.
+ Writing to a global secondary index (GSI) is considered a local write operation and uses regular write units.
+ There is no reserved capacity available for rWCUs or rWRUs at this time. Purchasing reserved capacity for WCUs can be beneficial for tables where GSIs consume write units.
+ When you add a new Region to a global table, DynamoDB bootstraps the new Region automatically and charges you as if it were a table restore, based on the GB size of the table. It also charges cross-Region data transfer fees.

**Which Regions does global tables support?**

[Global Tables version 2019.11.21 (Current)](GlobalTables.md) supports all AWS Regions for MREC tables and the following Region sets for MRSC tables:
+ US Region set: US East (N.Virginia), US East (Ohio), US West (Oregon)
+ EU Region set: Europe (Ireland), Europe (London), Europe (Paris), Europe (Frankfort)
+ AP Region set: Asia Pacific (Tokyo), Asia Pacific (Seoul), and Asia Pacific (Osaka)

**How are GSIs handled with global tables?**

In [Global Tables version 2019.11.21 (Current)](GlobalTables.md), when you create a GSI in one Region it’s automatically created in other participating Regions and automatically backfilled. 

**How do I stop replication of a global table?** 
+ You can delete a replica table the same way you would delete any other table. Deleting the global table stops replication to that Region and deletes the table copy kept in that Region. However, you can't stop replication while keeping copies of the table as independent entities, nor can you pause replication.
+ An MRSC table must be deployed in exactly three Regions. To delete the replicas you must delete all the replicas and the witness so that the MRSC table becomes a local table.

**How do DynamoDB Streams interact with global tables?**
+ Each global table produces an independent stream based on all its write operations, wherever they started from. You can choose to consume the DynamoDB stream in one Region or in all Regions (independently). If you want to process local but not replicated write operations, you can add your own Region attribute to each item to identify the writing Region. You can then use a Lambda event filter to call the Lambda function only for write operations in the local Region. This helps with insert and update operations, but not delete operations.
+ Global tables that are configured for multi-Region eventual consistency (MREC tables) replicate changes by reading those changes from a DynamoDB stream on a replica table and applying that change to all other replica tables. Therefore, DynamoDB is enabled by default on all replicas in an MREC global table and cannot be disabled on those replicas. The MREC replication process can combine multiple changes in a short period of time into a single replicated write operation. As a result, each replica's stream might contain slightly different records. DynamoDB Streams records on MREC replicas are always ordered on a per-item basis, but ordering between items might differ between replicas.
+ Global tables that are configured for multi-Region strong consistency (MRSC tables) don’t use DynamoDB Streams for replication, so this feature isn’t enabled by default on MRSC replicas. You can enable DynamoDB Streams on an MRSC replica. DynamoDB Streams records on MRSC replicas are identical for every replica and are always ordered on a per-item basis, but ordering between items might differ between replicas.

**How do global tables handle transactions?** 
+ Transactional operations on MRSC tables will generate errors.
+ Transactional operations on MREC tables provide atomicity, consistency, isolation, and durability (ACID) guarantees only within the Region where the write operation originally occurred. Transactions are not supported across Regions in global tables. For example, if you have an MREC global table with replicas in the US East (Ohio) and US West (Oregon) Regions and perform a `TransactWriteItems` operation in the US East (Ohio) Region, you might observe partially completed transactions in the US West (Oregon) Region as changes are replicated. Changes are replicated to other Regions only after they have been committed in the source Region.

** How do global tables interact with the DynamoDB Accelerator cache (DAX)?**

Global tables bypass DAX by updating DynamoDB directly, so DAX isn’t aware that it’s holding stale data. The DAX cache is refreshed only when the cache’s TTL expires.

**Do tags on tables propagate?**

No, tags do not automatically propagate.

**Should I backup tables in all Regions or just one?**

The answer depends on the purpose of the backup.
+ If you want to ensure data durability, DynamoDB already provides that safeguard. The service ensures durability.
+ If you want to keep a snapshot for historical records (for example, to meet regulatory requirements), backing up in one Region should suffice. You can copy the backup to additional Regions by using AWS Backup.
+ If you want to recover erroneously deleted or modified data, use [DynamoDB point-in-time recovery (PITR)](PointInTimeRecovery_Howitworks.md) in one Region.

**How do I deploy global tables using CloudFormation?**
+ CloudFormation represents a DynamoDB table and a global table as two separate resources: `AWS::DynamoDB::Table` and `AWS::DynamoDB::GlobalTable`. One approach is to create all tables that can potentially be global by using the `GlobalTable` construct of keeping them as standalone tables initially, and add Regions later if necessary. 
+ In CloudFormation, each global table is controlled by a single stack, in a single Region, regardless of the number of replicas. When you deploy your template, CloudFormation creates and updates all replicas as part of a single stack operation. You should not deploy the same [AWS::DynamoDB::GlobalTable](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-dynamodb-globaltable.html) resource in multiple Regions. This will result in errors and is unsupported. If you deploy your application template in multiple Regions, you can use conditions to create the `AWS::DynamoDB::GlobalTable` resource in a single Region. Alternatively, you can choose to define your `AWS::DynamoDB::GlobalTable` resources in a stack that’s separate from your application stack, and make sure that it’s deployed to a single Region. 
+ If you have a regular table and you want to convert it to a global table while keeping it managed by CloudFormation then set the deletion policy to `Retain`, remove the table from the stack, convert the table to a global table in the console, and then import the global table as a new resource to the stack. For more information, see the [AWS GitHub repository](https://github.com/aws-samples/amazon-dynamodb-table-to-global-table-cdk).
+ Cross-account replication is not supported at this time.

## Conclusion and resources
<a name="bp-global-table-design.prescriptive-guidance-resources-conclusion"></a>

DynamoDB global tables have very few controls but still require careful consideration. You must determine your write mode, routing model, and evacuation processes. You must instrument your application across every Region and be ready to adjust your routing or perform an evacuation to maintain global health. The reward is having a globally distributed dataset with low-latency read and write operations that is designed for 99.999% availability.

For more information about DynamoDB global tables, see the following resources:
+ [DynamoDB documentation](https://docs.aws.amazon.com/dynamodb/)
+ [Amazon Application Recovery Controller](https://aws.amazon.com/application-recovery-controller/)
+ [Readiness check in ARC](https://docs.aws.amazon.com/r53recovery/latest/dg/recovery-readiness.html) (AWS documentation)
+ [Route 53 routing policies](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/routing-policy.html)
+ [AWS Global Accelerator](https://aws.amazon.com/global-accelerator/)
+ [DynamoDB service-level agreement](https://aws.amazon.com/dynamodb/sla/)
+ [AWS Multi-Region Fundamentals](https://docs.aws.amazon.com/prescriptive-guidance/latest/aws-multi-region-fundamentals/introduction.html) (AWS whitepaper)
+ [Data resiliency design patterns with AWS](https://www.youtube.com/watch?v=7IA48SOX20c) (AWS re:Invent 2022 presentation)
+ [How Fidelity Investments and Reltio modernized with Amazon DynamoDB](https://www.youtube.com/watch?v=QUpV5MDu4Ys&t=706s) (AWS re:Invent 2022 presentation)
+ [Multi-Region design patterns and best practices](https://www.youtube.com/watch?v=ilgpzlE7Hds&t=1882s) (AWS re:Invent 2022 presentation)
+ [Disaster Recovery (DR) Architecture on AWS, Part III: Pilot Light and Warm Standby](https://aws.amazon.com/blogs/architecture/disaster-recovery-dr-architecture-on-aws-part-iii-pilot-light-and-warm-standby/) (AWS blog post)
+ [Use Region pinning to set a home Region for items in an Amazon DynamoDB global table](https://aws.amazon.com/blogs/database/use-region-pinning-to-set-a-home-region-for-items-in-an-amazon-dynamodb-global-table/) (AWS blog post)
+ [Monitoring Amazon DynamoDB for operational awareness](https://aws.amazon.com/blogs/database/monitoring-amazon-dynamodb-for-operational-awareness/) (AWS blog post)
+ [Scaling DynamoDB: How partitions, hot keys, and split for heat impact performance](https://aws.amazon.com/blogs/database/part-3-scaling-dynamodb-how-partitions-hot-keys-and-split-for-heat-impact-performance/) (AWS blog post)
+ [Multi-Region strong consistency with DynamoDB global tables ](https://www.youtube.com/watch?v=R-nTs8ZD8mA)(AWS re:Invent 2024 presentation)

# Best practices for managing the control plane in DynamoDB
<a name="bp-control-plane"></a>

**Note**  
DynamoDB is introducing a control plane throttle limit of 2,500 requests per second with the option for a retry. See below for additional details.

DynamoDB control plane operations let you manage DynamoDB tables as well as objects that are dependent on tables such as indexes. For more information about these operations, see [Control plane](HowItWorks.API.md#HowItWorks.API.ControlPlane). 

In some circumstances, you may need to take actions and use data returned by control plane calls as part of your business logic. For example, you might need to know the value of `ProvisionedThroughput` returned by `DescribeTable`. In these circumstances, follow these best practices:
+ Do not excessively query the DynamoDB control plane.
+ Do not mix control plane calls and data plane calls within the same code.
+ Handle throttles on control plane requests and retry with a backoff.
+ Invoke and track changes to a particular resource from a single client.
+ Instead of retrieving data for the same table multiple times at short intervals, cache the data for processing.

# Best practices for using bulk data operations in DynamoDB
<a name="BestPractices_BulkDataOperations"></a>

DynamoDB supports batch operations such as `BatchWriteItem` using which you can perform up to 25 `PutItem` and `DeleteItem` requests together. However, `BatchWriteItem` doesn't support `UpdateItem` operations. When it comes to bulk updates, the distinction lies in the requirements and the nature of the update. You can use other DynamoDB APIs such as `TransactWriteItems` for batch size up to 100. When more items are involved, you can use services such as AWS Glue, Amazon EMR, AWS Step Functions or use custom scripts and tools like DynamoDB-shell for bulk updates.

**Topics**
+ [Conditional batch update](BestPractices_ConditionalBatchUpdate.md)
+ [Efficient bulk operations](BestPractices_EfficientBulkOperations.md)

# Conditional batch update
<a name="BestPractices_ConditionalBatchUpdate"></a>

DynamoDB supports batch operations such as `BatchWriteItem` using which you can perform up to 25 `PutItem` and `DeleteItem` requests in a single batch. However, `BatchWriteItem` doesn't support `UpdateItem` operations and doesn't support condition expressions. As a workaround, you can use other DynamoDB APIs such as `TransactWriteItems` for batch size up to 100.

When more items are involved, and a major chunk of data needs to be changed, you can use services such as AWS Glue, Amazon EMR, AWS Step Functions or use custom scripts and tools like DynamoDB-shell for efficient bulk updates.

**When to use this pattern**
+ DynamoDB-shell is not a supported for production use case.
+ `TransactWriteItems` – up to 100 individual updates with or without conditions, executing as an all or nothing ACID bundle. `TransactWriteItems` calls can also be supplied with a `ClientRequestToken` if your application requires idempotency, meaning multiple identical calls have the same effect as one single call. This ensures you don't execute the same transaction multiple times and end up with an incorrect state of data.

  Trade-off – Additional throughput is consumed. 2 WCUs per 1KB write instead of the standard 1 WGU per 1 KB write.
+ PartiQL `BatchExecuteStatement` – up to 25 updates with or without conditions. `BatchExecuteStatement` always returns a success response to the overall request, and also returns a list of individual operation responses that preserves order.

  Trade-off – For larger batches, additional client-side logic is required to distribute requests in batches of 25. Individual error responses need to be considered to determine retry strategy.

## Code examples
<a name="bp-conditional-code-examples"></a>

These code examples use the boto3 library, which is the AWS SDK for Python. The examples assume you have boto3 installed and configured with appropriate AWS credentials.

Assume an inventory database for an electrical appliance vendor who has multiple warehouses across European cities. Because it is end of summer, the vendor would like to clear out desk fans to make room for other stock. The vendor wants to provide a price discount for all desk fans supplied out of warehouses in Italy but only if they have a reserve stock of 20 desk fans. The DynamoDB table is called **inventory**, it has a key schema of Partition key **sku** which is a unique identifier for each product and a Sort key **warehouse** which is an identifier for a warehouse.

The following Python code demonstrates how to perform this conditional batch update using `BatchExecuteStatement` API call.

```
import boto3

client=boto3.client("dynamodb")

before_image=client.query(TableName='inventory', KeyConditionExpression='sku=:pk_val AND begins_with(warehouse, :sk_val)', ExpressionAttributeValues={':pk_val':{'S':'F123'},':sk_val':{'S':'WIT'}}, ProjectionExpression='sku,warehouse,quantity,price')
print("Before update: ", before_image['Items'])

response=client.batch_execute_statement(
        Statements=[
            {'Statement': 'UPDATE inventory SET price=price-5 WHERE sku=? AND warehouse=? AND quantity > 20', 'Parameters': [{'S':'F123'}, {'S':'WITTUR1'}], 'ReturnValuesOnConditionCheckFailure': 'ALL_OLD'},
            {'Statement': 'UPDATE inventory SET price=price-5 WHERE sku=? AND warehouse=? AND quantity > 20', 'Parameters': [{'S':'F123'}, {'S':'WITROM1'}], 'ReturnValuesOnConditionCheckFailure': 'ALL_OLD'},
            {'Statement': 'UPDATE inventory SET price=price-5 WHERE sku=? AND warehouse=? AND quantity > 20', 'Parameters': [{'S':'F123'}, {'S':'WITROM2'}], 'ReturnValuesOnConditionCheckFailure': 'ALL_OLD'},
            {'Statement': 'UPDATE inventory SET price=price-5 WHERE sku=? AND warehouse=? AND quantity > 20', 'Parameters': [{'S':'F123'}, {'S':'WITROM5'}], 'ReturnValuesOnConditionCheckFailure': 'ALL_OLD'},
            {'Statement': 'UPDATE inventory SET price=price-5 WHERE sku=? AND warehouse=? AND quantity > 20', 'Parameters': [{'S':'F123'}, {'S':'WITVEN1'}], 'ReturnValuesOnConditionCheckFailure': 'ALL_OLD'},
            {'Statement': 'UPDATE inventory SET price=price-5 WHERE sku=? AND warehouse=? AND quantity > 20', 'Parameters': [{'S':'F123'}, {'S':'WITVEN2'}], 'ReturnValuesOnConditionCheckFailure': 'ALL_OLD'},
            {'Statement': 'UPDATE inventory SET price=price-5 WHERE sku=? AND warehouse=? AND quantity > 20', 'Parameters': [{'S':'F123'}, {'S':'WITVEN3'}], 'ReturnValuesOnConditionCheckFailure': 'ALL_OLD'},
        ],
        ReturnConsumedCapacity='TOTAL'
    )

after_image=client.query(TableName='inventory', KeyConditionExpression='sku=:pk_val AND begins_with(warehouse, :sk_val)', ExpressionAttributeValues={':pk_val':{'S':'F123'},':sk_val':{'S':'WIT'}}, ProjectionExpression='sku,warehouse,quantity,price')
print("After update: ", after_image['Items'])
```

Execution produces the below output on sample data:

```
Before update:  [{'quantity': {'N': '20'}, 'warehouse': {'S': 'WITROM1'}, 'price': {'N': '40'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '25'}, 'warehouse': {'S': 'WITROM2'}, 'price': {'N': '40'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '28'}, 'warehouse': {'S': 'WITROM5'}, 'price': {'N': '38'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '26'}, 'warehouse': {'S': 'WITTUR1'}, 'price': {'N': '40'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '10'}, 'warehouse': {'S': 'WITVEN1'}, 'price': {'N': '38'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '20'}, 'warehouse': {'S': 'WITVEN2'}, 'price': {'N': '38'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '50'}, 'warehouse': {'S': 'WITVEN3'}, 'price': {'N': '35'}, 'sku': {'S': 'F123'}}]
After update:  [{'quantity': {'N': '20'}, 'warehouse': {'S': 'WITROM1'}, 'price': {'N': '40'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '25'}, 'warehouse': {'S': 'WITROM2'}, 'price': {'N': '35'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '28'}, 'warehouse': {'S': 'WITROM5'}, 'price': {'N': '33'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '26'}, 'warehouse': {'S': 'WITTUR1'}, 'price': {'N': '35'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '10'}, 'warehouse': {'S': 'WITVEN1'}, 'price': {'N': '38'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '20'}, 'warehouse': {'S': 'WITVEN2'}, 'price': {'N': '38'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '50'}, 'warehouse': {'S': 'WITVEN3'}, 'price': {'N': '30'}, 'sku': {'S': 'F123'}}]
```

Since this is a bounded operation for an internal system, idempotency requirements haven't been considered. It's possible to place additional guardrails like price update should go through only if price is greater than 35 and less than 40 to make the updates more robust.

Alternatively, we can perform the same batch update operation using `TransactWriteItems` in case of stricter idempotency and ACID requirements. However, it is important to remember that either all the operations in the transaction bundle go through or the entire bundle fails.

Let’s assume a case where there is a heatwave in Italy and the demand for desk fans has increased sharply. The vendor wants to increase their desk fan cost going out of every warehouse in Italy by 20 Euros but the regulatory body only allows this cost increase if the current cost is less than 70 Euros across their entire inventory. It's essential that the price is updated throughout the inventory at once and only once and only if the cost is less than 70 Euros in each of their warehouse.

The following Python code demonstrates how to perform this batch update using `TransactWriteItems` API call.

```
import boto3

client=boto3.client("dynamodb")

before_image=client.query(TableName='inventory', KeyConditionExpression='sku=:pk_val AND begins_with(warehouse, :sk_val)', ExpressionAttributeValues={':pk_val':{'S':'F123'},':sk_val':{'S':'WIT'}}, ProjectionExpression='sku,warehouse,quantity,price')
print("Before update: ", before_image['Items'])

response=client.transact_write_items(
        ClientRequestToken='UUIDAWS124',
        TransactItems=[
            {'Update': { 'Key': {'sku': {'S':'F123'}, 'warehouse': {'S':'WITTUR1'}}, 'UpdateExpression': 'SET price = price + :inc', 'ConditionExpression': 'price < :cap', 'ExpressionAttributeValues': { ':inc': {'N': '20'}, ':cap': {'N': '70'}}, 'TableName': 'inventory', 'ReturnValuesOnConditionCheckFailure': 'ALL_OLD'}},
            {'Update': { 'Key': {'sku': {'S':'F123'}, 'warehouse': {'S':'WITROM1'}}, 'UpdateExpression': 'SET price = price + :inc', 'ConditionExpression': 'price < :cap', 'ExpressionAttributeValues': { ':inc': {'N': '20'}, ':cap': {'N': '70'}}, 'TableName': 'inventory', 'ReturnValuesOnConditionCheckFailure': 'ALL_OLD'}},
            {'Update': { 'Key': {'sku': {'S':'F123'}, 'warehouse': {'S':'WITROM2'}}, 'UpdateExpression': 'SET price = price + :inc', 'ConditionExpression': 'price < :cap', 'ExpressionAttributeValues': { ':inc': {'N': '20'}, ':cap': {'N': '70'}}, 'TableName': 'inventory', 'ReturnValuesOnConditionCheckFailure': 'ALL_OLD'}},
            {'Update': { 'Key': {'sku': {'S':'F123'}, 'warehouse': {'S':'WITROM5'}}, 'UpdateExpression': 'SET price = price + :inc', 'ConditionExpression': 'price < :cap', 'ExpressionAttributeValues': { ':inc': {'N': '20'}, ':cap': {'N': '70'}}, 'TableName': 'inventory', 'ReturnValuesOnConditionCheckFailure': 'ALL_OLD'}},
            {'Update': { 'Key': {'sku': {'S':'F123'}, 'warehouse': {'S':'WITVEN1'}}, 'UpdateExpression': 'SET price = price + :inc', 'ConditionExpression': 'price < :cap', 'ExpressionAttributeValues': { ':inc': {'N': '20'}, ':cap': {'N': '70'}}, 'TableName': 'inventory', 'ReturnValuesOnConditionCheckFailure': 'ALL_OLD'}},
            {'Update': { 'Key': {'sku': {'S':'F123'}, 'warehouse': {'S':'WITVEN2'}}, 'UpdateExpression': 'SET price = price + :inc', 'ConditionExpression': 'price < :cap', 'ExpressionAttributeValues': { ':inc': {'N': '20'}, ':cap': {'N': '70'}}, 'TableName': 'inventory', 'ReturnValuesOnConditionCheckFailure': 'ALL_OLD'}},
            {'Update': { 'Key': {'sku': {'S':'F123'}, 'warehouse': {'S':'WITVEN3'}}, 'UpdateExpression': 'SET price = price + :inc', 'ConditionExpression': 'price < :cap', 'ExpressionAttributeValues': { ':inc': {'N': '20'}, ':cap': {'N': '70'}}, 'TableName': 'inventory', 'ReturnValuesOnConditionCheckFailure': 'ALL_OLD'}},
        ],
        ReturnConsumedCapacity='TOTAL'
    )

after_image=client.query(TableName='inventory', KeyConditionExpression='sku=:pk_val AND begins_with(warehouse, :sk_val)', ExpressionAttributeValues={':pk_val':{'S':'F123'},':sk_val':{'S':'WIT'}}, ProjectionExpression='sku,warehouse,quantity,price')
print("After update: ", after_image['Items'])
```

Execution produces the below output on sample data:

```
Before update:  [{'quantity': {'N': '20'}, 'warehouse': {'S': 'WITROM1'}, 'price': {'N': '60'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '25'}, 'warehouse': {'S': 'WITROM2'}, 'price': {'N': '55'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '28'}, 'warehouse': {'S': 'WITROM5'}, 'price': {'N': '53'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '26'}, 'warehouse': {'S': 'WITTUR1'}, 'price': {'N': '55'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '10'}, 'warehouse': {'S': 'WITVEN1'}, 'price': {'N': '58'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '20'}, 'warehouse': {'S': 'WITVEN2'}, 'price': {'N': '58'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '50'}, 'warehouse': {'S': 'WITVEN3'}, 'price': {'N': '50'}, 'sku': {'S': 'F123'}}]
After update:  [{'quantity': {'N': '20'}, 'warehouse': {'S': 'WITROM1'}, 'price': {'N': '80'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '25'}, 'warehouse': {'S': 'WITROM2'}, 'price': {'N': '75'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '28'}, 'warehouse': {'S': 'WITROM5'}, 'price': {'N': '73'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '26'}, 'warehouse': {'S': 'WITTUR1'}, 'price': {'N': '75'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '10'}, 'warehouse': {'S': 'WITVEN1'}, 'price': {'N': '78'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '20'}, 'warehouse': {'S': 'WITVEN2'}, 'price': {'N': '78'}, 'sku': {'S': 'F123'}}, {'quantity': {'N': '50'}, 'warehouse': {'S': 'WITVEN3'}, 'price': {'N': '70'}, 'sku': {'S': 'F123'}}]
```

There are multiple approaches to perform batch updates in DynamoDB. The suitable approach depends on factors such as ACID and/or idempotency requirements, number of items to be updated, and familiarity with APIs.

# Efficient bulk operations
<a name="BestPractices_EfficientBulkOperations"></a>

**When to use this pattern**

These patterns are useful to efficiently perform bulk updates on DynamoDB items.
+ DynamoDB-shell is not a supported for production use case.
+ `TransactWriteItems` – up to 100 individual updates with or without conditions, executing as an all or nothing ACID bundle 

  Trade-off – Additional throughput is consumed, 2 WCUs per 1 KB write.
+ PartiQL `BatchExecuteStatement` – up to 25 updates with or without conditions

  Trade-off – Additional logic is required to distribute requests in batches of 25.
+ AWS Step Functions – rate-limited bulk operations for developers familiar with AWS Lambda.

  Trade-off – Execution time is inversely proportional to rate-limit. Limited by the maximum Lambda function timeout. The functionality entails that data changes that occur between the read and the write may be overwritten. For more info, see [Backfilling an Amazon DynamoDB Time to Live attribute using Amazon EMR: Part 2](https://aws.amazon.com/blogs/database/part-2-backfilling-an-amazon-dynamodb-time-to-live-attribute-using-amazon-emr/).
+ AWS Glue and Amazon EMR – rate-limited bulk operation with managed parallelism. For applications or updates that are not time-sensitive, these options can run in the background only consuming a small percentage of throughput. Both services uses the emr-dynamodb-connector to perform DynamoDB operations. These services perform a big read followed by a big write of updated items with an option to rate-limit.

  Trade-off – Execution time is inversely proportional to rate-limit. The functionality includes that data changes occurring between the read and the write can be overwritten. You can't read from Global Secondary Indexes (GSIs). See, [Backfilling an Amazon DynamoDB Time to Live attribute using Amazon EMR: Part 2](https://aws.amazon.com/blogs/database/part-2-backfilling-an-amazon-dynamodb-time-to-live-attribute-using-amazon-emr/).
+ DynamoDB Shell – rate-limited bulk operations using SQL-like queries. You can read from GSIs for better efficiency.

  Trade-off – Execution time is inversely proportional to rate-limit. See [Rate limited bulk operations in DynamoDB Shell](https://aws.amazon.com/blogs/database/rate-limited-bulk-operations-in-dynamodb-shell/).

## Using the pattern
<a name="BestPractices_EfficientBulkOperations_UsingThePattern"></a>

Bulk updates can have significant cost implications especially if you use the on-demand throughput mode. There’s a trade-off between speed and cost if you use the provisioned throughput mode. Setting the rate-limit parameter very strictly can lead to a very large processing time. You can roughly determine speed of the update using the average item size and the rate limit.

Alternatively, you can determine amount of throughput needed for the process based on the expected duration of the update process and the average item size. The blog references shared with each pattern provide details on the strategy, implementation and limitations of using the pattern. For more information, see [Cost-effective bulk processing with Amazon DynamoDB](https://aws.amazon.com/blogs/database/cost-effective-bulk-processing-with-amazon-dynamodb/).

There are multiple approaches to perform bulk-updates against a live DynamoDB table. The suitable approach depends on factors such as ACID and/or idempotency requirements, number of items to be updated and familiarity with APIs. It is important to consider the cost versus time trade-off, most approaches discussed above provide an option to rate-limit the throughput used by the bulk update job.

# Best practices for handling concurrent updates in DynamoDB
<a name="BestPractices_ImplementingVersionControl"></a>

In distributed systems, multiple processes or users may attempt to modify the same data at the same time. Without concurrency control, these concurrent writes can lead to lost updates, inconsistent data, or race conditions. DynamoDB provides several mechanisms to help you manage concurrent access and maintain data integrity.

**Note**  
Individual write operations such as `UpdateItem` are atomic and always operate on the most recent version of the item, regardless of concurrency. Locking strategies are needed when your application must read an item and then write it back based on the read value (a read-modify-write cycle), because another process could modify the item between the read and the write.

There are two primary strategies for handling concurrent updates:
+ **Optimistic locking** – Assumes conflicts are rare. It allows concurrent access and detects conflicts at write time using conditional writes. If a conflict is detected, the write fails and the application can retry.
+ **Pessimistic locking** – Assumes conflicts are likely. It prevents concurrent access by acquiring exclusive access to a resource before modifying it. Other processes must wait until the lock is released.

The following table summarizes the approaches available in DynamoDB:


| Approach | Mechanism | Best for | 
| --- | --- | --- | 
| Optimistic locking | Version attribute \$1 conditional writes | Low contention, inexpensive retries | 
| Pessimistic locking (transactions) | TransactWriteItems | Multi-item atomicity, moderate contention | 
| Pessimistic locking (lock client) | Dedicated lock table with lease and heartbeat | Long-running workflows, distributed coordination | 

# Optimistic locking with version number
<a name="BestPractices_OptimisticLocking"></a>

Optimistic locking is a strategy that detects conflicts at write time rather than preventing them. Each item includes a version attribute that increments with every update. When updating an item, you include a [condition expression](Expressions.ConditionExpressions.md) that checks whether the version number matches the value your application last read. If another process modified the item in the meantime, the condition fails and DynamoDB returns a `ConditionalCheckFailedException`.

## When to use optimistic locking
<a name="BestPractices_OptimisticLocking_WhenToUse"></a>

Optimistic locking is a good fit when:
+ Multiple users or processes may update the same item, but conflicts are infrequent.
+ Retrying a failed write is inexpensive for your application.
+ You want to avoid the overhead and complexity of managing distributed locks.

Common examples include e-commerce inventory updates, collaborative editing platforms, and financial transaction records.

## Tradeoffs
<a name="BestPractices_OptimisticLocking_Tradeoffs"></a>

**Retry overhead in high contention**  
In high-concurrency environments, the likelihood of conflicts increases, potentially causing higher retries and write costs.

**Implementation complexity**  
Adding version control to items and handling conditional checks adds complexity to the application logic. The AWS SDK for Java v2 Enhanced Client provides built-in support through the [https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/ddb-en-client-extensions.html#ddb-en-client-extensions-VRE](https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/ddb-en-client-extensions.html#ddb-en-client-extensions-VRE) annotation, which automatically manages version numbers for you.

## Pattern design
<a name="BestPractices_OptimisticLocking_PatternDesign"></a>

Include a version attribute in each item. Here is a simple schema design:
+ Partition key – A unique identifier for each item (for example, `ItemId`).
+ Attributes:
  + `ItemId` – The unique identifier for the item.
  + `Version` – An integer that represents the version number of the item.
  + `QuantityLeft` – The remaining inventory of the item.

When an item is first created, the `Version` attribute is set to 1. With each update, the version number increments by 1.


| ItemID (partition key) | Version | QuantityLeft | 
| --- | --- | --- | 
| Bananas | 1 | 10 | 
| Apples | 1 | 5 | 
| Oranges | 1 | 7 | 

## Implementation
<a name="BestPractices_OptimisticLocking_Implementation"></a>

To implement optimistic locking, follow these steps:

1. Read the current version of the item.

   ```
   def get_item(item_id):
       response = table.get_item(Key={'ItemID': item_id})
       return response['Item']
   
   item = get_item('Bananas')
   current_version = item['Version']
   ```

1. Update the item using a condition expression that checks the version number.

   ```
   def update_item(item_id, qty_bought, current_version):
       try:
           response = table.update_item(
               Key={'ItemID': item_id},
               UpdateExpression="SET QuantityLeft = QuantityLeft - :qty, Version = :new_v",
               ConditionExpression="Version = :expected_v",
               ExpressionAttributeValues={
                   ':qty': qty_bought,
                   ':new_v': current_version + 1,
                   ':expected_v': current_version
               },
               ReturnValues="UPDATED_NEW"
           )
           return response
       except ClientError as e:
           if e.response['Error']['Code'] == 'ConditionalCheckFailedException':
               print("Version conflict: another process updated this item.")
           raise
   ```

1. Handle conflicts by retrying with a fresh read.

   Each retry requires an additional read, so limit the total number of retries.

   ```
   def update_with_retry(item_id, qty_bought, max_retries=3):
       for attempt in range(max_retries):
           item = get_item(item_id)
           try:
               return update_item(item_id, qty_bought, item['Version'])
           except ClientError as e:
               if e.response['Error']['Code'] != 'ConditionalCheckFailedException':
                   raise
               print(f"Retry {attempt + 1}/{max_retries}")
       raise Exception("Update failed after maximum retries.")
   ```

For Java applications, the AWS SDK for Java v2 Enhanced Client provides built-in optimistic locking support through the [https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/ddb-en-client-extensions.html#ddb-en-client-extensions-VRE](https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/ddb-en-client-extensions.html#ddb-en-client-extensions-VRE) annotation, which automatically manages version numbers for you.

For more information about condition expressions, see [DynamoDB condition expression CLI example](Expressions.ConditionExpressions.md).

# Pessimistic locking with DynamoDB transactions
<a name="BestPractices_PessimisticLocking"></a>

DynamoDB [transactions](transactions.md) provide an all-or-nothing approach to grouped operations. When you use `TransactWriteItems`, DynamoDB monitors all items in the transaction. If any item is modified by another operation during the transaction, the entire transaction is canceled and DynamoDB returns a `TransactionCanceledException`. This behavior provides a form of pessimistic concurrency control because conflicting concurrent modifications are prevented rather than detected after the fact.

## When to use transactions for locking
<a name="BestPractices_PessimisticLocking_WhenToUse"></a>

Transactions are a good fit when:
+ You need to update multiple items atomically, either within the same table or across tables.
+ Your business logic requires all-or-nothing semantics – either all changes succeed or none are applied.

Common examples include transferring funds between accounts, placing orders that update both inventory and order tables, and exchanging items between players in a game.

## Tradeoffs
<a name="BestPractices_PessimisticLocking_Tradeoffs"></a>

**Higher write cost**  
For items up to 1 KB, transactions consume 2 WCUs per item (one to prepare, one to commit), compared to 1 WCU for a standard write.

**Item limit**  
A single transaction can include up to 100 actions across one or more tables.

**Conflict sensitivity**  
If any item in the transaction is modified by another operation, the entire transaction fails. In high-contention scenarios, this can lead to frequent cancellations.

## Implementation
<a name="BestPractices_PessimisticLocking_Implementation"></a>

The following example uses `TransactWriteItems` to transfer inventory between two items atomically. If another process modifies either item during the transaction, the entire operation is rolled back.

```
import boto3

client = boto3.client('dynamodb')

def transfer_inventory(source_id, target_id, quantity):
    try:
        client.transact_write_items(
            TransactItems=[
                {
                    'Update': {
                        'TableName': 'Inventory',
                        'Key': {'ItemID': {'S': source_id}},
                        'UpdateExpression': 'SET QuantityLeft = QuantityLeft - :qty',
                        'ConditionExpression': 'QuantityLeft >= :qty',
                        'ExpressionAttributeValues': {
                            ':qty': {'N': str(quantity)}
                        }
                    }
                },
                {
                    'Update': {
                        'TableName': 'Inventory',
                        'Key': {'ItemID': {'S': target_id}},
                        'UpdateExpression': 'SET QuantityLeft = QuantityLeft + :qty',
                        'ExpressionAttributeValues': {
                            ':qty': {'N': str(quantity)}
                        }
                    }
                }
            ]
        )
        return True
    except client.exceptions.TransactionCanceledException as e:
        print(f"Transaction canceled: {e}")
        return False
```

In this example, the condition expression checks that sufficient inventory exists, but no version attribute is needed. DynamoDB automatically cancels the transaction if any item in the transaction is modified by another operation between the prepare and commit phases. This is what provides the pessimistic concurrency control – conflicting concurrent modifications are prevented by the transaction itself.

**Note**  
You can combine transactions with optimistic locking by adding version checks as additional condition expressions. This provides an extra layer of protection but is not required for the transaction to detect conflicts.

For more information, see [Managing complex workflows with DynamoDB transactions](transactions.md).

# Distributed locking with the DynamoDB Lock Client
<a name="BestPractices_DistributedLocking"></a>

For applications that require traditional lock-acquire-release semantics, the DynamoDB Lock Client is an open-source library that implements distributed locking using a DynamoDB table as the lock store. This approach is useful when you need to coordinate access to an external resource (such as an S3 object or a shared configuration) across multiple application instances.

The lock client is available as an open-source [Java library](https://github.com/awslabs/amazon-dynamodb-lock-client).

## How it works
<a name="BestPractices_DistributedLocking_HowItWorks"></a>

The lock client uses a dedicated DynamoDB table to track locks. Each lock is represented as an item with the following key attributes:
+ A partition key that identifies the resource being locked.
+ A lease duration that specifies how long the lock is valid. If the lock holder crashes or becomes unresponsive, the lock automatically expires after the lease duration.
+ A heartbeat that the lock holder sends periodically to extend the lease. This prevents the lock from expiring while the holder is still actively processing.

The lock client uses conditional writes to ensure that only one process can acquire a lock at a time. If a lock is already held, the caller can choose to wait and retry or fail immediately.

## When to use the lock client
<a name="BestPractices_DistributedLocking_WhenToUse"></a>

The lock client is a good fit when:
+ You need to coordinate access to a shared resource across multiple application instances or microservices.
+ The critical section is long-running (seconds to minutes) and retrying the entire operation on conflict would be expensive.
+ You need automatic lock expiry to handle process failures gracefully.

Common examples include orchestrating distributed workflows, coordinating cron jobs across multiple instances, and managing access to shared external resources.

## Tradeoffs
<a name="BestPractices_DistributedLocking_Tradeoffs"></a>

**Additional infrastructure**  
Requires a dedicated DynamoDB table for lock management, with additional read and write capacity for lock operations and heartbeats.

**Clock dependency**  
Lock expiry relies on timestamps. Significant clock skew between clients can cause unexpected behavior, particularly for short lease durations.

**Deadlock risk**  
If your application acquires locks on multiple resources, you must acquire them in a consistent order to avoid deadlocks. The lease duration provides a safety net by automatically releasing locks from unresponsive holders.

## Implementation
<a name="BestPractices_DistributedLocking_Implementation"></a>

The following example shows how to use the DynamoDB Lock Client to acquire and release a lock:

```
import java.io.IOException;
import java.util.Optional;
import java.util.concurrent.TimeUnit;
import software.amazon.awssdk.services.dynamodb.DynamoDbClient;

final DynamoDbClient dynamoDB = DynamoDbClient.builder()
    .region(Region.US_WEST_2)
    .build();

final AmazonDynamoDBLockClient lockClient = new AmazonDynamoDBLockClient(
    AmazonDynamoDBLockClientOptions.builder(dynamoDB, "Locks")
        .withTimeUnit(TimeUnit.SECONDS)
        .withLeaseDuration(10L)
        .withHeartbeatPeriod(3L)
        .withCreateHeartbeatBackgroundThread(true)
        .build());

// Try to acquire a lock on a resource
final Optional<LockItem> lock =
    lockClient.tryAcquireLock(AcquireLockOptions.builder("my-shared-resource").build());

if (lock.isPresent()) {
    try {
        // Perform operations that require exclusive access
        processSharedResource();
    } finally {
        // Always release the lock when done
        lockClient.releaseLock(lock.get());
    }
} else {
    System.out.println("Failed to acquire lock.");
}

lockClient.close();
```

**Important**  
Always release locks in a `finally` block to ensure locks are released even if your processing logic throws an exception. Unreleased locks block other processes until the lease expires.

You can also implement a simple locking mechanism without the lock client library by using conditional writes directly. The following example uses `UpdateItem` with a condition expression to acquire a lock, and `DeleteItem` to release it:

```
from datetime import datetime, timedelta
from boto3.dynamodb.conditions import Attr

def acquire_lock(table, resource_name, owner_id, ttl_seconds):
    """Attempt to acquire a lock. Returns True if successful."""
    expiry = (datetime.now() + timedelta(seconds=ttl_seconds)).isoformat()
    now = datetime.now().isoformat()
    try:
        table.update_item(
            Key={'LockID': resource_name},
            UpdateExpression='SET #owner = :owner, #expiry = :expiry',
            ConditionExpression=Attr('LockID').not_exists() | Attr('ExpiresAt').lt(now),
            ExpressionAttributeNames={'#owner': 'OwnerID', '#expiry': 'ExpiresAt'},
            ExpressionAttributeValues={':owner': owner_id, ':expiry': expiry}
        )
        return True
    except table.meta.client.exceptions.ConditionalCheckFailedException:
        return False

def release_lock(table, resource_name, owner_id):
    """Release a lock. Only succeeds if the caller is the lock owner."""
    try:
        table.delete_item(
            Key={'LockID': resource_name},
            ConditionExpression=Attr('OwnerID').eq(owner_id)
        )
        return True
    except table.meta.client.exceptions.ConditionalCheckFailedException:
        return False
```

This approach uses a condition expression to ensure that a lock can only be acquired if it doesn't exist or has expired, and can only be released by the process that acquired it. Consider enabling [Time to Live (TTL)](TTL.md) on the lock table to automatically clean up expired lock items.

## Choosing a concurrency control strategy
<a name="BestPractices_ChoosingLockingStrategy"></a>

Use the following guidelines to choose the right approach for your workload:

**Use optimistic locking** when:  
+ Conflicts are infrequent.
+ Retrying a failed write is inexpensive.
+ You are updating a single item at a time.

**Use transactions** when:  
+ You need to update multiple items atomically.
+ You require all-or-nothing semantics across items or tables.
+ You need to combine condition checks with writes in a single operation.

**Use the lock client** when:  
+ You need to coordinate access to external resources across distributed processes.
+ The critical section is long-running and retrying on conflict is expensive.
+ You need automatic lock expiry to handle process failures.

**Note**  
If you use [DynamoDB global tables](GlobalTables.md), be aware that global tables use a "last writer wins" reconciliation strategy for concurrent updates. Optimistic locking with version numbers does not work as expected across Regions because a write in one Region may overwrite a concurrent write in another Region without a version check. Design your application to handle conflicts at the application level when using global tables.

# Best practices for understanding your AWS billing and usage reports in DynamoDB
<a name="bp-understanding-billing"></a>

 This document explains the `UsageType` billing codes for charges related to DynamoDB.

AWS provides cost and usage reports (CUR) that contain data for the services used. You can use AWS Cost and Usage Report to publish billing reports to Amazon S3 in a CSV format. When setting up the CUR you can choose to break time periods down by hour, day, or month, and you can choose if you want to break out usage by resource ID or not. For more details on generating CUR, please see [Creating Cost and Usage Reports](https://docs.aws.amazon.com/cur/latest/userguide/creating-cur.html)

Within the CSV export, you will find relevant attributes listed for each line. The following are examples of attributes that may be included:
+ **lineitem/UsageStartDate: **The start date and time for the line item in UTC, inclusive.
+ **lineitem/UsageEndDate: **The end date and time for the corresponding line item in UTC, exclusive.
+ **lineitem/ProductCode: **For DynamoDB this will be “AmazonDynamoDB”
+ **lineitem/UsageType: **A specific description code for the type of usage, as enumerated in this document
+ **lineitem/Operation: **A name that provides context to the charge such as the operation name that incurred the charge (optional).
+ **lineitem/ResourceId: **The identifier for the resource that incurred the usage. Available if the CUR includes a breakdown by resource ID.
+ **lineitem/UsageAmount: **The amount of usage incurred during the specified time period.
+ **lineitem/UnblendedCost: **The cost of this usage.
+ **lineitem/LineItemDescription: **Textual description of the line item.

For more information about the CUR data dictionary, see [Cost and Usage Report (CUR) 2.0](https://docs.aws.amazon.com/cur/latest/userguide/table-dictionary-cur2.html). Note that the exact names vary depending on context. 

A `UsageType` is a string with a value such as `ReadCapacityUnit-Hrs`, `USW2-ReadRequestUnits`, `EU-WriteCapacityUnit-Hrs`, or `USE1-TimedPITRStorage-ByteHrs`. Each usage type begins with an optional Region prefix. If absent, that indicates the us-east-1 Region. If present, the below table maps the short billing Region code to the conventional Region code and name.

For example, the usage named `USW2-ReadRequestUnits` indicates read request units consumed in us-west-2. 


****  

| Billing Region Code | Region Code | Region Name | 
| --- | --- | --- | 
| AFS1 | af-south-1 | Africa (Cape Town) | 
| APE1 | ap-east-1 | Asia Pacific (Hong Kong) | 
| APN1 | ap-northeast-1 | Asia Pacific (Tokyo) | 
| APN2 | ap-northeast-2 | Asia Pacific (Seoul) | 
| APN3 | ap-northeast-3 | Asia Pacific (Osaka) | 
| APS1 | ap-southeast-1 | Asia Pacific (Singapore) | 
| APS2 | ap-southeast-2 | Asia Pacific (Sydney) | 
| APS3 | ap-south-1 | Asia Pacific (Mumbai) | 
| APS4 | ap-southeast-3 | Asia Pacific (Jakarta) | 
| APS5 | ap-south-2 | Asia Pacific (Hyderabad) | 
| APS6 | ap-southeast-4 | Asia Pacific (Melbourne) | 
| CAN1 | ca-central-1 | Canada (Central) | 
| EU | eu-west-1 | Europe (Ireland) | 
| EUC1 | eu-central-1 | Europe (Frankfurt) | 
| EUC2 | eu-central-2 | Europe (Zurich) | 
| EUN1 | eu-north-1 | Europe (Stockholm) | 
| EUS1 | eu-south-1 | Europe (Milan) | 
| EUS2 | eu-south-2 | Europe (Spain) | 
| EUW1 | eu-west-1 | Europe (Ireland) | 
| EUW2 | eu-west-2 | Europe (London) | 
| EUW3 | eu-west-3 | Europe (Paris) | 
| ILC1 | Il-central-1 | Israel (Tel Aviv) | 
| MEC1 | me-central-1 | Middle East (UAE) | 
| MES1 | me-south-1 | Middle East (Bahrain) | 
| SAE1 | sa-east-1 | South America (São Paulo) | 
| USE1 (default) | us-east-1 | US East (N. Virginia) | 
| USE2 | us-east-2 | US East (Ohio) | 
| UGE1 | us-gov-east-1 | US Government East | 
| UGW1 | us-gov-west-1 | US Government West | 
| USW1 | us-west-1 | US West (N. California) | 
| USW2 | us-west-2 | US West (Oregon) | 

In the following sections, we use `REG-UsageType` pattern when going through the charges for DynamoDB, where REG specifies the region where usage occurred and usageType is the code for the type of charge. For example if you see a line item for `USW1- ReadCapacityUnit-Hrs` in your CSV file, that means the usage was incurred in US-West-1 for provisioned read capacity. In that case the listing would say `REG-ReadCapacityUnit-Hrs`.

**Topics**
+ [Throughput Capacity](#bp-understanding-billing.throughput)
+ [Streams](#bp-understanding-billing.streams)
+ [Storage](#bp-understanding-billing.storage)
+ [Backup and Restore](#bp-understanding-billing.backup)
+ [Data Transfer](#bp-understanding-billing.datatransfer)
+ [CloudWatch Contributor Insights](#bp-understanding-billing.cw)
+ [DynamoDB Accelerator (DAX)](#bp-understanding-billing.dax)

## Throughput Capacity
<a name="bp-understanding-billing.throughput"></a>

**Provisioned Capacity Reads and Writes**

When you create a DynamoDB table in provisioned capacity mode, you specify the read and write capacity that you expect your application to require. The usage type depends on your table class (Standard or Standard-Infrequent Access). You provision read and writes based on consumption rate per second, but the charges are priced per hour based on provisioned capacity.


****  

| UsageType | Units | Granularity | Description | 
| --- | --- | --- | --- | 
| REG-ReadCapacityUnit-Hrs | RCU-hours | Hour | Charges for reads in provisioned capacity mode using the Standard table class. | 
| REG-IA-ReadCapacityUnit-Hrs  | RCU-hours | Hour | Charges for reads in provisioned capacity mode using the Standard-IA table class. | 
| REG-WriteCapacityUnit-Hrs | WCU-hours | Hour | Charges for writes in provisioned capacity mode using the Standard table class. | 
| REG-IA-WriteCapacityUnit-Hrs  | WCU-hours | Hour | Charges for writes in provisioned capacity mode using the Standard-IA table class. | 

**Reserved Capacity Reads and Writes**

With reserved capacity, you pay a one-time upfront fee and commit to a minimum provisioned usage level over a period of time. Reserved capacity is billed at a discounted hourly rate. Any capacity that you provision in excess of your reserved capacity is billed at standard provisioned capacity rates. Reserved capacity is available for single-region, provisioned read and write capacity units (RCU and WCU) on DynamoDB tables that use the standard table class. Both 1-year and 3-year reserved capacity are billed using the same SKUs.


****  

| UsageType | Units | Granularity | Description | 
| --- | --- | --- | --- | 
| REG-HeavyUsage:dynamodb.read | RCU-hours | Up-front then monthly | Charges for reserved capacity reads: a one-time up-front charge and a monthly charge at the start of each month covering all the discounted committed RCU-hours during the month. Will have matching zero-cost REG-ReadCapacityUnit-Hrs line items. | 
| REG-HeavyUsage:dynamodb.write | WCU-hours | Up-front then monthly | Charges for reserved capacity writes: a one-time up-front charge and a monthly charge at the start of each month covering all the discounted committed WCU-hours during the month. Will have matching zero-cost REG-WriteCapacityUnit-Hrs line items. | 

**On-Demand Capacity Reads and Writes**

When you create a DynamoDB table in on-demand capacity mode, you pay only for the reads and writes your application performs. The prices for read and write requests depend on your table class.


****  

| UsageType | Units | Granularity | Description | 
| --- | --- | --- | --- | 
| REG-ReadRequestUnits | RRUs | Unit | Charges for reads in on-demand capacity mode with Standard table class. | 
| REG-IA-ReadRequestUnits | RRUs | Unit | Charges for reads in on-demand capacity mode with Standard-IA table class. | 
| REG-WriteRequestUnits | WRUs | Unit | Charges for writes in on-demand capacity mode with Standard table class. | 
| REG-IA-WriteRequestUnits | WRUs | Unit | Charges for writes in on-demand capacity mode with Standard-IA table class. | 

**Global Tables Reads and Writes**

DynamoDB charges for global tables usage based on the resources used on each replica table. For provisioned global tables, write requests for global tables are measured in replicated WCUs (rWCU) instead of standard WCUs and writes to global secondary indexes in global tables are measured in WCUs. For on-demand global tables, write requests are measured in replicated WRUs (rWRU) instead of standard WRUs. The number of rWCUs or rWRUs consumed for replication depends on the version of global tables you are using. The pricing depends on your table class.

Writes to global secondary indexes (GSIs) are billed using standard write units (WCUs and WRUs). Read requests and data storage are billed identically to single-region tables.

 If you add a table replica to create or extend a global table in new Regions, DynamoDB charges for a table restore in the added Regions per gigabyte of data restored. Restored Data is charged as REG-RestoreDataSize-Bytes. Please refer to [Backup and restore for DynamoDB](Backup-and-Restore.md) for details. Cross-Region replication and adding replicas to tables that contain data also incur charges for data transfer out.

 When you select on-demand capacity mode for your DynamoDB global tables, you pay only for the resources your application uses on each replica table.


****  

| UsageType | Units | Granularity | Description | 
| --- | --- | --- | --- | 
| REG-ReplWriteCapacityUnit-Hrs | rWCU-hours | Hour | Global table, provisioned, Standard table class. | 
| REG-IA-ReplWriteCapacityUnit-Hrs | rWCU-hours | Hour | Global table, provisioned, Standard-IA table class. | 
| REG-ReplWriteRequestUnits  | rWRU | Unit | Global table, on-demand, Standard table class. | 
| REG-IA-ReplWriteRequestUnits | rWRU | Unit | Global table, on-demand, Standard- IA table class | 

## Streams
<a name="bp-understanding-billing.streams"></a>

DynamoDB has two streaming technologies, DynamoDB Streams and Kinesis. Each have separate pricing.

DynamoDB Streams charges for reading data in read request units. Each `GetRecords` API call is billed as a streams read request. You are not charged for `GetRecords` API calls invoked by AWS Lambda as part of DynamoDB triggers or by DynamoDB global tables as part of replication.


****  

| UsageType | Units | Granularity | Description | 
| --- | --- | --- | --- | 
| REG-Streams-RequestsCount | Count | Unit | Read request units for DynamoDB Streams. | 

Amazon Kinesis Data Streams charges in change data capture units. DynamoDB charges one change data capture unit for each write (up to 1 KB). For items larger than 1 KB, additional change data capture units are required. You pay only for the writes your application performs without having to manage throughput capacity on the table.


****  

| UsageType | Units | Granularity | Description | 
| --- | --- | --- | --- | 
| REG-ChangeDataCaptureUnits-Kinesis | CDC Units | Unit | Change data capture units for Kinesis Data Streams. | 

## Storage
<a name="bp-understanding-billing.storage"></a>

DynamoDB measures the size of your billable data by adding the raw byte size of your data plus a per-item storage overhead that depends on the features you have enabled. 

**Note**  
Storage usage values in the CUR will be higher compared with the storage values when using `DescribeTable`, because `DescribeTable` does not include the per-item storage overhead.

Storage is calculated hourly but priced monthly as calculated from an average of the hourly charges.

Although the storage `UsageType` uses `ByteHrs` as a suffix, storage usage in the CUR is measured in GB and priced by GB-month.


****  

| UsageType | Units | Granularity | Description | 
| --- | --- | --- | --- | 
| REG-TimedStorage-ByteHrs | GB | Month | Amount of storage used by your DynamoDB tables and indexes, for tables with the Standard table class. | 
| REG-IA-TimedStorage- ByteHrs | GB | Month | Amount of storage used by your DynamoDB tables and indexes, for tables with the Standard-IA table class. | 

## Backup and Restore
<a name="bp-understanding-billing.backup"></a>

DynamoDB offers two types of backups: Point In Time Recovery (PITR) backups and on- demand backups. Users can also restore from those backups into DynamoDB tables. The charges below refers to both backups and restores. 

Backup storage charges are incurred on the first of the month with adjustments made throughout the month as backups are added or removed. See the [Understanding Amazon DynamoDB On-demand Backups and Billing](https://repost.aws/articles/AR74LYumctRa-t7Z87uwKrlw) blog for more information


****  

| UsageType | Units | Granularity | Description | 
| --- | --- | --- | --- | 
| REG-TimedBackupStorage-ByteHrs | GB | Month | The storage consumed by on-demand backups of your DynamoDB tables and Local Secondary Indexes. | 
| TimedPITRStorage-ByteHrs | GB | Month | The storage used by point-in-time recovery (PITR) backups. DynamoDB monitors the size of your PITR-enabled tables continuously throughout the month to determine your backup charges and bills for storage as long as PITR is enabled. | 
| REG-RestoreDataSize-Bytes | GB | Size | The total size of data restored (including table data, local secondary indexes and global secondary indexes) measured in GB from DynamoDB backups. | 

### AWS Backup
<a name="bp-understanding-billing.aws-backup"></a>

AWS Backup is a fully managed backup service that makes it easy to centralize and automate the backup of data across AWS services in the cloud as well as on premises. AWS Backup is charged for storage (warm or cold storage), restoration activities, and cross-Region data transfer. The following `UsageType` charges appear under the “AWSBackup” ProductCode rather than “AmazonDynamoDB”.


****  

| UsageType | Units | Granularity | Description | 
| --- | --- | --- | --- | 
| REG-WarmStorage- ByteHrs-DynamoDB | GB | Month | The storage used by DynamoDB backups managed by AWS Backup throughout the month, measured in GB-Month. | 
| REG-CrossRegion-WarmBytes-DynamoDB | GB | Size | The data transferred to a different AWS Region either within the same account or to a different AWS account. Cross-Region transfers charges occur when copying backups from one Region to another Region. The charge is always billed to the account where the data is transferred from. | 
| REG-Restore-WarmBytes-DynamoDB | GB | Size | The total size of the data restored from warm storage, measured in GB. | 
| REG-ColdStorage-ByteHrs-DynamoDB | GB | Month | The cold storage used by DynamoDB backups managed by AWS Backup throughout the month, measured in GB-Month. | 
| REG-Restore-ColdBytes-DynamoDB | GB | Month | The total size of the data restored from cold storage, measured in GB. | 

### Export and Import
<a name="bp-understanding-billing.export-import"></a>

 You can export data from DynamoDB to Amazon S3 or import data from Amazon S3 to a new DynamoDB table.

Although the `UsageType` uses `Bytes` as a suffix, export and import usage in the CUR is measured and priced in GB.


****  

| UsageType | Units | Granularity | Description | 
| --- | --- | --- | --- | 
| REG-ExportDataSize-Bytes | GB | Size | The charge for exporting data to S3. DynamoDB charges for data you export based on the size of the DynamoDB base table (table data and local secondary indexes) at the specified point in time when the export was created. | 
| REG-ImportDataSize-Bytes | GB | Size | The charge for importing data from S3. The size is calculated based on the uncompressed object size of the data within Amazon S3. There are no extra charges for importing to tables with GSIs. | 
| REG-IncrementalExportDataSize-Bytes | GB | Size | The charge for size of the data processed from the continuous backup to produce incremental exports. | 

## Data Transfer
<a name="bp-understanding-billing.datatransfer"></a>

Data transfer activity may appear associated with the DynamoDB service. DynamoDB does not charge for inbound data transfer, and it does not charge for data transferred between DynamoDB and other AWS services within the same AWS Region (in other words, \$10.00 per GB). Data transferred across AWS Regions (such as between DynamoDB in the US East [N. Virginia] Region and Amazon EC2 in the EU [Ireland] Region) is charged on both sides of the transfer.


****  

| UsageType | Units | Granularity | Description | 
| --- | --- | --- | --- | 
| REG-DataTransfer-In-Bytes | GB | Units | Data transferred in to DynamoDB from the internet. | 
| REG-DataTransfer-Out-Bytes | GB | Units | Data transferred out from DynamoDB to the internet. | 

## CloudWatch Contributor Insights
<a name="bp-understanding-billing.cw"></a>

CloudWatch Contributor Insights for DynamoDB is a diagnostic tool for identifying the most frequently accessed and throttled keys in your DynamoDB table. The following `UsageType` charges appear under the “AmazonCloudWatch” ProductCode rather than “AmazonDynamoDB”. 


****  

| UsageType | Units | Granularity | Description | 
| --- | --- | --- | --- | 
| REG-CW:ContributorEventsManaged | Events processed | Units | The amount of DynamoDB events processed. For example for a table with CloudWatch Contributor Insights enabled, anytime an item is read or written, it’s counted as one event. If the table has a sort key, it results in charges for two events. | 
| REG-CW:ContributorRulesManaged | Rule count | Month | DynamoDB creates rules to identify most accessed items and most throttled keys when you enable Cloud Watch Contributor Insights. This charge is incurred for the rules added for each entity (tables and GSIs) configured for logging CloudWatch contributor insights. | 

## DynamoDB Accelerator (DAX)
<a name="bp-understanding-billing.dax"></a>

DynamoDB Accelerator (DAX) is billed by the hour based on the instance type selected for the service. The charges below refers to the DynamoDB Accelerator instances provisioned. The following `UsageType` charges appear under the “AmazonDAX” ProductCode rather than “AmazonDynamoDB”. 


****  

| UsageType | Units | Granularity | Description | 
| --- | --- | --- | --- | 
| REG-NodeUsage:dax-<INSTANCETYPE> | Node-hour | Hour | The hourly usage of a particular instance type. Pricing is per node-hour consumed, from the time a node is launched until it is terminated. Each partial node-hour consumed will be billed as a full hour. DAX charges for each node in a DAX cluster. If you have a cluster with multiple nodes, you would see multiple line items in your billing report. | 

The instance type will be one of the values from the following list. For details about node types, see [Nodes](DAX.concepts.cluster.md#DAX.concepts.nodes).
+ r3.2xlarge, r4.8xlarge, or r5.8xlarge
+ r3.4xlarge, r4.large, or r5.large
+ r3.8xlarge, r4.xlarge, or r5.xlarge
+ r3.2xlarge, r5.12xlarge, or t2.medium
+ r3.4xlarge, r4.large, or r5.large
+ r3.xlarge, r5.16xlarge, or t2.small
+ r4.16xlarge, r5.24xlarge, or t3.medium
+ r4.2xlarge, r5.2xlarge, or t3.small
+ r4.4xlarge or r5.4xlarge

# Migrating a DynamoDB table from one account to another
<a name="bp-migrating-table-between-accounts"></a>

You can migrate an Amazon DynamoDB table from one account to another to implement a multi-account strategy or a backup strategy. You can also do it for testing, debugging, or compliance reasons. A common use case is copying DynamoDB tables across production, staging, test, and development environments where each environment utilizes a different AWS account.

DynamoDB offers two options for migrating tables from one AWS account to another:
+ **AWS Backup for Cross-Account Backup and Restore:** AWS Backup is a fully managed backup service that enables you to centrally manage backups across multiple AWS services. With its cross-account backup and restore functionality, you can back up a DynamoDB table in one account and restore the backup to another account in the same AWS Organization.
+ **DynamoDB Export and Import to Amazon S3:** Using the DynamoDB Export and Import to Amazon S3 features allows you to do a full export to an Amazon S3 bucket and then import that data into a new table in another AWS account. This approach is suitable when you need to migrate between accounts that are not part of the same AWS Organization or if you do not want to use AWS Backup.

**Note**  
Import from Amazon S3 does not support tables with Local Secondary Indexes (LSIs), but it does support Global Secondary Indexes (GSIs). For more information on LSIs and GSIs, see [Improving data access with secondary indexes in DynamoDB](SecondaryIndexes.md).

**Topics**
+ [Migrate a table using AWS Backup for cross-account backup and restore](bp-migrating-table-between-accounts-backup.md)
+ [Migrate a table using export to S3 and import from S3](bp-migrating-table-between-accounts-s3.md)

# Migrate a table using AWS Backup for cross-account backup and restore
<a name="bp-migrating-table-between-accounts-backup"></a>

**Prerequisites**
+ Source and target AWS accounts must belong to the same organization in the AWS Organizations service
+ Valid AWS Identity and Access Management (IAM) permissions to create and use AWS Backup vaults

For more information about setting up cross-account backups, see [Creating backup copies across AWS accounts](https://docs.aws.amazon.com/aws-backup/latest/devguide/create-cross-account-backup.html).

**Pricing information**

AWS charges for the backup (based on the table size), any data copying between AWS Regions (based on the amount of data), for the restore (based on the amount of data), and for any ongoing storage charges. To avoid ongoing charges, you can delete the backup if you don't need it after the restore.

For more information about pricing, see [AWS Backup pricing](https://aws.amazon.com/backup/pricing/) .

## Step 1: Enable advanced features for DynamoDB and cross-account backup
<a name="bp-migrating-table-between-accounts-backup-enable-advanced-features"></a>

1. In both the source and target AWS account, access the AWS Management Console and open the AWS Backup console.

1. Choose the **Settings** option.

1. Under **Advanced features for Amazon DynamoDB backups**, confirm that **Advanced features** is enabled. If it isn't, choose **Enable**.

1. Under **Cross-account management**, for **Cross-account backup**, choose **Turn On**.

## Step 2: Create a backup vault in the source account and target account
<a name="bp-migrating-table-between-accounts-backup-create-backup-vault"></a>

1. In the source AWS accounts, open the AWS Backup console.

1. Choose **Backup vaults**.

1. Choose **Create Backup vault**.

1. Copy and save the **Amazon Resource Name (ARN)** of the created backup vaults and the target AWS account.

1. You'll need the ARNs of both the source and target backup vaults when copying the DynamoDB table backup between accounts.

## Step 3: Create a DynamoDB table backup in the source account
<a name="bp-migrating-table-between-accounts-backup-create-table-backup"></a>

1. On the **AWS Backup Dashboard page**, choose **Create on-demand backup**.

1. In the **Settings** section, select **DynamoDB** as the **Resource type**, and then select the table name. 

1. In the **Backup vault** dropdown list, select the backup vault you created in the source account.

1. Select the desired **Retention period**.

1. Choose **Create on-demand backup**.

1. Monitor the status of the backup job on the **Backup Jobs** tab of the **AWS Backup Jobs** page. 

## Step 4: Copy the DynamoDB table backup from the source account to the target account
<a name="bp-migrating-table-between-accounts-backup-copy-table-backup"></a>

1. After the backup job completes, open the AWS Backup console in the source account and choose **Backup vaults**. 

1. Under **Backups**, choose the DynamoDB table backup. Choose **Actions** and then **Copy**.

1. Enter the AWS Region of the target account.

1. For **External vault ARN**, enter the ARN of the backup vault you created in the target account.

1.  In the target account backup vault, enable access from a source account to allow copying backups.

## Step 5: Restore the DynamoDB table backup in the target account
<a name="bp-migrating-table-between-accounts-restore-table-backup"></a>

1. In the target AWS account, open the AWS Backup console and choose **Backup vaults**

1. Under **Backups**, select the backup you copied from the source account. Choose **Actions**, then **Restore**.

1. Enter the name for the new DynamoDB table, the encryption that this new table will have, the key you want the restore to be encrypted with, and any other options.

1. When the restore is completed, the table status will show as **Active**.

# Migrate a table using export to S3 and import from S3
<a name="bp-migrating-table-between-accounts-s3"></a>

**Prerequisites**
+ You must enable Point-in-Time Recovery (PITR) for your table in order to perform the export to S3. For more information, see [Enable point-in-time recovery in DynamoDB](PointInTimeRecovery_Howitworks.md).
+ Valid IAM permissions to perform the export. For more information, see [Requesting a table export in DynamoDB](S3DataExport_Requesting.md).
+ Valid IAM permissions sufficient to perform the import. For more information, see [Requesting a table import in DynamoDB](S3DataImport.Requesting.md).

**Pricing information**

AWS charges for PITR (based on the size of the table and how long PITR is enabled for). If you don't need PITR except for the export, you can turn it off after the export concludes. AWS also charges for requests made against S3, for storing the exported data in S3 and for importing (based on the uncompressed size of the imported data).

For more information about DynamoDB pricing, see [DynamoDB pricing](https://aws.amazon.com/dynamodb/pricing/).

**Note**  
 There are limits on the size and number of objects when importing from S3 to DynamoDB. For more information, see [Import quotas](S3DataImport.Validation.md#S3DataImport.Validation.limits).

## Requesting a table export to Amazon S3
<a name="bp-migrating-table-between-accounts-s3-table-export"></a>

1. Sign in to the AWS Management Console and open the DynamoDB console.

1. In the navigation pane on the left side of the console, choose **Exports to S3**.

1. Choose a source table and destination S3 bucket. Enter the URL of the destination account bucket using the `s3://bucketname/prefix` format. `/prefix` is an optional folder to help keep your destination bucket organized.

1. Choose **Full export**. A full export outputs the full table snapshot of your table, at the point in time you specify.

   1. Select **Current time** to export the latest full table snapshot.

   1. For **Exported file format**, choose between DynamoDB JSON and Amazon Ion. The default option is DynamoDB JSON.

1. Click the **Export** button to begin the export.

1. Small table exports should complete in a few minutes, but tables in the terabyte range can take more than an hour.

## Requesting a table import from Amazon S3
<a name="bp-migrating-table-between-accounts-s3-table-import"></a>

1. Sign in to the AWS Management Console and open the DynamoDB console.

1. In the navigation pane on the left side of the console, choose **Import from S3**.

1. On the page that appears, select **Import from S3**.

1. Enter the Amazon S3 source URL. You can also find it by using the **Browse S3** button. The expected path is of the format `s3://bucket/prefix/AWSDynamoDB/<XXXXXXXX-XXXXXX>/data/`.

1. Specify if you are the S3 bucket owner.

1. Under **Import file compression**, select **GZIP** to match the export.

1. Under **Import file format**, select **DynamoDB JSON** to match the export.

1. Select **Next**. For **Specify table details**, choose the options for the new table that will be created to store your data.

1. Select **Next**. For **Configure table settings**, customize any additional table settings if applicable.

1. Select **Next** again to review your import options, then click **Import** to begin the import task. You'll see your new table listed under **Imports from S3** with the status **Importing**. You cannot access your table during this time. Small imports should complete in a few minutes, but tables in the terabyte range can take more than an hour.

1. After the import completes, the status shows as **Active**, and you can start using the table.

## Keeping tables in sync during migration
<a name="bp-migrating-table-between-accounts-s3-table-sync"></a>

If you can pause write operations on the source table for the duration of the migration, then the source and output should match up exactly after the migration. If you can't pause write operations, the target table would normally be a bit behind the source after the migration. To catch up the source table, you can use streaming (DynamoDB Streams or Kinesis Data Streams for DynamoDB) to replay the writes that happened in the source table since the backup or export. 

You should start reading the stream records prior to the timestamp when you exported the source table to S3. For example, if the export to S3 occurred at 2:00 PM and the import to the target table was concluded at 11:00 PM, you should initiate the DynamoDB stream reading at 1:58 PM. The streaming options for change data capture table summarizes the features of each streaming model.

Using DynamoDB Streams with Lambda offers a streamlined approach for synchronizing data between the source and target DynamoDB tables. You can use a Lambda function to replay each write in the target table.

**Note**  
Items are kept in the DynamoDB Streams for 24 hours, so you should plan to complete your backup and restore or export and import within that window.

# Prescriptive guidance to integrate DAX with DynamoDB applications
<a name="dax-prescriptive-guidance"></a>

[DynamoDB Accelerator](DAX.md) (DAX), is a DynamoDB-compatible caching service that provides fast in-memory performance for demanding applications, such as read-heavy applications. Using DAX, you can achieve response times in microseconds for accessing frequently requested data. This DynamoDB Accelerator prescriptive guide provides comprehensive insights and best practices for integrating DAX with your DynamoDB applications.

This guide provides foundational knowledge for those who are new to DAX or want to optimize their existing configurations. This guide covers various topics, for example, when to use DAX and creating a [DAX cluster](DAX.concepts.cluster.md#DAX.concepts.clusters). It also includes practical examples and detailed explanations to help you effectively implement DAX in your projects. Finally, this guide offers advanced strategies that you need to implement to maximize DAX caching capabilities for ensuring fast and scalable applications.

**Topics**
+ [Evaluating the suitability of DAX for your use cases](evaluate-dax-suitability.md)
+ [Configuring your DAX client](dax-config-dax-client.md)
+ [Configuring your DAX cluster](dax-config-considerations.md)
+ [Sizing your DAX cluster](dax-cluster-sizing.md)
+ [Deploying a cluster](dax-deploy-cluster.md)
+ [Managing cluster operations](dax-cluster-operations.md)
+ [Monitoring DAX](pres-guide-monitor-dax.md)

# Evaluating the suitability of DAX for your use cases
<a name="evaluate-dax-suitability"></a>

This section explains when and why to use DAX. Using this guidance helps you to determine if integrating DAX with DynamoDB is advantageous for your application's workload patterns, performance requirements, and data consistency needs. It also covers scenarios where DAX might not be suitable, for example, write-heavy workloads and infrequently accessed data.

**Topics**
+ [When and why to choose DAX](#choose-dax)
+ [When not to use DAX](#dax-unsuitable-scenarios)

## When and why to choose DAX
<a name="choose-dax"></a>

You can consider adding DAX to your application stack in several scenarios. For example, use DAX to reduce the overall latency of read requests against DynamoDB or to minimize repeated reads of the same data from a table. The following list presents examples of scenarios in which you can take advantage of integrating DAX with DynamoDB:
+ **High-performance requirement**
  + **Low latency reads** – You should consider using DAX if your application requires response times in microseconds for eventually-consistent reads. DAX can also drastically reduce the response time for accessing frequently read data.
+ **Read-intensive workloads**
  + **Read-heavy applications** – For applications with a high read-to-write ratio, for example, 10:1 or more, DAX results in more cache hits and less stale data. This reduces reads against a table. To avoid reading stale data from the cache if your application is write-heavy, make sure to set a lower [Using time to live (TTL) in DynamoDB](TTL.md) for the cache.
  + **Caching common queries** – If your application frequently reads the same data, for example, popular products on an e-commerce platform, DAX can serve these requests directly from its cache.
+ **Bursty traffic patterns**
  + **Smoother table scaling** – DAX helps smooth out impacts of sudden traffic spikes. DAX provides a buffer to scale up DynamoDB table capacity gracefully, which reduces the risk of read throttling.
  + **Higher read throughput for each item** – DynamoDB allocates individual partitions for each item. However, a partition starts throttling reads of an item when it reaches 3,000 [read capacity units](provisioned-capacity-mode.md#read-write-capacity-units) (RCU). DAX lets you scale reads of a single item beyond 3,000 RCU. 
+ **Cost optimization**
  + **Reducing DynamoDB costs** – Reading from DAX can reduce reads sent to a DynamoDB table, which can then directly impact cost. With a high cache hit rate, the reduced table read cost can exceed a DAX cluster cost, which results in a net cost reduction.
+ **Data consistency requirements**
  + **Eventual consistency** – DAX supports eventually consistent reads. This makes DAX suitable for use cases where immediate consistency isn't critical.
  + **Write-through caching** – Writes that you make against DAX are [write-through](DAX.consistency.md). Once DAX confirms that it's written an item to DynamoDB, it persists that item version in the item cache. This write-through mechanism helps maintain tighter data consistency between cache and database, but uses additional DAX cluster resources.

## When not to use DAX
<a name="dax-unsuitable-scenarios"></a>

While DAX is powerful, it's not suitable for all scenarios. The following list presents examples of scenarios in which integrating DAX with DynamoDB is unsuitable:
+ **Write-heavy workloads** – The primary advantage of DAX is speeding up reads, but writes use more DAX resources than reads. If your application is predominantly write-heavy, DAX benefits might be limited.
+ **Infrequently read data** – If your application accesses data infrequently or a wide range of rarely reused data (cold data), you'll likely experience a low [cache hit ratio](pres-guide-monitor-dax.md#cachehitratio). In this case, the overhead of maintaining the cache might not justify the performance gains.
+ **Bulk reads or writes** – If your application performs more bulk writes than individual writes, you should write around DAX. In addition, for bulk reads, you should run full table scans directly against a DynamoDB table.
+ **Strong consistency or transaction requirements** – DAX passes strongly consistent reads and [TransactGetItems](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_TransactGetItems.html) calls to a DynamoDB table. You should make these reads around the DAX cluster to avoid using cluster resources. Items read this way won't be cached; therefore, routing such items through DAX serves no purpose.
+ **Simple applications with modest performance requirements** – For applications with modest performance requirements and tolerance for direct DynamoDB latency, the complexity and cost of adding DAX might not be necessary. On its own, DynamoDB handles high throughput and provides single-digit millisecond performance.
+ **Complex querying needs beyond key-value access** – DAX is optimized for key-value access patterns. If your application requires complex querying capabilities with complex filtering, such as [Query](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Query.html) and [Scan](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Scan.html) operations, DAX caching benefits might be limited.

  In these situations, use [Amazon ElastiCache (Redis OSS)](https://docs.aws.amazon.com/AmazonElastiCache/latest/red-ug/WhatIs.html) as an alternative. ElastiCache (Redis OSS) supports advanced data structures, such as, lists, sets, and hashes. It also offers features, such as pub/sub, geospatial indexes, and scripting.
+ **Compliance requirements** – DAX doesn't currently offer the same compliance accreditations as as DynamoDB. For example, DAX hasn't obtained the SOC accreditation yet.

# Configuring your DAX client
<a name="dax-config-dax-client"></a>

The DAX cluster is an instance-based cluster that can be accessed using various DAX SDKs. Each SDK provides developers with configurable options, such as requestTimeout and connections, to meet specific application requirements.

When configuring your DAX client, a crucial consideration is your client application's scale—specifically, the ratio of client instances to DAX server instances (which has a maximum of 11). Large client instance fleets can generate numerous connections to DAX server instances, potentially overwhelming them. This guide outlines best practices for DAX client configuration.

## Best practices
<a name="dax-guidance-configuring-dax-client-best-practices"></a>

1. **Client instances** – Implement singleton client instances to ensure instance reuse across requests. For implementation details, see [Step 4: Run a sample application](DAX.client.run-application.md).

1. **Request timeouts** – While applications often require low request timeouts to ensure minimal latency for upstream systems, setting timeouts too low can cause problems. Low timeouts may trigger frequent reconnection to server instances when DAX servers experience temporary latency spikes. When a timeout occurs, the DAX client terminates the existing server node connection and establishes a new one. Since connection establishment is resource-intensive, numerous consecutive connections can overload DAX servers. We recommend the following:
   + Maintaining default request timeout settings.
   + If lower timeouts are necessary, implement separate application threads with lower timeout values and include retry mechanisms with exponential back-off.

1. **Connection timeout** – For most applications, we recommend maintaining the default connection timeout settings.

1. **Concurrent connections** – Certain SDKs, such as JavaV2, allow adjustment of concurrent connections to the DAX server. Key considerations:
   + DAX server instances can handle up to 40,000 concurrent connections.
   + Default settings are suitable for most use cases.
   + Large client instances combined with high concurrent connections may overload servers.
   + Lower concurrent connection values reduce server overload risk.
   + Performance calculation example:
     + Assuming 1ms request latency, each connection can theoretically handle 1,000 requests/second.
     + For a 3-node cluster, a single client instance connecting to all nodes can process 3,000 requests/second.
     + With 10 connections, the client can handle approximately 30,000 requests/second.

       Recommendation – Begin with lower concurrent connection settings and validate through performance testing with expected production workload patterns.

# Configuring your DAX cluster
<a name="dax-config-considerations"></a>

The DAX cluster is a managed cluster, but you can adjust its configurations to fit your application requirements. Because of its close integration with DynamoDB API operations, you should consider the following aspects when integrating your application with DAX.

**Topics**
+ [DAX pricing](#dax-pricing)
+ [Item cache and query cache](#item-vs-query-cache)
+ [Selecting TTL setting for the caches](#select-ttl-duration-caches)
+ [Caching multiple tables with a DAX cluster](#cache-multi-tables-dax-cluster)
+ [Data replication in DAX and DynamoDB global tables](#data-replication-dax-ddb-gt)
+ [DAX Region availability](#dax-region-availability)
+ [DAX caching behavior](#dax-caching-behavior)

## DAX pricing
<a name="dax-pricing"></a>

The cost of a cluster depends on the number and size of [nodes](DAX.concepts.cluster.md#DAX.concepts.nodes) it has provisioned. Every node is billed for each hour it runs in the cluster. For more information, see [Amazon DynamoDB pricing](https://aws.amazon.com/dynamodb/pricing/).

Cache hits don't incur DynamoDB cost, but impact DAX cluster resources. Cache misses incur DynamoDB read costs and require DAX resources. Writes incur DynamoDB write costs and impact DAX cluster resources to proxy the write.

## Item cache and query cache
<a name="item-vs-query-cache"></a>

DAX maintains an [item cache](DAX.concepts.md#DAX.concepts.item-cache) and a [query cache](DAX.concepts.md#DAX.concepts.query-cache). Understanding the differences between these caches can help you determine the performance and consistency characteristics they offer to your application.


| Cache characteristic | Item cache | Query cache | 
| --- | --- | --- | 
|  Purpose  |  Stores the results of [GetItem](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_GetItem.html) and [BatchGetItem](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchGetItem.html) API operations.  |  Stores the results of [Query](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Query.html) and [Scan](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Scan.html) API operations. These operations can return multiple items based on query conditions instead of specific item keys.  | 
|  Access Type  |  Uses key-based access. When an application requests data using `GetItem` or `BatchGetItem`, DAX first checks the item cache using the primary key of the requested items. If the item is cached and unexpired, DAX returns it immediately without accessing the DynamoDB table. |  Uses parameter-based access. DAX caches the result set of `Query` and `Scan` API operations. DAX serves subsequent requests with the same parameters that include the same query conditions, table, index, from the cache. This significantly reduces response times and DynamoDB read throughput consumption. | 
|  Cache Invalidation  |  DAX automatically replicates updated items into the item cache of the nodes in the DAX cluster in the following scenarios: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/dax-config-considerations.html)  |  The query cache is more challenging to invalidate than the item cache. Item updates might not directly map to cached queries or scans. You must carefully tune the query cache TTL to maintain data consistency. Writes through DAX or base table aren't reflected in query cache until the TTL expires the previously cached response and DAX performs a new query against DynamoDB.  | 
|  Global secondary index  | Because the GetItem API operation isn't supported on local secondary indexes or global secondary indexes, the item cache only caches reads from the base table. | Query cache caches queries against both tables and indexes. | 

## Selecting TTL setting for the caches
<a name="select-ttl-duration-caches"></a>

TTL determines the period for which data is stored in the cache before it becomes stale. After this period, the data is automatically refreshed on the next request. Choosing the right TTL setting for your DAX caches involves balancing between the optimization of application performance and consistency of data. Because there doesn't exist a universal TTL setting that works for all applications, the optimal TTL setting varies based on your application's specific characteristics and requirements. We recommend that you start with a conservative TTL setting using this prescriptive guidance. Then, iteratively adjust your TTL setting based on your application's performance data and insights.

DAX maintains a least recently used (LRU) list for the item cache. The LRU list tracks when items are first written to or last read from the cache. When the DAX node memory is full, DAX evicts older items even if they haven't expired yet to make room for new items. The LRU algorithm is always enabled and not user-configurable.

To set a TTL duration that works for your applications, consider the following points:

### Understand your data access patterns
<a name="ttl-data-access-patterns"></a>
+ **Read-heavy workloads** – For applications with read-heavy workloads and infrequent data updates, set a longer TTL duration to reduce the number of cache misses. A longer TTL duration also reduces the need to access the underlying DynamoDB table.
+ **Write-heavy workloads** – For applications with frequent updates that aren't written through DAX, set a shorter TTL duration to ensure the cache stays consistent with the database. A shorter TTL duration also reduces the risk of serving stale data.

### Evaluate your application's performance requirements
<a name="ttl-evaluate-app-performance-reqs"></a>
+ **Latency sensitivity** – If your application requires low latency over data freshness, use a longer TTL duration. A longer TTL duration maximizes cache hits, which reduces average read latency.
+ **Throughput and scalability** – A longer TTL duration reduces load on DynamoDB tables and improves throughput and scalability. However, you should balance this with the need for up-to-date data.

### Analyze cache eviction and memory usage
<a name="ttl-analyze-cache-evict-mem-use"></a>
+ **Cache memory limits** – Monitor your DAX cluster's memory usage. A longer TTL duration can store more data in the cache, which might reach memory limits and lead to LRU-based evictions.

### Use metrics and monitoring to adjust TTL
<a name="ttl-adjust-use-metrics"></a>

Regularly review [metrics](dax-metrics-dimensions-dax.md#dax-metrics-dimensions), for example, cache hits and misses, and CPU and memory utilization. Adjust your TTL setting based on these metrics to find an optimal balance between performance and data freshness. If cache misses are high and memory utilization is low, increase the TTL duration to increase the cache hit rate.

### Consider business requirements and compliance
<a name="ttl-business-reqs"></a>

Data retention policies might dictate the maximum TTL duration you can set for sensitive or personal information.

### Cache behavior if you set TTL to zero
<a name="ttl-cache-behavior-zero-value"></a>

If you set TTL to 0, the item cache and query cache present the following behaviors:
+ **Item cache** – Items in the cache are refeshed only when an LRU eviction or a write-through operation occurs.
+ **Query cache** – Query responses aren't cached.

## Caching multiple tables with a DAX cluster
<a name="cache-multi-tables-dax-cluster"></a>

For workloads with multiple small DynamoDB tables that don't need individual caches, a single DAX cluster caches requests for these tables. This provides more flexible and efficient use of DAX, particularly for applications that access multiple tables and require high-performance reads.

Similar to the DynamoDB [data plane](HowItWorks.API.md#HowItWorks.API.DataPlane) APIs, DAX requests require a table name. If you use multiple tables in the same DAX cluster, you don't need any specific configuration. However, you must ensure that the cluster's security permissions allow access to all cached tables.

### Considerations for using DAX with multiple tables
<a name="multi-table-dax-considerations"></a>

When you use DAX with multiple DynamoDB tables, you should consider the following points:
+ **Memory management** – When you use DAX with multiple tables, you should consider the total size of your working data set. All the tables in your data set will share the same memory space of the node type you selected.
+ **Resource allocation** – The DAX cluster's resources are shared among all the cached tables. However, a high-traffic table can cause eviction of data from the neighboring smaller tables.
+ **Economies of scale** – Group smaller resources into a larger DAX cluster for averaging out traffic to a steadier pattern. For the total number of read resources that the DAX cluster requires, it's also economical to have three or more nodes. This also increases the availability of all the cached tables in the cluster.

## Data replication in DAX and DynamoDB global tables
<a name="data-replication-dax-ddb-gt"></a>

DAX is a Region-based service, so a cluster is only aware of the traffic within its AWS Region. Global tables write around the cache when they replicate data from another Region.

A longer TTL duration can cause stale data to remain in your secondary Region for longer than in the primary Region. This can result in cache misses in the local cache of the secondary Region.

The following diagram shows data replication occurring at the global table level in the source Region A. The DAX cluster in Region B isn't immediately aware of the newly replicated data from the source Region A.

![\[A global table replicates Item v2 from Region A to Region B. Region B DAX cluster B is unaware of Item v2.\]](http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/images/dax-ddb-gt-data-replication.png)


## DAX Region availability
<a name="dax-region-availability"></a>

Not all Regions that support DynamoDB tables support deploying DAX clusters. If your application requires low read latency through DAX, first review the list of [Regions that support DAX](https://docs.aws.amazon.com/general/latest/gr/ddb.html#ddb_region). Then, select a Region for your DynamoDB table.

## DAX caching behavior
<a name="dax-caching-behavior"></a>

DAX performs metadata and negative caching. Understanding these caching behaviors will help you effectively manage attribute metadata of cached items and negative cache entries.
+ **Metadata caching** – DAX clusters indefinitely maintain metadata about the attribute names of cached items. This metadata persists even after the item expires or is evicted from the cache.

  Over time, applications that use unbounded number of attribute names can cause memory exhaustion in the DAX cluster. This limitation applies only to top-level attribute names, but not to the nested attribute names. Examples of unbounded attribute names include timestamps, UUIDs, and session IDs. Although you can use timestamps and session IDs as attribute values, we recommend to use shorter and more predictable attribute names.
+ **Negative caching** – If a cache miss occurs and the read from a DynamoDB table yields no matching items, DAX adds a negative cache entry in the respective item or query cache. This entry remains until the cache TTL duration expires or a write-through occurs. DAX continues to return this negative cache entry for future requests.

  If the negative caching behavior doesn't fit your application pattern, read the DynamoDB table directly when DAX returns an empty result. We also recommend that you set a lower TTL cache duration to avoid long-lasting empty results in the cache and improve consistency with the table.

# Sizing your DAX cluster
<a name="dax-cluster-sizing"></a>

A DAX cluster's total capacity and availability depends on node type and count. More nodes in the cluster increase its read capacity, but not the write capacity. Larger node types (up to r5.8xlarge) can handle more writes, but too few nodes can impact availability when a node failure occurs. For more information about sizing your DAX cluster, see the [DAX cluster sizing guide](DAX.sizing-guide.md).

The following sections discuss the different sizing aspects that you should consider to balance node type and count for creating a scalable and cost-efficient cluster.

**Topics**
+ [Planning availability](#dax-sizing-availability)
+ [Planning write throughput](#dax-sizing-write-throughput)
+ [Planning read throughput](#dax-sizing-read-throughput)
+ [Planning dataset size](#dax-sizing-dataset-size)
+ [Calculating approximate cluster capacity requirements](#dax-sizing-cluster-capacity)
+ [Approximating cluster throughput capacity by node type](#dax-sizing-cluster-throughput-capacity)
+ [Scaling write capacity in DAX clusters](#dax-sizing-scaling-write-capacity)

## Planning availability
<a name="dax-sizing-availability"></a>

When sizing a DAX cluster, you should first focus on its targeted availability. Availability of a clustered service, such as DAX, is a dimension of the total number of nodes in the cluster. Because a single node cluster has no tolerance for failure, its availability is equal to one node. In a 10-node cluster, the loss of a single node has a minimal impact to the cluster's overall capacity. This loss doesn't have a direct impact on availability because the remaining nodes can still fulfill read requests. To resume writes, DAX quickly nominates a new primary node.

DAX is VPC-based. It uses a subnet group to determine which [Availability Zones](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/) it can run nodes in and which IP addresses to use from the subnets. For production workloads, we highly recommend that you use DAX with at least three nodes in different Availability Zones. This ensures that the cluster has more than one node left to handle requests even if a single node or Availability Zone fails. A cluster can have up to 11 nodes, where one is a primary node and 10 are read replicas.

## Planning write throughput
<a name="dax-sizing-write-throughput"></a>

All DAX clusters have a primary node for write-through requests. The size of the node type for the cluster determines its write capacity. Adding additional read replicas doesn't increase the cluster's write capacity. Therefore, you should consider the write capacity during cluster creation because you can't change the node type later.

If your application needs to write-through DAX to update the item cache, consider increased use of cluster resources to facilitate the write. Writes against DAX consume about 25 times more resources than cache-hit reads. This might require a larger node type than for read-only clusters.

For additional guidance about determining whether write-through or write-around will work best for your application, see [Strategies for writes](DAX.consistency.md#DAX.consistency.strategies-for-writes).

## Planning read throughput
<a name="dax-sizing-read-throughput"></a>

A DAX cluster's read capacity depends on the cache hit ratio of your workload. Because DAX reads data from DynamoDB when a cache miss occurs, it consumes approximately 10 times more cluster resources than a cache-hit. To increase cache hits, increase the [TTL](dax-config-considerations.md#select-ttl-duration-caches) setting of the cache to define the period for which an item is stored in the cache. A higher TTL duration, however, increases the chance of reading older item versions unless updates are written through DAX.

To make sure that the cluster has sufficient read capacity, scale the cluster horizontally as mentioned in [Scaling a cluster horizontally](dax-cluster-operations.md#dax-cluster-horizontal-scaling). Adding more nodes adds read replicas to the pool of resources, while removing nodes reduces read capacity. When you select the number of nodes and their sizes for a cluster, consider both the minimum and maximum amount of read capacity needed. If you can't horizontally scale a cluster with smaller node types to meet your read requirements, use a larger node type.

## Planning dataset size
<a name="dax-sizing-dataset-size"></a>

Each available node type has a set memory size for DAX to cache data. If a node type is too small, the working set of data that an application requests won't fit in memory and results in cache misses. Because larger nodes support larger caches, use a node type larger than the estimated data set that you need to cache. A larger cache also improves the cache hit ratio.

You might get diminishing returns for caching items with few repeated reads. Calculate the memory size for frequently accessed items and make sure the cache is large enough to store that data set.

## Calculating approximate cluster capacity requirements
<a name="dax-sizing-cluster-capacity"></a>

You can estimate your workload's total capacity needs to help you select the appropriate size and quantity of cluster nodes. To do this estimation, calculate the variable *normalized request per second* (Normalized RPS). This variable represents the total units of work your application requires the DAX cluster to support, including cache hits, cache misses, and writes. To calculate the Normalized RPS, use the following inputs:
+ `ReadRPS_CacheHit` – Specifies the number of reads per second that result in a cache hit.
+ `ReadRPS_CacheMiss` – Specifies the number of reads per second that result in a cache miss.
+ `WriteRPS` – Specifies the number of writes per second that will go through DAX.
+ `DaxNodeCount` – Specifies the number of nodes in the DAX cluster.
+ `Size` – Specifies the size of the item being written or read in KB rounded up to the nearest KB.
+ `10x_ReadMissFactor` – Represents a value of 10. When a cache miss occurs, DAX uses approximately 10 times more resources than cache hits.
+ `25x_WriteFactor` – Represents a value of 25 because a DAX write-through uses approximately 25 times more resources than cache hits.

Using the following formula, you can calculate the Normalized RPS.

```
Normalized RPS = (ReadRPS_CacheHit * Size) + (ReadRPS_CacheMiss * Size * 10x_ReadMissFactor) + (WriteRequestRate * 25x_WriteFactor * Size * DaxNodeCount)
```

For example, consider the following input values:
+ `ReadRPS_CacheHit` = 50,000
+ `ReadRPS_CacheMiss` = 1,000
+ `ReadMissFactor` = 1
+ `Size` = 2 KB
+ `WriteRPS` = 100
+ `WriteFactor` = 1
+ `DaxNodeCount` = 3

By substituting these values in the formula, you can calculate the Normalized RPS as follows.

```
Normalized RPS = (50,000 Cache Hits/Sec * 2KB) + (1,000 Cache Misses/Sec * 2KB * 10) + (100 Writes/Sec * 25 * 2KB * 3)
```

In this example, the calculated value of Normalized RPS is 135,000. However, this Normalized RPS value doesn't account for keeping cluster utilization below 100% or node loss. We recommend that you factor in additional capacity. To do this, determine the greater of two multiplying factors: target utilization or node loss tolerance. Then, multiply the Normalized RPS by the greater factor to obtain a *target request per second* (Target RPS).
+ **Target utilization**

  Because performance impacts increase cache misses, we don't recommend running the DAX cluster at 100% utilization. Ideally, you should keep cluster utilization at or below 70%. To achieve this, multiply the Normalized RPS by 1.43.
+ **Node loss tolerance**

  If a node fails, your application must be able to distribute its requests among the remaining nodes. To make sure nodes stay below 100% utilization, choose a node type large enough to absorb extra traffic until the failed node comes back online. For a cluster with fewer nodes, each node must tolerate larger traffic increases when one node fails.

  If a primary node fails, DAX automatically fails over to a read replica and designates it as the new primary. If a replica node fails, other nodes in the DAX cluster can still serve requests until the failed node is recovered.

  For example, a 3-node DAX cluster with a node failure requires an additional 50% capacity on the remaining two nodes. This requires a multiplying factor of 1.5. Conversely, an 11-node cluster with a failed node requires an additional 10% capacity on the remaining nodes or a multiplying factor of 1.1.

Using the following formula, you can calculate the Target RPS.

```
Target RPS = Normalized RPS * CEILING(TargetUtilization, NodeLossTolerance)
```

For example, to calculate Target RPS, consider the following values:
+ `Normalized RPS` = 135,000
+ `TargetUtilization` = 1.43

  Because we're aiming for a maximum cluster utilization of 70%, we're setting `TargetUtilization` to 1.43.
+ `NodeLossTolerance` = 1.5

  Say that we're using a 3-node cluster, therefore, we're setting `NodeLossTolerance` to 50% capacity.

By substituting these values in the formula, you can calculate the Target RPS as follows.

```
Target RPS = 135,000 * CEILING(1.43, 1.5)
```

In this example, because the value of `NodeLossTolerance` is greater than `TargetUtilization`, we calculate the value of Target RPS with `NodeLossTolerance`. This gives us a Target RPS of 202,500, which is the total amount of capacity the DAX cluster must support. To determine the number of nodes you'll need in a cluster, map the Target RPS to an appropriate column in the [following table](#dax-sizing-cluster-throughput-capacity). For this example of a Target RPS of 202,500, you need the dax.r5.large node type with three nodes.

## Approximating cluster throughput capacity by node type
<a name="dax-sizing-cluster-throughput-capacity"></a>

Using the [Target RPS formula](#Target-RPS-formula), you can estimate cluster capacity for different node types. The following table shows approximate capacities for clusters with 1, 3, 5, and 11 node types. These capacities don't replace the need to perform load testing of DAX with your own data and request patterns. Additionally, these capacities don't include [t-type](DAX.Burstable.md) instances because of their lack of fixed CPU capacity. The unit for all values in the following table is Normalized RPS.


| Node type (memory) | 1 node | 3 nodes | 5 nodes | 11 nodes | 
| --- | --- | --- | --- | --- | 
| dax.r5.24xlarge (768GB) | 1M | 3M | 5M | 11M | 
| dax.r5.16xlarge (512GB) | 1M | 3M | 5M | 11M | 
| dax.r5.12xlarge (384GB) | 1M | 3M | 5M | 11M | 
| dax.r5.8xlarge (256GB) | 1M | 3M | 5M | 11M | 
| dax.r5.4xlarge (128GB) | 600k | 1.8M | 3M | 6.6M | 
| dax.r5.2xlarge (64GB) | 300k | 900k | 1.5M | 3.3M | 
| dax.r5.xlarge (32GB) | 150k | 450k | 750k | 1.65M | 
| dax.r5.large (16GB) | 75k | 225k | 375k | 825k | 

Because of the maximum limit of 1 million NPS (network operations per second) for each node, nodes of types dax.r5.8xlarge or larger don't contribute additional cluster capacity. Node types larger than 8xlarge might not contribute to total throughput capacity of the cluster. However, such node types can be helpful for storing a larger working data set in memory.

## Scaling write capacity in DAX clusters
<a name="dax-sizing-scaling-write-capacity"></a>

Each write to DAX consumes 25 normalized requests on every node. Because there's a 1 million RPS limit for each node, a DAX cluster is limited to 40,000 writes per second, not accounting for read usage.

If your use case requires more than 40,000 writes per second in the cache, you must use separate DAX clusters and shard the writes among them. Similar to DynamoDB, you can hash the partition key for the data you're writing to the cache. Then, use modulus to determine which shard to write the data to.

The following example calculates the hash of an input string. It then calculates the modulus of the hash value with 10.

```
def hash_modulo(input_string):
    # Compute the hash of the input string
    hash_value = hash(input_string)

    # Compute the modulus of the hash value with 10
    bucket_number = hash_value % 10

    return bucket_number

#Example usage
if _name_ == "_main_":
    input_string = input("Enter a string: ")
    result = hash_modulo(input_string)
    print(f"The hash modulo 10 of '{input_string}' is: {result}.")
```

# Deploying a cluster
<a name="dax-deploy-cluster"></a>

Creating a new DAX cluster requires configurations beyond those needed for DynamoDB. These configurations are particularly for networking because DAX is based on [Amazon VPC](https://docs.aws.amazon.com/vpc/latest/userguide/what-is-amazon-vpc.html). This gives you complete control over your virtual networking environment, including resource placement, connectivity, and security. This section presents the best practices for the settings needed during cluster creation.

For information about choosing cluster nodes, see [Sizing your DAX cluster](dax-cluster-sizing.md).

**Topics**
+ [Configure networks](#dax-cluster-config-network)
+ [Configure security](#dax-cluster-config-security)
+ [Parameter group](#dax-cluster-parameter-group)
+ [Maintenance window](#dax-cluster-maintenance-window)

## Configure networks
<a name="dax-cluster-config-network"></a>

DAX uses a [subnet group](DAX.concepts.cluster.md#DAX.concepts.cluster.security) to determine which Availability Zones it can run nodes in and which IP addresses to use from the subnets. To minimize latency between your application and DAX, the subnets and Availability Zones for your application servers and the DAX cluster should be the same.

We recommend that you spread the DAX nodes across multiple Availability Zones. The default option of Automatic allocation does this for you.

For best practices about setting up your VPC, see [Get started with Amazon VPC](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-getting-started.html) in the *Amazon VPC User Guide*.

## Configure security
<a name="dax-cluster-config-security"></a>

This section discusses the security measures that you should implement for your applications that use DAX. This section also briefly discusses the support that DAX includes for data encryption.

**IAM**  
DAX and DynamoDB have separate [access control](DAX.access-control.md) mechanisms. DAX requires an IAM role to access your DynamoDB tables. This role should follow the principle of least privilege and grant access only to specific tables and DynamoDB operations, such as [GetItem](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_GetItem.html) and [PutItem](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_PutItem.html). For more information about the access control mechanisms provided by DAX, see [DAX access control](DAX.access-control.md).

**Encryption**  
You configure encryption at rest and encryption in transit while creating a DAX cluster. These are enabled by default. We recommend that you keep the default encryption settings unless business requirements prevent it. For more information, see [DAX encryption at rest](DAXEncryptionAtRest.md) and [DAX encryption in transit](DAXEncryptionInTransit.md).

## Parameter group
<a name="dax-cluster-parameter-group"></a>

DAX applies a set of configurations on every node in a cluster called a [parameter group](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_dax_ParameterGroup.html). You can change this configuration after creating the cluster.

The DAX parameter group holds TTL settings for item cache and query cache. By default, the TTL duration is 5 minutes. You can override the TTL duration to any integer value greater than or equal to 1 millisecond.

You can't modify parameter groups when a running DAX instance is using them. You can change the parameter group values during the downtime of a DAX cluster.

## Maintenance window
<a name="dax-cluster-maintenance-window"></a>

To allow for occasional software upgrades and patches to your nodes, a weekly [maintenance window](DAX.concepts.cluster.md#DAX.concepts.maintenance-window) is configured for the DAX cluster. During this window, DAX performs rolling updates to the nodes. Clusters with more than one node don't lose availability of the cluster during these updates, but have reduced cluster capacity until the node returns. If your organization has a predictable time of low usage, consider setting the maintenance window manually to this time.

# Managing cluster operations
<a name="dax-cluster-operations"></a>

DAX handles the cluster’s maintenance and health for you. However, you need to provide operational input to scale the cluster horizontally or vertically to match your usage patterns. This section describes the recommended process to scale your DAX clusters.

**Topics**
+ [Scaling a cluster horizontally](#dax-cluster-horizontal-scaling)
+ [Scaling a cluster vertically](#dax-cluster-vertical-scaling)

## Scaling a cluster horizontally
<a name="dax-cluster-horizontal-scaling"></a>

Scaling a DAX cluster involves adjusting its capacity to meet throughput demands. This adjustment is done by increasing or decreasing the number of nodes (replicas) in the cluster while it's running. This process, known as [horizontal scaling](DAX.cluster-management.md#DAX.cluster-management.scaling.read-scaling), helps distribute the workload across more nodes or consolidate to fewer nodes when demand is low.

You can horizontally scale your DAX cluster in and out using the `decrease-replication-factor` or `increase-replication-factor` commands in the AWS CLI.

**Increase replication factor (scale out)**  
Increasing the replication factor of a DAX cluster adds more nodes to the cluster. The following example shows the usage of the `increase-replication-factor` command.

```
aws dax increase-replication-factor \
    --cluster-name yourClusterName  \
    --new-replication-factor desiredReplicationFactor
```
+ In this command, the `cluster-name` argument specifies the name of your cluster. For example, *yourClusterName*.
+ The `new-replication-factor` argument specifies the total number of nodes to add in the cluster after scaling. This includes the primary node and replica nodes. For example, if your cluster currently has 3 nodes and you want to add 2 more nodes, set the value of `new-replication-factor` to 5.

**Decrease replication factor (scale in)**  
Decreasing the replication factor of a DAX cluster removes nodes from the cluster. Removing nodes can help reduce cost during periods of low demand. The following example shows the usage of the `decrease-replication-factor` command.

```
aws dax decrease-replication-factor \
    --cluster-name yourClusterName  \
    --new-replication-factor desiredReplicationFactor
```
+ In this command, the `cluster-name` argument specifies the name of your cluster. For example, *yourClusterName*.
+ The `new-replication-factor` argument specifies the reduced number of nodes in your cluster after scaling. This number must be lower than the current replication factor and must include the primary node. For instance, if your cluster has 5 nodes and you want to remove 2 nodes, set the value of `new-replication-factor` to 3.

### Horizontal scaling considerations
<a name="dax-horizontal-scaling-considerations"></a>

Consider the following when you plan horizontal scaling:
+ **Primary node** – The DAX cluster includes a primary node. The replication factor includes this primary node. For example, a replication factor of 3 means one primary node and two replica nodes.
+ **Availability** – Adding or removing DAX nodes changes the cluster's availability and fault tolerance. More nodes can improve availability, but they also increase costs.
+ **Data migration** – When you increase the replication factor, DAX automatically handles data distribution across the new set of nodes. When a new node begins serving traffic, its cache is already warmed. However, during this process, there might be a temporary impact on performance during data migration.

Make sure you monitor your DAX clusters closely during and after the scaling process to ensure they're performing as expected and make further adjustments as necessary.

## Scaling a cluster vertically
<a name="dax-cluster-vertical-scaling"></a>

To vertically scale the node size of an existing cluster, you need to create a new cluster and migrate the application traffic to the new cluster. Migrating to a new cluster with different nodes involves several steps to ensure a smooth transition with minimal impact on your application's performance and availability.

To create a new cluster for scaling your node size vertically, consider the following points:
+ **Access your current setup** – Review the metrics of your current DAX cluster to determine the new node size and quantity you need. Use this information as input to define your cluster size. For information, see [Sizing your DAX cluster](dax-cluster-sizing.md).
+ **Set up a new DAX cluster** – Create a new DAX cluster with the node type and quantity you determined. You can use the existing configuration settings from your [parameter group](dax-deploy-cluster.md#dax-cluster-parameter-group), unless you need to make adjustments.
+ **Synchronize data** – Because DAX is a caching layer for DynamoDB, you don't need to migrate data directly. However, the new DAX cluster won't have any of your working dataset in memory until you send traffic to it.
+ **Update application configuration** – Update your application's configuration to point to the new [DAX cluster's endpoint](DAX.concepts.cluster.md#DAX.concepts.cluster-endpoint). You might need to change code or update environment variables, depending on your application's configuration.

  To reduce impact when you switch to a new cluster, send canary traffic to the new cluster from a small portion of your application fleet. You can do this by slowly rolling out application updates or by using a weight-based routing DNS entry in front of your DAX endpoint.
+ **Monitor and optimize** – After you switch to the new DAX cluster, closely monitor its performance [metrics and logs](DAX.Monitoring.md) for any issues. Be ready to adjust the number of nodes based on updated workload patterns.

  Until the new cluster caches your working dataset properly, you'll see higher cache miss rates and latencies.
+ **Decommission old cluster** – When you're sure that the new cluster is performing as expected, safely decommission the old DAX cluster to avoid unnecessary costs.

# Monitoring DAX
<a name="pres-guide-monitor-dax"></a>

You can monitor key [metrics](dax-metrics-dimensions-dax.md#dax-metrics-dimensions), for example cache hit ratio, to ensure optimal DAX cluster performance, diagnose issues, and determine when you need to scale the cluster. Regularly checking key metrics helps you maintain performance, stability, and cost-efficiency by scaling the cluster to match your workload requirements. For more information about monitoring DAX, see [Production monitoring](dax-production-monitoring.md).

The following list presents some of the key metrics you should monitor:
+ **Cache hit ratio** – Shows how effectively DAX serves cached data, reducing the need to access the underlying DynamoDB tables. Few cache misses for the cluster indicate good caching efficiency. But few cache hits suggest that you might need to revisit the caching TTL setting or the workload isn't a good fit for caching.

  Use Amazon CloudWatch to calculate your DAX cluster's cache hit ratio. Compare the `ItemCacheHits`, `ItemCacheMisses`, `QueryCacheHits`, and `QueryCacheMisses` metrics to get this ratio. The following formula shows how the cache hit ratio is calculated. To calculate the ratio using this formula, divide your cache hits by the sum of your cache hits and misses.

  ```
  Cache hit ratio = Cache hits / (Cache hits + Cache misses)
  ```

  The cache hit ratio is a number between 0 and 1, which is represented as a percentage. A higher percentage indicates better overall cache utilization.
+ **ErrorRequestCount** – Count of requests that resulted in user errors reported by the node or cluster. `ErrorRequestCount` includes requests that were throttled by the node or cluster. Monitoring user errors can help you identify scaling misconfigurations or hot item/partition patterns in your application.
+ **Operation latencies** – Monitoring the latency of read and write operations to and from the DAX cluster can help you in identifying performance bottlenecks. Increasing latencies might indicate issues with your DAX cluster configuration, network, or the need to scale.
+ **Network consumption** – Keep an eye on the `NetworkBytesIn` and `NetworkBytesOut` metrics to monitor your DAX cluster's network traffic. An unexpected increase in network throughput could mean more client requests or inefficient query patterns that are causing more data to be transferred.

  Monitoring network consumption helps you manage costs for your DAX cluster. It also ensures the network doesn't become a bottleneck for cluster performance.
+ **Eviction rate** – Shows how often items are removed from your cache to make room for new items. If the eviction rate increases over time, your cache might be too small or your caching strategy isn't effective.

  Monitor the `EvictedSize` metric in CloudWatch to determine if your cache size is adequate for your workload. If the total evicted size keeps growing, you might need to scale up your DAX cluster to accommodate a larger cache.
+ **CPU utilization** – Refers to the percentage of CPU utilization of the node or cluster. This is a critical metric to monitor for any database or caching system. High CPU utilization could mean your DAX cluster might be overloaded and needs scaling to handle the increased demand.

  Monitor the `CPUUtilization` metric for your DAX cluster. If your CPU utilization consistently approaches or exceeds 70-80%, consider [scaling up your DAX cluster](#dax-cluster-scale-monitoring-data) as described in the following section.

  If the number of requests sent to DAX exceeds a node's capacity, DAX limits the rate at which it accepts additional requests. It does this by returning a ThrottlingException. DAX continuously evaluates your cluster's CPU utilization to determine the request volume it can process while maintaining a healthy cluster state.

  You can monitor the `ThrottledRequestCount` metric that DAX publishes to CloudWatch. If you see these exceptions regularly, you should consider scaling up your cluster.

## Scaling your DAX cluster using monitoring data
<a name="dax-cluster-scale-monitoring-data"></a>

You can determine if you need to scale up or down your DAX cluster by monitoring its performance metrics.
+ **Scale up or out** – If your DAX cluster has high CPU utilization, low cache hits (after optimizing the caching strategy), or high operation latencies, you should scale up your cluster. Adding more nodes, also called scaling out, can help distribute the load more evenly. For workloads with increasing writes per second, you might need to choose more powerful nodes (scaling up).
+ **Scale down** – If you consistently see low CPU utilization and operation latencies below your thresholds, you might have over-provisioned resources. In such cases, scale down nodes to reduce costs. You can reduce the number of nodes down to 1 during low utilization periods, but you can't shut the cluster down entirely.