# Estimate capacity consumption of read and write throughput in Amazon Keyspaces
<a name="capacity-examples"></a>

When you read or write data in Amazon Keyspaces, the amount of read/write request units (RRUs/WRUs) or read/write capacity units (RCUs/WCUs) your query consumes depends on the total amount of data Amazon Keyspaces has to process to run the query. In some cases, the data returned to the client can be a subset of the data that Amazon Keyspaces had to read to process the query. For conditional writes, Amazon Keyspaces consumes write capacity even if the conditional check fails.

To estimate the total amount of data being processed for a request, you have to consider the encoded size of a row and the total number of rows. This topic covers some examples of common scenarios and access patterns to show how Amazon Keyspaces processes queries and how that affects capacity consumption. You can follow the examples to estimate the capacity requirements of your tables and use Amazon CloudWatch to observe the read and write capacity consumption for these use cases.

For information on how to calculate the encoded size of rows in Amazon Keyspaces, see [Estimate row size in Amazon Keyspaces](calculating-row-size.md).

**Topics**
+ [Estimate the capacity consumption of range queries in Amazon Keyspaces](range_queries.md)
+ [Estimate the read capacity consumption of limit queries](limit_queries.md)
+ [Estimate the read capacity consumption of table scans](table_scans.md)
+ [Estimate capacity consumption of lightweight transactions in Amazon Keyspaces](lightweight_transactions.md)
+ [Estimate capacity consumption for static columns in Amazon Keyspaces](static-columns.md)
+ [Estimate and provision capacity for a multi-Region table in Amazon Keyspaces](tables-multi-region-capacity.md)
+ [Estimate read and write capacity consumption with Amazon CloudWatch in Amazon Keyspaces](estimate_consumption_cw.md)

# Estimate the capacity consumption of range queries in Amazon Keyspaces
<a name="range_queries"></a>

 To look at the read capacity consumption of a range query, we use the following example table which is using on-demand capacity mode. 

```
pk1 | pk2 | pk3 | ck1 | ck2 | ck3 | value
-----+-----+-----+-----+-----+-----+-------
a | b | 1 | a | b | 50 | <any value that results in a row size larger than 4KB>
a | b | 1 | a | b | 60 | value_1
a | b | 1 | a | b | 70 | <any value that results in a row size larger than 4KB>
```

Now run the following query on this table.

```
SELECT * FROM amazon_keyspaces.example_table_1 WHERE pk1='a' AND pk2='b' AND pk3=1 AND ck1='a' AND ck2='b' AND ck3 > 50 AND ck3 < 70;
```

You receive the following result set from the query and the read operation performed by Amazon Keyspaces consumes 2 RRUs in `LOCAL_QUORUM` consistency mode.

```
pk1 | pk2 | pk3 | ck1 | ck2 | ck3 | value
-----+-----+-----+-----+-----+-----+-------
a | b | 1 | a | b | 60 | value_1
```

Amazon Keyspaces consumes 2 RRUs to evaluate the rows with the values `ck3=60` and `ck3=70` to process the query. However, Amazon Keyspaces only returns the row where the `WHERE` condition specified in the query is true, which is the row with value `ck3=60`. To evaluate the range specified in the query, Amazon Keyspaces reads the row matching the upper bound of the range, in this case `ck3 = 70`, but doesn’t return that row in the result. The read capacity consumption is based on the data read when processing the query, not on the data returned.

# Estimate the read capacity consumption of limit queries
<a name="limit_queries"></a>

 When processing a query that uses the `LIMIT` clause, Amazon Keyspaces reads rows up to the maximum page size when trying to match the condition specified in the query. If Amazon Keyspaces can't find sufficient matching data that meets the `LIMIT` value on the first page, one or more paginated calls could be needed. To continue reads on the next page, you can use a pagination token. The default page size is 1MB. To consume less read capacity when using `LIMIT` clauses, you can reduce the page size. For more information about pagination, see [Paginate results in Amazon Keyspaces](paginating-results.md).

For an example, let's look at the following query.

```
SELECT * FROM my_table WHERE partition_key=1234 LIMIT 1;
```

If you don’t set the page size, Amazon Keyspaces reads 1MB of data even though it returns only 1 row to you. To only have Amazon Keyspaces read one row, you can set the page size to 1 for this query. In this case, Amazon Keyspaces would only read one row provided you don’t have expired rows based on Time-to-live settings or client-side timestamps. 

The `PAGE SIZE` parameter determines how many rows Amazon Keyspaces scans from disk for each request, not how many rows Amazon Keyspaces returns to the client. Amazon Keyspaces applies the filters you provide, for example inequality on non-key columns or a `LIMIT` after it scans the data on disk. If you don’t explicitly set the `PAGE SIZE`, Amazon Keyspaces reads up to 1MB of data before applying filters. For example, if you're using `LIMIT 1` without specifying the `PAGE SIZE`, Amazon Keyspaces could read thousands of rows from disk before applying the limit clause and returning only a single row.

To avoid over-reading, reduce the `PAGE SIZE` which reduces the number of rows Amazon Keyspaces scans for each fetch. For example, if you define `LIMIT 5` in your query, set the `PAGE SIZE` to a value between 5 - 10 so that Amazon Keyspaces only scans 5 - 10 rows on each paginated call. You can modify this number to reduce the number of fetches. For limits that are larger than the page size, Amazon Keyspaces maintains the total result count with pagination state. In the case of a `LIMIT` of 10,000 rows, Amazon Keyspaces can fetch these results in two pages of 5,000 rows each. The 1MB limit is the upper bound for any page size set.

# Estimate the read capacity consumption of table scans
<a name="table_scans"></a>

Queries that result in full table scans, for example queries using the `ALLOW FILTERING` option, are another example of queries that process more reads than what they return as results. And the read capacity consumption is based on the data read, not the data returned.

For the table scan example we use the following example table in on-demand capacity mode.

```
pk | ck | value
---+----+---------
pk | 10 | <any value that results in a row size larger than 4KB>
pk | 20 | value_1 
pk | 30 | <any value that results in a row size larger than 4KB>
```

Amazon Keyspaces creates a table in on-demand capacity mode with four partitions by default. In this example table, all the data is stored in one partition and the remaining three partitions are empty.

Now run the following query on the table.

```
SELECT * from amazon_keyspaces.example_table_2;
```

This query results in a table scan operation where Amazon Keyspaces scans all four partitions of the table and consumes 6 RRUs in `LOCAL_QUORUM` consistency mode. First, Amazon Keyspaces consumes 3 RRUs for reading the three rows with `pk=‘pk’`. Then, Amazon Keyspaces consumes the additional 3 RRUs for scanning the three empty partitions of the table. Because this query results in a table scan, Amazon Keyspaces scans all the partitions in the table, including partitions without data. 

# Estimate capacity consumption of lightweight transactions in Amazon Keyspaces
<a name="lightweight_transactions"></a>

Lightweight transactions (LWT) allow you to perform conditional write operations against your table data. Conditional update operations are useful when inserting, updating and deleting records based on conditions that evaluate the current state. 

In Amazon Keyspaces, all write operations require LOCAL\$1QUORUM consistency and there is no additional charge for using LWTs. The difference for LWTs is that when a LWT condition check results in `FALSE`, Amazon Keyspaces consumes write capacity units (WCUs) or write request units (WRUs). The number of WCUs/WRUs consumed depends on the size of the row. 

For example, if the row size is 2 KB, the failed conditional write consumes two WCUs/WRUs. If the row doesn’t currently exist in the table, the operation consumes one WCUs/WRUs. 

To determine the number of requests that resulted in condition check failures, you can monitor the `ConditionalCheckFailed` metric in CloudWatch.

## Estimate LWT costs for tables with Time to Live (TTL)
<a name="lightweight_transactions_ttl"></a>

LWTs can require additional read capacity units (RCUs) or read request units (RRUs) for tables configured with TTL that don't use client-side timestamps. When using `IF EXISTS` or `IF NOT EXISTS` keywords condition check results in `FALSE`, the following capacity units are consumed:
+ RCUs/RRUs – If the row exists, the RCUs/RRUs consumed are based on the size of the existing row.
+ RCUs/RRUs – If the row doesn't exist, a single RCU/RRU is consumed.

If the evaluated condition results in a successful write operation, WCUs/WRUs are consumed based on the size of the new row.

# Estimate capacity consumption for static columns in Amazon Keyspaces
<a name="static-columns"></a>

In an Amazon Keyspaces table with clustering columns, you can use the `STATIC` keyword to create a static column. The value stored in a static column is shared between all rows in a logical partition. When you update the value of this column, Amazon Keyspaces applies the change automatically to all rows in the partition. 

This section describes how to calculate the encoded size of data when you're writing to static columns. This process is handled separately from the process that writes data to the nonstatic columns of a row. In addition to size quotas for static data, read and write operations on static columns also affect metering and throughput capacity for tables independently. For functional differences with Apache Cassandra when using static columns and paginated range read results, see [Pagination](functional-differences.md#functional-differences.paging).

**Topics**
+ [Calculate the static column size per logical partition in Amazon Keyspaces](static-columns-estimate.md)
+ [Estimate capacity throughput requirements for read/write operations on static data in Amazon Keyspaces](static-columns-metering.md)

# Calculate the static column size per logical partition in Amazon Keyspaces
<a name="static-columns-estimate"></a>

This section provides details about how to estimate the encoded size of static columns in Amazon Keyspaces. The encoded size is used when you're calculating your bill and quota use. You should also use the encoded size when you calculate provisioned throughput capacity requirements for tables. To calculate the encoded size of static columns in Amazon Keyspaces, you can use the following guidelines.
+ Partition keys can contain up to 2048 bytes of data. Each key column in the partition key requires up to 3 bytes of metadata. These metadata bytes count towards your static data size quota of 1 MB per partition. When calculating the size of your static data, you should assume that each partition key column uses the full 3 bytes of metadata.
+ Use the raw size of the static column data values based on the data type. For more information about data types, see [Data types](cql.elements.md#cql.data-types).
+ Add 104 bytes to the size of the static data for metadata.
+ Clustering columns and regular, nonprimary key columns do not count towards the size of static data. To learn how to estimate the size of nonstatic data within rows, see [Estimate row size in Amazon Keyspaces](calculating-row-size.md).

The total encoded size of a static column is based on the following formula:

```
partition key columns + static columns + metadata = total encoded size of static data
```

Consider the following example of a table where all columns are of type integer. The table has two partition key columns, two clustering columns, one regular column, and one static column.

```
CREATE TABLE mykeyspace.mytable(pk_col1 int, pk_col2 int, ck_col1 int, ck_col2 int, reg_col1 int, static_col1 int static, primary key((pk_col1, pk_col2),ck_col1, ck_col2));
```

In this example, we calculate the size of static data of the following statement:

```
INSERT INTO mykeyspace.mytable (pk_col1, pk_col2, static_col1) values(1,2,6);
```

To estimate the total bytes required by this write operation, you can use the following steps.

1. Calculate the size of a partition key column by adding the bytes for the data type stored in the column and the metadata bytes. Repeat this for all partition key columns.

   1. Calculate the size of the first column of the partition key (pk\$1col1):

      ```
      4 bytes for the integer data type + 3 bytes for partition key metadata = 7 bytes
      ```

   1. Calculate the size of the second column of the partition key (pk\$1col2): 

      ```
      4 bytes for the integer data type + 3 bytes for partition key metadata = 7 bytes
      ```

   1. Add both columns to get the total estimated size of the partition key columns: 

      ```
      7 bytes + 7 bytes = 14 bytes for the partition key columns
      ```

1. Add the size of the static columns. In this example, we only have one static column that stores an integer (which requires 4 bytes).

1. Finally, to get the total encoded size of the static column data, add up the bytes for the primary key columns and static columns, and add the additional 104 bytes for metadata:

   ```
   14 bytes for the partition key columns + 4 bytes for the static column + 104 bytes for metadata = 122 bytes.
   ```

You can also update static and nonstatic data with the same statement. To estimate the total size of the write operation, you must first calculate the size of the nonstatic data update. Then calculate the size of the row update as shown in the example at [Estimate row size in Amazon Keyspaces](calculating-row-size.md), and add the results. 

In this case, you can write a total of 2 MB—1 MB is the maximum row size quota, and 1 MB is the quota for the maximum static data size per logical partition.

To calculate the total size of an update of static and nonstatic data in the same statement, you can use the following formula:

```
(partition key columns + static columns + metadata = total encoded size of static data) + (partition key columns + clustering columns + regular columns + row metadata = total encoded size of row)
= total encoded size of data written
```

Consider the following example of a table where all columns are of type integer. The table has two partition key columns, two clustering columns, one regular column, and one static column.

```
CREATE TABLE mykeyspace.mytable(pk_col1 int, pk_col2 int, ck_col1 int, ck_col2 int, reg_col1 int, static_col1 int static, primary key((pk_col1, pk_col2),ck_col1, ck_col2));
```

In this example, we calculate the size of data when we write a row to the table, as shown in the following statement:

```
INSERT INTO mykeyspace.mytable (pk_col1, pk_col2, ck_col1, ck_col2, reg_col1, static_col1) values(2,3,4,5,6,7);
```

To estimate the total bytes required by this write operation, you can use the following steps.

1. Calculate the total encoded size of static data as shown earlier. In this example, it's 122 bytes.

1. Add the size of the total encoded size of the row based on the update of nonstatic data, following the steps at [Estimate row size in Amazon Keyspaces](calculating-row-size.md). In this example, the total size of the row update is 134 bytes.

   ```
   122 bytes for static data + 134 bytes for nonstatic data = 256 bytes.
   ```

# Estimate capacity throughput requirements for read/write operations on static data in Amazon Keyspaces
<a name="static-columns-metering"></a>

Static data is associated with logical partitions in Cassandra, not individual rows. Logical partitions in Amazon Keyspaces can be virtually unbound in size by spanning across multiple physical storage partitions. As a result, Amazon Keyspaces meters write operations on static and nonstatic data separately. Furthermore, writes that include both static and nonstatic data require additional underlying operations to provide data consistency. 

If you perform a mixed write operation of both static and nonstatic data, this results in two separate write operations—one for nonstatic and one for static data. This applies to both on-demand and provisioned read/write capacity modes.

The following example provides details about how to estimate the required read capacity units (RCUs) and write capacity units (WCUs) when you're calculating provisioned throughput capacity requirements for tables in Amazon Keyspaces that have static columns. You can estimate how much capacity your table needs to process writes that include both static and nonstatic data by using the following formula:

```
2 x WCUs required for nonstatic data + 2 x WCUs required for static data
```

For example, if your application writes 27 KBs of data per second and each write includes 25.5 KBs of nonstatic data and 1.5 KBs of static data, then your table requires 56 WCUs (2 x 26 WCUs \$1 2 x 2 WCUs).

Amazon Keyspaces meters the reads of static and nonstatic data the same as reads of multiple rows. As a result, the price of reading static and nonstatic data in the same operation is based on the aggregate size of the data processed to perform the read.

To learn how to monitor serverless resources with Amazon CloudWatch, see [Monitoring Amazon Keyspaces with Amazon CloudWatch](monitoring-cloudwatch.md).

# Estimate and provision capacity for a multi-Region table in Amazon Keyspaces
<a name="tables-multi-region-capacity"></a>

You can configure the throughput capacity of a multi-Region table in one of two ways:
+ On-demand capacity mode, measured in write request units (WRUs)
+ Provisioned capacity mode with auto scaling, measured in write capacity units (WCUs)

You can use provisioned capacity mode with auto scaling or on-demand capacity mode to help ensure that a multi-Region table has sufficient capacity to perform replicated writes to all AWS Regions.

**Note**  
Changing the capacity mode of the table in one of the Regions changes the capacity mode for all replicas.

By default, Amazon Keyspaces uses on-demand mode for multi-Region tables. With on-demand mode, you don't need to specify how much read and write throughput that you expect your application to perform. Amazon Keyspaces instantly accommodates your workloads as they ramp up or down to any previously reached traffic level. If a workload’s traffic level hits a new peak, Amazon Keyspaces adapts rapidly to accommodate the workload.

If you choose provisioned capacity mode for a table, you have to configure the number of read capacity units (RCUs) and write capacity units (WCUs) per second that your application requires. 

To plan a multi-Region table's throughput capacity needs, you should first estimate the number of WCUs per second needed for each Region. Then you add the writes from all Regions that your table is replicated in, and use the sum to provision capacity for each Region. This is required because every write that is performed in one Region must also be repeated in each replica Region. 

If the table doesn't have enough capacity to handle the writes from all Regions, capacity exceptions will occur. In addition, inter-Regional replication wait times will rise.

For example, if you have a multi-Region table where you expect 5 writes per second in US East (N. Virginia), 10 writes per second in US East (Ohio), and 5 writes per second in Europe (Ireland), you should expect the table to consume 20 WCUs in each Region: US East (N. Virginia), US East (Ohio), and Europe (Ireland). That means that in this example, you need to provision 20 WCUs for each of the table's replicas. You can monitor your table's capacity consumption using Amazon CloudWatch. For more information, see [Monitoring Amazon Keyspaces with Amazon CloudWatch](monitoring-cloudwatch.md). 

Each write is billed as 1 WCU, so you would see a total of 60 WCUs billed in this example. For more information about pricing, see [Amazon Keyspaces (for Apache Cassandra) pricing](https://aws.amazon.com/keyspaces/pricing). 

For more information about provisioned capacity with Amazon Keyspaces auto scaling, see [Manage throughput capacity automatically with Amazon Keyspaces auto scaling](autoscaling.md). 

**Note**  
If a table is running in provisioned capacity mode with auto scaling, the provisioned write capacity is allowed to float within those auto scaling settings for each Region. 

# Estimate read and write capacity consumption with Amazon CloudWatch in Amazon Keyspaces
<a name="estimate_consumption_cw"></a>

To estimate and monitor read and write capacity consumption, you can use a CloudWatch dashboard. For more information about available metrics for Amazon Keyspaces, see [Amazon Keyspaces metrics and dimensions](metrics-dimensions.md). 

To monitor read and write capacity units consumed by a specific statement with CloudWatch, you can follow these steps.

1. Create a new table with sample data

1. Configure a Amazon Keyspaces CloudWatch dashboard for the table. To get started, you can use a dashboard template available on [Github](https://github.com/aws-samples/amazon-keyspaces-cloudwatch-cloudformation-templates).

1. Run the CQL statement, for example using the `ALLOW FILTERING` option, and check the read capacity units consumed for the full table scan in the dashboard.