# Monitor Kinesis Data Streams
<a name="monitoring"></a>

You can monitor your data streams in Amazon Kinesis Data Streams using the following features:
+ [CloudWatch metrics](monitoring-with-cloudwatch.md)— Kinesis Data Streams sends Amazon CloudWatch custom metrics with detailed monitoring for each stream.
+ [Kinesis Agent](agent-health.md)— The Kinesis Agent publishes custom CloudWatch metrics to help assess if the agent is working as expected.
+ [API logging](logging-using-cloudtrail.md)— Kinesis Data Streams uses AWS CloudTrail to log API calls and store the data in an Amazon S3 bucket.
+ [Kinesis Client Library](monitoring-with-kcl.md)— Kinesis Client Library (KCL) provides metrics per shard, worker, and KCL application.
+ [Kinesis Producer Library](monitoring-with-kpl.md)— Amazon Kinesis Producer Library (KPL) provides metrics per shard, worker, and KPL application.

For more information about common monitoring issues, questions, and troubleshooting, see the following:
+  [Which metrics should I use to monitor and troubleshoot Kinesis Data Streams issues?](https://aws.amazon.com/premiumsupport/knowledge-center/kinesis-data-streams-troubleshoot/)
+ [Why does the IteratorAgeMilliseconds value in Kinesis Data Streams keep increasing?](https://aws.amazon.com/premiumsupport/knowledge-center/kinesis-data-streams-iteratorage-metric/)

# Monitor the Amazon Kinesis Data Streams service with Amazon CloudWatch
<a name="monitoring-with-cloudwatch"></a>

Amazon Kinesis Data Streams and Amazon CloudWatch are integrated so that you can collect, view, and analyze CloudWatch metrics for your Kinesis data streams. For example, to track shard usage, you can monitor the `IncomingBytes` and `OutgoingBytes` metrics and compare them to the number of shards in the stream.

Stream metrics and shard-level metrics that you configure are automatically collected and pushed to CloudWatch every minute. Metrics are archived for two weeks; after that period, the data is discarded.

The following table describes basic stream-level and enhanced shard-level monitoring for Kinesis data streams.


| Type | Description | 
| --- | --- | 
|  Basic (stream-level)  |  Stream-level data is sent automatically every minute at no charge.  | 
|  Enhanced (shard-level)  |  Shard-level data is sent every minute for an additional cost. To get this level of data, you must specifically enable it for the stream using the [EnableEnhancedMonitoring](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_EnableEnhancedMonitoring.html) operation.  For information about pricing, see the [Amazon CloudWatch product page](https://aws.amazon.com/cloudwatch).  | 

## Amazon Kinesis Data Streams dimensions and metrics
<a name="kinesis-metrics"></a>

Kinesis Data Streams sends metrics to CloudWatch at two levels: the stream level and, optionally, the shard level. Stream-level metrics are for the most common monitoring use cases in normal conditions. Shard-level metrics are for specific monitoring tasks, usually related to troubleshooting, and are enabled using the [EnableEnhancedMonitoring](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_EnableEnhancedMonitoring.html) operation. 

For an explanation of the statistics gathered from CloudWatch metrics, see [CloudWatch Statistics](https://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/cloudwatch_concepts.html#Statistic) in the *Amazon CloudWatch User Guide*.

**Topics**
+ [

### Basic stream-level metrics
](#kinesis-metrics-stream)
+ [

### Enhanced shard-level metrics
](#kinesis-metrics-shard)
+ [

### Dimensions for Amazon Kinesis Data Streams metrics
](#kinesis-metricdimensions)
+ [

### Recommended Amazon Kinesis Data Streams metrics
](#kinesis-metric-use)

### Basic stream-level metrics
<a name="kinesis-metrics-stream"></a>

The `AWS/Kinesis` namespace includes the following stream-level metrics.

Kinesis Data Streams sends these stream-level metrics to CloudWatch every minute. These metrics are always available.


| Metric | Description | 
| --- | --- | 
| GetRecords.Bytes |  The number of bytes retrieved from the Kinesis stream, measured over the specified time period. Minimum, Maximum, and Average statistics represent the bytes in a single `GetRecords` operation for the stream in the specified time period. Shard-level metric name: `OutgoingBytes` Dimensions: StreamName Statistics: Minimum, Maximum, Average, Sum, Samples Units: Bytes  | 
| GetRecords.IteratorAge |  This metric is no longer used. Use `GetRecords.IteratorAgeMilliseconds`.  | 
| GetRecords.IteratorAgeMilliseconds |  The age of the last record in all `GetRecords` calls made against a Kinesis stream, measured over the specified time period. Age is the difference between the current time and when the last record of the `GetRecords` call was written to the stream. The Minimum and Maximum statistics can be used to track the progress of Kinesis consumer applications. A value of zero indicates that the records being read are completely caught up with the stream. Shard-level metric name: `IteratorAgeMilliseconds` Dimensions: StreamName Statistics: Minimum, Maximum, Average, Samples Units: Milliseconds  | 
| GetRecords.Latency |  The time taken per `GetRecords` operation, measured over the specified time period. Dimensions: StreamName Statistics: Minimum, Maximum, Average Units: Milliseconds  | 
| GetRecords.Records |  The number of records retrieved from the shard, measured over the specified time period. Minimum, Maximum, and Average statistics represent the records in a single `GetRecords` operation for the stream in the specified time period. Shard-level metric name: `OutgoingRecords` Dimensions: StreamName Statistics: Minimum, Maximum, Average, Sum, Samples Units: Count  | 
| GetRecords.Success |  The number of successful `GetRecords` operations per stream, measured over the specified time period. Dimensions: StreamName Statistics: Average, Sum, Samples Units: Count  | 
| IncomingBytes |  The number of bytes successfully put to the Kinesis stream over the specified time period. This metric includes bytes from `PutRecord` and `PutRecords` operations. Minimum, Maximum, and Average statistics represent the bytes in a single put operation for the stream in the specified time period. Shard-level metric name: `IncomingBytes` Dimensions: StreamName Statistics: Minimum, Maximum, Average, Sum, Samples Units: Bytes  | 
| IncomingRecords |  The number of records successfully put to the Kinesis stream over the specified time period. This metric includes record counts from `PutRecord` and `PutRecords` operations. Minimum, Maximum, and Average statistics represent the records in a single put operation for the stream in the specified time period. Shard-level metric name: `IncomingRecords` Dimensions: StreamName Statistics: Minimum, Maximum, Average, Sum, Samples Units: Count  | 
| PutRecord.Bytes |  The number of bytes put to the Kinesis stream using the `PutRecord` operation over the specified time period. Dimensions: StreamName Statistics: Minimum, Maximum, Average, Sum, Samples Units: Bytes  | 
| PutRecord.Latency |  The time taken per `PutRecord` operation, measured over the specified time period. Dimensions: StreamName Statistics: Minimum, Maximum, Average Units: Milliseconds  | 
| PutRecord.Success |  The number of successful `PutRecord` operations per Kinesis stream, measured over the specified time period. Average reflects the percentage of successful writes to a stream. Dimensions: StreamName Statistics: Average, Sum, Samples Units: Count  | 
| PutRecords.Bytes |  The number of bytes put to the Kinesis stream using the `PutRecords` operation over the specified time period. Dimensions: StreamName Statistics: Minimum, Maximum, Average, Sum, Samples Units: Bytes  | 
| PutRecords.Latency |  The time taken per `PutRecords` operation, measured over the specified time period. Dimensions: StreamName Statistics: Minimum, Maximum, Average Units: Milliseconds  | 
| PutRecords.Records |  This metric is deprecated. Use `PutRecords.SuccessfulRecords`. Dimensions: StreamName Statistics: Minimum, Maximum, Average, Sum, Samples Units: Count  | 
| PutRecords.Success |  The number of `PutRecords` operations where at least one record succeeded, per Kinesis stream, measured over the specified time period. Dimensions: StreamName Statistics: Average, Sum, Samples Units: Count  | 
| PutRecords.TotalRecords |  The total number of records sent in a `PutRecords` operation per Kinesis data stream, measured over the specified time period. Dimensions: StreamName Statistics: Minimum, Maximum, Average, Sum, Samples Units: Count  | 
| PutRecords.SuccessfulRecords |  The number of successful records in a `PutRecords` operation per Kinesis data stream, measured over the specified time period. Dimensions: StreamName Statistics: Minimum, Maximum, Average, Sum, Samples Units: Count  | 
| PutRecords.FailedRecords |  The number of records rejected due to internal failures in a `PutRecords` operation per Kinesis data stream, measured over the specified time period. Occasional internal failures are to be expected and should be retried. Dimensions: StreamName Statistics: Minimum, Maximum, Average, Sum, Samples Units: Count  | 
| PutRecords.ThrottledRecords |  The number of records rejected due to throttling in a `PutRecords` operation per Kinesis data stream, measured over the specified time period. Dimensions: StreamName Statistics: Minimum, Maximum, Average, Sum, Samples Units: Count  | 
| ReadProvisionedThroughputExceeded |  The number of `GetRecords` calls throttled for the stream over the specified time period. The most commonly used statistic for this metric is Average. When the Minimum statistic has a value of 1, all records were throttled for the stream during the specified time period.  When the Maximum statistic has a value of 0 (zero), no records were throttled for the stream during the specified time period. Shard-level metric name: `ReadProvisionedThroughputExceeded` Dimensions: StreamName Statistics: Minimum, Maximum, Average, Sum, Samples Units: Count  | 
| SubscribeToShard.RateExceeded | This metric is emitted when a new subscription attempt fails because there already is an active subscription by the same consumer or if you exceed the number of calls per second allowed for this operation.Dimensions: StreamName, ConsumerName | 
| SubscribeToShard.Success |  This metric records whether the SubscribeToShard subscription was successfully established. The subscription only lives for at most 5 minutes. Therefore, this metric is emitted at least once every 5 minutes. Dimensions: StreamName, ConsumerName  | 
| SubscribeToShardEvent.Bytes |  The number of bytes received from the shard, measured over the specified time period. Minimum, Maximum, and Average statistics represent the bytes published in a single event for the specified time period. Shard-level metric name: `OutgoingBytes` Dimensions: StreamName, ConsumerName Statistics: Minimum, Maximum, Average, Sum, Samples Units: Bytes  | 
| SubscribeToShardEvent.MillisBehindLatest |  The number of milliseconds the read records are from the tip of the stream, indicating how far behind current time the consumer is. Dimensions: StreamName, ConsumerName Statistics: Minimum, Maximum, Average, Samples Units: Milliseconds  | 
| SubscribeToShardEvent.Records |  The number of records received from the shard, measured over the specified time period. Minimum, Maximum, and Average statistics represent the records in a single event for the specified time period. Shard-level metric name: `OutgoingRecords` Dimensions: StreamName, ConsumerName Statistics: Minimum, Maximum, Average, Sum, Samples Units: Count  | 
| SubscribeToShardEvent.Success | This metric is emitted every time an event is published successfully. It is only emitted when there's an active subscription.Dimensions: StreamName, ConsumerNameStatistics: Minimum, Maximum, Average, Sum, SamplesUnits: Count | 
| WriteProvisionedThroughputExceeded |  The number of records rejected due to throttling for the stream over the specified time period. This metric includes throttling from `PutRecord` and `PutRecords` operations. The most commonly used statistic for this metric is Average. When the Minimum statistic has a non-zero value, records were being throttled for the stream during the specified time period.  When the Maximum statistic has a value of 0 (zero), no records were being throttled for the stream during the specified time period. Shard-level metric name: `WriteProvisionedThroughputExceeded` Dimensions: StreamName Statistics: Minimum, Maximum, Average, Sum, Samples Units: Count  | 

### Enhanced shard-level metrics
<a name="kinesis-metrics-shard"></a>

The `AWS/Kinesis` namespace includes the following shard-level metrics.

Kinesis sends the following shard-level metrics to CloudWatch every minute. Each metric dimension creates 1 CloudWatch metric and makes approximately 43,200 `PutMetricData` API calls per month. These metrics are not enabled by default. There is a charge for enhanced metrics emitted from Kinesis. For more information, see [Amazon CloudWatch Pricing](https://aws.amazon.com/cloudwatch/pricing/) under the heading *Amazon CloudWatch Custom Metrics*. The charges are given per shard per metric per month.


| Metric | Description | 
| --- | --- | 
| IncomingBytes |  The number of bytes successfully put to the shard over the specified time period. This metric includes bytes from `PutRecord` and `PutRecords` operations. Minimum, Maximum, and Average statistics represent the bytes in a single put operation for the shard in the specified time period. Stream-level metric name: `IncomingBytes` Dimensions: StreamName, ShardId Statistics: Minimum, Maximum, Average, Sum, Samples Units: Bytes  | 
| IncomingRecords |  The number of records successfully put to the shard over the specified time period. This metric includes record counts from `PutRecord` and `PutRecords` operations. Minimum, Maximum, and Average statistics represent the records in a single put operation for the shard in the specified time period. Stream-level metric name: `IncomingRecords` Dimensions: StreamName, ShardId Statistics: Minimum, Maximum, Average, Sum, Samples Units: Count  | 
| IteratorAgeMilliseconds |  The age of the last record in all `GetRecords` calls made against a shard, measured over the specified time period. Age is the difference between the current time and when the last record of the `GetRecords` call was written to the stream. The Minimum and Maximum statistics can be used to track the progress of Kinesis consumer applications. A value of 0 (zero) indicates that the records being read are completely caught up with the stream. Stream-level metric name: `GetRecords.IteratorAgeMilliseconds` Dimensions: StreamName, ShardId Statistics: Minimum, Maximum, Average, Samples Units: Milliseconds  | 
| OutgoingBytes |  The number of bytes retrieved from the shard, measured over the specified time period. Minimum, Maximum, and Average statistics represent the bytes returned in a single `GetRecords` operation or published in a single `SubscribeToShard` event for the shard in the specified time period. Stream-level metric name: `GetRecords.Bytes` Dimensions: StreamName, ShardId Statistics: Minimum, Maximum, Average, Sum, Samples Units: Bytes  | 
| OutgoingRecords |  The number of records retrieved from the shard, measured over the specified time period. Minimum, Maximum, and Average statistics represent the records returned in a single `GetRecords` operation or published in a single `SubscribeToShard` event for the shard in the specified time period. Stream-level metric name: `GetRecords.Records` Dimensions: StreamName, ShardId Statistics: Minimum, Maximum, Average, Sum, Samples Units: Count  | 
| ReadProvisionedThroughputExceeded |  The number of `GetRecords` calls throttled for the shard over the specified time period. This exception count covers all dimensions of the following limits: 5 reads per shard per second or 2 MB per second per shard. The most commonly used statistic for this metric is Average. When the Minimum statistic has a value of 1, all records were throttled for the shard during the specified time period.  When the Maximum statistic has a value of 0 (zero), no records were throttled for the shard during the specified time period. Stream-level metric name: `ReadProvisionedThroughputExceeded` Dimensions: StreamName, ShardId Statistics: Minimum, Maximum, Average, Sum, Samples Units: Count  | 
| WriteProvisionedThroughputExceeded |  The number of records rejected due to throttling for the shard over the specified time period. This metric includes throttling from `PutRecord` and `PutRecords` operations and covers all dimensions of the following limits: 1,000 records per second per shard or 1 MB per second per shard. The most commonly used statistic for this metric is Average. When the Minimum statistic has a non-zero value, records were being throttled for the shard during the specified time period.  When the Maximum statistic has a value of 0 (zero), no records were being throttled for the shard during the specified time period. Stream-level metric name: `WriteProvisionedThroughputExceeded` Dimensions: StreamName, ShardId Statistics: Minimum, Maximum, Average, Sum, Samples Units: Count  | 

### Dimensions for Amazon Kinesis Data Streams metrics
<a name="kinesis-metricdimensions"></a>


|  Dimension  |  Description  | 
| --- | --- | 
|  StreamName  |  The name of the Kinesis stream. All available statistics are filtered by `StreamName`.   | 

### Recommended Amazon Kinesis Data Streams metrics
<a name="kinesis-metric-use"></a>

Several Amazon Kinesis Data Streams metrics might be of particular interest to Kinesis Data Streams customers. The following list provides recommended metrics and their uses.


| Metric | Usage Notes | 
| --- | --- | 
|  `GetRecords.IteratorAgeMilliseconds`  |  Tracks the read position across all shards and consumers in the stream. If an iterator's age passes 50% of the retention period (by default, 24 hours, configurable up to 7 days), there is risk for data loss due to record expiration. We recommend that you use CloudWatch alarms on the Maximum statistic to alert you before this loss is a risk. For an example scenario that uses this metric, see [Consumer record processing is falling behind](troubleshooting-consumers.md#record-processing-falls-behind).  | 
|  `ReadProvisionedThroughputExceeded`  |  When your consumer-side record processing is falling behind, it is sometimes difficult to know where the bottleneck is. Use this metric to determine if your reads are being throttled due to exceeding your read throughput limits. The most commonly used statistic for this metric is Average.  | 
| WriteProvisionedThroughputExceeded | This is for the same purpose as the ReadProvisionedThroughputExceeded metric, but for the producer (put) side of the stream. The most commonly used statistic for this metric is Average. | 
| PutRecord.Success, PutRecords.Success | We recommend using CloudWatch alarms on the Average statistic to indicate when records are failing to the stream. Choose one or both put types depending on what your producer uses. If using the Amazon Kinesis Producer Library (KPL), use PutRecords.Success. | 
| GetRecords.Success | We recommend using CloudWatch alarms on the Average statistic to indicate when records are failing from the stream. | 

## Access Amazon CloudWatch metrics for Kinesis Data Streams
<a name="cloudwatch-metrics"></a>

You can monitor metrics for Kinesis Data Streams using the CloudWatch console, the command line, or the CloudWatch API. The following procedures show you how to access metrics using these different methods. 

**To access metrics using the CloudWatch console**

1. Open the CloudWatch console at [https://console.aws.amazon.com/cloudwatch/](https://console.aws.amazon.com/cloudwatch/).

1. On the navigation bar, choose a Region.

1. In the navigation pane, choose **Metrics**.

1. In the **CloudWatch Metrics by Category** pane, choose **Kinesis Metrics**.

1. Click the relevant row to view the statistics for the specified **MetricName** and **StreamName**. 

   **Note:** Most console statistic names match the corresponding CloudWatch metric names listed previously, except for **Read Throughput** and **Write Throughput**. These statistics are calculated over 5-minute intervals: **Write Throughput** monitors the `IncomingBytes` CloudWatch metric, and **Read Throughput** monitors `GetRecords.Bytes`.

1. (Optional) In the graph pane, select a statistic and a time period, and then create a CloudWatch alarm using these settings.

**To access metrics using the AWS CLI**  
Use the [list-metrics](https://docs.aws.amazon.com/cli/latest/reference/cloudwatch/list-metrics.html) and [get-metric-statistics](https://docs.aws.amazon.com/cli/latest/reference/cloudwatch/get-metric-statistics.html) commands.

**To access metrics using the CloudWatch CLI**  
Use the [mon-list-metrics](https://docs.aws.amazon.com/AmazonCloudWatch/latest/cli/cli-mon-list-metrics.html) and [mon-get-stats](https://docs.aws.amazon.com/AmazonCloudWatch/latest/cli/cli-mon-get-stats.html) commands.

**To access metrics using the CloudWatch API**  
Use the [ListMetrics](https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_ListMetrics.html) and [GetMetricStatistics](https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_GetMetricStatistics.html) operations.

# Monitor Kinesis Data Streams Agent health with Amazon CloudWatch
<a name="agent-health"></a>

The agent publishes custom CloudWatch metrics with a namespace of **AWSKinesisAgent**. These metrics help you assess whether the agent is submitting data into Kinesis Data Streams as specified, and whether it is healthy and consuming the appropriate amount of CPU and memory resources on the data producer. Metrics such as the number of records and bytes sent are useful to understand the rate at which the agent is submitting data to the stream. When these metrics fall below expected thresholds by some percentage or drop to zero, it could indicate configuration issues, network errors, or agent health issues. Metrics such as on-host CPU and memory consumption and agent error counters indicate data producer resource usage, and provide insights into potential configuration or host errors. Finally, the agent also logs service exceptions to help investigate agent issues. These metrics are reported in the Region specified in the agent configuration setting `cloudwatch.endpoint`. CloudWatch metrics published from multiple Kinesis agents are aggregated or combined. For more information about agent configuration, see [Specify the agent configuration settings](writing-with-agents.md#agent-config-settings).

## Monitor with CloudWatch
<a name="agent-metrics"></a>

The Kinesis Data Streams agent sends the following metrics to CloudWatch.


| Metric | Description | 
| --- | --- | 
| BytesSent |  The number of bytes sent to Kinesis Data Streams over the specified time period. Units: Bytes  | 
| RecordSendAttempts |  The number of records attempted (either first time, or as a retry) in a call to `PutRecords` over the specified time period. Units: Count  | 
| RecordSendErrors |  The number of records that returned failure status in a call to `PutRecords`, including retries, over the specified time period. Units: Count  | 
| ServiceErrors |  The number of calls to `PutRecords` that resulted in a service error (other than a throttling error) over the specified time period.  Units: Count  | 

# Log Amazon Kinesis Data Streams API calls with AWS CloudTrail
<a name="logging-using-cloudtrail"></a>

Amazon Kinesis Data Streams is integrated with AWS CloudTrail, a service that provides a record of actions taken by a user, role, or an AWS service in Kinesis Data Streams. CloudTrail captures all API calls for Kinesis Data Streams as events. The calls captured include calls from the Kinesis Data Streams console and code calls to the Kinesis Data Streams API operations. If you create a trail, you can enable continuous delivery of CloudTrail events to an Amazon S3 bucket, including events for Kinesis Data Streams. If you don't configure a trail, you can still view the most recent events in the CloudTrail console in **Event history**. Using the information collected by CloudTrail, you can determine the request that was made to Kinesis Data Streams, the IP address from which the request was made, who made the request, when it was made, and additional details. 

To learn more about CloudTrail, including how to configure and enable it, see the [AWS CloudTrail User Guide](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/).

## Kinesis Data Streams information in CloudTrail
<a name="service-name-info-in-cloudtrail"></a>

CloudTrail is enabled on your AWS account when you create the account. When supported event activity occurs in Kinesis Data Streams, that activity is recorded in a CloudTrail event along with other AWS service events in **Event history**. You can view, search, and download recent events in your AWS account. For more information, see [Viewing Events with CloudTrail Event History](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/view-cloudtrail-events.html). 

For an ongoing record of events in your AWS account, including events for Kinesis Data Streams, create a trail. A *trail* enables CloudTrail to deliver log files to an Amazon S3 bucket. By default, when you create a trail in the console, the trail applies to all AWS Regions. The trail logs events from all Regions in the AWS partition and delivers the log files to the Amazon S3 bucket that you specify. Additionally, you can configure other AWS services to further analyze and act upon the event data collected in CloudTrail logs. For more information, see the following: 
+ [Overview for Creating a Trail](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-create-and-update-a-trail.html)
+ [CloudTrail Supported Services and Integrations](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-aws-service-specific-topics.html#cloudtrail-aws-service-specific-topics-integrations)
+ [Configuring Amazon SNS Notifications for CloudTrail](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/getting_notifications_top_level.html)
+ [Receiving CloudTrail Log Files from Multiple Regions](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/receive-cloudtrail-log-files-from-multiple-regions.html) and [Receiving CloudTrail Log Files from Multiple Accounts](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-receive-logs-from-multiple-accounts.html)

Kinesis Data Streams supports logging the following actions as events in CloudTrail log files:
+ [AddTagsToStream](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_AddTagsToStream.html)
+ [CreateStream](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_CreateStream.html)
+ [DecreaseStreamRetentionPeriod](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_DecreaseStreamRetentionPeriod.html)
+ [DeleteStream](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_DeleteStream.html)
+ [DeregisterStreamConsumer](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_DeregisterStreamConsumer.html)
+ [DescribeStream](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_DescribeStream.html)
+ [DescribeStreamConsumer](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_DescribeStreamConsumer.html)
+ [DisableEnhancedMonitoring](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_DisableEnhancedMonitoring.html)
+ [EnableEnhancedMonitoring](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_EnableEnhancedMonitoring.html)
+ [GetRecords](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html)
+ [GetShardIterator](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetShardIterator.html)
+ [IncreaseStreamRetentionPeriod](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_IncreaseStreamRetentionPeriod.html)
+ [ListStreamConsumers](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_ListStreamConsumers.html)
+ [ListStreams](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_ListStreams.html)
+ [ListTagsForStream](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_ListTagsForStream.html)
+ [MergeShards](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_MergeShards.html)
+ [PutRecord](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecord.html)
+ [PutRecords](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_PutRecords.html)
+ [RegisterStreamConsumer](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_RegisterStreamConsumer.html)
+ [RemoveTagsFromStream](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_RemoveTagsFromStream.html)
+ [SplitShard](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_SplitShard.html)
+ [StartStreamEncryption](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_StartStreamEncryption.html)
+ [StopStreamEncryption](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_StopStreamEncryption.html)
+ [SubscribeToShard](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_SubscribeToShard.html)
+ [UpdateShardCount](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_UpdateShardCount.html)
+ [UpdateStreamMode](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_UpdateStreamMode.html)

Every event or log entry contains information about who generated the request. The identity information helps you determine the following: 
+ Whether the request was made with root or AWS Identity and Access Management (IAM) user credentials.
+ Whether the request was made with temporary security credentials for a role or federated user.
+ Whether the request was made by another AWS service.

For more information, see the [CloudTrail userIdentity Element](https://docs.aws.amazon.com/awscloudtrail/latest/userguide/cloudtrail-event-reference-user-identity.html).

## Example: Kinesis Data Streams log file entries
<a name="understanding-service-name-entries"></a>

A trail is a configuration that enables delivery of events as log files to an Amazon S3 bucket that you specify. CloudTrail log files contain one or more log entries. An event represents a single request from any source and includes information about the requested action, the date and time of the action, request parameters, and so on. CloudTrail log files aren't an ordered stack trace of the public API calls, so they don't appear in any specific order.

The following example shows a CloudTrail log entry that demonstrates the `CreateStream`, `DescribeStream`, `ListStreams`, `DeleteStream`, `SplitShard`, and `MergeShards` actions.

```
{
    "Records": [
        {
            "eventVersion": "1.01",
            "userIdentity": {
                "type": "IAMUser",
                "principalId": "EX_PRINCIPAL_ID",
                "arn": "arn:aws:iam::012345678910:user/Alice",
                "accountId": "012345678910",
                "accessKeyId": "EXAMPLE_KEY_ID",
                "userName": "Alice"
            },
            "eventTime": "2014-04-19T00:16:31Z",
            "eventSource": "kinesis.amazonaws.com",
            "eventName": "CreateStream",
            "awsRegion": "us-east-1",
            "sourceIPAddress": "127.0.0.1",
            "userAgent": "aws-sdk-java/unknown-version Linux/x.xx",
            "requestParameters": {
                "shardCount": 1,
                "streamName": "GoodStream"
            },
            "responseElements": null,
            "requestID": "db6c59f8-c757-11e3-bc3b-57923b443c1c",
            "eventID": "b7acfcd0-6ca9-4ee1-a3d7-c4e8d420d99b"
        },
        {
            "eventVersion": "1.01",
            "userIdentity": {
                "type": "IAMUser",
                "principalId": "EX_PRINCIPAL_ID",
                "arn": "arn:aws:iam::012345678910:user/Alice",
                "accountId": "012345678910",
                "accessKeyId": "EXAMPLE_KEY_ID",
                "userName": "Alice"
            },
            "eventTime": "2014-04-19T00:17:06Z",
            "eventSource": "kinesis.amazonaws.com",
            "eventName": "DescribeStream",
            "awsRegion": "us-east-1",
            "sourceIPAddress": "127.0.0.1",
            "userAgent": "aws-sdk-java/unknown-version Linux/x.xx",
            "requestParameters": {
                "streamName": "GoodStream"
            },
            "responseElements": null,
            "requestID": "f0944d86-c757-11e3-b4ae-25654b1d3136",
            "eventID": "0b2f1396-88af-4561-b16f-398f8eaea596"
        },
        {
            "eventVersion": "1.01",
            "userIdentity": {
                "type": "IAMUser",
                "principalId": "EX_PRINCIPAL_ID",
                "arn": "arn:aws:iam::012345678910:user/Alice",
                "accountId": "012345678910",
                "accessKeyId": "EXAMPLE_KEY_ID",
                "userName": "Alice"
            },
            "eventTime": "2014-04-19T00:15:02Z",
            "eventSource": "kinesis.amazonaws.com",
            "eventName": "ListStreams",
            "awsRegion": "us-east-1",
            "sourceIPAddress": "127.0.0.1",
            "userAgent": "aws-sdk-java/unknown-version Linux/x.xx",
            "requestParameters": {
                "limit": 10
            },
            "responseElements": null,
            "requestID": "a68541ca-c757-11e3-901b-cbcfe5b3677a",
            "eventID": "22a5fb8f-4e61-4bee-a8ad-3b72046b4c4d"
        },
        {
            "eventVersion": "1.01",
            "userIdentity": {
                "type": "IAMUser",
                "principalId": "EX_PRINCIPAL_ID",
                "arn": "arn:aws:iam::012345678910:user/Alice",
                "accountId": "012345678910",
                "accessKeyId": "EXAMPLE_KEY_ID",
                "userName": "Alice"
            },
            "eventTime": "2014-04-19T00:17:07Z",
            "eventSource": "kinesis.amazonaws.com",
            "eventName": "DeleteStream",
            "awsRegion": "us-east-1",
            "sourceIPAddress": "127.0.0.1",
            "userAgent": "aws-sdk-java/unknown-version Linux/x.xx",
            "requestParameters": {
                "streamName": "GoodStream"
            },
            "responseElements": null,
            "requestID": "f10cd97c-c757-11e3-901b-cbcfe5b3677a",
            "eventID": "607e7217-311a-4a08-a904-ec02944596dd"
        },
        {
            "eventVersion": "1.01",
            "userIdentity": {
                "type": "IAMUser",
                "principalId": "EX_PRINCIPAL_ID",
                "arn": "arn:aws:iam::012345678910:user/Alice",
                "accountId": "012345678910",
                "accessKeyId": "EXAMPLE_KEY_ID",
                "userName": "Alice"
            },
            "eventTime": "2014-04-19T00:15:03Z",
            "eventSource": "kinesis.amazonaws.com",
            "eventName": "SplitShard",
            "awsRegion": "us-east-1",
            "sourceIPAddress": "127.0.0.1",
            "userAgent": "aws-sdk-java/unknown-version Linux/x.xx",
            "requestParameters": {
                "shardToSplit": "shardId-000000000000",
                "streamName": "GoodStream",
                "newStartingHashKey": "11111111"
            },
            "responseElements": null,
            "requestID": "a6e6e9cd-c757-11e3-901b-cbcfe5b3677a",
            "eventID": "dcd2126f-c8d2-4186-b32a-192dd48d7e33"
        },
        {
            "eventVersion": "1.01",
            "userIdentity": {
                "type": "IAMUser",
                "principalId": "EX_PRINCIPAL_ID",
                "arn": "arn:aws:iam::012345678910:user/Alice",
                "accountId": "012345678910",
                "accessKeyId": "EXAMPLE_KEY_ID",
                "userName": "Alice"
            },
            "eventTime": "2014-04-19T00:16:56Z",
            "eventSource": "kinesis.amazonaws.com",
            "eventName": "MergeShards",
            "awsRegion": "us-east-1",
            "sourceIPAddress": "127.0.0.1",
            "userAgent": "aws-sdk-java/unknown-version Linux/x.xx",
            "requestParameters": {
                "streamName": "GoodStream",
                "adjacentShardToMerge": "shardId-000000000002",
                "shardToMerge": "shardId-000000000001"
            },
            "responseElements": null,
            "requestID": "e9f9c8eb-c757-11e3-bf1d-6948db3cd570",
            "eventID": "77cf0d06-ce90-42da-9576-71986fec411f"
        }
    ]
}
```

# Monitor the Kinesis Client Library with Amazon CloudWatch
<a name="monitoring-with-kcl"></a>

The [Kinesis Client Library](https://docs.aws.amazon.com/kinesis/latest/dev/developing-consumers-with-kcl.html) (KCL) for Amazon Kinesis Data Streams publishes custom Amazon CloudWatch metrics on your behalf, using the name of your KCL application as the namespace. You can view these metrics by navigating to the [CloudWatch console](https://console.aws.amazon.com/cloudwatch/) and choosing **Custom Metrics**. For more information about custom metrics, see [Publish Custom Metrics](https://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/publishingMetrics.html) in the *Amazon CloudWatch User Guide*.

There is a nominal charge for the metrics uploaded to CloudWatch by the KCL; specifically, *Amazon CloudWatch Custom Metrics* and *Amazon CloudWatch API Requests* charges apply. For more information, see [Amazon CloudWatch Pricing](https://aws.amazon.com/cloudwatch/pricing/).

**Topics**
+ [

## Metrics and namespace
](#metrics-namespace)
+ [

## Metric levels and dimensions
](#metric-levels)
+ [

## Metric configuration
](#metrics-config)
+ [

## List of metrics
](#kcl-metrics-list)

## Metrics and namespace
<a name="metrics-namespace"></a>

The namespace that is used to upload metrics is the application name that you specify when you launch the KCL.

## Metric levels and dimensions
<a name="metric-levels"></a>

There are two options to control which metrics are uploaded to CloudWatch:

metric levels  
Every metric is assigned an individual level. When you set a metrics reporting level, metrics with an individual level below the reporting level are not sent to CloudWatch. The levels are: `NONE`, `SUMMARY`, and `DETAILED`. The default setting is `DETAILED`; that is, all metrics are sent to CloudWatch. A reporting level of `NONE` means that no metrics are sent at all. For information about which levels are assigned to what metrics, see [List of metrics](#kcl-metrics-list).

enabled dimensions  
Every KCL metric has associated dimensions that also get sent to CloudWatch. In KCL 2.x, if KCL is configured to process a single data stream, all the metrics dimensions (`Operation`, `ShardId`, and `WorkerIdentifier`) are enabled by default. Also, in KCL 2.x, if KCL is configured to process a single data stream, `Operation` dimension cannot be disabled. In KCL 2.x, if KCL is configured to process multiple data streams, all the metrics dimensions (`Operation`, `ShardId`, `StreamId`, and `WorkerIdentifier`) are enabled by default. Also, in KCL 2.x, if KCL is configured to process multiple data streams, the `Operation` and the `StreamId` dimensions cannot be disabled. `StreamId` dimension is available only for the per-shard metrics.  
 In KCL 1.x, only the `Operation` and the `ShardId` dimensions are enabled by default, and the `WorkerIdentifier` dimension is disabled. In KCL 1.x, the `Operation` dimension cannot be disabled.  
For more information about CloudWatch metric dimensions, see the [Dimensions](https://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/cloudwatch_concepts.html#Dimension) section in the Amazon CloudWatch Concepts topic, in the *Amazon CloudWatch User Guide*.  
When the `WorkerIdentifier` dimension is enabled, if a different value is used for the worker ID property every time a particular KCL worker restarts, new sets of metrics with new `WorkerIdentifier` dimension values are sent to CloudWatch. If you need the `WorkerIdentifier` dimension value to be the same across specific KCL worker restarts, you must explicitly specify the same worker ID value during initialization for each worker. Note that the worker ID value for each active KCL worker must be unique across all KCL workers.

## Metric configuration
<a name="metrics-config"></a>

Metric levels and enabled dimensions can be configured using the KinesisClientLibConfiguration instance, which is passed to Worker when launching the KCL application. In the MultiLangDaemon case, the `metricsLevel` and `metricsEnabledDimensions` properties can be specified in the .properties file used to launch the MultiLangDaemon KCL application.

Metric levels can be assigned one of three values: NONE, SUMMARY, or DETAILED. Enabled dimensions values must be comma-separated strings with the list of dimensions that are allowed for the CloudWatch metrics. The dimensions used by the KCL application are `Operation`, `ShardId`, and `WorkerIdentifier`.

## List of metrics
<a name="kcl-metrics-list"></a>

The following tables list the KCL metrics, grouped by scope and operation.

**Topics**
+ [

### Per-KCL-application metrics
](#kcl-metrics-per-app)
+ [

### Per-worker metrics
](#kcl-metrics-per-worker)
+ [

### Per-shard metrics
](#kcl-metrics-per-shard)

### Per-KCL-application metrics
<a name="kcl-metrics-per-app"></a>

These metrics are aggregated across all KCL workers within the scope of the application, as defined by the Amazon CloudWatch namespace.

**Topics**
+ [

#### LeaseAssignmentManager
](#lease-assignment-manager)
+ [

#### InitializeTask
](#init-task)
+ [

#### ShutdownTask
](#shutdown-task)
+ [

#### ShardSyncTask
](#shard-sync-task)
+ [

#### BlockOnParentTask
](#block-parent-task)
+ [

#### PeriodicShardSyncManager
](#periodic-task)
+ [

#### MultistreamTracker
](#multi-task)

#### LeaseAssignmentManager
<a name="lease-assignment-manager"></a>

The `LeaseAssignmentManager` operation is responsible for assigning leases to workers and rebalancing leases among workers to achieve even utilization of worker resources. The logic for this operation includes reading the lease related metadata from the lease table and metrics from the worker metrics table, and performing lease assignments.


| Metric | Description | 
| --- | --- | 
|  LeaseAndWorkerMetricsLoad.Time  |  Time taken to load all leases and worker metrics entry in the lease assignment manager (LAM), the new lease assignment and load balancing algorithm introduced in KCL 3.x. Metric level: Detailed Units: Milliseconds  | 
| TotalLeases |  Total number of leases for the current KCL application. Metric level: Summary Units: Count  | 
| NumWorkers |  Total number of workers in the current KCL application. Metric level: Summary Units: Count  | 
|  AssignExpiredOrUnassignedLeases.Time  |  Time to perform in-memory assignment of expired leases. Metric level: Detailed Units: Milliseconds  | 
| LeaseSpillover |  Number of leases that were not assigned due to hitting the limit on the maximum number of leases or maximum throughput per worker. Metric level: Summary Units: Count  | 
|  BalanceWorkerVariance.Time  |  Time to perform in-memory balancing of leases between workers. Metric level: Detailed Units: Milliseconds  | 
|  NumOfLeasesReassignment  |  Total number of lease reassignments made in the current reassignment iteration. Metric level: Summary Units: Count  | 
|  FailedAssignmentCount  |  Number of failures in AssignLease calls to the DynamoDB lease table.  Metric level: Detailed Units: Count  | 
|  ParallelyAssignLeases.Time  |  Time to flush new assignments to the DynamoDB lease table. Metric level: Detailed Units: Milliseconds  | 
|  ParallelyAssignLeases.Success  |  Number of successful flush of new assignments. Metric level: Detailed Units: Count  | 
|  TotalStaleWorkerMetricsEntry  |  Total number of worker metrics entries that must be cleaned up. Metric level: Detailed Units: Count  | 
| StaleWorkerMetricsCleanup.Time |  Time to perform worker metrics entry deletion from the DynamoDB worker metrics table. Metric level: Detailed Units: Milliseconds  | 
| Time |  Time taken by the `LeaseAssignmentManager` operation. Metric level: Summary Units: Milliseconds  | 
| Success |  Number of times the `LeaseAssignmentManager` operation successfully completed. Metric level: Summary Units: Count  | 
| ForceLeaderRelease |  Indicates that the lease assignment manager has failed 3 times consecutively and the leader worker is releasing the leadership. Metric level: Summary Units: Count  | 
|  NumWorkersWithInvalidEntry  |  Number of worker metrics entries which are considered invalid.  Metric level: Summary Units: Count  | 
|  NumWorkersWithFailingWorkerMetric  |  Number of worker metrics entries which has -1 (representing worker metric value is not available) as one of the value for worker metrics. Metric level: Summary Units: Count  | 
|  LeaseDeserializationFailureCount  |  Lease entry from the lease table which failed to deserialize. Metric level: Summary Units: Count  | 

#### InitializeTask
<a name="init-task"></a>

The `InitializeTask` operation is responsible for initializing the record processor for the KCL application. The logic for this operation includes getting a shard iterator from Kinesis Data Streams and initializing the record processor.


| Metric | Description | 
| --- | --- | 
| KinesisDataFetcher.getIterator.Success |  Number of successful `GetShardIterator` operations per KCL application.  Metric level: Detailed Units: Count  | 
| KinesisDataFetcher.getIterator.Time |  Time taken per `GetShardIterator` operation for the given KCL application. Metric level: Detailed Units: Milliseconds  | 
| RecordProcessor.initialize.Time |  Time taken by the record processor’s initialize method. Metric level: Summary Units: Milliseconds  | 
| Success |  Number of successful record processor initializations.  Metric level: Summary Units: Count  | 
| Time |  Time taken by the KCL worker for the record processor initialization. Metric level: Summary Units: Milliseconds  | 

#### ShutdownTask
<a name="shutdown-task"></a>

The `ShutdownTask` operation initiates the shutdown sequence for shard processing. This can occur because a shard is split or merged, or when the shard lease is lost from the worker. In both cases, the record processor `shutdown()` function is invoked. New shards are also discovered in the case where a shard was split or merged, resulting in the creation of one or two new shards.


| Metric | Description | 
| --- | --- | 
| CreateLease.Success |  Number of times that new child shards are successfully added into the KCL application DynamoDB table following parent shard shutdown. Metric level: Detailed Units: Count  | 
| CreateLease.Time |  Time taken for adding new child shard information in the KCL application DynamoDB table. Metric level: Detailed Units: Milliseconds  | 
| UpdateLease.Success |  Number of successful final checkpoints during the record processor shutdown. Metric level: Detailed Units: Count  | 
| UpdateLease.Time |  Time taken by the checkpoint operation during the record processor shutdown. Metric level: Detailed Units: Milliseconds  | 
| RecordProcessor.shutdown.Time |  Time taken by the record processor’s shutdown method. Metric level: Summary Units: Milliseconds  | 
| Success |  Number of successful shutdown tasks. Metric level: Summary Units: Count  | 
| Time |  Time taken by the KCL worker for the shutdown task. Metric level: Summary Units: Milliseconds  | 

#### ShardSyncTask
<a name="shard-sync-task"></a>

The `ShardSyncTask` operation discovers changes to shard information for the Kinesis data stream, so new shards can be processed by the KCL application.


| Metric | Description | 
| --- | --- | 
| CreateLease.Success |  Number of successful attempts to add new shard information into the KCL application DynamoDB table. Metric level: Detailed Units: Count  | 
| CreateLease.Time |  Time taken for adding new shard information in the KCL application DynamoDB table. Metric level: Detailed Units: Milliseconds  | 
| Success |  Number of successful shard sync operations. Metric level: Summary Units: Count  | 
| Time |  Time taken for the shard sync operation. Metric level: Summary Units: Milliseconds  | 

#### BlockOnParentTask
<a name="block-parent-task"></a>

If the shard is split or merged with other shards, then new child shards are created. The `BlockOnParentTask` operation ensures that record processing for the new shards does not start until the parent shards are completely processed by the KCL.


| Metric | Description | 
| --- | --- | 
| Success |  Number of successful checks for parent shard completion. Metric level: Summary Units: Count  | 
| Time |  Time taken for parent shards completion. Metric level: Summary Unit: Milliseconds  | 

#### PeriodicShardSyncManager
<a name="periodic-task"></a>

The `PeriodicShardSyncManager` is responsible for examining the data streams that are being processed by the KCL consumer application, identifying data streams with partial leases and handing them off for synchronization.

The following metrics are available when KCL is configured to process a single data stream (then the value of NumStreamsToSync and NumStreamsWithPartialLeases is set to 1) and also when KCL is configured to process multiple data streams.


| Metric | Description | 
| --- | --- | 
| NumStreamsToSync |  The number of data streams (per AWS account) being processed by the consumer application that contains partial leases and that must be handed off for synchronization.  Metric level: Summary Units: Count  | 
| NumStreamsWithPartialLeases |  The number of data streams (per AWS account) that the consumer application is processing that contains partial leases.  Metric level: Summary Units: Count  | 
| Success |  The number of times `PeriodicShardSyncManager` was able to successfully identify partial leases in the data streams that the consumer application is processing.  Metric level: Summary Units: Count  | 
| Time |  The amount of the time (in milliseconds) that the `PeriodicShardSyncManager` takes to examine the data streams that the consumer application is processing, in order to determine which data streams require shard synchronization.  Metric level: Summary Units: Milliseconds  | 

#### MultistreamTracker
<a name="multi-task"></a>

The `MultistreamTracker` interface enables you to build KCL consumer applications that can process multiple data streams at the same time.


| Metric | Description | 
| --- | --- | 
| DeletedStreams.Count |  The number of data streams deleted at this time period. Metric level: Summary Units: Count  | 
| ActiveStreams.Count |  The number of active data streams being processed. Metric level: Summary Units: Count  | 
| StreamsPendingDeletion.Count |  The number of data streams that are pending deletion based on `FormerStreamsLeasesDeletionStrategy`.  Metric level: Summary Units: Count  | 

### Per-worker metrics
<a name="kcl-metrics-per-worker"></a>

These metrics are aggregated across all record processors consuming data from a Kinesis data stream, such as an Amazon EC2 instance.

**Topics**
+ [

#### WorkerMetricStatsReporter
](#worker-metrics-stats)
+ [

#### LeaseDiscovery
](#lease-discovery)
+ [

#### RenewAllLeases
](#renew-leases)
+ [

#### TakeLeases
](#take-leases)

#### WorkerMetricStatsReporter
<a name="worker-metrics-stats"></a>

The `WorkerMetricStatReporter` operation is responsible for periodically publishing metrics of the current worker to the worker metrics table. These metrics are used by the `LeaseAssignmentManager` operation to perform lease assignments.


| Metric | Description | 
| --- | --- | 
|  InMemoryMetricStatsReporterFailure  |  Number of failures to capture the in-memory worker metric value, due to failure of some worker metrics. Metric level: Summary Units: Count  | 
|  WorkerMetricStatsReporter.Time  |  Time taken by the `WorkerMetricsStats` operation. Metric level: Summary Units: Milliseconds  | 
|  WorkerMetricStatsReporter.Success  |  Number of times the `WorkerMetricsStats` operation successfully completed. Metric level: Summary Units: Count  | 

#### LeaseDiscovery
<a name="lease-discovery"></a>

The `LeaseDiscovery` operation is responsible for identifying the new leases assigned to the current worker by the `LeaseAssignmentManager` operation. The logic for this operation involves identifying leases assigned to the current worker by reading the global secondary index of the lease table.


| Metric | Description | 
| --- | --- | 
|  ListLeaseKeysForWorker.Time  |  Time to call the global secondary index on the lease table and get lease keys assigned to the current worker. Metric level: Detailed Units: Milliseconds  | 
|  FetchNewLeases.Time  |  Time to fetch all new leases from the lease table.  Metric level: Detailed Units: Milliseconds  | 
|  NewLeasesDiscovered  |  Total number of new leases assigned to workers. Metric level: Detailed Units: Count  | 
|  Time  |  Time taken by the `LeaseDiscovery` operation. Metric level: Summary Units: Milliseconds  | 
|  Success  |  Number of times the `LeaseDiscovery` operation successfully completed. Metric level: Summary Units: Count  | 
|  OwnerMismatch  |  Number of owner mismatches from GSI response and lease table consistent read. Metric level: Detailed Units: Count  | 

#### RenewAllLeases
<a name="renew-leases"></a>

The `RenewAllLeases` operation periodically renews shard leases owned by a particular worker instance. 


| Metric | Description | 
| --- | --- | 
| RenewLease.Success |  Number of successful lease renewals by the worker. Metric level: Detailed Units: Count  | 
| RenewLease.Time |  Time taken by the lease renewal operation. Metric level: Detailed Units: Milliseconds  | 
| CurrentLeases |  Number of shard leases owned by the worker after all leases are renewed. Metric level: Summary Units: Count  | 
| LostLeases |  Number of shard leases that were lost following an attempt to renew all leases owned by the worker. Metric level: Summary Units: Count  | 
| Success |  Number of times the lease renewal operation was successful for the worker. Metric level: Summary Units: Count  | 
| Time |  Time taken for renewing all leases for the worker. Metric level: Summary Units: Milliseconds  | 

#### TakeLeases
<a name="take-leases"></a>

The `TakeLeases` operation balances record processing between all KCL workers. If the current KCL worker has fewer shard leases than required, it takes shard leases from another worker that is overloaded.


| Metric | Description | 
| --- | --- | 
| ListLeases.Success |  Number of times all shard leases were successfully retrieved from the KCL application DynamoDB table. Metric level: Detailed Units: Count  | 
| ListLeases.Time |  Time taken to retrieve all shard leases from the KCL application DynamoDB table. Metric level: Detailed Units: Milliseconds  | 
| TakeLease.Success |  Number of times the worker successfully took shard leases from other KCL workers. Metric level: Detailed Units: Count  | 
| TakeLease.Time |  Time taken to update the lease table with leases taken by the worker. Metric level: Detailed Units: Milliseconds  | 
| NumWorkers |  Total number of workers, as identified by a specific worker. Metric level: Summary Units: Count  | 
| NeededLeases |  Number of shard leases that the current worker needs for a balanced shard-processing load. Metric level: Detailed Units: Count  | 
| LeasesToTake |  Number of leases that the worker will attempt to take. Metric level: Detailed Units: Count  | 
| TakenLeases |  Number of leases taken successfully by the worker. Metric level: Summary Units: Count   | 
| TotalLeases |  Total number of shards that the KCL application is processing. Metric level: Detailed Units: Count  | 
| ExpiredLeases |  Total number of shards that are not being processed by any worker, as identified by the specific worker. Metric level: Summary Units: Count  | 
| Success |  Number of times the `TakeLeases` operation successfully completed. Metric level: Summary Units: Count  | 
| Time |  Time taken by the `TakeLeases` operation for a worker. Metric level: Summary Units: Milliseconds  | 

### Per-shard metrics
<a name="kcl-metrics-per-shard"></a>

These metrics are aggregated across a single record processor.

#### ProcessTask
<a name="process-task"></a>

The `ProcessTask` operation calls [GetRecords](https://docs.aws.amazon.com/kinesis/latest/APIReference/API_GetRecords.html) with the current iterator position to retrieve records from the stream and invokes the record processor `processRecords` function.


| Metric | Description | 
| --- | --- | 
| KinesisDataFetcher.getRecords.Success |  Number of successful `GetRecords` operations per Kinesis data stream shard.  Metric level: Detailed Units: Count  | 
| KinesisDataFetcher.getRecords.Time |  Time taken per `GetRecords` operation for the Kinesis data stream shard. Metric level: Detailed Units: Milliseconds  | 
| UpdateLease.Success |  Number of successful checkpoints made by the record processor for the given shard. Metric level: Detailed Units: Count  | 
| UpdateLease.Time |  Time taken for each checkpoint operation for the given shard. Metric level: Detailed Units: Milliseconds  | 
| DataBytesProcessed |  Total size of records processed in bytes on each `ProcessTask` invocation. Metric level: Summary Units: Byte  | 
| RecordsProcessed |  Number of records processed on each `ProcessTask` invocation. Metric level: Summary Units: Count  | 
| ExpiredIterator |  Number of ExpiredIteratorException received when calling `GetRecords`. Metric level: Summary Units: Count  | 
| MillisBehindLatest | Time that the current iterator is behind from the latest record (tip) in the shard. This value is less than or equal to the difference in time between the latest record in a response and the current time. This is a more accurate reflection of how far a shard is from the tip than comparing timestamps in the last response record. This value applies to the latest batch of records, not an average of all timestamps in each record.Metric level: SummaryUnits: Milliseconds | 
| RecordProcessor.processRecords.Time |  Time taken by the record processor’s `processRecords` method. Metric level: Summary Units: Milliseconds  | 
| Success |  Number of successful process task operations. Metric level: Summary Units: Count  | 
| Time |  Time taken for the process task operation. Metric level: Summary Units: Milliseconds  | 

# Monitor the Kinesis Producer Library with Amazon CloudWatch
<a name="monitoring-with-kpl"></a>

The [Amazon Kinesis Producer Library](https://docs.aws.amazon.com/kinesis/latest/dev/developing-producers-with-kpl.html) (KPL) for Amazon Kinesis Data Streams publishes custom Amazon CloudWatch metrics on your behalf. You can view these metrics by navigating to the [CloudWatch console](https://console.aws.amazon.com/cloudwatch/) and choosing **Custom Metrics**. For more information about custom metrics, see [Publish Custom Metrics](https://docs.aws.amazon.com/AmazonCloudWatch/latest/DeveloperGuide/publishingMetrics.html) in the *Amazon CloudWatch User Guide*.

There is a nominal charge for the metrics uploaded to CloudWatch by the KPL; specifically, Amazon CloudWatch Custom Metrics and Amazon CloudWatch API Requests charges apply. For more information, see [Amazon CloudWatch Pricing](https://aws.amazon.com/cloudwatch/pricing/). Local metrics gathering does not incur CloudWatch charges.

**Topics**
+ [

## Metrics, dimensions, and namespaces
](#kpl-metrics)
+ [

## Metric level and granularity
](#kpl-metrics-granularity)
+ [

## Local access and Amazon CloudWatch upload
](#kpl-metrics-local-upload)
+ [

## List of metrics
](#kpl-metrics-list)

## Metrics, dimensions, and namespaces
<a name="kpl-metrics"></a>

You can specify an application name when launching the KPL, which is then used as part of the namespace when uploading metrics. This is optional; the KPL provides a default value if an application name is not set.

You can also configure the KPL to add arbitrary additional dimensions to the metrics. This is useful if you want finer-grained data in your CloudWatch metrics. For example, you can add the hostname as a dimension, which then allows you to identify uneven load distributions across your fleet. All KPL configuration settings are immutable, so you can't change these additional dimensions after the KPL instance is initialized.

## Metric level and granularity
<a name="kpl-metrics-granularity"></a>

There are two options to control the number of metrics uploaded to CloudWatch:

*metric level*  
This is a rough gauge of how important a metric is. Every metric is assigned a level. When you set a level, metrics with levels below that are not sent to CloudWatch. The levels are `NONE`, `SUMMARY`, and `DETAILED`. The default setting is `DETAILED`; that is, all metrics. `NONE` means no metrics at all, so no metrics are actually assigned to that level.

*granularity*  
This controls whether the same metric is emitted at additional levels of granularity. The levels are `GLOBAL`, `STREAM`, and `SHARD`. The default setting is `SHARD`, which contains the most granular metrics.  
When `SHARD` is chosen, metrics are emitted with the stream name and shard ID as dimensions. In addition, the same metric is also emitted with only the stream name dimension, and the metric without the stream name. This means that, for a particular metric, two streams with two shards each will produce seven CloudWatch metrics: one for each shard, one for each stream, and one overall; all describing the same statistics but at different levels of granularity. For an illustration, see the following diagram.  
The different granularity levels form a hierarchy, and all the metrics in the system form trees, rooted at the metric names:  

```
MetricName (GLOBAL):           Metric X                    Metric Y
                                  |                           |
                           -----------------             ------------
                           |               |             |          |
StreamName (STREAM):    Stream A        Stream B      Stream A   Stream B
                           |               |
                        --------        ---------
                        |      |        |       |
ShardID (SHARD):     Shard 0 Shard 1  Shard 0 Shard 1
```
Not all metrics are available at the shard level; some are stream level or global by nature. These are not produced at the shard level, even if you have enabled shard-level metrics (`Metric Y` in the preceding diagram).  
When you specify an additional dimension, you must provide values for `tuple:<DimensionName, DimensionValue, Granularity>`. The granularity is used to determine where the custom dimension is inserted in the hierarchy: `GLOBAL` means that the additional dimension is inserted after the metric name, `STREAM` means it's inserted after the stream name, and `SHARD` means it's inserted after the shard ID. If multiple additional dimensions are given per granularity level, they are inserted in the order given.

## Local access and Amazon CloudWatch upload
<a name="kpl-metrics-local-upload"></a>

Metrics for the current KPL instance are available locally in real time; you can query the KPL at any time to get them. The KPL locally computes the sum, average, minimum, maximum, and count of every metric, as in CloudWatch.

You can get statistics that are cumulative from the start of the program to the present point in time, or using a rolling window over the past *N* seconds, where *N* is an integer between 1 and 60.

All metrics are available for upload to CloudWatch. This is especially useful for aggregating data across multiple hosts, monitoring, and alarming. This functionality is not available locally.

As described previously, you can select which metrics to upload with the *metric level* and *granularity* settings. Metrics that are not uploaded are available locally.

Uploading data points individually is untenable because it could produce millions of uploads per second, if traffic is high. For this reason, the KPL aggregates metrics locally into 1-minute buckets and uploads a statistics object to CloudWatch one time per minute, per enabled metric.


## List of metrics
<a name="kpl-metrics-list"></a>


| Metric | Description | 
| --- | --- | 
| UserRecordsReceived |  Count of how many logical user records were received by the KPL core for put operations. Not available at shard level. Metric level: Detailed  Unit: Count   | 
| UserRecordsPending |  Periodic sample of how many user records are currently pending. A record is pending if it is either currently buffered and waiting to be sent, or sent and in-flight to the backend service. Not available at shard level.  The KPL provides a dedicated method to retrieve this metric at the global level for customers to manage their put rate. Metric level: Detailed  Unit: Count   | 
| UserRecordsPut |  Count of how many logical user records were put successfully. The KPL outputs a zero for failed records. This allows the average to give the success rate, the count to give the total attempts, and the difference between the count and sum to give the failure count. Metric level: Summary Unit: Count  | 
| UserRecordsDataPut |  Bytes in the logical user records successfully put. Metric level: Detailed  Unit: Bytes   | 
| KinesisRecordsPut |  Count of how many Kinesis Data Streams records were put successfully (each Kinesis Data Streams record can contain multiple user records).  The KPL outputs a zero for failed records. This allows the average to give the success rate, the count to give the total attempts, and the difference between the count and sum to give the failure count. Metric level: Summary  Unit: Count   | 
| KinesisRecordsDataPut |  Bytes in the Kinesis Data Streams records.  Metric level: Detailed  Unit: Bytes   | 
| ErrorsByCode |  Count of each type of error code. This introduces an additional dimension of `ErrorCode`, in addition to the normal dimensions such as `StreamName` and `ShardId`. Not every error can be traced to a shard. The errors that cannot be traced are only emitted at stream or global levels. This metric captures information about such things as throttling, shard map changes, internal failures, service unavailable, timeouts, and so on.  Kinesis Data Streams API errors are counted one time per Kinesis Data Streams record. Multiple user records within a Kinesis Data Streams record do not generate multiple counts. Metric level: Summary  Unit: Count   | 
| AllErrors |  This is triggered by the same errors as **Errors by Code**, but does not distinguish between types. This is useful as a general monitor of the error rate without requiring a manual sum of the counts from all the different types of errors. Metric level: Summary  Unit: Count   | 
| RetriesPerRecord |  Number of retries performed per user record. Zero is emitted for records that succeed in one try. Data is emitted at the moment a user record finishes (when it either succeeds or can no longer be retried). If record time-to-live is a large value, this metric may be significantly delayed. Metric level: Detailed  Unit: Count   | 
| BufferingTime |  The time between a user record arriving at the KPL and leaving for the backend. This information is transmitted back to the user on a per-record basis, but is also available as an aggregated statistic. Metric level: Summary  Unit: Milliseconds   | 
| Request Time |  The time it takes to perform `PutRecordsRequests`. Metric level: Detailed  Unit: Milliseconds   | 
| User Records per Kinesis Record |  The number of logical user records aggregated into a single Kinesis Data Streams record. Metric level: Detailed  Unit: Count   | 
| Amazon Kinesis Records per PutRecordsRequest |  The number of Kinesis Data Streams records aggregated into a single `PutRecordsRequest`. Not available at shard level. Metric level: Detailed  Unit: Count   | 
| User Records per PutRecordsRequest |  The total number of user records contained within a `PutRecordsRequest`. This is roughly equivalent to the product of the previous two metrics. Not available at shard level. Metric level: Detailed  Unit: Count   |