

# Amazon Managed Service for Prometheus service quotas
<a name="AMP_quotas"></a>

The following two sections describe the quotas and limits associated with Amazon Managed Service for Prometheus.

## Service quotas
<a name="AMP-series-label-limits"></a>

Amazon Managed Service for Prometheus has the following quotas. Amazon Managed Service for Prometheus vends [CloudWatch usage metrics](https://docs.aws.amazon.com/prometheus/latest/userguide/AMP-CW-usage-metrics.html) to monitor Prometheus resource usage. Using the Amazon CloudWatch usage metrics alarm feature, you can monitor Prometheus resources and usage to prevent limit errors.

As your projects and workspaces grow, the most common quotas that you should monitor or request an increase for are: **Active series per workspace**, and **Ingestion rate per workspace**.

For all adjustable quotas, you can request a quota increase by choosing the link in the **Adjustable** column, or by [requesting a quota increase](https://console.aws.amazon.com/support/home#/case/create?issueType=service-limit-increase).

The *Active series per workspace* limit is dynamically applied. For more information, see [Active series default quotas](#AMP-dynamic-series). The *Ingestion rate per workspace* quota determines how quickly you can ingest data into your workspace. For more information see [Ingestion throttling](#AMP-request-throttling).

**Note**  
Unless otherwise noted, these quotas are per workspace. The maximum value for active series per workspace is one billion.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/prometheus/latest/userguide/AMP_quotas.html)

## Active series default quotas
<a name="AMP-dynamic-series"></a>

Amazon Managed Service for Prometheus workspaces automatically adapt to your ingestion usage. As your usage increases, the service automatically increases your time series capacity up to the default quota.

Your Amazon Managed Service for Prometheus workspace scales automatically, based on your usage, in two ways:

1. When your 30-minute average usage is below 5 million series, the capacity doubles (for example, a workspace with 3.5M usage gets 7M capacity).

1. When usage exceeds 5 million series, the workspace adds a 10 million buffer (for example, a workspace with 25M usage gets 35M capacity).

Amazon Managed Service for Prometheus automatically allocates more capacity as your ingestion increases, up to your quota. This helps ensure your workload does not experience sustained throttling. However, throttling can occur if you double or exceed 10 million above your previous baseline computed over the last 30 minutes. To avoid throttling, Amazon Managed Service for Prometheus recommends gradually increasing ingestion when increasing beyond your previous baseline.

**Note**  
The minimum capacity for active time series is 2 million, and there is no throttling when you have less than 2 million series.  
To go beyond your default quota, you can request a [quota increase](https://console.aws.amazon.com/support/home#/case/create?issueType=service-limit-increase).

## Scaling above the default quota
<a name="AMP-above-default-quota"></a>

When you request a quota increase above the default active series quota, Amazon Managed Service for Prometheus adjusts your workspace capacity accordingly. If you don't fully utilize the increased capacity, the service will reclaim the unused portion over time. As your usage grows, the workspace will scale up again automatically.

However, throttling can occur if you more than double or exceed 50 million active time series over your previous baseline computed from the last 2 hours. For example:
+ If your quota is 100 million and your baseline is 30 million, you can scale up to 60 million within 2 hours without throttling.
+ If your quota is 100 million and your baseline is 50 million, you can scale up to the full 100 million within 2 hours without throttling.

## Ingestion throttling
<a name="AMP-request-throttling"></a>

Amazon Managed Service for Prometheus throttles ingestion for each workspace, based on your current limits. This helps maintain the performance of the workspace. If you exceed the limit, you will see `DiscardedSamples` in CloudWatch metrics (with the `rate_limited` reason). You can use CloudWatch to monitor your ingestion, and to create an alarm to warn you when you are close to reaching the throttling limits. For more information, see [Use CloudWatch metrics to monitor Amazon Managed Service for Prometheus resources](AMP-CW-usage-metrics.md).

Amazon Managed Service for Prometheus uses the [token bucket algorithm](https://en.wikipedia.org/wiki/Token_bucket) to implement ingestion throttling. With this algorithm, your account has a *bucket* that holds a specific number of *tokens*. The number of tokens in the bucket represents your ingestion limit at any given second.

Each data sample ingested removes one token from the bucket. If your bucket size (*Ingestion rate per workspace*) is *1,000,000*, your workspace can ingest one million data samples in one second. If it exceeds one million samples to ingest, it will be throttled, and will not ingest any more records. Additional data samples will be discarded.

The bucket automatically refills at a set rate. If the bucket is below its maximum capacity, a set number of tokens is added back to it every second until it reaches its maximum capacity. If the bucket is full when the refill tokens arrive, they are discarded. The bucket can't hold more than its maximum number of tokens. The refill rate for sample ingestion is set by the *Ingestion rate per workspace* limit. If your *Ingestion rate per workspace* is set to 170,000, then the refill rate for the bucket is 170,000 tokens per second.

If your workspace ingests 1,000,000 data samples in a second, your bucket is immediately reduced to zero tokens. The bucket is then refilled by 170,000 tokens every second, until it reaches it's maximum capacity of 1,000,000 tokens. If there is no more ingestion, the previously empty bucket will return to it's maximum capacity in 6 seconds.

**Note**  
Ingestion happens in batched requests. If you have 100 tokens available, and send a request with 101 samples, the entire request is rejected. Amazon Managed Service for Prometheus does not partially accept requests. If you are writing a collector, you can manage retries (with smaller batches or after some time has passed).

You do not need to wait for the bucket to be full before your workspace can ingest more data samples. You can use tokens as they are added to the bucket. If you immediately use the refill tokens, the bucket does not reach its maximum capacity. For example, if you deplete the bucket, you can continue to ingest 170,000 data samples per second. The bucket can refill to maximum capacity only if you ingest fewer than 170,000 data samples per second.

## Additional limits on ingested data
<a name="AMP-ingest-limits"></a>

Amazon Managed Service for Prometheus also has the following additional requirements for data ingested into the workspace. These are not adjustable.
+ Metric samples older than 1 hour are refused from being ingested.
+ Every sample and metadata must have a metric name.