Target tracking scaling policies for Amazon EC2 Auto Scaling
A target tracking scaling policy automatically scales the capacity of your Auto Scaling group based on a target metric value. This allows your application to maintain optimal performance and cost efficiency without manual intervention.
With target tracking, you select a metric and a target value to represent the ideal average utilization or throughput level for your application. Amazon EC2 Auto Scaling creates and manages the CloudWatch alarms that invoke scaling events when the metric deviates from the target. As an example, this is similar to how a thermostat maintains a target temperature.
For example, let's say that you currently have an application that runs on two instances, and you want the CPU utilization of the Auto Scaling group to stay at around 50 percent when the load on the application changes. This gives you extra capacity to handle traffic spikes without maintaining an excessive number of idle resources.
You can meet this need by creating a target tracking scaling policy that targets an average CPU utilization of 50 percent. Then, your Auto Scaling group will scale out, or increase capacity, when CPU exceeds 50 percent to handle increased load. It will scale in, or decrease capacity, when CPU drops below 50 percent to optimize costs during periods of low utilization.
Topics
Multiple target tracking scaling policies
To help optimize scaling performance, you can use multiple target tracking scaling policies together, provided that each of them uses a different metric. For example, utilization and throughput can influence each other. Whenever one of these metrics changes, it usually implies that other metrics will also be impacted. The use of multiple metrics therefore provides additional information about the load that your Auto Scaling group is under. This can help Amazon EC2 Auto Scaling make more informed decisions when determining how much capacity to add to your group.
The intention of Amazon EC2 Auto Scaling is to always prioritize availability. It will scale out the Auto Scaling group if any of the target tracking policies are ready to scale out. It will scale in only if all of the target tracking policies (with the scale in portion enabled) are ready to scale in.
Choose metrics
You can create target tracking scaling policies with either predefined metrics or custom metrics.
When you create a target tracking scaling policy with a predefined metric type, you choose one metric from the following list of predefined metrics:
-
ASGAverageCPUUtilization
—Average CPU utilization of the Auto Scaling group. -
ASGAverageNetworkIn
—Average number of bytes received by a single instance on all network interfaces. -
ASGAverageNetworkOut
—Average number of bytes sent out from a single instance on all network interfaces. -
ALBRequestCountPerTarget
—Average Application Load Balancer request count per target.
Important
Other valuable information about the metrics for CPU utilization, network I/O, and Application Load Balancer request count per target can be found in the List the available CloudWatch metrics for your instances topic in the Amazon EC2 User Guide and the CloudWatch metrics for your Application Load Balancer topic in the User Guide for Application Load Balancers, respectively.
You can choose other available CloudWatch metrics or your own metrics in CloudWatch by specifying a custom metric. You must use the AWS CLI or an SDK to create a target tracking policy with a customized metric specification. For an example that specifies a customized metric specification for a target tracking scaling policy using the AWS CLI, see Example scaling policies for the AWS CLI.
Keep the following in mind when choosing a metric:
-
We recommend that you only use metrics that are available at one-minute intervals to help you scale faster in response to utilization changes. Target tracking will evaluate metrics aggregated at a one-minute granularity for all predefined metrics and custom metrics, but the underlying metric might publish data less frequently. For example, all Amazon EC2 metrics are sent in five-minute intervals by default, but they are configurable to one minute (known as detailed monitoring). This choice is up to the individual services. Most try to use the smallest interval possible. For information about enabling detailed monitoring, see Configure monitoring for Auto Scaling instances.
-
Not all custom metrics work for target tracking. The metric must be a valid utilization metric and describe how busy an instance is. The metric value must increase or decrease proportionally to the number of instances in the Auto Scaling group. That's so the metric data can be used to proportionally scale out or in the number of instances. For example, the CPU utilization of an Auto Scaling group works (that is, the Amazon EC2 metric
CPUUtilization
with the metric dimensionAutoScalingGroupName
), if the load on the Auto Scaling group is distributed across the instances. -
The following metrics do not work for target tracking:
-
The number of requests received by the load balancer fronting the Auto Scaling group (that is, the Elastic Load Balancing metric
RequestCount
). The number of requests received by the load balancer doesn't change based on the utilization of the Auto Scaling group. -
Load balancer request latency (that is, the Elastic Load Balancing metric
Latency
). Request latency can increase based on increasing utilization, but doesn't necessarily change proportionally. -
The CloudWatch Amazon SQS queue metric
ApproximateNumberOfMessagesVisible
. The number of messages in a queue might not change proportionally to the size of the Auto Scaling group that processes messages from the queue. However, a custom metric that measures the number of messages in the queue per EC2 instance in the Auto Scaling group can work. For more information, see Scaling policy based on Amazon SQS.
-
-
To use the
ALBRequestCountPerTarget
metric, you must specify theResourceLabel
parameter to identify the load balancer target group that is associated with the metric. For an example that specifies theResourceLabel
parameter for a target tracking scaling policy using the AWS CLI, see Example scaling policies for the AWS CLI. -
When a metric emits real 0 values to CloudWatch (for example,
ALBRequestCountPerTarget
), an Auto Scaling group can scale in to 0 when there is no traffic to your application for a sustained period of time. To have your Auto Scaling group scale in to 0 when no requests are routed it, the group's minimum capacity must be set to 0. -
Instead of publishing new metrics to use in your scaling policy, you can use metric math to combine existing metrics. For more information, see Create a target tracking scaling policy using metric math.
Define target value
When you create a target tracking scaling policy, you must specify a target value. The target value represents the optimal average utilization or throughput for the Auto Scaling group. To use resources cost efficiently, set the target value as high as possible with a reasonable buffer for unexpected traffic increases. When your application is optimally scaled out for a normal traffic flow, the actual metric value should be at or just below the target value.
When a scaling policy is based on throughput, such as the request count per target for an Application Load Balancer, network I/O, or other count metrics, the target value represents the optimal average throughput from a single instance, for a one-minute period.
Define instance warmup time
You can optionally specify the number of seconds that it takes for a newly launched instance to warm up. Until its specified warmup time has expired, an instance is not counted toward the aggregated EC2 instance metrics of the Auto Scaling group.
While instances are in the warmup period, your scaling policies only scale out if the metric value from instances that are not warming up is greater than the policy's target utilization.
If the group scales out again, the instances that are still warming up are counted as part of the desired capacity for the next scale-out activity. The intention is to continuously (but not excessively) scale out.
While the scale-out activity is in progress, all scale in activities initiated by scaling policies are blocked until the instances finish warming up. When the instances finish warming up, if a scale in event occurs, any instances currently in the process of terminating will be counted towards the current capacity of the group when calculating the new desired capacity. Therefore, we don't remove more instances from the Auto Scaling group than necessary.
Default value
If no value is set, then the scaling policy will use the default value, which is the value for the default instance warmup defined for the group. If the default instance warmup is null, then it falls back to the value of the default cooldown. We recommend using the default instance warmup to make it easier to update all scaling policies when the warmup time changes.
Considerations
The following considerations apply when working with target tracking scaling policies:
-
Do not create, edit, or delete the CloudWatch alarms that are used with a target tracking scaling policy. Amazon EC2 Auto Scaling creates and manages the CloudWatch alarms that are associated with your target tracking scaling policies and deletes them when no longer needed.
-
A target tracking scaling policy prioritizes availability during periods of fluctuating traffic levels by scaling in more gradually when traffic is decreasing. If you want your Auto Scaling group to scale in immediately when a workload finishes, you can disable the scale-in portion of the policy. This provides you with the flexibility to use the scale-in method that best meets your needs when utilization is low. To ensure that scale in happens as quickly as possible, we recommend not using a simple scaling policy to prevent a cooldown period from being added.
-
If the metric is missing data points, this causes the CloudWatch alarm state to change to
INSUFFICIENT_DATA
. When this happens, Amazon EC2 Auto Scaling cannot scale your group until new data points are found. -
If the metric is sparsely reported by design, metric math can be helpful. For example, to use the most recent values, then use the
FILL(m1,REPEAT)
function wherem1
is the metric. -
You might see gaps between the target value and the actual metric data points. This is because we act conservatively by rounding up or down when determining how many instances to add or remove. This prevents us from adding an insufficient number of instances or removing too many instances. However, for smaller Auto Scaling groups with fewer instances, the utilization of the group might seem far from the target value. For example, let's say that you set a target value of 50 percent for CPU utilization and your Auto Scaling group then exceeds the target. We might determine that adding 1.5 instances will decrease the CPU utilization to close to 50 percent. Because it is not possible to add 1.5 instances, we round up and add two instances. This might decrease the CPU utilization to a value below 50 percent, but it ensures that your application has enough resources to support it. Similarly, if we determine that removing 1.5 instances increases your CPU utilization to above 50 percent, we remove just one instance.
For larger Auto Scaling groups with more instances, the utilization is spread over a larger number of instances, in which case adding or removing instances causes less of a gap between the target value and the actual metric data points.
-
A target tracking scaling policy assumes that it should scale out your Auto Scaling group when the specified metric is above the target value. You can't use a target tracking scaling policy to scale out your Auto Scaling group when the specified metric is below the target value.