How step scaling for Application Auto Scaling works
This topic describes how step scaling works and introduces the key elements of a step scaling policy.
Contents
How it works
To use step scaling, you create a CloudWatch alarm that monitors a metric for your scalable target. Define the metric, threshold value, and number of evaluation periods that determine an alarm breach. You also create a step scaling policy that defines how to scale capacity when the alarm threshold is breached and associate it with your scalable target.
Add the step adjustments in the policy. You can define different step adjustments based on the breach size of the alarm. For example:
-
Scale out by 10 capacity units if the alarm metric reaches 60 percent
-
Scale out by 30 capacity units if the alarm metric reaches 75 percent
-
Scale out by 40 capacity units if the alarm metric reaches 85 percent
When the alarm threshold is breached for the specified number of evaluation periods,
Application Auto Scaling will apply the step adjustments defined in the policy. The adjustments can continue for
additional alarm breaches until the alarm state returns to OK
.
Scaling activities are performed with cooldown periods between them to prevent rapid fluctuations in capacity. You can optionally configure the cooldown periods for your scaling policy.
Step adjustments
When you create a step scaling policy, you specify one or more step adjustments that automatically scale the capacity of the target dynamically based on the size of the alarm breach. Each step adjustment specifies the following:
-
A lower bound for the metric value
-
An upper bound for the metric value
-
The amount by which to scale, based on the scaling adjustment type
CloudWatch aggregates metric data points based on the statistic for the metric associated with your CloudWatch alarm. When the alarm is breached, the appropriate scaling policy is invoked. Application Auto Scaling applies your specified aggregation type to the most recent metric data points from CloudWatch (as opposed to the raw metric data). It compares this aggregated metric value against the upper and lower bounds defined by the step adjustments to determine which step adjustment to perform.
You specify the upper and lower bounds relative to the breach threshold. For example, let's
say you made a CloudWatch alarm and a scale-out policy for when the metric is above 50 percent. You
then made a second alarm and a scale-in policy for when the metric is below 50 percent. You made
a set of step adjustments with an adjustment type of PercentChangeInCapacity
for
each policy:
Lower bound | Upper bound | Adjustment |
---|---|---|
0 |
10 |
0 |
10 |
20 |
10 |
20 |
null |
30 |
Lower bound | Upper bound | Adjustment |
---|---|---|
-10 |
0 |
0 |
-20 |
-10 |
-10 |
null |
-20 |
-30 |
This creates the following scaling configuration.
Metric value
-infinity 30% 40% 60% 70% infinity
-----------------------------------------------------------------------
-30% | -10% | Unchanged | +10% | +30%
-----------------------------------------------------------------------
Now, let's say that you use this scaling configuration on a scalable target that has a capacity of 10. The following points summarize the behavior of the scaling configuration in relation to the capacity of the scalable target:
-
The original capacity is maintained while the aggregated metric value is greater than 40 and less than 60.
-
If the metric value gets to 60, Application Auto Scaling increases the capacity of the scalable target by 1, to 11. That's based on the second step adjustment of the scale-out policy (add 10 percent of 10). After the new capacity is added, Application Auto Scaling increases the current capacity to 11. If the metric value rises to 70 even after this increase in capacity, Application Auto Scaling increases the target capacity by 3, to 14. That's based on the third step adjustment of the scale-out policy (add 30 percent of 11, 3.3, rounded down to 3).
-
If the metric value gets to 40, Application Auto Scaling decreases the capacity of the scalable target by 1, to 13, based on the second step adjustment of the scale-in policy (remove 10 percent of 14, 1.4, rounded down to 1). If the metric value falls to 30 even after this decrease in capacity, Application Auto Scaling decreases the target capacity by 3, to 10, based on the third step adjustment of the scale-in policy (remove 30 percent of 13, 3.9, rounded down to 3).
When you specify the step adjustments for your scaling policy, note the following:
-
The ranges of your step adjustments can't overlap or have a gap.
-
Only one step adjustment can have a null lower bound (negative infinity). If one step adjustment has a negative lower bound, then there must be a step adjustment with a null lower bound.
-
Only one step adjustment can have a null upper bound (positive infinity). If one step adjustment has a positive upper bound, then there must be a step adjustment with a null upper bound.
-
The upper and lower bound can't be null in the same step adjustment.
-
If the metric value is above the breach threshold, the lower bound is inclusive and the upper bound is exclusive. If the metric value is below the breach threshold, the lower bound is exclusive and the upper bound is inclusive.
Scaling adjustment types
You can define a scaling policy that performs the optimal scaling action, based on the scaling adjustment type that you choose. You can specify the adjustment type as a percentage of the current capacity of your scalable target or in absolute numbers.
Application Auto Scaling supports the following adjustment types for step scaling policies:
-
ChangeInCapacity—Increase or decrease the current capacity of the scalable target by the specified value. A positive value increases the capacity and a negative value decreases the capacity. For example: If the current capacity is 3 and the adjustment is 5, then Application Auto Scaling adds 5 to the capacity for a total of 8.
-
ExactCapacity—Change the current capacity of the scalable target to the specified value. Specify a non-negative value with this adjustment type. For example: If the current capacity is 3 and the adjustment is 5, then Application Auto Scaling changes the capacity to 5.
-
PercentChangeInCapacity—Increase or decrease the current capacity of the scalable target by the specified percentage. A positive value increases the capacity and a negative value decreases the capacity. For example: If the current capacity is 10 and the adjustment is 10 percent, then Application Auto Scaling adds 1 to the capacity for a total of 11.
If the resulting value is not an integer, Application Auto Scaling rounds it as follows:
-
Values greater than 1 are rounded down. For example,
12.7
is rounded to12
. -
Values between 0 and 1 are rounded to 1. For example,
.67
is rounded to1
. -
Values between 0 and -1 are rounded to -1. For example,
-.58
is rounded to-1
. -
Values less than -1 are rounded up. For example,
-6.67
is rounded to-6
.
With PercentChangeInCapacity, you can also specify the minimum amount to scale using the
MinAdjustmentMagnitude
parameter. For example, suppose that you create a policy that adds 25 percent and you specify a minimum amount of 2. If the scalable target has a capacity of 4 and the scaling policy is performed, 25 percent of 4 is 1. However, because you specified a minimum increment of 2, Application Auto Scaling adds 2. -
Cooldown period
You can optionally define a cooldown period in your step scaling policy.
A cooldown period specifies the amount of time the scaling policy waits for a previous scaling activity to take effect.
There are two ways to plan for the use of cooldown periods for a step scaling configuration:
-
With the cooldown period for scale-out policies, the intention is to continuously (but not excessively) scale out. After Application Auto Scaling successfully scales out using a scaling policy, it starts to calculate the cooldown time. A scaling policy won‘t increase the desired capacity again unless either a larger scale out is triggered or the cooldown period ends. While the scale-out cooldown period is in effect, the capacity added by the initiating scale-out activity is calculated as part of the desired capacity for the next scale-out activity.
-
With the cooldown period for scale-in policies, the intention is to scale in conservatively to protect your application‘s availability, so scale-in activities are blocked until the scale-in cooldown period has expired. However, if another alarm triggers a scale-out activity during the scale-in cooldown period, Application Auto Scaling scales out the target immediately. In this case, the scale-in cooldown period stops and doesn‘t complete.
For example, when a traffic peak occurs, an alarm is triggered and Application Auto Scaling automatically adds capacity to help handle the increased load. If you set a cooldown period for your scale-out policy, when the alarm triggers the policy to increase the capacity by 2, the scaling activity completes successfully, and the scale-out cooldown period starts. If an alarm triggers again during the cooldown period but at a more aggressive step adjustment of 3, the previous increase of 2 is considered part of the current capacity. Therefore, only 1 is added to the capacity. This allows faster scaling than waiting for the cooldown to expire but without adding more capacity than you need.
The cooldown period is measured in seconds and applies only to scaling policy-related scaling activities. During a cooldown period, when a scheduled action starts at the scheduled time, it can trigger a scaling activity immediately without waiting for the cooldown period to expire.
The default value is 300 if no value is specified.
Commonly used commands for scaling policy creation, management, and deletion
The commonly used commands for working with scaling policies include:
-
register-scalable-target to register AWS or custom resources as scalable targets (a resource that Application Auto Scaling can scale), and to suspend and resume scaling.
-
put-scaling-policy to add or modify scaling policies for an existing scalable target.
-
describe-scaling-activities to return information about scaling activities in an AWS Region.
-
describe-scaling-policies to return information about scaling policies in an AWS Region.
-
delete-scaling-policy to delete a scaling-policy.
Considerations
The following considerations apply when working with step scaling policies:
-
Consider whether you can predict the step adjustments on the application accurately enough to use step scaling. If your scaling metric increases or decreases proportionally to the capacity of the scalable target, we recommend that you use a target tracking scaling policy instead. You still have the option to use step scaling as an additional policy for a more advanced configuration. For example, you can configure a more aggressive response when utilization reaches a certain level.
-
Make sure to choose an adequate margin between the scale-out and scale-in thresholds to prevent flapping. Flapping is an infinite loop of scaling in and scaling out. That is, if a scaling action is taken, the metric value would change and start another scaling action in the reverse direction.
Related resources
For information about creating step scaling policies for Auto Scaling groups, see Step and simple scaling policies for Amazon EC2 Auto Scaling in the Amazon EC2 Auto Scaling User Guide.
Console access
Console access to view, add, update, or remove step scaling policies on scalable resources depends on the resource that you use. For more information, see AWS services that you can use with Application Auto Scaling.