Best practices for scaling plans - AWS Auto Scaling

Best practices for scaling plans

The following best practices can help you make the most of scaling plans:

  • When you create a launch template or launch configuration, enable detailed monitoring to get CloudWatch metric data for EC2 instances at a one-minute frequency because that ensures a faster response to load changes. Scaling on metrics with a five-minute frequency can result in a slower response time and scaling on stale metric data. By default, EC2 instances are enabled for basic monitoring, which means metric data for instances is available at five-minute intervals. For an additional charge, you can enable detailed monitoring to get metric data for instances at a one-minute frequency. For more information, see Configure monitoring for Auto Scaling instances in the Amazon EC2 Auto Scaling User Guide.

  • We also recommend that you enable Auto Scaling group metrics. Otherwise, actual capacity data is not shown in the capacity forecast graphs that are available on completion of the Create Scaling Plan wizard. For more information, see Monitoring CloudWatch metrics for your Auto Scaling groups and instances in the Amazon EC2 Auto Scaling User Guide.

  • Check which instance type your Auto Scaling group uses and be wary of using a burstable performance instance type. Amazon EC2 instances with burstable performance, such as T3 and T2 instances, are designed to provide a baseline level of CPU performance with the ability to burst to a higher level when required by your workload. Depending on the target utilization specified by the scaling plan, you could run the risk of exceeding the baseline and then running out of CPU credits, which limits performance. For more information, see CPU credits and baseline performance for burstable performance instances. To configure these instances as unlimited, see Using an Auto Scaling group to launch a burstable performance instance as Unlimited in the Amazon EC2 User Guide.

Other considerations

Note

There is a more recent version of predictive scaling, released in May, 2021. Some features introduced in this version are not available in scaling plans, and you must use a predictive scaling policy set directly on the Auto Scaling group to access those features. For more information, see Predictive scaling for Amazon EC2 Auto Scaling in the Amazon EC2 Auto Scaling User Guide.

Keep the following additional considerations in mind:

  • Predictive scaling uses load forecasts to schedule capacity in the future. The quality of the forecasts varies based on how cyclical the load is and the applicability of the trained forecasting model. Predictive scaling can be run in forecast only mode to assess the quality of the forecasts and the scaling actions created by the forecasts. You can set the predictive scaling mode to Forecast only when you create the scaling plan and then change it to Forecast and scale when you're finished assessing the forecast quality. For more information, see Predictive scaling settings and Monitoring and evaluating forecasts.

  • If you choose to specify different metrics for predictive scaling, you must ensure that the scaling metric and load metric are strongly correlated. The metric value must increase and decrease proportionally to the number of instances in the Auto Scaling group. This ensures that the metric data can be used to proportionally scale out or in the number of instances. For example, the load metric is total request count and the scaling metric is average CPU utilization. If the total request count increases by 50 percent, the average CPU utilization should also increase by 50 percent, provided that capacity remains unchanged.

  • Before creating your scaling plan, you should delete any previously scheduled scaling actions that you no longer need by accessing the consoles they were created from. AWS Auto Scaling does not create a predictive scaling action that overlaps an existing scheduled scaling action.

  • Your customized settings for minimum and maximum capacity, along with other settings used for dynamic scaling, show up in other consoles. However, we recommend that after you create a scaling plan, you do not modify these settings from other consoles because your scaling plan does not receive the updates from other consoles.

  • Your scaling plan can contain resources from multiple services, but each resource can be in only one scaling plan at a time.

Avoiding the ActiveWithProblems error

An "ActiveWithProblems" error can occur when a scaling plan is created, or resources are added to a scaling plan. The error occurs when the scaling plan is active, but the scaling configuration for one or more resources could not be applied.

Usually, this happens because a resource already has a scaling policy or an Auto Scaling group does not meet the minimum requirements for predictive scaling.

If any of your resources already have scaling policies from various service consoles, AWS Auto Scaling does not overwrite these other scaling policies or create new ones by default. You can optionally delete the existing scaling policies and replace them with target tracking scaling policies created from the AWS Auto Scaling console. You do this by enabling the Replace external scaling policies setting for each resource that has scaling policies to overwrite.

With predictive scaling, we recommend waiting 24 hours after creating a new Auto Scaling group to configure predictive scaling. At minimum, there must be 24 hours of historical data to generate the initial forecast. If the group has less than 24 hours of historical data and predictive scaling is enabled, then the scaling plan can't generate a forecast until the next forecast period, after the group has collected the required amount of data. However, you can also edit and save the scaling plan to restart the forecast process as soon as the 24 hours of data is available.