Amazon ECS service throttle logic - Amazon Elastic Container Service

Amazon ECS service throttle logic

The Amazon ECS service scheduler includes logic that throttles how often service tasks are launched if they repeatedly fail to start.

If tasks for a service repeatedly fail to enter the RUNNING state (progressing directly from a PENDING to a STOPPED status), then the time between subsequent restart attempts is incrementally increased up to a maximum of 27 minutes. This maximum period is subject to change in the future. This behavior reduces the effect that failing tasks have on your Amazon ECS cluster resources or Fargate infrastructure costs. If your service initiates the throttle logic, you receive the following service event message:

(service service-name) is unable to consistently start tasks successfully.

Amazon ECS doesn't ever stop a failing service from retrying. It also doesn't attempt to modify it in any way other than increasing the time between restarts. The service throttle logic doesn't provide any user-tunable parameters.

If you update your service to use a new task definition, your service returns to a normal, non-throttled state immediately. For more information, see Updating an Amazon ECS service using the console.

The following are some common causes that initiate this logic. We recommend that you take manual action to address the issue:

Important

Tasks that are stopped after they reach the RUNNING state don't start the throttle logic or the associated service event message. For example, assume that failed Elastic Load Balancing health checks for a service cause a task to be flagged as unhealthy, and Amazon ECS deregisters it and stops the task. At this point, the tasks aren't throttled. Even if a task's container command immediately exits with a non-zero exit code, the task already moved to the RUNNING state. Tasks that fail immediately because command errors don't cause the throttle or the service event message.