

# Control instance retention with instance lifecycle policies
<a name="instance-lifecycle-policy"></a>

 Instance lifecycle policies provide protection against Amazon EC2 Auto Scaling terminations when a termination lifecycle action is abandoned. Unlike lifecycle hooks alone, instance lifecycle policies are designed to ensure that instances move to a retained state when graceful shutdown procedures don't complete successfully. 

## When to use instance lifecycle policies
<a name="when-to-use-instance-lifecycle-policies"></a>

 Use instance lifecycle policies when graceful shutdown of your application is not optional but mandatory and failed shutdowns require manual intervention. Common use cases include: 
+  Stateful applications that must complete data persistence before termination. 
+  Applications requiring extended draining periods that may exceed the maximum lifecycle hook timeout of 48 hours. 
+  Workloads handling sensitive data where failed or incomplete cleanup could result in data loss or corruption. 
+  Mission-critical services where abrupt shutdown causes availability impact. 

 For more information on how to gracefully handle instance termination, see [Design your applications to gracefully handle instance termination](gracefully-handle-instance-termination.md). 

## How instance lifecycle policies work with termination lifecycle hooks
<a name="how-instance-lifecycle-policies-work"></a>

 Instance lifecycle policies work in combination with termination lifecycle hooks, not as a replacement. The process follows several stages: 

1.  **Termination lifecycle actions execute.** When Amazon EC2 Auto Scaling selects an instance for termination, your termination lifecycle hooks are invoked and the instance enters the `Terminating:Wait` state to begin executing the termination lifecycle actions. 

1.  **Graceful shutdown attempt begins.** Your application, either running on the instance or via a control plane, receives the terminatioin lifecycle action notification and begins graceful shutdown procedures such as draining connections, completing in-progress work, or transferring data. 

1.  **Termination lifecycle actions complete.** A termination lifecycle action can complete with `CONTINUE` or `ABANDON` result. 

1.  **The instance lifecycle policy evaluates the situation.** Without an instance lifecycle policy configured, the instance proceeds to termination immediately even if the termination lifecycle action was completed with `ABANDON` result. With an instance lifecycle policy configured to retain instances on `TerminateHookAbandon`, the instance moves to a retained state if the termination lifecycle action was completed with `ABANDON` result. 

1.  **Retained instances await manual action.** Instances in retained states continue to incur standard Amazon EC2 charges. These instances don't count toward your Auto Scaling group's desired capacity, so Auto Scaling launches replacement instances to maintain the desired size. Auto Scaling features such as instance refresh and max instance lifetime will also ignore retained instances. This allows you to complete cleanup procedures manually, recover data, or investigate why automated shutdown failed before manually terminating the instance. 

1.  **Manual termination occurs.** After you complete the necessary actions on the retained instance, you need to call the `TerminateInstanceInAutoScalingGroup` API to terminate the instance. 

# Configure instance retention
<a name="configure-instance-retention"></a>

Set up your Amazon EC2 Auto Scaling group to retain instances when termination lifecycle actions fail.

 To use instance lifecycle policies in your Auto Scaling group, you must also configure a termination lifecycle hook. If you configure an instance lifecycle policy but don't have any termination lifecycle hooks, the policy has no effect. Instance lifecycle policies will only apply when termination lifecycle actions are abandoned, not when they complete successfully with the `CONTINUE` result. 

 Instance lifecycle policies use retention triggers to determine when to retain an instance. The `TerminateHookAbandon` trigger causes retention in several scenarios: 
+  When you explicitly call the [ CompleteLifecycleAction ](https://docs.aws.amazon.com/autoscaling/ec2/APIReference/API_CompleteLifecycleAction.html) API with the `ABANDON` result. 
+  When a termination lifecycle action with default result `ABANDON` times out because the heartbeat timeout is reached without receiving a heartbeat. 
+  When the global timeout is reached on a termination lifecycle action with default result `ABANDON`, which is 48 hours or 100 times the heartbeat timeout, whichever is smaller 

------
#### [ Console ]

**To configure instance retention**

1. Open the Amazon EC2 Auto Scaling console

1. Create your Auto Scaling group (instance lifecycle policy defaults to Terminate)

1. Go to your Auto Scaling group details page and choose the **Instance Management** tab

1. In **Instance lifecycle policy for lifecycle hooks**, choose **Retain**

1. Create your termination lifecycle hooks with:
   + Lifecycle transition set to **Instance terminate**
   + Default result set to **Abandon**

------
#### [ AWS CLI ]

**To configure instance retention**  
 Use the [create-auto-scaling-group](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/autoscaling/create-auto-scaling-group.html) command with an instance lifecycle policy: 

```
aws autoscaling create-auto-scaling-group \
--auto-scaling-group-name my-asg \
--launch-template LaunchTemplateName=my-template,Version='$Latest' \
--min-size 1 \
--max-size 3 \
--desired-capacity 2 \
--vpc-zone-identifier subnet-12345678 \
--instance-lifecycle-policy file://lifecycle-policy.json
```

Contents of lifecycle-policy.json:

```
{
    "RetentionTriggers": {
        "TerminateHookAbandon": "retain"
    }
}
```

**To add a termination lifecycle hook**  
Use the [put-lifecycle-hook](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/autoscaling/put-lifecycle-hook.html) command:

```
aws autoscaling put-lifecycle-hook \
--lifecycle-hook-name my-termination-hook \
--auto-scaling-group-name my-asg \
--lifecycle-transition autoscaling:EC2_INSTANCE_TERMINATING \
--default-result ABANDON \
--heartbeat-timeout 300
```

------

# Manage retained instances
<a name="manage-retained-instances"></a>

 Monitor and control Amazon EC2 instances that have been moved to a retained state. Use CloudWatch metrics to track retained instances, then manually terminate retained instances after completing your custom actions. 

 Retained instances do not count toward your Amazon EC2 Auto Scaling group's desired capacity. When an instance enters a retained state, Auto Scaling launches a replacement instance to maintain the desired capacity. For example, suppose your Auto Scaling group has a desired capacity of 10. When an instance enters the `Terminating:Retained` state, Auto Scaling launches a replacement instance to maintain the desired capacity of 10. You now have 11 running instances in total: 10 in your active group plus 1 retained instance. Standard Amazon EC2 charges for all 11 instances will apply until you manually terminate the retained instance. 

## Instance lifecycle states of retained instances
<a name="instance-lifecyle-states-of-retained-instances"></a>

 Understand how instances transition through lifecycle states when instance lifecycle policies are used. Instances follow a specific path from normal termination through retention to final termination. 

*When retention is triggered, instances transition through these states:*

1. `Terminating` - Normal termination begins

1. `Terminating:Wait` - Lifecycle hook executes

1. `Terminating:Proceed` - Lifecycle actions wrap up (whether they succeeded or failed)

1. `Terminating:Retained` - Hook fails, instance retained for manual intervention

Warm pool instances take different lifecycle state paths depending on the scenario:

*Instances scaling back into the warm pool:*

1. `Warmed:Pending` - Normal warm pool transition begins

1. `Warmed:Pending:Wait` - Lifecycle hook executes

1. `Warmed:Pending:Proceed` - Lifecycle actions wrap up (whether they succeeded or failed)

1. `Warmed:Pending:Retained` - Hook fails, instance retained for manual intervention

*Instances being terminated from the warm pool:*

1. `Warmed:Terminating` - Normal termination begins

1. `Warmed:Terminating:Wait` - Lifecycle hook executes

1. `Warmed:Terminating:Proceed` - Lifecycle actions wrap up (whether they succeeded or failed)

1. `Warmed:Terminating:Retained` - Hook fails, instance retained for manual intervention

## Monitor retained instances
<a name="monitor-retained-instances"></a>

 Because retained Amazon EC2 instances incur costs and require manual intervention, monitoring them is essential. Amazon EC2 Auto Scaling provides several CloudWatch metrics to track retained instances. 

Enable group metrics to track retained instances:

```
aws autoscaling enable-metrics-collection \
--auto-scaling-group-name my-asg \
--metrics GroupTerminatingRetainedInstances
```

The available metrics are:
+  `GroupTerminatingRetainedInstances` shows the number of instances in the `Terminating:Retained` state. 
+  `GroupTerminatingRetainedCapacity` shows the capacity units represented by instances in the `Terminating:Retained` state. 
+  `WarmPoolTerminatingRetainedCapacity` tracks retained instances terminating from the warm pool. 
+  `WarmPoolPendingRetainedCapacity` tracks retained instances returning to the warm pool. 

 You can also check your Amazon EC2 Auto Scaling group's scaling activities to understand why instances were retained. Look for termination activities with `StatusCode: Cancelled` and status reason messages indicating lifecycle hook failures: 

```
aws autoscaling describe-scaling-activities \
--auto-scaling-group-name my-asg
```

 We recommend creating CloudWatch alarms on these metrics to alert you when instances enter a retained state. This helps you track cost implications and ensures you don't forget to clean up instances that require manual intervention. 

## Terminate retained instances
<a name="terminate-retained-instances"></a>

After completing your custom actions, terminate your retained instances by calling the [ TerminateInstanceInAutoScalingGroup ](https://docs.aws.amazon.com/autoscaling/ec2/APIReference/API_TerminateInstanceInAutoScalingGroup.html) API: 

```
aws autoscaling terminate-instance-in-auto-scaling-group \
--instance-id i-1234567890abcdef0 \
--no-should-decrement-desired-capacity
```