

# Amazon ECS blue/green deployments
<a name="deployment-type-blue-green"></a>

A blue/green deployment is a release methodology that reduces downtime and risk by running two identical production environments called blue and green. With Amazon ECS blue/green deployments, you can validate new service revisions before directing production traffic to them. This approach provides a safer way to deploy changes with the ability to quickly roll back if needed.

## Benefits
<a name="blue-green-deployment-benefits"></a>

The following are benefits of using blue/green deployments:
+ Reduces risk through testing with production traffic before switching production. You can validate the new deployment with test traffic before directing production traffic to it.
+ Zero downtime deployments. The production environment remains available throughout the deployment process, ensuring continuous service availability.
+ Easy rollback if issues are detected. If problems arise with the green deployment, you can quickly revert to the blue deployment without extended service disruption.
+ Controlled testing environment. The green environment provides an isolated space to test new features with real traffic patterns before full deployment.
+ Predictable deployment process. The structured approach with defined lifecycle stages makes deployments more consistent and reliable.
+ Automated validation through lifecycle hooks. You can implement automated tests at various stages of the deployment to verify functionality.

## Terminology
<a name="blue-green-deployment-terms"></a>

The following are Amazon ECS blue/green deployment terms:
+ Bake time - The duration when both blue and green service revisions are running simultaneously after the production traffic has shifted.
+ Blue deployment - The current production service revision that you want to replace.
+ Green deployment - The new service revision that you want to deploy.
+ Lifecycle stages - A series of events in the deployment operation, such as "after production traffic shift".
+ Lifecycle hook - A Lambda function that verifies the deployment at a specific lifecycle stage.
+ Listener - A Elastic Load Balancing resource that checks for connection requests using the protocol and port that you configure. The rules that you define for a listener determine how Amazon ECS routes requests to its registered targets.
+ Rule - An Elastic Load Balancing resource associated with a listener. A rule defines how requests are routed and consists of an action, condition, and priority.
+ Target group - An Elastic Load Balancing resource used to route requests to one or more registered targets (for example, EC2 instances). When you create a listener, you specify a target group for its default action. Traffic is forwarded to the target group specified in the listener rule.
+ Traffic shift - The process Amazon ECS uses to shift traffic from the blue deployment to the green deployment. For Amazon ECS blue/green deployments, all traffic is shifted from the blue service to the green service at once.

## Considerations
<a name="blue-green-deployment-considerations"></a>

Consider the following when choosing a deployment type:
+ Resource usage: Blue/green deployments temporarily run both the blue and green service revisions simultaneously, which may double your resource usage during deployments.
+ Deployment monitoring: Blue/green deployments provide more detailed deployment status information, allowing you to monitor each stage of the deployment process.
+ Rollback: Blue/green deployments make it easier to roll back to the previous version if issues are detected, as the blue revision is kept running until the bake time expires.
+ Network Load Balancer lifecycle hooks: If you use a Network Load Balancer for blue/green deployments, there is an additional 10 minutes for the TEST\$1TRAFFIC\$1SHIFT and PRODUCTION\$1TRAFFIC\$1SHIFT lifecycle stages. This is because Amazon ECS makes sure that it is safe to shift traffic.

# Amazon ECS blue/green service deployments workflow
<a name="blue-green-deployment-how-it-works"></a>

The Amazon ECS blue/green deployment process follows a structured approach with six distinct phases that ensure safe and reliable application updates. Each phase serves a specific purpose in validating and transitioning your application from the current version (blue) to the new version (green).

1. **Preparation Phase**: Create the green environment alongside the existing blue environment. This includes provisioning new service revisions, and preparing target groups.

1. **Deployment Phase**: Deploy the new service revision to the green environment. Amazon ECS launches new tasks using the updated service revision while the blue environment continues serving production traffic.

1. **Testing Phase**: Validate the green environment using test traffic routing. The Application Load Balancer directs test requests to the green environment while production traffic remains on blue.

1. **Traffic Shifting Phase**: Shift production traffic from blue to green based on your configured deployment strategy. This phase includes monitoring and validation checkpoints.

1. **Monitoring Phase**: Monitor application health, performance metrics, and alarm states during the bake time period. A rollback operation is initiated when issues are detected.

1. **Completion Phase**: Finalize the deployment by terminating the blue environment or maintaining it for potential rollback scenarios, depending on your configuration.

## Workflow
<a name="blue-green-deployment-workflow"></a>

The following diagram illustrates the comprehensive blue/green deployment workflow, showing the interaction between Amazon ECS, and the Application Load Balancer:

![\[Comprehensive diagram showing the blue/green deployment process in Amazon ECS with detailed component interactions, traffic shifting phases, and monitoring checkpoints\]](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/images/blue-green.png)


The enhanced deployment workflow includes the following detailed steps:

1. **Initial State**: The blue service (current production) handles 100% of production traffic. The Application Load Balancer has a single listener with rules that route all requests to the blue target group containing healthy blue tasks.

1. **Green Environment Provisioning**: Amazon ECS creates new tasks using the updated task definition. These tasks are registered with a new green target group but receive no traffic initially.

1. **Health Check Validation**: The Application Load Balancer performs health checks on green tasks. Only when green tasks pass health checks does the deployment proceed to the next phase.

1. **Test Traffic Routing**: If configured, the Application Load Balancer's listener rules route specific traffic patterns (such as requests with test headers) to the green environment for validation while production traffic remains on blue. This is controlled by the same listener that handles production traffic, using different rules based on request attributes.

1. **Production Traffic Shift**: Based on the deployment configuration, traffic shifts from blue to green. In ECS blue/green deployments, this is an immediate (all-at-once) shift where 100% of the traffic is moved from the blue to the green environment. The Application Load Balancer uses a single listener with listener rules that control traffic distribution between the blue and green target groups based on weights.

1. **Monitoring and Validation**: Throughout the traffic shift, Amazon ECS monitors CloudWatch metrics, alarm states, and deployment health. Automatic rollback triggers activate if issues are detected.

1. **Bake Time Period**: The duration when both blue and green service revisions are running simultaneously after the production traffic has shifted.

1. **Blue Environment Termination**: After successful traffic shift and validation, the blue environment is terminated to free up cluster resources, or maintained for rapid rollback capability.

1. **Final State**: The green environment becomes the new production environment, handling 100% of traffic. The deployment is marked as successful.

## Deployment lifecycle stages
<a name="blue-green-deployment-stages"></a>

The blue/green deployment process progresses through distinct lifecycle stages (a series of events in the deployment operation, such as "after production traffic shift"), each with specific responsibilities and validation checkpoints. Understanding these stages helps you monitor deployment progress and troubleshoot issues effectively.

 Each lifecycle stage can last up to 24 hours. We recommend that the value remains below the 24-hour mark. This is because asynchronous processes need time to trigger the hooks. The system times out, fails the deployment, and then initiates a rollback after a stage reaches 24 hours. CloudFormation deployments have additional timeout restrictions. While the 24-hour stage limit remains in effect, CloudFormation enforces a 36-hour limit on the entire deployment. CloudFormation fails the deployment, and then initiates a rollback if the process doesn't complete within 36 hours.


| Lifecycle stages | Description | Use this stage for lifecycle hook? | 
| --- | --- | --- | 
| RECONCILE\$1SERVICE | This stage only happens when you start a new service deployment with more than 1 service revision in an ACTIVE state. | Yes | 
| PRE\$1SCALE\$1UP | The green service revision has not started. The blue service revision is handling 100% of the production traffic. There is no test traffic. | Yes | 
| SCALE\$1UP | The time when the green service revision scales up to 100% and launches new tasks. The green service revision is not serving any traffic at this point. | No | 
| POST\$1SCALE\$1UP | The green service revision has started. The blue service revision is handling 100% of the production traffic. There is no test traffic. | Yes | 
| TEST\$1TRAFFIC\$1SHIFT | The blue and green service revisions are running. The blue service revision handles 100% of the production traffic. The green service revision is migrating from 0% to 100% of test traffic. | Yes | 
| POST\$1TEST\$1TRAFFIC\$1SHIFT | The test traffic shift is complete. The green service revision handles 100% of the test traffic. | Yes | 
| PRODUCTION\$1TRAFFIC\$1SHIFT | Production traffic is shifting to the green service revision. The green service revision is migrating from 0% to 100% of production traffic. | Yes | 
| POST\$1PRODUCTION\$1TRAFFIC\$1SHIFT | The production traffic shift is complete. | Yes | 
| BAKE\$1TIME | The duration when both blue and green service revisions are running simultaneously. | No | 
| CLEAN\$1UP | The blue service revision has completely scaled down to 0 running tasks. The green service revision is now the production service revision after this stage. | No | 

Each lifecycle stage includes built-in validation checkpoints that must pass before proceeding to the next stage. If any validation fails, the deployment can be automatically rolled back to maintain service availability and reliability.

When you use a Lambda function, the function must complete the work, or return IN\$1PROGRESS within 15 minutes. You can use the `callBackDelaySeconds` to delay the call to Lambda. For more information, see [app.py function](https://github.com/aws-samples/sample-amazon-ecs-blue-green-deployment-patterns/blob/main/ecs-bluegreen-lifecycle-hooks/src/approvalFunction/app.py#L20-L25) in the sample-amazon-ecs-blue-green-deployment-patterns on GitHub.

# Required resources for Amazon ECS blue/green deployments
<a name="blue-green-deployment-implementation"></a>

To use a blue/green deployment with managed traffic shifting, your service must use one of the following features:
+ Elastic Load Balancing
+ Service Connect

Services that don't use Service Discovery, Service Connect, VPC Lattice or Elastic Load Balancing can also use blue/green deployments, but don't get any of the managed traffic shifting benefits.

The following list provides a high-level overview of what you need to configure for Amazon ECS blue/green deployments:
+ Your service uses an Application Load Balancer, Network Load Balancer, or Service Connect. Configure the appropriate resources.
  + Application Load Balancer - For more information, see [Application Load Balancer resources for blue/green, linear, and canary deployments](alb-resources-for-blue-green.md).
  + Network Load Balancer - For more information, see [Network Load Balancer resources for Amazon ECS blue/green, linear and canary deployments](nlb-resources-for-blue-green.md).
  + Service Connect - For more information, see [Service Connect resources for Amazon ECS blue/green, linear, and canary deployments](service-connect-blue-green.md).
+ Set the service deployment controller to `ECS`.
+ Configure the deployment strategy as `blue/green` in your service definition.
+ Optionally, configure additional parameters such as:
  + Bake time for the new deployment
  + CloudWatch alarms for automatic rollback
  + Deployment lifecycle hooks for testing (these are Lambda functions that run at specified deployment stages)

## Best practices
<a name="blue-green-deployment-best-practices"></a>

Follow these best practices for successful Amazon ECS blue/green deployments:
+ Configure appropriate health checks that accurately reflect your application's health.
+ Set a bake time that allows sufficient testing of the green deployment.
+ Implement CloudWatch alarms to automatically detect issues and trigger rollbacks.
+ Use lifecycle hooks to perform automated testing at each deployment stage.
+ Ensure your application can handle both blue and green service revisions running simultaneously.
+ Plan for sufficient cluster capacity to handle both service revisions during deployment.
+ Test your rollback procedures before implementing them in production.

# Application Load Balancer resources for blue/green, linear, and canary deployments
<a name="alb-resources-for-blue-green"></a>

To use Application Load Balancers with Amazon ECS blue/green deployments, you need to configure specific resources that allow traffic routing between the blue and green service revisions. 

## Target groups
<a name="alb-target-groups"></a>

For blue/green deployments with Elastic Load Balancing, you need to create two target groups:
+ A primary target group for the blue service revision (current production traffic)
+ An alternate target group for the green service revision (new version)

Both target groups should be configured with the following settings:
+ Target type: `IP` (for Fargate or EC2 with `awsvpc` network mode)
+ Protocol: `HTTP` (or the protocol your application uses)
+ Port: The port your application listens on (typically `80` for HTTP)
+ VPC: The same VPC as your Amazon ECS tasks
+ Health check settings: Configured to properly check your application's health

During a blue/green deployment, Amazon ECS automatically registers tasks with the appropriate target group based on the deployment stage.

**Example Creating target groups for an Application Load Balancer**  
The following CLI commands create two target groups for use with an Application Load Balancer in a blue/green deployment:  

```
aws elbv2 create-target-group \
    --name blue-target-group \
    --protocol HTTP \
    --port 80 \
    --vpc-id vpc-abcd1234 \
    --target-type ip \
    --health-check-path / \
    --health-check-protocol HTTP \
    --health-check-interval-seconds 30 \
    --health-check-timeout-seconds 5 \
    --healthy-threshold-count 2 \
    --unhealthy-threshold-count 2

aws elbv2 create-target-group \
    --name green-target-group \
    --protocol HTTP \
    --port 80 \
    --vpc-id vpc-abcd1234 \
    --target-type ip \
    --health-check-path / \
    --health-check-protocol HTTP \
    --health-check-interval-seconds 30 \
    --health-check-timeout-seconds 5 \
    --healthy-threshold-count 2 \
    --unhealthy-threshold-count 2
```

## Application Load Balancer
<a name="alb-load-balancer"></a>

You need to create an Application Load Balancer with the following configuration:
+ Scheme: Internet-facing or internal, depending on your requirements
+ IP address type: IPv4
+ VPC: The same VPC as your Amazon ECS tasks
+ Subnets: At least two subnets in different Availability Zones
+ Security groups: A security group that allows traffic on the listener ports

The security group attached to the Application Load Balancer must have an outbound rule that allows traffic to the security group attached to your Amazon ECS tasks.

**Example Creating an Application Load Balancer**  
The following CLI command creates anApplication Load Balancer for use in a blue/green deployment:  

```
aws elbv2 create-load-balancer \
    --name my-application-load-balancer \
    --type application \
    --security-groups sg-abcd1234 \
    --subnets subnet-12345678 subnet-87654321
```

## Listeners and rules
<a name="alb-listeners"></a>

For blue/green deployments, you need to configure listeners on your Application Load Balancer:
+ Production listener: Handles production traffic (typically on port 80 or 443)
  + Initially forwards traffic to the primary target group (blue service revision)
  + After deployment, forwards traffic to the alternate target group (green service revision)
+ Test listener (optional): Handles test traffic to validate the green service revision before shifting production traffic
  + Can be configured on a different port (for example, 8080 or 8443)
  + Forwards traffic to the alternate target group (green service revision) during testing

During a blue/green deployment, Amazon ECS automatically updates the listener rules to route traffic to the appropriate target group based on the deployment stage.

**Example Creating a production listener**  
The following CLI command creates a production listener on port 80 that forwards traffic to the primary (blue) target group:  

```
aws elbv2 create-listener \
    --load-balancer-arn arn:aws:elasticloadbalancing:region:123456789012:loadbalancer/app/my-application-load-balancer/abcdef123456 \
    --protocol HTTP \
    --port 80 \
    --default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:region:123456789012:targetgroup/blue-target-group/abcdef123456
```

**Example Creating a test listener**  
The following CLI command creates a test listener on port 8080 that forwards traffic to the alternate (green) target group:  

```
aws elbv2 create-listener \
    --load-balancer-arn arn:aws:elasticloadbalancing:region:123456789012:loadbalancer/app/my-application-load-balancer/abcdef123456 \
    --protocol HTTP \
    --port 8080 \
    --default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:region:123456789012:targetgroup/green-target-group/ghijkl789012
```

**Example Creating a listener rule for path-based routing**  
The following CLI command creates a rule that forwards traffic for a specific path to the green target group for testing:  

```
aws elbv2 create-rule \
    --listener-arn arn:aws:elasticloadbalancing:region:123456789012:listener/app/my-application-load-balancer/abcdef123456/ghijkl789012 \
    --priority 10 \
    --conditions Field=path-pattern,Values='/test/*' \
    --actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:region:123456789012:targetgroup/green-target-group/ghijkl789012
```

**Example Creating a listener rule for header-based routing**  
The following CLI command creates a rule that forwards traffic with a specific header to the green target group for testing:  

```
aws elbv2 create-rule \
    --listener-arn arn:aws:elasticloadbalancing:region:123456789012:listener/app/my-application-load-balancer/abcdef123456/ghijkl789012 \
    --priority 20 \
    --conditions Field=http-header,HttpHeaderConfig='{Name=X-Environment,Values=[test]}' \
    --actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:region:123456789012:targetgroup/green-target-group/ghijkl789012
```

## Service configuration
<a name="alb-service-configuration"></a>

You must have permissions to allow Amazon ECS to manage load balancer resources in your clusters on your behalf. For more information, see [Amazon ECS infrastructure IAM role for load balancers](AmazonECSInfrastructureRolePolicyForLoadBalancers.md). 

When creating or updating an Amazon ECS service for blue/green deployments with Elastic Load Balancing, you need to specify the following configuration.

Replace the *user-input* with your values.

The key components in this configuration are:
+ `targetGroupArn`: The ARN of the primary target group (blue service revision).
+ `alternateTargetGroupArn`: The ARN of the alternate target group (green service revision).
+ `productionListenerRule`: The ARN of the listener rule for production traffic.
+ `roleArn`: The ARN of the role that allows Amazon ECS to manage Elastic Load Balancing resources.
+ `strategy`: Set to `BLUE_GREEN` to enable blue/green deployments.
+ `bakeTimeInMinutes`: The duration when both blue and green service revisions are running simultaneously after the production traffic has shifted.
+ `TestListenerRule`: The ARN of the listener rule for test traffic. This is an optional parameter.

```
{
    "loadBalancers": [
        {
            "targetGroupArn": "arn:aws:elasticloadbalancing:region:123456789012:targetgroup/primary-target-group/abcdef123456",
            "containerName": "container-name",
            "containerPort": 80,
            "advancedConfiguration": {
                "alternateTargetGroupArn": "arn:aws:elasticloadbalancing:region:account-id:targetgroup/alternate-target-group/ghijkl789012",
                "productionListenerRule": "arn:aws:elasticloadbalancing:region:account-id:listener-rule/app/load-balancer-name/abcdef123456/listener/ghijkl789012/rule/mnopqr345678",
                "roleArn": "arn:aws:iam::123456789012:role/ecs-elb-role"
            }
        }
    ],
    "deploymentConfiguration": {
        "strategy": "BLUE_GREEN",
        "maximumPercent": 200,
        "minimumHealthyPercent": 100,
        "bakeTimeInMinutes": 5
    }
}
```

## Traffic flow during deployment
<a name="alb-traffic-flow"></a>

During a blue/green deployment with Elastic Load Balancing, traffic flows through the system as follows:

1. *Initial state*: All production traffic is routed to the primary target group (blue service revision).

1. *Green service revision deployment*: Amazon ECS deploys the new tasks and registers them with the alternate target group.

1. *Test traffic*: If a test listener is configured, test traffic is routed to the alternate target group to validate the green service revision.

1. *Production traffic shift*: Amazon ECS updates the production listener rule to route traffic to the alternate target group (green service revision).

1. *Bake time*: The duration when both blue and green service revisions are running simultaneously after the production traffic has shifted.

1. *Completion*: After a successful deployment, the blue service revision is terminated.

If issues are detected during the deployment, Amazon ECS can automatically roll back by routing traffic back to the primary target group (blue service revision).

# Network Load Balancer resources for Amazon ECS blue/green, linear and canary deployments
<a name="nlb-resources-for-blue-green"></a>

To use a Network Load Balancer with Amazon ECS blue/green deployments, you need to configure specific resources that enable traffic routing between the blue and green service revisions. This section explains the required components and their configuration.

When your configuration includes a Network Load Balancer, Amazon ECS adds a 10 minute delay to the following lifecycle stages:
+ TEST\$1TRAFFIC\$1SHIFT
+ PRODUCTION\$1TRAFFIC\$1SHIFT

This delay accounts for Network Load Balancer timing issues that can cause a mismatch between the configured traffic weights and the actual traffic routing in the data plane. 

## Target groups
<a name="nlb-target-groups"></a>

For blue/green deployments with a Network Load Balancer, you need to create two target groups:
+ A primary target group for the blue service revision (current production traffic)
+ An alternate target group for the green service revision (new service revision)

Both target groups should be configured with the following settings:
+ Target type: `ip` (for Fargate or EC2 with `awsvpc` network mode)
+ Protocol: `TCP` (or the protocol your application uses)
+ Port: The port your application listens on (typically `80` for HTTP)
+ VPC: The same VPC as your Amazon ECS tasks
+ Health check settings: Configured to properly check your application's health

  For TCP health checks, the Network Load Balancer establishes a TCP connection with the target. If the connection is successful, the target is considered healthy.

  For HTTP/HTTPS health checks, the Network Load Balancer sends an HTTP/HTTPS request to the target and verifies the response.

During a blue/green deployment, Amazon ECS automatically registers tasks with the appropriate target group based on the deployment stage.

**Example Creating target groups for a Network Load Balancer**  
The following AWS CLI commands create two target groups for use with a Network Load Balancer in a blue/green deployment:  

```
aws elbv2 create-target-group \
    --name blue-target-group \
    --protocol TCP \
    --port 80 \
    --vpc-id vpc-abcd1234 \
    --target-type ip \
    --health-check-protocol TCP

aws elbv2 create-target-group \
    --name green-target-group \
    --protocol TCP \
    --port 80 \
    --vpc-id vpc-abcd1234 \
    --target-type ip \
    --health-check-protocol TCP
```

## Network Load Balancer
<a name="nlb-load-balancer"></a>

You need to create a Network Load Balancer with the following configuration:
+ Scheme: Internet-facing or internal, depending on your requirements
+ IP address type: IPv4
+ VPC: The same VPC as your Amazon ECS tasks
+ Subnets: At least two subnets in different Availability Zones

Unlike Application Load Balancers, Network Load Balancers operate at the transport layer (Layer 4) and do not use security groups. Instead, you need to ensure that the security groups associated with your Amazon ECS tasks allow traffic from the Network Load Balancer on the listener ports.

**Example Creating a Network Load Balancer**  
The following AWS CLI command creates a Network Load Balancer for use in a blue/green deployment:  

```
aws elbv2 create-load-balancer \
    --name my-network-load-balancer \
    --type network \
    --subnets subnet-12345678 subnet-87654321
```

## Considerations for using NLB with blue/green deployments
<a name="nlb-considerations"></a>

When using a Network Load Balancer for blue/green deployments, consider the following:
+ **Layer 4 operation**: Network Load Balancers operate at the transport layer (Layer 4) and do not inspect application layer (Layer 7) content. This means you cannot use HTTP headers or paths for routing decisions.
+ **Health checks**: Network Load Balancer health checks are limited to TCP, HTTP, or HTTPS protocols. For TCP health checks, the Network Load Balancer only verifies that the connection can be established.
+ **Connection preservation**: Network Load Balancers preserve the source IP address of the client, which can be useful for security and logging purposes.
+ **Static IP addresses**: Network Load Balancers provide static IP addresses for each subnet, which can be useful for whitelisting or when clients need to connect to a fixed IP address.
+ **Test traffic**: Since Network Load Balancers do not support content-based routing, test traffic must be sent to a different port than production traffic.

## Listeners and rules
<a name="nlb-listeners"></a>

For blue/green deployments with a Network Load Balancer, you need to configure listeners:
+ Production listener: Handles production traffic (typically on port 80 or 443)
  + Initially forwards traffic to the primary target group (blue service revision)
  + After deployment, forwards traffic to the alternate target group (green service revision)
+ Test listener (optional): Handles test traffic to validate the green service revision before shifting production traffic
  + Can be configured on a different port (e.g., 8080 or 8443)
  + Forwards traffic to the alternate target group (green service revision) during testing

Unlike Application Load Balancers, Network Load Balancers do not support content-based routing rules. Instead, traffic is routed based on the listener port and protocol.

The following AWS CLI commands create production and test listeners for a Network Load Balancer:

Replace the *user-input* with your values.

```
aws elbv2 create-listener \
    --load-balancer-arn arn:aws:elasticloadbalancing:region:123456789012:loadbalancer/net/my-network-lb/1234567890123456 \
    --protocol TCP \
    --port 80 \
    --default-actions Type=forward, TargetGroupArn=arn:aws:elasticloadbalancing:region:123456789012:targetgroup/blue-target-group/1234567890123456

aws elbv2 create-listener \
    --load-balancer-arn arn:aws:elasticloadbalancing:region:123456789012:loadbalancer/net/my-network-lb/1234567890123456 \
    --protocol TCP \
    --port 8080 \
    --default-actions Type=forward, TargetGroupArn=arn:aws:elasticloadbalancing:region:123456789012:targetgroup/green-target-group/1234567890123456
```

## Service configuration
<a name="nlb-service-configuration"></a>

You must have permissions to allow Amazon ECS to manage load balancer resources in your clusters on your behalf. For more information, see [Amazon ECS infrastructure IAM role for load balancers](AmazonECSInfrastructureRolePolicyForLoadBalancers.md). 

When creating or updating an Amazon ECS service for blue/green deployments with a Network Load Balancer, you need to specify the following configuration:

Replace the *user-input* with your values.

The key components in this configuration are:
+ `targetGroupArn`: The ARN of the primary target group (blue service revision)
+ `alternateTargetGroupArn`: The ARN of the alternate target group (green service revision)
+ `productionListenerRule`: The ARN of the listener for production traffic
+ `testListenerRule`: (Optional) The ARN of the listener for test traffic
+ `roleArn`: The ARN of the role that allows Amazon ECS to manage Network Load Balancer resources
+ `strategy`: Set to `BLUE_GREEN` to enable blue/green deployments
+ `bakeTimeInMinutes`: The duration when both blue and green service revisions are running simultaneously after the production traffic has shifted

```
{
    "loadBalancers": [
        {
            "targetGroupArn": "arn:aws:elasticloadbalancing:region:123456789012:targetgroup/blue-target-group/1234567890123456",
            "containerName": "container-name",
            "containerPort": 80,
            "advancedConfiguration": {
                "alternateTargetGroupArn": "arn:aws:elasticloadbalancing:region:123456789012:targetgroup/green-target-group/1234567890123456",
                "productionListenerRule": "arn:aws:elasticloadbalancing:region:123456789012:listener/net/my-network-lb/1234567890123456/1234567890123456",
                "testListenerRule": "arn:aws:elasticloadbalancing:region:123456789012:listener/net/my-network-lb/1234567890123456/2345678901234567",
                "roleArn": "arn:aws:iam::123456789012:role/ecs-nlb-role"
            }
        }
    ],
    "deploymentConfiguration": {
        "strategy": "BLUE_GREEN",
        "maximumPercent": 200,
        "minimumHealthyPercent": 100,
        "bakeTimeInMinutes": 5
    }
}
```

## Traffic flow during deployment
<a name="nlb-traffic-flow"></a>

During a blue/green deployment with a Network Load Balancer, traffic flows through the system as follows:

1. *Initial state*: All production traffic is routed to the primary target group (blue service revision).

1. *Green service revision deployment*: Amazon ECS deploys the new tasks and registers them with the alternate target group.

1. *Test traffic*: If a test listener is configured, test traffic is routed to the alternate target group to validate the green service revision.

1. *Production traffic shift*: Amazon ECS updates the production listener to route traffic to the alternate target group (green service revision).

1. *Bake time*: The duration when both blue and green service revisions are running simultaneously after the production traffic has shifted.

1. *Completion*: After a successful deployment, the blue service revision is terminated.

If issues are detected during the deployment, Amazon ECS can automatically roll back by routing traffic back to the primary target group (blue service revision).

# Service Connect resources for Amazon ECS blue/green, linear, and canary deployments
<a name="service-connect-blue-green"></a>

When using Service Connect with blue/green deployments, you need to configure specific components to enable proper traffic routing between the blue and green service revisions. This section explains the required components and their configuration.

## Architecture overview
<a name="service-connect-blue-green-architecture"></a>

Service Connect builds both service discovery and service mesh capabilities through a managed sidecar proxy that's automatically injected into your Amazon ECS tasks. These proxies handle routing decisions, retries, and metrics collection, while AWS Cloud Map provides the service registry backend. When you deploy a service with Service Connect enabled, the service registers itself in AWS Cloud Map, and client services discover it through the namespace.

In a standard Service Connect implementation, client services connect to logical service names, and the sidecar proxy handles routing to the actual service instances. With blue/green deployments, this model is extended to include test traffic routing through the `testTrafficRules` configuration.

During a blue/green deployment, the following key components work together:
+ **Service Connect Proxy**: All traffic between services passes through the Service Connect proxy, which makes routing decisions based on the configuration.
+ **AWS Cloud Map Registration**: Both blue and green deployments register with AWS Cloud Map, but the green deployment initially registers as a "test" endpoint.
+ **Test Traffic Routing**: The `testTrafficRules` in the Service Connect configuration determine how to identify and route test traffic to the green deployment. This is accomplished through **header-based routing**, where specific HTTP headers in the requests direct traffic to the test revision. By default, Service Connect recognizes the `x-amzn-ecs-blue-green-test` header for HTTP-based protocols when no custom rules are specified.
+ **Client Configuration**: All clients in the namespace automatically receive both production and test routes, but only requests matching test rules will go to the green deployment.

What makes this approach powerful is that it handles the complexity of service discovery during transitions. As traffic shifts from the blue to green deployment, all connectivity and discovery mechanisms update automatically. There's no need to update DNS records, reconfigure load balancers, or deploy service discovery changes separately since the service mesh handles it all.

## Traffic routing and testing
<a name="service-connect-blue-green-traffic-routing"></a>

Service Connect provides advanced traffic routing capabilities for blue/green deployments, including header-based routing and client alias configuration for testing scenarios.

### Test traffic header rules
<a name="service-connect-test-traffic-header-rules"></a>

During blue/green deployments, you can configure test traffic header rules to route specific requests to the green (new) service revision for testing purposes. This allows you to validate the new version with controlled traffic before completing the deployment.

Service Connect uses **header-based routing** to identify test traffic. By default, Service Connect recognizes the `x-amzn-ecs-blue-green-test` header for HTTP-based protocols when no custom rules are specified. When this header is present in a request, the Service Connect proxy automatically routes the request to the green deployment for testing.

Test traffic header rules enable you to:
+ Route requests with specific headers to the green service revision
+ Test new functionality with a subset of traffic
+ Validate service behavior before full traffic cutover
+ Implement canary testing strategies
+ Perform integration testing in a production-like environment

The header-based routing mechanism works seamlessly with your existing application architecture. Client services don't need to be aware of the blue/green deployment process - they simply include the appropriate headers when sending test requests, and the Service Connect proxy handles the routing logic automatically.

For more information about configuring test traffic header rules, see [ServiceConnectTestTrafficHeaderRules](https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_ServiceConnectTestTrafficHeaderRules.html) in the *Amazon Elastic Container Service API Reference*.

### Header matching rules
<a name="service-connect-header-match-rules"></a>

Header matching rules define the criteria for routing test traffic during blue/green deployments. You can configure multiple matching conditions to precisely control which requests are routed to the green service revision.

Header matching supports:
+ Exact header value matching
+ Header presence checking
+ Pattern-based matching
+ Multiple header combinations

Example use cases include routing requests with specific user agent strings, API versions, or feature flags to the green service for testing.

For more information about header matching configuration, see [ServiceConnectTestTrafficHeaderMatchRules](https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_ServiceConnectTestTrafficHeaderMatchRules.html) in the *Amazon Elastic Container Service API Reference*.

### Client aliases for blue/green deployments
<a name="service-connect-client-alias-blue-green"></a>

Client aliases provide stable DNS endpoints for services during blue/green deployments. They enable seamless traffic routing between blue and green service revisions without requiring client applications to change their connection endpoints.

During a blue/green deployment, client aliases:
+ Maintain consistent DNS names for client connections
+ Enable automatic traffic switching between service revisions
+ Support gradual traffic migration strategies
+ Provide rollback capabilities by redirecting traffic to the blue revision

You can configure multiple client aliases for different ports or protocols, allowing complex service architectures to maintain connectivity during deployments.

For more information about client alias configuration, see [ServiceConnectClientAlias](https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_ServiceConnectClientAlias.html) in the *Amazon Elastic Container Service API Reference*.

### Best practices for traffic routing
<a name="service-connect-blue-green-best-practices"></a>

When implementing traffic routing for blue/green deployments with Service Connect, consider the following best practices:
+ **Start with header-based testing**: Use test traffic header rules to validate the green service with controlled traffic before switching all traffic.
+ **Configure health checks**: Ensure both blue and green services have appropriate health checks configured to prevent routing traffic to unhealthy instances.
+ **Monitor service metrics**: Track key performance indicators for both service revisions during the deployment to identify issues early.
+ **Plan rollback strategy**: Configure client aliases and routing rules to enable quick rollback to the blue service if issues are detected.
+ **Test header matching logic**: Validate your header matching rules in a non-production environment before applying them to production deployments.

## Service Connect blue/green deployment workflow
<a name="service-connect-blue-green-workflow"></a>

Understanding how Service Connect manages the blue/green deployment process helps you implement and troubleshoot your deployments effectively. The following workflow shows how the different components interact during each phase of the deployment.

### Deployment phases
<a name="service-connect-deployment-phases"></a>

A Service Connect blue/green deployment progresses through several distinct phases:

1. **Initial State**: The blue service handles 100% of production traffic. All client services in the namespace connect to the blue service through the logical service name configured in Service Connect.

1. **Green Service Registration**: When the green deployment starts, it registers with AWS Cloud Map as a "test" endpoint. The Service Connect proxy in client services automatically receives both production and test route configurations.

1. **Test Traffic Routing**: Requests containing the test traffic headers (such as `x-amzn-ecs-blue-green-test`) are automatically routed to the green service by the Service Connect proxy. Production traffic continues to flow to the blue service.

1. **Traffic Shift Preparation**: After successful testing, the deployment process prepares for production traffic shift. Both blue and green services remain registered and healthy.

1. **Production Traffic Shift**: The Service Connect configuration updates to route production traffic to the green service. This happens automatically without requiring client service updates or DNS changes.

1. **Bake Time Period**: The duration when both blue and green service revisions are running simultaneously after the production traffic has shifted.

1. **Blue Service Deregistration**: After successful traffic shift and validation, the blue service is deregistered from AWS Cloud Map and terminated, completing the deployment.

### Service Connect proxy behavior
<a name="service-connect-proxy-behavior"></a>

The Service Connect proxy plays a crucial role in managing traffic during blue/green deployments. Understanding its behavior helps you design effective testing and deployment strategies.

Key proxy behaviors during blue/green deployments:
+ **Automatic Route Discovery**: The proxy automatically discovers both production and test routes from AWS Cloud Map without requiring application restarts or configuration changes.
+ **Header-Based Routing**: The proxy examines incoming request headers and routes traffic to the appropriate service revision based on the configured test traffic rules.
+ **Health Check Integration**: The proxy only routes traffic to healthy service instances, automatically excluding unhealthy tasks from the routing pool.
+ **Retry and Circuit Breaking**: The proxy provides built-in retry logic and circuit breaking capabilities, improving resilience during deployments.
+ **Metrics Collection**: The proxy collects detailed metrics for both blue and green services, enabling comprehensive monitoring during deployments.

### Service discovery updates
<a name="service-connect-service-discovery-updates"></a>

One of the key advantages of using Service Connect for blue/green deployments is the automatic handling of service discovery updates. Traditional blue/green deployments often require complex DNS updates or load balancer reconfiguration, but Service Connect manages these changes transparently.

During a deployment, Service Connect handles:
+ **Namespace Updates**: The Service Connect namespace automatically includes both blue and green service endpoints, with appropriate routing rules.
+ **Client Configuration**: All client services in the namespace automatically receive updated routing information without requiring restarts or redeployment.
+ **Gradual Transition**: Service discovery updates happen gradually and safely, ensuring no disruption to ongoing requests.
+ **Rollback Support**: If a rollback is needed, Service Connect can quickly revert service discovery configurations to route traffic back to the blue service.

# Creating an Amazon ECS blue/green deployment
<a name="deploy-blue-green-service"></a>

 By using Amazon ECS blue/green deployments, you can make and test service changes before implementing them in a production environment. 

## Prerequisites
<a name="deploy-blue-green-service-prerequisites"></a>

Perform the following operations before you start a blue/green deployment. 

1. Configure the appropriate permissions.
   + For information about Elastic Load Balancing permissions, see [Amazon ECS infrastructure IAM role for load balancers](AmazonECSInfrastructureRolePolicyForLoadBalancers.md).
   + For information about Lambda permissions, see [Permissions required for Lambda functions in Amazon ECS blue/green deployments](blue-green-permissions.md)

1. Amazon ECS blue/green deployments require that your service to use one of the following features: Configure the appropriate resources.
   + Application Load Balancer - For more information, see [Application Load Balancer resources for blue/green, linear, and canary deployments](alb-resources-for-blue-green.md).
   + Network Load Balancer - For more information, see [Network Load Balancer resources for Amazon ECS blue/green, linear and canary deployments](nlb-resources-for-blue-green.md).
   + Service Connect - For more information, see [Service Connect resources for Amazon ECS blue/green, linear, and canary deployments](service-connect-blue-green.md).

1. Decide if you want to run Lambda functions for the lifecycle stages.
   + PRE\$1SCALE\$1UP
   + POST\$1SCALE\$1UP
   + TEST\$1TRAFFIC\$1SHIFT
   + POST\$1TEST\$1TRAFFIC\$1SHIFT
   + PRODUCTION\$1TRAFFIC\$1SHIFT
   + POST\$1PRODUCTION\$1TRAFFIC\$1SHIFT

   For more information, see [Create a Lambda function with the console](https://docs.aws.amazon.com/lambda/latest/dg/getting-started.html#getting-started-create-function) in the *AWS Lambda Developer Guide*.

## Procedure
<a name="deploy-blue-green-service-procedure"></a>

You can use the console or the AWS CLI to create an Amazon ECS blue/green service.

------
#### [ Console ]

1. Open the console at [https://console.aws.amazon.com/ecs/v2](https://console.aws.amazon.com/ecs/v2).

1. Determine the resource from where you launch the service.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/deploy-blue-green-service.html)

   The **Create service** page displays.

1. Under **Service details**, do the following:

   1. For **Task definition family**, choose the task definition to use. Then, for **Task definition revision**, enter the revision to use.

   1. For **Service name**, enter a name for your service.

1. To run the service in an existing cluster, for **Existing cluster**, choose the cluster. To run the service in a new cluster, choose **Create cluster** 

1. Choose how your tasks are distributed across your cluster infrastructure. Under **Compute configuration**, choose your option.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/deploy-blue-green-service.html)

1. Under **Deployment configuration**, do the following:

   1. For **Service type**, choose **Replica**.

   1. For **Desired tasks**, enter the number of tasks to launch and maintain in the service.

   1. To have Amazon ECS monitor the distribution of tasks across Availability Zones, and redistribute them when there is an imbalance, under **Availability Zone service rebalancing**, select **Availability Zone service rebalancing**.

   1. For **Health check grace period**, enter the amount of time (in seconds) that the service scheduler ignores unhealthy Elastic Load Balancing, VPC Lattice, and container health checks after a task has first started. If you do not specify a health check grace period value, the default value of 0 is used.

1. 

   1. For **Bake time**, enter the number of minutes that both the blue and green service revisions will run simultaneously before the blue revision is terminated. This allows time for verification and testing.

   1. (Optional) Run Lambda functions to run at specific stages of the deployment. Under **Deployment lifecycle hooks**, select the stages to run the lifecycle hooks.

      To add a lifecycle hook:

      1. Choose **Add**.

      1. For **Lambda function**, enter the function name or ARN.

      1. For **Role**, select the IAM role that has permission to invoke the Lambda function.

      1. For **Lifecycle stages**, select the stages when the Lambda function should run.

1. To configure how Amazon ECS detects and handles deployment failures, expand **Deployment failure detection**, and then choose your options. 

   1. To stop a deployment when the tasks cannot start, select **Use the Amazon ECS deployment circuit breaker**.

      To have the software automatically roll back the deployment to the last completed deployment state when the deployment circuit breaker sets the deployment to a failed state, select **Rollback on failures**.

   1. To stop a deployment based on application metrics, select **Use CloudWatch alarm(s)**. Then, from **CloudWatch alarm name**, choose the alarms. To create a new alarm, go to the CloudWatch console.

      To have the software automatically roll back the deployment to the last completed deployment state when a CloudWatch alarm sets the deployment to a failed state, select **Rollback on failures**.

1. (Optional) To interconnect your service using Service Connect, expand **Service Connect**, and then specify the following:

   1.  Select **Turn on Service Connect**.

   1. Under **Service Connect configuration**, specify the client mode.
      + If your service runs a network client application that only needs to connect to other services in the namespace, choose **Client side only**.
      + If your service runs a network or web service application and needs to provide endpoints for this service, and connects to other services in the namespace, choose **Client and server**.

   1. To use a namespace that is not the default cluster namespace, for **Namespace**, choose the service namespace. This can be a namespace created separately in the same AWS Region in your AWS account or a namespace in the same Region that is shared with your account using AWS Resource Access Manager (AWS RAM). For more information about shared AWS Cloud Map namespaces, see [Cross-account AWS Cloud Map namespace sharing](https://docs.aws.amazon.com/cloud-map/latest/dg/sharing-namespaces.html) in the *AWS Cloud Map Developer Guide*.

   1. (Optional) Configure test traffic header rules for blue/green deployments. Under **Test traffic routing**, specify the following:

      1. Select **Enable test traffic header rules** to route specific requests to the green service revision during testing.

      1. For **Header matching rules**, configure the criteria for routing test traffic:
         + **Header name**: Enter the name of the HTTP header to match (for example, `X-Test-Version` or `User-Agent`).
         + **Match type**: Choose the matching criteria:
           + **Exact match**: Route requests where the header value exactly matches the specified value
           + **Header present**: Route requests that contain the specified header, regardless of value
           + **Pattern match**: Route requests where the header value matches a specified pattern
         + **Header value** (if using exact match or pattern match): Enter the value or pattern to match against.

         You can add multiple header matching rules to create complex routing logic. Requests matching any of the configured rules will be routed to the green service revision for testing.

      1. Choose **Add header rule** to configure additional header matching conditions.
**Note**  
Test traffic header rules enable you to validate new functionality with controlled traffic before completing the full deployment. This allows you to test the green service revision with specific requests (such as those from internal testing tools or beta users) while maintaining normal traffic flow to the blue service revision.

   1. (Optional) Specify a log configuration. Select **Use log collection**. The default option sends container logs to CloudWatch Logs. The other log driver options are configured using AWS FireLens. For more information, see [Send Amazon ECS logs to an AWS service or AWS Partner](using_firelens.md).

      The following describes each container log destination in more detail.
      + **Amazon CloudWatch** – Configure the task to send container logs to CloudWatch Logs. The default log driver options are provided, which create a CloudWatch log group on your behalf. To specify a different log group name, change the driver option values.
      + **Amazon Data Firehose** – Configure the task to send container logs to Firehose. The default log driver options are provided, which send logs to a Firehose delivery stream. To specify a different delivery stream name, change the driver option values.
      + **Amazon Kinesis Data Streams** – Configure the task to send container logs to Kinesis Data Streams. The default log driver options are provided, which send logs to an Kinesis Data Streams stream. To specify a different stream name, change the driver option values.
      + **Amazon OpenSearch Service** – Configure the task to send container logs to an OpenSearch Service domain. The log driver options must be provided. 
      + **Amazon S3** – Configure the task to send container logs to an Amazon S3 bucket. The default log driver options are provided, but you must specify a valid Amazon S3 bucket name.

1. (Optional) Configure **Load balancing** for blue/green deployment.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/AmazonECS/latest/developerguide/deploy-blue-green-service.html)

1. (Optional) To help identify your service and tasks, expand the **Tags** section, and then configure your tags.

   To have Amazon ECS automatically tag all newly launched tasks with the cluster name and the task definition tags, select **Turn on Amazon ECS managed tags**, and then for **Propagate tags from**, choose **Task definitions**.

   To have Amazon ECS automatically tag all newly launched tasks with the cluster name and the service tags, select **Turn on Amazon ECS managed tags**, and then for **Propagate tags from**, choose **Service**.

   Add or remove a tag.
   + [Add a tag] Choose **Add tag**, and then do the following:
     + For **Key**, enter the key name.
     + For **Value**, enter the key value.
   + [Remove a tag] Next to the tag, choose **Remove tag**.

1. Choose **Create**.

------
#### [ AWS CLI ]

1. Create a file named `service-definition.json` with the following content.

   Replace the *user-input* with your values.

   ```
   {
     "serviceName": "myBlueGreenService",
     "cluster": "arn:aws:ecs:us-west-2:123456789012:cluster/sample-fargate-cluster",
     "taskDefinition": "sample-fargate:1",
     "desiredCount": 5,
     "launchType": "FARGATE",
     "networkConfiguration": {
       "awsvpcConfiguration": {
         "subnets": [
           "subnet-09ce6e74c116a2299",
           "subnet-00bb3bd7a73526788",
           "subnet-0048a611aaec65477"
         ],
         "securityGroups": [
           "sg-09d45005497daa123"
         ],
         "assignPublicIp": "ENABLED"
       }
     },
     "deploymentController": {
       "type": "ECS"
     },
     "deploymentConfiguration": {
       "strategy": "BLUE_GREEN",
       "maximumPercent": 200,
       "minimumHealthyPercent": 100,
       "bakeTimeInMinutes": 2,
       "alarms": {
         "alarmNames": [
           "myAlarm"
         ],
         "rollback": true,
         "enable": true
       },
       "lifecycleHooks": [
         {
           "hookTargetArn": "arn:aws:lambda:us-west-2:7123456789012:function:checkExample",
           "roleArn": "arn:aws:iam::123456789012:role/ECSLifecycleHookInvoke",
           "lifecycleStages": [
             "PRE_SCALE_UP"
           ]
         }
       ]
     },
     "loadBalancers": [
       {
         "targetGroupArn": "arn:aws:elasticloadbalancing:us-west-2:123456789012:targetgroup/blue-target-group/54402ff563af1197",
         "containerName": "fargate-app",
         "containerPort": 80,
         "advancedConfiguration": {
           "alternateTargetGroupArn": "arn:aws:elasticloadbalancing:us-west-2:123456789012:targetgroup/green-target-group/cad10a56f5843199",
           "productionListenerRule": "arn:aws:elasticloadbalancing:us-west-2:123456789012:listener-rule/app/my-blue-green-demo/32e0e4f946c3c05b/9cfa8c482e204f7d/831dbaf72edb911",
           "roleArn": "arn:aws:iam::123456789012:role/LoadBalancerManagementforECS"
         }
       }
     ]
   }
   ```

1. Run `create-service`.

   Replace the *user-input* with your values.

   ```
   aws ecs create-service --cli-input-json file://service-definition.json
   ```

   Alternatively, you can use the following example which creates a blue/green deployment service with a load balancer configuration:

   ```
   aws ecs create-service \
      --cluster "arn:aws:ecs:us-west-2:123456789012:cluster/MyCluster" \
      --service-name "blue-green-example-service" \
      --task-definition "nginxServer:1" \
      --launch-type "FARGATE" \
      --network-configuration "awsvpcConfiguration={subnets=[subnet-12345,subnet-67890,subnet-abcdef,subnet-fedcba],securityGroups=[sg-12345],assignPublicIp=ENABLED}" \
      --desired-count 3 \
      --deployment-controller "type=ECS" \
      --deployment-configuration "strategy=BLUE_GREEN,maximumPercent=200,minimumHealthyPercent=100,bakeTimeInMinutes=0" \
      --load-balancers "targetGroupArn=arn:aws:elasticloadbalancing:us-west-2:123456789012:targetgroup/MyBGtg1/abcdef1234567890,containerName=nginx,containerPort=80,advancedConfiguration={alternateTargetGroupArn=arn:aws:elasticloadbalancing:us-west-2:123456789012:targetgroup/MyBGtg2/0987654321fedcba,productionListenerRule=arn:aws:elasticloadbalancing:us-west-2:123456789012:listener-rule/app/MyLB/1234567890abcdef/1234567890abcdef,roleArn=arn:aws:iam::123456789012:role/ELBManagementRole}"
   ```

------

## Next steps
<a name="deploy-blue-green-service-next-steps"></a>
+ Update the service to start the deployment. For more information, see [Updating an Amazon ECS service](update-service-console-v2.md).
+ Monitor the deployment process to ensure it follows the blue/green pattern:
  + The green service revision is created and scaled up
  + Test traffic is routed to the green revision (if configured)
  + Production traffic is shifted to the green revision
  + After the bake time, the blue revision is terminated

# Troubleshooting Amazon ECS blue/green deployments
<a name="troubleshooting-blue-green"></a>

This following provides solutions for common issues you might encounter when using blue/green deployments with Amazon ECS. Blue/green deployment errors can occur during the following phases:
+ *Synchronous path*: Errors that appear immediately in response to `CreateService` or `UpdateService` API calls.
+ *Asynchronous path*: Errors that appear in the `statusReason` field of `DescribeServiceDeployments` and cause a deployment rollback

**Tip**  
You can use the [Amazon ECS MCP server](ecs-mcp-introduction.md) with AI assistants to monitor deployments and troubleshoot deployment issues using natural language.

## Load balancer configuration issues
<a name="troubleshooting-blue-green-load-balancer"></a>

Load balancer configuration is a critical component of blue/green deployments in Amazon ECS. Proper configuration of listener rules, target groups, and load balancer types is essential for successful deployments. This section covers common load balancer configuration issues that can cause blue/green deployments to fail.

When troubleshooting load balancer issues, it's important to understand the relationship between listener rules and target groups. In a blue/green deployment:
+ The production listener rule directs traffic to the currently active (blue) service revision
+ The test listener rule can be used to validate the new (green) service revision before shifting production traffic
+ Target groups are used to register the container instances from each service revision
+ During deployment, traffic is gradually shifted from the blue service revision to the green service revision by adjusting the weights of the target groups in the listener rules

### Listener rule configuration errors
<a name="troubleshooting-blue-green-listener-rules"></a>

The following issues relate to incorrect listener rule configuration for blue/green deployments.

Using an Application Load Balancer listener ARN instead of a listener rule ARN  
*Error message*: `productionListenerRule has an invalid ARN format. Must be RuleArn for ALB or ListenerArn for NLB. Got: arn:aws:elasticloadbalancing:us-west-2:123456789012:listener/app/my-alb/abc123/def456`  
*Solution*: When using an Application Load Balancer, you must specify a listener rule ARN for `productionListenerRule` and `testListenerRule`, not a listener ARN. For Network Load Balancers, you must use the listener ARN.  
 For information about how to find the listener ARN, see [Listeners for your Application Load Balancers](https://docs.aws.amazon.com/elasticloadbalancing/latest/application/create-https-listener.html) in the *Application Load Balancer User Guide*. The ARN for a rule has the format `arn:aws:elasticloadbalancing:region:account-id:listener-rule/app/...`.

Using the same rule for both production and test listeners  
*Error message*: `The following rules cannot be used as both production and test listener rules: arn:aws:elasticloadbalancing:us-west-2:123456789012:listener-rule/app/my-alb/abc123/def456/ghi789`  
*Solution*: You must use different listener rules for production and test traffic. Create a separate listener rule for test traffic that routes to your test target group.

Target group not associated with listener rules  
*Error message*: `Service deployment rolled back because of invalid networking configuration: Target group arn:aws:elasticloadbalancing:us-west-2:123456789012:targetgroup/myAlternateTG/abc123 is not associated with either productionListenerRule or testListenerRule.`  
*Solution*: Both the primary target group and alternate target group must be associated with either the production listener rule or the test listener rule. Update your load balancer configuration to ensure both target groups are properly associated with your listener rules.

Missing test listener rule with an Application Load Balancer  
*Error message*: `For Application LoadBalancer, testListenerRule is required when productionListenerRule is not associated with both targetGroup and alternateTargetGroup`  
*Solution*: When you use an Application Load Balancer, if both target groups are not associated with the production listener rule, you must specify a test listener rule. Add a `testListenerRule` to your configuration and ensure both target groups are associated with either the production or test listener rule. For more information, see [Listeners for your Application Load Balancers](https://docs.aws.amazon.com/elasticloadbalancing/latest/application/create-https-listener.html) in the *Application Load Balancer User Guide*.

### Target group configuration errors
<a name="troubleshooting-blue-green-target-groups"></a>

The following issues relate to incorrect target group configuration for blue/green deployments.

Multiple target groups with traffic in listener rule  
*Error message*: `Service deployment rolled back because of invalid networking configuration. productionListenerRule arn:aws:elasticloadbalancing:us-west-2:123456789012:listener-rule/app/my-alb/abc123/def456/ghi789 should have exactly one target group serving traffic but found 2 target groups which are serving traffic`  
*Solution*: Before starting a blue/green deployment, ensure that only one target group is receiving traffic (has a non-zero weight) in your listener rule. Update your listener rule configuration to set the weight to zero for any target group that should not be receiving traffic.

Duplicate target groups across load balancer entries  
*Error message*: `Duplicate targetGroupArn found: arn:aws:elasticloadbalancing:us-west-2:123456789012:targetgroup/myecs-targetgroup/abc123`  
*Solution*: Each target group ARN must be unique across all load balancer entries in your service definition. Review your configuration and ensure you're using different target groups for each load balancer entry.

Unexpected target group in production listener rule  
*Error message*: `Service deployment rolled back because of invalid networking configuration. Production listener rule is forwarding traffic to unexpected target group arn:aws:elasticloadbalancing:us-west-2:123456789012:targetgroup/random-nlb-tg/abc123. Expected traffic to be forwarded to either targetGroupArn: arn:aws:elasticloadbalancing:us-west-2:123456789012:targetgroup/nlb-targetgroup/def456 or alternateTargetGroupArn: arn:aws:elasticloadbalancing:us-west-2:123456789012:targetgroup/nlb-tg-alternate/ghi789`  
*Solution*: The production listener rule is forwarding traffic to a target group that is not specified in your service definition. Ensure that the listener rule is configured to forward traffic only to the target groups specified in your service definition.   
For more information, see [forward actions](https://docs.aws.amazon.com/elasticloadbalancing/latest/application/load-balancer-listeners.html#forward-actions) in the *Application Load Balancer User Guide*.

### Load balancer type configuration errors
<a name="troubleshooting-blue-green-load-balancer-types"></a>

The following issues relate to incorrect load balancer type configuration for blue/green deployments.

Mixing Classic Load Balancer and Application Load Balancer or Network Load Balancer configurations  
*Error message*: `All loadBalancers must be strictly either ELBv1 (defining loadBalancerName) or ELBv2 (defining targetGroupArn)`  
Classic Load Balancers are the previous generation of load balancers from Elastic Load Balancing. We recommend that you migrate to a current generation load balancer. For more information, see [Migrate your Classic Load Balancer](https://docs.aws.amazon.com/elasticloadbalancing/latest/userguide/migrate-classic-load-balancer.html).
*Solution*: . Use either all Classic Load Balancers or all Application Load Balancers and Network Load Balancers.  
For Application Load Balancers and Network Load Balancers, specify only the `targetGroupArn` field.

Using advanced configuration with a Classic Load Balancer  
*Error message*: `advancedConfiguration field is not allowed with ELBv1 loadBalancers`  
*Solution*: Advanced configuration for blue/green deployments is only supported with Application Load Balancers and Network Load Balancers. If you use a Classic Load Balancer (specified with `loadBalancerName`), you cannot use the `advancedConfiguration` field. Either switch to an Application Load Balancer, or remove the `advancedConfiguration` field.

Inconsistent advanced configuration across load balancers  
*Error message*: `Either all or none of the provided loadBalancers must have advancedConfiguration defined`  
*Solution*: If you're using multiple load balancers, you must either define `advancedConfiguration` for all of them or for none of them. Update your configuration to ensure consistency across all load balancer entries.

Missing advanced configuration with blue/green deployment  
*Error message*: `advancedConfiguration field is required for all loadBalancers when using a non-ROLLING deployment strategy`  
*Solution*: When using a blue/green deployment strategy with Application Load Balancers, you must specify the `advancedConfiguration` field for all load balancer entries. Add the required `advancedConfiguration` to your load balancer configuration.

## Permission issues
<a name="troubleshooting-blue-green-permissions"></a>

The following issues relate to insufficient permissions for blue/green deployments.

Missing trust policy on infrastructure role  
*Error message*: `Service deployment rolled back because of invalid networking configuration. ECS was unable to manage the ELB resources due to missing permissions on ECS Infrastructure Role 'arn:aws:iam::123456789012:role/Admin'.`  
*Solution*: The IAM role specified for managing load balancer resources does not have the correct trust policy. Update the role's trust policy to allow the service to assume the role. The trust policy must include:    
****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ecs.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
```

Missing read permissions on load balancer role  
*Error message*: `service myService failed to describe target health on target-group myTargetGroup with (error User: arn:aws:sts::123456789012:assumed-role/myELBRole/ecs-service-scheduler is not authorized to perform: elasticloadbalancing:DescribeTargetHealth because no identity-based policy allows the elasticloadbalancing:DescribeTargetHealth action)`  
*Solution*: The IAM role used for managing load balancer resources does not have permission to read target health information. Add the `elasticloadbalancing:DescribeTargetHealth` permission to the role's policy. For information about Elastic Load Balancing permissions, see [Amazon ECS infrastructure IAM role for load balancers](AmazonECSInfrastructureRolePolicyForLoadBalancers.md).

Missing write permissions on load balancer role  
*Error message*: `service myService failed to register targets in target-group myTargetGroup with (error User: arn:aws:sts::123456789012:assumed-role/myELBRole/ecs-service-scheduler is not authorized to perform: elasticloadbalancing:RegisterTargets on resource: arn:aws:elasticloadbalancing:us-west-2:123456789012:targetgroup/myTargetGroup/abc123 because no identity-based policy allows the elasticloadbalancing:RegisterTargets action)`  
*Solution*: The IAM role used for managing load balancer resources does not have permission to register targets. Add the `elasticloadbalancing:RegisterTargets` permission to the role's policy. For information about Elastic Load Balancing permissions, see [Amazon ECS infrastructure IAM role for load balancers](AmazonECSInfrastructureRolePolicyForLoadBalancers.md).

Missing permission to modify listener rules  
*Error message*: `Service deployment rolled back because TEST_TRAFFIC_SHIFT lifecycle hook(s) failed. User: arn:aws:sts::123456789012:assumed-role/myELBRole/ECSNetworkingWithELB is not authorized to perform: elasticloadbalancing:ModifyListener on resource: arn:aws:elasticloadbalancing:us-west-2:123456789012:listener/app/my-alb/abc123/def456 because no identity-based policy allows the elasticloadbalancing:ModifyListener action`  
*Solution*: The IAM role used for managing load balancer resources does not have permission to modify listeners. Add the `elasticloadbalancing:ModifyListener` permission to the role's policy. For information about Elastic Load Balancing permissions, see [Amazon ECS infrastructure IAM role for load balancers](AmazonECSInfrastructureRolePolicyForLoadBalancers.md).

For blue/green deployments, we recommend attaching the `AmazonECS-ServiceLinkedRolePolicy` managed policy to your infrastructure role, which includes all the necessary permissions for managing load balancer resources.

## Lifecycle hook issues
<a name="troubleshooting-blue-green-lifecycle-hooks"></a>

The following issues relate to lifecycle hooks in blue/green deployments.

Incorrect trust policy on Lambda hook role  
*Error message*: `Service deployment rolled back because TEST_TRAFFIC_SHIFT lifecycle hook(s) failed. ECS was unable to assume role arn:aws:iam::123456789012:role/Admin`  
*Solution*: The IAM role specified for the Lambda lifecycle hook does not have the correct trust policy. Update the role's trust policy to allow the service to assume the role. The trust policy must include:    
****  

```
{
  "Version":"2012-10-17",		 	 	 
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": "ecs.amazonaws.com"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
```

Lambda hook returns FAILED status  
*Error message*: `Service deployment rolled back because TEST_TRAFFIC_SHIFT lifecycle hook(s) failed. Lifecycle hook target arn:aws:lambda:us-west-2:123456789012:function:myHook returned FAILED status.`  
*Solution*: The Lambda function specified as a lifecycle hook returned a FAILED status. Check the Lambda function logs in Amazon CloudWatch logs to determine the failure reason, and update the function to handle the deployment event correctly.

Missing permission to invoke Lambda function  
*Error message*: `Service deployment rolled back because TEST_TRAFFIC_SHIFT lifecycle hook(s) failed. ECS was unable to invoke hook target arn:aws:lambda:us-west-2:123456789012:function:myHook due to User: arn:aws:sts::123456789012:assumed-role/myLambdaRole/ECS-Lambda-Execution is not authorized to perform: lambda:InvokeFunction on resource: arn:aws:lambda:us-west-2:123456789012:function:myHook because no identity-based policy allows the lambda:InvokeFunction action`  
*Solution*: The IAM role used for the Lambda lifecycle hook does not have permission to invoke the Lambda function. Add the `lambda:InvokeFunction` permission to the role's policy for the specific Lambda function ARN. For information about Lambda permissions, see [Permissions required for Lambda functions in Amazon ECS blue/green deployments](blue-green-permissions.md).

Lambda function timeout or invalid response  
*Error message*: `Service deployment rolled back because TEST_TRAFFIC_SHIFT lifecycle hook(s) failed. ECS was unable to parse the response from arn:aws:lambda:us-west-2:123456789012:function:myHook due to HookStatus must not be null`  
*Solution*: The Lambda function either timed out or returned an invalid response. Ensure that your Lambda function returns a valid response with a `hookStatus` field set to either `SUCCEEDED` or `FAILED`. Also, check that the Lambda function timeout is set appropriately for your validation logic. For more information, see [Lifecycle hooks for Amazon ECS service deployments](deployment-lifecycle-hooks.md).  
Example of a valid Lambda response:  

```
{
  "hookStatus": "SUCCEEDED",
  "reason": "Validation passed"
}
```