Enable Amazon EKS Auto Mode across EKS clusters by using GitHub Actions
Urbija Goswami and Anugrah Lakra, Amazon Web Services
Summary
Amazon Elastic Kubernetes Service (EKS) clusters traditionally require manual management of compute resources through node groups. This creates operational overhead for:
Capacity planning and scaling decisions
Node provisioning and lifecycle management
Cost optimization across different workload types
Infrastructure maintenance and updates
Amazon EKS Auto Mode automates compute resource management by dynamically provisioning and scaling nodes based on workload demands, eliminating the need for manual node group management.
However, many organizations struggle to consistently enable and manage Amazon EKS Auto Mode across their existing and new clusters. Common challenges include:
Complex migration processes from existing node groups
Risk of service disruption during transition
Need for careful capacity planning and testing
Requirement for specific Amazon IAM
permissions and configurations Coordination across multiple teams and environments
This pattern implements a GitHub Actions
After enabling Auto Mode, the workflow drains and deletes old node groups, updates cluster role permissions, and cleans up previous scaling components such as Karpenter and Cluster Autoscaler. The workflow can be integrated with existing continuous integration and continuous delivery/deployment (CI/CD) pipelines.
Prerequisites and limitations
Prerequisites
1. Required
A GitHub account
and your own GitHub repository to run the workflow An active AWS account
with administrative permissions
2. Local tools installation
Terraform
version 1.13.0 or later GitHub CLI
(gh), configured with appropriate credentials
3. EKS Cluster Requirements
Kubernetes version 1.29 or later
Endpoint access configuration:
Either it is set to public and private endpoints
Or Private endpoint with NAT Gateway in private subnets
EKS API and ConfigMap cluster access enabled (required to allow EKS to dynamically manage Auto Mode nodes and update the aws-auth ConfigMap for proper cluster authentication during migration)
4. IAM OIDC Configuration Requirements
IAM role and identity provider for GitHub that includes:
Trust policy for GitHub OIDC
Permissions for:
EKS Cluster management
S3 bucket access
IAM role management
EC2 network management
See the iam.tf
code for simple setup using Terraform. The IAM role (GitHubActionsEKSRole) will be created when the Terraform code is applied.
Limitations
Only supports EKS clusters with Kubernetes version 1.29 and above
Only supports Karpenter version 1.1.0 and above
Region-specific implementation. Some AWS services aren't available in all AWS regions. For region availability, see AWS services by Region
Requires cluster endpoint accessibility
Limited to AWS-managed node groups
Architecture
Target technology stack
Target architecture

The GitHub Actions Workflow is triggered from the GitHub Repository by the user.
The GitHub Actions Workflow assumes an IAM role using OIDC to make the necessary changes in the AWS account. It also checks for the presence of the EKS Auto Node role in the account and if not present, the role is created and the necessary policies are attached.
A backup of the current state of the EKS cluster needing Auto Mode enabled is uploaded to an S3 bucket.
The cluster role of the cluster needing Auto Mode enabled is retrieved and additional permissions (AmazonEKSComputePolicy, AmazonEKSBlockStoragePolicy, AmazonEKSLoadBalancingPolicy, AmazonEKSNetworkingPolicy, AmazonEKSClusterPolicy) are added to it if not present for EKS Auto Mode. Additionally, as a pre-migration step, subnets of the clusters are updated with tags for EKS Auto Mode enablement.
The workflow enables the EKS Auto Mode in the EKS cluster.
Old node groups are identified and deleted. This is skipped if the user hasn’t given the permissions to the IAM role described in the optional setup steps below.
Scaling components (Karpenter and Cluster Autoscaler) are also removed if present previously.
The GitHub Actions workflow consists of three main jobs:
check-clusters: Identifies clusters without Auto Mode enabled and updates IAM policies and subnet tags.backup-and-check: Backs up cluster state before migration.gradual-migration: Enables Auto Mode while gradually draining existing node groups and cleaning up old scaling components. It also does a final verification of clusters’ states after migration.
Note
If you need node configuration backups or plan to delete nodes/node groups during migration to EKS Auto Mode, then you can add the IAM role created using the terraform code to aws-auth ConfigMap. Without it, you can still view node group configurations.
Tools
AWS CLI:
AWS Command Line Interface (AWS CLI) is an open source tool that helps you interact with AWS services through commands in your command-line shell. In our solution, we make use of the command-line interface for AWS services to execute EKS cluster configuration updates, IAM role updates and query cluster status throughout the automation process.
Amazon EKS:
Amazon Elastic Kubernetes Service (Amazon EKS) helps you run Kubernetes on AWS without needing to install or maintain your own Kubernetes control plane or nodes. In this pattern, Amazon EKS is the target service where Auto Mode is enabled to automate compute provisioning and node scaling across clusters in a specific Region.
IAM:
AWS Identity and Access Management (IAM) helps you securely manage access to your AWS resources by controlling who is authenticated and authorized to use them. In our solution, we use it to manage permissions for GitHub Actions to modify EKS cluster configurations via OIDC federation. The solution also modifies the cluster role permissions and adds a job to create EKS Node Role so that EKS Auto Mode can schedule the pending pods in new nodes that it spins up as a part of the node pools.
Amazon S3:
Amazon Simple Storage Service (Amazon S3) is a cloud-based object storage service that helps you store, protect, and retrieve any amount of data. In our solution, we use an S3 bucket to store the timestamped backups of the clusters before EKS Auto Mode is enabled in them, which would help in disaster recovery.
Other tools:
GitHub Actions:
GitHub Actions
HashiCorp Terraform:
Terraform
Code repository
The code for this pattern is available in the GitHub EKS Auto Mode Enablement via GitHub Actions
Best practices
Security:
Follow the principle of least privilege and grant the minimum permissions required to perform a task. For more information, see Grant least privilege and Security best practices in the IAM documentation. See the iam.tf
file in the repository for the minimum required configuration. Scope the IAM role trust policy to your specific GitHub repository and branch to prevent unauthorized workflow runs from assuming the role.
Enable EKS control plane logging (API server, audit, authenticator) before starting the migration so you can diagnose scheduling or authentication issues after Auto Mode is enabled.
Add --sse AES256 to all aws s3 cp commands in the backup script
to enforce server-side encryption on cluster state backups.
Reliability:
Test the workflow against a non-production cluster first. Verify that workloads reschedule correctly on Auto Mode nodes before migrating production clusters.
Verify that S3 backups completed successfully and contain valid cluster config, node group, Helm release, and custom resource data before proceeding with Auto Mode enablement.
After enabling Auto Mode, monitor pod scheduling events and node provisioning latency using Amazon CloudWatch Container Insights to detect issues early.
Performance:
Review Auto Mode node pool scaling patterns periodically and adjust workload resource requests and limits to avoid over-provisioning or scheduling delays.
Cost:
Tag EKS clusters and associated resources (IAM roles, S3 backup buckets, subnets) with environment and ownership metadata to support cost tracking and operational visibility. For more information, see tagging AWS resources documentation. You can edit the workflow file to add custom tags during the migration process.
Set up AWS Cost Explorer alerts to monitor changes in compute costs after enabling Auto Mode, since Auto Mode may change instance types and scaling behavior. For more information, see Analyzing your costs with AWS Cost Explorer documentation.
Operations:
Keep the workflow file and Terraform configurations in version control and document any environment-specific overrides such as region, role ARN, or S3 bucket name.
Epics
| Task | Description | Skills required |
|---|---|---|
Configure the GitHub repository. |
| AWS DevOps, Cloud architect |
| Task | Description | Skills required |
|---|---|---|
Set up IAM for backup and node group deletion |
Replace the $CLUSTER_NAME and $ACCOUNT_ID with the appropriate values.
Replace the $AWS_REGION and $ROLE_ARN with the specific region and the arn of the IAM role created above respectively. | AWS DevOps, Cloud architect |
| Task | Description | Skills required |
|---|---|---|
Trigger the GitHub Actions workflow. | The workflow is triggered automatically when any changes are pushed to the feature, main, or dev branches. To manually trigger via GitHub UI: 1. Go to the repository on GitHub 2. Click on the "Actions" tab 3. Select the workflow (auto-mode-pipeline) 4. Click "Run workflow" button 5. Choose the branch and click "Run workflow" The workflow handles verification | AWS DevOps, Cloud architect |
| Task | Description | Skills required |
|---|---|---|
Implementation of multi-environment deployment. |
|
| Task | Description | Skills required |
|---|---|---|
Clean up resources. |
| General AWS, Cloud architect |
Troubleshooting
| Issue | Solution |
|---|---|
Authentication Issues | • Verify GitHub OIDC provider is configured correctly in AWS IAM • Check that the IAM role ARN in git secrets matches the actual role created with terraform (GitHubActionsEKSRole) • Ensure GitHub repository has necessary secrets configured- AWS_REGION and AWS_ROLE_ARN. • Validate AWS Region settings match your cluster locations |
Permission Problems | • Test IAM role permissions locally: bash aws sts assume-role --role-arn <role-arn> --role-session-name test-session aws eks list-clusters • Ensure the role has eks:UpdateClusterConfig and eks:DescribeCluster permissions |
Cluster Compatibility | • Confirm EKS clusters are running Kubernetes 1.29 or above: bash aws eks describe-cluster --name <cluster-name> --query 'cluster.version' • Verify clusters are in ACTIVE state before enabling Auto Mode |
Workflow Failures | • Check GitHub Actions logs for specific error messages • Verify the workflow file syntax in .github/workflows/auto-mode-pipeline.yml • Ensure environment variables are properly set in the workflow |