

 **Help improve this page** 

To contribute to this user guide, choose the **Edit this page on GitHub** link that is located in the right pane of every page.

# Amazon EKS Hybrid Nodes overview
<a name="hybrid-nodes-overview"></a>

With *Amazon EKS Hybrid Nodes*, you can use your on-premises and edge infrastructure as nodes in Amazon EKS clusters. AWS manages the AWS-hosted Kubernetes control plane of the Amazon EKS cluster, and you manage the hybrid nodes that run in your on-premises or edge environments. This unifies Kubernetes management across your environments and offloads Kubernetes control plane management to AWS for your on-premises and edge applications.

Amazon EKS Hybrid Nodes works with any on-premises hardware or virtual machines, bringing the efficiency, scalability, and availability of Amazon EKS to wherever your applications need to run. You can use a wide range of Amazon EKS features with Amazon EKS Hybrid Nodes including Amazon EKS add-ons, Amazon EKS Pod Identity, cluster access entries, cluster insights, and extended Kubernetes version support. Amazon EKS Hybrid Nodes natively integrates with AWS services including AWS Systems Manager, AWS IAM Roles Anywhere, Amazon Managed Service for Prometheus, and Amazon CloudWatch for centralized monitoring, logging, and identity management.

With Amazon EKS Hybrid Nodes, there are no upfront commitments or minimum fees, and you are charged per hour for the vCPU resources of your hybrid nodes when they are attached to your Amazon EKS clusters. For more pricing information, see [Amazon EKS Pricing](https://aws.amazon.com/eks/pricing/).

[![AWS Videos](http://img.youtube.com/vi/https://www.youtube.com/embed/tFn9IdlddBw?rel=0/0.jpg)](http://www.youtube.com/watch?v=https://www.youtube.com/embed/tFn9IdlddBw?rel=0)


## Features
<a name="hybrid-nodes-features"></a>

EKS Hybrid Nodes has the following high-level features:
+  **Managed Kubernetes control plane**: AWS manages the AWS-hosted Kubernetes control plane of the EKS cluster, and you manage the hybrid nodes that run in your on-premises or edge environments. This unifies Kubernetes management across your environments and offloads Kubernetes control plane management to AWS for your on-premises and edge applications. By moving the Kubernetes control plane to AWS, you can conserve on-premises capacity for your applications and trust that the Kubernetes control plane scales with your workloads.
+  **Consistent EKS experience**: Most EKS features are supported with EKS Hybrid Nodes for a consistent EKS experience across your on-premises and cloud environments including EKS add-ons, EKS Pod Identity, cluster access entries, cluster insights, extended Kubernetes version support, and more. See [Configure add-ons for hybrid nodes](hybrid-nodes-add-ons.md) for more information on the EKS add-ons supported with EKS Hybrid Nodes.
+  **Centralized observability and identity management**: EKS Hybrid Nodes natively integrates with AWS services including AWS Systems Manager, AWS IAM Roles Anywhere, Amazon Managed Service for Prometheus, and Amazon CloudWatch for centralized monitoring, logging, and identity management.
+  **Burst-to-cloud or add on-premises capacity**: A single EKS cluster can be used to run hybrid nodes and nodes in AWS Regions, AWS Local Zones, or AWS Outposts to burst-to-cloud or add on-premises capacity to your EKS clusters. See [Considerations for mixed mode clusters](hybrid-nodes-webhooks.md#hybrid-nodes-considerations-mixed-mode) for more information.
+  **Flexible infrastructure**: EKS Hybrid Nodes follows a *bring your own infrastructure* approach and is agnostic to the infrastructure you use for hybrid nodes. You can run hybrid nodes on physical or virtual machines, and x86 and ARM architectures, making it possible to migrate on-premises workloads running on hybrid nodes across different infrastructure types.
+  **Flexible networking**: With EKS Hybrid Nodes, communication between the EKS control plane and hybrid nodes is routed through the VPC and subnets you pass during cluster creation, which builds on the [existing mechanism](https://docs.aws.amazon.com/eks/latest/best-practices/subnets.html) in EKS for control plane to node networking. This is flexible to your preferred method of connecting your on-premises networks to a VPC in AWS. There are several [documented options](https://docs.aws.amazon.com/whitepapers/latest/aws-vpc-connectivity-options/network-to-amazon-vpc-connectivity-options.html) available including AWS Site-to-Site VPN, AWS Direct Connect, or your own VPN solution, and you can choose the method that best fits your use case.

## Limits
<a name="hybrid-node-limits"></a>
+ Up to 15 CIDRs for Remote Node Networks and 15 CIDRs for Remote Pod Networks per cluster are supported.

## Considerations
<a name="hybrid-nodes-general"></a>
+ EKS Hybrid Nodes can be used with new or existing EKS clusters.
+ EKS Hybrid Nodes is available in all AWS Regions, except the AWS GovCloud (US) Regions and the AWS China Regions.
+ EKS Hybrid Nodes must have a reliable connection between your on-premises environment and AWS. EKS Hybrid Nodes is not a fit for disconnected, disrupted, intermittent or limited (DDIL) environments. If you are running in a DDIL environment, consider [Amazon EKS Anywhere](https://aws.amazon.com/eks/eks-anywhere/). Reference the [Best Practices for EKS Hybrid Nodes](https://docs.aws.amazon.com/eks/latest/best-practices/hybrid-nodes-network-disconnections.html) for information on how hybrid nodes behave during network disconnection scenarios.
+ Running EKS Hybrid Nodes on cloud infrastructure, including AWS Regions, AWS Local Zones, AWS Outposts, or in other clouds, is not supported. You will be charged the hybrid nodes fee if you run hybrid nodes on Amazon EC2 instances.
+ Billing for hybrid nodes starts when the nodes join the EKS cluster and stops when the nodes are removed from the cluster. Be sure to remove your hybrid nodes from your EKS cluster if you are not using them.

## Additional resources
<a name="hybrid-nodes-resources"></a>
+  [https://www.eksworkshop.com/docs/networking/eks-hybrid-nodes/](https://www.eksworkshop.com/docs/networking/eks-hybrid-nodes/): Step-by-step instructions for deploying EKS Hybrid Nodes in a demo environment.
+  [https://www.youtube.com/watch?v=ZxC7SkemxvU](https://www.youtube.com/watch?v=ZxC7SkemxvU): AWS re:Invent session introducing the EKS Hybrid Nodes launch with a customer showing how they are using EKS Hybrid Nodes in their environment.
+  [https://repost.aws/articles/ARL44xuau6TG2t-JoJ3mJ5Mw/unpacking-the-cluster-networking-for-amazon-eks-hybrid-nodes](https://repost.aws/articles/ARL44xuau6TG2t-JoJ3mJ5Mw/unpacking-the-cluster-networking-for-amazon-eks-hybrid-nodes): Article explaining various methods for setting up networking for EKS Hybrid Nodes.
+  [https://aws.amazon.com/blogs/containers/run-genai-inference-across-environments-with-amazon-eks-hybrid-nodes/](https://aws.amazon.com/blogs/containers/run-genai-inference-across-environments-with-amazon-eks-hybrid-nodes/): Blog post showing how to run GenAI inference across environments with EKS Hybrid Nodes.

# Prerequisite setup for hybrid nodes
<a name="hybrid-nodes-prereqs"></a>

To use Amazon EKS Hybrid Nodes, you must have private connectivity from your on-premises environment to/from AWS, bare metal servers or virtual machines with a supported operating system, and AWS IAM Roles Anywhere or AWS Systems Manager (SSM) hybrid activations configured. You are responsible for managing these prerequisites throughout the hybrid nodes lifecycle.
+ Hybrid network connectivity from your on-premises environment to/from AWS 
+ Infrastructure in the form of physical or virtual machines
+ Operating system that is compatible with hybrid nodes
+ On-premises IAM credentials provider configured

![\[Hybrid node network connectivity.\]](http://docs.aws.amazon.com/eks/latest/userguide/images/hybrid-prereq-diagram.png)


## Hybrid network connectivity
<a name="hybrid-nodes-prereqs-connect"></a>

The communication between the Amazon EKS control plane and hybrid nodes is routed through the VPC and subnets you pass during cluster creation, which builds on the [existing mechanism](https://aws.github.io/aws-eks-best-practices/networking/subnets/) in Amazon EKS for control plane to node networking. There are several [documented options](https://docs.aws.amazon.com/whitepapers/latest/aws-vpc-connectivity-options/network-to-amazon-vpc-connectivity-options.html) available for you to connect your on-premises environment with your VPC including AWS Site-to-Site VPN, AWS Direct Connect, or your own VPN connection. Reference the [AWS Site-to-Site VPN](https://docs.aws.amazon.com/vpn/latest/s2svpn/VPC_VPN.html) and [AWS Direct Connect](https://docs.aws.amazon.com/directconnect/latest/UserGuide/Welcome.html) user guides for more information on how to use those solutions for your hybrid network connection.

For an optimal experience, we recommend that you have reliable network connectivity of at least 100 Mbps and a maximum of 200ms round trip latency for the hybrid nodes connection to the AWS Region. This is general guidance that accommodates most use cases but is not a strict requirement. The bandwidth and latency requirements can vary depending on the number of hybrid nodes and your workload characteristics, such as application image size, application elasticity, monitoring and logging configurations, and application dependencies on accessing data stored in other AWS services. We recommend that you test with your own applications and environments before deploying to production to validate that your networking setup meets the requirements for your workloads.

## On-premises network configuration
<a name="hybrid-nodes-prereqs-onprem"></a>

You must enable inbound network access from the Amazon EKS control plane to your on-premises environment to allow the Amazon EKS control plane to communicate with the `kubelet` running on hybrid nodes and optionally with webhooks running on your hybrid nodes. Additionally, you must enable outbound network access for your hybrid nodes and components running on them to communicate with the Amazon EKS control plane. You can configure this communication to stay fully private to your AWS Direct Connect, AWS Site-to-Site VPN, or your own VPN connection.

The Classless Inter-Domain Routing (CIDR) ranges you use for your on-premises node and pod networks must use IPv4 RFC-1918 or CGNAT address ranges. Your on-premises router must be configured with routes to your on-premises nodes and optionally pods. See [On-premises networking configuration](hybrid-nodes-networking.md#hybrid-nodes-networking-on-prem) for more information on the on-premises network requirements, including the full list of required ports and protocols that must be enabled in your firewall and on-premises environment.

## EKS cluster configuration
<a name="hybrid-nodes-prereqs-cluster"></a>

To minimize latency, we recommend that you create your Amazon EKS cluster in the AWS Region closest to your on-premises or edge environment. You pass your on-premises node and pod CIDRs during Amazon EKS cluster creation via two API fields: `RemoteNodeNetwork` and `RemotePodNetwork`. You may need to discuss with your on-premises network team to identify your on-premises node and pod CIDRs. The node CIDR is allocated from your on-premises network and the pod CIDR is allocated from the Container Network Interface (CNI) you use if you are using an overlay network for your CNI. Cilium and Calico use overlay networks by default.

The on-premises node and pod CIDRs you configure via the `RemoteNodeNetwork` and `RemotePodNetwork` fields are used to configure the Amazon EKS control plane to route traffic through your VPC to the `kubelet` and the pods running on your hybrid nodes. Your on-premises node and pod CIDRs cannot overlap with each other, the VPC CIDR you pass during cluster creation, or the service IPv4 configuration for your Amazon EKS cluster. Also, Pod CIDRs must be unique to each EKS cluster so that your on-premises router can route traffic.

We recommend that you use either public or private endpoint access for the Amazon EKS Kubernetes API server endpoint. If you choose “Public and Private”, the Amazon EKS Kubernetes API server endpoint will always resolve to the public IPs for hybrid nodes running outside of your VPC, which can prevent your hybrid nodes from joining the cluster. When you use public endpoint access, the Kubernetes API server endpoint is resolved to public IPs and the communication from hybrid nodes to the Amazon EKS control plane will be routed over the internet. When you choose private endpoint access, the Kubernetes API server endpoint is resolved to private IPs and the communication from hybrid nodes to the Amazon EKS control plane will be routed over your private connectivity link, in most cases AWS Direct Connect or AWS Site-to-Site VPN.

## VPC configuration
<a name="hybrid-nodes-prereqs-vpc"></a>

You must configure the VPC you pass during Amazon EKS cluster creation with routes in its routing table for your on-premises node and optionally pod networks with your virtual private gateway (VGW) or transit gateway (TGW) as the target. An example is shown below. Replace `REMOTE_NODE_CIDR` and `REMOTE_POD_CIDR` with the values for your on-premises network.


| Destination | Target | Description | 
| --- | --- | --- | 
|  10.226.0.0/16  |  local  |  Traffic local to the VPC routes within the VPC  | 
|  REMOTE\$1NODE\$1CIDR  |  tgw-abcdef123456  |  On-prem node CIDR, route traffic to the TGW  | 
|  REMOTE\$1POD\$1CIDR  |  tgw-abcdef123456  |  On-prem pod CIDR, route traffic to the TGW  | 

## Security group configuration
<a name="hybrid-nodes-prereqs-sg"></a>

When you create a cluster, Amazon EKS creates a security group that’s named `eks-cluster-sg-<cluster-name>-<uniqueID>`. You cannot alter the inbound rules of this Cluster Security Group but you can restrict the outbound rules. You must add an additional security group to your cluster to enable the kubelet and optionally webhooks running on your hybrid nodes to contact the Amazon EKS control plane. The required inbound rules for this additional security group are shown below. Replace `REMOTE_NODE_CIDR` and `REMOTE_POD_CIDR` with the values for your on-premises network.


| Name | Security group rule ID | IP version | Type | Protocol | Port range | Source | 
| --- | --- | --- | --- | --- | --- | --- | 
|  On-prem node inbound  |  sgr-abcdef123456  |  IPv4  |  HTTPS  |  TCP  |  443  |  REMOTE\$1NODE\$1CIDR  | 
|  On-prem pod inbound  |  sgr-abcdef654321  |  IPv4  |  HTTPS  |  TCP  |  443  |  REMOTE\$1POD\$1CIDR  | 

## Infrastructure
<a name="hybrid-nodes-prereqs-infra"></a>

You must have bare metal servers or virtual machines available to use as hybrid nodes. Hybrid nodes are agnostic to the underlying infrastructure and support x86 and ARM architectures. Amazon EKS Hybrid Nodes follows a “bring your own infrastructure” approach, where you are responsible for provisioning and managing the bare metal servers or virtual machines that you use for hybrid nodes. While there is not a strict minimum resource requirement, we recommend that you use hosts with at least 1 vCPU and 1GiB RAM for hybrid nodes.

## Operating system
<a name="hybrid-nodes-prereqs-os"></a>

Bottlerocket, Amazon Linux 2023 (AL2023), Ubuntu, and RHEL are validated on an ongoing basis for use as the node operating system for hybrid nodes. Bottlerocket is supported by AWSin VMware vSphere environments only. AL2023 is not covered by AWS Support Plans when run outside of Amazon EC2. AL2023 can only be used in on-premises virtualized environments, see the [Amazon Linux 2023 User Guide](https://docs.aws.amazon.com/linux/al2023/ug/outside-ec2.html) for more information. AWS supports the hybrid nodes integration with Ubuntu and RHEL operating systems but does not provide support for the operating system itself.

You are responsible for operating system provisioning and management. When you are testing hybrid nodes for the first time, it is easiest to run the Amazon EKS Hybrid Nodes CLI (`nodeadm`) on an already provisioned host. For production deployments, we recommend that you include `nodeadm` in your golden operating system images with it configured to run as a systemd service to automatically join hosts to Amazon EKS clusters at host startup.

## On-premises IAM credentials provider
<a name="hybrid-nodes-prereqs-iam"></a>

Amazon EKS Hybrid Nodes use temporary IAM credentials provisioned by AWS SSM hybrid activations or AWS IAM Roles Anywhere to authenticate with the Amazon EKS cluster. You must use either AWS SSM hybrid activations or AWS IAM Roles Anywhere with the Amazon EKS Hybrid Nodes CLI (`nodeadm`). We recommend that you use AWS SSM hybrid activations if you do not have existing Public Key Infrastructure (PKI) with a Certificate Authority (CA) and certificates for your on-premises environments. If you do have existing PKI and certificates on-premises, use AWS IAM Roles Anywhere.

Similar to the [Amazon EKS node IAM role](create-node-role.md) for nodes running on Amazon EC2, you will create a Hybrid Nodes IAM Role with the required permissions to join hybrid nodes to Amazon EKS clusters. If you are using AWS IAM Roles Anywhere, configure a trust policy that allows AWS IAM Roles Anywhere to assume the Hybrid Nodes IAM Role and configure your AWS IAM Roles Anywhere profile with the Hybrid Nodes IAM Role as an assumable role. If you are using AWS SSM, configure a trust policy that allows AWS SSM to assume the Hybrid Nodes IAM Role and create the hybrid activation with the Hybrid Nodes IAM Role. See [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md) for how to create the Hybrid Nodes IAM Role with the required permissions.

# Prepare networking for hybrid nodes
<a name="hybrid-nodes-networking"></a>

This topic provides an overview of the networking setup you must have configured before creating your Amazon EKS cluster and attaching hybrid nodes. This guide assumes you have met the prerequisite requirements for hybrid network connectivity using [AWS Site-to-Site VPN](https://docs.aws.amazon.com/vpn/latest/s2svpn/SetUpVPNConnections.html), [AWS Direct Connect](https://docs.aws.amazon.com/directconnect/latest/UserGuide/Welcome.html), or your own VPN solution.

![\[Hybrid node network connectivity.\]](http://docs.aws.amazon.com/eks/latest/userguide/images/hybrid-prereq-diagram.png)


## On-premises networking configuration
<a name="hybrid-nodes-networking-on-prem"></a>

### Minimum network requirements
<a name="hybrid-nodes-networking-min-reqs"></a>

For an optimal experience, we recommend that you have reliable network connectivity of at least 100 Mbps and a maximum of 200ms round trip latency for the hybrid nodes connection to the AWS Region. This is general guidance that accommodates most use cases but is not a strict requirement. The bandwidth and latency requirements can vary depending on the number of hybrid nodes and your workload characteristics, such as application image size, application elasticity, monitoring and logging configurations, and application dependencies on accessing data stored in other AWS services. We recommend that you test with your own applications and environments before deploying to production to validate that your networking setup meets the requirements for your workloads.

### On-premises node and pod CIDRs
<a name="hybrid-nodes-networking-on-prem-cidrs"></a>

Identify the node and pod CIDRs you will use for your hybrid nodes and the workloads running on them. The node CIDR is allocated from your on-premises network and the pod CIDR is allocated from your Container Network Interface (CNI) if you are using an overlay network for your CNI. You pass your on-premises node CIDRs and pod CIDRs as inputs when you create your EKS cluster with the `RemoteNodeNetwork` and `RemotePodNetwork` fields. Your on-premises node CIDRs must be routable on your on-premises network. See the following section for information on the on-premises pod CIDR routability.

The on-premises node and pod CIDR blocks must meet the following requirements:

1. Be within one of the following `IPv4` RFC-1918 ranges: `10.0.0.0/8`, `172.16.0.0/12`, or `192.168.0.0/16` , or within the CGNAT range defined by RFC 6598: `100.64.0.0/10` .

1. Not overlap with each other, the VPC CIDR for your EKS cluster, or your Kubernetes service `IPv4` CIDR.

### On-premises pod network routing
<a name="hybrid-nodes-networking-on-prem-pod-routing"></a>

When using EKS Hybrid Nodes, we generally recommend that you make your on-premises pod CIDRs routable on your on-premises network to enable full cluster communication and functionality between cloud and on-premises environments.

 **Routable pod networks** 

If you are able to make your pod network routable on your on-premises network, follow the guidance below.

1. Configure the `RemotePodNetwork` field for your EKS cluster with your on-premises pod CIDR, your VPC route tables with your on-premises pod CIDR, and your EKS cluster security group with your on-premises pod CIDR.

1. There are several techniques you can use to make your on-premises pod CIDR routable on your on-premises network including Border Gateway Protocol (BGP), static routes, or other custom routing solutions. BGP is the recommended solution as it is more scalable and easier to manage than alternative solutions that require custom or manual route configuration. AWS supports the BGP capabilities of Cilium and Calico for advertising pod CIDRs, see [Configure CNI for hybrid nodes](hybrid-nodes-cni.md) and [Routable remote Pod CIDRs](hybrid-nodes-concepts-kubernetes.md#hybrid-nodes-concepts-k8s-pod-cidrs) for more information.

1. Webhooks can run on hybrid nodes as the EKS control plane is able to communicate with the Pod IP addresses assigned to the webhooks.

1. Workloads running on cloud nodes are able to communicate directly with workloads running on hybrid nodes in the same EKS cluster.

1. Other AWS services, such as AWS Application Load Balancers and Amazon Managed Service for Prometheus, are able to communicate with workloads running on hybrid nodes to balance network traffic and scrape pod metrics.

 **Unroutable pod networks** 

If you are *not* able to make your pod networks routable on your on-premises network, follow the guidance below.

1. Webhooks cannot run on hybrid nodes because webhooks require connectivity from the EKS control plane to the Pod IP addresses assigned to the webhooks. In this case, we recommend that you run webhooks on cloud nodes in the same EKS cluster as your hybrid nodes, see [Configure webhooks for hybrid nodes](hybrid-nodes-webhooks.md) for more information.

1. Workloads running on cloud nodes are not able to communicate directly with workloads running on hybrid nodes when using the VPC CNI for cloud nodes and Cilium or Calico for hybrid nodes.

1. Use Service Traffic Distribution to keep traffic local to the zone it is originating from. For more information on Service Traffic Distribution, see [Configure Service Traffic Distribution](hybrid-nodes-webhooks.md#hybrid-nodes-mixed-service-traffic-distribution).

1. Configure your CNI to use egress masquerade or network address translation (NAT) for pod traffic as it leaves your on-premises hosts. This is enabled by default in Cilium. Calico requires `natOutgoing` to be set to `true`.

1. Other AWS services, such as AWS Application Load Balancers and Amazon Managed Service for Prometheus, are not able to communicate with workloads running on hybrid nodes.

### Access required during hybrid node installation and upgrade
<a name="hybrid-nodes-networking-access-reqs"></a>

You must have access to the following domains during the installation process where you install the hybrid nodes dependencies on your hosts. This process can be done once when you are building your operating system images or it can be done on each host at runtime. This includes initial installation and when you upgrade the Kubernetes version of your hybrid nodes.

Some packages are installed using the OS’s default package manager. For AL2023 and RHEL, the `yum` command is used to install `containerd`, `ca-certificates`, `iptables` and `amazon-ssm-agent`. For Ubuntu, `apt` is used to install `containerd`, `ca-certificates`, and `iptables`, and `snap` is used to install `amazon-ssm-agent`.


| Component | URL | Protocol | Port | 
| --- | --- | --- | --- | 
|  EKS node artifacts (S3)  |  https://hybrid-assets.eks.amazonaws.com  |  HTTPS  |  443  | 
|   [EKS service endpoints](https://docs.aws.amazon.com/general/latest/gr/eks.html)   |  https://eks.*region*.amazonaws.com  |  HTTPS  |  443  | 
|   [ECR service endpoints](https://docs.aws.amazon.com/general/latest/gr/ecr.html)   |  https://api.ecr.*region*.amazonaws.com  |  HTTPS  |  443  | 
|  EKS ECR endpoints  |  See [View Amazon container image registries for Amazon EKS add-ons](add-ons-images.md) for regional endpoints.  |  HTTPS  |  443  | 
|  SSM binary endpoint 1   |  https://amazon-ssm-*region*.s3.*region*.amazonaws.com  |  HTTPS  |  443  | 
|   [SSM service endpoint](https://docs.aws.amazon.com/general/latest/gr/ssm.html) 1   |  https://ssm.*region*.amazonaws.com  |  HTTPS  |  443  | 
|  IAM Anywhere binary endpoint 2   |  https://rolesanywhere.amazonaws.com  |  HTTPS  |  443  | 
|   [IAM Anywhere service endpoint](https://docs.aws.amazon.com/general/latest/gr/rolesanywhere.html) 2   |  https://rolesanywhere.*region*.amazonaws.com  |  HTTPS  |  443  | 
|  Operating System package manager endpoints  |  Package repository endpoints are OS-specific and might vary by geographic region.  |  HTTPS  |  443  | 

**Note**  
 1 Access to the AWS SSM endpoints are only required if you are using AWS SSM hybrid activations for your on-premises IAM credential provider.  
 2 Access to the AWS IAM endpoints are only required if you are using AWS IAM Roles Anywhere for your on-premises IAM credential provider.

### Access required for ongoing cluster operations
<a name="hybrid-nodes-networking-access-reqs-ongoing"></a>

The following network access for your on-premises firewall is required for ongoing cluster operations.

**Important**  
Depending on your choice of CNI, you need to configure additional network access rules for the CNI ports. See the [Cilium documentation](https://docs.cilium.io/en/stable/operations/system_requirements/#firewall-rules) and the [Calico documentation](https://docs.tigera.io/calico/latest/getting-started/kubernetes/requirements#network-requirements) for details.


| Type | Protocol | Direction | Port | Source | Destination | Usage | 
| --- | --- | --- | --- | --- | --- | --- | 
|  HTTPS  |  TCP  |  Outbound  |  443  |  Remote Node CIDR(s)  |  EKS cluster IPs 1   |  kubelet to Kubernetes API server  | 
|  HTTPS  |  TCP  |  Outbound  |  443  |  Remote Pod CIDR(s)  |  EKS cluster IPs 1   |  Pod to Kubernetes API server  | 
|  HTTPS  |  TCP  |  Outbound  |  443  |  Remote Node CIDR(s)  |   [SSM service endpoint](https://docs.aws.amazon.com/general/latest/gr/ssm.html)   |  SSM hybrid activations credential refresh and SSM heartbeats every 5 minutes  | 
|  HTTPS  |  TCP  |  Outbound  |  443  |  Remote Node CIDR(s)  |   [IAM Anywhere service endpoint](https://docs.aws.amazon.com/general/latest/gr/rolesanywhere.html)   |  IAM Roles Anywhere credential refresh  | 
|  HTTPS  |  TCP  |  Outbound  |  443  |  Remote Pod CIDR(s)  |   [STS Regional Endpoint](https://docs.aws.amazon.com/general/latest/gr/sts.html)   |  Pod to STS endpoint, only required for IRSA  | 
|  HTTPS  |  TCP  |  Outbound  |  443  |  Remote Node CIDR(s)  |   [Amazon EKS Auth service endpoint](https://docs.aws.amazon.com/general/latest/gr/eks.html)   |  Node to Amazon EKS Auth endpoint, only required for Amazon EKS Pod Identity  | 
|  HTTPS  |  TCP  |  Inbound  |  10250  |  EKS cluster IPs 1   |  Remote Node CIDR(s)  |  Kubernetes API server to kubelet  | 
|  HTTPS  |  TCP  |  Inbound  |  Webhook ports  |  EKS cluster IPs 1   |  Remote Pod CIDR(s)  |  Kubernetes API server to webhooks  | 
|  HTTPS  |  TCP,UDP  |  Inbound,Outbound  |  53  |  Remote Pod CIDR(s)  |  Remote Pod CIDR(s)  |  Pod to CoreDNS. If you run at least 1 replica of CoreDNS in the cloud, you must allow DNS traffic to the VPC where CoreDNS is running.  | 
|  User-defined  |  User-defined  |  Inbound,Outbound  |  App ports  |  Remote Pod CIDR(s)  |  Remote Pod CIDR(s)  |  Pod to Pod  | 

**Note**  
 1 The IPs of the EKS cluster. See the following section on Amazon EKS elastic network interfaces.

### Amazon EKS network interfaces
<a name="hybrid-nodes-networking-eks-network-interfaces"></a>

Amazon EKS attaches network interfaces to the subnets in the VPC you pass during cluster creation to enable the communication between the EKS control plane and your VPC. The network interfaces that Amazon EKS creates can be found after cluster creation in the Amazon EC2 console or with the AWS CLI. The original network interfaces are deleted and new network interfaces are created when changes are applied on your EKS cluster, such as Kubernetes version upgrades. You can restrict the IP range for the Amazon EKS network interfaces by using constrained subnet sizes for the subnets you pass during cluster creation, which makes it easier to configure your on-premises firewall to allow inbound/outbound connectivity to this known, constrained set of IPs. To control which subnets network interfaces are created in, you can limit the number of subnets you specify when you create a cluster or you can update the subnets after creating the cluster.

The network interfaces provisioned by Amazon EKS have a description of the format `Amazon EKS your-cluster-name `. See the example below for an AWS CLI command you can use to find the IP addresses of the network interfaces that Amazon EKS provisions. Replace `VPC_ID` with the ID of the VPC you pass during cluster creation.

```
aws ec2 describe-network-interfaces \
--query 'NetworkInterfaces[?(VpcId == VPC_ID && contains(Description,Amazon EKS))].PrivateIpAddress'
```

## AWS VPC and subnet setup
<a name="hybrid-nodes-networking-vpc"></a>

The existing [VPC and subnet requirements](network-reqs.md) for Amazon EKS apply to clusters with hybrid nodes. Additionally, your VPC CIDR can’t overlap with your on-premises node and pod CIDRs. You must configure routes in your VPC routing table for your on-premises node and optionally pod CIDRs. These routes must be setup to route traffic to the gateway you are using for your hybrid network connectivity, which is commonly a virtual private gateway (VGW) or transit gateway (TGW). If you are using TGW or VGW to connect your VPC with your on-premises environment, you must create a TGW or VGW attachment for your VPC. Your VPC must have DNS hostname and DNS resolution support.

The following steps use the AWS CLI. You can also create these resources in the AWS Management Console or with other interfaces such as AWS CloudFormation, AWS CDK, or Terraform.

### Step 1: Create VPC
<a name="_step_1_create_vpc"></a>

1. Run the following command to create a VPC. Replace VPC\$1CIDR with an IPv4 CIDR range that is either RFC 1918 (private), CGNAT (RFC 6598), or non-RFC 1918/non-CGNAT (public) (for example, 10.0.0.0/16). Note: DNS resolution, which is an EKS requirement, is enabled for the VPC by default.

   ```
   aws ec2 create-vpc --cidr-block VPC_CIDR
   ```

1. Enable DNS hostnames for your VPC. Note, DNS resolution is enabled for the VPC by default. Replace `VPC_ID` with the ID of the VPC you created in the previous step.

   ```
   aws ec2 modify-vpc-attribute --vpc-id VPC_ID --enable-dns-hostnames
   ```

### Step 2: Create subnets
<a name="_step_2_create_subnets"></a>

Create at least 2 subnets. Amazon EKS uses these subnets for the cluster network interfaces. For more information, see the [Subnets requirements and considerations](network-reqs.md#network-requirements-subnets).

1. You can find the availability zones for an AWS Region with the following command. Replace `us-west-2` with your region.

   ```
   aws ec2 describe-availability-zones \
        --query 'AvailabilityZones[?(RegionName == us-west-2)].ZoneName'
   ```

1. Create a subnet. Replace `VPC_ID` with the ID of the VPC. Replace `SUBNET_CIDR` with the CIDR block for your subnet (for example 10.0.1.0/24 ). Replace `AZ` with the availability zone where the subnet will be created (for example us-west-2a). The subnets you create must be in at least 2 different availability zones.

   ```
   aws ec2 create-subnet \
       --vpc-id VPC_ID \
       --cidr-block SUBNET_CIDR \
       --availability-zone AZ
   ```

### (Optional) Step 3: Attach VPC with Amazon VPC Transit Gateway (TGW) or AWS Direct Connect virtual private gateway (VGW)
<a name="optional_step_3_attach_vpc_with_amazon_vpc_transit_gateway_tgw_or_shared_aws_direct_connect_virtual_private_gateway_vgw"></a>

If you are using a TGW or VGW, attach your VPC to the TGW or VGW. For more information, see [Amazon VPC attachments in Amazon VPC Transit Gateways](https://docs.aws.amazon.com/vpc/latest/tgw/tgw-vpc-attachments.html) or [AWS Direct Connect virtual private gateway associations](https://docs.aws.amazon.com/vpn/latest/s2svpn/how_it_works.html#VPNGateway).

 **Transit Gateway** 

Run the following command to attach a Transit Gateway. Replace `VPC_ID` with the ID of the VPC. Replace `SUBNET_ID1` and `SUBNET_ID2` with the IDs of the subnets you created in the previous step. Replace `TGW_ID` with the ID of your TGW.

```
aws ec2 create-transit-gateway-vpc-attachment \
    --vpc-id VPC_ID \
    --subnet-ids SUBNET_ID1 SUBNET_ID2 \
    --transit-gateway-id TGW_ID
```

 **Virtual Private Gateway** 

Run the following command to attach a Transit Gateway. Replace `VPN_ID` with the ID of your VGW. Replace `VPC_ID` with the ID of the VPC.

```
aws ec2 attach-vpn-gateway \
    --vpn-gateway-id VPN_ID \
    --vpc-id VPC_ID
```

### (Optional) Step 4: Create route table
<a name="_optional_step_4_create_route_table"></a>

You can modify the main route table for the VPC or you can create a custom route table. The following steps create a custom route table with the routes to on-premises node and pod CIDRs. For more information, see [Subnet route tables](https://docs.aws.amazon.com/vpc/latest/userguide/subnet-route-tables.html). Replace `VPC_ID` with the ID of the VPC.

```
aws ec2 create-route-table --vpc-id VPC_ID
```

### Step 5: Create routes for on-premises nodes and pods
<a name="_step_5_create_routes_for_on_premises_nodes_and_pods"></a>

Create routes in the route table for each of your on-premises remote nodes. You can modify the main route table for the VPC or use the custom route table you created in the previous step.

The examples below show how to create routes for your on-premises node and pod CIDRs. In the examples, a transit gateway (TGW) is used to connect the VPC with the on-premises environment. If you have multiple on-premises node and pods CIDRs, repeat the steps for each CIDR.
+ If you are using an internet gateway or a virtual private gateway (VGW) replace `--transit-gateway-id` with `--gateway-id`.
+ Replace `RT_ID` with the ID of the route table you created in the previous step.
+ Replace `REMOTE_NODE_CIDR` with the CIDR range you will use for your hybrid nodes.
+ Replace `REMOTE_POD_CIDR` with the CIDR range you will use for the pods running on hybrid nodes. The pod CIDR range corresponds to the Container Networking Interface (CNI) configuration, which most commonly uses an overlay network on-premises. For more information, see [Configure CNI for hybrid nodes](hybrid-nodes-cni.md).
+ Replace `TGW_ID` with the ID of your TGW.

 **Remote node network** 

```
aws ec2 create-route \
    --route-table-id RT_ID \
    --destination-cidr-block REMOTE_NODE_CIDR \
    --transit-gateway-id TGW_ID
```

 **Remote Pod network** 

```
aws ec2 create-route \
    --route-table-id RT_ID \
    --destination-cidr-block REMOTE_POD_CIDR \
    --transit-gateway-id TGW_ID
```

### (Optional) Step 6: Associate subnets with route table
<a name="_optional_step_6_associate_subnets_with_route_table"></a>

If you created a custom route table in the previous step, associate each of the subnets you created in the previous step with your custom route table. If you are modifying the VPC main route table, the subnets are automatically associated with the main route table of the VPC and you can skip this step.

Run the following command for each of the subnets you created in the previous steps. Replace `RT_ID` with the route table you created in the previous step. Replace `SUBNET_ID` with the ID of a subnet.

```
aws ec2 associate-route-table --route-table-id RT_ID --subnet-id SUBNET_ID
```

## Cluster security group configuration
<a name="hybrid-nodes-networking-cluster-sg"></a>

The following access for your EKS cluster security group is required for ongoing cluster operations. Amazon EKS automatically creates the required **inbound** security group rules for hybrid nodes when you create or update your cluster with remote node and pod networks configured. Because security groups allow all **outbound** traffic by default, Amazon EKS doesn’t automatically modify the **outbound** rules of the cluster security group for hybrid nodes. If you want to customize the cluster security group, you can limit traffic to the rules in the following table.


| Type | Protocol | Direction | Port | Source | Destination | Usage | 
| --- | --- | --- | --- | --- | --- | --- | 
|  HTTPS  |  TCP  |  Inbound  |  443  |  Remote Node CIDR(s)  |  N/A  |  Kubelet to Kubernetes API server  | 
|  HTTPS  |  TCP  |  Inbound  |  443  |  Remote Pod CIDR(s)  |  N/A  |  Pods requiring access to K8s API server when the CNI is not using NAT for the pod traffic.  | 
|  HTTPS  |  TCP  |  Outbound  |  10250  |  N/A  |  Remote Node CIDR(s)  |  Kubernetes API server to Kubelet  | 
|  HTTPS  |  TCP  |  Outbound  |  Webhook ports  |  N/A  |  Remote Pod CIDR(s)  |  Kubernetes API server to webhook (if running webhooks on hybrid nodes)  | 

**Important**  
 **Security group rule limits**: Amazon EC2 security groups have a maximum of 60 inbound rules by default. The security group inbound rules may not apply if your cluster security group approaches this limit. In this case, it may be required to manually add in the missing inbound rules.  
 **CIDR cleanup responsibility**: If you remove remote node or pod networks from EKS clusters, EKS does not automatically remove the corresponding security group rules. You are responsible for manually removing unused remote node or pod networks from your security group rules.

For more information about the cluster security group that Amazon EKS creates, see [View Amazon EKS security group requirements for clusters](sec-group-reqs.md).

### (Optional) Manual security group configuration
<a name="_optional_manual_security_group_configuration"></a>

If you need to create additional security groups or modify the automatically created rules, you can use the following commands as reference. By default, the command below creates a security group that allows all outbound access. You can restrict outbound access to include only the rules above. If you’re considering limiting the outbound rules, we recommend that you thoroughly test all of your applications and pod connectivity before you apply your changed rules to a production cluster.
+ In the first command, replace `SG_NAME` with a name for your security group
+ In the first command, replace `VPC_ID` with the ID of the VPC you created in the previous step
+ In the second command, replace `SG_ID` with the ID of the security group you create in the first command
+ In the second command, replace `REMOTE_NODE_CIDR` and `REMOTE_POD_CIDR` with the values for your hybrid nodes and on-premises network.

```
aws ec2 create-security-group \
    --group-name SG_NAME \
    --description "security group for hybrid nodes" \
    --vpc-id VPC_ID
```

```
aws ec2 authorize-security-group-ingress \
    --group-id SG_ID \
    --ip-permissions '[{"IpProtocol": "tcp", "FromPort": 443, "ToPort": 443, "IpRanges": [{"CidrIp": "REMOTE_NODE_CIDR"}, {"CidrIp": "REMOTE_POD_CIDR"}]}]'
```

# Prepare operating system for hybrid nodes
<a name="hybrid-nodes-os"></a>

Bottlerocket, Amazon Linux 2023 (AL2023), Ubuntu, and RHEL are validated on an ongoing basis for use as the node operating system for hybrid nodes. Bottlerocket is supported by AWSin VMware vSphere environments only. AL2023 is not covered by AWS Support Plans when run outside of Amazon EC2. AL2023 can only be used in on-premises virtualized environments, see the [Amazon Linux 2023 User Guide](https://docs.aws.amazon.com/linux/al2023/ug/outside-ec2.html) for more information. AWS supports the hybrid nodes integration with Ubuntu and RHEL operating systems but does not provide support for the operating system itself.

You are responsible for operating system provisioning and management. When you are testing hybrid nodes for the first time, it is easiest to run the Amazon EKS Hybrid Nodes CLI (`nodeadm`) on an already provisioned host. For production deployments, we recommend that you include `nodeadm` in your operating system images with it configured to run as a systemd service to automatically join hosts to Amazon EKS clusters at host startup. If you are using Bottlerocket as your node operating system on vSphere, you do not need to use `nodeadm` as Bottlerocket already contains the dependencies required for hybrid nodes and will automatically connect to the cluster you configure upon host startup.

## Version compatibility
<a name="_version_compatibility"></a>

The table below represents the operating system versions that are compatible and validated to use as the node operating system for hybrid nodes. If you are using other operating system variants or versions that are not included in this table, then the compatibility of hybrid nodes with your operating system variant or version is not covered by AWS Support. Hybrid nodes are agnostic to the underlying infrastructure and support x86 and ARM architectures.


| Operating System | Versions | 
| --- | --- | 
|  Amazon Linux  |  Amazon Linux 2023 (AL2023)  | 
|  Bottlerocket  |  v1.37.0 and above VMware variants running Kubernetes v1.28 and above  | 
|  Ubuntu  |  Ubuntu 20.04, Ubuntu 22.04, Ubuntu 24.04  | 
|  Red Hat Enterprise Linux  |  RHEL 8, RHEL 9  | 

## Operating system considerations
<a name="_operating_system_considerations"></a>

### General
<a name="_general"></a>
+ The Amazon EKS Hybrid Nodes CLI (`nodeadm`) can be used to simplify the installation and configuration of the hybrid nodes components and dependencies. You can run the `nodeadm install` process during your operating system image build pipelines or at runtime on each on-premises host. For more information on the components that `nodeadm` installs, see the [Hybrid nodes `nodeadm` reference](hybrid-nodes-nodeadm.md).
+ If you are using a proxy in your on-premises environment to reach the internet, there is additional operating system configuration required for the install and upgrade processes to configure your package manager to use the proxy. See [Configure proxy for hybrid nodes](hybrid-nodes-proxy.md) for instructions.

### Bottlerocket
<a name="_bottlerocket"></a>
+ The steps and tools to connect a Bottlerocket node are different than the steps for other operating systems and are covered separately in [Connect hybrid nodes with Bottlerocket](hybrid-nodes-bottlerocket.md), instead of the steps in [Connect hybrid nodes](hybrid-nodes-join.md).
+ The steps for Bottlerocket don’t use the hybrid nodes CLI tool, `nodeadm`.
+ Only VMware variants of Bottlerocket version v1.37.0 and above are supported with EKS Hybrid Nodes. VMware variants of Bottlerocket are available for Kubernetes versions v1.28 and above. [Other Bottlerocket variants](https://bottlerocket.dev/en/os/1.36.x/concepts/variants) are not supported as the hybrid nodes operating system. NOTE: VMware variants of Bottlerocket are only available for the x86\$164 architecture.

### Containerd
<a name="_containerd"></a>
+ Containerd is the standard Kubernetes container runtime and is a dependency for hybrid nodes, as well as all Amazon EKS node compute types. The Amazon EKS Hybrid Nodes CLI (`nodeadm`) attempts to install containerd during the `nodeadm install` process. You can configure the containerd installation at `nodeadm install` runtime with the `--containerd-source` command line option. Valid options are `none`, `distro`, and `docker`. If you are using RHEL, `distro` is not a valid option and you can either configure `nodeadm` to install the containerd build from Docker’s repos or you can manually install containerd. When using AL2023 or Ubuntu, `nodeadm` defaults to installing containerd from the operating system distribution. If you do not want nodeadm to install containerd, use the `--containerd-source none` option.

### Ubuntu
<a name="_ubuntu"></a>
+ If you are using Ubuntu 24.04, you may need to update your version of containerd or change your AppArmor configuration to adopt a fix that allows pods to properly terminate, see [Ubuntu \$12065423](https://bugs.launchpad.net/ubuntu/+source/containerd-app/\+bug/2065423). A reboot is required to apply changes to the AppArmor profile. The latest version of Ubuntu 24.04 has an updated containerd version in its package manager with the fix (containerd version 1.7.19\$1).

### ARM
<a name="_arm"></a>
+ If you are using ARM hardware, an ARMv8.2 compliant processor with the Cryptography Extension (ARMv8.2\$1crypto) is required to run version 1.31 and above of the EKS kube-proxy add-on. All Raspberry Pi systems prior to the Raspberry Pi 5, as well as Cortex-A72 based processors, do not meet this requirement. As a workaround, you can continue to use version 1.30 of the EKS kube-proxy add-on until it reaches end of extended support in July of 2026, see [Kubernetes release calendar](https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html), or use a custom kube-proxy image from upstream.
+ The following error message in the kube-proxy log indicates this incompatibility:

```
Fatal glibc error: This version of Amazon Linux requires a newer ARM64 processor compliant with at least ARM architecture 8.2-a with Cryptographic extensions. On EC2 this is Graviton 2 or later.
```

## Building operating system images
<a name="_building_operating_system_images"></a>

Amazon EKS provides [example Packer templates](https://github.com/aws/eks-hybrid/tree/main/example/packer) you can use to create operating system images that include `nodeadm` and configure it to run at host-startup. This process is recommended to avoid pulling the hybrid nodes dependencies individually on each host and to automate the hybrid nodes bootstrap process. You can use the example Packer templates with an Ubuntu 22.04, Ubuntu 24.04, RHEL 8 or RHEL 9 ISO image and can output images with these formats: OVA, Qcow2, or raw.

### Prerequisites
<a name="_prerequisites"></a>

Before using the example Packer templates, you must have the following installed on the machine from where you are running Packer.
+ Packer version 1.11.0 or higher. For instructions on installing Packer, see [Install Packer](https://developer.hashicorp.com/packer/tutorials/docker-get-started/get-started-install-cli) in the Packer documentation.
+ If building OVAs, VMware vSphere plugin 1.4.0 or higher
+ If building `Qcow2` or raw images, QEMU plugin version 1.x

### Set Environment Variables
<a name="_set_environment_variables"></a>

Before running the Packer build, set the following environment variables on the machine from where you are running Packer.

 **General** 

The following environment variables must be set for building images with all operating systems and output formats.


| Environment Variable | Type | Description | 
| --- | --- | --- | 
|  PKR\$1SSH\$1PASSWORD  |  String  |  Packer uses the `ssh_username` and `ssh_password` variables to SSH into the created machine when provisioning. This needs to match the passwords used to create the initial user within the respective OS’s kickstart or user-data files. The default is set as "builder" or "ubuntu" depending on the OS. When setting your password, make sure to change it within the corresponding `ks.cfg` or `user-data` file to match.  | 
|  ISO\$1URL  |  String  |  URL of the ISO to use. Can be a web link to download from a server, or an absolute path to a local file  | 
|  ISO\$1CHECKSUM  |  String  |  Associated checksum for the supplied ISO.  | 
|  CREDENTIAL\$1PROVIDER  |  String  |  Credential provider for hybrid nodes. Valid values are `ssm` (default) for SSM hybrid activations and `iam` for IAM Roles Anywhere  | 
|  K8S\$1VERSION  |  String  |  Kubernetes version for hybrid nodes (for example `1.31`). For supported Kubernetes versions, see [Amazon EKS supported versions](https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html).  | 
|  NODEADM\$1ARCH  |  String  |  Architecture for `nodeadm install`. Select `amd` or `arm`.  | 

 **RHEL** 

If you are using RHEL, the following environment variables must be set.


| Environment Variable | Type | Description | 
| --- | --- | --- | 
|  RH\$1USERNAME  |  String  |  RHEL subscription manager username  | 
|  RH\$1PASSWORD  |  String  |  RHEL subscription manager password  | 
|  RHEL\$1VERSION  |  String  |  Rhel iso version being used. Valid values are `8` or `9`.  | 

 **Ubuntu** 

There are no Ubuntu-specific environment variables required.

 **vSphere** 

If you are building a VMware vSphere OVA, the following environment variables must be set.


| Environment Variable | Type | Description | 
| --- | --- | --- | 
|  VSPHERE\$1SERVER  |  String  |  vSphere server address  | 
|  VSPHERE\$1USER  |  String  |  vSphere username  | 
|  VSPHERE\$1PASSWORD  |  String  |  vSphere password  | 
|  VSPHERE\$1DATACENTER  |  String  |  vSphere datacenter name  | 
|  VSPHERE\$1CLUSTER  |  String  |  vSphere cluster name  | 
|  VSPHERE\$1DATASTORE  |  String  |  vSphere datastore name  | 
|  VSPHERE\$1NETWORK  |  String  |  vSphere network name  | 
|  VSPHERE\$1OUTPUT\$1FOLDER  |  String  |  vSphere output folder for the templates  | 

 **QEMU** 


| Environment Variable | Type | Description | 
| --- | --- | --- | 
|  PACKER\$1OUTPUT\$1FORMAT  |  String  |  Output format for the QEMU builder. Valid values are `qcow2` and `raw`.  | 

 **Validate template** 

Before running your build, validate your template with the following command after setting your environment variables. Replace `template.pkr.hcl` if you are using a different name for your template.

```
packer validate template.pkr.hcl
```

### Build images
<a name="_build_images"></a>

Build your images with the following commands and use the `-only` flag to specify the target and operating system for your images. Replace `template.pkr.hcl` if you are using a different name for your template.

 **vSphere OVAs** 

**Note**  
If you are using RHEL with vSphere you need to convert the kickstart files to an OEMDRV image and pass it as an ISO to boot from. For more information, see the [Packer Readme](https://github.com/aws/eks-hybrid/tree/main/example/packer#utilizing-rhel-with-vsphere) in the EKS Hybrid Nodes GitHub Repository.

 **Ubuntu 22.04 OVA** 

```
packer build -only=general-build.vsphere-iso.ubuntu22 template.pkr.hcl
```

 **Ubuntu 24.04 OVA** 

```
packer build -only=general-build.vsphere-iso.ubuntu24 template.pkr.hcl
```

 **RHEL 8 OVA** 

```
packer build -only=general-build.vsphere-iso.rhel8 template.pkr.hcl
```

 **RHEL 9 OVA** 

```
packer build -only=general-build.vsphere-iso.rhel9 template.pkr.hcl
```

 **QEMU** 

**Note**  
If you are building an image for a specific host CPU that does not match your builder host, see the [QEMU](https://www.qemu.org/docs/master/system/qemu-cpu-models.html) documentation for the name that matches your host CPU and use the `-cpu` flag with the name of the host CPU when you run the following commands.

 **Ubuntu 22.04 Qcow2 / Raw** 

```
packer build -only=general-build.qemu.ubuntu22 template.pkr.hcl
```

 **Ubuntu 24.04 Qcow2 / Raw** 

```
packer build -only=general-build.qemu.ubuntu24 template.pkr.hcl
```

 **RHEL 8 Qcow2 / Raw** 

```
packer build -only=general-build.qemu.rhel8 template.pkr.hcl
```

 **RHEL 9 Qcow2 / Raw** 

```
packer build -only=general-build.qemu.rhel9 template.pkr.hcl
```

### Pass nodeadm configuration through user-data
<a name="_pass_nodeadm_configuration_through_user_data"></a>

You can pass configuration for `nodeadm` in your user-data through cloud-init to configure and automatically connect hybrid nodes to your EKS cluster at host startup. Below is an example for how to accomplish this when using VMware vSphere as the infrastructure for your hybrid nodes.

1. Install the the `govc` CLI following the instructions in the [govc readme](https://github.com/vmware/govmomi/blob/main/govc/README.md) on GitHub.

1. After running the Packer build in the previous section and provisioning your template, you can clone your template to create multiple different nodes using the following. You must clone the template for each new VM you are creating that will be used for hybrid nodes. Replace the variables in the command below with the values for your environment. The `VM_NAME` in the command below is used as your `NODE_NAME` when you inject the names for your VMs via your `metadata.yaml` file.

   ```
   govc vm.clone -vm "/PATH/TO/TEMPLATE" -ds="YOUR_DATASTORE" \
       -on=false -template=false -folder=/FOLDER/TO/SAVE/VM "VM_NAME"
   ```

1. After cloning the template for each of your new VMs, create a `userdata.yaml` and `metadata.yaml` for your VMs. Your VMs can share the same `userdata.yaml` and `metadata.yaml` and you will populate these on a per VM basis in the steps below. The `nodeadm` configuration is created and defined in the `write_files` section of your `userdata.yaml`. The example below uses AWS SSM hybrid activations as the on-premises credential provider for hybrid nodes. For more information on `nodeadm` configuration, see the [Hybrid nodes `nodeadm` reference](hybrid-nodes-nodeadm.md).

    **userdata.yaml:** 

   ```
   #cloud-config
   users:
     - name: # username for login. Use 'builder' for RHEL or 'ubuntu' for Ubuntu.
       passwd: # password to login. Default is 'builder' for RHEL.
       groups: [adm, cdrom, dip, plugdev, lxd, sudo]
       lock-passwd: false
       sudo: ALL=(ALL) NOPASSWD:ALL
       shell: /bin/bash
   
   write_files:
     - path: /usr/local/bin/nodeConfig.yaml
       permissions: '0644'
       content: |
         apiVersion: node.eks.aws/v1alpha1
         kind: NodeConfig
         spec:
             cluster:
                 name: # Cluster Name
                 region: # AWS region
             hybrid:
                 ssm:
                     activationCode: # Your ssm activation code
                     activationId: # Your ssm activation id
   
   runcmd:
     - /usr/local/bin/nodeadm init -c file:///usr/local/bin/nodeConfig.yaml >> /var/log/nodeadm-init.log 2>&1
   ```

    **metadata.yaml:** 

   Create a `metadata.yaml` for your environment. Keep the `"$NODE_NAME"` variable format in the file as this will be populated with values in a subsequent step.

   ```
   instance-id: "$NODE_NAME"
   local-hostname: "$NODE_NAME"
   network:
     version: 2
     ethernets:
       nics:
         match:
           name: ens*
         dhcp4: yes
   ```

1. Add the `userdata.yaml` and `metadata.yaml` files as `gzip+base64` strings with the following commands. The following commands should be run for each of the VMs you are creating. Replace `VM_NAME` with the name of the VM you are updating.

   ```
   export NODE_NAME="VM_NAME"
   export USER_DATA=$(gzip -c9 <userdata.yaml | base64)
   
   govc vm.change -dc="YOUR_DATASTORE" -vm "$NODE_NAME" -e guestinfo.userdata="${USER_DATA}"
   govc vm.change -dc="YOUR_DATASTORE" -vm "$NODE_NAME" -e guestinfo.userdata.encoding=gzip+base64
   
   envsubst '$NODE_NAME' < metadata.yaml > metadata.yaml.tmp
   export METADATA=$(gzip -c9 <metadata.yaml.tmp | base64)
   
   govc vm.change -dc="YOUR_DATASTORE" -vm "$NODE_NAME" -e guestinfo.metadata="${METADATA}"
   govc vm.change -dc="YOUR_DATASTORE" -vm "$NODE_NAME" -e guestinfo.metadata.encoding=gzip+base64
   ```

1. Power on your new VMs, which should automatically connect to the EKS cluster you configured.

   ```
   govc vm.power -on "${NODE_NAME}"
   ```

# Prepare credentials for hybrid nodes
<a name="hybrid-nodes-creds"></a>

Amazon EKS Hybrid Nodes use temporary IAM credentials provisioned by AWS SSM hybrid activations or AWS IAM Roles Anywhere to authenticate with the Amazon EKS cluster. You must use either AWS SSM hybrid activations or AWS IAM Roles Anywhere with the Amazon EKS Hybrid Nodes CLI (`nodeadm`). You should not use both AWS SSM hybrid activations and AWS IAM Roles Anywhere. We recommend that you use AWS SSM hybrid activations if you do not have existing Public Key Infrastructure (PKI) with a Certificate Authority (CA) and certificates for your on-premises environments. If you do have existing PKI and certificates on-premises, use AWS IAM Roles Anywhere.

## Hybrid Nodes IAM Role
<a name="hybrid-nodes-role"></a>

Before you can connect hybrid nodes to your Amazon EKS cluster, you must create an IAM role that will be used with AWS SSM hybrid activations or AWS IAM Roles Anywhere for your hybrid nodes credentials. After cluster creation, you will use this role with an Amazon EKS access entry or `aws-auth` ConfigMap entry to map the IAM role to Kubernetes Role-Based Access Control (RBAC). For more information on associating the Hybrid Nodes IAM role with Kubernetes RBAC, see [Prepare cluster access for hybrid nodes](hybrid-nodes-cluster-prep.md).

The Hybrid Nodes IAM role must have the following permissions.
+ Permissions for `nodeadm` to use the `eks:DescribeCluster` action to gather information about the cluster to which you want to connect hybrid nodes. If you do not enable the `eks:DescribeCluster` action, then you must pass your Kubernetes API endpoint, cluster CA bundle, and service IPv4 CIDR in the node configuration you pass to the `nodeadm init` command.
+ Permissions for `nodeadm` to use the `eks:ListAccessEntries` action to list the access entries on the cluster to which you want to connect hybrid nodes. If you do not enable the `eks:ListAccessEntries` action, then you must pass the `--skip cluster-access-validation` flag when you run the `nodeadm init` command.
+ Permissions for the kubelet to use container images from Amazon Elastic Container Registry (Amazon ECR) as defined in the [AmazonEC2ContainerRegistryPullOnly](https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonEC2ContainerRegistryPullOnly.html) policy.
+ If using AWS SSM, permissions for `nodeadm init` to use AWS SSM hybrid activations as defined in the [https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonSSMManagedInstanceCore.html](https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AmazonSSMManagedInstanceCore.html) policy.
+ If using AWS SSM, permissions to use the `ssm:DeregisterManagedInstance` action and `ssm:DescribeInstanceInformation` action for `nodeadm uninstall` to deregister instances.
+ (Optional) Permissions for the Amazon EKS Pod Identity Agent to use the `eks-auth:AssumeRoleForPodIdentity` action to retrieve credentials for pods.

## Setup AWS SSM hybrid activations
<a name="hybrid-nodes-ssm"></a>

Before setting up AWS SSM hybrid activations, you must have a Hybrid Nodes IAM role created and configured. For more information, see [Create the Hybrid Nodes IAM role](#hybrid-nodes-create-role). Follow the instructions at [Create a hybrid activation to register nodes with Systems Manager](https://docs.aws.amazon.com/systems-manager/latest/userguide/hybrid-activation-managed-nodes.html) in the AWS Systems Manager User Guide to create an AWS SSM hybrid activation for your hybrid nodes. The Activation Code and ID you receive is used with `nodeadm` when you register your hosts as hybrid nodes with your Amazon EKS cluster. You can come back to this step at a later point after you have created and prepared your Amazon EKS clusters for hybrid nodes.

**Important**  
Systems Manager immediately returns the Activation Code and ID to the console or the command window, depending on how you created the activation. Copy this information and store it in a safe place. If you navigate away from the console or close the command window, you might lose this information. If you lose it, you must create a new activation.

By default, AWS SSM hybrid activations are active for 24 hours. You can alternatively specify an `--expiration-date` when you create your hybrid activation in timestamp format, such as `2024-08-01T00:00:00`. When you use AWS SSM as your credential provider, the node name for your hybrid nodes is not configurable, and is auto-generated by AWS SSM. You can view and manage the AWS SSM Managed Instances in the AWS Systems Manager console under Fleet Manager. You can register up to 1,000 standard [hybrid-activated nodes](https://docs.aws.amazon.com/systems-manager/latest/userguide/activations.html) per account per AWS Region at no additional cost. However, registering more than 1,000 hybrid nodes requires that you activate the advanced-instances tier. There is a charge to use the advanced-instances tier that is not included in the [Amazon EKS Hybrid Nodes pricing](https://aws.amazon.com/eks/pricing/). For more information, see [AWS Systems Manager Pricing](https://aws.amazon.com/systems-manager/pricing/).

See the example below for how to create an AWS SSM hybrid activation with your Hybrid Nodes IAM role. When you use AWS SSM hybrid activations for your hybrid nodes credentials, the names of your hybrid nodes will have the format `mi-012345678abcdefgh` and the temporary credentials provisioned by AWS SSM are valid for 1 hour. You cannot alter the node name or credential duration when using AWS SSM as your credential provider. The temporary credentials are automatically rotated by AWS SSM and the rotation does not impact the status of your nodes or applications.

We recommend that you use one AWS SSM hybrid activation per EKS cluster to scope the AWS SSM `ssm:DeregisterManagedInstance` permission of the Hybrid Nodes IAM role to only be able to deregister instances that are associated with your AWS SSM hybrid activation. In the example on this page, a tag with the EKS cluster ARN is used, which can be used to map your AWS SSM hybrid activation to the EKS cluster. You can alternatively use your preferred tag and method of scoping the AWS SSM permissions based on your permission boundaries and requirements. The `REGISTRATION_LIMIT` option in the command below is an integer used to limit the number of machines that can use the AWS SSM hybrid activation (for example `10`)

```
aws ssm create-activation \
     --region AWS_REGION \
     --default-instance-name eks-hybrid-nodes \
     --description "Activation for EKS hybrid nodes" \
     --iam-role AmazonEKSHybridNodesRole \
     --tags Key=EKSClusterARN,Value=arn:aws:eks:AWS_REGION:AWS_ACCOUNT_ID:cluster/CLUSTER_NAME \
     --registration-limit REGISTRATION_LIMIT
```

Review the instructions on [Create a hybrid activation to register nodes with Systems Manager](https://docs.aws.amazon.com/systems-manager/latest/userguide/hybrid-activation-managed-nodes.html) for more information about the available configuration settings for AWS SSM hybrid activations.

## Setup AWS IAM Roles Anywhere
<a name="hybrid-nodes-iam-roles-anywhere"></a>

Follow the instructions at [Getting started with IAM Roles Anywhere](https://docs.aws.amazon.com/rolesanywhere/latest/userguide/getting-started.html) in the IAM Roles Anywhere User Guide to set up the trust anchor and profile you will use for temporary IAM credentials for your Hybrid Nodes IAM role. When you create your profile, you can create it without adding any roles. You can create this profile, return to these steps to create your Hybrid Nodes IAM role, and then add your role to your profile after it is created. You can alternatively use the AWS CloudFormation steps later on this page to complete your IAM Roles Anywhere setup for hybrid nodes.

When you add the Hybrid Nodes IAM role to your profile, select **Accept custom role session name** in the **Custom role** session name panel at the bottom of the **Edit profile** page in the AWS IAM Roles Anywhere console. This corresponds to the [acceptRoleSessionName](https://docs.aws.amazon.com/rolesanywhere/latest/APIReference/API_CreateProfile.html#rolesanywhere-CreateProfile-request-acceptRoleSessionName) field of the `CreateProfile` API. This allows you to supply a custom node name for your hybrid nodes in the configuration you pass to `nodeadm` during the bootstrap process. Passing a custom node name during the `nodeadm init` process is required. You can update your profile to accept a custom role session name after creating your profile.

You can configure the credential validity duration with AWS IAM Roles Anywhere through the [durationSeconds](https://docs.aws.amazon.com/rolesanywhere/latest/userguide/authentication-create-session#credentials-object) field of your AWS IAM Roles Anywhere profile. The default duration is 1 hour with a maximum of 12 hours. The `MaxSessionDuration` setting on your Hybrid Nodes IAM role must be greater than the `durationSeconds` setting on your AWS IAM Roles Anywhere profile. For more information on `MaxSessionDuration`, see [UpdateRole API documentation](https://docs.aws.amazon.com/systems-manager/latest/APIReference/API_UpdateRole.html).

The per-machine certificates and keys you generate from your certificate authority (CA) must be placed in the `/etc/iam/pki` directory on each hybrid node with the file names `server.pem` for the certificate and `server.key` for the key.

## Create the Hybrid Nodes IAM role
<a name="hybrid-nodes-create-role"></a>

To run the steps in this section, the IAM principal using the AWS console or AWS CLI must have the following permissions.
+  `iam:CreatePolicy` 
+  `iam:CreateRole` 
+  `iam:AttachRolePolicy` 
+ If using AWS IAM Roles Anywhere
  +  `rolesanywhere:CreateTrustAnchor` 
  +  `rolesanywhere:CreateProfile` 
  +  `iam:PassRole` 

### AWS CloudFormation
<a name="hybrid-nodes-creds-cloudformation"></a>

Install and configure the AWS CLI, if you haven’t already. See [Installing or updating to the last version of the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html).

 **Steps for AWS SSM hybrid activations** 

The CloudFormation stack creates the Hybrid Nodes IAM Role with the permissions outlined above. The CloudFormation template does not create the AWS SSM hybrid activation.

1. Download the AWS SSM CloudFormation template for hybrid nodes:

   ```
   curl -OL 'https://raw.githubusercontent.com/aws/eks-hybrid/refs/heads/main/example/hybrid-ssm-cfn.yaml'
   ```

1. Create a `cfn-ssm-parameters.json` with the following options:

   1. Replace `ROLE_NAME` with the name for your Hybrid Nodes IAM role. By default, the CloudFormation template uses `AmazonEKSHybridNodesRole` as the name of the role it creates if you do not specify a name.

   1. Replace `TAG_KEY` with the AWS SSM resource tag key you used when creating your AWS SSM hybrid activation. The combination of the tag key and tag value is used in the condition for the `ssm:DeregisterManagedInstance` to only allow the Hybrid Nodes IAM role to deregister the AWS SSM managed instances that are associated with your AWS SSM hybrid activation. In the CloudFormation template, `TAG_KEY` defaults to `EKSClusterARN`.

   1. Replace `TAG_VALUE` with the AWS SSM resource tag value you used when creating your AWS SSM hybrid activation. The combination of the tag key and tag value is used in the condition for the `ssm:DeregisterManagedInstance` to only allow the Hybrid Nodes IAM role to deregister the AWS SSM managed instances that are associated with your AWS SSM hybrid activation. If you are using the default `TAG_KEY` of `EKSClusterARN`, then pass your EKS cluster ARN as the `TAG_VALUE`. EKS cluster ARNs have the format ` arn:aws:eks:AWS_REGION:AWS_ACCOUNT_ID:cluster/CLUSTER_NAME`.

      ```
      {
        "Parameters": {
          "RoleName": "ROLE_NAME",
          "SSMDeregisterConditionTagKey": "TAG_KEY",
          "SSMDeregisterConditionTagValue": "TAG_VALUE"
        }
      }
      ```

1. Deploy the CloudFormation stack. Replace `STACK_NAME` with your name for the CloudFormation stack.

   ```
   aws cloudformation deploy \
       --stack-name STACK_NAME \
       --template-file hybrid-ssm-cfn.yaml \
       --parameter-overrides file://cfn-ssm-parameters.json \
       --capabilities CAPABILITY_NAMED_IAM
   ```

 **Steps for AWS IAM Roles Anywhere** 

The CloudFormation stack creates the AWS IAM Roles Anywhere trust anchor with the certificate authority (CA) you configure, creates the AWS IAM Roles Anywhere profile, and creates the Hybrid Nodes IAM role with the permissions outlined previously.

1. To set up a certificate authority (CA)

   1. To use an AWS Private CA resource, open the [AWS Private Certificate Authority console](https://console.aws.amazon.com/acm-pca/home). Follow the instructions in the [AWS Private CA User Guide](https://docs.aws.amazon.com/privateca/latest/userguide/PcaWelcome.html).

   1. To use an external CA, follow the instructions provided by the CA. You provide the certificate body in a later step.

   1. Certificates issued from public CAs cannot be used as trust anchors.

1. Download the AWS IAM Roles Anywhere CloudFormation template for hybrid nodes

   ```
   curl -OL 'https://raw.githubusercontent.com/aws/eks-hybrid/refs/heads/main/example/hybrid-ira-cfn.yaml'
   ```

1. Create a `cfn-iamra-parameters.json` with the following options:

   1. Replace `ROLE_NAME` with the name for your Hybrid Nodes IAM role. By default, the CloudFormation template uses `AmazonEKSHybridNodesRole` as the name of the role it creates if you do not specify a name.

   1. Replace `CERT_ATTRIBUTE` with the per-machine certificate attribute that uniquely identifies your host. The certificate attribute you use must match the nodeName you use for the `nodeadm` configuration when you connect hybrid nodes to your cluster. For more information, see the [Hybrid nodes `nodeadm` reference](hybrid-nodes-nodeadm.md). By default, the CloudFormation template uses `${aws:PrincipalTag/x509Subject/CN}` as the `CERT_ATTRIBUTE`, which corresponds to the CN field of your per-machine certificates. You can alternatively pass `$(aws:PrincipalTag/x509SAN/Name/CN}` as your `CERT_ATTRIBUTE`.

   1. Replace `CA_CERT_BODY` with the certificate body of your CA without line breaks. The `CA_CERT_BODY` must be in Privacy Enhanced Mail (PEM) format. If you have a CA certificate in PEM format, remove the line breaks and BEGIN CERTIFICATE and END CERTIFICATE lines before placing the CA certificate body in your `cfn-iamra-parameters.json` file.

      ```
      {
        "Parameters": {
          "RoleName": "ROLE_NAME",
          "CertAttributeTrustPolicy": "CERT_ATTRIBUTE",
          "CABundleCert": "CA_CERT_BODY"
        }
      }
      ```

1. Deploy the CloudFormation template. Replace `STACK_NAME` with your name for the CloudFormation stack.

   ```
   aws cloudformation deploy \
       --stack-name STACK_NAME \
       --template-file hybrid-ira-cfn.yaml \
       --parameter-overrides file://cfn-iamra-parameters.json
       --capabilities CAPABILITY_NAMED_IAM
   ```

### AWS CLI
<a name="hybrid-nodes-creds-awscli"></a>

Install and configure the AWS CLI, if you haven’t already. See [Installing or updating to the last version of the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html).

 **Create EKS Describe Cluster Policy** 

1. Create a file named `eks-describe-cluster-policy.json` with the following contents:

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Effect": "Allow",
               "Action": [
                   "eks:DescribeCluster"
               ],
               "Resource": "*"
           }
       ]
   }
   ```

1. Create the policy with the following command:

   ```
   aws iam create-policy \
       --policy-name EKSDescribeClusterPolicy \
       --policy-document file://eks-describe-cluster-policy.json
   ```

 **Steps for AWS SSM hybrid activations** 

1. Create a file named `eks-hybrid-ssm-policy.json` with the following contents. The policy grants permission for two actions `ssm:DescribeInstanceInformation` and `ssm:DeregisterManagedInstance`. The policy restricts the `ssm:DeregisterManagedInstance` permission to AWS SSM managed instances associated with your AWS SSM hybrid activation based on the resource tag you specify in your trust policy.

   1. Replace `AWS_REGION` with the AWS Region for your AWS SSM hybrid activation.

   1. Replace `AWS_ACCOUNT_ID` with your AWS account ID.

   1. Replace `TAG_KEY` with the AWS SSM resource tag key you used when creating your AWS SSM hybrid activation. The combination of the tag key and tag value is used in the condition for the `ssm:DeregisterManagedInstance` to only allow the Hybrid Nodes IAM role to deregister the AWS SSM managed instances that are associated with your AWS SSM hybrid activation. In the CloudFormation template, `TAG_KEY` defaults to `EKSClusterARN`.

   1. Replace `TAG_VALUE` with the AWS SSM resource tag value you used when creating your AWS SSM hybrid activation. The combination of the tag key and tag value is used in the condition for the `ssm:DeregisterManagedInstance` to only allow the Hybrid Nodes IAM role to deregister the AWS SSM managed instances that are associated with your AWS SSM hybrid activation. If you are using the default `TAG_KEY` of `EKSClusterARN`, then pass your EKS cluster ARN as the `TAG_VALUE`. EKS cluster ARNs have the format ` arn:aws:eks:AWS_REGION:AWS_ACCOUNT_ID:cluster/CLUSTER_NAME`.

      ```
      {
          "Version":"2012-10-17",		 	 	 
          "Statement": [
              {
                  "Effect": "Allow",
                  "Action": "ssm:DescribeInstanceInformation",
                  "Resource": "*"
              },
              {
                  "Effect": "Allow",
                  "Action": "ssm:DeregisterManagedInstance",
                  "Resource": "arn:aws:ssm:us-east-1:123456789012:managed-instance/*",
                  "Condition": {
                      "StringEquals": {
                          "ssm:resourceTag/TAG_KEY": "TAG_VALUE"
                      }
                  }
              }
          ]
      }
      ```

1. Create the policy with the following command

   ```
   aws iam create-policy \
       --policy-name EKSHybridSSMPolicy \
       --policy-document file://eks-hybrid-ssm-policy.json
   ```

1. Create a file named `eks-hybrid-ssm-trust.json`. Replace `AWS_REGION` with the AWS Region of your AWS SSM hybrid activation and `AWS_ACCOUNT_ID` with your AWS account ID.

   ```
   {
      "Version":"2012-10-17",		 	 	 
      "Statement":[
         {
            "Sid":"",
            "Effect":"Allow",
            "Principal":{
               "Service":"ssm.amazonaws.com"
            },
            "Action":"sts:AssumeRole",
            "Condition":{
               "StringEquals":{
                  "aws:SourceAccount":"123456789012"
               },
               "ArnEquals":{
                  "aws:SourceArn":"arn:aws:ssm:us-east-1:123456789012:*"
               }
            }
         }
      ]
   }
   ```

1. Create the role with the following command.

   ```
   aws iam create-role \
       --role-name AmazonEKSHybridNodesRole \
       --assume-role-policy-document file://eks-hybrid-ssm-trust.json
   ```

1. Attach the `EKSDescribeClusterPolicy` and the `EKSHybridSSMPolicy` you created in the previous steps. Replace `AWS_ACCOUNT_ID` with your AWS account ID.

   ```
   aws iam attach-role-policy \
       --role-name AmazonEKSHybridNodesRole \
       --policy-arn arn:aws:iam::AWS_ACCOUNT_ID:policy/EKSDescribeClusterPolicy
   ```

   ```
   aws iam attach-role-policy \
       --role-name AmazonEKSHybridNodesRole \
       --policy-arn arn:aws:iam::AWS_ACCOUNT_ID:policy/EKSHybridSSMPolicy
   ```

1. Attach the `AmazonEC2ContainerRegistryPullOnly` and `AmazonSSMManagedInstanceCore` AWS managed policies.

   ```
   aws iam attach-role-policy \
       --role-name AmazonEKSHybridNodesRole \
       --policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryPullOnly
   ```

   ```
   aws iam attach-role-policy \
       --role-name AmazonEKSHybridNodesRole \
       --policy-arn arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
   ```

 **Steps for AWS IAM Roles Anywhere** 

To use AWS IAM Roles Anywhere, you must set up your AWS IAM Roles Anywhere trust anchor before creating the Hybrid Nodes IAM Role. See [Setup AWS IAM Roles Anywhere](#hybrid-nodes-iam-roles-anywhere) for instructions.

1. Create a file named `eks-hybrid-iamra-trust.json`. Replace `TRUST_ANCHOR ARN` with the ARN of the trust anchor you created in the [Setup AWS IAM Roles Anywhere](#hybrid-nodes-iam-roles-anywhere) steps. The condition in this trust policy restricts the ability of AWS IAM Roles Anywhere to assume the Hybrid Nodes IAM role to exchange temporary IAM credentials only when the role session name matches the CN in the x509 certificate installed on your hybrid nodes. You can alternatively use other certificate attributes to uniquely identify your node. The certificate attribute that you use in the trust policy must correspond to the `nodeName` you set in your `nodeadm` configuration. For more information, see the [Hybrid nodes `nodeadm` reference](hybrid-nodes-nodeadm.md).

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Effect": "Allow",
               "Principal": {
                   "Service": "rolesanywhere.amazonaws.com"
               },
               "Action": [
                   "sts:TagSession",
                   "sts:SetSourceIdentity"
               ],
               "Condition": {
                   "StringEquals": {
                       "aws:PrincipalTag/x509Subject/CN": "${aws:PrincipalTag/x509Subject/CN}"
                   },
                   "ArnEquals": {
                       "aws:SourceArn": "arn:aws:rolesanywhere:us-east-1:123456789012:trust-anchor/TA_ID"
                   }
               }
           },
           {
               "Effect": "Allow",
               "Principal": {
                   "Service": "rolesanywhere.amazonaws.com"
               },
               "Action": "sts:AssumeRole",
               "Condition": {
                   "StringEquals": {
                       "sts:RoleSessionName": "${aws:PrincipalTag/x509Subject/CN}",
                       "aws:PrincipalTag/x509Subject/CN": "${aws:PrincipalTag/x509Subject/CN}"
                   },
                   "ArnEquals": {
                       "aws:SourceArn": "arn:aws:rolesanywhere:us-east-1:123456789012:trust-anchor/TA_ID"
                   }
               }
           }
       ]
   }
   ```

1. Create the role with the following command.

   ```
   aws iam create-role \
       --role-name AmazonEKSHybridNodesRole \
       --assume-role-policy-document file://eks-hybrid-iamra-trust.json
   ```

1. Attach the `EKSDescribeClusterPolicy` you created in the previous steps. Replace `AWS_ACCOUNT_ID` with your AWS account ID.

   ```
   aws iam attach-role-policy \
       --role-name AmazonEKSHybridNodesRole \
       --policy-arn arn:aws:iam::AWS_ACCOUNT_ID:policy/EKSDescribeClusterPolicy
   ```

1. Attach the `AmazonEC2ContainerRegistryPullOnly` AWS managed policy

   ```
   aws iam attach-role-policy \
       --role-name AmazonEKSHybridNodesRole \
       --policy-arn arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryPullOnly
   ```

### AWS Management Console
<a name="hybrid-nodes-creds-console"></a>

 **Create EKS Describe Cluster Policy** 

1. Open the [Amazon IAM console](https://console.aws.amazon.com/iam/home) 

1. In the left navigation pane, choose **Policies**.

1. On the **Policies** page, choose **Create policy**.

1. On the Specify permissions page, in the Select a service panel, choose EKS.

   1. Filter actions for **DescribeCluster** and select the **DescribeCluster** Read action.

   1. Choose **Next**.

1. On the **Review and create** page

   1. Enter a **Policy name** for your policy such as `EKSDescribeClusterPolicy`.

   1. Choose **Create policy**.

 **Steps for AWS SSM hybrid activations** 

1. Open the [Amazon IAM console](https://console.aws.amazon.com/iam/home) 

1. In the left navigation pane, choose **Policies**.

1. On the **Policies page**, choose **Create policy**.

1. On the **Specify permissions** page, in the **Policy editor** top right navigation, choose **JSON**. Paste the following snippet. Replace `AWS_REGION` with the AWS Region of your AWS SSM hybrid activation and replace `AWS_ACCOUNT_ID` with your AWS account ID. Replace `TAG_KEY` and `TAG_VALUE` with the AWS SSM resource tag key you used when creating your AWS SSM hybrid activation.

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [
           {
               "Effect": "Allow",
               "Action": "ssm:DescribeInstanceInformation",
               "Resource": "*"
           },
           {
               "Effect": "Allow",
               "Action": "ssm:DeregisterManagedInstance",
               "Resource": "arn:aws:ssm:us-east-1:123456789012:managed-instance/*",
               "Condition": {
                   "StringEquals": {
                       "ssm:resourceTag/TAG_KEY": "TAG_VALUE"
                   }
               }
           }
       ]
   }
   ```

   1. Choose **Next**.

1. On the **Review and Create** page.

   1. Enter a **Policy** name for your policy such as `EKSHybridSSMPolicy` 

   1. Choose **Create Policy**.

1. In the left navigation pane, choose **Roles**.

1. On the **Roles** page, choose **Create role**.

1. On the **Select trusted entity** page, do the following:

   1. In the **Trusted entity** type section, choose **Custom trust policy**. Paste the following into the Custom trust policy editor. Replace `AWS_REGION` with the AWS Region of your AWS SSM hybrid activation and `AWS_ACCOUNT_ID` with your AWS account ID.

      ```
      {
         "Version":"2012-10-17",		 	 	 
         "Statement":[
            {
               "Sid":"",
               "Effect":"Allow",
               "Principal":{
                  "Service":"ssm.amazonaws.com"
               },
               "Action":"sts:AssumeRole",
               "Condition":{
                  "StringEquals":{
                     "aws:SourceAccount":"123456789012"
                  },
                  "ArnEquals":{
                     "aws:SourceArn":"arn:aws:ssm:us-east-1:123456789012:*"
                  }
               }
            }
         ]
      }
      ```

   1. Choose Next.

1. On the **Add permissions** page, attach a custom policy or do the following:

   1. In the **Filter policies** box, enter `EKSDescribeClusterPolicy`, or the name of the policy you created above. Select the check box to the left of your policy name in the search results.

   1. In the **Filter policies** box, enter `EKSHybridSSMPolicy`, or the name of the policy you created above. Select the check box to the left of your policy name in the search results.

   1. In the **Filter policies** box, enter `AmazonEC2ContainerRegistryPullOnly`. Select the check box to the left of `AmazonEC2ContainerRegistryPullOnly` in the search results.

   1. In the **Filter policies** box, enter `AmazonSSMManagedInstanceCore`. Select the check box to the left of `AmazonSSMManagedInstanceCore` in the search results.

   1. Choose **Next**.

1. On the **Name, review, and create** page, do the following:

   1. For **Role name**, enter a unique name for your role, such as `AmazonEKSHybridNodesRole`.

   1. For **Description**, replace the current text with descriptive text such as `Amazon EKS - Hybrid Nodes role`.

   1. Choose **Create role**.

 **Steps for AWS IAM Roles Anywhere** 

To use AWS IAM Roles Anywhere, you must set up your AWS IAM Roles Anywhere trust anchor before creating the Hybrid Nodes IAM Role. See [Setup AWS IAM Roles Anywhere](#hybrid-nodes-iam-roles-anywhere) for instructions.

1. Open the [Amazon IAM console](https://console.aws.amazon.com/iam/home) 

1. In the left navigation pane, choose **Roles**.

1. On the **Roles** page, choose **Create role**.

1. On the **Select trusted entity** page, do the following:

   1. In the **Trusted entity type section**, choose **Custom trust policy**. Paste the following into the Custom trust policy editor. Replace `TRUST_ANCHOR ARN` with the ARN of the trust anchor you created in the [Setup AWS IAM Roles Anywhere](#hybrid-nodes-iam-roles-anywhere) steps. The condition in this trust policy restricts the ability of AWS IAM Roles Anywhere to assume the Hybrid Nodes IAM role to exchange temporary IAM credentials only when the role session name matches the CN in the x509 certificate installed on your hybrid nodes. You can alternatively use other certificate attributes to uniquely identify your node. The certificate attribute that you use in the trust policy must correspond to the nodeName you set in your nodeadm configuration. For more information, see the [Hybrid nodes `nodeadm` reference](hybrid-nodes-nodeadm.md).

      ```
      {
          "Version":"2012-10-17",		 	 	 
          "Statement": [
              {
                  "Effect": "Allow",
                  "Principal": {
                      "Service": "rolesanywhere.amazonaws.com"
                  },
                  "Action": [
                      "sts:TagSession",
                      "sts:SetSourceIdentity"
                  ],
                  "Condition": {
                      "StringEquals": {
                          "aws:PrincipalTag/x509Subject/CN": "${aws:PrincipalTag/x509Subject/CN}"
                      },
                      "ArnEquals": {
                          "aws:SourceArn": "arn:aws:rolesanywhere:us-east-1:123456789012:trust-anchor/TA_ID"
                      }
                  }
              },
              {
                  "Effect": "Allow",
                  "Principal": {
                      "Service": "rolesanywhere.amazonaws.com"
                  },
                  "Action": "sts:AssumeRole",
                  "Condition": {
                      "StringEquals": {
                          "sts:RoleSessionName": "${aws:PrincipalTag/x509Subject/CN}",
                          "aws:PrincipalTag/x509Subject/CN": "${aws:PrincipalTag/x509Subject/CN}"
                      },
                      "ArnEquals": {
                          "aws:SourceArn": "arn:aws:rolesanywhere:us-east-1:123456789012:trust-anchor/TA_ID"
                      }
                  }
              }
          ]
      }
      ```

   1. Choose Next.

1. On the **Add permissions** page, attach a custom policy or do the following:

   1. In the **Filter policies** box, enter `EKSDescribeClusterPolicy`, or the name of the policy you created above. Select the check box to the left of your policy name in the search results.

   1. In the **Filter policies** box, enter `AmazonEC2ContainerRegistryPullOnly`. Select the check box to the left of `AmazonEC2ContainerRegistryPullOnly` in the search results.

   1. Choose **Next**.

1. On the **Name, review, and create** page, do the following:

   1. For **Role name**, enter a unique name for your role, such as `AmazonEKSHybridNodesRole`.

   1. For **Description**, replace the current text with descriptive text such as `Amazon EKS - Hybrid Nodes role`.

   1. Choose **Create role**.

# Create an Amazon EKS cluster with hybrid nodes
<a name="hybrid-nodes-cluster-create"></a>

This topic provides an overview of the available options and describes what to consider when you create a hybrid nodes-enabled Amazon EKS cluster. EKS Hybrid Nodes have the same [Kubernetes version support](https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html) as Amazon EKS clusters with cloud nodes, including standard and extended support.

If you are not planning to use EKS Hybrid Nodes, see the primary Amazon EKS create cluster documentation at [Create an Amazon EKS cluster](create-cluster.md).

## Prerequisites
<a name="hybrid-nodes-cluster-create-prep"></a>
+ The [Prerequisite setup for hybrid nodes](hybrid-nodes-prereqs.md) completed. Before you create your hybrid nodes-enabled cluster, you must have your on-premises node and optionally pod CIDRs identified, your VPC and subnets created according to the EKS requirements, and hybrid nodes requirements, and your security group with inbound rules for your on-premises and optionally pod CIDRs. For more information on these prerequisites, see [Prepare networking for hybrid nodes](hybrid-nodes-networking.md).
+ The latest version of the AWS Command Line Interface (AWS CLI) installed and configured on your device. To check your current version, use `aws --version`. Package managers such yum, apt-get, or Homebrew for macOS are often several versions behind the latest version of the AWS CLI. To install the latest version, see [Installing or updating to the last version of the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) and [Configuring settings for the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html#cli-configure-quickstart-config) in the AWS Command Line Interface User Guide.
+ An [IAM principal](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles#iam-term-principal) with permissions to create IAM roles and attach policies, and create and describe EKS clusters

## Considerations
<a name="hybrid-nodes-cluster-create-consider"></a>
+ Your cluster must use either `API` or `API_AND_CONFIG_MAP` for the cluster authentication mode.
+ Your cluster must use IPv4 address family.
+ Your cluster must use either Public or Private cluster endpoint connectivity. Your cluster cannot use “Public and Private” cluster endpoint connectivity, because the Amazon EKS Kubernetes API server endpoint will resolve to the public IPs for hybrid nodes running outside of your VPC.
+ OIDC authentication is supported for EKS clusters with hybrid nodes.
+ You can add, change, or remove the hybrid nodes configuration of an existing cluster. For more information, see [Enable hybrid nodes on an existing Amazon EKS cluster or modify configuration](hybrid-nodes-cluster-update.md).

## Step 1: Create cluster IAM role
<a name="hybrid-nodes-cluster-create-iam"></a>

If you already have a cluster IAM role, or you’re going to create your cluster with `eksctl` or AWS CloudFormation, then you can skip this step. By default, `eksctl` and the AWS CloudFormation template create the cluster IAM role for you.

1. Run the following command to create an IAM trust policy JSON file.

   ```
   {
     "Version":"2012-10-17",		 	 	 
     "Statement": [
       {
         "Effect": "Allow",
         "Principal": {
           "Service": "eks.amazonaws.com"
         },
         "Action": "sts:AssumeRole"
       }
     ]
   }
   ```

1. Create the Amazon EKS cluster IAM role. If necessary, preface eks-cluster-role-trust-policy.json with the path on your computer that you wrote the file to in the previous step. The command associates the trust policy that you created in the previous step to the role. To create an IAM role, the [IAM principal](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles#iam-term-principal) that is creating the role must be assigned the `iam:CreateRole` action (permission).

   ```
   aws iam create-role \
       --role-name myAmazonEKSClusterRole \
       --assume-role-policy-document file://"eks-cluster-role-trust-policy.json"
   ```

1. You can assign either the Amazon EKS managed policy or create your own custom policy. For the minimum permissions that you must use in your custom policy, see [Amazon EKS node IAM role](create-node-role.md). Attach the Amazon EKS managed policy named `AmazonEKSClusterPolicy` to the role. To attach an IAM policy to an [IAM principal](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles#iam-term-principal), the principal that is attaching the policy must be assigned one of the following IAM actions (permissions): `iam:AttachUserPolicy` or `iam:AttachRolePolicy`.

   ```
   aws iam attach-role-policy \
       --policy-arn arn:aws:iam::aws:policy/AmazonEKSClusterPolicy \
       --role-name myAmazonEKSClusterRole
   ```

## Step 2: Create hybrid nodes-enabled cluster
<a name="hybrid-nodes-cluster-create-cluster"></a>

You can create a cluster by using:
+  [eksctl](#hybrid-nodes-cluster-create-eksctl) 
+  [AWS CloudFormation](#hybrid-nodes-cluster-create-cfn) 
+  [AWS CLI](#hybrid-nodes-cluster-create-cli) 
+  [AWS Management Console](#hybrid-nodes-cluster-create-console) 

### Create hybrid nodes-enabled cluster - eksctl
<a name="hybrid-nodes-cluster-create-eksctl"></a>

You need to install the latest version of the `eksctl` command line tool. To install or update `eksctl`, see [Installation](https://eksctl.io/installation) in the `eksctl` documentation.

1. Create `cluster-config.yaml` to define a hybrid nodes-enabled Amazon EKS IPv4 cluster. Make the following replacements in your `cluster-config.yaml`. For a full list of settings, see the [eksctl documentation](https://eksctl.io/getting-started/).

   1. Replace `CLUSTER_NAME` with a name for your cluster. The name can contain only alphanumeric characters (case-sensitive) and hyphens. It must start with an alphanumeric character and can’t be longer than 100 characters. The name must be unique within the AWS Region and AWS account that you’re creating the cluster in.

   1. Replace `AWS_REGION` with the AWS Region that you want to create your cluster in.

   1. Replace `K8S_VERSION` with any [Amazon EKS supported version](https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html).

   1. Replace `CREDS_PROVIDER` with `ssm` or `ira` based on the credential provider you configured in the steps for [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md).

   1. Replace `CA_BUNDLE_CERT` if your credential provider is set to `ira`, which uses AWS IAM Roles Anywhere as the credential provider. The CA\$1BUNDLE\$1CERT is the certificate authority (CA) certificate body and depends on your choice of CA. The certificate must be in Privacy Enhanced Mail (PEM) format.

   1. Replace `GATEWAY_ID` with the ID of your virtual private gateway or transit gateway to be attached to your VPC.

   1. Replace `REMOTE_NODE_CIDRS` with the on-premises node CIDR for your hybrid nodes.

   1. Replace `REMOTE_POD_CIDRS` with the on-premises pod CIDR for workloads running on hybrid nodes or remove the line from your configuration if you are not running webhooks on hybrid nodes. You must configure your `REMOTE_POD_CIDRS` if your CNI does not use Network Address Translation (NAT) or masquerading for pod IP addresses when pod traffic leaves your on-premises hosts. You must configure `REMOTE_POD_CIDRS` if you are running webhooks on hybrid nodes, see [Configure webhooks for hybrid nodes](hybrid-nodes-webhooks.md) for more information.

   1. Your on-premises node and pod CIDR blocks must meet the following requirements:

      1. Be within one of the IPv4 RFC-1918 ranges: `10.0.0.0/8`, `172.16.0.0/12`, or `192.168.0.0/16` , or within the CGNAT range defined by RFC 6598: `100.64.0.0/10`.

      1. Not overlap with each other, the `VPC CIDR` for your cluster, or your Kubernetes service IPv4 CIDR

         ```
         apiVersion: eksctl.io/v1alpha5
         kind: ClusterConfig
         
         metadata:
           name: CLUSTER_NAME
           region: AWS_REGION
           version: "K8S_VERSION"
         
         remoteNetworkConfig:
           iam:
             provider: CREDS_PROVIDER # default SSM, can also be set to IRA
             # caBundleCert: CA_BUNDLE_CERT
           vpcGatewayID: GATEWAY_ID
           remoteNodeNetworks:
           - cidrs: ["REMOTE_NODE_CIDRS"]
           remotePodNetworks:
           - cidrs: ["REMOTE_POD_CIDRS"]
         ```

1. Run the following command:

   ```
   eksctl create cluster -f cluster-config.yaml
   ```

   Cluster provisioning takes several minutes. While the cluster is being created, several lines of output appear. The last line of output is similar to the following example line.

   ```
   [✓]  EKS cluster "CLUSTER_NAME" in "REGION" region is ready
   ```

1. Continue with [Step 3: Update kubeconfig](#hybrid-nodes-cluster-create-kubeconfig).

### Create hybrid nodes-enabled cluster - AWS CloudFormation
<a name="hybrid-nodes-cluster-create-cfn"></a>

The CloudFormation stack creates the EKS cluster IAM role and an EKS cluster with the `RemoteNodeNetwork` and `RemotePodNetwork` you specify. Modify the CloudFormation template If you need to customize settings for your EKS cluster that are not exposed in the CloudFormation template.

1. Download the CloudFormation template.

   ```
   curl -OL 'https://raw.githubusercontent.com/aws/eks-hybrid/refs/heads/main/example/hybrid-eks-cfn.yaml'
   ```

1. Create a `cfn-eks-parameters.json` and specify your configuration for each value.

   1.  `CLUSTER_NAME`: name of the EKS cluster to be created

   1.  `CLUSTER_ROLE_NAME`: name of the EKS cluster IAM role to be created. The default in the template is “EKSClusterRole”.

   1.  `SUBNET1_ID`: the ID of the first subnet you created in the prerequisite steps

   1.  `SUBNET2_ID`: the ID of the second subnet you created in the prerequisite steps

   1.  `SG_ID`: the security group ID you created in the prerequisite steps

   1.  `REMOTE_NODE_CIDRS`: the on-premises node CIDR for your hybrid nodes

   1.  `REMOTE_POD_CIDRS`: the on-premises pod CIDR for workloads running on hybrid nodes. You must configure your `REMOTE_POD_CIDRS` if your CNI does not use Network Address Translation (NAT) or masquerading for pod IP addresses when pod traffic leaves your on-premises hosts. You must configure `REMOTE_POD_CIDRS` if you are running webhooks on hybrid nodes, see [Configure webhooks for hybrid nodes](hybrid-nodes-webhooks.md) for more information.

   1. Your on-premises node and pod CIDR blocks must meet the following requirements:

      1. Be within one of the IPv4 RFC-1918 ranges: `10.0.0.0/8`, `172.16.0.0/12`, or `192.168.0.0/16`, or within the CGNAT range defined by RFC 6598: `100.64.0.0/10`.

      1. Not overlap with each other, the `VPC CIDR` for your cluster, or your Kubernetes service IPv4 CIDR.

   1.  `CLUSTER_AUTH`: the cluster authentication mode for your cluster. Valid values are `API` and `API_AND_CONFIG_MAP`. The default in the template is `API_AND_CONFIG_MAP`.

   1.  `CLUSTER_ENDPOINT`: the cluster endpoint connectivity for your cluster. Valid values are “Public” and “Private”. The default in the template is Private, which means you will only be able to connect to the Kubernetes API endpoint from within your VPC.

   1.  `K8S_VERSION`: the Kubernetes version to use for your cluster. See [Amazon EKS supported versions](https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html).

      ```
      {
        "Parameters": {
          "ClusterName": "CLUSTER_NAME",
          "ClusterRoleName": "CLUSTER_ROLE_NAME",
          "SubnetId1": "SUBNET1_ID",
          "SubnetId2": "SUBNET2_ID",
          "SecurityGroupId" "SG_ID",
          "RemoteNodeCIDR": "REMOTE_NODE_CIDRS",
          "RemotePodCIDR": "REMOTE_POD_CIDRS",
          "ClusterAuthMode": "CLUSTER_AUTH",
          "ClusterEndpointConnectivity": "CLUSTER_ENDPOINT",
          "K8sVersion": "K8S_VERSION"
        }
       }
      ```

1. Deploy the CloudFormation stack. Replace `STACK_NAME` with your name for the CloudFormation stack and `AWS_REGION` with your AWS Region where the cluster will be created.

   ```
   aws cloudformation deploy \
       --stack-name STACK_NAME \
       --region AWS_REGION \
       --template-file hybrid-eks-cfn.yaml \
       --parameter-overrides file://cfn-eks-parameters.json \
       --capabilities CAPABILITY_NAMED_IAM
   ```

   Cluster provisioning takes several minutes. You can check the status of your stack with the following command. Replace `STACK_NAME` with your name for the CloudFormation stack and `AWS_REGION` with your AWS Region where the cluster will be created.

   ```
   aws cloudformation describe-stacks \
       --stack-name STACK_NAME \
       --region AWS_REGION \
       --query 'Stacks[].StackStatus'
   ```

1. Continue with [Step 3: Update kubeconfig](#hybrid-nodes-cluster-create-kubeconfig).

### Create hybrid nodes-enabled cluster - AWS CLI
<a name="hybrid-nodes-cluster-create-cli"></a>

1. Run the following command to create a hybrid nodes-enabled EKS cluster. Before running the command, replace the following with your settings. For a full list of settings, see the [Create an Amazon EKS cluster](create-cluster.md) documentation.

   1.  `CLUSTER_NAME`: name of the EKS cluster to be created

   1.  `AWS_REGION`: AWS Region where the cluster will be created.

   1.  `K8S_VERSION`: the Kubernetes version to use for your cluster. See Amazon EKS supported versions.

   1.  `ROLE_ARN`: the Amazon EKS cluster role you configured for your cluster. See Amazon EKS cluster IAM role for more information.

   1.  `SUBNET1_ID`: the ID of the first subnet you created in the prerequisite steps

   1.  `SUBNET2_ID`: the ID of the second subnet you created in the prerequisite steps

   1.  `SG_ID`: the security group ID you created in the prerequisite steps

   1. You can use `API` and `API_AND_CONFIG_MAP` for your cluster access authentication mode. In the command below, the cluster access authentication mode is set to `API_AND_CONFIG_MAP`.

   1. You can use the `endpointPublicAccess` and `endpointPrivateAccess` parameters to enable or disable public and private access to your cluster’s Kubernetes API server endpoint. In the command below `endpointPublicAccess` is set to false and `endpointPrivateAccess` is set to true.

   1.  `REMOTE_NODE_CIDRS`: the on-premises node CIDR for your hybrid nodes.

   1.  `REMOTE_POD_CIDRS` (optional): the on-premises pod CIDR for workloads running on hybrid nodes.

   1. Your on-premises node and pod CIDR blocks must meet the following requirements:

      1. Be within one of the IPv4 RFC-1918 ranges: `10.0.0.0/8`, `172.16.0.0/12`, or `192.168.0.0/16`, or within the CGNAT range defined by RFC 6598: `100.64.0.0/10`.

      1. Not overlap with each other, the `VPC CIDR` for your Amazon EKS cluster, or your Kubernetes service IPv4 CIDR.

         ```
         aws eks create-cluster \
             --name CLUSTER_NAME \
             --region AWS_REGION \
             --kubernetes-version K8S_VERSION \
             --role-arn ROLE_ARN \
             --resources-vpc-config subnetIds=SUBNET1_ID,SUBNET2_ID,securityGroupIds=SG_ID,endpointPrivateAccess=true,endpointPublicAccess=false \
             --access-config authenticationMode=API_AND_CONFIG_MAP \
             --remote-network-config '{"remoteNodeNetworks":[{"cidrs":["REMOTE_NODE_CIDRS"]}],"remotePodNetworks":[{"cidrs":["REMOTE_POD_CIDRS"]}]}'
         ```

1. It takes several minutes to provision the cluster. You can query the status of your cluster with the following command. Replace `CLUSTER_NAME` with the name of the cluster you are creating and `AWS_REGION` with the AWS Region where the cluster is creating. Don’t proceed to the next step until the output returned is `ACTIVE`.

   ```
   aws eks describe-cluster \
       --name CLUSTER_NAME \
       --region AWS_REGION \
       --query "cluster.status"
   ```

1. Continue with [Step 3: Update kubeconfig](#hybrid-nodes-cluster-create-kubeconfig).

### Create hybrid nodes-enabled cluster - AWS Management Console
<a name="hybrid-nodes-cluster-create-console"></a>

1. Open the Amazon EKS console at [Amazon EKS console](https://console.aws.amazon.com/eks/home#/clusters).

1. Choose Add cluster and then choose Create.

1. On the Configure cluster page, enter the following fields:

   1.  **Name** – A name for your cluster. The name can contain only alphanumeric characters (case-sensitive), hyphens, and underscores. It must start with an alphanumeric character and can’t be longer than 100 characters. The name must be unique within the AWS Region and AWS account that you’re creating the cluster in.

   1.  **Cluster IAM role** – Choose the Amazon EKS cluster IAM role that you created to allow the Kubernetes control plane to manage AWS resources on your behalf.

   1.  **Kubernetes version** – The version of Kubernetes to use for your cluster. We recommend selecting the latest version, unless you need an earlier version.

   1.  **Upgrade policy** - Choose either Extended or Standard.

      1.  **Extended:** This option supports the Kubernetes version for 26 months after the release date. The extended support period has an additional hourly cost that begins after the standard support period ends. When extended support ends, your cluster will be auto upgraded to the next version.

      1.  **Standard:** This option supports the Kubernetes version for 14 months after the release date. There is no additional cost. When standard support ends, your cluster will be auto upgraded to the next version.

   1.  **Cluster access** - choose to allow or disallow cluster administrator access and select an authentication mode. The following authentication modes are supported for hybrid nodes-enabled clusters.

      1.  **EKS API**: The cluster will source authenticated IAM principals only from EKS access entry APIs.

      1.  **EKS API and ConfigMap**: The cluster will source authenticated IAM principals from both EKS access entry APIs and the `aws-auth` ConfigMap.

   1.  **Secrets encryption** – (Optional) Choose to enable secrets encryption of Kubernetes secrets using a KMS key. You can also enable this after you create your cluster. Before you enable this capability, make sure that you’re familiar with the information in [Encrypt Kubernetes secrets with KMS on existing clusters](enable-kms.md).

   1.  **ARC zonal shift** - If enabled, EKS will register your cluster with ARC zonal shift to enable you to use zonal shift to shift application traffic away from an AZ.

   1.  **Tags** – (Optional) Add any tags to your cluster. For more information, see [Organize Amazon EKS resources with tags](eks-using-tags.md).

   1. When you’re done with this page, choose **Next**.

1. On the **Specify networking** page, select values for the following fields:

   1.  **VPC** – Choose an existing VPC that meets [View Amazon EKS networking requirements for VPC and subnets](network-reqs.md) and [Amazon EKS Hybrid Nodes requirements](hybrid-nodes-prereqs.md). Before choosing a VPC, we recommend that you’re familiar with all of the requirements and considerations in View Amazon EKS networking requirements for VPC, subnets, and hybrid nodes. You can’t change which VPC you want to use after cluster creation. If no VPCs are listed, then you need to create one first. For more information, see [Create an Amazon VPC for your Amazon EKS cluster](creating-a-vpc.md) and the [Amazon EKS Hybrid Nodes networking requirements](hybrid-nodes-prereqs.md).

   1.  **Subnets** – By default, all available subnets in the VPC specified in the previous field are preselected. You must select at least two.

   1.  **Security groups** – (Optional) Specify one or more security groups that you want Amazon EKS to associate to the network interfaces that it creates. At least one of the security groups you specify must have inbound rules for your on-premises node and optionally pod CIDRs. See the [Amazon EKS Hybrid Nodes networking requirements](hybrid-nodes-networking.md) for more information. Whether you choose any security groups or not, Amazon EKS creates a security group that enables communication between your cluster and your VPC. Amazon EKS associates this security group, and any that you choose, to the network interfaces that it creates. For more information about the cluster security group that Amazon EKS creates, see [View Amazon EKS security group requirements for clusters](sec-group-reqs.md) You can modify the rules in the cluster security group that Amazon EKS creates.

   1.  **Choose cluster IP address family** – You must choose IPv4 for hybrid nodes-enabled clusters.

   1. (Optional) Choose **Configure Kubernetes Service IP address range** and specify a **Service IPv4 range**.

   1.  **Choose Configure remote networks to enable hybrid nodes** and specify your on-premises node and pod CIDRs for hybrid nodes.

   1. You must configure your remote pod CIDR if your CNI does not use Network Address Translation (NAT) or masquerading for pod IP addresses when pod traffic leaves your on-premises hosts. You must configure the remote pod CIDR if you are running webhooks on hybrid nodes.

   1. Your on-premises node and pod CIDR blocks must meet the following requirements:

      1. Be within one of the IPv4 RFC-1918 ranges: `10.0.0.0/8`, `172.16.0.0/12`, or `192.168.0.0/16`, or within the CGNAT range defined by RFC 6598: `100.64.0.0/10`.

      1. Not overlap with each other, the `VPC CIDR` for your cluster, or your Kubernetes service IPv4 CIDR

   1. For **Cluster endpoint access**, select an option. After your cluster is created, you can change this option. For hybrid nodes-enabled clusters, you must choose either Public or Private. Before selecting a non-default option, make sure to familiarize yourself with the options and their implications. For more information, see [Cluster API server endpoint](cluster-endpoint.md).

   1. When you’re done with this page, choose **Next**.

1. (Optional) On the **Configure** observability page, choose which Metrics and Control plane logging options to turn on. By default, each log type is turned off.

   1. For more information about the Prometheus metrics option, see [Monitor your cluster metrics with Prometheus](prometheus.md).

   1. For more information about the EKS control logging options, see [Send control plane logs to CloudWatch Logs](control-plane-logs.md).

   1. When you’re done with this page, choose **Next**.

1. On the **Select add-ons** page, choose the add-ons that you want to add to your cluster.

   1. You can choose as many **Amazon EKS add-ons** and ** AWS Marketplace add-ons** as you require. Amazon EKS add-ons that are not compatible with hybrid nodes are marked with “Not compatible with Hybrid Nodes” and the add-ons have an anti-affinity rule to prevent them from running on hybrid nodes. See Configuring add-ons for hybrid nodes for more information. If the ** AWS Marketplace** add-ons that you want to install isn’t listed, you can search for available ** AWS Marketplace add-ons** by entering text in the search box. You can also search by **category**, **vendor**, or **pricing model** and then choose the add-ons from the search results.

   1. Some add-ons, such as CoreDNS and kube-proxy, are installed by default. If you disable any of the default add-ons, this may affect your ability to run Kubernetes applications.

   1. When you’re done with this page, choose `Next`.

1. On the **Configure selected add-ons settings** page, select the version that you want to install.

   1. You can always update to a later version after cluster creation. You can update the configuration of each add-on after cluster creation. For more information about configuring add-ons, see [Update an Amazon EKS add-on](updating-an-add-on.md). For the add-ons versions that are compatible with hybrid nodes, see [Configure add-ons for hybrid nodes](hybrid-nodes-add-ons.md).

   1. When you’re done with this page, choose Next.

1. On the **Review and create** page, review the information that you entered or selected on the previous pages. If you need to make changes, choose **Edit**. When you’re satisfied, choose **Create**. The **Status** field shows **CREATING** while the cluster is provisioned. Cluster provisioning takes several minutes.

1. Continue with [Step 3: Update kubeconfig](#hybrid-nodes-cluster-create-kubeconfig).

## Step 3: Update kubeconfig
<a name="hybrid-nodes-cluster-create-kubeconfig"></a>

If you created your cluster using `eksctl`, then you can skip this step. This is because `eksctl` already completed this step for you. Enable `kubectl` to communicate with your cluster by adding a new context to the `kubectl` config file. For more information about how to create and update the file, see [Connect kubectl to an EKS cluster by creating a kubeconfig file](create-kubeconfig.md).

```
aws eks update-kubeconfig --name CLUSTER_NAME --region AWS_REGION
```

An example output is as follows.

```
Added new context arn:aws:eks:AWS_REGION:111122223333:cluster/CLUSTER_NAME to /home/username/.kube/config
```

Confirm communication with your cluster by running the following command.

```
kubectl get svc
```

An example output is as follows.

```
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.100.0.1   <none>        443/TCP   28h
```

## Step 4: Cluster setup
<a name="_step_4_cluster_setup"></a>

As a next step, see [Prepare cluster access for hybrid nodes](hybrid-nodes-cluster-prep.md) to enable access for your hybrid nodes to join your cluster.

# Enable hybrid nodes on an existing Amazon EKS cluster or modify configuration
<a name="hybrid-nodes-cluster-update"></a>

This topic provides an overview of the available options and describes what to consider when you add, change, or remove the hybrid nodes configuration for an Amazon EKS cluster.

To enable an Amazon EKS cluster to use hybrid nodes, add the IP address CIDR ranges of your on-premises node and optionally pod network in the `RemoteNetworkConfig` configuration. EKS uses this list of CIDRs to enable connectivity between the cluster and your on-premises networks. For a full list of options when updating your cluster configuration, see the [UpdateClusterConfig](https://docs.aws.amazon.com/eks/latest/APIReference/API_UpdateClusterConfig.html) in the *Amazon EKS API Reference*.

You can do any of the following actions to the EKS Hybrid Nodes networking configuration in a cluster:
+  [Add remote network configuration to enable EKS Hybrid Nodes in an existing cluster.](#hybrid-nodes-cluster-enable-existing) 
+  [Add, change, or remove the remote node networks or the remote pod networks in an existing cluster.](#hybrid-nodes-cluster-update-config) 
+  [Remove all remote node network CIDR ranges to disable EKS Hybrid Nodes in an existing cluster.](#hybrid-nodes-cluster-disable) 

## Prerequisites
<a name="hybrid-nodes-cluster-enable-prep"></a>
+ Before enabling your Amazon EKS cluster for hybrid nodes, ensure your environment meets the requirements outlined at [Prerequisite setup for hybrid nodes](hybrid-nodes-prereqs.md), and detailed at [Prepare networking for hybrid nodes](hybrid-nodes-networking.md), [Prepare operating system for hybrid nodes](hybrid-nodes-os.md), and [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md).
+ Your cluster must use IPv4 address family.
+ Your cluster must use either `API` or `API_AND_CONFIG_MAP` for the cluster authentication mode. The process for modifying the cluster authentication mode is described at [Change authentication mode to use access entries](setting-up-access-entries.md).
+ We recommend that you use either public or private endpoint access for the Amazon EKS Kubernetes API server endpoint, but not both. If you choose “Public and Private”, the Amazon EKS Kubernetes API server endpoint will always resolve to the public IPs for hybrid nodes running outside of your VPC, which can prevent your hybrid nodes from joining the cluster. The process for modifying network access to your cluster is described at [Cluster API server endpoint](cluster-endpoint.md).
+ The latest version of the AWS Command Line Interface (AWS CLI) installed and configured on your device. To check your current version, use `aws --version`. Package managers such yum, apt-get, or Homebrew for macOS are often several versions behind the latest version of the AWS CLI. To install the latest version, see [Installing or updating to the latest version of the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) and [Configuring settings for the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html#cli-configure-quickstart-config) in the AWS Command Line Interface User Guide.
+ An [IAM principal](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles#iam-term-principal) with permission to call [UpdateClusterConfig](https://docs.aws.amazon.com/eks/latest/APIReference/API_UpdateClusterConfig.html) on your Amazon EKS cluster.
+ Update add-ons to versions that are compatible with hybrid nodes. For the add-ons versions that are compatible with hybrid nodes, see [Configure add-ons for hybrid nodes](hybrid-nodes-add-ons.md).
+ If you are running add-ons that are not compatible with hybrid nodes, ensure that the add-on [DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/) or [Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) has the following affinity rule to prevent deployment to hybrid nodes. Add the following affinity rule if it is not already present.

  ```
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: eks.amazonaws.com/compute-type
            operator: NotIn
            values:
            - hybrid
  ```

## Considerations
<a name="hybrid-nodes-cluster-enable-consider"></a>

The `remoteNetworkConfig` JSON object has the following behavior during an update:
+ Any existing part of the configuration that you don’t specify is unchanged. If you don’t specify either of the `remoteNodeNetworks` or `remotePodNetworks`, that part will remain the same.
+ If you are modifying either the `remoteNodeNetworks` or `remotePodNetworks` lists of CIDRs, you must specify the complete list of CIDRs that you want in your final configuration. When you specify a change to either the `remoteNodeNetworks` or `remotePodNetworks` CIDR list, EKS replaces the original list during the update.
+ Your on-premises node and pod CIDR blocks must meet the following requirements:

  1. Be within one of the IPv4 RFC-1918 ranges: 10.0.0.0/8, 172.16.0.0/12, or 192.168.0.0/16 , or within the CGNAT range defined by RFC 6598: `100.64.0.0/10` 

  1. Not overlap with each other, all CIDRs of the VPC for your Amazon EKS cluster, or your Kubernetes service IPv4 CIDR.

## Enable hybrid nodes on an existing cluster
<a name="hybrid-nodes-cluster-enable-existing"></a>

You can enable EKS Hybrid Nodes in an existing cluster by using:
+  [AWS CloudFormation](#hybrid-nodes-cluster-enable-cfn) 
+  [AWS CLI](#hybrid-nodes-cluster-enable-cli) 
+  [AWS Management Console](#hybrid-nodes-cluster-enable-console) 

### Enable EKS Hybrid Nodes in an existing cluster - AWS CloudFormation
<a name="hybrid-nodes-cluster-enable-cfn"></a>

1. To enable EKS Hybrid Nodes in your cluster, add the `RemoteNodeNetwork` and (optional) `RemotePodNetwork` to your CloudFormation template and update the stack. Note that `RemoteNodeNetwork` is a list with a maximum of one `Cidrs` item and the `Cidrs` is a list of multiple IP CIDR ranges.

   ```
   RemoteNetworkConfig:
     RemoteNodeNetworks:
       - Cidrs: [RemoteNodeCIDR]
     RemotePodNetworks:
       - Cidrs: [RemotePodCIDR]
   ```

1. Continue to [Prepare cluster access for hybrid nodes](hybrid-nodes-cluster-prep.md).

### Enable EKS Hybrid Nodes in an existing cluster - AWS CLI
<a name="hybrid-nodes-cluster-enable-cli"></a>

1. Run the following command to enable `RemoteNetworkConfig` for EKS Hybrid Nodes for your EKS cluster. Before running the command, replace the following with your settings. For a full list of settings, see the [UpdateClusterConfig](https://docs.aws.amazon.com/eks/latest/APIReference/API_UpdateClusterConfig.html) in the *Amazon EKS API Reference*.

   1.  `CLUSTER_NAME`: name of the EKS cluster to update.

   1.  `AWS_REGION`: AWS Region where the EKS cluster is running.

   1.  `REMOTE_NODE_CIDRS`: the on-premises node CIDR for your hybrid nodes.

   1.  `REMOTE_POD_CIDRS` (optional): the on-premises pod CIDR for workloads running on hybrid nodes.

      ```
      aws eks update-cluster-config \
          --name CLUSTER_NAME \
          --region AWS_REGION \
          --remote-network-config '{"remoteNodeNetworks":[{"cidrs":["REMOTE_NODE_CIDRS"]}],"remotePodNetworks":[{"cidrs":["REMOTE_POD_CIDRS"]}]}'
      ```

1. It takes several minutes to update the cluster. You can query the status of your cluster with the following command. Replace `CLUSTER_NAME` with the name of the cluster you are modifying and `AWS_REGION` with the AWS Region where the cluster is running. Don’t proceed to the next step until the output returned is `ACTIVE`.

   ```
   aws eks describe-cluster \
       --name CLUSTER_NAME \
       --region AWS_REGION \
       --query "cluster.status"
   ```

1. Continue to [Prepare cluster access for hybrid nodes](hybrid-nodes-cluster-prep.md).

### Enable EKS Hybrid Nodes in an existing cluster - AWS Management Console
<a name="hybrid-nodes-cluster-enable-console"></a>

1. Open the Amazon EKS console at [Amazon EKS console](https://console.aws.amazon.com/eks/home#/clusters).

1. Choose the name of the cluster to display your cluster information.

1. Choose the **Networking** tab and choose **Manage**.

1. In the dropdown, choose **Remote networks**.

1.  **Choose Configure remote networks to enable hybrid nodes** and specify your on-premises node and pod CIDRs for hybrid nodes.

1. Choose **Save changes** to finish. Wait for the cluster status to return to **Active**.

1. Continue to [Prepare cluster access for hybrid nodes](hybrid-nodes-cluster-prep.md).

## Update hybrid nodes configuration in an existing cluster
<a name="hybrid-nodes-cluster-update-config"></a>

You can modify `remoteNetworkConfig` in an existing hybrid cluster by using any of the following:
+  [AWS CloudFormation](#hybrid-nodes-cluster-update-cfn) 
+  [AWS CLI](#hybrid-nodes-cluster-update-cli) 
+  [AWS Management Console](#hybrid-nodes-cluster-update-console) 

### Update hybrid configuration in an existing cluster - AWS CloudFormation
<a name="hybrid-nodes-cluster-update-cfn"></a>

1. Update your CloudFormation template with the new network CIDR values.

   ```
   RemoteNetworkConfig:
     RemoteNodeNetworks:
       - Cidrs: [NEW_REMOTE_NODE_CIDRS]
     RemotePodNetworks:
       - Cidrs: [NEW_REMOTE_POD_CIDRS]
   ```
**Note**  
When updating `RemoteNodeNetworks` or `RemotePodNetworks` CIDR lists, include all CIDRs (new and existing). EKS replaces the entire list during updates. Omitting these fields from the update request retains their existing configurations.

1. Update your CloudFormation stack with the modified template and wait for the stack update to complete.

### Update hybrid configuration in an existing cluster - AWS CLI
<a name="hybrid-nodes-cluster-update-cli"></a>

1. To modify the remote network CIDRs, run the following command. Replace the values with your settings:

   ```
   aws eks update-cluster-config
   --name CLUSTER_NAME
   --region AWS_REGION
   --remote-network-config '{"remoteNodeNetworks":[{"cidrs":["NEW_REMOTE_NODE_CIDRS"]}],"remotePodNetworks":[{"cidrs":["NEW_REMOTE_POD_CIDRS"]}]}'
   ```
**Note**  
When updating `remoteNodeNetworks` or `remotePodNetworks` CIDR lists, include all CIDRs (new and existing). EKS replaces the entire list during updates. Omitting these fields from the update request retains their existing configurations.

1. Wait for the cluster status to return to ACTIVE before proceeding.

### Update hybrid configuration in an existing cluster - AWS Management Console
<a name="hybrid-nodes-cluster-update-console"></a>

1. Open the Amazon EKS console at [Amazon EKS console](https://console.aws.amazon.com/eks/home#/clusters).

1. Choose the name of the cluster to display your cluster information.

1. Choose the **Networking** tab and choose **Manage**.

1. In the dropdown, choose **Remote networks**.

1. Update the CIDRs under `Remote node networks` and `Remote pod networks - Optional` as needed.

1. Choose **Save changes** and wait for the cluster status to return to **Active**.

## Disable Hybrid nodes in an existing cluster
<a name="hybrid-nodes-cluster-disable"></a>

You can disable EKS Hybrid Nodes in an existing cluster by using:
+  [AWS CloudFormation](#hybrid-nodes-cluster-disable-cfn) 
+  [AWS CLI](#hybrid-nodes-cluster-disable-cli) 
+  [AWS Management Console](#hybrid-nodes-cluster-disable-console) 

### Disable EKS Hybrid Nodes in an existing cluster - AWS CloudFormation
<a name="hybrid-nodes-cluster-disable-cfn"></a>

1. To disable EKS Hybrid Nodes in your cluster, set `RemoteNodeNetworks` and `RemotePodNetworks` to empty arrays in your CloudFormation template and update the stack.

   ```
   RemoteNetworkConfig:
     RemoteNodeNetworks: []
     RemotePodNetworks: []
   ```

### Disable EKS Hybrid Nodes in an existing cluster - AWS CLI
<a name="hybrid-nodes-cluster-disable-cli"></a>

1. Run the following command to remove `RemoteNetworkConfig` from your EKS cluster. Before running the command, replace the following with your settings. For a full list of settings, see the [UpdateClusterConfig](https://docs.aws.amazon.com/eks/latest/APIReference/API_UpdateClusterConfig.html) in the *Amazon EKS API Reference*.

   1.  `CLUSTER_NAME`: name of the EKS cluster to update.

   1.  `AWS_REGION`: AWS Region where the EKS cluster is running.

      ```
      aws eks update-cluster-config \
          --name CLUSTER_NAME \
          --region AWS_REGION \
          --remote-network-config '{"remoteNodeNetworks":[],"remotePodNetworks":[]}'
      ```

1. It takes several minutes to update the cluster. You can query the status of your cluster with the following command. Replace `CLUSTER_NAME` with the name of the cluster you are modifying and `AWS_REGION` with the AWS Region where the cluster is running. Don’t proceed to the next step until the output returned is `ACTIVE`.

   ```
   aws eks describe-cluster \
       --name CLUSTER_NAME \
       --region AWS_REGION \
       --query "cluster.status"
   ```

### Disable EKS Hybrid Nodes in an existing cluster - AWS Management Console
<a name="hybrid-nodes-cluster-disable-console"></a>

1. Open the Amazon EKS console at [Amazon EKS console](https://console.aws.amazon.com/eks/home#/clusters).

1. Choose the name of the cluster to display your cluster information.

1. Choose the **Networking** tab and choose **Manage**.

1. In the dropdown, choose **Remote networks**.

1. Choose **Configure remote networks to enable hybrid nodes** and remove all the CIDRs under `Remote node networks` and `Remote pod networks - Optional`.

1. Choose **Save changes** to finish. Wait for the cluster status to return to **Active**.

# Prepare cluster access for hybrid nodes
<a name="hybrid-nodes-cluster-prep"></a>

Before connecting hybrid nodes to your Amazon EKS cluster, you must enable your Hybrid Nodes IAM Role with Kubernetes permissions to join the cluster. See [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md) for information on how to create the Hybrid Nodes IAM role. Amazon EKS supports two ways to associate IAM principals with Kubernetes Role-Based Access Control (RBAC), Amazon EKS access entries and the `aws-auth` ConfigMap. For more information on Amazon EKS access management, see [Grant IAM users and roles access to Kubernetes APIs](grant-k8s-access.md).

Use the procedures below to associate your Hybrid Nodes IAM role with Kubernetes permissions. To use Amazon EKS access entries, your cluster must have been created with the `API` or `API_AND_CONFIG_MAP` authentication modes. To use the `aws-auth` ConfigMap, your cluster must have been created with the `API_AND_CONFIG_MAP` authentication mode. The `CONFIG_MAP`-only authentication mode is not supported for hybrid nodes-enabled Amazon EKS clusters.

## Using Amazon EKS access entries for Hybrid Nodes IAM role
<a name="_using_amazon_eks_access_entries_for_hybrid_nodes_iam_role"></a>

There is an Amazon EKS access entry type for hybrid nodes named HYBRID\$1LINUX that can be used with an IAM role. With this access entry type, the username is automatically set to system:node:\$1\$1SessionName\$1\$1. For more information on creating access entries, see [Create access entries](creating-access-entries.md).

### AWS CLI
<a name="shared_aws_cli"></a>

1. You must have the latest version of the AWS CLI installed and configured on your device. To check your current version, use `aws --version`. Package managers such yum, apt-get, or Homebrew for macOS are often several versions behind the latest version of the AWS CLI. To install the latest version, see Installing and Quick configuration with aws configure in the AWS Command Line Interface User Guide.

1. Create your access entry with the following command. Replace CLUSTER\$1NAME with the name of your cluster and HYBRID\$1NODES\$1ROLE\$1ARN with the ARN of the role you created in the steps for [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md).

   ```
   aws eks create-access-entry --cluster-name CLUSTER_NAME \
       --principal-arn HYBRID_NODES_ROLE_ARN \
       --type HYBRID_LINUX
   ```

### AWS Management Console
<a name="hybrid-nodes-cluster-prep-console"></a>

1. Open the Amazon EKS console at [Amazon EKS console](https://console.aws.amazon.com/eks/home#/clusters).

1. Choose the name of your hybrid nodes-enabled cluster.

1. Choose the **Access** tab.

1. Choose **Create access entry**.

1. For **IAM principal**, select the Hybrid Nodes IAM role you created in the steps for [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md).

1. For **Type**, select **Hybrid Linux**.

1. (Optional) For **Tags**, assign labels to the access entry. For example, to make it easier to find all resources with the same tag.

1. Choose **Skip to review and create**. You cannot add policies to the Hybrid Linux access entry or change its access scope.

1. Review the configuration for your access entry. If anything looks incorrect, choose **Previous** to go back through the steps and correct the error. If the configuration is correct, choose **Create**.

## Using aws-auth ConfigMap for Hybrid Nodes IAM role
<a name="_using_aws_auth_configmap_for_hybrid_nodes_iam_role"></a>

In the following steps, you will create or update the `aws-auth` ConfigMap with the ARN of the Hybrid Nodes IAM Role you created in the steps for [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md).

1. Check to see if you have an existing `aws-auth` ConfigMap for your cluster. Note that if you are using a specific `kubeconfig` file, use the `--kubeconfig` flag.

   ```
   kubectl describe configmap -n kube-system aws-auth
   ```

1. If you are shown an `aws-auth` ConfigMap, then update it as needed.

   1. Open the ConfigMap for editing.

      ```
      kubectl edit -n kube-system configmap/aws-auth
      ```

   1. Add a new `mapRoles` entry as needed. Replace `HYBRID_NODES_ROLE_ARN` with the ARN of your Hybrid Nodes IAM role. Note, `{{SessionName}}` is the correct template format to save in the ConfigMap. Do not replace it with other values.

      ```
      data:
        mapRoles: |
        - groups:
          - system:bootstrappers
          - system:nodes
          rolearn: HYBRID_NODES_ROLE_ARN
          username: system:node:{{SessionName}}
      ```

   1. Save the file and exit your text editor.

1. If there is not an existing `aws-auth` ConfigMap for your cluster, create it with the following command. Replace `HYBRID_NODES_ROLE_ARN` with the ARN of your Hybrid Nodes IAM role. Note that `{{SessionName}}` is the correct template format to save in the ConfigMap. Do not replace it with other values.

   ```
   kubectl apply -f=/dev/stdin <<-EOF
   apiVersion: v1
   kind: ConfigMap
   metadata:
     name: aws-auth
     namespace: kube-system
   data:
     mapRoles: |
     - groups:
       - system:bootstrappers
       - system:nodes
       rolearn: HYBRID_NODES_ROLE_ARN
       username: system:node:{{SessionName}}
   EOF
   ```

# Run on-premises workloads on hybrid nodes
<a name="hybrid-nodes-tutorial"></a>

In an EKS cluster with hybrid nodes enabled, you can run on-premises and edge applications on your own infrastructure with the same Amazon EKS clusters, features, and tools that you use in AWS Cloud.

The following sections contain step-by-step instructions for using hybrid nodes.

**Topics**
+ [Connect hybrid nodes](hybrid-nodes-join.md)
+ [Connect hybrid nodes with Bottlerocket](hybrid-nodes-bottlerocket.md)
+ [Upgrade hybrid nodes](hybrid-nodes-upgrade.md)
+ [Patch hybrid nodes](hybrid-nodes-security.md)
+ [Delete hybrid nodes](hybrid-nodes-remove.md)

# Connect hybrid nodes
<a name="hybrid-nodes-join"></a>

**Note**  
The following steps apply to hybrid nodes running compatible operating systems except Bottlerocket. For steps to connect a hybrid node that runs Bottlerocket, see [Connect hybrid nodes with Bottlerocket](hybrid-nodes-bottlerocket.md).

This topic describes how to connect hybrid nodes to an Amazon EKS cluster. After your hybrid nodes join the cluster, they will appear with status Not Ready in the Amazon EKS console and in Kubernetes-compatible tooling such as kubectl. After completing the steps on this page, proceed to [Configure CNI for hybrid nodes](hybrid-nodes-cni.md) to make your hybrid nodes ready to run applications.

## Prerequisites
<a name="_prerequisites"></a>

Before connecting hybrid nodes to your Amazon EKS cluster, make sure you have completed the prerequisite steps.
+ You have network connectivity from your on-premises environment to the AWS Region hosting your Amazon EKS cluster. See [Prepare networking for hybrid nodes](hybrid-nodes-networking.md) for more information.
+ You have a compatible operating system for hybrid nodes installed on your on-premises hosts. See [Prepare operating system for hybrid nodes](hybrid-nodes-os.md) for more information.
+ You have created your Hybrid Nodes IAM role and set up your on-premises credential provider (AWS Systems Manager hybrid activations or AWS IAM Roles Anywhere). See [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md) for more information.
+ You have created your hybrid nodes-enabled Amazon EKS cluster. See [Create an Amazon EKS cluster with hybrid nodes](hybrid-nodes-cluster-create.md) for more information.
+ You have associated your Hybrid Nodes IAM role with Kubernetes Role-Based Access Control (RBAC) permissions. See [Prepare cluster access for hybrid nodes](hybrid-nodes-cluster-prep.md) for more information.

## Step 1: Install the hybrid nodes CLI (`nodeadm`) on each on-premises host
<a name="_step_1_install_the_hybrid_nodes_cli_nodeadm_on_each_on_premises_host"></a>

If you are including the Amazon EKS Hybrid Nodes CLI (`nodeadm`) in your pre-built operating system images, you can skip this step. For more information on the hybrid nodes version of `nodeadm`, see [Hybrid nodes `nodeadm` reference](hybrid-nodes-nodeadm.md).

The hybrid nodes version of `nodeadm` is hosted in Amazon S3 fronted by Amazon CloudFront. To install `nodeadm` on each on-premises host, you can run the following command from your on-premises hosts.

 **For x86\$164 hosts:** 

```
curl -OL 'https://hybrid-assets.eks.amazonaws.com/releases/latest/bin/linux/amd64/nodeadm'
```

 **For ARM hosts** 

```
curl -OL 'https://hybrid-assets.eks.amazonaws.com/releases/latest/bin/linux/arm64/nodeadm'
```

Add executable file permission to the downloaded binary on each host.

```
chmod +x nodeadm
```

## Step 2: Install the hybrid nodes dependencies with `nodeadm`
<a name="_step_2_install_the_hybrid_nodes_dependencies_with_nodeadm"></a>

If you are installing the hybrid nodes dependencies in pre-built operating system images, you can skip this step. The `nodeadm install` command can be used to install all dependencies required for hybrid nodes. The hybrid nodes dependencies include containerd, kubelet, kubectl, and AWS SSM or AWS IAM Roles Anywhere components. See [Hybrid nodes `nodeadm` reference](hybrid-nodes-nodeadm.md) for more information on the components and file locations installed by `nodeadm install`. See [Prepare networking for hybrid nodes](hybrid-nodes-networking.md) for hybrid nodes for more information on the domains that must be allowed in your on-premises firewall for the `nodeadm install` process.

Run the command below to install the hybrid nodes dependencies on your on-premises host. The command below must be run with a user that has sudo/root access on your host.

**Important**  
The hybrid nodes CLI (`nodeadm`) must be run with a user that has sudo/root access on your host.
+ Replace `K8S_VERSION` with the Kubernetes minor version of your Amazon EKS cluster, for example `1.31`. See [Amazon EKS supported versions](https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html) for a list of the supported Kubernetes versions.
+ Replace `CREDS_PROVIDER` with the on-premises credential provider you are using. Valid values are `ssm` for AWS SSM and `iam-ra` for AWS IAM Roles Anywhere.

```
nodeadm install K8S_VERSION --credential-provider CREDS_PROVIDER
```

## Step 3: Connect hybrid nodes to your cluster
<a name="_step_3_connect_hybrid_nodes_to_your_cluster"></a>

Before connecting your hybrid nodes to your cluster, make sure you have allowed the required access in your on-premises firewall and in the security group for your cluster for the Amazon EKS control plane to/from hybrid node communication. Most issues at this step are related to the firewall configuration, security group configuration, or Hybrid Nodes IAM role configuration.

**Important**  
The hybrid nodes CLI (`nodeadm`) must be run with a user that has sudo/root access on your host.

1. Create a `nodeConfig.yaml` file on each host with the values for your deployment. For a full description of the available configuration settings, see [Hybrid nodes `nodeadm` reference](hybrid-nodes-nodeadm.md). If your Hybrid Nodes IAM role does not have permission for the `eks:DescribeCluster` action, you must pass your Kubernetes API endpoint, cluster CA bundle, and Kubernetes service IPv4 CIDR in the cluster section of your `nodeConfig.yaml`.

   1. Use the `nodeConfig.yaml` example below if you are using AWS SSM hybrid activations for your on-premises credentials provider.

      1. Replace `CLUSTER_NAME` with the name of your cluster.

      1. Replace `AWS_REGION` with the AWS Region hosting your cluster. For example, `us-west-2`.

      1. Replace `ACTIVATION_CODE` with the activation code you received when creating your AWS SSM hybrid activation. See [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md) for more information.

      1. Replace `ACTIVATION_ID` with the activation ID you received when creating your AWS SSM hybrid activation. You can retrieve this information from the AWS Systems Manager console or from the AWS CLI `aws ssm describe-activations` command.

         ```
         apiVersion: node.eks.aws/v1alpha1
         kind: NodeConfig
         spec:
           cluster:
             name: CLUSTER_NAME
             region: AWS_REGION
           hybrid:
             ssm:
               activationCode: ACTIVATION_CODE
               activationId: ACTIVATION_ID
         ```

   1. Use the `nodeConfig.yaml` example below if you are using AWS IAM Roles Anywhere for your on-premises credentials provider.

      1. Replace `CLUSTER_NAME` with the name of your cluster.

      1. Replace `AWS_REGION` with the AWS Region hosting your cluster. For example, `us-west-2`.

      1. Replace `NODE_NAME` with the name of your node. The node name must match the CN of the certificate on the host if you configured the trust policy of your Hybrid Nodes IAM role with the `"sts:RoleSessionName": "${aws:PrincipalTag/x509Subject/CN}"` resource condition. The `nodeName` you use must not be longer than 64 characters.

      1. Replace `TRUST_ANCHOR_ARN` with the ARN of the trust anchor you configured in the steps for Prepare credentials for hybrid nodes.

      1. Replace `PROFILE_ARN` with the ARN of the trust anchor you configured in the steps for [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md).

      1. Replace `ROLE_ARN` with the ARN of your Hybrid Nodes IAM role.

      1. Replace `CERTIFICATE_PATH` with the path in disk to your node certificate. If you don’t specify it, the default is `/etc/iam/pki/server.pem`.

      1. Replace `KEY_PATH` with the path in disk to your certificate private key. If you don’t specify it, the default is `/etc/iam/pki/server.key`.

         ```
         apiVersion: node.eks.aws/v1alpha1
         kind: NodeConfig
         spec:
           cluster:
             name: CLUSTER_NAME
             region: AWS_REGION
           hybrid:
             iamRolesAnywhere:
               nodeName: NODE_NAME
               trustAnchorArn: TRUST_ANCHOR_ARN
               profileArn: PROFILE_ARN
               roleArn: ROLE_ARN
               certificatePath: CERTIFICATE_PATH
               privateKeyPath: KEY_PATH
         ```

1. Run the `nodeadm init` command with your `nodeConfig.yaml` to connect your hybrid nodes to your Amazon EKS cluster.

   ```
   nodeadm init -c file://nodeConfig.yaml
   ```

If the above command completes successfully, your hybrid node has joined your Amazon EKS cluster. You can verify this in the Amazon EKS console by navigating to the Compute tab for your cluster ([ensure IAM principal has permissions to view](view-kubernetes-resources.md#view-kubernetes-resources-permissions)) or with `kubectl get nodes`.

**Important**  
Your nodes will have status `Not Ready`, which is expected and is due to the lack of a CNI running on your hybrid nodes. If your nodes did not join the cluster, see [Troubleshooting hybrid nodes](hybrid-nodes-troubleshooting.md).

## Step 4: Configure a CNI for hybrid nodes
<a name="_step_4_configure_a_cni_for_hybrid_nodes"></a>

To make your hybrid nodes ready to run applications, continue with the steps on [Configure CNI for hybrid nodes](hybrid-nodes-cni.md).

# Connect hybrid nodes with Bottlerocket
<a name="hybrid-nodes-bottlerocket"></a>

This topic describes how to connect hybrid nodes running Bottlerocket to an Amazon EKS cluster. [Bottlerocket](https://aws.amazon.com/bottlerocket/) is an open source Linux distribution that is sponsored and supported by AWS. Bottlerocket is purpose-built for hosting container workloads. With Bottlerocket, you can improve the availability of containerized deployments and reduce operational costs by automating updates to your container infrastructure. Bottlerocket includes only the essential software to run containers, which improves resource usage, reduces security threats, and lowers management overhead.

Only VMware variants of Bottlerocket version v1.37.0 and above are supported with EKS Hybrid Nodes. VMware variants of Bottlerocket are available for Kubernetes versions v1.28 and above. The OS images for these variants include the kubelet, containerd, aws-iam-authenticator and other software prerequisites for EKS Hybrid Nodes. You can configure these components using a Bottlerocket [settings](https://github.com/bottlerocket-os/bottlerocket#settings) file that includes base64 encoded user-data for the Bottlerocket bootstrap and admin containers. Configuring these settings enables Bottlerocket to use your hybrid nodes credentials provider to authenticate hybrid nodes to your cluster. After your hybrid nodes join the cluster, they will appear with status `Not Ready` in the Amazon EKS console and in Kubernetes-compatible tooling such as `kubectl`. After completing the steps on this page, proceed to [Configure CNI for hybrid nodes](hybrid-nodes-cni.md) to make your hybrid nodes ready to run applications.

## Prerequisites
<a name="_prerequisites"></a>

Before connecting hybrid nodes to your Amazon EKS cluster, make sure you have completed the prerequisite steps.
+ You have network connectivity from your on-premises environment to the AWS Region hosting your Amazon EKS cluster. See [Prepare networking for hybrid nodes](hybrid-nodes-networking.md) for more information.
+ You have created your Hybrid Nodes IAM role and set up your on-premises credential provider (AWS Systems Manager hybrid activations or AWS IAM Roles Anywhere). See [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md) for more information.
+ You have created your hybrid nodes-enabled Amazon EKS cluster. See [Create an Amazon EKS cluster with hybrid nodes](hybrid-nodes-cluster-create.md) for more information.
+ You have associated your Hybrid Nodes IAM role with Kubernetes Role-Based Access Control (RBAC) permissions. See [Prepare cluster access for hybrid nodes](hybrid-nodes-cluster-prep.md) for more information.

## Step 1: Create the Bottlerocket settings TOML file
<a name="_step_1_create_the_bottlerocket_settings_toml_file"></a>

To configure Bottlerocket for hybrid nodes, you need to create a `settings.toml` file with the necessary configuration. The contents of the TOML file will differ based on the credential provider you are using (SSM or IAM Roles Anywhere). This file will be passed as user data when provisioning the Bottlerocket instance.

**Note**  
The TOML files provided below only represent the minimum required settings for initializing a Bottlerocket VMWare machine as a node on an EKS cluster. Bottlerocket provides a wide range of settings to address several different use cases, so for further configuration options beyond hybrid node initialization, please refer to the [Bottlerocket documentation](https://bottlerocket.dev/en) for the comprehensive list of all documented settings for the Bottlerocket version you are using (for example, [here](https://bottlerocket.dev/en/os/1.51.x/api/settings-index) are all the settings available for Bottlerocket 1.51.x).

### SSM
<a name="_ssm"></a>

If you are using AWS Systems Manager as your credential provider, create a `settings.toml` file with the following content:

```
[settings.kubernetes]
cluster-name = "<cluster-name>"
api-server = "<api-server-endpoint>"
cluster-certificate = "<cluster-certificate-authority>"
hostname-override = "<hostname>"
provider-id = "eks-hybrid:///<region>/<cluster-name>/<hostname>"
authentication-mode = "aws"
cloud-provider = ""
server-tls-bootstrap = true

[settings.network]
hostname = "<hostname>"

[settings.aws]
region = "<region>"

[settings.kubernetes.credential-providers.ecr-credential-provider]
enabled = true
cache-duration = "12h"
image-patterns = [
    "*.dkr.ecr.*.amazonaws.com",
    "*.dkr.ecr.*.amazonaws.com.rproxy.goskope.com.cn",
    "*.dkr.ecr.*.amazonaws.eu",
    "*.dkr.ecr-fips.*.amazonaws.com",
    "*.dkr.ecr-fips.*.amazonaws.eu",
    "public.ecr.aws"
]

[settings.kubernetes.node-labels]
"eks.amazonaws.com/compute-type" = "hybrid"
"eks.amazonaws.com/hybrid-credential-provider" = "ssm"

[settings.host-containers.admin]
enabled = true
user-data = "<base64-encoded-admin-container-userdata>"

[settings.bootstrap-containers.eks-hybrid-setup]
mode = "always"
user-data = "<base64-encoded-bootstrap-container-userdata>"

[settings.host-containers.control]
enabled = true
```

Replace the placeholders with the following values:
+  `<cluster-name>`: The name of your Amazon EKS cluster.
+  `<api-server-endpoint>`: The API server endpoint of your cluster.
+  `<cluster-certificate-authority>`: The base64-encoded CA bundle of your cluster.
+  `<region>`: The AWS Region hosting your cluster, for example "us-east-1".
+  `<hostname>`: The hostname of the Bottlerocket instance, which will also be configured as the node name. This can be any unique value of your choice, but must follow the [Kubernetes Object naming conventions](https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names). In addition, the hostname you use cannot be longer than 64 characters. NOTE: When using SSM provider, this hostname and node name will be replaced by the managed instance ID (for example, `mi-*` ID) after the instance has been registered with SSM.
+  `<base64-encoded-admin-container-userdata>`: The base64-encoded contents of the Bottlerocket admin container configuration. Enabling the admin container allows you to connect to your Bottlerocket instance with SSH for system exploration and debugging. While this is not a required setting, we recommend enabling it for ease of troubleshooting. Refer to the [Bottlerocket admin container documentation](https://github.com/bottlerocket-os/bottlerocket-admin-container#authenticating-with-the-admin-container) for more information on authenticating with the admin container. The admin container takes SSH user and key input in JSON format, for example,

```
{
  "user": "<ssh-user>",
  "ssh": {
    "authorized-keys": [
      "<ssh-authorized-key>"
    ]
  }
}
```
+  `<base64-encoded-bootstrap-container-userdata>`: The base64-encoded contents of the Bottlerocket bootstrap container configuration. Refer to the [Bottlerocket bootstrap container documentation](https://github.com/bottlerocket-os/bottlerocket-bootstrap-container) for more information on its configuration. The bootstrap container is responsible for registering the instance as an AWS SSM Managed Instance and joining it as a Kubernetes node on your Amazon EKS Cluster. The user data passed into the bootstrap container takes the form of a command invocation which accepts as input the SSM hybrid activation code and ID you previously created:

```
eks-hybrid-ssm-setup --activation-id=<activation-id> --activation-code=<activation-code> --region=<region>
```

### IAM Roles Anywhere
<a name="_iam_roles_anywhere"></a>

If you are using AWS IAM Roles Anywhere as your credential provider, create a `settings.toml` file with the following content:

```
[settings.kubernetes]
cluster-name = "<cluster-name>"
api-server = "<api-server-endpoint>"
cluster-certificate = "<cluster-certificate-authority>"
hostname-override = "<hostname>"
provider-id = "eks-hybrid:///<region>/<cluster-name>/<hostname>"
authentication-mode = "aws"
cloud-provider = ""
server-tls-bootstrap = true

[settings.network]
hostname = "<hostname>"

[settings.aws]
region = "<region>"
config = "<base64-encoded-aws-config-file>"

[settings.kubernetes.credential-providers.ecr-credential-provider]
enabled = true
cache-duration = "12h"
image-patterns = [
    "*.dkr.ecr.*.amazonaws.com",
    "*.dkr.ecr.*.amazonaws.com.rproxy.goskope.com.cn",
    "*.dkr.ecr.*.amazonaws.eu",
    "*.dkr.ecr-fips.*.amazonaws.com",
    "*.dkr.ecr-fips.*.amazonaws.eu",
    "public.ecr.aws"
]

[settings.kubernetes.node-labels]
"eks.amazonaws.com/compute-type" = "hybrid"
"eks.amazonaws.com/hybrid-credential-provider" = "iam-ra"

[settings.host-containers.admin]
enabled = true
user-data = "<base64-encoded-admin-container-userdata>"

[settings.bootstrap-containers.eks-hybrid-setup]
mode = "always"
user-data = "<base64-encoded-bootstrap-container-userdata>"
```

Replace the placeholders with the following values:
+  `<cluster-name>`: The name of your Amazon EKS cluster.
+  `<api-server-endpoint>`: The API server endpoint of your cluster.
+  `<cluster-certificate-authority>`: The base64-encoded CA bundle of your cluster.
+  `<region>`: The AWS Region hosting your cluster, e.g., "us-east-1"
+  `<hostname>`: The hostname of the Bottlerocket instance, which will also be configured as the node name. This can be any unique value of your choice, but must follow the [Kubernetes Object naming conventions](https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names). In addition, the hostname you use cannot be longer than 64 characters. NOTE: When using IAM-RA provider, the node name must match the CN of the certificate on the host if you configured the trust policy of your Hybrid Nodes IAM role with the `"sts:RoleSessionName": "${aws:PrincipalTag/x509Subject/CN}"` resource condition.
+  `<base64-encoded-aws-config-file>`: The base64-encoded contents of your AWS config file. The contents of the file should be as follows:

```
[default]
credential_process = aws_signing_helper credential-process --certificate /root/.aws/node.crt --private-key /root/.aws/node.key --profile-arn <profile-arn> --role-arn <role-arn> --trust-anchor-arn <trust-anchor-arn> --role-session-name <role-session-name>
```
+  `<base64-encoded-admin-container-userdata>`: The base64-encoded contents of the Bottlerocket admin container configuration. Enabling the admin container allows you to connect to your Bottlerocket instance with SSH for system exploration and debugging. While this is not a required setting, we recommend enabling it for ease of troubleshooting. Refer to the [Bottlerocket admin container documentation](https://github.com/bottlerocket-os/bottlerocket-admin-container#authenticating-with-the-admin-container) for more information on authenticating with the admin container. The admin container takes SSH user and key input in JSON format, for example,

```
{
  "user": "<ssh-user>",
  "ssh": {
    "authorized-keys": [
      "<ssh-authorized-key>"
    ]
  }
}
```
+  `<base64-encoded-bootstrap-container-userdata>`: The base64-encoded contents of the Bottlerocket bootstrap container configuration. Refer to the [Bottlerocket bootstrap container documentation](https://github.com/bottlerocket-os/bottlerocket-bootstrap-container) for more information on its configuration. The bootstrap container is responsible for creating the IAM Roles Anywhere host certificate and certificate private key files on the instance. These will then be consumed by the `aws_signing_helper` to obtain temporary credentials for authenticating with your Amazon EKS cluster. The user data passed into the bootstrap container takes the form of a command invocation which accepts as input the contents of the certificate and private key you previously created:

```
eks-hybrid-iam-ra-setup --certificate=<certificate> --key=<private-key>
```

## Step 2: Provision the Bottlerocket vSphere VM with user data
<a name="_step_2_provision_the_bottlerocket_vsphere_vm_with_user_data"></a>

Once you have constructed the TOML file, pass it as user data during vSphere VM creation. Keep in mind that the user data must be configured before the VM is powered on for the first time. As such, you will need to supply it when creating the instance, or if you wish to create the VM ahead of time, the VM must be in poweredOff state until you configure the user data for it. For example, if using the `govc` CLI:

### Creating VM for the first time
<a name="_creating_vm_for_the_first_time"></a>

```
govc vm.create \
  -on=true \
  -c=2 \
  -m=4096 \
  -net.adapter=<network-adapter> \
  -net=<network-name> \
  -e guestinfo.userdata.encoding="base64" \
  -e guestinfo.userdata="$(base64 -w0 settings.toml)" \
  -template=<template-name> \
  <vm-name>
```

### Updating user data for an existing VM
<a name="_updating_user_data_for_an_existing_vm"></a>

```
govc vm.create \
    -on=false \
    -c=2 \
    -m=4096 \
    -net.adapter=<network-adapter> \
    -net=<network-name> \
    -template=<template-name> \
    <vm-name>

govc vm.change
    -vm <vm-name> \
    -e guestinfo.userdata="$(base64 -w0 settings.toml)" \
    -e guestinfo.userdata.encoding="base64"

govc vm.power -on <vm-name>
```

In the above sections, the `-e guestinfo.userdata.encoding="base64"` option specifies that the user data is base64-encoded. The `-e guestinfo.userdata` option passes the base64-encoded contents of the `settings.toml` file as user data to the Bottlerocket instance. Replace the placeholders with your specific values, such as the Bottlerocket OVA template and networking details.

## Step 3: Verify the hybrid node connection
<a name="_step_3_verify_the_hybrid_node_connection"></a>

After the Bottlerocket instance starts, it will attempt to join your Amazon EKS cluster. You can verify the connection in the Amazon EKS console by navigating to the Compute tab for your cluster or by running the following command:

```
kubectl get nodes
```

**Important**  
Your nodes will have status `Not Ready`, which is expected and is due to the lack of a CNI running on your hybrid nodes. If your nodes did not join the cluster, see [Troubleshooting hybrid nodes](hybrid-nodes-troubleshooting.md).

## Step 4: Configure a CNI for hybrid nodes
<a name="_step_4_configure_a_cni_for_hybrid_nodes"></a>

To make your hybrid nodes ready to run applications, continue with the steps on [Configure CNI for hybrid nodes](hybrid-nodes-cni.md).

# Upgrade hybrid nodes for your cluster
<a name="hybrid-nodes-upgrade"></a>

The guidance for upgrading hybrid nodes is similar to self-managed Amazon EKS nodes that run in Amazon EC2. We recommend that you create new hybrid nodes on your target Kubernetes version, gracefully migrate your existing applications to the hybrid nodes on the new Kubernetes version, and remove the hybrid nodes on the old Kubernetes version from your cluster. Be sure to review the [Amazon EKS Best Practices](https://docs.aws.amazon.com/eks/latest/best-practices/cluster-upgrades.html) for upgrades before initiating an upgrade. Amazon EKS Hybrid Nodes have the same [Kubernetes version support](https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html) for Amazon EKS clusters with cloud nodes, including standard and extended support.

Amazon EKS Hybrid Nodes follow the same [version skew policy](https://kubernetes.io/releases/version-skew-policy/#supported-version-skew) for nodes as upstream Kubernetes. Amazon EKS Hybrid Nodes cannot be on a newer version than the Amazon EKS control plane, and hybrid nodes may be up to three Kubernetes minor versions older than the Amazon EKS control plane minor version.

If you do not have spare capacity to create new hybrid nodes on your target Kubernetes version for a cutover migration upgrade strategy, you can alternatively use the Amazon EKS Hybrid Nodes CLI (`nodeadm`) to upgrade the Kubernetes version of your hybrid nodes in-place.

**Important**  
If you are upgrading your hybrid nodes in-place with `nodeadm`, there is downtime for the node during the process where the older version of the Kubernetes components are shut down and the new Kubernetes version components are installed and started.

## Prerequisites
<a name="_prerequisites"></a>

Before upgrading, make sure you have completed the following prerequisites.
+ The target Kubernetes version for your hybrid nodes upgrade must be equal to or less than the Amazon EKS control plane version.
+ If you are following a cutover migration upgrade strategy, the new hybrid nodes you are installing on your target Kubernetes version must meet the [Prerequisite setup for hybrid nodes](hybrid-nodes-prereqs.md) requirements. This includes having IP addresses within the Remote Node Network CIDR you passed during Amazon EKS cluster creation.
+ For both cutover migration and in-place upgrades, the hybrid nodes must have access to the [required domains](hybrid-nodes-networking.md#hybrid-nodes-networking-on-prem) to pull the new versions of the hybrid nodes dependencies.
+ You must have kubectl installed on your local machine or instance you are using to interact with your Amazon EKS Kubernetes API endpoint.
+ The version of your CNI must support the Kubernetes version you are upgrading to. If it does not, upgrade your CNI version before upgrading your hybrid nodes. See [Configure CNI for hybrid nodes](hybrid-nodes-cni.md) for more information.

## Cutover migration (blue-green) upgrades
<a name="hybrid-nodes-upgrade-cutover"></a>

 *Cutover migration upgrades* refer to the process of creating new hybrid nodes on new hosts with your target Kubernetes version, gracefully migrating your existing applications to the new hybrid nodes on your target Kubernetes version, and removing the hybrid nodes on the old Kubernetes version from your cluster. This strategy is also called a blue-green migration.

1. Connect your new hosts as hybrid nodes following the [Connect hybrid nodes](hybrid-nodes-join.md) steps. When running the `nodeadm install` command, use your target Kubernetes version.

1. Enable communication between the new hybrid nodes on the target Kubernetes version and your hybrid nodes on the old Kubernetes version. This configuration allows pods to communicate with each other while you are migrating your workload to the hybrid nodes on the target Kubernetes version.

1. Confirm your hybrid nodes on your target Kubernetes version successfully joined your cluster and have status Ready.

1. Use the following command to mark each of the nodes that you want to remove as unschedulable. This is so that new pods aren’t scheduled or rescheduled on the nodes that you are replacing. For more information, see [kubectl cordon](https://kubernetes.io/docs/reference/kubectl/generated/kubectl_cordon/) in the Kubernetes documentation. Replace `NODE_NAME` with the name of the hybrid nodes on the old Kubernetes version.

   ```
   kubectl cordon NODE_NAME
   ```

   You can identify and cordon all of the nodes of a particular Kubernetes version (in this case, `1.28`) with the following code snippet.

   ```
   K8S_VERSION=1.28
   for node in $(kubectl get nodes -o json | jq --arg K8S_VERSION "$K8S_VERSION" -r '.items[] | select(.status.nodeInfo.kubeletVersion | match("\($K8S_VERSION)")).metadata.name')
   do
       echo "Cordoning $node"
       kubectl cordon $node
   done
   ```

1. If your current deployment is running fewer than two CoreDNS replicas on your hybrid nodes, scale out the deployment to at least two replicas. We recommend that you run at least two CoreDNS replicas on hybrid nodes for resiliency during normal operations.

   ```
   kubectl scale deployments/coredns --replicas=2 -n kube-system
   ```

1. Drain each of the hybrid nodes on the old Kubernetes version that you want to remove from your cluster with the following command. For more information on draining nodes, see [Safely Drain a Node](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/) in the Kubernetes documentation. Replace `NODE_NAME` with the name of the hybrid nodes on the old Kubernetes version.

   ```
   kubectl drain NODE_NAME --ignore-daemonsets --delete-emptydir-data
   ```

   You can identify and drain all of the nodes of a particular Kubernetes version (in this case, `1.28`) with the following code snippet.

   ```
   K8S_VERSION=1.28
   for node in $(kubectl get nodes -o json | jq --arg K8S_VERSION "$K8S_VERSION" -r '.items[] | select(.status.nodeInfo.kubeletVersion | match("\($K8S_VERSION)")).metadata.name')
   do
       echo "Draining $node"
       kubectl drain $node --ignore-daemonsets --delete-emptydir-data
   done
   ```

1. You can use `nodeadm` to stop and remove the hybrid nodes artifacts from the host. You must run `nodeadm` with a user that has root/sudo privileges. By default, `nodeadm uninstall` will not proceed if there are pods remaining on the node. For more information see [Hybrid nodes `nodeadm` reference](hybrid-nodes-nodeadm.md).

   ```
   nodeadm uninstall
   ```

1. With the hybrid nodes artifacts stopped and uninstalled, remove the node resource from your cluster.

   ```
   kubectl delete node node-name
   ```

   You can identify and delete all of the nodes of a particular Kubernetes version (in this case, `1.28`) with the following code snippet.

   ```
   K8S_VERSION=1.28
   for node in $(kubectl get nodes -o json | jq --arg K8S_VERSION "$K8S_VERSION" -r '.items[] | select(.status.nodeInfo.kubeletVersion | match("\($K8S_VERSION)")).metadata.name')
   do
       echo "Deleting $node"
       kubectl delete node $node
   done
   ```

1. Depending on your choice of CNI, there may be artifacts remaining on your hybrid nodes after running the above steps. See [Configure CNI for hybrid nodes](hybrid-nodes-cni.md) for more information.

## In-place upgrades
<a name="hybrid-nodes-upgrade-inplace"></a>

The in-place upgrade process refers to using `nodeadm upgrade` to upgrade the Kubernetes version for hybrid nodes without using new physical or virtual hosts and a cutover migration strategy. The `nodeadm upgrade` process shuts down the existing older Kubernetes components running on the hybrid node, uninstalls the existing older Kubernetes components, installs the new target Kubernetes components, and starts the new target Kubernetes components. It is strongly recommend to upgrade one node at a time to minimize impact to applications running on the hybrid nodes. The duration of this process depends on your network bandwidth and latency.

1. Use the following command to mark the node you are upgrading as unschedulable. This is so that new pods aren’t scheduled or rescheduled on the node that you are upgrading. For more information, see [kubectl cordon](https://kubernetes.io/docs/reference/kubectl/generated/kubectl_cordon/) in the Kubernetes documentation. Replace `NODE_NAME` with the name of the hybrid node you are upgrading

   ```
   kubectl cordon NODE_NAME
   ```

1. Drain the node you are upgrading with the following command. For more information on draining nodes, see [Safely Drain a Node](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/) in the Kubernetes documentation. Replace `NODE_NAME` with the name of the hybrid node you are upgrading.

   ```
   kubectl drain NODE_NAME --ignore-daemonsets --delete-emptydir-data
   ```

1. Run `nodeadm upgrade` on the hybrid node you are upgrading. You must run `nodeadm` with a user that has root/sudo privileges. The name of the node is preserved through upgrade for both AWS SSM and AWS IAM Roles Anywhere credential providers. You cannot change credentials providers during the upgrade process. See [Hybrid nodes `nodeadm` reference](hybrid-nodes-nodeadm.md) for configuration values for `nodeConfig.yaml`. Replace `K8S_VERSION` with the target Kubernetes version you upgrading to.

   ```
   nodeadm upgrade K8S_VERSION -c file://nodeConfig.yaml
   ```

1. To allow pods to be scheduled on the node after you have upgraded, type the following. Replace `NODE_NAME` with the name of the node.

   ```
   kubectl uncordon NODE_NAME
   ```

1. Watch the status of your hybrid nodes and wait for your nodes to shutdown and restart on the new Kubernetes version with the Ready status.

   ```
   kubectl get nodes -o wide -w
   ```

# Patch security updates for hybrid nodes
<a name="hybrid-nodes-security"></a>

This topic describes the procedure to perform in-place patching of security updates for specific packages and dependencies running on your hybrid nodes. As a best practice we recommend you to regularly update your hybrid nodes to receive CVEs and security patches.

For steps to upgrade the Kubernetes version, see [Upgrade hybrid nodes for your cluster](hybrid-nodes-upgrade.md).

One example of software that might need security patching is `containerd`.

## `Containerd`
<a name="_containerd"></a>

 `containerd` is the standard Kubernetes container runtime and core dependency for EKS Hybrid Nodes, used for managing container lifecycle, including pulling images and managing container execution. On an hybrid node, you can install `containerd` through the [nodeadm CLI](https://docs.aws.amazon.com/eks/latest/userguide/hybrid-nodes-nodeadm.html) or manually. Depending on the operating system of your node, `nodeadm` will install `containerd` from the OS-distributed package or Docker package.

When a CVE in `containerd` has been published, you have the following options to upgrade to the patched version of `containerd` on your Hybrid nodes.

## Step 1: Check if the patch published to package managers
<a name="_step_1_check_if_the_patch_published_to_package_managers"></a>

You can check whether the `containerd` CVE patch has been published to each respective OS package manager by referring to the corresponding security bulletins:
+  [Amazon Linux 2023](https://alas.aws.amazon.com/alas2023.html) 
+  [RHEL](https://access.redhat.com/security/security-updates/security-advisories) 
+  [Ubuntu 20.04](https://ubuntu.com/security/notices?order=newest&release=focal) 
+  [Ubuntu 22.04](https://ubuntu.com/security/notices?order=newest&release=jammy) 
+  [Ubuntu 24.04](https://ubuntu.com/security/notices?order=newest&release=noble) 

If you use the Docker repo as the source of `containerd`, you can check the [Docker security announcements](https://docs.docker.com/security/security-announcements/) to identify the availability of the patched version in the Docker repo.

## Step 2: Choose the method to install the patch
<a name="_step_2_choose_the_method_to_install_the_patch"></a>

There are three methods to patch and install security upgrades in-place on nodes. Which method you can use depends on whether the patch is available from the operating system in the package manager or not:

1. Install patches with `nodeadm upgrade` that are published to package managers, see [Step 2 a](#hybrid-nodes-security-nodeadm).

1. Install patches with the package managers directly, see [Step 2 b](#hybrid-nodes-security-package).

1. Install custom patches that aren’t published in package managers. Note that there are special considerations for custom patches for `containerd`, [Step 2 c](#hybrid-nodes-security-manual).

## Step 2 a: Patching with `nodeadm upgrade`
<a name="hybrid-nodes-security-nodeadm"></a>

After you confirm that the `containerd` CVE patch has been published to the OS or Docker repos (either Apt or RPM), you can use the `nodeadm upgrade` command to upgrade to the latest version of `containerd`. Since this isn’t a Kubernetes version upgrade, you must pass in your current Kubernetes version to the `nodeadm` upgrade command.

```
nodeadm upgrade K8S_VERSION --config-source file:///root/nodeConfig.yaml
```

## Step 2 b: Patching with operating system package managers
<a name="hybrid-nodes-security-package"></a>

Alternatively you can also update through the respective package manager and use it to upgrade the `containerd` package as follows.

 **Amazon Linux 2023** 

```
sudo yum update -y
sudo yum install -y containerd
```

 **RHEL** 

```
sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://download.docker.com/linux/rhel/docker-ce.repo
sudo yum update -y
sudo yum install -y containerd
```

 **Ubuntu** 

```
sudo mkdir -p /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
  $(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update -y
sudo apt install -y --only-upgrade containerd.io
```

## Step 2 c: `Containerd` CVE patch not published in package managers
<a name="hybrid-nodes-security-manual"></a>

If the patched `containerd` version is only available by other means instead of in the package manager, for example in GitHub releases, then you can install `containerd` from the official GitHub site.

1. If the machine has already joined the cluster as a hybrid node, then you need to run the `nodeadm uninstall` command.

1. Install the official `containerd` binaries. You can use the steps [official installation steps](https://github.com/containerd/containerd/blob/main/docs/getting-started.md#option-1-from-the-official-binaries) on GitHub.

1. Run the `nodeadm install` command with the `--containerd-source` argument set to `none`, which will skip `containerd` installation through `nodeadm`. You can use the value of `none` in the `containerd` source for any operating system that the node is running.

   ```
   nodeadm install K8S_VERSION --credential-provider CREDS_PROVIDER --containerd-source none
   ```

# Remove hybrid nodes
<a name="hybrid-nodes-remove"></a>

This topic describes how to delete hybrid nodes from your Amazon EKS cluster. You must delete your hybrid nodes with your choice of Kubernetes-compatible tooling such as [kubectl](https://kubernetes.io/docs/reference/kubectl/). Charges for hybrid nodes stop when the node object is removed from the Amazon EKS cluster. For more information on hybrid nodes pricing, see [Amazon EKS Pricing](https://aws.amazon.com/eks/pricing/).

**Important**  
Removing nodes is disruptive to workloads running on the node. Before deleting hybrid nodes, we recommend that you first drain the node to move pods to another active node. For more information on draining nodes, see [Safely Drain a Node](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/) in the Kubernetes documentation.

Run the kubectl steps below from your local machine or instance that you use to interact with the Amazon EKS cluster’s Kubernetes API endpoint. If you are using a specific `kubeconfig` file, use the `--kubeconfig` flag.

## Step 1: List your nodes
<a name="_step_1_list_your_nodes"></a>

```
kubectl get nodes
```

## Step 2: Drain your node
<a name="_step_2_drain_your_node"></a>

See [kubectl drain](https://kubernetes.io/docs/reference/kubectl/generated/kubectl_drain/) in the Kubernetes documentation for more information on the `kubectl drain` command.

```
kubectl drain --ignore-daemonsets <node-name>
```

## Step 3: Stop and uninstall hybrid nodes artifacts
<a name="_step_3_stop_and_uninstall_hybrid_nodes_artifacts"></a>

You can use the Amazon EKS Hybrid Nodes CLI (`nodeadm`) to stop and remove the hybrid nodes artifacts from the host. You must run `nodeadm` with a user that has root/sudo privileges. By default, `nodeadm uninstall` will not proceed if there are pods remaining on the node. If you are using AWS Systems Manager (SSM) as your credentials provider, the `nodeadm uninstall` command deregisters the host as an AWS SSM managed instance. For more information, see [Hybrid nodes `nodeadm` reference](hybrid-nodes-nodeadm.md).

```
nodeadm uninstall
```

## Step 4: Delete your node from the cluster
<a name="_step_4_delete_your_node_from_the_cluster"></a>

With the hybrid nodes artifacts stopped and uninstalled, remove the node resource from your cluster.

```
kubectl delete node <node-name>
```

## Step 5: Check for remaining artifacts
<a name="_step_5_check_for_remaining_artifacts"></a>

Depending on your choice of CNI, there may be artifacts remaining on your hybrid nodes after running the above steps. See [Configure CNI for hybrid nodes](hybrid-nodes-cni.md) for more information.

# Configure application networking, add-ons, and webhooks for hybrid nodes
<a name="hybrid-nodes-configure"></a>

After you create an EKS cluster for hybrid nodes, configure additional capabilities for application networking (CNI, BGP, Ingress, Load Balancing, Network Policies), add-ons, webhooks, and proxy settings. For the complete list of the EKS and community add-ons that are compatible with hybrid nodes, see [Configure add-ons for hybrid nodes](hybrid-nodes-add-ons.md).

 **EKS cluster insights** EKS includes insight checks for misconfigurations in your hybrid nodes setup that could impair functionality of your cluster or workloads. For more information on cluster insights, see [Prepare for Kubernetes version upgrades and troubleshoot misconfigurations with cluster insights](cluster-insights.md).

The following lists the common capabilities and add-ons that you can use with hybrid nodes:
+  **Container Networking Interface (CNI)**: AWS supports [Cilium](https://docs.cilium.io/en/stable/index.html) as the CNI for hybrid nodes. For more information, see [Configure CNI for hybrid nodes](hybrid-nodes-cni.md). Note that the AWS VPC CNI can’t be used with hybrid nodes.
+  **CoreDNS and `kube-proxy` **: CoreDNS and `kube-proxy` are installed automatically when hybrid nodes join the EKS cluster. These add-ons can be managed as EKS add-ons after cluster creation.
+  **Ingress and Load Balancing**: You can use the AWS Load Balancer Controller and Application Load Balancer (ALB) or Network Load Balancer (NLB) with the target type `ip` for workloads running on hybrid nodes. AWS supports Cilium’s built-in Ingress, Gateway, and Kubernetes Service load balancing features for workloads running on hybrid nodes. For more information, see [Configure Kubernetes Ingress for hybrid nodes](hybrid-nodes-ingress.md) and [Configure Services of type LoadBalancer for hybrid nodes](hybrid-nodes-load-balancing.md).
+  **Metrics**: You can use Amazon Managed Service for Prometheus (AMP) agent-less scrapers, AWS Distro for Open Telemetry (ADOT), and the Amazon CloudWatch Observability Agent with hybrid nodes. To use AMP agent-less scrapers for pod metrics on hybrid nodes, your pods must be accessible from the VPC that you use for the EKS cluster.
+  **Logs**: You can enable EKS control plane logging for hybrid nodes-enabled clusters. You can use the ADOT EKS add-on and the Amazon CloudWatch Observability Agent EKS add-on for hybrid node and pod logging.
+  **Pod Identities and IRSA**: You can use EKS Pod Identities and IAM Roles for Service Accounts (IRSA) with applications running on hybrid nodes to enable granular access for your pods running on hybrid nodes with other AWS services.
+  **Webhooks**: If you are running webhooks, see [Configure webhooks for hybrid nodes](hybrid-nodes-webhooks.md) for considerations and steps to optionally run webhooks on cloud nodes if you cannot make your on-premises pod networks routable.
+  **Proxy**: If you are using a proxy server in your on-premises environment for traffic leaving your data center or edge environment, you can configure your hybrid nodes and cluster to use your proxy server. For more information, see [Configure proxy for hybrid nodes](hybrid-nodes-proxy.md).

**Topics**
+ [

# Configure CNI for hybrid nodes
](hybrid-nodes-cni.md)
+ [

# Configure add-ons for hybrid nodes
](hybrid-nodes-add-ons.md)
+ [

# Configure webhooks for hybrid nodes
](hybrid-nodes-webhooks.md)
+ [

# Configure proxy for hybrid nodes
](hybrid-nodes-proxy.md)
+ [

# Configure Cilium BGP for hybrid nodes
](hybrid-nodes-cilium-bgp.md)
+ [

# Configure Kubernetes Ingress for hybrid nodes
](hybrid-nodes-ingress.md)
+ [

# Configure Services of type LoadBalancer for hybrid nodes
](hybrid-nodes-load-balancing.md)
+ [

# Configure Kubernetes Network Policies for hybrid nodes
](hybrid-nodes-network-policies.md)

# Configure CNI for hybrid nodes
<a name="hybrid-nodes-cni"></a>

Cilium is the AWS-supported Container Networking Interface (CNI) for Amazon EKS Hybrid Nodes. You must install a CNI for hybrid nodes to become ready to serve workloads. Hybrid nodes appear with status `Not Ready` until a CNI is running. You can manage the CNI with your choice of tools such as Helm. The instructions on this page cover Cilium lifecycle management (install, upgrade, delete). See [Cilium Ingress and Cilium Gateway Overview](hybrid-nodes-ingress.md#hybrid-nodes-ingress-cilium), [Service type LoadBalancer](hybrid-nodes-ingress.md#hybrid-nodes-ingress-cilium-loadbalancer), and [Configure Kubernetes Network Policies for hybrid nodes](hybrid-nodes-network-policies.md) for how to configure Cilium for ingress, load balancing, and network policies.

Cilium is not supported by AWS when running on nodes in AWS Cloud. The Amazon VPC CNI is not compatible with hybrid nodes and the VPC CNI is configured with anti-affinity for the `eks.amazonaws.com/compute-type: hybrid` label.

The Calico documentation previously on this page has been moved to the [EKS Hybrid Examples Repository](https://github.com/aws-samples/eks-hybrid-examples).

## Version compatibility
<a name="hybrid-nodes-cilium-version-compatibility"></a>

Cilium versions `v1.17.x` and `v1.18.x` are supported for EKS Hybrid Nodes for every Kubernetes version supported in Amazon EKS.

**Note**  
 **Cilium v1.18.3 kernel requirement**: Due to the kernel requirement (Linux kernel >= 5.10), Cilium v1.18.3 is not supported on:
+ Ubuntu 20.04
+ Red Hat Enterprise Linux (RHEL) 8

For system requirements, see [Cilium system requirements](https://docs.cilium.io/en/stable/operations/system_requirements/).

See [Kubernetes version support](https://docs.aws.amazon.com/eks/latest/userguide/kubernetes-versions.html) for the Kubernetes versions supported by Amazon EKS. EKS Hybrid Nodes have the same Kubernetes version support as Amazon EKS clusters with cloud nodes.

## Supported capabilities
<a name="hybrid-nodes-cilium-support"></a>

 AWS maintains builds of Cilium for EKS Hybrid Nodes that are based on the open source [Cilium project](https://github.com/cilium/cilium). To receive support from AWS for Cilium, you must be using the AWS-maintained Cilium builds and supported Cilium versions.

 AWS provides technical support for the default configurations of the following capabilities of Cilium for use with EKS Hybrid Nodes. If you plan to use functionality outside the scope of AWS support, it is recommended to obtain alternative commercial support for Cilium or have the in-house expertise to troubleshoot and contribute fixes to the Cilium project.


| Cilium Feature | Supported by AWS  | 
| --- | --- | 
|  Kubernetes network conformance  |  Yes  | 
|  Core cluster connectivity  |  Yes  | 
|  IP family  |  IPv4  | 
|  Lifecycle Management  |  Helm  | 
|  Networking Mode  |  VXLAN encapsulation  | 
|  IP Address Management (IPAM)  |  Cilium IPAM Cluster Scope  | 
|  Network Policy  |  Kubernetes Network Policy  | 
|  Border Gateway Protocol (BGP)  |  Cilium BGP Control Plane  | 
|  Kubernetes Ingress  |  Cilium Ingress, Cilium Gateway  | 
|  Service LoadBalancer IP Allocation  |  Cilium Load Balancer IPAM  | 
|  Service LoadBalancer IP Address Advertisement  |  Cilium BGP Control Plane  | 
|  kube-proxy replacement  |  Yes  | 

## Cilium considerations
<a name="hybrid-nodes-cilium-considerations"></a>
+  **Helm repository** - AWS hosts the Cilium Helm chart in the Amazon Elastic Container Registry Public (Amazon ECR Public) at [Amazon EKS Cilium/Cilium](https://gallery.ecr.aws/eks/cilium/cilium). The available versions include:
  + Cilium v1.17.9: `oci://public.ecr.aws/eks/cilium/cilium:1.17.9-0` 
  + Cilium v1.18.3: `oci://public.ecr.aws/eks/cilium/cilium:1.18.3-0` 

    The commands in this topic use this repository. Note that certain `helm repo` commands aren’t valid for Helm repositores in Amazon ECR Public, so you can’t refer to this repository from a local Helm repo name. Instead, use the full URI in most commands.
+ By default, Cilium is configured to run in overlay / tunnel mode with VXLAN as the [encapsulation method](https://docs.cilium.io/en/stable/network/concepts/routing/#encapsulation). This mode has the fewest requirements on the underlying physical network.
+ By default, Cilium [masquerades](https://docs.cilium.io/en/stable/network/concepts/masquerading/) the source IP address of all pod traffic leaving the cluster to the IP address of the node. If you disable masquerading, then your pod CIDRs must be routable on your on-premises network.
+ If you are running webhooks on hybrid nodes, your pod CIDRs must be routable on your on-premises network. If your pod CIDRs are not routable on your on-premises network, then it is recommended to run webhooks on cloud nodes in the same cluster. See [Configure webhooks for hybrid nodes](hybrid-nodes-webhooks.md) and [Prepare networking for hybrid nodes](hybrid-nodes-networking.md) for more information.
+  AWS recommends using Cilium’s built-in BGP functionality to make your pod CIDRs routable on your on-premises network. For more information on how to configure Cilium BGP with hybrid nodes, see [Configure Cilium BGP for hybrid nodes](hybrid-nodes-cilium-bgp.md).
+ The default IP Address Management (IPAM) in Cilium is called [Cluster Scope](https://docs.cilium.io/en/stable/network/concepts/ipam/cluster-pool/), where the Cilium operator allocates IP addresses for each node based on user-configured pod CIDRs.

## Install Cilium on hybrid nodes
<a name="hybrid-nodes-cilium-install"></a>

### Procedure
<a name="_procedure"></a>

1. Create a YAML file called `cilium-values.yaml`. The following example configures Cilium to run only on hybrid nodes by setting affinity for the `eks.amazonaws.com/compute-type: hybrid` label for the Cilium agent and operator.
   + Configure `clusterPoolIpv4PodCIDRList` with the same pod CIDRs you configured for your EKS cluster’s *remote pod networks*. For example, `10.100.0.0/24`. The Cilium operator allocates IP address slices from within the configured `clusterPoolIpv4PodCIDRList` IP space. Your pod CIDR must not overlap with your on-premises node CIDR, your VPC CIDR, or your Kubernetes service CIDR.
   + Configure `clusterPoolIpv4MaskSize` based on your required pods per node. For example, `25` for a /25 segment size of 128 pods per node.
   + Do not change `clusterPoolIpv4PodCIDRList` or `clusterPoolIpv4MaskSize` after deploying Cilium on your cluster, see [Expanding the cluster pool](https://docs.cilium.io/en/stable/network/concepts/ipam/cluster-pool/#expanding-the-cluster-pool) for more information.
   + If you are running Cilium in kube-proxy replacement mode, set `kubeProxyReplacement: "true"` in your Helm values and ensure you do not have an existing kube-proxy deployment running on the same nodes as Cilium.
   + The example below disables the Envoy Layer 7 (L7) proxy that Cilium uses for L7 network policies and ingress. For more information, see [Configure Kubernetes Network Policies for hybrid nodes](hybrid-nodes-network-policies.md) and [Cilium Ingress and Cilium Gateway Overview](hybrid-nodes-ingress.md#hybrid-nodes-ingress-cilium).
   + The example below configures `loadBalancer.serviceTopology`: `true` for Service Traffic Distribution to function correctly if you configure it for your services. For more information, see [Configure Service Traffic Distribution](hybrid-nodes-webhooks.md#hybrid-nodes-mixed-service-traffic-distribution).
   + For a full list of Helm values for Cilium, see the [Helm reference](https://docs.cilium.io/en/stable/helm-reference/) in the Cilium documentation.

     ```
     affinity:
       nodeAffinity:
         requiredDuringSchedulingIgnoredDuringExecution:
           nodeSelectorTerms:
           - matchExpressions:
             - key: eks.amazonaws.com/compute-type
               operator: In
               values:
               - hybrid
     ipam:
       mode: cluster-pool
       operator:
         clusterPoolIPv4MaskSize: 25
         clusterPoolIPv4PodCIDRList:
         - POD_CIDR
     loadBalancer:
       serviceTopology: true
     operator:
       affinity:
         nodeAffinity:
           requiredDuringSchedulingIgnoredDuringExecution:
             nodeSelectorTerms:
             - matchExpressions:
               - key: eks.amazonaws.com/compute-type
                 operator: In
                 values:
                   - hybrid
       unmanagedPodWatcher:
         restart: false
     loadBalancer:
       serviceTopology: true
     envoy:
       enabled: false
     kubeProxyReplacement: "false"
     ```

1. Install Cilium on your cluster.
   + Replace `CILIUM_VERSION` with a Cilium version (for example `1.17.9-0` or `1.18.3-0`). It is recommended to use the latest patch version for the Cilium minor version.
   + Ensure your nodes meet the kernel requirements for the version you choose. Cilium v1.18.3 requires Linux kernel >= 5.10.
   + If you are using a specific kubeconfig file, use the `--kubeconfig` flag with the Helm install command.

     ```
     helm install cilium oci://public.ecr.aws/eks/cilium/cilium \
         --version CILIUM_VERSION \
         --namespace kube-system \
         --values cilium-values.yaml
     ```

1. Confirm your Cilium installation was successful with the following commands. You should see the `cilium-operator` deployment and the `cilium-agent` running on each of your hybrid nodes. Additionally, your hybrid nodes should now have status `Ready`. For information on how to configure Cilium BGP to advertise your pod CIDRs to your on-premises network, proceed to [Configure Cilium BGP for hybrid nodes](hybrid-nodes-cilium-bgp.md).

   ```
   kubectl get pods -n kube-system
   ```

   ```
   NAME                              READY   STATUS    RESTARTS   AGE
   cilium-jjjn8                      1/1     Running   0          11m
   cilium-operator-d4f4d7fcb-sc5xn   1/1     Running   0          11m
   ```

   ```
   kubectl get nodes
   ```

   ```
   NAME                   STATUS   ROLES    AGE   VERSION
   mi-04a2cf999b7112233   Ready    <none>   19m   v1.31.0-eks-a737599
   ```

## Upgrade Cilium on hybrid nodes
<a name="hybrid-nodes-cilium-upgrade"></a>

Before upgrading your Cilium deployment, carefully review the [Cilium upgrade documentation](https://docs.cilium.io/en/v1.17/operations/upgrade/) and the upgrade notes to understand the changes in the target Cilium version.

1. Ensure that you have installed the `helm` CLI on your command-line environment. See the [Helm documentation](https://helm.sh/docs/intro/quickstart/) for installation instructions.

1. Run the Cilium upgrade pre-flight check. Replace `CILIUM_VERSION` with your target Cilium version. We recommend that you run the latest patch version for your Cilium minor version. You can find the latest patch release for a given minor Cilium release in the [Stable Releases section](https://github.com/cilium/cilium#stable-releases) of the Cilium documentation.

   ```
   helm install cilium-preflight oci://public.ecr.aws/eks/cilium/cilium --version CILIUM_VERSION \
     --namespace=kube-system \
     --set preflight.enabled=true \
     --set agent=false \
     --set operator.enabled=false
   ```

1. After applying the `cilium-preflight.yaml`, ensure that the number of `READY` pods is the same number of Cilium pods running.

   ```
   kubectl get ds -n kube-system | sed -n '1p;/cilium/p'
   ```

   ```
   NAME                      DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
   cilium                    2         2         2       2            2           <none>          1h20m
   cilium-pre-flight-check   2         2         2       2            2           <none>          7m15s
   ```

1. Once the number of READY pods are equal, make sure the Cilium pre-flight deployment is also marked as READY 1/1. If it shows READY 0/1, consult the [CNP Validation](https://docs.cilium.io/en/v1.17/operations/upgrade/#cnp-validation) section and resolve issues with the deployment before continuing with the upgrade.

   ```
   kubectl get deployment -n kube-system cilium-pre-flight-check -w
   ```

   ```
   NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
   cilium-pre-flight-check   1/1     1            0           12s
   ```

1. Delete the preflight

   ```
   helm uninstall cilium-preflight --namespace kube-system
   ```

1. Before running the `helm upgrade` command, preserve the values for your deployment in a `existing-cilium-values.yaml` or use `--set` command line options for your settings when you run the upgrade command. The upgrade operation overwrites the Cilium ConfigMap, so it is critical that your configuration values are passed when you upgrade.

   ```
   helm get values cilium --namespace kube-system -o yaml > existing-cilium-values.yaml
   ```

1. During normal cluster operations, all Cilium components should run the same version. The following steps describe how to upgrade all of the components from one stable release to a later stable release. When upgrading from one minor release to another minor release, it is recommended to upgrade to the latest patch release for the existing Cilium minor version first. To minimize disruption, set the `upgradeCompatibility` option to the initial Cilium version that you installed in this cluster.

   ```
   helm upgrade cilium oci://public.ecr.aws/eks/cilium/cilium --version CILIUM_VERSION \
     --namespace kube-system \
     --set upgradeCompatibility=1.X \
     -f existing-cilium-values.yaml
   ```

1. (Optional) If you need to rollback your upgrade due to issues, run the following commands.

   ```
   helm history cilium --namespace kube-system
   helm rollback cilium [REVISION] --namespace kube-system
   ```

## Delete Cilium from hybrid nodes
<a name="hybrid-nodes-cilium-delete"></a>

1. Run the following command to uninstall all Cilium components from your cluster. Note, uninstalling the CNI might impact the health of nodes and pods and shouldn’t be performed on production clusters.

   ```
   helm uninstall cilium --namespace kube-system
   ```

   The interfaces and routes configured by Cilium are not removed by default when the CNI is removed from the cluster, see the [GitHub issue](https://github.com/cilium/cilium/issues/34289) for more information.

1. To clean up the on-disk configuration files and resources, if you are using the standard configuration directories, you can remove the files as shown by the [`cni-uninstall.sh` script](https://github.com/cilium/cilium/blob/main/plugins/cilium-cni/cni-uninstall.sh) in the Cilium repository on GitHub.

1. To remove the Cilium Custom Resource Definitions (CRDs) from your cluster, you can run the following commands.

   ```
   kubectl get crds -oname | grep "cilium" | xargs kubectl delete
   ```

# Configure add-ons for hybrid nodes
<a name="hybrid-nodes-add-ons"></a>

This page describes considerations for running AWS add-ons and community add-ons on Amazon EKS Hybrid Nodes. To learn more about Amazon EKS add-ons and the processes for creating, upgrading, and removing add-ons from your cluster, see [Amazon EKS add-ons](eks-add-ons.md). Unless otherwise noted on this page, the processes for creating, upgrading, and removing Amazon EKS add-ons is the same for Amazon EKS clusters with hybrid nodes as it is for Amazon EKS clusters with nodes running in AWS Cloud. Only the add-ons included on this page have been validated for compatibility with Amazon EKS Hybrid Nodes.

The following AWS add-ons are compatible with Amazon EKS Hybrid Nodes.


|  AWS add-on | Compatible add-on versions | 
| --- | --- | 
|  kube-proxy  |  v1.25.14-eksbuild.2 and above  | 
|  CoreDNS  |  v1.9.3-eksbuild.7 and above  | 
|   AWS Distro for OpenTelemetry (ADOT)  |  v0.102.1-eksbuild.2 and above  | 
|  CloudWatch Observability agent  |  v2.2.1-eksbuild.1 and above  | 
|  EKS Pod Identity Agent  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/eks/latest/userguide/hybrid-nodes-add-ons.html)  | 
|  Node monitoring agent  |  v1.2.0-eksbuild.1 and above  | 
|  CSI snapshot controller  |  v8.1.0-eksbuild.1 and above  | 
|   AWS Private CA Connector for Kubernetes  |  v1.6.0-eksbuild.1 and above  | 
|  Amazon FSx CSI driver  |  v1.7.0-eksbuild.1 and above  | 
|   AWS Secrets Store CSI Driver provider  |  v2.1.1-eksbuild.1 and above  | 

The following community add-ons are compatible with Amazon EKS Hybrid Nodes. To learn more about community add-ons, see [Community add-ons](community-addons.md).


| Community add-on | Compatible add-on versions | 
| --- | --- | 
|  Kubernetes Metrics Server  |  v0.7.2-eksbuild.1 and above  | 
|  cert-manager  |  v1.17.2-eksbuild.1 and above  | 
|  Prometheus Node Exporter  |  v1.9.1-eksbuild.2 and above  | 
|  kube-state-metrics  |  v2.15.0-eksbuild.4 and above  | 
|  External DNS  |  v0.19.0-eksbuild.1 and above  | 

In addition to the Amazon EKS add-ons in the tables above, the [Amazon Managed Service for Prometheus Collector](prometheus.md), and the [AWS Load Balancer Controller](aws-load-balancer-controller.md) for [application ingress](alb-ingress.md) (HTTP) and [load balancing](network-load-balancing.md) (TCP/UDP) are compatible with hybrid nodes.

There are AWS add-ons and community add-ons that aren’t compatible with Amazon EKS Hybrid Nodes. The latest versions of these add-ons have an anti-affinity rule for the default `eks.amazonaws.com/compute-type: hybrid` label applied to hybrid nodes. This prevents them from running on hybrid nodes when deployed in your clusters. If you have clusters with both hybrid nodes and nodes running in AWS Cloud, you can deploy these add-ons in your cluster to nodes running in AWS Cloud. The Amazon VPC CNI is not compatible with hybrid nodes, and Cilium and Calico are supported as the Container Networking Interfaces (CNIs) for Amazon EKS Hybrid Nodes. See [Configure CNI for hybrid nodes](hybrid-nodes-cni.md) for more information.

## AWS add-ons
<a name="hybrid-nodes-add-ons-aws-add-ons"></a>

The sections that follow describe differences between running compatible AWS add-ons on hybrid nodes compared to other Amazon EKS compute types.

## kube-proxy and CoreDNS
<a name="hybrid-nodes-add-ons-core"></a>

EKS installs kube-proxy and CoreDNS as self-managed add-ons by default when you create an EKS cluster with the AWS API and AWS SDKs, including from the AWS CLI. You can overwrite these add-ons with Amazon EKS add-ons after cluster creation. Reference the EKS documentation for details on [Manage `kube-proxy` in Amazon EKS clusters](managing-kube-proxy.md) and [Manage CoreDNS for DNS in Amazon EKS clusters](managing-coredns.md). If you are running a mixed mode cluster with both hybrid nodes and nodes in AWS Cloud, AWS recommends to have at least one CoreDNS replica on hybrid nodes and at least one CoreDNS replica on your nodes in AWS Cloud. See [Configure CoreDNS replicas](hybrid-nodes-webhooks.md#hybrid-nodes-mixed-coredns) for configuration steps.

## CloudWatch Observability agent
<a name="hybrid-nodes-add-ons-cw"></a>

The CloudWatch Observability agent operator uses [webhooks](https://kubernetes.io/docs/reference/access-authn-authz/webhook/). If you run the operator on hybrid nodes, your on-premises pod CIDR must be routable on your on-premises network and you must configure your EKS cluster with your remote pod network. For more information, see [Configure webhooks for hybrid nodes](hybrid-nodes-webhooks.md).

Node-level metrics are not available for hybrid nodes because [CloudWatch Container Insights](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/ContainerInsights.html) depends on the availability of [Instance Metadata Service](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html) (IMDS) for node-level metrics. Cluster, workload, pod, and container-level metrics are available for hybrid nodes.

After installing the add-on by following the steps described in [Install the CloudWatch agent with the Amazon CloudWatch Observability](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/install-CloudWatch-Observability-EKS-addon.html), the add-on manifest must be updated before the agent can run successfully on hybrid nodes. Edit the `amazoncloudwatchagents` resource on the cluster to add the `RUN_WITH_IRSA` environment variable as shown below.

```
kubectl edit amazoncloudwatchagents -n amazon-cloudwatch cloudwatch-agent
```

```
apiVersion: v1
items:
- apiVersion: cloudwatch.aws.amazon.com/v1alpha1
  kind: AmazonCloudWatchAgent
  metadata:
    ...
    name: cloudwatch-agent
    namespace: amazon-cloudwatch
    ...
  spec:
    ...
    env:
    - name: RUN_WITH_IRSA # <-- Add this
      value: "True" # <-- Add this
    - name: K8S_NODE_NAME
      valueFrom:
        fieldRef:
          fieldPath: spec.nodeName
          ...
```

## Amazon Managed Prometheus managed collector for hybrid nodes
<a name="hybrid-nodes-add-ons-amp"></a>

An Amazon Managed Service for Prometheus (AMP) managed collector consists of a scraper that discovers and collects metrics from the resources in an Amazon EKS cluster. AMP manages the scraper for you, removing the need to manage any instances, agents, or scrapers yourself.

You can use AMP managed collectors without any additional configuration specific to hybrid nodes. However the metric endpoints for your applications on the hybrid nodes must be reachable from the VPC, including routes from the VPC to remote pod network CIDRs and the ports open in your on-premises firewall. Additionally, your cluster must have [private cluster endpoint access](cluster-endpoint.md).

Follow the steps in [Using an AWS managed collector](https://docs.aws.amazon.com/prometheus/latest/userguide/AMP-collector-how-to.html) in the Amazon Managed Service for Prometheus User Guide.

## AWS Distro for OpenTelemetry (ADOT)
<a name="hybrid-nodes-add-ons-adot"></a>

You can use the AWS Distro for OpenTelemetry (ADOT) add-on to collect metrics, logs, and tracing data from your applications running on hybrid nodes. ADOT uses admission [webhooks](https://kubernetes.io/docs/reference/access-authn-authz/webhook/) to mutate and validate the Collector Custom Resource requests. If you run the ADOT operator on hybrid nodes, your on-premises pod CIDR must be routable on your on-premises network and you must configure your EKS cluster with your remote pod network. For more information, see [Configure webhooks for hybrid nodes](hybrid-nodes-webhooks.md).

Follow the steps in [Getting Started with AWS Distro for OpenTelemetry using EKS Add-Ons](https://aws-otel.github.io/docs/getting-started/adot-eks-add-on) in the * AWS Distro for OpenTelemetry* documentation.

## AWS Load Balancer Controller
<a name="hybrid-nodes-add-ons-lbc"></a>

You can use the [AWS Load Balancer Controller](aws-load-balancer-controller.md) and Application Load Balancer (ALB) or Network Load Balancer (NLB) with the target type `ip` for workloads on hybrid nodes The IP target(s) used with the ALB or NLB must be routable from AWS. The AWS Load Balancer controller also uses [webhooks](https://kubernetes.io/docs/reference/access-authn-authz/webhook/). If you run the AWS Load Balancer Controller operator on hybrid nodes, your on-premises pod CIDR must be routable on your on-premises network and you must configure your EKS cluster with your remote pod network. For more information, see [Configure webhooks for hybrid nodes](hybrid-nodes-webhooks.md).

To install the AWS Load Balancer Controller, follow the steps at [AWS Application Load Balancer](hybrid-nodes-ingress.md#hybrid-nodes-ingress-alb) or [AWS Network Load Balancer](hybrid-nodes-load-balancing.md#hybrid-nodes-service-lb-nlb).

For ingress with ALB, you must specify the annotations below. See [Route application and HTTP traffic with Application Load Balancers](alb-ingress.md) for more information.

```
alb.ingress.kubernetes.io/target-type: ip
```

For load balancing with NLB, you must specify the annotations below. See [Route TCP and UDP traffic with Network Load Balancers](network-load-balancing.md) for more information.

```
service.beta.kubernetes.io/aws-load-balancer-type: "external"
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip"
```

## EKS Pod Identity Agent
<a name="hybrid-nodes-add-ons-pod-id"></a>

**Note**  
To successfully deploy the EKS Pod Identity Agent add-on on hybrid nodes running Bottlerocket, ensure your Bottlerocket version is at least v1.39.0. The Pod Identity Agent is not supported on earlier Bottlerocket versions in hybrid node environments.

The original Amazon EKS Pod Identity Agent DaemonSet relies on the availability of EC2 IMDS on the node to obtain the required AWS credentials. As IMDS isn’t available on hybrid nodes, starting with version 1.3.3-eksbuild.1, the Pod Identity Agent add-on optionally deploys a DaemonSet that mounts the required credentials. Hybrid nodes running Bottlerocket require a different method to mount the credentials, and starting in version 1.3.7-eksbuild.2, the Pod Identity Agent add-on optionally deploys a DaemonSet that specifically targets Bottlerocket hybrid nodes. The following sections describe the process for enabling the optional DaemonSets.

### Ubuntu/RHEL/AL2023
<a name="_ubunturhelal2023"></a>

1. To use the Pod Identity agent on Ubuntu/RHEL/Al2023 hybrid nodes, set `enableCredentialsFile: true` in the hybrid section of `nodeadm` config as shown below:

   ```
   apiVersion: node.eks.aws/v1alpha1
   kind: NodeConfig
   spec:
       hybrid:
           enableCredentialsFile: true # <-- Add this
   ```

   This will configure `nodeadm` to create a credentials file to be configured on the node under `/eks-hybrid/.aws/credentials`, which will be used by `eks-pod-identity-agent` pods. This credentials file will contain temporary AWS credentials that will be refreshed periodically.

1. After you update the `nodeadm` config on *each* node, run the following `nodeadm init` command with your `nodeConfig.yaml` to join your hybrid nodes to your Amazon EKS cluster. If your nodes have joined the cluster previous, still run the `nodeadm init` command again.

   ```
   nodeadm init -c file://nodeConfig.yaml
   ```

1. Install `eks-pod-identity-agent` with support for hybrid nodes enabled, by using either the AWS CLI or AWS Management Console.

   1.  AWS CLI: From the machine that you’re using to administer the cluster, run the following command to install `eks-pod-identity-agent` with support for hybrid nodes enabled. Replace `my-cluster` with the name of your cluster.

      ```
      aws eks create-addon \
          --cluster-name my-cluster \
          --addon-name eks-pod-identity-agent \
          --configuration-values '{"daemonsets":{"hybrid":{"create": true}}}'
      ```

   1.  AWS Management Console: If you are installing the Pod Identity Agent add-on through the AWS console, add the following to the optional configuration to deploy the DaemonSet that targets hybrid nodes.

      ```
      {"daemonsets":{"hybrid":{"create": true}}}
      ```

### Bottlerocket
<a name="_bottlerocket"></a>

1. To use the Pod Identity agent on Bottlerocket hybrid nodes, add the `--enable-credentials-file=true` flag to the command used for the Bottlerocket bootstrap container user data, as described in [Connect hybrid nodes with Bottlerocket](hybrid-nodes-bottlerocket.md).

   1. If you are using the SSM credential provider, your command should look like this:

      ```
      eks-hybrid-ssm-setup --activation-id=<activation-id> --activation-code=<activation-code> --region=<region> --enable-credentials-file=true
      ```

   1. If you are using the IAM Roles Anywhere credential provider, your command should look like this:

      ```
      eks-hybrid-iam-ra-setup --certificate=<certificate> --key=<private-key> --enable-credentials-file=true
      ```

      This will configure the bootstrap script to create a credentials file on the node under `/var/eks-hybrid/.aws/credentials`, which will be used by `eks-pod-identity-agent` pods. This credentials file will contain temporary AWS credentials that will be refreshed periodically.

1. Install `eks-pod-identity-agent` with support for Bottlerocket hybrid nodes enabled, by using either the AWS CLI or AWS Management Console.

   1.  AWS CLI: From the machine that you’re using to administer the cluster, run the following command to install `eks-pod-identity-agent` with support for Bottlerocket hybrid nodes enabled. Replace `my-cluster` with the name of your cluster.

      ```
      aws eks create-addon \
          --cluster-name my-cluster \
          --addon-name eks-pod-identity-agent \
          --configuration-values '{"daemonsets":{"hybrid-bottlerocket":{"create": true}}}'
      ```

   1.  AWS Management Console: If you are installing the Pod Identity Agent add-on through the AWS console, add the following to the optional configuration to deploy the DaemonSet that targets Bottlerocket hybrid nodes.

      ```
      {"daemonsets":{"hybrid-bottlerocket":{"create": true}}}
      ```

## CSI snapshot controller
<a name="hybrid-nodes-add-ons-csi-snapshotter"></a>

Starting with version `v8.1.0-eksbuild.2`, the [CSI snapshot controller add-on](csi-snapshot-controller.md) applies a soft anti-affinity rule for hybrid nodes, preferring the controller `deployment` to run on EC2 in the same AWS Region as the Amazon EKS control plane. Co-locating the `deployment` in the same AWS Region as the Amazon EKS control plane improves latency.

## Community add-ons
<a name="hybrid-nodes-add-ons-community"></a>

The sections that follow describe differences between running compatible community add-ons on hybrid nodes compared to other Amazon EKS compute types.

## Kubernetes Metrics Server
<a name="hybrid-nodes-add-ons-metrics-server"></a>

The control plane needs to reach Metrics Server’s pod IP (or node IP if hostNetwork is enabled). Therefore, unless you run Metrics Server in hostNetwork mode, you must configure a remote pod network when creating your Amazon EKS cluster, and you must make your pod IP addresses routable. Implementing Border Gateway Protocol (BGP) with the CNI is one common way to make your pod IP addresses routable.

## cert-manager
<a name="hybrid-nodes-add-ons-cert-manager"></a>

 `cert-manager` uses [webhooks](https://kubernetes.io/docs/reference/access-authn-authz/webhook/). If you run `cert-manager` on hybrid nodes, your on-premises pod CIDR must be routable on your on-premises network and you must configure your EKS cluster with your remote pod network. For more information, see [Configure webhooks for hybrid nodes](hybrid-nodes-webhooks.md).

# Configure webhooks for hybrid nodes
<a name="hybrid-nodes-webhooks"></a>

This page details considerations for running webhooks with hybrid nodes. Webhooks are used in Kubernetes applications and open source projects, such as the AWS Load Balancer Controller and CloudWatch Observability Agent, to perform mutating and validation capabilities at runtime.

 **Routable pod networks** 

If you are able to make your on-premises pod CIDR routable on your on-premises network, you can run webhooks on hybrid nodes. There are several techniques you can use to make your on-premises pod CIDR routable on your on-premises network including Border Gateway Protocol (BGP), static routes, or other custom routing solutions. BGP is the recommended solution as it is more scalable and easier to manage than alternative solutions that require custom or manual route configuration. AWS supports the BGP capabilities of Cilium and Calico for advertising pod CIDRs, see [Configure CNI for hybrid nodes](hybrid-nodes-cni.md) and [Routable remote Pod CIDRs](hybrid-nodes-concepts-kubernetes.md#hybrid-nodes-concepts-k8s-pod-cidrs) for more information.

 **Unroutable pod networks** 

If you *cannot* make your on-premises pod CIDR routable on your on-premises network and need to run webhooks, we recommend that you run all webhooks on cloud nodes in the same EKS cluster as your hybrid nodes.

## Considerations for mixed mode clusters
<a name="hybrid-nodes-considerations-mixed-mode"></a>

 *Mixed mode clusters* are defined as EKS clusters that have both hybrid nodes and nodes running in AWS Cloud. When running a mixed mode cluster, consider the following recommendations:
+ Run the VPC CNI on nodes in AWS Cloud and either Cilium or Calico on hybrid nodes. Cilium and Calico are not supported by AWS when running on nodes in AWS Cloud.
+ Configure webhooks to run on nodes in AWS Cloud. See [Configure webhooks for add-ons](#hybrid-nodes-webhooks-add-ons) for how to configure the webhooks for AWS and community add-ons.
+ If your applications require pods running on nodes in AWS Cloud to directly communicate with pods running on hybrid nodes ("east-west communication"), and you are using the VPC CNI on nodes in AWS Cloud, and Cilium or Calico on hybrid nodes, then your on-premises pod CIDR must be routable on your on-premises network.
+ Run at least one replica of CoreDNS on nodes in AWS Cloud and at least one replica of CoreDNS on hybrid nodes.
+ Configure Service Traffic Distribution to keep Service traffic local to the zone it is originating from. For more information on Service Traffic Distribution, see [Configure Service Traffic Distribution](#hybrid-nodes-mixed-service-traffic-distribution).
+ If you are using AWS Application Load Balancers (ALB) or Network Load Balancers (NLB) for workload traffic running on hybrid nodes, then the IP target(s) used with the ALB or NLB must be routable from AWS.
+ The Metrics Server add-on requires connectivity from the EKS control plane to the Metrics Server pod IP address. If you are running the Metrics Server add-on on hybrid nodes, then your on-premises pod CIDR must be routable on your on-premises network.
+ To collect metrics for hybrid nodes using Amazon Managed Service for Prometheus (AMP) managed collectors, your on-premises pod CIDR must be routable on your on-premises network. Or, you can use the AMP managed collector for EKS control plane metrics and resources running in AWS Cloud, and the AWS Distro for OpenTelemetry (ADOT) add-on to collect metrics for hybrid nodes.

## Configure mixed mode clusters
<a name="hybrid-nodes-mixed-mode"></a>

To view the mutating and validating webhooks running on your cluster, you can view the **Extensions** resource type in the **Resources** panel of the EKS console for your cluster, or you can use the following commands. EKS also reports webhook metrics in the cluster observability dashboard, see [Monitor your cluster with the observability dashboard](observability-dashboard.md) for more information.

```
kubectl get mutatingwebhookconfigurations
```

```
kubectl get validatingwebhookconfigurations
```

### Configure Service Traffic Distribution
<a name="hybrid-nodes-mixed-service-traffic-distribution"></a>

When running mixed mode clusters, we recommend that you use [https://kubernetes.io/docs/reference/networking/virtual-ips/#traffic-distribution](https://kubernetes.io/docs/reference/networking/virtual-ips/#traffic-distribution) to keep Service traffic local to the zone it is originating from. Service Traffic Distribution (available for Kubernetes versions 1.31 and later in EKS) is the recommended solution over [Topology Aware Routing](https://kubernetes.io/docs/concepts/services-networking/topology-aware-routing/) because it is more predictable. With Service Traffic Distribution, healthy endpoints in the zone will receive all of the traffic for that zone. With Topology Aware Routing, each service must meet several conditions in that zone to apply the custom routing, otherwise it routes traffic evenly to all endpoints.

If you are using Cilium as your CNI, you must run the CNI with the `enable-service-topology` set to `true` to enable Service Traffic Distribution. You can pass this configuration with the Helm install flag `--set loadBalancer.serviceTopology=true` or you can update an existing installation with the Cilium CLI command `cilium config set enable-service-topology true`. The Cilium agent running on each node must be restarted after updating the configuration for an existing installation.

An example of how to configure Service Traffic Distribution for the CoreDNS Service is shown in the following section, and we recommend that you enable the same for all Services in your cluster to avoid unintended cross-environment traffic.

### Configure CoreDNS replicas
<a name="hybrid-nodes-mixed-coredns"></a>

If you are running a mixed mode cluster with both hybrid nodes and nodes in AWS Cloud, we recommend that you have at least one CoreDNS replica on hybrid nodes and at least one CoreDNS replica on your nodes in AWS Cloud. To prevent latency and network issues in a mixed mode cluster setup, you can configure the CoreDNS Service to prefer the closest CoreDNS replica with [Service Traffic Distribution](https://kubernetes.io/docs/reference/networking/virtual-ips/#traffic-distribution).

 *Service Traffic Distribution* (available for Kubernetes versions 1.31 and later in EKS) is the recommended solution over [Topology Aware Routing](https://kubernetes.io/docs/concepts/services-networking/topology-aware-routing/) because it is more predictable. In Service Traffic Distribution, healthy endpoints in the zone will receive all of the traffic for that zone. In Topology Aware Routing, each service must meet several conditions in that zone to apply the custom routing, otherwise it routes traffic evenly to all endpoints. The following steps configure Service Traffic Distribution.

If you are using Cilium as your CNI, you must run the CNI with the `enable-service-topology` set to `true` to enable Service Traffic Distribution. You can pass this configuration with the Helm install flag `--set loadBalancer.serviceTopology=true` or you can update an existing installation with the Cilium CLI command `cilium config set enable-service-topology true`. The Cilium agent running on each node must be restarted after updating the configuration for an existing installation.

1. Add a topology zone label for each of your hybrid nodes, for example `topology.kubernetes.io/zone: onprem`. Or, you can set the label at the `nodeadm init` phase by specifying the label in your `nodeadm` configuration, see [Node Config for customizing kubelet (Optional)](hybrid-nodes-nodeadm.md#hybrid-nodes-nodeadm-kubelet). Note, nodes running in AWS Cloud automatically get a topology zone label applied to them that corresponds to the availability zone (AZ) of the node.

   ```
   kubectl label node hybrid-node-name topology.kubernetes.io/zone=zone
   ```

1. Add `podAntiAffinity` to the CoreDNS deployment with the topology zone key. Or, you can configure the CoreDNS deployment during installation with EKS add-ons.

   ```
   kubectl edit deployment coredns -n kube-system
   ```

   ```
   spec:
     template:
       spec:
         affinity:
          ...
           podAntiAffinity:
             preferredDuringSchedulingIgnoredDuringExecution:
             - podAffinityTerm:
                 labelSelector:
                   matchExpressions:
                   - key: k8s-app
                     operator: In
                     values:
                     - kube-dns
                 topologyKey: kubernetes.io/hostname
               weight: 100
             - podAffinityTerm:
                 labelSelector:
                   matchExpressions:
                   - key: k8s-app
                     operator: In
                     values:
                     - kube-dns
                 topologyKey: topology.kubernetes.io/zone
               weight: 50
         ...
   ```

1. Add the setting `trafficDistribution: PreferClose` to the `kube-dns` Service configuration to enable Service Traffic Distribution.

   ```
   kubectl patch svc kube-dns -n kube-system --type=merge -p '{
     "spec": {
       "trafficDistribution": "PreferClose"
     }
   }'
   ```

1. You can confirm that Service Traffic Distribution is enabled by viewing the endpoint slices for the `kube-dns` Service. Your endpoint slices must show the `hints` for your topology zone labels, which confirms that Service Traffic Distribution is enabled. If you do not see the `hints` for each endpoint address, then Service Traffic Distribution is not enabled.

   ```
   kubectl get endpointslice -A | grep "kube-dns"
   ```

   ```
   kubectl get endpointslice [.replaceable]`kube-dns-<id>`  -n kube-system -o yaml
   ```

   ```
   addressType: IPv4
   apiVersion: discovery.k8s.io/v1
   endpoints:
   - addresses:
     - <your-hybrid-node-pod-ip>
     hints:
       forZones:
       - name: onprem
     nodeName: <your-hybrid-node-name>
     zone: onprem
   - addresses:
     - <your-cloud-node-pod-ip>
     hints:
       forZones:
       - name: us-west-2a
     nodeName: <your-cloud-node-name>
     zone: us-west-2a
   ```

### Configure webhooks for add-ons
<a name="hybrid-nodes-webhooks-add-ons"></a>

The following add-ons use webhooks and are supported for use with hybrid nodes.
+  AWS Load Balancer Controller
+ CloudWatch Observability Agent
+  AWS Distro for OpenTelemetry (ADOT)
+  `cert-manager` 

See the following sections for configuring the webhooks used by these add-ons to run on nodes in AWS Cloud.

#### AWS Load Balancer Controller
<a name="hybrid-nodes-mixed-lbc"></a>

To use the AWS Load Balancer Controller in a mixed mode cluster setup, you must run the controller on nodes in AWS Cloud. To do so, add the following to your Helm values configuration or specify the values by using EKS add-on configuration.

```
affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: eks.amazonaws.com/compute-type
          operator: NotIn
          values:
          - hybrid
```

#### CloudWatch Observability Agent
<a name="hybrid-nodes-mixed-cwagent"></a>

The CloudWatch Observability Agent add-on has a Kubernetes Operator that uses webhooks. To run the operator on nodes in AWS Cloud in a mixed mode cluster setup, edit the CloudWatch Observability Agent operator configuration. You can’t configure the operator affinity during installation with Helm and EKS add-ons (see [containers-roadmap issue \$12431](https://github.com/aws/containers-roadmap/issues/2431)).

```
kubectl edit -n amazon-cloudwatch deployment amazon-cloudwatch-observability-controller-manager
```

```
spec:
  ...
  template:
    ...
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: eks.amazonaws.com/compute-type
                operator: NotIn
                values:
                - hybrid
```

#### AWS Distro for OpenTelemetry (ADOT)
<a name="hybrid-nodes-mixed-adot"></a>

The AWS Distro for OpenTelemetry (ADOT) add-on has a Kubernetes Operator that uses webhooks. To run the operator on nodes in AWS Cloud in a mixed mode cluster setup, add the following to your Helm values configuration or specify the values by using EKS add-on configuration.

```
affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: eks.amazonaws.com/compute-type
          operator: NotIn
          values:
          - hybrid
```

If your pod CIDR is not routable on your on-premises network, then the ADOT collector must run on hybrid nodes to scrape the metrics from your hybrid nodes and the workloads running on them. To do so, edit the Custom Resource Definition (CRD).

```
kubectl -n opentelemetry-operator-system edit opentelemetrycollectors.opentelemetry.io adot-col-prom-metrics
```

```
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: eks.amazonaws.com/compute-type
            operator: In
            values:
            - hybrid
```

You can configure the ADOT collector to only scrape metrics from hybrid nodes and the resources running on hybrid nodes by adding the following `relabel_configs` to each `scrape_configs` in the ADOT collector CRD configuration.

```
relabel_configs:
  - action: keep
    regex: hybrid
    source_labels:
    - __meta_kubernetes_node_label_eks_amazonaws_com_compute_type
```

The ADOT add-on has a prerequisite requirement to install `cert-manager` for the TLS certificates used by the ADOT operator webhook. `cert-manager` also runs webhooks and you can configure it to run on nodes in AWS Cloud with the following Helm values configuration.

```
affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: eks.amazonaws.com/compute-type
          operator: NotIn
          values:
          - hybrid
webhook:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: eks.amazonaws.com/compute-type
            operator: NotIn
            values:
            - hybrid
cainjector:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: eks.amazonaws.com/compute-type
            operator: NotIn
            values:
            - hybrid
startupapicheck:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: eks.amazonaws.com/compute-type
            operator: NotIn
            values:
            - hybrid
```

#### `cert-manager`
<a name="hybrid-nodes-mixed-cert-manager"></a>

The `cert-manager` add-on runs webhooks and you can configure it to run on nodes in AWS Cloud with the following Helm values configuration.

```
affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: eks.amazonaws.com/compute-type
          operator: NotIn
          values:
          - hybrid
webhook:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: eks.amazonaws.com/compute-type
            operator: NotIn
            values:
            - hybrid
cainjector:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: eks.amazonaws.com/compute-type
            operator: NotIn
            values:
            - hybrid
startupapicheck:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: eks.amazonaws.com/compute-type
            operator: NotIn
            values:
            - hybrid
```

# Configure proxy for hybrid nodes
<a name="hybrid-nodes-proxy"></a>

If you are using a proxy server in your on-premises environment for traffic leaving your data center or edge environment, you need to separately configure your nodes and your cluster to use your proxy server.

Cluster  
On your cluster, you need to configure `kube-proxy` to use your proxy server. You must configure `kube-proxy` after creating your Amazon EKS cluster.

Nodes  
On your nodes, you must configure the operating system, `containerd`, `kubelet`, and the Amazon SSM agent to use your proxy server. You can make these changes during the build process for your operating system images or before you run `nodeadm init` on each hybrid node.

## Node-level configuration
<a name="_node_level_configuration"></a>

You must apply the following configurations either in your operating system images or before running `nodeadm init` on each hybrid node.

### `containerd` proxy configuration
<a name="_containerd_proxy_configuration"></a>

 `containerd` is the default container management runtime for Kubernetes. If you are using a proxy for internet access, you must configure `containerd` so it can pull the container images required by Kubernetes and Amazon EKS.

Create a file on each hybrid node called `http-proxy.conf` in the `/etc/systemd/system/containerd.service.d` directory with the following contents. Replace `proxy-domain` and `port` with the values for your environment.

```
[Service]
Environment="HTTP_PROXY=http://proxy-domain:port"
Environment="HTTPS_PROXY=http://proxy-domain:port"
Environment="NO_PROXY=localhost"
```

#### `containerd` configuration from user data
<a name="_containerd_configuration_from_user_data"></a>

The `containerd.service.d` directory will need to be created for this file. You will need to reload systemd to pick up the configuration file without a reboot. In AL2023, the service will likely already be running when your script executes, so you will also need to restart it.

```
mkdir -p /etc/systemd/system/containerd.service.d
echo '[Service]' > /etc/systemd/system/containerd.service.d/http-proxy.conf
echo 'Environment="HTTP_PROXY=http://proxy-domain:port"' >> /etc/systemd/system/containerd.service.d/http-proxy.conf
echo 'Environment="HTTPS_PROXY=http://proxy-domain:port"' >> /etc/systemd/system/containerd.service.d/http-proxy.conf
echo 'Environment="NO_PROXY=localhost"' >> /etc/systemd/system/containerd.service.d/http-proxy.conf
systemctl daemon-reload
systemctl restart containerd
```

### `kubelet` proxy configuration
<a name="_kubelet_proxy_configuration"></a>

 `kubelet` is the Kubernetes node agent that runs on each Kubernetes node and is responsible for managing the node and pods running on it. If you are using a proxy in your on-premises environment, you must configure the `kubelet` so it can communicate with your Amazon EKS cluster’s public or private endpoints.

Create a file on each hybrid node called `http-proxy.conf` in the `/etc/systemd/system/kubelet.service.d/` directory with the following content. Replace `proxy-domain` and `port` with the values for your environment.

```
[Service]
Environment="HTTP_PROXY=http://proxy-domain:port"
Environment="HTTPS_PROXY=http://proxy-domain:port"
Environment="NO_PROXY=localhost"
```

#### `kubelet` configuration from user data
<a name="_kubelet_configuration_from_user_data"></a>

The `kubelet.service.d` directory must be created for this file. You will need to reload systemd to pick up the configuration file without a reboot. In AL2023, the service will likely already be running when your script executes, so you will also need to restart it.

```
mkdir -p /etc/systemd/system/kubelet.service.d
echo '[Service]' > /etc/systemd/system/kubelet.service.d/http-proxy.conf
echo 'Environment="HTTP_PROXY=http://proxy-domain:port"' >> /etc/systemd/system/kubelet.service.d/http-proxy.conf
echo 'Environment="HTTPS_PROXY=http://proxy-domain:port"' >> /etc/systemd/system/kubelet.service.d/http-proxy.conf
echo 'Environment="NO_PROXY=localhost"' >> /etc/systemd/system/kubelet.service.d/http-proxy.conf
systemctl daemon-reload
systemctl restart kubelet
```

### `ssm` proxy configuration
<a name="_ssm_proxy_configuration"></a>

 `ssm` is one of the credential providers that can be used to initialize a hybrid node. `ssm` is responsible for authenticating with AWS and generating temporary credentials that is used by `kubelet`. If you are using a proxy in your on-premises environment and using `ssm` as your credential provider on the node, you must configure the `ssm` so it can communicate with Amazon SSM service endpoints.

Create a file on each hybrid node called `http-proxy.conf` in the path below depending on the operating system
+ Ubuntu - `/etc/systemd/system/snap.amazon-ssm-agent.amazon-ssm-agent.service.d/http-proxy.conf` 
+ Amazon Linux 2023 and Red Hat Enterprise Linux - `/etc/systemd/system/amazon-ssm-agent.service.d/http-proxy.conf` 

Populate the file with the following contents. Replace `proxy-domain` and `port` with the values for your environment.

```
[Service]
Environment="HTTP_PROXY=http://proxy-domain:port"
Environment="HTTPS_PROXY=http://proxy-domain:port"
Environment="NO_PROXY=localhost"
```

#### `ssm` configuration from user data
<a name="_ssm_configuration_from_user_data"></a>

The `ssm` systemd service file directory must be created for this file. The directory path depends on the operating system used on the node.
+ Ubuntu - `/etc/systemd/system/snap.amazon-ssm-agent.amazon-ssm-agent.service.d` 
+ Amazon Linux 2023 and Red Hat Enterprise Linux - `/etc/systemd/system/amazon-ssm-agent.service.d` 

Replace the systemd service name in the restart command below depending on the operating system used on the node
+ Ubuntu - `snap.amazon-ssm-agent.amazon-ssm-agent` 
+ Amazon Linux 2023 and Red Hat Enterprise Linux - `amazon-ssm-agent` 

```
mkdir -p systemd-service-file-directory
echo '[Service]' > [.replaceable]#systemd-service-file-directory/http-proxy.conf
echo 'Environment="HTTP_PROXY=http://[.replaceable]#proxy-domain:port"' >> systemd-service-file-directory/http-proxy.conf
echo 'Environment="HTTPS_PROXY=http://[.replaceable]#proxy-domain:port"' >> [.replaceable]#systemd-service-file-directory/http-proxy.conf
echo 'Environment="NO_PROXY=localhost"' >> [.replaceable]#systemd-service-file-directory/http-proxy.conf
systemctl daemon-reload
systemctl restart [.replaceable]#systemd-service-name
```

### Operating system proxy configuration
<a name="_operating_system_proxy_configuration"></a>

If you are using a proxy for internet access, you must configure your operating system to be able to pull the hybrid nodes dependencies from your operating systems' package manager.

 **Ubuntu** 

1. Configure `snap` to use your proxy with the following commands:

   ```
   sudo snap set system proxy.https=http://proxy-domain:port
   sudo snap set system proxy.http=http://proxy-domain:port
   ```

1. To enable proxy for `apt`, create a file called `apt.conf` in the `/etc/apt/` directory. Replace proxy-domain and port with the values for your environment.

   ```
   Acquire::http::Proxy "http://proxy-domain:port";
   Acquire::https::Proxy "http://proxy-domain:port";
   ```

 **Amazon Linux 2023** 

1. Configure `dnf` to use your proxy. Create a file `/etc/dnf/dnf.conf` with the proxy-domain and port values for your environment.

   ```
   proxy=http://proxy-domain:port
   ```

 **Red Hat Enterprise Linux** 

1. Configure `yum` to use your proxy. Create a file `/etc/yum.conf` with the proxy-domain and port values for your environment.

   ```
   proxy=http://proxy-domain:port
   ```

### IAM Roles Anywhere proxy configuration
<a name="_iam_roles_anywhere_proxy_configuration"></a>

The IAM Roles Anywhere credential provider service is responsible for refreshing credentials when using IAM Roles Anywhere with the `enableCredentialsFile` flag (see [EKS Pod Identity Agent](hybrid-nodes-add-ons.md#hybrid-nodes-add-ons-pod-id)). If you are using a proxy in your on-premises environment, you must configure the service so it can communicate with IAM Roles Anywhere endpoints.

Create a file called `http-proxy.conf` in the `/etc/systemd/system/aws_signing_helper_update.service.d/` directory with the following content. Replace `proxy-domain` and `port` with the values for your environment.

```
[Service]
Environment="HTTP_PROXY=http://proxy-domain:port"
Environment="HTTPS_PROXY=http://proxy-domain:port"
Environment="NO_PROXY=localhost"
```

## Cluster wide configuration
<a name="_cluster_wide_configuration"></a>

The configurations in this section must be applied after you create your Amazon EKS cluster and before running `nodeadm init` on each hybrid node.

### kube-proxy proxy configuration
<a name="_kube_proxy_proxy_configuration"></a>

Amazon EKS automatically installs `kube-proxy` on each hybrid node as a DaemonSet when your hybrid nodes join the cluster. `kube-proxy` enables routing across services that are backed by pods on Amazon EKS clusters. To configure each host, `kube-proxy` requires DNS resolution for your Amazon EKS cluster endpoint.

1. Edit the `kube-proxy` DaemonSet with the following command

   ```
   kubectl -n kube-system edit ds kube-proxy
   ```

   This will open the `kube-proxy` DaemonSet definition on your configured editor.

1. Add the environment variables for `HTTP_PROXY` and `HTTPS_PROXY`. Note the `NODE_NAME` environment variable should already exist in your configuration. Replace `proxy-domain` and `port` with values for your environment.

   ```
   containers:
     - command:
       - kube-proxy
       - --v=2
       - --config=/var/lib/kube-proxy-config/config - --hostname-override=$(NODE_NAME)
       env:
       - name: HTTP_PROXY
         value: http://proxy-domain:port
       - name: HTTPS_PROXY
         value: http://proxy-domain:port
       - name: NODE_NAME
         valueFrom:
           fieldRef:
             apiVersion: v1
             fieldPath: spec.nodeName
   ```

# Configure Cilium BGP for hybrid nodes
<a name="hybrid-nodes-cilium-bgp"></a>

This topic describes how to configure Cilium Border Gateway Protocol (BGP) for Amazon EKS Hybrid Nodes. Cilium’s BGP functionality is called [Cilium BGP Control Plane](https://docs.cilium.io/en/stable/network/bgp-control-plane/bgp-control-plane/) and can be used to advertise pod and service addresses to your on-premises network. For alternative methods to make pod CIDRs routable on your on-premises network, see [Routable remote Pod CIDRs](hybrid-nodes-concepts-kubernetes.md#hybrid-nodes-concepts-k8s-pod-cidrs).

## Configure Cilium BGP
<a name="hybrid-nodes-cilium-bgp-configure"></a>

### Prerequisites
<a name="_prerequisites"></a>
+ Cilium installed following the instructions in [Configure CNI for hybrid nodes](hybrid-nodes-cni.md).

### Procedure
<a name="_procedure"></a>

1. To use BGP with Cilium to advertise pod or service addresses with your on-premises network, Cilium must be installed with `bgpControlPlane.enabled: true`. If you are enabling BGP for an existing Cilium deployment, you must restart the Cilium operator to apply the BGP configuration if BGP was not previously enabled. You can set `operator.rollOutPods` to `true` in your Helm values to restart the Cilium operator as part of the Helm install/upgrade process.

   ```
   helm upgrade cilium oci://public.ecr.aws/eks/cilium/cilium \
     --namespace kube-system \
     --reuse-values \
     --set operator.rollOutPods=true \
     --set bgpControlPlane.enabled=true
   ```

1. Confirm that the Cilium operator and agents were restarted and are running.

   ```
   kubectl -n kube-system get pods --selector=app.kubernetes.io/part-of=cilium
   ```

   ```
   NAME                               READY   STATUS    RESTARTS   AGE
   cilium-grwlc                       1/1     Running   0          4m12s
   cilium-operator-68f7766967-5nnbl   1/1     Running   0          4m20s
   cilium-operator-68f7766967-7spfz   1/1     Running   0          4m20s
   cilium-pnxcv                       1/1     Running   0          6m29s
   cilium-r7qkj                       1/1     Running   0          4m12s
   cilium-wxhfn                       1/1     Running   0          4m1s
   cilium-z7hlb                       1/1     Running   0          6m30s
   ```

1. Create a file called `cilium-bgp-cluster.yaml` with a `CiliumBGPClusterConfig` definition. You may need to obtain the following information from your network administrator.
   + Configure `localASN` with the ASN for the nodes running Cilium.
   + Configure `peerASN` with the ASN for your on-premises router.
   + Configure the `peerAddress` with the on-premises router IP that each node running Cilium will peer with.

     ```
     apiVersion: cilium.io/v2alpha1
     kind: CiliumBGPClusterConfig
     metadata:
       name: cilium-bgp
     spec:
       nodeSelector:
         matchExpressions:
         - key: eks.amazonaws.com/compute-type
           operator: In
           values:
           - hybrid
       bgpInstances:
       - name: "rack0"
         localASN: NODES_ASN
         peers:
         - name: "onprem-router"
           peerASN: ONPREM_ROUTER_ASN
           peerAddress: ONPREM_ROUTER_IP
           peerConfigRef:
             name: "cilium-peer"
     ```

1. Apply the Cilium BGP cluster configuration to your cluster.

   ```
   kubectl apply -f cilium-bgp-cluster.yaml
   ```

1. Create a file named `cilium-bgp-peer.yaml` with the `CiliumBGPPeerConfig` resource that defines a BGP peer configuration. Multiple peers can share the same configuration and provide reference to the common `CiliumBGPPeerConfig` resource. See the [BGP Peer configuration](https://docs.cilium.io/en/latest/network/bgp-control-plane/bgp-control-plane-v2/#bgp-peer-configuration) in the Cilium documentation for a full list of configuration options.

   The values for the following Cilium peer settings must match those of the on-premises router you are peering with.
   + Configure `holdTimeSeconds` which determines how long a BGP peer waits for a keepalive or update message before declaring the session down. The default is 90 seconds.
   + Configure `keepAliveTimeSeconds` which determines if a BGP peer is still reachable and the BGP session is active. The default is 30 seconds.
   + Configure `restartTimeSeconds` which determines the time that Cilium’s BGP control plane is expected to re-establish the BGP session after a restart. The default is 120 seconds.

     ```
     apiVersion: cilium.io/v2alpha1
     kind: CiliumBGPPeerConfig
     metadata:
       name: cilium-peer
     spec:
       timers:
         holdTimeSeconds: 90
         keepAliveTimeSeconds: 30
       gracefulRestart:
         enabled: true
         restartTimeSeconds: 120
       families:
         - afi: ipv4
           safi: unicast
           advertisements:
             matchLabels:
               advertise: "bgp"
     ```

1. Apply the Cilium BGP peer configuration to your cluster.

   ```
   kubectl apply -f cilium-bgp-peer.yaml
   ```

1. Create a file named `cilium-bgp-advertisement-pods.yaml` with a `CiliumBGPAdvertisement` resource to advertise the pod CIDRs to your on-premises network.
   + The `CiliumBGPAdvertisement` resource is used to define advertisement types and attributes associated with them. The example below configures Cilium to advertise only pod CIDRs. See the examples in [Service type LoadBalancer](hybrid-nodes-ingress.md#hybrid-nodes-ingress-cilium-loadbalancer) and [Cilium in-cluster load balancing](hybrid-nodes-load-balancing.md#hybrid-nodes-service-lb-cilium) for more information on configuring Cilium to advertise service addresses.
   + Each hybrid node running the Cilium agent peers with the upstream BGP-enabled router. Each node advertises the pod CIDR range that it owns when Cilium’s `advertisementType` is set to `PodCIDR` like in the example below. See the [BGP Advertisements configuration](https://docs.cilium.io/en/stable/network/bgp-control-plane/bgp-control-plane-v2/#bgp-advertisements) in the Cilium documentation for more information.

     ```
     apiVersion: cilium.io/v2alpha1
     kind: CiliumBGPAdvertisement
     metadata:
       name: bgp-advertisement-pods
       labels:
         advertise: bgp
     spec:
       advertisements:
         - advertisementType: "PodCIDR"
     ```

1. Apply the Cilium BGP Advertisement configuration to your cluster.

   ```
   kubectl apply -f cilium-bgp-advertisement-pods.yaml
   ```

1. You can confirm the BGP peering worked with the [Cilium CLI](https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default/#install-the-cilium-cli) by using the `cilium bgp peers` command. You should see the correct values in the output for your environment and the Session State as `established`. See the [Troubleshooting and Operations Guide](https://docs.cilium.io/en/latest/network/bgp-control-plane/bgp-control-plane/#troubleshooting-and-operation-guide) in the Cilium documentation for more information on troubleshooting.

   In the examples below, there are five hybrid nodes running the Cilium agent and each node is advertising the Pod CIDR range that it owns.

   ```
   cilium bgp peers
   ```

   ```
   Node                   Local AS    Peer AS               Peer Address        Session State   Uptime     Family         Received   Advertised
   mi-026d6a261e355fba7   NODES_ASN
                     ONPREM_ROUTER_ASN
                     ONPREM_ROUTER_IP    established     1h18m58s   ipv4/unicast   1          2
   mi-082f73826a163626e   NODES_ASN
                     ONPREM_ROUTER_ASN
                     ONPREM_ROUTER_IP    established     1h19m12s   ipv4/unicast   1          2
   mi-09183e8a3d755abf6   NODES_ASN
                     ONPREM_ROUTER_ASN
                     ONPREM_ROUTER_IP    established     1h18m47s   ipv4/unicast   1          2
   mi-0d78d815980ed202d   NODES_ASN
                     ONPREM_ROUTER_ASN
                     ONPREM_ROUTER_IP    established     1h19m12s   ipv4/unicast   1          2
   mi-0daa253999fe92daa   NODES_ASN
                     ONPREM_ROUTER_ASN
                     ONPREM_ROUTER_IP    established     1h18m58s   ipv4/unicast   1          2
   ```

   ```
   cilium bgp routes
   ```

   ```
   Node                   VRouter       Prefix           NextHop   Age         Attrs
   mi-026d6a261e355fba7   NODES_ASN     10.86.2.0/26     0.0.0.0   1h16m46s   [{Origin: i} {Nexthop: 0.0.0.0}]
   mi-082f73826a163626e   NODES_ASN     10.86.2.192/26   0.0.0.0   1h16m46s   [{Origin: i} {Nexthop: 0.0.0.0}]
   mi-09183e8a3d755abf6   NODES_ASN     10.86.2.64/26    0.0.0.0   1h16m46s   [{Origin: i} {Nexthop: 0.0.0.0}]
   mi-0d78d815980ed202d   NODES_ASN     10.86.2.128/26   0.0.0.0   1h16m46s   [{Origin: i} {Nexthop: 0.0.0.0}]
   mi-0daa253999fe92daa   NODES_ASN     10.86.3.0/26     0.0.0.0   1h16m46s   [{Origin: i} {Nexthop: 0.0.0.0}]
   ```

# Configure Kubernetes Ingress for hybrid nodes
<a name="hybrid-nodes-ingress"></a>

This topic describes how to configure Kubernetes Ingress for workloads running on Amazon EKS Hybrid Nodes. [Kubernetes Ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/) exposes HTTP and HTTPS routes from outside the cluster to services within the cluster. To make use of Ingress resources, a Kubernetes Ingress controller is required to set up the networking infrastructure and components that serve the network traffic.

 AWS supports AWS Application Load Balancer (ALB) and Cilium for Kubernetes Ingress for workloads running on EKS Hybrid Nodes. The decision to use ALB or Cilium for Ingress is based on the source of application traffic. If application traffic originates from an AWS Region, AWS recommends using AWS ALB and the AWS Load Balancer Controller. If application traffic originates from the local on-premises or edge environment, AWS recommends using Cilium’s built-in Ingress capabilities, which can be used with or without load balancer infrastructure in your environment.

![\[EKS Hybrid Nodes Ingress\]](http://docs.aws.amazon.com/eks/latest/userguide/images/hybrid-nodes-ingress.png)


## AWS Application Load Balancer
<a name="hybrid-nodes-ingress-alb"></a>

You can use the [AWS Load Balancer Controller](aws-load-balancer-controller.md) and Application Load Balancer (ALB) with the target type `ip` for workloads running on hybrid nodes. When using target type `ip`, ALB forwards traffic directly to the pods, bypassing the Service layer network path. For ALB to reach the pod IP targets on hybrid nodes, your on-premises pod CIDR must be routable on your on-premises network. Additionally, the AWS Load Balancer Controller uses webhooks and requires direct communication from the EKS control plane. For more information, see [Configure webhooks for hybrid nodes](hybrid-nodes-webhooks.md).

### Considerations
<a name="_considerations"></a>
+ See [Route application and HTTP traffic with Application Load Balancers](alb-ingress.md) and [Install AWS Load Balancer Controller with Helm](lbc-helm.md) for more information on AWS Application Load Balancer and AWS Load Balancer Controller.
+ See [Best Practices for Load Balancing](https://docs.aws.amazon.com/eks/latest/best-practices/load-balancing.html) for information on how to choose between AWS Application Load Balancer and AWS Network Load Balancer.
+ See [AWS Load Balancer Controller Ingress annotations](https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/guide/ingress/annotations/) for the list of annotations that can be configured for Ingress resources with AWS Application Load Balancer.

### Prerequisites
<a name="_prerequisites"></a>
+ Cilium installed following the instructions in [Configure CNI for hybrid nodes](hybrid-nodes-cni.md).
+ Cilium BGP Control Plane enabled following the instructions in [Configure Cilium BGP for hybrid nodes](hybrid-nodes-cilium-bgp.md). If you do not want to use BGP, you must use an alternative method to make your on-premises pod CIDRs routable on your on-premises network. If you do not make your on-premises pod CIDRs routable, ALB will not be able to register or contact your pod IP targets.
+ Helm installed in your command-line environment, see the [Setup Helm instructions](helm.md) for more information.
+ eksctl installed in your command-line environment, see the [eksctl install instructions](install-kubectl.md#eksctl-install-update) for more information.

### Procedure
<a name="_procedure"></a>

1. Download an IAM policy for the AWS Load Balancer Controller that allows it to make calls to AWS APIs on your behalf.

   ```
   curl -O https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/refs/heads/main/docs/install/iam_policy.json
   ```

1. Create an IAM policy using the policy downloaded in the previous step.

   ```
   aws iam create-policy \
       --policy-name AWSLoadBalancerControllerIAMPolicy \
       --policy-document file://iam_policy.json
   ```

1. Replace the value for cluster name (`CLUSTER_NAME`), AWS Region (`AWS_REGION`), and AWS account ID (`AWS_ACCOUNT_ID`) with your settings and run the following command.

   ```
   eksctl create iamserviceaccount \
       --cluster=CLUSTER_NAME \
       --namespace=kube-system \
       --name=aws-load-balancer-controller \
       --attach-policy-arn=arn:aws:iam::AWS_ACCOUNT_ID:policy/AWSLoadBalancerControllerIAMPolicy \
       --override-existing-serviceaccounts \
       --region AWS_REGION \
       --approve
   ```

1. Add the eks-charts Helm chart repository and update your local Helm repository to make sure that you have the most recent charts.

   ```
   helm repo add eks https://aws.github.io/eks-charts
   ```

   ```
   helm repo update eks
   ```

1. Install the AWS Load Balancer Controller. Replace the value for cluster name (`CLUSTER_NAME`), AWS Region (`AWS_REGION`), VPC ID (`VPC_ID`), and AWS Load Balancer Controller Helm chart version (`AWS_LBC_HELM_VERSION`) with your settings and run the following command. If you are running a mixed mode cluster with both hybrid nodes and nodes in AWS Cloud, you can run the AWS Load Balancer Controller on cloud nodes following the instructions at [AWS Load Balancer Controller](hybrid-nodes-webhooks.md#hybrid-nodes-mixed-lbc).
   + You can find the latest version of the Helm chart by running `helm search repo eks/aws-load-balancer-controller --versions`.

     ```
     helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
       -n kube-system \
       --version AWS_LBC_HELM_VERSION \
       --set clusterName=CLUSTER_NAME \
       --set region=AWS_REGION \
       --set vpcId=VPC_ID \
       --set serviceAccount.create=false \
       --set serviceAccount.name=aws-load-balancer-controller
     ```

1. Verify the AWS Load Balancer Controller was installed successfully.

   ```
   kubectl get -n kube-system deployment aws-load-balancer-controller
   ```

   ```
   NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
   aws-load-balancer-controller   2/2     2            2           84s
   ```

1. Create a sample application. The example below uses the [Istio Bookinfo](https://istio.io/latest/docs/examples/bookinfo/) sample microservices application.

   ```
   kubectl apply -f https://raw.githubusercontent.com/istio/istio/refs/heads/master/samples/bookinfo/platform/kube/bookinfo.yaml
   ```

1. Create a file named `my-ingress-alb.yaml` with the following contents.

   ```
   apiVersion: networking.k8s.io/v1
   kind: Ingress
   metadata:
     name: my-ingress
     namespace: default
     annotations:
       alb.ingress.kubernetes.io/load-balancer-name: "my-ingress-alb"
       alb.ingress.kubernetes.io/target-type: "ip"
       alb.ingress.kubernetes.io/scheme: "internet-facing"
       alb.ingress.kubernetes.io/healthcheck-path: "/details/1"
   spec:
     ingressClassName: alb
     rules:
     - http:
         paths:
         - backend:
             service:
               name: details
               port:
                 number: 9080
           path: /details
           pathType: Prefix
   ```

1. Apply the Ingress configuration to your cluster.

   ```
   kubectl apply -f my-ingress-alb.yaml
   ```

1. Provisioning the ALB for your Ingress resource may take a few minutes. Once the ALB is provisioned, your Ingress resource will have an address assigned to it that corresponds to the DNS name of the ALB deployment. The address will have the format `<alb-name>-<random-string>.<region>.elb.amazonaws.com`.

   ```
   kubectl get ingress my-ingress
   ```

   ```
   NAME         CLASS   HOSTS   ADDRESS                                                     PORTS   AGE
   my-ingress   alb     *       my-ingress-alb-<random-string>.<region>.elb.amazonaws.com   80      23m
   ```

1. Access the Service using the address of the ALB.

   ```
   curl -s http//my-ingress-alb-<random-string>.<region>.elb.amazonaws.com:80/details/1 | jq
   ```

   ```
   {
     "id": 1,
     "author": "William Shakespeare",
     "year": 1595,
     "type": "paperback",
     "pages": 200,
     "publisher": "PublisherA",
     "language": "English",
     "ISBN-10": "1234567890",
     "ISBN-13": "123-1234567890"
     "details": "This is the details page"
   }
   ```

## Cilium Ingress and Cilium Gateway Overview
<a name="hybrid-nodes-ingress-cilium"></a>

Cilium’s Ingress capabilities are built into Cilium’s architecture and can be managed with the Kubernetes Ingress API or Gateway API. If you don’t have existing Ingress resources, AWS recommends to start with the Gateway API, as it is a more expressive and flexible way to define and manage Kubernetes networking resources. The [Kubernetes Gateway API](https://gateway-api.sigs.k8s.io/) aims to standardize how networking resources for Ingress, Load Balancing, and Service Mesh are defined and managed in Kubernetes clusters.

When you enable Cilium’s Ingress or Gateway features, the Cilium operator reconciles Ingress / Gateway objects in the cluster and Envoy proxies on each node process the Layer 7 (L7) network traffic. Cilium does not directly provision Ingress / Gateway infrastructure such as load balancers. If you plan to use Cilium Ingress / Gateway with a load balancer, you must use the load balancer’s tooling, commonly an Ingress or Gateway controller, to deploy and manage the load balancer’s infrastructure.

For Ingress / Gateway traffic, Cilium handles the core network traffic and L3/L4 policy enforcement, and integrated Envoy proxies process the L7 network traffic. With Cilium Ingress / Gateway, Envoy is responsible for applying L7 routing rules, policies, and request manipulation, advanced traffic management such as traffic splitting and mirroring, and TLS termination and origination. Cilium’s Envoy proxies are deployed as a separate DaemonSet (`cilium-envoy`) by default, which enables Envoy and the Cilium agent to be separately updated, scaled, and managed.

For more information on how Cilium Ingress and Cilium Gateway work, see the [Cilium Ingress](https://docs.cilium.io/en/stable/network/servicemesh/ingress/) and [Cilium Gateway](https://docs.cilium.io/en/stable/network/servicemesh/gateway-api/gateway-api/) pages in the Cilium documentation.

## Cilium Ingress and Gateway Comparison
<a name="hybrid-nodes-ingress-cilium-comparison"></a>

The table below summarizes the Cilium Ingress and Cilium Gateway features as of **Cilium version 1.17.x**.


| Feature | Ingress | Gateway | 
| --- | --- | --- | 
|  Service type LoadBalancer  |  Yes  |  Yes  | 
|  Service type NodePort  |  Yes  |  No1   | 
|  Host network  |  Yes  |  Yes  | 
|  Shared load balancer  |  Yes  |  Yes  | 
|  Dedicated load balancer  |  Yes  |  No2   | 
|  Network policies  |  Yes  |  Yes  | 
|  Protocols  |  Layer 7 (HTTP(S), gRPC)  |  Layer 7 (HTTP(S), gRPC)3   | 
|  TLS Passthrough  |  Yes  |  Yes  | 
|  Traffic Management  |  Path and Host routing  |  Path and Host routing, URL redirect and rewrite, traffic splitting, header modification  | 

 1 Cilium Gateway support for NodePort services is planned for Cilium version 1.18.x ([\$127273](https://github.com/cilium/cilium/pull/27273))

 2 Cilium Gateway support for dedicated load balancers ([\$125567](https://github.com/cilium/cilium/issues/25567))

 3 Cilium Gateway support for TCP/UDP ([\$121929](https://github.com/cilium/cilium/issues/21929))

## Install Cilium Gateway
<a name="hybrid-nodes-ingress-cilium-gateway-install"></a>

### Considerations
<a name="_considerations_2"></a>
+ Cilium must be configured with `nodePort.enabled` set to `true` as shown in the examples below. If you are using Cilium’s kube-proxy replacement feature, you do not need to set `nodePort.enabled` to `true`.
+ Cilium must be configured with `envoy.enabled` set to `true` as shown in the examples below.
+ Cilium Gateway can be deployed in load balancer (default) or host network mode.
+ When using Cilium Gateway in load balancer mode, the `service.beta.kubernetes.io/aws-load-balancer-type: "external"` annotation must be set on the Gateway resource to prevent the legacy AWS cloud provider from creating a Classic Load Balancer for the Service of type LoadBalancer that Cilium creates for the Gateway resource.
+ When using Cilium Gateway in host network mode, the Service of type LoadBalancer mode is disabled. Host network mode is useful for environments that do not have load balancer infrastructure, see [Host network](#hybrid-nodes-ingress-cilium-host-network) for more information.

### Prerequisites
<a name="_prerequisites_2"></a>

1. Helm installed in your command-line environment, see [Setup Helm instructions](helm.md).

1. Cilium installed following the instructions in [Configure CNI for hybrid nodes](hybrid-nodes-cni.md).

### Procedure
<a name="_procedure_2"></a>

1. Install the Kubernetes Gateway API Custom Resource Definitions (CRDs).

   ```
   kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.2.1/config/crd/standard/gateway.networking.k8s.io_gatewayclasses.yaml
   kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.2.1/config/crd/standard/gateway.networking.k8s.io_gateways.yaml
   kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.2.1/config/crd/standard/gateway.networking.k8s.io_httproutes.yaml
   kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.2.1/config/crd/standard/gateway.networking.k8s.io_referencegrants.yaml
   kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/gateway-api/v1.2.1/config/crd/standard/gateway.networking.k8s.io_grpcroutes.yaml
   ```

1. Create a file called `cilium-gateway-values.yaml` with the following contents. The example below configures Cilium Gateway to use the default load balancer mode and to use a separate `cilium-envoy` DaemonSet for Envoy proxies configured to run only on hybrid nodes.

   ```
   gatewayAPI:
     enabled: true
     # uncomment to use host network mode
     # hostNetwork:
     #   enabled: true
   nodePort:
     enabled: true
   envoy:
     enabled: true
     affinity:
       nodeAffinity:
         requiredDuringSchedulingIgnoredDuringExecution:
           nodeSelectorTerms:
           - matchExpressions:
             - key: eks.amazonaws.com/compute-type
               operator: In
               values:
               - hybrid
   ```

1. Apply the Helm values file to your cluster.

   ```
   helm upgrade cilium oci://public.ecr.aws/eks/cilium/cilium \
     --namespace kube-system \
     --reuse-values \
     --set operator.rollOutPods=true \
     --values cilium-gateway-values.yaml
   ```

1. Confirm the Cilium operator, agent, and Envoy pods are running.

   ```
   kubectl -n kube-system get pods --selector=app.kubernetes.io/part-of=cilium
   ```

   ```
   NAME                               READY   STATUS    RESTARTS   AGE
   cilium-envoy-5pgnd                 1/1     Running   0          6m31s
   cilium-envoy-6fhg4                 1/1     Running   0          6m30s
   cilium-envoy-jskrk                 1/1     Running   0          6m30s
   cilium-envoy-k2xtb                 1/1     Running   0          6m31s
   cilium-envoy-w5s9j                 1/1     Running   0          6m31s
   cilium-grwlc                       1/1     Running   0          4m12s
   cilium-operator-68f7766967-5nnbl   1/1     Running   0          4m20s
   cilium-operator-68f7766967-7spfz   1/1     Running   0          4m20s
   cilium-pnxcv                       1/1     Running   0          6m29s
   cilium-r7qkj                       1/1     Running   0          4m12s
   cilium-wxhfn                       1/1     Running   0          4m1s
   cilium-z7hlb                       1/1     Running   0          6m30s
   ```

## Configure Cilium Gateway
<a name="hybrid-nodes-ingress-cilium-gateway-configure"></a>

Cilium Gateway is enabled on Gateway objects by setting the `gatewayClassName` to `cilium`. The Service that Cilium creates for Gateway resources can be configured with fields on the Gateway object. Common annotations used by Gateway controllers to configure the load balancer infrastructure can be configured with the Gateway object’s `infrastructure` field. When using Cilium’s LoadBalancer IPAM (see example in [Service type LoadBalancer](#hybrid-nodes-ingress-cilium-loadbalancer)), the IP address to use for the Service of type LoadBalancer can be configured on the Gateway object’s `addresses` field. For more information on Gateway configuration, see the [Kubernetes Gateway API specification](https://gateway-api.sigs.k8s.io/reference/spec/#gateway).

```
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: my-gateway
spec:
  gatewayClassName: cilium
  infrastructure:
    annotations:
      service.beta.kubernetes.io/...
      service.kubernetes.io/...
  addresses:
  - type: IPAddress
    value: <LoadBalancer IP address>
  listeners:
  ...
```

Cilium and the Kubernetes Gateway specification support the GatewayClass, Gateway, HTTPRoute, GRPCRoute, and ReferenceGrant resources.
+ See [HTTPRoute](https://gateway-api.sigs.k8s.io/api-types/httproute/HTTPRoute) and [GRPCRoute](https://gateway-api.sigs.k8s.io/api-types/grpcroute/GRPCRoute) specifications for the list of available fields.
+ See the examples in the [Deploy Cilium Gateway](#hybrid-nodes-ingress-cilium-gateway-deploy) section below and the examples in the [Cilium documentation](https://docs.cilium.io/en/stable/network/servicemesh/gateway-api/gateway-api/#examples) for how to use and configure these resources.

## Deploy Cilium Gateway
<a name="hybrid-nodes-ingress-cilium-gateway-deploy"></a>

1. Create a sample application. The example below uses the [Istio Bookinfo](https://istio.io/latest/docs/examples/bookinfo/) sample microservices application.

   ```
   kubectl apply -f https://raw.githubusercontent.com/istio/istio/refs/heads/master/samples/bookinfo/platform/kube/bookinfo.yaml
   ```

1. Confirm the application is running successfully.

   ```
   kubectl get pods
   ```

   ```
   NAME                              READY   STATUS    RESTARTS   AGE
   details-v1-766844796b-9965p       1/1     Running   0          81s
   productpage-v1-54bb874995-jmc8j   1/1     Running   0          80s
   ratings-v1-5dc79b6bcd-smzxz       1/1     Running   0          80s
   reviews-v1-598b896c9d-vj7gb       1/1     Running   0          80s
   reviews-v2-556d6457d-xbt8v        1/1     Running   0          80s
   reviews-v3-564544b4d6-cpmvq       1/1     Running   0          80s
   ```

1. Create a file named `my-gateway.yaml` with the following contents. The example below uses the `service.beta.kubernetes.io/aws-load-balancer-type: "external"` annotation to prevent the legacy AWS cloud provider from creating a Classic Load Balancer for the Service of type LoadBalancer that Cilium creates for the Gateway resource.

   ```
   ---
   apiVersion: gateway.networking.k8s.io/v1
   kind: Gateway
   metadata:
     name: my-gateway
   spec:
     gatewayClassName: cilium
     infrastructure:
       annotations:
         service.beta.kubernetes.io/aws-load-balancer-type: "external"
     listeners:
     - protocol: HTTP
       port: 80
       name: web-gw
       allowedRoutes:
         namespaces:
           from: Same
   ---
   apiVersion: gateway.networking.k8s.io/v1
   kind: HTTPRoute
   metadata:
     name: http-app-1
   spec:
     parentRefs:
     - name: my-gateway
       namespace: default
     rules:
     - matches:
       - path:
           type: PathPrefix
           value: /details
       backendRefs:
       - name: details
         port: 9080
   ```

1. Apply the Gateway resource to your cluster.

   ```
   kubectl apply -f my-gateway.yaml
   ```

1. Confirm the Gateway resource and corresponding Service were created. At this stage, it is expected that the `ADDRESS` field of the Gateway resource is not populated with an IP address or hostname, and that the Service of type LoadBalancer for the Gateway resource similarly does not have an IP address or hostname assigned.

   ```
   kubectl get gateway my-gateway
   ```

   ```
   NAME         CLASS    ADDRESS   PROGRAMMED   AGE
   my-gateway   cilium             True         10s
   ```

   ```
   kubectl get svc cilium-gateway-my-gateway
   ```

   ```
   NAME                        TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
   cilium-gateway-my-gateway   LoadBalancer   172.16.227.247   <pending>     80:30912/TCP   24s
   ```

1. Proceed to [Service type LoadBalancer](#hybrid-nodes-ingress-cilium-loadbalancer) to configure the Gateway resource to use an IP address allocated by Cilium Load Balancer IPAM, and [Service type NodePort](#hybrid-nodes-ingress-cilium-nodeport) or [Host network](#hybrid-nodes-ingress-cilium-host-network) to configure the Gateway resource to use NodePort or host network addresses.

## Install Cilium Ingress
<a name="hybrid-nodes-ingress-cilium-ingress-install"></a>

### Considerations
<a name="_considerations_3"></a>
+ Cilium must be configured with `nodePort.enabled` set to `true` as shown in the examples below. If you are using Cilium’s kube-proxy replacement feature, you do not need to set `nodePort.enabled` to `true`.
+ Cilium must be configured with `envoy.enabled` set to `true` as shown in the examples below.
+ With `ingressController.loadbalancerMode` set to `dedicated`, Cilium creates dedicated Services for each Ingress resource. With `ingressController.loadbalancerMode` set to `shared`, Cilium creates a shared Service of type LoadBalancer for all Ingress resources in the cluster. When using the `shared` load balancer mode, the settings for the shared Service such as `labels`, `annotations`, `type`, and `loadBalancerIP` are configured in the `ingressController.service` section of the Helm values. See the [Cilium Helm values reference](https://github.com/cilium/cilium/blob/v1.17.6/install/kubernetes/cilium/values.yaml#L887) for more information.
+ With `ingressController.default` set to `true`, Cilium is configured as the default Ingress controller for the cluster and will create Ingress entries even when the `ingressClassName` is not specified on Ingress resources.
+ Cilium Ingress can be deployed in load balancer (default), node port, or host network mode. When Cilium is installed in host network mode, the Service of type LoadBalancer and Service of type NodePort modes are disabled. See [Host network](#hybrid-nodes-ingress-cilium-host-network) for more information.
+ Always set `ingressController.service.annotations` to `service.beta.kubernetes.io/aws-load-balancer-type: "external"` in the Helm values to prevent the legacy AWS cloud provider from creating a Classic Load Balancer for the default `cilium-ingress` Service created by the [Cilium Helm chart](https://github.com/cilium/cilium/blob/main/install/kubernetes/cilium/templates/cilium-ingress-service.yaml).

### Prerequisites
<a name="_prerequisites_3"></a>

1. Helm installed in your command-line environment, see [Setup Helm instructions](helm.md).

1. Cilium installed following the instructions in [Configure CNI for hybrid nodes](hybrid-nodes-cni.md).

### Procedure
<a name="_procedure_3"></a>

1. Create a file called `cilium-ingress-values.yaml` with the following contents. The example below configures Cilium Ingress to use the default load balancer `dedicated` mode and to use a separate `cilium-envoy` DaemonSet for Envoy proxies configured to run only on hybrid nodes.

   ```
   ingressController:
     enabled: true
     loadbalancerMode: dedicated
     service:
       annotations:
         service.beta.kubernetes.io/aws-load-balancer-type: "external"
   nodePort:
     enabled: true
   envoy:
     enabled: true
     affinity:
       nodeAffinity:
         requiredDuringSchedulingIgnoredDuringExecution:
           nodeSelectorTerms:
           - matchExpressions:
             - key: eks.amazonaws.com/compute-type
               operator: In
               values:
               - hybrid
   ```

1. Apply the Helm values file to your cluster.

   ```
   helm upgrade cilium oci://public.ecr.aws/eks/cilium/cilium \
     --namespace kube-system \
     --reuse-values \
     --set operator.rollOutPods=true \
     --values cilium-ingress-values.yaml
   ```

1. Confirm the Cilium operator, agent, and Envoy pods are running.

   ```
   kubectl -n kube-system get pods --selector=app.kubernetes.io/part-of=cilium
   ```

   ```
   NAME                               READY   STATUS    RESTARTS   AGE
   cilium-envoy-5pgnd                 1/1     Running   0          6m31s
   cilium-envoy-6fhg4                 1/1     Running   0          6m30s
   cilium-envoy-jskrk                 1/1     Running   0          6m30s
   cilium-envoy-k2xtb                 1/1     Running   0          6m31s
   cilium-envoy-w5s9j                 1/1     Running   0          6m31s
   cilium-grwlc                       1/1     Running   0          4m12s
   cilium-operator-68f7766967-5nnbl   1/1     Running   0          4m20s
   cilium-operator-68f7766967-7spfz   1/1     Running   0          4m20s
   cilium-pnxcv                       1/1     Running   0          6m29s
   cilium-r7qkj                       1/1     Running   0          4m12s
   cilium-wxhfn                       1/1     Running   0          4m1s
   cilium-z7hlb                       1/1     Running   0          6m30s
   ```

## Configure Cilium Ingress
<a name="hybrid-nodes-ingress-cilium-ingress-configure"></a>

Cilium Ingress is enabled on Ingress objects by setting the `ingressClassName` to `cilium`. The Service(s) that Cilium creates for Ingress resources can be configured with annotations on the Ingress objects when using the `dedicated` load balancer mode and in the Cilium / Helm configuration when using the `shared` load balancer mode. These annotations are commonly used by Ingress controllers to configure the load balancer infrastructure, or other attributes of the Service such as the service type, load balancer mode, ports, and TLS passthrough. Key annotations are described below. For a full list of supported annotations, see the [Cilium Ingress annotations](https://docs.cilium.io/en/stable/network/servicemesh/ingress/#supported-ingress-annotations) in the Cilium documentation.


| Annotation | Description | 
| --- | --- | 
|   `ingress.cilium.io/loadbalancer-mode`   |   `dedicated`: Dedicated Service of type LoadBalancer for each Ingress resource (default).  `shared`: Single Service of type LoadBalancer for all Ingress resources.  | 
|   `ingress.cilium.io/service-type`   |   `LoadBalancer`: The Service will be of type LoadBalancer (default)  `NodePort`: The Service will be of type NodePort.  | 
|   `service.beta.kubernetes.io/aws-load-balancer-type`   |   `"external"`: Prevent legacy AWS cloud provider from provisioning Classic Load Balancer for Services of type LoadBalancer.  | 
|   `lbipam.cilium.io/ips`   |  List of IP addresses to allocate from Cilium LoadBalancer IPAM  | 

Cilium and the Kubernetes Ingress specification support Exact, Prefix, and Implementation-specific matching rules for Ingress paths. Cilium supports regex as its implementation-specific matching rule. For more information, see [Ingress path types and precedence](https://docs.cilium.io/en/stable/network/servicemesh/ingress/#ingress-path-types-and-precedence) and [Path types examples](https://docs.cilium.io/en/stable/network/servicemesh/path-types/) in the Cilium documentation, and the examples in the [Deploy Cilium Ingress](#hybrid-nodes-ingress-cilium-ingress-deploy) section of this page.

An example Cilium Ingress object is shown below.

```
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-ingress
  annotations:
    service.beta.kubernetes.io/...
    service.kubernetes.io/...
spec:
  ingressClassName: cilium
  rules:
  ...
```

## Deploy Cilium Ingress
<a name="hybrid-nodes-ingress-cilium-ingress-deploy"></a>

1. Create a sample application. The example below uses the [Istio Bookinfo](https://istio.io/latest/docs/examples/bookinfo/) sample microservices application.

   ```
   kubectl apply -f https://raw.githubusercontent.com/istio/istio/refs/heads/master/samples/bookinfo/platform/kube/bookinfo.yaml
   ```

1. Confirm the application is running successfully.

   ```
   kubectl get pods
   ```

   ```
   NAME                              READY   STATUS    RESTARTS   AGE
   details-v1-766844796b-9965p       1/1     Running   0          81s
   productpage-v1-54bb874995-jmc8j   1/1     Running   0          80s
   ratings-v1-5dc79b6bcd-smzxz       1/1     Running   0          80s
   reviews-v1-598b896c9d-vj7gb       1/1     Running   0          80s
   reviews-v2-556d6457d-xbt8v        1/1     Running   0          80s
   reviews-v3-564544b4d6-cpmvq       1/1     Running   0          80s
   ```

1. Create a file named `my-ingress.yaml` with the following contents. The example below uses the `service.beta.kubernetes.io/aws-load-balancer-type: "external"` annotation to prevent the legacy AWS cloud provider from creating a Classic Load Balancer for the Service of type LoadBalancer that Cilium creates for the Ingress resource.

   ```
   apiVersion: networking.k8s.io/v1
   kind: Ingress
   metadata:
     name: my-ingress
     namespace: default
     annotations:
       service.beta.kubernetes.io/aws-load-balancer-type: "external"
   spec:
     ingressClassName: cilium
     rules:
     - http:
         paths:
         - backend:
             service:
               name: details
               port:
                 number: 9080
           path: /details
           pathType: Prefix
   ```

1. Apply the Ingress resource to your cluster.

   ```
   kubectl apply -f my-ingress.yaml
   ```

1. Confirm the Ingress resource and corresponding Service were created. At this stage, it is expected that the `ADDRESS` field of the Ingress resource is not populated with an IP address or hostname, and that the shared or dedicated Service of type LoadBalancer for the Ingress resource similarly does not have an IP address or hostname assigned.

   ```
   kubectl get ingress my-ingress
   ```

   ```
   NAME         CLASS    HOSTS   ADDRESS   PORTS   AGE
   my-ingress   cilium   *                 80      8s
   ```

   For load balancer mode `shared` 

   ```
   kubectl -n kube-system get svc cilium-ingress
   ```

   ```
   NAME             TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
   cilium-ingress   LoadBalancer   172.16.217.48   <pending>     80:32359/TCP,443:31090/TCP   10m
   ```

   For load balancer mode `dedicated` 

   ```
   kubectl -n default get svc cilium-ingress-my-ingress
   ```

   ```
   NAME                        TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
   cilium-ingress-my-ingress   LoadBalancer   172.16.193.15   <pending>     80:32088/TCP,443:30332/TCP   25s
   ```

1. Proceed to [Service type LoadBalancer](#hybrid-nodes-ingress-cilium-loadbalancer) to configure the Ingress resource to use an IP address allocated by Cilium Load Balancer IPAM, and [Service type NodePort](#hybrid-nodes-ingress-cilium-nodeport) or [Host network](#hybrid-nodes-ingress-cilium-host-network) to configure the Ingress resource to use NodePort or host network addresses.

## Service type LoadBalancer
<a name="hybrid-nodes-ingress-cilium-loadbalancer"></a>

### Existing load balancer infrastructure
<a name="_existing_load_balancer_infrastructure"></a>

By default, for both Cilium Ingress and Cilium Gateway, Cilium creates Kubernetes Service(s) of type LoadBalancer for the Ingress / Gateway resources. The attributes of the Service(s) that Cilium creates can be configured through the Ingress and Gateway resources. When you create Ingress or Gateway resources, the externally exposed IP address or hostnames for the Ingress or Gateway are allocated from the load balancer infrastructure, which is typically provisioned by an Ingress or Gateway controller.

Many Ingress and Gateway controllers use annotations to detect and configure the load balancer infrastructure. The annotations for these Ingress and Gateway controllers are configured on the Ingress or Gateway resources as shown in the previous examples above. Reference your Ingress or Gateway controller’s documentation for the annotations it supports and see the [Kubernetes Ingress documentation](https://kubernetes.io/docs/concepts/services-networking/ingress-controllers/) and [Kubernetes Gateway documentation](https://gateway-api.sigs.k8s.io/implementations/) for a list of popular controllers.

**Important**  
Cilium Ingress and Gateway cannot be used with the AWS Load Balancer Controller and AWS Network Load Balancers (NLBs) with EKS Hybrid Nodes. Attempting to use these together results in unregistered targets, as the NLB attempts to directly connect to the Pod IPs that back the Service of type LoadBalancer when the NLB’s `target-type` is set to `ip` (requirement for using NLB with workloads running on EKS Hybrid Nodes).

### No load balancer infrastructure
<a name="_no_load_balancer_infrastructure"></a>

If you do not have load balancer infrastructure and corresponding Ingress / Gateway controller in your environment, Ingress / Gateway resources and corresponding Services of type LoadBalancer can be configured to use IP addresses allocated by Cilium’s [Load Balancer IP address management](https://docs.cilium.io/en/stable/network/lb-ipam/) (LB IPAM). Cilium LB IPAM can be configured with known IP address ranges from your on-premises environment, and can use Cilium’s built-in Border Gateway Protocol (BGP) support or L2 announcements to advertise the LoadBalancer IP addresses to your on-premises network.

The example below shows how to configure Cilium’s LB IPAM with an IP address to use for your Ingress / Gateway resources, and how to configure Cilium BGP Control Plane to advertise the LoadBalancer IP address with the on-premises network. Cilium’s LB IPAM feature is enabled by default, but is not activated until a `CiliumLoadBalancerIPPool` resource is created.

#### Prerequisites
<a name="_prerequisites_4"></a>
+ Cilium Ingress or Gateway installed following the instructions in [Install Cilium Ingress](#hybrid-nodes-ingress-cilium-ingress-install) or [Install Cilium Gateway](#hybrid-nodes-ingress-cilium-gateway-install).
+ Cilium Ingress or Gateway resources with sample application deployed following the instructions in [Deploy Cilium Ingress](#hybrid-nodes-ingress-cilium-ingress-deploy) or [Deploy Cilium Gateway](#hybrid-nodes-ingress-cilium-gateway-deploy).
+ Cilium BGP Control Plane enabled following the instructions in [Configure Cilium BGP for hybrid nodes](hybrid-nodes-cilium-bgp.md). If you do not want to use BGP, you can skip this prerequisite, but you will not be able to access your Ingress or Gateway resource until the LoadBalancer IP address allocated by Cilium LB IPAM is routable on your on-premises network.

#### Procedure
<a name="_procedure_4"></a>

1. Optionally patch the Ingress or Gateway resource to request a specific IP address to use for the Service of type LoadBalancer. If you do not request a specific IP address, Cilium will allocate an IP address from the IP address range configured in the `CiliumLoadBalancerIPPool` resource in the subsequent step. In the commands below, replace `LB_IP_ADDRESS` with the IP address to request for the Service of type LoadBalancer.

    **Gateway** 

   ```
   kubectl patch gateway -n default my-gateway --type=merge -p '{
     "spec": {
       "addresses": [{"type": "IPAddress", "value": "LB_IP_ADDRESS"}]
     }
   }'
   ```

    **Ingress** 

   ```
   kubectl patch ingress my-ingress --type=merge -p '{
     "metadata": {"annotations": {"lbipam.cilium.io/ips": "LB_IP_ADDRESS"}}
   }'
   ```

1. Create a file named `cilium-lbip-pool-ingress.yaml` with a `CiliumLoadBalancerIPPool` resource to configure the Load Balancer IP address range for your Ingress / Gateway resources.
   + If you are using Cilium Ingress, Cilium automatically applies the `cilium.io/ingress: "true"` label to the Services it creates for Ingress resources. You can use this label in the `serviceSelector` field of the `CiliumLoadBalancerIPPool` resource definition to select the Services eligible for LB IPAM.
   + If you are using Cilium Gateway, you can use the `gateway.networking.k8s.io/gateway-name` label in the `serviceSelector` fields of the `CiliumLoadBalancerIPPool` resource definition to select the Gateway resources eligible for LB IPAM.
   + Replace `LB_IP_CIDR` with the IP address range to use for the Load Balancer IP addresses. To select a single IP address, use a `/32` CIDR. For more information, see [LoadBalancer IP Address Management](https://docs.cilium.io/en/stable/network/lb-ipam/) in the Cilium documentation.

     ```
     apiVersion: cilium.io/v2alpha1
     kind: CiliumLoadBalancerIPPool
     metadata:
       name: bookinfo-pool
     spec:
       blocks:
       - cidr: "LB_IP_CIDR"
       serviceSelector:
         # if using Cilium Gateway
         matchExpressions:
           - { key: gateway.networking.k8s.io/gateway-name, operator: In, values: [ my-gateway ] }
         # if using Cilium Ingress
         matchLabels:
           cilium.io/ingress: "true"
     ```

1. Apply the `CiliumLoadBalancerIPPool` resource to your cluster.

   ```
   kubectl apply -f cilium-lbip-pool-ingress.yaml
   ```

1. Confirm an IP address was allocated from Cilium LB IPAM for the Ingress / Gateway resource.

    **Gateway** 

   ```
   kubectl get gateway my-gateway
   ```

   ```
   NAME         CLASS    ADDRESS        PROGRAMMED   AGE
   my-gateway   cilium   LB_IP_ADDRESS    True         6m41s
   ```

    **Ingress** 

   ```
   kubectl get ingress my-ingress
   ```

   ```
   NAME         CLASS    HOSTS   ADDRESS        PORTS   AGE
   my-ingress   cilium   *       LB_IP_ADDRESS   80      10m
   ```

1. Create a file named `cilium-bgp-advertisement-ingress.yaml` with a `CiliumBGPAdvertisement` resource to advertise the LoadBalancer IP address for the Ingress / Gateway resources. If you are not using Cilium BGP, you can skip this step. The LoadBalancer IP address used for your Ingress / Gateway resource must be routable on your on-premises network for you to be able to query the service in the next step.

   ```
   apiVersion: cilium.io/v2alpha1
   kind: CiliumBGPAdvertisement
   metadata:
     name: bgp-advertisement-lb-ip
     labels:
       advertise: bgp
   spec:
     advertisements:
       - advertisementType: "Service"
         service:
           addresses:
             - LoadBalancerIP
         selector:
           # if using Cilium Gateway
           matchExpressions:
             - { key: gateway.networking.k8s.io/gateway-name, operator: In, values: [ my-gateway ] }
           # if using Cilium Ingress
           matchLabels:
             cilium.io/ingress: "true"
   ```

1. Apply the `CiliumBGPAdvertisement` resource to your cluster.

   ```
   kubectl apply -f cilium-bgp-advertisement-ingress.yaml
   ```

1. Access the service using the IP address allocated from Cilium LB IPAM.

   ```
   curl -s http://LB_IP_ADDRESS:80/details/1 | jq
   ```

   ```
   {
     "id": 1,
     "author": "William Shakespeare",
     "year": 1595,
     "type": "paperback",
     "pages": 200,
     "publisher": "PublisherA",
     "language": "English",
     "ISBN-10": "1234567890",
     "ISBN-13": "123-1234567890"
   }
   ```

## Service type NodePort
<a name="hybrid-nodes-ingress-cilium-nodeport"></a>

If you do not have load balancer infrastructure and corresponding Ingress controller in your environment, or if you are self-managing your load balancer infrastructure or using DNS-based load balancing, you can configure Cilium Ingress to create Services of type NodePort for the Ingress resources. When using NodePort with Cilium Ingress, the Service of type NodePort is exposed on a port on each node in port range 30000-32767. In this mode, when traffic reaches any node in the cluster on the NodePort, it is then forwarded to a pod that backs the service, which may be on the same node or a different node.

**Note**  
Cilium Gateway support for NodePort services is planned for Cilium version 1.18.x ([\$127273](https://github.com/cilium/cilium/pull/27273))

### Prerequisites
<a name="_prerequisites_5"></a>
+ Cilium Ingress installed following the instructions in [Install Cilium Ingress](#hybrid-nodes-ingress-cilium-ingress-install).
+ Cilium Ingress resources with sample application deployed following the instructions in [Deploy Cilium Ingress](#hybrid-nodes-ingress-cilium-ingress-deploy).

### Procedure
<a name="_procedure_5"></a>

1. Patch the existing Ingress resource `my-ingress` to change it from Service type LoadBalancer to NodePort.

   ```
   kubectl patch ingress my-ingress --type=merge -p '{
       "metadata": {"annotations": {"ingress.cilium.io/service-type": "NodePort"}}
   }'
   ```

   If you have not created the Ingress resource, you can create it by applying the following Ingress definition to your cluster. Note, the Ingress definition below uses the Istio Bookinfo sample application described in [Deploy Cilium Ingress](#hybrid-nodes-ingress-cilium-ingress-deploy).

   ```
   apiVersion: networking.k8s.io/v1
   kind: Ingress
   metadata:
     name: my-ingress
     namespace: default
     annotations:
       service.beta.kubernetes.io/aws-load-balancer-type: "external"
       "ingress.cilium.io/service-type": "NodePort"
   spec:
     ingressClassName: cilium
     rules:
     - http:
         paths:
         - backend:
             service:
               name: details
               port:
                 number: 9080
           path: /details
           pathType: Prefix
   ```

1. Confirm the Service for the Ingress resource was updated to use Service type NodePort. Note the Port for the HTTP protocol in the output. In the example below this HTTP port is `32353`, which will be used in a subsequent step to query the Service. The benefit of using Cilium Ingress with Service of type NodePort is that you can apply path and host-based routing, as well as network policies for the Ingress traffic, which you cannot do for a standard Service of type NodePort without Ingress.

   ```
   kubectl -n default get svc cilium-ingress-my-ingress
   ```

   ```
   NAME                        TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)                      AGE
   cilium-ingress-my-ingress   NodePort   172.16.47.153   <none>        80:32353/TCP,443:30253/TCP   27m
   ```

1. Get the IP addresses of your nodes in your cluster.

   ```
   kubectl get nodes -o wide
   ```

   ```
   NAME                   STATUS   ROLES    AGE   VERSION               INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION       CONTAINER-RUNTIME
   mi-026d6a261e355fba7   Ready    <none>   23h   v1.32.3-eks-473151a   10.80.146.150   <none>        Ubuntu 22.04.5 LTS   5.15.0-142-generic   containerd://1.7.27
   mi-082f73826a163626e   Ready    <none>   23h   v1.32.3-eks-473151a   10.80.146.32    <none>        Ubuntu 22.04.4 LTS   5.15.0-142-generic   containerd://1.7.27
   mi-09183e8a3d755abf6   Ready    <none>   23h   v1.32.3-eks-473151a   10.80.146.33    <none>        Ubuntu 22.04.4 LTS   5.15.0-142-generic   containerd://1.7.27
   mi-0d78d815980ed202d   Ready    <none>   23h   v1.32.3-eks-473151a   10.80.146.97    <none>        Ubuntu 22.04.4 LTS   5.15.0-142-generic   containerd://1.7.27
   mi-0daa253999fe92daa   Ready    <none>   23h   v1.32.3-eks-473151a   10.80.146.100   <none>        Ubuntu 22.04.4 LTS   5.15.0-142-generic   containerd://1.7.27
   ```

1. Access the Service of type NodePort using the IP addresses of your nodes and the NodePort captured above. In the example below the node IP address used is `10.80.146.32` and the NodePort is `32353`. Replace these with the values for your environment.

   ```
   curl -s http://10.80.146.32:32353/details/1 | jq
   ```

   ```
   {
     "id": 1,
     "author": "William Shakespeare",
     "year": 1595,
     "type": "paperback",
     "pages": 200,
     "publisher": "PublisherA",
     "language": "English",
     "ISBN-10": "1234567890",
     "ISBN-13": "123-1234567890"
   }
   ```

## Host network
<a name="hybrid-nodes-ingress-cilium-host-network"></a>

Similar to Service of type NodePort, if you do not have load balancer infrastructure and an Ingress or Gateway controller, or if you are self-managing your load balancing with an external load balancer, you can configure Cilium Ingress and Cilium Gateway to expose Ingress and Gateway resources directly on the host network. When the host network mode is enabled for an Ingress or Gateway resource, the Service of type LoadBalancer and NodePort modes are automatically disabled, host network mode is mutually exclusive with these alternative modes for each Ingress or Gateway resource. Compared to the Service of type NodePort mode, host network mode offers additional flexibility for the range of ports that can be used (it’s not restricted to the 30000-32767 NodePort range) and you can configure a subset of nodes where the Envoy proxies run on the host network.

### Prerequisites
<a name="_prerequisites_6"></a>
+ Cilium Ingress or Gateway installed following the instructions in [Install Cilium Ingress](#hybrid-nodes-ingress-cilium-ingress-install) or [Install Cilium Gateway](#hybrid-nodes-ingress-cilium-gateway-install).

### Procedure
<a name="_procedure_6"></a>

#### Gateway
<a name="_gateway"></a>

1. Create a file named `cilium-gateway-host-network.yaml` with the following content.

   ```
   gatewayAPI:
     enabled: true
     hostNetwork:
       enabled: true
       # uncomment to restrict nodes where Envoy proxies run on the host network
       # nodes:
       #   matchLabels:
       #     role: gateway
   ```

1. Apply the host network Cilium Gateway configuration to your cluster.

   ```
   helm upgrade cilium oci://public.ecr.aws/eks/cilium/cilium \
     --namespace kube-system \
     --reuse-values \
     --set operator.rollOutPods=true \
     -f cilium-gateway-host-network.yaml
   ```

   If you have not created the Gateway resource, you can create it by applying the following Gateway definition to your cluster. The Gateway definition below uses the Istio Bookinfo sample application described in [Deploy Cilium Gateway](#hybrid-nodes-ingress-cilium-gateway-deploy). In the example below, the Gateway resource is configured to use the `8111` port for the HTTP listener, which is the shared listener port for the Envoy proxies running on the host network. If you are using a privileged port (lower than 1023) for the Gateway resource, reference the [Cilium documentation](https://docs.cilium.io/en/stable/network/servicemesh/gateway-api/gateway-api/#bind-to-privileged-port) for instructions.

   ```
   ---
   apiVersion: gateway.networking.k8s.io/v1
   kind: Gateway
   metadata:
     name: my-gateway
   spec:
     gatewayClassName: cilium
     listeners:
     - protocol: HTTP
       port: 8111
       name: web-gw
       allowedRoutes:
         namespaces:
           from: Same
   ---
   apiVersion: gateway.networking.k8s.io/v1
   kind: HTTPRoute
   metadata:
     name: http-app-1
   spec:
     parentRefs:
     - name: my-gateway
       namespace: default
     rules:
     - matches:
       - path:
           type: PathPrefix
           value: /details
       backendRefs:
       - name: details
         port: 9080
   ```

   You can observe the applied Cilium Envoy Configuration with the following command.

   ```
   kubectl get cec cilium-gateway-my-gateway -o yaml
   ```

   You can get the Envoy listener port for the `cilium-gateway-my-gateway` Service with the following command. In this example, the shared listener port is `8111`.

   ```
   kubectl get cec cilium-gateway-my-gateway -o jsonpath={.spec.services[0].ports[0]}
   ```

1. Get the IP addresses of your nodes in your cluster.

   ```
   kubectl get nodes -o wide
   ```

   ```
   NAME                   STATUS   ROLES    AGE   VERSION               INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION       CONTAINER-RUNTIME
   mi-026d6a261e355fba7   Ready    <none>   23h   v1.32.3-eks-473151a   10.80.146.150   <none>        Ubuntu 22.04.5 LTS   5.15.0-142-generic   containerd://1.7.27
   mi-082f73826a163626e   Ready    <none>   23h   v1.32.3-eks-473151a   10.80.146.32    <none>        Ubuntu 22.04.4 LTS   5.15.0-142-generic   containerd://1.7.27
   mi-09183e8a3d755abf6   Ready    <none>   23h   v1.32.3-eks-473151a   10.80.146.33    <none>        Ubuntu 22.04.4 LTS   5.15.0-142-generic   containerd://1.7.27
   mi-0d78d815980ed202d   Ready    <none>   23h   v1.32.3-eks-473151a   10.80.146.97    <none>        Ubuntu 22.04.4 LTS   5.15.0-142-generic   containerd://1.7.27
   mi-0daa253999fe92daa   Ready    <none>   23h   v1.32.3-eks-473151a   10.80.146.100   <none>        Ubuntu 22.04.4 LTS   5.15.0-142-generic   containerd://1.7.27
   ```

1. Access the Service using the IP addresses of your nodes and the listener port for the `cilium-gateway-my-gateway` resource. In the example below the node IP address used is `10.80.146.32` and the listener port is `8111`. Replace these with the values for your environment.

   ```
   curl -s http://10.80.146.32:8111/details/1 | jq
   ```

   ```
   {
     "id": 1,
     "author": "William Shakespeare",
     "year": 1595,
     "type": "paperback",
     "pages": 200,
     "publisher": "PublisherA",
     "language": "English",
     "ISBN-10": "1234567890",
     "ISBN-13": "123-1234567890"
   }
   ```

#### Ingress
<a name="_ingress"></a>

Due to an upstream Cilium issue ([\$134028](https://github.com/cilium/cilium/issues/34028)), Cilium Ingress in host network mode requires using `loadbalancerMode: shared`, which creates a single Service of type ClusterIP for all Ingress resources in the cluster. If you are using a privileged port (lower than 1023) for the Ingress resource, reference the [Cilium documentation](https://docs.cilium.io/en/stable/network/servicemesh/ingress/#bind-to-privileged-port) for instructions.

1. Create a file named `cilium-ingress-host-network.yaml` with the following content.

   ```
   ingressController:
     enabled: true
     loadbalancerMode: shared
     # This is a workaround for the upstream Cilium issue
     service:
       externalTrafficPolicy: null
       type: ClusterIP
     hostNetwork:
       enabled: true
       # ensure the port does not conflict with other services on the node
       sharedListenerPort: 8111
       # uncomment to restrict nodes where Envoy proxies run on the host network
       # nodes:
       #   matchLabels:
       #     role: ingress
   ```

1. Apply the host network Cilium Ingress configuration to your cluster.

   ```
   helm upgrade cilium oci://public.ecr.aws/eks/cilium/cilium \
     --namespace kube-system \
     --reuse-values \
     --set operator.rollOutPods=true \
     -f cilium-ingress-host-network.yaml
   ```

   If you have not created the Ingress resource, you can create it by applying the following Ingress definition to your cluster. The Ingress definition below uses the Istio Bookinfo sample application described in [Deploy Cilium Ingress](#hybrid-nodes-ingress-cilium-ingress-deploy).

   ```
   apiVersion: networking.k8s.io/v1
   kind: Ingress
   metadata:
     name: my-ingress
     namespace: default
   spec:
     ingressClassName: cilium
     rules:
     - http:
         paths:
         - backend:
             service:
               name: details
               port:
                 number: 9080
           path: /details
           pathType: Prefix
   ```

   You can observe the applied Cilium Envoy Configuration with the following command.

   ```
   kubectl get cec -n kube-system cilium-ingress -o yaml
   ```

   You can get the Envoy listener port for the `cilium-ingress` Service with the following command. In this example, the shared listener port is `8111`.

   ```
   kubectl get cec -n kube-system cilium-ingress -o jsonpath={.spec.services[0].ports[0]}
   ```

1. Get the IP addresses of your nodes in your cluster.

   ```
   kubectl get nodes -o wide
   ```

   ```
   NAME                   STATUS   ROLES    AGE   VERSION               INTERNAL-IP     EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION       CONTAINER-RUNTIME
   mi-026d6a261e355fba7   Ready    <none>   23h   v1.32.3-eks-473151a   10.80.146.150   <none>        Ubuntu 22.04.5 LTS   5.15.0-142-generic   containerd://1.7.27
   mi-082f73826a163626e   Ready    <none>   23h   v1.32.3-eks-473151a   10.80.146.32    <none>        Ubuntu 22.04.4 LTS   5.15.0-142-generic   containerd://1.7.27
   mi-09183e8a3d755abf6   Ready    <none>   23h   v1.32.3-eks-473151a   10.80.146.33    <none>        Ubuntu 22.04.4 LTS   5.15.0-142-generic   containerd://1.7.27
   mi-0d78d815980ed202d   Ready    <none>   23h   v1.32.3-eks-473151a   10.80.146.97    <none>        Ubuntu 22.04.4 LTS   5.15.0-142-generic   containerd://1.7.27
   mi-0daa253999fe92daa   Ready    <none>   23h   v1.32.3-eks-473151a   10.80.146.100   <none>        Ubuntu 22.04.4 LTS   5.15.0-142-generic   containerd://1.7.27
   ```

1. Access the Service using the IP addresses of your nodes and the `sharedListenerPort` for the `cilium-ingress` resource. In the example below the node IP address used is `10.80.146.32` and the listener port is `8111`. Replace these with the values for your environment.

   ```
   curl -s http://10.80.146.32:8111/details/1 | jq
   ```

   ```
   {
     "id": 1,
     "author": "William Shakespeare",
     "year": 1595,
     "type": "paperback",
     "pages": 200,
     "publisher": "PublisherA",
     "language": "English",
     "ISBN-10": "1234567890",
     "ISBN-13": "123-1234567890"
   }
   ```

# Configure Services of type LoadBalancer for hybrid nodes
<a name="hybrid-nodes-load-balancing"></a>

This topic describes how to configure Layer 4 (L4) load balancing for applications running on Amazon EKS Hybrid Nodes. Kubernetes Services of type LoadBalancer are used to expose Kubernetes applications external to the cluster. Services of type LoadBalancer are commonly used with physical load balancer infrastructure in the cloud or on-premises environment to serve the workload’s traffic. This load balancer infrastructure is commonly provisioned with an environment-specific controller.

 AWS supports AWS Network Load Balancer (NLB) and Cilium for Services of type LoadBalancer running on EKS Hybrid Nodes. The decision to use NLB or Cilium is based on the source of application traffic. If application traffic originates from an AWS Region, AWS recommends using AWS NLB and the AWS Load Balancer Controller. If application traffic originates from the local on-premises or edge environment, AWS recommends using Cilium’s built-in load balancing capabilities, which can be used with or without load balancer infrastructure in your environment.

For Layer 7 (L7) application traffic load balancing, see [Configure Kubernetes Ingress for hybrid nodes](hybrid-nodes-ingress.md). For general information on Load Balancing with EKS, see [Best Practices for Load Balancing](https://docs.aws.amazon.com/eks/latest/best-practices/load-balancing.html).

## AWS Network Load Balancer
<a name="hybrid-nodes-service-lb-nlb"></a>

You can use the [AWS Load Balancer Controller](aws-load-balancer-controller.md) and NLB with the target type `ip` for workloads running on hybrid nodes. When using target type `ip`, NLB forwards traffic directly to the pods, bypassing the Service layer network path. For NLB to reach the pod IP targets on hybrid nodes, your on-premises pod CIDRs must be routable on your on-premises network. Additionally, the AWS Load Balancer Controller uses webhooks and requires direct communication from the EKS control plane. For more information, see [Configure webhooks for hybrid nodes](hybrid-nodes-webhooks.md).
+ See [Route TCP and UDP traffic with Network Load Balancers](network-load-balancing.md) for subnet configuration requirements, and [Install AWS Load Balancer Controller with Helm](lbc-helm.md) and [Best Practices for Load Balancing](https://docs.aws.amazon.com/eks/latest/best-practices/load-balancing.html) for additional information about AWS Network Load Balancer and AWS Load Balancer Controller.
+ See [AWS Load Balancer Controller NLB configurations](https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/guide/service/nlb/) for configurations that can be applied to Services of type `LoadBalancer` with AWS Network Load Balancer.

### Prerequisites
<a name="_prerequisites"></a>
+ Cilium installed following the instructions in [Configure CNI for hybrid nodes](hybrid-nodes-cni.md).
+ Cilium BGP Control Plane enabled following the instructions in [Configure Cilium BGP for hybrid nodes](hybrid-nodes-cilium-bgp.md). If you do not want to use BGP, you must use an alternative method to make your on-premises pod CIDRs routable on your on-premises network, see [Routable remote Pod CIDRs](hybrid-nodes-concepts-kubernetes.md#hybrid-nodes-concepts-k8s-pod-cidrs) for more information.
+ Helm installed in your command-line environment, see [Setup Helm instructions](helm.md).
+ eksctl installed in your command-line environment, see [Setup eksctl instructions](install-kubectl.md#eksctl-install-update).

### Procedure
<a name="_procedure"></a>

1. Download an IAM policy for the AWS Load Balancer Controller that allows it to make calls to AWS APIs on your behalf.

   ```
   curl -O https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/refs/heads/main/docs/install/iam_policy.json
   ```

1. Create an IAM policy using the policy downloaded in the previous step.

   ```
   aws iam create-policy \
       --policy-name AWSLoadBalancerControllerIAMPolicy \
       --policy-document file://iam_policy.json
   ```

1. Replace the values for cluster name (`CLUSTER_NAME`), AWS Region (`AWS_REGION`), and AWS account ID (`AWS_ACCOUNT_ID`) with your settings and run the following command.

   ```
   eksctl create iamserviceaccount \
       --cluster=CLUSTER_NAME \
       --namespace=kube-system \
       --name=aws-load-balancer-controller \
       --attach-policy-arn=arn:aws:iam::AWS_ACCOUNT_ID:policy/AWSLoadBalancerControllerIAMPolicy \
       --override-existing-serviceaccounts \
       --region AWS_REGION \
       --approve
   ```

1. Add the eks-charts Helm chart repository. AWS maintains this repository on GitHub.

   ```
   helm repo add eks https://aws.github.io/eks-charts
   ```

1. Update your local Helm repository to make sure that you have the most recent charts.

   ```
   helm repo update eks
   ```

1. Install the AWS Load Balancer Controller. Replace the values for cluster name (`CLUSTER_NAME`), AWS Region (`AWS_REGION`), VPC ID (`VPC_ID`), and AWS Load Balancer Controller Helm chart version (`AWS_LBC_HELM_VERSION`) with your settings. You can find the latest version of the Helm chart by running `helm search repo eks/aws-load-balancer-controller --versions`. If you are running a mixed mode cluster with both hybrid nodes and nodes in AWS Cloud, you can run the AWS Load Balancer Controller on cloud nodes following the instructions at [AWS Load Balancer Controller](hybrid-nodes-webhooks.md#hybrid-nodes-mixed-lbc).

   ```
   helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
     -n kube-system \
     --version AWS_LBC_HELM_VERSION \
     --set clusterName=CLUSTER_NAME \
     --set region=AWS_REGION \
     --set vpcId=VPC_ID \
     --set serviceAccount.create=false \
     --set serviceAccount.name=aws-load-balancer-controller
   ```

1. Verify the AWS Load Balancer Controller was installed successfully.

   ```
   kubectl get -n kube-system deployment aws-load-balancer-controller
   ```

   ```
   NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
   aws-load-balancer-controller   2/2     2            2           84s
   ```

1. Define a sample application in a file named `tcp-sample-app.yaml`. The example below uses a simple NGINX deployment with a TCP port.

   ```
   apiVersion: apps/v1
   kind: Deployment
   metadata:
     name: tcp-sample-app
     namespace: default
   spec:
     replicas: 3
     selector:
       matchLabels:
         app: nginx
     template:
       metadata:
         labels:
           app: nginx
       spec:
         containers:
           - name: nginx
             image: public.ecr.aws/nginx/nginx:1.23
             ports:
               - name: tcp
                 containerPort: 80
   ```

1. Apply the deployment to your cluster.

   ```
   kubectl apply -f tcp-sample-app.yaml
   ```

1. Define a Service of type LoadBalancer for the deployment in a file named `tcp-sample-service.yaml`.

   ```
   apiVersion: v1
   kind: Service
   metadata:
     name: tcp-sample-service
     namespace: default
     annotations:
       service.beta.kubernetes.io/aws-load-balancer-type: external
       service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
       service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
   spec:
     ports:
       - port: 80
         targetPort: 80
         protocol: TCP
     type: LoadBalancer
     selector:
       app: nginx
   ```

1. Apply the Service configuration to your cluster.

   ```
   kubectl apply -f tcp-sample-service.yaml
   ```

1. Provisioning the NLB for the Service may take a few minutes. Once the NLB is provisioned, the Service will have an address assigned to it that corresponds to the DNS name of the NLB deployment.

   ```
   kubectl get svc tcp-sample-service
   ```

   ```
   NAME                 TYPE           CLUSTER-IP       EXTERNAL-IP                                                                    PORT(S)        AGE
   tcp-sample-service   LoadBalancer   172.16.115.212   k8s-default-tcpsampl-xxxxxxxxxx-xxxxxxxxxxxxxxxx.elb.<region>.amazonaws.com   80:30396/TCP   8s
   ```

1. Access the Service using the address of the NLB.

   ```
   curl k8s-default-tcpsampl-xxxxxxxxxx-xxxxxxxxxxxxxxxx.elb.<region>.amazonaws.com
   ```

   An example output is below.

   ```
   <!DOCTYPE html>
   <html>
   <head>
   <title>Welcome to nginx!</title>
   [...]
   ```

1. Clean up the resources you created.

   ```
   kubectl delete -f tcp-sample-service.yaml
   kubectl delete -f tcp-sample-app.yaml
   ```

## Cilium in-cluster load balancing
<a name="hybrid-nodes-service-lb-cilium"></a>

Cilium can be used as an in-cluster load balancer for workloads running on EKS Hybrid Nodes, which can be useful for environments that do not have load balancer infrastructure. Cilium’s load balancing capabilities are built on a combination of Cilium features including kube-proxy replacement, Load Balancer IP Address Management (IPAM), and BGP Control Plane. The responsibilities of these features are detailed below:
+  **Cilium kube-proxy replacement**: Handles routing Service traffic to backend pods.
+  **Cilium Load Balancer IPAM**: Manages IP addresses that can be assigned to Services of type `LoadBalancer`.
+  **Cilium BGP Control Plane**: Advertises IP addresses allocated by Load Balancer IPAM to the on-premises network.

If you are not using Cilium’s kube-proxy replacement, you can still use Cilium Load Balancer IPAM and BGP Control Plane to allocate and assign IP addresses for Services of type LoadBalancer. If you are not using Cilium’s kube-proxy replacement, the load balancing for Services to backend pods is handled by kube-proxy and iptables rules by default in EKS.

### Prerequisites
<a name="_prerequisites_2"></a>
+ Cilium installed following the instructions in [Configure CNI for hybrid nodes](hybrid-nodes-cni.md) with or without kube-proxy replacement enabled. Cilium’s kube-proxy replacement requires running an operating system with a Linux kernel at least as recent as v4.19.57, v5.1.16, or v5.2.0. All recent versions of the operating systems supported for use with hybrid nodes meet this criteria, with the exception of Red Hat Enterprise Linux (RHEL) 8.x.
+ Cilium BGP Control Plane enabled following the instructions in [Configure Cilium BGP for hybrid nodes](hybrid-nodes-cilium-bgp.md). If you do not want to use BGP, you must use an alternative method to make your on-premises pod CIDRs routable on your on-premises network, see [Routable remote Pod CIDRs](hybrid-nodes-concepts-kubernetes.md#hybrid-nodes-concepts-k8s-pod-cidrs) for more information.
+ Helm installed in your command-line environment, see [Setup Helm instructions](helm.md).

### Procedure
<a name="_procedure_2"></a>

1. Create a file named `cilium-lbip-pool-loadbalancer.yaml` with a `CiliumLoadBalancerIPPool` resource to configure the Load Balancer IP address range for your Services of type LoadBalancer.
   + Replace `LB_IP_CIDR` with the IP address range to use for the Load Balancer IP addresses. To select a single IP address, use a `/32` CIDR. For more information, see [LoadBalancer IP Address Management](https://docs.cilium.io/en/stable/network/lb-ipam/) in the Cilium documentation.
   + The `serviceSelector` field is configured to match against the name of the Service you will create in a subsequent step. With this configuration, IPs from this pool will only be allocated to Services with the name `tcp-sample-service`.

     ```
     apiVersion: cilium.io/v2alpha1
     kind: CiliumLoadBalancerIPPool
     metadata:
       name: tcp-service-pool
     spec:
       blocks:
       - cidr: "LB_IP_CIDR"
       serviceSelector:
         matchLabels:
           io.kubernetes.service.name: tcp-sample-service
     ```

1. Apply the `CiliumLoadBalancerIPPool` resource to your cluster.

   ```
   kubectl apply -f cilium-lbip-pool-loadbalancer.yaml
   ```

1. Confirm there is at least one IP address available in the pool.

   ```
   kubectl get ciliumloadbalancerippools.cilium.io
   ```

   ```
   NAME               DISABLED   CONFLICTING   IPS AVAILABLE   AGE
   tcp-service-pool   false      False         1               24m
   ```

1. Create a file named `cilium-bgp-advertisement-loadbalancer.yaml` with a `CiliumBGPAdvertisement` resource to advertise the load balancer IP address for the Service you will create in the next step. If you are not using Cilium BGP, you can skip this step. The load balancer IP address used for your Service must be routable on your on-premises network for you to be able to query the service in the final step.
   + The `advertisementType` field is set to `Service` and `service.addresses` is set to `LoadBalancerIP` to only advertise the `LoadBalancerIP` for Services of type `LoadBalancer`.
   + The `selector` field is configured to match against the name of the Service you will create in a subsequent step. With this configuration, only `LoadBalancerIP` for Services with the name `tcp-sample-service` will be advertised.

     ```
     apiVersion: cilium.io/v2alpha1
     kind: CiliumBGPAdvertisement
     metadata:
       name: bgp-advertisement-tcp-service
       labels:
         advertise: bgp
     spec:
       advertisements:
         - advertisementType: "Service"
           service:
             addresses:
               - LoadBalancerIP
           selector:
             matchLabels:
               io.kubernetes.service.name: tcp-sample-service
     ```

1. Apply the `CiliumBGPAdvertisement` resource to your cluster. If you are not using Cilium BGP, you can skip this step.

   ```
   kubectl apply -f cilium-bgp-advertisement-loadbalancer.yaml
   ```

1. Define a sample application in a file named `tcp-sample-app.yaml`. The example below uses a simple NGINX deployment with a TCP port.

   ```
   apiVersion: apps/v1
   kind: Deployment
   metadata:
     name: tcp-sample-app
     namespace: default
   spec:
     replicas: 3
     selector:
       matchLabels:
         app: nginx
     template:
       metadata:
         labels:
           app: nginx
       spec:
         containers:
           - name: nginx
             image: public.ecr.aws/nginx/nginx:1.23
             ports:
               - name: tcp
                 containerPort: 80
   ```

1. Apply the deployment to your cluster.

   ```
   kubectl apply -f tcp-sample-app.yaml
   ```

1. Define a Service of type LoadBalancer for the deployment in a file named `tcp-sample-service.yaml`.
   + You can request a specific IP address from the load balancer IP pool with the `lbipam.cilium.io/ips` annotation on the Service object. You can remove this annotation if you do not want to request a specific IP address for the Service.
   + The `loadBalancerClass` spec field is required to prevent the legacy AWS Cloud Provider from creating a Classic Load Balancer for the Service. In the example below this is configured to `io.cilium/bgp-control-plane` to use Cilium’s BGP Control Plane as the load balancer class. This field can alternatively be configured to `io.cilium/l2-announcer` to use Cilium’s [L2 Announcements feature](https://docs.cilium.io/en/latest/network/l2-announcements/) (currently in beta and not officially supported by AWS).

     ```
     apiVersion: v1
     kind: Service
     metadata:
       name: tcp-sample-service
       namespace: default
       annotations:
         lbipam.cilium.io/ips: "LB_IP_ADDRESS"
     spec:
       loadBalancerClass: io.cilium/bgp-control-plane
       ports:
         - port: 80
           targetPort: 80
           protocol: TCP
       type: LoadBalancer
       selector:
         app: nginx
     ```

1. Apply the Service to your cluster. The Service will be created with an external IP address that you can use to access the application.

   ```
   kubectl apply -f tcp-sample-service.yaml
   ```

1. Verify the Service was created successfully and has an IP assigned to it from the `CiliumLoadBalancerIPPool` created in the previous step.

   ```
   kubectl get svc tcp-sample-service
   ```

   ```
   NAME                 TYPE           CLUSTER-IP      EXTERNAL-IP     PORT(S)        AGE
   tcp-sample-service   LoadBalancer   172.16.117.76   LB_IP_ADDRESS   80:31129/TCP   14m
   ```

1. If you are using Cilium in kube-proxy replacement mode, you can confirm Cilium is handling the load balancing for the Service by running the following command. In the output below, the `10.86.2.x` addresses are the pod IP addresses of the backend pods for the Service.

   ```
   kubectl -n kube-system exec ds/cilium -- cilium-dbg service list
   ```

   ```
   ID   Frontend               Service Type   Backend
   ...
   41   LB_IP_ADDRESS:80/TCP   LoadBalancer   1 => 10.86.2.76:80/TCP (active)
                                              2 => 10.86.2.130:80/TCP (active)
                                              3 => 10.86.2.141:80/TCP (active)
   ```

1. Confirm Cilium is advertising the IP address to the on-premises network via BGP. In the example below, there are five hybrid nodes, each advertising the `LB_IP_ADDRESS` for the `tcp-sample-service` Service to the on-premises network.

   ```
   Node                   VRouter      Prefix             NextHop   Age     Attrs
   mi-026d6a261e355fba7   NODES_ASN
                     LB_IP_ADDRESS/32   0.0.0.0   12m3s   [{Origin: i} {Nexthop: 0.0.0.0}]
   mi-082f73826a163626e   NODES_ASN
                     LB_IP_ADDRESS/32   0.0.0.0   12m3s   [{Origin: i} {Nexthop: 0.0.0.0}]
   mi-09183e8a3d755abf6   NODES_ASN
                     LB_IP_ADDRESS/32   0.0.0.0   12m3s   [{Origin: i} {Nexthop: 0.0.0.0}]
   mi-0d78d815980ed202d   NODES_ASN
                     LB_IP_ADDRESS/32   0.0.0.0   12m3s   [{Origin: i} {Nexthop: 0.0.0.0}]
   mi-0daa253999fe92daa   NODES_ASN
                     LB_IP_ADDRESS/32   0.0.0.0   12m3s   [{Origin: i} {Nexthop: 0.0.0.0}]
   ```

1. Access the Service using the assigned load balancerIP address.

   ```
   curl LB_IP_ADDRESS
   ```

   An example output is below.

   ```
   <!DOCTYPE html>
   <html>
   <head>
   <title>Welcome to nginx!</title>
   [...]
   ```

1. Clean up the resources you created.

   ```
   kubectl delete -f tcp-sample-service.yaml
   kubectl delete -f tcp-sample-app.yaml
   kubectl delete -f cilium-lb-ip-pool.yaml
   kubectl delete -f cilium-bgp-advertisement.yaml
   ```

# Configure Kubernetes Network Policies for hybrid nodes
<a name="hybrid-nodes-network-policies"></a>

 AWS supports Kubernetes Network Policies (Layer 3 / Layer 4) for pod ingress and egress traffic when using Cilium as the CNI with EKS Hybrid Nodes. If you are running EKS clusters with nodes in AWS Cloud, AWS supports the [Amazon VPC CNI for Kubernetes Network Policies](cni-network-policy.md).

This topic covers how to configure Cilium and Kubernetes Network Policies with EKS Hybrid Nodes. For detailed information on Kubernetes Network Policies, see [Kubernetes Network Policies](https://kubernetes.io/docs/concepts/services-networking/network-policies/) in the Kubernetes documentation.

## Configure network policies
<a name="hybrid-nodes-configure-network-policies"></a>

### Considerations
<a name="_considerations"></a>
+  AWS supports the upstream Kubernetes Network Policies and specfication for pod ingress and egress. AWS currently does not support `CiliumNetworkPolicy` or `CiliumClusterwideNetworkPolicy`.
+ The `policyEnforcementMode` Helm value can be used to control the default Cilium policy enforcement behavior. The default behavior allows all egress and ingress traffic. When an endpoint is selected by a network policy, it transitions to a default-deny state, where only explicitly allowed traffic is allowed. See the Cilium documentation for more information on the [default policy mode](https://docs.cilium.io/en/stable/security/policy/intro/#policy-mode-default) and [policy enforcement modes](https://docs.cilium.io/en/stable/security/policy/intro/#policy-enforcement-modes).
+ If you are changing `policyEnforcementMode` for an existing Cilium installation, you must restart the Cilium agent DaemonSet to apply the new policy enforcement mode.
+ Use `namespaceSelector` and `podSelector` to allow or deny traffic to/from namespaces and pods with matching labels. The `namespaceSelector` and `podSelector` can be used with `matchLabels` or `matchExpressions` to select namespaces and pods based on their labels.
+ Use `ingress.ports` and `egress.ports` to allow or deny traffic to/from ports and protocols.
+ The `ipBlock` field cannot be used to selectively allow or deny traffic to/from pod IP addresses ([\$19209](https://github.com/cilium/cilium/issues/9209)). Using `ipBlock` selectors for node IPs is a beta feature in Cilium and is not supported by AWS.
+ See the [NetworkPolicy resource](https://kubernetes.io/docs/concepts/services-networking/network-policies/#networkpolicy-resource) in the Kubernetes documentation for information on the available fields for Kubernetes Network Policies.

### Prerequisites
<a name="_prerequisites"></a>
+ Cilium installed following the instructions in [Configure CNI for hybrid nodes](hybrid-nodes-cni.md).
+ Helm installed in your command-line environment, see [Setup Helm instructions](helm.md).

### Procedure
<a name="_procedure"></a>

The following procedure sets up network policies for a sample microservices application so that components can only talk to other components that are required for the application to function. The procedure uses the [Istio Bookinfo](https://istio.io/latest/docs/examples/bookinfo/) sample microservices application.

The Bookinfo application consists of four separate microservices with the following relationships:
+  **productpage**. The productpage microservice calls the details and reviews microservices to populate the page.
+  **details**. The details microservice contains book information.
+  **reviews**. The reviews microservice contains book reviews. It also calls the ratings microservice.
+  **ratings**. The ratings microservice contains book ranking information that accompanies a book review.

  1. Create the sample application.

     ```
     kubectl apply -f https://raw.githubusercontent.com/istio/istio/refs/heads/master/samples/bookinfo/platform/kube/bookinfo.yaml
     ```

  1. Confirm the application is running successfully and note the pod IP address for the productpage microservice. You will use this pod IP address to query each microservice in the subsequent steps.

     ```
     kubectl get pods -o wide
     ```

     ```
     NAME                              READY   STATUS    RESTARTS   AGE   IP            NODE
     details-v1-766844796b-9wff2       1/1     Running   0          7s    10.86.3.7     mi-0daa253999fe92daa
     productpage-v1-54bb874995-lwfgg   1/1     Running   0          7s    10.86.2.193   mi-082f73826a163626e
     ratings-v1-5dc79b6bcd-59njm       1/1     Running   0          7s    10.86.2.232   mi-082f73826a163626e
     reviews-v1-598b896c9d-p2289       1/1     Running   0          7s    10.86.2.47    mi-026d6a261e355fba7
     reviews-v2-556d6457d-djktc        1/1     Running   0          7s    10.86.3.58    mi-0daa253999fe92daa
     reviews-v3-564544b4d6-g8hh4       1/1     Running   0          7s    10.86.2.69    mi-09183e8a3d755abf6
     ```

  1. Create a pod that will be used throughout to test the network policies. Note the pod is created in the `default` namespace with the label `access: true`.

     ```
     kubectl run curl-pod --image=curlimages/curl -i --tty --labels=access=true --namespace=default --overrides='{"spec": { "nodeSelector": {"eks.amazonaws.com/compute-type": "hybrid"}}}' -- /bin/sh
     ```

  1. Test access to the productpage microservice. In the example below, we use the pod IP address of the productpage pod (`10.86.2.193`) to query the microservice. Replace this with the pod IP address of the productpage pod in your environment.

     ```
     curl -s http://10.86.2.193:9080/productpage | grep -o "<title>.*</title>"
     ```

     ```
     <title>Simple Bookstore App</title>
     ```

  1. You can exit the test curl pod by typing `exit` and can reattach to the pod by running the following command.

     ```
     kubectl attach curl-pod -c curl-pod -i -t
     ```

  1. To demonstrate the effects of the network policies in the following steps, we first create a network policy that denies all traffic for the BookInfo microservices. Create a file called `network-policy-deny-bookinfo.yaml` that defines the deny network policy.

     ```
     apiVersion: networking.k8s.io/v1
     kind: NetworkPolicy
     metadata:
       name: deny-bookinfo
       namespace: default
     spec:
       podSelector:
         matchExpressions:
         - key: app
           operator: In
           values: ["productpage", "details", "reviews", "ratings"]
       policyTypes:
       - Ingress
       - Egress
     ```

  1. Apply the deny network policy to your cluster.

     ```
     kubectl apply -f network-policy-default-deny-bookinfo.yaml
     ```

  1. Test access to the BookInfo application. In the example below, we use the pod IP address of the productpage pod (`10.86.2.193`) to query the microservice. Replace this with the pod IP address of the productpage pod in your environment.

     ```
     curl http://10.86.2.193:9080/productpage --max-time 10
     ```

     ```
     curl: (28) Connection timed out after 10001 milliseconds
     ```

  1. Create a file called `network-policy-productpage.yaml` that defines the productpage network policy. The policy has the following rules:
     + allows ingress traffic from pods with the label `access: true` (the curl pod created in the previous step)
     + allows egress TCP traffic on port `9080` for the details, reviews, and ratings microservices
     + allows egress TCP/UDP traffic on port `53` for CoreDNS which runs in the `kube-system` namespace

       ```
       apiVersion: networking.k8s.io/v1
       kind: NetworkPolicy
       metadata:
         name: productpage-policy
         namespace: default
       spec:
         podSelector:
           matchLabels:
             app: productpage
         policyTypes:
         - Ingress
         - Egress
         ingress:
         - from:
           - podSelector:
               matchLabels:
                 access: "true"
         egress:
         - to:
           - podSelector:
               matchExpressions:
               - key: app
                 operator: In
                 values: ["details", "reviews", "ratings"]
           ports:
           - port: 9080
             protocol: TCP
         - to:
           - namespaceSelector:
               matchLabels:
                 kubernetes.io/metadata.name: kube-system
             podSelector:
               matchLabels:
                 k8s-app: kube-dns
           ports:
           - port: 53
             protocol: UDP
           - port: 53
             protocol: TCP
       ```

  1. Apply the productpage network policy to your cluster.

     ```
     kubectl apply -f network-policy-productpage.yaml
     ```

  1. Connect to the curl pod and test access to the Bookinfo application. Access to the productpage microservice is now allowed, but the other microservices are still denied because they are still subject to the deny network policy. In the examples below, we use the pod IP address of the productpage pod (`10.86.2.193`) to query the microservice. Replace this with the pod IP address of the productpage pod in your environment.

     ```
     kubectl attach curl-pod -c curl-pod -i -t
     ```

     ```
     curl -s http://10.86.2.193:9080/productpage | grep -o "<title>.*</title>"
     <title>Simple Bookstore App</title>
     ```

     ```
     curl -s http://10.86.2.193:9080/api/v1/products/1
     {"error": "Sorry, product details are currently unavailable for this book."}
     ```

     ```
     curl -s http://10.86.2.193:9080/api/v1/products/1/reviews
     {"error": "Sorry, product reviews are currently unavailable for this book."}
     ```

     ```
     curl -s http://10.86.2.193:9080/api/v1/products/1/ratings
     {"error": "Sorry, product ratings are currently unavailable for this book."}
     ```

  1. Create a file called `network-policy-details.yaml` that defines the details network policy. The policy allows only ingress traffic from the productpage microservice.

     ```
     apiVersion: networking.k8s.io/v1
     kind: NetworkPolicy
     metadata:
       name: details-policy
       namespace: default
     spec:
       podSelector:
         matchLabels:
           app: details
       policyTypes:
       - Ingress
       ingress:
       - from:
         - podSelector:
             matchLabels:
               app: productpage
     ```

  1. Create a file called `network-policy-reviews.yaml` that defines the reviews network policy. The policy allows only ingress traffic from the productpage microservice and only egress traffic to the ratings microservice and CoreDNS.

     ```
     apiVersion: networking.k8s.io/v1
     kind: NetworkPolicy
     metadata:
       name: reviews-policy
       namespace: default
     spec:
       podSelector:
         matchLabels:
           app: reviews
       policyTypes:
       - Ingress
       - Egress
       ingress:
       - from:
         - podSelector:
             matchLabels:
               app: productpage
       egress:
       - to:
         - podSelector:
             matchLabels:
               app: ratings
       - to:
         - namespaceSelector:
             matchLabels:
               kubernetes.io/metadata.name: kube-system
           podSelector:
             matchLabels:
               k8s-app: kube-dns
         ports:
         - port: 53
           protocol: UDP
         - port: 53
           protocol: TCP
     ```

  1. Create a file called `network-policy-ratings.yaml` that defines the ratings network policy. The policy allows only ingress traffic from the productpage and reviews microservices.

     ```
     apiVersion: networking.k8s.io/v1
     kind: NetworkPolicy
     metadata:
       name: ratings-policy
       namespace: default
     spec:
       podSelector:
         matchLabels:
           app: ratings
       policyTypes:
       - Ingress
       ingress:
       - from:
         - podSelector:
             matchExpressions:
             - key: app
               operator: In
               values: ["productpage", "reviews"]
     ```

  1. Apply the details, reviews, and ratings network policies to your cluster.

     ```
     kubectl apply -f network-policy-details.yaml
     kubectl apply -f network-policy-reviews.yaml
     kubectl apply -f network-policy-ratings.yaml
     ```

  1. Connect to the curl pod and test access to the Bookinfo application. In the examples below, we use the pod IP address of the productpage pod (`10.86.2.193`) to query the microservice. Replace this with the pod IP address of the productpage pod in your environment.

     ```
     kubectl attach curl-pod -c curl-pod -i -t
     ```

     Test the details microservice.

     ```
     curl -s http://10.86.2.193:9080/api/v1/products/1
     ```

     ```
     {"id": 1, "author": "William Shakespeare", "year": 1595, "type": "paperback", "pages": 200, "publisher": "PublisherA", "language": "English", "ISBN-10": "1234567890", "ISBN-13": "123-1234567890"}
     ```

     Test the reviews microservice.

     ```
     curl -s http://10.86.2.193:9080/api/v1/products/1/reviews
     ```

     ```
     {"id": "1", "podname": "reviews-v1-598b896c9d-p2289", "clustername": "null", "reviews": [{"reviewer": "Reviewer1", "text": "An extremely entertaining play by Shakespeare. The slapstick humour is refreshing!"}, {"reviewer": "Reviewer2", "text": "Absolutely fun and entertaining. The play lacks thematic depth when compared to other plays by Shakespeare."}]}
     ```

     Test the ratings microservice.

     ```
     curl -s http://10.86.2.193:9080/api/v1/products/1/ratings
     ```

     ```
     {"id": 1, "ratings": {"Reviewer1": 5, "Reviewer2": 4}}
     ```

  1. Clean up the resources you created in this procedure.

     ```
     kubectl delete -f network-policy-deny-bookinfo.yaml
     kubectl delete -f network-policy-productpage.yaml
     kubectl delete -f network-policy-details.yaml
     kubectl delete -f network-policy-reviews.yaml
     kubectl delete -f network-policy-ratings.yaml
     kubectl delete -f https://raw.githubusercontent.com/istio/istio/refs/heads/master/samples/bookinfo/platform/kube/bookinfo.yaml
     kubectl delete pod curl-pod
     ```

# Concepts for hybrid nodes
<a name="hybrid-nodes-concepts"></a>

With *Amazon EKS Hybrid Nodes*, you join physical or virtual machines running in on-premises or edge environments to Amazon EKS clusters running in the AWS Cloud. This approach brings many benefits, but also introduces new networking concepts and architectures for those familiar with running Kubernetes clusters in a single network environment.

The following sections dive deep into the Kubernetes and networking concepts for EKS Hybrid Nodes and details how traffic flows through the hybrid architecture. These sections require that you are familiar with basic Kubernetes networking knowledge, such as the concepts of pods, nodes, services, Kubernetes control plane, kubelet and kube-proxy.

We recommend reading these pages in order, starting with the [Networking concepts for hybrid nodes](hybrid-nodes-concepts-networking.md), then the [Kubernetes concepts for hybrid nodes](hybrid-nodes-concepts-kubernetes.md), and finally the [Network traffic flows for hybrid nodes](hybrid-nodes-concepts-traffic-flows.md).

**Topics**
+ [

# Networking concepts for hybrid nodes
](hybrid-nodes-concepts-networking.md)
+ [

# Kubernetes concepts for hybrid nodes
](hybrid-nodes-concepts-kubernetes.md)
+ [

# Network traffic flows for hybrid nodes
](hybrid-nodes-concepts-traffic-flows.md)

# Networking concepts for hybrid nodes
<a name="hybrid-nodes-concepts-networking"></a>

This section details the core networking concepts and the constraints you must consider when designing your network topology for EKS Hybrid Nodes.

## Networking concepts for EKS Hybrid Nodes
<a name="_networking_concepts_for_eks_hybrid_nodes"></a>

![\[High level hybrid nodes network diagram\]](http://docs.aws.amazon.com/eks/latest/userguide/images/hybrid-nodes-highlevel-network.png)


 **VPC as the network hub** 

All traffic that crosses the cloud boundary routes through your VPC. This includes traffic between the EKS control plane or pods running in AWS to hybrid nodes or pods running on them. You can think of your cluster’s VPC as the network hub between your hybrid nodes and the rest of the cluster. This architecture gives you full control of the traffic and its routing but also makes it your responsibility to correctly configure routes, security groups, and firewalls for the VPC.

 **EKS control plane to the VPC** 

The EKS control plane attaches **Elastic Network Interfaces (ENIs)** to your VPC. These ENIs handle traffic to and from the EKS API server. You control the placement of the EKS control plane ENIs when you configure your cluster, as EKS attaches ENIs to the subnets you pass during cluster creation.

EKS associates Security Groups to the ENIs that EKS attaches to your subnets. These security groups allow traffic to and from the EKS control plane through the ENIs. This is important for EKS Hybrid Nodes because you must allow traffic from the hybrid nodes and the pods running on them to the EKS control plane ENIs.

 **Remote Node Networks** 

The remote node networks, specifically the remote node CIDRs, are the ranges of IPs assigned to the machines you use as hybrid nodes. When you provision hybrid nodes, they reside in your on-premises data center or edge location, which is a different network domain than the EKS control plane and VPC. Each hybrid node has an IP address, or addresses, from a remote node CIDR that is distinct from the subnets in your VPC.

You configure the EKS cluster with these remote node CIDRs so EKS knows to route all traffic destined for the hybrid nodes IPs through your cluster VPC, such as requests to the kubelet API. The connections to the `kubelet` API are used in the `kubectl attach`, `kubectl cp`, `kubectl exec`, `kubectl logs`, and `kubectl port-forward` commands.

 **Remote Pod Networks** 

The remote pod networks are the ranges of IPs assigned to the pods running on the hybrid nodes. Generally, you configure your CNI with these ranges and the IP Address Management (IPAM) functionality of the CNI takes care of assigning a slice of these ranges to each hybrid node. When you create a pod, the CNI assigns an IP to the pod from the slice allocated to the node where the pod has been scheduled.

You configure the EKS cluster with these remote pod CIDRs so the EKS control plane knows to route all traffic destined for the pods running on the hybrid nodes through your cluster’s VPC, such as communication with webhooks.

![\[Remote Pod Networks\]](http://docs.aws.amazon.com/eks/latest/userguide/images/hybrid-nodes-remote-pod-cidrs.png)


 **On-premises to the VPC** 

The on-premises network you use for hybrid nodes must route to the VPC you use for your EKS cluster. There are several [Network-to-Amazon VPC connectivity option](https://docs.aws.amazon.com/whitepapers/latest/aws-vpc-connectivity-options/network-to-amazon-vpc-connectivity-options.html) available to connect your on-premises network to a VPC. You can also use your own VPN solution.

It is important that you configure the routing correctly on the AWS Cloud side in the VPC and in your on-premises network, so that both networks route the right traffic through the connection for the two networks.

In the VPC, all traffic going to the remote node and remote pod networks must route through the connection to your on-premises network (referred to as the "gateway"). If some of your subnets have different route tables, you must configure each route table with the routes for your hybrid nodes and the pods running on them. This is true for the subnets where the EKS control plane ENIs are attached to, and subnets that contain EC2 nodes or pods that must communicate with hybrid nodes.

In your on-premises network, you must configure your network to allow traffic to and from your EKS cluster’s VPC and the other AWS services required for hybrid nodes. The traffic for the EKS cluster traverses the gateway in both directions.

## Networking constraints
<a name="_networking_constraints"></a>

 **Fully routed network** 

The main constraint is that the EKS control plane and all nodes, cloud or hybrid nodes, need to form a **fully routed** network. This means that all nodes must be able to reach each other at layer three, by IP address.

The EKS control plane and cloud nodes are already reachable from each other because they are in a flat network (the VPC). The hybrid nodes, however, are in a different network domain. This is why you need to configure additional routing in the VPC and on your on-premises network to route traffic between the hybrid nodes and the rest of the cluster. If the hybrid nodes are reachable from each other and from the VPC, your hybrid nodes can be in one single flat network or in multiple segmented networks.

 **Routable remote pod CIDRs** 

For the EKS control plane to communicate with pods running on hybrid nodes (for example, webhooks or the Metrics Server) or for pods running on cloud nodes to communicate with pods running on hybrid nodes (workload east-west communication), your remote pod CIDR must be routable from the VPC. This means that the VPC must be able to route traffic to the pod CIDRs through the gateway to your on-premises network and that your on-premises network must be able to route the traffic for a pod to the right node.

It’s important to note the distinction between the pod routing requirements in the VPC and on-premises. The VPC only needs to know that any traffic going to a remote pod should go through the gateway. If you only have one remote pod CIDR, you only need one route.

This requirement is true for all hops in your on-premises network up to the local router in the same subnet as your hybrid nodes. This is the only router that needs to be aware of the pod CIDR slice assigned to each node, making sure that traffic for a particular pod gets delivered to the node where the pod has been scheduled.

You can choose to propagate these routes for the on-premises pod CIDRs from your local on-premises router to the VPC route tables, but it isn’t necessary. If your on-premises pod CIDRs change frequently and your VPC route tables need to be updated to reflect the changing pod CIDRs, we recommend that you propagate the on-premises pod CIDRs to the VPC route tables, but this is uncommon.

Note, the constraint for making your on-premises pod CIDRs routable is optional. If you don’t need to run webhooks on your hybrid nodes or have pods on cloud nodes talk to pods on hybrid nodes, you don’t need to configure routing for the pod CIDRs on your on-premises network.

 *Why do the on-premises pod CIDRs need to be routable with hybrid nodes?* 

When using EKS with the VPC CNI for your cloud nodes, the VPC CNI assigns IPs directly from the VPC to the pods. This means there is no need for any special routing, as both cloud pods and the EKS control plane can reach the Pod IPs directly.

When running on-premises (and with other CNIs in the cloud), the pods typically run in an isolated overlay network and the CNI takes care of delivering traffic between pods. This is commonly done through encapsulation: the CNI converts pod-to-pod traffic into node-to-node traffic, taking care of encapsulating and de-encapsulating on both ends. This way, there is no need for extra configuration on the nodes and on the routers.

The networking with hybrid nodes is unique because it presents a combination of both topologies - the EKS control plane and cloud nodes (with the VPC CNI) expect a flat network including nodes and pods, while the pods running on hybrid nodes are in an overlay network by using VXLAN for encapsulation (by default in Cilium). Pods running on hybrid nodes can reach the EKS control plane and pods running on cloud nodes assuming the on-premises network can route to the VPC. However, without routing for the pod CIDRs on the on-premises network, any traffic coming back to an on-premises pod IP will be dropped eventually if the network doesn’t know how to reach the overlay network and route to the correct nodes.

# Kubernetes concepts for hybrid nodes
<a name="hybrid-nodes-concepts-kubernetes"></a>

This page details the key Kubernetes concepts that underpin the EKS Hybrid Nodes system architecture.

## EKS control plane in the VPC
<a name="hybrid-nodes-concepts-k8s-api"></a>

The IPs of the EKS control plane ENIs are stored in the `kubernetes` `Endpoints` object in the `default` namespace. When EKS creates new ENIs or removes older ones, EKS updates this object so the list of IPs is always up-to-date.

You can use these endpoints through the `kubernetes` Service, also in the `default` namespace. This service, of `ClusterIP` type, always gets assigned the first IP of the cluster’s service CIDR. For example, for the service CIDR `172.16.0.0/16`, the service IP will be `172.16.0.1`.

Generally, this is how pods (regardless if running in the cloud or hybrid nodes) access the EKS Kubernetes API server. Pods use the service IP as the destination IP, which gets translated to the actual IPs of one of the EKS control plane ENIs. The primary exception is `kube-proxy`, because it sets up the translation.

## EKS API server endpoint
<a name="hybrid-nodes-concepts-k8s-eks-api"></a>

The `kubernetes` service IP isn’t the only way to access the EKS API server. EKS also creates a Route53 DNS name when you create your cluster. This is the `endpoint` field of your EKS cluster when calling the EKS `DescribeCluster` API action.

```
{
    "cluster": {
        "endpoint": "https://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.gr7.us-west-2.eks.amazonaws.com",
        "name": "my-cluster",
        "status": "ACTIVE"
    }
}
```

In a public endpoint access or public and private endpoint access cluster, your hybrid nodes will resolve this DNS name to a public IP by default, routable through the internet. In a private endpoint access cluster, the DNS name resolves to the private IPs of the EKS control plane ENIs.

This is how the `kubelet` and `kube-proxy` access the Kubernetes API server. If you want all your Kubernetes cluster traffic to flow through the VPC, you either need to configure your cluster in private access mode or modify your on-premises DNS server to resolve the EKS cluster endpoint to the private IPs of the EKS control plane ENIs.

## `kubelet` endpoint
<a name="hybrid-nodes-concepts-k8s-kubelet-api"></a>

The `kubelet` exposes several REST endpoints, allowing other parts of the system to interact with and gather information from each node. In most clusters, the majority of traffic to the `kubelet` server comes from the control plane, but certain monitoring agents might also interact with it.

Through this interface, the `kubelet` handles various requests: fetching logs (`kubectl logs`), executing commands inside containers (`kubectl exec`), and port-forwarding traffic (`kubectl port-forward`). Each of these requests interacts with the underlying container runtime through the `kubelet`, appearing seamless to cluster administrators and developers.

The most common consumer of this API is the Kubernetes API server. When you use any of the `kubectl` commands mentioned previously, `kubectl` makes an API request to the API server, which then calls the `kubelet` API of the node where the pod is running. This is the main reason why the node IP needs to be reachable from the EKS control plane and why, even if your pods are running, you won’t be able to access their logs or `exec` if the node route is misconfigured.

 **Node IPs** 

When the EKS control plane communicates with a node, it uses one of the addresses reported in the `Node` object status (`status.addresses`).

With EKS cloud nodes, it’s common for the kubelet to report the private IP of the EC2 instance as an `InternalIP` during the node registration. This IP is then validated by the Cloud Controller Manager (CCM) making sure it belongs to the EC2 instance. In addition, the CCM typically adds the public IPs (as `ExternalIP`) and DNS names (`InternalDNS` and `ExternalDNS`) of the instance to the node status.

However, there is no CCM for hybrid nodes. When you register a hybrid node with the EKS Hybrid Nodes CLI (`nodeadm`), it configures the kubelet to report your machine’s IP directly in the node’s status, without the CCM.

```
apiVersion: v1
kind: Node
metadata:
  name: my-node-1
spec:
  providerID: eks-hybrid:///us-west-2/my-cluster/my-node-1
status:
  addresses:
  - address: 10.1.1.236
    type: InternalIP
  - address: my-node-1
    type: Hostname
```

If your machine has multiple IPs, the kubelet will select one of them following its own logic. You can control the selected IP with the `--node-ip` flag, which you can pass in `nodeadm` config in `spec.kubelet.flags`. Only the IP reported in the `Node` object needs a route from the VPC. Your machines can have other IPs that aren’t reachable from the cloud.

## `kube-proxy`
<a name="hybrid-nodes-concepts-k8s-kube-proxy"></a>

 `kube-proxy` is responsible for implementing the Service abstraction at the networking layer of each node. It acts as a network proxy and load balancer for traffic destined to Kubernetes Services. By continuously watching the Kubernetes API server for changes related to Services and Endpoints, `kube-proxy` dynamically updates the underlying host’s networking rules to ensure traffic is properly directed.

In `iptables` mode, `kube-proxy` programs several `netfilter` chains to handle service traffic. The rules form the following hierarchy:

1.  **KUBE-SERVICES chain**: The entry point for all service traffic. It has rules matching each service’s `ClusterIP` and port.

1.  **KUBE-SVC-XXX chains**: Service-specific chains has load balancing rules for each service.

1.  **KUBE-SEP-XXX chains**: Endpoint-specific chains has the actual `DNAT` rules.

Let’s examine what happens for a service `test-server` in the `default` namespace: \$1 Service ClusterIP: `172.16.31.14` \$1 Service port: `80` \$1 Backing pods: `10.2.0.110`, `10.2.1.39`, and `10.2.2.254` 

When we inspect the `iptables` rules (using `iptables-save 0 grep -A10 KUBE-SERVICES`):

1. In the **KUBE-SERVICES** chain, we find a rule matching the service:

   ```
   -A KUBE-SERVICES -d 172.16.31.14/32 -p tcp -m comment --comment "default/test-server cluster IP" -m tcp --dport 80 -j KUBE-SVC-XYZABC123456
   ```
   + This rule matches packets destined for 172.16.31.14:80
   + The comment indicates what this rule is for: `default/test-server cluster IP` 
   + Matching packets jump to the `KUBE-SVC-XYZABC123456` chain

1. The **KUBE-SVC-XYZABC123456** chain has probability-based load balancing rules:

   ```
   -A KUBE-SVC-XYZABC123456 -m statistic --mode random --probability 0.33333333349 -j KUBE-SEP-POD1XYZABC
   -A KUBE-SVC-XYZABC123456 -m statistic --mode random --probability 0.50000000000 -j KUBE-SEP-POD2XYZABC
   -A KUBE-SVC-XYZABC123456 -j KUBE-SEP-POD3XYZABC
   ```
   + First rule: 33.3% chance to jump to `KUBE-SEP-POD1XYZABC` 
   + Second rule: 50% chance of the remaining traffic (33.3% of total) to jump to `KUBE-SEP-POD2XYZABC` 
   + Last rule: All remaining traffic (33.3% of total) jumps to `KUBE-SEP-POD3XYZABC` 

1. The individual **KUBE-SEP-XXX** chains perform the DNAT (Destination NAT):

   ```
   -A KUBE-SEP-POD1XYZABC -p tcp -m tcp -j DNAT --to-destination 10.2.0.110:80
   -A KUBE-SEP-POD2XYZABC -p tcp -m tcp -j DNAT --to-destination 10.2.1.39:80
   -A KUBE-SEP-POD3XYZABC -p tcp -m tcp -j DNAT --to-destination 10.2.2.254:80
   ```
   + These DNAT rules rewrite the destination IP and port to direct traffic to specific pods.
   + Each rule handles about 33.3% of the traffic, providing even load balancing between `10.2.0.110`, `10.2.1.39` and `10.2.2.254`.

This multi-level chain structure enables `kube-proxy` to efficiently implement service load balancing and redirection through kernel-level packet manipulation, without requiring a proxy process in the data path.

### Impact on Kubernetes operations
<a name="hybrid-nodes-concepts-k8s-operations"></a>

A broken `kube-proxy` on a node prevents that node from routing Service traffic properly, causing timeouts or failed connections for pods that rely on cluster Services. This can be especially disruptive when a node is first registered. The CNI needs to talk to the Kubernetes API server to get information, such as the node’s pod CIDR, before it can configure any pod networking. To do that, it uses the `kubernetes` Service IP. However, if `kube-proxy` hasn’t been able to start or has failed to set the right `iptables` rules, the requests going to the `kubernetes` service IP aren’t translated to the actual IPs of the EKS control plane ENIs. As a consequence, the CNI will enter a crash loop and none of the pods will be able to run properly.

We know pods use the `kubernetes` service IP to communicate with the Kubernetes API server, but `kube-proxy` needs to first set `iptables` rules to make that work.

How does `kube-proxy` communicate with the API server?

The `kube-proxy` must be configured to use the actual IP/s of the Kubernetes API server or a DNS name that resolves to them. In the case of EKS, EKS configures the default `kube-proxy` to point to the Route53 DNS name that EKS creates when you create the cluster. You can see this value in the `kube-proxy` ConfigMap in the `kube-system` namespace. The content of this ConfigMap is a `kubeconfig` that gets injected into the `kube-proxy` pod, so look for the `clusters0.cluster.server` field. This value will match the `endpoint` field of your EKS cluster (when calling EKS `DescribeCluster` API).

```
apiVersion: v1
data:
  kubeconfig: |-
    kind: Config
    apiVersion: v1
    clusters:
    - cluster:
        certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        server: https://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.gr7.us-west-2.eks.amazonaws.com
      name: default
    contexts:
    - context:
        cluster: default
        namespace: default
        user: default
      name: default
    current-context: default
    users:
    - name: default
      user:
        tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
kind: ConfigMap
metadata:
  name: kube-proxy
  namespace: kube-system
```

## Routable remote Pod CIDRs
<a name="hybrid-nodes-concepts-k8s-pod-cidrs"></a>

The [Networking concepts for hybrid nodes](hybrid-nodes-concepts-networking.md) page details the requirements to run webhooks on hybrid nodes or to have pods running on cloud nodes communicate with pods running on hybrid nodes. The key requirement is that the on-premises router needs to know which node is responsible for a particular pod IP. There are several ways to achieve this, including Border Gateway Protocol (BGP), static routes, and Address Resolution Protocol (ARP) proxying. These are covered in the following sections.

 **Border Gateway Protocol (BGP)** 

If your CNI supports it (such as Cilium and Calico), you can use the BGP mode of your CNI to propagate routes to your per node pod CIDRs from your nodes to your local router. When using the CNI’s BGP mode, your CNI acts as a virtual router, so your local router thinks the pod CIDR belongs to a different subnet and your node is the gateway to that subnet.

![\[Hybrid nodes BGP routing\]](http://docs.aws.amazon.com/eks/latest/userguide/images/hybrid-nodes-bgp.png)


 **Static routes** 

Or, you can configure static routes in your local router. This is the simplest way to route the on-premises pod CIDR to your VPC, but it is also the most error prone and difficult to maintain. You need to make sure that the routes are always up-to-date with the existing nodes and their assigned pod CIDRs. If your number of nodes is small and infrastructure is static, this is a viable option and removes the need for BGP support in your router. If you opt for this, we recommend to configure your CNI with the pod CIDR slice that you want to assign to each node instead of letting its IPAM decide.

![\[Hybrid nodes static routing\]](http://docs.aws.amazon.com/eks/latest/userguide/images/hybrid-nodes-static-routes.png)


 **Address Resolution Protocol (ARP) proxying** 

ARP proxying is another approach to make on-premises pod IPs routable, particularly useful when your hybrid nodes are on the same Layer 2 network as your local router. With ARP proxying enabled, a node responds to ARP requests for pod IPs it hosts, even though those IPs belong to a different subnet.

When a device on your local network tries to reach a pod IP, it first sends an ARP request asking "Who has this IP?". The hybrid node hosting that pod will respond with its own MAC address, saying "I can handle traffic for that IP." This creates a direct path between devices on your local network and the pods without requiring router configuration.

For this to work, your CNI must support proxy ARP functionality. Cilium has built-in support for proxy ARP that you can enable through configuration. The key consideration is that the pod CIDR must not overlap with any other network in your environment, as this could cause routing conflicts.

This approach has several advantages: \$1 No need to configure your router with BGP or maintain static routes \$1 Works well in environments where you don’t have control over your router configuration

![\[Hybrid nodes ARP proxying\]](http://docs.aws.amazon.com/eks/latest/userguide/images/hybrid-nodes-arp-proxy.png)


## Pod-to-Pod encapsulation
<a name="hybrid-nodes-concepts-k8s-pod-encapsulation"></a>

In on-premises environments, CNIs typically use encapsulation protocols to create overlay networks that can operate on top of the physical network without the need to re-configure it. This section explains how this encapsulation works. Note that some of the details might vary depending on the CNI you are using.

Encapsulation wraps original pod network packets inside another network packet that can be routed through the underlying physical network. This allows pods to communicate across nodes running the same CNI without requiring the physical network to know how to route those pod CIDRs.

The most common encapsulation protocol used with Kubernetes is Virtual Extensible LAN (VXLAN), though others (such as `Geneve`) are also available depending on your CNI.

### VXLAN encapsulation
<a name="_vxlan_encapsulation"></a>

VXLAN encapsulates Layer 2 Ethernet frames within UDP packets. When a pod sends traffic to another pod on a different node, the CNI performs the following:

1. The CNI intercepts packets from Pod A

1. The CNI wraps the original packet in a VXLAN header

1. This wrapped packet is then sent through the node’s regular networking stack to the destination node

1. The CNI on the destination node unwraps the packet and delivers it to Pod B

Here’s what happens to the packet structure during VXLAN encapsulation:

Original Pod-to-Pod Packet:

```
+-----------------+---------------+-------------+-----------------+
| Ethernet Header | IP Header     | TCP/UDP     | Payload         |
| Src: Pod A MAC  | Src: Pod A IP | Src Port    |                 |
| Dst: Pod B MAC  | Dst: Pod B IP | Dst Port    |                 |
+-----------------+---------------+-------------+-----------------+
```

After VXLAN Encapsulation:

```
+-----------------+-------------+--------------+------------+---------------------------+
| Outer Ethernet  | Outer IP    | Outer UDP    | VXLAN      | Original Pod-to-Pod       |
| Src: Node A MAC | Src: Node A | Src: Random  | VNI: xx    | Packet (unchanged         |
| Dst: Node B MAC | Dst: Node B | Dst: 4789    |            | from above)               |
+-----------------+-------------+--------------+------------+---------------------------+
```

The VXLAN Network Identifier (VNI) distinguishes between different overlay networks.

### Pod communication scenarios
<a name="_pod_communication_scenarios"></a>

 **Pods on the same hybrid node** 

When pods on the same hybrid node communicate, no encapsulation is typically needed. The CNI sets up local routes that direct traffic between pods through the node’s internal virtual interfaces:

```
Pod A -> veth0 -> node's bridge/routing table -> veth1 -> Pod B
```

The packet never leaves the node and doesn’t require encapsulation.

 **Pods on different hybrid nodes** 

Communication between pods on different hybrid nodes requires encapsulation:

```
Pod A -> CNI -> [VXLAN encapsulation] -> Node A network -> router or gateway -> Node B network -> [VXLAN decapsulation] -> CNI -> Pod B
```

This allows the pod traffic to traverse the physical network infrastructure without requiring the physical network to understand pod IP routing.

# Network traffic flows for hybrid nodes
<a name="hybrid-nodes-concepts-traffic-flows"></a>

This page details the network traffic flows for EKS Hybrid Nodes with diagrams showing the end-to-end network paths for the different traffic types.

The following traffic flows are covered:
+  [Hybrid node `kubelet` to EKS control plane](#hybrid-nodes-concepts-traffic-flows-kubelet-to-cp) 
+  [EKS control plane to hybrid node (`kubelet` server)](#hybrid-nodes-concepts-traffic-flows-cp-to-kubelet) 
+  [Pods running on hybrid nodes to EKS control plane](#hybrid-nodes-concepts-traffic-flows-pods-to-cp) 
+  [EKS control plane to pods running on a hybrid node (webhooks)](#hybrid-nodes-concepts-traffic-flows-cp-to-pod) 
+  [Pod-to-Pod running on hybrid nodes](#hybrid-nodes-concepts-traffic-flows-pod-to-pod) 
+  [Pods on cloud nodes to pods on hybrid nodes (east-west traffic)](#hybrid-nodes-concepts-traffic-flows-east-west) 

## Hybrid node `kubelet` to EKS control plane
<a name="hybrid-nodes-concepts-traffic-flows-kubelet-to-cp"></a>

![\[Hybrid node kubelet to EKS control plane\]](http://docs.aws.amazon.com/eks/latest/userguide/images/hybrid-nodes-kubelet-to-cp-public.png)


### Request
<a name="_request"></a>

 **1. `kubelet` Initiates Request** 

When the `kubelet` on a hybrid node needs to communicate with the EKS control plane (for example, to report node status or get pod specs), it uses the `kubeconfig` file provided during node registration. This `kubeconfig` has the API server endpoint URL (the Route53 DNS name) rather than direct IP addresses.

The `kubelet` performs a DNS lookup for the endpoint (for example, `https://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.gr7.us-west-2.eks.amazonaws.com`). In a public access cluster, this resolves to a public IP address (say `54.239.118.52`) that belongs to the EKS service running in AWS. The `kubelet` then creates a secure HTTPS request to this endpoint. The initial packet looks like this:

```
+--------------------+---------------------+-----------------+
| IP Header          | TCP Header          | Payload         |
| Src: 10.80.0.2     | Src: 52390 (random) |                 |
| Dst: 54.239.118.52 | Dst: 443            |                 |
+--------------------+---------------------+-----------------+
```

 **2. Local Router Routing** 

Since the destination IP is a public IP address and not part of the local network, the `kubelet` sends this packet to its default gateway (the local on-premises router). The router examines the destination IP and determines it’s a public IP address.

For public traffic, the router typically forwards the packet to an internet gateway or border router that handles outbound traffic to the internet. This is omitted in the diagram and will depend on how your on-premises network is setup. The packet traverses your on-premises network infrastructure and eventually reaches your internet service provider’s network.

 **3. Delivery to the EKS control plane** 

The packet travels across the public internet and transit networks until it reaches AWS's network. AWS's network routes the packet to the EKS service endpoint in the appropriate region. When the packet reaches the EKS service, it’s forwarded to the actual EKS control plane for your cluster.

This routing through the public internet is different from the private VPC-routed path that we’ll see in other traffic flows. The key difference is that when using public access mode, traffic from on-premises `kubelet` (although not from pods) to the EKS control plane does not go through your VPC - it uses the global internet infrastructure instead.

### Response
<a name="_response"></a>

After the EKS control plane processes the `kubelet` request, it sends a response back:

 **3. EKS control plane sends response** 

The EKS control plane creates a response packet. This packet has the public IP as the source and the hybrid node’s IP as the destination:

```
+--------------------+---------------------+-----------------+
| IP Header          | TCP Header          | Payload         |
| Src: 54.239.118.52 | Src: 443            |                 |
| Dst: 10.80.0.2     | Dst: 52390          |                 |
+--------------------+---------------------+-----------------+
```

 **2. Internet Routing** 

The response packet travels back through the internet, following the routing path determined by internet service providers, until it reaches your on-premises network edge router.

 **1. Local Delivery** 

Your on-premises router receives the packet and recognizes the destination IP (`10.80.0.2`) as belonging to your local network. It forwards the packet through your local network infrastructure until it reaches the target hybrid node, where the `kubelet` receives and processes the response.

## Hybrid node `kube-proxy` to EKS control plane
<a name="_hybrid_node_kube_proxy_to_eks_control_plane"></a>

If you enable public endpoint access for the cluster, the return traffic uses the public internet. This traffiic originates from the `kube-proxy` on the hybrid node to the EKS control plane and follows the same path as the traffic from the `kubelet` to the EKS control plane.

## EKS control plane to hybrid node (`kubelet` server)
<a name="hybrid-nodes-concepts-traffic-flows-cp-to-kubelet"></a>

![\[EKS control plane to hybrid node\]](http://docs.aws.amazon.com/eks/latest/userguide/images/hybrid-nodes-cp-to-kubelet.png)


### Request
<a name="_request_2"></a>

 **1. EKS Kubernetes API server initiates request** 

The EKS Kubernetes API server retrieves the node’s IP address (`10.80.0.2`) from the node object’s status. It then routes this request through its ENI in the VPC, as the destination IP belongs to the configured remote node CIDR (`10.80.0.0/16`). The initial packet looks like this:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 10.0.0.132 | Src: 67493 (random) |                 |
| Dst: 10.80.0.2  | Dst: 10250          |                 |
+-----------------+---------------------+-----------------+
```

 **2. VPC network processing** 

The packet leaves the ENI and enters the VPC networking layer, where it’s directed to the subnet’s gateway for further routing.

 **3. VPC route table lookup** 

The VPC route table for the subnet containing the EKS control plane ENI has a specific route (the second one in the diagram) for the remote node CIDR. Based on this routing rule, the packet is directed to the VPC-to-onprem gateway.

 **4. Cross-boundary transit** 

The gateway transfers the packet across the cloud boundary through your established connection (such as Direct Connect or VPN) to your on-premises network.

 **5. On-premises network reception** 

The packet arrives at your local on-premises router that handles traffic for the subnet where your hybrid nodes are located.

 **6. Final delivery** 

The local router identifies that the destination IP (`10.80.0.2`) address belongs to its directly connected network and forwards the packet directly to the target hybrid node, where the `kubelet` receives and processes the request.

### Response
<a name="_response_2"></a>

After the hybrid node’s `kubelet` processes the request, it sends back a response following the same path in reverse:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 10.80.0.2  | Src: 10250          |                 |
| Dst: 10.0.0.132 | Dst: 67493          |                 |
+-----------------+---------------------+-----------------+
```

 **6. `kubelet` Sends Response** 

The `kubelet` on the hybrid node (`10.80.0.2`) creates a response packet with the original source IP as the destination. The destination doesn’t belong to the local network so its sent to the host’s default gateway, which is the local router.

 **5. Local Router Routing** 

The router determines that the destination IP (`10.0.0.132`) belongs to `10.0.0.0/16`, which has a route pointing to the gateway connecting to AWS.

 **4. Cross-Boundary Return** 

The packet travels back through the same on-premises to VPC connection (such as Direct Connect or VPN), crossing the cloud boundary in the reverse direction.

 **3. VPC Routing** 

When the packet arrives in the VPC, the route tables identify that the destination IP belongs to a VPC CIDR. The packet routes within the VPC.

 **2. VPC Network Delivery** 

The VPC networking layer forwards the packet to the subnet with the EKS control plane ENI (`10.0.0.132`).

 **1. ENI Reception** 

The packet reaches the EKS control plane ENI attached to the Kubernetes API server, completing the round trip.

## Pods running on hybrid nodes to EKS control plane
<a name="hybrid-nodes-concepts-traffic-flows-pods-to-cp"></a>

![\[Pods running on hybrid nodes to EKS control plane\]](http://docs.aws.amazon.com/eks/latest/userguide/images/hybrid-nodes-pod-to-cp.png)


### Without CNI NAT
<a name="_without_cni_nat"></a>

### Request
<a name="_request_3"></a>

Pods generally talk to the Kubernetes API server through the `kubernetes` service. The service IP is the first IP of the cluster’s service CIDR. This convention allows pods that need to run before CoreDNS is available to reach the API server, for example, the CNI. Requests leave the pod with the service IP as the destination. For example, if the service CIDR is `172.16.0.0/16`, the service IP will be `172.16.0.1`.

 **1. Pod Initiates Request** 

The pod sends a request to the `kubernetes` service IP (`172.16.0.1`) on the API server port (443) from a random source port. The packet looks like this:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 10.85.1.56 | Src: 67493 (random) |                 |
| Dst: 172.16.0.1 | Dst: 443            |                 |
+-----------------+---------------------+-----------------+
```

 **2. CNI Processing** 

The CNI detects that the destination IP doesn’t belong to any pod CIDR it manages. Since **outgoing NAT is disabled**, the CNI passes the packet to the host network stack without modifying it.

 **3. Node Network Processing** 

The packet enters the node’s network stack where `netfilter` hooks trigger the `iptables` rules set by kube-proxy. Several rules apply in the following order:

1. The packet first hits the `KUBE-SERVICES` chain, which contains rules matching each service’s ClusterIP and port.

1. The matching rule jumps to the `KUBE-SVC-XXX` chain for the `kubernetes` service (packets destined for `172.16.0.1:443`), which contains load balancing rules.

1. The load balancing rule randomly selects one of the `KUBE-SEP-XXX` chains for the control plane ENI IPs (`10.0.0.132` or `10.0.1.23`).

1. The selected `KUBE-SEP-XXX` chain has the actual rule that changes the destination IP from the service IP to the selected IP. This is called Destination Network Address Translation (DNAT).

After these rules are applied, assuming that the selected EKS control plane ENI’s IP is `10.0.0.132`, the packet looks like this:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 10.85.1.56 | Src: 67493 (random) |                 |
| Dst: 10.0.0.132 | Dst: 443            |                 |
+-----------------+---------------------+-----------------+
```

The node forwards the packet to its default gateway because the destination IP is not in the local network.

 **4. Local Router Routing** 

The local router determines that the destination IP (`10.0.0.132`) belongs to the VPC CIDR (`10.0.0.0/16`) and forwards it to the gateway connecting to AWS.

 **5. Cross-Boundary Transit** 

The packet travels through your established connection (such as Direct Connect or VPN) across the cloud boundary to the VPC.

 **6. VPC Network Delivery** 

The VPC networking layer routes the packet to the correct subnet where the EKS control plane ENI (`10.0.0.132`) is located.

 **7. ENI Reception** 

The packet reaches the EKS control plane ENI attached to the Kubernetes API server.

### Response
<a name="_response_3"></a>

After the EKS control plane processes the request, it sends a response back to the pod:

 **7. API Server Sends Response** 

The EKS Kubernetes API server creates a response packet with the original source IP as the destination. The packet looks like this:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 10.0.0.132 | Src: 443            |                 |
| Dst: 10.85.1.56 | Dst: 67493          |                 |
+-----------------+---------------------+-----------------+
```

Because the destination IP belongs to the configured remote pod CIDR (`10.85.0.0/16`), it sends it through its ENI in the VPC with the subnet’s router as the next hop.

 **6. VPC Routing** 

The VPC route table contains an entry for the remote pod CIDR (`10.85.0.0/16`), directing this traffic to the VPC-to-onprem gateway.

 **5. Cross-Boundary Transit** 

The gateway transfers the packet across the cloud boundary through your established connection (such as Direct Connect or VPN) to your on-premises network.

 **4. On-Premises Network Reception** 

The packet arrives at your local on-premises router.

 **3. Delivery to node** 

The router’s table has an entry for `10.85.1.0/24` with `10.80.0.2` as the next hop, delivering the packet to our node.

 **2. Node Network Processing** 

As the packet is processed by the node’s network stack, `conntrack` (a part of `netfilter`) matches the packet with the connection the pod initially establish. Since DNAT was originally applied, `conntrack` reverses the DNAT by rewriting the source IP from the EKS control plane ENI’s IP to the `kubernetes` service IP:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 172.16.0.1 | Src: 443            |                 |
| Dst: 10.85.1.56 | Dst: 67493          |                 |
+-----------------+---------------------+-----------------+
```

 **1. CNI Processing** 

The CNI identifies that the destination IP belongs to a pod in its network and delivers the packet to the correct pod network namespace.

This flow showcases why Remote Pod CIDRs must be properly routable from the VPC all the way to the specific node hosting each pod - the entire return path depends on proper routing of pod IPs across both cloud and on-premises networks.

### With CNI NAT
<a name="_with_cni_nat"></a>

This flow is very similar to the one *without CNI NAT*, but with one key difference: the CNI applies source NAT (SNAT) to the packet before sending it to the node’s network stack. This changes the source IP of the packet to the node’s IP, allowing the packet to be routed back to the node without requiring additional routing configuration.

### Request
<a name="_request_4"></a>

 **1. Pod Initiates Request** 

The pod sends a request to the `kubernetes` service IP (`172.16.0.1`) on the EKS Kubernetes API server port (443) from a random source port. The packet looks like this:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 10.85.1.56 | Src: 67493 (random) |                 |
| Dst: 172.16.0.1 | Dst: 443            |                 |
+-----------------+---------------------+-----------------+
```

 **2. CNI Processing** 

The CNI detects that the destination IP doesn’t belong to any pod CIDR it manages. Since **outgoing NAT is enabled**, the CNI applies SNAT to the packet, changing the source IP to the node’s IP before passing it to the node’s network stack:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 10.80.0.2  | Src: 67493 (random) |                 |
| Dst: 172.16.0.1 | Dst: 443            |                 |
+-----------------+---------------------+-----------------+
```

Note: CNI and `iptables` are shown in the example as separate blocks for clarity, but in practice, it’s possible that some CNIs use `iptables` to apply NAT.

 **3. Node Network Processing** 

Here the `iptables` rules set by `kube-proxy` behave the same as in the previous example, load balancing the packet to one of the EKS control plane ENIs. The packet now looks like this:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 10.80.0.2  | Src: 67493 (random) |                 |
| Dst: 10.0.0.132 | Dst: 443            |                 |
+-----------------+---------------------+-----------------+
```

The node forwards the packet to its default gateway because the destination IP is not in the local network.

 **4. Local Router Routing** 

The local router determines that the destination IP (`10.0.0.132`) belongs to the VPC CIDR (`10.0.0.0/16`) and forwards it to the gateway connecting to AWS.

 **5. Cross-Boundary Transit** 

The packet travels through your established connection (such as Direct Connect or VPN) across the cloud boundary to the VPC.

 **6. VPC Network Delivery** 

The VPC networking layer routes the packet to the correct subnet where the EKS control plane ENI (`10.0.0.132`) is located.

 **7. ENI Reception** 

The packet reaches the EKS control plane ENI attached to the Kubernetes API server.

### Response
<a name="_response_4"></a>

After the EKS control plane processes the request, it sends a response back to the pod:

 **7. API Server Sends Response** 

The EKS Kubernetes API server creates a response packet with the original source IP as the destination. The packet looks like this:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 10.0.0.132 | Src: 443            |                 |
| Dst: 10.80.0.2  | Dst: 67493          |                 |
+-----------------+---------------------+-----------------+
```

Because the destination IP belongs to the configured remote node CIDR (`10.80.0.0/16`), it sends it through its ENI in the VPC with the subnet’s router as the next hop.

 **6. VPC Routing** 

The VPC route table contains an entry for the remote node CIDR (`10.80.0.0/16`), directing this traffic to the VPC-to-onprem gateway.

 **5. Cross-Boundary Transit** 

The gateway transfers the packet across the cloud boundary through your established connection (such as Direct Connect or VPN) to your on-premises network.

 **4. On-Premises Network Reception** 

The packet arrives at your local on-premises router.

 **3. Delivery to node** 

The local router identifies that the destination IP (`10.80.0.2`) address belongs to its directly connected network and forwards the packet directly to the target hybrid node.

 **2. Node Network Processing** 

As the packet is processed by the node’s network stack, `conntrack` (a part of `netfilter`) matches the packet with the connection the pod initially establish and since DNAT was originally applied, it reverses this by rewriting the source IP from the EKS control plane ENI’s IP to the `kubernetes` service IP:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 172.16.0.1 | Src: 443            |                 |
| Dst: 10.80.0.2  | Dst: 67493          |                 |
+-----------------+---------------------+-----------------+
```

 **1. CNI Processing** 

The CNI identifies this packet belongs to a connection where it has previously applied SNAT. It reverses the SNAT, changing the destination IP back to the pod’s IP:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 172.16.0.1 | Src: 443            |                 |
| Dst: 10.85.1.56 | Dst: 67493          |                 |
+-----------------+---------------------+-----------------+
```

The CNI detects the destination IP belongs to a pod in its network and delivers the packet to the correct pod network namespace.

This flow showcases how CNI NAT-ing can simplify configuration by allowing packets to be routed back to the node without requiring additional routing for the pod CIDRs.

## EKS control plane to pods running on a hybrid node (webhooks)
<a name="hybrid-nodes-concepts-traffic-flows-cp-to-pod"></a>

![\[EKS control plane to pods running on a hybrid node\]](http://docs.aws.amazon.com/eks/latest/userguide/images/hybrid-nodes-cp-to-pod.png)


This traffic pattern is most commonly seen with webhooks, where the EKS control plane needs to directly initiate connections to webhook servers running in pods on hybrid nodes. Examples include validating and mutating admission webhooks, which are called by the API server during resource validation or mutation processes.

### Request
<a name="_request_5"></a>

 **1. EKS Kubernetes API server initiates request** 

When a webhook is configured in the cluster and a relevant API operation triggers it, the EKS Kubernetes API server needs to make a direct connection to the webhook server pod. The API server first looks up the pod’s IP address from the Service or Endpoint resource associated with the webhook.

Assuming the webhook pod is running on a hybrid node with IP `10.85.1.23`, the EKS Kubernetes API server creates an HTTPS request to the webhook endpoint. The initial packet is sent through the EKS control plane ENI in your VPC because the destination IP `10.85.1.23` belongs to the configured remote pod CIDR (`10.85.0.0/16`). The packet looks like this:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 10.0.0.132 | Src: 41892 (random) |                 |
| Dst: 10.85.1.23 | Dst: 8443           |                 |
+-----------------+---------------------+-----------------+
```

 **2. VPC Network Processing** 

The packet leaves the EKS control plane ENI and enters the VPC networking layer with the subnet’s router as the next hop.

 **3. VPC Route Table Lookup** 

The VPC route table for the subnet containing the EKS control plane ENI contains a specific route for the remote pod CIDR (`10.85.0.0/16`). This routing rule directs the packet to the VPC-to-onprem gateway (for example, a Virtual Private Gateway for Direct Connect or VPN connections):

```
Destination     Target
10.0.0.0/16     local
10.85.0.0/16    vgw-id (VPC-to-onprem gateway)
```

 **4. Cross-Boundary Transit** 

The gateway transfers the packet across the cloud boundary through your established connection (such as Direct Connect or VPN) to your on-premises network. The packet maintains its original source and destination IP addresses as it traverses this connection.

 **5. On-Premises Network Reception** 

The packet arrives at your local on-premises router. The router consults its routing table to determine how to reach the 10.85.1.23 address. For this to work, your on-premises network must have routes for the pod CIDRs that direct traffic to the appropriate hybrid node.

In this case, the router’s route table contains an entry indicating that the `10.85.1.0/24` subnet is reachable through the hybrid node with IP `10.80.0.2`:

```
Destination     Next Hop
10.85.1.0/24    10.80.0.2
```

 **6. Delivery to node** 

Based on the routing table entry, the router forwards the packet to the hybrid node (`10.80.0.2`). When the packet arrives at the node, it looks the same as when the EKS Kubernetes API server sent it, with the destination IP still being the pod’s IP.

 **7. CNI Processing** 

The node’s network stack receives the packet and, seeing that the destination IP is not the node’s own IP, passes it to the CNI for processing. The CNI identifies that the destination IP belongs to a pod running locally on this node and forwards the packet to the correct pod through the appropriate virtual interfaces:

```
Original packet -> node routing -> CNI -> Pod's network namespace
```

The webhook server in the pod receives the request and processes it.

### Response
<a name="_response_5"></a>

After the webhook pod processes the request, it sends back a response following the same path in reverse:

 **7. Pod Sends Response** 

The webhook pod creates a response packet with its own IP as the source and the original requester (the EKS control plane ENI) as the destination:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 10.85.1.23 | Src: 8443           |                 |
| Dst: 10.0.0.132 | Dst: 41892          |                 |
+-----------------+---------------------+-----------------+
```

The CNI identifies that this packet goes to an external network (not a local pod) and passes the packet to the node’s network stack with the original source IP preserved.

 **6. Node Network Processing** 

The node determines that the destination IP (`10.0.0.132`) is not in the local network and forwards the packet to its default gateway (the local router).

 **5. Local Router Routing** 

The local router consults its routing table and determines that the destination IP (`10.0.0.132`) belongs to the VPC CIDR (`10.0.0.0/16`). It forwards the packet to the gateway connecting to AWS.

 **4. Cross-Boundary Transit** 

The packet travels back through the same on-premises to VPC connection, crossing the cloud boundary in the reverse direction.

 **3. VPC Routing** 

When the packet arrives in the VPC, the route tables identify that the destination IP belongs to a subnet within the VPC. The packet is routed accordingly within the VPC.

 **2. and 1. EKS control plane ENI Reception** 

The packet reaches the ENI attached to the EKS Kubernetes API server, completing the round trip. The API server receives the webhook response and continues processing the original API request based on this response.

This traffic flow demonstrates why remote pod CIDRs must be properly configured and routed:
+ The VPC must have routes for the remote pod CIDRs pointing to the on-premises gateway
+ Your on-premises network must have routes for pod CIDRs that direct traffic to the specific nodes hosting those pods
+ Without this routing configuration, webhooks and other similar services running in pods on hybrid nodes would not be reachable from the EKS control plane.

## Pod-to-Pod running on hybrid nodes
<a name="hybrid-nodes-concepts-traffic-flows-pod-to-pod"></a>

![\[Pod-to Pod running on hybrid nodes\]](http://docs.aws.amazon.com/eks/latest/userguide/images/hybrid-nodes-pod-to-pod.png)


This section explains how pods running on different hybrid nodes communicate with each other. This example assumes your CNI uses VXLAN for encapsulation, which is common for CNIs such as Cilium or Calico. The overall process is similar for other encapsulation protocols such as Geneve or IP-in-IP.

### Request
<a name="_request_6"></a>

 **1. Pod A Initiates Communication** 

Pod A (`10.85.1.56`) on Node 1 wants to send traffic to Pod B (`10.85.2.67`) on Node 2. The initial packet looks like this:

```
+------------------+-----------------+-------------+-----------------+
| Ethernet Header  | IP Header       | TCP/UDP     | Payload         |
| Src: Pod A MAC   | Src: 10.85.1.56 | Src: 43721  |                 |
| Dst: Gateway MAC | Dst: 10.85.2.67 | Dst: 8080   |                 |
+------------------+-----------------+-------------+-----------------+
```

 **2. CNI Intercepts and Processes the Packet** 

When Pod A’s packet leaves its network namespace, the CNI intercepts it. The CNI consults its routing table and determines: - The destination IP (`10.85.2.67`) belongs to the pod CIDR - This IP is not on the local node but belongs to Node 2 (`10.80.0.3`) - The packet needs to be encapsulated with VXLAN.

The decision to encapsulate is critical because the underlying physical network doesn’t know how to route pod CIDRs directly - it only knows how to route traffic between node IPs.

The CNI encapsulates the entire original packet inside a VXLAN frame. This effectively creates a "packet within a packet" with new headers:

```
+-----------------+----------------+--------------+------------+---------------------------+
| Outer Ethernet  | Outer IP       | Outer UDP    | VXLAN      | Original Pod-to-Pod       |
| Src: Node1 MAC  | Src: 10.80.0.2 | Src: Random  | VNI: 42    | Packet (unchanged         |
| Dst: Router MAC | Dst: 10.80.0.3 | Dst: 8472    |            | from above)               |
+-----------------+----------------+--------------+------------+---------------------------+
```

Key points about this encapsulation: - The outer packet is addressed from Node 1 (`10.80.0.2`) to Node 2 (`10.80.0.3`) - UDP port `8472` is the VXLAN port Cilium uses by default - The VXLAN Network Identifier (VNI) identifies which overlay network this packet belongs to - The entire original packet (with Pod A’s IP as source and Pod B’s IP as destination) is preserved intact inside

The encapsulated packet now enters the regular networking stack of Node 1 and is processed in the same way as any other packet:

1.  **Node Network Processing**: Node 1’s network stack routes the packet based on its destination (`10.80.0.3`)

1.  **Local Network Delivery**:
   + If both nodes are on the same Layer 2 network, the packet is sent directly to Node 2
   + If they’re on different subnets, the packet is forwarded to the local router first

1.  **Router Handling**: The router forwards the packet based on its routing table, delivering it to Node 2

 **3. Receiving Node Processing** 

When the encapsulated packet arrives at Node 2 (`10.80.0.3`):

1. The node’s network stack receives it and identifies it as a VXLAN packet (UDP port `4789`)

1. The packet is passed to the CNI’s VXLAN interface for processing

 **4. VXLAN Decapsulation** 

The CNI on Node 2 processes the VXLAN packet:

1. It strips away the outer headers (Ethernet, IP, UDP, and VXLAN)

1. It extracts the original inner packet

1. The packet is now back to its original form:

```
+------------------+-----------------+-------------+-----------------+
| Ethernet Header  | IP Header       | TCP/UDP     | Payload         |
| Src: Pod A MAC   | Src: 10.85.1.56 | Src: 43721  |                 |
| Dst: Gateway MAC | Dst: 10.85.2.67 | Dst: 8080   |                 |
+------------------+-----------------+-------------+-----------------+
```

The CNI on Node 2 examines the destination IP (`10.85.2.67`) and:

1. Identifies that this IP belongs to a local pod

1. Routes the packet through the appropriate virtual interfaces

1. Delivers the packet to Pod B’s network namespace

### Response
<a name="_response_6"></a>

When Pod B responds to Pod A, the entire process happens in reverse:

1. Pod B sends a packet to Pod A (`10.85.1.56`)

1. Node 2’s CNI encapsulates it with VXLAN, setting the destination to Node 1 (`10.80.0.2`)

1. The encapsulated packet is delivered to Node 1

1. Node 1’s CNI decapsulates it and delivers the original response to Pod A

## Pods on cloud nodes to pods on hybrid nodes (east-west traffic)
<a name="hybrid-nodes-concepts-traffic-flows-east-west"></a>

![\[Pods on cloud nodes to pods on hybrid nodes\]](http://docs.aws.amazon.com/eks/latest/userguide/images/hybrid-nodes-east-west.png)


### Request
<a name="_request_7"></a>

 **1. Pod A Initiates Communication** 

Pod A (`10.0.0.56`) on the EC2 Node wants to send traffic to Pod B (`10.85.1.56`) on the Hybrid Node. The initial packet looks like this:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 10.0.0.56  | Src: 52390 (random) |                 |
| Dst: 10.85.1.56 | Dst: 8080           |                 |
+-----------------+---------------------+-----------------+
```

With the VPC CNI, Pod A has an IP from the VPC CIDR and is directly attached to an ENI on the EC2 instance. The pod’s network namespace is connected to the VPC network, so the packet enters the VPC routing infrastructure directly.

 **2. VPC Routing** 

The VPC route table contains a specific route for the Remote Pod CIDR (`10.85.0.0/16`), directing this traffic to the VPC-to-onprem gateway:

```
Destination     Target
10.0.0.0/16     local
10.85.0.0/16    vgw-id (VPC-to-onprem gateway)
```

Based on this routing rule, the packet is directed toward the gateway connecting to your on-premises network.

 **3. Cross-Boundary Transit** 

The gateway transfers the packet across the cloud boundary through your established connection (such as Direct Connect or VPN) to your on-premises network. The packet maintains its original source and destination IP addresses throughout this transit.

 **4. On-Premises Network Reception** 

The packet arrives at your local on-premises router. The router consults its routing table to determine the next hop for reaching the 10.85.1.56 address. Your on-premises router must have routes for the pod CIDRs that direct traffic to the appropriate hybrid node.

The router’s table has an entry indicating that the `10.85.1.0/24` subnet is reachable through the hybrid node with IP `10.80.0.2`:

```
Destination     Next Hop
10.85.1.0/24    10.80.0.2
```

 **5. Node Network Processing** 

The router forwards the packet to the hybrid node (`10.80.0.2`). When the packet arrives at the node, it still has Pod A’s IP as the source and Pod B’s IP as the destination.

 **6. CNI Processing** 

The node’s network stack receives the packet and, seeing that the destination IP is not its own, passes it to the CNI for processing. The CNI identifies that the destination IP belongs to a pod running locally on this node and forwards the packet to the correct pod through the appropriate virtual interfaces:

```
Original packet -> node routing -> CNI -> Pod B's network namespace
```

Pod B receives the packet and processes it as needed.

### Response
<a name="_response_7"></a>

 **6. Pod B Sends Response** 

Pod B creates a response packet with its own IP as the source and Pod A’s IP as the destination:

```
+-----------------+---------------------+-----------------+
| IP Header       | TCP Header          | Payload         |
| Src: 10.85.1.56 | Src: 8080           |                 |
| Dst: 10.0.0.56  | Dst: 52390          |                 |
+-----------------+---------------------+-----------------+
```

The CNI identifies that this packet is destined for an external network and passes it to the node’s network stack.

 **5. Node Network Processing** 

The node determines that the destination IP (`10.0.0.56`) does not belong to the local network and forwards the packet to its default gateway (the local router).

 **4. Local Router Routing** 

The local router consults its routing table and determines that the destination IP (`10.0.0.56`) belongs to the VPC CIDR (`10.0.0.0/16`). It forwards the packet to the gateway connecting to AWS.

 **3. Cross-Boundary Transit** 

The packet travels back through the same on-premises to VPC connection, crossing the cloud boundary in the reverse direction.

 **2. VPC Routing** 

When the packet arrives in the VPC, the routing system identifies that the destination IP belongs to a subnet within the VPC. The packet is routed through the VPC network toward the EC2 instance hosting Pod A.

 **1. Pod A Receives Response** 

The packet arrives at the EC2 instance and is delivered directly to Pod A through its attached ENI. Since the VPC CNI doesn’t use overlay networking for pods in the VPC, no additional decapsulation is needed - the packet arrives with its original headers intact.

This east-west traffic flow demonstrates why remote pod CIDRs must be properly configured and routable from both directions:
+ The VPC must have routes for the remote pod CIDRs pointing to the on-premises gateway
+ Your on-premises network must have routes for pod CIDRs that direct traffic to the specific nodes hosting those pods.

# Hybrid nodes `nodeadm` reference
<a name="hybrid-nodes-nodeadm"></a>

The Amazon EKS Hybrid Nodes CLI (`nodeadm`) simplifies the installation, configuration, registration, and uninstallation of the hybrid nodes components. You can include `nodeadm` in your operating system images to automate hybrid node bootstrap, see [Prepare operating system for hybrid nodes](hybrid-nodes-os.md) for more information.

The `nodeadm` version for hybrid nodes differs from the `nodeadm` version used for bootstrapping Amazon EC2 instances as nodes in Amazon EKS clusters. Follow the documentation and references for the appropriate `nodeadm` version. This documentation page is for the hybrid nodes `nodeadm` version.

The source code for the hybrid nodes `nodeadm` is published in the https://github.com/aws/eks-hybrid GitHub repository.

**Important**  
You must run `nodeadm` with a user that has root/sudo privileges.

## Download `nodeadm`
<a name="_download_nodeadm"></a>

The hybrid nodes version of `nodeadm` is hosted in Amazon S3 fronted by Amazon CloudFront. To install `nodeadm` on each on-premises host, you can run the following command from your on-premises hosts.

 **For x86\$164 hosts** 

```
curl -OL 'https://hybrid-assets.eks.amazonaws.com/releases/latest/bin/linux/amd64/nodeadm'
```

 **For ARM hosts** 

```
curl -OL 'https://hybrid-assets.eks.amazonaws.com/releases/latest/bin/linux/arm64/nodeadm'
```

Add executable file permission to the downloaded binary on each host.

```
chmod +x nodeadm
```

## `nodeadm install`
<a name="_nodeadm_install"></a>

The `nodeadm install` command is used to install the artifacts and dependencies required to run and join hybrid nodes to an Amazon EKS cluster. The `nodeadm install` command can be run individually on each hybrid node or can be run during image build pipelines to preinstall the hybrid nodes dependencies in operating system images.

 **Usage** 

```
nodeadm install [KUBERNETES_VERSION] [flags]
```

 **Positional Arguments** 

(Required) `KUBERNETES_VERSION` The major.minor version of EKS Kubernetes to install, for example `1.32` 

 **Flags** 


| Name | Required | Description | 
| --- | --- | --- | 
|   `-p`,  `--credential-provider`   |  TRUE  |  Credential provider to install. Supported values are `iam-ra` and `ssm`. See [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md) for more information.  | 
|   `-s`,  `--containerd-source`   |  FALSE  |  Source for `containerd`. `nodeadm` supports installing `containerd` from the OS distro, Docker packages, and skipping `containerd` install.  **Values**   `distro` - This is the default value. `nodeadm` will install the latest `containerd` package distributed by the node OS that is compatible with the EKS Kubernetes version. `distro` is not a supported value for Red Hat Enterprise Linux (RHEL) operating systems.  `docker` - `nodeadm` will install the latest `containerd` package built and distributed by Docker that is compatible with the EKS Kubernetes version. `docker` is not a supported value for Amazon Linux 2023.  `none` - `nodeadm` will not install `containerd` package. You must manually install `containerd` before running `nodeadm init`.  | 
|   `-r`,  `--region`   |  FALSE  |  Specifies the AWS Region for downloading artifacts such as the SSM Agent. Defaults to `us-west-2`.  | 
|   `-t`,  `--timeout`   |  FALSE  |  Maximum install command duration. The input follows duration format. For example `1h23m`. Default download timeout for install command is set to 20 minutes.  | 
|   `-h`, `--help`   |  FALSE  |  Displays help message with available flag, subcommand and positional value parameters.  | 

 **Examples** 

Install Kubernetes version `1.32` with AWS Systems Manager (SSM) as the credential provider

```
nodeadm install 1.32 --credential-provider ssm
```

Install Kubernetes version `1.32` with AWS Systems Manager (SSM) as the credential provider, Docker as the containerd source, with a download timeout of 20 minutes.

```
nodeadm install 1.32 --credential-provider ssm --containerd-source docker --timeout 20m
```

Install Kubernetes version `1.32` with AWS IAM Roles Anywhere as the credential provider

```
nodeadm install 1.32 --credential-provider iam-ra
```

## `nodeadm config check`
<a name="_nodeadm_config_check"></a>

The `nodeadm config check` command checks the provided node configuration for errors. This command can be used to verify and validate the correctness of a hybrid node configuration file.

 **Usage** 

```
nodeadm config check [flags]
```

 **Flags** 


| Name | Required | Description | 
| --- | --- | --- | 
|   `-c`,  `--config-source`   |  TRUE  |  Source of nodeadm configuration. For hybrid nodes the input should follow a URI with file scheme.  | 
|   `-h`, `--help`   |  FALSE  |  Displays help message with available flag, subcommand and positional value parameters.  | 

 **Examples** 

```
nodeadm config check -c file://nodeConfig.yaml
```

## `nodeadm init`
<a name="_nodeadm_init"></a>

The `nodeadm init` command starts and connects the hybrid node with the configured Amazon EKS cluster. See [Node Config for SSM hybrid activations](#hybrid-nodes-node-config-ssm) or [Node Config for IAM Roles Anywhere](#hybrid-nodes-node-config-iamra) for details of how to configure the `nodeConfig.yaml` file.

 **Usage** 

```
nodeadm init [flags]
```

 **Flags** 


| Name | Required | Description | 
| --- | --- | --- | 
|   `-c`,  `--config-source`   |  TRUE  |  Source of `nodeadm` configuration. For hybrid nodes the input should follow a URI with file scheme.  | 
|   `-s`,  `--skip`   |  FALSE  |  Phases of `init` to be skipped. It is not recommended to skip any of the phases unless it helps to fix an issue.  **Values**   `install-validation` skips checking if the preceding install command ran successfully.  `cni-validation` skips checking if either Cilium or Calico CNI’s VXLAN ports are opened if firewall is enabled on the node  `node-ip-validation` skips checking if the node IP falls within a CIDR in the remote node networks  | 
|   `-h`, `--help`   |  FALSE  |  Displays help message with available flag, subcommand and positional value parameters.  | 

 **Examples** 

```
nodeadm init -c file://nodeConfig.yaml
```

## `nodeadm upgrade`
<a name="_nodeadm_upgrade"></a>

The `nodeadm upgrade` command upgrades all the installed artifacts to the latest version and bootstraps the node to configure the upgraded artifacts and join the EKS cluster on AWS. Upgrade is a disruptive command to the workloads running on the node. Please move your workloads to another node before running upgrade.

 **Usage** 

```
nodeadm upgrade [KUBERNETES_VERSION] [flags]
```

 **Positional Arguments** 

(Required) `KUBERNETES_VERSION` The major.minor version of EKS Kubernetes to install, for example `1.32` 

 **Flags** 


| Name | Required | Description | 
| --- | --- | --- | 
|   `-c`,  `--config-source`   |  TRUE  |  Source of `nodeadm` configuration. For hybrid nodes the input should follow a URI with file scheme.  | 
|   `-t`,  `--timeout`   |  FALSE  |  Timeout for downloading artifacts. The input follows duration format. For example 1h23m. Default download timeout for upgrade command is set to 10 minutes.  | 
|   `-s`,  `--skip`   |  FALSE  |  Phases of upgrade to be skipped. It is not recommended to skip any of the phase unless it helps to fix an issue.  **Values**   `pod-validation` skips checking if all the no pods are running on the node, except daemon sets and static pods.  `node-validation` skips checking if the node has been cordoned.  `init-validation` skips checking if the node has been initialized successfully before running upgrade.  `containerd-major-version-upgrade` prevents containerd major version upgrades during node upgrade.  | 
|   `-h`, `--help`   |  FALSE  |  Displays help message with available flag, subcommand and positional value parameters.  | 

 **Examples** 

```
nodeadm upgrade 1.32 -c file://nodeConfig.yaml
```

```
nodeadm upgrade 1.32 -c file://nodeConfig.yaml --timeout 20m
```

## `nodeadm uninstall`
<a name="_nodeadm_uninstall"></a>

The `nodeadm uninstall` command stops and removes the artifacts `nodeadm` installs during `nodeadm install`, including the kubelet and containerd. Note, the uninstall command does not drain or delete your hybrid nodes from your cluster. You must run the drain and delete operations separately, see [Remove hybrid nodes](hybrid-nodes-remove.md) for more information. By default, `nodeadm uninstall` will not proceed if there are pods remaining on the node. Similarly, `nodeadm uninstall` does not remove CNI dependencies or dependencies of other Kubernetes add-ons you run on your cluster. To fully remove the CNI installation from your host, see the instructions at [Configure CNI for hybrid nodes](hybrid-nodes-cni.md). If you are using AWS SSM hybrid activations as your on-premises credentials provider, the `nodeadm uninstall` command deregisters your hosts as AWS SSM managed instances.

 **Usage** 

```
nodeadm uninstall [flags]
```

 **Flags** 


| Name | Required | Description | 
| --- | --- | --- | 
|   `-s`,  `--skip`   |  FALSE  |  Phases of uninstall to be skipped. It is not recommended to skip any of the phases unless it helps to fix an issue.  **Values**   `pod-validation` skips checking if all the no pods are running on the node, except daemon sets and static pods.  `node-validation` skips checking if the node has been cordoned.  `init-validation` skips checking if the node has been initialized successfully before running uninstall.  | 
|   `-h`,  `--help`   |  FALSE  |  Displays help message with available flag, subcommand and positional value parameters.  | 
|   `-f`,  `--force`   |  FALSE  |  Force delete additional directories that might contain remaining files from Kubernetes and CNI components.  **WARNING**  This will delete all contents in default Kubernetes and CNI directories (`/var/lib/cni`, `/etc/cni/net.d`, etc). Do not use this flag if you store your own data in these locations. Starting from nodeadm `v1.0.9`, the `./nodeadm uninstall --skip node-validation,pod-validation --force` command no longer deletes the `/var/lib/kubelet` directory. This is because it may contain Pod volumes and volume-subpath directories that sometimes include the mounted node filesystem.  **Safe handling tips**  - Deleting mounted paths can lead to accidental deletion of the actual mounted node filesystem. Before manually deleting the `/var/lib/kubelet` directory, carefully inspect all active mounts and unmount volumes safely to avoid data loss.  | 

 **Examples** 

```
nodeadm uninstall
```

```
nodeadm uninstall --skip node-validation,pod-validation
```

## `nodeadm debug`
<a name="_nodeadm_debug"></a>

The `nodeadm debug` command can be used to troubleshoot unhealthy or misconfigured hybrid nodes. It validates the following requirements are in-place.
+ The node has network access to the required AWS APIs for obtaining credentials,
+ The node is able to get AWS credentials for the configured Hybrid Nodes IAM role,
+ The node has network access to the EKS Kubernetes API endpoint and the validity of the EKS Kubernetes API endpoint certificate,
+ The node is able to authenticate with the EKS cluster, its identity in the cluster is valid, and that the node has access to the EKS cluster through the VPC configured for the EKS cluster.

If errors are found, the command’s output suggests troubleshooting steps. Certain validation steps show child processes. If these fail, the output is showed in a stderr section under the validation error.

 **Usage** 

```
nodeadm debug [flags]
```

 **Flags** 


| Name | Required | Description | 
| --- | --- | --- | 
|   `-c`, `--config-source`   |  TRUE  |  Source of `nodeadm` configuration. For hybrid nodes the input should follow a URI with file scheme.  | 
|   `--no-color`   |  FALSE  |  Disables color output. Useful for automation.  | 
|   `-h`, `--help`   |  FALSE  |  Displays help message with available flag, subcommand and positional value parameters.  | 

 **Examples** 

```
nodeadm debug -c file://nodeConfig.yaml
```

## Nodeadm file locations
<a name="_nodeadm_file_locations"></a>

### nodeadm install
<a name="_nodeadm_install_2"></a>

When running `nodeadm install`, the following files and file locations are configured.


| Artifact | Path | 
| --- | --- | 
|  IAM Roles Anywhere CLI  |  /usr/local/bin/aws\$1signing\$1helper  | 
|  Kubelet binary  |  /usr/bin/kubelet  | 
|  Kubectl binary  |  usr/local/bin/kubectl  | 
|  ECR Credentials Provider  |  /etc/eks/image-credential-provider/ecr-credential-provider  | 
|   AWS IAM Authenticator  |  /usr/local/bin/aws-iam-authenticator  | 
|  SSM Setup CLI  |  /opt/ssm/ssm-setup-cli  | 
|  SSM Agent  |  On Ubuntu - /snap/amazon-ssm-agent/current/amazon-ssm-agent On RHEL & AL2023 - /usr/bin/amazon-ssm-agent  | 
|  Containerd  |  On Ubuntu & AL2023 - /usr/bin/containerd On RHEL - /bin/containerd  | 
|  Iptables  |  On Ubuntu & AL2023 - /usr/sbin/iptables On RHEL - /sbin/iptables  | 
|  CNI plugins  |  /opt/cni/bin  | 
|  installed artifacts tracker  |  /opt/nodeadm/tracker  | 

### nodeadm init
<a name="_nodeadm_init_2"></a>

When running `nodeadm init`, the following files and file locations are configured.


| Name | Path | 
| --- | --- | 
|  Kubelet kubeconfig  |  /var/lib/kubelet/kubeconfig  | 
|  Kubelet config  |  /etc/kubernetes/kubelet/config.json  | 
|  Kubelet systemd unit  |  /etc/systemd/system/kubelet.service  | 
|  Image credentials provider config  |  /etc/eks/image-credential-provider/config.json  | 
|  Kubelet env file  |  /etc/eks/kubelet/environment  | 
|  Kubelet Certs  |  /etc/kubernetes/pki/ca.crt  | 
|  Containerd config  |  /etc/containerd/config.toml  | 
|  Containerd kernel modules config  |  /etc/modules-load.d/containerd.conf  | 
|   AWS config file  |  /etc/aws/hybrid/config  | 
|   AWS credentials file (if enable credentials file)  |  /eks-hybrid/.aws/credentials  | 
|   AWS signing helper system unit  |  /etc/systemd/system/aws\$1signing\$1helper\$1update.service  | 
|  Sysctl conf file  |  /etc/sysctl.d/99-nodeadm.conf  | 
|  Ca-certificates  |  /etc/ssl/certs/ca-certificates.crt  | 
|  Gpg key file  |  /etc/apt/keyrings/docker.asc  | 
|  Docker repo source file  |  /etc/apt/sources.list.d/docker.list  | 

## Node Config for SSM hybrid activations
<a name="hybrid-nodes-node-config-ssm"></a>

The following is a sample `nodeConfig.yaml` when using AWS SSM hybrid activations for hybrid nodes credentials.

```
apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
  cluster:
    name:             # Name of the EKS cluster
    region:           # AWS Region where the EKS cluster resides
  hybrid:
    ssm:
      activationCode: # SSM hybrid activation code
      activationId:   # SSM hybrid activation id
```

## Node Config for IAM Roles Anywhere
<a name="hybrid-nodes-node-config-iamra"></a>

The following is a sample `nodeConfig.yaml` for AWS IAM Roles Anywhere for hybrid nodes credentials.

When using AWS IAM Roles Anywhere as your on-premises credentials provider, the `nodeName` you use in your `nodeadm` configuration must align with the permissions you scoped for your Hybrid Nodes IAM role. For example, if your permissions for the Hybrid Nodes IAM role only allow AWS IAM Roles Anywhere to assume the role when the role session name is equal to the CN of the host certificate, then the `nodeName` in your `nodeadm` configuration must be the same as the CN of your certificates. The `nodeName` that you use can’t be longer than 64 characters. For more information, see [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md).

```
apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
  cluster:
    name:              # Name of the EKS cluster
    region:            # AWS Region where the EKS cluster resides
  hybrid:
    iamRolesAnywhere:
      nodeName:        # Name of the node
      trustAnchorArn:  # ARN of the IAM Roles Anywhere trust anchor
      profileArn:      # ARN of the IAM Roles Anywhere profile
      roleArn:         # ARN of the Hybrid Nodes IAM role
      certificatePath: # Path to the certificate file to authenticate with the IAM Roles Anywhere trust anchor
      privateKeyPath:  # Path to the private key file for the certificate
```

## Node Config for customizing kubelet (Optional)
<a name="hybrid-nodes-nodeadm-kubelet"></a>

You can pass kubelet configuration and flags in your `nodeadm` configuration. See the example below for how to add an additional node label `abc.amazonaws.com/test-label` and config for setting `shutdownGracePeriod` to 30 seconds.

```
apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
  cluster:
    name:             # Name of the EKS cluster
    region:           # AWS Region where the EKS cluster resides
  kubelet:
    config:           # Map of kubelet config and values
       shutdownGracePeriod: 30s
    flags:            # List of kubelet flags
       - --node-labels=abc.company.com/test-label=true
  hybrid:
    ssm:
      activationCode: # SSM hybrid activation code
      activationId:   # SSM hybrid activation id
```

## Node Config for customizing containerd (Optional)
<a name="_node_config_for_customizing_containerd_optional"></a>

You can pass custom containerd configuration in your `nodeadm` configuration. The containerd configuration for `nodeadm` accepts in-line TOML. See the example below for how to configure containerd to disable deletion of unpacked image layers in the containerd content store.

```
apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
  cluster:
    name:             # Name of the EKS cluster
    region:           # AWS Region where the EKS cluster resides
  containerd:
    config: |         # Inline TOML containerd additional configuration
       [plugins."io.containerd.grpc.v1.cri".containerd]
       discard_unpacked_layers = false
  hybrid:
    ssm:
      activationCode: # SSM hybrid activation code
      activationId:   # SSM hybrid activation id
```

**Note**  
Containerd versions 1.x and 2.x use different configuration formats. Containerd 1.x uses config version 2, while containerd 2.x uses config version 3. Although containerd 2.x remains backward compatible with config version 2, config version 3 is recommended for optimal performance. Check your containerd version with `containerd --version` or review `nodeadm` install logs. For more details on config versioning, see https://containerd.io/releases/

You can also use the containerd configuration to enable SELinux support. With SELinux enabled on containerd, ensure pods scheduled on the node have the proper securityContext and seLinuxOptions enabled. More information on configuring a security context can be found on the [Kubernetes documentation](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/).

**Note**  
Red Hat Enterprise Linux (RHEL) 8 and RHEL 9 have SELinux enabled by default and set to strict on the host. Amazon Linux 2023 has SELinux enabled by default and set to permissive mode. When SELinux is set to permissive mode on the host, enabling it on containerd will not block requests but will log it according to the SELinux configuration on the host.

```
apiVersion: node.eks.aws/v1alpha1
kind: NodeConfig
spec:
  cluster:
    name:             # Name of the EKS cluster
    region:           # AWS Region where the EKS cluster resides
  containerd:
    config: |         # Inline TOML containerd additional configuration
       [plugins."io.containerd.grpc.v1.cri"]
       enable_selinux = true
  hybrid:
    ssm:
      activationCode: # SSM hybrid activation code
      activationId:   # SSM hybrid activation id
```

# Troubleshooting hybrid nodes
<a name="hybrid-nodes-troubleshooting"></a>

This topic covers some common errors that you might see while using Amazon EKS Hybrid Nodes and how to fix them. For other troubleshooting information, see [Troubleshoot problems with Amazon EKS clusters and nodes](troubleshooting.md) and [Knowledge Center tag for Amazon EKS](https://repost.aws/tags/knowledge-center/TA4IvCeWI1TE66q4jEj4Z9zg/amazon-elastic-kubernetes-service) on * AWS re:Post*. If you cannot resolve the issue, contact AWS Support.

 **Node troubleshooting with `nodeadm debug` ** You can run the `nodeadm debug` command from your hybrid nodes to validate networking and credential requirements are met. For more information on the `nodeadm debug` command, see [Hybrid nodes `nodeadm` reference](hybrid-nodes-nodeadm.md).

 **Detect issues with your hybrid nodes with cluster insights** Amazon EKS cluster insights includes *insight checks* that detect common issues with the configuration of EKS Hybrid Nodes in your cluster. You can view the results of all insight checks from the AWS Management Console, AWS CLI, and the AWS SDKs. For more information about cluster insights, see [Prepare for Kubernetes version upgrades and troubleshoot misconfigurations with cluster insights](cluster-insights.md).

## Installing hybrid nodes troubleshooting
<a name="hybrid-nodes-troubleshooting-install"></a>

The following troubleshooting topics are related to installing the hybrid nodes dependencies on hosts with the `nodeadm install` command.

 ** `nodeadm` command failed `must run as root` ** 

The `nodeadm install` command must be run with a user that has root or `sudo` privileges on your host. If you run `nodeadm install` with a user that does not have root or `sudo` privileges, you will see the following error in the `nodeadm` output.

```
"msg":"Command failed","error":"must run as root"
```

 **Unable to connect to dependencies** 

The `nodeadm install` command installs the dependencies required for hybrid nodes. The hybrid nodes dependencies include `containerd`, `kubelet`, `kubectl`, and AWS SSM or AWS IAM Roles Anywhere components. You must have access from where you are running `nodeadm install` to download these dependencies. For more information on the list of locations that you must be able to access, see [Prepare networking for hybrid nodes](hybrid-nodes-networking.md). If you do not have access, you will see errors similar to the following in the `nodeadm install` output.

```
"msg":"Command failed","error":"failed reading file from url: ...: max retries achieved for http request"
```

 **Failed to update package manager** 

The `nodeadm install` command runs `apt update` or `yum update` or `dnf update` before installing the hybrid nodes dependencies. If this step does not succeed you might see errors similar to the following. To remediate, you can run `apt update` or `yum update` or `dnf update` before running `nodeadm install` or you can attempt to re-run `nodeadm install`.

```
failed to run update using package manager
```

 **Timeout or context deadline exceeded** 

When running `nodeadm install`, if you see issues at various stages of the install process with errors that indicate there was a timeout or context deadline exceeded, you might have a slow connection that is preventing the installation of the hybrid nodes dependencies before timeouts are met. To work around these issues, you can attempt to use the `--timeout` flag in `nodeadm` to extend the duration of the timeouts for downloading the dependencies.

```
nodeadm install K8S_VERSION --credential-provider CREDS_PROVIDER --timeout 20m0s
```

## Connecting hybrid nodes troubleshooting
<a name="hybrid-nodes-troubleshooting-connect"></a>

The troubleshooting topics in this section are related to the process of connecting hybrid nodes to EKS clusters with the `nodeadm init` command.

 **Operation errors or unsupported scheme** 

When running `nodeadm init`, if you see errors related to `operation error` or `unsupported scheme`, check your `nodeConfig.yaml` to make sure it is properly formatted and passed to `nodeadm`. For more information on the format and options for `nodeConfig.yaml`, see [Hybrid nodes `nodeadm` reference](hybrid-nodes-nodeadm.md).

```
"msg":"Command failed","error":"operation error ec2imds: GetRegion, request canceled, context deadline exceeded"
```

 **Hybrid Nodes IAM role missing permissions for the `eks:DescribeCluster` action** 

When running `nodeadm init`, `nodeadm` attempts to gather information about your EKS cluster by calling the EKS `DescribeCluster` action. If your Hybrid Nodes IAM role does not have permission for the `eks:DescribeCluster` action, then you must pass your Kubernetes API endpoint, cluster CA bundle, and service IPv4 CIDR in the node configuration you pass to `nodeadm` when you run `nodeadm init`. For more information on the required permissions for the Hybrid Nodes IAM role, see [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md).

```
"msg":"Command failed","error":"operation error EKS: DescribeCluster, https response error StatusCode: 403 ... AccessDeniedException"
```

 **Hybrid Nodes IAM role missing permissions for the `eks:ListAccessEntries` action** 

When running `nodeadm init`, `nodeadm` attempts to validate whether your EKS cluster has an access entry of type `HYBRID_LINUX` associated with the Hybrid Nodes IAM role by calling the EKS `ListAccessEntries` action. If your Hybrid Nodes IAM role does not have permission for the `eks:ListAccessEntries` action, then you must pass the `--skip cluster-access-validation` flag when you run the `nodeadm init` command. For more information on the required permissions for the Hybrid Nodes IAM role, see [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md).

```
"msg":"Command failed","error":"operation error EKS: ListAccessEntries, https response error StatusCode: 403 ... AccessDeniedException"
```

 **Node IP not in remote node network CIDR** 

When running `nodeadm init`, you might encounter an error if the node’s IP address is not within the specified remote node network CIDRs. The error will look similar to the following example:

```
node IP 10.18.0.1 is not in any of the remote network CIDR blocks [10.0.0.0/16 192.168.0.0/16]
```

This example shows a node with IP 10.18.0.1 attempting to join a cluster with remote network CIDRs 10.0.0.0/16 and 192.168.0.0/16. The error occurs because 10.18.0.1 isn’t within either of the ranges.

Confirm that you’ve properly configured your `RemoteNodeNetworks` to include all node IP addresses. For more information on networking configuration, see [Prepare networking for hybrid nodes](hybrid-nodes-networking.md).
+ Run the following command in the region your cluster is located to check your `RemoteNodeNetwork` configurations. Verify that the CIDR blocks listed in the output include the IP range of your node and is the same as the CIDR blocks listed in the error message. If they do not match, confirm the cluster name and region in your `nodeConfig.yaml` match your intended cluster.

```
aws eks describe-cluster --name CLUSTER_NAME --region REGION_NAME --query cluster.remoteNetworkConfig.remoteNodeNetworks
```
+ Verify you’re working with the intended node:
  + Confirm you’re on the correct node by checking its hostname and IP address match the one you intend to register with the cluster.
  + Confirm this node is in the correct on-premises network (the one whose CIDR range was registered as `RemoteNodeNetwork` during cluster setup).

If your node IP is still not what you expected, check the following:
+ If you are using IAM Roles Anywhere, `kubelet` performs a DNS lookup on the IAM Roles Anywhere `nodeName` and uses an IP associated with the node name if available. If you maintain DNS entries for your nodes, confirm that these entries point to IPs within your remote node network CIDRs.
+ If your node has multiple network interfaces, `kubelet` might select an interface with an IP address outside your remote node network CIDRs as default. To use a different interface, specify its IP address using the `--node-ip` `kubelet` flag in your `nodeConfig.yaml`. For more information, see [Hybrid nodes `nodeadm` reference](hybrid-nodes-nodeadm.md). You can view your node’s network interfaces and its IP addresses by running the following command on your node:

```
ip addr show
```

 **Hybrid nodes are not appearing in EKS cluster** 

If you ran `nodeadm init` and it completed but your hybrid nodes do not appear in your cluster, there might be issues with the network connection between your hybrid nodes and the EKS control plane, you might not have the required security group permissions configured, or you might not have the required mapping of your Hybrid Nodes IAM role to Kubernetes Role-Based Access Control (RBAC). You can start the debugging process by checking the status of `kubelet` and the `kubelet` logs with the following commands. Run the following commands from the hybrid nodes that failed to join your cluster.

```
systemctl status kubelet
```

```
journalctl -u kubelet -f
```

 **Unable to communicate with cluster** 

If your hybrid node was unable to communicate with the cluster control plane, you might see logs similar to the following.

```
"Failed to ensure lease exists, will retry" err="Get ..."
```

```
"Unable to register node with API server" err="Post ..."
```

```
Failed to contact API server when waiting for CSINode publishing ... dial tcp <ip address>: i/o timeout
```

If you see these messages, check the following to ensure it meets the hybrid nodes requirements detailed in [Prepare networking for hybrid nodes](hybrid-nodes-networking.md).
+ Confirm the VPC passed to EKS cluster has routes to your Transit Gateway (TGW) or Virtual Private Gateway (VGW) for your on-premises node and optionally pod CIDRs.
+ Confirm you have an additional security group for your EKS cluster has inbound rules for your on-premises node CIDRs and optionally pod CIDRs.
+ Confirm your on-premises router is configured to allow traffic to and from the EKS control plane.

 **Unauthorized** 

If your hybrid node was able to communicate with the EKS control plane but was not able to register, you might see logs similar to the following. Note the key difference in the log messages below is the `Unauthorized` error. This signals that the node was not able to perform its tasks because it does not have the required permissions.

```
"Failed to ensure lease exists, will retry" err="Unauthorized"
```

```
"Unable to register node with API server" err="Unauthorized"
```

```
Failed to contact API server when waiting for CSINode publishing: Unauthorized
```

If you see these messages, check the following to ensure it meets the hybrid nodes requirements details in [Prepare credentials for hybrid nodes](hybrid-nodes-creds.md) and [Prepare cluster access for hybrid nodes](hybrid-nodes-cluster-prep.md).
+ Confirm the identity of the hybrid nodes matches your expected Hybrid Nodes IAM role. This can be done by running `sudo aws sts get-caller-identity` from your hybrid nodes.
+ Confirm your Hybrid Nodes IAM role has the required permissions.
+ Confirm that in your cluster you have an EKS access entry for your Hybrid Nodes IAM role or confirm that your `aws-auth` ConfigMap has an entry for your Hybrid Nodes IAM role. If you are using EKS access entries, confirm your access entry for your Hybrid Nodes IAM role has the `HYBRID_LINUX` access type. If you are using the `aws-auth` ConfigMap, confirm your entry for the Hybrid Nodes IAM role meets the requirements and formatting detailed in [Prepare cluster access for hybrid nodes](hybrid-nodes-cluster-prep.md).

### Hybrid nodes registered with EKS cluster but show status `Not Ready`
<a name="hybrid-nodes-troubleshooting-not-ready"></a>

If your hybrid nodes successfully registered with your EKS cluster, but the hybrid nodes show status `Not Ready`, the first thing to check is your Container Networking Interface (CNI) status. If you have not installed a CNI, then it is expected that your hybrid nodes have status `Not Ready`. Once a CNI is installed and running successfully, nodes are updated to the status `Ready`. If you attempted to install a CNI but it is not running successfully, see [Hybrid nodes CNI troubleshooting](#hybrid-nodes-troubleshooting-cni) on this page.

 **Certificate Signing Requests (CSRs) are stuck Pending** 

After connecting hybrid nodes to your EKS cluster, if you see that there are pending CSRs for your hybrid nodes, your hybrid nodes are not meeting the requirements for automatic approval. CSRs for hybrid nodes are automatically approved and issued if the CSRs for hybrid nodes were created by a node with `eks.amazonaws.com/compute-type: hybrid` label, and the CSR has the following Subject Alternative Names (SANs): at least one DNS SAN equal to the node name and the IP SANs belong to the remote node network CIDRs.

 **Hybrid profile already exists** 

If you changed your `nodeadm` configuration and attempt to reregister the node with the new configuration, you might see an error that states that the hybrid profile already exists but its contents have changed. Instead of running `nodeadm init` in between configuration changes, run `nodeadm uninstall` followed by a `nodeadm install` and `nodeadm init`. This ensures a proper clean up with the changes in configuration.

```
"msg":"Command failed","error":"hybrid profile already exists at /etc/aws/hybrid/config but its contents do not align with the expected configuration"
```

 **Hybrid node failed to resolve Private API** 

After running `nodeadm init`, if you see an error in the `kubelet` logs that shows failures to contact the EKS Kubernetes API server because there is `no such host`, you might have to change your DNS entry for the EKS Kubernetes API endpoint in your on-premises network or at the host level. See [Forwarding inbound DNS queries to your VPC](https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/resolver-forwarding-inbound-queries.html) in the * AWS Route53 documentation*.

```
Failed to contact API server when waiting for CSINode publishing: Get ... no such host
```

 **Can’t view hybrid nodes in the EKS console** 

If you have registered your hybrid nodes but are unable to view them in the EKS console, check the permissions of the IAM principal you are using to view the console. The IAM principal you’re using must have specific minimum IAM and Kubernetes permissions to view resources in the console. For more information, see [View Kubernetes resources in the AWS Management Console](view-kubernetes-resources.md).

## Running hybrid nodes troubleshooting
<a name="_running_hybrid_nodes_troubleshooting"></a>

If your hybrid nodes registered with your EKS cluster, had status `Ready`, and then transitioned to status `Not Ready`, there are a wide range of issues that might have contributed to the unhealthy status such as the node lacking sufficient resources for CPU, memory, or available disk space, or the node is disconnected from the EKS control plane. You can use the steps below to troubleshoot your nodes, and if you cannot resolve the issue, contact AWS Support.

Run `nodeadm debug` from your hybrid nodes to validate networking and credential requirements are met. For more information on the `nodeadm debug` command, see [Hybrid nodes `nodeadm` reference](hybrid-nodes-nodeadm.md).

 **Get node status** 

```
kubectl get nodes -o wide
```

 **Check node conditions and events** 

```
kubectl describe node NODE_NAME
```

 **Get pod status** 

```
kubectl get pods -A -o wide
```

 **Check pod conditions and events** 

```
kubectl describe pod POD_NAME
```

 **Check pod logs** 

```
kubectl logs POD_NAME
```

 **Check `kubectl` logs** 

```
systemctl status kubelet
```

```
journalctl -u kubelet -f
```

 **Pod liveness probes failing or webhooks are not working** 

If applications, add-ons, or webhooks running on your hybrid nodes are not starting properly, you might have networking issues that block the communication to the pods. For the EKS control plane to contact webhooks running on hybrid nodes, you must configure your EKS cluster with a remote pod network and have routes for your on-premises pod CIDR in your VPC routing table with the target as your Transit Gateway (TGW), virtual private gateway (VGW), or other gateway you are using to connect your VPC with your on-premises network. For more information on the networking requirements for hybrid nodes, see [Prepare networking for hybrid nodes](hybrid-nodes-networking.md). You additionally must allow this traffic in your on-premises firewall and ensure your router can properly route to your pods. See [Configure webhooks for hybrid nodes](hybrid-nodes-webhooks.md) for more information on the requirements for running webhooks on hybrid nodes.

A common pod log message for this scenario is shown below the following where ip-address is the Cluster IP for the Kubernetes service.

```
dial tcp <ip-address>:443: connect: no route to host
```

 ** `kubectl logs` or `kubectl exec` commands not working (`kubelet` API commands)** 

If `kubectl attach`, `kubectl cp`, `kubectl exec`, `kubectl logs`, and `kubectl port-forward` commands time out while other `kubectl` commands succeed, the issue is likely related to remote network configuration. These commands connect through the cluster to the `kubelet` endpoint on the node. For more information see [`kubelet` endpoint](hybrid-nodes-concepts-kubernetes.md#hybrid-nodes-concepts-k8s-kubelet-api).

Verify that your node IPs and pod IPs fall within the remote node network and remote pod network CIDRs configured for your cluster. Use the commands below to examine IP assignments.

```
kubectl get nodes -o wide
```

```
kubectl get pods -A -o wide
```

Compare these IPs with your configured remote network CIDRs to ensure proper routing. For network configuration requirements, see [Prepare networking for hybrid nodes](hybrid-nodes-networking.md).

## Hybrid nodes CNI troubleshooting
<a name="hybrid-nodes-troubleshooting-cni"></a>

If you run into issues with initially starting Cilium or Calico with hybrid nodes, it is most often due to networking issues between hybrid nodes or the CNI pods running on hybrid nodes, and the EKS control plane. Make sure your environment meets the requirements in Prepare networking for hybrid nodes. It’s useful to break down the problem into parts.

EKS cluster configuration  
Are the RemoteNodeNetwork and RemotePodNetwork configurations correct?

VPC configuration  
Are there routes for the RemoteNodeNetwork and RemotePodNetwork in the VPC routing table with the target of the Transit Gateway or Virtual Private Gateway?

Security group configuration  
Are there inbound and outbound rules for the RemoteNodeNetwork and RemotePodNetwork ?

On-premises network  
Are there routes and access to and from the EKS control plane and to and from the hybrid nodes and the pods running on hybrid nodes?

CNI configuration  
If using an overlay network, does the IP pool configuration for the CNI match the RemotePodNetwork configured for the EKS cluster if using webhooks?

 **Hybrid node has status `Ready` without a CNI installed** 

If your hybrid nodes are showing status `Ready`, but you have not installed a CNI on your cluster, it is possible that there are old CNI artifacts on your hybrid nodes. By default, when you uninstall Cilium and Calico with tools such as Helm, the on-disk resources are not removed from your physical or virtual machines. Additionally, the Custom Resource Definitions (CRDs) for these CNIs might still be present on your cluster from an old installation. For more information, see the Delete Cilium and Delete Calico sections of [Configure CNI for hybrid nodes](hybrid-nodes-cni.md).

 **Cilium troubleshooting** 

If you are having issues running Cilium on hybrid nodes, see [the troubleshooting steps](https://docs.cilium.io/en/stable/operations/troubleshooting/) in the Cilium documentation. The sections below cover issues that might be unique to deploying Cilium on hybrid nodes.

 **Cilium isn’t starting** 

If the Cilium agents that run on each hybrid node are not starting, check the logs of the Cilium agent pods for errors. The Cilium agent requires connectivity to the EKS Kubernetes API endpoint to start. Cilium agent startup will fail if this connectivity is not correctly configured. In this case, you will see log messages similar to the following in the Cilium agent pod logs.

```
msg="Unable to contact k8s api-server"
level=fatal msg="failed to start: Get \"https://<k8s-cluster-ip>:443/api/v1/namespaces/kube-system\": dial tcp <k8s-cluster-ip>:443: i/o timeout"
```

The Cilium agent runs on the host network. Your EKS cluster must be configured with `RemoteNodeNetwork` for the Cilium connectivity. Confirm you have an additional security group for your EKS cluster with an inbound rule for your `RemoteNodeNetwork`, that you have routes in your VPC for your `RemoteNodeNetwork`, and that your on-premises network is configured correctly to allow connectivity to the EKS control plane.

If the Cilium operator is running and some of your Cilium agents are running but not all, confirm that you have available pod IPs to allocate for all nodes in your cluster. You configure the size of your allocatable Pod CIDRs when using cluster pool IPAM with `clusterPoolIPv4PodCIDRList` in your Cilium configuration. The per-node CIDR size is configured with the `clusterPoolIPv4MaskSize` setting in your Cilium configuration. See [Expanding the cluster pool](https://docs.cilium.io/en/stable/network/concepts/ipam/cluster-pool/#expanding-the-cluster-pool) in the Cilium documentation for more information.

 **Cilium BGP is not working** 

If you are using Cilium BGP Control Plane to advertise your pod or service addresses to your on-premises network, you can use the following Cilium CLI commands to check if BGP is advertising the routes to your resources. For steps to install the Cilium CLI, see [Install the Cilium CLI](https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default/#install-the-cilium-cli) in the Cilium documentation.

If BGP is working correctly, you should your hybrid nodes with Session State `established` in the output. You might need to work with your networking team to identify the correct values for your environment’s Local AS, Peer AS, and Peer Address.

```
cilium bgp peers
```

```
cilium bgp routes
```

If you are using Cilium BGP to advertise the IPs of Services with type `LoadBalancer`, you must have the same label on both your `CiliumLoadBalancerIPPool` and Service, which should be used in the selector of your `CiliumBGPAdvertisement`. An example is shown below. Note, if you are using Cilium BGP to advertise the IPs of Services with type LoadBalancer, the BGP routes might be disrupted during Cilium agent restart. For more information, see [Failure Scenarios](https://docs.cilium.io/en/latest/network/bgp-control-plane/bgp-control-plane-operation/#failure-scenarios) in the Cilium documentation.

 **Service** 

```
kind: Service
apiVersion: v1
metadata:
  name: guestbook
  labels:
    app: guestbook
spec:
  ports:
  - port: 3000
    targetPort: http-server
  selector:
    app: guestbook
  type: LoadBalancer
```

 **CiliumLoadBalancerIPPool** 

```
apiVersion: cilium.io/v2alpha1
kind: CiliumLoadBalancerIPPool
metadata:
  name: guestbook-pool
  labels:
    app: guestbook
spec:
  blocks:
  - cidr: <CIDR to advertise>
  serviceSelector:
    matchExpressions:
      - { key: app, operator: In, values: [ guestbook ] }
```

 **CiliumBGPAdvertisement** 

```
apiVersion: cilium.io/v2alpha1
kind: CiliumBGPAdvertisement
metadata:
  name: bgp-advertisements-guestbook
  labels:
    advertise: bgp
spec:
  advertisements:
    - advertisementType: "Service"
      service:
        addresses:
          - ExternalIP
          - LoadBalancerIP
      selector:
        matchExpressions:
          - { key: app, operator: In, values: [ guestbook ] }
```

 **Calico troubleshooting** 

If you are having issues running Calico on hybrid nodes, see [the troubleshooting steps](https://docs.tigera.io/calico/latest/operations/troubleshoot/) in the Calico documentation. The sections below cover issues that might be unique to deploying Calico on hybrid nodes.

The table below summarizes the Calico components and whether they run on the node or pod network by default. If you configured Calico to use NAT for outgoing pod traffic, your on-premises network must be configured to route traffic to your on-premises node CIDR and your VPC routing tables must be configured with a route for your on-premises node CIDR with your transit gateway (TGW) or virtual private gateway (VGW) as the target. If you are not configuring Calico to use NAT for outgoing pod traffic, your on-premises network must be configured to route traffic to your on-premises pod CIDR and your VPC routing tables must be configured with a route for your on-premises pod CIDR with your transit gateway (TGW) or virtual private gateway (VGW) as the target.


| Component | Network | 
| --- | --- | 
|  Calico API server  |  Node  | 
|  Calico Controllers for Kubernetes  |  Pod  | 
|  Calico node agent  |  Node  | 
|  Calico `typha`   |  Node  | 
|  Calico CSI node driver  |  Pod  | 
|  Calico operator  |  Node  | 

 **Calico resources are scheduled or running on cordoned nodes** 

The Calico resources that don’t run as a DaemonSet have flexible tolerations by default that enable them to be scheduled on cordoned nodes that are not ready for scheduling or running pods. You can tighten the tolerations for the non-DaemonSet Calico resources by changing your operator installation to include the following.

```
installation:
  ...
  controlPlaneTolerations:
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  calicoKubeControllersDeployment:
    spec:
      template:
        spec:
          tolerations:
          - effect: NoExecute
            key: node.kubernetes.io/unreachable
            operator: Exists
            tolerationSeconds: 300
          - effect: NoExecute
            key: node.kubernetes.io/not-ready
            operator: Exists
            tolerationSeconds: 300
  typhaDeployment:
    spec:
      template:
        spec:
          tolerations:
          - effect: NoExecute
            key: node.kubernetes.io/unreachable
            operator: Exists
            tolerationSeconds: 300
          - effect: NoExecute
            key: node.kubernetes.io/not-ready
            operator: Exists
            tolerationSeconds: 300
```

## Credentials troubleshooting
<a name="hybrid-nodes-troubleshooting-creds"></a>

For both AWS SSM hybrid activations and AWS IAM Roles Anywhere, you can validate that credentials for the Hybrid Nodes IAM role are correctly configured on your hybrid nodes by running the following command from your hybrid nodes. Confirm the node name and Hybrid Nodes IAM Role name are what you expect.

```
sudo aws sts get-caller-identity
```

```
{
    "UserId": "ABCDEFGHIJKLM12345678910:<node-name>",
    "Account": "<aws-account-id>",
    "Arn": "arn:aws:sts::<aws-account-id>:assumed-role/<hybrid-nodes-iam-role/<node-name>"
}
```

 ** AWS Systems Manager (SSM) troubleshooting** 

If you are using AWS SSM hybrid activations for your hybrid nodes credentials, be aware of the following SSM directories and artifacts that are installed on your hybrid nodes by `nodeadm`. For more information on the SSM agent, see [Working with the SSM agent](https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-agent.html) in the * AWS Systems Manager User Guide*.


| Description | Location | 
| --- | --- | 
|  SSM agent  |  Ubuntu - `/snap/amazon-ssm-agent/current/amazon-ssm-agent` RHEL & AL2023 - `/usr/bin/amazon-ssm-agent`   | 
|  SSM agent logs  |   `/var/log/amazon/ssm`   | 
|   AWS credentials  |   `/root/.aws/credentials`   | 
|  SSM Setup CLI  |   `/opt/ssm/ssm-setup-cli`   | 

 **Restarting the SSM agent** 

Some issues can be resolved by restarting the SSM agent. You can use the commands below to restart it.

 **AL2023 and other operating systems** 

```
systemctl restart amazon-ssm-agent
```

 **Ubuntu** 

```
systemctl restart snap.amazon-ssm-agent.amazon-ssm-agent
```

 **Check connectivity to SSM endpoints** 

Confirm you can connect to the SSM endpoints from your hybrid nodes. For a list of the SSM endpoints, see [AWS Systems Manager endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/ssm.html). Replace `us-west-2` in the command below with the AWS Region for your AWS SSM hybrid activation.

```
ping ssm.us-west-2.amazonaws.com
```

 **View connection status of registered SSM instances** 

You can check the connection status of the instances that are registered with SSM hybrid activations with the following AWS CLI command. Replace the machine ID with the machine ID of your instance.

```
aws ssm get-connection-status --target mi-012345678abcdefgh
```

 **SSM Setup CLI checksum mismatch** 

When running `nodeadm install` if you see an issue with the `ssm-setup-cli` checksum mismatch you should confirm there are not older existing SSM installations on your host. If there are older SSM installations on your host, remove them and re-run `nodeadm install` to resolve the issue.

```
Failed to perform agent-installation/on-prem registration: error while verifying installed ssm-setup-cli checksum: checksum mismatch with latest ssm-setup-cli.
```

 **SSM `InvalidActivation` ** 

If you see an error registering your instance with AWS SSM, confirm the `region`, `activationCode`, and `activationId` in your `nodeConfig.yaml` are correct. The AWS Region for your EKS cluster must match the region of your SSM hybrid activation. If these values are misconfigured, you might see an error similar to the following.

```
ERROR Registration failed due to error registering the instance with AWS SSM. InvalidActivation
```

 **SSM `ExpiredTokenException`: The security token included in the request is expired** 

If the SSM agent is not able to refresh credentials, you might see an `ExpiredTokenException`. In this scenario, if you are able to connect to the SSM endpoints from your hybrid nodes, you might need to restart the SSM agent to force a credential refresh.

```
"msg":"Command failed","error":"operation error SSM: DescribeInstanceInformation, https response error StatusCode: 400, RequestID: eee03a9e-f7cc-470a-9647-73d47e4cf0be, api error ExpiredTokenException: The security token included in the request is expired"
```

 **SSM error in running register machine command** 

If you see an error registering the machine with SSM, you might need to re-run `nodeadm install` to make sure all of the SSM dependencies are properly installed.

```
"error":"running register machine command: , error: fork/exec /opt/aws/ssm-setup-cli: no such file or directory"
```

 **SSM `ActivationExpired` ** 

When running `nodeadm init`, if you see an error registering the instance with SSM due to an expired activation, you need to create a new SSM hybrid activation, update your `nodeConfig.yaml` with the `activationCode` and `activationId` of your new SSM hybrid activation, and re-run `nodeadm init`.

```
"msg":"Command failed","error":"SSM activation expired. Please use a valid activation"
```

```
ERROR Registration failed due to error registering the instance with AWS SSM. ActivationExpired
```

 **SSM failed to refresh cached credentials** 

If you see a failure to refresh cached credentials, the `/root/.aws/credentials` file might have been deleted on your host. First check your SSM hybrid activation and ensure it is active and your hybrid nodes are configured correctly to use the activation. Check the SSM agent logs at `/var/log/amazon/ssm` and re-run the `nodeadm init` command once you have resolved the issue on the SSM side.

```
"Command failed","error":"operation error SSM: DescribeInstanceInformation, get identity: get credentials: failed to refresh cached credentials"
```

 **Clean up SSM** 

To remove the SSM agent from your host, you can run the following commands.

```
dnf remove -y amazon-ssm-agent
sudo apt remove --purge amazon-ssm-agent
snap remove amazon-ssm-agent
rm -rf /var/lib/amazon/ssm/Vault/Store/RegistrationKey
```

 ** AWS IAM Roles Anywhere troubleshooting** 

If you are using AWS IAM Roles Anywhere for your hybrid nodes credentials, be aware of the following directories and artifacts that are installed on your hybrid nodes by `nodeadm`. For more information on the troubleshooting IAM Roles Anywhere, see [Troubleshooting AWS IAM Roles Anywhere identity and access](https://docs.aws.amazon.com/rolesanywhere/latest/userguide/security_iam_troubleshoot.html) in the * AWS IAM Roles Anywhere User Guide*.


| Description | Location | 
| --- | --- | 
|  IAM Roles Anywhere CLI  |   `/usr/local/bin/aws_signing_helper`   | 
|  Default certificate location and name  |   `/etc/iam/pki/server.pem`   | 
|  Default key location and name  |   `/etc/iam/pki/server.key`   | 

 **IAM Roles Anywhere failed to refresh cached credentials** 

If you see a failure to refresh cached credentials, review the contents of `/etc/aws/hybrid/config` and confirm that IAM Roles Anywhere was configured correctly in your `nodeadm` configuration. Confirm that `/etc/iam/pki` exists. Each node must have a unique certificate and key. By default, when using IAM Roles Anywhere as the credential provider, `nodeadm` uses `/etc/iam/pki/server.pem` for the certificate location and name, and `/etc/iam/pki/server.key` for the private key. You might need to create the directories before placing the certificates and keys in the directories with `sudo mkdir -p /etc/iam/pki`. You can verify the content of your certificate with the command below.

```
openssl x509 -text -noout -in server.pem
```

```
open /etc/iam/pki/server.pem: no such file or directory
could not parse PEM data
Command failed {"error": "... get identity: get credentials: failed to refresh cached credentials, process provider error: error in credential_process: exit status 1"}
```

 **IAM Roles Anywhere not authorized to perform `sts:AssumeRole` ** 

In the `kubelet` logs, if you see an access denied issue for the `sts:AssumeRole` operation when using IAM Roles Anywhere, check the trust policy of your Hybrid Nodes IAM role to confirm the IAM Roles Anywhere service principal is allowed to assume the Hybrid Nodes IAM Role. Additionally confirm that the trust anchor ARN is configured properly in your Hybrid Nodes IAM role trust policy and that your Hybrid Nodes IAM role is added to your IAM Roles Anywhere profile.

```
could not get token: AccessDenied: User: ... is not authorized to perform: sts:AssumeRole on resource: ...
```

 **IAM Roles Anywhere not authorized to set `roleSessionName` ** 

In the `kubelet` logs, if you see an access denied issue for setting the `roleSessionName`, confirm you have set `acceptRoleSessionName` to true for your IAM Roles Anywhere profile.

```
AccessDeniedException: Not authorized to set roleSessionName
```

## Operating system troubleshooting
<a name="hybrid-nodes-troubleshooting-os"></a>

### RHEL
<a name="_rhel"></a>

 **Entitlement or subscription manager registration failures** 

If you are running `nodeadm install` and encounter a failure to install the hybrid nodes dependencies due to entitlement registration issues, ensure you have properly set your Red Hat username and password on your host.

```
This system is not registered with an entitlement server
```

### Ubuntu
<a name="_ubuntu"></a>

 **GLIBC not found** 

If you are using Ubuntu for your operating system and IAM Roles Anywhere for your credential provider with hybrid nodes and see an issue with GLIBC not found, you can install that dependency manually to resolve the issue.

```
GLIBC_2.32 not found (required by /usr/local/bin/aws_signing_helper)
```

Run the following commands to install the dependency:

```
ldd --version
sudo apt update && apt install libc6
sudo apt install glibc-source
```

### Bottlerocket
<a name="_bottlerocket"></a>

If you have the Bottlerocket admin container enabled, you can access it with SSH for advanced debugging and troubleshooting with elevated privileges. The following sections contain commands that need to be run on the context of the Bottlerocket host. Once you are on the admin container, you can run `sheltie` to get a full root shell in the Bottlerocket host.

```
sheltie
```

You can also run the commands in the following sections from the admin container shell by prefixing each command with `sudo chroot /.bottlerocket/rootfs`.

```
sudo chroot /.bottlerocket/rootfs <command>
```

 **Using logdog for log collection** 

Bottlerocket provides the `logdog` utility to efficiently collect logs and system information for troubleshooting purposes.

```
logdog
```

The `logdog` utility gathers logs from various locations on a Bottlerocket host and combines them into a tarball. By default, the tarball will be created at `/var/log/support/bottlerocket-logs.tar.gz`, and is accessible from host containers at `/.bottlerocket/support/bottlerocket-logs.tar.gz`.

 **Accessing system logs with journalctl** 

You can check the status of the various system services such as `kubelet`, `containerd`, etc and view their logs with the following commands. The `-f` flag will follow the logs in real time.

For checking `kubelet` service status and retrieving `kubelet` logs, you can run:

```
systemctl status kubelet
journalctl -u kubelet -f
```

For checking `containerd` service status and retrieving the logs for the orchestrated `containerd` instance, you can run:

```
systemctl status containerd
journalctl -u containerd -f
```

For checking `host-containerd` service status and retrieving the logs for the host `containerd` instance, you can run:

```
systemctl status host-containerd
journalctl -u host-containerd -f
```

For retrieving the logs for the bootstrap containers and host containers, you can run:

```
journalctl _COMM=host-ctr -f
```