Configure a CNI for hybrid nodes
Cilium and Calico are supported as the Container Networking Interfaces (CNIs) for Amazon EKS Hybrid Nodes. You must install a CNI for hybrid nodes to become ready to serve workloads. Hybrid nodes appear with status Not Ready
until a CNI is running. You can manage these CNIs with your choice of tooling such as Helm. The Amazon VPC CNI is not compatible with hybrid nodes and the VPC CNI is configured with anti-affinity for the eks.amazonaws.com/compute-type: hybrid
label.
Version compatibility
The table below represents the Cilium and Calico versions that are compatible and validated for each Kubernetes version supported in Amazon EKS.
Kubernetes version | Cilium version | Calico version |
---|---|---|
1.31 |
1.16.x |
3.29.x |
1.30 |
1.16.x |
3.29.x |
1.29 |
1.16.x |
3.29.x |
1.28 |
1.16.x |
3.29.x |
1.27 |
1.16.x |
3.29.x |
1.26 |
1.16.x |
3.29.x |
1.25 |
1.16.x |
3.29.x |
Supported capabilities
AWS supports the following capabilities of Cilium and Calico for use with hybrid nodes. If you plan to use functionality outside the scope of AWS support, we recommend that you obtain commercial support for the plugin or have the in-house expertise to troubleshoot and contribute fixes to the CNI plugin project.
Feature | Cilium | Calico |
---|---|---|
Kubernetes network conformance |
Yes |
Yes |
Control plane to node connectivity |
Yes |
Yes |
Control plane to pod connectivity |
Yes |
Yes |
Lifecycle Management |
Install, Upgrade, Delete |
Install, Upgrade, Delete |
Networking Mode |
VXLAN |
VXLAN |
IP Address Management (IPAM) |
Cluster Scope (Cilium IPAM) |
Calico IPAM |
IP family |
IPv4 |
IPv4 |
BGP |
Yes (Cilium Control Plane) |
Yes |
Install Cilium on hybrid nodes
-
Ensure that you have installed the helm CLI on your command-line environment. See the Helm documentation
for installation instructions. -
Install the Cilium Helm repo.
helm repo add cilium https://helm.cilium.io/
-
Create a yaml file called
cilium-values.yaml
. If you configured at least one remote pod network, configure the same pod CIDRs for yourclusterPoolIPv4PodCIDRList
. You shouldn’t change yourclusterPoolIPv4PodCIDRList
after deploying Cilium on your cluster. You can configureclusterPoolIPv4MaskSize
based on your required pods per node, see Expanding the cluster poolin the Cilium documentation. For a full list of Helm values for Cilium, see the the Helm reference in the Cilium documentation. The following example configures all of the Cilium components to run on only the hybrid nodes, since they have the the eks.amazonaws.com/compute-type: hybrid
label.By default, Cilium masquerades the source IP address of all pod traffic leaving the cluster to the IP address of the node. This makes it possible for Cilium to run with Amazon EKS clusters that have remote pod networks configured and with clusters that don’t have remote pod networks configured. If you disable masquerading for your Cilium deployment, then you must configure your Amazon EKS cluster with your remote pod networks and you must advertise your pod addresses with your on-premises network. If you are running webhooks on your hybrid nodes, you must configure your cluster with your remote pod networks and you must advertise your pod addresses with your on-premises network.
A common way to advertise pod addresses with your on-premises network is by using BGP. To use BGP with Cilium, you must set
bgpControlPlane.enabled: true
. For more information on Cilium’s BGP support, see Cilium BGP Control Planein the Cilium documentation. affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: eks.amazonaws.com/compute-type operator: In values: - hybrid ipam: mode: cluster-pool operator: clusterPoolIPv4MaskSize: 25 clusterPoolIPv4PodCIDRList: - POD_CIDR operator: unmanagedPodWatcher: restart: false
-
Install Cilium on your cluster. Replace
CILIUM_VERSION
with your desired Cilium version. It is recommended to run the latest patch version for your Cilium minor version. You can find the latest patch release for a given minor Cilium release in the Stable Releases sectionof the Cilium documentation. If you are enabling BGP for your deployment, add the --set bgpControlPlane.enabled=true
flag in the command below. If you are using a specific kubeconfig file, use the--kubeconfig
flag with the Helm install command.helm install cilium cilium/cilium \ --version
CILIUM_VERSION
\ --namespace kube-system \ --values cilium-values.yaml -
You can confirm your Cilium installation was successful with the following commands. You should see the
cilium-operator
deployment and thecilium-agent
running on each of your hybrid nodes. Additionally, your hybrid nodes should now have statusReady
. For information on how to configure BGP for Cilium, proceed to the next step.kubectl get pods -n kube-system
NAME READY STATUS RESTARTS AGE cilium-jjjn8 1/1 Running 0 11m cilium-operator-d4f4d7fcb-sc5xn 1/1 Running 0 11m
kubectl get nodes
NAME STATUS ROLES AGE VERSION mi-04a2cf999b7112233 Ready <none> 19m v1.31.0-eks-a737599
-
To use BGP with Cilium to advertise your pod addresses with your on-premises network, you must have installed Cilium with
bgpControlPlane.enabled: true
. To configure BGP in Cilium, first create a file calledcilium-bgp-cluster.yaml
with aCiliumBGPClusterConfig
with the peerAddress set to your on-premises router IP that you are peering with. Configure thelocalASN
andpeerASN
based on your on-premises router configuration.apiVersion: cilium.io/v2alpha1 kind: CiliumBGPClusterConfig metadata: name: cilium-bgp spec: nodeSelector: matchExpressions: - key: eks.amazonaws.com/compute-type operator: In values: - hybrid bgpInstances: - name: "rack0" localASN: ONPREM_ROUTER_ASN peers: - name: "onprem-router" peerASN: PEER_ASN peerAddress: ONPREM_ROUTER_IP peerConfigRef: name: "cilium-peer"
-
Apply the Cilium BGP Cluster configuration to your cluster.
kubectl apply -f cilium-bgp-cluster.yaml
-
The
CiliumBGPPeerConfig
resource is used to define a BGP peer configuration. Multiple peers can share the same configuration and provide reference to the commonCiliumBGPPeerConfig
resource. Create a file namedcilium-bgp-peer.yaml
to configure the peer configuration for your on-premises network. See the BGP Peer Configurationin the Cilium documentation for a full list of configuration options. apiVersion: cilium.io/v2alpha1 kind: CiliumBGPPeerConfig metadata: name: cilium-peer spec: timers: holdTimeSeconds: 30 keepAliveTimeSeconds: 10 gracefulRestart: enabled: true restartTimeSeconds: 120 families: - afi: ipv4 safi: unicast advertisements: matchLabels: advertise: "bgp"
-
Apply the Cilium BGP Peer configuration to your cluster.
kubectl apply -f cilium-bgp-peer.yaml
-
The
CiliumBGPAdvertisement
resource is used to define various advertisement types and attributes associated with them. Create a file namedcilium-bgp-advertisement.yaml
and configure theCiliumBGPAdvertisement
resource with your desired settings.apiVersion: cilium.io/v2alpha1 kind: CiliumBGPAdvertisement metadata: name: bgp-advertisements labels: advertise: bgp spec: advertisements: - advertisementType: "PodCIDR" - advertisementType: "Service" service: addresses: - ClusterIP - ExternalIP - LoadBalancerIP
-
Apply the Cilium BGP Advertisement configuration to your cluster.
kubectl apply -f cilium-bgp-advertisement.yaml
You can confirm the BGP peering worked with the Cilium CLI
by using the cilium bgp peers
command. You should see the correct values in the output for your environment and the Session State asestablished
. See the Troubleshooting and Operations Guidein the Cilium documentation for more information on troubleshooting.
Upgrade Cilium on hybrid nodes
Before upgrading your Cilium deployment, carefully review the Cilium upgrade documentation
-
Ensure that you have installed the
helm
CLI on your command-line environment. See the Helm documentationfor installation instructions. -
Install the Cilium Helm repo.
helm repo add cilium https://helm.cilium.io/
-
Run the Cilium upgrade pre-flight check. Replace
CILIUM_VERSION
with your target Cilium version. It is recommended to run the latest patch version for your Cilium minor version. You can find the latest patch release for a given minor Cilium release in the Stable Releases sectionof the Cilium documentation. helm install cilium-preflight cilium/cilium --version CILIUM_VERSION \ --namespace=kube-system \ --set preflight.enabled=true \ --set agent=false \ --set operator.enabled=false
-
After applying the
cilium-preflight.yaml
, ensure that the number of READY pods is the same number of Cilium pods running.kubectl get ds -n kube-system | sed -n '1p;/cilium/p'
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE cilium 2 2 2 2 2 <none> 1h20m cilium-pre-flight-check 2 2 2 2 2 <none> 7m15s
-
Once the number of READY pods are equal, make sure the Cilium pre-flight deployment is also marked as READY 1/1. If it shows READY 0/1, consult the CNP Validation
section and resolve issues with the deployment before continuing with the upgrade. kubectl get deployment -n kube-system cilium-pre-flight-check -w
NAME READY UP-TO-DATE AVAILABLE AGE cilium-pre-flight-check 1/1 1 0 12s
-
Delete the preflight
helm uninstall cilium-preflight --namespace kube-system
-
During normal cluster operations, all Cilium components should run the same version. The following steps describe how to upgrade all of the components from one stable release to a later stable release. When upgrading from one minor release to another minor release, it is recommended to upgrade to the latest patch release for the existing Cilium minor version first. To minimize disruption, the upgradeCompatibility option should be set to the initial Cilium version which was installed in this cluster.
Before running the helm upgrade command, preserve the values for your deployment in a
cilium-values.yaml
or use--set
command line options for your settings. The upgrade operation overwrites the Cilium ConfigMap, so it is critical that your configuration values are passed when you upgrade. If you are using BGP, it is recommended to use the--set bgpControlPlane=true
command line option instead of supplying this information in your values file.helm upgrade cilium cilium/cilium --version CILIUM_VERSION \ --namespace kube-system \ --set upgradeCompatibility=1.X \ -f cilium-values.yaml
-
(Optional) If you need to rollback your upgrade due to issues, run the following commands.
helm history cilium --namespace kube-system helm rollback cilium [REVISION] --namespace kube-system
Delete Cilium from hybrid nodes
-
Run the following command to uninstall all Cilium components from your cluster. Note, uninstalling the CNI may impact the health of nodes and pods and shouldn’t be performed on production clusters.
helm uninstall cilium --namespace kube-system
The interfaces and routes configured by Cilium are not removed by default when the CNI is removed from the cluster, see the GitHub issue
for more information. -
To clean up the on-disk configuration files and resources, if you are using the standard configuration directories, you can remove the files as shown by the
cni-uninstall.sh
scriptin the Cilium repository on GitHub. -
To remove the Cilium Custom Resource Definitions (CRDs) from your cluster, you can run the following commands.
kubectl get crds -oname | grep "cilium" | xargs kubectl delete
Install Calico on hybrid nodes
-
Ensure that you have installed the helm CLI on your command-line environment. See the Helm documentation
for installation instructions. -
Install the Cilium Helm repo.
helm repo add projectcalico https://docs.tigera.io/calico/charts
-
Create a yaml file called
calico-values.yaml
that configures Calico with affinity to run on hybrid nodes. For more information on the different Calico networking modes, see Determining the best networking optionin the Calico documentation. -
Replace
POD_CIDR
with the CIDR ranges for your pods. If you configured your Amazon EKS cluster with remote pod networks, thePOD_CIDR
that you specify for Calico should be the same as the remote pod networks. For example,10.100.0.0/24
. -
Replace
CIDR_SIZE
with the size of the CIDR segment you wish to allocate to each node. For example,25
for a /25 segment size. For more information on CIDRblockSize
and changing theblockSize
, see Change IP pool block sizein the Calico documentation. -
In the example below,
natOutgoing
is enabled andbgp
is disabled. In this configuration, Calico can run on Amazon EKS clusters that have Remote Pod Network configured and can run on clusters that do not have Remote Pod Network configured. If you havenatOutgoing
set to disabled, you must configure your cluster with your remote pod networks and your on-premises network must be able to properly route traffic destined for your pod CIDRs. A common way to advertise pod addresses with your on-premises network is by using BGP. To use BGP with Calico, you must enablebgp
. The example below configures all of the Calico components to run on only the hybrid nodes, since they have theeks.amazonaws.com/compute-type: hybrid
label. If you are running webhooks on your hybrid nodes, you must configure your cluster with your Remote Pod Networks and you must advertise your pod addresses with your on-premises network. The example below configurescontrolPlaneReplicas: 1
, increase the value if you have multiple hybrid nodes and want to run the Calico control plane components in a highly available fashion.installation: enabled: true cni: type: Calico ipam: type: Calico calicoNetwork: bgp: Disabled ipPools: - cidr: POD_CIDR blockSize: CIDR_SIZE encapsulation: VXLAN natOutgoing: Enabled nodeSelector: eks.amazonaws.com/compute-type == "hybrid" controlPlaneReplicas: 1 controlPlaneNodeSelector: eks.amazonaws.com/compute-type: hybrid calicoNodeDaemonSet: spec: template: spec: nodeSelector: eks.amazonaws.com/compute-type: hybrid csiNodeDriverDaemonSet: spec: template: spec: nodeSelector: eks.amazonaws.com/compute-type: hybrid calicoKubeControllersDeployment: spec: template: spec: nodeSelector: eks.amazonaws.com/compute-type: hybrid typhaDeployment: spec: template: spec: nodeSelector: eks.amazonaws.com/compute-type: hybrid
-
-
Install Calico on your cluster. Replace
CALICO_VERSION
with your desired Calico version (for example 3.29.0), see the Calico releasesto find the latest patch release for your Calico minor version. It is recommended to run the latest patch version for the Calico minor version. If you are using a specific kubeconfig
file, use the--kubeconfig
flag.helm install calico projectcalico/tigera-operator \ --version
CALICO_VERSION
\ --namespace kube-system \ -f calico-values.yaml -
You can confirm your Calico installation was successful with the following commands. You should see the
tigera-operator
deployment, thecalico-node
agent running on each of your hybrid nodes, as well as thecalico-apiserver
,csi-node-driver
, andcalico-kube-controllers
deployed. Additionally, your hybrid nodes should now have statusReady
. If you are usingnatOutgoing: Disabled
, then all of the Calico components will not be able to start successfully until you advertise your pod addresses with your on-premises network. For information on how to configure BGP for Calico, proceed to the next step.kubectl get pods -A
NAMESPACE NAME READY STATUS RESTARTS AGE calico-apiserver calico-apiserver-6c77bb6d46-2n8mq 1/1 Running 0 69s calico-system calico-kube-controllers-7c5f8556b5-7h267 1/1 Running 0 68s calico-system calico-node-s5nnk 1/1 Running 0 68s calico-system calico-typha-6487cc9d8c-wc5jm 1/1 Running 0 69s calico-system csi-node-driver-cv42d 2/2 Running 0 68s kube-system coredns-7bb495d866-2lc9v 1/1 Running 0 6m27s kube-system coredns-7bb495d866-2t8ln 1/1 Running 0 157m kube-system kube-proxy-lxzxh 1/1 Running 0 18m kube-system tigera-operator-f8bc97d4c-28b4d 1/1 Running 0 90s
kubectl get nodes
NAME STATUS ROLES AGE VERSION mi-0c6ec2f6f79176565 Ready <none> 5h13m v1.31.0-eks-a737599
-
If you installed Calico without BGP, skip this step. To configure BGP, create a file called
calico-bgp.yaml
with aBGPPeer
configuration and aBGPConfiguration
. It is important to distinguishBGPPeer
andBGPConfiguration
. TheBGPPeer
is the BGP-enabled router or remote resource with which the nodes in a Calico cluster will peer. TheasNumber
in theBGPPeer
configuration is similar to the Cilium settingpeerASN
. TheBGPConfiguration
is applied to each Calico node and theasNumber
for theBGPConfiguration
is equivalent to the Cilium settinglocalASN
. ReplaceONPREM_ROUTER_IP
,ONPREM_ROUTER_ASN
, andLOCAL_ASN
in the example below with the values for your on-premises environment. ThekeepOriginalNextHop: true
setting is used to ensure each node advertises only the pod network CIDR that it owns.apiVersion: projectcalico.org/v3 kind: BGPPeer metadata: name: calico-hybrid-nodes spec: peerIP:
ONPREM_ROUTER_IP
asNumber:ONPREM_ROUTER_ASN
keepOriginalNextHop: true --- apiVersion: projectcalico.org/v3 kind: BGPConfiguration metadata: name: default spec: nodeToNodeMeshEnabled: false asNumber:LOCAL_ASN
-
Apply the file to your cluster.
kubectl apply -f calico-bgp.yaml
-
Confirm the Calico pods are running with the following command.
kubectl get pods -n calico-system -w
NAMESPACE NAME READY STATUS RESTARTS AGE calico-apiserver calico-apiserver-598bf99b6c-2vltk 1/1 Running 0 3h24m calico-system calico-kube-controllers-75f84bbfd6-zwmnx 1/1 Running 31 (59m ago) 3h20m calico-system calico-node-9b2pg 1/1 Running 0 5h17m calico-system calico-typha-7d55c76584-kxtnq 1/1 Running 0 5h18m calico-system csi-node-driver-dmnmm 2/2 Running 0 5h18m kube-system coredns-7bb495d866-dtn4z 1/1 Running 0 6h23m kube-system coredns-7bb495d866-mk7j4 1/1 Running 0 6h19m kube-system kube-proxy-vms28 1/1 Running 0 6h12m kube-system tigera-operator-55f9d9d565-jj9bg 1/1 Running 0 73m
If you encountered issues during these steps, see the troubleshooting guidance
Upgrade Calico on hybrid nodes
Before upgrading your Calico deployment, carefully review the Calico upgrade documentation
-
Download the operator manifest for the version of Calico you are upgrading to. Replace
CALICO_VERSION
with the version you are upgrading to, for examplev3.29.0
. Make sure to prepend thev
to the major.minor.patch.kubectl apply --server-side --force-conflicts \ -f https://raw.githubusercontent.com/projectcalico/calico/
CALICO_VERSION
/manifests/operator-crds.yaml -
Run
helm upgrade
to upgrade your Calico deployment. Replace CALICO_VERSION with the version you are upgrading to, for examplev3.29.0
. Create thecalico-values.yaml
file from the configuration values that you used to install Calico.helm upgrade calico projectcalico/tigera-operator \ --version
CALICO_VERSION
\ --namespace kube-system \ -f calico-values.yaml
Delete Calico from hybrid nodes
-
Run the following command to uninstall Calico components from your cluster. Note that uninstalling the CNI may impact the health of nodes and pods and should not be performed on production clusters. If you installed Calico in a namespace other than
kube-system
change the namespace in the command below.helm uninstall calico --namespace kube-system
Note that the interfaces and routes configured by Calico are not removed by default when the CNI is removed from the cluster.
-
To clean up the on-disk configuration files and resources, remove the Calico files from the
/opt/cni
and/etc/cni
directories. -
To remove the Calico CRDs from your cluster, run the following commands.
kubectl get crds -oname | grep "calico" | xargs kubectl delete
kubectl get crds -oname | grep "tigera" | xargs kubectl delete