

# Getting started with AWS Batch on Amazon EKS
<a name="getting-started-eks"></a>

AWS Batch on Amazon EKS is a managed service for scheduling and scaling batch workloads into existing Amazon EKS clusters. AWS Batch doesn't create, administer, or perform lifecycle operations of your Amazon EKS clusters on your behalf. AWS Batch orchestration scales up and down nodes managed by AWS Batch and run pods on those nodes.

AWS Batch doesn't touch nodes, auto scaling node groups or pods lifecycles that aren't associated with AWS Batch compute environments within your Amazon EKS cluster. For AWS Batch to operate effectively, its [service-linked role](using-service-linked-roles.md) needs Kubernetes role-based access control (RBAC) permissions in your existing Amazon EKS cluster. For more information, see [Using RBAC Authorization](https://kubernetes.io/docs/reference/access-authn-authz/rbac/) in the *Kubernetes documentation*.

AWS Batch requires a Kubernetes namespace where it can scope pods as AWS Batch jobs into. We recommend a dedicated namespace to isolate the AWS Batch pods from your other cluster workloads.

After AWS Batch has been given RBAC access and a namespace has been established, you can associate that Amazon EKS cluster to an AWS Batch compute environment using the [CreateComputeEnvironment](https://docs.aws.amazon.com/batch/latest/APIReference/API_CreateComputeEnvironment.html) API operation. A job queue can be associated with this new Amazon EKS compute environment. AWS Batch jobs are submitted to the job queue based on an Amazon EKS job definition using the [SubmitJob](https://docs.aws.amazon.com/batch/latest/APIReference/API_SubmitJob.html) API operation. AWS Batch then launches AWS Batch managed nodes and place jobs from job queue as Kubernetes pods into the EKS cluster associated with an AWS Batch compute environment.

The following sections cover how to get set up for AWS Batch on Amazon EKS.

**Contents**
+ [Overview](#getting-started-eks-context)
+ [Prerequisites](#getting-started-eks-prerequisites)
+ [Step 1: Create your Amazon EKS cluster for AWS Batch](#getting-started-eks-step-0)
+ [Step 2: Prepare your Amazon EKS cluster for AWS Batch](#getting-started-eks-step-1)
+ [Step 3: Create an Amazon EKS compute environment](#getting-started-eks-step-2)
+ [Step 4: Create a job queue and attach the compute environment](#getting-started-eks-step-3)
+ [Step 5: Create a job definition](#getting-started-eks-step-4)
+ [Step 6: Submit a job](#getting-started-eks-step-5)
+ [Step 7: View the Job's output](#getting-started-eks-step-7)
+ [Step 8: (Optional) Submit a job with overrides](#getting-started-eks-step-6)
+ [Step 9: Clean up your tutorial resources](#getting-started-eks-step-8)
+ [Additional resources](#getting-started-eks-additional-resources)

## Overview
<a name="getting-started-eks-context"></a>

This tutorial demonstrates how to setup AWS Batch with Amazon EKS using the AWS CLI, `kubectl` and `eksctl`. 

**Intended Audience**  
This tutorial is designed for system administrators and developers responsible for setting up, testing, and deploying AWS Batch.

**Features Used**  
This tutorial shows you how to use the AWS CLI, to:  
+ Create and configure an Amazon EKS compute environment
+ Create a job queue.
+ Create a job definition
+ Create and submit a job to run
+ Submit a job with overrides

**Time Required**  
It should take about 30–40 minutes to complete this tutorial.

**Regional Restrictions**  
There are no country or regional restrictions associated with using this solution.

**Resource Usage Costs**  
There's no charge for creating an AWS account. However, by implementing this solution, you might incur some or all of the costs that are listed in the following table.      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/batch/latest/userguide/getting-started-eks.html)

## Prerequisites
<a name="getting-started-eks-prerequisites"></a>

Before starting this tutorial, you must install and configure the following tools and resources that you need to create and manage both AWS Batch and Amazon EKS resources.
+ **AWS CLI** – A command line tool for working with AWS services, including Amazon EKS. This guide requires that you use version 2.8.6 or later or 1.26.0 or later. For more information, see [Installing, updating, and uninstalling the AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html) in the *AWS Command Line Interface User Guide*. After installing the AWS CLI, we recommend that you also configure it. For more information, see [Quick configuration with `aws configure`](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-quickstart.html#cli-configure-quickstart-config) in the *AWS Command Line Interface User Guide*.
+ **`kubectl`** – A command line tool for working with Kubernetes clusters. This guide requires that you use version `1.23` or later. For more information, see [Installing or updating `kubectl`](https://docs.aws.amazon.com/eks/latest/userguide/install-kubectl.html) in the *Amazon EKS User Guide*.
+ **`eksctl`** – A command line tool for working with Amazon EKS clusters that automates many individual tasks. This guide requires that you use version `0.115.0` or later. For more information, see [Installing or updating `eksctl`](https://docs.aws.amazon.com/eks/latest/userguide/eksctl.html) in the **Amazon EKS User Guide**.
+ **Required IAM permissions** – The IAM security principal that you're using must have permissions to work with Amazon EKS IAM roles and service linked roles, CloudFormation, and a VPC and related resources. For more information, see [Actions, resources, and condition keys for Amazon Elastic Kubernetes Service](https://docs.aws.amazon.com/service-authorization/latest/reference/list_amazonelastickubernetesservice.html) and [Using service-linked roles](https://docs.aws.amazon.com/IAM/latest/UserGuide/using-service-linked-roles.html) in the *IAM User Guide*. You must complete all steps in this guide as the same user.
+ **Permissions** – Users calling the [CreateComputeEnvironment](https://docs.aws.amazon.com/batch/latest/APIReference/API_CreateComputeEnvironment.html) API operation to create a compute environment that uses Amazon EKS resources require permissions to the `eks:DescribeCluster` API operation. 
+ **AWS account number** – You need to know your AWS account ID. Follow the directions in [Viewing your AWS account ID](https://docs.aws.amazon.com/IAM/latest/UserGuide/console-account-id.html).
+ **(Optional) CloudWatch** – To examine the details of [(Optional) Submit a job with overrides](#getting-started-eks-step-6), logging must be configured. For more information, see [Use CloudWatch Logs to monitor AWS Batch on Amazon EKS jobs](batch-eks-cloudwatch-logs.md).

## Step 1: Create your Amazon EKS cluster for AWS Batch
<a name="getting-started-eks-step-0"></a>

**Important**  
To get started as simply and quickly as possible, this tutorial includes steps with default settings. Before creating for production use, we recommend that you familiarize yourself with all settings and deploy with the settings that meet your requirements.

Once you have installed the prerequisites you need to create your cluster using `eksctl`. Creating the cluster can take between 10-15 minutes. 

```
$  eksctl create cluster --name my-cluster-name --region region-code
```

In the preceding command replace:
+ Replace *my-cluster-name* with the name you want to use for your cluster. 
+ Replace *region-code* with the AWS Region to create the cluster in, for example `us-west-2`.

The cluster name and region are needed for later in this tutorial.

## Step 2: Prepare your Amazon EKS cluster for AWS Batch
<a name="getting-started-eks-step-1"></a>

All steps are required.

1. 

**Create a dedicated namespace for AWS Batch jobs**

   Use `kubectl` to create a new namespace.

   ```
   $ namespace=my-aws-batch-namespace
   ```

   ```
   $ cat - <<EOF | kubectl create -f -
   {
     "apiVersion": "v1",
     "kind": "Namespace",
     "metadata": {
       "name": "${namespace}",
       "labels": {
         "name": "${namespace}"
       }
     }
   }
   EOF
   ```

   Output:

   ```
   namespace/my-aws-batch-namespace created
   ```

1. 

**Enable access via role-based access control (RBAC)**

   Use `kubectl` to create a Kubernetes role for the cluster to allow AWS Batch to watch nodes and pods, and to bind the role. You must do this once for each EKS cluster.

   ```
   $ cat - <<EOF | kubectl apply -f -
   apiVersion: rbac.authorization.k8s.io/v1
   kind: ClusterRole
   metadata:
     name: aws-batch-cluster-role
   rules:
     - apiGroups: [""]
       resources: ["namespaces"]
       verbs: ["get"]
     - apiGroups: [""]
       resources: ["nodes"]
       verbs: ["get", "list", "watch"]
     - apiGroups: [""]
       resources: ["pods"]
       verbs: ["get", "list", "watch"]
     - apiGroups: [""]
       resources: ["events"]
       verbs: ["list"]
     - apiGroups: [""]
       resources: ["configmaps"]
       verbs: ["get", "list", "watch"]
     - apiGroups: ["apps"]
       resources: ["daemonsets", "deployments", "statefulsets", "replicasets"]
       verbs: ["get", "list", "watch"]
     - apiGroups: ["rbac.authorization.k8s.io"]
       resources: ["clusterroles", "clusterrolebindings"]
       verbs: ["get", "list"]
   ---
   apiVersion: rbac.authorization.k8s.io/v1
   kind: ClusterRoleBinding
   metadata:
     name: aws-batch-cluster-role-binding
   subjects:
   - kind: User
     name: aws-batch
     apiGroup: rbac.authorization.k8s.io
   roleRef:
     kind: ClusterRole
     name: aws-batch-cluster-role
     apiGroup: rbac.authorization.k8s.io
   EOF
   ```

   Output:

   ```
   clusterrole.rbac.authorization.k8s.io/aws-batch-cluster-role created
   clusterrolebinding.rbac.authorization.k8s.io/aws-batch-cluster-role-binding created
   ```

1. Create namespace-scoped Kubernetes role for AWS Batch to manage and lifecycle pods and bind it. You must do this once for each unique namespace.

   ```
   $ namespace=my-aws-batch-namespace
   ```

   ```
   $ cat - <<EOF | kubectl apply -f - --namespace "${namespace}"
   apiVersion: rbac.authorization.k8s.io/v1
   kind: Role
   metadata:
     name: aws-batch-compute-environment-role
     namespace: ${namespace}
   rules:
     - apiGroups: [""]
       resources: ["pods"]
       verbs: ["create", "get", "list", "watch", "delete", "patch"]
     - apiGroups: [""]
       resources: ["serviceaccounts"]
       verbs: ["get", "list"]
     - apiGroups: ["rbac.authorization.k8s.io"]
       resources: ["roles", "rolebindings"]
       verbs: ["get", "list"]
   ---
   apiVersion: rbac.authorization.k8s.io/v1
   kind: RoleBinding
   metadata:
     name: aws-batch-compute-environment-role-binding
     namespace: ${namespace}
   subjects:
   - kind: User
     name: aws-batch
     apiGroup: rbac.authorization.k8s.io
   roleRef:
     kind: Role
     name: aws-batch-compute-environment-role
     apiGroup: rbac.authorization.k8s.io
   EOF
   ```

   Output:

   ```
   role.rbac.authorization.k8s.io/aws-batch-compute-environment-role created
   rolebinding.rbac.authorization.k8s.io/aws-batch-compute-environment-role-binding created
   ```

1. Update Kubernetes `aws-auth` conﬁguration map to map the preceding RBAC permissions to the AWS Batch service-linked role.

   In the following command replace:
   + Replace *<your-account-number>* with your AWS account number. 

   ```
   $ eksctl create iamidentitymapping \
       --cluster my-cluster-name \
       --arn "arn:aws:iam::<your-account-number>:role/AWSServiceRoleForBatch" \
       --username aws-batch
   ```

   Output:

   ```
   2022-10-25 20:19:57 [ℹ]  adding identity "arn:aws:iam::<your-account-number>:role/AWSServiceRoleForBatch" to auth ConfigMap
   ```
**Note**  
The path `aws-service-role/batch.amazonaws.com/` has been removed from the ARN of the service-linked role. This is because of an issue with the `aws-auth` configuration map. For more information, see [Roles with paths don't work when the path is included in their ARN in the aws-authconfigmap](https://github.com/kubernetes-sigs/aws-iam-authenticator/issues/268).

## Step 3: Create an Amazon EKS compute environment
<a name="getting-started-eks-step-2"></a>

AWS Batch compute environments define compute resource parameters to meet your batch workload needs. In a managed compute environment, AWS Batch helps you to manage the capacity and instance types of the compute resources (Kubernetes nodes) within your Amazon EKS cluster. This is based on the compute resource specification that you define when you create the compute environment. You can use EC2 On-Demand Instances or EC2 Spot Instances.

Now that the **AWSServiceRoleForBatch** service-linked role has access to your Amazon EKS cluster, you can create AWS Batch resources. First, create a compute environment that points to your Amazon EKS cluster.
+ For `subnets` run `eksctl get cluster my-cluster-name` to get the subnets used by the cluster. 
+ For `securityGroupIds` parameter you can use the same security group as the Amazon EKS cluster. This command retrieves the security group ID for the cluster.

  ```
  $ aws eks describe-cluster \
      --name my-cluster-name \
      --query cluster.resourcesVpcConfig.clusterSecurityGroupId
  ```
+ The `instanceRole` is created when you create the cluster. To find the `instanceRole` list all entities that use the `AmazonEKSWorkerNodePolicy` policy: 

  ```
  $  aws iam list-entities-for-policy --policy-arn arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
  ```

  The name of the policy role contains the name of the cluster that you created `eksctl-my-cluster-name-nodegroup-example`. 

  To find the `instanceRole` arn run the following command:

  ```
  $  aws iam list-instance-profiles-for-role --role-name eksctl-my-cluster-name-nodegroup-example        
  ```

  Output:

  ```
  INSTANCEPROFILES        arn:aws:iam::<your-account-number>:instance-profile/eks-04cb2200-94b9-c297-8dbe-87f12example
  ```

  For more information, see [Creating the Amazon EKS node IAM role](https://docs.aws.amazon.com/eks/latest/userguide/create-node-role.html#create-worker-node-role) and [Enabling IAM principal access to your cluster](https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html) in the *Amazon EKS User Guide*. If you're using pod networking, see [Configuring the Amazon VPC CNI plugin for Kubernetes to use IAM roles for service accounts](https://docs.aws.amazon.com/eks/latest/userguide/cni-iam-role.html) in the **Amazon EKS User Guide**.

```
$ cat <<EOF > ./batch-eks-compute-environment.json
{
  "computeEnvironmentName": "My-Eks-CE1",
  "type": "MANAGED",
  "state": "ENABLED",
  "eksConfiguration": {
    "eksClusterArn": "arn:aws:eks:region-code:your-account-number:cluster/my-cluster-name",
    "kubernetesNamespace": "my-aws-batch-namespace"
  },
  "computeResources": {
    "type": "EC2",
    "allocationStrategy": "BEST_FIT_PROGRESSIVE",
    "minvCpus": 0,
    "maxvCpus": 128,
    "instanceTypes": [
        "m5"
    ],
    "subnets": [
        "<eks-cluster-subnets-with-access-to-internet-for-image-pull>"
    ],
    "securityGroupIds": [
        "<eks-cluster-sg>"
    ],
    "instanceRole": "<eks-instance-profile>"
  }
}
EOF
```

```
$ aws batch create-compute-environment --cli-input-json file://./batch-eks-compute-environment.json
```

**Notes**
+ Maintenance of an Amazon EKS compute environment is a shared responsibility. For more information, see [Shared responsibility of the Kubernetes nodes](eks-ce-shared-responsibility.md).

## Step 4: Create a job queue and attach the compute environment
<a name="getting-started-eks-step-3"></a>

**Important**  
It's important to confirm that the compute environment is healthy before proceeding. The [DescribeComputeEnvironments](https://docs.aws.amazon.com/batch/latest/APIReference/API_DescribeComputeEnvironments.html) API operation can be used to do this.  

```
$ aws batch describe-compute-environments --compute-environments My-Eks-CE1
```
Confirm that the `status` parameter is not `INVALID`. If it is, look at the `statusReason` parameter for the cause. For more information, see [Troubleshooting AWS Batch](troubleshooting.md).

Jobs submitted to this new job queue are run as pods on AWS Batch managed nodes that joined the Amazon EKS cluster that's associated with your compute environment.

```
$ cat <<EOF > ./batch-eks-job-queue.json
 {
    "jobQueueName": "My-Eks-JQ1",
    "priority": 10,
    "computeEnvironmentOrder": [
      {
        "order": 1,
        "computeEnvironment": "My-Eks-CE1"
      }
    ]
  }
EOF
```

```
$ aws batch create-job-queue --cli-input-json file://./batch-eks-job-queue.json
```

## Step 5: Create a job definition
<a name="getting-started-eks-step-4"></a>

The following Job definition instructs the pod to sleep for 60 seconds.

```
$ cat <<EOF > ./batch-eks-job-definition.json
{
  "jobDefinitionName": "MyJobOnEks_Sleep",
  "type": "container",
  "eksProperties": {
    "podProperties": {
      "hostNetwork": true,
      "containers": [
        {
          "image": "public.ecr.aws/amazonlinux/amazonlinux:2",
          "command": [
            "sleep",
            "60"
          ],
          "resources": {
            "limits": {
              "cpu": "1",
              "memory": "1024Mi"
            }
          }
        }
      ],
      "metadata": {
        "labels": {
          "environment": "test"
        }
      }
    }
  }
}
EOF
```

```
$ aws batch register-job-definition --cli-input-json file://./batch-eks-job-definition.json
```

**Notes**
+ There are considerations for the `cpu` and `memory` parameters. For more information, see [Memory and vCPU considerations for AWS Batch on Amazon EKS](memory-cpu-batch-eks.md).

## Step 6: Submit a job
<a name="getting-started-eks-step-5"></a>

Run the following AWS CLI command to submit a new Job.

```
$ aws batch submit-job --job-queue My-Eks-JQ1 \
    --job-definition MyJobOnEks_Sleep --job-name My-Eks-Job1
```

To check the status of a Job:

```
$ aws batch describe-jobs --job <jobId-from-submit-response>
```

**Notes**
+ For more information about running jobs on Amazon EKS resources, see [Amazon EKS jobs](eks-jobs.md).

## Step 7: View the Job's output
<a name="getting-started-eks-step-7"></a>

To view the Job's output, do the following:

1. Open the AWS Batch console at [https://console.aws.amazon.com/batch/](https://console.aws.amazon.com/batch/).

1. In the navigation pane choose **Jobs**. 

1. In the **Job queue** drop down choose the Job queue you created for the tutorial.

1. The **Jobs** table lists all of your Jobs and what their current status is. Once the Job's **Status** is **Succeeded** choose the **Name** of the Job, *My-Eks-JQ1*, to view the Job's details. 

1. In the **Details** pane the **Started at** and **Stopped at** times should be one minute apart. 

## Step 8: (Optional) Submit a job with overrides
<a name="getting-started-eks-step-6"></a>

This job overrides the command passed to the container. AWS Batch aggressively cleans up the pods after the jobs complete to reduce the load to Kubernetes. To examine the details of a job, logging must be configured. For more information, see [Use CloudWatch Logs to monitor AWS Batch on Amazon EKS jobs](batch-eks-cloudwatch-logs.md).

```
$ cat <<EOF > ./submit-job-override.json
{
  "jobName": "EksWithOverrides",
  "jobQueue": "My-Eks-JQ1",
  "jobDefinition": "MyJobOnEks_Sleep",
  "eksPropertiesOverride": {
    "podProperties": {
      "containers": [
        {
          "command": [
            "/bin/sh"
          ],
          "args": [
            "-c",
            "echo hello world"
          ]
        }
      ]
    }
  }
}
EOF
```

```
$ aws batch submit-job --cli-input-json file://./submit-job-override.json
```

**Notes**
+ For improved visibility into the details of the operations, enable Amazon EKS control plane logging. For more information, see [Amazon EKS control plane logging](https://docs.aws.amazon.com/eks/latest/userguide/control-plane-logs.html) in the *Amazon EKS User Guide*.
+ Daemonsets and kubelets overhead affects available vCPU and memory resources, specifically scaling and job placement. For more information, see [Memory and vCPU considerations for AWS Batch on Amazon EKS](memory-cpu-batch-eks.md).

To view the Job's output, do the following:

1. Open the AWS Batch console at [https://console.aws.amazon.com/batch/](https://console.aws.amazon.com/batch/).

1. In the navigation pane choose **Jobs**. 

1. In the **Job queue** drop down choose the Job queue you created for the tutorial.

1. The **Jobs** table lists all of your Jobs and what their current status is. Once the Job's **Status** is **Succeeded** choose the **Name** of the Job to view the Job's details. 

1. In the **Details** pane choose **Log stream name**. The CloudWatch console for the Job will open and there should be one event with the **Message** of `hello world` or your custom message.

## Step 9: Clean up your tutorial resources
<a name="getting-started-eks-step-8"></a>

You are charged for the Amazon EC2 instance while it is enabled. You can delete the instance to stop incurring charges.

To delete the resources you created, do the following:

1. Open the AWS Batch console at [https://console.aws.amazon.com/batch/](https://console.aws.amazon.com/batch/).

1. In the navigation pane choose **Job queue**. 

1. In the **Job queue** table choose the Job queue you created for the tutorial.

1. Choose **Disable**. Once the Job queue **State** is Disabled you can choose **Delete**.

1. Once the Job queue is deleted, in the navigation pane choose **Compute environments**.

1. Choose the compute environment you created for this tutorial and then choose **Disable**. It may take 1–2 minuets for the compute environment to complete being disabled.

1. Once the compute environment’s **State** is Disabled, choose **Delete**. It may take 1–2 minuets for the compute environment to be deleted.

## Additional resources
<a name="getting-started-eks-additional-resources"></a>

After you complete the tutorial, you might want to explore the following topics::
+ Learn more about the [Best practices](best-practices.md).
+ Explore the AWS Batch core components. For more information, see [Components of AWS Batch](batch_components.md).
+ Learn more about the different [Compute Environments](compute_environments.md) available in AWS Batch.
+ Learn more about [Job queues](job_queues.md) and their different scheduling options.
+ Learn more about [Job definitions](job_definitions.md) and the different configuration options.
+ Learn more about the different types of [Jobs](jobs.md).