Setting up spark-submit for Amazon EMR on EKS
Complete the following tasks to get set up before you can run an application with spark-submit on Amazon EMR on EKS. If you've already signed up for Amazon Web Services (AWS) and have used Amazon EKS, you are almost ready to use Amazon EMR on EKS. If you've already completed any of the prerequisites, you can skip those and move on to the next one.
-
Install or update to the latest version of the AWS CLI – If you've already installed the AWS CLI, confirm that you have the latest version.
-
Set up kubectl and eksctl – eksctl is a command line tool that you use to communicate with Amazon EKS.
-
Get started with Amazon EKS – eksctl – Follow the steps to create a new Kubernetes cluster with nodes in Amazon EKS.
-
Select an Amazon EMR base image URI (release 6.10.0 or higher) – the
spark-submit
command is supported with Amazon EMR releases 6.10.0 and higher. -
Confirm that the driver service account has appropriate permissions to create and watch executor pods. For more information, see Verify Spark driver service account security requirements for spark-submit.
-
Set up your local AWS credentials profile.
-
From the Amazon EKS console, choose your EKS cluster, then find the EKS cluster endpoint, located under Overview, Details, then API server endpoint.