Links to Amazon EMR on EKS best practices guides on GitHub
We've built the Amazon EMR on EKS Best Practices Guide using open source community collaboration so
that we can iterate quickly and provide recommendations for aspects of creating and running a virtual cluster. We
recommend that you use the Amazon EMR on EKS best practices
guide for the sections. Choose the links in each section to go to the GitHub
site.
Security
Encryption best practices: how to use encryption for data at rest and in
transit.
Managing network security describes how to configure security groups for
pods for Amazon EMR on EKS while you connect to data sources that are hosted in AWS services
like Amazon RDS and Amazon Redshift.
Using AWS secrets manager to store secrets.
Pyspark job submission
Pyspark job submission: specifies different types of packaging for pySpark
applications using packaging formats like zip, egg, wheel, and pex.
Storage
Using EBS volumes:: how to use static and dynamic provisioning for jobs
that need EBS volumes.
Using Amazon FSx for Lustre volumes: how to use static and dynamic provisioning
for jobs that need Amazon FSx for Luster volumes.
Using Instance store volumes: how to use instance store volumes for job
processing.
Using Hive metastore: offers different ways to use Hive metastore.
Using AWS Glue: offers different ways to configure AWS Glue catalog.
Debugging
Using Spark debugging: how to change the log level.
Connecting to Spark UI on the driver pod.
How to use self-hosted Spark history server with Amazon EMR on EKS.
Troubleshooting Amazon EMR on EKS issues
Troubleshooting.
Node placement
Using Kubernetes node selectors for single-az
and other use
cases.
Using Fargate node placement.
Using Dynamic Resource Allocation (DRA).
EKS best practices for the Amazon VPC Container Network Interface plugin (CNI),
Cluster Autoscaler, and Core DNS.
Cost optimization
Using spot instances: Amazon EC2 spot instance best practices and how to use the
Spark node decommission feature.
Using AWS Outposts
Running Amazon EMR on EKS using AWS Outposts