Amazon EMR on EKS 6.10.0 releases - Amazon EMR

Amazon EMR on EKS 6.10.0 releases

The following Amazon EMR 6.10.0 releases are available for Amazon EMR on EKS. Select a specific emr-6.10.0-XXXX release to view more details such as the related container image tag.

Release notes for Amazon EMR 6.10.0

  • Supported applications ‐ AWS SDK for Java 1.12.397, Spark 3.3.1-amzn-0, Hudi 0.12.2-amzn-0, Iceberg 1.1.0-amzn-0, Delta 2.2.0.

  • Supported components ‐ aws-sagemaker-spark-sdk, emr-ddb, emr-goodies, emr-s3-select, emrfs, hadoop-client, hudi, hudi-spark, iceberg, spark-kubernetes.

  • Supported configuration classifications:

    For use with StartJobRun and CreateManagedEndpoint APIs:

    Classifications Descriptions

    core-site

    Change values in Hadoop’s core-site.xml file.

    emrfs-site

    Change EMRFS settings.

    spark-metrics

    Change values in Spark's metrics.properties file.

    spark-defaults

    Change values in Spark's spark-defaults.conf file.

    spark-env

    Change values in the Spark environment.

    spark-hive-site

    Change values in Spark's hive-site.xml file.

    spark-log4j

    Change values in Spark's log4j.properties file.

    For use specifically with CreateManagedEndpoint APIs:

    Classifications Descriptions

    jeg-config

    Change values in Jupyter Enterprise Gateway jupyter_enterprise_gateway_config.py file.

    jupyter-kernel-overrides

    Change value for the Kernel Image in Jupyter Kernel Spec file.

    Configuration classifications allow you to customize applications. These often correspond to a configuration XML file for the application, such as spark-hive-site.xml. For more information, see Configure Applications.

Notable features

  • Spark operator - With Amazon EMR on EKS 6.10.0 and higher, you can use the Kubernetes operator for Apache Spark, or the Spark operator, to deploy and manage Spark applications with the Amazon EMR release runtime on your own Amazon EKS clusters. For more information, see Running Spark jobs with the Spark operator.

  • Java 11 - With Amazon EMR on EKS 6.10 and higher, you can launch Spark with Java 11 runtime. To do this, pass emr-6.10.0-java11-latest as a release label. We recommend that you validate and run performance tests before you move your production workloads from the Java 8 image to the Java 11 image.

  • For the Amazon Redshift integration for Apache Spark, Amazon EMR on EKS 6.10.0 removes the dependency on minimal-json.jar, and automatically adds the required spark-redshift related jars to the executor class path for Spark: spark-redshift.jar, spark-avro.jar, and RedshiftJDBC.jar.

Changes

  • EMRFS S3-optimized committer is now enabled by default for parquet, ORC, and text-based formats (including CSV and JSON).