Troubleshooting Amazon EMR on EKS Spark operator - Amazon EMR

Troubleshooting Amazon EMR on EKS Spark operator

Refer to the following sections if you encounter problems with the Amazon EMR on EKS Spark operator. For more information including steps to complete the installation, see Running Spark jobs with the Spark operator.

Error on Helm chart installation

If you followed the steps in Install the Spark operator and it returned a INSTALLATION FAILED error like the one below when you tried to install or verify the Helm chart, you might not have obtained the authentication tokens to the Amazon ECR repository for the operator.

To resolve this issue, repeat the step in Install the Spark operator to authenticate your Helm client to the Amazon ECR registry. Then, try the installation step again.

Error: INSTALLATION FAILED: Kubernetes cluster unreachable: the server has asked for the client to provide credentials

UnsupportedFileSystemException: No FileSystem for scheme "s3"

You might encounter the following exception in thread "main":

org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "s3"

If this occurs, add the following exceptions to the SparkApplication spec:

hadoopConf: # EMRFS filesystem fs.s3.customAWSCredentialsProvider: com.amazonaws.auth.WebIdentityTokenCredentialsProvider fs.s3.impl: com.amazon.ws.emr.hadoop.fs.EmrFileSystem fs.AbstractFileSystem.s3.impl: org.apache.hadoop.fs.s3.EMRFSDelegate fs.s3.buffer.dir: /mnt/s3 fs.s3.getObject.initialSocketTimeoutMilliseconds: "2000" mapreduce.fileoutputcommitter.algorithm.version.emr_internal_use_only.EmrFileSystem: "2" mapreduce.fileoutputcommitter.cleanup-failures.ignored.emr_internal_use_only.EmrFileSystem: "true" sparkConf: # Required for EMR Runtime spark.driver.extraClassPath: /usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/home/hadoop/extrajars/* spark.driver.extraLibraryPath: /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native spark.executor.extraClassPath: /usr/lib/hadoop-lzo/lib/*:/usr/lib/hadoop/hadoop-aws.jar:/usr/share/aws/aws-java-sdk/*:/usr/share/aws/emr/emrfs/conf:/usr/share/aws/emr/emrfs/lib/*:/usr/share/aws/emr/emrfs/auxlib/*:/usr/share/aws/emr/security/conf:/usr/share/aws/emr/security/lib/*:/usr/share/aws/hmclient/lib/aws-glue-datacatalog-spark-client.jar:/usr/share/java/Hive-JSON-Serde/hive-openx-serde.jar:/usr/share/aws/sagemaker-spark-sdk/lib/sagemaker-spark-sdk.jar:/home/hadoop/extrajars/* spark.executor.extraLibraryPath: /usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/docker/usr/lib/hadoop/lib/native:/docker/usr/lib/hadoop-lzo/lib/native