Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Considerations for Amazon EMR with Lake Formation

Focus mode
Considerations for Amazon EMR with Lake Formation - Amazon EMR

Consider the following when using Amazon EMR with AWS Lake Formation.

  • Table-level access control is available on clusters with Amazon EMR releases 6.13 and higher.

  • Fine-grained access control at row, column, and cell level is available on clusters with Amazon EMR releases 6.15 and higher.

  • Users with access to a table can access all the properties of that table. If you have Lake Formation based access control on a table, review the table to make sure that the properties don't contain any sensitive data or information.

  • Amazon EMR clusters with Lake Formation don't support Spark's fallback to HDFS when Spark collects table statistics. This ordinarily helps optimize query performance.

  • Operations that support access controls based on Lake Formation with non-governed Apache Spark tables include INSERT INTO and INSERT OVERWRITE.

  • Operations that support access controls based on Lake Formation with Apache Spark and Apache Hive include SELECT, DESCRIBE, SHOW DATABASE, SHOW TABLE, SHOW COLUMN, and SHOW PARTITION.

  • Amazon EMR doesn't support access control to the following Lake Formation based operations:

    • Writes to governed tables

    • Amazon EMR doesn't support CREATE TABLE. Amazon EMR 6.10.0 and higher supports ALTER TABLE.

    • DML statements other than INSERT commands.

  • There are performance differences between the same query with and without Lake Formation based access control.

  • You can only use Amazon EMR with Lake Formation for Spark jobs.

  • Trusted Identity propagation is not supported with multi-catalog hierarchy in Glue Data Catalog. For more information, see Working with a multi-catalog hierarchy in AWS Glue Data Catalog.

PrivacySite termsCookie preferences
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.