Using AWS Lake Formation with Amazon EMR
Amazon EMR is a flexible AWS managed cluster platform on which you can run any custom code on supported big data frameworks like Hadoop Map-Reduce, Spark, Hive, Presto, etc. Organizations also use Amazon EMR to run both batch and stream data processing applications across a highly distributed cluster. Using Apache Spark on Amazon EMR, you can run your data transformations and custom code on database and tables whose permissions are managed by Lake Formation.
There are three options for deploying Amazon EMR:
-
EMR on EC2
-
EMR Serverless
-
Amazon EMR on EKS
For more information, see Integrate Amazon EMR with Lake Formation or Using EMR Serverless with AWS Lake Formation for fine-grained access control
Support for transactional table formats
Amazon EMR releases 6.15.0 and higher include support for Lake Formation table, row, column, and
cell-level access control permissions on Apache Hudi , Apache Iceberg
and Delta
Lake
For limitations, see Considerations for Amazon EMR with Lake Formation.
Table format | Description and allowed operations | Lake Formation permissions supported in Amazon EMR |
---|---|---|
Apache Hudi |
A open table format used to simplify incremental data processing and data pipeline development. For a list of supported operations, see Apache Hudi and Lake Formation. |
Amazon EMR supports table, row, column, and cell-level access control with Apache Hudi. |
Apache Iceberg |
An open table format that manages large collections of files as tables. For a list of supported operations, see Apache Iceberg and Lake Formation. |
Amazon EMR supports table, row, column, and cell-level access control with Apache Iceberg. |
Linux Foundation Delta Lake |
Delta Lake is an open-source project that helps implement modern data lake architectures commonly built on Amazon S3 or Hadoop Distributed File System (HDFS). For a list of supported operations, see Delta Lake and Lake Formation. |
Amazon EMR supports table, row, column, and cell-level access control with Delta Lake tables. |