You can integrate AWS Lake Formation with your AWS IAM Identity Center enabled EMR cluster.
First, be sure you have an Identity Center instance set up in the same Region as your cluster. For more information, see Create an Identity Center instance. You can find the instance ARN in the IAM Identity Center console when you view the instance details, or use the following command to view details for all your instances from the CLI:
aws sso-admin list-instances
Then use the ARN and your AWS account ID with the following command to configure Lake Formation to be compatible with IAM Identity Center:
aws lakeformation create-lake-formation-identity-center-configuration --cli-input-json file://create-lake-fromation-idc-config.json
json input:
{
"CatalogId": "account-id/org-account-id
",
"InstanceArn": "identity-center-instance-arn
"
}
Now, call put-data-lake-settings
and enable
AllowFullTableExternalDataAccess
with Lake Formation:
aws lakeformation put-data-lake-settings --cli-input-json file://put-data-lake-settings.json
json input:
{
"DataLakeSettings": {
"DataLakeAdmins": [
{
"DataLakePrincipalIdentifier": "admin-ARN
"
}
],
"CreateDatabaseDefaultPermissions": [...]
,
"CreateTableDefaultPermissions": [...]
,
"AllowExternalDataFiltering": true,
"AllowFullTableExternalDataAccess": true
}
}
Finally, grant full table permissions to the identity ARN for the user that accesses the EMR cluster. The ARN contains the user ID from Identity Center. Navigate to Identity Center in the console, select Users, and then select the user to view their General information settings.
Copy the User ID and paste it into the following ARN for
:user-id
arn:aws:identitystore:::user/
user-id
Note
Queries on the EMR cluster only work if the IAM Identity Center identity has full table access on the Lake Formation protected table. If the identity doesn't have full table access, then the query will fail.
Use the following command to grant the user full table access:
aws lakeformation grant-permissions --cli-input-json file://grantpermissions.json
json input:
{
"Principal": {
"DataLakePrincipalIdentifier": "arn:aws:identitystore:::user/user-id
"
},
"Resource": {
"Table": {
"DatabaseName": "tip_db",
"Name": "tip_table"
}
},
"Permissions": [
"ALL"
],
"PermissionsWithGrantOption": [
"ALL"
]
}