Creating an Amazon S3 Tables catalog in the AWS Glue Data Catalog
This feature is in preview release and is subject to change. For more
information, see the Betas and Previews section in the AWS Service Terms |
Amazon S3 Tables provide S3 storage that's specifically optimized for analytics workloads, improving query performance while reducing costs. S3 tables have built-in support for Apache Iceberg standard, which allows you to easily query tabular data in Amazon S3 table buckets using popular query engines like Apache Spark.
You can now publish and catalog S3 tables as AWS Glue Data Catalog objects and register the catalog as a Lake Formation data location from Lake Formation console or using service APIs. For more information, see Using Amazon S3 Tables with AWS analytics services in the Amazon Simple Storage Service User Guide.
Prerequisites
-
A data lake administrator or an IAM principal with
CREATE_CATALOG
permission can complete the one-click integration from the Lake Formation console. -
Create an IAM role for Lake Formation data access to your S3 table buckets. The IAM role used when registering the table bucket with Lake Formation requires the following permissions:
{ "Action": [ "s3tables:CreateTableBucket", "s3tables:GetTableBucket", "s3tables:ListTableBuckets", "s3tables:DeleteTableBucket", "s3tables:CreateNamespace", "s3tables:GetNamespace", "s3tables:ListNamespaces", "s3tables:DeleteNamespace", "s3tables:GetTableBucketPolicy", "s3tables:DeleteTableBucketPolicy", "s3tables:CreateTable", "s3tables:GetTable", "s3tables:RenameTable", "s3tables:DeleteTable", "s3tables:ListTables", "s3tables:PutTablePolicy", "s3tables:GetTablePolicy", "s3tables:DeleteTablePolicy" ], "Resource": "arn:aws:s3tables:us-east-1:123456789012:bucket/*", "Effect": "Allow" }
For more information, see Requirements for roles used to register locations.
-
Add the following trust policy to the IAM role to allow the Lake Formation service to assume the role and vend temporary credentials to the integrated analytical engines.
{ "Effect": "Allow", "Principal": { "Service": "lakeformation.amazonaws.com" }, "Action": [ "sts:AssumeRole", "sts:SetContext" # add action to trust relationship when using IAM Identity center principals with Lake Formation ] }
To integrate Amazon S3 Tables with AWS Glue Data Catalog (console)
-
Open the Amazon S3 console at https://console.aws.amazon.com/s3/
. -
Create Amazon S3 Table buckets using Amazon S3 console, and integrate it with AWS analytics services. For more information, see Using Amazon S3 Tables with AWS analytics services.
-
Open the Lake Formation console at https://console.aws.amazon.com/lakeformation/
. In the navigation pane, choose Catalogs under Data Catalog.
Choose S3 Table integration on the Catalogs page.
-
Choose and IAM role for Lake Formation to assume to vend credentials to the analytical query engines.
Choose Enable. The new catalog for S3 Tables is added to the catalog list.
-
Choose the catalog to view catalog objects and grant permissions to other principals.
To create an S3 Tables catalog (CLI)
-
Create a catalog.
aws glue create-catalog --cli-input-json file://input.json '{ "Name":
"s3tablescatalog"
, "CatalogInput" : { "FederatedCatalog": { "Identifier": "arn:aws:s3tables:us-east-1:123456789012:bucket/*", "ConnectionName": "aws:s3tables" }, "CreateDatabaseDefaultPermissions": [], "CreateTableDefaultPermissions": [] } }' -
Register the S3 Tables catalog as Lake Formation data location.
aws lakeformation register-resource \ --resource-arn 'arn:aws:s3tables:us-east-1:123456789012:bucket/*' \ --role-arn 'arn:aws:iam::123456789012:role/LakeFormationDataAccessRole' \ --with-federation