Once your S3 table buckets are integrated with the AWS Glue Data Catalog you can use the AWS Glue Iceberg REST endpoint to connect to your S3 tables from Apache Iceberg-compatible clients, such as PyIceberg or Spark. The AWS Glue Iceberg REST endpoint implements the Iceberg REST Catalog Open API specification
For an end to end walkthrough using PyIceberg, see Access data in Amazon S3 Tables using PyIceberg through the AWS Glue Iceberg REST endpoint
Prerequisites
Create an IAM role for your client
To access tables through AWS Glue endpoints, you need to create an IAM role with permissions to AWS Glue and Lake Formation actions. This procedure explains how to create this role and configure its permissions.
Open the IAM console at https://console.aws.amazon.com/iam/
. In the left navigation pane, choose Policies.
Choose Create a policy, and choose JSON in policy editor.
Add the following inline policy that grants permissions to access AWS Glue and Lake Formation actions:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow" "Action": [ "glue:GetCatalog", "glue:GetDatabase", "glue:GetDatabases", "glue:GetTable", "glue:GetTables", "glue:CreateTable", "glue:UpdateTable" ], "Resource": [ "arn:aws:glue:
<region>
:<account-id>
:catalog", "arn:aws:glue:<region>
:<account-id>
:catalog/s3tablescatalog", "arn:aws:glue:<region>
:<account-id>
:catalog/s3tablescatalog/<s3_table_bucket_name>
", "arn:aws:glue:<region>
:<account-id>
:table/s3tablescatalog/<s3_table_bucket_name>
/<namespace>
/*", "arn:aws:glue:<region>
:<account-id>
:database/s3tablescatalog/<s3_table_bucket_name>
/<namespace>
" ] } ] }, { "Effect": "Allow", "Action": [ "lakeformation:GetDataAccess" ], "Resource": "*" } ] }
Define access in Lake Formation
Lake Formation provides fine-grained access control for your data lake tables. When you integrated your S3 bucket with the AWS Glue Data Catalog, your tables were automatically registered as resources in Lake Formation. To access these tables, you must grant specific Lake Formation permissions to your IAM identity, in addition to its IAM policy permissions.
The following steps explain how to apply Lake Formation access controls to allow your Iceberg client to connect to your tables. You must sign in as a data lake administrator to apply these permissions.
Allow external engines to access table data
In Lake Formation, you must enable full table access for external engines to access data. This allows third-party applications to get temporary credentials from Lake Formation when using an IAM role that has full permissions on the requested table.
Open the Lake Formation console at https://console.aws.amazon.com/lakeformation/
Open the Lake Formation console at https://console.aws.amazon.com/lakeformation/
, and sign in as a data lake administrator. In the navigation pane under Administration, choose Application integration settings.
Select Allow external engines to access data in Amazon S3 locations with full table access. Then choose Save.
Grant Lake Formation permissions on your table resources
Next, grant Lake Formation permissions to the IAM role you created for your Iceberg-compatible client. These permissions will allow the role to create and manage tables in your namespace. You need to provide both database and table-level permissions:
To grant database permissions
Open the AWS Lake Formation console at https://console.aws.amazon.com/lakeformation/
, and sign in as a data lake administrator. In the navigation pane, choose Data permissions and then choose Grant.
On the Grant Permissions page, under Principals, choose IAM users and roles and select the IAM role you created for AWS Glue Iceberg REST endpoint access.
Under LF-Tags or catalog resources, choose Named Data Catalog resources.
For Catalogs, choose the AWS Glue data catalog that was created for your table bucket. For example,
.<accoundID>
:s3tablescatalog/<table-bucket-name>
For Databases, choose
mynamespace
.For Table permissions, choose Create table and Describe.
Choose Grant.
To grant table permissions
Open the AWS Lake Formation console at https://console.aws.amazon.com/lakeformation/, and sign in as a data lake administrator.
In the navigation pane, choose Data permissions and then choose Grant.
On the Grant Permissions page, under Principals, choose IAM users and roles and select the IAM role you created for AWS Glue Iceberg REST endpoint access.
Under LF-Tags or catalog resources, choose Named Data Catalog resources.
For Catalogs, choose the AWS Glue data catalog that was created for your table bucket. For example,
.<accoundID>
:s3tablescatalog/<table-bucket-name>
For Databases, choose the S3 table bucket namespace that you created.
For Tables, choose ALL_TABLES.
For Table permissions, choose Super.
Choose Grant.
Set up your environment to use the endpoint
After you have setup the IAM role with the permissions required for table access you can use it to run Iceberg clients from your local machine by configuring the AWS CLI with your role, using the following command:
aws sts assume-role --role-arn "arn:aws:iam::
<accountid>
:role/<glue-irc-role>
" --role-session-name<glue-irc-role>
To access tables through the AWS Glue REST endpoint, you need to initialize a catalog in your Iceberg-compatible client. This initialization requires specifying custom properties, including sigv4 properties, the endpoint URI and the warehouse location. Specify these properties as follows:
-
Sigv4 properties - Sigv4 must be enabled, the signing name is
glue
-
Warehouse location - This is your table bucket, specified in this format:
<accountid>
:s3tablescatalog/<table-bucket-name>
-
Endpoint URI - Refer to the AWS Glue service endpoints reference guide for the region-specific endpoint
The following example shows how to initialize a pyIceberg catalog.
rest_catalog = load_catalog(
s3tablescatalog
, **{ "type": "rest", "warehouse": "<accountid>
:s3tablescatalog/<table-bucket-name>
", "uri": "https://glue.<region>
.amazonaws.com/iceberg", "rest.sigv4-enabled": "true", "rest.signing-name": "glue", "rest.signing-region": region } )
For additional information about the AWS Glue Iceberg REST endpoint implementation, see Connecting to the Data Catalog using AWS Glue Iceberg REST endpoint in the AWS Glue User Guide.