Creating an Amazon S3 table
An S3 table is a subresource of a table bucket. This resource stores tables in the
Apache Iceberg
format so you can work with them using query
engines and other applications that support Apache Iceberg. Amazon S3 continuously optimizes your tables to help reduce storage costs and improve analytics query performance.
When you create a table, Amazon S3 automatically generates a warehouse location for the table. This is a unique S3 location where you can read and write objects associated with the table. The following example shows the format of a warehouse location:
s3://63a8e430-6e0b-46f5-k833abtwr6s8tmtsycedn8s4yc3xhuse1b--table-s3
Tables have the following Amazon Resource Name (ARN) format:
arn:aws:s3tables:
Region
:OwnerAccountID
:bucket/bucket-name
/table/tableID
By default, you can create up to 10,000 tables in a table bucket. To request a quota
increase for table buckets or tables, contact Support
You can create a table by using the Amazon S3 REST API, AWS SDK, AWS CLI, or integrated query engines.
For information on valid table names, see Naming rules for tables and namespaces.
Prerequisites for creating tables
To create a table, you must first do the following:
Create a namespace in your table bucket.
Make sure that you have IAM permissions for
s3tables:CreateTable
ands3tables:PutTableData
.
This example shows you how to create a table with a schema by using the AWS CLI and specifying table metadata with JSON. To use this example,
replace the user input placeholders
with your own
information.
aws s3tables create-table --cli-input-json file://
mytabledefinition.json
mytabledefinition.json:
{ "tableBucketARN": "arn:aws:s3tables:
us-east-1
:111122223333
:bucket/amzn-s3-demo-table-bucket
", "namespace": "your_namespace
", "name": "example_table
", "format": "ICEBERG", "metadata": { "iceberg": { "schema": { "fields": [ {"name": "id
", "type": "int
","required": true
}, {"name": "name
", "type": "string
"}, {"name": "value
", "type": "int
"} ] } } } }
You can create a table in a supported query engine connected to your Amazon S3 table buckets, such as in an Apache Spark session on Amazon EMR.
This example shows you how to create a table with Spark by using CREATE
statements, and add table data using INSERT or by reading data from an existing file. To use this example, replace the
user input placeholders
with your own
information.
spark.sql( " CREATE TABLE IF NOT EXISTS s3tablesbucket.
example_namespace
.`example_table
` ( id INT, name STRING, value INT ) USING iceberg " )
After you create the table you can load data into the table. Choose from the following methods:
Add data into the table using the
INSERT
command.spark.sql( """ INSERT INTO s3tablesbucket.
my_namespace
.my_table
VALUES (1, 'ABC', 100), (2, 'XYZ', 200) """)Load an existing data file.
Read the data into Spark.
val data_file_location = "
Path such as S3 URI to data file
" val data_file = spark.read.parquet(data_file_location
)Write the data into an Iceberg table.
data_file.writeTo("s3tablesbucket.
my_namespace
.my_table
").using("Iceberg").tableProperty ("format-version", "2").createOrReplace()