Creating an Amazon S3 table - Amazon Simple Storage Service

Creating an Amazon S3 table

An S3 table is a subresource of a table bucket. This resource stores tables in the Apache Iceberg format so you can work with them using query engines and other applications that support Apache Iceberg. Amazon S3 continuously optimizes your tables to help reduce storage costs and improve analytics query performance.

When you create a table, Amazon S3 automatically generates a warehouse location for the table. This is a unique S3 location where you can read and write objects associated with the table. The following example shows the format of a warehouse location:

s3://63a8e430-6e0b-46f5-k833abtwr6s8tmtsycedn8s4yc3xhuse1b--table-s3

Tables have the following Amazon Resource Name (ARN) format:

arn:aws:s3tables:Region:OwnerAccountID:bucket/bucket-name/table/tableID

By default, you can create up to 10,000 tables in a table bucket. To request a quota increase for table buckets or tables, contact Support.

You can create a table by using the Amazon S3 REST API, AWS SDK, AWS CLI, or integrated query engines.

For information on valid table names, see Naming rules for tables and namespaces.

Prerequisites for creating tables

To create a table, you must first do the following:

This example shows you how to create a table with a schema by using the AWS CLI and specifying table metadata with JSON. To use this example, replace the user input placeholders with your own information.

aws s3tables create-table --cli-input-json file://mytabledefinition.json

mytabledefinition.json:

{ "tableBucketARN": "arn:aws:s3tables:us-east-1:111122223333:bucket/amzn-s3-demo-table-bucket", "namespace": "your_namespace", "name": "example_table", "format": "ICEBERG", "metadata": { "iceberg": { "schema": { "fields": [ {"name": "id", "type": "int","required": true}, {"name": "name", "type": "string"}, {"name": "value", "type": "int"} ] } } } }

You can create a table in a supported query engine connected to your Amazon S3 table buckets, such as in an Apache Spark session on Amazon EMR.

This example shows you how to create a table with Spark by using CREATE statements, and add table data using INSERT or by reading data from an existing file. To use this example, replace the user input placeholders with your own information.

spark.sql( " CREATE TABLE IF NOT EXISTS s3tablesbucket.example_namespace.`example_table` ( id INT, name STRING, value INT ) USING iceberg " )

After you create the table you can load data into the table. Choose from the following methods:

  • Add data into the table using the INSERT command.

    spark.sql( """ INSERT INTO s3tablesbucket.my_namespace.my_table VALUES (1, 'ABC', 100), (2, 'XYZ', 200) """)
  • Load an existing data file.

    1. Read the data into Spark.

      val data_file_location = "Path such as S3 URI to data file" val data_file = spark.read.parquet(data_file_location)
    2. Write the data into an Iceberg table.

      data_file.writeTo("s3tablesbucket.my_namespace.my_table").using("Iceberg").tableProperty ("format-version", "2").createOrReplace()