Preparing data tables in Amazon S3
You can query data tables that have been cataloged in AWS Glue and stored in Amazon S3. If your data tables are already cataloged in AWS Glue, skip to Creating a configured table in AWS Clean Rooms.
Preparing your data tables in Amazon S3 involves the following steps:
Topics
Step 1: Complete the prerequisites
To prepare your data tables for use with AWS Clean Rooms, you must complete the following prerequisites:
-
Your data tables are saved as one of the supported data formats for AWS Clean Rooms.
-
Your data tables are cataloged in AWS Glue and use the supported data types for AWS Clean Rooms.
-
All of your data tables are stored in Amazon Simple Storage Service (Amazon S3) in the same AWS Region in which the collaboration was created.
-
The AWS Glue Data Catalog is in the same Region in which the collaboration was created.
-
The AWS Glue Data Catalog is in the same AWS account as the membership.
-
The Amazon S3 bucket isn't registered with AWS Lake Formation.
Step 2: (Optional) Prepare your data for cryptographic computing
(Optional) If you're using cryptographic computing and your data table contains sensitive information that you want to encrypt, you must encrypt the data table using the C3R encryption client.
To prepare your data for cryptographic computing, follow the procedures in Preparing encrypted data tables with Cryptographic Computing for Clean Rooms.
Step 3: Upload your data table to Amazon S3
Note
If you intend to use encrypted data tables in the collaboration, you must first encrypt the data for cryptographic computing before you upload your data table to Amazon S3. For more information, see Preparing encrypted data tables with Cryptographic Computing for Clean Rooms.
To upload your data table to Amazon S3
-
Sign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/
. -
Choose Buckets, and then choose a bucket where you want to store your data table.
-
Choose Upload, and then follow the prompts.
-
Choose the Objects tab to view the prefix where your data is stored. Make a note of the name of the folder.
You can select the folder to view the data.
Step 4: Create an AWS Glue table
If you already have an AWS Glue data table, you can skip this step.
In this step, you set up a crawler in AWS Glue that crawls all the files in your S3 bucket and creates an AWS Glue table. For more information, see Defining crawlers in AWS Glue in the AWS Glue User Guide.
For more information about supported AWS Glue Data Catalog data types, see Supported data types.
Note
AWS Clean Rooms doesn't currently support S3 buckets registered with AWS Lake Formation.
The following procedure describes how to create an AWS Glue table. If you want to use an encrypted AWS Glue Data Catalog object with an AWS Key Management Service (AWS KMS) key, you need to configure the KMS key permissions policy to allow access to that encrypted table. For more information, see Setting up encryption in AWS Glue in the AWS Glue Developer Guide.
To create an AWS Glue table
-
Follow the Working with crawlers on the AWS Glue console procedure in the AWS Glue User Guide.
-
Make a note of the AWS Glue database name and AWS Glue table name.
Step 5: Next steps
Now that you have prepared your data tables in Amazon S3, you are ready to:
The tables can be queried after:
-
The collaboration creator has set up a collaboration in AWS Clean Rooms. For more information, see Creating a collaboration.
-
The collaboration creator has sent the collaboration ID to you as a participant in the collaboration.