Optimizing metadata table query performance - Amazon Simple Storage Service

Optimizing metadata table query performance

Note

The S3 Metadata feature is in preview release for Amazon S3 and is subject to change.

Since S3 Metadata is based on the Apache Iceberg table format, you can optimize the performance and cost of your metadata table queries by using specific time ranges.

For example, the following SQL query provides the sensitivity level of new objects in an S3 general purpose bucket:

SELECT key, object_tags['SensitivityLevel'] FROM aws_s3_metadata.my_metadata_table WHERE record_type = 'CREATE' GROUP BY object_tags['SensitivityLevel']

This query scans the entire metadata table, which might take a long time to run. To improve performance, you can include the record_timestamp column to focus on a specific time range. Here's an updated version of the previous query that looks at new objects from the past month:

SELECT key, object_tags['SensitivityLevel'] FROM aws_s3_metadata.my_metadata_table WHERE record_type = 'CREATE' AND record_timestamp > (CURRENT_TIMESTAMP – interval '1' month) GROUP BY object_tags['SensitivityLevel']