Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Customizing crawler behavior

Focus mode
Customizing crawler behavior - AWS Glue

When you configure an AWS Glue crawler, you have several options for defining the behavior of your crawler.

  • Incremental crawls – You can configure a crawler to run incremental crawls to add only new partitions to the table schema.

  • Partition indexes – A crawler creates partition indexes for Amazon S3 and Delta Lake targets by default to provide efficient lookup for specific partitions.

  • Accelerate crawl time by using Amazon S3 events – You can configure a crawler to use Amazon S3 events to identify the changes between two crawls by listing all the files from the subfolder which triggered the event instead of listing the full Amazon S3 or Data Catalog target.

  • Handling schema changes – You can prevent a crawlers from making any schema changes to the existing schema. You can use the AWS Management Console or the AWS Glue API to configure how your crawler processes certain types of changes.

  • A single schema for multiple Amazon S3 paths – You can configure a crawler to create a single schema for each S3 path if the data is compatible.

  • Table location and partitioning levels – The table level crawler option provides you the flexibility to tell the crawler where the tables are located, and how you want partitions created.

  • Table threshold – You can specify the maximum number of tables the crawler is allowed to create by specifying a table threshold.

  • AWS Lake Formation credentials – You can configure a crawler to use Lake Formation credentials to access an Amazon S3 data store or a Data Catalog table with an underlying Amazon S3 location within the same AWS account or another AWS account.

For more information about using the AWS Glue console to add a crawler, see Configuring a crawler.

PrivacySite termsCookie preferences
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.