CfnCrawlerProps

class aws_cdk.aws_glue.CfnCrawlerProps(*, role, targets, classifiers=None, configuration=None, crawler_security_configuration=None, database_name=None, description=None, lake_formation_configuration=None, name=None, recrawl_policy=None, schedule=None, schema_change_policy=None, table_prefix=None, tags=None)

Bases: object

Properties for defining a CfnCrawler.

Parameters:

role (str) – The Amazon Resource Name (ARN) of an IAM role that’s used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
targets (Union[IResolvable, TargetsProperty, Dict[str, Any]]) – A collection of targets to crawl.
classifiers (Optional[Sequence[str]]) – A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.
configuration (Optional[str]) – Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler’s behavior. For more information, see Configuring a Crawler .
crawler_security_configuration (Optional[str]) – The name of the SecurityConfiguration structure to be used by this crawler.
database_name (Optional[str]) – The name of the database in which the crawler’s output is stored.
description (Optional[str]) – A description of the crawler.
lake_formation_configuration (Union[IResolvable, LakeFormationConfigurationProperty, Dict[str, Any], None]) – Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
name (Optional[str]) – The name of the crawler.
recrawl_policy (Union[IResolvable, RecrawlPolicyProperty, Dict[str, Any], None]) – A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
schedule (Union[IResolvable, ScheduleProperty, Dict[str, Any], None]) – For scheduled crawlers, the schedule when the crawler runs.
schema_change_policy (Union[IResolvable, SchemaChangePolicyProperty, Dict[str, Any], None]) – The policy that specifies update and delete behaviors for the crawler. The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer’s database at the time of the crawl. The SchemaChangePolicy does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of the SchemaChangePolicy on a crawler. The SchemaChangePolicy consists of two components, UpdateBehavior and DeleteBehavior .
table_prefix (Optional[str]) – The prefix added to the names of tables that are created.
tags (Any) – The tags to use with this crawler.

See:

http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html

ExampleMetadata:

fixture=_generated

Example:

# The code below shows an example of how to instantiate this type.
# The values are placeholders you should change.
from aws_cdk import aws_glue as glue

# tags: Any

cfn_crawler_props = glue.CfnCrawlerProps(
    role="role",
    targets=glue.CfnCrawler.TargetsProperty(
        catalog_targets=[glue.CfnCrawler.CatalogTargetProperty(
            connection_name="connectionName",
            database_name="databaseName",
            dlq_event_queue_arn="dlqEventQueueArn",
            event_queue_arn="eventQueueArn",
            tables=["tables"]
        )],
        delta_targets=[glue.CfnCrawler.DeltaTargetProperty(
            connection_name="connectionName",
            create_native_delta_table=False,
            delta_tables=["deltaTables"],
            write_manifest=False
        )],
        dynamo_db_targets=[glue.CfnCrawler.DynamoDBTargetProperty(
            path="path"
        )],
        hudi_targets=[glue.CfnCrawler.HudiTargetProperty(
            connection_name="connectionName",
            exclusions=["exclusions"],
            maximum_traversal_depth=123,
            paths=["paths"]
        )],
        iceberg_targets=[glue.CfnCrawler.IcebergTargetProperty(
            connection_name="connectionName",
            exclusions=["exclusions"],
            maximum_traversal_depth=123,
            paths=["paths"]
        )],
        jdbc_targets=[glue.CfnCrawler.JdbcTargetProperty(
            connection_name="connectionName",
            enable_additional_metadata=["enableAdditionalMetadata"],
            exclusions=["exclusions"],
            path="path"
        )],
        mongo_db_targets=[glue.CfnCrawler.MongoDBTargetProperty(
            connection_name="connectionName",
            path="path"
        )],
        s3_targets=[glue.CfnCrawler.S3TargetProperty(
            connection_name="connectionName",
            dlq_event_queue_arn="dlqEventQueueArn",
            event_queue_arn="eventQueueArn",
            exclusions=["exclusions"],
            path="path",
            sample_size=123
        )]
    ),

    # the properties below are optional
    classifiers=["classifiers"],
    configuration="configuration",
    crawler_security_configuration="crawlerSecurityConfiguration",
    database_name="databaseName",
    description="description",
    lake_formation_configuration=glue.CfnCrawler.LakeFormationConfigurationProperty(
        account_id="accountId",
        use_lake_formation_credentials=False
    ),
    name="name",
    recrawl_policy=glue.CfnCrawler.RecrawlPolicyProperty(
        recrawl_behavior="recrawlBehavior"
    ),
    schedule=glue.CfnCrawler.ScheduleProperty(
        schedule_expression="scheduleExpression"
    ),
    schema_change_policy=glue.CfnCrawler.SchemaChangePolicyProperty(
        delete_behavior="deleteBehavior",
        update_behavior="updateBehavior"
    ),
    table_prefix="tablePrefix",
    tags=tags
)

Attributes

classifiers

A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.

See:: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html#cfn-glue-crawler-classifiers

configuration

Crawler configuration information.

This versioned JSON string allows users to specify aspects of a crawler’s behavior. For more information, see Configuring a Crawler .

See:: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html#cfn-glue-crawler-configuration

crawler_security_configuration

The name of the SecurityConfiguration structure to be used by this crawler.

See:: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html#cfn-glue-crawler-crawlersecurityconfiguration

database_name

The name of the database in which the crawler’s output is stored.

See:: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html#cfn-glue-crawler-databasename

description

A description of the crawler.

See:: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html#cfn-glue-crawler-description

lake_formation_configuration

Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.

See:: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html#cfn-glue-crawler-lakeformationconfiguration

name

The name of the crawler.

See:: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html#cfn-glue-crawler-name

recrawl_policy

A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.

See:: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html#cfn-glue-crawler-recrawlpolicy

role

The Amazon Resource Name (ARN) of an IAM role that’s used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.

See:: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html#cfn-glue-crawler-role

schedule

For scheduled crawlers, the schedule when the crawler runs.

See:: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html#cfn-glue-crawler-schedule

schema_change_policy

The policy that specifies update and delete behaviors for the crawler.

The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer’s database at the time of the crawl. The SchemaChangePolicy does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of the SchemaChangePolicy on a crawler.

The SchemaChangePolicy consists of two components, UpdateBehavior and DeleteBehavior .

See:: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html#cfn-glue-crawler-schemachangepolicy

table_prefix

The prefix added to the names of tables that are created.

See:: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html#cfn-glue-crawler-tableprefix

tags

The tags to use with this crawler.

See:: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html#cfn-glue-crawler-tags

targets

A collection of targets to crawl.

See:: http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html#cfn-glue-crawler-targets