CfnCrawlerProps
- class aws_cdk.aws_glue.CfnCrawlerProps(*, role, targets, classifiers=None, configuration=None, crawler_security_configuration=None, database_name=None, description=None, lake_formation_configuration=None, name=None, recrawl_policy=None, schedule=None, schema_change_policy=None, table_prefix=None, tags=None)
Bases:
object
Properties for defining a
CfnCrawler
.- Parameters:
role (
str
) – The Amazon Resource Name (ARN) of an IAM role that’s used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.targets (
Union
[IResolvable
,TargetsProperty
,Dict
[str
,Any
]]) – A collection of targets to crawl.classifiers (
Optional
[Sequence
[str
]]) – A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.configuration (
Optional
[str
]) – Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler’s behavior. For more information, see Configuring a Crawler .crawler_security_configuration (
Optional
[str
]) – The name of theSecurityConfiguration
structure to be used by this crawler.database_name (
Optional
[str
]) – The name of the database in which the crawler’s output is stored.description (
Optional
[str
]) – A description of the crawler.lake_formation_configuration (
Union
[IResolvable
,LakeFormationConfigurationProperty
,Dict
[str
,Any
],None
]) – Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.name (
Optional
[str
]) – The name of the crawler.recrawl_policy (
Union
[IResolvable
,RecrawlPolicyProperty
,Dict
[str
,Any
],None
]) – A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.schedule (
Union
[IResolvable
,ScheduleProperty
,Dict
[str
,Any
],None
]) – For scheduled crawlers, the schedule when the crawler runs.schema_change_policy (
Union
[IResolvable
,SchemaChangePolicyProperty
,Dict
[str
,Any
],None
]) – The policy that specifies update and delete behaviors for the crawler. The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer’s database at the time of the crawl. TheSchemaChangePolicy
does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of theSchemaChangePolicy
on a crawler. The SchemaChangePolicy consists of two components,UpdateBehavior
andDeleteBehavior
.table_prefix (
Optional
[str
]) – The prefix added to the names of tables that are created.tags (
Any
) – The tags to use with this crawler.
- See:
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-glue-crawler.html
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_glue as glue # tags: Any cfn_crawler_props = glue.CfnCrawlerProps( role="role", targets=glue.CfnCrawler.TargetsProperty( catalog_targets=[glue.CfnCrawler.CatalogTargetProperty( connection_name="connectionName", database_name="databaseName", dlq_event_queue_arn="dlqEventQueueArn", event_queue_arn="eventQueueArn", tables=["tables"] )], delta_targets=[glue.CfnCrawler.DeltaTargetProperty( connection_name="connectionName", create_native_delta_table=False, delta_tables=["deltaTables"], write_manifest=False )], dynamo_db_targets=[glue.CfnCrawler.DynamoDBTargetProperty( path="path" )], hudi_targets=[glue.CfnCrawler.HudiTargetProperty( connection_name="connectionName", exclusions=["exclusions"], maximum_traversal_depth=123, paths=["paths"] )], iceberg_targets=[glue.CfnCrawler.IcebergTargetProperty( connection_name="connectionName", exclusions=["exclusions"], maximum_traversal_depth=123, paths=["paths"] )], jdbc_targets=[glue.CfnCrawler.JdbcTargetProperty( connection_name="connectionName", enable_additional_metadata=["enableAdditionalMetadata"], exclusions=["exclusions"], path="path" )], mongo_db_targets=[glue.CfnCrawler.MongoDBTargetProperty( connection_name="connectionName", path="path" )], s3_targets=[glue.CfnCrawler.S3TargetProperty( connection_name="connectionName", dlq_event_queue_arn="dlqEventQueueArn", event_queue_arn="eventQueueArn", exclusions=["exclusions"], path="path", sample_size=123 )] ), # the properties below are optional classifiers=["classifiers"], configuration="configuration", crawler_security_configuration="crawlerSecurityConfiguration", database_name="databaseName", description="description", lake_formation_configuration=glue.CfnCrawler.LakeFormationConfigurationProperty( account_id="accountId", use_lake_formation_credentials=False ), name="name", recrawl_policy=glue.CfnCrawler.RecrawlPolicyProperty( recrawl_behavior="recrawlBehavior" ), schedule=glue.CfnCrawler.ScheduleProperty( schedule_expression="scheduleExpression" ), schema_change_policy=glue.CfnCrawler.SchemaChangePolicyProperty( delete_behavior="deleteBehavior", update_behavior="updateBehavior" ), table_prefix="tablePrefix", tags=tags )
Attributes
- classifiers
A list of UTF-8 strings that specify the names of custom classifiers that are associated with the crawler.
- configuration
Crawler configuration information.
This versioned JSON string allows users to specify aspects of a crawler’s behavior. For more information, see Configuring a Crawler .
- crawler_security_configuration
The name of the
SecurityConfiguration
structure to be used by this crawler.
- database_name
The name of the database in which the crawler’s output is stored.
- description
A description of the crawler.
- lake_formation_configuration
Specifies whether the crawler should use AWS Lake Formation credentials for the crawler instead of the IAM role credentials.
- name
The name of the crawler.
- recrawl_policy
A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
- role
The Amazon Resource Name (ARN) of an IAM role that’s used to access customer resources, such as Amazon Simple Storage Service (Amazon S3) data.
- schedule
For scheduled crawlers, the schedule when the crawler runs.
- schema_change_policy
The policy that specifies update and delete behaviors for the crawler.
The policy tells the crawler what to do in the event that it detects a change in a table that already exists in the customer’s database at the time of the crawl. The
SchemaChangePolicy
does not affect whether or how new tables and partitions are added. New tables and partitions are always created regardless of theSchemaChangePolicy
on a crawler.The SchemaChangePolicy consists of two components,
UpdateBehavior
andDeleteBehavior
.
- table_prefix
The prefix added to the names of tables that are created.
- tags
The tags to use with this crawler.
- targets
A collection of targets to crawl.