AWS::Glue::Crawler S3Target

Specifies a data store in Amazon Simple Storage Service (Amazon S3).

Syntax

To declare this entity in your AWS CloudFormation template, use the following syntax:

JSON


{
  "ConnectionName" : String,
  "DlqEventQueueArn" : String,
  "EventQueueArn" : String,
  "Exclusions" : [ String, ... ],
  "Path" : String,
  "SampleSize" : Integer
}

YAML


  ConnectionName: String
  DlqEventQueueArn: String
  EventQueueArn: String
  Exclusions: 
    - String
  Path: String
  SampleSize: Integer

Properties

ConnectionName

The name of a connection which allows a job or crawler to access data in Amazon S3 within an Amazon Virtual Private Cloud environment (Amazon VPC).

Required: No

Type: String

Update requires: No interruption

DlqEventQueueArn

A valid Amazon dead-letter SQS ARN. For example, arn:aws:sqs:region:account:deadLetterQueue.

Required: No

Type: String

Update requires: No interruption

EventQueueArn

A valid Amazon SQS ARN. For example, arn:aws:sqs:region:account:sqs.

Required: No

Type: String

Update requires: No interruption

Exclusions

A list of glob patterns used to exclude from the crawl. For more information, see Catalog Tables with a Crawler.

Required: No

Type: Array of String

Update requires: No interruption

Path

The path to the Amazon S3 target.

Required: No

Type: String

Update requires: No interruption

SampleSize

Sets the number of files in each leaf folder to be crawled when crawling sample files in a dataset. If not set, all the files are crawled. A valid value is an integer between 1 and 249.

Required: No

Type: Integer

Update requires: No interruption

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

RecrawlPolicy

Schedule