Navigation Guide
You are on a Command (operation) page with structural examples. Use the navigation breadcrumb if you would like to return to the Client landing page.

CreateCrawlerCommand

Creates a new crawler with specified targets, role, configuration, and optional schedule. At least one crawl target must be specified, in the s3Targets field, the jdbcTargets field, or the DynamoDBTargets field.

Example Syntax

Use a bare-bones client and the command you need to make an API call.

import { GlueClient, CreateCrawlerCommand } from "@aws-sdk/client-glue"; // ES Modules import
// const { GlueClient, CreateCrawlerCommand } = require("@aws-sdk/client-glue"); // CommonJS import
const client = new GlueClient(config);
const input = { // CreateCrawlerRequest
  Name: "STRING_VALUE", // required
  Role: "STRING_VALUE", // required
  DatabaseName: "STRING_VALUE",
  Description: "STRING_VALUE",
  Targets: { // CrawlerTargets
    S3Targets: [ // S3TargetList
      { // S3Target
        Path: "STRING_VALUE",
        Exclusions: [ // PathList
          "STRING_VALUE",
        ],
        ConnectionName: "STRING_VALUE",
        SampleSize: Number("int"),
        EventQueueArn: "STRING_VALUE",
        DlqEventQueueArn: "STRING_VALUE",
      },
    ],
    JdbcTargets: [ // JdbcTargetList
      { // JdbcTarget
        ConnectionName: "STRING_VALUE",
        Path: "STRING_VALUE",
        Exclusions: [
          "STRING_VALUE",
        ],
        EnableAdditionalMetadata: [ // EnableAdditionalMetadata
          "COMMENTS" || "RAWTYPES",
        ],
      },
    ],
    MongoDBTargets: [ // MongoDBTargetList
      { // MongoDBTarget
        ConnectionName: "STRING_VALUE",
        Path: "STRING_VALUE",
        ScanAll: true || false,
      },
    ],
    DynamoDBTargets: [ // DynamoDBTargetList
      { // DynamoDBTarget
        Path: "STRING_VALUE",
        scanAll: true || false,
        scanRate: Number("double"),
      },
    ],
    CatalogTargets: [ // CatalogTargetList
      { // CatalogTarget
        DatabaseName: "STRING_VALUE", // required
        Tables: [ // CatalogTablesList // required
          "STRING_VALUE",
        ],
        ConnectionName: "STRING_VALUE",
        EventQueueArn: "STRING_VALUE",
        DlqEventQueueArn: "STRING_VALUE",
      },
    ],
    DeltaTargets: [ // DeltaTargetList
      { // DeltaTarget
        DeltaTables: [
          "STRING_VALUE",
        ],
        ConnectionName: "STRING_VALUE",
        WriteManifest: true || false,
        CreateNativeDeltaTable: true || false,
      },
    ],
    IcebergTargets: [ // IcebergTargetList
      { // IcebergTarget
        Paths: [
          "STRING_VALUE",
        ],
        ConnectionName: "STRING_VALUE",
        Exclusions: [
          "STRING_VALUE",
        ],
        MaximumTraversalDepth: Number("int"),
      },
    ],
    HudiTargets: [ // HudiTargetList
      { // HudiTarget
        Paths: "<PathList>",
        ConnectionName: "STRING_VALUE",
        Exclusions: "<PathList>",
        MaximumTraversalDepth: Number("int"),
      },
    ],
  },
  Schedule: "STRING_VALUE",
  Classifiers: [ // ClassifierNameList
    "STRING_VALUE",
  ],
  TablePrefix: "STRING_VALUE",
  SchemaChangePolicy: { // SchemaChangePolicy
    UpdateBehavior: "LOG" || "UPDATE_IN_DATABASE",
    DeleteBehavior: "LOG" || "DELETE_FROM_DATABASE" || "DEPRECATE_IN_DATABASE",
  },
  RecrawlPolicy: { // RecrawlPolicy
    RecrawlBehavior: "CRAWL_EVERYTHING" || "CRAWL_NEW_FOLDERS_ONLY" || "CRAWL_EVENT_MODE",
  },
  LineageConfiguration: { // LineageConfiguration
    CrawlerLineageSettings: "ENABLE" || "DISABLE",
  },
  LakeFormationConfiguration: { // LakeFormationConfiguration
    UseLakeFormationCredentials: true || false,
    AccountId: "STRING_VALUE",
  },
  Configuration: "STRING_VALUE",
  CrawlerSecurityConfiguration: "STRING_VALUE",
  Tags: { // TagsMap
    "<keys>": "STRING_VALUE",
  },
};
const command = new CreateCrawlerCommand(input);
const response = await client.send(command);
// {};

CreateCrawlerCommand Input

See CreateCrawlerCommandInput for more details

Parameter	Type	Description
`Name` Required	`string \| undefined`	Name of the new crawler.
`Role` Required	`string \| undefined`	The IAM role or Amazon Resource Name (ARN) of an IAM role used by the new crawler to access customer resources.
`Targets` Required	`CrawlerTargets \| undefined`	A list of collection of targets to crawl.
`Classifiers`	`string[] \| undefined`	A list of custom classifiers that the user has registered. By default, all built-in classifiers are included in a crawl, but these custom classifiers always override the default classifiers for a given classification.
`Configuration`	`string \| undefined`	Crawler configuration information. This versioned JSON string allows users to specify aspects of a crawler's behavior. For more information, see Setting crawler configuration options .
`CrawlerSecurityConfiguration`	`string \| undefined`	The name of the `SecurityConfiguration` structure to be used by this crawler.
`DatabaseName`	`string \| undefined`	The Glue database where results are written, such as: `arn:aws:daylight:us-east-1::database/sometable/*`.
`Description`	`string \| undefined`	A description of the new crawler.
`LakeFormationConfiguration`	`LakeFormationConfiguration \| undefined`	Specifies Lake Formation configuration settings for the crawler.
`LineageConfiguration`	`LineageConfiguration \| undefined`	Specifies data lineage configuration settings for the crawler.
`RecrawlPolicy`	`RecrawlPolicy \| undefined`	A policy that specifies whether to crawl the entire dataset again, or to crawl only folders that were added since the last crawler run.
`Schedule`	`string \| undefined`	A `cron` expression used to specify the schedule (see Time-Based Schedules for Jobs and Crawlers . For example, to run something every day at 12:15 UTC, you would specify: `cron(15 12 * * ? *)`.
`SchemaChangePolicy`	`SchemaChangePolicy \| undefined`	The policy for the crawler's update and deletion behavior.
`TablePrefix`	`string \| undefined`	The table prefix used for catalog tables that are created.
`Tags`	`Record<string, string> \| undefined`	The tags to use with this crawler request. You may use tags to limit access to the crawler. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide.

CreateCrawlerCommand Output

See CreateCrawlerCommandOutput for details

Parameter	Type	Description
`$metadata` Required	`ResponseMetadata`	Metadata pertaining to this request.

Throws

Name	Fault	Details
AlreadyExistsException	client	A resource to be created or added already exists.
InvalidInputException	client	The input provided was not valid.
OperationTimeoutException	client	The operation timed out.
ResourceNumberLimitExceededException	client	A resource numerical limit was exceeded.
GlueServiceException		Base exception class for all service exceptions from Glue service.

Getting Started

References

CreateCrawlerCommand

Example Syntax

CreateCrawlerCommand Input

CreateCrawlerCommand Output

Throws

Table of Contents

Search