GlueClient

Glue

Defines the public endpoint for the Glue service.

Installation

NPM
npm install @aws-sdk/client-glue
Yarn
yarn add @aws-sdk/client-glue
pnpm
pnpm add @aws-sdk/client-glue

GlueClient Operations

Command
Summary
BatchCreatePartitionCommand

Creates one or more partitions in a batch operation.

BatchDeleteConnectionCommand

Deletes a list of connection definitions from the Data Catalog.

BatchDeletePartitionCommand

Deletes one or more partitions in a batch operation.

BatchDeleteTableCommand

Deletes multiple tables at once.

After completing this operation, you no longer have access to the table versions and partitions that belong to the deleted table. Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.

To ensure the immediate deletion of all related resources, before calling BatchDeleteTable, use DeleteTableVersion or BatchDeleteTableVersion, and DeletePartition or BatchDeletePartition, to delete any resources that belong to the table.

BatchDeleteTableVersionCommand

Deletes a specified batch of versions of a table.

BatchGetBlueprintsCommand

Retrieves information about a list of blueprints.

BatchGetCrawlersCommand

Returns a list of resource metadata for a given list of crawler names. After calling the ListCrawlers operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

BatchGetCustomEntityTypesCommand

Retrieves the details for the custom patterns specified by a list of names.

BatchGetDataQualityResultCommand

Retrieves a list of data quality results for the specified result IDs.

BatchGetDevEndpointsCommand

Returns a list of resource metadata for a given list of development endpoint names. After calling the ListDevEndpoints operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

BatchGetJobsCommand

Returns a list of resource metadata for a given list of job names. After calling the ListJobs operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

BatchGetPartitionCommand

Retrieves partitions in a batch request.

BatchGetTableOptimizerCommand

Returns the configuration for the specified table optimizers.

BatchGetTriggersCommand

Returns a list of resource metadata for a given list of trigger names. After calling the ListTriggers operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

BatchGetWorkflowsCommand

Returns a list of resource metadata for a given list of workflow names. After calling the ListWorkflows operation, you can call this operation to access the data to which you have been granted permissions. This operation supports all IAM permissions, including permission conditions that uses tags.

BatchPutDataQualityStatisticAnnotationCommand

Annotate datapoints over time for a specific data quality statistic.

BatchStopJobRunCommand

Stops one or more job runs for a specified job definition.

BatchUpdatePartitionCommand

Updates one or more partitions in a batch operation.

CancelDataQualityRuleRecommendationRunCommand

Cancels the specified recommendation run that was being used to generate rules.

CancelDataQualityRulesetEvaluationRunCommand

Cancels a run where a ruleset is being evaluated against a data source.

CancelMLTaskRunCommand

Cancels (stops) a task run. Machine learning task runs are asynchronous tasks that Glue runs on your behalf as part of various machine learning workflows. You can cancel a machine learning task run at any time by calling CancelMLTaskRun with a task run's parent transform's TransformID and the task run's TaskRunId.

CancelStatementCommand

Cancels the statement.

CheckSchemaVersionValidityCommand

Validates the supplied schema. This call has no side effects, it simply validates using the supplied schema using DataFormat as the format. Since it does not take a schema set name, no compatibility checks are performed.

CreateBlueprintCommand

Registers a blueprint with Glue.

CreateCatalogCommand

Creates a new catalog in the Glue Data Catalog.

CreateClassifierCommand

Creates a classifier in the user's account. This can be a GrokClassifier, an XMLClassifier, a JsonClassifier, or a CsvClassifier, depending on which field of the request is present.

CreateColumnStatisticsTaskSettingsCommand

Creates settings for a column statistics task.

CreateConnectionCommand

Creates a connection definition in the Data Catalog.

Connections used for creating federated resources require the IAM glue:PassConnection permission.

CreateCrawlerCommand

Creates a new crawler with specified targets, role, configuration, and optional schedule. At least one crawl target must be specified, in the s3Targets field, the jdbcTargets field, or the DynamoDBTargets field.

CreateCustomEntityTypeCommand

Creates a custom pattern that is used to detect sensitive data across the columns and rows of your structured data.

Each custom pattern you create specifies a regular expression and an optional list of context words. If no context words are passed only a regular expression is checked.

CreateDataQualityRulesetCommand

Creates a data quality ruleset with DQDL rules applied to a specified Glue table.

You create the ruleset using the Data Quality Definition Language (DQDL). For more information, see the Glue developer guide.

CreateDatabaseCommand

Creates a new database in a Data Catalog.

CreateDevEndpointCommand

Creates a new development endpoint.

CreateIntegrationCommand

Creates a Zero-ETL integration in the caller's account between two resources with Amazon Resource Names (ARNs): the SourceArn and TargetArn.

CreateIntegrationResourcePropertyCommand

This API can be used for setting up the ResourceProperty of the Glue connection (for the source) or Glue database ARN (for the target). These properties can include the role to access the connection or database. To set both source and target properties the same API needs to be invoked with the Glue connection ARN as ResourceArn with SourceProcessingProperties and the Glue database ARN as ResourceArn with TargetProcessingProperties respectively.

CreateIntegrationTablePropertiesCommand

This API is used to provide optional override properties for the the tables that need to be replicated. These properties can include properties for filtering and partitioning for the source and target tables. To set both source and target properties the same API need to be invoked with the Glue connection ARN as ResourceArn with SourceTableConfig, and the Glue database ARN as ResourceArn with TargetTableConfig respectively.

CreateJobCommand

Creates a new job definition.

CreateMLTransformCommand

Creates an Glue machine learning transform. This operation creates the transform and all the necessary parameters to train it.

Call this operation as the first step in the process of using a machine learning transform (such as the FindMatches transform) for deduplicating data. You can provide an optional Description, in addition to the parameters that you want to use for your algorithm.

You must also specify certain parameters for the tasks that Glue runs on your behalf as part of learning from your data and creating a high-quality machine learning transform. These parameters include Role, and optionally, AllocatedCapacity, Timeout, and MaxRetries. For more information, see Jobs .

CreatePartitionCommand

Creates a new partition.

CreatePartitionIndexCommand

Creates a specified partition index in an existing table.

CreateRegistryCommand

Creates a new registry which may be used to hold a collection of schemas.

CreateSchemaCommand

Creates a new schema set and registers the schema definition. Returns an error if the schema set already exists without actually registering the version.

When the schema set is created, a version checkpoint will be set to the first version. Compatibility mode "DISABLED" restricts any additional schema versions from being added after the first schema version. For all other compatibility modes, validation of compatibility settings will be applied only from the second version onwards when the RegisterSchemaVersion API is used.

When this API is called without a RegistryId, this will create an entry for a "default-registry" in the registry database tables, if it is not already present.

CreateScriptCommand

Transforms a directed acyclic graph (DAG) into code.

CreateSecurityConfigurationCommand

Creates a new security configuration. A security configuration is a set of security properties that can be used by Glue. You can use a security configuration to encrypt data at rest. For information about using security configurations in Glue, see Encrypting Data Written by Crawlers, Jobs, and Development Endpoints .

CreateSessionCommand

Creates a new session.

CreateTableCommand

Creates a new table definition in the Data Catalog.

CreateTableOptimizerCommand

Creates a new table optimizer for a specific function.

CreateTriggerCommand

Creates a new trigger.

Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets from a Glue Connection, Amazon Web Services Secrets Manager or other secret management mechanism if you intend to keep them within the Job.

CreateUsageProfileCommand

Creates an Glue usage profile.

CreateUserDefinedFunctionCommand

Creates a new function definition in the Data Catalog.

CreateWorkflowCommand

Creates a new workflow.

DeleteBlueprintCommand

Deletes an existing blueprint.

DeleteCatalogCommand

Removes the specified catalog from the Glue Data Catalog.

After completing this operation, you no longer have access to the databases, tables (and all table versions and partitions that might belong to the tables) and the user-defined functions in the deleted catalog. Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.

To ensure the immediate deletion of all related resources before calling the DeleteCatalog operation, use DeleteTableVersion (or BatchDeleteTableVersion), DeletePartition (or BatchDeletePartition), DeleteTable (or BatchDeleteTable), DeleteUserDefinedFunction and DeleteDatabase to delete any resources that belong to the catalog.

DeleteClassifierCommand

Removes a classifier from the Data Catalog.

DeleteColumnStatisticsForPartitionCommand

Delete the partition column statistics of a column.

The Identity and Access Management (IAM) permission required for this operation is DeletePartition.

DeleteColumnStatisticsForTableCommand

Retrieves table statistics of columns.

The Identity and Access Management (IAM) permission required for this operation is DeleteTable.

DeleteColumnStatisticsTaskSettingsCommand

Deletes settings for a column statistics task.

DeleteConnectionCommand

Deletes a connection from the Data Catalog.

DeleteCrawlerCommand

Removes a specified crawler from the Glue Data Catalog, unless the crawler state is RUNNING.

DeleteCustomEntityTypeCommand

Deletes a custom pattern by specifying its name.

DeleteDataQualityRulesetCommand

Deletes a data quality ruleset.

DeleteDatabaseCommand

Removes a specified database from a Data Catalog.

After completing this operation, you no longer have access to the tables (and all table versions and partitions that might belong to the tables) and the user-defined functions in the deleted database. Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.

To ensure the immediate deletion of all related resources, before calling DeleteDatabase, use DeleteTableVersion or BatchDeleteTableVersion, DeletePartition or BatchDeletePartition, DeleteUserDefinedFunction, and DeleteTable or BatchDeleteTable, to delete any resources that belong to the database.

DeleteDevEndpointCommand

Deletes a specified development endpoint.

DeleteIntegrationCommand

Deletes the specified Zero-ETL integration.

DeleteIntegrationTablePropertiesCommand

Deletes the table properties that have been created for the tables that need to be replicated.

DeleteJobCommand

Deletes a specified job definition. If the job definition is not found, no exception is thrown.

DeleteMLTransformCommand

Deletes an Glue machine learning transform. Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by Glue. If you no longer need a transform, you can delete it by calling DeleteMLTransforms. However, any Glue jobs that still reference the deleted transform will no longer succeed.

DeletePartitionCommand

Deletes a specified partition.

DeletePartitionIndexCommand

Deletes a specified partition index from an existing table.

DeleteRegistryCommand

Delete the entire registry including schema and all of its versions. To get the status of the delete operation, you can call the GetRegistry API after the asynchronous call. Deleting a registry will deactivate all online operations for the registry such as the UpdateRegistry, CreateSchema, UpdateSchema, and RegisterSchemaVersion APIs.

DeleteResourcePolicyCommand

Deletes a specified policy.

DeleteSchemaCommand

Deletes the entire schema set, including the schema set and all of its versions. To get the status of the delete operation, you can call GetSchema API after the asynchronous call. Deleting a registry will deactivate all online operations for the schema, such as the GetSchemaByDefinition, and RegisterSchemaVersion APIs.

DeleteSchemaVersionsCommand

Remove versions from the specified schema. A version number or range may be supplied. If the compatibility mode forbids deleting of a version that is necessary, such as BACKWARDS_FULL, an error is returned. Calling the GetSchemaVersions API after this call will list the status of the deleted versions.

When the range of version numbers contain check pointed version, the API will return a 409 conflict and will not proceed with the deletion. You have to remove the checkpoint first using the DeleteSchemaCheckpoint API before using this API.

You cannot use the DeleteSchemaVersions API to delete the first schema version in the schema set. The first schema version can only be deleted by the DeleteSchema API. This operation will also delete the attached SchemaVersionMetadata under the schema versions. Hard deletes will be enforced on the database.

If the compatibility mode forbids deleting of a version that is necessary, such as BACKWARDS_FULL, an error is returned.

DeleteSecurityConfigurationCommand

Deletes a specified security configuration.

DeleteSessionCommand

Deletes the session.

DeleteTableCommand

Removes a table definition from the Data Catalog.

After completing this operation, you no longer have access to the table versions and partitions that belong to the deleted table. Glue deletes these "orphaned" resources asynchronously in a timely manner, at the discretion of the service.

To ensure the immediate deletion of all related resources, before calling DeleteTable, use DeleteTableVersion or BatchDeleteTableVersion, and DeletePartition or BatchDeletePartition, to delete any resources that belong to the table.

DeleteTableOptimizerCommand

Deletes an optimizer and all associated metadata for a table. The optimization will no longer be performed on the table.

DeleteTableVersionCommand

Deletes a specified version of a table.

DeleteTriggerCommand

Deletes a specified trigger. If the trigger is not found, no exception is thrown.

DeleteUsageProfileCommand

Deletes the Glue specified usage profile.

DeleteUserDefinedFunctionCommand

Deletes an existing function definition from the Data Catalog.

DeleteWorkflowCommand

Deletes a workflow.

DescribeConnectionTypeCommand

The DescribeConnectionType API provides full details of the supported options for a given connection type in Glue.

DescribeEntityCommand

Provides details regarding the entity used with the connection type, with a description of the data model for each field in the selected entity.

The response includes all the fields which make up the entity.

DescribeInboundIntegrationsCommand

Returns a list of inbound integrations for the specified integration.

DescribeIntegrationsCommand

The API is used to retrieve a list of integrations.

GetBlueprintCommand

Retrieves the details of a blueprint.

GetBlueprintRunCommand

Retrieves the details of a blueprint run.

GetBlueprintRunsCommand

Retrieves the details of blueprint runs for a specified blueprint.

GetCatalogCommand

The name of the Catalog to retrieve. This should be all lowercase.

GetCatalogImportStatusCommand

Retrieves the status of a migration operation.

GetCatalogsCommand

Retrieves all catalogs defined in a catalog in the Glue Data Catalog. For a Redshift-federated catalog use case, this operation returns the list of catalogs mapped to Redshift databases in the Redshift namespace catalog.

GetClassifierCommand

Retrieve a classifier by name.

GetClassifiersCommand

Lists all classifier objects in the Data Catalog.

GetColumnStatisticsForPartitionCommand

Retrieves partition statistics of columns.

The Identity and Access Management (IAM) permission required for this operation is GetPartition.

GetColumnStatisticsForTableCommand

Retrieves table statistics of columns.

The Identity and Access Management (IAM) permission required for this operation is GetTable.

GetColumnStatisticsTaskRunCommand

Get the associated metadata/information for a task run, given a task run ID.

GetColumnStatisticsTaskRunsCommand

Retrieves information about all runs associated with the specified table.

GetColumnStatisticsTaskSettingsCommand

Gets settings for a column statistics task.

GetConnectionCommand

Retrieves a connection definition from the Data Catalog.

GetConnectionsCommand

Retrieves a list of connection definitions from the Data Catalog.

GetCrawlerCommand

Retrieves metadata for a specified crawler.

GetCrawlerMetricsCommand

Retrieves metrics about specified crawlers.

GetCrawlersCommand

Retrieves metadata for all crawlers defined in the customer account.

GetCustomEntityTypeCommand

Retrieves the details of a custom pattern by specifying its name.

GetDataCatalogEncryptionSettingsCommand

Retrieves the security configuration for a specified catalog.

GetDataQualityModelCommand

Retrieve the training status of the model along with more information (CompletedOn, StartedOn, FailureReason).

GetDataQualityModelResultCommand

Retrieve a statistic's predictions for a given Profile ID.

GetDataQualityResultCommand

Retrieves the result of a data quality rule evaluation.

GetDataQualityRuleRecommendationRunCommand

Gets the specified recommendation run that was used to generate rules.

GetDataQualityRulesetCommand

Returns an existing ruleset by identifier or name.

GetDataQualityRulesetEvaluationRunCommand

Retrieves a specific run where a ruleset is evaluated against a data source.

GetDatabaseCommand

Retrieves the definition of a specified database.

GetDatabasesCommand

Retrieves all databases defined in a given Data Catalog.

GetDataflowGraphCommand

Transforms a Python script into a directed acyclic graph (DAG).

GetDevEndpointCommand

Retrieves information about a specified development endpoint.

When you create a development endpoint in a virtual private cloud (VPC), Glue returns only a private IP address, and the public IP address field is not populated. When you create a non-VPC development endpoint, Glue returns only a public IP address.

GetDevEndpointsCommand

Retrieves all the development endpoints in this Amazon Web Services account.

When you create a development endpoint in a virtual private cloud (VPC), Glue returns only a private IP address and the public IP address field is not populated. When you create a non-VPC development endpoint, Glue returns only a public IP address.

GetEntityRecordsCommand

This API is used to query preview data from a given connection type or from a native Amazon S3 based Glue Data Catalog.

Returns records as an array of JSON blobs. Each record is formatted using Jackson JsonNode based on the field type defined by the DescribeEntity API.

Spark connectors generate schemas according to the same data type mapping as in the DescribeEntity API. Spark connectors convert data to the appropriate data types matching the schema when returning rows.

GetIntegrationResourcePropertyCommand

This API is used for fetching the ResourceProperty of the Glue connection (for the source) or Glue database ARN (for the target)

GetIntegrationTablePropertiesCommand

This API is used to retrieve optional override properties for the tables that need to be replicated. These properties can include properties for filtering and partition for source and target tables.

GetJobBookmarkCommand

Returns information on a job bookmark entry.

For more information about enabling and using job bookmarks, see:

GetJobCommand

Retrieves an existing job definition.

GetJobRunCommand

Retrieves the metadata for a given job run. Job run history is accessible for 365 days for your workflow and job run.

GetJobRunsCommand

Retrieves metadata for all runs of a given job definition.

GetJobRuns returns the job runs in chronological order, with the newest jobs returned first.

GetJobsCommand

Retrieves all current job definitions.

GetMLTaskRunCommand

Gets details for a specific task run on a machine learning transform. Machine learning task runs are asynchronous tasks that Glue runs on your behalf as part of various machine learning workflows. You can check the stats of any task run by calling GetMLTaskRun with the TaskRunID and its parent transform's TransformID.

GetMLTaskRunsCommand

Gets a list of runs for a machine learning transform. Machine learning task runs are asynchronous tasks that Glue runs on your behalf as part of various machine learning workflows. You can get a sortable, filterable list of machine learning task runs by calling GetMLTaskRuns with their parent transform's TransformID and other optional parameters as documented in this section.

This operation returns a list of historic runs and must be paginated.

GetMLTransformCommand

Gets an Glue machine learning transform artifact and all its corresponding metadata. Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by Glue. You can retrieve their metadata by calling GetMLTransform.

GetMLTransformsCommand

Gets a sortable, filterable list of existing Glue machine learning transforms. Machine learning transforms are a special type of transform that use machine learning to learn the details of the transformation to be performed by learning from examples provided by humans. These transformations are then saved by Glue, and you can retrieve their metadata by calling GetMLTransforms.

GetMappingCommand

Creates mappings.

GetPartitionCommand

Retrieves information about a specified partition.

GetPartitionIndexesCommand

Retrieves the partition indexes associated with a table.

GetPartitionsCommand

Retrieves information about the partitions in a table.

GetPlanCommand

Gets code to perform a specified mapping.

GetRegistryCommand

Describes the specified registry in detail.

GetResourcePoliciesCommand

Retrieves the resource policies set on individual resources by Resource Access Manager during cross-account permission grants. Also retrieves the Data Catalog resource policy.

If you enabled metadata encryption in Data Catalog settings, and you do not have permission on the KMS key, the operation can't return the Data Catalog resource policy.

GetResourcePolicyCommand

Retrieves a specified resource policy.

GetSchemaByDefinitionCommand

Retrieves a schema by the SchemaDefinition. The schema definition is sent to the Schema Registry, canonicalized, and hashed. If the hash is matched within the scope of the SchemaName or ARN (or the default registry, if none is supplied), that schema’s metadata is returned. Otherwise, a 404 or NotFound error is returned. Schema versions in Deleted statuses will not be included in the results.

GetSchemaCommand

Describes the specified schema in detail.

GetSchemaVersionCommand

Get the specified schema by its unique ID assigned when a version of the schema is created or registered. Schema versions in Deleted status will not be included in the results.

GetSchemaVersionsDiffCommand

Fetches the schema version difference in the specified difference type between two stored schema versions in the Schema Registry.

This API allows you to compare two schema versions between two schema definitions under the same schema.

GetSecurityConfigurationCommand

Retrieves a specified security configuration.

GetSecurityConfigurationsCommand

Retrieves a list of all security configurations.

GetSessionCommand

Retrieves the session.

GetStatementCommand

Retrieves the statement.

GetTableCommand

Retrieves the Table definition in a Data Catalog for a specified table.

GetTableOptimizerCommand

Returns the configuration of all optimizers associated with a specified table.

GetTableVersionCommand

Retrieves a specified version of a table.

GetTableVersionsCommand

Retrieves a list of strings that identify available versions of a specified table.

GetTablesCommand

Retrieves the definitions of some or all of the tables in a given Database.

GetTagsCommand

Retrieves a list of tags associated with a resource.

GetTriggerCommand

Retrieves the definition of a trigger.

GetTriggersCommand

Gets all the triggers associated with a job.

GetUnfilteredPartitionMetadataCommand

Retrieves partition metadata from the Data Catalog that contains unfiltered metadata.

For IAM authorization, the public IAM action associated with this API is glue:GetPartition.

GetUnfilteredPartitionsMetadataCommand

Retrieves partition metadata from the Data Catalog that contains unfiltered metadata.

For IAM authorization, the public IAM action associated with this API is glue:GetPartitions.

GetUnfilteredTableMetadataCommand

Allows a third-party analytical engine to retrieve unfiltered table metadata from the Data Catalog.

For IAM authorization, the public IAM action associated with this API is glue:GetTable.

GetUsageProfileCommand

Retrieves information about the specified Glue usage profile.

GetUserDefinedFunctionCommand

Retrieves a specified function definition from the Data Catalog.

GetUserDefinedFunctionsCommand

Retrieves multiple function definitions from the Data Catalog.

GetWorkflowCommand

Retrieves resource metadata for a workflow.

GetWorkflowRunCommand

Retrieves the metadata for a given workflow run. Job run history is accessible for 90 days for your workflow and job run.

GetWorkflowRunPropertiesCommand

Retrieves the workflow run properties which were set during the run.

GetWorkflowRunsCommand

Retrieves metadata for all runs of a given workflow.

ImportCatalogToGlueCommand

Imports an existing Amazon Athena Data Catalog to Glue.

ListBlueprintsCommand

Lists all the blueprint names in an account.

ListColumnStatisticsTaskRunsCommand

List all task runs for a particular account.

ListConnectionTypesCommand

The ListConnectionTypes API provides a discovery mechanism to learn available connection types in Glue. The response contains a list of connection types with high-level details of what is supported for each connection type. The connection types listed are the set of supported options for the ConnectionType value in the CreateConnection API.

ListCrawlersCommand

Retrieves the names of all crawler resources in this Amazon Web Services account, or the resources with the specified tag. This operation allows you to see which resources are available in your account, and their names.

This operation takes the optional Tags field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.

ListCrawlsCommand

Returns all the crawls of a specified crawler. Returns only the crawls that have occurred since the launch date of the crawler history feature, and only retains up to 12 months of crawls. Older crawls will not be returned.

You may use this API to:

  • Retrive all the crawls of a specified crawler.

  • Retrieve all the crawls of a specified crawler within a limited count.

  • Retrieve all the crawls of a specified crawler in a specific time range.

  • Retrieve all the crawls of a specified crawler with a particular state, crawl ID, or DPU hour value.

ListCustomEntityTypesCommand

Lists all the custom patterns that have been created.

ListDataQualityResultsCommand

Returns all data quality execution results for your account.

ListDataQualityRuleRecommendationRunsCommand

Lists the recommendation runs meeting the filter criteria.

ListDataQualityRulesetEvaluationRunsCommand

Lists all the runs meeting the filter criteria, where a ruleset is evaluated against a data source.

ListDataQualityRulesetsCommand

Returns a paginated list of rulesets for the specified list of Glue tables.

ListDataQualityStatisticAnnotationsCommand

Retrieve annotations for a data quality statistic.

ListDataQualityStatisticsCommand

Retrieves a list of data quality statistics.

ListDevEndpointsCommand

Retrieves the names of all DevEndpoint resources in this Amazon Web Services account, or the resources with the specified tag. This operation allows you to see which resources are available in your account, and their names.

This operation takes the optional Tags field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.

ListEntitiesCommand

Returns the available entities supported by the connection type.

ListJobsCommand

Retrieves the names of all job resources in this Amazon Web Services account, or the resources with the specified tag. This operation allows you to see which resources are available in your account, and their names.

This operation takes the optional Tags field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.

ListMLTransformsCommand

Retrieves a sortable, filterable list of existing Glue machine learning transforms in this Amazon Web Services account, or the resources with the specified tag. This operation takes the optional Tags field, which you can use as a filter of the responses so that tagged resources can be retrieved as a group. If you choose to use tag filtering, only resources with the tags are retrieved.

ListRegistriesCommand

Returns a list of registries that you have created, with minimal registry information. Registries in the Deleting status will not be included in the results. Empty results will be returned if there are no registries available.

ListSchemaVersionsCommand

Returns a list of schema versions that you have created, with minimal information. Schema versions in Deleted status will not be included in the results. Empty results will be returned if there are no schema versions available.

ListSchemasCommand

Returns a list of schemas with minimal details. Schemas in Deleting status will not be included in the results. Empty results will be returned if there are no schemas available.

When the RegistryId is not provided, all the schemas across registries will be part of the API response.

ListSessionsCommand

Retrieve a list of sessions.

ListStatementsCommand

Lists statements for the session.

ListTableOptimizerRunsCommand

Lists the history of previous optimizer runs for a specific table.

ListTriggersCommand

Retrieves the names of all trigger resources in this Amazon Web Services account, or the resources with the specified tag. This operation allows you to see which resources are available in your account, and their names.

This operation takes the optional Tags field, which you can use as a filter on the response so that tagged resources can be retrieved as a group. If you choose to use tags filtering, only resources with the tag are retrieved.

ListUsageProfilesCommand

List all the Glue usage profiles.

ListWorkflowsCommand

Lists names of workflows created in the account.

ModifyIntegrationCommand

Modifies a Zero-ETL integration in the caller's account.

PutDataCatalogEncryptionSettingsCommand

Sets the security configuration for a specified catalog. After the configuration has been set, the specified encryption is applied to every catalog write thereafter.

PutDataQualityProfileAnnotationCommand

Annotate all datapoints for a Profile.

PutResourcePolicyCommand

Sets the Data Catalog resource policy for access control.

PutSchemaVersionMetadataCommand

Puts the metadata key value pair for a specified schema version ID. A maximum of 10 key value pairs will be allowed per schema version. They can be added over one or more calls.

PutWorkflowRunPropertiesCommand

Puts the specified workflow run properties for the given workflow run. If a property already exists for the specified run, then it overrides the value otherwise adds the property to existing properties.

QuerySchemaVersionMetadataCommand

Queries for the schema version metadata information.

RegisterSchemaVersionCommand

Adds a new version to the existing schema. Returns an error if new version of schema does not meet the compatibility requirements of the schema set. This API will not create a new schema set and will return a 404 error if the schema set is not already present in the Schema Registry.

If this is the first schema definition to be registered in the Schema Registry, this API will store the schema version and return immediately. Otherwise, this call has the potential to run longer than other operations due to compatibility modes. You can call the GetSchemaVersion API with the SchemaVersionId to check compatibility modes.

If the same schema definition is already stored in Schema Registry as a version, the schema ID of the existing schema is returned to the caller.

RemoveSchemaVersionMetadataCommand

Removes a key value pair from the schema version metadata for the specified schema version ID.

ResetJobBookmarkCommand

Resets a bookmark entry.

For more information about enabling and using job bookmarks, see:

ResumeWorkflowRunCommand

Restarts selected nodes of a previous partially completed workflow run and resumes the workflow run. The selected nodes and all nodes that are downstream from the selected nodes are run.

RunStatementCommand

Executes the statement.

SearchTablesCommand

Searches a set of tables based on properties in the table metadata as well as on the parent database. You can search against text or filter conditions.

You can only get tables that you have access to based on the security policies defined in Lake Formation. You need at least a read-only access to the table for it to be returned. If you do not have access to all the columns in the table, these columns will not be searched against when returning the list of tables back to you. If you have access to the columns but not the data in the columns, those columns and the associated metadata for those columns will be included in the search.

StartBlueprintRunCommand

Starts a new run of the specified blueprint.

StartColumnStatisticsTaskRunCommand

Starts a column statistics task run, for a specified table and columns.

StartColumnStatisticsTaskRunScheduleCommand

Starts a column statistics task run schedule.

StartCrawlerCommand

Starts a crawl using the specified crawler, regardless of what is scheduled. If the crawler is already running, returns a CrawlerRunningException .

StartCrawlerScheduleCommand

Changes the schedule state of the specified crawler to SCHEDULED, unless the crawler is already running or the schedule state is already SCHEDULED.

StartDataQualityRuleRecommendationRunCommand

Starts a recommendation run that is used to generate rules when you don't know what rules to write. Glue Data Quality analyzes the data and comes up with recommendations for a potential ruleset. You can then triage the ruleset and modify the generated ruleset to your liking.

Recommendation runs are automatically deleted after 90 days.

StartDataQualityRulesetEvaluationRunCommand

Once you have a ruleset definition (either recommended or your own), you call this operation to evaluate the ruleset against a data source (Glue table). The evaluation computes results which you can retrieve with the GetDataQualityResult API.

StartExportLabelsTaskRunCommand

Begins an asynchronous task to export all labeled data for a particular transform. This task is the only label-related API call that is not part of the typical active learning workflow. You typically use StartExportLabelsTaskRun when you want to work with all of your existing labels at the same time, such as when you want to remove or change labels that were previously submitted as truth. This API operation accepts the TransformId whose labels you want to export and an Amazon Simple Storage Service (Amazon S3) path to export the labels to. The operation returns a TaskRunId. You can check on the status of your task run by calling the GetMLTaskRun API.

StartImportLabelsTaskRunCommand

Enables you to provide additional labels (examples of truth) to be used to teach the machine learning transform and improve its quality. This API operation is generally used as part of the active learning workflow that starts with the StartMLLabelingSetGenerationTaskRun call and that ultimately results in improving the quality of your machine learning transform.

After the StartMLLabelingSetGenerationTaskRun finishes, Glue machine learning will have generated a series of questions for humans to answer. (Answering these questions is often called 'labeling' in the machine learning workflows). In the case of the FindMatches transform, these questions are of the form, “What is the correct way to group these rows together into groups composed entirely of matching records?” After the labeling process is finished, users upload their answers/labels with a call to StartImportLabelsTaskRun. After StartImportLabelsTaskRun finishes, all future runs of the machine learning transform use the new and improved labels and perform a higher-quality transformation.

By default, StartMLLabelingSetGenerationTaskRun continually learns from and combines all labels that you upload unless you set Replace to true. If you set Replace to true, StartImportLabelsTaskRun deletes and forgets all previously uploaded labels and learns only from the exact set that you upload. Replacing labels can be helpful if you realize that you previously uploaded incorrect labels, and you believe that they are having a negative effect on your transform quality.

You can check on the status of your task run by calling the GetMLTaskRun operation.

StartJobRunCommand

Starts a job run using a job definition.

StartMLEvaluationTaskRunCommand

Starts a task to estimate the quality of the transform.

When you provide label sets as examples of truth, Glue machine learning uses some of those examples to learn from them. The rest of the labels are used as a test to estimate quality.

Returns a unique identifier for the run. You can call GetMLTaskRun to get more information about the stats of the EvaluationTaskRun.

StartMLLabelingSetGenerationTaskRunCommand

Starts the active learning workflow for your machine learning transform to improve the transform's quality by generating label sets and adding labels.

When the StartMLLabelingSetGenerationTaskRun finishes, Glue will have generated a "labeling set" or a set of questions for humans to answer.

In the case of the FindMatches transform, these questions are of the form, “What is the correct way to group these rows together into groups composed entirely of matching records?”

After the labeling process is finished, you can upload your labels with a call to StartImportLabelsTaskRun. After StartImportLabelsTaskRun finishes, all future runs of the machine learning transform will use the new and improved labels and perform a higher-quality transformation.

StartTriggerCommand

Starts an existing trigger. See Triggering Jobs  for information about how different types of trigger are started.

StartWorkflowRunCommand

Starts a new run of the specified workflow.

StopColumnStatisticsTaskRunCommand

Stops a task run for the specified table.

StopColumnStatisticsTaskRunScheduleCommand

Stops a column statistics task run schedule.

StopCrawlerCommand

If the specified crawler is running, stops the crawl.

StopCrawlerScheduleCommand

Sets the schedule state of the specified crawler to NOT_SCHEDULED, but does not stop the crawler if it is already running.

StopSessionCommand

Stops the session.

StopTriggerCommand

Stops a specified trigger.

StopWorkflowRunCommand

Stops the execution of the specified workflow run.

TagResourceCommand

Adds tags to a resource. A tag is a label you can assign to an Amazon Web Services resource. In Glue, you can tag only certain resources. For information about what resources you can tag, see Amazon Web Services Tags in Glue .

TestConnectionCommand

Tests a connection to a service to validate the service credentials that you provide.

You can either provide an existing connection name or a TestConnectionInput for testing a non-existing connection input. Providing both at the same time will cause an error.

If the action is successful, the service sends back an HTTP 200 response.

UntagResourceCommand

Removes tags from a resource.

UpdateBlueprintCommand

Updates a registered blueprint.

UpdateCatalogCommand

Updates an existing catalog's properties in the Glue Data Catalog.

UpdateClassifierCommand

Modifies an existing classifier (a GrokClassifier, an XMLClassifier, a JsonClassifier, or a CsvClassifier, depending on which field is present).

UpdateColumnStatisticsForPartitionCommand

Creates or updates partition statistics of columns.

The Identity and Access Management (IAM) permission required for this operation is UpdatePartition.

UpdateColumnStatisticsForTableCommand

Creates or updates table statistics of columns.

The Identity and Access Management (IAM) permission required for this operation is UpdateTable.

UpdateColumnStatisticsTaskSettingsCommand

Updates settings for a column statistics task.

UpdateConnectionCommand

Updates a connection definition in the Data Catalog.

UpdateCrawlerCommand

Updates a crawler. If a crawler is running, you must stop it using StopCrawler before updating it.

UpdateCrawlerScheduleCommand

Updates the schedule of a crawler using a cron expression.

UpdateDataQualityRulesetCommand

Updates the specified data quality ruleset.

UpdateDatabaseCommand

Updates an existing database definition in a Data Catalog.

UpdateDevEndpointCommand

Updates a specified development endpoint.

UpdateIntegrationResourcePropertyCommand

This API can be used for updating the ResourceProperty of the Glue connection (for the source) or Glue database ARN (for the target). These properties can include the role to access the connection or database. Since the same resource can be used across multiple integrations, updating resource properties will impact all the integrations using it.

UpdateIntegrationTablePropertiesCommand

This API is used to provide optional override properties for the tables that need to be replicated. These properties can include properties for filtering and partitioning for the source and target tables. To set both source and target properties the same API need to be invoked with the Glue connection ARN as ResourceArn with SourceTableConfig, and the Glue database ARN as ResourceArn with TargetTableConfig respectively.

The override will be reflected across all the integrations using same ResourceArn and source table.

UpdateJobCommand

Updates an existing job definition. The previous job definition is completely overwritten by this information.

UpdateJobFromSourceControlCommand

Synchronizes a job from the source control repository. This operation takes the job artifacts that are located in the remote repository and updates the Glue internal stores with these artifacts.

This API supports optional parameters which take in the repository information.

UpdateMLTransformCommand

Updates an existing machine learning transform. Call this operation to tune the algorithm parameters to achieve better results.

After calling this operation, you can call the StartMLEvaluationTaskRun operation to assess how well your new parameters achieved your goals (such as improving the quality of your machine learning transform, or making it more cost-effective).

UpdatePartitionCommand

Updates a partition.

UpdateRegistryCommand

Updates an existing registry which is used to hold a collection of schemas. The updated properties relate to the registry, and do not modify any of the schemas within the registry.

UpdateSchemaCommand

Updates the description, compatibility setting, or version checkpoint for a schema set.

For updating the compatibility setting, the call will not validate compatibility for the entire set of schema versions with the new compatibility setting. If the value for Compatibility is provided, the VersionNumber (a checkpoint) is also required. The API will validate the checkpoint version number for consistency.

If the value for the VersionNumber (checkpoint) is provided, Compatibility is optional and this can be used to set/reset a checkpoint for the schema.

This update will happen only if the schema is in the AVAILABLE state.

UpdateSourceControlFromJobCommand

Synchronizes a job to the source control repository. This operation takes the job artifacts from the Glue internal stores and makes a commit to the remote repository that is configured on the job.

This API supports optional parameters which take in the repository information.

UpdateTableCommand

Updates a metadata table in the Data Catalog.

UpdateTableOptimizerCommand

Updates the configuration for an existing table optimizer.

UpdateTriggerCommand

Updates a trigger definition.

Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets from a Glue Connection, Amazon Web Services Secrets Manager or other secret management mechanism if you intend to keep them within the Job.

UpdateUsageProfileCommand

Update an Glue usage profile.

UpdateUserDefinedFunctionCommand

Updates an existing function definition in the Data Catalog.

UpdateWorkflowCommand

Updates an existing workflow.

GlueClient Configuration

Parameter
Type
Description
defaultsMode
Optional
DefaultsMode | Provider<DefaultsMode>
The @smithy/smithy-client#DefaultsMode that will be used to determine how certain default configuration options are resolved in the SDK.
disableHostPrefix
Optional
boolean
Disable dynamically changing the endpoint of the client based on the hostPrefix trait of an operation.
extensions
Optional
RuntimeExtension[]
Optional extensions
logger
Optional
Logger
Optional logger for logging debug/info/warn/error.
maxAttempts
Optional
number | Provider<number>
Value for how many times a request will be made at most in case of retry.
profile
Optional
string
Setting a client profile is similar to setting a value for the AWS_PROFILE environment variable. Setting a profile on a client in code only affects the single client instance, unlike AWS_PROFILE.When set, and only for environments where an AWS configuration file exists, fields configurable by this file will be retrieved from the specified profile within that file. Conflicting code configuration and environment variables will still have higher priority.For client credential resolution that involves checking the AWS configuration file, the client's profile (this value) will be used unless a different profile is set in the credential provider options.
region
Optional
string | Provider<string>
The AWS region to which this client will send requests
requestHandler
Optional
__HttpHandlerUserInput
The HTTP handler to use or its constructor options. Fetch in browser and Https in Nodejs.
retryMode
Optional
string | Provider<string>
Specifies which retry algorithm to use.
useDualstackEndpoint
Optional
boolean | Provider<boolean>
Enables IPv6/IPv4 dualstack endpoint.
useFipsEndpoint
Optional
boolean | Provider<boolean>
Enables FIPS compatible endpoints.
Additional config fields are described in the full configuration type: GlueClientConfig