- Navigation GuideYou are on a Command (operation) page with structural examples. Use the navigation breadcrumb if you would like to return to the Client landing page.
CreateJobCommand
Creates a new job definition.
Example Syntax
Use a bare-bones client and the command you need to make an API call.
import { GlueClient, CreateJobCommand } from "@aws-sdk/client-glue"; // ES Modules import
// const { GlueClient, CreateJobCommand } = require("@aws-sdk/client-glue"); // CommonJS import
const client = new GlueClient(config);
const input = { // CreateJobRequest
Name: "STRING_VALUE", // required
JobMode: "SCRIPT" || "VISUAL" || "NOTEBOOK",
JobRunQueuingEnabled: true || false,
Description: "STRING_VALUE",
LogUri: "STRING_VALUE",
Role: "STRING_VALUE", // required
ExecutionProperty: { // ExecutionProperty
MaxConcurrentRuns: Number("int"),
},
Command: { // JobCommand
Name: "STRING_VALUE",
ScriptLocation: "STRING_VALUE",
PythonVersion: "STRING_VALUE",
Runtime: "STRING_VALUE",
},
DefaultArguments: { // GenericMap
"<keys>": "STRING_VALUE",
},
NonOverridableArguments: {
"<keys>": "STRING_VALUE",
},
Connections: { // ConnectionsList
Connections: [ // OrchestrationStringList
"STRING_VALUE",
],
},
MaxRetries: Number("int"),
AllocatedCapacity: Number("int"),
Timeout: Number("int"),
MaxCapacity: Number("double"),
SecurityConfiguration: "STRING_VALUE",
Tags: { // TagsMap
"<keys>": "STRING_VALUE",
},
NotificationProperty: { // NotificationProperty
NotifyDelayAfter: Number("int"),
},
GlueVersion: "STRING_VALUE",
NumberOfWorkers: Number("int"),
WorkerType: "Standard" || "G.1X" || "G.2X" || "G.025X" || "G.4X" || "G.8X" || "Z.2X",
CodeGenConfigurationNodes: { // CodeGenConfigurationNodes
"<keys>": { // CodeGenConfigurationNode
AthenaConnectorSource: { // AthenaConnectorSource
Name: "STRING_VALUE", // required
ConnectionName: "STRING_VALUE", // required
ConnectorName: "STRING_VALUE", // required
ConnectionType: "STRING_VALUE", // required
ConnectionTable: "STRING_VALUE",
SchemaName: "STRING_VALUE", // required
OutputSchemas: [ // GlueSchemas
{ // GlueSchema
Columns: [ // GlueStudioSchemaColumnList
{ // GlueStudioSchemaColumn
Name: "STRING_VALUE", // required
Type: "STRING_VALUE",
},
],
},
],
},
JDBCConnectorSource: { // JDBCConnectorSource
Name: "STRING_VALUE", // required
ConnectionName: "STRING_VALUE", // required
ConnectorName: "STRING_VALUE", // required
ConnectionType: "STRING_VALUE", // required
AdditionalOptions: { // JDBCConnectorOptions
FilterPredicate: "STRING_VALUE",
PartitionColumn: "STRING_VALUE",
LowerBound: Number("long"),
UpperBound: Number("long"),
NumPartitions: Number("long"),
JobBookmarkKeys: [ // EnclosedInStringProperties
"STRING_VALUE",
],
JobBookmarkKeysSortOrder: "STRING_VALUE",
DataTypeMapping: { // JDBCDataTypeMapping
"<keys>": "DATE" || "STRING" || "TIMESTAMP" || "INT" || "FLOAT" || "LONG" || "BIGDECIMAL" || "BYTE" || "SHORT" || "DOUBLE",
},
},
ConnectionTable: "STRING_VALUE",
Query: "STRING_VALUE",
OutputSchemas: [
{
Columns: [
{
Name: "STRING_VALUE", // required
Type: "STRING_VALUE",
},
],
},
],
},
SparkConnectorSource: { // SparkConnectorSource
Name: "STRING_VALUE", // required
ConnectionName: "STRING_VALUE", // required
ConnectorName: "STRING_VALUE", // required
ConnectionType: "STRING_VALUE", // required
AdditionalOptions: { // AdditionalOptions
"<keys>": "STRING_VALUE",
},
OutputSchemas: [
{
Columns: [
{
Name: "STRING_VALUE", // required
Type: "STRING_VALUE",
},
],
},
],
},
CatalogSource: { // CatalogSource
Name: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
},
RedshiftSource: { // RedshiftSource
Name: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
RedshiftTmpDir: "STRING_VALUE",
TmpDirIAMRole: "STRING_VALUE",
},
S3CatalogSource: { // S3CatalogSource
Name: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
PartitionPredicate: "STRING_VALUE",
AdditionalOptions: { // S3SourceAdditionalOptions
BoundedSize: Number("long"),
BoundedFiles: Number("long"),
},
},
S3CsvSource: { // S3CsvSource
Name: "STRING_VALUE", // required
Paths: [ // required
"STRING_VALUE",
],
CompressionType: "gzip" || "bzip2",
Exclusions: [
"STRING_VALUE",
],
GroupSize: "STRING_VALUE",
GroupFiles: "STRING_VALUE",
Recurse: true || false,
MaxBand: Number("int"),
MaxFilesInBand: Number("int"),
AdditionalOptions: { // S3DirectSourceAdditionalOptions
BoundedSize: Number("long"),
BoundedFiles: Number("long"),
EnableSamplePath: true || false,
SamplePath: "STRING_VALUE",
},
Separator: "comma" || "ctrla" || "pipe" || "semicolon" || "tab", // required
Escaper: "STRING_VALUE",
QuoteChar: "quote" || "quillemet" || "single_quote" || "disabled", // required
Multiline: true || false,
WithHeader: true || false,
WriteHeader: true || false,
SkipFirst: true || false,
OptimizePerformance: true || false,
OutputSchemas: [
{
Columns: [
{
Name: "STRING_VALUE", // required
Type: "STRING_VALUE",
},
],
},
],
},
S3JsonSource: { // S3JsonSource
Name: "STRING_VALUE", // required
Paths: [ // required
"STRING_VALUE",
],
CompressionType: "gzip" || "bzip2",
Exclusions: [
"STRING_VALUE",
],
GroupSize: "STRING_VALUE",
GroupFiles: "STRING_VALUE",
Recurse: true || false,
MaxBand: Number("int"),
MaxFilesInBand: Number("int"),
AdditionalOptions: {
BoundedSize: Number("long"),
BoundedFiles: Number("long"),
EnableSamplePath: true || false,
SamplePath: "STRING_VALUE",
},
JsonPath: "STRING_VALUE",
Multiline: true || false,
OutputSchemas: [
{
Columns: [
{
Name: "STRING_VALUE", // required
Type: "STRING_VALUE",
},
],
},
],
},
S3ParquetSource: { // S3ParquetSource
Name: "STRING_VALUE", // required
Paths: "<EnclosedInStringProperties>", // required
CompressionType: "snappy" || "lzo" || "gzip" || "uncompressed" || "none",
Exclusions: "<EnclosedInStringProperties>",
GroupSize: "STRING_VALUE",
GroupFiles: "STRING_VALUE",
Recurse: true || false,
MaxBand: Number("int"),
MaxFilesInBand: Number("int"),
AdditionalOptions: {
BoundedSize: Number("long"),
BoundedFiles: Number("long"),
EnableSamplePath: true || false,
SamplePath: "STRING_VALUE",
},
OutputSchemas: "<GlueSchemas>",
},
RelationalCatalogSource: { // RelationalCatalogSource
Name: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
},
DynamoDBCatalogSource: { // DynamoDBCatalogSource
Name: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
},
JDBCConnectorTarget: { // JDBCConnectorTarget
Name: "STRING_VALUE", // required
Inputs: [ // OneInput // required
"STRING_VALUE",
],
ConnectionName: "STRING_VALUE", // required
ConnectionTable: "STRING_VALUE", // required
ConnectorName: "STRING_VALUE", // required
ConnectionType: "STRING_VALUE", // required
AdditionalOptions: {
"<keys>": "STRING_VALUE",
},
OutputSchemas: "<GlueSchemas>",
},
SparkConnectorTarget: { // SparkConnectorTarget
Name: "STRING_VALUE", // required
Inputs: [ // required
"STRING_VALUE",
],
ConnectionName: "STRING_VALUE", // required
ConnectorName: "STRING_VALUE", // required
ConnectionType: "STRING_VALUE", // required
AdditionalOptions: {
"<keys>": "STRING_VALUE",
},
OutputSchemas: "<GlueSchemas>",
},
CatalogTarget: { // BasicCatalogTarget
Name: "STRING_VALUE", // required
Inputs: [ // required
"STRING_VALUE",
],
PartitionKeys: [ // GlueStudioPathList
"<EnclosedInStringProperties>",
],
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
},
RedshiftTarget: { // RedshiftTarget
Name: "STRING_VALUE", // required
Inputs: [ // required
"STRING_VALUE",
],
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
RedshiftTmpDir: "STRING_VALUE",
TmpDirIAMRole: "STRING_VALUE",
UpsertRedshiftOptions: { // UpsertRedshiftTargetOptions
TableLocation: "STRING_VALUE",
ConnectionName: "STRING_VALUE",
UpsertKeys: [ // EnclosedInStringPropertiesMinOne
"STRING_VALUE",
],
},
},
S3CatalogTarget: { // S3CatalogTarget
Name: "STRING_VALUE", // required
Inputs: [ // required
"STRING_VALUE",
],
PartitionKeys: [
"<EnclosedInStringProperties>",
],
Table: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
SchemaChangePolicy: { // CatalogSchemaChangePolicy
EnableUpdateCatalog: true || false,
UpdateBehavior: "UPDATE_IN_DATABASE" || "LOG",
},
},
S3GlueParquetTarget: { // S3GlueParquetTarget
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
PartitionKeys: [
"<EnclosedInStringProperties>",
],
Path: "STRING_VALUE", // required
Compression: "snappy" || "lzo" || "gzip" || "uncompressed" || "none",
SchemaChangePolicy: { // DirectSchemaChangePolicy
EnableUpdateCatalog: true || false,
UpdateBehavior: "UPDATE_IN_DATABASE" || "LOG",
Table: "STRING_VALUE",
Database: "STRING_VALUE",
},
},
S3DirectTarget: { // S3DirectTarget
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
PartitionKeys: [
"<EnclosedInStringProperties>",
],
Path: "STRING_VALUE", // required
Compression: "STRING_VALUE",
Format: "json" || "csv" || "avro" || "orc" || "parquet" || "hudi" || "delta", // required
SchemaChangePolicy: {
EnableUpdateCatalog: true || false,
UpdateBehavior: "UPDATE_IN_DATABASE" || "LOG",
Table: "STRING_VALUE",
Database: "STRING_VALUE",
},
},
ApplyMapping: { // ApplyMapping
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
Mapping: [ // Mappings // required
{ // Mapping
ToKey: "STRING_VALUE",
FromPath: "<EnclosedInStringProperties>",
FromType: "STRING_VALUE",
ToType: "STRING_VALUE",
Dropped: true || false,
Children: [
{
ToKey: "STRING_VALUE",
FromPath: "<EnclosedInStringProperties>",
FromType: "STRING_VALUE",
ToType: "STRING_VALUE",
Dropped: true || false,
Children: "<Mappings>",
},
],
},
],
},
SelectFields: { // SelectFields
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
Paths: [ // required
"<EnclosedInStringProperties>",
],
},
DropFields: { // DropFields
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
Paths: "<GlueStudioPathList>", // required
},
RenameField: { // RenameField
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
SourcePath: "<EnclosedInStringProperties>", // required
TargetPath: "<EnclosedInStringProperties>", // required
},
Spigot: { // Spigot
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
Path: "STRING_VALUE", // required
Topk: Number("int"),
Prob: Number("double"),
},
Join: { // Join
Name: "STRING_VALUE", // required
Inputs: [ // TwoInputs // required
"STRING_VALUE",
],
JoinType: "equijoin" || "left" || "right" || "outer" || "leftsemi" || "leftanti", // required
Columns: [ // JoinColumns // required
{ // JoinColumn
From: "STRING_VALUE", // required
Keys: "<GlueStudioPathList>", // required
},
],
},
SplitFields: { // SplitFields
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
Paths: "<GlueStudioPathList>", // required
},
SelectFromCollection: { // SelectFromCollection
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
Index: Number("int"), // required
},
FillMissingValues: { // FillMissingValues
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
ImputedPath: "STRING_VALUE", // required
FilledPath: "STRING_VALUE",
},
Filter: { // Filter
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
LogicalOperator: "AND" || "OR", // required
Filters: [ // FilterExpressions // required
{ // FilterExpression
Operation: "EQ" || "LT" || "GT" || "LTE" || "GTE" || "REGEX" || "ISNULL", // required
Negated: true || false,
Values: [ // FilterValues // required
{ // FilterValue
Type: "COLUMNEXTRACTED" || "CONSTANT", // required
Value: "<EnclosedInStringProperties>", // required
},
],
},
],
},
CustomCode: { // CustomCode
Name: "STRING_VALUE", // required
Inputs: [ // ManyInputs // required
"STRING_VALUE",
],
Code: "STRING_VALUE", // required
ClassName: "STRING_VALUE", // required
OutputSchemas: "<GlueSchemas>",
},
SparkSQL: { // SparkSQL
Name: "STRING_VALUE", // required
Inputs: [ // required
"STRING_VALUE",
],
SqlQuery: "STRING_VALUE", // required
SqlAliases: [ // SqlAliases // required
{ // SqlAlias
From: "STRING_VALUE", // required
Alias: "STRING_VALUE", // required
},
],
OutputSchemas: "<GlueSchemas>",
},
DirectKinesisSource: { // DirectKinesisSource
Name: "STRING_VALUE", // required
WindowSize: Number("int"),
DetectSchema: true || false,
StreamingOptions: { // KinesisStreamingSourceOptions
EndpointUrl: "STRING_VALUE",
StreamName: "STRING_VALUE",
Classification: "STRING_VALUE",
Delimiter: "STRING_VALUE",
StartingPosition: "latest" || "trim_horizon" || "earliest" || "timestamp",
MaxFetchTimeInMs: Number("long"),
MaxFetchRecordsPerShard: Number("long"),
MaxRecordPerRead: Number("long"),
AddIdleTimeBetweenReads: true || false,
IdleTimeBetweenReadsInMs: Number("long"),
DescribeShardInterval: Number("long"),
NumRetries: Number("int"),
RetryIntervalMs: Number("long"),
MaxRetryIntervalMs: Number("long"),
AvoidEmptyBatches: true || false,
StreamArn: "STRING_VALUE",
RoleArn: "STRING_VALUE",
RoleSessionName: "STRING_VALUE",
AddRecordTimestamp: "STRING_VALUE",
EmitConsumerLagMetrics: "STRING_VALUE",
StartingTimestamp: new Date("TIMESTAMP"),
},
DataPreviewOptions: { // StreamingDataPreviewOptions
PollingTime: Number("long"),
RecordPollingLimit: Number("long"),
},
},
DirectKafkaSource: { // DirectKafkaSource
Name: "STRING_VALUE", // required
StreamingOptions: { // KafkaStreamingSourceOptions
BootstrapServers: "STRING_VALUE",
SecurityProtocol: "STRING_VALUE",
ConnectionName: "STRING_VALUE",
TopicName: "STRING_VALUE",
Assign: "STRING_VALUE",
SubscribePattern: "STRING_VALUE",
Classification: "STRING_VALUE",
Delimiter: "STRING_VALUE",
StartingOffsets: "STRING_VALUE",
EndingOffsets: "STRING_VALUE",
PollTimeoutMs: Number("long"),
NumRetries: Number("int"),
RetryIntervalMs: Number("long"),
MaxOffsetsPerTrigger: Number("long"),
MinPartitions: Number("int"),
IncludeHeaders: true || false,
AddRecordTimestamp: "STRING_VALUE",
EmitConsumerLagMetrics: "STRING_VALUE",
StartingTimestamp: new Date("TIMESTAMP"),
},
WindowSize: Number("int"),
DetectSchema: true || false,
DataPreviewOptions: {
PollingTime: Number("long"),
RecordPollingLimit: Number("long"),
},
},
CatalogKinesisSource: { // CatalogKinesisSource
Name: "STRING_VALUE", // required
WindowSize: Number("int"),
DetectSchema: true || false,
Table: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
StreamingOptions: {
EndpointUrl: "STRING_VALUE",
StreamName: "STRING_VALUE",
Classification: "STRING_VALUE",
Delimiter: "STRING_VALUE",
StartingPosition: "latest" || "trim_horizon" || "earliest" || "timestamp",
MaxFetchTimeInMs: Number("long"),
MaxFetchRecordsPerShard: Number("long"),
MaxRecordPerRead: Number("long"),
AddIdleTimeBetweenReads: true || false,
IdleTimeBetweenReadsInMs: Number("long"),
DescribeShardInterval: Number("long"),
NumRetries: Number("int"),
RetryIntervalMs: Number("long"),
MaxRetryIntervalMs: Number("long"),
AvoidEmptyBatches: true || false,
StreamArn: "STRING_VALUE",
RoleArn: "STRING_VALUE",
RoleSessionName: "STRING_VALUE",
AddRecordTimestamp: "STRING_VALUE",
EmitConsumerLagMetrics: "STRING_VALUE",
StartingTimestamp: new Date("TIMESTAMP"),
},
DataPreviewOptions: {
PollingTime: Number("long"),
RecordPollingLimit: Number("long"),
},
},
CatalogKafkaSource: { // CatalogKafkaSource
Name: "STRING_VALUE", // required
WindowSize: Number("int"),
DetectSchema: true || false,
Table: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
StreamingOptions: {
BootstrapServers: "STRING_VALUE",
SecurityProtocol: "STRING_VALUE",
ConnectionName: "STRING_VALUE",
TopicName: "STRING_VALUE",
Assign: "STRING_VALUE",
SubscribePattern: "STRING_VALUE",
Classification: "STRING_VALUE",
Delimiter: "STRING_VALUE",
StartingOffsets: "STRING_VALUE",
EndingOffsets: "STRING_VALUE",
PollTimeoutMs: Number("long"),
NumRetries: Number("int"),
RetryIntervalMs: Number("long"),
MaxOffsetsPerTrigger: Number("long"),
MinPartitions: Number("int"),
IncludeHeaders: true || false,
AddRecordTimestamp: "STRING_VALUE",
EmitConsumerLagMetrics: "STRING_VALUE",
StartingTimestamp: new Date("TIMESTAMP"),
},
DataPreviewOptions: {
PollingTime: Number("long"),
RecordPollingLimit: Number("long"),
},
},
DropNullFields: { // DropNullFields
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
NullCheckBoxList: { // NullCheckBoxList
IsEmpty: true || false,
IsNullString: true || false,
IsNegOne: true || false,
},
NullTextList: [ // NullValueFields
{ // NullValueField
Value: "STRING_VALUE", // required
Datatype: { // Datatype
Id: "STRING_VALUE", // required
Label: "STRING_VALUE", // required
},
},
],
},
Merge: { // Merge
Name: "STRING_VALUE", // required
Inputs: [ // required
"STRING_VALUE",
],
Source: "STRING_VALUE", // required
PrimaryKeys: "<GlueStudioPathList>", // required
},
Union: { // Union
Name: "STRING_VALUE", // required
Inputs: [ // required
"STRING_VALUE",
],
UnionType: "ALL" || "DISTINCT", // required
},
PIIDetection: { // PIIDetection
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
PiiType: "RowAudit" || "RowMasking" || "ColumnAudit" || "ColumnMasking", // required
EntityTypesToDetect: "<EnclosedInStringProperties>", // required
OutputColumnName: "STRING_VALUE",
SampleFraction: Number("double"),
ThresholdFraction: Number("double"),
MaskValue: "STRING_VALUE",
},
Aggregate: { // Aggregate
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
Groups: "<GlueStudioPathList>", // required
Aggs: [ // AggregateOperations // required
{ // AggregateOperation
Column: "<EnclosedInStringProperties>", // required
AggFunc: "avg" || "countDistinct" || "count" || "first" || "last" || "kurtosis" || "max" || "min" || "skewness" || "stddev_samp" || "stddev_pop" || "sum" || "sumDistinct" || "var_samp" || "var_pop", // required
},
],
},
DropDuplicates: { // DropDuplicates
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
Columns: [ // LimitedPathList
[ // LimitedStringList
"STRING_VALUE",
],
],
},
GovernedCatalogTarget: { // GovernedCatalogTarget
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
PartitionKeys: "<GlueStudioPathList>",
Table: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
SchemaChangePolicy: {
EnableUpdateCatalog: true || false,
UpdateBehavior: "UPDATE_IN_DATABASE" || "LOG",
},
},
GovernedCatalogSource: { // GovernedCatalogSource
Name: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
PartitionPredicate: "STRING_VALUE",
AdditionalOptions: {
BoundedSize: Number("long"),
BoundedFiles: Number("long"),
},
},
MicrosoftSQLServerCatalogSource: { // MicrosoftSQLServerCatalogSource
Name: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
},
MySQLCatalogSource: { // MySQLCatalogSource
Name: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
},
OracleSQLCatalogSource: { // OracleSQLCatalogSource
Name: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
},
PostgreSQLCatalogSource: { // PostgreSQLCatalogSource
Name: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
},
MicrosoftSQLServerCatalogTarget: { // MicrosoftSQLServerCatalogTarget
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
},
MySQLCatalogTarget: { // MySQLCatalogTarget
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
},
OracleSQLCatalogTarget: { // OracleSQLCatalogTarget
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
},
PostgreSQLCatalogTarget: { // PostgreSQLCatalogTarget
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
},
DynamicTransform: { // DynamicTransform
Name: "STRING_VALUE", // required
TransformName: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
Parameters: [ // TransformConfigParameterList
{ // TransformConfigParameter
Name: "STRING_VALUE", // required
Type: "str" || "int" || "float" || "complex" || "bool" || "list" || "null", // required
ValidationRule: "STRING_VALUE",
ValidationMessage: "STRING_VALUE",
Value: "<EnclosedInStringProperties>",
ListType: "str" || "int" || "float" || "complex" || "bool" || "list" || "null",
IsOptional: true || false,
},
],
FunctionName: "STRING_VALUE", // required
Path: "STRING_VALUE", // required
Version: "STRING_VALUE",
OutputSchemas: "<GlueSchemas>",
},
EvaluateDataQuality: { // EvaluateDataQuality
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
Ruleset: "STRING_VALUE", // required
Output: "PrimaryInput" || "EvaluationResults",
PublishingOptions: { // DQResultsPublishingOptions
EvaluationContext: "STRING_VALUE",
ResultsS3Prefix: "STRING_VALUE",
CloudWatchMetricsEnabled: true || false,
ResultsPublishingEnabled: true || false,
},
StopJobOnFailureOptions: { // DQStopJobOnFailureOptions
StopJobOnFailureTiming: "Immediate" || "AfterDataLoad",
},
},
S3CatalogHudiSource: { // S3CatalogHudiSource
Name: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
AdditionalHudiOptions: {
"<keys>": "STRING_VALUE",
},
OutputSchemas: "<GlueSchemas>",
},
CatalogHudiSource: { // CatalogHudiSource
Name: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
AdditionalHudiOptions: {
"<keys>": "STRING_VALUE",
},
OutputSchemas: "<GlueSchemas>",
},
S3HudiSource: { // S3HudiSource
Name: "STRING_VALUE", // required
Paths: "<EnclosedInStringProperties>", // required
AdditionalHudiOptions: "<AdditionalOptions>",
AdditionalOptions: {
BoundedSize: Number("long"),
BoundedFiles: Number("long"),
EnableSamplePath: true || false,
SamplePath: "STRING_VALUE",
},
OutputSchemas: "<GlueSchemas>",
},
S3HudiCatalogTarget: { // S3HudiCatalogTarget
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
PartitionKeys: "<GlueStudioPathList>",
Table: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
AdditionalOptions: "<AdditionalOptions>", // required
SchemaChangePolicy: {
EnableUpdateCatalog: true || false,
UpdateBehavior: "UPDATE_IN_DATABASE" || "LOG",
},
},
S3HudiDirectTarget: { // S3HudiDirectTarget
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
Path: "STRING_VALUE", // required
Compression: "gzip" || "lzo" || "uncompressed" || "snappy", // required
PartitionKeys: "<GlueStudioPathList>",
Format: "json" || "csv" || "avro" || "orc" || "parquet" || "hudi" || "delta", // required
AdditionalOptions: "<AdditionalOptions>", // required
SchemaChangePolicy: {
EnableUpdateCatalog: true || false,
UpdateBehavior: "UPDATE_IN_DATABASE" || "LOG",
Table: "STRING_VALUE",
Database: "STRING_VALUE",
},
},
DirectJDBCSource: { // DirectJDBCSource
Name: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
ConnectionName: "STRING_VALUE", // required
ConnectionType: "sqlserver" || "mysql" || "oracle" || "postgresql" || "redshift", // required
RedshiftTmpDir: "STRING_VALUE",
},
S3CatalogDeltaSource: { // S3CatalogDeltaSource
Name: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
AdditionalDeltaOptions: "<AdditionalOptions>",
OutputSchemas: "<GlueSchemas>",
},
CatalogDeltaSource: { // CatalogDeltaSource
Name: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
Table: "STRING_VALUE", // required
AdditionalDeltaOptions: "<AdditionalOptions>",
OutputSchemas: "<GlueSchemas>",
},
S3DeltaSource: { // S3DeltaSource
Name: "STRING_VALUE", // required
Paths: "<EnclosedInStringProperties>", // required
AdditionalDeltaOptions: "<AdditionalOptions>",
AdditionalOptions: {
BoundedSize: Number("long"),
BoundedFiles: Number("long"),
EnableSamplePath: true || false,
SamplePath: "STRING_VALUE",
},
OutputSchemas: "<GlueSchemas>",
},
S3DeltaCatalogTarget: { // S3DeltaCatalogTarget
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
PartitionKeys: "<GlueStudioPathList>",
Table: "STRING_VALUE", // required
Database: "STRING_VALUE", // required
AdditionalOptions: "<AdditionalOptions>",
SchemaChangePolicy: {
EnableUpdateCatalog: true || false,
UpdateBehavior: "UPDATE_IN_DATABASE" || "LOG",
},
},
S3DeltaDirectTarget: { // S3DeltaDirectTarget
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
PartitionKeys: "<GlueStudioPathList>",
Path: "STRING_VALUE", // required
Compression: "uncompressed" || "snappy", // required
Format: "json" || "csv" || "avro" || "orc" || "parquet" || "hudi" || "delta", // required
AdditionalOptions: "<AdditionalOptions>",
SchemaChangePolicy: {
EnableUpdateCatalog: true || false,
UpdateBehavior: "UPDATE_IN_DATABASE" || "LOG",
Table: "STRING_VALUE",
Database: "STRING_VALUE",
},
},
AmazonRedshiftSource: { // AmazonRedshiftSource
Name: "STRING_VALUE",
Data: { // AmazonRedshiftNodeData
AccessType: "STRING_VALUE",
SourceType: "STRING_VALUE",
Connection: { // Option
Value: "STRING_VALUE",
Label: "STRING_VALUE",
Description: "STRING_VALUE",
},
Schema: {
Value: "STRING_VALUE",
Label: "STRING_VALUE",
Description: "STRING_VALUE",
},
Table: {
Value: "STRING_VALUE",
Label: "STRING_VALUE",
Description: "STRING_VALUE",
},
CatalogDatabase: {
Value: "STRING_VALUE",
Label: "STRING_VALUE",
Description: "STRING_VALUE",
},
CatalogTable: {
Value: "STRING_VALUE",
Label: "STRING_VALUE",
Description: "STRING_VALUE",
},
CatalogRedshiftSchema: "STRING_VALUE",
CatalogRedshiftTable: "STRING_VALUE",
TempDir: "STRING_VALUE",
IamRole: "<Option>",
AdvancedOptions: [ // AmazonRedshiftAdvancedOptions
{ // AmazonRedshiftAdvancedOption
Key: "STRING_VALUE",
Value: "STRING_VALUE",
},
],
SampleQuery: "STRING_VALUE",
PreAction: "STRING_VALUE",
PostAction: "STRING_VALUE",
Action: "STRING_VALUE",
TablePrefix: "STRING_VALUE",
Upsert: true || false,
MergeAction: "STRING_VALUE",
MergeWhenMatched: "STRING_VALUE",
MergeWhenNotMatched: "STRING_VALUE",
MergeClause: "STRING_VALUE",
CrawlerConnection: "STRING_VALUE",
TableSchema: [ // OptionList
"<Option>",
],
StagingTable: "STRING_VALUE",
SelectedColumns: [
"<Option>",
],
},
},
AmazonRedshiftTarget: { // AmazonRedshiftTarget
Name: "STRING_VALUE",
Data: {
AccessType: "STRING_VALUE",
SourceType: "STRING_VALUE",
Connection: "<Option>",
Schema: "<Option>",
Table: "<Option>",
CatalogDatabase: "<Option>",
CatalogTable: "<Option>",
CatalogRedshiftSchema: "STRING_VALUE",
CatalogRedshiftTable: "STRING_VALUE",
TempDir: "STRING_VALUE",
IamRole: "<Option>",
AdvancedOptions: [
{
Key: "STRING_VALUE",
Value: "STRING_VALUE",
},
],
SampleQuery: "STRING_VALUE",
PreAction: "STRING_VALUE",
PostAction: "STRING_VALUE",
Action: "STRING_VALUE",
TablePrefix: "STRING_VALUE",
Upsert: true || false,
MergeAction: "STRING_VALUE",
MergeWhenMatched: "STRING_VALUE",
MergeWhenNotMatched: "STRING_VALUE",
MergeClause: "STRING_VALUE",
CrawlerConnection: "STRING_VALUE",
TableSchema: [
"<Option>",
],
StagingTable: "STRING_VALUE",
SelectedColumns: [
"<Option>",
],
},
Inputs: "<OneInput>",
},
EvaluateDataQualityMultiFrame: { // EvaluateDataQualityMultiFrame
Name: "STRING_VALUE", // required
Inputs: [ // required
"STRING_VALUE",
],
AdditionalDataSources: { // DQDLAliases
"<keys>": "STRING_VALUE",
},
Ruleset: "STRING_VALUE", // required
PublishingOptions: {
EvaluationContext: "STRING_VALUE",
ResultsS3Prefix: "STRING_VALUE",
CloudWatchMetricsEnabled: true || false,
ResultsPublishingEnabled: true || false,
},
AdditionalOptions: { // DQAdditionalOptions
"<keys>": "STRING_VALUE",
},
StopJobOnFailureOptions: {
StopJobOnFailureTiming: "Immediate" || "AfterDataLoad",
},
},
Recipe: { // Recipe
Name: "STRING_VALUE", // required
Inputs: "<OneInput>", // required
RecipeReference: { // RecipeReference
RecipeArn: "STRING_VALUE", // required
RecipeVersion: "STRING_VALUE", // required
},
RecipeSteps: [ // RecipeSteps
{ // RecipeStep
Action: { // RecipeAction
Operation: "STRING_VALUE", // required
Parameters: { // ParameterMap
"<keys>": "STRING_VALUE",
},
},
ConditionExpressions: [ // ConditionExpressionList
{ // ConditionExpression
Condition: "STRING_VALUE", // required
Value: "STRING_VALUE",
TargetColumn: "STRING_VALUE", // required
},
],
},
],
},
SnowflakeSource: { // SnowflakeSource
Name: "STRING_VALUE", // required
Data: { // SnowflakeNodeData
SourceType: "STRING_VALUE",
Connection: "<Option>",
Schema: "STRING_VALUE",
Table: "STRING_VALUE",
Database: "STRING_VALUE",
TempDir: "STRING_VALUE",
IamRole: "<Option>",
AdditionalOptions: "<AdditionalOptions>",
SampleQuery: "STRING_VALUE",
PreAction: "STRING_VALUE",
PostAction: "STRING_VALUE",
Action: "STRING_VALUE",
Upsert: true || false,
MergeAction: "STRING_VALUE",
MergeWhenMatched: "STRING_VALUE",
MergeWhenNotMatched: "STRING_VALUE",
MergeClause: "STRING_VALUE",
StagingTable: "STRING_VALUE",
SelectedColumns: [
"<Option>",
],
AutoPushdown: true || false,
TableSchema: "<OptionList>",
},
OutputSchemas: "<GlueSchemas>",
},
SnowflakeTarget: { // SnowflakeTarget
Name: "STRING_VALUE", // required
Data: {
SourceType: "STRING_VALUE",
Connection: "<Option>",
Schema: "STRING_VALUE",
Table: "STRING_VALUE",
Database: "STRING_VALUE",
TempDir: "STRING_VALUE",
IamRole: "<Option>",
AdditionalOptions: "<AdditionalOptions>",
SampleQuery: "STRING_VALUE",
PreAction: "STRING_VALUE",
PostAction: "STRING_VALUE",
Action: "STRING_VALUE",
Upsert: true || false,
MergeAction: "STRING_VALUE",
MergeWhenMatched: "STRING_VALUE",
MergeWhenNotMatched: "STRING_VALUE",
MergeClause: "STRING_VALUE",
StagingTable: "STRING_VALUE",
SelectedColumns: "<OptionList>",
AutoPushdown: true || false,
TableSchema: "<OptionList>",
},
Inputs: "<OneInput>",
},
ConnectorDataSource: { // ConnectorDataSource
Name: "STRING_VALUE", // required
ConnectionType: "STRING_VALUE", // required
Data: { // ConnectorOptions // required
"<keys>": "STRING_VALUE",
},
OutputSchemas: "<GlueSchemas>",
},
ConnectorDataTarget: { // ConnectorDataTarget
Name: "STRING_VALUE", // required
ConnectionType: "STRING_VALUE", // required
Data: { // required
"<keys>": "STRING_VALUE",
},
Inputs: "<OneInput>",
},
},
},
ExecutionClass: "FLEX" || "STANDARD",
SourceControlDetails: { // SourceControlDetails
Provider: "GITHUB" || "GITLAB" || "BITBUCKET" || "AWS_CODE_COMMIT",
Repository: "STRING_VALUE",
Owner: "STRING_VALUE",
Branch: "STRING_VALUE",
Folder: "STRING_VALUE",
LastCommitId: "STRING_VALUE",
AuthStrategy: "PERSONAL_ACCESS_TOKEN" || "AWS_SECRETS_MANAGER",
AuthToken: "STRING_VALUE",
},
MaintenanceWindow: "STRING_VALUE",
};
const command = new CreateJobCommand(input);
const response = await client.send(command);
// { // CreateJobResponse
// Name: "STRING_VALUE",
// };
CreateJobCommand Input
Parameter | Type | Description |
---|
Parameter | Type | Description |
---|---|---|
Command Required | JobCommand | undefined | The |
Name Required | string | undefined | The name you assign to this job definition. It must be unique in your account. |
Role Required | string | undefined | The name or Amazon Resource Name (ARN) of the IAM role associated with this job. |
AllocatedCapacity | number | undefined | |
CodeGenConfigurationNodes | Record<string, CodeGenConfigurationNode> | undefined | The representation of a directed acyclic graph on which both the Glue Studio visual component and Glue Studio code generation is based. |
Connections | ConnectionsList | undefined | The connections used for this job. |
DefaultArguments | Record<string, string> | undefined | The default arguments for every run of this job, specified as name-value pairs. You can specify arguments here that your own job-execution script consumes, as well as arguments that Glue itself consumes. Job arguments may be logged. Do not pass plaintext secrets as arguments. Retrieve secrets from a Glue Connection, Secrets Manager or other secret management mechanism if you intend to keep them within the Job. For information about how to specify and consume your own Job arguments, see the Calling Glue APIs in Python topic in the developer guide. For information about the arguments you can provide to this field when configuring Spark jobs, see the Special Parameters Used by Glue topic in the developer guide. For information about the arguments you can provide to this field when configuring Ray jobs, see Using job parameters in Ray jobs in the developer guide. |
Description | string | undefined | Description of the job being defined. |
ExecutionClass | ExecutionClass | undefined | Indicates whether the job is run with a standard or flexible execution class. The standard execution-class is ideal for time-sensitive workloads that require fast job startup and dedicated resources. The flexible execution class is appropriate for time-insensitive jobs whose start and completion times may vary. Only jobs with Glue version 3.0 and above and command type |
ExecutionProperty | ExecutionProperty | undefined | An |
GlueVersion | string | undefined | In Spark jobs, Ray jobs should set For more information about the available Glue versions and corresponding Spark and Python versions, see Glue version in the developer guide. Jobs that are created without specifying a Glue version default to Glue 0.9. |
JobMode | JobMode | undefined | A mode that describes how a job was created. Valid values are:
When the |
JobRunQueuingEnabled | boolean | undefined | Specifies whether job run queuing is enabled for the job runs for this job. A value of true means job run queuing is enabled for the job runs. If false or not populated, the job runs will not be considered for queueing. If this field does not match the value set in the job run, then the value from the job run field will be used. |
LogUri | string | undefined | This field is reserved for future use. |
MaintenanceWindow | string | undefined | This field specifies a day of the week and hour for a maintenance window for streaming jobs. Glue periodically performs maintenance activities. During these maintenance windows, Glue will need to restart your streaming jobs. Glue will restart the job within 3 hours of the specified maintenance window. For instance, if you set up the maintenance window for Monday at 10:00AM GMT, your jobs will be restarted between 10:00AM GMT to 1:00PM GMT. |
MaxCapacity | number | undefined | For Glue version 1.0 or earlier jobs, using the standard worker type, the number of Glue data processing units (DPUs) that can be allocated when this job runs. A DPU is a relative measure of processing power that consists of 4 vCPUs of compute capacity and 16 GB of memory. For more information, see the Glue pricing page . For Glue version 2.0+ jobs, you cannot specify a Do not set The value that can be allocated for
|
MaxRetries | number | undefined | The maximum number of times to retry this job if it fails. |
NonOverridableArguments | Record<string, string> | undefined | Arguments for this job that are not overridden when providing job arguments in a job run, specified as name-value pairs. |
NotificationProperty | NotificationProperty | undefined | Specifies configuration properties of a job notification. |
NumberOfWorkers | number | undefined | The number of workers of a defined |
SecurityConfiguration | string | undefined | The name of the |
SourceControlDetails | SourceControlDetails | undefined | The details for a source control configuration for a job, allowing synchronization of job artifacts to or from a remote repository. |
Tags | Record<string, string> | undefined | The tags to use with this job. You may use tags to limit access to the job. For more information about tags in Glue, see Amazon Web Services Tags in Glue in the developer guide. |
Timeout | number | undefined | The job timeout in minutes. This is the maximum time that a job run can consume resources before it is terminated and enters Jobs must have timeout values less than 7 days or 10080 minutes. Otherwise, the jobs will throw an exception. When the value is left blank, the timeout is defaulted to 2880 minutes. Any existing Glue jobs that had a timeout value greater than 7 days will be defaulted to 7 days. For instance if you have specified a timeout of 20 days for a batch job, it will be stopped on the 7th day. For streaming jobs, if you have set up a maintenance window, it will be restarted during the maintenance window after 7 days. |
WorkerType | WorkerType | undefined | The type of predefined worker that is allocated when a job runs. Accepts a value of G.1X, G.2X, G.4X, G.8X or G.025X for Spark jobs. Accepts the value Z.2X for Ray jobs.
|
CreateJobCommand Output
Parameter | Type | Description |
---|
Parameter | Type | Description |
---|---|---|
$metadata Required | ResponseMetadata | Metadata pertaining to this request. |
Name | string | undefined | The unique name that was provided for this job definition. |
Throws
Name | Fault | Details |
---|
Name | Fault | Details |
---|---|---|
AlreadyExistsException | client | A resource to be created or added already exists. |
ConcurrentModificationException | client | Two processes are trying to modify a resource simultaneously. |
IdempotentParameterMismatchException | client | The same unique identifier was associated with two different records. |
InternalServiceException | server | An internal service error occurred. |
InvalidInputException | client | The input provided was not valid. |
OperationTimeoutException | client | The operation timed out. |
ResourceNumberLimitExceededException | client | A resource numerical limit was exceeded. |
GlueServiceException | Base exception class for all service exceptions from Glue service. |