CfnJobProps
- class aws_cdk.aws_databrew.CfnJobProps(*, name, role_arn, type, database_outputs=None, data_catalog_outputs=None, dataset_name=None, encryption_key_arn=None, encryption_mode=None, job_sample=None, log_subscription=None, max_capacity=None, max_retries=None, output_location=None, outputs=None, profile_configuration=None, project_name=None, recipe=None, tags=None, timeout=None, validation_configurations=None)
Bases:
object
Properties for defining a
CfnJob
.- Parameters:
name (
str
) – The unique name of the job.role_arn (
str
) – The Amazon Resource Name (ARN) of the role to be assumed for this job.type (
str
) – The job type of the job, which must be one of the following:. -PROFILE
- A job to analyze a dataset, to determine its size, data types, data distribution, and more. -RECIPE
- A job to apply one or more transformations to a dataset.database_outputs (
Union
[IResolvable
,Sequence
[Union
[IResolvable
,DatabaseOutputProperty
,Dict
[str
,Any
]]],None
]) – Represents a list of JDBC database output objects which defines the output destination for a DataBrew recipe job to write into.data_catalog_outputs (
Union
[IResolvable
,Sequence
[Union
[IResolvable
,DataCatalogOutputProperty
,Dict
[str
,Any
]]],None
]) – One or more artifacts that represent the AWS Glue Data Catalog output from running the job.dataset_name (
Optional
[str
]) – A dataset that the job is to process.encryption_key_arn (
Optional
[str
]) – The Amazon Resource Name (ARN) of an encryption key that is used to protect the job output. For more information, see Encrypting data written by DataBrew jobsencryption_mode (
Optional
[str
]) – The encryption mode for the job, which can be one of the following:. -SSE-KMS
- Server-side encryption with keys managed by AWS KMS . -SSE-S3
- Server-side encryption with keys managed by Amazon S3.job_sample (
Union
[IResolvable
,JobSampleProperty
,Dict
[str
,Any
],None
]) – A sample configuration for profile jobs only, which determines the number of rows on which the profile job is run. If aJobSample
value isn’t provided, the default value is used. The default value is CUSTOM_ROWS for the mode parameter and 20,000 for the size parameter.log_subscription (
Optional
[str
]) – The current status of Amazon CloudWatch logging for the job.max_capacity (
Union
[int
,float
,None
]) – The maximum number of nodes that can be consumed when the job processes data.max_retries (
Union
[int
,float
,None
]) – The maximum number of times to retry the job after a job run fails.output_location (
Union
[IResolvable
,OutputLocationProperty
,Dict
[str
,Any
],None
]) –AWS::DataBrew::Job.OutputLocation
.outputs (
Union
[IResolvable
,Sequence
[Union
[IResolvable
,OutputProperty
,Dict
[str
,Any
]]],None
]) – One or more artifacts that represent output from running the job.profile_configuration (
Union
[IResolvable
,ProfileConfigurationProperty
,Dict
[str
,Any
],None
]) – Configuration for profile jobs. Configuration can be used to select columns, do evaluations, and override default parameters of evaluations. When configuration is undefined, the profile job will apply default settings to all supported columns.project_name (
Optional
[str
]) – The name of the project that the job is associated with.recipe (
Union
[IResolvable
,RecipeProperty
,Dict
[str
,Any
],None
]) – A series of data transformation steps that the job runs.tags (
Optional
[Sequence
[Union
[CfnTag
,Dict
[str
,Any
]]]]) – Metadata tags that have been applied to the job.timeout (
Union
[int
,float
,None
]) – The job’s timeout in minutes. A job that attempts to run longer than this timeout period ends with a status ofTIMEOUT
.validation_configurations (
Union
[IResolvable
,Sequence
[Union
[IResolvable
,ValidationConfigurationProperty
,Dict
[str
,Any
]]],None
]) – List of validation configurations that are applied to the profile job.
- Link:
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-job.html
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. import aws_cdk.aws_databrew as databrew cfn_job_props = databrew.CfnJobProps( name="name", role_arn="roleArn", type="type", # the properties below are optional database_outputs=[databrew.CfnJob.DatabaseOutputProperty( database_options=databrew.CfnJob.DatabaseTableOutputOptionsProperty( table_name="tableName", # the properties below are optional temp_directory=databrew.CfnJob.S3LocationProperty( bucket="bucket", # the properties below are optional bucket_owner="bucketOwner", key="key" ) ), glue_connection_name="glueConnectionName", # the properties below are optional database_output_mode="databaseOutputMode" )], data_catalog_outputs=[databrew.CfnJob.DataCatalogOutputProperty( database_name="databaseName", table_name="tableName", # the properties below are optional catalog_id="catalogId", database_options=databrew.CfnJob.DatabaseTableOutputOptionsProperty( table_name="tableName", # the properties below are optional temp_directory=databrew.CfnJob.S3LocationProperty( bucket="bucket", # the properties below are optional bucket_owner="bucketOwner", key="key" ) ), overwrite=False, s3_options=databrew.CfnJob.S3TableOutputOptionsProperty( location=databrew.CfnJob.S3LocationProperty( bucket="bucket", # the properties below are optional bucket_owner="bucketOwner", key="key" ) ) )], dataset_name="datasetName", encryption_key_arn="encryptionKeyArn", encryption_mode="encryptionMode", job_sample=databrew.CfnJob.JobSampleProperty( mode="mode", size=123 ), log_subscription="logSubscription", max_capacity=123, max_retries=123, output_location=databrew.CfnJob.OutputLocationProperty( bucket="bucket", # the properties below are optional bucket_owner="bucketOwner", key="key" ), outputs=[databrew.CfnJob.OutputProperty( location=databrew.CfnJob.S3LocationProperty( bucket="bucket", # the properties below are optional bucket_owner="bucketOwner", key="key" ), # the properties below are optional compression_format="compressionFormat", format="format", format_options=databrew.CfnJob.OutputFormatOptionsProperty( csv=databrew.CfnJob.CsvOutputOptionsProperty( delimiter="delimiter" ) ), max_output_files=123, overwrite=False, partition_columns=["partitionColumns"] )], profile_configuration=databrew.CfnJob.ProfileConfigurationProperty( column_statistics_configurations=[databrew.CfnJob.ColumnStatisticsConfigurationProperty( statistics=databrew.CfnJob.StatisticsConfigurationProperty( included_statistics=["includedStatistics"], overrides=[databrew.CfnJob.StatisticOverrideProperty( parameters={ "parameters_key": "parameters" }, statistic="statistic" )] ), # the properties below are optional selectors=[databrew.CfnJob.ColumnSelectorProperty( name="name", regex="regex" )] )], dataset_statistics_configuration=databrew.CfnJob.StatisticsConfigurationProperty( included_statistics=["includedStatistics"], overrides=[databrew.CfnJob.StatisticOverrideProperty( parameters={ "parameters_key": "parameters" }, statistic="statistic" )] ), entity_detector_configuration=databrew.CfnJob.EntityDetectorConfigurationProperty( entity_types=["entityTypes"], # the properties below are optional allowed_statistics=databrew.CfnJob.AllowedStatisticsProperty( statistics=["statistics"] ) ), profile_columns=[databrew.CfnJob.ColumnSelectorProperty( name="name", regex="regex" )] ), project_name="projectName", recipe=databrew.CfnJob.RecipeProperty( name="name", # the properties below are optional version="version" ), tags=[CfnTag( key="key", value="value" )], timeout=123, validation_configurations=[databrew.CfnJob.ValidationConfigurationProperty( ruleset_arn="rulesetArn", # the properties below are optional validation_mode="validationMode" )] )
Attributes
- data_catalog_outputs
One or more artifacts that represent the AWS Glue Data Catalog output from running the job.
- database_outputs
Represents a list of JDBC database output objects which defines the output destination for a DataBrew recipe job to write into.
- dataset_name
A dataset that the job is to process.
- encryption_key_arn
The Amazon Resource Name (ARN) of an encryption key that is used to protect the job output.
For more information, see Encrypting data written by DataBrew jobs
- encryption_mode
.
SSE-KMS
- Server-side encryption with keys managed by AWS KMS .SSE-S3
- Server-side encryption with keys managed by Amazon S3.
- Link:
- Type:
The encryption mode for the job, which can be one of the following
- job_sample
A sample configuration for profile jobs only, which determines the number of rows on which the profile job is run.
If a
JobSample
value isn’t provided, the default value is used. The default value is CUSTOM_ROWS for the mode parameter and 20,000 for the size parameter.
- log_subscription
The current status of Amazon CloudWatch logging for the job.
- max_capacity
The maximum number of nodes that can be consumed when the job processes data.
- max_retries
The maximum number of times to retry the job after a job run fails.
- name
The unique name of the job.
- output_location
AWS::DataBrew::Job.OutputLocation
.
- outputs
One or more artifacts that represent output from running the job.
- profile_configuration
Configuration for profile jobs.
Configuration can be used to select columns, do evaluations, and override default parameters of evaluations. When configuration is undefined, the profile job will apply default settings to all supported columns.
- project_name
The name of the project that the job is associated with.
- recipe
A series of data transformation steps that the job runs.
- role_arn
The Amazon Resource Name (ARN) of the role to be assumed for this job.
- tags
Metadata tags that have been applied to the job.
- timeout
The job’s timeout in minutes.
A job that attempts to run longer than this timeout period ends with a status of
TIMEOUT
.
- type
.
PROFILE
- A job to analyze a dataset, to determine its size, data types, data distribution, and more.RECIPE
- A job to apply one or more transformations to a dataset.
- Link:
- Type:
The job type of the job, which must be one of the following
- validation_configurations
List of validation configurations that are applied to the profile job.