CfnRecipe
- class aws_cdk.aws_databrew.CfnRecipe(scope, id, *, name, steps, description=None, tags=None)
Bases:
CfnResource
Specifies a new AWS Glue DataBrew transformation recipe.
- See:
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-databrew-recipe.html
- CloudformationResource:
AWS::DataBrew::Recipe
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_databrew as databrew cfn_recipe = databrew.CfnRecipe(self, "MyCfnRecipe", name="name", steps=[databrew.CfnRecipe.RecipeStepProperty( action=databrew.CfnRecipe.ActionProperty( operation="operation", # the properties below are optional parameters={ "parameters_key": "parameters" } ), # the properties below are optional condition_expressions=[databrew.CfnRecipe.ConditionExpressionProperty( condition="condition", target_column="targetColumn", # the properties below are optional value="value" )] )], # the properties below are optional description="description", tags=[CfnTag( key="key", value="value" )] )
- Parameters:
scope (
Construct
) – Scope in which this resource is defined.id (
str
) – Construct identifier for this resource (unique in its scope).name (
str
) – The unique name for the recipe.steps (
Union
[IResolvable
,Sequence
[Union
[IResolvable
,RecipeStepProperty
,Dict
[str
,Any
]]]]) – A list of steps that are defined by the recipe.description (
Optional
[str
]) – The description of the recipe.tags (
Optional
[Sequence
[Union
[CfnTag
,Dict
[str
,Any
]]]]) – Metadata tags that have been applied to the recipe.
Methods
- add_deletion_override(path)
Syntactic sugar for
addOverride(path, undefined)
.- Parameters:
path (
str
) – The path of the value to delete.- Return type:
None
- add_dependency(target)
Indicates that this resource depends on another resource and cannot be provisioned unless the other resource has been successfully provisioned.
This can be used for resources across stacks (or nested stack) boundaries and the dependency will automatically be transferred to the relevant scope.
- Parameters:
target (
CfnResource
) –- Return type:
None
- add_depends_on(target)
(deprecated) Indicates that this resource depends on another resource and cannot be provisioned unless the other resource has been successfully provisioned.
- Parameters:
target (
CfnResource
) –- Deprecated:
use addDependency
- Stability:
deprecated
- Return type:
None
- add_metadata(key, value)
Add a value to the CloudFormation Resource Metadata.
- Parameters:
key (
str
) –value (
Any
) –
- See:
- Return type:
None
Note that this is a different set of metadata from CDK node metadata; this metadata ends up in the stack template under the resource, whereas CDK node metadata ends up in the Cloud Assembly.
- add_override(path, value)
Adds an override to the synthesized CloudFormation resource.
To add a property override, either use
addPropertyOverride
or prefixpath
with “Properties.” (i.e.Properties.TopicName
).If the override is nested, separate each nested level using a dot (.) in the path parameter. If there is an array as part of the nesting, specify the index in the path.
To include a literal
.
in the property name, prefix with a\
. In most programming languages you will need to write this as"\\."
because the\
itself will need to be escaped.For example:
cfn_resource.add_override("Properties.GlobalSecondaryIndexes.0.Projection.NonKeyAttributes", ["myattribute"]) cfn_resource.add_override("Properties.GlobalSecondaryIndexes.1.ProjectionType", "INCLUDE")
would add the overrides Example:
"Properties": { "GlobalSecondaryIndexes": [ { "Projection": { "NonKeyAttributes": [ "myattribute" ] ... } ... }, { "ProjectionType": "INCLUDE" ... }, ] ... }
The
value
argument toaddOverride
will not be processed or translated in any way. Pass raw JSON values in here with the correct capitalization for CloudFormation. If you pass CDK classes or structs, they will be rendered with lowercased key names, and CloudFormation will reject the template.- Parameters:
path (
str
) –The path of the property, you can use dot notation to override values in complex types. Any intermediate keys will be created as needed.
value (
Any
) –The value. Could be primitive or complex.
- Return type:
None
- add_property_deletion_override(property_path)
Adds an override that deletes the value of a property from the resource definition.
- Parameters:
property_path (
str
) – The path to the property.- Return type:
None
- add_property_override(property_path, value)
Adds an override to a resource property.
Syntactic sugar for
addOverride("Properties.<...>", value)
.- Parameters:
property_path (
str
) – The path of the property.value (
Any
) – The value.
- Return type:
None
- apply_removal_policy(policy=None, *, apply_to_update_replace_policy=None, default=None)
Sets the deletion policy of the resource based on the removal policy specified.
The Removal Policy controls what happens to this resource when it stops being managed by CloudFormation, either because you’ve removed it from the CDK application or because you’ve made a change that requires the resource to be replaced.
The resource can be deleted (
RemovalPolicy.DESTROY
), or left in your AWS account for data recovery and cleanup later (RemovalPolicy.RETAIN
). In some cases, a snapshot can be taken of the resource prior to deletion (RemovalPolicy.SNAPSHOT
). A list of resources that support this policy can be found in the following link:- Parameters:
policy (
Optional
[RemovalPolicy
]) –apply_to_update_replace_policy (
Optional
[bool
]) – Apply the same deletion policy to the resource’s “UpdateReplacePolicy”. Default: truedefault (
Optional
[RemovalPolicy
]) – The default policy to apply in case the removal policy is not defined. Default: - Default value is resource specific. To determine the default value for a resource, please consult that specific resource’s documentation.
- See:
- Return type:
None
- get_att(attribute_name, type_hint=None)
Returns a token for an runtime attribute of this resource.
Ideally, use generated attribute accessors (e.g.
resource.arn
), but this can be used for future compatibility in case there is no generated attribute.- Parameters:
attribute_name (
str
) – The name of the attribute.type_hint (
Optional
[ResolutionTypeHint
]) –
- Return type:
- get_metadata(key)
Retrieve a value value from the CloudFormation Resource Metadata.
- Parameters:
key (
str
) –- See:
- Return type:
Any
Note that this is a different set of metadata from CDK node metadata; this metadata ends up in the stack template under the resource, whereas CDK node metadata ends up in the Cloud Assembly.
- inspect(inspector)
Examines the CloudFormation resource and discloses attributes.
- Parameters:
inspector (
TreeInspector
) – tree inspector to collect and process attributes.- Return type:
None
- obtain_dependencies()
Retrieves an array of resources this resource depends on.
This assembles dependencies on resources across stacks (including nested stacks) automatically.
- Return type:
List
[Union
[Stack
,CfnResource
]]
- obtain_resource_dependencies()
Get a shallow copy of dependencies between this resource and other resources in the same stack.
- Return type:
List
[CfnResource
]
- override_logical_id(new_logical_id)
Overrides the auto-generated logical ID with a specific ID.
- Parameters:
new_logical_id (
str
) – The new logical ID to use for this stack element.- Return type:
None
- remove_dependency(target)
Indicates that this resource no longer depends on another resource.
This can be used for resources across stacks (including nested stacks) and the dependency will automatically be removed from the relevant scope.
- Parameters:
target (
CfnResource
) –- Return type:
None
- replace_dependency(target, new_target)
Replaces one dependency with another.
- Parameters:
target (
CfnResource
) – The dependency to replace.new_target (
CfnResource
) – The new dependency to add.
- Return type:
None
- to_string()
Returns a string representation of this construct.
- Return type:
str
- Returns:
a string representation of this resource
Attributes
- CFN_RESOURCE_TYPE_NAME = 'AWS::DataBrew::Recipe'
- cfn_options
Options for this resource, such as condition, update policy etc.
- cfn_resource_type
AWS resource type.
- creation_stack
return:
the stack trace of the point where this Resource was created from, sourced from the +metadata+ entry typed +aws:cdk:logicalId+, and with the bottom-most node +internal+ entries filtered.
- description
The description of the recipe.
- logical_id
The logical ID for this CloudFormation stack element.
The logical ID of the element is calculated from the path of the resource node in the construct tree.
To override this value, use
overrideLogicalId(newLogicalId)
.- Returns:
the logical ID as a stringified token. This value will only get resolved during synthesis.
- name
The unique name for the recipe.
- node
The tree node.
- ref
Return a string that will be resolved to a CloudFormation
{ Ref }
for this element.If, by any chance, the intrinsic reference of a resource is not a string, you could coerce it to an IResolvable through
Lazy.any({ produce: resource.ref })
.
- stack
The stack in which this element is defined.
CfnElements must be defined within a stack scope (directly or indirectly).
- steps
A list of steps that are defined by the recipe.
- tags
Tag Manager which manages the tags for this resource.
- tags_raw
Metadata tags that have been applied to the recipe.
Static Methods
- classmethod is_cfn_element(x)
Returns
true
if a construct is a stack element (i.e. part of the synthesized cloudformation template).Uses duck-typing instead of
instanceof
to allow stack elements from different versions of this library to be included in the same stack.- Parameters:
x (
Any
) –- Return type:
bool
- Returns:
The construct as a stack element or undefined if it is not a stack element.
- classmethod is_cfn_resource(x)
Check whether the given object is a CfnResource.
- Parameters:
x (
Any
) –- Return type:
bool
- classmethod is_construct(x)
Checks if
x
is a construct.Use this method instead of
instanceof
to properly detectConstruct
instances, even when the construct library is symlinked.Explanation: in JavaScript, multiple copies of the
constructs
library on disk are seen as independent, completely different libraries. As a consequence, the classConstruct
in each copy of theconstructs
library is seen as a different class, and an instance of one class will not test asinstanceof
the other class.npm install
will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of theconstructs
library can be accidentally installed, andinstanceof
will behave unpredictably. It is safest to avoid usinginstanceof
, and using this type-testing method instead.- Parameters:
x (
Any
) – Any object.- Return type:
bool
- Returns:
true if
x
is an object created from a class which extendsConstruct
.
ActionProperty
- class CfnRecipe.ActionProperty(*, operation, parameters=None)
Bases:
object
Represents a transformation and associated parameters that are used to apply a change to an AWS Glue DataBrew dataset.
- Parameters:
operation (
str
) – The name of a valid DataBrew transformation to be performed on the data.parameters (
Union
[IResolvable
,Mapping
[str
,str
],None
]) – Contextual parameters for the transformation.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_databrew as databrew action_property = databrew.CfnRecipe.ActionProperty( operation="operation", # the properties below are optional parameters={ "parameters_key": "parameters" } )
Attributes
- operation
The name of a valid DataBrew transformation to be performed on the data.
- parameters
Contextual parameters for the transformation.
ConditionExpressionProperty
- class CfnRecipe.ConditionExpressionProperty(*, condition, target_column, value=None)
Bases:
object
Represents an individual condition that evaluates to true or false.
Conditions are used with recipe actions. The action is only performed for column values where the condition evaluates to true.
If a recipe requires more than one condition, then the recipe must specify multiple
ConditionExpression
elements. Each condition is applied to the rows in a dataset first, before the recipe action is performed.- Parameters:
condition (
str
) – A specific condition to apply to a recipe action. For more information, see Recipe structure in the AWS Glue DataBrew Developer Guide .target_column (
str
) – A column to apply this condition to.value (
Optional
[str
]) – A value that the condition must evaluate to for the condition to succeed.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_databrew as databrew condition_expression_property = databrew.CfnRecipe.ConditionExpressionProperty( condition="condition", target_column="targetColumn", # the properties below are optional value="value" )
Attributes
- condition
A specific condition to apply to a recipe action.
For more information, see Recipe structure in the AWS Glue DataBrew Developer Guide .
- target_column
A column to apply this condition to.
- value
A value that the condition must evaluate to for the condition to succeed.
DataCatalogInputDefinitionProperty
- class CfnRecipe.DataCatalogInputDefinitionProperty(*, catalog_id=None, database_name=None, table_name=None, temp_directory=None)
Bases:
object
Represents how metadata stored in the AWS Glue Data Catalog is defined in a DataBrew dataset.
- Parameters:
catalog_id (
Optional
[str
]) – The unique identifier of the AWS account that holds the Data Catalog that stores the data.database_name (
Optional
[str
]) – The name of a database in the Data Catalog.table_name (
Optional
[str
]) – The name of a database table in the Data Catalog. This table corresponds to a DataBrew dataset.temp_directory (
Union
[IResolvable
,S3LocationProperty
,Dict
[str
,Any
],None
]) – Represents an Amazon location where DataBrew can store intermediate results.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_databrew as databrew data_catalog_input_definition_property = databrew.CfnRecipe.DataCatalogInputDefinitionProperty( catalog_id="catalogId", database_name="databaseName", table_name="tableName", temp_directory=databrew.CfnRecipe.S3LocationProperty( bucket="bucket", # the properties below are optional key="key" ) )
Attributes
- catalog_id
The unique identifier of the AWS account that holds the Data Catalog that stores the data.
- database_name
The name of a database in the Data Catalog.
- table_name
The name of a database table in the Data Catalog.
This table corresponds to a DataBrew dataset.
- temp_directory
Represents an Amazon location where DataBrew can store intermediate results.
InputProperty
- class CfnRecipe.InputProperty(*, data_catalog_input_definition=None, s3_input_definition=None)
Bases:
object
Represents information on how DataBrew can find data, in either the AWS Glue Data Catalog or Amazon S3.
- Parameters:
data_catalog_input_definition (
Union
[IResolvable
,DataCatalogInputDefinitionProperty
,Dict
[str
,Any
],None
]) – The AWS Glue Data Catalog parameters for the data.s3_input_definition (
Union
[IResolvable
,S3LocationProperty
,Dict
[str
,Any
],None
]) – The Amazon S3 location where the data is stored.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_databrew as databrew input_property = databrew.CfnRecipe.InputProperty( data_catalog_input_definition=databrew.CfnRecipe.DataCatalogInputDefinitionProperty( catalog_id="catalogId", database_name="databaseName", table_name="tableName", temp_directory=databrew.CfnRecipe.S3LocationProperty( bucket="bucket", # the properties below are optional key="key" ) ), s3_input_definition=databrew.CfnRecipe.S3LocationProperty( bucket="bucket", # the properties below are optional key="key" ) )
Attributes
- data_catalog_input_definition
The AWS Glue Data Catalog parameters for the data.
- s3_input_definition
The Amazon S3 location where the data is stored.
RecipeParametersProperty
- class CfnRecipe.RecipeParametersProperty(*, aggregate_function=None, base=None, case_statement=None, category_map=None, chars_to_remove=None, collapse_consecutive_whitespace=None, column_data_type=None, column_range=None, count=None, custom_characters=None, custom_stop_words=None, custom_value=None, datasets_columns=None, date_add_value=None, date_time_format=None, date_time_parameters=None, delete_other_rows=None, delimiter=None, end_pattern=None, end_position=None, end_value=None, expand_contractions=None, exponent=None, false_string=None, group_by_agg_function_options=None, group_by_columns=None, hidden_columns=None, ignore_case=None, include_in_split=None, input=None, interval=None, is_text=None, join_keys=None, join_type=None, left_columns=None, limit=None, lower_bound=None, map_type=None, mode_type=None, multi_line=None, num_rows=None, num_rows_after=None, num_rows_before=None, order_by_column=None, order_by_columns=None, other=None, pattern=None, pattern_option1=None, pattern_option2=None, pattern_options=None, period=None, position=None, remove_all_punctuation=None, remove_all_quotes=None, remove_all_whitespace=None, remove_custom_characters=None, remove_custom_value=None, remove_leading_and_trailing_punctuation=None, remove_leading_and_trailing_quotes=None, remove_leading_and_trailing_whitespace=None, remove_letters=None, remove_numbers=None, remove_source_column=None, remove_special_characters=None, right_columns=None, sample_size=None, sample_type=None, secondary_inputs=None, second_input=None, sheet_indexes=None, sheet_names=None, source_column=None, source_column1=None, source_column2=None, source_columns=None, start_column_index=None, start_pattern=None, start_position=None, start_value=None, stemming_mode=None, step_count=None, step_index=None, stop_words_mode=None, strategy=None, target_column=None, target_column_names=None, target_date_format=None, target_index=None, time_zone=None, tokenizer_pattern=None, true_string=None, udf_lang=None, units=None, unpivot_column=None, upper_bound=None, use_new_data_frame=None, value=None, value1=None, value2=None, value_column=None, view_frame=None)
Bases:
object
Parameters that are used as inputs for various recipe actions.
The parameters are specific to the context in which they’re used.
- Parameters:
aggregate_function (
Optional
[str
]) – The name of an aggregation function to apply.base (
Optional
[str
]) – The number of digits used in a counting system.case_statement (
Optional
[str
]) – A case statement associated with a recipe.category_map (
Optional
[str
]) – A category map used for one-hot encoding.chars_to_remove (
Optional
[str
]) – Characters to remove from a step that applies one-hot encoding or tokenization.collapse_consecutive_whitespace (
Optional
[str
]) – Remove any non-word non-punctuation character.column_data_type (
Optional
[str
]) – The data type of the column.column_range (
Optional
[str
]) – A range of columns to which a step is applied.count (
Optional
[str
]) – The number of times a string needs to be repeated.custom_characters (
Optional
[str
]) – One or more characters that can be substituted or removed, depending on the context.custom_stop_words (
Optional
[str
]) – A list of words to ignore in a step that applies word tokenization.custom_value (
Optional
[str
]) – A list of custom values to use in a step that requires that you provide a value to finish the operation.datasets_columns (
Optional
[str
]) – A list of the dataset columns included in a project.date_add_value (
Optional
[str
]) – A value that specifies how many units of time to add or subtract for a date math operation.date_time_format (
Optional
[str
]) – A date format to apply to a date.date_time_parameters (
Optional
[str
]) – A set of parameters associated with a datetime.delete_other_rows (
Optional
[str
]) – Determines whether unmapped rows in a categorical mapping should be deleted.delimiter (
Optional
[str
]) – The delimiter to use when parsing separated values in a text file.end_pattern (
Optional
[str
]) – The end pattern to locate.end_position (
Optional
[str
]) – The end position to locate.end_value (
Optional
[str
]) – The end value to locate.expand_contractions (
Optional
[str
]) – A list of word contractions and what they expand to. For eample: can’t ; cannot ; can not .exponent (
Optional
[str
]) – The exponent to apply in an exponential operation.false_string (
Optional
[str
]) – A value that representsFALSE
.group_by_agg_function_options (
Optional
[str
]) – Specifies options to apply to theGROUP BY
used in an aggregation.group_by_columns (
Optional
[str
]) – The columns to use in theGROUP BY
clause.hidden_columns (
Optional
[str
]) – A list of columns to hide.ignore_case (
Optional
[str
]) – Indicates that lower and upper case letters are treated equally.include_in_split (
Optional
[str
]) – Indicates if this column is participating in a split transform.input (
Any
) – The input location to load the dataset from - Amazon S3 or AWS Glue Data Catalog .interval (
Optional
[str
]) – The number of characters to split by.is_text (
Optional
[str
]) – Indicates if the content is text.join_keys (
Optional
[str
]) – The keys or columns involved in a join.join_type (
Optional
[str
]) – The type of join to use, for example,INNER JOIN
,OUTER JOIN
, and so on.left_columns (
Optional
[str
]) – The columns on the left side of the join.limit (
Optional
[str
]) – The number of times to performsplit
orreplaceBy
in a string.lower_bound (
Optional
[str
]) – The lower boundary for a value.map_type (
Optional
[str
]) – The type of mappings to apply to construct a new dynamic frame.mode_type (
Optional
[str
]) – Determines the manner in which mode value is calculated, in case there is more than one mode value. Valid values:NONE
|AVERAGE
|MINIMUM
|MAXIMUM
multi_line (
Union
[bool
,IResolvable
,None
]) – Specifies whether JSON input contains embedded new line characters.num_rows (
Optional
[str
]) – The number of rows to consider in a window.num_rows_after (
Optional
[str
]) – The number of rows to consider after the current row in a window.num_rows_before (
Optional
[str
]) – The number of rows to consider before the current row in a window.order_by_column (
Optional
[str
]) – A column to sort the results by.order_by_columns (
Optional
[str
]) – The columns to sort the results by.other (
Optional
[str
]) – The value to assign to unmapped cells, in categorical mapping.pattern (
Optional
[str
]) – The pattern to locate.pattern_option1 (
Optional
[str
]) – The starting pattern to split between.pattern_option2 (
Optional
[str
]) – The ending pattern to split between.pattern_options (
Optional
[str
]) – For splitting by multiple delimiters: A JSON-encoded string that lists the patterns in the format. For example:[{\"pattern\":\"1\",\"includeInSplit\":true}]
period (
Optional
[str
]) – The size of the rolling window.position (
Optional
[str
]) – The character index within a string.remove_all_punctuation (
Optional
[str
]) – Iftrue
, removes all of the following characters:.
.!
.,
.?
.remove_all_quotes (
Optional
[str
]) – Iftrue
, removes all single quotes and double quotes.remove_all_whitespace (
Optional
[str
]) – Iftrue
, removes all whitespaces from the value.remove_custom_characters (
Optional
[str
]) – Iftrue
, removes all chraracters specified byCustomCharacters
.remove_custom_value (
Optional
[str
]) – Iftrue
, removes all chraracters specified byCustomValue
.remove_leading_and_trailing_punctuation (
Optional
[str
]) – Iftrue
, removes the following characters if they occur at the start or end of the value:.
!
,
?
.remove_leading_and_trailing_quotes (
Optional
[str
]) – Iftrue
, removes single quotes and double quotes from the beginning and end of the value.remove_leading_and_trailing_whitespace (
Optional
[str
]) – Iftrue
, removes all whitespaces from the beginning and end of the value.remove_letters (
Optional
[str
]) – Iftrue
, removes all uppercase and lowercase alphabetic characters (A through Z; a through z).remove_numbers (
Optional
[str
]) – Iftrue
, removes all numeric characters (0 through 9).remove_source_column (
Optional
[str
]) – Iftrue
, the source column will be removed after un-nesting that column. (Used with nested column types, such as Map, Struct, or Array.)remove_special_characters (
Optional
[str
]) – Iftrue
, removes all of the following characters: ! “ # $ % & ‘ ( ) * + , - . / : ; < = > ? @ [ ] ^ _ ` { | } ~``right_columns (
Optional
[str
]) – The columns on the right side of a join.sample_size (
Optional
[str
]) – The number of rows in the sample.sample_type (
Optional
[str
]) – The sampling type to apply to the dataset. Valid values:FIRST_N
|LAST_N
|RANDOM
secondary_inputs (
Union
[IResolvable
,Sequence
[Union
[IResolvable
,SecondaryInputProperty
,Dict
[str
,Any
]]],None
]) – A list of secondary inputs in a UNION transform.second_input (
Optional
[str
]) – A object value to indicate the second dataset used in a join.sheet_indexes (
Union
[IResolvable
,Sequence
[Union
[int
,float
]],None
]) – One or more sheet numbers in the Excel file, which will be included in a dataset.sheet_names (
Optional
[Sequence
[str
]]) – Oone or more named sheets in the Excel file, which will be included in a dataset.source_column (
Optional
[str
]) – A source column needed for an operation, step, or transform.source_column1 (
Optional
[str
]) – A source column needed for an operation, step, or transform.source_column2 (
Optional
[str
]) – A source column needed for an operation, step, or transform.source_columns (
Optional
[str
]) – A list of source columns needed for an operation, step, or transform.start_column_index (
Optional
[str
]) – The index number of the first column used by an operation, step, or transform.start_pattern (
Optional
[str
]) – The starting pattern to locate.start_position (
Optional
[str
]) – The starting position to locate.start_value (
Optional
[str
]) – The starting value to locate.stemming_mode (
Optional
[str
]) – Indicates this operation uses stems and lemmas (base words) for word tokenization.step_count (
Optional
[str
]) – The total number of transforms in this recipe.step_index (
Optional
[str
]) – The index ID of a step.stop_words_mode (
Optional
[str
]) – Indicates this operation uses stop words as part of word tokenization.strategy (
Optional
[str
]) – The resolution strategy to apply in resolving ambiguities.target_column (
Optional
[str
]) – The column targeted by this operation.target_column_names (
Optional
[str
]) – The names to give columns altered by this operation.target_date_format (
Optional
[str
]) – The date format to convert to.target_index (
Optional
[str
]) – The index number of an object that is targeted by this operation.time_zone (
Optional
[str
]) – The current timezone that you want to use for dates.tokenizer_pattern (
Optional
[str
]) – A regex expression to use when splitting text into terms, also called words or tokens.true_string (
Optional
[str
]) – A value to use to representTRUE
.udf_lang (
Optional
[str
]) – The language that’s used in the user-defined function.units (
Optional
[str
]) – Specifies a unit of time. For example:MINUTES
;SECONDS
;HOURS
; etc.unpivot_column (
Optional
[str
]) – Cast columns as rows, so that each value is a different row in a single column.upper_bound (
Optional
[str
]) – The upper boundary for a value.use_new_data_frame (
Optional
[str
]) – Create a new container to hold a dataset.value (
Optional
[str
]) – A static value that can be used in a comparison, a substitution, or in another context-specific way. AValue
can be a number, string, or other datatype, depending on the recipe action in which it’s used.value1 (
Optional
[str
]) – A value that’s used by this operation.value2 (
Optional
[str
]) – A value that’s used by this operation.value_column (
Optional
[str
]) – The column that is provided as a value that’s used by this operation.view_frame (
Optional
[str
]) – The subset of rows currently available for viewing.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_databrew as databrew # input: Any recipe_parameters_property = databrew.CfnRecipe.RecipeParametersProperty( aggregate_function="aggregateFunction", base="base", case_statement="caseStatement", category_map="categoryMap", chars_to_remove="charsToRemove", collapse_consecutive_whitespace="collapseConsecutiveWhitespace", column_data_type="columnDataType", column_range="columnRange", count="count", custom_characters="customCharacters", custom_stop_words="customStopWords", custom_value="customValue", datasets_columns="datasetsColumns", date_add_value="dateAddValue", date_time_format="dateTimeFormat", date_time_parameters="dateTimeParameters", delete_other_rows="deleteOtherRows", delimiter="delimiter", end_pattern="endPattern", end_position="endPosition", end_value="endValue", expand_contractions="expandContractions", exponent="exponent", false_string="falseString", group_by_agg_function_options="groupByAggFunctionOptions", group_by_columns="groupByColumns", hidden_columns="hiddenColumns", ignore_case="ignoreCase", include_in_split="includeInSplit", input=input, interval="interval", is_text="isText", join_keys="joinKeys", join_type="joinType", left_columns="leftColumns", limit="limit", lower_bound="lowerBound", map_type="mapType", mode_type="modeType", multi_line=False, num_rows="numRows", num_rows_after="numRowsAfter", num_rows_before="numRowsBefore", order_by_column="orderByColumn", order_by_columns="orderByColumns", other="other", pattern="pattern", pattern_option1="patternOption1", pattern_option2="patternOption2", pattern_options="patternOptions", period="period", position="position", remove_all_punctuation="removeAllPunctuation", remove_all_quotes="removeAllQuotes", remove_all_whitespace="removeAllWhitespace", remove_custom_characters="removeCustomCharacters", remove_custom_value="removeCustomValue", remove_leading_and_trailing_punctuation="removeLeadingAndTrailingPunctuation", remove_leading_and_trailing_quotes="removeLeadingAndTrailingQuotes", remove_leading_and_trailing_whitespace="removeLeadingAndTrailingWhitespace", remove_letters="removeLetters", remove_numbers="removeNumbers", remove_source_column="removeSourceColumn", remove_special_characters="removeSpecialCharacters", right_columns="rightColumns", sample_size="sampleSize", sample_type="sampleType", secondary_inputs=[databrew.CfnRecipe.SecondaryInputProperty( data_catalog_input_definition=databrew.CfnRecipe.DataCatalogInputDefinitionProperty( catalog_id="catalogId", database_name="databaseName", table_name="tableName", temp_directory=databrew.CfnRecipe.S3LocationProperty( bucket="bucket", # the properties below are optional key="key" ) ), s3_input_definition=databrew.CfnRecipe.S3LocationProperty( bucket="bucket", # the properties below are optional key="key" ) )], second_input="secondInput", sheet_indexes=[123], sheet_names=["sheetNames"], source_column="sourceColumn", source_column1="sourceColumn1", source_column2="sourceColumn2", source_columns="sourceColumns", start_column_index="startColumnIndex", start_pattern="startPattern", start_position="startPosition", start_value="startValue", stemming_mode="stemmingMode", step_count="stepCount", step_index="stepIndex", stop_words_mode="stopWordsMode", strategy="strategy", target_column="targetColumn", target_column_names="targetColumnNames", target_date_format="targetDateFormat", target_index="targetIndex", time_zone="timeZone", tokenizer_pattern="tokenizerPattern", true_string="trueString", udf_lang="udfLang", units="units", unpivot_column="unpivotColumn", upper_bound="upperBound", use_new_data_frame="useNewDataFrame", value="value", value1="value1", value2="value2", value_column="valueColumn", view_frame="viewFrame" )
Attributes
- aggregate_function
The name of an aggregation function to apply.
- base
The number of digits used in a counting system.
- case_statement
A case statement associated with a recipe.
- category_map
A category map used for one-hot encoding.
- chars_to_remove
Characters to remove from a step that applies one-hot encoding or tokenization.
- collapse_consecutive_whitespace
Remove any non-word non-punctuation character.
- column_data_type
The data type of the column.
- column_range
A range of columns to which a step is applied.
- count
The number of times a string needs to be repeated.
- custom_characters
One or more characters that can be substituted or removed, depending on the context.
- custom_stop_words
A list of words to ignore in a step that applies word tokenization.
- custom_value
A list of custom values to use in a step that requires that you provide a value to finish the operation.
- datasets_columns
A list of the dataset columns included in a project.
- date_add_value
A value that specifies how many units of time to add or subtract for a date math operation.
- date_time_format
A date format to apply to a date.
- date_time_parameters
A set of parameters associated with a datetime.
- delete_other_rows
Determines whether unmapped rows in a categorical mapping should be deleted.
- delimiter
The delimiter to use when parsing separated values in a text file.
- end_pattern
The end pattern to locate.
- end_position
The end position to locate.
- end_value
The end value to locate.
- expand_contractions
A list of word contractions and what they expand to.
For eample: can’t ; cannot ; can not .
- exponent
The exponent to apply in an exponential operation.
- false_string
A value that represents
FALSE
.
- group_by_agg_function_options
Specifies options to apply to the
GROUP BY
used in an aggregation.
- group_by_columns
The columns to use in the
GROUP BY
clause.
A list of columns to hide.
- ignore_case
Indicates that lower and upper case letters are treated equally.
- include_in_split
Indicates if this column is participating in a split transform.
- input
The input location to load the dataset from - Amazon S3 or AWS Glue Data Catalog .
- interval
The number of characters to split by.
- is_text
Indicates if the content is text.
- join_keys
The keys or columns involved in a join.
- join_type
The type of join to use, for example,
INNER JOIN
,OUTER JOIN
, and so on.
- left_columns
The columns on the left side of the join.
- limit
The number of times to perform
split
orreplaceBy
in a string.
- lower_bound
The lower boundary for a value.
- map_type
The type of mappings to apply to construct a new dynamic frame.
- mode_type
Determines the manner in which mode value is calculated, in case there is more than one mode value.
Valid values:
NONE
|AVERAGE
|MINIMUM
|MAXIMUM
- multi_line
Specifies whether JSON input contains embedded new line characters.
- num_rows
The number of rows to consider in a window.
- num_rows_after
The number of rows to consider after the current row in a window.
- num_rows_before
The number of rows to consider before the current row in a window.
- order_by_column
A column to sort the results by.
- order_by_columns
The columns to sort the results by.
- other
The value to assign to unmapped cells, in categorical mapping.
- pattern
The pattern to locate.
- pattern_option1
The starting pattern to split between.
- pattern_option2
The ending pattern to split between.
- pattern_options
A JSON-encoded string that lists the patterns in the format.
For example:
[{\"pattern\":\"1\",\"includeInSplit\":true}]
- See:
- Type:
For splitting by multiple delimiters
- period
The size of the rolling window.
- position
The character index within a string.
- remove_all_punctuation
.
.!
.,
.?
.- See:
- Type:
If
true
, removes all of the following characters
- remove_all_quotes
If
true
, removes all single quotes and double quotes.
- remove_all_whitespace
If
true
, removes all whitespaces from the value.
- remove_custom_characters
If
true
, removes all chraracters specified byCustomCharacters
.
- remove_custom_value
If
true
, removes all chraracters specified byCustomValue
.
- remove_leading_and_trailing_punctuation
.
!
,
?
.- See:
- Type:
If
true
, removes the following characters if they occur at the start or end of the value
- remove_leading_and_trailing_quotes
If
true
, removes single quotes and double quotes from the beginning and end of the value.
- remove_leading_and_trailing_whitespace
If
true
, removes all whitespaces from the beginning and end of the value.
- remove_letters
If
true
, removes all uppercase and lowercase alphabetic characters (A through Z;a through z).
- remove_numbers
If
true
, removes all numeric characters (0 through 9).
- remove_source_column
If
true
, the source column will be removed after un-nesting that column.(Used with nested column types, such as Map, Struct, or Array.)
- remove_special_characters
`!
“ # $ % & ‘ ( ) * + , - . / : ; < = > ? @ [ ] ^ _ `` { | } ~``
- See:
- Type:
If
true
, removes all of the following characters
- right_columns
The columns on the right side of a join.
- sample_size
The number of rows in the sample.
- sample_type
The sampling type to apply to the dataset.
Valid values:
FIRST_N
|LAST_N
|RANDOM
- second_input
A object value to indicate the second dataset used in a join.
- secondary_inputs
A list of secondary inputs in a UNION transform.
- sheet_indexes
One or more sheet numbers in the Excel file, which will be included in a dataset.
- sheet_names
Oone or more named sheets in the Excel file, which will be included in a dataset.
- source_column
A source column needed for an operation, step, or transform.
- source_column1
A source column needed for an operation, step, or transform.
- source_column2
A source column needed for an operation, step, or transform.
- source_columns
A list of source columns needed for an operation, step, or transform.
- start_column_index
The index number of the first column used by an operation, step, or transform.
- start_pattern
The starting pattern to locate.
- start_position
The starting position to locate.
- start_value
The starting value to locate.
- stemming_mode
Indicates this operation uses stems and lemmas (base words) for word tokenization.
- step_count
The total number of transforms in this recipe.
- step_index
The index ID of a step.
- stop_words_mode
Indicates this operation uses stop words as part of word tokenization.
- strategy
The resolution strategy to apply in resolving ambiguities.
- target_column
The column targeted by this operation.
- target_column_names
The names to give columns altered by this operation.
- target_date_format
The date format to convert to.
- target_index
The index number of an object that is targeted by this operation.
- time_zone
The current timezone that you want to use for dates.
- tokenizer_pattern
A regex expression to use when splitting text into terms, also called words or tokens.
- true_string
A value to use to represent
TRUE
.
- udf_lang
The language that’s used in the user-defined function.
- units
Specifies a unit of time.
For example:
MINUTES
;SECONDS
;HOURS
; etc.
- unpivot_column
Cast columns as rows, so that each value is a different row in a single column.
- upper_bound
The upper boundary for a value.
- use_new_data_frame
Create a new container to hold a dataset.
- value
A static value that can be used in a comparison, a substitution, or in another context-specific way.
A
Value
can be a number, string, or other datatype, depending on the recipe action in which it’s used.
- value1
A value that’s used by this operation.
- value2
A value that’s used by this operation.
- value_column
The column that is provided as a value that’s used by this operation.
- view_frame
The subset of rows currently available for viewing.
RecipeStepProperty
- class CfnRecipe.RecipeStepProperty(*, action, condition_expressions=None)
Bases:
object
Represents a single step from a DataBrew recipe to be performed.
- Parameters:
action (
Union
[IResolvable
,ActionProperty
,Dict
[str
,Any
]]) – The particular action to be performed in the recipe step.condition_expressions (
Union
[IResolvable
,Sequence
[Union
[IResolvable
,ConditionExpressionProperty
,Dict
[str
,Any
]]],None
]) – One or more conditions that must be met for the recipe step to succeed. .. epigraph:: All of the conditions in the array must be met. In other words, all of the conditions must be combined using a logical AND operation.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_databrew as databrew recipe_step_property = databrew.CfnRecipe.RecipeStepProperty( action=databrew.CfnRecipe.ActionProperty( operation="operation", # the properties below are optional parameters={ "parameters_key": "parameters" } ), # the properties below are optional condition_expressions=[databrew.CfnRecipe.ConditionExpressionProperty( condition="condition", target_column="targetColumn", # the properties below are optional value="value" )] )
Attributes
- action
The particular action to be performed in the recipe step.
- condition_expressions
One or more conditions that must be met for the recipe step to succeed.
All of the conditions in the array must be met. In other words, all of the conditions must be combined using a logical AND operation.
S3LocationProperty
- class CfnRecipe.S3LocationProperty(*, bucket, key=None)
Bases:
object
Represents an Amazon S3 location (bucket name, bucket owner, and object key) where DataBrew can read input data, or write output from a job.
- Parameters:
bucket (
str
) – The Amazon S3 bucket name.key (
Optional
[str
]) – The unique name of the object in the bucket.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_databrew as databrew s3_location_property = databrew.CfnRecipe.S3LocationProperty( bucket="bucket", # the properties below are optional key="key" )
Attributes
- bucket
The Amazon S3 bucket name.
- key
The unique name of the object in the bucket.
SecondaryInputProperty
- class CfnRecipe.SecondaryInputProperty(*, data_catalog_input_definition=None, s3_input_definition=None)
Bases:
object
Represents secondary inputs in a UNION transform.
- Parameters:
data_catalog_input_definition (
Union
[IResolvable
,DataCatalogInputDefinitionProperty
,Dict
[str
,Any
],None
]) – The AWS Glue Data Catalog parameters for the data.s3_input_definition (
Union
[IResolvable
,S3LocationProperty
,Dict
[str
,Any
],None
]) – The Amazon S3 location where the data is stored.
- See:
- ExampleMetadata:
fixture=_generated
Example:
# The code below shows an example of how to instantiate this type. # The values are placeholders you should change. from aws_cdk import aws_databrew as databrew secondary_input_property = databrew.CfnRecipe.SecondaryInputProperty( data_catalog_input_definition=databrew.CfnRecipe.DataCatalogInputDefinitionProperty( catalog_id="catalogId", database_name="databaseName", table_name="tableName", temp_directory=databrew.CfnRecipe.S3LocationProperty( bucket="bucket", # the properties below are optional key="key" ) ), s3_input_definition=databrew.CfnRecipe.S3LocationProperty( bucket="bucket", # the properties below are optional key="key" ) )
Attributes
- data_catalog_input_definition
The AWS Glue Data Catalog parameters for the data.
- s3_input_definition
The Amazon S3 location where the data is stored.