TransformInput

class aws_cdk.aws_stepfunctions_tasks.TransformInput(*, transform_data_source, compression_type=None, content_type=None, split_type=None)

Bases: object

Dataset to be transformed and the Amazon S3 location where it is stored.

Parameters:

transform_data_source (Union[TransformDataSource, Dict[str, Any]]) – S3 location of the channel data.
compression_type (Optional[CompressionType]) – The compression type of the transform data. Default: NONE
content_type (Optional[str]) – Multipurpose internet mail extension (MIME) type of the data. Default: - None
split_type (Optional[SplitType]) – Method to use to split the transform job’s data files into smaller batches. Default: NONE

ExampleMetadata:

infused

Example:

tasks.SageMakerCreateTransformJob(self, "Batch Inference",
    transform_job_name="MyTransformJob",
    model_name="MyModelName",
    model_client_options=tasks.ModelClientOptions(
        invocations_max_retries=3,  # default is 0
        invocations_timeout=Duration.minutes(5)
    ),
    transform_input=tasks.TransformInput(
        transform_data_source=tasks.TransformDataSource(
            s3_data_source=tasks.TransformS3DataSource(
                s3_uri="s3://inputbucket/train",
                s3_data_type=tasks.S3DataType.S3_PREFIX
            )
        )
    ),
    transform_output=tasks.TransformOutput(
        s3_output_path="s3://outputbucket/TransformJobOutputPath"
    ),
    transform_resources=tasks.TransformResources(
        instance_count=1,
        instance_type=ec2.InstanceType.of(ec2.InstanceClass.M4, ec2.InstanceSize.XLARGE)
    )
)

Attributes

compression_type

The compression type of the transform data.

Default:: NONE

content_type

Multipurpose internet mail extension (MIME) type of the data.

Default:

None

split_type

Method to use to split the transform job’s data files into smaller batches.

Default:: NONE

transform_data_source: S3 location of the channel data.