This is the new AWS CloudFormation Template Reference Guide. Please update your bookmarks and links. For help getting started with CloudFormation, see the AWS CloudFormation User Guide.
AWS::DataPipeline::Pipeline
The AWS::DataPipeline::Pipeline resource specifies a data pipeline that you can use to automate the movement and transformation of data.
Important
AWS Data Pipeline is no longer available to new customers. Existing customers of AWS Data Pipeline can continue to use the service as normal. Learn more
In each pipeline, you define pipeline objects, such as activities, schedules, data nodes, and resources.
The AWS::DataPipeline::Pipeline resource adds tasks, schedules, and
      preconditions to the specified pipeline. You can use PutPipelineDefinition to
      populate a new pipeline.
PutPipelineDefinition also validates the configuration as it adds it to the pipeline. Changes to the pipeline are saved unless one 
            of the following validation errors exist in the pipeline.
        
- 
                    An object is missing a name or identifier field. 
- 
                    A string or reference field is empty. 
- 
                    The number of objects in the pipeline exceeds the allowed maximum number of objects. 
- 
                    The pipeline is in a FINISHED state. 
Pipeline object definitions are passed to the PutPipelineDefinition action and returned by the GetPipelineDefinition action.
Syntax
To declare this entity in your AWS CloudFormation template, use the following syntax:
JSON
{ "Type" : "AWS::DataPipeline::Pipeline", "Properties" : { "Activate" :Boolean, "Description" :String, "Name" :String, "ParameterObjects" :[ ParameterObject, ... ], "ParameterValues" :[ ParameterValue, ... ], "PipelineObjects" :[ PipelineObject, ... ], "PipelineTags" :[ PipelineTag, ... ]} }
YAML
Type: AWS::DataPipeline::Pipeline Properties: Activate:BooleanDescription:StringName:StringParameterObjects:- ParameterObjectParameterValues:- ParameterValuePipelineObjects:- PipelineObjectPipelineTags:- PipelineTag
Properties
- Activate
- 
                    Indicates whether to validate and start the pipeline or stop an active pipeline. By default, the value is set to true.Required: No Type: Boolean Update requires: No interruption 
- Description
- 
                    A description of the pipeline. Required: No Type: String Pattern: [\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\r\n\t]*Minimum: 0Maximum: 1024Update requires: Replacement 
- Name
- 
                    The name of the pipeline. Required: Yes Type: String Pattern: [\u0020-\uD7FF\uE000-\uFFFD\uD800\uDC00-\uDBFF\uDFFF\n\t]*Minimum: 1Maximum: 1024Update requires: Replacement 
- ParameterObjects
- 
                    The parameter objects used with the pipeline. Required: No Type: Array of ParameterObject Update requires: No interruption 
- ParameterValues
- 
                    The parameter values used with the pipeline. Required: No Type: Array of ParameterValue Update requires: No interruption 
- PipelineObjects
- 
                    The objects that define the pipeline. These objects overwrite the existing pipeline definition. Not all objects, fields, and values can be updated. For information about restrictions, see Editing Your Pipeline in the AWS Data Pipeline Developer Guide. Required: No Type: Array of PipelineObject Update requires: No interruption 
- 
                    A list of arbitrary tags (key-value pairs) to associate with the pipeline, which you can use to control permissions. For more information, see Controlling Access to Pipelines and Resources in the AWS Data Pipeline Developer Guide. Required: No Type: Array of PipelineTag Update requires: No interruption 
Return values
Ref
When you pass the logical ID of this resource to the intrinsic Ref function, Ref returns the pipeline ID.
For more information about using the Ref function, see Ref.
Fn::GetAtt
- PipelineId
- 
                            The ID of the pipeline. 
Examples
The following data pipeline backs up data from an Amazon DynamoDB table to an Amazon
      S3 bucket. The pipeline uses the HiveCopyActivity activity to copy the data,
      and runs it once a day. The roles for the
      pipeline and the pipeline resource are declared elsewhere in the same template.
JSON
"DynamoDBInputS3OutputHive": { "Type": "AWS::DataPipeline::Pipeline", "Properties": { "Name": "DynamoDBInputS3OutputHive", "Description": "Pipeline to backup DynamoDB data to S3", "Activate": "true", "ParameterObjects": [ { "Id": "myDDBReadThroughputRatio", "Attributes": [ { "Key": "description", "StringValue": "DynamoDB read throughput ratio" }, { "Key": "type", "StringValue": "Double" }, { "Key": "default", "StringValue": "0.2" } ] }, { "Id": "myOutputS3Loc", "Attributes": [ { "Key": "description", "StringValue": "S3 output bucket" }, { "Key": "type", "StringValue": "AWS::S3::ObjectKey" }, { "Key": "default", "StringValue": { "Fn::Join" : [ "", [ "s3://", { "Ref": "S3OutputLoc" } ] ] } } ] }, { "Id": "myDDBTableName", "Attributes": [ { "Key": "description", "StringValue": "DynamoDB Table Name " }, { "Key": "type", "StringValue": "String" } ] } ], "ParameterValues": [ { "Id": "myDDBTableName", "StringValue": { "Ref": "TableName" } } ], "PipelineObjects": [ { "Id": "S3BackupLocation", "Name": "Copy data to this S3 location", "Fields": [ { "Key": "type", "StringValue": "S3DataNode" }, { "Key": "dataFormat", "RefValue": "DDBExportFormat" }, { "Key": "directoryPath", "StringValue": "#{myOutputS3Loc}/#{format(@scheduledStartTime, 'YYYY-MM-dd-HH-mm-ss')}" } ] }, { "Id": "DDBSourceTable", "Name": "DDBSourceTable", "Fields": [ { "Key": "tableName", "StringValue": "#{myDDBTableName}" }, { "Key": "type", "StringValue": "DynamoDBDataNode" }, { "Key": "dataFormat", "RefValue": "DDBExportFormat" }, { "Key": "readThroughputPercent", "StringValue": "#{myDDBReadThroughputRatio}" } ] }, { "Id": "DDBExportFormat", "Name": "DDBExportFormat", "Fields": [ { "Key": "type", "StringValue": "DynamoDBExportDataFormat" } ] }, { "Id": "TableBackupActivity", "Name": "TableBackupActivity", "Fields": [ { "Key": "resizeClusterBeforeRunning", "StringValue": "true" }, { "Key": "type", "StringValue": "HiveCopyActivity" }, { "Key": "input", "RefValue": "DDBSourceTable" }, { "Key": "runsOn", "RefValue": "EmrClusterForBackup" }, { "Key": "output", "RefValue": "S3BackupLocation" } ] }, { "Id": "DefaultSchedule", "Name": "RunOnce", "Fields": [ { "Key": "occurrences", "StringValue": "1" }, { "Key": "startAt", "StringValue": "FIRST_ACTIVATION_DATE_TIME" }, { "Key": "type", "StringValue": "Schedule" }, { "Key": "period", "StringValue": "1 Day" } ] }, { "Id": "Default", "Name": "Default", "Fields": [ { "Key": "type", "StringValue": "Default" }, { "Key": "scheduleType", "StringValue": "cron" }, { "Key": "failureAndRerunMode", "StringValue": "CASCADE" }, { "Key": "role", "StringValue": "DataPipelineDefaultRole" }, { "Key": "resourceRole", "StringValue": "DataPipelineDefaultResourceRole" }, { "Key": "schedule", "RefValue": "DefaultSchedule" } ] }, { "Id": "EmrClusterForBackup", "Name": "EmrClusterForBackup", "Fields": [ { "Key": "terminateAfter", "StringValue": "2 Hours" }, { "Key": "amiVersion", "StringValue": "3.3.2" }, { "Key": "masterInstanceType", "StringValue": "m1.medium" }, { "Key": "coreInstanceType", "StringValue": "m1.medium" }, { "Key": "coreInstanceCount", "StringValue": "1" }, { "Key": "type", "StringValue": "EmrCluster" } ] } ] } }
YAML
DynamoDBInputS3OutputHive: Type: AWS::DataPipeline::Pipeline Properties: Name: DynamoDBInputS3OutputHive Description: "Pipeline to backup DynamoDB data to S3" Activate: true ParameterObjects: - Id: "myDDBReadThroughputRatio" Attributes: - Key: "description" StringValue: "DynamoDB read throughput ratio" - Key: "type" StringValue: "Double" - Key: "default" StringValue: "0.2" - Id: "myOutputS3Loc" Attributes: - Key: "description" StringValue: "S3 output bucket" - Key: "type" StringValue: "AWS::S3::ObjectKey" - Key: "default" StringValue: Fn::Join: - "" - - "s3://" - Ref: "S3OutputLoc" - Id: "myDDBTableName" Attributes: - Key: "description" StringValue: "DynamoDB Table Name " - Key: "type" StringValue: "String" ParameterValues: - Id: "myDDBTableName" StringValue: Ref: "TableName" PipelineObjects: - Id: "S3BackupLocation" Name: "Copy data to this S3 location" Fields: - Key: "type" StringValue: "S3DataNode" - Key: "dataFormat" RefValue: "DDBExportFormat" - Key: "directoryPath" StringValue: "#{myOutputS3Loc}/#{format(@scheduledStartTime, 'YYYY-MM-dd-HH-mm-ss')}" - Id: "DDBSourceTable" Name: "DDBSourceTable" Fields: - Key: "tableName" StringValue: "#{myDDBTableName}" - Key: "type" StringValue: "DynamoDBDataNode" - Key: "dataFormat" RefValue: "DDBExportFormat" - Key: "readThroughputPercent" StringValue: "#{myDDBReadThroughputRatio}" - Id: "DDBExportFormat" Name: "DDBExportFormat" Fields: - Key: "type" StringValue: "DynamoDBExportDataFormat" - Id: "TableBackupActivity" Name: "TableBackupActivity" Fields: - Key: "resizeClusterBeforeRunning" StringValue: "true" - Key: "type" StringValue: "HiveCopyActivity" - Key: "input" RefValue: "DDBSourceTable" - Key: "runsOn" RefValue: "EmrClusterForBackup" - Key: "output" RefValue: "S3BackupLocation" - Id: "DefaultSchedule" Name: "RunOnce" Fields: - Key: "occurrences" StringValue: "1" - Key: "startAt" StringValue: "FIRST_ACTIVATION_DATE_TIME" - Key: "type" StringValue: "Schedule" - Key: "period" StringValue: "1 Day" - Id: "Default" Name: "Default" Fields: - Key: "type" StringValue: "Default" - Key: "scheduleType" StringValue: "cron" - Key: "failureAndRerunMode" StringValue: "CASCADE" - Key: "role" StringValue: "DataPipelineDefaultRole" - Key: "resourceRole" StringValue: "DataPipelineDefaultResourceRole" - Key: "schedule" RefValue: "DefaultSchedule" - Id: "EmrClusterForBackup" Name: "EmrClusterForBackup" Fields: - Key: "terminateAfter" StringValue: "2 Hours" - Key: "amiVersion" StringValue: "3.3.2" - Key: "masterInstanceType" StringValue: "m1.medium" - Key: "coreInstanceType" StringValue: "m1.medium" - Key: "coreInstanceCount" StringValue: "1" - Key: "type" StringValue: "EmrCluster"
See also
- 
                    Pipeline Object Reference in the AWS Data Pipeline Developer Guide. 
- 
                    PutPipelineDefinition in the AWS Data Pipeline API Reference.