# Creating a task for transferring your data
<a name="create-task-how-to"></a>

A *task* describes where and how AWS DataSync transfers data. A task consists of the following:
+ [**Source location**](working-with-locations.md) – The storage system or service where DataSync transfers data from.
+ [**Destination location**](working-with-locations.md) – The storage system or service where DataSync transfers data to.
+ [**Task options**](task-options.md) – Settings such as what files to transfer, how data gets verified, when the task runs, and more.
+ [**Task executions**](run-task.md) – When you run a task, it's called a *task execution*.

## Creating your task
<a name="create-task-steps"></a>

When you create a DataSync task, you specify your source and destination locations. You also can customize your task by choosing which files to transfer, how metadata gets handled, setting up a schedule, and more.

Before you create your task, make sure that you understand [how DataSync transfers work](how-datasync-transfer-works.md#transferring-files) and review the [task quotas](datasync-limits.md#task-hard-limits).

**Important**  
If you're planning to transfer data to or from an Amazon S3 location, review [how DataSync can affect your S3 request charges](create-s3-location.md#create-s3-location-s3-requests) and the [DataSync pricing page](https://aws.amazon.com/datasync/pricing/) before you begin.

### Using the DataSync console
<a name="create-task-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. Make sure you're in one of the AWS Regions where you plan to transfer data.

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. On the **Configure source location** page, [create](transferring-data-datasync.md) or choose a source location, then choose **Next**.

1. On the **Configure destination location** page, [create](transferring-data-datasync.md) or choose a destination location, then choose **Next**.

1. (Recommended) On the **Configure settings** page, give your task a name that you can remember.

1. While still on the **Configure settings** page, choose your task options or use the default settings.

   You might be interested in some of the following options:
   + Specify the [task mode](choosing-task-mode.md) that you want to use.
   + Specify what data to transfer by using a [manifest](transferring-with-manifest.md) or [filters](filtering.md).
   + Configure how to [handle file metadata](configure-metadata.md) and [verify data integrity](configure-data-verification-options.md).
   + Monitor your transfer with [task reports](task-reports.md) or [Amazon CloudWatch](monitor-datasync.md). We recommend setting up some kind of monitoring for your task.

   When you're done, choose **Next**.

1. Review your task configuration, then choose **Create task**.

You're ready to [start your task](run-task.md).

### Using the AWS CLI
<a name="create-task-cli"></a>

Once you [create your DataSync source and destination locations](transferring-data-datasync.md), you can create your task.

1. In your AWS CLI settings, make sure that you're using one of the AWS Regions where you plan to transfer data.

1. Copy the following `create-task` command:

   ```
   aws datasync create-task \
     --source-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --destination-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --name "task-name"
   ```

1. For `--source-location-arn`, specify the Amazon Resource Name (ARN) of your source location.

1. For `--destination-location-arn`, specify the ARN of your destination location.

   If you're transferring across AWS Regions or accounts, make sure that the ARN includes the other Region or account ID.

1. (Recommended) For `--name`, specify a name for your task that you can remember.

1. Specify other task options as needed. You might be interested in some of the following options:
   + Specify what data to transfer by using a [manifest](transferring-with-manifest.md) or [filters](filtering.md).
   + Configure how to [handle file metadata](configure-metadata.md) and [verify data integrity](configure-data-verification-options.md).
   + Monitor your transfer with [task reports](task-reports.md) or [Amazon CloudWatch](monitor-datasync.md). We recommend setting up some kind of monitoring for your task.

   For more options, see [create-task](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/datasync/create-task.html). Here's an example `create-task` command that specifies several options:

   ```
   aws datasync create-task \
     --source-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --destination-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --cloud-watch-log-group-arn "arn:aws:logs:region:account-id" \
     --name "task-name" \
     --options VerifyMode=NONE,OverwriteMode=NEVER,Atime=BEST_EFFORT,Mtime=PRESERVE,Uid=INT_VALUE,Gid=INT_VALUE,PreserveDevices=PRESERVE,PosixPermissions=PRESERVE,PreserveDeletedFiles=PRESERVE,TaskQueueing=ENABLED,LogLevel=TRANSFER
   ```

1. Run the `create-task` command.

   If the command is successful, you get a response that shows you the ARN of the task that you created. For example:

   ```
   { 
       "TaskArn": "arn:aws:datasync:us-east-1:111222333444:task/task-08de6e6697796f026" 
   }
   ```

You're ready to [start your task](run-task.md).

## Task statuses
<a name="understand-task-creation-statuses"></a>

When you create a DataSync task, you can check its status to see if it's ready to run.


| Console status | API status | Description | 
| --- | --- | --- | 
| Available |  `AVAILABLE`  |  The task is ready to start transferring data.  | 
| Running |  `RUNNING`  | A task execution is in progress. For more information, see [Task execution statuses](run-task.md#understand-task-execution-statuses). | 
|  Unavailable  |  `UNAVAILABLE`  |  A DataSync agent used by the task is offline. For more information, see [What do I do if my agent is offline?](troubleshooting-datasync-agents.md#troubleshoot-agent-offline)  | 
|  Queued  |  `QUEUED`  |  Another task execution that uses the same DataSync agent is in progress. For more information, see [Knowing when your task is queued](run-task.md#queue-task-execution).  | 

## Partitioning large datasets with multiple tasks
<a name="multiple-tasks-large-dataset"></a>

If you're transferring a large dataset, such as [migrating](datasync-large-migration.md) millions of files or objects, we recommend using DataSync Enhanced mode for your transfer, which can transfer datasets with virtually unlimited numbers of files. For very large datasets, with billions of files, you should consider partitioning your dataset with multiple DataSync tasks. Partitioning your data across multiple tasks (and possibly [agents](do-i-need-datasync-agent.md#multiple-agents), depending on your locations) helps reduce the time it takes DataSync to prepare and transfer your data.

Consider some of the ways that you can partition a large dataset across several DataSync tasks:
+ Create tasks that transfer separate folders. For example, you might create two tasks that target `/FolderA` and `/FolderB`, respectively, in your source storage.
+ Create tasks that transfer subsets of files, objects, and folders by using a [manifest](transferring-with-manifest.md) or [filters](filtering.md).

Be mindful that this approach can increase the I/O operations on your storage and affect your network bandwidth. For more information, see the blog on [How to accelerate your data transfers with DataSync scale out architectures](https://aws.amazon.com/blogs/storage/how-to-accelerate-your-data-transfers-with-aws-datasync-scale-out-architectures/).

## Segmenting transferred data with multiple tasks
<a name="multiple-tasks-organize-transfer"></a>

If you're transferring different sets of data to the same destination, you can create multiple tasks to help segment the data that you transfer.

For example, if you're transferring to the same S3 bucket named `MyBucket`, you can create different prefixes in the bucket that correspond to each task. This approach prevents file name conflicts the datasets and allows you to set different permissions for each prefix. Here's how you might set this up:

1. Create three prefixes in the destination `MyBucket` named `task1`, `task2`, and `task3`:
   + `s3://MyBucket/task1`
   + `s3://MyBucket/task2`
   + `s3://MyBucket/task3`

1. Create three DataSync tasks named `task1`, `task2`, and `task3` that transfer to the corresponding prefix in `MyBucket`.

# Choosing a task mode for your data transfer
<a name="choosing-task-mode"></a>

Your AWS DataSync task can run in one of the following modes:
+ **Enhanced mode** – Transfer virtually unlimited numbers of files or objects with higher performance than Basic mode. Enhanced mode tasks optimize the data transfer process by listing, preparing, transferring, and verifying data in parallel. Enhanced mode is currently available for transfers between Amazon S3 locations, transfers between Azure Blob and Amazon S3 without an agent, transfers between other clouds and Amazon S3 without an agent, and transfers between NFS or SMB file servers and Amazon S3 using an Enhanced mode agent.
+ **Basic mode** – Transfer files or objects between AWS storage and all other supported DataSync locations. Basic mode tasks are subject to [quotas](datasync-limits.md) on the number of files, objects, and directories in a dataset. Basic mode sequentially prepares, transfers, and verifies data, making it slower than Enhanced mode for most workloads.

## Understanding task mode differences
<a name="task-mode-differences"></a>

The following information can help you determine which task mode to use.


| Capability | Enhanced mode behavior | Basic mode behavior | 
| --- | --- | --- | 
| [Performance](how-datasync-transfer-works.md#transferring-files) | DataSync lists, prepares, transfers, and verifies your data in parallel. Provides higher performance than Basic mode for most workloads (such as transferring large objects) | DataSync prepares, transfers, and verifies your data sequentially. Performance is slower than Enhanced mode for most workloads | 
| Number of items in a dataset that DataSync can work with per task execution |  Virtually unlimited numbers of objects  |  [Quotas](datasync-limits.md#task-hard-limits) apply  | 
|  Data transfer [counters](transfer-performance-counters.md) and [metrics](monitor-datasync.md)  |  More counters and metrics than Basic mode, such as the number of objects that DataSync finds at your source location, how many objects are prepared during each task execution, and folder counters similar to file and object counters  |  Less counters and metrics than Enhanced mode  | 
|  [Logging](configure-logging.md)  | Structured logs (JSON format) | Unstructured logs | 
|  [Supported locations](working-with-locations.md)  | Currently for transfers between Amazon S3 locations, transfers between Azure Blob and Amazon S3 without an agent, transfers between other clouds and Amazon S3 without an agent, and transfers between NFS or SMB file servers and Amazon S3 using an Enhanced mode agent. |  For transfers between all locations that DataSync supports  | 
|  [Data verification options](configure-data-verification-options.md)  | DataSync verifies only transferred data | DataSync verifies all data by default | 
| Cost | For more information, see the [DataSync pricing](https://aws.amazon.com/datasync/pricing) page | For more information, see the [DataSync pricing](https://aws.amazon.com/datasync/pricing) page | 
| Failure handling for unsupported object tags | For cloud storage transfers to or from locations that don't support object tagging, task execution will fail immediately if the ObjectTags option is unspeficied or set to PRESERVE. | For cloud storage transfers to or from locations that don't support object tagging, task execution will run normally, but will report per-object failures for tagged objects if the ObjectTags option is unspecified or set to PRESERVE. | 

## Choosing a task mode
<a name="choosing-task-mode-how-to"></a>

You can choose Enhanced mode only for transfers between Amazon S3 locations, transfers between Azure Blob and Amazon S3 without an agent, transfers between other clouds and Amazon S3 without an agent, and transfers between NFS or SMB file servers and Amazon S3 using an Enhanced mode agent. Otherwise, you must use Basic mode. For example, a transfer from an on-premises [HDFS location](create-hdfs-location.md) to an S3 location requires Basic mode.

Your task options and performance might vary depending on the task mode you choose. Once you create your task, you can't change the task mode.

**Required permissions**  
To create an Enhanced mode task, the IAM role that you're using DataSync with must have the `iam:CreateServiceLinkedRole` permission.  
For your DataSync user permissions, consider using [AWSDataSyncFullAccess](security-iam-awsmanpol.md#security-iam-awsmanpol-awsdatasyncfullaccess). This is an AWS managed policy that provides a user full access to DataSync and minimal access to its dependencies.

### Using the DataSync console
<a name="choosing-task-mode-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For **Task mode**, choose one of the following options:
   + **Enhanced**
   + **Basic**

   For more information, see [Understanding task mode differences](#task-mode-differences).

1. While still on the **Configure settings** page, choose other task options or use the default settings.

   You might be interested in some of the following options:
   + Specify what data to transfer by using a [manifest](transferring-with-manifest.md) or [filters](filtering.md).
   + Configure how to [handle file metadata](configure-metadata.md) and [verify data integrity](configure-data-verification-options.md).
   + Monitor your transfer with [task reports](task-reports.md) or [Amazon CloudWatch Logs](monitor-datasync.md).

   When you're done, choose **Next**.

1. Review your task configuration, then choose **Create task**.

### Using the AWS CLI
<a name="choosing-task-mode-cli"></a>

1. In your AWS CLI settings, make sure that you're using one of the AWS Regions where you plan to transfer data.

1. Copy the following `create-task` command:

   ```
   aws datasync create-task \
     --source-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --destination-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --task-mode "ENHANCED-or-BASIC"
   ```

1. For `--source-location-arn`, specify the Amazon Resource Name (ARN) of your source location.

1. For `--destination-location-arn`, specify the ARN of your destination location.

   If you're transferring across AWS Regions or accounts, make sure that the ARN includes the other Region or account ID.

1. For `--task-mode`, specify `ENHANCED` or `BASIC`.

   For more information, see [Understanding task mode differences](#task-mode-differences).

1. Specify other task options as needed. You might be interested in some of the following options:
   + Specify what data to transfer by using a [manifest](transferring-with-manifest.md) or [filters](filtering.md).
   + Configure how to [handle file metadata](configure-metadata.md) and [verify data integrity](configure-data-verification-options.md).
   + Monitor your transfer with [task reports](task-reports.md) or [Amazon CloudWatch Logs](monitor-datasync.md).

   For more options, see [create-task](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/datasync/create-task.html). Here's an example `create-task` command that specifies Enhanced mode and several other options:

   ```
   aws datasync create-task \
     --source-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --destination-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --name "task-name" \
     --task-mode "ENHANCED" \
     --options TransferMode=CHANGED,VerifyMode=ONLY_FILES_TRANSFERRED,ObjectTags=PRESERVE,LogLevel=TRANSFER
   ```

1. Run the `create-task` command.

   If the command is successful, you get a response that shows you the ARN of the task that you created. For example:

   ```
   { 
       "TaskArn": "arn:aws:datasync:us-east-1:111222333444:task/task-08de6e6697796f026" 
   }
   ```

### Using the DataSync API
<a name="choosing-task-mode-api"></a>

You can specify the DataSync task mode by configuring the `TaskMode` parameter in the [CreateTask](https://docs.aws.amazon.com/datasync/latest/userguide/API_CreateTask.html) operation.

# Choosing what AWS DataSync transfers
<a name="task-options"></a>

AWS DataSync lets you choose what to transfer and how you want your data handled. Some options include:
+ Transferring an exact list of files or object by using a manifest.
+ Including or excluding certain types of data in your transfer by using a filter.
+ For recurring transfers, moving only the data that's changed since the last transfer
+ Overwriting data in the destination location to match what's in the source location.
+ Choosing which file or object metadata to preserve between your storage locations.

**Topics**
+ [Transferring specific files or objects by using a manifest](transferring-with-manifest.md)
+ [Transferring specific files, objects, and folders by using filters](filtering.md)
+ [Understanding how DataSync handles file and object metadata](metadata-copied.md)
+ [Links and directories copied by AWS DataSync](special-files-copied.md)
+ [Configuring how to handle files, objects, and metadata](configure-metadata.md)

# Transferring specific files or objects by using a manifest
<a name="transferring-with-manifest"></a>

A *manifest* is a list of files or objects that you want AWS DataSync to transfer. For example, instead of having to transfer everything in an S3 bucket with potentially millions of objects, DataSync transfers only the objects that you list in your manifest.

Manifests are similar to [filters](filtering.md) but let you identify exactly which files or objects to transfer instead of data that matches a filter pattern.

**Note**  
The maximum allowable size for a manifest file with Enhanced mode tasks is 20 GB.

## Creating your manifest
<a name="transferring-with-manifest-create"></a>

A manifest is a comma-separated values (CSV)-formatted file that lists the files or objects in your source location that you want DataSync to transfer. If your source is an S3 bucket, you can also include which version of an object to transfer.

**Topics**
+ [Guidelines](#transferring-with-manifest-guidelines)
+ [Example manifests](#manifest-examples)

### Guidelines
<a name="transferring-with-manifest-guidelines"></a>

Use these guidelines to help you create a manifest that works with DataSync.

------
#### [ Do ]
+ Specify the full path of each file or object that you want to transfer.

  You can't specify only a directory or folder with the intention of transferring all of its contents. For these situations, consider using an [include filter](filtering.md) instead of a manifest.
+ Make sure that each file or object path is relative to the mount path, folder, directory, or prefix that you specified when configuring your DataSync source location.

  For example, let's say you [configure an S3 location](create-s3-location.md#create-s3-location-how-to) with a prefix named `photos`. That prefix includes an object `my-picture.png` that you want to transfer. In the manifest, you then only need to specify the object (`my-picture.png`) instead of the prefix and object (`photos/my-picture.png`).
+ To specify Amazon S3 object version IDs, separate the object's path and version ID by using a comma.

  The following example shows a manifest entry with two fields. The first field includes an object named `picture1.png`. The second field is separated by a comma and includes a version ID of `111111`:

  ```
  picture1.png,111111
  ```
+ Use quotes in the following situations:
  + When a path contains special characters (commas, quotes, and line endings):

    `"filename,with,commas.txt"`
  + When a path spans multiple lines:

    ```
    "this
    is
    a
    filename.txt"
    ```
  + When a path includes quotes:

    `filename""with""quotes.txt`

    This represents a path named `filename"with"quotes.txt`.

  These quote rules also apply to version ID fields. In general, if a manifest field has a quote, you must escape it with another quote.
+ Separate each file or object entry with a new line.

  You can separate lines by using Linux (line feed or carriage return) or Windows (carriage return followed by a line feed) style line breaks.
+ Save your manifest (for example, `my-manifest.csv` or `my-manifest.txt`).
+ Upload the manifest to an S3 bucket that [DataSync can access](#transferring-with-manifest-access).

  This bucket doesn't have to be in the same AWS Region or account where you're using DataSync.

------
#### [ Don't ]
+ Specify only a directory or folder with the intention of transferring all of its contents.

  A manifest can only include full paths to the files or objects that you want to transfer. If you configure your source location to use a specific mount path, folder, directory, or prefix, you don't have to include that in your manifest.
+ Specify a file or object path that exceeds 4,096 characters.
+ Specify a file path, object path, or Amazon S3 object version ID that exceeds 1,024 bytes.
+ Specify duplicate file or object paths.
+ Include an object version ID if your source location isn't an S3 bucket.
+ Include more than two fields in a manifest entry.

  An entry can include only a file or object path and (if applicable) an Amazon S3 object version ID.
+ Include characters that don't conform to UTF-8 encoding.
+ Include unintentional spaces in your entry fields outside of quotes.

------

### Example manifests
<a name="manifest-examples"></a>

Use these examples to help you create a manifest that works with DataSync. 

**Manifest with full file or object paths**  
The following example shows a manifest with full file or object paths to transfer.  

```
photos/picture1.png
photos/picture2.png
photos/picture3.png
```

**Manifest with only object keys**  
The following example shows a manifest with objects to transfer from an Amazon S3 source location. Since the [location is configured](create-s3-location.md#create-s3-location-how-to) with the prefix `photos`, only the object keys are specified.  

```
picture1.png
picture2.png
picture3.png
```

**Manifest with object paths and version IDs**  
The first two entries in the following manifest example include specific Amazon S3 object versions to transfer.  

```
photos/picture1.png,111111
photos/picture2.png,121212
photos/picture3.png
```

**Manifest with UTF-8 characters**  
The following example shows a manifest with files that include UTF-8 characters.  

```
documents/résumé1.pdf
documents/résumé2.pdf
documents/résumé3.pdf
```

## Providing DataSync access to your manifest
<a name="transferring-with-manifest-access"></a>

You need an AWS Identity and Access Management (IAM) role that gives DataSync access to your manifest in its S3 bucket. This role must include the following permissions:
+ `s3:GetObject`
+ `s3:GetObjectVersion`

You can generate this role automatically in the DataSync console or create the role yourself.

**Note**  
If your manifest is in a different AWS account, you must create this role manually.

### Creating the IAM role automatically
<a name="creating-manfiest-role-automatically"></a>

When creating or starting a transfer task in the console, DataSync can create an IAM role for you with the `s3:GetObject` and `s3:GetObjectVersion` permissions that you need to access your manifest.

**Required permissions to automatically create the role**  
To automatically create the role, make sure that the role that you're using to access the DataSync console has the following permissions:  
+ `iam:CreateRole`
+ `iam:CreatePolicy`
+ `iam:AttachRolePolicy`

### Creating the IAM role (same account)
<a name="creating-manfiest-role-automatically-same-account"></a>

You can manually create the IAM role that DataSync needs to access your manifest. The following instructions assume that you're in the same AWS account where you use DataSync and your manifest's S3 bucket is located. 

1. Open the IAM console at [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/).

1. In the left navigation pane, under **Access management**, choose **Roles**, and then choose **Create role**.

1. On the **Select trusted entity** page, for **Trusted entity type**, choose **AWS service**.

1. For **Use case**, choose **DataSync** in the dropdown list and select **DataSync**. Choose **Next**.

1. On the **Add permissions** page, choose **Next**. Give your role a name and choose **Create role**.

1. On the **Roles** page, search for the role that you just created and choose its name.

1. On the role's details page, choose the **Permissions** tab. Choose **Add permissions** then **Create inline policy**.

1. Choose the **JSON** tab and paste the following sample policy into the policy editor:

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [{
           "Sid": "DataSyncAccessManifest",
           "Effect": "Allow",
           "Action": [
               "s3:GetObject",
               "s3:GetObjectVersion"
           ],
           "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/my-manifest.csv"
       }]
   }
   ```

------

1. In the sample policy that you just pasted, replace the following values with your own:

   1. Replace `amzn-s3-demo-bucket` with the name of the S3 bucket that's hosting your manifest.

   1. Replace `my-manifest.csv` with the file name of your manifest.

1. Choose **Next**. Give your policy a name and choose **Create policy**.

1. (Recommended) To prevent the [cross-service confused deputy problem](cross-service-confused-deputy-prevention.md), do the following:

   1. On the role's details page, choose the **Trust relationships** tab. Choose **Edit trust policy**.

   1. Update the trust policy by using the following example, which includes the `aws:SourceArn` and `aws:SourceAccount` global condition context keys:

------
#### [ JSON ]

****  

      ```
      {
          "Version":"2012-10-17",		 	 	 
          "Statement": [
            {
              "Effect": "Allow",
              "Principal": {
                  "Service": "datasync.amazonaws.com"
              },
              "Action": "sts:AssumeRole",
              "Condition": {
                  "StringEquals": {
                  "aws:SourceAccount": "555555555555"
                  },
                  "ArnLike": {
                  "aws:SourceArn": "arn:aws:datasync:us-east-1:555555555555:*"
                  }
              }
            }
        ]
      }
      ```

------
      + Replace each instance `account-id` with the AWS account ID where you're using DataSync.
      + Replace `region` with the AWS Region where you're using DataSync.

   1. Choose **Update policy**.

You've created an IAM role that allows DataSync to access your manifest. Specify this role when [creating](#manifest-creating-task) or [starting](#manifest-starting-task) your task.

### Creating the IAM role (different account)
<a name="creating-manfiest-role-automatically-different-account"></a>

If your manifest is in an S3 bucket that belongs to a different AWS account, you must manually create the IAM role that DataSync uses to access the manifest. Then, in the AWS account where your manifest is located, you need to include the role in the S3 bucket policy.

#### Creating the role
<a name="creating-manfiest-role-automatically-different-account-1"></a>

1. Open the IAM console at [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/).

1. In the left navigation pane, under **Access management**, choose **Roles**, and then choose **Create role**.

1. On the **Select trusted entity** page, for **Trusted entity type**, choose **AWS service**.

1. For **Use case**, choose **DataSync** in the dropdown list and select **DataSync**. Choose **Next**.

1. On the **Add permissions** page, choose **Next**. Give your role a name and choose **Create role**.

1. On the **Roles** page, search for the role that you just created and choose its name.

1. On the role's details page, choose the **Permissions** tab. Choose **Add permissions** then **Create inline policy**.

1. Choose the **JSON** tab and paste the following sample policy into the policy editor:

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [{
           "Sid": "DataSyncAccessManifest",
           "Effect": "Allow",
           "Action": [
               "s3:GetObject",
               "s3:GetObjectVersion"
           ],
           "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/my-manifest.csv"
       }]
   }
   ```

------

1. In the sample policy that you just pasted, replace the following values with your own:

   1. Replace `amzn-s3-demo-bucket` with the name of the S3 bucket that's hosting your manifest.

   1. Replace `my-manifest.csv` with the file name of your manifest.

1. Choose **Next**. Give your policy a name and choose **Create policy**.

1. (Recommended) To prevent the [cross-service confused deputy problem](cross-service-confused-deputy-prevention.md), do the following:

   1. On the role's details page, choose the **Trust relationships** tab. Choose **Edit trust policy**.

   1. Update the trust policy by using the following example, which includes the `aws:SourceArn` and `aws:SourceAccount` global condition context keys:

------
#### [ JSON ]

****  

      ```
      {
          "Version":"2012-10-17",		 	 	 
          "Statement": [
            {
              "Effect": "Allow",
              "Principal": {
                  "Service": "datasync.amazonaws.com"
              },
              "Action": "sts:AssumeRole",
              "Condition": {
                  "StringEquals": {
                  "aws:SourceAccount": "000000000000"
                  },
                  "ArnLike": {
                  "aws:SourceArn": "arn:aws:datasync:us-east-1:000000000000:*"
                  }
              }
           }
        ]
      }
      ```

------
      + Replace each instance of `account-id` with the AWS account ID where you're using DataSync.
      + Replace `region` with the AWS Region where you're using DataSync.

   1. Choose **Update policy**.

You created the IAM role that you can include in your S3 bucket policy.

#### Updating your S3 bucket policy with the role
<a name="creating-manfiest-role-automatically-different-account-2"></a>

Once you've created the IAM role, you must add it to the S3 bucket policy in the other AWS account where your manifest is located.

1. In the AWS Management Console, switch over to the account with your manfiest's S3 bucket.

1. Open the Amazon S3 console at [https://console.aws.amazon.com/s3/](https://console.aws.amazon.com/s3/).

1. On the bucket's detail page, choose the **Permissions** tab.

1. Under **Bucket policy**, choose **Edit** and do the following to modify your S3 bucket policy:

   1. Update what's in the editor to include the following policy statements:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Sid": "DataSyncAccessManifestBucket",
            "Effect": "Allow",
            "Action": [
              "s3:GetObject",
              "s3:GetObjectVersion"
            ],
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket"
          }
        ]
      }
      ```

------

   1. Replace `account-id` with the AWS account ID for the account that you're using DataSync with.

   1. Replace `datasync-role` with the IAM role that you just created that allows DataSync to access your manifest.

   1. Replace `amzn-s3-demo-bucket` with the name of the S3 bucket that's hosting your manifest in the other AWS account.

1. Choose **Save changes**.

You've created an IAM role that allows DataSync to access your manifest in the other account. Specify this role when [creating](#manifest-creating-task) or [starting](#manifest-starting-task) your task.

## Specifying your manifest when creating a task
<a name="manifest-creating-task"></a>

You can specify the manifest that you want DataSync to use when creating a task.

### Using the DataSync console
<a name="manifest-creating-task-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For **Contents to scan**, choose **Specific files, objects, and folders**, then select **Using a manifest**.

1. For **S3 URI**, choose your manifest that's hosted on an S3 bucket.

   Alternatively, you can enter the URI (for example, `s3://bucket/prefix/my-manifest.csv`).

1. For **Object version**, choose the version of the manifest that you want DataSync to use.

   By default, DataSync uses the latest version of the object.

1. For **Manifest access role**, do one of the following:
   + Choose **Autogenerate** for DataSync to automatically create an IAM role with the permissions required to access your manifest in its S3 bucket.
   + Choose an existing IAM role that can access your manifest.

   For more information, see [Providing DataSync access to your manifest](#transferring-with-manifest-access).

1. Configure any other task settings you need, then choose **Next**.

1. Choose **Create task**.

### Using the AWS CLI
<a name="manifest-creating-task-cli"></a>

1. Copy the following `create-task` command:

   ```
   aws datasync create-task \
     --source-location-arn arn:aws:datasync:us-east-1:123456789012:location/loc-12345678abcdefgh \
     --destination-location-arn arn:aws:datasync:us-east-1:123456789012:location/loc-abcdefgh12345678 \
     --manifest-config {
         "Source": {
           "S3": {
               "ManifestObjectPath": "s3-object-key-of-manifest",
               "BucketAccessRoleArn": "bucket-iam-role",
               "S3BucketArn": "amzn-s3-demo-bucket-arn",
               "ManifestObjectVersionId": "manifest-version-to-use" 
           }
         }
     }
   ```

1. For the `--source-location-arn` parameter, specify the Amazon Resource Name (ARN) of the location that you're transferring data from.

1. For the `--destination-location-arn` parameter, specify the ARN of the location that you're transferring data to.

1. For the `--manifest-config` parameter, do the following:
   + `ManifestObjectPath` – Specify the S3 object key of your manifest.
   + `BucketAccessRoleArn` – Specify the IAM role that allows DataSync to access your manifest in its S3 bucket.

     For more information, see [Providing DataSync access to your manifest](#transferring-with-manifest-access).
   + `S3BucketArn` – Specify the ARN of the S3 bucket that's hosting your manifest.
   + `ManifestObjectVersionId` – Specify the version of the manifest that you want DataSync to use.

     By default, DataSync uses the latest version of the object.

1. Run the `create-task` command to create your task.

When you're ready, you can [start your transfer task](run-task.md).

## Specifying your manifest when starting a task
<a name="manifest-starting-task"></a>

You can specify the manifest that you want DataSync to use when executing a task.

### Using the DataSync console
<a name="manifest-starting-task-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Tasks**, and then choose the task that you want to start.

1. In the task overview page, choose **Start**, and then choose **Start with overriding options**.

1. For **Contents to scan**, choose **Specific files, objects, and folders**, then select **Using a manifest**.

1. For **S3 URI**, choose your manifest that's hosted on an S3 bucket.

   Alternatively, you can enter the URI (for example, `s3://bucket/prefix/my-manifest.csv`).

1. For **Object version**, choose the version of the manifest that you want DataSync to use.

   By default, DataSync uses the latest version of the object.

1. For **Manifest access role**, do one of the following:
   + Choose **Autogenerate** for DataSync to automatically create an IAM role to access your manifest in its S3 bucket.
   + Choose an existing IAM role that can access your manifest.

   For more information, see [Providing DataSync access to your manifest](#transferring-with-manifest-access).

1. Choose **Start** to begin your transfer.

### Using the AWS CLI
<a name="manifest-starting-task-cli"></a>

1. Copy the following `start-task-execution` command:

   ```
   aws datasync start-task-execution \
     --task-arn arn:aws:datasync:us-east-1:123456789012:task/task-12345678abcdefgh \
     --manifest-config {
         "Source": {
           "S3": {
               "ManifestObjectPath": "s3-object-key-of-manifest",
               "BucketAccessRoleArn": "bucket-iam-role",
               "S3BucketArn": "amzn-s3-demo-bucket-arn",
               "ManifestObjectVersionId": "manifest-version-to-use" 
           }
         }
     }
   ```

1. For the `--task-arn` parameter, specify the Amazon Resource Name (ARN) of the task that you're starting.

1. For the `--manifest-config` parameter, do the following:
   + `ManifestObjectPath` – Specify the S3 object key of your manifest.
   + `BucketAccessRoleArn` – Specify the IAM role that allows DataSync to access your manifest in its S3 bucket.

     For more information, see [Providing DataSync access to your manifest](#transferring-with-manifest-access).
   + `S3BucketArn` – Specify the ARN of the S3 bucket that's hosting your manifest.
   + `ManifestObjectVersionId` – Specify the version of the manifest that you want DataSync to use.

     By default, DataSync uses the latest version of the object.

1. Run the `start-task-execution` command to begin your transfer.

## Limitations
<a name="transferring-with-manifest-limitations"></a>
+ You can't use a manifest together with [filters](filtering.md).
+ You can't specify only a directory or folder with the intention of transferring all of its contents. For these situations, consider using an [include filter](filtering.md) instead of a manifest.
+ You can't use the **Keep deleted files** task option (`PreserveDeletedFiles` in the [API](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-PreserveDeletedFiles)) to [maintain files or objects in the destination that aren't in the source](configure-metadata.md). DataSync only transfers what's listed in your manifest and doesn't delete anything in the destination.

## Troubleshooting
<a name="manifests-troubleshooting"></a>

**Errors related to `HeadObject` or `GetObjectTagging`**  
If you're transferring objects with specific version IDs from an S3 bucket, you might see an error related to `HeadObject` or `GetObjectTagging`. For example, here's an error related to `GetObjectTagging`:

```
[WARN] Failed to read metadata for file /picture1.png (versionId: 111111): S3 Get Object Tagging Failed
[ERROR] S3 Exception: op=GetObjectTagging photos/picture1.png, code=403, type=15, exception=AccessDenied, 
msg=Access Denied req-hdrs: content-type=application/xml, x-amz-api-version=2006-03-01 rsp-hdrs: content-type=application/xml, 
date=Wed, 07 Feb 2024 20:16:14 GMT, server=AmazonS3, transfer-encoding=chunked, 
x-amz-id-2=IOWQ4fDEXAMPLEQM+ey7N9WgVhSnQ6JEXAMPLEZb7hSQDASK+Jd1vEXAMPLEa3Km, x-amz-request-id=79104EXAMPLEB723
```

If you see either of these errors, validate that the IAM role that DataSync uses to access your S3 source location has the following permissions:
+ `s3:GetObjectVersion`
+ `s3:GetObjectVersionTagging`

If you need to update your role with these permissions, see [Creating an IAM role for DataSync to access your Amazon S3 location](create-s3-location.md#create-role-manually).

**Error: `ManifestFileDoesNotExist`**  
This error indicates that a file within the manifest was not found at the source. Review the [guidelines](#transferring-with-manifest-guidelines) for creating a manifest.

## Next steps
<a name="manifests-next-steps"></a>

If you haven't already, [start your task](run-task.md). Otherwise, [monitor your task's activity](monitoring-overview.md).

# Transferring specific files, objects, and folders by using filters
<a name="filtering"></a>

AWS DataSync lets you apply filters to include or exclude data from your source location in a transfer. For example, if you don't want to transfer temporary files that end with `.tmp`, you can create an exclude filter so that these files don't make their way to your destination location.

You can use a combination of exclude and include filters in the same transfer task. If you modify a task's filters, those changes are applied the next time you run the task.

## Filtering terms, definitions, and syntax
<a name="filter-overview"></a>

Familiarize yourself with the concepts related to DataSync filtering:

**Filter **  
The whole string that makes up a particular filter (for example, `*.tmp``|``*.temp` or `/folderA|/folderB`).  
Filters are made up of patterns delimited by using a pipe (\$1). You don't need a delimiter when you add patterns in the DataSync console because you add each pattern separately.  
Filters are case sensitive. For example, filter `/folderA` won't match `/FolderA`.

**Pattern**  
A pattern within a filter. For example, `*.tmp` is a pattern that's part of the `*.tmp``|``*.temp` filter. If your filter has multiple patterns, you delimit each pattern by using a pipe (\$1).

**Folders**  
+ All filters are relative to the source location path. For example, suppose that you specify `/my_source/` as the source path when you create your source location and task and specify the include filter `/transfer_this/`. In this case, DataSync transfers only the directory `/my_source/transfer_this/` and its contents.
+ To specify a folder directly under the source location, include a forward slash (/) in front of the folder name. In the example preceding, the pattern uses `/transfer_this`, not `transfer_this`.
+ DataSync interprets the following patterns the same way and matches both the folder and its content.

  `/dir` 

  `/dir/`
+ When you are transferring data from or to an Amazon S3 bucket, DataSync treats the `/` character in the object key as the equivalent of a folder on a file system.

**Special characters**  
Following are special characters for use with filtering.      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/filtering.html)

## Example filters
<a name="sample-filters"></a>

The following examples show common filters you can use with DataSync.

**Note**  
There are limits to how many characters you can use in a filter. For more information, see [DataSync quotas](datasync-limits.md#task-hard-limits).

**Exclude some folders from your source location**  
In some cases, you want might exclude folders in your source location to not copy them to your destination location. For example, if you have temporary work-in-progress folders, you can use something like the following filter:

`*/.temp`

To exclude folders with similar content (such as `/reports2021` and `/reports2022)`), you can use an exclude filter like the following:

`/reports*`

To exclude folders at any level in the file hierarchy, you can use an exclude filter like the following. 

`*/folder-to-exclude-1`\$1`*/folder-to-exclude-2`

To exclude folders at the top level of the source location, you can use an exclude filter like the following. 

`/top-level-folder-to-exclude-1`\$1`/top-level-folder-to-exclude-2`

**Include a subset of the folders on your source location**  
In some cases, your source location might be a large share and you need to transfer a subset of the folders under the root. To include specific folders, start a task execution with an include filter like the following.

`/folder-to-transfer/*`

**Exclude specific file types**  
To exclude certain file types from the transfer, you can create a task execution with an exclude filter such as `*.temp`.

**Transfer individual files you specify**  
To transfer a list of individual files, start a task execution with an include filter like the following: "`/folder/subfolder/file1.txt`\$1`/folder/subfolder/file2.txt`\$1`/folder/subfolder/file2.txt`"

## Creating include filters
<a name="include-filters"></a>

Include filters define the files, objects, and folders that you want DataSync to transfer. You can configure include filters when you create, edit, or start a task.

DataSync scans and transfers only files and folders that match the include filters. For example, to include a subset of your source folders, you might specify `/important_folder_1`\$1`/important_folder_2`. 

**Note**  
Include filters support the wildcard (\$1) character only as the rightmost character in a pattern. For example, `/documents*`\$1`/code*` is supported, but `*.txt` isn't.

### Using the DataSync console
<a name="include-filters-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For **Contents to scan**, choose **Specific files, objects, and folders**, then select **Using filters**.

1. For **Includes**, enter your filter (for example, `/important_folders` to include an important directory), then choose **Add pattern**.

1. Add other include filters as needed. 

### Using the AWS CLI
<a name="include-filters-cli"></a>

When using the AWS CLI, you must use single quotation marks (`'`) around the filter and a \$1 (pipe) as a delimiter if you have more than one filter.

The following example specifies two include filters `/important_folder1` and `/important_folder2` when running the `create-task` command.

```
aws datasync create-task
   --source-location-arn 'arn:aws:datasync:region:account-id:location/location-id' \
   --destination-location-arn 'arn:aws:datasync:region:account-id:location/location-id' \
   --includes FilterType=SIMPLE_PATTERN,Value='/important_folder1|/important_folder2'
```

## Creating exclude filters
<a name="exclude-filters"></a>

Exclude filters define the files, objects, and folders in your source location that you don't want DataSync to transfer. You can configure these filters when you create, edit, or start a task.

**Topics**
+ [Data excluded by default](#directories-ignored-during-transfers)

### Data excluded by default
<a name="directories-ignored-during-transfers"></a>

DataSync automatically excludes some data from being transferred:
+ `.snapshot` – DataSync ignores any path ending with `.snapshot`, which typically is used for point-in-time snapshots of a storage system's files or directories.
+ `/.aws-datasync` and `/.awssync` – DataSync creates these folders in your location to help facilitate your transfer.
+ `/.zfs` – You might see this folder with Amazon FSx for OpenZFS locations.

### Using the DataSync console
<a name="adding-exclude-filters"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For **Excludes**, enter your filter (for example, `*/temp` to exclude temporary folders), then choose **Add pattern**.

1. Add other exclude filters as needed. 

1. If needed, add [include filters](#include-filters).

### Using the AWS CLI
<a name="adding-exclude-filters-cli"></a>

When using the AWS CLI, you must use single quotation marks (`'`) around the filter and a \$1 (pipe) as a delimiter if you have more than one filter. 

The following example specifies two exclude filters `*/temp` and `*/tmp` when running the `create-task` command.

```
aws datasync create-task \
   --source-location-arn 'arn:aws:datasync:region:account-id:location/location-id' \
   --destination-location-arn 'arn:aws:datasync:region:account-id:location/location-id' \
   --excludes FilterType=SIMPLE_PATTERN,Value='*/temp|*/tmp'
```

# Understanding how DataSync handles file and object metadata
<a name="metadata-copied"></a>

AWS DataSync can preserve your file or object metadata during a data transfer. How your metadata gets copied depends on your transfer locations and if those locations use similar types of metadata.

## System-level metadata
<a name="metadata-copied-system-level"></a>

In general, DataSync doesn't copy system-level metadata. For example, when transferring from an SMB file server, the permissions you configured at the file system level aren't copied to the destination storage system.

There are exceptions. When transferring between Amazon S3 and other object storage, DataSync does copy some [system-defined object metadata](#metadata-copied-between-object-s3).

## Metadata copied in Amazon S3 transfers
<a name="metadata-copied-amazon-s3"></a>

The following tables describe what metadata DataSync can copy when a transfer involves an Amazon S3 location.

**Topics**
+ [To Amazon S3](#metadata-copied-to-s3)
+ [Between Amazon S3 and other object storage](#metadata-copied-between-object-s3)
+ [Between Amazon S3 and HDFS](#metadata-copied-between-hdfs-s3)

### To Amazon S3
<a name="metadata-copied-to-s3"></a>


| When copying from one of these locations | To this location | DataSync can copy | 
| --- | --- | --- | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  The following as Amazon S3 user metadata: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html) The file metadata stored in Amazon S3 user metadata is interoperable with NFS shares on file gateways using AWS Storage Gateway. A file gateway enables low-latency access from on-premises networks to data that was copied to Amazon S3 by DataSync. This metadata is also interoperable with FSx for Lustre. When DataSync copies objects that contain this metadata back to an NFS server, the file metadata is restored. Restoring metadata requires granting elevated permissions to the NFS server. For more information, see [Configuring AWS DataSync transfers with an NFS file server](create-nfs-location.md).  | 

### Between Amazon S3 and other object storage
<a name="metadata-copied-between-object-s3"></a>

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)

### Between Amazon S3 and HDFS
<a name="metadata-copied-between-hdfs-s3"></a>


| When copying between these locations | DataSync can copy | 
| --- | --- | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  | The following as Amazon S3 user metadata:[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)HDFS uses strings to store file and folder user and group ownership, rather than numeric identifiers, such as UIDs and GIDs. | 

## Metadata copied in NFS transfers
<a name="metadata-copied-nfs"></a>

The following table describes what metadata DataSync can copy between locations that use Network File System (NFS).


| When copying between these locations | DataSync can copy | 
| --- | --- | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  | 

## Metadata copied in SMB transfers
<a name="metadata-copied-smb"></a>

The following table describes what metadata DataSync can copy between locations that use Server Message Block (SMB).


| When copying between these locations | DataSync can copy | 
| --- | --- | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  | 

## Metadata copied in other transfer scenarios
<a name="metadata-copied-different"></a>

DataSync handles metadata the following ways when copying between these storage systems (most of which have different metadata structures).

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)

## Understanding when and how DataSync applies default POSIX metadata
<a name="POSIX-metadata"></a>

DataSync applies default POSIX metadata in the following situations:
+ When your transfer's source and destination locations don't have similar metadata structures
+ When metadata is missing from the source location

The following table describes how DataSync applies default POSIX metadata during these types of transfers:


| Source | Destination | File permissions | Folder permissions | UID | GID | 
| --- | --- | --- | --- | --- | --- | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  0755  | 0755 |  65534  |  65534  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  0644  |  0755  |  65534  |  65534  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  0644  |  0755  |  65534  |  65534  | 

1 In cases where the objects don't have metadata that was previously applied by DataSync.

# Links and directories copied by AWS DataSync
<a name="special-files-copied"></a>

AWS DataSync handles hard links, symbolic links, and directories differently depending on the storage locations involved in your transfer.

## Hard links
<a name="special-files-copied-hard-links"></a>

Here's how DataSync handles hard links in some common transfer scenarios:
+ **When transferring between an NFS file server, FSx for Lustre, FSx for OpenZFS, FSx for ONTAP (using NFS), and Amazon EFS**, hard links are preserved.
+ **When transferring to Amazon S3**, each underlying file referenced by a hard link is transferred only once. During incremental transfers, separate objects are created in your S3 bucket. If a hard link is unchanged in Amazon S3, it's correctly restored when transferred to an NFS file server, FSx for Lustre, FSx for OpenZFS, FSx for ONTAP (using NFS), or Amazon EFS file system.
+ **When transferring to Microsoft Azure Blob Storage**, each underlying file referenced by a hard link is transferred only once. During incremental transfers, separate objects are created in your blob storage if there are new references in the source. When transferring from Azure Blob Storage, DataSync transfers hard links as if they are individual files.
+ **When transferring between an SMB file server, FSx for Windows File Server, and FSx for ONTAP (using SMB)**, hard links aren't supported. If DataSync encounters hard links in these situations, the transfer task completes with an error. To learn more, check your CloudWatch logs.
+ **When transferring to HDFS**, hard links aren't supported. CloudWatch logs show these links as skipped.

## Symbolic links
<a name="special-files-copied-symbolic-links"></a>

Here's how DataSync handles symbolic links in some common transfer scenarios:
+ **When transferring between an NFS file server, FSx for Lustre, FSx for OpenZFS, FSx for ONTAP (using NFS), and Amazon EFS**, symbolic links are preserved.
+ **When transferring to Amazon S3**, the link target path is stored in the Amazon S3 object. The link is correctly restored when transferred to an NFS file server, FSx for Lustre, FSx for OpenZFS, FSx for ONTAP, or Amazon EFS file system.
+ **When transferring to Azure Blob Storage**, symbolic links aren't supported. CloudWatch logs show these links as skipped.
+ **When transferring between an SMB file server, FSx for Windows File Server, and FSx for ONTAP (using SMB)**, symbolic links aren't supported. DataSync doesn't transfer a symbolic link itself but instead a file referenced by the symbolic link. To recognize duplicate files and deduplicate them with symbolic links, you must configure deduplication on your destination file system.
+ **When transferring to HDFS**, symbolic links aren't supported. CloudWatch logs show these links as skipped.

## Directories
<a name="special-files-copied-directories"></a>

In general, DataSync preserves directories when transferring between storage systems. This isn’t the case in the following situations:
+ **When transferring to Amazon S3**, directories are represented as empty objects that have prefixes and end with a forward slash (`/`).
+ **When transferring to Azure Blob Storage without a hierarchical namespace**, directories don't exist. What looks like a directory is just part of an object name.

# Configuring how to handle files, objects, and metadata
<a name="configure-metadata"></a>

You can configure how AWS DataSync handles your files, objects, and their associated metadata when transferring between locations.

For example, with recurring transfers, you might want to overwrite files in your destination with changes in the source to keep the locations in sync. You can copy properties such as POSIX permissions for files and folders, tags associated with objects, and access control lists (ACLs).

## Transfer mode options
<a name="task-option-transfer-mode"></a>

You can configure whether DataSync transfers only the data (including metadata) that's changed following an initial copy or all data every time you run the task. If you're planning on recurring transfers, you might only want to transfer what's changed since your previous task execution.


| Option in console | Option in API | Description | 
| --- | --- | --- | 
|  **Transfer only data that has changed**  |  [TransferMode](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-TransferMode) set to `CHANGED`  | After your initial full transfer, DataSync copies only the data and metadata that differs between the source and destination location. | 
|  **Transfer all data**  |  [TransferMode](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-TransferMode) set to `ALL`  |  DataSync copies everything in the source to the destination without comparing differences between the locations.   | 

## File and object handling options
<a name="task-option-file-object-handling"></a>

You can control some aspects of how DataSync treats your files or objects in the destination location. For example, DataSync can delete files in the destination that aren't in the source.


| Option in console | Option in API | Description | 
| --- | --- | --- | 
|  **Keep deleted files**  |  [PreserveDeletedFiles](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-PreserveDeletedFiles)  |  Specifies whether DataSync maintains files or objects in the destination location that don't exist in the source. If you configure your task to delete objects from your Amazon S3 bucket, you might incur minimum storage duration charges for certain storage classes. For detailed information, see [Storage class considerations with Amazon S3 transfers](create-s3-location.md#using-storage-classes).  You can't configure your task to delete data in the destination and also [transfer all data](#task-option-transfer-mode). When you transfer all data, DataSync doesn't scan your destination location and doesn't know what to delete.   | 
|  **Overwrite files**  |  [OverwriteMode](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-OverwriteMode)  |  Specifies whether DataSync modifies data in the destination location when the source data or metadata has changed. If you don't configure your task to overwrite data, the destination data isn't overwritten even if the source data differs. If your task overwrites objects, you might incur additional charges for certain storage classes (for example, for retrieval or early deletion). For detailed information, see [Storage class considerations with Amazon S3 transfers](create-s3-location.md#using-storage-classes).  | 

## Metadata handling options
<a name="task-option-metadata-handling"></a>

DataSync can preserve file and object metadata during a transfer. The metadata that DataSync can preserve depends on the storage systems involved and whether those systems use a similar metadata structure.

Before configuring your task, make sure that you understand how DataSync handles [metadata](metadata-copied.md) and [special files](special-files-copied.md) when transferring between your source and destination locations.

**Important**  
DataSync supports transfers to and from certain third-party cloud storage systems, such as Google Cloud Storage and IBM Cloud Object Storage, which handle system metadata in a way that is not fully S3-compatible. For these transfers, DataSync attempts to copy metadata attributes such as `ContentType`, `ContentEncoding`, `ContentLanguage`, and `CacheControl` on a best-effort basis. If the destination storage system does not apply these attributes, they will be ignored during task verification.


| Option in console | Option in API | Description | 
| --- | --- | --- | 
|  **Copy ownership**  | [Gid](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-Gid) and [Uid](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-Uid) |  Specifies whether DataSync copies POSIX file and folder ownership, such as the group ID of the file's owners and the user ID of the file's owner.  | 
|  **Copy permissions**  | [PosixPermissions](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-PosixPermissions) |  Specifies whether DataSync copies POSIX permissions for files and folders from the source to the destination.  | 
| Copy timestamps | [Atime](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-Atime) and [Mtime](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-Mtime) |  Specifies whether DataSync copies the timestamp metadata from the source to the destination. Required when you need to run a task more than once.  | 
| Copy object tags | [ObjectTags](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-ObjectTags) |  Specifies whether DataSync preserves the tags associated with your objects when transferring between object storage systems.  | 
| Copy ownership, DACLs, and SACLs | [SecurityDescriptorCopyFlags ](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-SecurityDescriptorCopyFlags) set to OWNER\$1DACL\$1SACL |  DataSync copies the following: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/configure-metadata.html)  | 
| Copy ownership and DACLs | [SecurityDescriptorCopyFlags ](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-SecurityDescriptorCopyFlags) set to OWNER\$1DACL |  DataSync copies the following: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/configure-metadata.html) DataSync won't copy SACLs when you choose this option.  | 
| Do not copy ownership or ACLs | [SecurityDescriptorCopyFlags ](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-SecurityDescriptorCopyFlags) set to NONE |  DataSync doesn't copy any ownership or permissions data. The objects that DataSync writes to your destination location are owned by the user whose credentials are provided for DataSync to access the destination. Destination object permissions are determined based on the permissions configured on the destination server.  | 

## Configuring file, object, and metadata handling options
<a name="configure-file-metadata-options"></a>

You can configure how DataSync handles files, objects, and metadata when creating, editing, or starting your transfer task.

### Using the DataSync console
<a name="configure-metadata-console"></a>

The following instructions describe how to configure file, object, and metadata handling options when creating a task.

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For **Transfer mode**, choose one of the following options:
   + **Transfer only data that has changed**
   + **Transfer all data**

   For more information about these options, see [Transfer mode options](#task-option-transfer-mode).

1. Select **Keep deleted files** if you want DataSync to maintain files or objects in the destination location that don't exist in the source.

   If you don't choose this option and your task deletes objects from your Amazon S3 bucket, you might incur minimum storage duration charges for certain storage classes. For detailed information, see [Storage class considerations with Amazon S3 transfers](create-s3-location.md#using-storage-classes).
**Warning**  
You can't deselect this option and enable **Transfer all data**. When you transfer all data, DataSync doesn't scan your destination location and doesn't know what to delete.

1. Select **Overwrite files** if you want DataSync to modify data in the destination location when the source data or metadata has changed.

   If your task overwrites objects, you might incur additional charges for certain storage classes (for example, for retrieval or early deletion). For detailed information, see [Storage class considerations with Amazon S3 transfers](create-s3-location.md#using-storage-classes).

   If you don't choose this option, the destination data isn't overwritten even if the source data differs.

1. Under **Transfer options**, select how you want DataSync to handle metadata. For more information about the options, see [Metadata handling options](#task-option-metadata-handling).
**Important**  
The options you see in the console depend on your task's source and destination locations. You might have to expand **Additional settings** to see some of these options.
   + **Copy ownership**
   + **Copy permissions**
   + **Copy timestamps**
   + **Copy object tags**
   + **Copy ownership, DACLs, and SACLs**
   + **Copy ownership and DACLs**
   + **Do not copy ownership or ACLs**

### Using the DataSync API
<a name="configure-file-metadata-options-api"></a>

You can configure file, object, and metadata handling options by using the `Options` parameter with any of the following operations:
+ [CreateTask](https://docs.aws.amazon.com/datasync/latest/userguide/API_CreateTask.html)
+ [StartTaskExecution](https://docs.aws.amazon.com/datasync/latest/userguide/API_StartTaskExecution.html)
+ [UpdateTask](https://docs.aws.amazon.com/datasync/latest/userguide/API_UpdateTask.html)

# Configuring how AWS DataSync verifies data integrity
<a name="configure-data-verification-options"></a>

During a transfer, AWS DataSync uses checksum verification to verify the integrity of the data that you copy between locations. You also can configure DataSync to perform additional verification at the end of your transfer.

## Data verification options
<a name="data-verification-options"></a>

Use the following information to help you decide if and how you want DataSync to perform these additional checks.


| Console option | API option | Description | 
| --- | --- | --- | 
|  **Verify only transferred data** (recommended)  |  [VerifyMode](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-VerifyMode) set to `ONLY_FILES_TRANSFERRED`  |  DataSync calculates the checksum of transferred data (including metadata) at the source location. At the end of your transfer, DataSync compares this checksum to the checksum calculated on that same data at the destination. We recommend this option when transferring to S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage classes. For more information, see [Storage class considerations with Amazon S3 transfers](create-s3-location.md#using-storage-classes).  | 
|  **Verify all data**  |  [VerifyMode](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-VerifyMode) set to `POINT_IN_TIME_CONSISTENT`  |  At the end of your transfer, DataSync checks the entire source and destination to verify that both locations are fully synchronized.   Not supported when your task uses [Enhanced mode](choosing-task-mode.md).  If you use a [manifest](transferring-with-manifest.md), DataSync only scans and verifies what's listed in the manifest. You can't use this option when transferring to S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage classes. For more information, see [Storage class considerations with Amazon S3 transfers](create-s3-location.md#using-storage-classes).   | 
| Don't verify data after transfer |  [VerifyMode](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-VerifyMode) set to `NONE`  | DataSync performs data integrity checks only during your transfer. Unlike other options, there's no additional verification at the end of your transfer. | 

## Configuring data verification
<a name="configure-data-verification"></a>

You can configure data verification options when creating a task, updating a task, or starting a task execution.

### Using the DataSync console
<a name="configure-data-verification-options-console"></a>

The following instructions describe how to configure data verification options when creating a task.

**To configure data verification by using the console**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For **Verification**, choose one of the following:
   + **Verify only transferred data** (recommended)
   + **Verify all data**
   + **Don't verify data after transfer**

### Using the DataSync API
<a name="configure-data-verification-options-api"></a>

You can configure how DataSync verifies data by using the `VerifyMode` parameter with any of the following operations:
+ [CreateTask](https://docs.aws.amazon.com/datasync/latest/userguide/API_CreateTask.html)
+ [UpdateTask](https://docs.aws.amazon.com/datasync/latest/userguide/API_UpdateTask.html)
+ [StartTaskExecution](https://docs.aws.amazon.com/datasync/latest/userguide/API_StartTaskExecution.html)

# Setting bandwidth limits for your AWS DataSync task
<a name="configure-bandwidth"></a>

You can configure network bandwidth limits for your AWS DataSync task and each of its executions.

## Limiting bandwidth for a task
<a name="configure-bandwidth-create"></a>

Set a bandwidth limit when creating, editing, or starting a task.

### Using the DataSync console
<a name="configure-bandwidth-create-console"></a>

The following instructions describe how to configure a bandwidth limit for your task when you're creating it.

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For **Bandwidth limit**, choose one of the following:
   + Select **Use available** to use all of the available network bandwidth for each task execution.
   + Select **Set bandwidth limit (MiB/s)** and enter the maximum bandwidth that you want DataSync to use for each task execution.

### Using the DataSync API
<a name="configure-bandwidth-create-api"></a>

You can configure a task's bandwidth limit by using the `BytesPerSecond` parameter with any of the following operations:
+ [CreateTask](https://docs.aws.amazon.com/datasync/latest/userguide/API_CreateTask.html)
+ [UpdateTask](https://docs.aws.amazon.com/datasync/latest/userguide/API_UpdateTask.html)
+ [StartTaskExecution](https://docs.aws.amazon.com/datasync/latest/userguide/API_StartTaskExecution.html)

## Throttling bandwidth for a task execution
<a name="adjust-bandwidth-throttling"></a>

You can modify the bandwidth limit for a running or queued task execution.

### Using the DataSync console
<a name="adjust-bandwidth-throttling-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the navigation pane, expand **Data transfer**. then choose **Tasks**.

1. Choose the task and then select **History** to view the task's executions.

1. Choose the task execution that you want to modify and then choose **Edit**.

1. In the dialog box, choose one of the following:
   + Select **Use available** to use all of the available network bandwidth for the task execution.
   + Select **Set bandwidth limit (MiB/s)** and enter the maximum bandwidth that you want DataSync to use for the task execution.

1. Choose **Save changes**.

   The new bandwidth limit takes effect within 60 seconds.

### Using the DataSync API
<a name="adjust-bandwidth-throttling-api"></a>

You can modify the bandwidth limit for a running or queued task execution by using the `BytesPerSecond` parameter with the [UpdateTaskExecution](https://docs.aws.amazon.com/datasync/latest/userguide/API_UpdateTaskExecution.html) operation.

# Scheduling when your AWS DataSync task runs
<a name="task-scheduling"></a>

You can set up an AWS DataSync task schedule to periodically transfer data between storage locations.

## How DataSync task scheduling works
<a name="how-task-scheduling-works"></a>

A scheduled DataSync task runs at a frequency that you specify, with a minimum interval of 1 hour. You can create a task schedule by using a cron or rate expressions.

**Important**  
You can't schedule a task to run at an interval faster than 1 hour.

**Using cron expressions**  
Use cron expressions for task schedules that run on a specific time and day. For example, here's how you can configure a task schedule in the AWS CLI that runs at 12:00 PM UTC every Sunday and Wednesday.  

```
cron(0 12 ? * SUN,WED *)
```

**Using rate expressions**  
Use rate expressions for task schedules that run on a regular interval, such as every 12 hours. For example, here's how you can configure a task schedule in the AWS CLI that runs every 12 hours:  

```
rate(12 hours)
```

**Tip**  
For more information about cron and rate expression syntax, see the [https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-cron-expressions.html](https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-cron-expressions.html).

## Creating a DataSync task schedule
<a name="configure-task-schedule"></a>

You can schedule how frequently your task runs by using the DataSync console, AWS CLI, or DataSync API.

### Using the DataSync console
<a name="configure-task-schedule-console"></a>

The following instructions describe how to set up a schedule when creating a task. You can modify the schedule later when editing the task.

In the console, some scheduling options let you specify the exact time that your task runs (such as daily at 10:30 PM). If you don't include a time for these options, your task runs at the time that you create (or update) the task.

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For schedule **Frequency**, do one of the following:
   + Choose **Not scheduled** if you don't want your task to run on a schedule.
   + Choose **Hourly**, then choose the minute during the hour that you want your task to run. 
   + Choose **Daily** and enter the UTC time that you want your task to run.
   + Choose **Weekly** and the day of the week and enter the UTC time that you want the task to run.
   + Choose **Days of the week**, choose a specific day or days, and enter the UTC time that the task should run in the format HH:MM.
   + Choose **Custom**, and then select **Cron expression** or **Rate expression**. Enter your task schedule with a minimum interval of 1 hour. 

### Using the AWS CLI
<a name="configure-task-schedule-api"></a>

You can create a schedule for your DataSync task by using the `--schedule` parameter with the `create-task`, `update-task`, or `start-task-execution` command.

The following instructions describe how to do this with the `create-task` command.

1. Copy the following `create-task` command:

   ```
   aws datasync create-task \
     --source-location-arn arn:aws:datasync:us-east-1:123456789012:location/loc-12345678abcdefgh \
     --destination-location-arn arn:aws:datasync:us-east-1:123456789012:location/loc-abcdefgh12345678 \
     --schedule '{
       "ScheduleExpression": "cron(0 12 ? * SUN,WED *)"
     }'
   ```

1. For the `--source-location-arn` parameter, specify the Amazon Resource Name (ARN) of the location that you're transferring data from.

1. For the `--destination-location-arn` parameter, specify the ARN of the location that you're transferring data to.

1. For the `--schedule` parameter, specify a cron or rate expression for your schedule.

   In the example, the cron expression `cron(0 12 ? * SUN,WED *)` sets a task schedule that runs at 12:00 PM UTC every Sunday and Wednesday.

1. Run the `create-task` command to create your task with the schedule.

## Pausing a DataSync task schedule
<a name="pause-task-schedule"></a>

There can be situations where you need to pause your DataSync task schedule. For example, you might need to temporarily disable a recurring transfer to fix an issue with your task or perform maintenance on your storage system.

DataSync might disable your task schedule automatically for the following reasons:
+ Your task fails repeatedly with the same error.
+ You [disable an AWS Region](https://docs.aws.amazon.com/accounts/latest/reference/manage-acct-regions.html) that your task is using.

### Using the DataSync console
<a name="pause-scheduled-task-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, and then choose **Tasks**.

1. Choose the task that you want to pause the schedule for, and then choose **Edit**.

1. For **Schedule**, turn off **Enable schedule**. Choose **Save changes**.

### Using the AWS CLI
<a name="pause-scheduled-task-cli"></a>

1. Copy the following `update-task` command:

   ```
   aws datasync update-task \
     --task-arn arn:aws:datasync:us-east-1:123456789012:task/task-12345678abcdefgh \
     --schedule '{
       "ScheduleExpression": "cron(0 12 ? * SUN,WED *)",
       "Status": "DISABLED"
     }'
   ```

1. For the `--task-arn` parameter, specify the ARN of the task that you want to pause the schedule for.

1. For the `--schedule` parameter, do the following:
   + For `ScheduleExpression`, specify a cron or rate expression for your schedule.

     In the example, the expression `cron(0 12 ? * SUN,WED *)` sets a task schedule that runs at 12:00 PM UTC every Sunday and Wednesday.
   + For `Status`, specify `DISABLED` to pause the task schedule.

1. Run the `update-task` command.

1. To resume the schedule, run the same `update-task` command with `Status` set to `ENABLED`.

## Checking the status of a DataSync task schedule
<a name="check-scheduled-task"></a>

You can see whether your DataSync task schedule is enabled. 

### Using the DataSync console
<a name="check-scheduled-task-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, and then choose **Tasks**.

1. In the **Schedule** column, check whether the task's schedule is enabled or disabled.

### Using the AWS CLI
<a name="check-scheduled-task-cli"></a>

1. Copy the following `describe-task` command:

   ```
   aws datasync describe-task \
     --task-arn arn:aws:datasync:us-east-1:123456789012:task/task-12345678abcdefgh
   ```

1. For the `--task-arn` parameter, specify the ARN of the task that you want information about.

1. Run the `describe-task` command.

You get a response that provides details about your task, including its schedule. (The following example focuses primarily on the task schedule configuration and doesn't show a full `describe-task` response.)

The example shows that the task's schedule is manually disabled. If the schedule is disabled by the DataSync `SERVICE`, you see an error message for `DisabledReason` to help you understand why the task keeps failing. For more information, see [Troubleshooting AWS DataSync issues](troubleshooting-datasync.md).

```
{
    "TaskArn": "arn:aws:datasync:us-east-1:123456789012:task/task-12345678abcdefgh",
    "Status": "AVAILABLE",
    "Schedule": {
        "ScheduleExpression": "cron(0 12 ? * SUN,WED *)",
        "Status": "DISABLED",
        "StatusUpdateTime": 1697736000,
        "DisabledBy": "USER",
        "DisabledReason": "Manually disabled by user."
    },
    ...
}
```

# Tagging your AWS DataSync tasks
<a name="tagging-tasks"></a>

*Tags* are key-value pairs that help you manage, filter, and search for your AWS DataSync resources. You can add up to 50 tags to each DataSync task and task execution.

For example, you might create a task for a large data migration and tag the task with the key **Project** and value **Large Migration**. To further organize the migration, you could tag one run of the task with the key **Transfer Date** and value **May 2021** (subsequent task executions might be tagged **June 2021**, **July 2021**, and so on).

## Tagging your DataSync task
<a name="tagging-tasks-console"></a>

You can tag your DataSync task only when creating the task.

### Using the DataSync console
<a name="tagging-tasks-console-steps"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. On the **Configure settings** page, choose **Add new tag** to tag your task.

### Using the AWS CLI
<a name="tagging-tasks-cli-steps"></a>

1. Copy the following `create-task` command:

   ```
   aws datasync create-task \
       --source-location-arn 'arn:aws:datasync:region:account-id:location/source-location-id' \
       --destination-location-arn 'arn:aws:datasync:region:account-id:location/destination-location-id' \
       --tags Key=tag-key,Value=tag-value
   ```

1. Specify the following parameters in the command:
   + `--source-location-arn` – Specify the Amazon Resource Name (ARN) of the source location in your transfer.
   + `--destination-location-arn` – Specify the ARN of the destination location in your transfer.
   + `--tags` – Specify the tags that you want to apply to the task.

     For more than one tag, separate each key-value pair with a space.

1. (Optional) Specify other parameters that make sense for your transfer scenario.

   For a list of `--options`, see the [create-task](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/datasync/create-task.html) command.

1. Run the `create-task` command.

   You get a response that shows the task that you just created.

   ```
   {
       "TaskArn": "arn:aws:datasync:us-east-2:123456789012:task/task-abcdef01234567890"
   }
   ```

To view the tags you added to this task, you can use the [list-tags-for-resource](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/datasync/list-tags-for-resource.html) command.

## Tagging your DataSync task execution
<a name="tagging-task-executions-console"></a>

You can tag each run of your DataSync task.

If your task already has tags, remember the following about using tags with task executions:
+ If you start your task with the console, its user-created tags are applied automatically to the task execution. However, system-created tags that begin with `aws:` are not applied.
+ If you start your task with the DataSync API or AWS CLI, its tags are not applied automatically to the task execution.

### Using the DataSync console
<a name="tagging-task-executions-console"></a>

To add, edit, or remove tags from a task execution, you must start the task with overriding options.

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**.

1. Choose the task.

1. Choose **Start**, then choose one of the following options: 
   + **Start with defaults** – Applies any tags associated with your task.
   + **Start with overriding options** – Allows you to add, edit, or remove tags for this particular task execution.

### Using the AWS CLI
<a name="tagging-task-executions-cli"></a>

1. Copy the following `start-task-execution` command:

   ```
   aws datasync start-task-execution \
       --task-arn 'arn:aws:datasync:region:account-id:task/task-id' \
       --tags Key=tag-key,Value=tag-value
   ```

1. Specify the following parameters in the command:
   + `--task-arn` – Specify the ARN of the task that you want to start.
   + `--tags` – Specify the tags that you want to apply to this specific run of the task.

     For more than one tag, separate each key-value pair with a space.

1. (Optional) Specify other parameters that make sense for your situation.

   For more information, see the [start-task-execution](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/datasync/start-task-execution.html) command.

1. Run the `start-task-execution` command.

   You get a response that shows the task execution that you just started.

   ```
   {
       "TaskExecutionArn": "arn:aws:datasync:us-east-2:123456789012:task/task-abcdef01234567890"
   }
   ```

To view the tags you added to this task, you can use the [list-tags-for-resource](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/datasync/list-tags-for-resource.html) command.