# Transferring your data with AWS DataSync
<a name="transferring-data-datasync"></a>

With AWS DataSync, you can transfer data to or from storage that's on-premises, in AWS, or in other clouds.

Setting up a DataSync transfer generally involves the following steps:

1. Determine if DataSync [supports your transfer](working-with-locations.md).

1. If [you need a DataSync agent](do-i-need-datasync-agent.md) for your transfer, deploy and activate an agent as close as possible to one of your storage systems.

   For example, if you're transferring from an on-premises Network File System (NFS) file server, deploy the agent as close as you can to that file server.

1. Provide DataSync access to your storage system.

   DataSync needs permission to read from or write to your storage (depending on whether your storage is a source or destination location). For example, learn how to [provide DataSync access to NFS file servers](create-nfs-location.md#accessing-nfs).

1. [Connect your network](networking-datasync.md) for traffic between your storage system and DataSync.

1. Create a location for your source storage system by using the DataSync console, AWS CLI, or DataSync API.

   For example, learn how to [create an NFS location](create-nfs-location.md#create-nfs-location-how-to) or [Amazon S3 location](create-s3-location.md#create-s3-location-how-to).

1. Repeat steps 3-5 to create your transfer's destination location.

1. [Create and start a DataSync transfer task](create-task-how-to.md) that includes your source and destination locations.

**Topics**
+ [

# Where can I transfer my data with AWS DataSync?
](working-with-locations.md)
+ [

# Transferring to or from on-premises storage with AWS DataSync
](transferring-on-premises-storage.md)
+ [

# Transferring to or from AWS storage with AWS DataSync
](transferring-aws-storage.md)
+ [

# Transferring to or from other cloud storage with AWS DataSync
](transferring-other-cloud-storage.md)
+ [

# Creating a task for transferring your data
](create-task-how-to.md)
+ [

# Starting a task to transfer your data
](run-task.md)

# Where can I transfer my data with AWS DataSync?
<a name="working-with-locations"></a>

Where you can transfer your data with AWS DataSync depends on the following factors:
+ Your transfer's source and destination [locations](how-datasync-transfer-works.md#sync-locations)
+ If your locations are in different AWS accounts
+ If your locations are in different AWS Regions
+ If your are using Basic mode or Enhanced mode

## Supported transfers in the same AWS account
<a name="working-with-locations-same-account"></a>

DataSync supports transfers between the following storage resources that are associated with the same AWS account.


| Source | Destination | Requires an agent? | Supported task mode | 
| --- | --- | --- | --- | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes  |  Basic, Enhanced  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes  |  Basic only  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes  |  Basic only  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Only for Basic mode  |  Basic, Enhanced  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes  |  Basic only  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  No  |  Basic, Enhanced  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  No  |  Basic only  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes  |  Basic, Enhanced  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes  |  Basic only  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Only for Basic mode  |  Basic, Enhanced  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes  |  Basic only  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes  |  Basic only  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes  |  Basic only  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  No  |  Basic only  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  No  |  Basic only  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes  |  Basic only  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes  |  Basic only  | 

## Supported transfers across AWS accounts
<a name="working-with-locations-across-accounts"></a>

DataSync supports some transfers between storage resources that are associated with different AWS accounts.


| Source | Destination | Requires an agent? | Supported task mode | 
| --- | --- | --- | --- | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes  |  Basic, Enhanced  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes  |  Basic only  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  No  |  Basic, Enhanced  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  No  |  Basic only  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes  |  Basic, Enhanced  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes  |  Basic only  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  No  |  Basic only  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/working-with-locations.html)  |  Yes (when used as an NFS/SMB location)  |  Basic only  | 

1 Configured as an [NFS location](create-nfs-location.md).

2 Configured as an [SMB location](create-smb-location.md).

3 Configured as an NFS or SMB location.

## Supported transfers in the same AWS Region
<a name="working-with-locations-same-region"></a>

There are no restrictions when transferring data within the same AWS Region (including [opt-in Regions](https://docs.aws.amazon.com/accounts/latest/reference/manage-acct-regions.html)). For more information, see [AWS Regions supported by DataSync](https://docs.aws.amazon.com/general/latest/gr/datasync.html).

## Supported transfers between AWS Regions
<a name="working-with-locations-cross-regions"></a>

Note the following when transferring data between [AWS Regions supported by DataSync](https://docs.aws.amazon.com/general/latest/gr/datasync.html):
+ When transferring between AWS storage services in different AWS Regions, one of the two locations must be in the Region where you're using DataSync.
+ You can't transfer across Regions with an NFS, SMB, HDFS, or object storage location. In these situations, both of your transfer locations must be in the same Region where you [activate your DataSync agent](activate-agent.md).
+ With AWS GovCloud (US) Regions, you can:
  + Transfer between the AWS GovCloud (US-East) and AWS GovCloud (US-West) Regions.
  + Transfer between an AWS GovCloud (US) Region and commercial AWS Region, such as US East (N. Virginia). This type of transfer requires an [agent](agent-requirements.md) when transferring between Amazon EFS or Amazon FSx file systems.

**Important**  
You pay for data transferred between AWS Regions. This transfer is billed as data transfer out from the source to destination Region. For more information, see [AWS DataSync Pricing](https://aws.amazon.com/datasync/pricing/).

## Determining if your transfer requires a DataSync agent
<a name="datasync-transfer-requirements"></a>

Depending on your transfer scenario, you might need a DataSync agent. For more information, see [Do I need an AWS DataSync agent?](do-i-need-datasync-agent.md)

# Transferring to or from on-premises storage with AWS DataSync
<a name="transferring-on-premises-storage"></a>

With AWS DataSync, you can transfer files and objects between a number of on-premises or self-managed storage systems and the following AWS storage services:
+ [Amazon S3](create-s3-location.md)
+ [Amazon EFS](create-efs-location.md)
+ [Amazon FSx for Windows File Server](create-fsx-location.md)
+ [Amazon FSx for Lustre](create-lustre-location.md)
+ [Amazon FSx for OpenZFS](create-openzfs-location.md)
+ [Amazon FSx for NetApp ONTAP](create-ontap-location.md)

**Topics**
+ [

# Configuring AWS DataSync transfers with an NFS file server
](create-nfs-location.md)
+ [

# Configuring AWS DataSync transfers with an SMB file server
](create-smb-location.md)
+ [

# Configuring AWS DataSync transfers with an HDFS cluster
](create-hdfs-location.md)
+ [

# Configuring DataSync transfers with an object storage system
](create-object-location.md)

# Configuring AWS DataSync transfers with an NFS file server
<a name="create-nfs-location"></a>

With AWS DataSync, you can transfer data between your Network File System (NFS) file server and the following AWS storage services. Supported storage services depend on your task mode, as shown below:


| Basic mode | Enhanced mode | 
| --- | --- | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/create-nfs-location.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/create-nfs-location.html)  | 

To set up this kind of transfer, you create a [location](how-datasync-transfer-works.md#sync-locations) for your NFS file server. You can use this location as a transfer source or destination.

## Providing DataSync access to NFS file servers
<a name="accessing-nfs"></a>

For DataSync to access your NFS file server, you need a DataSync [agent](how-datasync-transfer-works.md#sync-agents). The agent mounts an export on your file server by using the NFS protocol. Be sure to use the agent that corresponds to your desired task mode.

**Topics**
+ [

### Configuring your NFS export
](#accessing-nfs-configuring-export)
+ [

### Supported NFS versions
](#supported-nfs-versions)

### Configuring your NFS export
<a name="accessing-nfs-configuring-export"></a>

The export that DataSync needs for your transfer depends on if your NFS file server is a source or destination location and how your file server's permissions are configured.

If your file server is a source location, DataSync just has to read and traverse your files and folders. If it's a destination location, DataSync needs root access to write to the location and set ownership, permissions, and other metadata on the files and folders that you're copying. You can use the `no_root_squash` option to allow root access for your export.

The following examples describe how to configure an NFS export that provides access to DataSync.

**When your NFS file server is a source location (root access)**  
Configure your export by using the following command, which provides DataSync read-only permissions (`ro`) and root access ( `no_root_squash`):

```
export-path datasync-agent-ip-address(ro,no_root_squash)
```

**When your NFS file server is a destination location**  
Configure your export by using the following command, which provides DataSync write permissions (`rw`) and root access ( `no_root_squash`):

```
export-path datasync-agent-ip-address(rw,no_root_squash)
```

**When your NFS file server is a source location (no root access)**  
Configure your export by using the following command, which specifies the POSIX user ID (UID) and group ID (GID) that you know would provide DataSync read-only permissions on the export:

```
export-path datasync-agent-ip-address(ro,all_squash,anonuid=uid,anongid=gid)
```

### Supported NFS versions
<a name="supported-nfs-versions"></a>

By default, DataSync uses NFS version 4.1. DataSync also supports NFS 4.0 and 3.x.

## Configuring your network for NFS transfers
<a name="configure-network-nfs-location"></a>

For your DataSync transfer, you must configure traffic for a few network connections: 

1. Allow traffic on the following ports from your DataSync agent to your NFS file server:
   + **For NFS version 4.1 and 4.0** – TCP port 2049
   + **For NFS version 3.x** – TCP ports 111 and 2049

   Other NFS clients in your network should be able to mount the NFS export that you're using to transfer data. The export must also be accessible without Kerberos authentication.

1. Configure traffic for your [service endpoint connection](datasync-network.md) (such as a VPC, public, or FIPS endpoint).

1. Allow traffic from the DataSync service to the [AWS storage service](datasync-network.md#storage-service-network-requirements) you're transferring to or from.

## Creating your NFS transfer location
<a name="create-nfs-location-how-to"></a>

Before you begin, note the following:
+ You need an NFS file server that you want to transfer data from.
+ You need a DataSync agent that can [access your file server](#accessing-nfs).
+  DataSync doesn't support copying NFS version 4 access control lists (ACLs).

### Using the DataSync console
<a name="create-nfs-location-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Network File System (NFS)**.

1. For **Agents**, choose the DataSync agent that can connect to your NFS file server.

   You can choose more than one agent. For more information, see [Using multiple DataSync agents](do-i-need-datasync-agent.md#multiple-agents).

1. For **NFS server**, enter the Domain Name System (DNS) name or IP address of the NFS file server that your DataSync agent connects to.

1. For **Mount path**, enter the NFS export path that you want DataSync to mount.

   This path (or a subdirectory of the path) is where DataSync transfers data to or from. For more information, see [Configuring your NFS export](#accessing-nfs-configuring-export).

1. (Optional) Expand **Additional settings** and choose a specific **NFS version** for DataSync to use when accessing your file server.

   For more information, see [Supported NFS versions](#supported-nfs-versions).

1. (Optional) Choose **Add tag** to tag your NFS location.

   *Tags* are key-value pairs that help you manage, filter, and search for your locations. We recommend creating at least a name tag for your location. 

1. Choose **Create location**.

### Using the AWS CLI
<a name="create-location-nfs-cli"></a>
+ Use the following command to create an NFS location.

  ```
  aws datasync create-location-nfs \
      --server-hostname nfs-server-address \
      --on-prem-config AgentArns=datasync-agent-arns \
      --subdirectory nfs-export-path
  ```

  For more information on creating the location, see [Providing DataSync access to NFS file servers](#accessing-nfs).

  DataSync automatically chooses the NFS version that it uses to read from an NFS location. To specify an NFS version, use the optional `Version` parameter in the [NfsMountOptions](API_NfsMountOptions.md) API operation.

This command returns the Amazon Resource Name (ARN) of the NFS location, similar to the ARN shown following.

```
{
    "LocationArn": "arn:aws:datasync:us-east-1:111222333444:location/loc-0f01451b140b2af49"
}
```

To make sure that the directory can be mounted, you can connect to any computer that has the same network configuration as your agent and run the following command. 

```
mount -t nfs -o nfsvers=<nfs-server-version <nfs-server-address:<nfs-export-path <test-folder
```

The following is an example of the command.

```
mount -t nfs -o nfsvers=3 198.51.100.123:/path_for_sync_to_read_from /temp_folder_to_test_mount_on_local_machine
```

# Configuring AWS DataSync transfers with an SMB file server
<a name="create-smb-location"></a>

With AWS DataSync, you can transfer data between your Server Message Block (SMB) file server and the following AWS storage services. Supported storage services depend on your task mode, as shown below:


| Basic mode | Enhanced mode | 
| --- | --- | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/create-smb-location.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/create-smb-location.html)  | 

To set up this kind of transfer, you create a [location](how-datasync-transfer-works.md#sync-locations) for your SMB file server. You can use this as a transfer source or destination. Be sure to use the agent that corresponds to your desired task mode.

## Providing DataSync access to SMB file servers
<a name="configuring-smb"></a>

DataSync connects to your file server using the SMB protocol and can authenticate with NTLM or Kerberos.

**Topics**
+ [

### Supported SMB versions
](#configuring-smb-version)
+ [

### Using NTLM authentication
](#configuring-smb-ntlm-authentication)
+ [

### Using Kerberos authentication
](#configuring-smb-kerberos-authentication)
+ [

### Required permissions
](#configuring-smb-permissions)
+ [

### DFS Namespaces
](#configuring-smb-location-dfs)

### Supported SMB versions
<a name="configuring-smb-version"></a>

By default, DataSync automatically chooses a version of the SMB protocol based on negotiation with your SMB file server.

You also can configure DataSync to use a specific SMB version, but we recommend doing this only if DataSync has trouble negotiating with the SMB file server automatically. DataSync supports SMB versions 1.0 and later. For security reasons, we recommend using SMB version 3.0.2 or later. Earlier versions, such as SMB 1.0, contain known security vulnerabilities that attackers can exploit to compromise your data.

See the following table for a list of options in the DataSync console and API:


| Console option | API option | Description | 
| --- | --- | --- | 
| Automatic |  `AUTOMATIC`  |  DataSync and the SMB file server negotiate the highest version of SMB that they mutually support between 2.1 and 3.1.1. This is the default and recommended option. If you instead choose a specific version that your file server doesn't support, you may get an `Operation Not Supported` error.  | 
|  SMB 3.0.2  |  `SMB3`  |  Restricts the protocol negotiation to only SMB version 3.0.2.  | 
| SMB 2.1 |  `SMB2`  | Restricts the protocol negotiation to only SMB version 2.1. | 
| SMB 2.0 | `SMB2_0` | Restricts the protocol negotiation to only SMB version 2.0. | 
| SMB 1.0 | `SMB1` | Restricts the protocol negotiation to only SMB version 1.0. | 

### Using NTLM authentication
<a name="configuring-smb-ntlm-authentication"></a>

To use NTLM authentication, you provide a user name and password that allows DataSync to access the SMB file server that you're transferring to or from. The user can be a local user on your file server or a domain user in your Microsoft Active Directory.

### Using Kerberos authentication
<a name="configuring-smb-kerberos-authentication"></a>

To use Kerberos authentication, you provide a Kerberos principal, Kerberos key table (keytab) file, and Kerberos configuration file that allows DataSync to access the SMB file server that you're transferring to or from.

**Topics**
+ [

#### Prerequisites
](#configuring-smb-kerberos-prerequisites)
+ [

#### DataSync configuration options for Kerberos
](#configuring-smb-kerberos-options)

#### Prerequisites
<a name="configuring-smb-kerberos-prerequisites"></a>

You need to create a couple Kerberos artifacts and configure your network so that DataSync can access your SMB file server.
+ Create a Kerberos keytab file by using the [ktpass](https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/ktpass) or [kutil](https://web.mit.edu/kerberos/krb5-1.12/doc/admin/admin_commands/ktutil.html) utility.

  The following example creates a keytab file by using `ktpass`. The Kerberos realm that you specify (`MYDOMAIN.ORG`) must be upper case.

  ```
  ktpass /out C:\YOUR_KEYTAB.keytab /princ HOST/kerberosuser@MYDOMAIN.ORG /mapuser kerberosuser /pass * /crypto AES256-SHA1 /ptype KRB5_NT_PRINCIPAL
  ```
+ Prepare a simplified version of the Kerberos configuration file (`krb5.conf`). Include information about the realm, the location of the domain admin servers, and mappings of hostnames onto a Kerberos realm.

  Verify that the `krb5.conf` content is formatted with the correct mixed casing for the realms and domain realm names. For example:

  ```
  [libdefaults] 
    dns_lookup_realm = true 
    dns_lookup_kdc = true 
    forwardable = true 
    default_realm = MYDOMAIN.ORG
  
  [realms] 
    MYDOMAIN.ORG = { 
      kdc = mydomain.org 
      admin_server = mydomain.org 
    }
  
  [domain_realm] 
    .mydomain.org = MYDOMAIN.ORG 
    mydomain.org = MYDOMAIN.ORG
  ```
+ In your network configuration, make sure that your Kerberos Key Distribution Center (KDC) server port is open. The KDC port is typically TCP port 88.

#### DataSync configuration options for Kerberos
<a name="configuring-smb-kerberos-options"></a>

When creating an SMB location that uses Kerberos, you configure the following options.


| Console option | API option | Description | 
| --- | --- | --- | 
|  **SMB server**  |  `ServerHostName`  |  The domain name of the SMB file server that your DataSync agent will mount. For Kerberos, you can't specify the file server's IP address.  | 
|  **Kerberos principal**  |  `KerberosPrincipal`  |  An identity in your Kerberos realm that has permission to access the files, folders, and file metadata in your SMB file server. A Kerberos principal might look like `HOST/kerberosuser@MYDOMAIN.ORG`. Principal names are case sensitive.  | 
|  **Keytab file**  |  `KerberosKeytab`   |  A Kerberos key table (keytab) file, which includes mappings between your Kerberos principal and encryption keys.  | 
|  **Kerberos configuration file**  |  `KerberosKrbConf`  |  A `krb5.conf` file that defines your Kerberos realm configuration.  | 
|  **DNS IP addresses** (optional)  |  `DnsIpAddresses`  |  The IPv4 addresses for the DNS servers that your SMB file server belongs to. If you have multiple domains in your environment, configuring this makes sure that DataSync connects to the right SMB file server.  | 

### Required permissions
<a name="configuring-smb-permissions"></a>

The identity that you provide DataSync must have permission to mount and access your SMB file server's files, folders, and file metadata.

If you provide an identity in your Active Directory, it must be a member of an Active Directory group with one or both of the following user rights (depending the [metadata that you want DataSync to copy](configure-metadata.md)):


| User right | Description | 
| --- | --- | 
|  **Restore files and directories** (`SE_RESTORE_NAME`)  |  Allows DataSync to copy object ownership, permissions, file metadata, and NTFS discretionary access lists (DACLs). This user right is usually granted to members of the **Domain Admins** and **Backup Operators** groups (both of which are default Active Directory groups).  | 
|  **Manage auditing and security log** (`SE_SECURITY_NAME`)  |  Allows DataSync to copy NTFS system access control lists (SACLs). This user right is usually granted to members of the **Domain Admins** group.   | 

If you want to copy Windows ACLs and are transferring between an SMB file server and another storage system that uses SMB (such as Amazon FSx for Windows File Server or FSx for ONTAP), the identity that you provide DataSync must belong to the same Active Directory domain or have an Active Directory trust relationship between their domains.

### DFS Namespaces
<a name="configuring-smb-location-dfs"></a>

DataSync doesn't support Microsoft Distributed File System (DFS) Namespaces. We recommend specifying an underlying file server or share instead when creating your DataSync location.

## Creating your SMB transfer location
<a name="create-smb-location-how-to"></a>

Before you begin, you need an SMB file server that you want to transfer data from.

### Using the DataSync console
<a name="create-smb-location-how-to-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Server Message Block (SMB)**.

   You configure this location as a source or destination later.

1. For **Agents**, choose the DataSync agent that can connect to your SMB file server.

   You can choose more than one agent. For more information, see [Using multiple DataSync agents](do-i-need-datasync-agent.md#multiple-agents).

1. For **SMB server**, enter the domain name or IP address of the SMB file server that your DataSync agent will mount.

   Remember the following with this setting:
   + You can't specify an IP version 6 (IPv6) address.
   + If you're using Kerberos authentication, you must specify a domain name.

1. For **Share name**, enter the name of the share exported by your SMB file server where DataSync will read or write data.

   You can include a subdirectory in the share path (for example, `/path/to/subdirectory`). Make sure that other SMB clients in your network can also mount this path. 

   To copy all the data in the subdirectory, DataSync must be able to mount the SMB share and access all of its data. For more information, see [Required permissions](#configuring-smb-permissions).

1. (Optional) Expand **Additional settings** and choose an **SMB Version** for DataSync to use when accessing your file server.

   By default, DataSync automatically chooses a version based on negotiation with the SMB file server. For information, see [Supported SMB versions](#configuring-smb-version).

1. For **Authentication type**, choose **NTLM** or **Kerberos**.

1. Do one of the following depending on your authentication type:

------
#### [ NTLM ]
   + For **User**, enter a user name that can mount your SMB file server and has permission to access the files and folders involved in your transfer.

     For more information, see [Required permissions](#configuring-smb-permissions).
   + For **Password**, enter the password of the user who can mount your SMB file server and has permission to access the files and folders involved in your transfer.
   + (Optional) For **Domain**, enter the Windows domain name that your SMB file server belongs to.

     If you have multiple domains in your environment, configuring this setting makes sure that DataSync connects to the right SMB file server.

------
#### [ Kerberos ]
   + For **Kerberos principal**, specify a principal in your Kerberos realm that has permission to access the files, folders, and file metadata in your SMB file server.

     A Kerberos principal might look like `HOST/kerberosuser@MYDOMAIN.ORG`.

     Principal names are case sensitive. Your DataSync task execution will fail if the principal that you specify for this setting doesn’t exactly match the principal that you use to create the keytab file.
   + For **Keytab file**, upload a keytab file that includes mappings between your Kerberos principal and encryption keys.
   + For **Kerberos configuration file**, upload a `krb5.conf` file that defines your Kerberos realm configuration.
   + (Optional) For **DNS IP addresses**, specify up to two IPv4 addresses for the DNS servers that your SMB file server belongs to. 

     If you have multiple domains in your environment, configuring this parameter makes sure that DataSync connects to the right SMB file server.

------

1. (Optional) Choose **Add tag** to tag your SMB location.

   *Tags* are key-value pairs that help you manage, filter, and search for your locations. We recommend creating at least a name tag for your location. 

1. Choose **Create location**.

### Using the AWS CLI
<a name="create-location-smb-cli"></a>

The following instructions describe how to create SMB locations with NTLM or Kerberos authentication.

------
#### [ NTLM ]

1. Copy the following `create-location-smb` command.

   ```
   aws datasync create-location-smb \
       --agent-arns datasync-agent-arns \
       --server-hostname smb-server-address \
       --subdirectory smb-export-path \
       --authentication-type "NTLM" \
       --user user-who-can-mount-share \
       --password user-password \
       --domain windows-domain-of-smb-server
   ```

1. For `--agent-arns`, specify the DataSync agent that can connect to your SMB file server.

   You can choose more than one agent. For more information, see [Using multiple DataSync agents](do-i-need-datasync-agent.md#multiple-agents).

1. For `--server-hostname`, specify the domain name or IPv4 address of the SMB file server that your DataSync agent will mount. 

1. For `--subdirectory`, specify the name of the share exported by your SMB file server where DataSync will read or write data.

   You can include a subdirectory in the share path (for example, `/path/to/subdirectory`). Make sure that other SMB clients in your network can also mount this path. 

   To copy all the data in the subdirectory, DataSync must be able to mount the SMB share and access all of its data. For more information, see [Required permissions](#configuring-smb-permissions).

1. For `--user`, specify a user name that can mount your SMB file server and has permission to access the files and folders involved in your transfer.

   For more information, see [Required permissions](#configuring-smb-permissions).

1. For `--password`, specify the password of the user who can mount your SMB file server and has permission to access the files and folders involved in your transfer.

1. (Optional) For `--domain`, specify the Windows domain name that your SMB file server belongs to.

   If you have multiple domains in your environment, configuring this setting makes sure that DataSync connects to the right SMB file server.

1. (Optional) Add the `--version` option if you want DataSync to use a specific SMB version. For more information, see [Supported SMB versions](#configuring-smb-version).

1. Run the `create-location-smb` command.

   If the command is successful, you get a response that shows you the ARN of the location that you created. For example:

   ```
   {
       "arn:aws:datasync:us-east-1:123456789012:location/loc-01234567890example"
   }
   ```

------
#### [ Kerberos ]

1. Copy the following `create-location-smb` command.

   ```
   aws datasync create-location-smb \
       --agent-arns datasync-agent-arns \
       --server-hostname smb-server-address \
       --subdirectory smb-export-path \
       --authentication-type "KERBEROS" \
       --kerberos-principal "HOST/kerberosuser@EXAMPLE.COM" \
       --kerberos-keytab "fileb://path/to/file.keytab" \
       --kerberos-krb5-conf "file://path/to/krb5.conf" \
       --dns-ip-addresses array-of-ipv4-addresses
   ```

1. For `--agent-arns`, specify the DataSync agent that can connect to your SMB file server.

   You can choose more than one agent. For more information, see [Using multiple DataSync agents](do-i-need-datasync-agent.md#multiple-agents).

1. For `--server-hostname`, specify the domain name of the SMB file server that your DataSync agent will mount. 

1. For `--subdirectory`, specify the name of the share exported by your SMB file server where DataSync will read or write data.

   You can include a subdirectory in the share path (for example, `/path/to/subdirectory`). Make sure that other SMB clients in your network can also mount this path. 

   To copy all the data in the subdirectory, DataSync must be able to mount the SMB share and access all of its data. For more information, see [Required permissions](#configuring-smb-permissions).

1. For the Kerberos options, do the following:
   + `--kerberos-principal`: Specify a principal in your Kerberos realm that has permission to access the files, folders, and file metadata in your SMB file server.

     A Kerberos principal might look like `HOST/kerberosuser@MYDOMAIN.ORG`.

     Principal names are case sensitive. Your DataSync task execution will fail if the principal that you specify for this option doesn’t exactly match the principal that you use to create the keytab file.
   + `--kerberos-keytab`: Specify a keytab file that includes mappings between your Kerberos principal and encryption keys.
   + `--kerberos-krb5-conf`: Specify a `krb5.conf` file that defines your Kerberos realm configuration.
   + (Optional) `--dns-ip-addresses`: Specify up to two IPv4 addresses for the DNS servers that your SMB file server belongs to. 

     If you have multiple domains in your environment, configuring this parameter makes sure that DataSync connects to the right SMB file server.

1. (Optional) Add the `--version` option if you want DataSync to use a specific SMB version. For more information, see [Supported SMB versions](#configuring-smb-version).

1. Run the `create-location-smb` command.

   If the command is successful, you get a response that shows you the ARN of the location that you created. For example:

   ```
   {
       "arn:aws:datasync:us-east-1:123456789012:location/loc-01234567890example"
   }
   ```

------

# Configuring AWS DataSync transfers with an HDFS cluster
<a name="create-hdfs-location"></a>

With AWS DataSync, you can transfer data between your Hadoop Distributed File System (HDFS) cluster and one of the following AWS storage services using Basic mode tasks:
+ [Amazon S3](create-s3-location.md)
+ [Amazon EFS](create-efs-location.md)
+ [Amazon FSx for Windows File Server](create-fsx-location.md)
+ [Amazon FSx for Lustre](create-lustre-location.md)
+ [Amazon FSx for OpenZFS](create-openzfs-location.md)
+ [Amazon FSx for NetApp ONTAP](create-ontap-location.md)

To set up this kind of transfer, you create a [location](how-datasync-transfer-works.md#sync-locations) for your HDFS cluster. You can use this location as a transfer source or destination.

## Providing DataSync access to HDFS clusters
<a name="accessing-hdfs"></a>

To connect to your HDFS cluster, DataSync uses a Basic mode agent [agent that you deploy](deploy-agents.md) as close as possible to your HDFS cluster. The DataSync agent acts as an HDFS client and communicates with the NameNodes and DataNodes in your cluster.

When you start a transfer task, DataSync queries the NameNode for locations of files and folders on the cluster. If you configure your HDFS location as a source location, DataSync reads files and folder data from the DataNodes in your cluster and copies that data to the destination. If you configure your HDFS location as a destination location, then DataSync writes files and folders from the source to the DataNodes in your cluster.

### Authentication
<a name="accessing-hdfs-authentication"></a>

When connecting to an HDFS cluster, DataSync supports simple authentication or Kerberos authentication. To use simple authentication, provide the user name of a user with rights to read and write to the HDFS cluster. To use Kerberos authentication, provide a Kerberos configuration file, a Kerberos key table (keytab) file, and a Kerberos principal name. The credentials of the Kerberos principal must be in the provided keytab file.

### Encryption
<a name="accessing-hdfs-encryption"></a>

When using Kerberos authentication, DataSync supports encryption of data as it's transmitted between the DataSync agent and your HDFS cluster. Encrypt your data by using the Quality of Protection (QOP) configuration settings on your HDFS cluster and by specifying the QOP settings when creating your HDFS location. The QOP configuration includes settings for data transfer protection and Remote Procedure Call (RPC) protection. 

**DataSync supports the following Kerberos encryption types:**
+ `des-cbc-crc`
+ `des-cbc-md4`
+ `des-cbc-md5`
+ `des3-cbc-sha1`
+ `arcfour-hmac`
+ `arcfour-hmac-exp`
+ `aes128-cts-hmac-sha1-96`
+ `aes256-cts-hmac-sha1-96`
+ `aes128-cts-hmac-sha256-128`
+ `aes256-cts-hmac-sha384-192`
+ `camellia128-cts-cmac`
+ `camellia256-cts-cmac`

You can also configure HDFS clusters for encryption at rest using Transparent Data Encryption (TDE). When using simple authentication, DataSync reads and writes to TDE-enabled clusters. If you're using DataSync to copy data to a TDE-enabled cluster, first configure the encryption zones on the HDFS cluster. DataSync doesn't create encryption zones. 

## Unsupported HDFS features
<a name="hdfs-unsupported-features"></a>

The following HDFS capabilities aren't currently supported by DataSync:
+ Transparent Data Encryption (TDE) when using Kerberos authentication
+ Configuring multiple NameNodes
+ Hadoop HDFS over HTTP (HttpFS)
+ POSIX access control lists (ACLs)
+ HDFS extended attributes (xattrs)
+ HDFS clusters using Apache HBase

## Creating your HDFS transfer location
<a name="create-hdfs-location-how-to"></a>

You can use your location as a source or destination for your DataSync transfer.

**Before you begin**: Verify network connectivity between your agent and Hadoop cluster by doing the following:
+ Test access to the TCP ports listed in [Network requirements for on-premises, self-managed, and other cloud storage](datasync-network.md#on-premises-network-requirements).
+ Test access between your local agent and your Hadoop cluster. For instructions, see [Verifying your agent's connection to your storage system](test-agent-connections.md#self-managed-storage-connectivity).

### Using the DataSync console
<a name="create-hdfs-location-how-to-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Hadoop Distributed File System (HDFS)**.

   You can configure this location as a source or destination later. 

1. For **Agents**, choose the agent that can connect to your HDFS cluster.

   You can choose more than one agent. For more information, see [Using multiple DataSync agents](do-i-need-datasync-agent.md#multiple-agents).

1. For **NameNode**, provide the domain name or IP address of your HDFS cluster's primary NameNode.

1. For **Folder**, enter a folder on your HDFS cluster that you want DataSync to use for the data transfer.

   If your HDFS location is a source, DataSync copies the files in this folder to the destination. If your location is a destination, DataSync writes files to this folder.

1. To set the **Block size** or **Replication factor**, choose **Additional settings**.

   The default block size is 128 MiB. The block sizes that you provide must be a multiple of 512 bytes.

   The default replication factor is three DataNodes when transferring to the HDFS cluster. 

1. In the **Security** section, choose the **Authentication type** used on your HDFS cluster. 
   + **Simple** – For **User**, specify the user name with the following permissions on the HDFS cluster (depending on your use case):
     + If you plan to use this location as a source location, specify a user that only has read permissions.
     + If you plan to use this location as a destination location, specify a user that has read and write permissions.

     Optionally, specify the URI of the Key Management Server (KMS) of your HDFS cluster. 
   + **Kerberos** – Specify the Kerberos **Principal** with access to your HDFS cluster. Next, provide the **KeyTab file** that contains the provided Kerberos principal. Then, provide the **Kerberos configuration file**. Finally, specify the type of encryption in transit protection in the **RPC protection** and **Data transfer protection** dropdown lists.

1. (Optional) Choose **Add tag** to tag your HDFS location.

   *Tags* are key-value pairs that help you manage, filter, and search for your locations. We recommend creating at least a name tag for your location. 

1. Choose **Create location**.

### Using the AWS CLI
<a name="create-location-hdfs-cli"></a>

1. Copy the following `create-location-hdfs` command.

   ```
   aws datasync create-location-hdfs --name-nodes [{"Hostname":"host1", "Port": 8020}] \
       --authentication-type "SIMPLE|KERBEROS" \
       --agent-arns [arn:aws:datasync:us-east-1:123456789012:agent/agent-01234567890example] \
       --subdirectory "/path/to/my/data"
   ```

1. For the `--name-nodes` parameter, specify the hostname or IP address of your HDFS cluster's primary NameNode and the TCP port that the NameNode is listening on.

1. For the `--authentication-type` parameter, specify the type of authentication to use when connecting to the Hadoop cluster. You can specify `SIMPLE` or `KERBEROS`.

   If you use `SIMPLE` authentication, use the `--simple-user` parameter to specify the user name of the user. If you use `KERBEROS` authentication, use the `--kerberos-principal`, `--kerberos-keytab`, and `--kerberos-krb5-conf` parameters. For more information, see [create-location-hdfs](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/datasync/create-location-hdfs.html).

1. For the `--agent-arns` parameter, specify the ARN of the DataSync agent that can connect to your HDFS cluster.

   You can choose more than one agent. For more information, see [Using multiple DataSync agents](do-i-need-datasync-agent.md#multiple-agents).

1. (Optional) For the `--subdirectory` parameter, specify a folder on your HDFS cluster that you want DataSync to use for the data transfer.

   If your HDFS location is a source, DataSync copies the files in this folder to the destination. If your location is a destination, DataSync writes files to this folder.

1. Run the `create-location-hdfs` command.

   If the command is successful, you get a response that shows you the ARN of the location that you created. For example:

   ```
   {
       "arn:aws:datasync:us-east-1:123456789012:location/loc-01234567890example"
   }
   ```

# Configuring DataSync transfers with an object storage system
<a name="create-object-location"></a>

With AWS DataSync, you can transfer data between your object storage system and one of the following AWS storage services using Basic mode tasks:
+ [Amazon S3](create-s3-location.md)
+ [Amazon EFS](create-efs-location.md)
+ [Amazon FSx for Windows File Server](create-fsx-location.md)
+ [Amazon FSx for Lustre](create-lustre-location.md)
+ [Amazon FSx for OpenZFS](create-openzfs-location.md)
+ [Amazon FSx for NetApp ONTAP](create-ontap-location.md)

To set up this kind of transfer, you create a [location](how-datasync-transfer-works.md#sync-locations) for your object storage system. You can use this location as a transfer source or destination. Transferring data to or from your on-premises object storage requires a Basic mode DataSync agent.

## Prerequisites
<a name="create-object-location-prerequisites"></a>

Your object storage system must be compatible with the following [Amazon S3 API operations](https://docs.aws.amazon.com/AmazonS3/latest/API/API_Operations.html) for DataSync to connect to it:
+ `AbortMultipartUpload`
+ `CompleteMultipartUpload`
+ `CopyObject`
+ `CreateMultipartUpload`
+ `DeleteObject`
+ `DeleteObjects`
+ `DeleteObjectTagging`
+ `GetBucketLocation`
+ `GetObject`
+ `GetObjectTagging`
+ `HeadBucket`
+ `HeadObject`
+ `ListObjectsV2`
+ `PutObject`
+ `PutObjectTagging`
+ `UploadPart`

## Creating your object storage transfer location
<a name="create-object-location-how-to"></a>

Before you begin, you need an object storage system that you plan to transfer data to or from.

### Using the DataSync console
<a name="create-object-location-how-to-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Object storage**.

   You configure this location as a source or destination later.

1. For **Server**, provide the domain name or IP address of the object storage server. 

1. For **Bucket name**, enter the name of the object storage bucket involved in the transfer.

1. For **Folder**, enter an object prefix.

   DataSync only copies objects with this prefix. 

1. If your transfer requires an agent, choose **Use agents**, then choose the DataSync agent that connects to your object storage system.

   Some transfers don't require agents. In other scenarios, you might want to use more than one agent. For more information, see [Situations when you don't need a DataSync agent](do-i-need-datasync-agent.md#when-agent-not-required) and [Using multiple DataSync agents](do-i-need-datasync-agent.md#multiple-agents).

1. To configure the connection to the object storage server, expand **Additional settings** and do the following:

   1. For **Server protocol**, choose **HTTP** or **HTTPS**.

   1. For **Server port**, use a default port (**80** for HTTP or **443** for HTTPS) or specify a custom port if needed.

   1. For **Certificate**, if your object storage system uses a private or self-signed certificate authority (CA), select **Choose file** and specify a single `.pem` file with a full certificate chain.

      The certificate chain might include:
      + The object storage system's certificate
      + All intermediate certificates (if there are any)
      + The root certificate of the signing CA

      You can concatenate your certificates into a `.pem` file (which can be up to 32768 bytes before base64 encoding). The following example `cat` command creates an `object_storage_certificates.pem` file that includes three certificates:

      ```
      cat object_server_certificate.pem intermediate_certificate.pem ca_root_certificate.pem > object_storage_certificates.pem
      ```

1. If the object storage server requires credentials for access, select **Requires credentials** and enter the **Access key** you use to access the bucket. Then either enter the **Secret key** directly, or specify an AWS Secrets Manager secret that contains the key. For more information, see [Providing credentials for storage locations](https://docs.aws.amazon.com/datasync/latest/userguide/location-credentials.html).

   The access key and secret key can be a user name and password, respectively.

1. (Optional) Choose **Add tag** to tag your object storage location.

   *Tags* are key-value pairs that help you manage, filter, and search for your locations. We recommend creating at least a name tag for your location. 

1. Choose **Create location**.

### Using the AWS CLI
<a name="create-location-object-cli"></a>

1. Copy the following `create-location-object-storage` command:

   ```
   aws datasync create-location-object-storage \
       --server-hostname object-storage-server.example.com \
       --bucket-name your-bucket \
       --agent-arns arn:aws:datasync:us-east-1:123456789012:agent/agent-01234567890deadfb
   ```

1. Specify the following required parameters in the command:
   + `--server-hostname` – Specify the domain name or IP address of your object storage server.
   + `--bucket-name` – Specify the name of the bucket on your object storage server that you're transferring to or from.

1. (Optional) Add any of the following parameters to the command:
   + `--agent-arns` – Specify the DataSync agents that you want to connect to your object storage server.
   + `--server-port` – Specifies the port that your object storage server accepts inbound network traffic on (for example, port `443`).
   + `--server-protocol` – Specifies the protocol (`HTTP` or `HTTPS`) which your object storage server uses to communicate.
   + `--access-key` – Specifies the access key (for example, a user name) if credentials are required to authenticate with the object storage server.
   + `--secret-key` – Specifies the secret key (for example, a password) if credentials are required to authenticate with the object storage server.

     You can also provide additional parameters for securing your keys using AWS Secrets Manager. For more information, see [Providing credentials for storage locations](https://docs.aws.amazon.com/datasync/latest/userguide/location-credentials.html).
   + `--server-certificate` – Specifies a certificate chain for DataSync to authenticate with your object storage system if the system uses a private or self-signed certificate authority (CA). You must specify a single `.pem` file with a full certificate chain (for example, `file:///home/user/.ssh/object_storage_certificates.pem`).

     The certificate chain might include:
     + The object storage system's certificate
     + All intermediate certificates (if there are any)
     + The root certificate of the signing CA

     You can concatenate your certificates into a `.pem` file (which can be up to 32768 bytes before base64 encoding). The following example `cat` command creates an `object_storage_certificates.pem` file that includes three certificates:

     ```
     cat object_server_certificate.pem intermediate_certificate.pem ca_root_certificate.pem > object_storage_certificates.pem
     ```
   + `--subdirectory` – Specifies the object prefix for your object storage server.

     DataSync only copies objects with this prefix. 
   + `--tags` – Specifies the key-value pair that represents a tag that you want to add to the location resource.

     Tags can help you manage, filter, and search for your resources. We recommend creating a name tag for your location.

1. Run the `create-location-object-storage` command.

   You get a response that shows you the location ARN that you just created.

   ```
   {
       "LocationArn": "arn:aws:datasync:us-east-1:123456789012:location/loc-01234567890abcdef"
   }
   ```

# Transferring to or from AWS storage with AWS DataSync
<a name="transferring-aws-storage"></a>

With AWS DataSync, you can transfer data to or from a number of AWS storage services. For more information, see [Where can I transfer my data with DataSync?](working-with-locations.md)

**Topics**
+ [

# Configuring AWS DataSync transfers with Amazon S3
](create-s3-location.md)
+ [

# Configuring AWS DataSync transfers with Amazon EFS
](create-efs-location.md)
+ [

# Configuring transfers with FSx for Windows File Server
](create-fsx-location.md)
+ [

# Configuring DataSync transfers with FSx for Lustre
](create-lustre-location.md)
+ [

# Configuring DataSync transfers with Amazon FSx for OpenZFS
](create-openzfs-location.md)
+ [

# Configuring transfers with Amazon FSx for NetApp ONTAP
](create-ontap-location.md)

# Configuring AWS DataSync transfers with Amazon S3
<a name="create-s3-location"></a>

To transfer data to or from your Amazon S3 bucket, you create an AWS DataSync transfer *location*. DataSync can use this location as a source or destination for transferring data.

## Providing DataSync access to S3 buckets
<a name="create-s3-location-access"></a>

DataSync needs access to the S3 bucket that you're transferring to or from. To do this, you must create an AWS Identity and Access Management (IAM) role that DataSync assumes with the permissions required to access the bucket. You then specify this role when [creating your Amazon S3 location for DataSync](#create-s3-location-how-to).

**Contents**
+ [

### Required permissions
](#create-s3-location-required-permissions)
+ [

### Creating an IAM role for DataSync to access your Amazon S3 location
](#create-role-manually)
+ [

### Accessing S3 buckets using server-side encryption
](#create-s3-location-encryption)
+ [

### Accessing restricted S3 buckets
](#denying-s3-access)
+ [

### Accessing S3 buckets with restricted VPC access
](#create-s3-location-restricted-vpc)

### Required permissions
<a name="create-s3-location-required-permissions"></a>

The permissions that your IAM role needs can depend on whether bucket is a DataSync source or destination location. Amazon S3 on Outposts requires a different set of permissions.

------
#### [ Amazon S3 (source location) ]

```
{
    "Version": "2012-10-17",		 	 	 
    "Statement": [
        {
            "Action": [
                "s3:GetBucketLocation",
                "s3:ListBucket",
                "s3:ListBucketMultipartUploads"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket"
        },
        {
            "Action": [
                "s3:GetObject",
                "s3:GetObjectTagging",
                "s3:GetObjectVersion",
                "s3:GetObjectVersionTagging",
                "s3:ListMultipartUploadParts"
              ],
            "Effect": "Allow",
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/*"
        }
    ]
}
```

------
#### [ Amazon S3 (destination location) ]

```
{
 "Version": "2012-10-17",		 	 	 
 "Statement": [
     {
         "Action": [
             "s3:GetBucketLocation",
             "s3:ListBucket",
             "s3:ListBucketMultipartUploads"
         ],
         "Effect": "Allow",
         "Resource": "arn:aws:s3:::amzn-s3-demo-bucket",
         "Condition": {
             "StringEquals": {
                 "aws:ResourceAccount": "123456789012"
             }
         }
     },
     {
         "Action": [
             "s3:AbortMultipartUpload",
             "s3:DeleteObject",
             "s3:GetObject",
             "s3:GetObjectTagging",
             "s3:GetObjectVersion",
             "s3:GetObjectVersionTagging",
             "s3:ListMultipartUploadParts",
             "s3:PutObject",
             "s3:PutObjectTagging"
           ],
         "Effect": "Allow",
         "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/*",
         "Condition": {
             "StringEquals": {
                 "aws:ResourceAccount": "123456789012"
             }
         }
     }
 ]
}
```

------
#### [ Amazon S3 on Outposts ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Action": [
                "s3-outposts:ListBucket",
                "s3-outposts:ListBucketMultipartUploads"
            ],
            "Effect": "Allow",
            "Resource": [
            "arn:aws:s3-outposts:us-east-1:123456789012:outpost/outpost-id/bucket/amzn-s3-demo-bucket",
    "arn:aws:s3-outposts:us-east-1:123456789012:outpost/outpost-id/accesspoint/bucket-access-point-name"
            ]
        },
        {
            "Action": [
                "s3-outposts:AbortMultipartUpload",
                "s3-outposts:DeleteObject",
                "s3-outposts:GetObject",
                "s3-outposts:GetObjectTagging",
                "s3-outposts:GetObjectVersion",
                "s3-outposts:GetObjectVersionTagging",
                "s3-outposts:ListMultipartUploadParts",
                "s3-outposts:PutObject",
                "s3-outposts:PutObjectTagging"
            ],
            "Effect": "Allow",
            "Resource": [
            "arn:aws:s3-outposts:us-east-1:123456789012:outpost/outpost-id/bucket/amzn-s3-demo-bucket/*",
    "arn:aws:s3-outposts:us-east-1:123456789012:outpost/outpost-id/accesspoint/bucket-access-point-name/*"
            ]
        },
        {
            "Action": "s3-outposts:GetAccessPoint",
            "Effect": "Allow",
            "Resource": "arn:aws:s3-outposts:us-east-1:123456789012:outpost/outpost-id/accesspoint/bucket-access-point-name"
        }
    ]
}
```

------

### Creating an IAM role for DataSync to access your Amazon S3 location
<a name="create-role-manually"></a>

When [creating your Amazon S3 location](#create-s3-location-how-to) in the console, DataSync can automatically create and assume an IAM role that normally has the right permissions to access your S3 bucket.

In some situations, you might need to create this role manually (for example, accessing buckets with extra layers of security or transferring to or from a bucket in a different AWS accounts).

#### Manually creating an IAM role for DataSync
<a name="create-role-manually-steps"></a>

1. Open the IAM console at [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/).

1. In the left navigation pane, under **Access management**, choose **Roles**, and then choose **Create role**.

1. On the **Select trusted entity** page, for **Trusted entity type**, choose **AWS service**.

1. For **Use case**, choose **DataSync** in the dropdown list and select **DataSync**. Choose **Next**.

1. On the **Add permissions** page, choose **Next**. Give your role a name and choose **Create role**.

1. On the **Roles** page, search for the role that you just created and choose its name.

1. On the role's details page, choose the **Permissions** tab. Choose **Add permissions** then **Create inline policy**.

1. Choose the **JSON** tab and [add the permissions required](#create-s3-location-required-permissions) to access your bucket into the policy editor.

1. Choose **Next**. Give your policy a name and choose **Create policy**.

1. (Recommended) To prevent the [cross-service confused deputy problem](cross-service-confused-deputy-prevention.md), do the following:

   1. On the role's details page, choose the **Trust relationships** tab. Choose **Edit trust policy**.

   1. Update the trust policy by using the following example, which includes the `aws:SourceArn` and `aws:SourceAccount` global condition context keys:

------
#### [ JSON ]

****  

      ```
      {
          "Version":"2012-10-17",		 	 	 
          "Statement": [
            {
              "Effect": "Allow",
              "Principal": {
                  "Service": "datasync.amazonaws.com"
              },
              "Action": "sts:AssumeRole",
              "Condition": {
                  "StringEquals": {
                  "aws:SourceAccount": "444455556666"
                  },
                  "ArnLike": {
                  "aws:SourceArn": "arn:aws:datasync:us-east-1:444455556666:*"
                  }
              }
            }
        ]
      }
      ```

------

   1. Choose **Update policy**.

You can specify this role when creating your Amazon S3 location.

### Accessing S3 buckets using server-side encryption
<a name="create-s3-location-encryption"></a>

DataSync can transfer data to or from [S3 buckets that use server-side encryption](https://docs.aws.amazon.com/AmazonS3/latest/userguide/serv-side-encryption.html). The type of encryption key a bucket uses can determine if you need a custom policy allowing DataSync to access the bucket.

When using DataSync with S3 buckets that use server-side encryption, remember the following:
+ **If your S3 bucket is encrypted with an AWS managed key** – DataSync can access the bucket's objects by default if all your resources are in the same AWS account.
+ **If your S3 bucket is encrypted with a customer managed AWS Key Management Service (AWS KMS) key (SSE-KMS)** – The [key's policy](https://docs.aws.amazon.com/kms/latest/developerguide/key-policy-modifying.html) must include the IAM role that DataSync uses to access the bucket.
+ **If your S3 bucket is encrypted with a customer managed SSE-KMS key and in a different AWS account** – DataSync needs permission to access the bucket in the other AWS account. You can set up this up by doing the following:
  + In the IAM role that DataSync uses, you must specify the cross-account bucket's SSE-KMS key by using the key's fully qualified Amazon Resource Name (ARN). This is the same key ARN that you use to configure the bucket's [default encryption](https://docs.aws.amazon.com/AmazonS3/latest/userguide/default-bucket-encryption.html). You can't specify a key ID, alias name, or alias ARN in this situation.

    Here's an example key ARN:

    `arn:aws:kms:us-west-2:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab`

    For more information on specifying KMS keys in IAM policy statements, see the *[AWS Key Management Service Developer Guide](https://docs.aws.amazon.com/kms/latest/developerguide/cmks-in-iam-policies.html)*.
  + In the SSE-KMS key policy, [specify the IAM role used by DataSync](https://docs.aws.amazon.com/kms/latest/developerguide/key-policy-modifying-external-accounts.html).
+ **If your S3 bucket is encrypted with a customer managed AWS KMS key (DSSE-KMS) for dual-layer server-side encryption** – The [key's policy](https://docs.aws.amazon.com/kms/latest/developerguide/key-policy-modifying.html) must include the IAM role that DataSync uses to access the bucket. (Keep in mind that DSSE-KMS doesn't support [S3 Bucket Keys](https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucket-key.html), which can reduce AWS KMS request costs.)
+ **If your S3 bucket is encrypted with a customer-provided encryption key (SSE-C)** – DataSync can't access this bucket.

#### Example: SSE-KMS key policy for DataSync
<a name="create-s3-location-encryption-example"></a>

The following example is a [key policy](https://docs.aws.amazon.com/kms/latest/developerguide/key-policies.html) for a customer-managed SSE-KMS key. The policy is associated with an S3 bucket that uses server-side encryption.

If you want to use this example, replace the following values with your own:
+ *account-id* – Your AWS account.
+ *admin-role-name* – The name of the IAM role that can administer the key.
+ *datasync-role-name* – The name of the IAM role that allows DataSync to use the key when accessing the bucket.

------
#### [ JSON ]

****  

```
{
    "Id": "key-consolepolicy-3",
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "Enable IAM Permissions",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::111122223333:root"
            },
            "Action": "kms:*",
            "Resource": "*"
        },
        {
            "Sid": "Allow access for Key Administrators",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::111122223333:role/admin-role-name"
            },
            "Action": [
                "kms:Create*",
                "kms:Describe*",
                "kms:Enable*",
                "kms:List*",
                "kms:Put*",
                "kms:Update*",
                "kms:Revoke*",
                "kms:Disable*",
                "kms:Get*",
                "kms:Delete*",
                "kms:TagResource",
                "kms:UntagResource",
                "kms:ScheduleKeyDeletion",
                "kms:CancelKeyDeletion"
            ],
            "Resource": "*"
        },
        {
            "Sid": "Allow use of the key",
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::111122223333:role/datasync-role-name"
            },
            "Action": [
                "kms:Encrypt",
                "kms:Decrypt",
                "kms:ReEncrypt*",
                "kms:GenerateDataKey*"
            ],
            "Resource": "*"
        }
    ]
}
```

------

### Accessing restricted S3 buckets
<a name="denying-s3-access"></a>

If you need to transfer to or from an S3 bucket that typically denies all access, you can edit the bucket policy so that DataSync can access the bucket only for your transfer.

#### Example: Allowing access based on IAM roles
<a name="denying-s3-access-example"></a>

1. Copy the following S3 bucket policy.

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [{
           "Sid": "Deny-access-to-bucket",
           "Effect": "Deny",
           "Principal": "*",
           "Action": "s3:*",
           "Resource": [
               "arn:aws:s3:::amzn-s3-demo-bucket",
               "arn:aws:s3:::amzn-s3-demo-bucket/*"
           ],
           "Condition": {
               "StringNotLike": {
                   "aws:userid": [
                       "datasync-iam-role-id:*",
                       "your-iam-role-id"
                   ]
               }
           }
       }]
   }
   ```

------

1. In the policy, replace the following values:
   + `amzn-s3-demo-bucket` – Specify the name of the restricted S3 bucket.
   + `datasync-iam-role-id` – Specify the ID of the [IAM role that DataSync uses](#create-s3-location-access) to access the bucket.

     Run the following AWS CLI command to get the IAM role ID:

     `aws iam get-role --role-name datasync-iam-role-name`

     In the output, look for the `RoleId` value:

     `"RoleId": "ANPAJ2UCCR6DPCEXAMPLE"`
   + `your-iam-role-id` – Specify the ID of the IAM role that you use to create your DataSync location for the bucket.

     Run the following command to get the IAM role ID:

     `aws iam get-role --role-name your-iam-role-name`

     In the output, look for the `RoleId` value:

     `"RoleId": "AIDACKCEVSQ6C2EXAMPLE"`

1. [Add this policy](https://docs.aws.amazon.com/AmazonS3/latest/userguide/add-bucket-policy.html) to your S3 bucket policy.

1. When you're done using DataSync with the restricted bucket, remove the conditions for both IAM roles from the bucket policy.

### Accessing S3 buckets with restricted VPC access
<a name="create-s3-location-restricted-vpc"></a>

An Amazon S3 bucket that [limits access to specific virtual private cloud (VPC) endpoints or VPCs](https://docs.aws.amazon.com/AmazonS3/latest/userguide/example-bucket-policies-vpc-endpoint.html) will deny DataSync from transferring to or from that bucket. To enable transfers in these situations, you can update the bucket's policy to include the IAM role that you [specify with your DataSync location](#create-s3-location-how-to).

------
#### [ Option 1: Allowing access based on DataSync location role ARN ]

In the S3 bucket policy, you can specify the Amazon Resource Name (ARN) of your DataSync location IAM role.

The following example is an S3 bucket policy that denies access from all but two VPCs (`vpc-1234567890abcdef0` and `vpc-abcdef01234567890`). However, the policy also includes the [ArnNotLikeIfExists](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_condition_operators.html) condition and [aws:PrincipalArn](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_condition-keys.html#condition-keys-principalarn) condition key, which allow the ARN of a DataSync location role to access the bucket.

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "Access-to-specific-VPCs-only",
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/*",
            "Condition": {
                "StringNotEqualsIfExists": {
                    "aws:SourceVpc": [
                        "vpc-1234567890abcdef0",
                        "vpc-abcdef01234567890"
                    ]
                },
                "ArnNotLikeIfExists": {
                    "aws:PrincipalArn": [
                        "arn:aws:iam::111122223333:role/datasync-location-role-name"
                    ]
                }
            }
        }
    ]
}
```

------
#### [ Option 2: Allowing access based on DataSync location role tag ]

In the S3 bucket policy, you can specify a tag attached to your DataSync location IAM role.

The following example is an S3 bucket policy that denies access from all but two VPCs (`vpc-1234567890abcdef0` and `vpc-abcdef01234567890`). However, the policy also includes the [StringNotEqualsIfExists](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_elements_condition_operators.html) condition and [aws:PrincipalTag](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_condition-keys.html#condition-keys-principaltag) condition key, which allow a principal with the tag key `exclude-from-vpc-restriction` and value `true`. You can try a similar approach in your bucket policy by specifying a tag attached to your DataSync location role.

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Statement": [
        {
            "Sid": "Access-to-specific-VPCs-only",
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/*",
            "Condition": {
                "StringNotEqualsIfExists": {
                    "aws:SourceVpc": [
                        "vpc-1234567890abcdef0",
                        "vpc-abcdef01234567890"
                    ],
                    "aws:PrincipalTag/exclude-from-vpc-restriction": "true"
                }
            }
        }
    ]
}
```

------

## Storage class considerations with Amazon S3 transfers
<a name="using-storage-classes"></a>

When Amazon S3 is your destination location, DataSync can transfer your data directly into a specific [Amazon S3 storage class](https://aws.amazon.com/s3/storage-classes/).

Some storage classes have behaviors that can affect your Amazon S3 storage costs. When using storage classes that can incur additional charges for overwriting, deleting, or retrieving objects, changes to object data or metadata result in such charges. For more information, see [Amazon S3 pricing](https://aws.amazon.com/s3/pricing/).

**Important**  
New objects transferred to your Amazon S3 destination location are stored using the storage class that you specify when [creating your location](#create-s3-location-how-to).  
By default, DataSync preserves the storage class of existing objects in your destination location unless you configure your task to [transfer all data](configure-metadata.md#task-option-transfer-mode). In those situations, the storage class that you specify when creating your location is used for all objects.


| Amazon S3 storage class | Considerations | 
| --- | --- | 
| S3 Standard | Choose S3 Standard to store your frequently accessed files redundantly in multiple Availability Zones that are geographically separated. This is the default if you don't specify a storage class.  | 
| S3 Intelligent-Tiering |  Choose S3 Intelligent-Tiering to optimize storage costs by automatically moving data to the most cost-effective storage access tier. You pay a monthly charge per object stored in the S3 Intelligent-Tiering storage class. This Amazon S3 charge includes monitoring data access patterns and moving objects between tiers.  | 
| S3 Standard-IA |  Choose S3 Standard-IA to store your infrequently accessed objects redundantly in multiple Availability Zones that are geographically separated.  Objects stored in the S3 Standard-IA storage class can incur additional charges for overwriting, deleting, or retrieving. Consider how often these objects change, how long you plan to keep these objects, and how often you need to access them. Changes to object data or metadata are equivalent to deleting an object and creating a new one to replace it. This results in additional charges for objects stored in the S3 Standard-IA storage class. Objects less than 128 KB are smaller than the minimum capacity charge per object in the S3 Standard-IA storage class. These objects are stored in the S3 Standard storage class.  | 
| S3 One Zone-IA  |  Choose S3 One Zone-IA to store your infrequently accessed objects in a single Availability Zone.  Objects stored in the S3 One Zone-IA storage class can incur additional charges for overwriting, deleting, or retrieving. Consider how often these objects change, how long you plan to keep these objects, and how often you need to access them. Changes to object data or metadata are equivalent to deleting an object and creating a new one to replace it. This results in additional charges for objects stored in the S3 One Zone-IA storage class. Objects less than 128 KB are smaller than the minimum capacity charge per object in the S3 One Zone-IA storage class. These objects are stored in the S3 Standard storage class.  | 
| S3 Glacier Instant Retrieval |  Choose S3 Glacier Instant Retrieval to archive objects that are rarely accessed but require retrieval in milliseconds. Data stored in the S3 Glacier Instant Retrieval storage class offers cost savings compared to the S3 Standard-IA storage class with the same latency and throughput performance. S3 Glacier Instant Retrieval has higher data access costs than S3 Standard-IA, though. Objects stored in S3 Glacier Instant Retrieval can incur additional charges for overwriting, deleting, or retrieving. Consider how often these objects change, how long you plan to keep these objects, and how often you need to access them. Changes to object data or metadata are equivalent to deleting an object and creating a new one to replace it. This results in additional charges for objects stored in the S3 Glacier Instant Retrieval storage class. Objects less than 128 KB are smaller than the minimum capacity charge per object in the S3 Glacier Instant Retrieval storage class. These objects are stored in the S3 Standard storage class.  | 
| S3 Glacier Flexible Retrieval | Choose S3 Glacier Flexible Retrieval for more active archives.Objects stored in S3 Glacier Flexible Retrieval can incur additional charges for overwriting, deleting, or retrieving. Consider how often these objects change, how long you plan to keep these objects, and how often you need to access them. Changes to object data or metadata are equivalent to deleting an object and creating a new one to replace it. This results in additional charges for objects stored in the S3 Glacier Flexible Retrieval storage class.The S3 Glacier Flexible Retrieval storage class requires 40 KB of additional metadata for each archived object. DataSync puts objects that are less than 40 KB in the S3 Standard storage class. You must restore objects archived in this storage class before DataSync can read them. For information, see [Working with archived objects](https://docs.aws.amazon.com/AmazonS3/latest/userguide/archived-objects.html) in the Amazon S3 User Guide.When using S3 Glacier Flexible Retrieval, choose the **Verify only the data transferred** task option to compare data and metadata checksums at the end of the transfer. You can't use the **Verify all data in the destination** option for this storage class because it requires retrieving all existing objects from the destination. | 
| S3 Glacier Deep Archive |  Choose S3 Glacier Deep Archive to archive your objects for long-term data retention and digital preservation where data is accessed once or twice a year. Objects stored in S3 Glacier Deep Archive can incur additional charges for overwriting, deleting, or retrieving. Consider how often these objects change, how long you plan to keep these objects, and how often you need to access them. Changes to object data or metadata are equivalent to deleting an object and creating a new one to replace it. This results in additional charges for objects stored in the S3 Glacier Deep Archive storage class. The S3 Glacier Deep Archive storage class requires 40 KB of additional metadata for each archived object. DataSync puts objects that are less than 40 KB in the S3 Standard storage class. You must restore objects archived in this storage class before DataSync can read them. For information, see [Working with archived objects](https://docs.aws.amazon.com/AmazonS3/latest/userguide/archived-objects.html) in the *Amazon S3 User Guide*. When using S3 Glacier Deep Archive, choose the **Verify only the data transferred** task option to compare data and metadata checksums at the end of the transfer. You can't use the **Verify all data in the destination** option for this storage class because it requires retrieving all existing objects from the destination.  | 
|  S3 Outposts  |  The storage class for Amazon S3 on Outposts.  | 

## Evaluating S3 request costs when using DataSync
<a name="create-s3-location-s3-requests"></a>

With Amazon S3 locations, you incur costs related to S3 API requests made by DataSync. This section can help you understand how DataSync uses these requests and how they might affect your [Amazon S3 costs](https://aws.amazon.com/s3/pricing/).

**Topics**
+ [

### S3 requests made by DataSync
](#create-s3-location-s3-requests-made)
+ [

### Cost considerations
](#create-s3-location-s3-requests-cost)

### S3 requests made by DataSync
<a name="create-s3-location-s3-requests-made"></a>

The following table describes the S3 requests that DataSync can make when you’re copying data to or from an Amazon S3 location.


| S3 request | How DataSync uses it | 
| --- | --- | 
|  [ListObjectV2](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html)  |  DataSync makes at least one `LIST` request for every object ending in a forward slash (`/`) to list the objects that start with that prefix. This request is called during a task’s [preparing](run-task.md#understand-task-execution-statuses) phase.  | 
|  [HeadObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_HeadObject.html)  | DataSync makes `HEAD` requests to retrieve object metadata during a task’s [preparing](run-task.md#understand-task-execution-statuses) and [verifying](run-task.md#understand-task-execution-statuses) phases. There can be multiple `HEAD` requests per object depending on how you want DataSync to [verify the integrity of the data it transfers](configure-data-verification-options.md). | 
|  [GetObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html)  |  DataSync makes `GET` requests to read data from an object during a task’s [transferring](run-task.md#understand-task-execution-statuses) phase. There can be multiple `GET` requests for large objects.  | 
|  [GetObjectTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObjectTagging.html)  |  If you configure your task to [copy object tags](configure-metadata.md), DataSync makes these `GET` requests to check for object tags during the task's [preparing](run-task.md#understand-task-execution-statuses) and [transferring](run-task.md#understand-task-execution-statuses) phases.  | 
|  [PutObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObject.html)  |  DataSync makes `PUT` requests to create objects and prefixes in a destination S3 bucket during a task’s [transferring](run-task.md#understand-task-execution-statuses) phase. Since DataSync uses the [Amazon S3 multipart upload feature](https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html), there can be multiple `PUT` requests for large objects. To help minimize storage costs, we recommend using a [lifecycle configuration]() to stop incomplete multipart uploads.  | 
|  [PutObjectTagging](https://docs.aws.amazon.com/AmazonS3/latest/API/API_PutObjectTagging.html)  | If your source objects have tags and you configure your task to [copy object tags](configure-metadata.md), DataSync makes these `PUT` requests when [transferring](run-task.md#understand-task-execution-statuses) those tags. | 
|  [CopyObject](https://docs.aws.amazon.com/AmazonS3/latest/API/API_CopyObject.html)  |  DataSync makes a `COPY` request to create a copy of an object only if that object’s metadata changes. This can happen if you originally copied data to the S3 bucket using another service or tool that didn’t carry over its metadata.  | 

### Cost considerations
<a name="create-s3-location-s3-requests-cost"></a>

DataSync makes S3 requests on S3 buckets every time you run your task. This can lead to charges adding up in certain situations. For example:
+ You’re frequently transferring objects to or from an S3 bucket.
+ You may not be transferring much data, but your S3 bucket has lots of objects in it. You can still see high charges in this scenario because DataSync makes S3 requests on each of the bucket's objects.
+ You're transferring between S3 buckets, so DataSync is making S3 requests on the source and destination.

To help minimize S3 request costs related to DataSync, consider the following:

**Topics**
+ [

#### What S3 storage classes am I using?
](#create-s3-location-s3-requests-storage-classes)
+ [

#### How often do I need to transfer my data?
](#create-s3-location-s3-requests-recurring-transfers)

#### What S3 storage classes am I using?
<a name="create-s3-location-s3-requests-storage-classes"></a>

S3 request charges can vary based on the Amazon S3 storage class your objects are using, particularly for classes that archive objects (such as S3 Glacier Instant Retrieval, S3 Glacier Flexible Retrieval, and S3 Glacier Deep Archive).

Here are some scenarios in which storage classes can affect your S3 request charges when using DataSync:
+ Each time you run a task, DataSync makes `HEAD` requests to retrieve object metadata. These requests result in charges even if you aren’t moving any objects. How much these requests affect your bill depends on the storage class your objects are using along with the number of objects that DataSync scans.
+ If you moved objects into the S3 Glacier Instant Retrieval storage class (either directly or through a bucket lifecycle configuration), requests on objects in this class are more expensive than objects in other storage classes.
+ If you configure your DataSync task to [verify that your source and destination locations are fully synchronized](configure-data-verification-options.md), there will be `GET` requests for each object in all storage classes (except S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive).
+ In addition to `GET` requests, you incur data retrieval costs for objects in the S3 Standard-IA, S3 One Zone-IA, or S3 Glacier Instant Retrieval storage class.

For more information, see [Amazon S3 pricing](https://aws.amazon.com/s3/pricing/).

#### How often do I need to transfer my data?
<a name="create-s3-location-s3-requests-recurring-transfers"></a>

If you need to move data on a recurring basis, think about a [schedule](task-scheduling.md) that doesn't run more tasks than you need.

You may also consider limiting the scope of your transfers. For example, you can configure DataSync to focus on objects in certain prefixes or [filter what data gets transferred](filtering.md). These options can help reduce the number of S3 requests made each time you run your DataSync task.

## Object considerations with Amazon S3 transfers
<a name="create-s3-location-considerations"></a>
+ If you're transferring from an S3 bucket, use [S3 Storage Lens](https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage_lens_basics_metrics_recommendations.html) to determine how many objects you're moving.
+ When transferring between S3 buckets, we recommend using [Enhanced task mode](choosing-task-mode.md) because you aren't subject to DataSync task [quotas](datasync-limits.md).
+ DataSync might not transfer an object with nonstandard characters in its name. For more information, see the [object key naming guidelines](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-keys.html#object-key-guidelines) in the *Amazon S3 User Guide*.
+ When using DataSync with an S3 bucket that uses [versioning](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Versioning.html), remember the following:
  + When transferring to an S3 bucket, DataSync creates a new version of an object if that object is modified at the source. This results in additional charges.
  + An object has different version IDs in the source and destination buckets.
  + Only the most recent version of each object is transferred from the source bucket. Earlier versions are not copied to the destination.
+ After initially transferring data from an S3 bucket to a file system (for example, NFS or Amazon FSx), subsequent runs of the same DataSync task won't include objects that have been modified but are the same size they were during the first transfer.

## Creating your transfer location for an Amazon S3 general purpose bucket
<a name="create-s3-location-how-to"></a>

To create a location for your transfer, you need an existing S3 general purpose bucket. If you don't have one, see the [https://docs.aws.amazon.com/AmazonS3/latest/userguide/GetStartedWithS3.html](https://docs.aws.amazon.com/AmazonS3/latest/userguide/GetStartedWithS3.html).

**Important**  
Before you create your location, make sure that you read the following sections:  
[Storage class considerations with Amazon S3 transfers](#using-storage-classes)
[Evaluating S3 request costs when using DataSync](#create-s3-location-s3-requests)

### Using the DataSync console
<a name="create-s3-location-how-to-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Amazon S3**, and then choose **General purpose bucket**.

1. For **S3 URI**, enter or choose the bucket and prefix that you want to use for your location.
**Warning**  
DataSync can't transfer objects with a prefix that begins with a slash (`/`) or includes `//`, `/./`, or `/../` patterns. For example:  
`/photos`
`photos//2006/January`
`photos/./2006/February`
`photos/../2006/March`

1. For **S3 storage class when used as a destination**, choose a storage class that you want your objects to use when Amazon S3 is a transfer destination.

   For more information, see [Storage class considerations with Amazon S3 transfers](#using-storage-classes).

1. For **IAM role**, do one of the following:
   + Choose **Autogenerate** for DataSync to automatically create an IAM role with the permissions required to access the S3 bucket.

     If DataSync previously created an IAM role for this S3 bucket, that role is chosen by default.
   + Choose a custom IAM role that you created. For more information, see [Creating an IAM role for DataSync to access your Amazon S3 location](#create-role-manually).

1. (Optional) Choose **Add new tag** to tag your Amazon S3 location.

   Tags can help you manage, filter, and search for your resources. We recommend creating a name tag for your location.

1. Choose **Create location**.

### Using the AWS CLI
<a name="create-location-s3-cli"></a>

1. Copy the following `create-location-s3` command:

   ```
   aws datasync create-location-s3 \
       --s3-bucket-arn 'arn:aws:s3:::amzn-s3-demo-bucket' \
       --s3-storage-class 'your-S3-storage-class' \
       --s3-config 'BucketAccessRoleArn=arn:aws:iam::account-id:role/role-allowing-datasync-operations' \
       --subdirectory /your-prefix-name
   ```

1. For `--s3-bucket-arn`, specify the ARN of the S3 bucket that you want to use as a location.

1. For `--s3-storage-class`, specify a storage class that you want your objects to use when Amazon S3 is a transfer destination.

1. For `--s3-config`, specify the ARN of the IAM role that DataSync needs to access your bucket.

   For more information, see [Creating an IAM role for DataSync to access your Amazon S3 location](#create-role-manually).

1. For `--subdirectory`, specify a prefix in the S3 bucket that DataSync reads from or writes to (depending on whether the bucket is a source or destination location).
**Warning**  
DataSync can't transfer objects with a prefix that begins with a slash (`/`) or includes `//`, `/./`, or `/../` patterns. For example:  
`/photos`
`photos//2006/January`
`photos/./2006/February`
`photos/../2006/March`

1. Run the `create-location-s3` command.

   If the command is successful, you get a response that shows you the ARN of the location that you created. For example:

   ```
   {
       "LocationArn": "arn:aws:datasync:us-east-1:111222333444:location/loc-0b3017fc4ba4a2d8d"
   }
   ```

You can use this location as a source or destination for your DataSync task.

## Creating your transfer location for an S3 on Outposts bucket
<a name="create-s3-location-outposts-how-to"></a>

To create a location for your transfer, you need an existing Amazon S3 on Outposts bucket. If you don't have one, see the [https://docs.aws.amazon.com/AmazonS3/latest/s3-outposts/S3onOutposts.html](https://docs.aws.amazon.com/AmazonS3/latest/s3-outposts/S3onOutposts.html).

You also need a DataSync agent. For more information, see [Deploying your Basic mode agent on AWS Outposts](deploy-agents.md#outposts-agent).

When transferring from an S3 on Outposts bucket prefix that contains a large dataset (such as hundreds of thousands or millions of objects), your DataSync task might time out. To avoid this, consider using a [DataSync manifest](transferring-with-manifest.md), which lets you specify the exact objects that you need to transfer.

### Using the DataSync console
<a name="create-s3-location-how-to-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Amazon S3**, and then choose **Outposts bucket**.

1. For **S3 bucket**, choose an Amazon S3 access point that can access your S3 on Outposts bucket. 

   For more information, see the [https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-points.html](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-points.html).

1. For **S3 storage class when used as a destination**, choose a storage class that you want your objects to use when Amazon S3 is a transfer destination.

   For more information, see [Storage class considerations with Amazon S3 transfers](#using-storage-classes). DataSync by default uses the S3 Outposts storage class for Amazon S3 on Outposts.

1. For **Agents**, specify the Amazon Resource Name (ARN) of the DataSync agent on your Outpost.

1. For **Folder**, enter a prefix in the S3 bucket that DataSync reads from or writes to (depending on whether the bucket is a source or destination location).
**Warning**  
DataSync can't transfer objects with a prefix that begins with a slash (`/`) or includes `//`, `/./`, or `/../` patterns. For example:  
`/photos`
`photos//2006/January`
`photos/./2006/February`
`photos/../2006/March`

1. For **IAM role**, do one of the following:
   + Choose **Autogenerate** for DataSync to automatically create an IAM role with the permissions required to access the S3 bucket.

     If DataSync previously created an IAM role for this S3 bucket, that role is chosen by default.
   + Choose a custom IAM role that you created. For more information, see [Creating an IAM role for DataSync to access your Amazon S3 location](#create-role-manually).

1. (Optional) Choose **Add new tag** to tag your Amazon S3 location.

   Tags can help you manage, filter, and search for your resources. We recommend creating a name tag for your location.

1. Choose **Create location**.

### Using the AWS CLI
<a name="create-location-s3-cli"></a>

1. Copy the following `create-location-s3` command:

   ```
   aws datasync create-location-s3 \
       --s3-bucket-arn 'bucket-access-point' \
       --s3-storage-class 'your-S3-storage-class' \
       --s3-config 'BucketAccessRoleArn=arn:aws:iam::account-id:role/role-allowing-datasync-operations' \
       --subdirectory /your-folder \
       --agent-arns 'arn:aws:datasync:your-region:account-id::agent/agent-agent-id'
   ```

1. For `--s3-bucket-arn`, specify the ARN an Amazon S3 access point that can access your S3 on Outposts bucket.

   For more information, see the [https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-points.html](https://docs.aws.amazon.com/AmazonS3/latest/userguide/access-points.html).

1. For `--s3-storage-class`, specify a storage class that you want your objects to use when Amazon S3 is a transfer destination.

   For more information, see [Storage class considerations with Amazon S3 transfers](#using-storage-classes). DataSync by default uses the S3 Outposts storage class for S3 on Outposts.

1. For `--s3-config`, specify the ARN of the IAM role that DataSync needs to access your bucket.

   For more information, see [Creating an IAM role for DataSync to access your Amazon S3 location](#create-role-manually).

1. For `--subdirectory`, specify a prefix in the S3 bucket that DataSync reads from or writes to (depending on whether the bucket is a source or destination location).
**Warning**  
DataSync can't transfer objects with a prefix that begins with a slash (`/`) or includes `//`, `/./`, or `/../` patterns. For example:  
`/photos`
`photos//2006/January`
`photos/./2006/February`
`photos/../2006/March`

1. For `--agent-arns`, specify the ARN of the DataSync agent on your Outpost.

1. Run the `create-location-s3` command.

   If the command is successful, you get a response that shows you the ARN of the location that you created. For example:

   ```
   {
       "LocationArn": "arn:aws:datasync:us-east-1:111222333444:location/loc-0b3017fc4ba4a2d8d"
   }
   ```

You can use this location as a source or destination for your DataSync task.

## Amazon S3 transfers across AWS accounts
<a name="create-s3-location-cross-transfers"></a>

With DataSync, you can move data to or from S3 buckets in [different AWS accounts](working-with-locations.md#working-with-locations-across-accounts). For more information, see the following tutorials:
+ [Transferring data from on-premises storage to Amazon S3 across AWS accounts](s3-cross-account-transfer.md)
+ [Transferring data from Amazon S3 to Amazon S3 across AWS accounts](tutorial_s3-s3-cross-account-transfer.md)

## Amazon S3 transfers between commercial and AWS GovCloud (US) Regions
<a name="create-s3-location-govcloud"></a>

By default, DataSync doesn't transfer between S3 buckets in commercial and AWS GovCloud (US) Regions. You can still set up this kind of transfer, though, by creating an object storage location for one of the S3 buckets in your transfer. You can perform this type of transfer with or without an agent. If you use an agent, your task must be configured for **Basic** mode. To transfer without an agent, you must use **Enhanced** mode.

**Before you begin**: Make sure that you understand the cost implications of transferring between Regions. For more information, see [AWS DataSync Pricing](https://aws.amazon.com/datasync/pricing/).

**Contents**
+ [

### Providing DataSync access to your object storage location's bucket
](#create-s3-location-govcloud-iam)
+ [

### Creating your DataSync agent (optional)
](#create-s3-location-govcloud-create-agent)
+ [

### Creating an object storage location for your S3 bucket
](#create-s3-location-govcloud-how-to)

### Providing DataSync access to your object storage location's bucket
<a name="create-s3-location-govcloud-iam"></a>

When creating the object storage location for this transfer, you must provide DataSync the credentials of an IAM user with permission to access the location's S3 bucket. For more information, see [Required permissions](#create-s3-location-required-permissions).

**Warning**  
IAM users have long-term credentials, which presents a security risk. To help mitigate this risk, we recommend that you provide these users with only the permissions they require to perform the task and that you remove these users when they are no longer needed.

### Creating your DataSync agent (optional)
<a name="create-s3-location-govcloud-create-agent"></a>

If you want to run your transfer using **Basic** mode, then you will need to use an agent. Because you're transferring between a commercial and AWS GovCloud (US) Region, you deploy your DataSync agent as an Amazon EC2 instance in one of the Regions. We recommend that your agent use a VPC service endpoint to avoid data transfer charges out to the public internet. For more information, see [Amazon EC2 Data Transfer pricing](https://aws.amazon.com/ec2/pricing/on-demand/#Data_Transfer).

Choose one of the following scenarios that describe how to create an agent based on the Region where you plan to run your DataSync task.

#### When running a DataSync task in a commercial Region
<a name="using-datasync-in-commercial"></a>

The following diagram shows a transfer where your DataSync task and agent are in the commercial Region.

![\[A DataSync agent deployed in a commercial Region for a cross-Region transfer to an S3 bucket in an AWS GovCloud (US) Region.\]](http://docs.aws.amazon.com/datasync/latest/userguide/images/s3-task-in-commercial.png)


| Reference | Description | 
| --- | --- | 
| 1 | In the commercial Region where you're running a DataSync task, data transfers from the source S3 bucket. The source bucket is configured as an [Amazon S3 location](#create-s3-location-how-to) in the commercial Region. | 
| 2 | Data transfers through the DataSync agent, which is in the same VPC and subnet where the VPC service endpoint and [network interfaces](required-network-interfaces.md) are located. | 
| 3 | Data transfers to the destination S3 bucket in the AWS GovCloud (US) Region. The destination bucket is configured as an [object storage location](#create-s3-location-govcloud-how-to) in the commercial Region.  | 

You can use this same setup to transfer the opposite direction, too, from the AWS GovCloud (US) Region to the commercial Region.

**To create your DataSync agent**

1. [Deploy an Amazon EC2 agent](deploy-agents.md#ec2-deploy-agent-how-to) in your commercial Region.

1. Configure your agent to use a [VPC service endpoint](choose-service-endpoint.md#datasync-in-vpc).

1. [Activate your agent](activate-agent.md).

#### When running a DataSync task in a GovCloud (US) Region
<a name="using-datasync-in-govcloud-1"></a>

The following diagram shows a transfer where your DataSync task and agent are in the AWS GovCloud (US) Region.

![\[A DataSync agent deployed in a AWS GovCloud (US) Region or a cross-Region transfer to an S3 bucket in the same AWS GovCloud (US) Region.\]](http://docs.aws.amazon.com/datasync/latest/userguide/images/s3-task-in-govcloud-1.png)


| Reference | Description | 
| --- | --- | 
| 1 | Data transfers from the source S3 bucket in the commercial Region to the AWS GovCloud (US) Region where you're running a DataSync task. The source bucket is configured as an [object storage location](#create-s3-location-govcloud-how-to) in the AWS GovCloud (US) Region.  | 
| 2 | In the AWS GovCloud (US) Region, data transfers through the DataSync agent in the same VPC and subnet where the VPC service endpoint and [network interfaces](required-network-interfaces.md) are located. | 
| 3 | Data transfers to the destination S3 bucket in the AWS GovCloud (US) Region. The destination bucket is configured as an [Amazon S3 location](#create-s3-location-how-to) in the AWS GovCloud (US) Region. | 

You can use this same setup to transfer the opposite direction, too, from the AWS GovCloud (US) Region to the commercial Region.

**To create your DataSync agent**

1. [Deploy an Amazon EC2 agent](deploy-agents.md#ec2-deploy-agent-how-to) in your AWS GovCloud (US) Region.

1. Configure your agent to use a [VPC service endpoint](choose-service-endpoint.md#datasync-in-vpc).

1. [Activate your agent](activate-agent.md).

If your dataset is highly compressible, you might see reduced costs by instead creating your agent in a commercial Region while running a task in an AWS GovCloud (US) Region. There's more setup than normal for creating this agent, including preparing the agent for use in a commercial Region. For information about creating an agent for this setup, see the [Move data in and out of AWS GovCloud (US) with AWS DataSync](https://aws.amazon.com/blogs/publicsector/move-data-in-out-aws-govcloud-datasync/) blog.

### Creating an object storage location for your S3 bucket
<a name="create-s3-location-govcloud-how-to"></a>

You need an object storage location for the S3 bucket that's in the Region where you aren't running your DataSync task.

#### Using the DataSync console
<a name="create-s3-location-govcloud-how-to-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. Make sure that you're in the same Region where you plan to run your task.

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Object storage**.

1. For **Agents**, choose the DataSync agent that you created for this transfer.

1. For **Server**, enter an Amazon S3 endpoint for your bucket by using one of the following formats:
   + **Commercial Region bucket:** `s3.your-region.amazonaws.com`
   + **AWS GovCloud (US) Region bucket**: `s3.your-gov-region.amazonaws.com`

   For a list of Amazon S3 endpoints, see the *[AWS General Reference](https://docs.aws.amazon.com/general/latest/gr/s3.html)*.

1. For **Bucket** name, enter the name of the S3 bucket.

1. For **Folder**, enter a prefix in the S3 bucket that DataSync reads from or writes to (depending on whether the bucket is a source or destination location).
**Warning**  
DataSync can't transfer objects with a prefix that begins with a slash (`/`) or includes `//`, `/./`, or `/../` patterns. For example:  
`/photos`
`photos//2006/January`
`photos/./2006/February`
`photos/../2006/March`

1. Select **Requires credentials** and do the following:
   + For **Access key**, enter the access key for an [IAM user](#create-s3-location-govcloud-iam) that can access the bucket.
   + For **Secret key**, enter the same IAM user’s secret key.

1. (Optional) Choose **Add tag** to tag your location.

   Tags can help you manage, filter, and search for your resources. We recommend creating a name tag for your location.

1. Choose **Create location**.

#### Using the AWS CLI
<a name="create-s3-location-govcloud-how-to-cli"></a>

1. Copy the following `create-location-object-storage` command:

   ```
   aws datasync create-location-object-storage \
       --server-hostname s3-endpoint \
       --bucket-name amzn-s3-demo-bucket \
       --agent-arns arn:aws:datasync:your-region:123456789012:agent/agent-01234567890deadfb
   ```

1. For the `--server-hostname` parameter, specify an Amazon S3 endpoint for your bucket by using one of the following formats:
   + **Commercial Region bucket:** `s3.your-region.amazonaws.com`
   + **AWS GovCloud (US) Region bucket**: `s3.your-gov-region.amazonaws.com`

   For the Region in the endpoint, make sure that you specify the same Region where you plan to run your task.

   For a list of Amazon S3 endpoints, see the *[AWS General Reference](https://docs.aws.amazon.com/general/latest/gr/s3.html)*.

1. For the `--bucket-name` parameter, specify the name of the S3 bucket.

1. For the `--agent-arns` parameter, specify the DataSync agent that you created for this transfer.

1. For the `--access-key` parameter, specify the access key for an [IAM user](#create-s3-location-govcloud-iam) that can access the bucket.

1. For the `--secret-key` parameter, enter the same IAM user's secret key.

1. (Optional) For the `--subdirectory` parameter, specify a prefix in the S3 bucket that DataSync reads from or writes to (depending on whether the bucket is a source or destination location).
**Warning**  
DataSync can't transfer objects with a prefix that begins with a slash (`/`) or includes `//`, `/./`, or `/../` patterns. For example:  
`/photos`
`photos//2006/January`
`photos/./2006/February`
`photos/../2006/March`

1. (Optional) For the `--tags` parameter, specify key-value pairs that represent tags for the location resource.

   Tags can help you manage, filter, and search for your resources. We recommend creating a name tag for your location.

1. Run the `create-location-object-storage` command.

   You get a response that shows you the location ARN that you just created.

   ```
   {
       "LocationArn": "arn:aws:datasync:us-east-1:123456789012:location/loc-01234567890abcdef"
   }
   ```

You can use this location as a source or destination for your DataSync task. For the other S3 bucket in this transfer, [create an Amazon S3 location](#create-s3-location-how-to).

## Next steps
<a name="create-s3-location-next-steps"></a>

Some possible next steps include:

1. If needed, create your other location. For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. [Configure DataSync task settings](task-options.md), such as what files to transfer, how to handle metadata, among other options.

1. [Set a schedule](task-scheduling.md) for your DataSync task.

1. [Configure monitoring](monitoring-overview.md) for your DataSync task.

1. [Start](run-task.md) your task.

# Configuring AWS DataSync transfers with Amazon EFS
<a name="create-efs-location"></a>

To transfer data to or from your Amazon EFS file system, you must create an AWS DataSync transfer *location*. DataSync can use this location as a source or destination for transferring data.

## Providing DataSync access to Amazon EFS file systems
<a name="create-efs-location-access"></a>

[Creating a location](#create-efs-location-how-to) involves understanding how DataSync can access your storage. For Amazon EFS, DataSync mounts your file system as a root user from your virtual private cloud (VPC) using [network interfaces](required-network-interfaces.md).

**Contents**
+ [

### Determining the subnet and security groups for your mount target
](#create-efs-location-mount-target)
+ [

### Accessing restricted file systems
](#create-efs-location-iam)
  + [

#### Creating a DataSync IAM role for file system access
](#create-efs-location-iam-role)
  + [

#### Example file system policy allowing DataSync access
](#create-efs-location-iam-policy)

### Determining the subnet and security groups for your mount target
<a name="create-efs-location-mount-target"></a>

When creating your location, you specify the subnet and security groups that allow DataSync to connect to one of your Amazon EFS file system's [mount targets](https://docs.aws.amazon.com/efs/latest/ug/accessing-fs.html).

The subnet that you specify must be located:
+ In the same VPC as your file system.
+ In the same Availability Zone as at least one mount target for your file system.

**Note**  
You don't need to specify a subnet that includes a file system mount target.

The security groups that you specify must allow inbound traffic on Network File System (NFS) port 2049. For information on creating and updating security groups for your mount targets, see the [https://docs.aws.amazon.com/efs/latest/ug/network-access.html](https://docs.aws.amazon.com/efs/latest/ug/network-access.html).

**Specifying security groups associated with a mount target**  
You can specify a security group that's associated with one of your file system's mount targets. We recommend this approach from a network management standpoint.

**Specifying security groups that aren't associated with a mount target**  
You also can specify a security group that isn't associated with one of your file system's mount targets. However, this security group must be able to communicate with a mount target's security group.  
For example, here's how you might create a relationship between security group D (for DataSync) and security group M (for the mount target):  
+ Security group D, which you specify when creating your location, must have a rule that allows outbound connections on NFS port 2049 to security group M.
+ Security group M, which you associate with the mount target, must allow inbound access on NFS port 2049 from security group D.

**To find a mount target's security group**

The following instructions can help you identify the security group of an Amazon EFS file system mount target that you want DataSync to use for your transfer.

1. In the AWS CLI, run the following `describe-mount-targets` command.

   ```
   aws efs describe-mount-targets \
       --region file-system-region  \
       --file-system-id file-system-id
   ```

   This command returns information about your file system's mount targets (similar to the following example output).

   ```
   {
       "MountTargets": [
           {
               "OwnerId": "111222333444",
               "MountTargetId": "fsmt-22334a10",
               "FileSystemId": "fs-123456ab",
               "SubnetId": "subnet-f12a0e34",
               "LifeCycleState": "available",
               "IpAddress": "11.222.0.123",
               "NetworkInterfaceId": "eni-1234a044"
           }
       ]
   }
   ```

1. Take note of the `MountTargetId` value that you want to use.

1. Run the following `describe-mount-target-security-groups` command using the `MountTargetId` to see the security group of your mount target.

   ```
   aws efs describe-mount-target-security-groups \
       --region file-system-region \
       --mount-target-id mount-target-id
   ```

You specify this security group when [creating your location](#create-efs-location-how-to).

### Accessing restricted file systems
<a name="create-efs-location-iam"></a>

DataSync can transfer to or from Amazon EFS file systems that restrict access through [access points](https://docs.aws.amazon.com/efs/latest/ug/efs-access-points.html) and [IAM policies](https://docs.aws.amazon.com/efs/latest/ug/iam-access-control-nfs-efs.html).

**Note**  
If DataSync accesses a destination file system through an access point that [enforces user identity](https://docs.aws.amazon.com/efs/latest/ug/efs-access-points.html#enforce-identity-access-points), the POSIX user and group IDs for your source data aren't preserved if you configure your DataSync task to [copy ownership](configure-metadata.md). Instead, the transferred files and folders are set to the access point's user and group IDs. When this happens, task verification fails because DataSync detects a mismatch between metadata in the source and destination locations.

**Contents**
+ [

#### Creating a DataSync IAM role for file system access
](#create-efs-location-iam-role)
+ [

#### Example file system policy allowing DataSync access
](#create-efs-location-iam-policy)

#### Creating a DataSync IAM role for file system access
<a name="create-efs-location-iam-role"></a>

If you have an Amazon EFS file system that restricts access through an IAM policy, you can create an IAM role that provides DataSync permission to read from or write data to the file system. You then might need to specify that role in your [file system policy](#create-efs-location-iam-policy).

**To create the DataSync IAM role**

1. Open the IAM console at [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/).

1. In the left navigation pane, under **Access management**, choose **Roles**, and then choose **Create role**.

1. On the **Select trusted entity** page, for **Trusted entity type**, choose **Custom trust policy**.

1. Paste the following JSON into the policy editor:

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [{
           "Effect": "Allow",
           "Principal": {
               "Service": "datasync.amazonaws.com"
           },
           "Action": "sts:AssumeRole"
       }]
   }
   ```

------

1. Choose **Next**. On the **Add permissions** page, choose **Next**.

1. Give your role a name and choose **Create role**.

You specify this role when [creating your location](#create-efs-location-how-to).

#### Example file system policy allowing DataSync access
<a name="create-efs-location-iam-policy"></a>

The following example file system policy shows how access to an Amazon EFS file system (identified in the policy as `fs-1234567890abcdef0`) is restricted but still allows access to DataSync through an IAM role named `MyDataSyncRole`:

------
#### [ JSON ]

****  

```
{
    "Version":"2012-10-17",		 	 	 
    "Id": "ExampleEFSFileSystemPolicy",
    "Statement": [{
        "Sid": "AccessEFSFileSystem",
        "Effect": "Allow",
        "Principal": {
            "AWS": "arn:aws:iam::111122223333:role/MyDataSyncRole"
        },
        "Action": [
            "elasticfilesystem:ClientMount",
            "elasticfilesystem:ClientWrite",
            "elasticfilesystem:ClientRootAccess"
        ],
        "Resource": "arn:aws:elasticfilesystem:us-east-1:111122223333:file-system/fs-1234567890abcdef0",
        "Condition": {
            "Bool": {
                "aws:SecureTransport": "true"
            },
            "StringEquals": {
                "elasticfilesystem:AccessPointArn": "arn:aws:elasticfilesystem:us-east-1:111122223333:access-point/fsap-abcdef01234567890"
            }
        }
    }]
}
```

------
+ `Principal` – Specifies an [IAM role](#create-efs-location-iam) that gives DataSync permission to access the file system.
+ `Action` – Gives DataSync root access and allows it to read from and write to the file system.
+ `aws:SecureTransport` – Requires NFS clients to use TLS when connecting to the file system.
+ `elasticfilesystem:AccessPointArn` – Allows access to the file system only through a specific access point.

## Network considerations with Amazon EFS transfers
<a name="efs-network-considerations"></a>

VPCs that you use with DataSync must have default tenancy. VPCs with dedicated tenancy aren't supported.

## Performance considerations with Amazon EFS transfers
<a name="efs-considerations"></a>

Your Amazon EFS file system's throughput mode can affect transfer duration and file system performance during the transfer. Consider the following:
+ For best results, we recommend using Elastic throughput mode. If you don't use Elastic throughput mode, your transfer might take longer.
+ If you use Bursting throughput mode, the performance of your file system's applications might be affected because DataSync consumes file system burst credits.
+ How you [configure DataSync to verify your transferred data](configure-data-verification-options.md) can affect file system performance and data access costs.

For more information, see [Amazon EFS performance](https://docs.aws.amazon.com/efs/latest/ug/performance.html) in the *Amazon Elastic File System User Guide* and the [Amazon EFS Pricing](https://aws.amazon.com/efs/pricing/) page.

## Creating your Amazon EFS transfer location
<a name="create-efs-location-how-to"></a>

To create the transfer location, you need an existing Amazon EFS file system. If you don't have one, see [Getting started with Amazon EFS](https://docs.aws.amazon.com/efs/latest/ug/getting-started.html) in the *Amazon Elastic File System User Guide*.

### Using the DataSync console
<a name="create-efs-location-how-to-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For ** Location type**, choose **Amazon EFS file system**.

   You configure this location as a source or destination later. 

1. For **File system**, choose the Amazon EFS file system that you want to use as a location.

1. For **Mount path**, enter a mount path for your Amazon EFS file system.

   This specifies where DataSync reads or writes data (depending on if this is a source or destination location) on your file system.

   By default, DataSync uses the root directory (or [access point](https://docs.aws.amazon.com/efs/latest/ug/efs-access-points.html) if you provide one for the **EFS access point** setting). You can also specify subdirectories using forward slashes (for example, `/path/to/directory`).

1. For **Subnet** choose a subnet where you want DataSync to create the [network interfaces](required-network-interfaces.md) for managing your data transfer traffic.

   The subnet must be located:
   + In the same VPC as your file system.
   + In the same Availability Zone as at least one file system mount target.
**Note**  
You don't need to specify a subnet that includes a file system mount target.

1. For **Security groups**, choose the security group associated with your Amazon EFS file system's mount target. You can choose more than one security group.
**Note**  
The security groups that you specify must allow inbound traffic on NFS port 2049. For more information, see [Determining the subnet and security groups for your mount target](#create-efs-location-mount-target).

1. For **In-transit encryption**, choose whether you want DataSync to use Transport Layer Security (TLS) encryption when it transfers data to or from your file system.
**Note**  
You must enable this setting to configure an access point, IAM role, or both with your Amazon EFS location.

1. (Optional) For **EFS access point**, choose an access point that DataSync can use to mount your file system.

   For more information, see [Accessing restricted file systems](#create-efs-location-iam).

1. (Optional) For **IAM role**, specify a role that allows DataSync to access your file system.

   For information on creating this role, see [Creating a DataSync IAM role for file system access](#create-efs-location-iam-role).

1. (Optional) Select **Add tag** to tag your file system.

   A *tag* is a key-value pair that helps you manage, filter, and search for your locations. 

1. Choose **Create location**.

### Using the AWS CLI
<a name="create-location-efs-cli"></a>

1. Copy the following `create-location-efs` command:

   ```
   aws datasync create-location-efs \
       --efs-filesystem-arn 'arn:aws:elasticfilesystem:region:account-id:file-system/file-system-id' \
       --subdirectory /path/to/your/subdirectory \
       --ec2-config SecurityGroupArns='arn:aws:ec2:region:account-id:security-group/security-group-id',SubnetArn='arn:aws:ec2:region:account-id:subnet/subnet-id' \
       --in-transit-encryption TLS1_2 \
       --access-point-arn 'arn:aws:elasticfilesystem:region:account-id:access-point/access-point-id' \
       --file-system-access-role-arn 'arn:aws:iam::account-id:role/datasync-efs-access-role
   ```

1. For `--efs-filesystem-arn`, specify the Amazon Resource Name (ARN) of the Amazon EFS file system that you're transferring to or from.

1. For `--subdirectory`, specify a mount path for your file system.

   This is where DataSync reads or writes data (depending on if this is a source or destination location) on your file system. 

   By default, DataSync uses the root directory (or [access point](https://docs.aws.amazon.com/efs/latest/ug/efs-access-points.html) if you provide one with `--access-point-arn`). You can also specify subdirectories using forward slashes (for example, `/path/to/directory`).

1. For `--ec2-config`, do the following:
   + For `SecurityGroupArns`, specify the ARN of the security group associated with your file system's mount target. You can specify more than one security group.
**Note**  
The security groups that you specify must allow inbound traffic on NFS port 2049. For more information, see [Determining the subnet and security groups for your mount target](#create-efs-location-mount-target).
   + For `SubnetArn`, specify the ARN of the subnet where you want DataSync to create the [network interfaces](required-network-interfaces.md) for managing your data transfer traffic.

     The subnet must be located:
     + In the same VPC as your file system.
     + In the same Availability Zone as at least one file system mount target.
**Note**  
You don't need to specify a subnet that includes a file system mount target.

1. For `--in-transit-encryption`, specify whether you want DataSync to use Transport Layer Security (TLS) encryption when it transfers data to or from your file system.
**Note**  
You must set this to `TLS1_2` to configure an access point, IAM role, or both with your Amazon EFS location.

1. (Optional) For `--access-point-arn`, specify the ARN of an access point that DataSync can use to mount your file system.

   For more information, see [Accessing restricted file systems](#create-efs-location-iam).

1. (Optional) For `--file-system-access-role-arn`, specify the ARN of an IAM role that allows DataSync to access your file system.

   For information on creating this role, see [Creating a DataSync IAM role for file system access](#create-efs-location-iam-role).

1. Run the `create-location-efs` command.

   If the command is successful, you get a response that shows you the ARN of the location that you created. For example:

   ```
   {
       "LocationArn": "arn:aws:datasync:us-east-1:111222333444:location/loc-0b3017fc4ba4a2d8d"
   }
   ```

# Configuring transfers with FSx for Windows File Server
<a name="create-fsx-location"></a>

To transfer data to or from your Amazon FSx for Windows File Server file system, you must create an AWS DataSync transfer *location*. DataSync can use this location as a source or destination for transferring data.

## Providing DataSync access to FSx for Windows File Server file systems
<a name="create-fsx-location-access"></a>

DataSync connects to your FSx for Windows File Server file system with the Server Message Block (SMB) protocol and mounts it from your virtual private cloud (VPC) using [network interfaces](required-network-interfaces.md).

**Note**  
VPCs that you use with DataSync must have default tenancy. VPCs with dedicated tenancy aren't supported.

**Topics**
+ [

### Required permissions
](#create-fsx-windows-location-permissions)
+ [

### Required authentication protocols
](#configuring-fsx-windows-authentication-protocols)
+ [

### DFS Namespaces
](#configuring-fsx-windows-location-dfs)

### Required permissions
<a name="create-fsx-windows-location-permissions"></a>

You must provide DataSync a user with the necessary rights to mount and access your FSx for Windows File Server files, folders, and file metadata.

We recommend that this user belong to a Microsoft Active Directory group for administering your file system. The specifics of this group depends on your Active Directory setup:
+ If you're using AWS Directory Service for Microsoft Active Directory with FSx for Windows File Server, the user must be a member of the **AWS Delegated FSx Administrators** group.
+ If you're using self-managed Active Directory with FSx for Windows File Server, the user must be a member of one of two groups:
  + The **Domain Admins** group, which is the default delegated administrators group.
  + A custom delegated administrators group with user rights that allow DataSync to copy object ownership permissions and Windows access control lists (ACLs).
**Important**  
You can't change the delegated administrators group after the file system has been deployed. You must either redeploy the file system or restore it from a backup to use the custom delegated administrator group with the following user rights that DataSync needs to copy metadata.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/create-fsx-location.html)
+ If you want to copy Windows ACLs and are transferring between an SMB file server and FSx for Windows File Server file system or between FSx for Windows File Server file systems, the users that you provide DataSync must belong to the same Active Directory domain or have an Active Directory trust relationship between their domains.

**Warning**  
Your FSx for Windows File Server file system's SYSTEM user must have **Full control** permissions on all folders in your file system. Do not change the NTFS ACL permissions for this user on your folders. If you do, DataSync can change your file system's permissions in a way that makes your file share inaccessible and prevents file system backups from being usable. For more information on file- and folder-level access, see the*[ Amazon FSx for Windows File Server User Guide](https://docs.aws.amazon.com/fsx/latest/WindowsGuide/limit-access-file-folder.html)*.

### Required authentication protocols
<a name="configuring-fsx-windows-authentication-protocols"></a>

Your FSx for Windows File Server must use NTLM authentication for DataSync to access it. DataSync can't access a file server that uses Kerberos authentication. 

### DFS Namespaces
<a name="configuring-fsx-windows-location-dfs"></a>

DataSync doesn't support Microsoft Distributed File System (DFS) Namespaces. We recommend specifying an underlying file server or share instead when creating your DataSync location.

For more information, see [Grouping multiple file systems with DFS Namespaces](https://docs.aws.amazon.com/fsx/latest/WindowsGuide/group-file-systems.html) in the *Amazon FSx for Windows File Server User Guide*.

## Creating your FSx for Windows File Server transfer location
<a name="create-fsx-location-how-to"></a>

Before you begin, make sure that you have an existing FSx for Windows File Server in your AWS Region. For more information, see [Getting started with Amazon FSx ](https://docs.aws.amazon.com/fsx/latest/WindowsGuide/getting-started.html) in the *Amazon FSx for Windows File Server User Guide*.

### Using the DataSync console
<a name="create-fsx-location-access-how-to-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Amazon FSx**.

1. For **FSx file system**, choose the FSx for Windows File Server file system that you want to use as a location.

1. For **Share name**, enter a mount path for your FSx for Windows File Server using forward slashes.

   This specifies the path where DataSync reads or writes data (depending on if this is a source or destination location).

   You can also include subdirectories (for example, `/path/to/directory`).

1. For **Security groups**, choose up to five Amazon EC2 security groups that provide access to your file system's preferred subnet.

   The security groups that you choose must be able to communicate with your file system's security groups. For information about configuring security groups for file system access, see the [https://docs.aws.amazon.com/fsx/latest/WindowsGuide/limit-access-security-groups.html](https://docs.aws.amazon.com/fsx/latest/WindowsGuide/limit-access-security-groups.html).
**Note**  
If you choose a security group that doesn't allow connections from within itself, do one of the following:  
Configure the security group to allow it to communicate within itself.
Choose a different security group that can communicate with the mount target's security group.

1. For **User**, enter the name of a user that can access your FSx for Windows File Server.

   For more information, see [Required permissions](#create-fsx-windows-location-permissions).

1. For **Password**, enter password of the user name.

1. (Optional) For **Domain**, enter the name of the Windows domain that your FSx for Windows File Server file system belongs to.

   If you have multiple Active Directory domains in your environment, configuring this setting makes sure that DataSync connects to the right file system.

1. (Optional) Enter values for the **Key** and **Value** fields to tag the FSx for Windows File Server.

   Tags help you manage, filter, and search for your AWS resources. We recommend creating at least a name tag for your location. 

1. Choose **Create location**.

### Using the AWS CLI
<a name="create-location-fsx-cli"></a>

**To create an FSx for Windows File Server location by using the AWS CLI**
+ Use the following command to create an Amazon FSx location.

  ```
  aws datasync create-location-fsx-windows \
      --fsx-filesystem-arn arn:aws:fsx:region:account-id:file-system/filesystem-id \
      --security-group-arns arn:aws:ec2:region:account-id:security-group/group-id \
      --user smb-user --password password
  ```

  In the `create-location-fsx-windows` command, do the following:
  + `fsx-filesystem-arn` – Specify the Amazon Resource Name (ARN) of the file system that you want to transfer to or from.
  + `security-group-arns` – Specify the ARNs of up to five Amazon EC2 security groups that provide access to your file system's preferred subnet.

    The security groups that you specify must be able to communicate with your file system's security groups. For information about configuring security groups for file system access, see the [https://docs.aws.amazon.com/fsx/latest/WindowsGuide/limit-access-security-groups.html](https://docs.aws.amazon.com/fsx/latest/WindowsGuide/limit-access-security-groups.html).
**Note**  
If you choose a security group that doesn't allow connections from within itself, do one of the following:  
Configure the security group to allow it to communicate within itself.
Choose a different security group that can communicate with the mount target's security group.
  + The AWS Region – The Region that you specify is the one where your target Amazon FSx file system is located.

The preceding command returns a location ARN similar to the one shown following.

```
{ 
    "LocationArn": "arn:aws:datasync:us-west-2:111222333444:location/loc-07db7abfc326c50fb" 
}
```

# Configuring DataSync transfers with FSx for Lustre
<a name="create-lustre-location"></a>

To transfer data to or from your Amazon FSx for Lustre file system, you must create an AWS DataSync transfer *location*. DataSync can use this location as a source or destination for transferring data.

## Providing DataSync access to FSx for Lustre file systems
<a name="create-lustre-location-access"></a>

DataSync accesses your FSx for Lustre file system using the Lustre client. DataSync requires access to all data on your FSx for Lustre file system. To have this level of access, DataSync mounts your file system as the root user using a user ID (UID) and group ID (GID) of `0`.

DataSync mounts your file system from your virtual private cloud (VPC) using [network interfaces](required-network-interfaces.md). DataSync fully manages the creation, the use, and the deletion of these network interfaces on your behalf.

**Note**  
VPCs that you use with DataSync must have default tenancy. VPCs with dedicated tenancy aren't supported.

## Creating your FSx for Lustre transfer location
<a name="create-lustre-location-how-to"></a>

To create the transfer location, you need an existing FSx for Lustre file system. For more information, see [Getting started with Amazon FSx for Lustre](https://docs.aws.amazon.com/fsx/latest/LustreGuide/getting-started.html) in the *Amazon FSx for Lustre User Guide*.

### Using the DataSync console
<a name="create-lustre-location-how-to-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Amazon FSx**.

   You configure this location as a source or destination later. 

1. For **FSx file system**, choose the FSx for Lustre file system that you want to use as a location. 

1. For **Mount path**, enter the mount path for your FSx for Lustre file system.

   The path can include a subdirectory. When the location is used as a source, DataSync reads data from the mount path. When the location is used as a destination, DataSync writes all data to the mount path. If a subdirectory isn't provided, DataSync uses the root directory (`/`).

1. For **Security groups**, choose up to five security groups that provide access to your FSx for Lustre file system.

   The security groups must be able to access the file system's ports. The file system must also allow access from the security groups.

   For more information about security groups, see [File System Access Control with Amazon VPC](https://docs.aws.amazon.com/fsx/latest/LustreGuide/limit-access-security-groups.html) in the *Amazon FSx for Lustre User Guide*.

1. (Optional) Enter values for the **Key** and **Value** fields to tag the FSx for Lustre file system.

   Tags help you manage, filter, and search for your AWS resources. We recommend creating at least a name tag for your location. 

1. Choose **Create location**.

### Using the AWS CLI
<a name="create-location-lustre-cli"></a>

**To create an FSx for Lustre location by using the AWS CLI**
+ Use the following command to create an FSx for Lustre location.

  ```
  aws datasync create-location-fsx-lustre \
      --fsx-filesystem-arn arn:aws:fsx:region:account-id:file-system:filesystem-id \
      --security-group-arns arn:aws:ec2:region:account-id:security-group/group-id
  ```

  The following parameters are required in the `create-location-fsx-lustre` command.
  + `fsx-filesystem-arn` – The fully qualified Amazon Resource Name (ARN) of the file system that you want to read from or write to.
  + `security-group-arns` – The ARN of an Amazon EC2 security group to apply to the [network interfaces](required-network-interfaces.md) of the file system's preferred subnet.

The preceding command returns a location ARN similar to the following.

```
{
    "LocationArn": "arn:aws:datasync:us-west-2:111222333444:location/loc-07sb7abfc326c50fb"
}
```

# Configuring DataSync transfers with Amazon FSx for OpenZFS
<a name="create-openzfs-location"></a>

To transfer data to or from your Amazon FSx for OpenZFS file system, you must create an AWS DataSync transfer *location*. DataSync can use this location as a source or destination for transferring data.

## Providing DataSync access to FSx for OpenZFS file systems
<a name="create-openzfs-access"></a>

DataSync mounts your FSx for OpenZFS file system from your virtual private cloud (VPC) using [network interfaces](required-network-interfaces.md). DataSync fully manages the creation, the use, and the deletion of these network interfaces on your behalf.

**Note**  
VPCs that you use with DataSync must have default tenancy. VPCs with dedicated tenancy aren't supported.

## Configuring FSx for OpenZFS file system authorization
<a name="configure-openzfs-authorization"></a>

DataSync accesses your FSx for OpenZFS file system as an NFS client, mounting the file system as a root user with a user ID (UID) and group ID (GID) of `0`.

For DataSync to copy all of your file metadata, you must configure the NFS export settings on your file system volumes using `no_root_squash`. However, you can limit this level of access to only a specific DataSync task.

For more information, see [Volume properties](https://docs.aws.amazon.com/fsx/latest/OpenZFSGuide/managing-volumes.html#volume-properties) in the *Amazon FSx for OpenZFS User Guide*.

### Configuring NFS exports specific to DataSync (recommended)
<a name="configure-nfs-export-recommended"></a>

You can configure an NFS export specific to each volume that’s accessed only by your DataSync task. Do this for the most recent ancestor volume of the mount path that you specify when creating your FSx for OpenZFS location.

**To configure an NFS export specific to DataSync**

1. Create your [DataSync task](create-task-how-to.md).

   This creates the task’s network interfaces that you specify in your NFS export settings.

1. Locate the private IP addresses of the task's network interfaces by using the Amazon EC2 console or AWS CLI.

1. For your FSx for OpenZFS file system volume, configure the following NFS export settings for each of the task’s network interfaces:
   + **Client address**: Enter the network interface’s private IP address (for example, `10.24.34.0`).
   + **NFS options**: Enter `rw,no_root_squash`.

### Configuring NFS exports for all clients
<a name="configure-nfs-export-general"></a>

You can specify an NFS export that allows root access to all clients.

**To configure an NFS export for all clients**
+ For your FSx for OpenZFS file system volume, configure the following NFS export settings:
  + **Client address**: Enter `*`.
  + **NFS options**: Enter `rw,no_root_squash`.

## Creating your FSx for OpenZFS transfer location
<a name="create-openzfs-location-how-to"></a>

To create the location, you need an existing FSx for OpenZFS file system. If you don't have one, see [Getting started with Amazon FSx for OpenZFS](https://docs.aws.amazon.com/fsx/latest/OpenZFSGuide/getting-started.html) in the *Amazon FSx for OpenZFS User Guide*.

### Using the DataSync console
<a name="create-openzfs-location-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Locations**, and then choose **Create location**.

1. For **Location type**, choose **Amazon FSx**.

   You configure this location as a source or destination later.

1. For **FSx file system**, choose the FSx for OpenZFS file system that you want to use as a location. 

1. For **Mount path**, enter the mount path for your FSx for OpenZFS file system. 

   The path must begin with `/fsx` and can be any existing directory path in the file system. When the location is used as a source, DataSync reads data from the mount path. When the location is used as a destination, DataSync writes all data to the mount path. If a subdirectory isn't provided, DataSync uses the root volume directory (for example, `/fsx`).

1. For **Security groups**, choose up to five security groups that provide network access to your FSx for OpenZFS file system. 

   The security groups must provide access to the network ports that are used by the FSx for OpenZFS file system. The file system must allow network access from the security groups.

   For more information about security groups, see [File system access control with Amazon VPC](https://docs.aws.amazon.com/fsx/latest/OpenZFSGuide/limit-access-security-groups.html) in the *Amazon FSx for OpenZFS User Guide*.

1. (Optional) Expand **Additional settings** and for **NFS version** choose the NFS version that DataSync uses to access your file system.

   By default, DataSync uses NFS version 4.1.

1. (Optional) Enter values for the **Key** and **Value** fields to tag the FSx for OpenZFS file system.

   Tags help you manage, filter, and search for your location. We recommend creating at least a name tag for your location. 

1. Choose **Create location**.

### Using the AWS CLI
<a name="create-openzfs-location-cli"></a>

**To create an FSx for OpenZFS location by using the AWS CLI**

1. Copy the following `create-location-fsx-open-zfs` command:

   ```
   aws datasync create-location-fsx-open-zfs \
      --fsx-filesystem-arn arn:aws:fsx:region:account-id:file-system/filesystem-id \
      --security-group-arns arn:aws:ec2:region:account-id:security-group/group-id \
      --protocol NFS={}
   ```

1. Specify the following required options in the command:
   + For `fsx-filesystem-arn`, specify the location file system's fully qualified Amazon Resource Name (ARN). This includes the AWS Region where your file system resides, your AWS account, and the file system ID.
   + For `security-group-arns`, specify the ARN of the Amazon EC2 security group that provides access to the [network interfaces](required-network-interfaces.md) of your FSx for OpenZFS file system's preferred subnet. This includes the AWS Region where your Amazon EC2 instance resides, your AWS account, and the security group ID.

     For more information about security groups, see [File System Access Control with Amazon VPC](https://docs.aws.amazon.com/fsx/latest/OpenZFSGuide/limit-access-security-groups.html) in the *Amazon FSx for OpenZFS User Guide*.
   + For `protocol`, specify the protocol that DataSync uses to access your file system. (DataSync currently supports only NFS.)

1. Run the command. You get a response showing the location that you just created.

   ```
   { 
       "LocationArn": "arn:aws:datasync:us-west-2:123456789012:location/loc-abcdef01234567890" 
   }
   ```

# Configuring transfers with Amazon FSx for NetApp ONTAP
<a name="create-ontap-location"></a>

To transfer data to or from your Amazon FSx for NetApp ONTAP file system, you must create an AWS DataSync transfer *location*. DataSync can use this location as a source or destination for transferring data.

## Providing DataSync access to FSx for ONTAP file systems
<a name="create-ontap-location-access"></a>

To access an FSx for ONTAP file system, DataSync mounts a storage virtual machine (SVM) on your file system using [network interfaces](required-network-interfaces.md) in your virtual private cloud (VPC). DataSync creates these network interfaces in your file system’s preferred subnet only when you create a task that includes your FSx for ONTAP location.

**Note**  
VPCs that you use with DataSync must have default tenancy. VPCs with dedicated tenancy aren't supported.

DataSync can connect to an FSx for ONTAP file system's SVM and copy data by using the Network File System (NFS) or Server Message Block (SMB) protocol.

**Topics**
+ [

### Using the NFS protocol
](#create-ontap-location-supported-protocols)
+ [

### Using the SMB protocol
](#create-ontap-location-smb)
+ [

### Unsupported protocols
](#create-ontap-location-unsupported-protocols)
+ [

### Choosing the right protocol
](#create-ontap-location-choosing-protocol)
+ [

### Accessing SnapLock volumes
](#create-ontap-location-snaplock)

### Using the NFS protocol
<a name="create-ontap-location-supported-protocols"></a>

With the NFS protocol, DataSync uses the `AUTH_SYS` security mechanism with a user ID (UID) and group ID (GID) of `0` to authenticate with your SVM.

**Note**  
DataSync currently only supports NFS version 3 with FSx for ONTAP locations.

### Using the SMB protocol
<a name="create-ontap-location-smb"></a>

With the SMB protocol, DataSync uses credentials that you provide to authenticate with your SVM.

**Supported SMB versions**  
By default, DataSync automatically chooses a version of the SMB protocol based on negotiation with your SMB file server. You also can configure DataSync to use a specific version, but we recommend doing this only if DataSync has trouble negotiating with the SMB file server automatically. For security reasons, we recommend using SMB version 3.0.2 or later.  
See the following table for a list of options in the DataSync console and API for configuring an SMB version with your FSx for ONTAP location:      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/create-ontap-location.html)

**Required permissions**  
You must provide DataSync a local user in your SVM or a domain user in your Microsoft Active Directory with the necessary rights to mount and access your files, folders, and file metadata.  
If you provide a user in your Active Directory, note the following:  
+ If you're using AWS Directory Service for Microsoft Active Directory, the user must be a member of the **AWS Delegated FSx Administrators** group.
+ If you're using a self-managed Active Directory, the user must be a member of one of two groups:
  + The **Domain Admins** group, which is the default delegated administrators group.
  + A custom delegated administrators group with user rights that allow DataSync to copy object ownership permissions and Windows access control lists (ACLs).
**Important**  
You can't change the delegated administrators group after the file system has been deployed. You must either redeploy the file system or restore it from a backup to use the custom delegated administrator group with the following user rights that DataSync needs to copy metadata.    
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/create-ontap-location.html)
+ If you want to copy Windows ACLs and are transferring between FSx for ONTAP file systems using SMB (or other types of file systems using SMB), the users that you provide DataSync must belong to the same Active Directory domain or have an Active Directory trust relationship between their domains.

**Required authentication protocols**  
For DataSync to access your SMB share, your FSx for ONTAP file system must use NTLM authentication. DataSync can't access FSx for ONTAP file systems that use Kerberos authentication.

**DFS Namespaces**  
DataSync doesn't support Microsoft Distributed File System (DFS) Namespaces. We recommend specifying an underlying file server or share instead when creating your DataSync location.

### Unsupported protocols
<a name="create-ontap-location-unsupported-protocols"></a>

DataSync can't access FSx for ONTAP file systems using the iSCSI (Internet Small Computer Systems Interface) protocol.

### Choosing the right protocol
<a name="create-ontap-location-choosing-protocol"></a>

To preserve file metadata in FSx for ONTAP migrations, configure your DataSync source and destination locations to use the same protocol. Between the supported protocols, SMB preserves metadata with the highest fidelity (see [Understanding how DataSync handles file and object metadata](metadata-copied.md) for details).

When migrating from a Unix (Linux) server or network-attached storage (NAS) share that serves users through NFS, do the following:

1. [Create an NFS location](create-nfs-location.md) for the Unix (Linux) server or NAS share. (This will be your source location.)

1. Configure the FSx for ONTAP volume you’re transferring data to with the [Unix security style](https://docs.aws.amazon.com/fsx/latest/ONTAPGuide/managing-volumes.html#volume-security-style).

1. Create a location for your FSx for ONTAP file system that’s configured for NFS. (This will be your destination location.)

When migrating from a Windows server or NAS share that serves users through SMB, do the following:

1. [Create an SMB location](create-smb-location.md) for the Windows server or NAS share. (This will be your source location.)

1. Configure the FSx for ONTAP volume you’re transferring data to with the [NTFS security style](https://docs.aws.amazon.com/fsx/latest/ONTAPGuide/managing-volumes.html#volume-security-style).

1. Create a location for your FSx for ONTAP file system that’s configured for SMB. (This will be your destination location.)

If your FSx for ONTAP environment uses multiple protocols, we recommend working with an AWS storage specialist. To learn about best practices for multiprotocol access, see [Enabling multiprotocol workloads with Amazon FSx for NetApp ONTAP](https://aws.amazon.com/blogs/storage/enabling-multiprotocol-workloads-with-amazon-fsx-for-netapp-ontap/).

### Accessing SnapLock volumes
<a name="create-ontap-location-snaplock"></a>

If you're transferring data to a [SnapLock volume](https://docs.aws.amazon.com/fsx/latest/ONTAPGuide/snaplock.html) on an FSx for ONTAP file system, make sure the SnapLock settings **Autocommit** and **Volume append mode** are disabled on the volume during your transfer. You can re-enable these settings when you're done transferring data.

## Creating your FSx for ONTAP transfer location
<a name="create-ontap-location-how-to"></a>

To create the location, you need an existing FSx for ONTAP file system. If you don't have one, see [Getting started with Amazon FSx for NetApp ONTAP](https://docs.aws.amazon.com/fsx/latest/ONTAPGuide/getting-started.html) in the *Amazon FSx for NetApp ONTAP User Guide*.

### Using the DataSync console
<a name="create-ontap-location-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Amazon FSx**.

   You configure this location as a source or destination later.

1. For **FSx file system**, choose the FSx for ONTAP file system that you want to use as a location.

1. For **Storage virtual machine**, choose a storage virtual machine (SVM) in your file system where you want to copy data to or from.

1. For **Mount path**, specify a path to the file share in that SVM where you'll copy your data.

   You can specify a junction path (also known as a mount point), qtree path (for NFS file shares), or share name (for SMB file shares). For example, your mount path might be `/vol1`, `/vol1/tree1`, or `/share1`.
**Tip**  
Don't specify a path in the SVM's root volume. For more information, see [Managing FSx for ONTAP storage virtual machines](https://docs.aws.amazon.com/fsx/latest/ONTAPGuide/managing-svms.html) in the *Amazon FSx for NetApp ONTAP User Guide*.

1. For **Security groups**, choose up to five Amazon EC2 security groups that provide access to your file system's preferred subnet.

   The security groups must allow outbound traffic on the following ports (depending on the protocol you use):
   + **NFS** – TCP ports 111, 635, and 2049 
   + **SMB** – TCP port 445

   Your file system's security groups must also allow inbound traffic on the same ports.

1. For **Protocol**, choose the data transfer protocol that DataSync uses to access your file system's SVM.

   For more information, see [Choosing the right protocol](#create-ontap-location-choosing-protocol).

------
#### [ NFS ]

   DataSync uses NFS version 3.

------
#### [ SMB ]

   Configure an SMB version, user, password, and Active Directory domain name (if needed) to access the SVM.
   + (Optional) Expand **Additional settings** and choose an **SMB version** for DataSync to use when accessing your SVM.

     By default, DataSync automatically chooses a version based on negotiation with the SMB file server. For more information, see [Using the SMB protocol](#create-ontap-location-smb).
   + For **User**, enter a user name that can mount and access the files, folders, and metadata that you want to transfer in the SVM.

     For more information, see [Using the SMB protocol](#create-ontap-location-smb).
   + For **Password**, enter the password of the user that you specified that can access the SVM.
   + (Optional) For **Active Directory domain name**, enter the fully qualified domain name (FQDN) of the Active Directory that your SVM belongs to.

     If you have multiple domains in your environment, configuring this setting makes sure that DataSync connects to the right SVM.

------

1. (Optional) Enter values for the **Key** and **Value** fields to tag the FSx for ONTAP file system.

   Tags help you manage, filter, and search for your AWS resources. We recommend creating at least a name tag for your location. 

1. Choose **Create location**.

### Using the AWS CLI
<a name="create-ontap-location-cli"></a>

**To create an FSx for ONTAP location by using the AWS CLI**

1. Copy the following `create-location-fsx-ontap` command:

   ```
   aws datasync create-location-fsx-ontap \
      --storage-virtual-machine-arn arn:aws:fsx:region:account-id:storage-virtual-machine/fs-file-system-id \
      --security-group-arns arn:aws:ec2:region:account-id:security-group/group-id \
      --protocol data-transfer-protocol={}
   ```

1. Specify the following required options in the command:
   + For `storage-virtual-machine-arn`, specify the fully qualified Amazon Resource Name (ARN) of a storage virtual machine (SVM) in your file system where you want to copy data to or from.

     This ARN includes the AWS Region where your file system resides, your AWS account, and the file system and SVM IDs.
   + For `security-group-arns`, specify the ARNs of the Amazon EC2 security groups that provide access to the [network interfaces](required-network-interfaces.md) of your file system's preferred subnet.

     This includes the AWS Region where your Amazon EC2 instance resides, your AWS account, and your security group IDs. You can specify up to five security group ARNs.

     For more information about security groups, see [File System Access Control with Amazon VPC](https://docs.aws.amazon.com/fsx/latest/ONTAPGuide/limit-access-security-groups.html) in the *Amazon FSx for NetApp ONTAP User Guide*.
   + For `protocol`, configure the protocol that DataSync uses to access your file system's SVM.
     + For NFS, you can use the default configuration:

       `--protocol NFS={}`
     + For SMB, you must specify a user name and password that can access the SVM:

       `--protocol SMB={User=smb-user,Password=smb-password}`

1. Run the command.

   You get a response that shows the location that you just created.

   ```
   { 
       "LocationArn": "arn:aws:datasync:us-west-2:123456789012:location/loc-abcdef01234567890" 
   }
   ```

# Transferring to or from other cloud storage with AWS DataSync
<a name="transferring-other-cloud-storage"></a>

With AWS DataSync, you can transfer data between some other cloud providers and AWS storage services. For more information, see [Where can I transfer my data with DataSync?](working-with-locations.md)

**Topics**
+ [

# Planning transfers to or from third-party cloud storage systems
](third-party-cloud-transfer-considerations.md)
+ [

# Configuring AWS DataSync transfers with Google Cloud Storage
](tutorial_transfer-google-cloud-storage.md)
+ [

# Configuring transfers with Microsoft Azure Blob Storage
](creating-azure-blob-location.md)
+ [

# Configuring AWS DataSync transfers with Microsoft Azure Files SMB shares
](transferring-azure-files.md)
+ [

# Configuring transfers with other cloud object storage
](creating-other-cloud-object-location.md)

# Planning transfers to or from third-party cloud storage systems
<a name="third-party-cloud-transfer-considerations"></a>

When planning cross-cloud data transfers, consider the following:
+ **Using an agent:** An agent is only required to access storage in other clouds when using Basic mode tasks. [Enhanced mode tasks](https://docs.aws.amazon.com/datasync/latest/userguide/choosing-task-mode.html) do not require an agent. If you decide to use an agent, you can deploy it as an [Amazon EC2 instance](https://docs.aws.amazon.com/datasync/latest/userguide/deploy-agents.html#ec2-deploy-agent) when transferring from a cloud providers' S3-compatible object storage, or as a Google Compute Engine or Azure Virtual Machine for transfers from those specific storage services, respectively. When transferring from filesystems in Google and Azure, we recommend deploying the agent as a Google or Azure VM so that the agent is as close to the filesystem as possible. Additionally, DataSync compresses the data from the agent to AWS, which can help reduce egress costs. DataSync provides a list of [validated cloud locations](https://docs.aws.amazon.com/datasync/latest/userguide/creating-other-cloud-object-location.html) that provide the required [Amazon S3 API compatibility](https://docs.aws.amazon.com/datasync/latest/userguide/creating-other-cloud-object-location.html#other-cloud-access).
+ **The other cloud’s object storage endpoint:** The storage endpoint for a third-party cloud provider is typically region or account specific. The regional endpoint is used as the server in the DataSync object storage location, together with a specified bucket name.
+ **Storage classes of the source objects:** Like Amazon S3, some cloud providers support an archive tier that requires a restore before being able to access the archived objects. For example, objects in the Azure Blob archive tier must be retrieved for standard access prior to a data transfer. Objects in the Google Cloud Storage archive tier can be accessed immediately and do not require restore, but there are retrieval costs associated with direct archive tier access. Review your cross-cloud storage class documentation to determine access requirements and retrieval fees prior to beginning your data transfer. For more information about restoring archived objects in Amazon S3, see [Restoring an archived object](https://docs.aws.amazon.com/AmazonS3/latest/userguide/restoring-objects.html) in the *Amazon Simple Storage Service User Guide*.
+ **Object storage access:** Transferring data between third-party cloud providers requires access to the other cloud's object storage in the form of authentication keys. For example, to provide access to Google Cloud Storage, you configure a DataSync object storage location that connects to the [Google Cloud Storage XML API](https://cloud.google.com/storage/docs/xml-api/overview) and authenticates using a [Hash-based Message Authentication Code (HMAC) key](https://docs.aws.amazon.com/datasync/latest/userguide/tutorial_transfer-google-cloud-storage.html#transfer-google-cloud-storage-create-hmac-key) for your service account. For Azure Blob storage, you configure a dedicated [Azure Blob DataSync location](https://docs.aws.amazon.com/datasync/latest/userguide/creating-azure-blob-location.html#creating-azure-blob-location-how-to) that authenticates using [SAS tokens](https://docs.aws.amazon.com/datasync/latest/userguide/creating-azure-blob-location.html#azure-blob-access). DataSync uses AWS Secrets Manager to securely store your object storage credentials. For more information, see [Securing storage location credentials](https://docs.aws.amazon.com/datasync/latest/userguide/location-credentials.html).
+ **Object tag support:**
  + Unlike Amazon S3, not all cloud providers support [object tags](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-tagging.html). DataSync tasks can fail while attempting to read tags from the source location if the cloud provider does not support object tags through the Amazon S3 API, or if the credentials you provide are insufficient to retrieve the tags. DataSync provides a task option to turn off [reading and copying object tags](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-ObjectTags) during a transfer if object tags are not supported, or you don't want to retain the tags. Review your cloud provider documentation to determine if object tags are supported, and verify your transfer task's object tag settings before initiating the transfer.
  + You can use the Amazon S3 API to check whether a cloud provider will return a `get-object-tagging` request. For more information, see [get-object-tagging](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3api/get-object-tagging.html) in the *AWS CLI Command Reference*.

    A cloud provider that supports object tags will return a response similar to the following example:

    ```
    aws s3api get-object-tagging --bucket BUCKET_NAME --endpoint- url=https://BUCKET_ENDPOINT --key prefix/file1
                                        
    {
    
        "TagSet": []
    
    }
    ```

    A cloud provider that doesn’t support `get-object-tagging` will return the following message:

    ```
    aws s3api get-object-tagging --bucket BUCKET_NAME --endpoint- url=https://BUCKET_ENDPOINT --key prefix/file1
    
    An error occurred (OperationNotSupported) when calling the GetObjectTagging operation: The operation is not supported for this resource
    ```
+ **Associated costs for requests and data egress:** Transferring data from cloud object storage has [request and egress costs](https://docs.aws.amazon.com/datasync/latest/userguide/creating-other-cloud-object-location.html#other-cloud-considerations-costs) associated with reading data and data transfer out. Request charges vary between cloud providers and between storage classes where applicable. Consult your cloud provider documentation regarding specific costs for requests relative to the storage class you plan to read from. For an overview of request charges that DataSync makes for data transfers, see [Evaluating S3 request costs when using DataSync](https://docs.aws.amazon.com/datasync/latest/userguide/create-s3-location.html#create-s3-location-s3-requests) and [AWS DataSync pricing](https://aws.amazon.com/datasync/pricing/). Transferring data out of specific cloud providers results in egress charges. Data transfer costs vary between cloud providers and are also dependent on the region where the data is stored.
+ **Object storage request rates:** Cloud providers have various performance and request rate characteristics for their object storage platforms. Review your other cloud provider's request rates and determine where the request limits are applied. Plan ahead for highly parallelized transfers consisting of multiple agents, where specific partitioning or performance increases might be required.

  Amazon S3 has documented request rates that you can build your solution around. Amazon S3 request rates are per partitioned prefix and are scalable across multiple prefixes. For more information, see [Best practices design patterns: optimizing Amazon S3 performance](https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html) in the *Amazon Simple Storage Service User Guide*.

# Configuring AWS DataSync transfers with Google Cloud Storage
<a name="tutorial_transfer-google-cloud-storage"></a>

With AWS DataSync, you can transfer data between Google Cloud Storage and the following AWS storage services:
+ Amazon S3
+ Amazon EFS
+ Amazon FSx for Windows File Server
+ Amazon FSx for Lustre
+ Amazon FSx for OpenZFS
+ Amazon FSx for NetApp ONTAP

To begin the transfer setup, create a location for your Google Cloud Storage. This location can serve as either your transfer source or destination. A DataSync agent is required only when you transfer data between Google Cloud Storage and Amazon EFS or Amazon FSx, or when using **Basic mode** tasks. **Enhanced mode** data transfers between Google Cloud Storage and Amazon S3 don't require an agent.

**Note**  
For private cloud connectivity between Google Cloud Storage and AWS, use Basic mode with agents.

## Overview
<a name="transfer-google-cloud-storage-overview"></a>

DataSync uses the [Google Cloud Storage XML API](https://cloud.google.com/storage/docs/xml-api/overview) for data transfers. This API provides an Amazon S3 compatible interface for reading and writing data with Google Cloud Storage buckets.

When you use Basic mode for transfers, you can deploy the agent in Google Cloud Storage or your Amazon VPC.

------
#### [ Agent in Google Cloud ]

1. You deploy a DataSync agent in your Google Cloud environment.

1. The agent reads your Google Cloud Storage bucket by using a Hash-based Message Authentication Code (HMAC) key.

1. The objects from your Google Cloud Storage bucket transfer securely through TLS 1.3 into the AWS Cloud by using a public endpoint.

1. The DataSync service writes the data to your S3 bucket.

The following diagram illustrates the transfer.

![\[An example DataSync transfer shows how object data transfers from a Google Cloud Storage bucket to an S3 bucket. First, the DataSync agent is deployed in your Google Cloud environment. Then, the DataSync agent reads the Google Cloud Storage bucket. The data moves securely through a public endpoint into AWS, where DataSync writes the objects to an S3 bucket in the same AWS Region where you're using DataSync.\]](http://docs.aws.amazon.com/datasync/latest/userguide/images/diagram-transfer-google-cloud-storage-public.png)


------
#### [ Agent in your VPC ]

1. You deploy a DataSync agent in a virtual private cloud (VPC) in your AWS environment.

1. The agent reads your Google Cloud Storage bucket by using a Hash-based Message Authentication Code (HMAC) key.

1. The objects from your Google Cloud Storage bucket transfer securely through TLS 1.3 into the AWS Cloud by using a private VPC endpoint.

1. The DataSync service writes the data to your S3 bucket.

The following diagram illustrates the transfer.

![\[An example DataSync transfer shows how object data transfers from a Google Cloud Storage bucket to an S3 bucket. First, the DataSync agent is deployed in a VPC in AWS. Then, the DataSync agent reads the Google Cloud Storage bucket. The data moves securely through a VPC endpoint into AWS, where DataSync writes the objects to an S3 bucket in the same AWS Region as the VPC.\]](http://docs.aws.amazon.com/datasync/latest/userguide/images/diagram-transfer-google-cloud-storage.png)


------

## Costs
<a name="transfer-google-cloud-storage-cost"></a>

The fees associated with this migration might include:
+ Running a Google [Compute Engine](https://cloud.google.com/compute/all-pricing) virtual machine (VM) instance (if you deploy your DataSync agent in Google Cloud)
+ Running an [Amazon EC2](https://aws.amazon.com/ec2/pricing/) instance (if you deploy your DataSync agent in a VPC within AWS)
+ Transferring the data by using [DataSync](https://aws.amazon.com/datasync/pricing/), including request charges related to [Google Cloud Storage](https://cloud.google.com/storage/pricing) and [Amazon S3](create-s3-location.md#create-s3-location-s3-requests) (if S3 is one of your transfer locations)
+ Transferring data out of [Google Cloud Storage](https://cloud.google.com/storage/pricing)
+ Storing data in [Amazon S3](https://aws.amazon.com/s3/pricing/)

## Prerequisites
<a name="transfer-google-cloud-storage-prerequisites"></a>

Before you begin, do the following if you haven’t already:
+ [Create a Google Cloud Storage bucket](https://cloud.google.com/storage/docs/creating-buckets) with the objects that you want to transfer to AWS.
+ [Sign up for an AWS account](https://portal.aws.amazon.com/billing/signup).
+ [Create an Amazon S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html) for storing your objects after they're in AWS.

## Creating an HMAC key for your Google Cloud Storage bucket
<a name="transfer-google-cloud-storage-create-hmac-key"></a>

DataSync uses an HMAC key that's associated with your Google service account to authenticate with and read the bucket that you’re transferring data from. (For detailed instructions on how to create HMAC keys, see the [Google Cloud Storage documentation](https://cloud.google.com/storage/docs/authentication/hmackeys).)

**To create an HMAC key**

1. Create an HMAC key for your Google service account.

1. Make sure that your Google service account has at least `Storage Object Viewer` permissions.

1. Save your HMAC key's access ID and secret in a secure location.

   You'll need these items later to configure your DataSync source location.

## Step 2: Configure your network
<a name="transfer-google-cloud-storage-configure-network"></a>

Network configuration is required only when using a DataSync agent with your transfer. The network requirements for this migration depend on where you choose to deploy your agent.

### For a DataSync agent in Google Cloud
<a name="transfer-google-cloud-storage-configure-public"></a>

If you want to host your DataSync agent in Google Cloud, configure your network to [allow DataSync transfers through a public endpoint](datasync-network.md#using-public-endpoints).

### For a DataSync agent in your VPC
<a name="transfer-google-cloud-storage-configure-vpc"></a>

If you want to host your agent in AWS, you need a VPC with an interface endpoint. DataSync uses the VPC endpoint to facilitate the transfer.

**To configure your network for a VPC endpoint**

1. If you don't have one, [create a VPC](https://docs.aws.amazon.com/vpc/latest/userguide/working-with-vpcs.html#Create-VPC) in the same AWS Region as your S3 bucket.

1. [Create a private subnet for your VPC](https://docs.aws.amazon.com/vpc/latest/userguide/create-subnets.html).

1. [Create a VPC service endpoint](https://docs.aws.amazon.com/vpc/latest/privatelink/create-interface-endpoint.html) for DataSync.

1. Configure your network to [allow DataSync transfers through a VPC service endpoint](datasync-network.md#using-vpc-endpoint).

   To do this, modify the [security group](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-security-groups.html) that's associated with your VPC service endpoint.

## Step 3: Create a DataSync agent (optional)
<a name="transfer-google-cloud-storage-create-agent"></a>

A DataSync agent is only required when using **Basic** mode tasks. If you are using **Enhanced** mode to transfer between Google Cloud Storage (GCS) and Amazon S3, then no agent is required. If you want to use **Basic** mode, then you need a DataSync agent that can access your GCS bucket.

### For Google Cloud
<a name="transfer-google-cloud-storage-choose-endpoint"></a>

In this scenario, the DataSync agent runs in your Google Cloud environment.

**Before you begin**: [Install the Google Cloud CLI](https://cloud.google.com/sdk/docs/install).

**To create the agent for Google Cloud**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Agents**, then choose **Create agent**.

1. For **Hypervisor**, choose **VMware ESXi**, then choose **Download the image** to download a `.zip` file that contains the agent.

1. Open a terminal. Unzip the image by running the following command:

   ```
   unzip AWS-DataSync-Agent-VMWare.zip
   ```

1. Extract the contents of the agent's `.ova` file beginning with `aws-datasync` by running the following command:

   ```
   tar -xvf aws-datasync-2.0.1655755445.1-x86_64.xfs.gpt.ova
   ```

1. Import the agent's `.vmdk` file into Google Cloud by running the following Google Cloud CLI command:

   ```
   gcloud compute images import aws-datasync-2-test \
      --source-file INCOMPLETE-aws-datasync-2.0.1655755445.1-x86_64.xfs.gpt-disk1.vmdk \
      --os centos-7
   ```
**Note**  
Importing the `.vmdk` file might take up to two hours.

1. Create and start a VM instance for the agent image that you just imported. 

   The instance needs the following configurations for your agent. (For detailed instructions on how to create an instance, see the [Google Cloud Compute Engine documentation](https://cloud.google.com/compute/docs/instances).)
   + For the machine type, choose one of the following:
     + **e2-standard-8** – For DataSync task executions working with up to 20 million objects.
     + **e2-standard-16** – For DataSync task executions working with more than 20 million objects.
   + For the boot disk settings, go to the custom images section. Then choose the DataSync agent image that you just imported.
   + For the service account setting, choose your Google service account (the same account that you used in [Step 1](#transfer-google-cloud-storage-create-hmac-key)).
   + For the firewall setting, choose the option to allow HTTP (port 80) traffic.

     To activate your DataSync agent, port 80 must be open on the agent. The port doesn't need to be publicly accessible. Once activated, DataSync closes the port.

1. After the VM instance is running, take note of its public IP address.

   You'll need this IP address to activate the agent.

1. Go back to the DataSync console. On the **Create agent** screen where you downloaded the agent image, do the following to activate your agent:
   + For **Endpoint type**, choose the public service endpoints option (for example, **Public service endpoints in US East Ohio**).
   + For **Activation key**, choose **Automatically get the activation key from your agent**.
   + For **Agent address**, enter the public IP address of the agent VM instance that you just created.
   + Choose **Get key**.

1. Give your agent a name, and then choose **Create agent**.

Your agent is online and ready to transfer data.

### For your VPC
<a name="transfer-google-cloud-storage-deploy-agent"></a>

In this scenario, the agent runs as an Amazon EC2 instance in a VPC that's associated with your AWS account.

**Before you begin**: [Set up the AWS Command Line Interface (AWS CLI)](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html).

**To create the agent for your VPC**

1. Open a terminal. Make sure to configure your AWS CLI profile to use the account that's associated with your S3 bucket.

1. Copy the following command. Replace `vpc-region` with the AWS Region where your VPC resides (for example, `us-east-1`).

   ```
   aws ssm get-parameter --name /aws/service/datasync/ami --region vpc-region
   ```

1. Run the command. In the output, take note of the `"Value"` property.

   This value is the DataSync Amazon Machine Image (AMI) ID of the Region that you specified. For example, an AMI ID could look like `ami-1234567890abcdef0`.

1. Copy the following URL. Again, replace `vpc-region` with the AWS Region where your VPC resides. Then, replace `ami-id` with the AMI ID that you noted in the previous step.

   ```
   https://console.aws.amazon.com/ec2/v2/home?region=vpc-region#LaunchInstanceWizard:ami=ami-id
   ```

1. Paste the URL into a browser.

   The Amazon EC2 instance launch page in the AWS Management Console displays.

1. For **Instance type**, choose one of the [recommended Amazon EC2 instances for DataSync agents](agent-requirements.md#ec2-instance-types).

1. For **Key pair**, choose an existing key pair, or create a new one.

1. For **Network settings**, choose the VPC and subnet where you want to deploy the agent.

1. Choose **Launch instance**.

1. Once the Amazon EC2 instance is running, [choose your VPC endpoint](choose-service-endpoint.md#datasync-in-vpc).

1. [Activate your agent](activate-agent.md).

## Step 4: Create a DataSync source location for your Google Cloud Storage bucket
<a name="transfer-google-cloud-storage-create-source"></a>

To set up a DataSync location for your Google Cloud Storage bucket, you need the access ID and secret for the HMAC key that you created in [Step 1](#transfer-google-cloud-storage-create-hmac-key).

**To create the DataSync source location**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Object storage**.

1. For **Server**, enter **storage.googleapis.com**.

1. For **Bucket name**, enter the name of your Google Cloud Storage bucket.

1. For **Folder**, enter an object prefix.

   DataSync only copies objects with this prefix.

1. If your transfer requires an agent, choose **Use agents**, then choose the agent that you created in [Step 3](#transfer-google-cloud-storage-create-agent).

1. Expand **Additional settings**. For **Server protocol**, choose **HTTPS**. For **Server port**, choose **443**.

1. Scroll down to the **Authentication** section. Make sure that the **Requires credentials** check box is selected, and then do the following:
   + For **Access key**, enter your HMAC key's access ID.
   + For **Secret key**, either enter your HMAC key's secret key directly, or specify an AWS Secrets Manager secret that contains the key. For more information, see [Providing credentials for storage locations](https://docs.aws.amazon.com/datasync/latest/userguide/location-credentials.html).

1. Choose **Create location**.

## Step 5: Create a DataSync destination location for your S3 bucket
<a name="transfer-google-cloud-storage-create-destination"></a>

You need a DataSync location for where you want your data to end up.

**To create the DataSync destination location**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. [Create a DataSync location for the S3 bucket](create-s3-location.md).

   If you deployed the DataSync agent in your VPC, this tutorial assumes that the S3 bucket is in the same AWS Region as your VPC and DataSync agent. 

## Step 6: Create and start a DataSync task
<a name="transfer-google-cloud-storage-start-task"></a>

With your source and destinations locations configured, you can start moving your data into AWS.

**To create and start the DataSync task**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. On the **Configure source location** page, do the following:

   1. Choose **Choose an existing location**.

   1. Choose the source location that you created in [Step 4](#transfer-google-cloud-storage-create-source), then choose **Next**.

1. On the **Configure destination location** page, do the following:

   1. Choose **Choose an existing location**.

   1. Choose the destination location that you created in [Step 5](#transfer-google-cloud-storage-create-destination), then choose **Next**.

1. On the **Configure settings** page, do the following:

   1. Under **Data transfer configuration**, expand **Additional settings** and clear the **Copy object tags** check box.
**Important**  
Because the Google Cloud Storage XML API does not support reading or writing object tags, your DataSync task might fail if you try to copy object tags.

   1. Configure any other task settings that you want, and then choose **Next**.

1. On the **Review** page, review your settings, and then choose **Create task**.

1. On the task's details page, choose **Start**, and then choose one of the following:
   + To run the task without modification, choose **Start with defaults**.
   + To modify the task before running it, choose **Start with overriding options**.

When your task finishes, you'll see the objects from your Google Cloud Storage bucket in your S3 bucket.

# Configuring transfers with Microsoft Azure Blob Storage
<a name="creating-azure-blob-location"></a>

With AWS DataSync, you can transfer data between Microsoft Azure Blob Storage (including Azure Data Lake Storage Gen2 blob storage) and the following AWS storage services:
+ [Amazon S3](create-s3-location.md)
+ [Amazon EFS](create-efs-location.md)
+ [Amazon FSx for Windows File Server](create-fsx-location.md)
+ [Amazon FSx for Lustre](create-lustre-location.md)
+ [Amazon FSx for OpenZFS](create-openzfs-location.md)
+ [Amazon FSx for NetApp ONTAP](create-ontap-location.md)

To set up this kind of transfer, you create a [location](how-datasync-transfer-works.md#sync-locations) for your Azure Blob Storage. You can use this location as a transfer source or destination. A DataSync agent is required only when transferring data between Azure Blob and Amazon EFS or Amazon FSx, or when using **Basic** mode tasks. You don't need an agent to transfer data between Azure Blob and Amazon S3 using **Enhanced** mode.

## Providing DataSync access to your Azure Blob Storage
<a name="azure-blob-access"></a>

How DataSync accesses your Azure Blob Storage depends on several factors, including whether you're transferring to or from blob storage and what kind of [shared access signature (SAS) token](#azure-blob-sas-tokens) you're using. Your objects also must be in an [access tier](#azure-blob-access-tiers) that DataSync can work with.

**Topics**
+ [

### SAS tokens
](#azure-blob-sas-tokens)
+ [

### Access tiers
](#azure-blob-access-tiers)

### SAS tokens
<a name="azure-blob-sas-tokens"></a>

A SAS token specifies the access permissions for your blob storage. (For more information about SAS, see the [Azure Blob Storage documentation](https://learn.microsoft.com/azure/storage/common/storage-sas-overview).)

You can generate SAS tokens to provide different levels of access. DataSync supports tokens with the following access levels:
+ Account
+ Container

The access permissions that DataSync needs depends on the scope of your token. Not having the correct permissions can cause your transfer to fail. For example, your transfer won't succeed if you're moving objects with tags to Azure Blob Storage but your SAS token doesn't have tag permissions.

**Topics**
+ [

#### SAS token permissions for account-level access
](#account-sas-tokens)
+ [

#### SAS token permissions for container-level access
](#container-sas-tokens)
+ [

#### SAS expiration policies
](#azure-blob-sas-expiration-policies)

#### SAS token permissions for account-level access
<a name="account-sas-tokens"></a>

DataSync needs an account-level access token with the following permissions (depending on whether you're transferring to or from Azure Blob Storage).

------
#### [ Transfers from blob storage ]
+ **Allowed services** – Blob
+ **Allowed resource types** – Container, Object

  If you don't include these permissions, DataSync can't transfer your object metadata, including [object tags](#azure-blob-considerations-object-tags).
+ **Allowed permissions** – Read, List
+ **Allowed blob index permissions** – Read/Write (if you want DataSync to copy [object tags](#azure-blob-considerations-object-tags))

------
#### [ Transfers to blob storage ]
+ **Allowed services** – Blob
+ **Allowed resource types** – Container, Object

  If you don't include these permissions, DataSync can't transfer your object metadata, including [object tags](#azure-blob-considerations-object-tags).
+ **Allowed permissions** – Read, Write, List, Delete (if you want DataSync to remove files that aren't in your transfer source)
+ **Allowed blob index permissions** – Read/Write (if you want DataSync to copy [object tags](#azure-blob-considerations-object-tags))

------

#### SAS token permissions for container-level access
<a name="container-sas-tokens"></a>

DataSync needs a container-level access token with the following permissions (depending on whether you're transferring to or from Azure Blob Storage).

------
#### [ Transfers from blob storage ]
+ Read
+ List
+ Tag (if you want DataSync to copy [object tags](#azure-blob-considerations-object-tags))
**Note**  
You can't add the tag permission when generating a SAS token in the Azure portal. To add the tag permission, instead generate the token by using the [https://learn.microsoft.com/en-us/azure/vs-azure-tools-storage-manage-with-storage-explorer](https://learn.microsoft.com/en-us/azure/vs-azure-tools-storage-manage-with-storage-explorer) app or generate a [SAS token that provides account-level access](#account-sas-tokens).

------
#### [ Transfers to blob storage ]
+ Read
+ Write
+ List
+ Delete (if you want DataSync to remove files that aren't in your transfer source)
+ Tag (if you want DataSync to copy [object tags](#azure-blob-considerations-object-tags))
**Note**  
You can't add the tag permission when generating a SAS token in the Azure portal. To add the tag permission, instead generate the token by using the [https://learn.microsoft.com/en-us/azure/vs-azure-tools-storage-manage-with-storage-explorer](https://learn.microsoft.com/en-us/azure/vs-azure-tools-storage-manage-with-storage-explorer) app or generate a [SAS token that provides account-level access](#account-sas-tokens).

------

#### SAS expiration policies
<a name="azure-blob-sas-expiration-policies"></a>

Make sure that your SAS doesn't expire before you expect to finish your transfer. For information about configuring a SAS expiration policy, see the [Azure Blob Storage documentation](https://learn.microsoft.com/en-us/azure/storage/common/sas-expiration-policy).

If the SAS expires during the transfer, DataSync can no longer access your Azure Blob Storage location. (You might see a Failed to open directory error.) If this happens, [update your location](#azure-blob-update-location) with a new SAS token and restart your DataSync task.

### Access tiers
<a name="azure-blob-access-tiers"></a>

When transferring from Azure Blob Storage, DataSync can copy objects in the hot and cool tiers. For objects in the archive access tier, you must rehydrate those objects to the hot or cool tier before you can copy them.

When transferring to Azure Blob Storage, DataSync can copy objects into the hot, cool, and archive access tiers. If you're copying objects into the archive access tier, DataSync can't verify the transfer if you're trying to [verify all data in the destination](configure-data-verification-options.md).

DataSync doesn't support the cold access tier. For more information about access tiers, see the [Azure Blob Storage documentation](https://learn.microsoft.com/en-us/azure/storage/blobs/access-tiers-overview?tabs=azure-portal).

## Considerations with Azure Blob Storage transfers
<a name="azure-blob-considerations"></a>

When planning to transfer data to or from Azure Blob Storage with DataSync, there are some things to keep in mind.

**Topics**
+ [

### Costs
](#azure-blob-considerations-costs)
+ [

### Blob types
](#blob-types)
+ [

### AWS Region availability
](#azure-blob-considerations-regions)
+ [

### Copying object tags
](#azure-blob-considerations-object-tags)
+ [

### Transferring to Amazon S3
](#azure-blob-considerations-s3)
+ [

### Deleting directories in a transfer destination
](#azure-blob-considerations-deleted-files)
+ [

### Limitations
](#azure-blob-limitations)

### Costs
<a name="azure-blob-considerations-costs"></a>

The fees associated with moving data in or out of Azure Blob Storage can include:
+ Running an [Azure virtual machine (VM)](https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/) (if you deploy a DataSync agent in Azure)
+ Running an [Amazon EC2](https://aws.amazon.com/ec2/pricing/) instance (if you deploy a DataSync agent in a VPC within AWS)
+ Transferring the data by using [DataSync](https://aws.amazon.com/datasync/pricing/), including request charges related to [https://azure.microsoft.com/en-us/pricing/details/storage/blobs/](https://azure.microsoft.com/en-us/pricing/details/storage/blobs/) and [Amazon S3](create-s3-location.md#create-s3-location-s3-requests) (if S3 is one of your transfer locations)
+ Transferring data in or out of [https://azure.microsoft.com/en-us/pricing/details/storage/blobs/](https://azure.microsoft.com/en-us/pricing/details/storage/blobs/)
+ Storing data in an [AWS storage service](working-with-locations.md) supported by DataSync

### Blob types
<a name="blob-types"></a>

How DataSync works with blob types depends on whether you're transferring to or from Azure Blob Storage. When you're moving data into blob storage, the objects or files that DataSync transfers can only be block blobs. When you're moving data out of blob storage, DataSync can transfer block, page, and append blobs.

For more information about blob types, see the [Azure Blob Storage documentation](https://learn.microsoft.com/en-us/rest/api/storageservices/understanding-block-blobs--append-blobs--and-page-blobs).

### AWS Region availability
<a name="azure-blob-considerations-regions"></a>

You can create an Azure Blob Storage transfer location in any [AWS Region that's supported by DataSync](https://docs.aws.amazon.com/general/latest/gr/datasync.html#datasync-region).

### Copying object tags
<a name="azure-blob-considerations-object-tags"></a>

The ability for DataSync to preserve object tags when transferring to or from Azure Blob Storage depends on the following factors:
+ **The size of an object's tags** – DataSync can't transfer an object with tags that exceed 2 KB.
+ **Whether DataSync is configured to copy object tags** – DataSync [copies object tags](configure-metadata.md) by default.
+ **The namespace that your Azure storage account uses** – DataSync can copy object tags if your Azure storage account uses a flat namespace but not if your account uses a hierarchical namespace (a feature of Azure Data Lake Storage Gen2). Your DataSync task will fail if you try to copy object tags and your storage account uses a hierarchical namespace.
+ **Whether your SAS token authorizes tagging** – The permissions that you need to copy object tags vary depending on the level of access that your token provides. Your task will fail if you try to copy object tags and your token doesn't have the right permissions for tagging. For more information, check the permission requirements for [account-level access tokens](#account-sas-tokens) or [container-level access tokens](#container-sas-tokens).

### Transferring to Amazon S3
<a name="azure-blob-considerations-s3"></a>

When transferring to Amazon S3, DataSync won't transfer Azure Blob Storage objects larger than 5 TB or objects with metadata larger than 2 KB.

### Deleting directories in a transfer destination
<a name="azure-blob-considerations-deleted-files"></a>

When transferring to Azure Blob Storage, DataSync can [remove objects in your blob storage that aren't present in your transfer source](configure-metadata.md). (You can configure this option by clearing the **Keep deleted files** setting in the DataSync console. Your [SAS token](#azure-blob-sas-tokens) must also have delete permissions.)

When you configure your transfer this way, DataSync won't delete directories in your blob storage if your Azure storage account is using a hierarchical namespace. In this case, you must manually delete the directories (for example, by using [https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-explorer](https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-explorer)).

### Limitations
<a name="azure-blob-limitations"></a>

Remember the following limitations when transferring data to or from Azure Blob Storage:
+ DataSync [creates some directories](filtering.md#directories-ignored-during-transfers) in a location to help facilitate your transfer. If Azure Blob Storage is a destination location and your storage account uses a hierarchical namespace, you might notice task-specific subdirectories (such as `task-000011112222abcde`) in the `/.aws-datasync` folder. DataSync typically deletes these subdirectories following a transfer. If that doesn't happen, you can delete these task-specific directories yourself as long as a task isn't running.
+ DataSync doesn't support using a SAS token to access only a specific folder in your Azure Blob Storage container.
+ You can't provide DataSync a user delegation SAS token for accessing your blob storage.

## Creating your DataSync agent (optional)
<a name="azure-blob-creating-agent"></a>

A DataSync agent is required only when transferring data between Azure Blob and Amazon EFS or Amazon FSx, or when using **Basic** mode tasks. You don't need an agent to transfer data between Azure Blob and Amazon S3 using **Enhanced** mode. This section describes how to deploy and activate an agent.

**Tip**  
Although you can deploy your agent on an Amazon EC2 instance, using a Microsoft Hyper-V agent might result in decreased network latency and more data compression. 

### Microsoft Hyper-V agents
<a name="azure-blob-creating-agent-hyper-v"></a>

You can deploy your DataSync agent directly in Azure with a Microsoft Hyper-V image.

**Tip**  
Before you continue, consider using a shell script that might help you deploy your Hyper-V agent in Azure quicker. You can get more information and download the code on [GitHub](https://github.com/aws-samples/aws-datasync-deploy-agent-azure).  
If you use the script, you can skip ahead to the section about [Getting your agent's activation key](#azure-blob-creating-agent-hyper-v-3).

**Topics**
+ [

#### Prerequisites
](#azure-blob-creating-agent-hyper-v-0)
+ [

#### Downloading and preparing your agent
](#azure-blob-creating-agent-hyper-v-1)
+ [

#### Deploying your agent in Azure
](#azure-blob-creating-agent-hyper-v-2)
+ [

#### Getting your agent's activation key
](#azure-blob-creating-agent-hyper-v-3)
+ [

#### Activating your agent
](#azure-blob-creating-agent-hyper-v-4)

#### Prerequisites
<a name="azure-blob-creating-agent-hyper-v-0"></a>

To prepare your DataSync agent and deploy it in Azure, you must do the following:
+ Enable Hyper-V on your local machine.
+ Install [https://learn.microsoft.com/en-us/powershell/scripting/install/installing-powershell?view=powershell-7.3&viewFallbackFrom=powershell-7.1](https://learn.microsoft.com/en-us/powershell/scripting/install/installing-powershell?view=powershell-7.3&viewFallbackFrom=powershell-7.1) (including the Hyper-V Module).
+ Install the [Azure CLI](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli).
+ Install [https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json&bc=%2Fazure%2Fstorage%2Fblobs%2Fbreadcrumb%2Ftoc.json](https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json&bc=%2Fazure%2Fstorage%2Fblobs%2Fbreadcrumb%2Ftoc.json).

#### Downloading and preparing your agent
<a name="azure-blob-creating-agent-hyper-v-1"></a>

Download an agent from the DataSync console. Before you can deploy the agent in Azure, you must convert it to a fixed-size virtual hard disk (VHD). For more information, see the [Azure documentation](https://learn.microsoft.com/en-us/azure/virtual-machines/windows/prepare-for-upload-vhd-image).

**To download and prepare your agent**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Agents**, and then choose **Create agent**.

1. For **Hypervisor**, choose **Microsoft Hyper-V**, and then choose **Download the image**.

   The agent downloads in a `.zip` file that contains a `.vhdx` file.

1. Extract the `.vhdx` file on your local machine.

1. Open PowerShell and do the following:

   1. Copy the following `Convert-VHD` cmdlet:

      ```
      Convert-VHD -Path .\local-path-to-vhdx-file\aws-datasync-2.0.1686143940.1-x86_64.xfs.gpt.vhdx `
      -DestinationPath .\local-path-to-vhdx-file\aws-datasync-2016861439401-x86_64.vhd -VHDType Fixed
      ```

   1. Replace each instance of `local-path-to-vhdx-file` with the location of the `.vhdx` file on your local machine.

   1. Run the command.

   Your agent is now a fixed-size VHD (with a `.vhd` file format) and ready to deploy in Azure.

#### Deploying your agent in Azure
<a name="azure-blob-creating-agent-hyper-v-2"></a>

Deploying your DataSync agent in Azure involves:
+ Creating a managed disk in Azure
+ Uploading your agent to that managed disk
+ Attaching the managed disk to a Linux virtual machine

**To deploy your agent in Azure**

1. In PowerShell, go to the directory that contains your agent's `.vhd` file.

1. Run the `ls` command and save the `Length` value (for example, `85899346432`).

   This is the size of your agent image in bytes, which you need when creating a managed disk that can hold the image.

1. Do the following to create a managed disk:

   1. Copy the following Azure CLI command:

      ```
      az disk create -n your-managed-disk `
      -g your-resource-group `
      -l your-azure-region `
      --upload-type Upload `
      --upload-size-bytes agent-size-bytes `
      --sku standard_lrs
      ```

   1. Replace `your-managed-disk` with a name for your managed disk.

   1. Replace `your-resource-group` with the name of the Azure resource group that your storage account belongs to.

   1. Replace `your-azure-region` with the Azure region where your resource group is located.

   1. Replace `agent-size-bytes` with the size of your agent image.

   1. Run the command.

   This command creates an empty managed disk with a [standard SKU](https://learn.microsoft.com/en-us/rest/api/storagerp/srp_sku_types) where you can upload your DataSync agent.

1. To generate a shared access signature (SAS) that allows write access to the managed disk, do the following:

   1. Copy the following Azure CLI command:

      ```
      az disk grant-access -n your-managed-disk `
      -g your-resource-group `
      --access-level Write `
      --duration-in-seconds 86400
      ```

   1. Replace `your-managed-disk` with the name of the managed disk that you created.

   1. Replace `your-resource-group` with the name of the Azure resource group that your storage account belongs to.

   1. Run the command.

      In the output, take note of the SAS URI. You need this URI when uploading the agent to Azure.

   The SAS allows you to write to the disk for up to an hour. This means that you have an hour to upload your agent to the managed disk.

1. To upload your agent to your managed disk in Azure, do the following:

   1. Copy the following `AzCopy` command:

      ```
      .\azcopy copy local-path-to-vhd-file sas-uri --blob-type PageBlob
      ```

   1. Replace `local-path-to-vhd-file` with the location of the agent's `.vhd` file on your local machine.

   1. Replace `sas-uri` with the SAS URI that you got when you ran the `az disk grant-access` command.

   1. Run the command.

1. When the agent upload finishes, revoke access to your managed disk. To do this, copy the following Azure CLI command:

   ```
   az disk revoke-access -n your-managed-disk -g your-resource-group
   ```

   1. Replace `your-resource-group` with the name of the Azure resource group that your storage account belongs to.

   1. Replace `your-managed-disk` with the name of the managed disk that you created.

   1. Run the command.

1. Do the following to attach your managed disk to a new Linux VM:

   1. Copy the following Azure CLI command:

      ```
      az vm create --resource-group your-resource-group `
      --location eastus `
      --name your-agent-vm `
      --size Standard_E4as_v4 `
      --os-type linux `
      --attach-os-disk your-managed-disk
      ```

   1. Replace `your-resource-group` with the name of the Azure resource group that your storage account belongs to.

   1. Replace `your-agent-vm` with a name for the VM that you can remember.

   1. Replace `your-managed-disk` with the name of the managed disk that you're attaching to the VM.

   1. Run the command.

You've deployed your agent. Before you can start configuring your data transfer, you must activate the agent.

#### Getting your agent's activation key
<a name="azure-blob-creating-agent-hyper-v-3"></a>

To manually get your DataSync agent's activation key, follow these steps. 

Alternatively, [DataSync can automatically get the activation key for you](activate-agent.md), but this approach requires some network configuration.

**To get your agent's activation key**

1. In the Azure portal, [enable boot diagnostics for the VM for your agent](https://learn.microsoft.com/en-us/azure/virtual-machines/boot-diagnostics) by choosing the **Enable with custom storage account** setting and specifying your Azure storage account.

   After you've enabled the boot diagnostics for your agent's VM, you can access your agent’s local console to get the activation key.

1. While still in the Azure portal, go to your VM and choose **Serial console**.

1. In the agent's local console, log in by using the following default credentials: 
   + **Username** – **admin**
   + **Password** – **password**

   We recommend at some point changing at least the agent's password. In the agent's local console, enter **5** on the main menu, then use the `passwd` command to change the password.

1. Enter **0** to get the agent's activation key.

1. Enter the AWS Region where you're using DataSync (for example, **us-east-1**).

1. Choose the [service endpoint](choose-service-endpoint.md) that the agent will use to connect with AWS. 

1. Save the value of the `Activation key` output. 

#### Activating your agent
<a name="azure-blob-creating-agent-hyper-v-4"></a>

After you have the activation key, you can finish creating your DataSync agent.

**To activate your agent**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Agents**, and then choose **Create agent**.

1. For **Hypervisor**, choose **Microsoft Hyper-V**.

1. For **Endpoint type**, choose the same type of service endpoint that you specified when you got your agent's activation key (for example, choose **Public service endpoints in *Region name***).

1. Configure your network to work with the service endpoint type that your agent is using. For service endpoint network requirements, see the following topics:
   + [VPC endpoints](datasync-network.md#using-vpc-endpoint)
   + [Public endpoints](datasync-network.md#using-public-endpoints)
   + [Federal Information Processing Standard (FIPS) endpoints](datasync-network.md#using-public-endpoints)

1. For **Activation key**, do the following:

   1. Choose **Manually enter your agent's activation key**.

   1. Enter the activation key that you got from the agent's local console.

1. Choose **Create agent**.

Your agent is ready to connect with your Azure Blob Storage. For more information, see [Creating your Azure Blob Storage transfer location](#creating-azure-blob-location-how-to).

### Amazon EC2 agents
<a name="azure-blob-creating-agent-ec2"></a>

You can deploy your DataSync agent on an Amazon EC2 instance.

**To create an Amazon EC2 agent**

1. [Deploy an Amazon EC2 agent](deploy-agents.md#ec2-deploy-agent).

1. [Choose a service endpoint](choose-service-endpoint.md) that the agent uses to communicate with AWS.

   In this situation, we recommend using a virtual private cloud (VPC) service endpoint.

1. Configure your network to work with [VPC service endpoints](datasync-network.md#using-vpc-endpoint).

1. [Activate the agent](https://docs.aws.amazon.com/datasync/latest/userguide/activate-agent.html).

## Creating your Azure Blob Storage transfer location
<a name="creating-azure-blob-location-how-to"></a>

You can configure DataSync to use your Azure Blob Storage as a transfer source or destination.

**Before you begin**  
Make sure that you know [how DataSync accesses Azure Blob Storage](#azure-blob-access) and works with [access tiers](#azure-blob-access-tiers) and [blob types](#blob-types). You also need a [DataSync agent](#azure-blob-creating-agent) that can connect to your Azure Blob Storage container.

### Using the DataSync console
<a name="creating-azure-blob-location-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Microsoft Azure Blob Storage**.

1. For **Container URL**, enter the URL of the container that's involved in your transfer.

1. (Optional) For **Access tier when used as a destination**, choose the [access tier](#azure-blob-access-tiers) that you want your objects or files transferred into.

1. For **Folder**, enter path segments if you want to limit your transfer to a virtual directory in your container (for example, `/my/images`).

1. If your transfer requires an agent, choose **Use agents**, then choose the DataSync agent that can connect with your Azure Blob Storage container.

1. For **SAS token**, provide the credentials necessary for DataSync to access your blob storage. Some public datasets on Azure Blob storage do not require credentials. You can enter a SAS token directly, or specify an AWS Secrets Manager secret that contains the token. For more information, see [Providing credentials for storage locations](https://docs.aws.amazon.com/datasync/latest/userguide/location-credentials.html).

   Your SAS token is part of the SAS URI string that comes after your storage resource URI and a question mark (`?`). A token looks something like this:

   ```
   sp=r&st=2023-12-20T14:54:52Z&se=2023-12-20T22:54:52Z&spr=https&sv=2021-06-08&sr=c&sig=aBBKDWQvyuVcTPH9EBp%2FXTI9E%2F%2Fmq171%2BZU178wcwqU%3D
   ```

1. (Optional) Enter values for the **Key** and **Value** fields to tag the location.

   Tags help you manage, filter, and search for your AWS resources. We recommend creating at least a name tag for your location. 

1. Choose **Create location**.

### Using the AWS CLI
<a name="creating-azure-blob-location-cli"></a>

1. Copy the following `create-location-azure-blob` command:

   ```
   aws datasync create-location-azure-blob \
     --container-url "https://path/to/container" \
     --authentication-type "SAS" \
     --sas-configuration '{
         "Token": "your-sas-token"
       }' \
     --agent-arns my-datasync-agent-arn \
     --subdirectory "/path/to/my/data" \
     --access-tier "access-tier-for-destination" \
     --tags [{"Key": "key1","Value": "value1"}]
   ```

1. For the `--container-url` parameter, specify the URL of the Azure Blob Storage container that's involved in your transfer.

1. For the `--authentication-type` parameter, specify `SAS`. If you are accessing a public dataset that does not require authentication, specify `NONE`.

1. For the `--sas-configuration` parameter's `Token` option, specify the SAS token that allows DataSync to access your blob storage. 

   You can also provide additional parameters for securing your keys using AWS Secrets Manager. For more information, see [Providing credentials for storage locations](https://docs.aws.amazon.com/datasync/latest/userguide/location-credentials.html).

   Your SAS token is part of the SAS URI string that comes after your storage resource URI and a question mark (`?`). A token looks something like this:

   ```
   sp=r&st=2023-12-20T14:54:52Z&se=2023-12-20T22:54:52Z&spr=https&sv=2021-06-08&sr=c&sig=aBBKDWQvyuVcTPH9EBp%2FXTI9E%2F%2Fmq171%2BZU178wcwqU%3D
   ```

1. (Optional) For the `--agent-arns` parameter, specify the Amazon Resource Name (ARN) of the DataSync agent that can connect to your container.

   Here's an example agent ARN: `arn:aws:datasync:us-east-1:123456789012:agent/agent-01234567890aaabfb`

   You can specify more than one agent. For more information, see [Using multiple DataSync agents](do-i-need-datasync-agent.md#multiple-agents).

1. For the `--subdirectory` parameter, specify path segments if you want to limit your transfer to a virtual directory in your container (for example, `/my/images`).

1. (Optional) For the `--access-tier` parameter, specify the [access tier](#azure-blob-access-tiers) (`HOT`, `COOL`, or `ARCHIVE`) that you want your objects or files transferred into.

   This parameter applies only when you're using this location as a transfer destination.

1. (Optional) For the `--tags` parameter, specify key-value pairs that can help you manage, filter, and search for your location.

   We recommend creating a name tag for your location.

1. Run the `create-location-azure-blob` command.

   If the command is successful, you get a response that shows you the ARN of the location that you created. For example:

   ```
   { 
       "LocationArn": "arn:aws:datasync:us-east-1:123456789012:location/loc-12345678abcdefgh" 
   }
   ```

## Viewing your Azure Blob Storage transfer location
<a name="azure-blob-view-location"></a>

You can get details about the existing DataSync transfer location for your Azure Blob Storage.

### Using the DataSync console
<a name="azure-blob-view-location-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations**.

1. Choose your Azure Blob Storage location.

   You can see details about your location, including any DataSync transfer tasks that are using it.

### Using the AWS CLI
<a name="azure-blob-view-location-cli"></a>

1. Copy the following `describe-location-azure-blob` command:

   ```
   aws datasync describe-location-azure-blob \
     --location-arn "your-azure-blob-location-arn"
   ```

1. For the `--location-arn` parameter, specify the ARN for the Azure Blob Storage location that you created (for example, `arn:aws:datasync:us-east-1:123456789012:location/loc-12345678abcdefgh`).

1. Run the `describe-location-azure-blob` command.

   You get a response that shows you details about your location. For example:

   ```
   {
       "LocationArn": "arn:aws:datasync:us-east-1:123456789012:location/loc-12345678abcdefgh",
       "LocationUri": "azure-blob://my-user.blob.core.windows.net/container-1",
       "AuthenticationType": "SAS",
       "Subdirectory": "/my/images",
       "AgentArns": ["arn:aws:datasync:us-east-1:123456789012:agent/agent-01234567890deadfb"],
   }
   ```

## Updating your Azure Blob Storage transfer location
<a name="azure-blob-update-location"></a>

If needed, you can modify your location's configuration in the console or by using the AWS CLI.

### Using the AWS CLI
<a name="azure-blob-update-location-cli"></a>

1. Copy the following `update-location-azure-blob` command:

   ```
   aws datasync update-location-azure-blob \
     --location-arn "your-azure-blob-location-arn" \
     --authentication-type "SAS" \
     --sas-configuration '{
         "Token": "your-sas-token"
       }' \
     --agent-arns my-datasync-agent-arn \
     --subdirectory "/path/to/my/data" \
     --access-tier "access-tier-for-destination"
   ```

1. For the `--location-arn` parameter, specify the ARN for the Azure Blob Storage location that you're updating (for example, `arn:aws:datasync:us-east-1:123456789012:location/loc-12345678abcdefgh`).

1. For the `--authentication-type` parameter, specify `SAS`.

1. For the `--sas-configuration` parameter's `Token` option, specify the SAS token that allows DataSync to access your blob storage. 

   The token is part of the SAS URI string that comes after the storage resource URI and a question mark (`?`). A token looks something like this:

   ```
   sp=r&st=2022-12-20T14:54:52Z&se=2022-12-20T22:54:52Z&spr=https&sv=2021-06-08&sr=c&sig=qCBKDWQvyuVcTPH9EBp%2FXTI9E%2F%2Fmq171%2BZU178wcwqU%3D
   ```

1. For the `--agent-arns` parameter, specify the Amazon Resource Name (ARN) of the DataSync agent that you want to connect to your container.

   Here's an example agent ARN: `arn:aws:datasync:us-east-1:123456789012:agent/agent-01234567890aaabfb`

   You can specify more than one agent. For more information, see [Using multiple DataSync agents](do-i-need-datasync-agent.md#multiple-agents).

1. For the `--subdirectory` parameter, specify path segments if you want to limit your transfer to a virtual directory in your container (for example, `/my/images`).

1. (Optional) For the `--access-tier` parameter, specify the [access tier](#azure-blob-access-tiers) (`HOT`, `COOL`, or `ARCHIVE`) that you want your objects to be transferred into.

   This parameter applies only when you're using this location as a transfer destination.

## Next steps
<a name="create-azure-blob-location-next-steps"></a>

After you finish creating a DataSync location for your Azure Blob Storage, you can continue setting up your transfer. Here are some next steps to consider:

1. If you haven't already, [create another location](working-with-locations.md) where you plan to transfer your data to or from your Azure Blob Storage.

1. Learn how DataSync [handles metadata and special files](metadata-copied.md), particularly if your transfer locations don't have a similar metadata structure.

1. Configure how your data gets transferred. For example, you can [transfer only a subset of your data](filtering.md) or delete files in your blob storage that aren't in your source location (as long as your [SAS token](#azure-blob-sas-tokens) has delete permissions).

1. [Start your transfer](run-task.md). 

# Configuring AWS DataSync transfers with Microsoft Azure Files SMB shares
<a name="transferring-azure-files"></a>

You can configure AWS DataSync to transfer data to or from a Microsoft Azure Files Server Message Block (SMB) share.

**Tip**  
For a full walkthrough on moving data from Azure Files SMB shares to AWS, see the [AWS Storage Blog](https://aws.amazon.com/blogs/storage/how-to-move-data-from-azure-files-smb-shares-to-aws-using-aws-datasync/).

## Providing DataSync access to SMB shares
<a name="configuring-smb-azure-files"></a>

DataSync connects to your SMB share using the SMB protocol and authenticates with credentials that you provide it.

**Topics**
+ [

### Supported SMB protocol versions
](#configuring-smb-version-azure-files)
+ [

### Required permissions
](#configuring-smb-permissions-azure-files)

### Supported SMB protocol versions
<a name="configuring-smb-version-azure-files"></a>

By default, DataSync automatically chooses a version of the SMB protocol based on negotiation with your SMB file server.

You also can configure DataSync to use a specific SMB version, but we recommend doing this only if DataSync has trouble negotiating with the SMB file server automatically. DataSync supports SMB versions 1.0 and later. For security reasons, we recommend using SMB version 3.0.2 or later. Earlier versions, such as SMB 1.0, contain known security vulnerabilities that attackers can exploit to compromise your data.

See the following table for a list of options in the DataSync console and API:


| Console option | API option | Description | 
| --- | --- | --- | 
| Automatic |  `AUTOMATIC`  |  DataSync and the SMB file server negotiate the highest version of SMB that they mutually support between 2.1 and 3.1.1. This is the default and recommended option. If you instead choose a specific version that your file server doesn't support, you may get an `Operation Not Supported` error.  | 
|  SMB 3.0.2  |  `SMB3`  |  Restricts the protocol negotiation to only SMB version 3.0.2.  | 
| SMB 2.1 |  `SMB2`  | Restricts the protocol negotiation to only SMB version 2.1. | 
| SMB 2.0 | `SMB2_0` | Restricts the protocol negotiation to only SMB version 2.0. | 
| SMB 1.0 | `SMB1` | Restricts the protocol negotiation to only SMB version 1.0. | 

### Required permissions
<a name="configuring-smb-permissions-azure-files"></a>

DataSync needs a user who has permission to mount and access your SMB location. This can be a local user on your Windows file server or a domain user that's defined in your Microsoft Active Directory.

To set object ownership, DataSync requires the `SE_RESTORE_NAME` privilege, which is usually granted to members of the built-in Active Directory groups **Backup Operators** and **Domain Admins**. Providing a user to DataSync with this privilege also helps ensure sufficient permissions to files, folders, and file metadata, except for NTFS system access control lists (SACLs).

Additional privileges are required to copy SACLs. Specifically, this requires the Windows `SE_SECURITY_NAME` privilege, which is granted to members of the **Domain Admins** group. If you configure your task to copy SACLs, make sure that the user has the required privileges. To learn more about configuring a task to copy SACLs, see [Configuring how to handle files, objects, and metadata](configure-metadata.md).

When you copy data between an SMB file server and Amazon FSx for Windows File Server file system, the source and destination locations must belong to the same Microsoft Active Directory domain or have an Active Directory trust relationship between their domains.

## Creating your Azure Files transfer location by using the console
<a name="create-azure-files-smb-location-how-to"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Server Message Block (SMB)**.

   You configure this location as a source or destination later.

1. For **Agents**, choose one or more DataSync agents that you want to connect to your SMB share.

   If you choose more than one agent, make sure you understand using [multiple agents for a location](do-i-need-datasync-agent.md#multiple-agents).

1. For **SMB Server**, enter the Domain Name System (DNS) name or IP address of the SMB share that your DataSync agent will mount.
**Note**  
You can't specify an IP version 6 (IPv6) address.

1. For **Share name**, enter the name of the share exported by your SMB share where DataSync will read or write data.

   You can include a subdirectory in the share path (for example, `/path/to/subdirectory`). Make sure that other SMB clients in your network can also mount this path. 

   To copy all the data in the subdirectory, DataSync must be able to mount the SMB share and access all of its data. For more information, see [Required permissions](create-smb-location.md#configuring-smb-permissions).

1. (Optional) Expand **Additional settings** and choose an **SMB Version** for DataSync to use when accessing your SMB share.

   By default, DataSync automatically chooses a version based on negotiation with the SMB share. For information, see [Supported SMB versions](create-smb-location.md#configuring-smb-version).

1. For **User**, enter a user name that can mount your SMB share and has permission to access the files and folders involved in your transfer.

   For more information, see [Required permissions](create-smb-location.md#configuring-smb-permissions).

1. For **Password**, enter the password of the user who can mount your SMB share and has permission to access the files and folders involved in your transfer.

1. (Optional) For **Domain**, enter the Windows domain name that your SMB share belongs to.

   If you have multiple domains in your environment, configuring this setting makes sure that DataSync connects to the right share.

1. (Optional) Choose **Add tag** to tag your location.

   *Tags* are key-value pairs that help you manage, filter, and search for your locations. We recommend creating at least a name tag for your location. 

1. Choose **Create location**.

# Configuring transfers with other cloud object storage
<a name="creating-other-cloud-object-location"></a>

With AWS DataSync, you can transfer data between [AWS storage services](transferring-aws-storage.md) and the following cloud object storage providers:
+ [https://docs.wasabi.com/](https://docs.wasabi.com/)
+ [https://docs.digitalocean.com/](https://docs.digitalocean.com/)
+ [https://docs.oracle.com/iaas/Content/home.htm](https://docs.oracle.com/iaas/Content/home.htm)
+ [https://developers.cloudflare.com/r2/](https://developers.cloudflare.com/r2/)
+ [https://www.backblaze.com/docs/cloud-storage](https://www.backblaze.com/docs/cloud-storage)
+ [https://guide.ncloud-docs.com/docs/](https://guide.ncloud-docs.com/docs/)
+ [https://www.alibabacloud.com/help/en/oss/product-overview/what-is-oss](https://www.alibabacloud.com/help/en/oss/product-overview/what-is-oss)
+ [https://cloud.ibm.com/docs/cloud-object-storage?topic=cloud-object-storage-getting-started-cloud-object-storage](https://cloud.ibm.com/docs/cloud-object-storage?topic=cloud-object-storage-getting-started-cloud-object-storage)
+ [https://help.lyvecloud.seagate.com/en/product-features.html](https://help.lyvecloud.seagate.com/en/product-features.html)

A DataSync agent is required only when transferring data between storage systems in other clouds and Amazon EFS or Amazon FSx, or when using **Basic** mode tasks. You don't need an agent to transfer data between storage systems in other clouds and Amazon S3 using **Enhanced** mode.

Regardless of whether you use an agent, you must also create a transfer [location](how-datasync-transfer-works.md#sync-locations) for your cloud object storage (specifically an **Object storage** location). DataSync can use this location as a source or destination for your transfer.

## Providing DataSync access to your other cloud object storage
<a name="other-cloud-access"></a>

How DataSync accesses your cloud object storage depends on several factors, including whether your storage is compatible with the Amazon S3 API and the permissions and credentials that DataSync needs to access your storage.

**Topics**
+ [

### Amazon S3 API compatibility
](#other-cloud-s3-compatibility)
+ [

### Storage permissions and endpoints
](#other-cloud-permissions)
+ [

### Storage credentials
](#other-cloud-credentials)

### Amazon S3 API compatibility
<a name="other-cloud-s3-compatibility"></a>

Your cloud object storage must be compatible with the following [Amazon S3 API operations](https://docs.aws.amazon.com/AmazonS3/latest/API/API_Operations.html) for DataSync to connect to it:
+ `AbortMultipartUpload`
+ `CompleteMultipartUpload`
+ `CopyObject`
+ `CreateMultipartUpload`
+ `DeleteObject`
+ `DeleteObjects`
+ `DeleteObjectTagging`
+ `GetBucketLocation`
+ `GetObject`
+ `GetObjectTagging`
+ `HeadBucket`
+ `HeadObject`
+ `ListObjectsV2`
+ `PutObject`
+ `PutObjectTagging`
+ `UploadPart`

### Storage permissions and endpoints
<a name="other-cloud-permissions"></a>

You must configure the permissions that allow DataSync to access your cloud object storage. If your object storage is a source location, DataSync needs read and list permissions for the bucket that you're transferring data from. If your object storage is a destination location, DataSync needs read, list, write, and delete permissions for the bucket.

DataSync also needs an endpoint (or server) to connect to your storage. The following table describes the endpoints that DataSync can use to access other cloud object storage:


| Other cloud provider | Endpoint | 
| --- | --- | 
| Wasabi Cloud Storage |  `S3.region.wasabisys.com`  | 
| DigitalOcean Spaces |  `region.digitaloceanspaces.com`  | 
| Oracle Cloud Infrastructure Object Storage |  `namespace.compat.objectstorage.region.oraclecloud.com`  | 
|  Cloudflare R2 Storage  |  `account-id.r2.cloudflarestorage.com`  | 
|  Backblaze B2 Cloud Storage  |  `S3.region.backblazeb2.com`  | 
| NAVER Cloud Object Storage |  `region.object.ncloudstorage.com` (most regions)  | 
| Alibaba Cloud Object Storage Service | `region.aliyuncs.com` | 
| IBM Cloud Object Storage | `s3.region.cloud-object-storage.appdomain.cloud` | 
| Seagate Lyve Cloud | `s3.region.lyvecloud.seagate.com` | 

**Important**  
For details on how to configure bucket permissions and updated information on storage endpoints, see your cloud provider's documentation.

### Storage credentials
<a name="other-cloud-credentials"></a>

DataSync also needs the credentials to access the object storage bucket involved in your transfer. This might be an access key and secret key or something similar depending on how your cloud storage provider refers to these credentials.

For more information, see your cloud provider's documentation.

## Considerations when transferring from other cloud object storage
<a name="other-cloud-considerations"></a>

When planning to transfer objects to or from another cloud storage provider by using DataSync, there are some things to keep in mind.

**Topics**
+ [

### Costs
](#other-cloud-considerations-costs)
+ [

### Storage classes
](#other-cloud-considerations-storage-classes)
+ [

### Object tags
](#other-cloud-considerations-object-tags)
+ [

### Transferring to Amazon S3
](#other-cloud-considerations-s3)

### Costs
<a name="other-cloud-considerations-costs"></a>

The fees associated with moving data in and out of another cloud storage provider can include:
+ Running an [Amazon EC2](https://aws.amazon.com/ec2/pricing/) instance for your DataSync agent
+ Transferring the data by using [DataSync](https://aws.amazon.com/datasync/pricing/), including request charges related to your cloud object storage and [Amazon S3](create-s3-location.md#create-s3-location-s3-requests) (if S3 is your transfer destination)
+ Transferring data in or out of your cloud storage (check your cloud provider's pricing)
+ Storing data in an [AWS storage service](transferring-aws-storage.md) supported by DataSync
+ Storing data in another cloud provider (check your cloud provider's pricing)

### Storage classes
<a name="other-cloud-considerations-storage-classes"></a>

Some cloud storage providers have storage classes (similar to [Amazon S3](create-s3-location.md#using-storage-classes)) which DataSync can't read without being restored first. For example, Oracle Cloud Infrastructure Object Storage has an archive storage class. You need to restore objects in that storage class before DataSync can transfer them. For more information, see your cloud provider's documentation.

### Object tags
<a name="other-cloud-considerations-object-tags"></a>

Not all cloud providers support object tags. The ones that do might not allow querying tags through the Amazon S3 API. In either situation, your DataSync transfer task might fail if you try to copy object tags.

You can avoid this by clearing the **Copy object tags** checkbox in the DataSync console when creating, starting, or updating your task.

### Transferring to Amazon S3
<a name="other-cloud-considerations-s3"></a>

When transferring to Amazon S3, DataSync can't transfer objects larger than 5 TB. DataSync also can only copy object metadata up to 2 KB.

## Creating your DataSync agent
<a name="other-cloud-creating-agent"></a>

A DataSync agent is required only when transferring data between storage systems in other clouds and Amazon EFS or Amazon FSx, or when using **Basic** mode tasks. You don't need an agent to transfer data between storage systems in other clouds and Amazon S3 using **Enhanced** mode. This section desribes how to deploy and activate an agent on an Amazon EC2 instance in your virtual private cloud (VPC) in AWS.

**To create an Amazon EC2 agent**

1. [Deploy an Amazon EC2 agent](deploy-agents.md#ec2-deploy-agent).

1. [Choose a service endpoint](choose-service-endpoint.md) that the agent uses to communicate with AWS.

   In this situation, we recommend using a VPC service endpoint.

1. Configure your network to work with [VPC service endpoints](datasync-network.md#using-vpc-endpoint).

1. [Activate the agent](activate-agent.md).

## Creating a transfer location for your other cloud object storage
<a name="creating-other-cloud-location-how-to"></a>

You can configure DataSync to use your cloud object storage as a source or destination location.

**Before you begin**  
Make sure that you know [how DataSync accesses your cloud object storage](#other-cloud-access). You also need a [DataSync agent](#other-cloud-creating-agent) that can connect to your cloud object storage.

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Object storage**.

1. For **Server**, enter the [endpoint](#other-cloud-permissions) that DataSync can use to access your cloud object storage:
   + **Wasabi Cloud Storage** – `S3.region.wasabisys.com`
   + **DigitalOcean Spaces** – `region.digitaloceanspaces.com`
   + **Oracle Cloud Infrastructure Object Storage** – `namespace.compat.objectstorage.region.oraclecloud.com`
   + **Cloudflare R2 Storage** – `account-id.r2.cloudflarestorage.com`
   + **Backblaze B2 Cloud Storage** – `S3.region.backblazeb2.com`
   + **NAVER Cloud Object Storage** – `region.object.ncloudstorage.com` (most regions)
   + **Alibaba Cloud Object Storage Service** – `region.aliyuncs.com`
   + **IBM Cloud Object Storage** – `s3.region.cloud-object-storage.appdomain.cloud`
   + **Seagate Lyve Cloud** – `s3.region.lyvecloud.seagate.com`

1. For **Bucket name**, enter the name of the object storage bucket that you're transferring data to or from.

1. For **Folder**, enter an object preﬁx. DataSync only transfers objects with this prefix.

1. If your transfer requires an agent, choose **Use agents**, then choose the DataSync agent that can connect with your cloud object storage.

1. Expand **Additional settings**. For **Server protocol**, choose **HTTPS**. For **Server port**, choose **443**.

1. Scroll down to the **Authentication** section. Make sure that the **Requires credentials** check box is selected, and then provide DataSync your [storage credentials](#other-cloud-credentials).
   + For **Access key**, enter the ID to access your cloud object storage.
   + For **Secret key**, provide the secret key to access your cloud object storage. You can either enter the key directly, or specify an AWS Secrets Manager secret that contains the key. For more information, see [Providing credentials for storage locations](https://docs.aws.amazon.com/datasync/latest/userguide/location-credentials.html).

1. (Optional) Enter values for the **Key** and **Value** fields to tag the location.

   Tags help you manage, filter, and search for your AWS resources. We recommend creating at least a name tag for your location. 

1. Choose **Create location**.

## Next steps
<a name="other-cloud-location-next-steps"></a>

After you finish creating a DataSync location for your cloud object storage, you can continue setting up your transfer. Here are some next steps to consider:

1. If you haven't already, [create another location](transferring-aws-storage.md) where you plan to transfer your data to or from in AWS.

1. Learn how DataSync [handles metadata and special files](metadata-copied.md) for object storage locations.

1. Configure how your data gets transferred. For example, maybe you only want to [transfer a subset of your data](filtering.md).
**Important**  
Make sure that you configure how DataSync copies object tags correctly. For more information, see considerations with [object tags](#other-cloud-considerations-object-tags).

1. [Start your transfer](run-task.md). 

 
# Creating a task for transferring your data
<a name="create-task-how-to"></a>

A *task* describes where and how AWS DataSync transfers data. A task consists of the following:
+ [**Source location**](working-with-locations.md) – The storage system or service where DataSync transfers data from.
+ [**Destination location**](working-with-locations.md) – The storage system or service where DataSync transfers data to.
+ [**Task options**](task-options.md) – Settings such as what files to transfer, how data gets verified, when the task runs, and more.
+ [**Task executions**](run-task.md) – When you run a task, it's called a *task execution*.

## Creating your task
<a name="create-task-steps"></a>

When you create a DataSync task, you specify your source and destination locations. You also can customize your task by choosing which files to transfer, how metadata gets handled, setting up a schedule, and more.

Before you create your task, make sure that you understand [how DataSync transfers work](how-datasync-transfer-works.md#transferring-files) and review the [task quotas](datasync-limits.md#task-hard-limits).

**Important**  
If you're planning to transfer data to or from an Amazon S3 location, review [how DataSync can affect your S3 request charges](create-s3-location.md#create-s3-location-s3-requests) and the [DataSync pricing page](https://aws.amazon.com/datasync/pricing/) before you begin.

### Using the DataSync console
<a name="create-task-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. Make sure you're in one of the AWS Regions where you plan to transfer data.

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. On the **Configure source location** page, [create](transferring-data-datasync.md) or choose a source location, then choose **Next**.

1. On the **Configure destination location** page, [create](transferring-data-datasync.md) or choose a destination location, then choose **Next**.

1. (Recommended) On the **Configure settings** page, give your task a name that you can remember.

1. While still on the **Configure settings** page, choose your task options or use the default settings.

   You might be interested in some of the following options:
   + Specify the [task mode](choosing-task-mode.md) that you want to use.
   + Specify what data to transfer by using a [manifest](transferring-with-manifest.md) or [filters](filtering.md).
   + Configure how to [handle file metadata](configure-metadata.md) and [verify data integrity](configure-data-verification-options.md).
   + Monitor your transfer with [task reports](task-reports.md) or [Amazon CloudWatch](monitor-datasync.md). We recommend setting up some kind of monitoring for your task.

   When you're done, choose **Next**.

1. Review your task configuration, then choose **Create task**.

You're ready to [start your task](run-task.md).

### Using the AWS CLI
<a name="create-task-cli"></a>

Once you [create your DataSync source and destination locations](transferring-data-datasync.md), you can create your task.

1. In your AWS CLI settings, make sure that you're using one of the AWS Regions where you plan to transfer data.

1. Copy the following `create-task` command:

   ```
   aws datasync create-task \
     --source-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --destination-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --name "task-name"
   ```

1. For `--source-location-arn`, specify the Amazon Resource Name (ARN) of your source location.

1. For `--destination-location-arn`, specify the ARN of your destination location.

   If you're transferring across AWS Regions or accounts, make sure that the ARN includes the other Region or account ID.

1. (Recommended) For `--name`, specify a name for your task that you can remember.

1. Specify other task options as needed. You might be interested in some of the following options:
   + Specify what data to transfer by using a [manifest](transferring-with-manifest.md) or [filters](filtering.md).
   + Configure how to [handle file metadata](configure-metadata.md) and [verify data integrity](configure-data-verification-options.md).
   + Monitor your transfer with [task reports](task-reports.md) or [Amazon CloudWatch](monitor-datasync.md). We recommend setting up some kind of monitoring for your task.

   For more options, see [create-task](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/datasync/create-task.html). Here's an example `create-task` command that specifies several options:

   ```
   aws datasync create-task \
     --source-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --destination-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --cloud-watch-log-group-arn "arn:aws:logs:region:account-id" \
     --name "task-name" \
     --options VerifyMode=NONE,OverwriteMode=NEVER,Atime=BEST_EFFORT,Mtime=PRESERVE,Uid=INT_VALUE,Gid=INT_VALUE,PreserveDevices=PRESERVE,PosixPermissions=PRESERVE,PreserveDeletedFiles=PRESERVE,TaskQueueing=ENABLED,LogLevel=TRANSFER
   ```

1. Run the `create-task` command.

   If the command is successful, you get a response that shows you the ARN of the task that you created. For example:

   ```
   { 
       "TaskArn": "arn:aws:datasync:us-east-1:111222333444:task/task-08de6e6697796f026" 
   }
   ```

You're ready to [start your task](run-task.md).

## Task statuses
<a name="understand-task-creation-statuses"></a>

When you create a DataSync task, you can check its status to see if it's ready to run.


| Console status | API status | Description | 
| --- | --- | --- | 
| Available |  `AVAILABLE`  |  The task is ready to start transferring data.  | 
| Running |  `RUNNING`  | A task execution is in progress. For more information, see [Task execution statuses](run-task.md#understand-task-execution-statuses). | 
|  Unavailable  |  `UNAVAILABLE`  |  A DataSync agent used by the task is offline. For more information, see [What do I do if my agent is offline?](troubleshooting-datasync-agents.md#troubleshoot-agent-offline)  | 
|  Queued  |  `QUEUED`  |  Another task execution that uses the same DataSync agent is in progress. For more information, see [Knowing when your task is queued](run-task.md#queue-task-execution).  | 

## Partitioning large datasets with multiple tasks
<a name="multiple-tasks-large-dataset"></a>

If you're transferring a large dataset, such as [migrating](datasync-large-migration.md) millions of files or objects, we recommend using DataSync Enhanced mode for your transfer, which can transfer datasets with virtually unlimited numbers of files. For very large datasets, with billions of files, you should consider partitioning your dataset with multiple DataSync tasks. Partitioning your data across multiple tasks (and possibly [agents](do-i-need-datasync-agent.md#multiple-agents), depending on your locations) helps reduce the time it takes DataSync to prepare and transfer your data.

Consider some of the ways that you can partition a large dataset across several DataSync tasks:
+ Create tasks that transfer separate folders. For example, you might create two tasks that target `/FolderA` and `/FolderB`, respectively, in your source storage.
+ Create tasks that transfer subsets of files, objects, and folders by using a [manifest](transferring-with-manifest.md) or [filters](filtering.md).

Be mindful that this approach can increase the I/O operations on your storage and affect your network bandwidth. For more information, see the blog on [How to accelerate your data transfers with DataSync scale out architectures](https://aws.amazon.com/blogs/storage/how-to-accelerate-your-data-transfers-with-aws-datasync-scale-out-architectures/).

## Segmenting transferred data with multiple tasks
<a name="multiple-tasks-organize-transfer"></a>

If you're transferring different sets of data to the same destination, you can create multiple tasks to help segment the data that you transfer.

For example, if you're transferring to the same S3 bucket named `MyBucket`, you can create different prefixes in the bucket that correspond to each task. This approach prevents file name conflicts the datasets and allows you to set different permissions for each prefix. Here's how you might set this up:

1. Create three prefixes in the destination `MyBucket` named `task1`, `task2`, and `task3`:
   + `s3://MyBucket/task1`
   + `s3://MyBucket/task2`
   + `s3://MyBucket/task3`

1. Create three DataSync tasks named `task1`, `task2`, and `task3` that transfer to the corresponding prefix in `MyBucket`.

# Choosing a task mode for your data transfer
<a name="choosing-task-mode"></a>

Your AWS DataSync task can run in one of the following modes:
+ **Enhanced mode** – Transfer virtually unlimited numbers of files or objects with higher performance than Basic mode. Enhanced mode tasks optimize the data transfer process by listing, preparing, transferring, and verifying data in parallel. Enhanced mode is currently available for transfers between Amazon S3 locations, transfers between Azure Blob and Amazon S3 without an agent, transfers between other clouds and Amazon S3 without an agent, and transfers between NFS or SMB file servers and Amazon S3 using an Enhanced mode agent.
+ **Basic mode** – Transfer files or objects between AWS storage and all other supported DataSync locations. Basic mode tasks are subject to [quotas](datasync-limits.md) on the number of files, objects, and directories in a dataset. Basic mode sequentially prepares, transfers, and verifies data, making it slower than Enhanced mode for most workloads.

## Understanding task mode differences
<a name="task-mode-differences"></a>

The following information can help you determine which task mode to use.


| Capability | Enhanced mode behavior | Basic mode behavior | 
| --- | --- | --- | 
| [Performance](how-datasync-transfer-works.md#transferring-files) | DataSync lists, prepares, transfers, and verifies your data in parallel. Provides higher performance than Basic mode for most workloads (such as transferring large objects) | DataSync prepares, transfers, and verifies your data sequentially. Performance is slower than Enhanced mode for most workloads | 
| Number of items in a dataset that DataSync can work with per task execution |  Virtually unlimited numbers of objects  |  [Quotas](datasync-limits.md#task-hard-limits) apply  | 
|  Data transfer [counters](transfer-performance-counters.md) and [metrics](monitor-datasync.md)  |  More counters and metrics than Basic mode, such as the number of objects that DataSync finds at your source location, how many objects are prepared during each task execution, and folder counters similar to file and object counters  |  Less counters and metrics than Enhanced mode  | 
|  [Logging](configure-logging.md)  | Structured logs (JSON format) | Unstructured logs | 
|  [Supported locations](working-with-locations.md)  | Currently for transfers between Amazon S3 locations, transfers between Azure Blob and Amazon S3 without an agent, transfers between other clouds and Amazon S3 without an agent, and transfers between NFS or SMB file servers and Amazon S3 using an Enhanced mode agent. |  For transfers between all locations that DataSync supports  | 
|  [Data verification options](configure-data-verification-options.md)  | DataSync verifies only transferred data | DataSync verifies all data by default | 
| Cost | For more information, see the [DataSync pricing](https://aws.amazon.com/datasync/pricing) page | For more information, see the [DataSync pricing](https://aws.amazon.com/datasync/pricing) page | 
| Failure handling for unsupported object tags | For cloud storage transfers to or from locations that don't support object tagging, task execution will fail immediately if the ObjectTags option is unspeficied or set to PRESERVE. | For cloud storage transfers to or from locations that don't support object tagging, task execution will run normally, but will report per-object failures for tagged objects if the ObjectTags option is unspecified or set to PRESERVE. | 

## Choosing a task mode
<a name="choosing-task-mode-how-to"></a>

You can choose Enhanced mode only for transfers between Amazon S3 locations, transfers between Azure Blob and Amazon S3 without an agent, transfers between other clouds and Amazon S3 without an agent, and transfers between NFS or SMB file servers and Amazon S3 using an Enhanced mode agent. Otherwise, you must use Basic mode. For example, a transfer from an on-premises [HDFS location](create-hdfs-location.md) to an S3 location requires Basic mode.

Your task options and performance might vary depending on the task mode you choose. Once you create your task, you can't change the task mode.

**Required permissions**  
To create an Enhanced mode task, the IAM role that you're using DataSync with must have the `iam:CreateServiceLinkedRole` permission.  
For your DataSync user permissions, consider using [AWSDataSyncFullAccess](security-iam-awsmanpol.md#security-iam-awsmanpol-awsdatasyncfullaccess). This is an AWS managed policy that provides a user full access to DataSync and minimal access to its dependencies.

### Using the DataSync console
<a name="choosing-task-mode-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For **Task mode**, choose one of the following options:
   + **Enhanced**
   + **Basic**

   For more information, see [Understanding task mode differences](#task-mode-differences).

1. While still on the **Configure settings** page, choose other task options or use the default settings.

   You might be interested in some of the following options:
   + Specify what data to transfer by using a [manifest](transferring-with-manifest.md) or [filters](filtering.md).
   + Configure how to [handle file metadata](configure-metadata.md) and [verify data integrity](configure-data-verification-options.md).
   + Monitor your transfer with [task reports](task-reports.md) or [Amazon CloudWatch Logs](monitor-datasync.md).

   When you're done, choose **Next**.

1. Review your task configuration, then choose **Create task**.

### Using the AWS CLI
<a name="choosing-task-mode-cli"></a>

1. In your AWS CLI settings, make sure that you're using one of the AWS Regions where you plan to transfer data.

1. Copy the following `create-task` command:

   ```
   aws datasync create-task \
     --source-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --destination-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --task-mode "ENHANCED-or-BASIC"
   ```

1. For `--source-location-arn`, specify the Amazon Resource Name (ARN) of your source location.

1. For `--destination-location-arn`, specify the ARN of your destination location.

   If you're transferring across AWS Regions or accounts, make sure that the ARN includes the other Region or account ID.

1. For `--task-mode`, specify `ENHANCED` or `BASIC`.

   For more information, see [Understanding task mode differences](#task-mode-differences).

1. Specify other task options as needed. You might be interested in some of the following options:
   + Specify what data to transfer by using a [manifest](transferring-with-manifest.md) or [filters](filtering.md).
   + Configure how to [handle file metadata](configure-metadata.md) and [verify data integrity](configure-data-verification-options.md).
   + Monitor your transfer with [task reports](task-reports.md) or [Amazon CloudWatch Logs](monitor-datasync.md).

   For more options, see [create-task](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/datasync/create-task.html). Here's an example `create-task` command that specifies Enhanced mode and several other options:

   ```
   aws datasync create-task \
     --source-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --destination-location-arn "arn:aws:datasync:us-east-1:account-id:location/location-id" \
     --name "task-name" \
     --task-mode "ENHANCED" \
     --options TransferMode=CHANGED,VerifyMode=ONLY_FILES_TRANSFERRED,ObjectTags=PRESERVE,LogLevel=TRANSFER
   ```

1. Run the `create-task` command.

   If the command is successful, you get a response that shows you the ARN of the task that you created. For example:

   ```
   { 
       "TaskArn": "arn:aws:datasync:us-east-1:111222333444:task/task-08de6e6697796f026" 
   }
   ```

### Using the DataSync API
<a name="choosing-task-mode-api"></a>

You can specify the DataSync task mode by configuring the `TaskMode` parameter in the [CreateTask](https://docs.aws.amazon.com/datasync/latest/userguide/API_CreateTask.html) operation.

# Choosing what AWS DataSync transfers
<a name="task-options"></a>

AWS DataSync lets you choose what to transfer and how you want your data handled. Some options include:
+ Transferring an exact list of files or object by using a manifest.
+ Including or excluding certain types of data in your transfer by using a filter.
+ For recurring transfers, moving only the data that's changed since the last transfer
+ Overwriting data in the destination location to match what's in the source location.
+ Choosing which file or object metadata to preserve between your storage locations.

**Topics**
+ [

# Transferring specific files or objects by using a manifest
](transferring-with-manifest.md)
+ [

# Transferring specific files, objects, and folders by using filters
](filtering.md)
+ [

# Understanding how DataSync handles file and object metadata
](metadata-copied.md)
+ [

# Links and directories copied by AWS DataSync
](special-files-copied.md)
+ [

# Configuring how to handle files, objects, and metadata
](configure-metadata.md)

# Transferring specific files or objects by using a manifest
<a name="transferring-with-manifest"></a>

A *manifest* is a list of files or objects that you want AWS DataSync to transfer. For example, instead of having to transfer everything in an S3 bucket with potentially millions of objects, DataSync transfers only the objects that you list in your manifest.

Manifests are similar to [filters](filtering.md) but let you identify exactly which files or objects to transfer instead of data that matches a filter pattern.

**Note**  
The maximum allowable size for a manifest file with Enhanced mode tasks is 20 GB.

## Creating your manifest
<a name="transferring-with-manifest-create"></a>

A manifest is a comma-separated values (CSV)-formatted file that lists the files or objects in your source location that you want DataSync to transfer. If your source is an S3 bucket, you can also include which version of an object to transfer.

**Topics**
+ [

### Guidelines
](#transferring-with-manifest-guidelines)
+ [

### Example manifests
](#manifest-examples)

### Guidelines
<a name="transferring-with-manifest-guidelines"></a>

Use these guidelines to help you create a manifest that works with DataSync.

------
#### [ Do ]
+ Specify the full path of each file or object that you want to transfer.

  You can't specify only a directory or folder with the intention of transferring all of its contents. For these situations, consider using an [include filter](filtering.md) instead of a manifest.
+ Make sure that each file or object path is relative to the mount path, folder, directory, or prefix that you specified when configuring your DataSync source location.

  For example, let's say you [configure an S3 location](create-s3-location.md#create-s3-location-how-to) with a prefix named `photos`. That prefix includes an object `my-picture.png` that you want to transfer. In the manifest, you then only need to specify the object (`my-picture.png`) instead of the prefix and object (`photos/my-picture.png`).
+ To specify Amazon S3 object version IDs, separate the object's path and version ID by using a comma.

  The following example shows a manifest entry with two fields. The first field includes an object named `picture1.png`. The second field is separated by a comma and includes a version ID of `111111`:

  ```
  picture1.png,111111
  ```
+ Use quotes in the following situations:
  + When a path contains special characters (commas, quotes, and line endings):

    `"filename,with,commas.txt"`
  + When a path spans multiple lines:

    ```
    "this
    is
    a
    filename.txt"
    ```
  + When a path includes quotes:

    `filename""with""quotes.txt`

    This represents a path named `filename"with"quotes.txt`.

  These quote rules also apply to version ID fields. In general, if a manifest field has a quote, you must escape it with another quote.
+ Separate each file or object entry with a new line.

  You can separate lines by using Linux (line feed or carriage return) or Windows (carriage return followed by a line feed) style line breaks.
+ Save your manifest (for example, `my-manifest.csv` or `my-manifest.txt`).
+ Upload the manifest to an S3 bucket that [DataSync can access](#transferring-with-manifest-access).

  This bucket doesn't have to be in the same AWS Region or account where you're using DataSync.

------
#### [ Don't ]
+ Specify only a directory or folder with the intention of transferring all of its contents.

  A manifest can only include full paths to the files or objects that you want to transfer. If you configure your source location to use a specific mount path, folder, directory, or prefix, you don't have to include that in your manifest.
+ Specify a file or object path that exceeds 4,096 characters.
+ Specify a file path, object path, or Amazon S3 object version ID that exceeds 1,024 bytes.
+ Specify duplicate file or object paths.
+ Include an object version ID if your source location isn't an S3 bucket.
+ Include more than two fields in a manifest entry.

  An entry can include only a file or object path and (if applicable) an Amazon S3 object version ID.
+ Include characters that don't conform to UTF-8 encoding.
+ Include unintentional spaces in your entry fields outside of quotes.

------

### Example manifests
<a name="manifest-examples"></a>

Use these examples to help you create a manifest that works with DataSync. 

**Manifest with full file or object paths**  
The following example shows a manifest with full file or object paths to transfer.  

```
photos/picture1.png
photos/picture2.png
photos/picture3.png
```

**Manifest with only object keys**  
The following example shows a manifest with objects to transfer from an Amazon S3 source location. Since the [location is configured](create-s3-location.md#create-s3-location-how-to) with the prefix `photos`, only the object keys are specified.  

```
picture1.png
picture2.png
picture3.png
```

**Manifest with object paths and version IDs**  
The first two entries in the following manifest example include specific Amazon S3 object versions to transfer.  

```
photos/picture1.png,111111
photos/picture2.png,121212
photos/picture3.png
```

**Manifest with UTF-8 characters**  
The following example shows a manifest with files that include UTF-8 characters.  

```
documents/résumé1.pdf
documents/résumé2.pdf
documents/résumé3.pdf
```

## Providing DataSync access to your manifest
<a name="transferring-with-manifest-access"></a>

You need an AWS Identity and Access Management (IAM) role that gives DataSync access to your manifest in its S3 bucket. This role must include the following permissions:
+ `s3:GetObject`
+ `s3:GetObjectVersion`

You can generate this role automatically in the DataSync console or create the role yourself.

**Note**  
If your manifest is in a different AWS account, you must create this role manually.

### Creating the IAM role automatically
<a name="creating-manfiest-role-automatically"></a>

When creating or starting a transfer task in the console, DataSync can create an IAM role for you with the `s3:GetObject` and `s3:GetObjectVersion` permissions that you need to access your manifest.

**Required permissions to automatically create the role**  
To automatically create the role, make sure that the role that you're using to access the DataSync console has the following permissions:  
+ `iam:CreateRole`
+ `iam:CreatePolicy`
+ `iam:AttachRolePolicy`

### Creating the IAM role (same account)
<a name="creating-manfiest-role-automatically-same-account"></a>

You can manually create the IAM role that DataSync needs to access your manifest. The following instructions assume that you're in the same AWS account where you use DataSync and your manifest's S3 bucket is located. 

1. Open the IAM console at [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/).

1. In the left navigation pane, under **Access management**, choose **Roles**, and then choose **Create role**.

1. On the **Select trusted entity** page, for **Trusted entity type**, choose **AWS service**.

1. For **Use case**, choose **DataSync** in the dropdown list and select **DataSync**. Choose **Next**.

1. On the **Add permissions** page, choose **Next**. Give your role a name and choose **Create role**.

1. On the **Roles** page, search for the role that you just created and choose its name.

1. On the role's details page, choose the **Permissions** tab. Choose **Add permissions** then **Create inline policy**.

1. Choose the **JSON** tab and paste the following sample policy into the policy editor:

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [{
           "Sid": "DataSyncAccessManifest",
           "Effect": "Allow",
           "Action": [
               "s3:GetObject",
               "s3:GetObjectVersion"
           ],
           "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/my-manifest.csv"
       }]
   }
   ```

------

1. In the sample policy that you just pasted, replace the following values with your own:

   1. Replace `amzn-s3-demo-bucket` with the name of the S3 bucket that's hosting your manifest.

   1. Replace `my-manifest.csv` with the file name of your manifest.

1. Choose **Next**. Give your policy a name and choose **Create policy**.

1. (Recommended) To prevent the [cross-service confused deputy problem](cross-service-confused-deputy-prevention.md), do the following:

   1. On the role's details page, choose the **Trust relationships** tab. Choose **Edit trust policy**.

   1. Update the trust policy by using the following example, which includes the `aws:SourceArn` and `aws:SourceAccount` global condition context keys:

------
#### [ JSON ]

****  

      ```
      {
          "Version":"2012-10-17",		 	 	 
          "Statement": [
            {
              "Effect": "Allow",
              "Principal": {
                  "Service": "datasync.amazonaws.com"
              },
              "Action": "sts:AssumeRole",
              "Condition": {
                  "StringEquals": {
                  "aws:SourceAccount": "555555555555"
                  },
                  "ArnLike": {
                  "aws:SourceArn": "arn:aws:datasync:us-east-1:555555555555:*"
                  }
              }
            }
        ]
      }
      ```

------
      + Replace each instance `account-id` with the AWS account ID where you're using DataSync.
      + Replace `region` with the AWS Region where you're using DataSync.

   1. Choose **Update policy**.

You've created an IAM role that allows DataSync to access your manifest. Specify this role when [creating](#manifest-creating-task) or [starting](#manifest-starting-task) your task.

### Creating the IAM role (different account)
<a name="creating-manfiest-role-automatically-different-account"></a>

If your manifest is in an S3 bucket that belongs to a different AWS account, you must manually create the IAM role that DataSync uses to access the manifest. Then, in the AWS account where your manifest is located, you need to include the role in the S3 bucket policy.

#### Creating the role
<a name="creating-manfiest-role-automatically-different-account-1"></a>

1. Open the IAM console at [https://console.aws.amazon.com/iam/](https://console.aws.amazon.com/iam/).

1. In the left navigation pane, under **Access management**, choose **Roles**, and then choose **Create role**.

1. On the **Select trusted entity** page, for **Trusted entity type**, choose **AWS service**.

1. For **Use case**, choose **DataSync** in the dropdown list and select **DataSync**. Choose **Next**.

1. On the **Add permissions** page, choose **Next**. Give your role a name and choose **Create role**.

1. On the **Roles** page, search for the role that you just created and choose its name.

1. On the role's details page, choose the **Permissions** tab. Choose **Add permissions** then **Create inline policy**.

1. Choose the **JSON** tab and paste the following sample policy into the policy editor:

------
#### [ JSON ]

****  

   ```
   {
       "Version":"2012-10-17",		 	 	 
       "Statement": [{
           "Sid": "DataSyncAccessManifest",
           "Effect": "Allow",
           "Action": [
               "s3:GetObject",
               "s3:GetObjectVersion"
           ],
           "Resource": "arn:aws:s3:::amzn-s3-demo-bucket/my-manifest.csv"
       }]
   }
   ```

------

1. In the sample policy that you just pasted, replace the following values with your own:

   1. Replace `amzn-s3-demo-bucket` with the name of the S3 bucket that's hosting your manifest.

   1. Replace `my-manifest.csv` with the file name of your manifest.

1. Choose **Next**. Give your policy a name and choose **Create policy**.

1. (Recommended) To prevent the [cross-service confused deputy problem](cross-service-confused-deputy-prevention.md), do the following:

   1. On the role's details page, choose the **Trust relationships** tab. Choose **Edit trust policy**.

   1. Update the trust policy by using the following example, which includes the `aws:SourceArn` and `aws:SourceAccount` global condition context keys:

------
#### [ JSON ]

****  

      ```
      {
          "Version":"2012-10-17",		 	 	 
          "Statement": [
            {
              "Effect": "Allow",
              "Principal": {
                  "Service": "datasync.amazonaws.com"
              },
              "Action": "sts:AssumeRole",
              "Condition": {
                  "StringEquals": {
                  "aws:SourceAccount": "000000000000"
                  },
                  "ArnLike": {
                  "aws:SourceArn": "arn:aws:datasync:us-east-1:000000000000:*"
                  }
              }
           }
        ]
      }
      ```

------
      + Replace each instance of `account-id` with the AWS account ID where you're using DataSync.
      + Replace `region` with the AWS Region where you're using DataSync.

   1. Choose **Update policy**.

You created the IAM role that you can include in your S3 bucket policy.

#### Updating your S3 bucket policy with the role
<a name="creating-manfiest-role-automatically-different-account-2"></a>

Once you've created the IAM role, you must add it to the S3 bucket policy in the other AWS account where your manifest is located.

1. In the AWS Management Console, switch over to the account with your manfiest's S3 bucket.

1. Open the Amazon S3 console at [https://console.aws.amazon.com/s3/](https://console.aws.amazon.com/s3/).

1. On the bucket's detail page, choose the **Permissions** tab.

1. Under **Bucket policy**, choose **Edit** and do the following to modify your S3 bucket policy:

   1. Update what's in the editor to include the following policy statements:

------
#### [ JSON ]

****  

      ```
      {
        "Version":"2012-10-17",		 	 	 
        "Statement": [
          {
            "Sid": "DataSyncAccessManifestBucket",
            "Effect": "Allow",
            "Action": [
              "s3:GetObject",
              "s3:GetObjectVersion"
            ],
            "Resource": "arn:aws:s3:::amzn-s3-demo-bucket"
          }
        ]
      }
      ```

------

   1. Replace `account-id` with the AWS account ID for the account that you're using DataSync with.

   1. Replace `datasync-role` with the IAM role that you just created that allows DataSync to access your manifest.

   1. Replace `amzn-s3-demo-bucket` with the name of the S3 bucket that's hosting your manifest in the other AWS account.

1. Choose **Save changes**.

You've created an IAM role that allows DataSync to access your manifest in the other account. Specify this role when [creating](#manifest-creating-task) or [starting](#manifest-starting-task) your task.

## Specifying your manifest when creating a task
<a name="manifest-creating-task"></a>

You can specify the manifest that you want DataSync to use when creating a task.

### Using the DataSync console
<a name="manifest-creating-task-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For **Contents to scan**, choose **Specific files, objects, and folders**, then select **Using a manifest**.

1. For **S3 URI**, choose your manifest that's hosted on an S3 bucket.

   Alternatively, you can enter the URI (for example, `s3://bucket/prefix/my-manifest.csv`).

1. For **Object version**, choose the version of the manifest that you want DataSync to use.

   By default, DataSync uses the latest version of the object.

1. For **Manifest access role**, do one of the following:
   + Choose **Autogenerate** for DataSync to automatically create an IAM role with the permissions required to access your manifest in its S3 bucket.
   + Choose an existing IAM role that can access your manifest.

   For more information, see [Providing DataSync access to your manifest](#transferring-with-manifest-access).

1. Configure any other task settings you need, then choose **Next**.

1. Choose **Create task**.

### Using the AWS CLI
<a name="manifest-creating-task-cli"></a>

1. Copy the following `create-task` command:

   ```
   aws datasync create-task \
     --source-location-arn arn:aws:datasync:us-east-1:123456789012:location/loc-12345678abcdefgh \
     --destination-location-arn arn:aws:datasync:us-east-1:123456789012:location/loc-abcdefgh12345678 \
     --manifest-config {
         "Source": {
           "S3": {
               "ManifestObjectPath": "s3-object-key-of-manifest",
               "BucketAccessRoleArn": "bucket-iam-role",
               "S3BucketArn": "amzn-s3-demo-bucket-arn",
               "ManifestObjectVersionId": "manifest-version-to-use" 
           }
         }
     }
   ```

1. For the `--source-location-arn` parameter, specify the Amazon Resource Name (ARN) of the location that you're transferring data from.

1. For the `--destination-location-arn` parameter, specify the ARN of the location that you're transferring data to.

1. For the `--manifest-config` parameter, do the following:
   + `ManifestObjectPath` – Specify the S3 object key of your manifest.
   + `BucketAccessRoleArn` – Specify the IAM role that allows DataSync to access your manifest in its S3 bucket.

     For more information, see [Providing DataSync access to your manifest](#transferring-with-manifest-access).
   + `S3BucketArn` – Specify the ARN of the S3 bucket that's hosting your manifest.
   + `ManifestObjectVersionId` – Specify the version of the manifest that you want DataSync to use.

     By default, DataSync uses the latest version of the object.

1. Run the `create-task` command to create your task.

When you're ready, you can [start your transfer task](run-task.md).

## Specifying your manifest when starting a task
<a name="manifest-starting-task"></a>

You can specify the manifest that you want DataSync to use when executing a task.

### Using the DataSync console
<a name="manifest-starting-task-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Tasks**, and then choose the task that you want to start.

1. In the task overview page, choose **Start**, and then choose **Start with overriding options**.

1. For **Contents to scan**, choose **Specific files, objects, and folders**, then select **Using a manifest**.

1. For **S3 URI**, choose your manifest that's hosted on an S3 bucket.

   Alternatively, you can enter the URI (for example, `s3://bucket/prefix/my-manifest.csv`).

1. For **Object version**, choose the version of the manifest that you want DataSync to use.

   By default, DataSync uses the latest version of the object.

1. For **Manifest access role**, do one of the following:
   + Choose **Autogenerate** for DataSync to automatically create an IAM role to access your manifest in its S3 bucket.
   + Choose an existing IAM role that can access your manifest.

   For more information, see [Providing DataSync access to your manifest](#transferring-with-manifest-access).

1. Choose **Start** to begin your transfer.

### Using the AWS CLI
<a name="manifest-starting-task-cli"></a>

1. Copy the following `start-task-execution` command:

   ```
   aws datasync start-task-execution \
     --task-arn arn:aws:datasync:us-east-1:123456789012:task/task-12345678abcdefgh \
     --manifest-config {
         "Source": {
           "S3": {
               "ManifestObjectPath": "s3-object-key-of-manifest",
               "BucketAccessRoleArn": "bucket-iam-role",
               "S3BucketArn": "amzn-s3-demo-bucket-arn",
               "ManifestObjectVersionId": "manifest-version-to-use" 
           }
         }
     }
   ```

1. For the `--task-arn` parameter, specify the Amazon Resource Name (ARN) of the task that you're starting.

1. For the `--manifest-config` parameter, do the following:
   + `ManifestObjectPath` – Specify the S3 object key of your manifest.
   + `BucketAccessRoleArn` – Specify the IAM role that allows DataSync to access your manifest in its S3 bucket.

     For more information, see [Providing DataSync access to your manifest](#transferring-with-manifest-access).
   + `S3BucketArn` – Specify the ARN of the S3 bucket that's hosting your manifest.
   + `ManifestObjectVersionId` – Specify the version of the manifest that you want DataSync to use.

     By default, DataSync uses the latest version of the object.

1. Run the `start-task-execution` command to begin your transfer.

## Limitations
<a name="transferring-with-manifest-limitations"></a>
+ You can't use a manifest together with [filters](filtering.md).
+ You can't specify only a directory or folder with the intention of transferring all of its contents. For these situations, consider using an [include filter](filtering.md) instead of a manifest.
+ You can't use the **Keep deleted files** task option (`PreserveDeletedFiles` in the [API](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-PreserveDeletedFiles)) to [maintain files or objects in the destination that aren't in the source](configure-metadata.md). DataSync only transfers what's listed in your manifest and doesn't delete anything in the destination.

## Troubleshooting
<a name="manifests-troubleshooting"></a>

**Errors related to `HeadObject` or `GetObjectTagging`**  
If you're transferring objects with specific version IDs from an S3 bucket, you might see an error related to `HeadObject` or `GetObjectTagging`. For example, here's an error related to `GetObjectTagging`:

```
[WARN] Failed to read metadata for file /picture1.png (versionId: 111111): S3 Get Object Tagging Failed
[ERROR] S3 Exception: op=GetObjectTagging photos/picture1.png, code=403, type=15, exception=AccessDenied, 
msg=Access Denied req-hdrs: content-type=application/xml, x-amz-api-version=2006-03-01 rsp-hdrs: content-type=application/xml, 
date=Wed, 07 Feb 2024 20:16:14 GMT, server=AmazonS3, transfer-encoding=chunked, 
x-amz-id-2=IOWQ4fDEXAMPLEQM+ey7N9WgVhSnQ6JEXAMPLEZb7hSQDASK+Jd1vEXAMPLEa3Km, x-amz-request-id=79104EXAMPLEB723
```

If you see either of these errors, validate that the IAM role that DataSync uses to access your S3 source location has the following permissions:
+ `s3:GetObjectVersion`
+ `s3:GetObjectVersionTagging`

If you need to update your role with these permissions, see [Creating an IAM role for DataSync to access your Amazon S3 location](create-s3-location.md#create-role-manually).

**Error: `ManifestFileDoesNotExist`**  
This error indicates that a file within the manifest was not found at the source. Review the [guidelines](#transferring-with-manifest-guidelines) for creating a manifest.

## Next steps
<a name="manifests-next-steps"></a>

If you haven't already, [start your task](run-task.md). Otherwise, [monitor your task's activity](monitoring-overview.md).

# Transferring specific files, objects, and folders by using filters
<a name="filtering"></a>

AWS DataSync lets you apply filters to include or exclude data from your source location in a transfer. For example, if you don't want to transfer temporary files that end with `.tmp`, you can create an exclude filter so that these files don't make their way to your destination location.

You can use a combination of exclude and include filters in the same transfer task. If you modify a task's filters, those changes are applied the next time you run the task.

## Filtering terms, definitions, and syntax
<a name="filter-overview"></a>

Familiarize yourself with the concepts related to DataSync filtering:

**Filter **  
The whole string that makes up a particular filter (for example, `*.tmp``|``*.temp` or `/folderA|/folderB`).  
Filters are made up of patterns delimited by using a pipe (\$1). You don't need a delimiter when you add patterns in the DataSync console because you add each pattern separately.  
Filters are case sensitive. For example, filter `/folderA` won't match `/FolderA`.

**Pattern**  
A pattern within a filter. For example, `*.tmp` is a pattern that's part of the `*.tmp``|``*.temp` filter. If your filter has multiple patterns, you delimit each pattern by using a pipe (\$1).

**Folders**  
+ All filters are relative to the source location path. For example, suppose that you specify `/my_source/` as the source path when you create your source location and task and specify the include filter `/transfer_this/`. In this case, DataSync transfers only the directory `/my_source/transfer_this/` and its contents.
+ To specify a folder directly under the source location, include a forward slash (/) in front of the folder name. In the example preceding, the pattern uses `/transfer_this`, not `transfer_this`.
+ DataSync interprets the following patterns the same way and matches both the folder and its content.

  `/dir` 

  `/dir/`
+ When you are transferring data from or to an Amazon S3 bucket, DataSync treats the `/` character in the object key as the equivalent of a folder on a file system.

**Special characters**  
Following are special characters for use with filtering.      
[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/filtering.html)

## Example filters
<a name="sample-filters"></a>

The following examples show common filters you can use with DataSync.

**Note**  
There are limits to how many characters you can use in a filter. For more information, see [DataSync quotas](datasync-limits.md#task-hard-limits).

**Exclude some folders from your source location**  
In some cases, you want might exclude folders in your source location to not copy them to your destination location. For example, if you have temporary work-in-progress folders, you can use something like the following filter:

`*/.temp`

To exclude folders with similar content (such as `/reports2021` and `/reports2022)`), you can use an exclude filter like the following:

`/reports*`

To exclude folders at any level in the file hierarchy, you can use an exclude filter like the following. 

`*/folder-to-exclude-1`\$1`*/folder-to-exclude-2`

To exclude folders at the top level of the source location, you can use an exclude filter like the following. 

`/top-level-folder-to-exclude-1`\$1`/top-level-folder-to-exclude-2`

**Include a subset of the folders on your source location**  
In some cases, your source location might be a large share and you need to transfer a subset of the folders under the root. To include specific folders, start a task execution with an include filter like the following.

`/folder-to-transfer/*`

**Exclude specific file types**  
To exclude certain file types from the transfer, you can create a task execution with an exclude filter such as `*.temp`.

**Transfer individual files you specify**  
To transfer a list of individual files, start a task execution with an include filter like the following: "`/folder/subfolder/file1.txt`\$1`/folder/subfolder/file2.txt`\$1`/folder/subfolder/file2.txt`"

## Creating include filters
<a name="include-filters"></a>

Include filters define the files, objects, and folders that you want DataSync to transfer. You can configure include filters when you create, edit, or start a task.

DataSync scans and transfers only files and folders that match the include filters. For example, to include a subset of your source folders, you might specify `/important_folder_1`\$1`/important_folder_2`. 

**Note**  
Include filters support the wildcard (\$1) character only as the rightmost character in a pattern. For example, `/documents*`\$1`/code*` is supported, but `*.txt` isn't.

### Using the DataSync console
<a name="include-filters-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For **Contents to scan**, choose **Specific files, objects, and folders**, then select **Using filters**.

1. For **Includes**, enter your filter (for example, `/important_folders` to include an important directory), then choose **Add pattern**.

1. Add other include filters as needed. 

### Using the AWS CLI
<a name="include-filters-cli"></a>

When using the AWS CLI, you must use single quotation marks (`'`) around the filter and a \$1 (pipe) as a delimiter if you have more than one filter.

The following example specifies two include filters `/important_folder1` and `/important_folder2` when running the `create-task` command.

```
aws datasync create-task
   --source-location-arn 'arn:aws:datasync:region:account-id:location/location-id' \
   --destination-location-arn 'arn:aws:datasync:region:account-id:location/location-id' \
   --includes FilterType=SIMPLE_PATTERN,Value='/important_folder1|/important_folder2'
```

## Creating exclude filters
<a name="exclude-filters"></a>

Exclude filters define the files, objects, and folders in your source location that you don't want DataSync to transfer. You can configure these filters when you create, edit, or start a task.

**Topics**
+ [

### Data excluded by default
](#directories-ignored-during-transfers)

### Data excluded by default
<a name="directories-ignored-during-transfers"></a>

DataSync automatically excludes some data from being transferred:
+ `.snapshot` – DataSync ignores any path ending with `.snapshot`, which typically is used for point-in-time snapshots of a storage system's files or directories.
+ `/.aws-datasync` and `/.awssync` – DataSync creates these folders in your location to help facilitate your transfer.
+ `/.zfs` – You might see this folder with Amazon FSx for OpenZFS locations.

### Using the DataSync console
<a name="adding-exclude-filters"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For **Excludes**, enter your filter (for example, `*/temp` to exclude temporary folders), then choose **Add pattern**.

1. Add other exclude filters as needed. 

1. If needed, add [include filters](#include-filters).

### Using the AWS CLI
<a name="adding-exclude-filters-cli"></a>

When using the AWS CLI, you must use single quotation marks (`'`) around the filter and a \$1 (pipe) as a delimiter if you have more than one filter. 

The following example specifies two exclude filters `*/temp` and `*/tmp` when running the `create-task` command.

```
aws datasync create-task \
   --source-location-arn 'arn:aws:datasync:region:account-id:location/location-id' \
   --destination-location-arn 'arn:aws:datasync:region:account-id:location/location-id' \
   --excludes FilterType=SIMPLE_PATTERN,Value='*/temp|*/tmp'
```

# Understanding how DataSync handles file and object metadata
<a name="metadata-copied"></a>

AWS DataSync can preserve your file or object metadata during a data transfer. How your metadata gets copied depends on your transfer locations and if those locations use similar types of metadata.

## System-level metadata
<a name="metadata-copied-system-level"></a>

In general, DataSync doesn't copy system-level metadata. For example, when transferring from an SMB file server, the permissions you configured at the file system level aren't copied to the destination storage system.

There are exceptions. When transferring between Amazon S3 and other object storage, DataSync does copy some [system-defined object metadata](#metadata-copied-between-object-s3).

## Metadata copied in Amazon S3 transfers
<a name="metadata-copied-amazon-s3"></a>

The following tables describe what metadata DataSync can copy when a transfer involves an Amazon S3 location.

**Topics**
+ [

### To Amazon S3
](#metadata-copied-to-s3)
+ [

### Between Amazon S3 and other object storage
](#metadata-copied-between-object-s3)
+ [

### Between Amazon S3 and HDFS
](#metadata-copied-between-hdfs-s3)

### To Amazon S3
<a name="metadata-copied-to-s3"></a>


| When copying from one of these locations | To this location | DataSync can copy | 
| --- | --- | --- | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  The following as Amazon S3 user metadata: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html) The file metadata stored in Amazon S3 user metadata is interoperable with NFS shares on file gateways using AWS Storage Gateway. A file gateway enables low-latency access from on-premises networks to data that was copied to Amazon S3 by DataSync. This metadata is also interoperable with FSx for Lustre. When DataSync copies objects that contain this metadata back to an NFS server, the file metadata is restored. Restoring metadata requires granting elevated permissions to the NFS server. For more information, see [Configuring AWS DataSync transfers with an NFS file server](create-nfs-location.md).  | 

### Between Amazon S3 and other object storage
<a name="metadata-copied-between-object-s3"></a>

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)

### Between Amazon S3 and HDFS
<a name="metadata-copied-between-hdfs-s3"></a>


| When copying between these locations | DataSync can copy | 
| --- | --- | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  | The following as Amazon S3 user metadata:[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)HDFS uses strings to store file and folder user and group ownership, rather than numeric identifiers, such as UIDs and GIDs. | 

## Metadata copied in NFS transfers
<a name="metadata-copied-nfs"></a>

The following table describes what metadata DataSync can copy between locations that use Network File System (NFS).


| When copying between these locations | DataSync can copy | 
| --- | --- | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  | 

## Metadata copied in SMB transfers
<a name="metadata-copied-smb"></a>

The following table describes what metadata DataSync can copy between locations that use Server Message Block (SMB).


| When copying between these locations | DataSync can copy | 
| --- | --- | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  | 

## Metadata copied in other transfer scenarios
<a name="metadata-copied-different"></a>

DataSync handles metadata the following ways when copying between these storage systems (most of which have different metadata structures).

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)

## Understanding when and how DataSync applies default POSIX metadata
<a name="POSIX-metadata"></a>

DataSync applies default POSIX metadata in the following situations:
+ When your transfer's source and destination locations don't have similar metadata structures
+ When metadata is missing from the source location

The following table describes how DataSync applies default POSIX metadata during these types of transfers:


| Source | Destination | File permissions | Folder permissions | UID | GID | 
| --- | --- | --- | --- | --- | --- | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  0755  | 0755 |  65534  |  65534  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  0644  |  0755  |  65534  |  65534  | 
|  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/metadata-copied.html)  |  0644  |  0755  |  65534  |  65534  | 

1 In cases where the objects don't have metadata that was previously applied by DataSync.

# Links and directories copied by AWS DataSync
<a name="special-files-copied"></a>

AWS DataSync handles hard links, symbolic links, and directories differently depending on the storage locations involved in your transfer.

## Hard links
<a name="special-files-copied-hard-links"></a>

Here's how DataSync handles hard links in some common transfer scenarios:
+ **When transferring between an NFS file server, FSx for Lustre, FSx for OpenZFS, FSx for ONTAP (using NFS), and Amazon EFS**, hard links are preserved.
+ **When transferring to Amazon S3**, each underlying file referenced by a hard link is transferred only once. During incremental transfers, separate objects are created in your S3 bucket. If a hard link is unchanged in Amazon S3, it's correctly restored when transferred to an NFS file server, FSx for Lustre, FSx for OpenZFS, FSx for ONTAP (using NFS), or Amazon EFS file system.
+ **When transferring to Microsoft Azure Blob Storage**, each underlying file referenced by a hard link is transferred only once. During incremental transfers, separate objects are created in your blob storage if there are new references in the source. When transferring from Azure Blob Storage, DataSync transfers hard links as if they are individual files.
+ **When transferring between an SMB file server, FSx for Windows File Server, and FSx for ONTAP (using SMB)**, hard links aren't supported. If DataSync encounters hard links in these situations, the transfer task completes with an error. To learn more, check your CloudWatch logs.
+ **When transferring to HDFS**, hard links aren't supported. CloudWatch logs show these links as skipped.

## Symbolic links
<a name="special-files-copied-symbolic-links"></a>

Here's how DataSync handles symbolic links in some common transfer scenarios:
+ **When transferring between an NFS file server, FSx for Lustre, FSx for OpenZFS, FSx for ONTAP (using NFS), and Amazon EFS**, symbolic links are preserved.
+ **When transferring to Amazon S3**, the link target path is stored in the Amazon S3 object. The link is correctly restored when transferred to an NFS file server, FSx for Lustre, FSx for OpenZFS, FSx for ONTAP, or Amazon EFS file system.
+ **When transferring to Azure Blob Storage**, symbolic links aren't supported. CloudWatch logs show these links as skipped.
+ **When transferring between an SMB file server, FSx for Windows File Server, and FSx for ONTAP (using SMB)**, symbolic links aren't supported. DataSync doesn't transfer a symbolic link itself but instead a file referenced by the symbolic link. To recognize duplicate files and deduplicate them with symbolic links, you must configure deduplication on your destination file system.
+ **When transferring to HDFS**, symbolic links aren't supported. CloudWatch logs show these links as skipped.

## Directories
<a name="special-files-copied-directories"></a>

In general, DataSync preserves directories when transferring between storage systems. This isn’t the case in the following situations:
+ **When transferring to Amazon S3**, directories are represented as empty objects that have prefixes and end with a forward slash (`/`).
+ **When transferring to Azure Blob Storage without a hierarchical namespace**, directories don't exist. What looks like a directory is just part of an object name.

# Configuring how to handle files, objects, and metadata
<a name="configure-metadata"></a>

You can configure how AWS DataSync handles your files, objects, and their associated metadata when transferring between locations.

For example, with recurring transfers, you might want to overwrite files in your destination with changes in the source to keep the locations in sync. You can copy properties such as POSIX permissions for files and folders, tags associated with objects, and access control lists (ACLs).

## Transfer mode options
<a name="task-option-transfer-mode"></a>

You can configure whether DataSync transfers only the data (including metadata) that's changed following an initial copy or all data every time you run the task. If you're planning on recurring transfers, you might only want to transfer what's changed since your previous task execution.


| Option in console | Option in API | Description | 
| --- | --- | --- | 
|  **Transfer only data that has changed**  |  [TransferMode](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-TransferMode) set to `CHANGED`  | After your initial full transfer, DataSync copies only the data and metadata that differs between the source and destination location. | 
|  **Transfer all data**  |  [TransferMode](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-TransferMode) set to `ALL`  |  DataSync copies everything in the source to the destination without comparing differences between the locations.   | 

## File and object handling options
<a name="task-option-file-object-handling"></a>

You can control some aspects of how DataSync treats your files or objects in the destination location. For example, DataSync can delete files in the destination that aren't in the source.


| Option in console | Option in API | Description | 
| --- | --- | --- | 
|  **Keep deleted files**  |  [PreserveDeletedFiles](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-PreserveDeletedFiles)  |  Specifies whether DataSync maintains files or objects in the destination location that don't exist in the source. If you configure your task to delete objects from your Amazon S3 bucket, you might incur minimum storage duration charges for certain storage classes. For detailed information, see [Storage class considerations with Amazon S3 transfers](create-s3-location.md#using-storage-classes).  You can't configure your task to delete data in the destination and also [transfer all data](#task-option-transfer-mode). When you transfer all data, DataSync doesn't scan your destination location and doesn't know what to delete.   | 
|  **Overwrite files**  |  [OverwriteMode](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-OverwriteMode)  |  Specifies whether DataSync modifies data in the destination location when the source data or metadata has changed. If you don't configure your task to overwrite data, the destination data isn't overwritten even if the source data differs. If your task overwrites objects, you might incur additional charges for certain storage classes (for example, for retrieval or early deletion). For detailed information, see [Storage class considerations with Amazon S3 transfers](create-s3-location.md#using-storage-classes).  | 

## Metadata handling options
<a name="task-option-metadata-handling"></a>

DataSync can preserve file and object metadata during a transfer. The metadata that DataSync can preserve depends on the storage systems involved and whether those systems use a similar metadata structure.

Before configuring your task, make sure that you understand how DataSync handles [metadata](metadata-copied.md) and [special files](special-files-copied.md) when transferring between your source and destination locations.

**Important**  
DataSync supports transfers to and from certain third-party cloud storage systems, such as Google Cloud Storage and IBM Cloud Object Storage, which handle system metadata in a way that is not fully S3-compatible. For these transfers, DataSync attempts to copy metadata attributes such as `ContentType`, `ContentEncoding`, `ContentLanguage`, and `CacheControl` on a best-effort basis. If the destination storage system does not apply these attributes, they will be ignored during task verification.


| Option in console | Option in API | Description | 
| --- | --- | --- | 
|  **Copy ownership**  | [Gid](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-Gid) and [Uid](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-Uid) |  Specifies whether DataSync copies POSIX file and folder ownership, such as the group ID of the file's owners and the user ID of the file's owner.  | 
|  **Copy permissions**  | [PosixPermissions](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-PosixPermissions) |  Specifies whether DataSync copies POSIX permissions for files and folders from the source to the destination.  | 
| Copy timestamps | [Atime](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-Atime) and [Mtime](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-Mtime) |  Specifies whether DataSync copies the timestamp metadata from the source to the destination. Required when you need to run a task more than once.  | 
| Copy object tags | [ObjectTags](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-ObjectTags) |  Specifies whether DataSync preserves the tags associated with your objects when transferring between object storage systems.  | 
| Copy ownership, DACLs, and SACLs | [SecurityDescriptorCopyFlags ](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-SecurityDescriptorCopyFlags) set to OWNER\$1DACL\$1SACL |  DataSync copies the following: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/configure-metadata.html)  | 
| Copy ownership and DACLs | [SecurityDescriptorCopyFlags ](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-SecurityDescriptorCopyFlags) set to OWNER\$1DACL |  DataSync copies the following: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/datasync/latest/userguide/configure-metadata.html) DataSync won't copy SACLs when you choose this option.  | 
| Do not copy ownership or ACLs | [SecurityDescriptorCopyFlags ](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-SecurityDescriptorCopyFlags) set to NONE |  DataSync doesn't copy any ownership or permissions data. The objects that DataSync writes to your destination location are owned by the user whose credentials are provided for DataSync to access the destination. Destination object permissions are determined based on the permissions configured on the destination server.  | 

## Configuring file, object, and metadata handling options
<a name="configure-file-metadata-options"></a>

You can configure how DataSync handles files, objects, and metadata when creating, editing, or starting your transfer task.

### Using the DataSync console
<a name="configure-metadata-console"></a>

The following instructions describe how to configure file, object, and metadata handling options when creating a task.

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For **Transfer mode**, choose one of the following options:
   + **Transfer only data that has changed**
   + **Transfer all data**

   For more information about these options, see [Transfer mode options](#task-option-transfer-mode).

1. Select **Keep deleted files** if you want DataSync to maintain files or objects in the destination location that don't exist in the source.

   If you don't choose this option and your task deletes objects from your Amazon S3 bucket, you might incur minimum storage duration charges for certain storage classes. For detailed information, see [Storage class considerations with Amazon S3 transfers](create-s3-location.md#using-storage-classes).
**Warning**  
You can't deselect this option and enable **Transfer all data**. When you transfer all data, DataSync doesn't scan your destination location and doesn't know what to delete.

1. Select **Overwrite files** if you want DataSync to modify data in the destination location when the source data or metadata has changed.

   If your task overwrites objects, you might incur additional charges for certain storage classes (for example, for retrieval or early deletion). For detailed information, see [Storage class considerations with Amazon S3 transfers](create-s3-location.md#using-storage-classes).

   If you don't choose this option, the destination data isn't overwritten even if the source data differs.

1. Under **Transfer options**, select how you want DataSync to handle metadata. For more information about the options, see [Metadata handling options](#task-option-metadata-handling).
**Important**  
The options you see in the console depend on your task's source and destination locations. You might have to expand **Additional settings** to see some of these options.
   + **Copy ownership**
   + **Copy permissions**
   + **Copy timestamps**
   + **Copy object tags**
   + **Copy ownership, DACLs, and SACLs**
   + **Copy ownership and DACLs**
   + **Do not copy ownership or ACLs**

### Using the DataSync API
<a name="configure-file-metadata-options-api"></a>

You can configure file, object, and metadata handling options by using the `Options` parameter with any of the following operations:
+ [CreateTask](https://docs.aws.amazon.com/datasync/latest/userguide/API_CreateTask.html)
+ [StartTaskExecution](https://docs.aws.amazon.com/datasync/latest/userguide/API_StartTaskExecution.html)
+ [UpdateTask](https://docs.aws.amazon.com/datasync/latest/userguide/API_UpdateTask.html)

# Configuring how AWS DataSync verifies data integrity
<a name="configure-data-verification-options"></a>

During a transfer, AWS DataSync uses checksum verification to verify the integrity of the data that you copy between locations. You also can configure DataSync to perform additional verification at the end of your transfer.

## Data verification options
<a name="data-verification-options"></a>

Use the following information to help you decide if and how you want DataSync to perform these additional checks.


| Console option | API option | Description | 
| --- | --- | --- | 
|  **Verify only transferred data** (recommended)  |  [VerifyMode](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-VerifyMode) set to `ONLY_FILES_TRANSFERRED`  |  DataSync calculates the checksum of transferred data (including metadata) at the source location. At the end of your transfer, DataSync compares this checksum to the checksum calculated on that same data at the destination. We recommend this option when transferring to S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage classes. For more information, see [Storage class considerations with Amazon S3 transfers](create-s3-location.md#using-storage-classes).  | 
|  **Verify all data**  |  [VerifyMode](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-VerifyMode) set to `POINT_IN_TIME_CONSISTENT`  |  At the end of your transfer, DataSync checks the entire source and destination to verify that both locations are fully synchronized.   Not supported when your task uses [Enhanced mode](choosing-task-mode.md).  If you use a [manifest](transferring-with-manifest.md), DataSync only scans and verifies what's listed in the manifest. You can't use this option when transferring to S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive storage classes. For more information, see [Storage class considerations with Amazon S3 transfers](create-s3-location.md#using-storage-classes).   | 
| Don't verify data after transfer |  [VerifyMode](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-VerifyMode) set to `NONE`  | DataSync performs data integrity checks only during your transfer. Unlike other options, there's no additional verification at the end of your transfer. | 

## Configuring data verification
<a name="configure-data-verification"></a>

You can configure data verification options when creating a task, updating a task, or starting a task execution.

### Using the DataSync console
<a name="configure-data-verification-options-console"></a>

The following instructions describe how to configure data verification options when creating a task.

**To configure data verification by using the console**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For **Verification**, choose one of the following:
   + **Verify only transferred data** (recommended)
   + **Verify all data**
   + **Don't verify data after transfer**

### Using the DataSync API
<a name="configure-data-verification-options-api"></a>

You can configure how DataSync verifies data by using the `VerifyMode` parameter with any of the following operations:
+ [CreateTask](https://docs.aws.amazon.com/datasync/latest/userguide/API_CreateTask.html)
+ [UpdateTask](https://docs.aws.amazon.com/datasync/latest/userguide/API_UpdateTask.html)
+ [StartTaskExecution](https://docs.aws.amazon.com/datasync/latest/userguide/API_StartTaskExecution.html)

# Setting bandwidth limits for your AWS DataSync task
<a name="configure-bandwidth"></a>

You can configure network bandwidth limits for your AWS DataSync task and each of its executions.

## Limiting bandwidth for a task
<a name="configure-bandwidth-create"></a>

Set a bandwidth limit when creating, editing, or starting a task.

### Using the DataSync console
<a name="configure-bandwidth-create-console"></a>

The following instructions describe how to configure a bandwidth limit for your task when you're creating it.

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For **Bandwidth limit**, choose one of the following:
   + Select **Use available** to use all of the available network bandwidth for each task execution.
   + Select **Set bandwidth limit (MiB/s)** and enter the maximum bandwidth that you want DataSync to use for each task execution.

### Using the DataSync API
<a name="configure-bandwidth-create-api"></a>

You can configure a task's bandwidth limit by using the `BytesPerSecond` parameter with any of the following operations:
+ [CreateTask](https://docs.aws.amazon.com/datasync/latest/userguide/API_CreateTask.html)
+ [UpdateTask](https://docs.aws.amazon.com/datasync/latest/userguide/API_UpdateTask.html)
+ [StartTaskExecution](https://docs.aws.amazon.com/datasync/latest/userguide/API_StartTaskExecution.html)

## Throttling bandwidth for a task execution
<a name="adjust-bandwidth-throttling"></a>

You can modify the bandwidth limit for a running or queued task execution.

### Using the DataSync console
<a name="adjust-bandwidth-throttling-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the navigation pane, expand **Data transfer**. then choose **Tasks**.

1. Choose the task and then select **History** to view the task's executions.

1. Choose the task execution that you want to modify and then choose **Edit**.

1. In the dialog box, choose one of the following:
   + Select **Use available** to use all of the available network bandwidth for the task execution.
   + Select **Set bandwidth limit (MiB/s)** and enter the maximum bandwidth that you want DataSync to use for the task execution.

1. Choose **Save changes**.

   The new bandwidth limit takes effect within 60 seconds.

### Using the DataSync API
<a name="adjust-bandwidth-throttling-api"></a>

You can modify the bandwidth limit for a running or queued task execution by using the `BytesPerSecond` parameter with the [UpdateTaskExecution](https://docs.aws.amazon.com/datasync/latest/userguide/API_UpdateTaskExecution.html) operation.

# Scheduling when your AWS DataSync task runs
<a name="task-scheduling"></a>

You can set up an AWS DataSync task schedule to periodically transfer data between storage locations.

## How DataSync task scheduling works
<a name="how-task-scheduling-works"></a>

A scheduled DataSync task runs at a frequency that you specify, with a minimum interval of 1 hour. You can create a task schedule by using a cron or rate expressions.

**Important**  
You can't schedule a task to run at an interval faster than 1 hour.

**Using cron expressions**  
Use cron expressions for task schedules that run on a specific time and day. For example, here's how you can configure a task schedule in the AWS CLI that runs at 12:00 PM UTC every Sunday and Wednesday.  

```
cron(0 12 ? * SUN,WED *)
```

**Using rate expressions**  
Use rate expressions for task schedules that run on a regular interval, such as every 12 hours. For example, here's how you can configure a task schedule in the AWS CLI that runs every 12 hours:  

```
rate(12 hours)
```

**Tip**  
For more information about cron and rate expression syntax, see the [https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-cron-expressions.html](https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-cron-expressions.html).

## Creating a DataSync task schedule
<a name="configure-task-schedule"></a>

You can schedule how frequently your task runs by using the DataSync console, AWS CLI, or DataSync API.

### Using the DataSync console
<a name="configure-task-schedule-console"></a>

The following instructions describe how to set up a schedule when creating a task. You can modify the schedule later when editing the task.

In the console, some scheduling options let you specify the exact time that your task runs (such as daily at 10:30 PM). If you don't include a time for these options, your task runs at the time that you create (or update) the task.

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. For schedule **Frequency**, do one of the following:
   + Choose **Not scheduled** if you don't want your task to run on a schedule.
   + Choose **Hourly**, then choose the minute during the hour that you want your task to run. 
   + Choose **Daily** and enter the UTC time that you want your task to run.
   + Choose **Weekly** and the day of the week and enter the UTC time that you want the task to run.
   + Choose **Days of the week**, choose a specific day or days, and enter the UTC time that the task should run in the format HH:MM.
   + Choose **Custom**, and then select **Cron expression** or **Rate expression**. Enter your task schedule with a minimum interval of 1 hour. 

### Using the AWS CLI
<a name="configure-task-schedule-api"></a>

You can create a schedule for your DataSync task by using the `--schedule` parameter with the `create-task`, `update-task`, or `start-task-execution` command.

The following instructions describe how to do this with the `create-task` command.

1. Copy the following `create-task` command:

   ```
   aws datasync create-task \
     --source-location-arn arn:aws:datasync:us-east-1:123456789012:location/loc-12345678abcdefgh \
     --destination-location-arn arn:aws:datasync:us-east-1:123456789012:location/loc-abcdefgh12345678 \
     --schedule '{
       "ScheduleExpression": "cron(0 12 ? * SUN,WED *)"
     }'
   ```

1. For the `--source-location-arn` parameter, specify the Amazon Resource Name (ARN) of the location that you're transferring data from.

1. For the `--destination-location-arn` parameter, specify the ARN of the location that you're transferring data to.

1. For the `--schedule` parameter, specify a cron or rate expression for your schedule.

   In the example, the cron expression `cron(0 12 ? * SUN,WED *)` sets a task schedule that runs at 12:00 PM UTC every Sunday and Wednesday.

1. Run the `create-task` command to create your task with the schedule.

## Pausing a DataSync task schedule
<a name="pause-task-schedule"></a>

There can be situations where you need to pause your DataSync task schedule. For example, you might need to temporarily disable a recurring transfer to fix an issue with your task or perform maintenance on your storage system.

DataSync might disable your task schedule automatically for the following reasons:
+ Your task fails repeatedly with the same error.
+ You [disable an AWS Region](https://docs.aws.amazon.com/accounts/latest/reference/manage-acct-regions.html) that your task is using.

### Using the DataSync console
<a name="pause-scheduled-task-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, and then choose **Tasks**.

1. Choose the task that you want to pause the schedule for, and then choose **Edit**.

1. For **Schedule**, turn off **Enable schedule**. Choose **Save changes**.

### Using the AWS CLI
<a name="pause-scheduled-task-cli"></a>

1. Copy the following `update-task` command:

   ```
   aws datasync update-task \
     --task-arn arn:aws:datasync:us-east-1:123456789012:task/task-12345678abcdefgh \
     --schedule '{
       "ScheduleExpression": "cron(0 12 ? * SUN,WED *)",
       "Status": "DISABLED"
     }'
   ```

1. For the `--task-arn` parameter, specify the ARN of the task that you want to pause the schedule for.

1. For the `--schedule` parameter, do the following:
   + For `ScheduleExpression`, specify a cron or rate expression for your schedule.

     In the example, the expression `cron(0 12 ? * SUN,WED *)` sets a task schedule that runs at 12:00 PM UTC every Sunday and Wednesday.
   + For `Status`, specify `DISABLED` to pause the task schedule.

1. Run the `update-task` command.

1. To resume the schedule, run the same `update-task` command with `Status` set to `ENABLED`.

## Checking the status of a DataSync task schedule
<a name="check-scheduled-task"></a>

You can see whether your DataSync task schedule is enabled. 

### Using the DataSync console
<a name="check-scheduled-task-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, and then choose **Tasks**.

1. In the **Schedule** column, check whether the task's schedule is enabled or disabled.

### Using the AWS CLI
<a name="check-scheduled-task-cli"></a>

1. Copy the following `describe-task` command:

   ```
   aws datasync describe-task \
     --task-arn arn:aws:datasync:us-east-1:123456789012:task/task-12345678abcdefgh
   ```

1. For the `--task-arn` parameter, specify the ARN of the task that you want information about.

1. Run the `describe-task` command.

You get a response that provides details about your task, including its schedule. (The following example focuses primarily on the task schedule configuration and doesn't show a full `describe-task` response.)

The example shows that the task's schedule is manually disabled. If the schedule is disabled by the DataSync `SERVICE`, you see an error message for `DisabledReason` to help you understand why the task keeps failing. For more information, see [Troubleshooting AWS DataSync issues](troubleshooting-datasync.md).

```
{
    "TaskArn": "arn:aws:datasync:us-east-1:123456789012:task/task-12345678abcdefgh",
    "Status": "AVAILABLE",
    "Schedule": {
        "ScheduleExpression": "cron(0 12 ? * SUN,WED *)",
        "Status": "DISABLED",
        "StatusUpdateTime": 1697736000,
        "DisabledBy": "USER",
        "DisabledReason": "Manually disabled by user."
    },
    ...
}
```

# Tagging your AWS DataSync tasks
<a name="tagging-tasks"></a>

*Tags* are key-value pairs that help you manage, filter, and search for your AWS DataSync resources. You can add up to 50 tags to each DataSync task and task execution.

For example, you might create a task for a large data migration and tag the task with the key **Project** and value **Large Migration**. To further organize the migration, you could tag one run of the task with the key **Transfer Date** and value **May 2021** (subsequent task executions might be tagged **June 2021**, **July 2021**, and so on).

## Tagging your DataSync task
<a name="tagging-tasks-console"></a>

You can tag your DataSync task only when creating the task.

### Using the DataSync console
<a name="tagging-tasks-console-steps"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. Configure your task's source and destination locations.

   For more information, see [Where can I transfer my data with AWS DataSync?](working-with-locations.md)

1. On the **Configure settings** page, choose **Add new tag** to tag your task.

### Using the AWS CLI
<a name="tagging-tasks-cli-steps"></a>

1. Copy the following `create-task` command:

   ```
   aws datasync create-task \
       --source-location-arn 'arn:aws:datasync:region:account-id:location/source-location-id' \
       --destination-location-arn 'arn:aws:datasync:region:account-id:location/destination-location-id' \
       --tags Key=tag-key,Value=tag-value
   ```

1. Specify the following parameters in the command:
   + `--source-location-arn` – Specify the Amazon Resource Name (ARN) of the source location in your transfer.
   + `--destination-location-arn` – Specify the ARN of the destination location in your transfer.
   + `--tags` – Specify the tags that you want to apply to the task.

     For more than one tag, separate each key-value pair with a space.

1. (Optional) Specify other parameters that make sense for your transfer scenario.

   For a list of `--options`, see the [create-task](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/datasync/create-task.html) command.

1. Run the `create-task` command.

   You get a response that shows the task that you just created.

   ```
   {
       "TaskArn": "arn:aws:datasync:us-east-2:123456789012:task/task-abcdef01234567890"
   }
   ```

To view the tags you added to this task, you can use the [list-tags-for-resource](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/datasync/list-tags-for-resource.html) command.

## Tagging your DataSync task execution
<a name="tagging-task-executions-console"></a>

You can tag each run of your DataSync task.

If your task already has tags, remember the following about using tags with task executions:
+ If you start your task with the console, its user-created tags are applied automatically to the task execution. However, system-created tags that begin with `aws:` are not applied.
+ If you start your task with the DataSync API or AWS CLI, its tags are not applied automatically to the task execution.

### Using the DataSync console
<a name="tagging-task-executions-console"></a>

To add, edit, or remove tags from a task execution, you must start the task with overriding options.

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**.

1. Choose the task.

1. Choose **Start**, then choose one of the following options: 
   + **Start with defaults** – Applies any tags associated with your task.
   + **Start with overriding options** – Allows you to add, edit, or remove tags for this particular task execution.

### Using the AWS CLI
<a name="tagging-task-executions-cli"></a>

1. Copy the following `start-task-execution` command:

   ```
   aws datasync start-task-execution \
       --task-arn 'arn:aws:datasync:region:account-id:task/task-id' \
       --tags Key=tag-key,Value=tag-value
   ```

1. Specify the following parameters in the command:
   + `--task-arn` – Specify the ARN of the task that you want to start.
   + `--tags` – Specify the tags that you want to apply to this specific run of the task.

     For more than one tag, separate each key-value pair with a space.

1. (Optional) Specify other parameters that make sense for your situation.

   For more information, see the [start-task-execution](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/datasync/start-task-execution.html) command.

1. Run the `start-task-execution` command.

   You get a response that shows the task execution that you just started.

   ```
   {
       "TaskExecutionArn": "arn:aws:datasync:us-east-2:123456789012:task/task-abcdef01234567890"
   }
   ```

To view the tags you added to this task, you can use the [list-tags-for-resource](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/datasync/list-tags-for-resource.html) command.

# Starting a task to transfer your data
<a name="run-task"></a>

Once you create your AWS DataSync transfer task, you can start moving data. Each run of a task is called a *task execution*. For information about what happens during a task execution, see [How DataSync transfers files, objects, and directories](how-datasync-transfer-works.md#transferring-files).

**Important**  
If you're planning to transfer data to or from an Amazon S3 location, review [how DataSync can affect your S3 request charges](create-s3-location.md#create-s3-location-s3-requests) and the [DataSync pricing page](https://aws.amazon.com/datasync/pricing/) before you begin.

## Starting your task
<a name="starting-task"></a>

Once you've created your task, you can begin moving data right away.

### Using the DataSync console
<a name="starting-task-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**.

1. Choose the task that you want to run.

   Make sure that the task has an **Available** status. You also can select multiple tasks.

1. Choose **Actions** and then choose one of the following options:
   + **Start** – Runs the task (or tasks if you selected more than one).
   + **Start with overriding options** – Allows you to modify some of your task settings before you begin moving data. When you're ready, choose **Start**.

1. Choose **See execution details** to view details about the running task execution.

### Using the AWS CLI
<a name="start-task-execution"></a>

To start your DataSync task, you just need to specify the Amazon Resource Name (ARN) of the task you want to run. Here's an example `start-task-execution` command:

```
aws datasync start-task-execution \
    --task-arn 'arn:aws:datasync:region:account-id:task/task-id'
```

The following example starts a task with a few settings that are different than the task's default settings:

```
aws datasync start-task-execution \
    --override-options VerifyMode=NONE,OverwriteMode=NEVER,PosixPermissions=NONE
```

The command returns an ARN for your task execution similar to the following example:

```
{ 
    "TaskExecutionArn": "arn:aws:datasync:us-east-1:209870788375:task/task-08de6e6697796f026/execution/exec-04ce9d516d69bd52f"
}
```

**Note**  
Each agent can run a single task at a time.

### Using the DataSync API
<a name="starting-task-api"></a>

You can start your task by using the [StartTaskExecution](https://docs.aws.amazon.com/datasync/latest/userguide/API_StartTaskExecution.html) operation. Use the [DescribeTaskExecution](https://docs.aws.amazon.com/datasync/latest/userguide/API_DescribeTaskExecution.html) operation to get details about the running task execution.

Once started, you can [check the task execution's status](#understand-task-execution-statuses) as DataSync copies your data. You also can [throttle the task execution's bandwidth](configure-bandwidth.md#adjust-bandwidth-throttling) if needed.

## Task execution statuses
<a name="understand-task-execution-statuses"></a>

When you start a DataSync task, you might see these statuses. ([Task statuses](create-task-how-to.md#understand-task-creation-statuses) are different than task execution statuses.)


| Console status | API status | Description | 
| --- | --- | --- | 
|  Queueing  |  `QUEUED`  |  Another task execution is running and using the same DataSync agent. For more information, see [Knowing when your task is queued](#queue-task-execution).  | 
|  Launching  |  `LAUNCHING`  |  DataSync is initializing the task execution. This status usually goes quickly but can take up to a few minutes.  | 
| Launched | `LAUNCHED` | DataSync has launched the task execution. | 
|  Preparing  |  `PREPARING`  |  DataSync is determining what data to transfer. Preparation can take just minutes, a few hours, or even longer depending on the number of files, objects, or directories in both locations and how you configure your task. How preparation works also depends on your task mode. For more information, see [How DataSync prepares your data transfer](how-datasync-transfer-works.md#how-datasync-prepares).  | 
|  Transferring  |  `TRANSFERRING`  |  DataSync is performing the actual data transfer.  | 
|  Verifying  |  `VERIFYING`  |  DataSync is verifying the integrity of your data at the end of the transfer.  | 
|  Success  |  `SUCCESS`  |  The task execution succeeded.  | 
|  Cancelling  |  `CANCELLING`  | The task execution is in the process of being cancelled. | 
|  Error  |  `ERROR`  |  The task execution failed.  | 

## Knowing when your task is queued
<a name="queue-task-execution"></a>

When running multiple tasks (for example, you're [transferring a large dataset](create-task-how-to.md#multiple-tasks-large-dataset)), DataSync might queue the tasks to run in a series (first in, first out). Some examples of when this happens include:
+ You run different tasks that use the same DataSync agent. While you can use the same agent for multiple tasks, an agent can only run one task at a time.
+ A task execution is in progress and you start additional executions of the same task using different [filters](filtering.md) or [manifests](transferring-with-manifest.md).

In each example, the queued tasks don't start until the task ahead of them finishes.

## Cancelling your task execution
<a name="cancel-running-task"></a>

 You can stop any running or queued DataSync task execution. 

**To cancel a task execution by using the console**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**.

1. Select the **Task ID** for the running task that you want to monitor.

   The task status should be **Running**.

1. Choose **History** to view the task's executions.

1. Select the task execution that you want to stop, and then choose **Stop**.

1. In the dialog box, choose **Stop**.

To cancel a running or queued task by using the DataSync API, see [CancelTaskExecution](https://docs.aws.amazon.com/datasync/latest/userguide/API_CancelTaskExecution.html).

### Automated cancellation of stuck tasks
<a name="auto-cancel-stuck-tasks"></a>

At times a running DataSync task execution can become stuck.