

# Transferring to or from other cloud storage with AWS DataSync
<a name="transferring-other-cloud-storage"></a>

With AWS DataSync, you can transfer data between some other cloud providers and AWS storage services. For more information, see [Where can I transfer my data with DataSync?](working-with-locations.md)

**Topics**
+ [Planning transfers to or from third-party cloud storage systems](third-party-cloud-transfer-considerations.md)
+ [Configuring AWS DataSync transfers with Google Cloud Storage](tutorial_transfer-google-cloud-storage.md)
+ [Configuring transfers with Microsoft Azure Blob Storage](creating-azure-blob-location.md)
+ [Configuring AWS DataSync transfers with Microsoft Azure Files SMB shares](transferring-azure-files.md)
+ [Configuring transfers with other cloud object storage](creating-other-cloud-object-location.md)

# Planning transfers to or from third-party cloud storage systems
<a name="third-party-cloud-transfer-considerations"></a>

When planning cross-cloud data transfers, consider the following:
+ **Using an agent:** An agent is only required to access storage in other clouds when using Basic mode tasks. [Enhanced mode tasks](https://docs.aws.amazon.com/datasync/latest/userguide/choosing-task-mode.html) do not require an agent. If you decide to use an agent, you can deploy it as an [Amazon EC2 instance](https://docs.aws.amazon.com/datasync/latest/userguide/deploy-agents.html#ec2-deploy-agent) when transferring from a cloud providers' S3-compatible object storage, or as a Google Compute Engine or Azure Virtual Machine for transfers from those specific storage services, respectively. When transferring from filesystems in Google and Azure, we recommend deploying the agent as a Google or Azure VM so that the agent is as close to the filesystem as possible. Additionally, DataSync compresses the data from the agent to AWS, which can help reduce egress costs. DataSync provides a list of [validated cloud locations](https://docs.aws.amazon.com/datasync/latest/userguide/creating-other-cloud-object-location.html) that provide the required [Amazon S3 API compatibility](https://docs.aws.amazon.com/datasync/latest/userguide/creating-other-cloud-object-location.html#other-cloud-access).
+ **The other cloud’s object storage endpoint:** The storage endpoint for a third-party cloud provider is typically region or account specific. The regional endpoint is used as the server in the DataSync object storage location, together with a specified bucket name.
+ **Storage classes of the source objects:** Like Amazon S3, some cloud providers support an archive tier that requires a restore before being able to access the archived objects. For example, objects in the Azure Blob archive tier must be retrieved for standard access prior to a data transfer. Objects in the Google Cloud Storage archive tier can be accessed immediately and do not require restore, but there are retrieval costs associated with direct archive tier access. Review your cross-cloud storage class documentation to determine access requirements and retrieval fees prior to beginning your data transfer. For more information about restoring archived objects in Amazon S3, see [Restoring an archived object](https://docs.aws.amazon.com/AmazonS3/latest/userguide/restoring-objects.html) in the *Amazon Simple Storage Service User Guide*.
+ **Object storage access:** Transferring data between third-party cloud providers requires access to the other cloud's object storage in the form of authentication keys. For example, to provide access to Google Cloud Storage, you configure a DataSync object storage location that connects to the [Google Cloud Storage XML API](https://cloud.google.com/storage/docs/xml-api/overview) and authenticates using a [Hash-based Message Authentication Code (HMAC) key](https://docs.aws.amazon.com/datasync/latest/userguide/tutorial_transfer-google-cloud-storage.html#transfer-google-cloud-storage-create-hmac-key) for your service account. For Azure Blob storage, you configure a dedicated [Azure Blob DataSync location](https://docs.aws.amazon.com/datasync/latest/userguide/creating-azure-blob-location.html#creating-azure-blob-location-how-to) that authenticates using [SAS tokens](https://docs.aws.amazon.com/datasync/latest/userguide/creating-azure-blob-location.html#azure-blob-access). DataSync uses AWS Secrets Manager to securely store your object storage credentials. For more information, see [Securing storage location credentials](https://docs.aws.amazon.com/datasync/latest/userguide/location-credentials.html).
+ **Object tag support:**
  + Unlike Amazon S3, not all cloud providers support [object tags](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-tagging.html). DataSync tasks can fail while attempting to read tags from the source location if the cloud provider does not support object tags through the Amazon S3 API, or if the credentials you provide are insufficient to retrieve the tags. DataSync provides a task option to turn off [reading and copying object tags](https://docs.aws.amazon.com/datasync/latest/userguide/API_Options.html#DataSync-Type-Options-ObjectTags) during a transfer if object tags are not supported, or you don't want to retain the tags. Review your cloud provider documentation to determine if object tags are supported, and verify your transfer task's object tag settings before initiating the transfer.
  + You can use the Amazon S3 API to check whether a cloud provider will return a `get-object-tagging` request. For more information, see [get-object-tagging](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/s3api/get-object-tagging.html) in the *AWS CLI Command Reference*.

    A cloud provider that supports object tags will return a response similar to the following example:

    ```
    aws s3api get-object-tagging --bucket BUCKET_NAME --endpoint- url=https://BUCKET_ENDPOINT --key prefix/file1
                                        
    {
    
        "TagSet": []
    
    }
    ```

    A cloud provider that doesn’t support `get-object-tagging` will return the following message:

    ```
    aws s3api get-object-tagging --bucket BUCKET_NAME --endpoint- url=https://BUCKET_ENDPOINT --key prefix/file1
    
    An error occurred (OperationNotSupported) when calling the GetObjectTagging operation: The operation is not supported for this resource
    ```
+ **Associated costs for requests and data egress:** Transferring data from cloud object storage has [request and egress costs](https://docs.aws.amazon.com/datasync/latest/userguide/creating-other-cloud-object-location.html#other-cloud-considerations-costs) associated with reading data and data transfer out. Request charges vary between cloud providers and between storage classes where applicable. Consult your cloud provider documentation regarding specific costs for requests relative to the storage class you plan to read from. For an overview of request charges that DataSync makes for data transfers, see [Evaluating S3 request costs when using DataSync](https://docs.aws.amazon.com/datasync/latest/userguide/create-s3-location.html#create-s3-location-s3-requests) and [AWS DataSync pricing](https://aws.amazon.com/datasync/pricing/). Transferring data out of specific cloud providers results in egress charges. Data transfer costs vary between cloud providers and are also dependent on the region where the data is stored.
+ **Object storage request rates:** Cloud providers have various performance and request rate characteristics for their object storage platforms. Review your other cloud provider's request rates and determine where the request limits are applied. Plan ahead for highly parallelized transfers consisting of multiple agents, where specific partitioning or performance increases might be required.

  Amazon S3 has documented request rates that you can build your solution around. Amazon S3 request rates are per partitioned prefix and are scalable across multiple prefixes. For more information, see [Best practices design patterns: optimizing Amazon S3 performance](https://docs.aws.amazon.com/AmazonS3/latest/userguide/optimizing-performance.html) in the *Amazon Simple Storage Service User Guide*.

# Configuring AWS DataSync transfers with Google Cloud Storage
<a name="tutorial_transfer-google-cloud-storage"></a>

With AWS DataSync, you can transfer data between Google Cloud Storage and the following AWS storage services:
+ Amazon S3
+ Amazon EFS
+ Amazon FSx for Windows File Server
+ Amazon FSx for Lustre
+ Amazon FSx for OpenZFS
+ Amazon FSx for NetApp ONTAP

To begin the transfer setup, create a location for your Google Cloud Storage. This location can serve as either your transfer source or destination. A DataSync agent is required only when you transfer data between Google Cloud Storage and Amazon EFS or Amazon FSx, or when using **Basic mode** tasks. **Enhanced mode** data transfers between Google Cloud Storage and Amazon S3 don't require an agent.

**Note**  
For private cloud connectivity between Google Cloud Storage and AWS, use Basic mode with agents.

## Overview
<a name="transfer-google-cloud-storage-overview"></a>

DataSync uses the [Google Cloud Storage XML API](https://cloud.google.com/storage/docs/xml-api/overview) for data transfers. This API provides an Amazon S3 compatible interface for reading and writing data with Google Cloud Storage buckets.

When you use Basic mode for transfers, you can deploy the agent in Google Cloud Storage or your Amazon VPC.

------
#### [ Agent in Google Cloud ]

1. You deploy a DataSync agent in your Google Cloud environment.

1. The agent reads your Google Cloud Storage bucket by using a Hash-based Message Authentication Code (HMAC) key.

1. The objects from your Google Cloud Storage bucket transfer securely through TLS 1.3 into the AWS Cloud by using a public endpoint.

1. The DataSync service writes the data to your S3 bucket.

The following diagram illustrates the transfer.

![\[An example DataSync transfer shows how object data transfers from a Google Cloud Storage bucket to an S3 bucket. First, the DataSync agent is deployed in your Google Cloud environment. Then, the DataSync agent reads the Google Cloud Storage bucket. The data moves securely through a public endpoint into AWS, where DataSync writes the objects to an S3 bucket in the same AWS Region where you're using DataSync.\]](http://docs.aws.amazon.com/datasync/latest/userguide/images/diagram-transfer-google-cloud-storage-public.png)


------
#### [ Agent in your VPC ]

1. You deploy a DataSync agent in a virtual private cloud (VPC) in your AWS environment.

1. The agent reads your Google Cloud Storage bucket by using a Hash-based Message Authentication Code (HMAC) key.

1. The objects from your Google Cloud Storage bucket transfer securely through TLS 1.3 into the AWS Cloud by using a private VPC endpoint.

1. The DataSync service writes the data to your S3 bucket.

The following diagram illustrates the transfer.

![\[An example DataSync transfer shows how object data transfers from a Google Cloud Storage bucket to an S3 bucket. First, the DataSync agent is deployed in a VPC in AWS. Then, the DataSync agent reads the Google Cloud Storage bucket. The data moves securely through a VPC endpoint into AWS, where DataSync writes the objects to an S3 bucket in the same AWS Region as the VPC.\]](http://docs.aws.amazon.com/datasync/latest/userguide/images/diagram-transfer-google-cloud-storage.png)


------

## Costs
<a name="transfer-google-cloud-storage-cost"></a>

The fees associated with this migration might include:
+ Running a Google [Compute Engine](https://cloud.google.com/compute/all-pricing) virtual machine (VM) instance (if you deploy your DataSync agent in Google Cloud)
+ Running an [Amazon EC2](https://aws.amazon.com/ec2/pricing/) instance (if you deploy your DataSync agent in a VPC within AWS)
+ Transferring the data by using [DataSync](https://aws.amazon.com/datasync/pricing/), including request charges related to [Google Cloud Storage](https://cloud.google.com/storage/pricing) and [Amazon S3](create-s3-location.md#create-s3-location-s3-requests) (if S3 is one of your transfer locations)
+ Transferring data out of [Google Cloud Storage](https://cloud.google.com/storage/pricing)
+ Storing data in [Amazon S3](https://aws.amazon.com/s3/pricing/)

## Prerequisites
<a name="transfer-google-cloud-storage-prerequisites"></a>

Before you begin, do the following if you haven’t already:
+ [Create a Google Cloud Storage bucket](https://cloud.google.com/storage/docs/creating-buckets) with the objects that you want to transfer to AWS.
+ [Sign up for an AWS account](https://portal.aws.amazon.com/billing/signup).
+ [Create an Amazon S3 bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html) for storing your objects after they're in AWS.

## Creating an HMAC key for your Google Cloud Storage bucket
<a name="transfer-google-cloud-storage-create-hmac-key"></a>

DataSync uses an HMAC key that's associated with your Google service account to authenticate with and read the bucket that you’re transferring data from. (For detailed instructions on how to create HMAC keys, see the [Google Cloud Storage documentation](https://cloud.google.com/storage/docs/authentication/hmackeys).)

**To create an HMAC key**

1. Create an HMAC key for your Google service account.

1. Make sure that your Google service account has at least `Storage Object Viewer` permissions.

1. Save your HMAC key's access ID and secret in a secure location.

   You'll need these items later to configure your DataSync source location.

## Step 2: Configure your network
<a name="transfer-google-cloud-storage-configure-network"></a>

Network configuration is required only when using a DataSync agent with your transfer. The network requirements for this migration depend on where you choose to deploy your agent.

### For a DataSync agent in Google Cloud
<a name="transfer-google-cloud-storage-configure-public"></a>

If you want to host your DataSync agent in Google Cloud, configure your network to [allow DataSync transfers through a public endpoint](datasync-network.md#using-public-endpoints).

### For a DataSync agent in your VPC
<a name="transfer-google-cloud-storage-configure-vpc"></a>

If you want to host your agent in AWS, you need a VPC with an interface endpoint. DataSync uses the VPC endpoint to facilitate the transfer.

**To configure your network for a VPC endpoint**

1. If you don't have one, [create a VPC](https://docs.aws.amazon.com/vpc/latest/userguide/working-with-vpcs.html#Create-VPC) in the same AWS Region as your S3 bucket.

1. [Create a private subnet for your VPC](https://docs.aws.amazon.com/vpc/latest/userguide/create-subnets.html).

1. [Create a VPC service endpoint](https://docs.aws.amazon.com/vpc/latest/privatelink/create-interface-endpoint.html) for DataSync.

1. Configure your network to [allow DataSync transfers through a VPC service endpoint](datasync-network.md#using-vpc-endpoint).

   To do this, modify the [security group](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-security-groups.html) that's associated with your VPC service endpoint.

## Step 3: Create a DataSync agent (optional)
<a name="transfer-google-cloud-storage-create-agent"></a>

A DataSync agent is only required when using **Basic** mode tasks. If you are using **Enhanced** mode to transfer between Google Cloud Storage (GCS) and Amazon S3, then no agent is required. If you want to use **Basic** mode, then you need a DataSync agent that can access your GCS bucket.

### For Google Cloud
<a name="transfer-google-cloud-storage-choose-endpoint"></a>

In this scenario, the DataSync agent runs in your Google Cloud environment.

**Before you begin**: [Install the Google Cloud CLI](https://cloud.google.com/sdk/docs/install).

**To create the agent for Google Cloud**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Agents**, then choose **Create agent**.

1. For **Hypervisor**, choose **VMware ESXi**, then choose **Download the image** to download a `.zip` file that contains the agent.

1. Open a terminal. Unzip the image by running the following command:

   ```
   unzip AWS-DataSync-Agent-VMWare.zip
   ```

1. Extract the contents of the agent's `.ova` file beginning with `aws-datasync` by running the following command:

   ```
   tar -xvf aws-datasync-2.0.1655755445.1-x86_64.xfs.gpt.ova
   ```

1. Import the agent's `.vmdk` file into Google Cloud by running the following Google Cloud CLI command:

   ```
   gcloud compute images import aws-datasync-2-test \
      --source-file INCOMPLETE-aws-datasync-2.0.1655755445.1-x86_64.xfs.gpt-disk1.vmdk \
      --os centos-7
   ```
**Note**  
Importing the `.vmdk` file might take up to two hours.

1. Create and start a VM instance for the agent image that you just imported. 

   The instance needs the following configurations for your agent. (For detailed instructions on how to create an instance, see the [Google Cloud Compute Engine documentation](https://cloud.google.com/compute/docs/instances).)
   + For the machine type, choose one of the following:
     + **e2-standard-8** – For DataSync task executions working with up to 20 million objects.
     + **e2-standard-16** – For DataSync task executions working with more than 20 million objects.
   + For the boot disk settings, go to the custom images section. Then choose the DataSync agent image that you just imported.
   + For the service account setting, choose your Google service account (the same account that you used in [Step 1](#transfer-google-cloud-storage-create-hmac-key)).
   + For the firewall setting, choose the option to allow HTTP (port 80) traffic.

     To activate your DataSync agent, port 80 must be open on the agent. The port doesn't need to be publicly accessible. Once activated, DataSync closes the port.

1. After the VM instance is running, take note of its public IP address.

   You'll need this IP address to activate the agent.

1. Go back to the DataSync console. On the **Create agent** screen where you downloaded the agent image, do the following to activate your agent:
   + For **Endpoint type**, choose the public service endpoints option (for example, **Public service endpoints in US East Ohio**).
   + For **Activation key**, choose **Automatically get the activation key from your agent**.
   + For **Agent address**, enter the public IP address of the agent VM instance that you just created.
   + Choose **Get key**.

1. Give your agent a name, and then choose **Create agent**.

Your agent is online and ready to transfer data.

### For your VPC
<a name="transfer-google-cloud-storage-deploy-agent"></a>

In this scenario, the agent runs as an Amazon EC2 instance in a VPC that's associated with your AWS account.

**Before you begin**: [Set up the AWS Command Line Interface (AWS CLI)](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html).

**To create the agent for your VPC**

1. Open a terminal. Make sure to configure your AWS CLI profile to use the account that's associated with your S3 bucket.

1. Copy the following command. Replace `vpc-region` with the AWS Region where your VPC resides (for example, `us-east-1`).

   ```
   aws ssm get-parameter --name /aws/service/datasync/ami --region vpc-region
   ```

1. Run the command. In the output, take note of the `"Value"` property.

   This value is the DataSync Amazon Machine Image (AMI) ID of the Region that you specified. For example, an AMI ID could look like `ami-1234567890abcdef0`.

1. Copy the following URL. Again, replace `vpc-region` with the AWS Region where your VPC resides. Then, replace `ami-id` with the AMI ID that you noted in the previous step.

   ```
   https://console.aws.amazon.com/ec2/v2/home?region=vpc-region#LaunchInstanceWizard:ami=ami-id
   ```

1. Paste the URL into a browser.

   The Amazon EC2 instance launch page in the AWS Management Console displays.

1. For **Instance type**, choose one of the [recommended Amazon EC2 instances for DataSync agents](agent-requirements.md#ec2-instance-types).

1. For **Key pair**, choose an existing key pair, or create a new one.

1. For **Network settings**, choose the VPC and subnet where you want to deploy the agent.

1. Choose **Launch instance**.

1. Once the Amazon EC2 instance is running, [choose your VPC endpoint](choose-service-endpoint.md#datasync-in-vpc).

1. [Activate your agent](activate-agent.md).

## Step 4: Create a DataSync source location for your Google Cloud Storage bucket
<a name="transfer-google-cloud-storage-create-source"></a>

To set up a DataSync location for your Google Cloud Storage bucket, you need the access ID and secret for the HMAC key that you created in [Step 1](#transfer-google-cloud-storage-create-hmac-key).

**To create the DataSync source location**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Object storage**.

1. For **Server**, enter **storage.googleapis.com**.

1. For **Bucket name**, enter the name of your Google Cloud Storage bucket.

1. For **Folder**, enter an object prefix.

   DataSync only copies objects with this prefix.

1. If your transfer requires an agent, choose **Use agents**, then choose the agent that you created in [Step 3](#transfer-google-cloud-storage-create-agent).

1. Expand **Additional settings**. For **Server protocol**, choose **HTTPS**. For **Server port**, choose **443**.

1. Scroll down to the **Authentication** section. Make sure that the **Requires credentials** check box is selected, and then do the following:
   + For **Access key**, enter your HMAC key's access ID.
   + For **Secret key**, either enter your HMAC key's secret key directly, or specify an AWS Secrets Manager secret that contains the key. For more information, see [Providing credentials for storage locations](https://docs.aws.amazon.com/datasync/latest/userguide/location-credentials.html).

1. Choose **Create location**.

## Step 5: Create a DataSync destination location for your S3 bucket
<a name="transfer-google-cloud-storage-create-destination"></a>

You need a DataSync location for where you want your data to end up.

**To create the DataSync destination location**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. [Create a DataSync location for the S3 bucket](create-s3-location.md).

   If you deployed the DataSync agent in your VPC, this tutorial assumes that the S3 bucket is in the same AWS Region as your VPC and DataSync agent. 

## Step 6: Create and start a DataSync task
<a name="transfer-google-cloud-storage-start-task"></a>

With your source and destinations locations configured, you can start moving your data into AWS.

**To create and start the DataSync task**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Tasks**, and then choose **Create task**.

1. On the **Configure source location** page, do the following:

   1. Choose **Choose an existing location**.

   1. Choose the source location that you created in [Step 4](#transfer-google-cloud-storage-create-source), then choose **Next**.

1. On the **Configure destination location** page, do the following:

   1. Choose **Choose an existing location**.

   1. Choose the destination location that you created in [Step 5](#transfer-google-cloud-storage-create-destination), then choose **Next**.

1. On the **Configure settings** page, do the following:

   1. Under **Data transfer configuration**, expand **Additional settings** and clear the **Copy object tags** check box.
**Important**  
Because the Google Cloud Storage XML API does not support reading or writing object tags, your DataSync task might fail if you try to copy object tags.

   1. Configure any other task settings that you want, and then choose **Next**.

1. On the **Review** page, review your settings, and then choose **Create task**.

1. On the task's details page, choose **Start**, and then choose one of the following:
   + To run the task without modification, choose **Start with defaults**.
   + To modify the task before running it, choose **Start with overriding options**.

When your task finishes, you'll see the objects from your Google Cloud Storage bucket in your S3 bucket.

# Configuring transfers with Microsoft Azure Blob Storage
<a name="creating-azure-blob-location"></a>

With AWS DataSync, you can transfer data between Microsoft Azure Blob Storage (including Azure Data Lake Storage Gen2 blob storage) and the following AWS storage services:
+ [Amazon S3](create-s3-location.md)
+ [Amazon EFS](create-efs-location.md)
+ [Amazon FSx for Windows File Server](create-fsx-location.md)
+ [Amazon FSx for Lustre](create-lustre-location.md)
+ [Amazon FSx for OpenZFS](create-openzfs-location.md)
+ [Amazon FSx for NetApp ONTAP](create-ontap-location.md)

To set up this kind of transfer, you create a [location](how-datasync-transfer-works.md#sync-locations) for your Azure Blob Storage. You can use this location as a transfer source or destination. A DataSync agent is required only when transferring data between Azure Blob and Amazon EFS or Amazon FSx, or when using **Basic** mode tasks. You don't need an agent to transfer data between Azure Blob and Amazon S3 using **Enhanced** mode.

## Providing DataSync access to your Azure Blob Storage
<a name="azure-blob-access"></a>

How DataSync accesses your Azure Blob Storage depends on several factors, including whether you're transferring to or from blob storage and what kind of [shared access signature (SAS) token](#azure-blob-sas-tokens) you're using. Your objects also must be in an [access tier](#azure-blob-access-tiers) that DataSync can work with.

**Topics**
+ [SAS tokens](#azure-blob-sas-tokens)
+ [Access tiers](#azure-blob-access-tiers)

### SAS tokens
<a name="azure-blob-sas-tokens"></a>

A SAS token specifies the access permissions for your blob storage. (For more information about SAS, see the [Azure Blob Storage documentation](https://learn.microsoft.com/azure/storage/common/storage-sas-overview).)

You can generate SAS tokens to provide different levels of access. DataSync supports tokens with the following access levels:
+ Account
+ Container

The access permissions that DataSync needs depends on the scope of your token. Not having the correct permissions can cause your transfer to fail. For example, your transfer won't succeed if you're moving objects with tags to Azure Blob Storage but your SAS token doesn't have tag permissions.

**Topics**
+ [SAS token permissions for account-level access](#account-sas-tokens)
+ [SAS token permissions for container-level access](#container-sas-tokens)
+ [SAS expiration policies](#azure-blob-sas-expiration-policies)

#### SAS token permissions for account-level access
<a name="account-sas-tokens"></a>

DataSync needs an account-level access token with the following permissions (depending on whether you're transferring to or from Azure Blob Storage).

------
#### [ Transfers from blob storage ]
+ **Allowed services** – Blob
+ **Allowed resource types** – Container, Object

  If you don't include these permissions, DataSync can't transfer your object metadata, including [object tags](#azure-blob-considerations-object-tags).
+ **Allowed permissions** – Read, List
+ **Allowed blob index permissions** – Read/Write (if you want DataSync to copy [object tags](#azure-blob-considerations-object-tags))

------
#### [ Transfers to blob storage ]
+ **Allowed services** – Blob
+ **Allowed resource types** – Container, Object

  If you don't include these permissions, DataSync can't transfer your object metadata, including [object tags](#azure-blob-considerations-object-tags).
+ **Allowed permissions** – Read, Write, List, Delete (if you want DataSync to remove files that aren't in your transfer source)
+ **Allowed blob index permissions** – Read/Write (if you want DataSync to copy [object tags](#azure-blob-considerations-object-tags))

------

#### SAS token permissions for container-level access
<a name="container-sas-tokens"></a>

DataSync needs a container-level access token with the following permissions (depending on whether you're transferring to or from Azure Blob Storage).

------
#### [ Transfers from blob storage ]
+ Read
+ List
+ Tag (if you want DataSync to copy [object tags](#azure-blob-considerations-object-tags))
**Note**  
You can't add the tag permission when generating a SAS token in the Azure portal. To add the tag permission, instead generate the token by using the [https://learn.microsoft.com/en-us/azure/vs-azure-tools-storage-manage-with-storage-explorer](https://learn.microsoft.com/en-us/azure/vs-azure-tools-storage-manage-with-storage-explorer) app or generate a [SAS token that provides account-level access](#account-sas-tokens).

------
#### [ Transfers to blob storage ]
+ Read
+ Write
+ List
+ Delete (if you want DataSync to remove files that aren't in your transfer source)
+ Tag (if you want DataSync to copy [object tags](#azure-blob-considerations-object-tags))
**Note**  
You can't add the tag permission when generating a SAS token in the Azure portal. To add the tag permission, instead generate the token by using the [https://learn.microsoft.com/en-us/azure/vs-azure-tools-storage-manage-with-storage-explorer](https://learn.microsoft.com/en-us/azure/vs-azure-tools-storage-manage-with-storage-explorer) app or generate a [SAS token that provides account-level access](#account-sas-tokens).

------

#### SAS expiration policies
<a name="azure-blob-sas-expiration-policies"></a>

Make sure that your SAS doesn't expire before you expect to finish your transfer. For information about configuring a SAS expiration policy, see the [Azure Blob Storage documentation](https://learn.microsoft.com/en-us/azure/storage/common/sas-expiration-policy).

If the SAS expires during the transfer, DataSync can no longer access your Azure Blob Storage location. (You might see a Failed to open directory error.) If this happens, [update your location](#azure-blob-update-location) with a new SAS token and restart your DataSync task.

### Access tiers
<a name="azure-blob-access-tiers"></a>

When transferring from Azure Blob Storage, DataSync can copy objects in the hot and cool tiers. For objects in the archive access tier, you must rehydrate those objects to the hot or cool tier before you can copy them.

When transferring to Azure Blob Storage, DataSync can copy objects into the hot, cool, and archive access tiers. If you're copying objects into the archive access tier, DataSync can't verify the transfer if you're trying to [verify all data in the destination](configure-data-verification-options.md).

DataSync doesn't support the cold access tier. For more information about access tiers, see the [Azure Blob Storage documentation](https://learn.microsoft.com/en-us/azure/storage/blobs/access-tiers-overview?tabs=azure-portal).

## Considerations with Azure Blob Storage transfers
<a name="azure-blob-considerations"></a>

When planning to transfer data to or from Azure Blob Storage with DataSync, there are some things to keep in mind.

**Topics**
+ [Costs](#azure-blob-considerations-costs)
+ [Blob types](#blob-types)
+ [AWS Region availability](#azure-blob-considerations-regions)
+ [Copying object tags](#azure-blob-considerations-object-tags)
+ [Transferring to Amazon S3](#azure-blob-considerations-s3)
+ [Deleting directories in a transfer destination](#azure-blob-considerations-deleted-files)
+ [Limitations](#azure-blob-limitations)

### Costs
<a name="azure-blob-considerations-costs"></a>

The fees associated with moving data in or out of Azure Blob Storage can include:
+ Running an [Azure virtual machine (VM)](https://azure.microsoft.com/en-us/pricing/details/virtual-machines/linux/) (if you deploy a DataSync agent in Azure)
+ Running an [Amazon EC2](https://aws.amazon.com/ec2/pricing/) instance (if you deploy a DataSync agent in a VPC within AWS)
+ Transferring the data by using [DataSync](https://aws.amazon.com/datasync/pricing/), including request charges related to [https://azure.microsoft.com/en-us/pricing/details/storage/blobs/](https://azure.microsoft.com/en-us/pricing/details/storage/blobs/) and [Amazon S3](create-s3-location.md#create-s3-location-s3-requests) (if S3 is one of your transfer locations)
+ Transferring data in or out of [https://azure.microsoft.com/en-us/pricing/details/storage/blobs/](https://azure.microsoft.com/en-us/pricing/details/storage/blobs/)
+ Storing data in an [AWS storage service](working-with-locations.md) supported by DataSync

### Blob types
<a name="blob-types"></a>

How DataSync works with blob types depends on whether you're transferring to or from Azure Blob Storage. When you're moving data into blob storage, the objects or files that DataSync transfers can only be block blobs. When you're moving data out of blob storage, DataSync can transfer block, page, and append blobs.

For more information about blob types, see the [Azure Blob Storage documentation](https://learn.microsoft.com/en-us/rest/api/storageservices/understanding-block-blobs--append-blobs--and-page-blobs).

### AWS Region availability
<a name="azure-blob-considerations-regions"></a>

You can create an Azure Blob Storage transfer location in any [AWS Region that's supported by DataSync](https://docs.aws.amazon.com/general/latest/gr/datasync.html#datasync-region).

### Copying object tags
<a name="azure-blob-considerations-object-tags"></a>

The ability for DataSync to preserve object tags when transferring to or from Azure Blob Storage depends on the following factors:
+ **The size of an object's tags** – DataSync can't transfer an object with tags that exceed 2 KB.
+ **Whether DataSync is configured to copy object tags** – DataSync [copies object tags](configure-metadata.md) by default.
+ **The namespace that your Azure storage account uses** – DataSync can copy object tags if your Azure storage account uses a flat namespace but not if your account uses a hierarchical namespace (a feature of Azure Data Lake Storage Gen2). Your DataSync task will fail if you try to copy object tags and your storage account uses a hierarchical namespace.
+ **Whether your SAS token authorizes tagging** – The permissions that you need to copy object tags vary depending on the level of access that your token provides. Your task will fail if you try to copy object tags and your token doesn't have the right permissions for tagging. For more information, check the permission requirements for [account-level access tokens](#account-sas-tokens) or [container-level access tokens](#container-sas-tokens).

### Transferring to Amazon S3
<a name="azure-blob-considerations-s3"></a>

When transferring to Amazon S3, DataSync won't transfer Azure Blob Storage objects larger than 5 TB or objects with metadata larger than 2 KB.

### Deleting directories in a transfer destination
<a name="azure-blob-considerations-deleted-files"></a>

When transferring to Azure Blob Storage, DataSync can [remove objects in your blob storage that aren't present in your transfer source](configure-metadata.md). (You can configure this option by clearing the **Keep deleted files** setting in the DataSync console. Your [SAS token](#azure-blob-sas-tokens) must also have delete permissions.)

When you configure your transfer this way, DataSync won't delete directories in your blob storage if your Azure storage account is using a hierarchical namespace. In this case, you must manually delete the directories (for example, by using [https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-explorer](https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-explorer)).

### Limitations
<a name="azure-blob-limitations"></a>

Remember the following limitations when transferring data to or from Azure Blob Storage:
+ DataSync [creates some directories](filtering.md#directories-ignored-during-transfers) in a location to help facilitate your transfer. If Azure Blob Storage is a destination location and your storage account uses a hierarchical namespace, you might notice task-specific subdirectories (such as `task-000011112222abcde`) in the `/.aws-datasync` folder. DataSync typically deletes these subdirectories following a transfer. If that doesn't happen, you can delete these task-specific directories yourself as long as a task isn't running.
+ DataSync doesn't support using a SAS token to access only a specific folder in your Azure Blob Storage container.
+ You can't provide DataSync a user delegation SAS token for accessing your blob storage.

## Creating your DataSync agent (optional)
<a name="azure-blob-creating-agent"></a>

A DataSync agent is required only when transferring data between Azure Blob and Amazon EFS or Amazon FSx, or when using **Basic** mode tasks. You don't need an agent to transfer data between Azure Blob and Amazon S3 using **Enhanced** mode. This section describes how to deploy and activate an agent.

**Tip**  
Although you can deploy your agent on an Amazon EC2 instance, using a Microsoft Hyper-V agent might result in decreased network latency and more data compression. 

### Microsoft Hyper-V agents
<a name="azure-blob-creating-agent-hyper-v"></a>

You can deploy your DataSync agent directly in Azure with a Microsoft Hyper-V image.

**Tip**  
Before you continue, consider using a shell script that might help you deploy your Hyper-V agent in Azure quicker. You can get more information and download the code on [GitHub](https://github.com/aws-samples/aws-datasync-deploy-agent-azure).  
If you use the script, you can skip ahead to the section about [Getting your agent's activation key](#azure-blob-creating-agent-hyper-v-3).

**Topics**
+ [Prerequisites](#azure-blob-creating-agent-hyper-v-0)
+ [Downloading and preparing your agent](#azure-blob-creating-agent-hyper-v-1)
+ [Deploying your agent in Azure](#azure-blob-creating-agent-hyper-v-2)
+ [Getting your agent's activation key](#azure-blob-creating-agent-hyper-v-3)
+ [Activating your agent](#azure-blob-creating-agent-hyper-v-4)

#### Prerequisites
<a name="azure-blob-creating-agent-hyper-v-0"></a>

To prepare your DataSync agent and deploy it in Azure, you must do the following:
+ Enable Hyper-V on your local machine.
+ Install [https://learn.microsoft.com/en-us/powershell/scripting/install/installing-powershell?view=powershell-7.3&viewFallbackFrom=powershell-7.1](https://learn.microsoft.com/en-us/powershell/scripting/install/installing-powershell?view=powershell-7.3&viewFallbackFrom=powershell-7.1) (including the Hyper-V Module).
+ Install the [Azure CLI](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli).
+ Install [https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json&bc=%2Fazure%2Fstorage%2Fblobs%2Fbreadcrumb%2Ftoc.json](https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-v10?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json&bc=%2Fazure%2Fstorage%2Fblobs%2Fbreadcrumb%2Ftoc.json).

#### Downloading and preparing your agent
<a name="azure-blob-creating-agent-hyper-v-1"></a>

Download an agent from the DataSync console. Before you can deploy the agent in Azure, you must convert it to a fixed-size virtual hard disk (VHD). For more information, see the [Azure documentation](https://learn.microsoft.com/en-us/azure/virtual-machines/windows/prepare-for-upload-vhd-image).

**To download and prepare your agent**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Agents**, and then choose **Create agent**.

1. For **Hypervisor**, choose **Microsoft Hyper-V**, and then choose **Download the image**.

   The agent downloads in a `.zip` file that contains a `.vhdx` file.

1. Extract the `.vhdx` file on your local machine.

1. Open PowerShell and do the following:

   1. Copy the following `Convert-VHD` cmdlet:

      ```
      Convert-VHD -Path .\local-path-to-vhdx-file\aws-datasync-2.0.1686143940.1-x86_64.xfs.gpt.vhdx `
      -DestinationPath .\local-path-to-vhdx-file\aws-datasync-2016861439401-x86_64.vhd -VHDType Fixed
      ```

   1. Replace each instance of `local-path-to-vhdx-file` with the location of the `.vhdx` file on your local machine.

   1. Run the command.

   Your agent is now a fixed-size VHD (with a `.vhd` file format) and ready to deploy in Azure.

#### Deploying your agent in Azure
<a name="azure-blob-creating-agent-hyper-v-2"></a>

Deploying your DataSync agent in Azure involves:
+ Creating a managed disk in Azure
+ Uploading your agent to that managed disk
+ Attaching the managed disk to a Linux virtual machine

**To deploy your agent in Azure**

1. In PowerShell, go to the directory that contains your agent's `.vhd` file.

1. Run the `ls` command and save the `Length` value (for example, `85899346432`).

   This is the size of your agent image in bytes, which you need when creating a managed disk that can hold the image.

1. Do the following to create a managed disk:

   1. Copy the following Azure CLI command:

      ```
      az disk create -n your-managed-disk `
      -g your-resource-group `
      -l your-azure-region `
      --upload-type Upload `
      --upload-size-bytes agent-size-bytes `
      --sku standard_lrs
      ```

   1. Replace `your-managed-disk` with a name for your managed disk.

   1. Replace `your-resource-group` with the name of the Azure resource group that your storage account belongs to.

   1. Replace `your-azure-region` with the Azure region where your resource group is located.

   1. Replace `agent-size-bytes` with the size of your agent image.

   1. Run the command.

   This command creates an empty managed disk with a [standard SKU](https://learn.microsoft.com/en-us/rest/api/storagerp/srp_sku_types) where you can upload your DataSync agent.

1. To generate a shared access signature (SAS) that allows write access to the managed disk, do the following:

   1. Copy the following Azure CLI command:

      ```
      az disk grant-access -n your-managed-disk `
      -g your-resource-group `
      --access-level Write `
      --duration-in-seconds 86400
      ```

   1. Replace `your-managed-disk` with the name of the managed disk that you created.

   1. Replace `your-resource-group` with the name of the Azure resource group that your storage account belongs to.

   1. Run the command.

      In the output, take note of the SAS URI. You need this URI when uploading the agent to Azure.

   The SAS allows you to write to the disk for up to an hour. This means that you have an hour to upload your agent to the managed disk.

1. To upload your agent to your managed disk in Azure, do the following:

   1. Copy the following `AzCopy` command:

      ```
      .\azcopy copy local-path-to-vhd-file sas-uri --blob-type PageBlob
      ```

   1. Replace `local-path-to-vhd-file` with the location of the agent's `.vhd` file on your local machine.

   1. Replace `sas-uri` with the SAS URI that you got when you ran the `az disk grant-access` command.

   1. Run the command.

1. When the agent upload finishes, revoke access to your managed disk. To do this, copy the following Azure CLI command:

   ```
   az disk revoke-access -n your-managed-disk -g your-resource-group
   ```

   1. Replace `your-resource-group` with the name of the Azure resource group that your storage account belongs to.

   1. Replace `your-managed-disk` with the name of the managed disk that you created.

   1. Run the command.

1. Do the following to attach your managed disk to a new Linux VM:

   1. Copy the following Azure CLI command:

      ```
      az vm create --resource-group your-resource-group `
      --location eastus `
      --name your-agent-vm `
      --size Standard_E4as_v4 `
      --os-type linux `
      --attach-os-disk your-managed-disk
      ```

   1. Replace `your-resource-group` with the name of the Azure resource group that your storage account belongs to.

   1. Replace `your-agent-vm` with a name for the VM that you can remember.

   1. Replace `your-managed-disk` with the name of the managed disk that you're attaching to the VM.

   1. Run the command.

You've deployed your agent. Before you can start configuring your data transfer, you must activate the agent.

#### Getting your agent's activation key
<a name="azure-blob-creating-agent-hyper-v-3"></a>

To manually get your DataSync agent's activation key, follow these steps. 

Alternatively, [DataSync can automatically get the activation key for you](activate-agent.md), but this approach requires some network configuration.

**To get your agent's activation key**

1. In the Azure portal, [enable boot diagnostics for the VM for your agent](https://learn.microsoft.com/en-us/azure/virtual-machines/boot-diagnostics) by choosing the **Enable with custom storage account** setting and specifying your Azure storage account.

   After you've enabled the boot diagnostics for your agent's VM, you can access your agent’s local console to get the activation key.

1. While still in the Azure portal, go to your VM and choose **Serial console**.

1. In the agent's local console, log in by using the following default credentials: 
   + **Username** – **admin**
   + **Password** – **password**

   We recommend at some point changing at least the agent's password. In the agent's local console, enter **5** on the main menu, then use the `passwd` command to change the password.

1. Enter **0** to get the agent's activation key.

1. Enter the AWS Region where you're using DataSync (for example, **us-east-1**).

1. Choose the [service endpoint](choose-service-endpoint.md) that the agent will use to connect with AWS. 

1. Save the value of the `Activation key` output. 

#### Activating your agent
<a name="azure-blob-creating-agent-hyper-v-4"></a>

After you have the activation key, you can finish creating your DataSync agent.

**To activate your agent**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Agents**, and then choose **Create agent**.

1. For **Hypervisor**, choose **Microsoft Hyper-V**.

1. For **Endpoint type**, choose the same type of service endpoint that you specified when you got your agent's activation key (for example, choose **Public service endpoints in *Region name***).

1. Configure your network to work with the service endpoint type that your agent is using. For service endpoint network requirements, see the following topics:
   + [VPC endpoints](datasync-network.md#using-vpc-endpoint)
   + [Public endpoints](datasync-network.md#using-public-endpoints)
   + [Federal Information Processing Standard (FIPS) endpoints](datasync-network.md#using-public-endpoints)

1. For **Activation key**, do the following:

   1. Choose **Manually enter your agent's activation key**.

   1. Enter the activation key that you got from the agent's local console.

1. Choose **Create agent**.

Your agent is ready to connect with your Azure Blob Storage. For more information, see [Creating your Azure Blob Storage transfer location](#creating-azure-blob-location-how-to).

### Amazon EC2 agents
<a name="azure-blob-creating-agent-ec2"></a>

You can deploy your DataSync agent on an Amazon EC2 instance.

**To create an Amazon EC2 agent**

1. [Deploy an Amazon EC2 agent](deploy-agents.md#ec2-deploy-agent).

1. [Choose a service endpoint](choose-service-endpoint.md) that the agent uses to communicate with AWS.

   In this situation, we recommend using a virtual private cloud (VPC) service endpoint.

1. Configure your network to work with [VPC service endpoints](datasync-network.md#using-vpc-endpoint).

1. [Activate the agent](https://docs.aws.amazon.com/datasync/latest/userguide/activate-agent.html).

## Creating your Azure Blob Storage transfer location
<a name="creating-azure-blob-location-how-to"></a>

You can configure DataSync to use your Azure Blob Storage as a transfer source or destination.

**Before you begin**  
Make sure that you know [how DataSync accesses Azure Blob Storage](#azure-blob-access) and works with [access tiers](#azure-blob-access-tiers) and [blob types](#blob-types). You also need a [DataSync agent](#azure-blob-creating-agent) that can connect to your Azure Blob Storage container.

### Using the DataSync console
<a name="creating-azure-blob-location-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Microsoft Azure Blob Storage**.

1. For **Container URL**, enter the URL of the container that's involved in your transfer.

1. (Optional) For **Access tier when used as a destination**, choose the [access tier](#azure-blob-access-tiers) that you want your objects or files transferred into.

1. For **Folder**, enter path segments if you want to limit your transfer to a virtual directory in your container (for example, `/my/images`).

1. If your transfer requires an agent, choose **Use agents**, then choose the DataSync agent that can connect with your Azure Blob Storage container.

1. For **SAS token**, provide the credentials necessary for DataSync to access your blob storage. Some public datasets on Azure Blob storage do not require credentials. You can enter a SAS token directly, or specify an AWS Secrets Manager secret that contains the token. For more information, see [Providing credentials for storage locations](https://docs.aws.amazon.com/datasync/latest/userguide/location-credentials.html).

   Your SAS token is part of the SAS URI string that comes after your storage resource URI and a question mark (`?`). A token looks something like this:

   ```
   sp=r&st=2023-12-20T14:54:52Z&se=2023-12-20T22:54:52Z&spr=https&sv=2021-06-08&sr=c&sig=aBBKDWQvyuVcTPH9EBp%2FXTI9E%2F%2Fmq171%2BZU178wcwqU%3D
   ```

1. (Optional) Enter values for the **Key** and **Value** fields to tag the location.

   Tags help you manage, filter, and search for your AWS resources. We recommend creating at least a name tag for your location. 

1. Choose **Create location**.

### Using the AWS CLI
<a name="creating-azure-blob-location-cli"></a>

1. Copy the following `create-location-azure-blob` command:

   ```
   aws datasync create-location-azure-blob \
     --container-url "https://path/to/container" \
     --authentication-type "SAS" \
     --sas-configuration '{
         "Token": "your-sas-token"
       }' \
     --agent-arns my-datasync-agent-arn \
     --subdirectory "/path/to/my/data" \
     --access-tier "access-tier-for-destination" \
     --tags [{"Key": "key1","Value": "value1"}]
   ```

1. For the `--container-url` parameter, specify the URL of the Azure Blob Storage container that's involved in your transfer.

1. For the `--authentication-type` parameter, specify `SAS`. If you are accessing a public dataset that does not require authentication, specify `NONE`.

1. For the `--sas-configuration` parameter's `Token` option, specify the SAS token that allows DataSync to access your blob storage. 

   You can also provide additional parameters for securing your keys using AWS Secrets Manager. For more information, see [Providing credentials for storage locations](https://docs.aws.amazon.com/datasync/latest/userguide/location-credentials.html).

   Your SAS token is part of the SAS URI string that comes after your storage resource URI and a question mark (`?`). A token looks something like this:

   ```
   sp=r&st=2023-12-20T14:54:52Z&se=2023-12-20T22:54:52Z&spr=https&sv=2021-06-08&sr=c&sig=aBBKDWQvyuVcTPH9EBp%2FXTI9E%2F%2Fmq171%2BZU178wcwqU%3D
   ```

1. (Optional) For the `--agent-arns` parameter, specify the Amazon Resource Name (ARN) of the DataSync agent that can connect to your container.

   Here's an example agent ARN: `arn:aws:datasync:us-east-1:123456789012:agent/agent-01234567890aaabfb`

   You can specify more than one agent. For more information, see [Using multiple DataSync agents](do-i-need-datasync-agent.md#multiple-agents).

1. For the `--subdirectory` parameter, specify path segments if you want to limit your transfer to a virtual directory in your container (for example, `/my/images`).

1. (Optional) For the `--access-tier` parameter, specify the [access tier](#azure-blob-access-tiers) (`HOT`, `COOL`, or `ARCHIVE`) that you want your objects or files transferred into.

   This parameter applies only when you're using this location as a transfer destination.

1. (Optional) For the `--tags` parameter, specify key-value pairs that can help you manage, filter, and search for your location.

   We recommend creating a name tag for your location.

1. Run the `create-location-azure-blob` command.

   If the command is successful, you get a response that shows you the ARN of the location that you created. For example:

   ```
   { 
       "LocationArn": "arn:aws:datasync:us-east-1:123456789012:location/loc-12345678abcdefgh" 
   }
   ```

## Viewing your Azure Blob Storage transfer location
<a name="azure-blob-view-location"></a>

You can get details about the existing DataSync transfer location for your Azure Blob Storage.

### Using the DataSync console
<a name="azure-blob-view-location-console"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations**.

1. Choose your Azure Blob Storage location.

   You can see details about your location, including any DataSync transfer tasks that are using it.

### Using the AWS CLI
<a name="azure-blob-view-location-cli"></a>

1. Copy the following `describe-location-azure-blob` command:

   ```
   aws datasync describe-location-azure-blob \
     --location-arn "your-azure-blob-location-arn"
   ```

1. For the `--location-arn` parameter, specify the ARN for the Azure Blob Storage location that you created (for example, `arn:aws:datasync:us-east-1:123456789012:location/loc-12345678abcdefgh`).

1. Run the `describe-location-azure-blob` command.

   You get a response that shows you details about your location. For example:

   ```
   {
       "LocationArn": "arn:aws:datasync:us-east-1:123456789012:location/loc-12345678abcdefgh",
       "LocationUri": "azure-blob://my-user.blob.core.windows.net/container-1",
       "AuthenticationType": "SAS",
       "Subdirectory": "/my/images",
       "AgentArns": ["arn:aws:datasync:us-east-1:123456789012:agent/agent-01234567890deadfb"],
   }
   ```

## Updating your Azure Blob Storage transfer location
<a name="azure-blob-update-location"></a>

If needed, you can modify your location's configuration in the console or by using the AWS CLI.

### Using the AWS CLI
<a name="azure-blob-update-location-cli"></a>

1. Copy the following `update-location-azure-blob` command:

   ```
   aws datasync update-location-azure-blob \
     --location-arn "your-azure-blob-location-arn" \
     --authentication-type "SAS" \
     --sas-configuration '{
         "Token": "your-sas-token"
       }' \
     --agent-arns my-datasync-agent-arn \
     --subdirectory "/path/to/my/data" \
     --access-tier "access-tier-for-destination"
   ```

1. For the `--location-arn` parameter, specify the ARN for the Azure Blob Storage location that you're updating (for example, `arn:aws:datasync:us-east-1:123456789012:location/loc-12345678abcdefgh`).

1. For the `--authentication-type` parameter, specify `SAS`.

1. For the `--sas-configuration` parameter's `Token` option, specify the SAS token that allows DataSync to access your blob storage. 

   The token is part of the SAS URI string that comes after the storage resource URI and a question mark (`?`). A token looks something like this:

   ```
   sp=r&st=2022-12-20T14:54:52Z&se=2022-12-20T22:54:52Z&spr=https&sv=2021-06-08&sr=c&sig=qCBKDWQvyuVcTPH9EBp%2FXTI9E%2F%2Fmq171%2BZU178wcwqU%3D
   ```

1. For the `--agent-arns` parameter, specify the Amazon Resource Name (ARN) of the DataSync agent that you want to connect to your container.

   Here's an example agent ARN: `arn:aws:datasync:us-east-1:123456789012:agent/agent-01234567890aaabfb`

   You can specify more than one agent. For more information, see [Using multiple DataSync agents](do-i-need-datasync-agent.md#multiple-agents).

1. For the `--subdirectory` parameter, specify path segments if you want to limit your transfer to a virtual directory in your container (for example, `/my/images`).

1. (Optional) For the `--access-tier` parameter, specify the [access tier](#azure-blob-access-tiers) (`HOT`, `COOL`, or `ARCHIVE`) that you want your objects to be transferred into.

   This parameter applies only when you're using this location as a transfer destination.

## Next steps
<a name="create-azure-blob-location-next-steps"></a>

After you finish creating a DataSync location for your Azure Blob Storage, you can continue setting up your transfer. Here are some next steps to consider:

1. If you haven't already, [create another location](working-with-locations.md) where you plan to transfer your data to or from your Azure Blob Storage.

1. Learn how DataSync [handles metadata and special files](metadata-copied.md), particularly if your transfer locations don't have a similar metadata structure.

1. Configure how your data gets transferred. For example, you can [transfer only a subset of your data](filtering.md) or delete files in your blob storage that aren't in your source location (as long as your [SAS token](#azure-blob-sas-tokens) has delete permissions).

1. [Start your transfer](run-task.md). 

# Configuring AWS DataSync transfers with Microsoft Azure Files SMB shares
<a name="transferring-azure-files"></a>

You can configure AWS DataSync to transfer data to or from a Microsoft Azure Files Server Message Block (SMB) share.

**Tip**  
For a full walkthrough on moving data from Azure Files SMB shares to AWS, see the [AWS Storage Blog](https://aws.amazon.com/blogs/storage/how-to-move-data-from-azure-files-smb-shares-to-aws-using-aws-datasync/).

## Providing DataSync access to SMB shares
<a name="configuring-smb-azure-files"></a>

DataSync connects to your SMB share using the SMB protocol and authenticates with credentials that you provide it.

**Topics**
+ [Supported SMB protocol versions](#configuring-smb-version-azure-files)
+ [Required permissions](#configuring-smb-permissions-azure-files)

### Supported SMB protocol versions
<a name="configuring-smb-version-azure-files"></a>

By default, DataSync automatically chooses a version of the SMB protocol based on negotiation with your SMB file server.

You also can configure DataSync to use a specific SMB version, but we recommend doing this only if DataSync has trouble negotiating with the SMB file server automatically. DataSync supports SMB versions 1.0 and later. For security reasons, we recommend using SMB version 3.0.2 or later. Earlier versions, such as SMB 1.0, contain known security vulnerabilities that attackers can exploit to compromise your data.

See the following table for a list of options in the DataSync console and API:


| Console option | API option | Description | 
| --- | --- | --- | 
| Automatic |  `AUTOMATIC`  |  DataSync and the SMB file server negotiate the highest version of SMB that they mutually support between 2.1 and 3.1.1. This is the default and recommended option. If you instead choose a specific version that your file server doesn't support, you may get an `Operation Not Supported` error.  | 
|  SMB 3.0.2  |  `SMB3`  |  Restricts the protocol negotiation to only SMB version 3.0.2.  | 
| SMB 2.1 |  `SMB2`  | Restricts the protocol negotiation to only SMB version 2.1. | 
| SMB 2.0 | `SMB2_0` | Restricts the protocol negotiation to only SMB version 2.0. | 
| SMB 1.0 | `SMB1` | Restricts the protocol negotiation to only SMB version 1.0. | 

### Required permissions
<a name="configuring-smb-permissions-azure-files"></a>

DataSync needs a user who has permission to mount and access your SMB location. This can be a local user on your Windows file server or a domain user that's defined in your Microsoft Active Directory.

To set object ownership, DataSync requires the `SE_RESTORE_NAME` privilege, which is usually granted to members of the built-in Active Directory groups **Backup Operators** and **Domain Admins**. Providing a user to DataSync with this privilege also helps ensure sufficient permissions to files, folders, and file metadata, except for NTFS system access control lists (SACLs).

Additional privileges are required to copy SACLs. Specifically, this requires the Windows `SE_SECURITY_NAME` privilege, which is granted to members of the **Domain Admins** group. If you configure your task to copy SACLs, make sure that the user has the required privileges. To learn more about configuring a task to copy SACLs, see [Configuring how to handle files, objects, and metadata](configure-metadata.md).

When you copy data between an SMB file server and Amazon FSx for Windows File Server file system, the source and destination locations must belong to the same Microsoft Active Directory domain or have an Active Directory trust relationship between their domains.

## Creating your Azure Files transfer location by using the console
<a name="create-azure-files-smb-location-how-to"></a>

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Server Message Block (SMB)**.

   You configure this location as a source or destination later.

1. For **Agents**, choose one or more DataSync agents that you want to connect to your SMB share.

   If you choose more than one agent, make sure you understand using [multiple agents for a location](do-i-need-datasync-agent.md#multiple-agents).

1. For **SMB Server**, enter the Domain Name System (DNS) name or IP address of the SMB share that your DataSync agent will mount.
**Note**  
You can't specify an IP version 6 (IPv6) address.

1. For **Share name**, enter the name of the share exported by your SMB share where DataSync will read or write data.

   You can include a subdirectory in the share path (for example, `/path/to/subdirectory`). Make sure that other SMB clients in your network can also mount this path. 

   To copy all the data in the subdirectory, DataSync must be able to mount the SMB share and access all of its data. For more information, see [Required permissions](create-smb-location.md#configuring-smb-permissions).

1. (Optional) Expand **Additional settings** and choose an **SMB Version** for DataSync to use when accessing your SMB share.

   By default, DataSync automatically chooses a version based on negotiation with the SMB share. For information, see [Supported SMB versions](create-smb-location.md#configuring-smb-version).

1. For **User**, enter a user name that can mount your SMB share and has permission to access the files and folders involved in your transfer.

   For more information, see [Required permissions](create-smb-location.md#configuring-smb-permissions).

1. For **Password**, enter the password of the user who can mount your SMB share and has permission to access the files and folders involved in your transfer.

1. (Optional) For **Domain**, enter the Windows domain name that your SMB share belongs to.

   If you have multiple domains in your environment, configuring this setting makes sure that DataSync connects to the right share.

1. (Optional) Choose **Add tag** to tag your location.

   *Tags* are key-value pairs that help you manage, filter, and search for your locations. We recommend creating at least a name tag for your location. 

1. Choose **Create location**.

# Configuring transfers with other cloud object storage
<a name="creating-other-cloud-object-location"></a>

With AWS DataSync, you can transfer data between [AWS storage services](transferring-aws-storage.md) and the following cloud object storage providers:
+ [https://docs.wasabi.com/](https://docs.wasabi.com/)
+ [https://docs.digitalocean.com/](https://docs.digitalocean.com/)
+ [https://docs.oracle.com/iaas/Content/home.htm](https://docs.oracle.com/iaas/Content/home.htm)
+ [https://developers.cloudflare.com/r2/](https://developers.cloudflare.com/r2/)
+ [https://www.backblaze.com/docs/cloud-storage](https://www.backblaze.com/docs/cloud-storage)
+ [https://guide.ncloud-docs.com/docs/](https://guide.ncloud-docs.com/docs/)
+ [https://www.alibabacloud.com/help/en/oss/product-overview/what-is-oss](https://www.alibabacloud.com/help/en/oss/product-overview/what-is-oss)
+ [https://cloud.ibm.com/docs/cloud-object-storage?topic=cloud-object-storage-getting-started-cloud-object-storage](https://cloud.ibm.com/docs/cloud-object-storage?topic=cloud-object-storage-getting-started-cloud-object-storage)
+ [https://help.lyvecloud.seagate.com/en/product-features.html](https://help.lyvecloud.seagate.com/en/product-features.html)

A DataSync agent is required only when transferring data between storage systems in other clouds and Amazon EFS or Amazon FSx, or when using **Basic** mode tasks. You don't need an agent to transfer data between storage systems in other clouds and Amazon S3 using **Enhanced** mode.

Regardless of whether you use an agent, you must also create a transfer [location](how-datasync-transfer-works.md#sync-locations) for your cloud object storage (specifically an **Object storage** location). DataSync can use this location as a source or destination for your transfer.

## Providing DataSync access to your other cloud object storage
<a name="other-cloud-access"></a>

How DataSync accesses your cloud object storage depends on several factors, including whether your storage is compatible with the Amazon S3 API and the permissions and credentials that DataSync needs to access your storage.

**Topics**
+ [Amazon S3 API compatibility](#other-cloud-s3-compatibility)
+ [Storage permissions and endpoints](#other-cloud-permissions)
+ [Storage credentials](#other-cloud-credentials)

### Amazon S3 API compatibility
<a name="other-cloud-s3-compatibility"></a>

Your cloud object storage must be compatible with the following [Amazon S3 API operations](https://docs.aws.amazon.com/AmazonS3/latest/API/API_Operations.html) for DataSync to connect to it:
+ `AbortMultipartUpload`
+ `CompleteMultipartUpload`
+ `CopyObject`
+ `CreateMultipartUpload`
+ `DeleteObject`
+ `DeleteObjects`
+ `DeleteObjectTagging`
+ `GetBucketLocation`
+ `GetObject`
+ `GetObjectTagging`
+ `HeadBucket`
+ `HeadObject`
+ `ListObjectsV2`
+ `PutObject`
+ `PutObjectTagging`
+ `UploadPart`

### Storage permissions and endpoints
<a name="other-cloud-permissions"></a>

You must configure the permissions that allow DataSync to access your cloud object storage. If your object storage is a source location, DataSync needs read and list permissions for the bucket that you're transferring data from. If your object storage is a destination location, DataSync needs read, list, write, and delete permissions for the bucket.

DataSync also needs an endpoint (or server) to connect to your storage. The following table describes the endpoints that DataSync can use to access other cloud object storage:


| Other cloud provider | Endpoint | 
| --- | --- | 
| Wasabi Cloud Storage |  `S3.region.wasabisys.com`  | 
| DigitalOcean Spaces |  `region.digitaloceanspaces.com`  | 
| Oracle Cloud Infrastructure Object Storage |  `namespace.compat.objectstorage.region.oraclecloud.com`  | 
|  Cloudflare R2 Storage  |  `account-id.r2.cloudflarestorage.com`  | 
|  Backblaze B2 Cloud Storage  |  `S3.region.backblazeb2.com`  | 
| NAVER Cloud Object Storage |  `region.object.ncloudstorage.com` (most regions)  | 
| Alibaba Cloud Object Storage Service | `region.aliyuncs.com` | 
| IBM Cloud Object Storage | `s3.region.cloud-object-storage.appdomain.cloud` | 
| Seagate Lyve Cloud | `s3.region.lyvecloud.seagate.com` | 

**Important**  
For details on how to configure bucket permissions and updated information on storage endpoints, see your cloud provider's documentation.

### Storage credentials
<a name="other-cloud-credentials"></a>

DataSync also needs the credentials to access the object storage bucket involved in your transfer. This might be an access key and secret key or something similar depending on how your cloud storage provider refers to these credentials.

For more information, see your cloud provider's documentation.

## Considerations when transferring from other cloud object storage
<a name="other-cloud-considerations"></a>

When planning to transfer objects to or from another cloud storage provider by using DataSync, there are some things to keep in mind.

**Topics**
+ [Costs](#other-cloud-considerations-costs)
+ [Storage classes](#other-cloud-considerations-storage-classes)
+ [Object tags](#other-cloud-considerations-object-tags)
+ [Transferring to Amazon S3](#other-cloud-considerations-s3)

### Costs
<a name="other-cloud-considerations-costs"></a>

The fees associated with moving data in and out of another cloud storage provider can include:
+ Running an [Amazon EC2](https://aws.amazon.com/ec2/pricing/) instance for your DataSync agent
+ Transferring the data by using [DataSync](https://aws.amazon.com/datasync/pricing/), including request charges related to your cloud object storage and [Amazon S3](create-s3-location.md#create-s3-location-s3-requests) (if S3 is your transfer destination)
+ Transferring data in or out of your cloud storage (check your cloud provider's pricing)
+ Storing data in an [AWS storage service](transferring-aws-storage.md) supported by DataSync
+ Storing data in another cloud provider (check your cloud provider's pricing)

### Storage classes
<a name="other-cloud-considerations-storage-classes"></a>

Some cloud storage providers have storage classes (similar to [Amazon S3](create-s3-location.md#using-storage-classes)) which DataSync can't read without being restored first. For example, Oracle Cloud Infrastructure Object Storage has an archive storage class. You need to restore objects in that storage class before DataSync can transfer them. For more information, see your cloud provider's documentation.

### Object tags
<a name="other-cloud-considerations-object-tags"></a>

Not all cloud providers support object tags. The ones that do might not allow querying tags through the Amazon S3 API. In either situation, your DataSync transfer task might fail if you try to copy object tags.

You can avoid this by clearing the **Copy object tags** checkbox in the DataSync console when creating, starting, or updating your task.

### Transferring to Amazon S3
<a name="other-cloud-considerations-s3"></a>

When transferring to Amazon S3, DataSync can't transfer objects larger than 5 TB. DataSync also can only copy object metadata up to 2 KB.

## Creating your DataSync agent
<a name="other-cloud-creating-agent"></a>

A DataSync agent is required only when transferring data between storage systems in other clouds and Amazon EFS or Amazon FSx, or when using **Basic** mode tasks. You don't need an agent to transfer data between storage systems in other clouds and Amazon S3 using **Enhanced** mode. This section desribes how to deploy and activate an agent on an Amazon EC2 instance in your virtual private cloud (VPC) in AWS.

**To create an Amazon EC2 agent**

1. [Deploy an Amazon EC2 agent](deploy-agents.md#ec2-deploy-agent).

1. [Choose a service endpoint](choose-service-endpoint.md) that the agent uses to communicate with AWS.

   In this situation, we recommend using a VPC service endpoint.

1. Configure your network to work with [VPC service endpoints](datasync-network.md#using-vpc-endpoint).

1. [Activate the agent](activate-agent.md).

## Creating a transfer location for your other cloud object storage
<a name="creating-other-cloud-location-how-to"></a>

You can configure DataSync to use your cloud object storage as a source or destination location.

**Before you begin**  
Make sure that you know [how DataSync accesses your cloud object storage](#other-cloud-access). You also need a [DataSync agent](#other-cloud-creating-agent) that can connect to your cloud object storage.

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, expand **Data transfer**, then choose **Locations** and **Create location**.

1. For **Location type**, choose **Object storage**.

1. For **Server**, enter the [endpoint](#other-cloud-permissions) that DataSync can use to access your cloud object storage:
   + **Wasabi Cloud Storage** – `S3.region.wasabisys.com`
   + **DigitalOcean Spaces** – `region.digitaloceanspaces.com`
   + **Oracle Cloud Infrastructure Object Storage** – `namespace.compat.objectstorage.region.oraclecloud.com`
   + **Cloudflare R2 Storage** – `account-id.r2.cloudflarestorage.com`
   + **Backblaze B2 Cloud Storage** – `S3.region.backblazeb2.com`
   + **NAVER Cloud Object Storage** – `region.object.ncloudstorage.com` (most regions)
   + **Alibaba Cloud Object Storage Service** – `region.aliyuncs.com`
   + **IBM Cloud Object Storage** – `s3.region.cloud-object-storage.appdomain.cloud`
   + **Seagate Lyve Cloud** – `s3.region.lyvecloud.seagate.com`

1. For **Bucket name**, enter the name of the object storage bucket that you're transferring data to or from.

1. For **Folder**, enter an object preﬁx. DataSync only transfers objects with this prefix.

1. If your transfer requires an agent, choose **Use agents**, then choose the DataSync agent that can connect with your cloud object storage.

1. Expand **Additional settings**. For **Server protocol**, choose **HTTPS**. For **Server port**, choose **443**.

1. Scroll down to the **Authentication** section. Make sure that the **Requires credentials** check box is selected, and then provide DataSync your [storage credentials](#other-cloud-credentials).
   + For **Access key**, enter the ID to access your cloud object storage.
   + For **Secret key**, provide the secret key to access your cloud object storage. You can either enter the key directly, or specify an AWS Secrets Manager secret that contains the key. For more information, see [Providing credentials for storage locations](https://docs.aws.amazon.com/datasync/latest/userguide/location-credentials.html).

1. (Optional) Enter values for the **Key** and **Value** fields to tag the location.

   Tags help you manage, filter, and search for your AWS resources. We recommend creating at least a name tag for your location. 

1. Choose **Create location**.

## Next steps
<a name="other-cloud-location-next-steps"></a>

After you finish creating a DataSync location for your cloud object storage, you can continue setting up your transfer. Here are some next steps to consider:

1. If you haven't already, [create another location](transferring-aws-storage.md) where you plan to transfer your data to or from in AWS.

1. Learn how DataSync [handles metadata and special files](metadata-copied.md) for object storage locations.

1. Configure how your data gets transferred. For example, maybe you only want to [transfer a subset of your data](filtering.md).
**Important**  
Make sure that you configure how DataSync copies object tags correctly. For more information, see considerations with [object tags](#other-cloud-considerations-object-tags).

1. [Start your transfer](run-task.md). 

 