

# Getting started with AWS DataSync
<a name="setting-up"></a>

Before you get started with AWS DataSync, you need to sign up for an AWS account if you don't have one. We also recommend learning where DataSync can be used and how much it might cost to transfer your data.

## Sign up for an AWS account
<a name="sign-up-for-aws"></a>

If you do not have an AWS account, complete the following steps to create one.

**To sign up for an AWS account**

1. Open [https://portal.aws.amazon.com/billing/signup](https://portal.aws.amazon.com/billing/signup).

1. Follow the online instructions.

   Part of the sign-up procedure involves receiving a phone call or text message and entering a verification code on the phone keypad.

   When you sign up for an AWS account, an *AWS account root user* is created. The root user has access to all AWS services and resources in the account. As a security best practice, assign administrative access to a user, and use only the root user to perform [tasks that require root user access](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_root-user.html#root-user-tasks).

AWS sends you a confirmation email after the sign-up process is complete. At any time, you can view your current account activity and manage your account by going to [https://aws.amazon.com/](https://aws.amazon.com/) and choosing **My Account**.

## Create a user with administrative access
<a name="create-an-admin"></a>

After you sign up for an AWS account, secure your AWS account root user, enable AWS IAM Identity Center, and create an administrative user so that you don't use the root user for everyday tasks.

**Secure your AWS account root user**

1.  Sign in to the [AWS Management Console](https://console.aws.amazon.com/) as the account owner by choosing **Root user** and entering your AWS account email address. On the next page, enter your password.

   For help signing in by using root user, see [Signing in as the root user](https://docs.aws.amazon.com/signin/latest/userguide/console-sign-in-tutorials.html#introduction-to-root-user-sign-in-tutorial) in the *AWS Sign-In User Guide*.

1. Turn on multi-factor authentication (MFA) for your root user.

   For instructions, see [Enable a virtual MFA device for your AWS account root user (console)](https://docs.aws.amazon.com/IAM/latest/UserGuide/enable-virt-mfa-for-root.html) in the *IAM User Guide*.

**Create a user with administrative access**

1. Enable IAM Identity Center.

   For instructions, see [Enabling AWS IAM Identity Center](https://docs.aws.amazon.com//singlesignon/latest/userguide/get-set-up-for-idc.html) in the *AWS IAM Identity Center User Guide*.

1. In IAM Identity Center, grant administrative access to a user.

   For a tutorial about using the IAM Identity Center directory as your identity source, see [ Configure user access with the default IAM Identity Center directory](https://docs.aws.amazon.com//singlesignon/latest/userguide/quick-start-default-idc.html) in the *AWS IAM Identity Center User Guide*.

**Sign in as the user with administrative access**
+ To sign in with your IAM Identity Center user, use the sign-in URL that was sent to your email address when you created the IAM Identity Center user.

  For help signing in using an IAM Identity Center user, see [Signing in to the AWS access portal](https://docs.aws.amazon.com/signin/latest/userguide/iam-id-center-sign-in-tutorial.html) in the *AWS Sign-In User Guide*.

**Assign access to additional users**

1. In IAM Identity Center, create a permission set that follows the best practice of applying least-privilege permissions.

   For instructions, see [ Create a permission set](https://docs.aws.amazon.com//singlesignon/latest/userguide/get-started-create-a-permission-set.html) in the *AWS IAM Identity Center User Guide*.

1. Assign users to a group, and then assign single sign-on access to the group.

   For instructions, see [ Add groups](https://docs.aws.amazon.com//singlesignon/latest/userguide/addgroups.html) in the *AWS IAM Identity Center User Guide*.

## Required IAM permissions for using DataSync
<a name="permissions-requirements"></a>

DataSync can transfer your data to or from an Amazon S3 bucket, Amazon EFS file system, or Amazon FSx file system. To get your data where you want it to go, you need the right IAM permissions granted to your identity. For example, the IAM role that you use with DataSync needs permission to use the Amazon S3 operations required to transfer data to an S3 bucket.

You can grant these permissions with IAM policies provided by AWS or by creating your own policies.

**Contents**
+ [AWS managed policies](#permissions-requirements-managed)
+ [Customer managed policies](#permissions-requirements-customer-managed)

### AWS managed policies
<a name="permissions-requirements-managed"></a>

AWS provides the following managed policies for common DataSync use cases:
+ `AWSDataSyncReadOnlyAccess` – Provides read-only access to DataSync.
+ `AWSDataSyncFullAccess` – Provides full access to DataSync and minimal access to its dependencies.

For more information, see [AWS managed policies for AWS DataSync](security-iam-awsmanpol.md).

### Customer managed policies
<a name="permissions-requirements-customer-managed"></a>

You can create custom IAM policies to use with DataSync. For more information, see [IAM customer managed policies for AWS DataSync](using-identity-based-policies.md).

## Where can I use DataSync?
<a name="datasync-regions"></a>

For a list of AWS Regions and endpoints that DataSync supports, see [AWS DataSync endpoints and quotas](https://docs.aws.amazon.com/general/latest/gr/datasync.html) in the *AWS General Reference*.

## How can I use DataSync?
<a name="datasync-access"></a>

There are several ways to use DataSync:
+ [DataSync console](https://console.aws.amazon.com/datasync/home), which is part of the AWS Management Console.
+ [DataSync API](API_Reference.md) or the [AWS CLI](https://awscli.amazonaws.com/v2/documentation/api/latest/reference/datasync/index.html#cli-aws-datasync) to programmatically configure and manage DataSync.
+ [AWS CloudFormation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/AWS_DataSync.html) or [Terraform](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/datasync_agent) to provision your DataSync resources.
+ [AWS SDKs](https://aws.amazon.com/developer/) to build applications that use DataSync.

## How much will DataSync cost?
<a name="datasync-pricing"></a>

To create a custom estimate using the amount of data that you plan to transfer, see [DataSync pricing](https://aws.amazon.com/datasync/pricing). 

## Open-source components used by DataSync
<a name="datasync-os-attributions"></a>

To view the open-source components used by DataSync, download the following link:
+ [datasync-open-source-components.zip](samples/datasync-open-source-components.zip)

# Do I need an AWS DataSync agent?
<a name="do-i-need-datasync-agent"></a>

To use AWS DataSync, you might need an agent. An *agent* is a virtual machine (VM) appliance that you deploy in your storage environment for data transfers.

Whether you need an agent depends on several factors, including the type of storage you're transferring to or from, if you're transferring across AWS accounts, and which AWS Regions you're transferring between. Before reading further, [check that DataSync supports the transfer you're interested in](working-with-locations.md).

After you determine that DataSync supports your transfer scenario, review the following information to help you understand whether you need an agent.

## Situations when you need a DataSync agent
<a name="when-agent-required"></a>

Most situations that require a DataSync agent involve storage that's managed by you or another cloud provider.
+ Transferring between AWS storage services and on-premises storage
+ Transferring between Amazon EFS or Amazon FSx and storage in other clouds
+ Transferring between some AWS storage services [across AWS accounts](working-with-locations.md#working-with-locations-across-accounts) (when neither storage service is Amazon S3)
+ Transferring between a commercial AWS Region and an AWS GovCloud (US) Region where the source and destination are either Amazon EFS or Amazon FSx

## Situations when you don't need a DataSync agent
<a name="when-agent-not-required"></a>

The situations that don't require an agent apply whether you're transferring in the [same AWS Region](working-with-locations.md#working-with-locations-same-region) or [across Regions](working-with-locations.md#working-with-locations-cross-regions).
+ Transferring between AWS storage services in the same AWS account
+ Transferring between Amazon S3 and a different AWS storage service across AWS accounts
+ Transferring between Amazon S3 and object storage in other clouds
+ Transferring between a commercial AWS Region and an AWS GovCloud (US) where either the source or destination is Amazon S3

## Choosing an agent for your task mode
<a name="choose-task-mode-agent"></a>

DataSync tasks run in Basic mode or Enhanced mode. Basic mode tasks require a Basic mode agent. Enhanced mode tasks require an Enhanced mode agent.

Basic mode supports using an agent when copying to or from the following locations:
+ NFS
+ SMB
+ HDFS
+ Object storage (including other clouds)
+ Azure blob

Enhanced mode supports using an agent for transfers to or from Amazon S3 with the following locations:
+ NFS
+ SMB

For more information, see [Choosing a task mode for your data transfer](choosing-task-mode.md).

## Using multiple DataSync agents
<a name="multiple-agents"></a>

While most transfers only need one agent, using multiple agents can speed up transfers for large datasets with millions of files or objects. In these situations, we recommend running transfer tasks in parallel, using one agent per task. This approach spreads the transfer workload across multiple tasks, with each task using its own agent. It also helps reduce the time it takes DataSync to prepare and transfer your data. For more information, see [Partitioning large datasets with multiple tasks](create-task-how-to.md#multiple-tasks-large-dataset).

Another option—especially if you have millions of small files—is to use multiple agents with a transfer location. For example, you can connect up to four agents to your on-premises Network File System (NFS) file service. This option might speed up your transfer, although the time it takes DataSync to prepare the transfer doesn’t change.

With either approach, be mindful that these can increase the I/O operations on your storage and affect your network bandwidth. For more information on using multiple agents for your DataSync transfers, see the [AWS Storage Blog](https://aws.amazon.com/blogs/storage/how-to-accelerate-your-data-transfers-with-aws-datasync-scale-out-architectures/).

If you're thinking of using multiple agents, remember the following:
+ A location can have up to four Basic mode agents and up to four Enhanced mode agents assigned. A task that uses the location will only use the agents that correspond to the configured task mode.
+ Using multiple agents with a location doesn't provide high availability. All the agents associated with a location must be online before you can start your transfer task. If one of the agents is [offline](managing-agent.md#understand-agent-statuses), you can't run your task.
+ If you're [using a virtual private cloud (VPC) service endpoint](choose-service-endpoint.md#datasync-in-vpc) to communicate with the DataSync service, all the agents must use the same endpoint and subnet.

## Next steps
<a name="do-i-need-agent-next-steps"></a>
+ If you need an agent, review the [agent requirements](agent-requirements.md) to understand what makes sense for your storage environment.
+ If you don't need an agent for your transfer, you can start [configuring your transfer](transferring-data-datasync.md).

# Requirements for AWS DataSync agents
<a name="agent-requirements"></a>

Before you [deploy](deploy-agents.md) an AWS DataSync agent in your storage environment, make sure that you understand the agent hypervisor and resource requirements.

## Hypervisor requirements
<a name="hosts-requirements"></a>

DataSync agents can be deployed on supported hypervisors to facilitate data transfer.

**Note**  
Enhanced mode agents only support VMware ESXi, KVM, Nutanix AHV, and EC2.

You can run a DataSync agent on the following hypervisors:
+ **VMware ESXi (version 7.0 or 8.0)**: VMware ESXi is available on the [Broadcom website](https://knowledge.broadcom.com/external/article?articleId=366685#mcetoc_1i29sq73la). You also need a VMware vSphere client to connect to the host. 
+ **Linux Kernel-based Virtual Machine (KVM)**: A free, open-source virtualization technology. KVM is included in Linux versions 2.6.20 and newer. DataSync is tested and supported for the CentOS/RHEL 7 and 8, Ubuntu 16.04 LTS, and Ubuntu 18.04 LTS distributions. Other modern Linux distribution might work, but function or performance is not guaranteed. You must enable hardware accelerated virtualization on your KVM host to deploy your DataSync agent.

  We recommend this option if you already have a KVM environment up and running and you're already familiar with how KVM works.

  Running KVM on Amazon EC2 isn't supported and can't be used for DataSync agents.
+ **Microsoft Hyper-V (version 2012 R2, 2016, or 2019)**: Basic mode agents only. For this setup, you need a Microsoft Hyper-V Manager on a Microsoft Windows client computer to connect to the host.

  The DataSync agent is a generation 1 virtual machine (VM). For more information about the differences between generation 1 and generation 2 VMs, see [Should I create a generation 1 or 2 virtual machine in Hyper-V?](https://docs.microsoft.com/en-us/windows-server/virtualization/hyper-v/plan/should-i-create-a-generation-1-or-2-virtual-machine-in-hyper-v) 
+ **Amazon EC2**: DataSync provides an Amazon Machine Image (AMI) that contains the DataSync image. For the recommended instance types, see [Amazon EC2 instance requirements](#ec2-instance-types).

## Agent requirements for DataSync transfers
<a name="agent-tranfer-resource-requirements"></a>

For DataSync transfers, your agent must meet the following resource requirements.

**Important**  
Keep in mind that the Basic mode agent requirements for working with up to 20 million files, objects, or directories are general guidelines. Your agent may need more resources because of other factors, such as how many directories you have and object metadata size. For example, the m5.2xlarge instance for an Amazon EC2 agent still might not be enough for a transfer of less than 20 million files.  
Enhanced mode agents don't have file quotas.

**Contents**
+ [Virtual machine requirements](#hardware)
+ [Amazon EC2 instance requirements](#ec2-instance-types)

### Virtual machine requirements
<a name="hardware"></a>

When deploying a DataSync agent that isn't on an Amazon EC2 instance, the agent VM requires the following resources, depending upon whether you use a Basic mode agent or an Enhanced mode agent:


| Resource | Basic mode | Enhanced mode | 
| --- | --- | --- | 
| Virtual processors | Four virtual processors assigned to the VM | Eight virtual processors assigned to the VM | 
| Disk space | 80 GB of disk space for installing the VM image and system data | 80 GB of disk space for installing the VM image and system data | 
| RAM |  32 GB of RAM assigned to the VM for task executions working with up to 20 million files, objects, or directories 64 GB of RAM assigned to the VM for task executions working with more than 20 million files, objects, or directories  |  32 GB of RAM assigned to the VM  | 

### Amazon EC2 instance requirements
<a name="ec2-instance-types"></a>

When deploying a DataSync agent on an Amazon EC2 instance, the instance size must be at least 2xlarge. We recommend using one of the following instance sizes, depending upon whether you use a Basic mode agent or an Enhanced mode agent: 


| Basic mode agent | Enhanced mode agent | 
| --- | --- | 
|  For task executions working with up to 20 million files, objects, or directories, use **m5.2xlarge.**  For task executions working with more than 20 million files, objects, or directories, use **m5.4xlarge.**   |  Use **m6a.2xlarge** regardless of the number of files, objects, or directories in your dataset.  | 

## Agent requirements for AWS Region partitions
<a name="agent-partition-requirements"></a>

DataSync agent images are associated with specific [AWS Region partitions](https://docs.aws.amazon.com/glossary/latest/reference/glos-chap.html?id=docs_gateway#partition). For example, by default you can't download an agent in a commercial AWS Region and then activate it in an AWS GovCloud (US) Region.

## Agent management requirements
<a name="agent-management-requirements"></a>

Once you [activate](activate-agent.md) your DataSync agent, AWS manages the agent for you. For more information, see [Managing your AWS DataSync agent](managing-agent.md).

# Deploying your AWS DataSync agent
<a name="deploy-agents"></a>

When creating an AWS DataSync agent, the first step is to deploy the agent in your storage environment. You can deploy an agent as a virtual machine (VM) on VMware ESXi, Linux Kernel-based Virtual Machine (KVM), Nutanix AHV (using the KVM image), and Microsoft Hyper-V hypervisors. You also can deploy an agent as an Amazon EC2 instance in a virtual private cloud (VPC) within AWS.

**Tip**  
Before you begin, confirm whether you [need a DataSync agent](do-i-need-datasync-agent.md).

## Deploying your agent on VMware
<a name="create-vmw-agent"></a>

You can download an agent from the DataSync console and deploy it in your VMware environment.

**Before you begin**: Make sure that your storage environment can support a DataSync agent. For more information, see [Virtual machine requirements](agent-requirements.md#hardware).

**To deploy an agent on VMware**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Agents**, and then choose **Create agent**. 

1. For **Hypervisor**, choose **VMWare ESXi**, and then choose **Download the image**.
   + The Enhanced mode agent downloads as an `.ova` image file.
   + The Basic mode agent downloads in a `.zip` file that contains the `.ova` image file

1. To minimize network latency, deploy the agent as close as possible to the storage system that DataSync needs to access (the same local network if possible). For more information, see [Network requirements for on-premises, self-managed, and other cloud storage](datasync-network.md#on-premises-network-requirements).

   If needed, see your hypervisor's documentation on how to deploy an `.ova` file in a VMware host.

1. Power on your hypervisor, log in to the agent VM, and get the agent's IP address. You need this IP address to activate the agent.

   The agent VM's default credentials are login **admin** and password **password**. If needed, change the password through the [VM's local console](local-console-vm.md).

**Next step: [Choosing a service endpoint for your AWS DataSync agent](choose-service-endpoint.md)**

## Deploying your agent on KVM
<a name="create-kvm-agent"></a>

You can download an agent from the DataSync console and deploy it in your KVM environment.

**Before you begin**: Make sure that your storage environment can support a DataSync agent. For more information, see [Virtual machine requirements](agent-requirements.md#hardware).

**To deploy an agent on KVM**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Agents**, and then choose **Create agent**.

1. For **Hypervisor**, choose **Kernel-based Virtual Machine (KVM)**, and then choose **Download the image**.
   + The Enhanced mode agent downloads as an `.qcow2` image file.
   + The Basic mode agent downloads in a `.zip` file that contains the `.qcow2` image file

1. To minimize network latency, deploy the agent as close as possible to the storage system that DataSync needs to access (the same local network if possible). For more information, see [Network requirements for on-premises, self-managed, and other cloud storage](datasync-network.md#on-premises-network-requirements).

1. Run the following command to install your `.qcow2` image. 

   ```
   virt-install \
       --name "datasync" \
       --description "DataSync agent" \
       --os-type=generic \
       --ram=32768 \
       --vcpus=4 \
       --disk path=datasync-yyyymmdd-x86_64.qcow2,bus=virtio,size=80 \
       --network default,model=virtio \
       --graphics none \
       --virt-type kvm \
       --import
   ```

   For information about how to manage this VM and your KVM host, see your hypervisor's documentation.

1. Power on your hypervisor, log in to your VM, and get the IP address of the agent. You need this IP address to activate the agent.

   The agent VM's default credentials are login **admin** and password **password**. If needed, change the password through the [VM's local console](local-console-vm.md).

**Next step: [Choosing a service endpoint for your AWS DataSync agent](choose-service-endpoint.md)**

## Deploying your Basic mode agent on Microsoft Hyper-V
<a name="create-hyper-v-agent"></a>

You can download a Basic mode agent from the DataSync console and deploy it in your Microsoft Hyper-V environment.

**Before you begin**: Make sure that your storage environment can support a DataSync agent. For more information, see [Virtual machine requirements](agent-requirements.md#hardware).

**To deploy a Basic mode agent on Hyper-V**

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Agents**, and then choose **Create agent**.

1. For **Hypervisor**, choose **Microsoft Hyper-V**, and then choose **Download the image**.

   The agent downloads in a `.zip` file that contains a `.vhdx` image file.

1. To minimize network latency, deploy the agent as close as possible to the storage system that DataSync needs to access (the same local network if possible). For more information, see [Network requirements for on-premises, self-managed, and other cloud storage](datasync-network.md#on-premises-network-requirements).

   If needed, see your hypervisor's documentation on how to deploy a `.vhdx` file in a Hyper-V host.
**Warning**  
You may notice poor network performance if you enable virtual machine queue (VMQ) on a Hyper-V host that's using a Broadcom network adapter. For information about a workaround, see the [Microsoft documentation](https://learn.microsoft.com/en-us/troubleshoot/windows-server/networking/poor-network-performance-hyper-v-host-vm).

1. Power on your hypervisor, log in to your VM, and get the IP address of the agent. You need this IP address to activate the agent.

   The agent VM's default credentials are login **admin** and password **password**. If needed, change the password through the [VM's local console](local-console-vm.md).

**Next step: [Choosing a service endpoint for your AWS DataSync agent](choose-service-endpoint.md)**

## Deploying your Amazon EC2 agent
<a name="ec2-deploy-agent"></a>

You might deploy a DataSync agent as an Amazon EC2 instance when transferring data between:
+ A self-managed cloud storage system (for example, an NFS file server in AWS) and an AWS storage service.
+ A cloud storage provider (such as Microsoft Azure Blob Storage or Google Cloud Storage) and an AWS storage service using Basic mode.
+ An S3 bucket in a commercial AWS Region and an S3 bucket in an AWS GovCloud (US) Region.
+ [Amazon S3 on AWS Outposts](#outposts-agent) and an AWS storage service using Basic mode.

**Warning**  
We don't recommend using an Amazon EC2 agent with on-premises storage because of increased network latency. Instead, deploy the agent as a VMware, KVM, or Hyper-V virtual machine in your data center as close to your on-premises storage as possible. 

### Deploying your EC2 agent
<a name="ec2-deploy-agent-how-to"></a>

**To choose the agent AMI for your AWS Region**<a name="AMI-command"></a>

1. Open a terminal and copy the following AWS CLI command to get the latest DataSync Amazon Machine Image (AMI) ID for the Region where you want to deploy your Amazon EC2 agent.

   Basic mode agents

   ```
   aws ssm get-parameter --name /aws/service/datasync/ami --region your-region
   ```

   Enhanced mode agents

   ```
   aws ssm get-parameter --name /aws/service/datasync/ami/v3 --region your-region
   ```

1. Run the command. In the output, take note of the `"Value"` property with the DataSync AMI ID.  
**Example command and output**  

   ```
   aws ssm get-parameter --name /aws/service/datasync/ami --region us-east-1                              
   
   {
       "Parameter": {
           "Name": "/aws/service/datasync/ami",
           "Type": "String",
           "Value": "ami-1234567890abcdef0",
           "Version": 6,
           "LastModifiedDate": 1569946277.996,
           "ARN": "arn:aws:ssm:us-east-1::parameter/aws/service/datasync/ami"
       }
   }
   ```<a name="efs-efs-steps"></a>

**To deploy your Amazon EC2 agent**
**Tip**  
To avoid charges for transferring across Availability Zones, deploy your agent in a way that it doesn't require network traffic between Availability Zones. (To learn more about data transfer prices for all AWS Regions, see [Amazon EC2 Data Transfer pricing](https://aws.amazon.com/ec2/pricing/on-demand/#Data_Transfer).)  
For example, deploy your agent in the Availability Zone where your self-managed cloud storage system is located. 

1. Copy the following URL:

   ```
   https://console.aws.amazon.com/ec2/v2/home?region=agent-region#LaunchInstanceWizard:ami=ami-id
   ```
   + Replace `agent-region` with the Region where you want to deploy your agent.
   + Replace `ami-id` with the DataSync AMI ID that you obtained.

1. Paste the URL into a browser.

   The Amazon EC2 instance launch page in the AWS Management Console displays.

1. For **Instance type**, choose one of the [recommended Amazon EC2 instances](agent-requirements.md#ec2-instance-types) for DataSync.

1.  For **Key pair**, choose an existing key pair, or create a new one. 

1. For **Network settings**, choose **Edit** and then do the following:

   1. For **VPC**, choose a VPC where you want to deploy your agent.

   1. For **Auto-assign public IP**, choose whether you want your agent to be accessible from the public internet.

      You use the instance's public or private IP address later to activate your agent.

   1. For **Firewall (security groups)**, create or a select a security group that does the following:
      + If needed, allows inbound traffic to the Amazon EC2 instance on port 80 (HTTP). Some options for [getting an agent activation key](activate-agent.md#get-activation-key) require this connection.
      + Allows inbound and outbound traffic between the Amazon EC2 instance the storage system that you're transferring data to or from. For more information, see [Network requirements for on-premises, self-managed, and other cloud storage](datasync-network.md#on-premises-network-requirements).
**Note**  
There are additional ports to configure depending on the type of [service endpoint](choose-service-endpoint.md) that your agent uses.

1. (Recommended) To increase performance when transferring from a cloud-based file system, expand **Advanced details** and choose a **Placement group** value where your storage is located.

1. Choose **Launch instance** to launch your Amazon EC2 instance.

1. Once your instance status is **Running**, choose the instance.

1. If you configured your instance to be accessible from the public internet, make note of the instance's public IP address. If you didn't, make note of the private IP address.

   You need this IP address when [activating your agent](activate-agent.md).

### Examples: Deploying your EC2 agent in an AWS Region
<a name="using-ec2-agent-in-region"></a>

The following guidance can help with common scenarios if you deploy an DataSync agent in an AWS Region.

**Topics**
+ [Deploying your Basic mode agent for transfers between cloud storage and AWS storage services](#efs-efs)
+ [Deploying your Basic mode agent for transfers between Amazon S3 and AWS file systems](#s3-cloud-nfs)

#### Deploying your Basic mode agent for transfers between cloud storage and AWS storage services
<a name="efs-efs"></a>

To transfer data between AWS accounts, or between cloud storage systems, the DataSync agent must be located in the same AWS Region and AWS account where the source file system resides. This type of transfer includes the following:
+ Transfers between Amazon EFS or Amazon FSx to AWS storage in a different AWS account.
+ Transfers from self-managed file systems to AWS storage services.

**Important**  
Deploy your agent such that it doesn't require network traffic between Availability Zones (to avoid charges for such traffic).   
To access your Amazon EFS or FSx for Windows File Server file system, deploy the agent in an Availability Zone that has a mount target to your file system.
For self-managed file systems, deploy the agent in the Availability Zone where your file system resides.
To learn more about data transfer prices for all AWS Regions, see [Amazon EC2 On-Demand pricing](https://aws.amazon.com/ec2/pricing/on-demand/). 

For example, the following diagram shows a high-level view of the DataSync architecture for transferring data from in-cloud Network File System (NFS) to in-cloud NFS or Amazon S3.

![\[Diagram showing data transfer between source Region containing a virtual private cloud (VPC) with an EFS file system and DataSync agent, and a destination Region with a DataSync endpoint and EFS file system.\]](http://docs.aws.amazon.com/datasync/latest/userguide/images/efs-efs-ec2.png)


Remember the following when transferring between AWS storage services across AWS accounts:
+ When transferring between Amazon EFS file systems or Amazon FSx file systems using the NFS protocol, configure your source file system as an [NFS location](create-nfs-location.md).
+ When transferring between Amazon FSx file systems using the SMB protocol, configure your source file system as an [SMB location](create-smb-location.md).

#### Deploying your Basic mode agent for transfers between Amazon S3 and AWS file systems
<a name="s3-cloud-nfs"></a>

The following diagram provides a high-level view of the DataSync architecture for transferring data from Amazon S3 to an AWS file system, such as Amazon EFS or Amazon FSx. You can use this architecture to transfer data from one AWS account to another, or to transfer data from Amazon S3 to a self-managed in-cloud file system. 

![\[Diagram showing data transfer between source Region containing an S3 bucket and DataSync endpoint, and a destination Region containing a VPC with an EFS file system and DataSync agent.\]](http://docs.aws.amazon.com/datasync/latest/userguide/images/s3-efs-ec2.png)


## Deploying your Basic mode agent on AWS Outposts
<a name="outposts-agent"></a>

You can launch a DataSync Amazon EC2 instance on your Outpost. To learn more about launching an AMI on AWS Outposts, see [Launch an instance on your Outpost](https://docs.aws.amazon.com/outposts/latest/userguide/launch-instance.html) in the *AWS Outposts User Guide*. 

When using DataSync to access Amazon S3 on Outposts, you must use a Basic mode agent and launch it in a VPC that's allowed to access your Amazon S3 access point, and activate the agent in the parent Region of the Outpost. The agent must also be able to route to the Amazon S3 on Outposts endpoint for the bucket. To learn more about working with Amazon S3 on Outposts endpoints, see [Working with Amazon S3 on Outposts](https://docs.aws.amazon.com/AmazonS3/latest/userguide/WorkingWithS3Outposts.html#AccessingS3Outposts) in the *Amazon S3 User Guide*.

# Choosing a service endpoint for your AWS DataSync agent
<a name="choose-service-endpoint"></a>

A [service endpoint](https://docs.aws.amazon.com/general/latest/gr/rande.html#datasync-region) is how your AWS DataSync [agent communicates with the DataSync service](networking-datasync.md#2-network-between-agent-service). DataSync supports the following types of service endpoints:
+ **Public service endpoint** – Data is sent over the public internet.
+ **Federal Information Processing Standard (FIPS) service endpoint** – Data is sent over the public internet by using processes that comply with FIPS.
+ **Virtual private cloud (VPC) service endpoint** – Data is sent through your VPC instead of over the public internet, increasing the security of your transferred data.
+ **FIPS VPC service endpoint** – Data is sent through your VPC using processes that comply with FIPS. 

You need a service endpoint to [activate your agent](activate-agent.md). When choosing a service endpoint, remember the following:
+ An agent can only use one type of endpoint. If you need to transfer data using different endpoint types, create an agent for each type.
+ How you [connect your storage network to AWS](networking-datasync.md#connecting-options-to-amazon) determines what service endpoints you can use.

## Choosing a public service endpoint
<a name="choose-service-endpoint-public"></a>

If you use a public service endpoint, all communication between your DataSync agent and the DataSync service occurs over the public internet. 

1. Determine the DataSync [public service endpoint](https://docs.aws.amazon.com/general/latest/gr/datasync.html) that you want to use.

1. [Configure your network](datasync-network.md#using-public-endpoints) to allow the traffic required for using DataSync public service endpoints.

**Next step: [Activating your AWS DataSync agent](activate-agent.md)**

## Choosing a FIPS service endpoint
<a name="choose-service-endpoint-fips"></a>

DataSync provides some service endpoints that comply with FIPS. For more information, see [FIPS endpoints](https://docs.aws.amazon.com/general/latest/gr/rande.html#FIPS-endpoints) in the *AWS General Reference*.

1. Determine the DataSync [FIPS service endpoint](https://docs.aws.amazon.com/general/latest/gr/datasync.html) that you want to use.

1. [Configure your network](datasync-network.md#using-public-endpoints) to allow the traffic required for using DataSync FIPS service endpoints.

**Next step: [Activating your AWS DataSync agent](activate-agent.md)**

## Choosing a VPC service endpoint
<a name="datasync-in-vpc"></a>

If you use a VPC service endpoint, your data isn't transferred across the public internet. DataSync instead transfers data through a VPC that's based on the Amazon VPC service.

**Contents**
+ [How DataSync agents work with VPC service endpoints](#working-with-endpoints)
+ [DataSync limitations with VPCs](#datasync-in-vpc-limitations)
+ [Creating a VPC service endpoint for DataSync](#create-agent-steps-vpc)

### How DataSync agents work with VPC service endpoints
<a name="working-with-endpoints"></a>

VPC service endpoints are provided by AWS PrivateLink. These types of endpoints let you privately connect supported AWS services to your VPC. When you use a VPC service endpoint with DataSync, all communication between your DataSync agent and the DataSync service remains in your VPC. 

The VPC service endpoint (along with the [network interfaces](required-network-interfaces.md) DataSync creates for data transfer traffic) uses private IP addresses that are only accessible from inside your VPC. For more information, see [Connecting your network for AWS DataSync transfers](networking-datasync.md).

### DataSync limitations with VPCs
<a name="datasync-in-vpc-limitations"></a>
+ VPCs that you use with DataSync must have default tenancy. VPCs with dedicated tenancy aren't supported.
+ DataSync doesn't support [shared VPCs](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-sharing.html).

### Creating a VPC service endpoint for DataSync
<a name="create-agent-steps-vpc"></a>

You create a VPC service endpoint for DataSync in a VPC that you manage. Your service endpoint, VPC, and DataSync agent must belong to the same AWS account.

The following diagram shows an example of DataSync using a VPC service endpoint for transferring from an on-premises storage system to an Amazon S3 bucket. The numbered callouts correspond to the steps to create a VPC service endpoint.

![\[A network diagram showing the order in which you can create a VPC service endpoint for DataSync.\]](http://docs.aws.amazon.com/datasync/latest/userguide/images/datasync-agent-vpc-endpoint.png)


**To create a VPC service endpoint for DataSync**

1. [Create](https://docs.aws.amazon.com/vpc/latest/userguide/create-vpc.html) or determine a VPC and subnet where you want to create your VPC service endpoint.

   If you're transferring to or from storage that's outside AWS, the VPC should extend to that storage environment (for example, your storage environment might be a data center where your on-premises NFS file server is located). You can do this by using routing rules over [Direct Connect](direct-connect-architecture.md) or VPN.

1. Create a DataSync VPC service endpoint by doing the following:

   1. Open the Amazon VPC console at [https://console.aws.amazon.com/vpc/](https://console.aws.amazon.com/vpc/).

   1. In the left navigation pane, choose **Endpoints**, then choose **Create endpoint**.

   1. For **Service category**, choose **AWS services**.

   1. For **Services**, search for **datasync** and choose the endpoint for the AWS Region that you're in (for example, `com.amazonaws.us-east-1.datasync` or `com.amazonaws.us-east-1.datasync-fips`).

   1. For **VPC**, choose the VPC where you want to create the VPC service endpoint.

   1. Expand **Additional settings** and clear the **Enable Private DNS Name** check box to disable this setting.

      We recommend disabling this setting in case you have agents in the same VPC that need to use a public service endpoint. An agent can't reach a [public service endpoint](datasync-network.md#using-public-endpoints) over the network when this setting is enabled.

   1. For **Subnet**, choose the subnet where you want to create the VPC service endpoint. Take note of the subnet ARN (you need this when activating your agent).

   1. Choose **Create endpoint**. Take note of the endpoint ID (you need this when activating your agent).

1. In your VPC, configure a [security group](https://docs.aws.amazon.com/vpc/latest/userguide/vpc-security-groups.html) that allows the traffic required for using DataSync [VPC service endpoints](datasync-network.md#using-vpc-endpoint). Take note of the security group ARN (you need this when activating your agent).

   The security group must allow your agent to connect with the private IP addresses of the VPC service endpoint and your [network interfaces](required-network-interfaces.md) (which get created when you create your task).

**Next step: [Activating your AWS DataSync agent](activate-agent.md)**

# Activating your AWS DataSync agent
<a name="activate-agent"></a>

To finish creating your AWS DataSync agent, you must activate it. This step associates the agent with your AWS account.

**Note**  
You can't activate an agent in more than one AWS account and AWS Region at a time.

## Prerequisites
<a name="activate-agent-prerequisites"></a>

To activate your DataSync agent, make sure that you have the following information:
+ The [DataSync service endpoint](choose-service-endpoint.md) that you're activating your agent with.

  If you're using a VPC service endpoint, you need these details:
  + The VPC service endpoint ID.
  + The subnet where your VPC service endpoint is located.
  + The security group that allows the traffic required for using DataSync [VPC service endpoints](datasync-network.md#using-vpc-endpoint).
+ Your agent's IP address or domain name.

  How you find this depends on the type of agent that you [deploy](deploy-agents.md). For example, if your agent is an Amazon EC2 instance, you can find its IP address by going to the instance's page on the Amazon EC2 console.

**Note**  
For FIPS VPC endpoints, use the AWS CLI or DataSync API.

## Getting an activation key
<a name="get-activation-key"></a>

You can obtain an activation key for your deployed DataSync agent a few different ways. Some options require access to your agent on port 80 (HTTP). If you use one of these options, DataSync closes the port once you activate the agent.

**Note**  
Agent activation keys expire in 30 minutes if unused.

------
#### [ DataSync console ]

When [activating your agent in the DataSync console](#activate-agent-how-to), DataSync can get the activation key for you by using the **Automatically get the activation key from your agent** option.

To use this option, your browser must be able to reach your agent on port 80.

------
#### [ Agent local console ]

Unlike the other options for getting an activation key, this option doesn't require your agent to be accessible on port 80.

1. Log in to the [local console](local-console-vm.md#local-console-login) of your agent virtual machine (VM) or Amazon EC2 instance.

1. On the **AWS DataSync Activation - Configuration** main menu, enter **0** to get an activation key.

1. Enter the AWS Region that you're activating your agent in.

1. Enter the type of service endpoint type that your agent is using.

1. Copy the activation key that displays.

   For example: `F0EFT-7FPPR-GG7MC-3I9R3-27DOH`

   You specify this key when [activating your agent](#activate-agent-how-to).

------
#### [ CLI ]

With standard Unix tools, you can run a `curl` request to your agent's IP address to get its activation key.

To use this option, your client must be able to reach your agent on port 80. You can run the following command to check:

```
nc -vz agent-ip-address 80
```

Once you confirm you can reach the agent, run one of the following commands depending on the type of service endpoint that you're using:
+ **Public service endpoints**:

  ```
  curl "http://agent-ip-address/?gatewayType=SYNC&activationRegion=your-region&no_redirect"
  ```
+ **FIPS service endpoints**:

  ```
  curl "http://agent-ip-address/?gatewayType=SYNC&activationRegion=your-region&endpointType=FIPS&no_redirect"
  ```
+ **VPC service endpoints**:

  ```
  curl "http://agent-ip-address/?gatewayType=SYNC&activationRegion=your-region&privateLinkEndpoint=vpc-endpoint-ip-address&endpointType=PRIVATE_LINK&no_redirect"
  ```
+ **FIPS VPC service endpoints**:

  ```
  curl "http://agent-ip-address/?gatewayType=SYNC&activationRegion=your-region&privateLinkEndpoint=vpc-endpoint-ip-address&endpointType=FIPS_PRIVATE_LINK&no_redirect"
  ```

**Note**  
To find the `vpc-endpoint-ip-address`, open the [Amazon VPC console](https://console.aws.amazon.com/vpc/), choose **Endpoints**, and select your DataSync VPC service endpoint. On the **Subnets** tab, locate the IP address for your [VPC service endpoint's subnet](choose-service-endpoint.md#datasync-in-vpc). This is the endpoint's IP address.

This command returns an activation key. For example:

`F0EFT-7FPPR-GG7MC-3I9R3-27DOH`

You specify this key when [activating your agent](#activate-agent-how-to).

------

## Activating your agent
<a name="activate-agent-how-to"></a>

You have several options for activating your DataSync agent. Once activated, AWS [manages the agent](managing-agent.md) for you.

------
#### [ DataSync console ]

1. Open the AWS DataSync console at [https://console.aws.amazon.com/datasync/](https://console.aws.amazon.com/datasync/).

1. In the left navigation pane, choose **Agents**, and then choose **Create agent**.

1. In the **Service endpoint** section, do the following to specify the service endpoint for your agent:
   + For a public service endpoint, choose **Public service endpoints in *your current AWS Region***.
   + For a FIPS service endpoint, choose **FIPS service endpoints in *your current AWS Region***.
   + For a VPC service endpoint, do the following:
     + Choose **VPC endpoints using AWS PrivateLink**.
     + For **VPC endpoint**, choose the VPC service endpoint that you want your agent to use.
     + For **Subnet**, choose the subnet where your VPC service endpoint is located.
     + For **Security group**, choose the security group that allows the traffic required for using DataSync [VPC service endpoints](datasync-network.md#using-vpc-endpoint).

1. In the **Activation key** section, do one of the following to specify your agent's activation key:
   + Choose **Automatically get the activation key from your agent** for DataSync to get the key for you. 
     + For **Agent address**, enter your agent's IP address or domain name.
     + Choose **Get key**.

       If activation fails, [check your network configuration](datasync-network.md) based on the type of service endpoint you're using.
   + Choose **Manually enter your agent's activation key** if you don't want a connection between your browser and agent.
     +  [Get the key](#get-activation-key) from the agent local console or by using a `curl` command.
     + Back in the DataSync console, enter the key in the **Activation key** field.

1. (Recommended) For **Agent name**, give your agent a name that you can remember.

1. (Optional) For **Tags**, enter values for the **Key** and **Value** fields to tag your agent.

   Tags help you manage, filter, and search for your AWS resources. 

1. Choose **Create agent**.

1. On the **Agents** page, verify that your agent is using the correct service endpoint type.
**Note**  
At this point, you might notice that your agent is offline. This happens briefly after activating an agent.

------
#### [ AWS CLI ]

1. Once you [get your activation key](#get-activation-key), copy one of the following `create-agent` commands depending on the type of service endpoint that you're using:
   + **Public or FIPS service endpoint**:

     ```
     aws datasync create-agent \
       --activation-key activation-key \
       --agent-name name-for-agent
     ```
   + **VPC or FIPS VPC service endpoint**:

     ```
     aws datasync create-agent \
       --activation-key activation-key \
       --agent-name name-for-agent \
       --vpc-endpoint-id vpc-endpoint-id \
       --subnet-arns subnet-arn \
       --security-group-arns security-group-arn
     ```

1. For `--activation-key`, specify your [agent activation key](#get-activation-key).

1. (Recommended) For `--agent-name`, specify a name for your agent that you can remember.

1. If you're using a VPC service endpoint, specify the following options:
   + For `--vpc-endpoint-id`, specify the ID of the VPC service endpoint that you're using.
   + For `--subnet-arns`, specify the ARN of the subnet where your VPC service endpoint is located.
   + For `--security-group-arns`, specify the ARN of the security group that allows the traffic required for using DataSync [VPC service endpoints](datasync-network.md#using-vpc-endpoint).

1. Run the `create-agent` command.

   You get a response with the ARN of the agent that you just activated. For example:

   ```
   {
       "AgentArn": "arn:aws:datasync:us-east-1:111222333444:agent/agent-0b0addbeef44baca3"
   }
   ```

1. Verify that your agent is activated by running the `list-agents` command:

   ```
   aws datasync list-agents
   ```
**Note**  
At this point, you might notice that your agent `Status` is `OFFLINE`. This happens briefly after activating an agent.

------
#### [ DataSync API ]

Once you [get your activation key](#get-activation-key), activate your agent by using the [CreateAgent](https://docs.aws.amazon.com/datasync/latest/userguide/API_CreateAgent.html) operation.

**Note**  
When you're done, you might notice that your agent is offline. This happens briefly after activating an agent.

------

## Next steps
<a name="activate-agent-next-steps"></a>
+ [Verify your agent's connection](test-agent-connections.md) to your storage system and the DataSync service.
+ If you run into issues trying to activate your agent, get help with [troubleshooting](troubleshooting-datasync-agents.md).
+ Create the DataSync location that you want to use with your agent. This might be an [on-premises](transferring-on-premises-storage.md) or [other cloud](transferring-other-cloud-storage.md) location.

# Verifying your agent's network connections
<a name="test-agent-connections"></a>

Once you activate your AWS DataSync agent, make sure that the agent has network connectivity to your storage system and the DataSync service.

## Accessing your agent's local console
<a name="local-console-login-getting-started"></a>

How you access your agent's local console depends on the type of agent you're using. 

### Accessing the local console (VMware ESXi, Linux KVM, or Microsoft Hyper-V)
<a name="local-console-login-agent-vm-getting-started"></a>

For security reasons, you can't remotely connect to the local console of the DataSync agent virtual machine (VM). 
+ If this is your first time using the local console, log in with the default credentials. The default user name is **admin** and the password is **password**.
**Note**  
We recommend changing the default password. To do this, on the console main menu enter **5** (or **6** for VMware VMs), then run the `passwd` command to change the password. 

### Accessing the local console (Amazon EC2)
<a name="local-console-login-agent-ec2-getting-started"></a>

To connect to an Amazon EC2 agent's local console, you must use SSH.

**Before you begin**: Make sure that your EC2 instance's security group allows access with SSH (TCP port 22).

1. Open a terminal and copy the following `ssh` command:

   ```
   ssh -i /path/key-pair-name.pem instance-user-name@instance-public-ip-address
   ```
   + For */path/key-pair-name*, specify the path and file name (`.pem`) of the private key required to connect to your instance.
   + For *instance-user-name*, specify `admin`.
   + For *instance-public-ip-address*, specify the public IP address of your instance.

1. Run the `ssh` command to connect to the instance.

Once connected, the main menu of the agent's local console displays.

## Verifying your agent's connection to your storage system
<a name="self-managed-storage-connectivity"></a>

Test whether your DataSync agent can connect to your storage system. For more information, see [1. Network connection between your storage system and agent](networking-datasync.md#1-network-between-storage-agent).

1. [Access your agent's local console](#local-console-login-getting-started).

1. On the **AWS DataSync Activation - Configuration** main menu, enter **3**.

1. Enter one of the following options:

   1. Enter **1** to test an NFS server connection.

   1. Enter **2** to test an SMB server connection.

   1. Enter **3** to test an object storage server connection.

   1. Enter **4** to test an HDFS connection.

   1. Enter **5** to test a Microsoft Azure Blob Storage connection.

1. Enter the storage server's IP address or domain name.

   Remember the following when entering the IP address or domain name:
   + Don't include a protocol. For example, enter **mystorage.com** instead of **https://mystorage.com**.
   + For HDFS, enter the IP address or domain name of the NameNode or DataNode in the Hadoop cluster.

1. If requested, enter the TCP port for connecting to the storage server (for example, **443**).

See if the connectivity test **PASSED** or **FAILED**.

## Verifying your agent's connection to the DataSync service
<a name="test-network"></a>

Test whether your DataSync agent can connect to the DataSync service. For more information, see [2. Network connection between your agent and DataSync service](networking-datasync.md#2-network-between-agent-service).

1. [Access your agent's local console](#local-console-login-getting-started).

1. On the **AWS DataSync Activation - Configuration** main menu, enter **2** to begin testing network connectivity.

   If your agent is activated, the **Test Network Connectivity** option can be initiated without any additional user input, because the Region and endpoint type are taken from the activated agent information.

1. Enter the type of DataSync service endpoint that your agent uses:

   1. For public service endpoints, enter **1** and the AWS Region where your agent is activated.

   1. For FIPS service endpoints, enter **2** and the Region where your agent is activated.

   1. For VPC service endpoints, enter **3**.

   1. For FIPS VPC service endpoints, enter **4**.

   You see a **PASSED** or **FAILED** message.

1. If you see a **FAILED** message, check your network configuration. For more information, see [AWS DataSync network requirements](datasync-network.md).

## Next steps
<a name="test-agent-connections-next-steps"></a>

Create the DataSync location that you want to use with your agent. This might be an [on-premises](transferring-on-premises-storage.md) or [other cloud](transferring-other-cloud-storage.md) location.