Use Case 2: Continuous on-premises data migration

Reference architecture diagram for using an ongoing data migration from on-premises storage solution

Figure 3 - Ongoing data migration from on-premises storage solution

In use case 2, a customer has a hybrid cloud deployment with data being used by both an on-premises environment and systems deployed in AWS. Additionally, the customer wants a dedicated connection to AWS that provides consistent network performance. As part of the on-going data migration, AWS Direct Connect acts as the backbone, providing a dedicated connection that bypasses the Internet to connect to AWS cloud. Additionally, the customer deploys AWS Storage Gateway with Gateway-Cached Volumes in the data center, which sends data to an Amazon S3 bucket in their target AWS Region. The following steps describe the required steps to build this solution:

The customer creates an AWS Direct Connect connection between their corporate data center and the AWS Cloud.
1. To set up the connection using the Connection Wizard ordering type, the customer provides the following information using the AWS Direct Connect Console:
  1. Choose a resiliency level.
    
    Maximum Resiliency (for critical workloads): You can achieve maximum resiliency for critical workloads by using separate connections that terminate on separate devices in more than one location. This topology provides resiliency against device, connectivity, and complete location failures.
    
    High Resiliency (for critical workloads): You can achieve high resiliency for critical workloads by using two independent connections to multiple locations. This topology provides resiliency against connectivity failures caused by a fiber cut or a device failure. It also helps prevent a complete location failure.
    
    Development and Test (non-critical or test/dev workloads): You can achieve development and test resiliency for non-critical workloads by using separate connections that terminate on separate devices in one location. This topology provides resiliency against device failure, but does not provide resiliency against location failure.
  2. Enter connection settings:
    
    Bandwidth – choose from 1Gbps to 100Gbps
    
    First location – the first physical location for your first Direct Connect connection
    
    First location service provider
    
    Second location – the second physical location for your second Direct Connect connection
    
    Second location service provider
  3. Review and create menu: confirm your selections and click create.
2. After the customer creates a connection using the AWS Direct Connect console, AWS will send an email within 72 hours. The email will include a Letter of Authorization and Connecting Facility Assignment (LOA-CFA). After receiving the LOA-CFA, the customer will forward it to their network provider so they can order a cross connect for the customer. The customer is not able to order a cross connect for themselves in the AWS Direct Connect location if the customer does not already have equipment there. The network provider will have to do this for the customer.
3. After the physical connection is set up, the customer creates the virtual interfaces within AWS Direct Connect to connect to AWS public services, such Amazon S3.
4. After creating virtual interfaces, the customer runs the AWS Direct Connect failover test to make sure that traffic routes to alternate online virtual interfaces.
After the AWS Direct Connect connection is setup, the customer creates an Amazon S3 bucket into which the on-premises data can be backed up.
The customer deploys the AWS Storage Gateway in their existing data center using following steps:
1. Deploy a new gateway using AWS Storage Gateway console.
2. Select Volume Gateway-Cached volumes for the type of gateway.
3. Download the gateway virtual machine (VM) image and deploy on the on-premises virtualization environment.
4. Provision two local disks to be attached to the VM.
5. After the gateway VM is powered on, record the IP address of the machine, and then enter the IP address in the AWS Storage Gateway console to activate the gateway.
After the gateway is activated, the customer can configure the volume gateway in the AWS Storage Gateway console:
1. Configure the local storage by selecting one of the two local disks attached to the storage gateway VM to be used as the upload buffer and cache storage.
2. Create volumes on the Amazon S3 bucket.
The customer connects the Amazon S3 gateway volume as an iSCSI connection through the storage gateway IP address on a client machine.
After setup is completed and the customer applications write data to the storage volumes in AWS, the gateway at first stores the data on the on-premises disks (referred to as cache storage) before uploading the data to Amazon S3. The cache storage acts as the on-premises durable store for data that is waiting to upload to Amazon S3 from the upload buffer. The cache storage also lets the gateway store the customer application's recently accessed data on-premises for low-latency access. If an application requests data, the gateway first checks the cache storage for the data before checking Amazon S3. To prepare for upload to Amazon S3, the gateway also stores incoming data in a staging area, referred to as an upload buffer. Storage Gateway uploads this buffer data over an encrypted Secure Sockets Layer (SSL) connection to AWS, where it is stored encrypted in Amazon S3.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

1: One-time massive data migration

3: Continuous streaming data ingestion