Back up and archive data to Amazon S3 with Veeam Backup & Replication
Created by Jeanna James, Anthony Fiore (AWS) (AWS), and William Quigley
Environment: Production | Technologies: Storage & backup | AWS services: Amazon EC2; Amazon S3; Amazon S3 Glacier |
Summary
This pattern details the process for sending backups created by Veeam Backup & Replication to supported Amazon Simple Storage Service (Amazon S3) object storage classes by using the Veeam scale-out backup repository capability.
Veeam supports multiple Amazon S3 storage classes to best fit your specific needs. You can choose the type of storage based on the data access, resiliency, and cost requirements of your backup or archive data. For example, you can store data that you don’t plan to use for 30 days or longer in Amazon S3 infrequent access (IA) for lower cost. If you’re planning to archive data for 90 days or longer, you can use Amazon Simple Storage Service Glacier (Amazon S3 Glacier) Flexible Retrieval or S3 Glacier Deep Archive with Veeam’s archive tier. You can also use S3 Object Lock to make backups immutable within Amazon S3.
This pattern doesn’t cover how to set up Veeam Backup & Replication with a tape gateway in AWS Storage Gateway. For information about that topic, see Veeam Backup & Replication using AWS VTL Gateway - Deployment Guide
Warning: This scenario requires IAM users with programmatic access and long-term credentials, which present a security risk. To help mitigate this risk, we recommend that you provide these users with only the permissions they require to perform the task and that you remove these users when they are no longer needed. Access keys can be updated if necessary. For more information, see Updating access keys in the IAM User Guide. |
Prerequisites and limitations
Prerequisites
Veeam Backup & Replication, including Veeam Availability Suite or Veeam Backup Essentials, installed (you can register for a free trial
) Veeam Backup & Replication license with Enterprise or Enterprise Plus functionality, which includes Veeam Universal License (VUL)
An active AWS Identity and Access Management (IAM) user with access to an Amazon S3 bucket
An active IAM user with access to Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Virtual Private Cloud (Amazon VPC) (if utilizing archive tier)
Network connectivity from on premises to AWS services with available bandwidth for backup and restore traffic through a public internet connection or an AWS Direct Connect public virtual interface (VIF)
The following network ports and endpoints opened to ensure proper communication with object storage repositories:
Amazon S3 storage – TCP – port 443: Used to communicate with Amazon S3 storage.
Amazon S3 storage – cloud endpoints – *.amazonaws.com for AWS Regions and the AWS GovCloud (US) Regions, or *.amazonaws.com.rproxy.goskope.com.cn for China Regions: Used to communicate with Amazon S3 storage. For a complete list of connection endpoints, see Amazon S3 endpoints in the AWS documentation.
Amazon S3 storage – TCP HTTP – port 80: Used to verify certificate status. Consider that certificate verification endpoints—certificate revocation list (CRL) URLs and Online Certificate Status Protocol (OCSP) servers—are subject to change. The actual list of addresses can be found in the certificate itself.
Amazon S3 storage – certificate verification endpoints – *.amazontrust.com: Used to verify certificate status. Consider that certificate verification endpoints (CRL URLs and OCSP servers) are subject to change. The actual list of addresses can be found in the certificate itself.
Limitations
Veeam doesn’t support S3 Lifecycle policies on any S3 buckets that are used as Veeam object storage repositories. These include polices with Amazon S3 storage class transitions and S3 Lifecycle expiration rules. Veeam must be the sole entity that manages these objects. Enabling S3 Lifecycle policies might have unexpected results, including data loss.
Product versions
Veeam Backup & Replication v9.5 Update 4 or later (backup only or capacity tier)
Veeam Backup & Replication v10 or later (backup or capacity tier and S3 Object Lock)
Veeam Backup & Replication v11 or later (backup or capacity tier, archive or archive tier, and S3 Object Lock)
Veeam Backup & Replication v12 or later (performance tier, backup or capacity tier, archive or archive tier, and S3 Object Lock)
S3 Standard
S3 Standard-IA
S3 One Zone-IA
S3 Glacier Flexible Retrieval (v11 and later only)
S3 Glacier Deep Archive (v11 and later only)
S3 Glacier Instant Retrieval (v12 and later only)
Architecture
Source technology stack
On-premises Veeam Backup & Replication installation with connectivity from a Veeam backup server or a Veeam gateway server to Amazon S3
Target technology stack
Amazon S3
Amazon VPC and Amazon EC2 (if using archive tier)
Target architecture: SOBR
The following diagram shows the scale-out backup repository (SOBR) architecture.
Veeam Backup and Replication software protects data from logical errors such as system failures, application errors, or accidental deletion. In this diagram, backups are run on premises first, and a secondary copy is sent directly to Amazon S3. A backup represents a point-in-time copy of the data.
The workflow consists of three primary components that are required for tiering or copying backups to Amazon S3, and one optional component:
Veeam Backup & Replication (1) – The backup server that is responsible for coordinating, controlling, and managing backup infrastructure, settings, jobs, recovery tasks, and other processes.
Veeam gateway server (not shown in the diagram) – An optional on-premises gateway server that is required if the Veeam backup server doesn’t have outbound connectivity to Amazon S3.
Scale-out backup repository (2) – Repository system with horizontal scaling support for multi-tier storage of data. The scale-out backup repository consists of one or more backup repositories that provide fast access to data and can be expanded with Amazon S3 object storage repositories for long-term storage (capacity tier) and archiving (archive tier). Veeam uses the scale-out backup repository to tier data automatically between local (performance tier) and Amazon S3 object storage (capacity and archive tiers).
Amazon S3 (3) – AWS object storage service that offers scalability, data availability, security, and performance.
Target architecture: DTO
The following diagram shows the direct-to-object (DTO) architecture.
In this diagram, backup data goes directly to Amazon S3 without being stored on premises first. Secondary copies can be stored in S3 Glacier.
Automation and scale
You can automate the creation of IAM resources and S3 buckets by using the AWS CloudFormation templates provided in the VeeamHub GitHub repository
Tools
Tools and AWS services
Veeam Backup & Replication
is a solution from Veeam for protecting, backing up, replicating, and restoring your virtual and physical workloads. AWS CloudFormation helps you model and set up your AWS resources, provision them quickly and consistently, and manage them throughout their lifecycle. You can use a template to describe your resources and their dependencies, and launch and configure them together as a stack, instead of managing resources individually. You can manage and provision stacks across multiple AWS accounts and AWS Regions.
Amazon Elastic Compute Cloud (Amazon EC2) provides scalable computing capacity in the AWS Cloud. You can use Amazon EC2 to launch as many or as few virtual servers as you need, and you can scale out or scale in.
AWS Identity and Access Management (IAM) is a web service for securely controlling access to AWS services. With IAM, you can centrally manage users, security credentials such as access keys, and permissions that control which AWS resources users and applications can access.
Amazon Simple Storage Service (Amazon S3) is an object storage service. You can use Amazon S3 to store and retrieve any amount of data at any time, from anywhere on the web.
Amazon S3 Glacier (S3 Glacier) is a secure and durable service for low-cost data archiving and long-term backup.
Amazon Virtual Private Cloud (Amazon VPC) provisions a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network that you've defined. This virtual network closely resembles a traditional network that you'd operate in your own data center, with the benefits of using the scalable infrastructure of AWS.
Code
Use the CloudFormation templates provided in the VeeamHub GitHub repository
Best practices
In accordance with IAM best practices, we strongly recommend that you regularly rotate long-term IAM user credentials, such as the IAM user that you use for writing Veeam Backup & Replication backups to Amazon S3. For more information, see Security best practices in the IAM documentation.
Epics
Task | Description | Skills required |
---|---|---|
Create an IAM user. | Follow the instructions in the IAM documentation to create an IAM user. This user should not have AWS console access, and you will need to create an access key for this user. Veeam uses this entity to authenticate with AWS to read and write to your S3 buckets. You must grant least privilege (that is, grant only the permissions required to perform a task) so the user doesn’t have more authority than it needs. For example IAM policies to attach to your Veeam IAM user, see the Additional information section. Note Alternatively, you can use the CloudFormation templates provided in the VeeamHub GitHub repository | AWS administrator |
Create an S3 bucket. |
For more information, see Creating a bucket in the Amazon S3 documentation. | AWS administrator |
Task | Description | Skills required |
---|---|---|
Launch the New Object Repository wizard. | Before you set up the object storage and scale-out backup repositories in Veeam, you must add the Amazon S3 and Amazon S3 Glacier storage repositories that you want to use for the capacity and archive tiers. In the next epic, you’ll connect these storage repositories to your scale-out backup repository.
| AWS administrator, App owner |
Add Amazon S3 storage for the capacity tier. |
| AWS administrator, App owner |
Add S3 Glacier storage for the archive tier. | If you want to create an archive tier, use the IAM permissions detailed in the Additional information section.
| AWS administrator, App owner |
Task | Description | Skills required |
---|---|---|
Launch the New Scale-Out Backup Repository wizard. |
| App owner, AWS systems administrator |
Add a scale-out backup repository and configure capacity and archive tiers. |
| App owner, AWS systems administrator |
Related resources
Creating an IAM user in your AWS account (IAM documentation)
Creating a bucket (Amazon S3 documentation)
Blocking public access to your Amazon S3 storage (Amazon S3 documentation)
Using S3 Object Lock (Amazon S3 documentation)
How to Create Secure IAM Policy for Connection to S3 Object Storage
(Veeam documentation)
Additional information
The following sections provide sample IAM policies you can use when you create an IAM user in the Epics section of this pattern.
IAM policy for capacity tier
Note Change the name of the S3 buckets in the example policy from <yourbucketname>
to the name of the S3 bucket that you want to use for Veeam capacity tier backups.
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "s3:GetObjectVersion", "s3:ListBucketVersions", "s3:ListBucket", "s3:PutObjectLegalHold", "s3:GetBucketVersioning", "s3:GetObjectLegalHold", "s3:GetBucketObjectLockConfiguration", "s3:PutObject*", "s3:GetObject*", "s3:GetEncryptionConfiguration", "s3:PutObjectRetention", "s3:PutBucketObjectLockConfiguration", "s3:DeleteObject*", "s3:DeleteObjectVersion", "s3:GetBucketLocation" ], "Resource": [ "arn:aws:s3:::/*", "arn:aws:s3:::" ] }, { "Sid": "VisualEditor1", "Effect": "Allow", "Action": [ "s3:ListAllMyBuckets", "s3:ListBucket" ], "Resource": "*" } ] }
IAM policy for archive tier
Note Change the name of the S3 buckets in the example policy from <yourbucketname>
to the name of the S3 bucket that you want to use for Veeam archive tier backups.
To use your existing VPC, subnet, and security groups:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "s3:DeleteObject", "s3:PutObject", "s3:GetObject", "s3:RestoreObject", "s3:ListBucket", "s3:AbortMultipartUpload", "s3:GetBucketVersioning", "s3:ListAllMyBuckets", "s3:GetBucketLocation", "s3:GetBucketObjectLockConfiguration", "s3:PutObjectRetention", "s3:GetObjectVersion", "s3:PutObjectLegalHold", "s3:GetObjectRetention", "s3:DeleteObjectVersion", "s3:ListBucketVersions", "ec2:DescribeInstances", "ec2:CreateKeyPair", "ec2:DescribeKeyPairs", "ec2:RunInstances", "ec2:DeleteKeyPair", "ec2:DescribeVpcAttribute", "ec2:CreateTags", "ec2:DescribeSubnets", "ec2:TerminateInstances", "ec2:DescribeSecurityGroups", "ec2:DescribeImages", "ec2:DescribeVpcs" ], "Resource": "*" } ] }
To create new VPC, subnet, and security groups:
{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "s3:DeleteObject", "s3:PutObject", "s3:GetObject", "s3:RestoreObject", "s3:ListBucket", "s3:AbortMultipartUpload", "s3:GetBucketVersioning", "s3:ListAllMyBuckets", "s3:GetBucketLocation", "s3:GetBucketObjectLockConfiguration", "s3:PutObjectRetention", "s3:GetObjectVersion", "s3:PutObjectLegalHold", "s3:GetObjectRetention", "s3:DeleteObjectVersion", "s3:ListBucketVersions", "ec2:DescribeInstances", "ec2:CreateKeyPair", "ec2:DescribeKeyPairs", "ec2:RunInstances", "ec2:DeleteKeyPair", "ec2:DescribeVpcAttribute", "ec2:CreateTags", "ec2:DescribeSubnets", "ec2:TerminateInstances", "ec2:DescribeSecurityGroups", "ec2:DescribeImages", "ec2:DescribeVpcs", "ec2:CreateVpc", "ec2:CreateSubnet", "ec2:DescribeAvailabilityZones", "ec2:CreateRoute", "ec2:CreateInternetGateway", "ec2:AttachInternetGateway", "ec2:ModifyVpcAttribute", "ec2:CreateSecurityGroup", "ec2:DeleteSecurityGroup", "ec2:AuthorizeSecurityGroupIngress", "ec2:AuthorizeSecurityGroupEgress", "ec2:DescribeRouteTables", "ec2:DescribeInstanceTypes" ], "Resource": "*" } ] }