Amazon S3 backups - AWS Backup

Amazon S3 backups

Overview

AWS Backup supports centralized backup and restore of applications storing data in S3 alone or alongside other AWS services for database, storage, and compute. Many features are available for S3 backups, including Backup Audit Manager.

You can use a single backup policy in AWS Backup to centrally automate the creation of backups of your application data. AWS Backup automatically organizes backups across different AWS services and third-party applications in one centralized, encrypted location (known as a backup vault) so that you can manage backups of your entire application through a centralized experience. For S3, you can create continuous backups and restore your application data stored in S3 and restore the backups to a point-in-time with a single click.

Prerequisites for S3 backups

Permissions and policies for Amazon S3 backup and restore

To backup, copy, and restore S3 resources, you must have the correct policies in your role. To add these policies, go to AWS managed policies. Add the AWSBackupServiceRolePolicyForS3Backup and AWSBackupServiceRolePolicyForS3Restore to the roles that you intend to use to backup and restore S3 buckets.

If you do not have sufficient permission, please request the manager of your organization's administrative (admin) account to add the policies to the intended roles.

For more information, please see Managed policies and inline policies in the IAM User Guide.

Backups and versioning

You must enable S3 Versioning on your S3 bucket to use AWS Backup for Amazon S3. We have kept this prerequisite because in AWS we recommend S3 versioning as a best practice for data protection.

We recommend that you set a lifecycle expiration period for your S3 versions. Not setting up a lifecycle expiration period might increase your S3 costs because AWS Backup backs up and stores all unexpired versions of your S3 data. To learn more about setting up S3 lifecycle policies, follow the instructions on this page.

Considerations for Amazon S3 backups

The following points should be considered when you backup S3 resources:

  • Focused object metadata support – AWS Backup supports the following metadata: tags, access control lists (ACLs), user-defined metadata, original creation date, and version ID. You may also restore all backed-up data and metadata except original creation date, version ID, storage class, and e-tags.

  • When you restore an S3 object, AWS Backup applies a checksum value, even if the original object did not use the checksum feature.

  • An S3 object key name can be made up of most UTF-8 encodable strings. The following Unicode characters are allowed: #x9 | #xA | #xD | #x20 to #xD7FF | #xE000 to #xFFFD | #x10000 to #x10FFFF .

    Object key names that include characters not in this list might be excluded from backups.

  • Cold storage transition – Use AWS Backup lifecycle management policy to define the timeline for backup expiration. Cold storage transition of S3 backups is not supported.

  • Backups of S3 buckets with many versions of the same object that were created at the same second are not supported.

  • For periodic backups, AWS Backup makes a best effort to track all changes to your object metadata. However, if you update a tag or ACL multiple times within 1 minute, AWS Backup might not capture all intermediate states.

  • AWS Backup does not offer support for backups of SSE-C-encrypted objects. AWS Backup also does not support backups of bucket configurations, including bucket policy, settings, name, or access point.

  • AWS Backup does not support backups of S3 on AWS Outposts.

  • CloudTrail logging – If you log data read events, you must have CloudTrail logs to a different target bucket. If you save CloudTrail logs in the bucket that they log, there is an infinite loop, which can cause unexpected charges.

    For more information, see Data events in the CloudTrail User Guide.

  • Server access logging – If you enable server access logging, you must have the logs delivered to a different target bucket. If you save these logs in the bucket that they log, there is an infinite loop. For more information, see Enabling Amazon S3 server access logging.

Supported bucket types and quantities

AWS Backup supports backup and restore of general purpose S3 buckets. Directory buckets are not supported at this time.

The upper limit of a quantity of a resource (known as a quota), such as a bucket, allowed in an AWS account depends on the service. Amazon S3 quotas are different from AWS Backup quotas.

In each AWS account, you can create backups for up to 100 buckets by default. You are able to request a quota increase up to 1,000 buckets. Visit the Service Quotas console for more information.

Accounts with excess of 1,000 buckets are subject to quota limits; when requests exceed the quota, it may result in failed jobs. It is a best practice to limit an account to 1,000 buckets.

Supported S3 Storage Classes

AWS Backup allows you to backup your S3 data stored in the following S3 Storage Classes:

  • S3 Standard

  • S3 Standard - Infrequent Access (IA)

  • S3 One Zone-IA

  • S3 Glacier Instant Retrieval

  • S3 Intelligent-Tiering (S3 INT)

Backups of an object in the storage class S3 Intelligent-Tiering (INT) access those objects. This access triggers S3 Intelligent-Tiering to automatically move those objects to Frequent Access.

Backups that access Infrequent Access tiers, including S3 Standard - Infrequently Access (IA) and S3 One Zone-IA classes, move under the S3 storage charge of Frequent Access (applies to Infrequent Access or Archive Instant Access tiers).

With the exception of Glacier Instant Retrieval, archived storage classes are not supported.

For more information about storage pricing for Amazon S3, see Amazon S3 Pricing.

S3 backup types

With AWS Backup, you can create the following types of backups of your S3 buckets, including object data, tags, Access Control Lists (ACLs), and user-defined metadata:

  • Continuous backups allow you to restore to any point in time within the last 35 days. Continuous backups for an S3 bucket should only be configured in one backup plan.

    See Point-in-Time Recovery for a list of supported services and instructions on how to use AWS Backup to take continuous backups.

  • Periodic backups use snapshots of your data to allow you to retain data for your specified duration up to 99 years. You can schedule periodic backups in frequencies such as 1 hour, 12 hours, 1 day, 1 week, or 1 month. AWS Backup takes periodic backups during the backup window you define in your backup plan.

    See Creating a backup plan to understand how AWS Backup applies your backup plan to your resources.

Cross-account and cross-Region copies are available for S3 backups, but copies of continuous backups do not have point-in-time restore capabilities.

Continuous and periodic backups of S3 buckets must both reside in the same backup vault.

AWS Backup for S3 relies on receiving S3 events through Amazon EventBridge. If this setting is disabled in S3 bucket notification settings, continuous backups will stop for those buckets with the setting turned off. For more information, see Using EventBridge.

For both backup types, the first backup is a full backup, while subsequent backups are incremental at object-level.

Compare S3 backup types

Your backup strategy for S3 resources can involve just continuous backups, just periodic (snapshot) backups, or a combination of both. The information below can help you choose what works best for your organization:

Continuous backups only:

  • After the first full backup of your existing data is complete, changes in your S3 bucket data are tracked as they occur.

  • The tracked changes allow you to use PITR (point-in-time restore) for the retention period of the continuous backup. To perform a restore job, you choose the point in time to which you wish to restore.

  • The retention period of each continuous backup has a maximum of 35 days.

Periodic (snapshot) backups only, scheduled or on-demand:

  • AWS Backup scans the entire S3 bucket, retrieves each object’s ACL and tags, and initiates a Head request for every object that was in the prior snapshot but was not found in the snapshot being created.

  • The backup is point-in-time consistent.

  • The backup date and time recorded is the time at which AWS Backup completes the traversal of the bucket, not at the time which a backup job was created.

  • The first backup of a bucket is a full backup. Each subsequent backup is incremental, representing the change in data since the last snapshot.

  • The snapshot made by the periodic backup can have a retention period of up to 99 years.

Continuous backups combined with periodic/snapshot backups:

  • After the first full backup of your existing data (each bucket) is complete, changes in your bucket are tracked as they occur.

  • You can perform a point-in-time restore from a continuous recovery point.

  • Snapshots are point-in-time consistent.

  • Snapshots are taken directly from the continuous recovery point, eliminating the need to rescan a bucket to allow for faster processes.

  • Snapshots and continuous recovery points share data lineage; storage of data between snapshot and continuous recovery points is not duplicated.

S3 backup completion windows

The table below shows sample buckets of various sizes to help you guide estimates of the completion time of the initial full backup of an S3 bucket. Backup times will vary with the size, content, configuration, and settings of each bucket.

Bucket size Number of objects Estimated time to complete initial backup
425 GB (gigabytes) 135 million 31 hours
800 TB (terabytes) 670 million 38 hours
6 PB (petabytes) 5 billion 100 hours
370 TB (terabytes) 7.5 billion 180 hours

Best practices and cost considerations for S3 backups

Best practices

For buckets with more than 300 million objects:

  • For buckets with greater than 300 million objects, the backup rate can reach up to 17,000 objects per second during the initial full backup of the bucket (incremental backups will have a different speed); buckets containing fewer than 300 million objects back up at a rate close to 1,000 objects per second.

  • Continuous backups are recommended.

  • If backup lifecycle is planned for more than 35 days, you can also enable snapshot backups for the bucket in the same vault in which your continuous backups are stored.

Cost considerations

  • S3 lifecycle policies have an optional feature called Delete expired object delete markers. When this feature is left off, delete markers, sometimes in the millions, expire with no cleanup plan. When buckets without this feature are backed up, two issues impact time and cost:

    • Delete markers are backed up, just like objects. Backup time and restore time can be impacted depending on the ratio of objects to delete markers.

    • Each object and marker that is backed up has a minimum charge. Each delete marker is charged the same as a 128KiB object.

  • For accounts which make backups at least daily or more frequently, cost benefits can be realized by using continuous backups if the data within the backups has minimal changes between backups.

  • Larger buckets that do not change frequently can benefit from continuous backups, since this can result in lower costs when scans of the whole bucket along with multiple requests per objects don't need to be performed on pre-existing objects (objects that are unchanged from the previous backup).

  • Buckets that contain more than 100 million objects and that have a small delete rate compared to the overall backup size might realize cost benefits with a backup plan that contains both a continuous backup with a retention period of 2 days along with snapshots of a longer retention.

  • Periodic (snapshot) backup time aligns with the start of the backup process when a bucket scan is not needed. Scans are not needed in a bucket that contains both continuous backup and snapshots since in these cases snapshots are taken from a continuous recovery point.

  • For each object in a single S3-GIR (Amazon S3 Glacier Instant Retrieval), AWS Backup performs multiple calls, which will result in retrieval charges when a backup is conducted.

    Similar retrieval costs apply to buckets with objects in S3-IA and S3 One Zone-IA storage classes.

  • AWS KMS, CloudTrail, and Amazon CloudWatch features that are part of your backup strategy can result in additional costs beyond S3 bucket data storage. See the following for information on adjusting these features:

    • Reducing the cost of SSE-KMS with Amazon S3 Bucket keys in the Amazon S3 User Guide.

    • You can reduce CloudTrail costs by excluding AWS KMS events and by disabling S3 data events:

      • Exclude AWS KMS events: In the CloudTrail User Guide, Creating a trail in the console (basic event selectors) allows the option to exclude AWS KMS events to filter these events out of your trail (default setting includes all KMS events):

      • The option to log or exclude KMS events is available only if you log management events on your trail. If you choose not to log management events, KMS events are not logged, and you cannot change KMS event logging settings.

      • AWS KMS actions such as Encrypt, Decrypt, and GenerateDataKey typically generate a large volume (more than 99%) of events. These actions are now logged as Read events. Low-volume, relevant KMS actions such as Disable, Delete, and ScheduleKey (which typically account for less than 0.5% of KMS event volume) are logged as Write events.

      • To exclude high-volume events like Encrypt, Decrypt, and GenerateDataKey, but still log relevant events such as Disable, Delete, and ScheduleKey, choose to log Write management events, and clear the check box for Exclude AWS KMS events.

      • Disable S3 data events: By default, trails and event data stores do not log data events. Disable S3 data events before your initial backup to reduce costs.

    • To reduce CloudWatch costs, you can stop sending CloudTrail events to CloudWatch Logs when you update a trail to disable CloudWatch Logs settings.

Restoring S3 backups

You can restore your S3 data that you backed up using AWS Backup to the S3 Standard Storage class. You can restore your S3 data to an existing bucket, including the original bucket. During restore, you can also create a new S3 bucket as the restore target. You can restore S3 backups only to the same AWS Region where your backup is located.

You can restore the entire S3 bucket, or folders or objects within the bucket. AWS Backup restores the current version of that object.

To restore your S3 data using AWS Backup, see Restore S3 data using AWS Backup.