Understanding how DataSync handles file and object metadata
AWS DataSync can preserve your file or object metadata during a data transfer. How your metadata gets copied depends on your transfer locations and if those locations use similar types of metadata.
System-level metadata
In general, DataSync doesn't copy system-level metadata. For example, when transferring from an SMB file server, the permissions you configured at the file system level aren't copied to the destination storage system.
There are exceptions. When transferring between Amazon S3 and other object storage, DataSync does copy some system-defined object metadata.
Metadata copied in Amazon S3 transfers
The following tables describe what metadata DataSync can copy when a transfer involves an Amazon S3 location.
To Amazon S3
When copying from one of these locations | To this location | DataSync can copy |
---|---|---|
|
|
The following as Amazon S3 user metadata:
The file metadata stored in Amazon S3 user metadata is interoperable with NFS shares on file gateways using AWS Storage Gateway. A file gateway enables low-latency access from on-premises networks to data that was copied to Amazon S3 by DataSync. This metadata is also interoperable with FSx for Lustre. When DataSync copies objects that contain this metadata back to an NFS server, the file metadata is restored. Restoring metadata requires granting elevated permissions to the NFS server. For more information, see Configuring AWS DataSync transfers with an NFS file server. |
Between Amazon S3 and other object storage
When copying between these locations | DataSync can copy |
---|---|
|
DataSync doesn't copy other object metadata, such as object access control lists (ACLs), prior object versions, or the Last-Modified key. |
|
Between Amazon S3 and HDFS
When copying between these locations | DataSync can copy |
---|---|
|
The following as Amazon S3 user metadata:
|
Metadata copied in NFS transfers
The following table describes what metadata DataSync can copy between locations that use Network File System (NFS).
When copying between these locations | DataSync can copy |
---|---|
|
|
Metadata copied in SMB transfers
The following table describes what metadata DataSync can copy between locations that use Server Message Block (SMB).
When copying between these locations | DataSync can copy |
---|---|
|
|
Metadata copied in other transfer scenarios
DataSync handles metadata the following ways when copying between these storage systems (most of which have different metadata structures).
When copying from one of these locations | To one of these locations | DataSync can copy |
---|---|---|
|
|
Default POSIX metadata for all files and folders on the destination file system or objects in the destination S3 bucket. This approach includes using the default POSIX user ID and group ID values. Windows-based metadata (such as ACLs) is not preserved. |
|
|
Default POSIX metadata on the destination files and folders. This approach includes using the default POSIX user ID and group ID values. |
|
|
The following as user-defined metadata:
|
|
|
HDFS stores file and folder user and group ownership as strings rather than numeric identifiers (such as UIDs and GIDs). Default values for UIDs and GIDs are applied on the destination file system. For more information, see Understanding when and how DataSync applies default POSIX metadata. |
|
|
File and folder timestamps from the source location. The file or folder owner is set based on the HDFS user or Kerberos principal you specified when creating the HDFS transfer location. The Groups Mapping configuration on the Hadoop cluster determines the group. |
|
|
File and folder timestamps from the source location. Ownership is set based on the Windows user that was specified in DataSync to access the Amazon FSx or SMB share. Permissions are inherited from the parent directory. |
|
|
Understanding when and how DataSync applies default POSIX metadata
DataSync applies default POSIX metadata in the following situations:
-
When your transfer's source and destination locations don't have similar metadata structures
-
When metadata is missing from the source location
The following table describes how DataSync applies default POSIX metadata during these types of transfers:
Source | Destination | File permissions | Folder permissions | UID | GID |
---|---|---|---|---|---|
|
|
0755 |
0755 |
65534 |
65534 |
|
|
0644 |
0755 |
65534 |
65534 |
|
|
0644 |
0755 |
65534 |
65534 |
1 In cases where the objects don't have metadata that was previously applied by DataSync.