

# Managing data deduplication
<a name="managing-data-dedup"></a>

You can manage your file system's [data deduplication settings](managing-storage-configuration.md#using-data-dedup) using the Amazon FSx CLI for remote management on PowerShell. For more information about using the Amazon FSx CLI remote management on PowerShell, see [Using the Amazon FSx CLI for PowerShell](administering-file-systems.md#remote-pwrshell). 

Following are commands that you can use for data deduplication. 


| Data deduplication command | Description | 
| --- | --- | 
| **[Enable-FSxDedup](#enable-dedup)** | Enables data deduplication on the file share. Data compression after deduplication is enabled by default when you enable data deduplication. | 
| **Disable-FSxDedup** | Disables data deduplication on the file share. | 
| **Get-FSxDedupConfiguration** | Retrieves deduplication configuration information, including Minimum file size and age for optimization, compression settings, and Excluded file types and folders. | 
| **Set-FSxDedupConfiguration** | Changes the deduplication configuration settings, including minimum file size and age for optimization, compression settings, and excluded file types and folders. | 
| **[Get-FSxDedupStatus](#get-dedup-status)** | Retrieve the deduplication status, and include read-only properties that describe optimization savings and status on the file system, times, and completion status for the last dedup jobs on the file system. | 
| **Get-FSxDedupMetadata** | Retrieves deduplication optimization metadata. | 
| **Update-FSxDedupStatus** | Computes and retrieves updated data deduplication savings information. | 
| **Measure-FSxDedupFileMetadata** | Measures and retrieves the potential storage space that you can reclaim on your file system if you delete a group of folders. Files often have chunks that are shared across other folders, and the deduplication engine calculates which chunks are unique and would be deleted. | 
| **Get-FSxDedupSchedule** | Retrieves deduplication schedules that are currently defined. | 
| **[New-FSxDedupSchedule](#new-dedup-sched)** | Create and customize a data deduplication schedule. | 
| **[Set-FSxDedupSchedule](#set-dedup-sched)** | Change configuration settings for existing data deduplication schedules. | 
| **Remove-FSxDedupSchedule** | Delete a deduplication schedule. | 
| **Get-FSxDedupJob** | Get status and information for all currently running or queued deduplication jobs. | 
| **Stop-FSxDedupJob** | Cancel one or more specified data deduplication jobs. | 

The online help for each command provides a reference of all command options. To access this help, run the command with **-?**, for example **Enable-FSxDedup -?**. 

## Enabling data deduplication
<a name="enable-dedup"></a>

You enable data deduplication on an Amazon FSx for Windows File Server file share using the `Enable-FSxDedup` command, as follows.

```
PS C:\Users\Admin> Invoke-Command -ComputerName amznfsxzzzzzzzz.corp.example.com -ConfigurationName FSxRemoteAdmin -ScriptBlock {Enable-FsxDedup }
```

When you enable data deduplication, a default schedule and configuration are created. You can create, modify, and remove schedules and configurations using the commands below.

You can use the `Disable-FSxDedup` command to disable data deduplication entirely on your file system.

## Creating a data deduplication schedule
<a name="new-dedup-sched"></a>

Although the default schedule works well in most cases, you can create a new deduplication schedule by using the `New-FsxDedupSchedule` command, shown as follows. Data deduplication schedules use UTC time.

```
PS C:\Users\Admin> Invoke-Command -ComputerName amznfsxzzzzzzzz.corp.example.com -ConfigurationName FSxRemoteAdmin -ScriptBlock {   
New-FSxDedupSchedule -Name "CustomOptimization" -Type Optimization -Days Mon,Wed,Sat -Start 08:00 -DurationHours 7
}
```

 This command creates a schedule named `CustomOptimization` that runs on days Monday, Wednesday, and Saturday, starting the job at 8:00 am (UTC) each day, with a maximum duration of 7 hours, after which the job stops if it is still running.

Note that creating new, custom deduplication job schedules does not override or remove the existing default schedule. Before creating a custom deduplication job, you may want to disable the default job if you don’t need it.

You can disable the default deduplication schedule by using the `Set-FsxDedupSchedule` command, shown as follows.

```
PS C:\Users\Admin> Invoke-Command -ComputerName amznfsxzzzzzzzz.corp.example.com -ConfigurationName FSxRemoteAdmin -ScriptBlock {Set-FSxDedupSchedule -Name “BackgroundOptimization” -Enabled $false}
```

You can remove a deduplication schedule by using the `Remove-FSxDedupSchedule -Name "ScheduleName"` command. Note that the default `BackgroundOptimization` deduplication schedule cannot be modified or removed and will need to be disabled instead.

## Modifying a data deduplication schedule
<a name="set-dedup-sched"></a>

You can modify an existing deduplication schedule by using the `Set-FsxDedupSchedule` command, shown as follows.

```
PS C:\Users\Admin> Invoke-Command -ComputerName amznfsxzzzzzzzz.corp.example.com -ConfigurationName FSxRemoteAdmin -ScriptBlock {   
Set-FSxDedupSchedule -Name "CustomOptimization" -Type Optimization -Days Mon,Tues,Wed,Sat -Start 09:00 -DurationHours 9
}
```

 This command modifies the existing `CustomOptimization` schedule to run on days Monday to Wednesday and Saturday, starting the job at 9:00 am (UTC) each day, with a maximum duration of 9 hours, after which the job stops if it is still running. 

 To modify the minimum file age before optimizing setting, use the `Set-FSxDedupConfiguration` command. 

## Viewing the amount of saved space
<a name="get-dedup-status"></a>

To view the amount of disk space you are saving from running data deduplication, use the `Get-FSxDedupStatus` command, as follows.

```
PS C:\Users\Admin> Invoke-Command -ComputerName amznfsxzzzzzzzz.corp.example.com -ConfigurationName FsxRemoteAdmin -ScriptBlock { 
Get-FSxDedupStatus } | select OptimizedFilesCount,OptimizedFilesSize,SavedSpace,OptimizedFilesSavingsRate

OptimizedFilesCount OptimizedFilesSize SavedSpace OptimizedFilesSavingsRate
------------------- ------------------ ---------- -------------------------
              12587           31163594   25944826                        83
```

**Note**  
The values shown in the command response for following parameters are not reliable, and you should not use these values: Capacity, FreeSpace, UsedSpace, UnoptimizedSize, and SavingsRate.