

# Monitoring with Amazon CloudWatch
<a name="monitoring-cloudwatch"></a>

You can monitor Amazon FSx for Lustre using CloudWatch, which collects and processes raw data from Amazon FSx for Lustre into readable, near real-time metrics. These statistics are retained for a period of 15 months, so that you can access historical information and gain a better perspective on how your application or service is performing. For more information about CloudWatch, see [What is Amazon CloudWatch?](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/WhatIsCloudWatch.html) in the *Amazon CloudWatch User Guide*.

CloudWatch metrics for FSx for Lustre are organized into six categories:
+ **Network I/O metrics** – Measure activity between clients and your file system.
+ **Object storage server metrics** – Measure object storage server (OSS) network throughput and disk throughput utilization.
+ **Object storage target metrics** – Measure object storage target (OST) disk throughput and disk IOPS utilization.
+ **Metadata metrics** – Measure metadata server (MDS) CPU utilization, metadata target (MDT) IOPS utilization, and client metadata operations.
+ **Storage capacity metrics** – Measure storage capacity utilization.
+ **S3 data repository metrics** – Measure age of oldest message waiting to be imported or exported, and renames processed by the file system.

The following diagram illustrates an FSx for Lustre file system, its components, and its metric categories.

![\[FSx for Lustre reports metrics in CloudWatch.\]](http://docs.aws.amazon.com/fsx/latest/LustreGuide/images/metrics-overview.png)


FSx for Lustre sends metric data to CloudWatch at 1-minute intervals.

**Note**  
Metrics may not be published during ﬁle system maintenance windows for your Amazon FSx for Lustre file system.

**Topics**
+ [How to use Amazon FSx for Lustre CloudWatch metrics](how_to_use_metrics.md)
+ [Accessing CloudWatch metrics](accessingmetrics.md)
+ [Amazon FSx for Lustre metrics and dimensions](fs-metrics.md)
+ [Performance warnings and recommendations](performance-insights.md)
+ [Creating CloudWatch alarms to monitor metrics](creating_alarms.md)

# How to use Amazon FSx for Lustre CloudWatch metrics
<a name="how_to_use_metrics"></a>

There are two primary architectural components of each Amazon FSx for Lustre file system:
+ One or more **object storage servers (OSSs)** that serve data to clients that access the file system. Each OSS is attached to one or more storage volumes, known as **object storage targets (OSTs)**, that host the data in your file system.
+ One or more **metadata servers (MDSs)** that serve metadata to clients that access the file system. Each MDS is attached to a storage volume, known as a **metadata target (MDT)**, that stores metadata such as filenames, directories, access permissions, and file layouts.

FSx for Lustre reports metrics in CloudWatch that track performance and resource utilization for your file system's storage and metadata servers, and their associated storage volumes. The following diagram illustrates an Amazon FSx for Lustre file system with its architectural components, and the performance and resource CloudWatch metrics that are available for monitoring.

![\[Diagram displaying the different types of FSx for Lustre Cloudwatch metrics.\]](http://docs.aws.amazon.com/fsx/latest/LustreGuide/images/file-server-metrics.png)


You can use the **Monitoring & performance** panel on your file system's dashboard in the Amazon FSx for Lustre console to view the metrics that are described in the following tables. For more information, see [Accessing CloudWatch metrics](accessingmetrics.md).


**File system activity (in Summary tab)**  

| How do I... | Chart | Relevant metrics | 
| --- | --- | --- | 
| ...determine the amount of available storage capacity on my file system? | Available storage capacity (bytes) | `FreeDataStorageCapacity` | 
| ...determine my file system's total client throughput? | Total client throughput (bytes/sec) | SUM(`DataReadBytes` \$1 `DataWriteBytes`)/PERIOD (in seconds) | 
| …determine my file system’s total client IOPS? | Total client IOPS (operations/sec) | SUM(DataReadOperations \$1 DataWriteOperations \$1 MetadataOperations)/PERIOD (in seconds) | 
| ...determine the number of connections that are established between clients and my file server? | Client connections (count) | ClientConnections | 
| …determine my file system’s metadata performance utilization? | Metadata IOPS utilization (percent) | MAX(MDT Disk IOPS) | 


**Storage tab**  

| How do I... | Chart | Relevant metrics | 
| --- | --- | --- | 
| ...determine how much storage is available? | Available storage capacity (bytes) | `FreeDataStorageCapacity` | 
| …determine the percentage of used storage for my file system, excluding space reserved for cached writes on clients? | Total storage capacity utilization (percent) | `StorageCapacityUtilization` | 
| …determine the percentage of used storage for my file system, including space reserved for cached writes on clients? | Total storage capacity utilization (percent) | `StorageCapacityUtilizationWithCachedWrites` | 
| …determine the percentage of used storage for my file system’s OSTs excluding space reserved for cached writes on clients? | Total storage capacity utilization per OST (percent) | `StorageCapacityUtilization` | 
| …determine the percentage of used storage for my file system’s OSTs, including space reserved for cached writes on clients? | Total storage capacity utilization per OST with client grants (percent) | `StorageCapacityUtilizationWithCachedWrites` | 
| …determine my file system’s data compression ratio? | Compression savings | 100\$1(LogicalDiskUsage - PhysicalDiskUSage)/LogicalDiskUsage | 


**Object storage performance (in Performance tab)**  

| How do I... | Chart | Relevant metrics | 
| --- | --- | --- | 
| …determine the network throughput between the clients and the OSSs as a percentage of the provisioned limit? | Network throughput (percent) | NetworkThroughputUtilization | 
| …determine the disk throughput between the OSS and its OSTs as a percentage of the provisioned limit? | Disk throughput (percent) | `FileServerDiskThroughputUtilization` | 
| …determine the IOPS for operations that access OSTs as a percentage of the provisioned limit? | Disk IOPS (percent) | `DiskIopsUtilization` | 


**Metadata performance (in Performance tab)**  

| How do I... | Chart | Relevant metrics | 
| --- | --- | --- | 
| …determine the metadata server's CPU utilization percentage? | CPU utilization (percent) | CPUUtilization | 
| …determine the metadata IOPS utilization as a percentage of the provisioned limit? | Metadata IOPS utilization | MAX(MDT Disk IOPS) | 

# Accessing CloudWatch metrics
<a name="accessingmetrics"></a>

You can access Amazon FSx for Lustre metrics for CloudWatch in the following ways: 
+ The Amazon FSx for Lustre console.
+ The CloudWatch console.
+ The CloudWatch command line interface (CLI).
+ The CloudWatch API.

The following procedures show you how to access the metrics using these tools.

## Using the Amazon FSx for Lustre console
<a name="cwmetrics-fsx-console"></a>

**To view metrics using the Amazon FSx for Lustre console**

1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. From the navigation pane, choose **File systems**, then choose the file system that has the metrics that you want to view.

1. On the **Summary** page, choose **Monitoring & performance** to see the metrics for your file system.

   There are four tabs on the **Monitoring & performance** panel.
   + Choose **Summary** (the default tab) to display any active warnings, CloudWatch alarms, and graphs for **File system activity**.
   + Choose **Storage** to view storage capacity, utilization metrics, and active warnings.
   + Choose **Performance** to view file server and storage performance metrics, and active warnings.
   + Choose **CloudWatch alarms** to view graphs of any alarms configured for your file system.

## Using the CloudWatch console
<a name="cwmetrics-cw-console"></a>

**To view metrics using the CloudWatch console**

1. Open the [CloudWatch console](https://console.aws.amazon.com/cloudwatch).

1. In the navigation pane, choose **Metrics**. 

1. Select the **FSx** namespace.

1. (Optional) To view a metric, enter its name in the search field.

1. (Optional) To explore metrics, select the category that best matches your question.

## Using the AWS CLI
<a name="cw-metrics-cli"></a>

**To access metrics from the AWS CLI**
+ Use the [https://docs.aws.amazon.com/cli/latest/reference/cloudwatch/list-metrics.html](https://docs.aws.amazon.com/cli/latest/reference/cloudwatch/list-metrics.html) command with the `--namespace "AWS/FSx"` namespace. For more information, see the [AWS CLI Command Reference](https://docs.aws.amazon.com/cli/latest/reference/).

## Using the CloudWatch API
<a name="cw-metrics-cw-api"></a>

**To access metrics from the CloudWatch API**
+ Call `[GetMetricStatistics](https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_GetMetricStatistics.html)`. For more information, see [Amazon CloudWatch API Reference](https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/). 

# Amazon FSx for Lustre metrics and dimensions
<a name="fs-metrics"></a>

Amazon FSx for Lustre publishes the metrics described in the following tables in the `AWS/FSx` namespace in Amazon CloudWatch for all FSx for Lustre file systems.

**Topics**
+ [FSx for Lustre network I/O metrics](#fsx-networkio-metrics)
+ [FSx for Lustre object storage server metrics](#fsx-oss-metrics)
+ [FSx for Lustre object storage target metrics](#fsx-ost-metrics)
+ [FSx for Lustre metadata metrics](#fs-metadata-metrics)
+ [FSx for Lustre storage capacity metrics](#fsx-storage-capacity-metrics)
+ [FSx for Lustre S3 repository metrics](#auto-import-export-metrics)
+ [FSx for Lustre dimensions](#fsx-dimensions)

## FSx for Lustre network I/O metrics
<a name="fsx-networkio-metrics"></a>

The `AWS/FSx` namespace includes the following network I/O metrics. All of these metrics take one dimension, `FileSystemId`.


| Metric | Description | 
| --- | --- | 
| DataReadBytes |  The number of bytes from reads by clients to the file system. The `Sum` statistic is the total number of bytes associated with read operations during the specified period. The `Minimum` statistic is the minimum number of bytes associated with read operations on a single OST. The `Maximum` statistic is the maximum number of bytes associated with read operations on the OST. The `Average` statistic is the average number of bytes associated with read operations per OST. The `SampleCount` statistic is the number of OSTs. To calculate the average throughput (bytes per second) for a period, divide the `Sum` statistic by the number of seconds in the period. Units: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/fsx/latest/LustreGuide/fs-metrics.html) Valid statistics: `Sum`, `Minimum`, `Maximum`, `Average`, `SampleCount`  | 
| DataWriteBytes |  The number of bytes from writes by clients to the file system. The `Sum` statistic is the total number of bytes associated with write operations. The `Minimum` statistic is the minimum number of bytes associated with write operations on a single OST. The `Maximum` statistic is the maximum number of bytes associated with write operations on the OST. The `Average` statistic is the average number of bytes associated with write operations per OST. The `SampleCount` statistic is the number of OSTs. To calculate the average throughput (bytes per second) for a period, divide the `Sum` statistic by the number of seconds in the period. Units: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/fsx/latest/LustreGuide/fs-metrics.html) Valid statistics: `Sum`, `Minimum`, `Maximum`, `Average`, `SampleCount`  | 
| DataReadOperations |  The number of read operations. The `Sum` statistic is the total number of read operations. The `Minimum` statistic is the minimum number of read operations on a single OST. The `Maximum` statistic is the maximum number of read operations on the OST. The `Average` statistic is the average number of read operations per OST. The `SampleCount` statistic is the number of OSTs. To calculate the average number of read operations (operations per second) for a period, divide the `Sum` statistic by the number of seconds in the period. Units: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/fsx/latest/LustreGuide/fs-metrics.html) Valid statistics: `Sum`, `Minimum`, `Maximum`, `Average`, `SampleCount`  | 
| DataWriteOperations |  The number of write operations. The `Sum` statistic is the total number of write operations. The `Minimum` statistic is the minimum number of write operations on a single OST. The `Maximum` statistic is the maximum number of write operations on the OST. The `Average` statistic is the average number of write operations per OST. The `SampleCount` statistic is the number of OSTs. To calculate the average number of write operations (operations per second) for a period, divide the `Sum` statistic by the number of seconds in the period. Units: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/fsx/latest/LustreGuide/fs-metrics.html) Valid statistics: `Sum`, `Minimum`, `Maximum`, `Average`, `SampleCount`  | 
| MetadataOperations |  The number of metadata operations. The `Sum` statistic is the count of metadata operations. The `Minimum` statistic is the minimum number of metadata operations per MDT. The `Maximum` statistic is the maximum number of metadata operations per MDT. The `Average` statistic is the average number of metadata operations per MDT. The `SampleCount` statistic is the number of MDTs. To calculate the average number of metadata operations (operations per second) for a period, divide the `Sum` statistic by the number of seconds in the period. Units: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/fsx/latest/LustreGuide/fs-metrics.html) Valid statistics: `Sum`, `Minimum`, `Maximum`, `Average`, `SampleCount`  | 
| ClientConnections | The number of active connections between clients and the file system. Unit: Count | 

## FSx for Lustre object storage server metrics
<a name="fsx-oss-metrics"></a>

The `AWS/FSx` namespace includes the following object storage server (OSS) metrics. All of these metrics take two dimensions, `FileSystemId` and `FileServer`.
+ `FileSystemId` – Your file system's AWS resource ID.
+ `FileServer` – The name of the object storage server (OSS) in your Lustre file system. Each OSS is provisioned with one or more object storage targets (OSTs). OSS use the naming convention of OSS<HostIndex>, where *HostIndex* represents a 4-digit hexadecimal value (for example, `OSS0001`). An OSS's ID is the ID of the first OST attached to it. For example, the first OSS attached to `OST0000` and `OST0001`, will use `OSS0000`, and the second OSS attached to `OST0002`, `OST0003` will use `OSS0002`.


| Metric | Description | 
| --- | --- | 
| NetworkThroughputUtilization | Network throughput utilization as a percentage of available network throughput for your file system. This metric is equivalent to the sum of `NetworkSentBytes` and `NetworkReceivedBytes` as a percentage of the network throughput capacity of one OSS for your file system. There is one metric emitted each minute for each of your file system's OSSs. The `Average` statistic is the average network throughput utilization for the given OSS over the specified period. The `Minimum` statistic is the lowest network throughput utilization for the given OSS over one minute, for the specified period. The `Maximum` statistic is the highest network throughput utilization for the given OSS over one minute, for the specified period. Unit: Percent Valid statistics: `Average`, `Minimum`, `Maximum` | 
| NetworkSentBytes | The number of bytes sent by the file system. All traffic is considered in this metric, including data movement to and from linked data repositories. There is one metric emitted each minute for each of your file system's OSSs. The `Sum` statistic is the total number of bytes sent over the network by the given OSS over the specified period. The `Average` statistic is the average number of bytes sent over the network by the given OSS over the specified period. The `Minimum` statistic is the lowest number of bytes sent over the network by the given OSS over the specified period. The `Maximum` statistic is the highest number of bytes sent over the network by the given OSS over the specified period. To calculate sent throughput (bytes per second) for any statistic, divide the statistic by the seconds in the specified period. Unit: Bytes Valid statistics: `Sum`, `Average`, `Minimum`, `Maximum` | 
| NetworkReceivedBytes | The number of bytes received by the file system. All traffic is considered in this metric, including data movement to and from linked data repositories. There is one metric emitted each minute for each of your file system's OSSs. The `Sum` statistic is the total number of bytes received over the network by the given OSS over the specified period. The `Average` statistic is the average number of bytes received over the network by the given OSS over the specified period. The `Minimum` statistic is the lowest number of bytes received over the network by the given OSS over the specified period. The `Maximum` statistic is the highest number of bytes received over the network by the given OSS over the specified period. To calculate throughput (bytes per second) for any statistic, divide the statistic by the seconds in the specified period. Unit: Bytes Valid statistics: `Sum`, `Average`, `Minimum`, `Maximum` | 
| FileServerDiskThroughputUtilization |  The disk throughput between your OSS and associated OSTs, as a percentage of the provisioned limit determined by throughput capacity. This metric is equivalent to the sum of `DiskReadBytes` and `DiskWriteBytes` as a percentage of the OSS's disk throughput capacity for your file system. There is one metric emitted each minute for each of your file system's OSSs. The `Average` statistic is the average OSS disk throughput utilization for the given OSS over the specified period. The `Minimum`statistic is the lowest OSS disk throughput utilization for the given OSS over the specified period. The `Maximum` statistic is the highest OSS disk throughput utilization for the given OSS over the specified period. Unit: Percent Valid statistics: `Average`, `Minimum`, `Maximum` | 

## FSx for Lustre object storage target metrics
<a name="fsx-ost-metrics"></a>

The `AWS/FSx` namespace includes the following object storage target (OST) metrics. All of these metrics take two dimensions, `FileSystemId` and `StorageTargetId`.

**Note**  
`DiskReadOperations` and `DiskWriteOperations` metrics are not available on Scratch file systems, and `DiskIopsUtilization` metrics are not available on Scratch and Persistent HDD file systems.


| Metric | Description | 
| --- | --- | 
| DiskReadBytes | The number of bytes (disk IO) from any disk reads from this OST. There is one metric emitted each minute for each of your file system's OSTs. The `Sum` statistic is the total number of bytes read in a minute from the given OST over the specified period. The `Average` statistic is the average number of bytes read each minute from the given OST over the specified period. The `Minimum` statistic is the lowest number of bytes read each minute from the given OST over the specified period. The `Maximum` statistic is the highest number of bytes read each minute from the given OST over the specified period. To calculate read disk throughput (bytes per second) for any statistic, divide the statistic by the seconds in the period. Unit: Bytes Valid statistics: `Sum`, `Average`, `Minimum`, and, `Maximum` | 
| DiskWriteBytes | The number of bytes (disk IO) from any disk writes from this OST. There is one metric emitted each minute for each of your file system's OSTs. The `Sum` statistic is the total number of bytes written each minute from the given OST over the specified period. The `Average` statistic is the average number of bytes written each minute from the given OST over the specified period. The `Minimum` statistic is the lowest number of bytes written each minute from the given OST over the specified period. The `Maximum` statistic is the highest number of bytes written each minute from the given OST over the specified period. To calculate read disk throughput (bytes per second) for any statistic, divide the statistic by the seconds in the period Unit: Bytes Valid statistics: `Sum`, `Average`, `Minimum`, and, `Maximum` | 
| DiskReadOperations |  The number of read operations (disk IO) to this OST. There is one metric emitted each minute for each of your file system's OSTs. The `Sum` statistic is the total number of read operations performed by the given OST over the specified period. The `Average` statistic is the average number of read operations performed each minute by the given OST over the specified period. The `Minimum` statistic is the lowest number of read operations performed each minute by the given OST over the specified period. The `Maximum` statistic is the highest number of read operations performed each minute by the given OST over the specified period. To calculate average disk IOPS over the period, use the `Average` statistic and divide the result by 60 (seconds). Units: Count  Valid statistics: `Sum`, `Average`, `Minimum`, and `Maximum`  | 
| DiskWriteOperations |  The number of write operations (disk IO) to this OST. There is one metric emitted each minute for each of your file system's OSTs. The `Sum` statistic is the total number of write operations performed by the given OST over the specified period. The `Average` statistic is the average number of write operations performed each minute by the given OST over the specified period. The `Minimum` statistic is the lowest number of write operations performed each minute by the given OST over the specified period. The `Maximum` statistic is the highest number of write operations performed each minute by the given OST over the specified period. To calculate average disk IOPS over the period, use the `Average` statistic and divide the result by 60 (seconds). Units: Count  Valid statistics: `Sum`, `Average`, `Minimum`, and `Maximum`  | 
| DiskIopsUtilization | The disk IOPS utilization of one OST, as a percentage of the OST's disk IOPS limit. There is one metric emitted each minute for each of your file system's OSTs. The `Average` statistic is the average disk IOPS utilization for the given OST over the specified period. The `Minimum` statistic is the lowest disk IOPS utilization for the given OST over the specified period. The `Maximum` statistic is the highest disk IOPS utilization for the given OST over the specified period. Unit: Percent Valid statistics: `Average`, `Minimum`, and `Maximum`  | 

## FSx for Lustre metadata metrics
<a name="fs-metadata-metrics"></a>

The `AWS/FSx` namespace includes the following metadata metrics. The `CPUUtilization` metric takes the `FileSystemId` and `FileServer` dimensions, while the other metrics take the `FileSystemId` and `StorageTargetId` dimensions.
+ `FileSystemId` – Your file system's AWS resource ID.
+ `StorageTargetId` – The name of the metadata target (MDT). MDTs use the naming convention of MDT<MDTIndex> (for example, `MDT0001`).
+ `FileServer` – The name of the metadata server (MDS) in your Lustre file system. Each MDS is provisioned with one metadata target (MDT). MDS use the naming convention of MDS<HostIndex>, where `HostIndex` represents a 4-digit hexadecimal value derived using the MDT index on the server. For example, the first MDS provisioned with `MDT0000` will use `MDS0000`, and the second MDS provisioned with `MDT0001` will use `MDS0001`. Your file system contains multiple metadata servers if your file system has a metadata configuration specified.


| Metric | Description | 
| --- | --- | 
| CPUUtilization |  The percent utilization of your file system's MDS CPU resources. There is one metric emitted each minute for each of your file system's MDSs. The `Average` statistic is the average CPU utilization of the MDS over a specified period. The `Minimum` statistic is the lowest CPU utilization for the given MDS over the specified period. The `Maximum` statistic is the highest CPU utilization for the given MDS over the specified period. Unit: Percent Valid statistics: `Average`, `Minimum` and `Maximum`  | 
| FileCreateOperations |  Total number of file create operations. Unit: Count  | 
| FileOpenOperations |  Total number of file open operations. Unit: Count  | 
| FileDeleteOperations |  Total number of file delete operations. Unit: Count  | 
| StatOperations |  Total number of stat operations. Unit: Count  | 
| RenameOperations |  Total number of directory renames, whether in-place directory renames or cross directory renames. Unit: Count  | 

## FSx for Lustre storage capacity metrics
<a name="fsx-storage-capacity-metrics"></a>

The `AWS/FSx` namespace includes the following storage capacity metrics. All of these metrics take two dimensions, `FileSystemId` and `StorageTargetId` except `LogicalDiskUsage` and `PhysicalDiskUsage` which take the `FileSystemId` dimension.


| Metric | Description | 
| --- | --- | 
| FreeDataStorageCapacity |  The amount of available storage capacity in this OST. There is one metric emitted each minute for each of your file system's OSTs. The `Sum` statistic is the total number of bytes available in the given OST over the specified period. The `Average` statistic is the average number of bytes available in the given OST over the specified period. The `Minimum` statistic is the lowest number of bytes available in the given OST over the specified period. The `Maximum` statistic is the highest number of bytes available in the given OST over the specified period. Unit: Bytes Valid statistics: `Sum`, `Average`, `Minimum`, and `Maximum`  | 
| `StorageCapacityUtilization`  |  The storage capacity utilization for a given file system OST. There is one metric emitted each minute for each of your file system's OSTs. The `Average` statistic is the average amount of storage capacity utilization for a given OST over a specified period. The `Minimum` statistic is the minimum amount of storage capacity utilization for a given OST over a specified period. The `Maximum` statistic is the maximum amount of storage capacity utilization for a given OST over a specified period. Unit: Percent Valid statistics: `Average`, `Minimum`, `Maximum`  | 
| `StorageCapacityUtilizationWithCachedWrites`  |  The storage capacity utilization for a given file system OST including space reserved for cached writes on the client. There is one metric emitted each minute for each of your file system's OSTs. The `Average` statistic is the average amount of storage capacity utilization for a given OST over a specified period. The `Minimum` statistic is the minimum amount of storage capacity utilization for a given OST over a specified period. The `Maximum` statistic is the maximum amount of storage capacity utilization for a given OST over a specified period. Unit: Percent Valid statistics: `Average`, `Minimum`, `Maximum`  | 
| LogicalDiskUsage |  The amount of logical data stored (uncompressed). The `Sum` statistic is the total number of logical bytes stored in the file system. The `Minimum` statistic is the least number of logical bytes stored in an OST in the file system. The `Maximum` statistic is the largest number of logical bytes stored in an OST in the file system. The `Average` statistic is the average number of logical bytes stored per OST. The `SampleCount` statistic is the number of OSTs. Units: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/fsx/latest/LustreGuide/fs-metrics.html) Valid statistics: `Sum`, `Minimum`, `Maximum`, `Average`, `SampleCount`  | 
| PhysicalDiskUsage |  The amount of storage physically occupied by file system data (compressed). The `Sum` statistic is the total number of bytes occupied in OSTs in the file system. The `Minimum` statistic is the total number of bytes occupied in the emptiest OST. The `Maximum` statistic is the total number of bytes occupied in the fullest OST. The `Average` statistic is the average number of bytes occupied per OST. The `SampleCount` statistic is the number of OSTs. Units: [\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/fsx/latest/LustreGuide/fs-metrics.html) Valid statistics: `Sum`, `Minimum`, `Maximum`, `Average`, `SampleCount`  | 

## FSx for Lustre S3 repository metrics
<a name="auto-import-export-metrics"></a>

FSx for Lustre publishes the following `AutoImport` (automatic import) and `AutoExport` (automatic export) metrics into the `FSx` namespace in CloudWatch. These metrics use dimensions to enable more granular measurements of your data. All `AutoImport` and `AutoExport` metrics have the `FileSystemId` and `Publisher` dimensions.


| Metric | Description | 
| --- | --- | 
|  `AgeOfOldestQueuedMessage` Dimension: `AutoExport`  |  The age, in seconds, of the oldest message waiting to be exported. The `Average` statistic is the average age of the oldest message waiting to be exported. The `Maximum` statistic is the maximum number of seconds a message lived in the export queue. The `Minimum` statistic is the minimum number of seconds a message lived in the export queue. A value of zero indicates that no messages are waiting to be exported. Units: Seconds Valid statistics: `Average`, `Minimum`, `Maximum`  | 
|  `RepositoryRenameOperations` Dimension: `AutoExport`  |  The number of renames processed by the file system in response to a larger directory rename. The `Sum` statistic is the total number of rename operations that result from a directory rename. The `Average` statistic is the average number of rename operations for the file system. The `Maximum` statistic is the maximum number of rename operations associated with a directory rename on the file system. The `Minimum` statistic is the minimum number of renames associated with a directory rename on the file system. Units: Count Valid statistics: `Sum`, `Average`, `Minimum`, `Maximum`,   | 
|  `AgeOfOldestQueuedMessage` Dimension: `AutoImport`  |  The age, in seconds, of the oldest message waiting to be imported. The `Average` statistic is the average age of the oldest message waiting to be imported. The `Maximum` statistic is the maximum number of seconds a message lived in the import queue. The `Minimum` statistic is the minimum number of seconds a message lived in the import queue. A value of zero indicates that no messages are waiting to be imported. Units: Seconds Valid statistics: `Average`, `Minimum`, `Maximum`  | 

## FSx for Lustre dimensions
<a name="fsx-dimensions"></a>

Amazon FSx for Lustre metrics use the `AWS/FSx` namespace and use the following dimensions.
+ The `FileSystemId` dimension denotes a file system's ID and filters the metrics that you request to that individual file system. You can find the ID on the Amazon FSx console on the **Summary** panel of the file system details page, in the **File system ID** field. The file system ID takes the form of *fs-01234567890123456*. You can also see the ID in the response of a [https://docs.aws.amazon.com/cli/latest/reference/fsx/describe-file-systems.html](https://docs.aws.amazon.com/cli/latest/reference/fsx/describe-file-systems.html) CLI command (the equivalent API action is [https://docs.aws.amazon.com/fsx/latest/APIReference/API_DescribeFileSystems.html](https://docs.aws.amazon.com/fsx/latest/APIReference/API_DescribeFileSystems.html)).
+ The `StorageTargetId` dimension denotes which OST (object storage target) or MDT (metadata target) published the metadata metrics. A `StorageTargetId` takes the form of `OSTxxxx` (for example, `OST0001`) or `MDTxxxx` (for example, `MDT0001`).
+ The `FileServer` dimension denotes the following
  + For OSS metrics: the name of the object storage server (OSS). OSS use the `OSSxxxx` naming convention (for example, `OSS0002`).
  + For the CPUUtilization metric: the name of a metadata server (MDS). MDS use the `MDSxxxx` naming convention (for example, `MDS0002`).
+ The `Publisher` dimension is available in CloudWatch and AWS CLI for the `AutoImport` and `AutoImport` metrics to denote which service published the metrics.

 For more information about dimensions, see [Dimensions](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_concepts.html#Dimension) in the *Amazon CloudWatch User Guide*.

# Performance warnings and recommendations
<a name="performance-insights"></a>

FSx for Lustre displays a warning for CloudWatch metrics when one of these metrics approaches or crosses a predetermined threshold for multiple consecutive data points. These warnings provide you with actionable recommendations that you can use to optimize your file system's performance.

Warnings are accessible in several areas of the **Monitoring & performance** dashboard on the Amazon FSx for Lustre console. All active or recent Amazon FSx performance warnings and CloudWatch alarms configured for the file system that are in an alarm state appear in the **Monitoring & performance** panel in the **Summary** section. The warning also appears in the section of the dashboard where the metric graph is displayed. These warnings automatically disappear from the dashboard 24 hours after the underlying metrics fall below the warning threshold.

You can create CloudWatch alarms for any of the Amazon FSx metrics. For more information, see [Creating CloudWatch alarms to monitor metrics](creating_alarms.md).

## Use performance warnings to improve file system performance
<a name="resolve-warnings"></a>

Amazon FSx provides actionable recommendations that you can use to optimize your file system's performance. You can take the recommended action if you expect the issue to continue, or if it's causing an impact to your file system's performance. Depending on which metric has triggered a warning, you can resolve it by increasing the file system's throughput capacity, storage capacity, or metadata IOPS, as described in the following table.

[\[See the AWS documentation website for more details\]](http://docs.aws.amazon.com/fsx/latest/LustreGuide/performance-insights.html)

For more information about file system performance, see [Amazon FSx for Lustre performance](performance.md).

# Creating CloudWatch alarms to monitor metrics
<a name="creating_alarms"></a>

You can create a CloudWatch alarm that sends an Amazon SNS message when the alarm changes state. An alarm watches a single metric over a time period that you specify and performs one or more actions based on the value of the metric relative to a given threshold over a specified period of time. The action is a notification that's sent to an Amazon SNS topic or Auto Scaling policy.

Alarms invoke actions for sustained state changes only. CloudWatch alarms don't invoke actions because they are in a particular state. The state must change and remain changed for a specified period of time. You can create an alarm on the Amazon FSx console or the CloudWatch console.

The following procedures describe how to create alarms for Amazon FSx for Lustre using the console, AWS CLI, and API.

## To set alarms using the Amazon FSx for Lustre console
<a name="fsx-console-alarms"></a>

1. Open the Amazon FSx console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. From the navigation pane, choose **File systems**, and then choose the file system that you want to create the alarm for.

1. On the **Summary** page, choose **Monitoring & performance**. 

1. Choose **Create CloudWatch alarm**. You are redirected to the CloudWatch console.

1. Choose **Select metrics**, and choose **Next**.

1. In the **Metrics** section, choose **FSX**.

1. Choose **File System Metrics**, choose the metric that you want to set the alarm for, and then choose **Select metric**.

1. In the **Conditions** section, choose the conditions for the alarm, and choose **Next**.
**Note**  
Metrics might not be published during file system maintenance. To prevent unnecessary and misleading alarm condition changes and to configure your alarms so that they are resilient to missing data points, see [Configuring how CloudWatch alarms treat missing data](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/AlarmThatSendsEmail.html#alarms-and-missing-data) in the *Amazon CloudWatch User Guide*.

1. If you want CloudWatch to send you an email or SNS notification when the alarm state triggers the action, choose **Whenever this alarm state is**.

   For **Select an SNS topic**, choose an existing SNS topic. If you select **Create topic**, you can set the name and email addresses for a new email subscription list. This list is saved and appears in the field for future alarms. Choose **Next**.
**Warning**  
If you use **Create topic** to create a new Amazon SNS topic, the email addresses must be verified before they receive notifications. Emails are only sent when the alarm enters an alarm state. If this alarm state change happens before the email addresses are verified, they do not receive a notification.

1. Fill in the **Name**, **Description**, and **Whenever** values for the metric, and choose **Next**. 

1. On the **Preview and create** page, review the alarm and choose **Create Alarm**. 

## To set alarms using the CloudWatch console
<a name="cloudwatch-console-alarms"></a>

1. Sign in to the AWS Management Console and open the CloudWatch console at [https://console.aws.amazon.com/cloudwatch/](https://console.aws.amazon.com/cloudwatch/).

1. Choose **Create Alarm** to start the **Create Alarm Wizard**. 

1. Choose **FSx Metrics** to locate a metric. To narrow the results, you can search for your file system ID. Select the metric that you want to create an alarm for and choose **Next**.

1.  Enter a **Name** and a **Description**, and choose a **Whenever** value for the metric.

1. If you want CloudWatch to send you an email when the alarm state is reached, choose **State is ALARM** for **Whenever this alarm**. For **Send notification to**, choose an existing SNS topic. If you select **Create topic**, you can set up the names and email addresses for a new email subscription list. This list is saved and appears in the field for future alarms.
**Warning**  
If you use **Create topic** to create a new Amazon SNS topic, the email addresses must be verified before they receive notifications. Emails are only sent when the alarm enters an alarm state. If this alarm state change happens before the email addresses are verified, they do not receive a notification.

1. View the **Alarm Preview** and then choose **Create Alarm** or go back to make changes. 

## To set an alarm using the AWS CLI
<a name="cli-alarms"></a>
+ Call `[put-metric-alarm](https://docs.aws.amazon.com/cli/latest/reference/put-metric-alarm.html)`. For more information, see *[AWS CLI Command Reference](https://docs.aws.amazon.com/cli/latest/reference/)*.

## To set an alarm using the CloudWatch
<a name="api-alarms"></a>
+ Call `[PutMetricAlarm](https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/API_PutMetricAlarm.html)`. For more information, see *[Amazon CloudWatch API Reference](https://docs.aws.amazon.com/AmazonCloudWatch/latest/APIReference/)*. 