Amazon FSx for Lustre metrics and dimensions - FSx for Lustre

Amazon FSx for Lustre metrics and dimensions

Amazon FSx for Lustre publishes the metrics described in the following tables in the AWS/FSx namespace in Amazon CloudWatch for all FSx for Lustre file systems.

FSx for Lustre network I/O metrics

The AWS/FSx namespace includes the following network I/O metrics. All of these metrics take one dimension, FileSystemId.

Metric Description
DataReadBytes

The number of bytes from reads by clients to the file system.

The Sum statistic is the total number of bytes associated with read operations during the specified period. The Minimum statistic is the minimum number of bytes associated with read operations on a single OST. The Maximum statistic is the maximum number of bytes associated with read operations on the OST. The Average statistic is the average number of bytes associated with read operations per OST. The SampleCount statistic is the number of OSTs.

To calculate the average throughput (bytes per second)for a period, divide the Sum statistic by the number of seconds in the period.

Units:

  • Bytes for Sum, Minimum, Maximum, Average.

  • Count for SampleCount.

Valid statistics: Sum, Minimum, Maximum, Average, SampleCount

DataWriteBytes

The number of bytes from writes by clients to the file system.

The Sum statistic is the total number of bytes associated with write operations. The Minimum statistic is the minimum number of bytes associated with write operations on a single OST. The Maximum statistic is the maximum number of bytes associated with write operations on the OST. The Average statistic is the average number of bytes associated with write operations per OST. The SampleCount statistic is the number of OSTs.

To calculate the average throughput (bytes per second) for a period, divide the Sum statistic by the number of seconds in the period.

Units:

  • Bytes for Sum, Minimum, Maximum, Average.

  • Count for SampleCount.

Valid statistics: Sum, Minimum, Maximum, Average, SampleCount

DataReadOperations

The number of read operations.

The Sum statistic is the total number of read operations. The Minimum statistic is the minimum number of read operations on a single OST. The Maximum statistic is the maximum number of read operations on the OST. The Average statistic is the average number of read operations per OST. The SampleCount statistic is the number of OSTs.

To calculate the average number of read operations (operations per second) for a period, divide the Sum statistic by the number of seconds in the period.

Units:

  • Bytes for Sum, Minimum, Maximum, Average.

  • Count for SampleCount.

Valid statistics: Sum, Minimum, Maximum, Average, SampleCount

DataWriteOperations

The number of write operations.

The Sum statistic is the total number of write operations. The Minimum statistic is the minimum number of write operations on a single OST. The Maximum statistic is the maximum number of write operations on the OST. The Average statistic is the average number of write operations per OST. The SampleCount statistic is the number of OSTs.

To calculate the average number of write operations (operations per second) for a period, divide the Sum statistic by the number of seconds in the period.

Units:

  • Bytes for Sum, Minimum, Maximum, Average.

  • Count for SampleCount.

Valid statistics: Sum, Minimum, Maximum, Average, SampleCount

MetadataOperations

The number of metadata operations.

The Sum statistic is the count of metadata operations. The Minimum statistic is the minimum number of metadata operations per MDT. The Maximum statistic is the maximum number of metadata operations per MDT. The Average statistic is the average number of metadata operations per MDT. The SampleCount statistic is the number of MDTs.

To calculate the average number of metadata operations (operations per second) for a period, divide the Sum statistic by the number of seconds in the period.

Units:

  • Count for Sum, Minimum, Maximum, Average, SampleCount.

Valid statistics: Sum, Minimum, Maximum, Average, SampleCount

ClientConnections

The number of active connections between clients and the file system.

Unit: Count

FSx for Lustre object storage server metrics

The AWS/FSx namespace includes the following object storage server (OSS) metrics. All of these metrics take two dimensions, FileSystemId and FileServer.

  • FileSystemId – Your file system's AWS resource ID.

  • FileServer – The name of the object storage server (OSS) in your Lustre file system. Each OSS is provisioned with one or more object storage targets (OSTs). OSS use the naming convention of OSS<HostIndex>, where HostIndex represents a 4-digit hexadecimal value (for example, OSS0001). An OSS's ID is the ID of the first OST attached to it. For example, the first OSS attached to OST0000 and OST0001, will use OSS0000, and the second OSS attached to OST0002, OST0003 will use OSS0002.

Metric Description
NetworkThroughputUtilization

Network throughput utilization as a percentage of available network throughput for your file system. This metric is equivalent to the sum of NetworkSentBytes and NetworkReceivedBytes as a percentage of the network throughput capacity of one OSS for your file system. There is one metric emitted each minute for each of your file system's OSSs.

The Average statistic is the average network throughput utilization for the given OSS over the specified period.

The Minimum statistic is the lowest network throughput utilization for the given OSS over one minute, for the specified period.

The Maximum statistic is the highest network throughput utilization for the given OSS over one minute, for the specified period.

Unit: Percent

Valid statistics: Average, Minimum, Maximum

NetworkSentBytes

The number of bytes sent by the file system. All traffic is considered in this metric, including data movement to and from linked data repositories. There is one metric emitted each minute for each of your file system's OSSs.

The Sum statistic is the total number of bytes sent over the network by the given OSS over the specified period.

The Average statistic is the average number of bytes sent over the network by the given OSS over the specified period.

The Minimum statistic is the lowest number of bytes sent over the network by the given OSS over the specified period. The Maximum statistic is the highest number of bytes sent over the network by the given OSS over the specified period.

The Maximum statistic is the highest number of bytes sent over the network by the given OSS over the specified period.

To calculate sent throughput (bytes per second) for any statistic, divide the statistic by the seconds in the specified period.

Unit: Bytes

Valid statistics: Sum, Average, Minimum, Maximum

NetworkReceivedBytes

The number of bytes received by the file system. All traffic is considered in this metric, including data movement to and from linked data repositories. There is one metric emitted each minute for each of your file system's OSSs.

The Sum statistic is the total number of bytes received over the network by the given OSS over the specified period.

The Average statistic is the average number of bytes received over the network by the given OSS over the specified period.

The Minimum statistic is the lowest number of bytes received over the network by the given OSS over the specified period.

The Maximum statistic is the highest number of bytes received over the network by the given OSS over the specified period.

To calculate throughput (bytes per second) for any statistic, divide the statistic by the seconds in the specified period.

Unit: Bytes

Valid statistics: Sum, Average, Minimum, Maximum

FileServerDiskThroughputUtilization

The disk throughput between your OSS and associated OSTs, as a percentage of the provisioned limit determined by throughput capacity. This metric is equivalent to the sum of DiskReadBytes and DiskWriteBytes as a percentage of the OSS's disk throughput capacity for your file system. There is one metric emitted each minute for each of your file system's OSSs.

The Average statistic is the average OSS disk throughput utilization for the given OSS over the specified period.

The Minimumstatistic is the lowest OSS disk throughput utilization for the given OSS over the specified period.

The Maximum statistic is the highest OSS disk throughput utilization for the given OSS over the specified period.

Unit: Percent

Valid statistics: Average, Minimum, Maximum

FSx for Lustre object storage target metrics

The AWS/FSx namespace includes the following object storage target (OST) metrics. All of these metrics take two dimensions, FileSystemId and StorageTargetId.

Note

DiskReadOperations and DiskWriteOperations metrics are not available on Scratch file systems, and DiskIopsUtilization metrics are not available on Scratch and Persistent HDD file systems.

Metric Description
DiskReadBytes

The number of bytes (disk IO) from any disk reads from this OST. There is one metric emitted each minute for each of your file system's OSTs.

The Sum statistic is the total number of bytes read in a minute from the given OST over the specified period.

The Average statistic is the average number of bytes read each minute from the given OST over the specified period.

The Minimum statistic is the lowest number of bytes read each minute from the given OST over the specified period.

The Maximum statistic is the highest number of bytes read each minute from the given OST over the specified period.

To calculate read disk throughput (bytes per second) for any statistic, divide the statistic by the seconds in the period.

Unit: Bytes

Valid statistics: Sum, Average, Minimum, and, Maximum

DiskWriteBytes

The number of bytes (disk IO) from any disk writes from this OST. There is one metric emitted each minute for each of your file system's OSTs.

The Sum statistic is the total number of bytes written each minute from the given OST over the specified period.

The Average statistic is the average number of bytes written each minute from the given OST over the specified period.

The Minimum statistic is the lowest number of bytes written each minute from the given OST over the specified period.

The Maximum statistic is the highest number of bytes written each minute from the given OST over the specified period.

To calculate read disk throughput (bytes per second) for any statistic, divide the statistic by the seconds in the period

Unit: Bytes

Valid statistics: Sum, Average, Minimum, and, Maximum

DiskReadOperations

The number of read operations (disk IO) to this OST. There is one metric emitted each minute for each of your file system's OSTs.

The Sum statistic is the total number of read operations performed by the given OST over the specified period.

The Average statistic is the average number of read operations performed each minute by the given OST over the specified period.

The Minimum statistic is the lowest number of read operations performed each minute by the given OST over the specified period.

The Maximum statistic is the highest number of read operations performed each minute by the given OST over the specified period.

To calculate average disk IOPS over the period, use the Average statistic and divide the result by 60 (seconds).

Units: Count

Valid statistics: Sum, Average, Minimum, and Maximum

DiskWriteOperations

The number of write operations (disk IO) to this OST. There is one metric emitted each minute for each of your file system's OSTs.

The Sum statistic is the total number of write operations performed by the given OST over the specified period.

The Average statistic is the average number of write operations performed each minute by the given OST over the specified period.

The Minimum statistic is the lowest number of write operations performed each minute by the given OST over the specified period.

The Maximum statistic is the highest number of write operations performed each minute by the given OST over the specified period.

To calculate average disk IOPS over the period, use the Average statistic and divide the result by 60 (seconds).

Units: Count

Valid statistics: Sum, Average, Minimum, and Maximum

DiskIopsUtilization

The disk IOPS utilization of one OST, as a percentage of the OST's disk IOPS limit. There is one metric emitted each minute for each of your file system's OSTs.

The Average statistic is the average disk IOPS utilization for the given OST over the specified period.

The Minimum statistic is the lowest disk IOPS utilization for the given OST over the specified period.

The Maximum statistic is the highest disk IOPS utilization for the given OST over the specified period.

Unit: Percent

Valid statistics: Average, Minimum, and Maximum

FSx for Lustre metadata metrics

The AWS/FSx namespace includes the following metadata metrics. The CPUUtilization metric takes the FileSystemId and FileServer dimensions, while the other metrics take the FileSystemId and StorageTargetId dimensions.

  • FileSystemId – Your file system's AWS resource ID.

  • StorageTargetId – The name of the metadata target (MDT). MDTs use the naming convention of MDT<MDTIndex> (for example, MDT0001).

  • FileServer – The name of the metadata server (MDS) in your Lustre file system. Each MDS is provisioned with one metadata target (MDT). MDS use the naming convention of MDS<HostIndex>, where HostIndex represents a 4-digit hexadecimal value derived using the MDT index on the server. For example, the first MDS provisioned with MDT0000 will use MDS0000, and the second MDS provisioned with MDT0001 will use MDS0001. Your file system contains multiple metadata servers if your file system has a metadata configuration specified.

Metric Description
CPUUtilization

The percent utilization of your file system's MDS CPU resources. There is one metric emitted each minute for each of your file system's MDSs.

The Average statistic is the average CPU utilization of the MDS over a specified period.

The Minimum statistic is the lowest CPU utilization for the given MDS over the specified period.

The Maximum statistic is the highest CPU utilization for the given MDS over the specified period.

Unit: Percent

Valid statistics: Average, Minimum and Maximum

FileCreateOperations

Total number of file create operations.

Unit: Count

FileOpenOperations

Total number of file open operations.

Unit: Count

FileDeleteOperations

Total number of file delete operations.

Unit: Count

StatOperations

Total number of stat operations.

Unit: Count

RenameOperations

Total number of directory renames, whether in-place directory renames or cross directory renames.

Unit: Count

FSx for Lustre storage capacity metrics

The AWS/FSx namespace includes the following storage capacity metrics. All of these metrics take two dimensions, FileSystemId and StorageTargetId except LogicalDiskUsage and PhysicalDiskUsage which take the FileSystemId dimension.

Metric Description
FreeDataStorageCapacity

The amount of available storage capacity in this OST. There is one metric emitted each minute for each of your file system's OSTs.

The Sum statistic is the total number of bytes available in the given OST over the specified period.

The Average statistic is the average number of bytes available in the given OST over the specified period.

The Minimum statistic is the lowest number of bytes available in the given OST over the specified period.

The Maximum statistic is the highest number of bytes available in the given OST over the specified period.

Unit: Bytes

Valid statistics: Sum, Average, Minimum, and Maximum

StorageCapacityUtilization

The storage capacity utilization for a given file system OST. There is one metric emitted each minute for each of your file system's OSTs.

The Average statistic is the average amount of storage capacity utilization for a given OST over a specified period.

The Minimum statistic is the minimum amount of storage capacity utilization for a given OST over a specified period.

The Maximum statistic is the maximum amount of storage capacity utilization for a given OST over a specified period.

Unit: Percent

Valid statistics: Average, Minimum, Maximum

StorageCapacityUtilizationWithCachedWrites

The storage capacity utilization for a given file system OST including space reserved for cached writes on the client. There is one metric emitted each minute for each of your file system's OSTs.

The Average statistic is the average amount of storage capacity utilization for a given OST over a specified period.

The Minimum statistic is the minimum amount of storage capacity utilization for a given OST over a specified period.

The Maximum statistic is the maximum amount of storage capacity utilization for a given OST over a specified period.

Unit: Percent

Valid statistics: Average, Minimum, Maximum

LogicalDiskUsage

The amount of logical data stored (uncompressed).

The Sum statistic is the total number of logical bytes stored in the file system. The Minimum statistic is the least number of logical bytes stored in an OST in the file system. The Maximum statistic is the largest number of logical bytes stored in an OST in the file system. The Average statistic is the average number of logical bytes stored per OST. The SampleCount statistic is the number of OSTs.

Units:

  • Bytes for Sum, Minimum, Maximum.

  • Count for SampleCount.

Valid statistics: Sum, Minimum, Maximum, Average, SampleCount

PhysicalDiskUsage

The amount of storage physically occupied by file system data (compressed).

The Sum statistic is the total number of bytes occupied in OSTs in the file system. The Minimum statistic is the total number of bytes occupied in the emptiest OST. The Maximum statistic is the total number of bytes occupied in the fullest OST. The Average statistic is the average number of bytes occupied per OST. The SampleCount statistic is the number of OSTs.

Units:

  • Bytes for Sum, Minimum, Maximum.

  • Count for SampleCount.

Valid statistics: Sum, Minimum, Maximum, Average, SampleCount

FSx for Lustre S3 repository metrics

FSx for Lustre publishes the following AutoImport (automatic import) and AutoExport (automatic export) metrics into the FSx namespace in CloudWatch. These metrics use dimensions to enable more granular measurements of your data. All AutoImport and AutoExport metrics have the FileSystemId and Publisher dimensions.

Metric Description

AgeOfOldestQueuedMessage

Dimension: AutoExport

The age, in seconds, of the oldest message waiting to be exported.

The Average statistic is the average age of the oldest message waiting to be exported. The Maximum statistic is the maximum number of seconds a message lived in the export queue. The Minimum statistic is the minimum number of seconds a message lived in the export queue. A value of zero indicates that no messages are waiting to be exported.

Units: Seconds

Valid statistics: Average, Minimum, Maximum

RepositoryRenameOperations

Dimension: AutoExport

The number of renames processed by the file system in response to a larger directory rename.

The Sum statistic is the total number of rename operations that result from a directory rename. The Average statistic is the average number of rename operations for the file system. The Maximum statistic is the maximum number of rename operations associated with a directory rename on the file system. The Minimum statistic is the minimum number of renames associated with a directory rename on the file system.

Units: Count

Valid statistics: Sum, Average, Minimum, Maximum,

AgeOfOldestQueuedMessage

Dimension: AutoImport

The age, in seconds, of the oldest message waiting to be imported.

The Average statistic is the average age of the oldest message waiting to be imported. The Maximum statistic is the maximum number of seconds a message lived in the import queue. The Minimum statistic is the minimum number of seconds a message lived in the import queue. A value of zero indicates that no messages are waiting to be imported.

Units: Seconds

Valid statistics: Average, Minimum, Maximum

FSx for Lustre dimensions

Amazon FSx for Lustre metrics use the AWS/FSx namespace and use the following dimensions.

  • The FileSystemId dimension denotes a file system's ID and filters the metrics that you request to that individual file system. You can find the ID on the Amazon FSx console on the Summary panel of the file system details page, in the File system ID field. The file system ID takes the form of fs-01234567890123456. You can also see the ID in the response of a describe-file-systems CLI command (the equivalent API action is DescribeFileSystems).

  • The StorageTargetId dimension denotes which OST (object storage target) or MDT (metadata target) published the metadata metrics. A StorageTargetId takes the form of OSTxxxx (for example, OST0001) or MDTxxxx (for example, MDT0001).

  • The FileServer dimension denotes the following

    • For OSS metrics: the name of the object storage server (OSS). OSS use the OSSxxxx naming convention (for example, OSS0002).

    • For the CPUUtilization metric: the name of a metadata server (MDS). MDS use the MDSxxxx naming convention (for example, MDS0002).

  • The Publisher dimension is available in CloudWatch and AWS CLI for the AutoImport and AutoImport metrics to denote which service published the metrics.

For more information about dimensions, see Dimensions in the Amazon CloudWatch User Guide.