Working with EFA-enabled file systems
If you are creating a file system with over 10 GB/s of throughput capacity, we recommend enabling Elastic Fabric Adapter (EFA) to optimize throughput per client instance. EFA is a high-performance network interface that uses a custom-built operating system bypass technique and the AWS Scalable Reliable Datagram (SRD) network protocol to increase performance. For information about EFA, see Elastic Fabric Adapter for AI/ML and HPC workloads on Amazon EC2 in the Amazon EC2 User Guide.
EFA-enabled file systems support two additional performance features: GPUDirect Storage (GDS) and ENA Express. GDS support builds on EFA to further enhance performance by enabling direct data transfer between the file system and the GPU memory, bypassing the CPU. This direct path eliminates the need for redundant memory copies and CPU involvement in the data transfer operations. With EFA and GDS support, you can achieve higher throughput to individual EFA-enabled client instances. ENA Express provides optimized network communication for Amazon EC2 instances using an advanced path selection algorithm and enhanced congestion control mechanism. With ENA Express support, you can achieve higher throughput to individual ENA Express-enabled client instances. For information about ENA Express, see Improve network performance between EC2 instances with ENA Express in the Amazon EC2 User Guide.
Topics
Considerations when using EFA-enabled file systems
Here are a few important items to consider when creating EFA-enabled file systems:
Multiple connectivity options: EFA-enabled file systems can communicate with client instances using ENA, ENA Express, and EFA.
Deployment type: EFA is supported on Persistent 2 file systems with a metadata configuration specified.
Updating EFA setting: You can choose to enable EFA when you create a new file system but you cannot enable or disable EFA on an existing file system.
Scaling throughput with storage capacity: You can scale storage capacity on an EFA-enabled file system to increase throughput capacity but you cannot change the throughput tier of an EFA-enabled file system.
Prerequisites for using EFA-enabled file systems
The following are prerequisites for using EFA-enabled file systems:
To create your EFA-enabled file system:
Use an EFA-enabled security group. For more information, see EFA-enabled security groups.
Use the same Availability Zone and /16 CIDR as your EFA-enabled client instances within your Amazon VPC.
To access your file system using Elastic Fabric Adapter (EFA):
Use Nitro v4 (or higher) EC2 instances that support EFA, excluding the p5en and trn2 instance families. See the Supported instance types in the Amazon EC2 User Guide.
Run Ubuntu 22.04 with a kernel version of 6.8 and higher. For more information, see To install the Lustre client on Ubuntu 22.04.
Install the EFA modules and configure EFA interfaces on your client instances. For more information, see Configuring EFA clients.
To access your file system using GPUDirect Storage (GDS):
Use an Amazon EC2 P5, P5e, G6, or G6e client instance.
Install the NVIDIA Compute Unified Device Architecture (CUDA) package, the open source NVIDIA driver, and the NVIDIA GPUDirect Storage Driver on your client instance. For more information, see Installing the GDS driver.
To access your file system using ENA Express:
Use Amazon EC2 instances that support ENA Express. See the Supported instance types for ENA Express in the Amazon EC2 User Guide.
Update the settings for your Linux instance. See Prerequisites for Linux instances in the Amazon EC2 User Guide.
Enable ENA Express on network interfaces for your client instances. For details, see Review ENA Express settings for your EC2 instance in the Amazon EC2 User Guide.