# Getting started with Amazon File Cache
<a name="getting-started"></a>

Learn how to start using Amazon File Cache. These steps walk you through creating an Amazon File Cache resource and accessing it from your compute instances. Amazon File Cache can link to an Amazon Simple Storage Service (Amazon S3) or Network File System (NFS) data repository (but not to both types at the same time). This exercise uses an Amazon S3 bucket as the data repository, and shows how to use your cache to process the data in your Amazon S3 bucket with your file-based applications.

This getting started exercise includes the following steps.

**Topics**
+ [Prerequisites](prerequisites.md)
+ [Step 1: Create your cache](getting-started-step1.md)
+ [Step 2: Install and configure the Lustre client on your instance before mounting your cache](getting-started-step2.md)
+ [Step 3: Run your analysis](getting-started-step3.md)
+ [Step 4: Clean up resources](getting-started-step4.md)

# Prerequisites
<a name="prerequisites"></a>

To perform this getting started exercise, you'll need the following:
+ An AWS account with the permissions necessary to create an Amazon File Cache and an Amazon Elastic Compute Cloud (Amazon EC2) instance. For more information, see [Setting up](setting-up.md).
+ Each cache requires four IP addresses for the metadata servers (MDS) and one IP address for each storage server (OSS). Caches are provisioned with 2.4 TiB of storage per OSS.
+ An Amazon EC2 instance running a supported Linux release in your virtual private cloud (VPC) based on the Amazon VPC service. You'll install the Lustre client on this Amazon EC2 instance, and then mount your cache on the Amazon EC2 instance. The Lustre client supports Amazon Linux, Amazon Linux 2, CentOS and Red Hat Enterprise Linux 7.9 and 8.4 through 8.6, Rocky Linux 8.4 through 8.6, and Ubuntu 18.04, 20.04, and 22.04. For this getting started exercise, we'll use Ubuntu 22.04.

  When creating your Amazon EC2 instance for this getting started exercise, keep the following in mind:
  + We recommend that you create your instance in your default VPC.
  + We recommend that you use the default security group when creating your Amazon EC2 instance.
+ An Amazon S3 bucket storing the data for your workload to process. The Amazon S3 bucket will be the linked data repository for your cache.

# Step 1: Create your cache
<a name="getting-started-step1"></a>

Next, you create your cache. For this exercise, there are instructions about creating a data repository association to link to an Amazon S3 bucket when you create the File Cache.

1. Open the AWS Management Console at [https://console.aws.amazon.com/fsx/](https://console.aws.amazon.com/fsx/).

1. Choose **Caches** in the navigation pane.

1. On the dashboard, choose **Create cache** to start the cache creation wizard.

   Begin your configuration with the **Cache details** section.  
![\[The Cache details screen, where you can enter cache parameters.\]](http://docs.aws.amazon.com/fsx/latest/FileCacheGuide/images/cache-create-standard.png)

1. For **Cache name**, enter a name for your cache. We recommend using a name that helps you to identify and manage the cache in the future. You can use a maximum of 256 Unicode letters, white spaces, numbers, and these special characters: \$1 - = . \$1 : /

1. For **Cache storage capacity**, set the amount of storage capacity for your cache, in TiB. Set this to a value of 1.2 TiB, 2.4 TiB, or increments of 2.4 TiB.

   Additionally, metadata storage capacity of 2.4 TiB is provisioned for all caches.

1. The amount of **Throughput capacity** is calculated by multiplying the cache storage capacity by the throughput tier. For example, for a 1.2 TiB cache, it's `1200` MBps; for a 9.6 TiB cache, it's `9600` MBps.

   **Throughput capacity** is the sustained speed at which the file server that hosts your cache can serve data.

1. In the **Network & security** section, provide networking and security group information:
   + For **Virtual Private Cloud (VPC)**, choose the Amazon VPC that you want to associate with your cache.
   + For **VPC Security Groups**, the ID for the default security group for your VPC should already be added.
   + For **Subnet**, choose any value from the list of available subnets.

1. In the **Encryption** section, for **Encryption key**, choose the AWS Key Management Service (AWS KMS) encryption key that protects your cache's data at rest.

1. For **Tags - *optional***, you can enter a key and value to add tags to your cache. A tag is a case-sensitive key-value pair that helps you to manage, filter, and search for your cache.

1. Choose **Next**.

1. In the **Data repository associations (DRAs)** section, there are no DRAs linking your cache to Amazon S3 or NFS data repositories. For detailed information about linking data repositories to Amazon File Cache, see [To link an S3 bucket or NFS file system while creating a cache (console)](create-linked-repo.md#link-new-repo-console).

   The following instructions describe how to link your cache to an existing Amazon S3 bucket for this getting started exercise. In the **Data repository association** dialog box, provide information for the following fields.

   1. For **Repository type**, choose `S3`.

   1. For **Data repository path**, enter the path of the existing S3 bucket or prefix to associate with your cache (for example, `s3://my-bucket/my-prefix/`).

   1. For **Cache path**, enter the name of a high-level directory (such as `/ns1`) or subdirectory (such as `/ns1/subdir`) within Amazon File Cache to associate with the S3 data repository. The first forward slash in the path is required.
**Note**  
**Cache path** can only be set to root (/) on NFS DRAs when **Subdirectories** is specified. If you specify root (/) as the **Cache path**, you can create only one DRA on the cache.  
**Cache path** can't be set to root (/) for an S3 DRA.

   1. Choose **Add**.

1. Choose **Next**.

1. Review the cache configuration shown on the **Cache summary** page. For your reference, note which cache settings you can modify after the cache is created.

1. Choose **Create cache**.

Now that you've created your cache, note its fully qualified domain name and mount name for a later step. You can find the fully qualified domain name and mount name for a cache by choosing the name of the cache in the **Caches** dashboard, and then choosing **Attach**.

# Step 2: Install and configure the Lustre client on your instance before mounting your cache
<a name="getting-started-step2"></a>

To mount your cache from your Amazon EC2 instance, first install the Lustre 2.12 client.

You can get Lustre packages from the Ubuntu 22.04 AWS Lustre client repository. To validate that the contents of the repository weren't tampered with before or during download, a GNU Privacy Guard (GPG) signature is applied to the metadata of the repository. Installing the repository fails unless you have the correct public GPG key installed on your system.

**To download the Lustre client onto your Amazon EC2 instance**

1. Open a terminal on your client.

1. Follow these steps to add the Lustre client Ubuntu repository:

   1. If you have not previously registered an AWS Lustre client Ubuntu repository on your client instance, download and install the required public key. Use the following command.

      ```
      wget -O - https://fsx-lustre-client-repo-public-keys.s3.amazonaws.com/fsx-ubuntu-public-key.asc | gpg --dearmor | sudo tee /usr/share/keyrings/fsx-ubuntu-public-key.gpg >/dev/null
      ```

   1. Add the AWS Lustre package repository to your local package manager using the following command.

      ```
      sudo bash -c 'echo "deb [signed-by=/usr/share/keyrings/fsx-ubuntu-public-key.gpg] https://fsx-lustre-client-repo.s3.amazonaws.com/ubuntu jammy main" > /etc/apt/sources.list.d/fsxlustreclientrepo.list && apt-get update'
      ```

1. Determine which kernel is currently running on your client instance, and update as needed. The AWS Lustre client on Ubuntu 22.04 requires kernel `5.15.0.1020-aws` or later for both x86-based EC2 instances and Arm-based EC2 instances powered by AWS Graviton processors.

   1. Run the following command to determine which kernel is running.

      ```
      uname -r
      ```

   1. Run the following command to update to the latest Ubuntu kernel and Lustre version and then reboot.

      ```
      sudo apt install -y linux-aws lustre-client-modules-aws && sudo reboot
      ```

      If your kernel version is greater than `5.15.0.1020-aws` for both x86-based EC2 instances and Graviton-based instances, and you don’t want to update to the latest kernel version, you can install the Lustre client for the current kernel with the following command.

      ```
      sudo apt install -y lustre-client-modules-$(uname -r)
      ```

      The two Lustre packages that are necessary for mounting and interacting with your cache are installed. You can optionally install additional related packages such as a package containing the source code and packages containing tests that are included in the repository.

   1. List all available packages in the repository by using the following command. 

      ```
      sudo apt-cache search ^lustre
      ```

   1. (Optional) If you want your system upgrade to also always upgrade Lustre client modules, verify that the `lustre-client-modules-aws` package is installed using the following command.

      ```
      sudo apt install -y lustre-client-modules-aws
      ```

For information about installing the Lustre client on other Linux distributions, see [Installing the Lustre client](install-lustre-client.md).

**To mount your cache**

1. Make a directory for the mount point with the following command.

   ```
   sudo mkdir -p /mnt
   ```

1. Mount the Amazon File Cache to the directory that you created. Use the following command and replace these items:
   + Replace `cache_dns_name` with the actual file cache's Domain Name System (DNS) name.
   + Replace `mountname` with the cache's mount name, which you can get by running the **describe-file-caches** AWS CLI command or the [DescribeFileCaches](https://docs.aws.amazon.com/fsx/latest/APIReference/API_DescribeFileCaches.html) API operation.

   ```
   sudo mount -t lustre -o relatime,flock cache_dns_name@tcp:/mountname /mnt
   ```

    This command mounts your cache with these options:
   +  `relatime` – Maintains `atime` (inode access times) data, but not for each time that a file is accessed. With this option enabled, `atime` data is written to disk only if the file was modified after the `atime` data was last updated (mtime), or if the file was last accessed more than a certain amount of time ago (one day by default). `relatime` is required for [automatic cache eviction](cache-eviction.md#auto-cache-eviction) to work properly.
   +  `flock` – Enables file locking for your cache. If you don't want file locking enabled, use the `mount` command without `flock`.

1. Verify that the mount command was successful by listing the contents of the directory to which you mounted the cache `/mnt`, by using the following command.

   ```
   ls /mnt
   import-path  lustre
   $
   ```

   You can also use the `df` command.

   ```
   df
   Filesystem                      1K-blocks    Used  Available Use% Mounted on
   devtmpf                          1001808       0    1001808   0% /dev
   tmpfs                            1019760       0    1019760   0% /dev/shm
   tmpfs                            1019760     392    1019368   1% /run
   tmpfs                            1019760       0    1019760   0% /sys/fs/cgroup
   /dev/xvda1                       8376300 1263180    7113120  16% /
   123.456.789.0@tcp:/mountname  3547698816   13824 3547678848   1% /mnt
   tmpfs                             203956       0     203956   0% /run/user/1000
   ```

   The results show the Amazon File Cache resource mounted on /mnt.

# Step 3: Run your analysis
<a name="getting-started-step3"></a>

Now that your cache is created and mounted to a compute instance, you can use it to run your high-performance compute workload. The workload loads data from the Amazon S3 data repository as files are accessed by your workload.

After you run the workload, you can export the data that you write to your cache back to your Amazon S3 bucket at any time. From a terminal on one of your compute instances, run the following command to export a file to your Amazon S3 bucket.

```
sudo lfs hsm_archive file_name
```

For more information about how to run this command on a folder or large collection of files quickly, see [Exporting files using HSM commands](exporting-files-hsm.md).

# Step 4: Clean up resources
<a name="getting-started-step4"></a>

After you finish this exercise, we recommend that you follow these steps to clean up your resources and protect your AWS account.

**To clean up resources**

1. On the Amazon EC2 console, terminate your instance. For more information, see [Terminate your instance](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/terminating-instances.html) in the *Amazon EC2 User Guide.*

1. On the AWS Management Console, delete your cache with the following procedure:

   1. In the navigation pane, choose **Caches**.

   1. Choose the cache that you want to delete from list of caches on the dashboard.

   1. For **Actions**, choose **Delete cache**.

   1. In the dialog box that appears, provide the cache ID to confirm the deletion. Choose **Delete cache**.

1. If you created an Amazon S3 bucket for this exercise, and don't want to preserve the data that you exported, you can now delete it. For more information, see [Deleting a bucket](https://docs.aws.amazon.com/AmazonS3/latest/userguide/delete-bucket.html) in the *Amazon Simple Storage Service User Guide.*