

# Choosing a DLAMI instance type
<a name="instance-select"></a>

More generally, consider the following when choosing an instance type for a DLAMI.
+ If you're new to deep learning, then an instance with a single GPU might suit your needs.
+ If you're budget conscious, then you can use CPU-only instances.
+ If you're looking to optimize high performance and cost efficiency for deep learning model inference, then you can use instances with AWS Inferentia chips.
+ If you're looking for a high performance GPU instance with an Arm64-based CPU architecture, then you can use the G5g instance type.
+  If you're interested in running a pretrained model for inference and predictions, then you can attach an [Amazon Elastic Inference](https://docs.aws.amazon.com/elastic-inference/latest/developerguide/what-is-ei.html) to your Amazon EC2 instance. Amazon Elastic Inference gives you access to an accelerator with a fraction of a GPU.
+ For high-volume inference services, a single CPU instance with a lot of memory, or a cluster of such instances, might be a better solution. 
+  If you're using a large model with a lot of data or a high batch size, then you need a larger instance with more memory. You can also distribute your model to a cluster of GPUs. You may find that using an instance with less memory is a better solution for you if you decrease your batch size. This may impact your accuracy and training speed.
+  If you’re interested in running machine learning applications using NVIDIA Collective Communications Library (NCCL) requiring high levels of inter-node communications at scale, you might want to use [ Elastic Fabric Adapter (EFA)](https://docs.aws.amazon.com/dlami/latest/devguide/tutorial-efa-launching.html).

For more detail on instances, see [EC2 Instance Types](https://aws.amazon.com/ec2/instance-types/).

The following topics provide information about instance type considerations. 

**Important**  
The Deep Learning AMIs include drivers, software, or toolkits developed, owned, or provided by NVIDIA Corporation. You agree to use these NVIDIA drivers, software, or toolkits only on Amazon EC2 instances that include NVIDIA hardware.

**Topics**
+ [Pricing for the DLAMI](#pricing)
+ [DLAMI Region Availability](#region)
+ [Recommended GPU Instances](gpu.md)
+ [Recommended CPU Instances](cpu.md)
+ [Recommended Inferentia Instances](inferentia.md)
+ [Recommended Trainium Instances](trainium.md)

## Pricing for the DLAMI
<a name="pricing"></a>

The deep learning frameworks included in the DLAMI are free, and each has its own open-source licenses. Although the software included in the DLAMI is free, you still have to pay for the underlying Amazon EC2 instance hardware.

Some Amazon EC2 instance types are labeled as free. It is possible to run the DLAMI on one of these free instances. This means that using the DLAMI is entirely free when you only use that instance's capacity. If you need a more powerful instance with more CPU cores, more disk space, more RAM, or one or more GPUs, then you need an instance that is not in the free-tier instance class.

For more information about instance choice and pricing, see [Amazon EC2 pricing](https://aws.amazon.com/ec2/pricing/).

## DLAMI Region Availability
<a name="region"></a>

Each Region supports a different range of instance types and often an instance type has a slightly different cost in different Regions. DLAMIs are not available in every Region, but it is possible to copy DLAMIs to the Region of your choice. See [Copying an AMI](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/CopyingAMIs.html) for more information. Note the Region selection list and be sure you pick a Region that's close to you or your customers. If you plan to use more than one DLAMI and potentially create a cluster, be sure to use the same Region for all of nodes in the cluster.

For a more info on Regions, visit [Amazon EC2 service endpoints](https://docs.aws.amazon.com/general/latest/gr/ec2-service.html#ec2_region).

**Next Up**  
[Recommended GPU Instances](gpu.md)

# Recommended GPU Instances
<a name="gpu"></a>

We recommend a GPU instance for most deep learning purposes. Training new models is faster on a GPU instance than a CPU instance. You can scale sub-linearly when you have multi-GPU instances or if you use distributed training across many instances with GPUs. 

The following instance types support the DLAMI. For information about GPU instance type options and their uses, see [EC2 Instance Types](https://aws.amazon.com/ec2/instance-types/) and select **Accelerated Computing**.

**Note**  
The size of your model should be a factor in choosing an instance. If your model exceeds an instance's available RAM, choose a different instance type with enough memory for your application. 
+ [Amazon EC2 P6-B200 Instances](https://aws.amazon.com/ec2/instance-types/p6/) have up to 8 NVIDIA Blackwell B200 GPUs.
+ [Amazon EC2 P6-B300 Instances](https://aws.amazon.com/ec2/instance-types/p6/) have up to 8 NVIDIA Blackwell B300 GPUs.
+ [Amazon EC2 P6e-GB200 Instances](https://aws.amazon.com/ec2/instance-types/p6/) have up to 4 NVIDIA Blackwell GB200 GPUs.
+ [Amazon EC2 P5e Instances](https://aws.amazon.com/ec2/instance-types/p5/) have up to 8 NVIDIA Tesla H200 GPUs.
+ [Amazon EC2 P5 Instances](https://aws.amazon.com/ec2/instance-types/p5/) have up to 8 NVIDIA Tesla H100 GPUs.
+ [Amazon EC2 P4 Instances](https://aws.amazon.com/ec2/instance-types/p4/) have up to 8 NVIDIA Tesla A100 GPUs.
+ [Amazon EC2 P3 Instances](https://aws.amazon.com/ec2/instance-types/p3/) have up to 8 NVIDIA Tesla V100 GPUs.
+ [Amazon EC2 G3 Instances](https://aws.amazon.com/ec2/instance-types/g3/) have up to 4 NVIDIA Tesla M60 GPUs.
+ [Amazon EC2 G4 Instances](https://aws.amazon.com/ec2/instance-types/g4/) have up to 4 NVIDIA T4 GPUs.
+ [Amazon EC2 G5 Instances](https://aws.amazon.com/ec2/instance-types/g5/) have up to 8 NVIDIA A10G GPUs.
+ [Amazon EC2 G6 Instances](https://aws.amazon.com/ec2/instance-types/g6/) have up to 8 NVIDIA L4 GPUs.
+ [Amazon EC2 G6e Instances](https://aws.amazon.com/ec2/instance-types/g6e/) have up to 8 NVIDIA L40S Tensor Core GPUs.
+ [Amazon EC2 G5g Instances](https://aws.amazon.com/ec2/instance-types/g5g/) have Arm64-based [AWS Graviton2 processors](https://aws.amazon.com/ec2/graviton/).

DLAMI instances provide tooling to monitor and optimize your GPU processes. For more information about monitoring your GPU processes, see [GPU Monitoring and Optimization](tutorial-gpu.md).

For specific tutorials on working with G5g instances, see [The ARM64 DLAMI](tutorial-arm64.md).

**Next Up**  
[Recommended CPU Instances](cpu.md)

# Recommended CPU Instances
<a name="cpu"></a>

Whether you're on a budget, learning about deep learning, or just want to run a prediction service, you have many affordable options in the CPU category. Some frameworks take advantage of Intel's MKL DNN, which speeds up training and inference on C5 (not available in all Regions) CPU instance types. For information about CPU instance types, see [EC2 Instance Types](https://aws.amazon.com/ec2/instance-types/) and select **Compute Optimized**.

**Note**  
The size of your model should be a factor in choosing an instance. If your model exceeds an instance's available RAM, choose a different instance type with enough memory for your application. 
+ [Amazon EC2 C5 Instances](https://aws.amazon.com/ec2/instance-types/c5/) have up to 72 Intel vCPUs. C5 instances excel at scientific modeling, batch processing, distributed analytics, high-performance computing (HPC), and machine and deep learning inference.

**Next Up**  
[Recommended Inferentia Instances](inferentia.md)

# Recommended Inferentia Instances
<a name="inferentia"></a>

AWS Inferentia instances are designed to provide high performance and cost efficiency for deep learning model inference workloads. Specifically, Inf2 instance types use AWS Inferentia chips and the [AWS Neuron SDK](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/), which is integrated with popular machine learning frameworks such as TensorFlow and PyTorch.

Customers can use Inf2 instances to run large scale machine learning inference applications such as search, recommendation engines, computer vision, speech recognition, natural language processing, personalization, and fraud detection, at the lowest cost in the cloud.

**Note**  
The size of your model should be a factor in choosing an instance. If your model exceeds an instance's available RAM, choose a different instance type with enough memory for your application. 
+ [Amazon EC2 Inf2 Instances](https://aws.amazon.com/ec2/instance-types/inf2/) have up to up to 16 AWS Inferentia chips and 100 Gbps of networking throughput.

For more information about getting started with AWS Inferentia DLAMIs, see [The AWS Inferentia Chip With DLAMI](tutorial-inferentia.md).

**Next Up**  
[Recommended Trainium Instances](trainium.md)

# Recommended Trainium Instances
<a name="trainium"></a>

AWS Trainium instances are designed to provide high performance and cost efficiency for deep learning model inference workloads. Specifically, Trn1 instance types use AWS Trainium chips and the [AWS Neuron SDK](https://awsdocs-neuron.readthedocs-hosted.com/en/latest/), which is integrated with popular machine learning frameworks such as TensorFlow and PyTorch.

Customers can use Trn1 instances to run large scale machine learning inference applications such as search, recommendation engines, computer vision, speech recognition, natural language processing, personalization, and fraud detection, at the lowest cost in the cloud.

**Note**  
The size of your model should be a factor in choosing an instance. If your model exceeds an instance's available RAM, choose a different instance type with enough memory for your application. 
+ [Amazon EC2 Trn1 Instances](https://aws.amazon.com/ec2/instance-types/trn1/) have up to up to 16 AWS Trainium chips and 100 Gbps of networking throughput.