Cluster-Specific Configurations
SageMaker HyperPod offers flexibility in running training jobs across different cluster environments. Each environment has its own configuration requirements and setup process. This section outlines the steps and configurations needed for running training jobs in SageMaker HyperPod Slurm, SageMaker HyperPod k8s, and SageMaker training jobs. Understanding these configurations is crucial for effectively leveraging the power of distributed training in your chosen environment.
You can use a recipe in the following cluster environments:
-
SageMaker HyperPod Slurm Orchestration
-
SageMaker HyperPod Amazon Elastic Kubernetes Service Orchestration
-
SageMaker training jobs
To launch a training job in a cluster, set and install the corresponding cluster configuration and environment.