What particular configurations HyperPod manages in Slurm configuration files
When you create a Slurm cluster on HyperPod, the HyperPod agent sets
up the slurm.conf
gres.conf
/opt/slurm/etc/
to manage the Slurm cluster based on your
HyperPod cluster creation request and lifecycle scripts. The following list
shows which specific parameters the HyperPod agent handles and overwrites.
Important
We strongly recommend that you do not change these parameters managed by HyperPod.
-
In
slurm.conf
, HyperPod sets up the following basic parameters: ClusterName
,SlurmctldHost
,PartitionName
, andNodeName
.Also, to enable the Auto-resume functionality, HyperPod requires the
TaskPlugin
andSchedulerParameters
parameters set as follows. The HyperPod agent sets up these two parameters with the required values by default.TaskPlugin=task/none SchedulerParameters=permit_job_expansion
-
In
gres.conf
, HyperPod manages NodeName
for GPU nodes.