範例

下列範例組AWS ParallelCluster態示範使用SlurmTorque、和AWS Batch排程器的組態。

注意

從版本 2.11.5 開始，AWS ParallelCluster不支持使用SGE或Torque調度程序。

內容

Slurm Workload Manager (`slurm`)

以下範例會啟動具 slurm 排程器的叢集。範例配置會啟動 1 個具有 2 個工作佇列的叢集。第一個佇列最初spot有 2 個可用的 t3.micro Spot 執行個體。它最多可以擴展到 10 個執行個體，並在 10 分鐘內沒有任務執行時縮減至最少 1 個執行個體 (可使用scaledown_idletime設定進行調整)。第二個佇列不會從執行個體開始ondemand，最多可擴充至 5 個t3.micro隨需執行個體。


[global]
update_check = true
sanity_check = true
cluster_template = slurm

[aws]
aws_region_name = <your AWS 區域>

[vpc public]
master_subnet_id = <your subnet>
vpc_id = <your VPC>

[cluster slurm]
key_name = <your EC2 keypair name>
base_os = alinux2                   # optional, defaults to alinux2
scheduler = slurm
master_instance_type = t3.micro     # optional, defaults to t3.micro
vpc_settings = public
queue_settings = spot,ondemand

[queue spot]
compute_resource_settings = spot_i1
compute_type = spot                 # optional, defaults to ondemand

[compute_resource spot_i1]
instance_type = t3.micro
min_count = 1                       # optional, defaults to 0
initial_count = 2                   # optional, defaults to 0

[queue ondemand]
compute_resource_settings = ondemand_i1

[compute_resource ondemand_i1]
instance_type = t3.micro
max_count = 5                       # optional, defaults to 10

Son of Grid Engine(`sge`) 和 Torque Resource Manager (`torque`)

注意

此範例僅適用於AWS ParallelCluster版本 2.11.4 以上且包含在內的版本。從版本 2.11.5 開始，AWS ParallelCluster不支持使用SGE或Torque調度程序。

下列範例會使用torque或sge排程器啟動叢集。若要使用SGE，請變更scheduler = torque為scheduler = sge。此範例組態最多允許 5 個並行節點，並且在 10 分鐘內沒有工作執行時，可縮減為兩個節點。


[global]
update_check = true
sanity_check = true
cluster_template = torque

[aws]
aws_region_name = <your AWS 區域>

[vpc public]
master_subnet_id = <your subnet>
vpc_id = <your VPC>

[cluster torque]
key_name = <your EC2 keypair name>but they aren't eligible for future updates
base_os = alinux2                   # optional, defaults to alinux2
scheduler = torque                  # optional, defaults to sge
master_instance_type = t3.micro     # optional, defaults to t3.micro
vpc_settings = public
initial_queue_size = 2              # optional, defaults to 0
maintain_initial_size = true        # optional, defaults to false
max_queue_size = 5                  # optional, defaults to 10

注意

從版本 2.11.5 開始，AWS ParallelCluster不支持使用SGE或Torque調度程序。如果您使用這些版本，您可以繼續使用它們，或疑難排解AWS服務和AWS支援團隊的支援。

AWS Batch (`awsbatch`)

以下範例會啟動具 awsbatch 排程器的叢集。它設置為根據您的工作資源需求選擇更好的實例類型。

範例組態最多允許 40 個並行 vCPU，並且在 10 分鐘內未執行任務時縮減為零 (可使用此scaledown_idletime設定進行調整)。


[global]
update_check = true
sanity_check = true
cluster_template = awsbatch

[aws]
aws_region_name = <your AWS 區域>

[vpc public]
master_subnet_id = <your subnet>
vpc_id = <your VPC>

[cluster awsbatch]
scheduler = awsbatch
compute_instance_type = optimal # optional, defaults to optimal
min_vcpus = 0                   # optional, defaults to 0
desired_vcpus = 0               # optional, defaults to 4
max_vcpus = 40                  # optional, defaults to 20
base_os = alinux2               # optional, defaults to alinux2, controls the base_os of
                                # the head node and the docker image for the compute fleet
key_name = <your EC2 keypair name>
vpc_settings = public

您的瀏覽器已停用或無法使用 Javascript。

您必須啟用 Javascript，才能使用 AWS 文件。請參閱您的瀏覽器說明頁以取得說明。

文件慣用形式

[vpc] 區段

AWS ParallelCluster 的運作方式

範例

注意

內容

Slurm Workload Manager (slurm)

Son of Grid Engine(sge) 和 Torque Resource Manager (torque)

注意

注意

AWS Batch (awsbatch)

Slurm Workload Manager (`slurm`)

Son of Grid Engine(`sge`) 和 Torque Resource Manager (`torque`)

AWS Batch (`awsbatch`)