Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Create a SageMaker HyperPod cluster on training plans using the SageMaker API, or AWS CLI

Focus mode
Create a SageMaker HyperPod cluster on training plans using the SageMaker API, or AWS CLI - Amazon SageMaker AI

To use SageMaker training plans for your Amazon SageMaker HyperPod cluster, specify the ARN of the training plan you want to use in the TrainingPlanArn parameter of the ClusterInstanceGroupSpecification when calling the CreateCluster API operation.

Ensure that the subnet associated with the designated AZ of your plan is included in the VPCConfig of your cluster configuration. You can retrieve the AvailabilityZone of a training plan in the response of a DescribeTrainingPlan API call.

The following sample illustrates how to create a new SageMaker HyperPod cluster and provide an instance group with a training plan in the --instance-groups attribute of the create-cluster AWS CLI command.

# Create a cluster aws sagemaker create-cluster \ --cluster-name cluster-name \ --instance-groups '[ \ { \ "InstanceCount": 1,\ "InstanceGroupName": "controller-nodes",\ "InstanceType": "ml.t3.xlarge",\ "LifeCycleConfig": {"SourceS3Uri": source_s3_uri, "OnCreate": "on_create.sh"},\ "ExecutionRole": "arn:aws:iam::customer_account_id:role/execution_role",\ "ThreadsPerCore": 1,\ },\ { \ "InstanceCount": 2, \ "InstanceGroupName": "worker-nodes",\ "InstanceType": "p4d.24xlarge",\ "LifeCycleConfig": {"SourceS3Uri": source_s3_uri, "OnCreate": "on_create.sh"},\ "ExecutionRole": "arn:aws:iam::customer_account_id}:role/execution_role}",\ "ThreadsPerCore": 1,\ "TrainingPlanArn": training_plan_arn,\ }]'

For information about how to create an HyperPod cluster using the AWS CLI, see create-cluster.

After creating the cluster, you can verify that your instance group was properly assigned capacity from the training plan by calling the DescribeCluster API.

aws sagemaker describe-cluster --cluster-name cluster-name
PrivacySite termsCookie preferences
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.