UpdateCluster
Updates a SageMaker HyperPod cluster.
Request Syntax
{
"ClusterName": "string
",
"InstanceGroups": [
{
"ExecutionRole": "string
",
"InstanceCount": number
,
"InstanceGroupName": "string
",
"InstanceStorageConfigs": [
{ ... }
],
"InstanceType": "string
",
"LifeCycleConfig": {
"OnCreate": "string
",
"SourceS3Uri": "string
"
},
"OnStartDeepHealthChecks": [ "string
" ],
"OverrideVpcConfig": {
"SecurityGroupIds": [ "string
" ],
"Subnets": [ "string
" ]
},
"ThreadsPerCore": number
}
],
"NodeRecovery": "string
"
}
Request Parameters
For information about the parameters that are common to all actions, see Common Parameters.
The request accepts the following data in JSON format.
- ClusterName
-
Specify the name of the SageMaker HyperPod cluster you want to update.
Type: String
Length Constraints: Maximum length of 256.
Pattern:
^(arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:cluster/[a-z0-9]{12})|([a-zA-Z0-9](-*[a-zA-Z0-9]){0,62})$
Required: Yes
- InstanceGroups
-
Specify the instance groups to update.
Type: Array of ClusterInstanceGroupSpecification objects
Array Members: Minimum number of 1 item. Maximum number of 100 items.
Required: Yes
- NodeRecovery
-
The node recovery mode to be applied to the SageMaker HyperPod cluster.
Type: String
Valid Values:
Automatic | None
Required: No
Response Syntax
{
"ClusterArn": "string"
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- ClusterArn
-
The Amazon Resource Name (ARN) of the updated SageMaker HyperPod cluster.
Type: String
Length Constraints: Maximum length of 256.
Pattern:
^arn:aws[a-z\-]*:sagemaker:[a-z0-9\-]*:[0-9]{12}:cluster/[a-z0-9]{12}$
Errors
For information about the errors that are common to all actions, see Common Errors.
- ConflictException
-
There was a conflict when you attempted to modify a SageMaker entity such as an
Experiment
orArtifact
.HTTP Status Code: 400
- ResourceLimitExceeded
-
You have exceeded an SageMaker resource limit. For example, you might have too many training jobs created.
HTTP Status Code: 400
- ResourceNotFound
-
Resource being access is not found.
HTTP Status Code: 400
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: