UpdateInferenceComponent
Updates an inference component.
Request Syntax
{
"DeploymentConfig": {
"AutoRollbackConfiguration": {
"Alarms": [
{
"AlarmName": "string"
}
]
},
"RollingUpdatePolicy": {
"MaximumBatchSize": {
"Type": "string",
"Value": number
},
"MaximumExecutionTimeoutInSeconds": number,
"RollbackMaximumBatchSize": {
"Type": "string",
"Value": number
},
"WaitIntervalInSeconds": number
}
},
"InferenceComponentName": "string",
"RuntimeConfig": {
"CopyCount": number
},
"Specification": {
"BaseInferenceComponentName": "string",
"ComputeResourceRequirements": {
"MaxMemoryRequiredInMb": number,
"MinMemoryRequiredInMb": number,
"NumberOfAcceleratorDevicesRequired": number,
"NumberOfCpuCoresRequired": number
},
"Container": {
"ArtifactUrl": "string",
"Environment": {
"string" : "string"
},
"Image": "string"
},
"DataCacheConfig": {
"EnableCaching": boolean
},
"InstanceType": "string",
"ModelName": "string",
"SchedulingConfig": {
"AvailabilityZoneBalance": {
"EnforcementMode": "string",
"MaxImbalance": number
},
"PlacementStrategy": "string"
},
"StartupParameters": {
"ContainerStartupHealthCheckTimeoutInSeconds": number,
"ModelDataDownloadTimeoutInSeconds": number
}
},
"Specifications": [
{
"BaseInferenceComponentName": "string",
"ComputeResourceRequirements": {
"MaxMemoryRequiredInMb": number,
"MinMemoryRequiredInMb": number,
"NumberOfAcceleratorDevicesRequired": number,
"NumberOfCpuCoresRequired": number
},
"Container": {
"ArtifactUrl": "string",
"Environment": {
"string" : "string"
},
"Image": "string"
},
"DataCacheConfig": {
"EnableCaching": boolean
},
"InstanceType": "string",
"ModelName": "string",
"SchedulingConfig": {
"AvailabilityZoneBalance": {
"EnforcementMode": "string",
"MaxImbalance": number
},
"PlacementStrategy": "string"
},
"StartupParameters": {
"ContainerStartupHealthCheckTimeoutInSeconds": number,
"ModelDataDownloadTimeoutInSeconds": number
}
}
]
}
Request Parameters
For information about the parameters that are common to all actions, see Common Parameters.
The request accepts the following data in JSON format.
- DeploymentConfig
-
The deployment configuration for the inference component. The configuration contains the desired deployment strategy and rollback settings.
Type: InferenceComponentDeploymentConfig object
Required: No
- InferenceComponentName
-
The name of the inference component.
Type: String
Length Constraints: Minimum length of 0. Maximum length of 63.
Pattern:
[a-zA-Z0-9]([\-a-zA-Z0-9]*[a-zA-Z0-9])?Required: Yes
- RuntimeConfig
-
Runtime settings for a model that is deployed with an inference component.
Type: InferenceComponentRuntimeConfig object
Required: No
- Specification
-
Details about the resources to deploy with this inference component, including the model, container, and compute resources.
Type: InferenceComponentSpecification object
Required: No
- Specifications
-
A list of specification objects for the inference component, one per instance type. Use this parameter when you want to specify different model or resource configurations for the inference component on each instance type. You can use either this parameter or the singular
Specificationparameter, but not both.Type: Array of InferenceComponentSpecification objects
Array Members: Minimum number of 1 item. Maximum number of 5 items.
Required: No
Response Syntax
{
"InferenceComponentArn": "string"
}
Response Elements
If the action is successful, the service sends back an HTTP 200 response.
The following data is returned in JSON format by the service.
- InferenceComponentArn
-
The Amazon Resource Name (ARN) of the inference component.
Type: String
Length Constraints: Minimum length of 20. Maximum length of 2048.
Errors
For information about the errors that are common to all actions, see Common Error Types.
- ResourceLimitExceeded
-
You have exceeded an SageMaker resource limit. For example, you might have too many training jobs created.
HTTP Status Code: 400
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: