Request Syntax Request Parameters Response Syntax Response Elements Errors See Also

UpdateInferenceComponent

Updates an inference component.

Request Syntax


{
   "DeploymentConfig": { 
      "AutoRollbackConfiguration": { 
         "Alarms": [ 
            { 
               "AlarmName": "string"
            }
         ]
      },
      "RollingUpdatePolicy": { 
         "MaximumBatchSize": { 
            "Type": "string",
            "Value": number
         },
         "MaximumExecutionTimeoutInSeconds": number,
         "RollbackMaximumBatchSize": { 
            "Type": "string",
            "Value": number
         },
         "WaitIntervalInSeconds": number
      }
   },
   "InferenceComponentName": "string",
   "RuntimeConfig": { 
      "CopyCount": number
   },
   "Specification": { 
      "BaseInferenceComponentName": "string",
      "ComputeResourceRequirements": { 
         "MaxMemoryRequiredInMb": number,
         "MinMemoryRequiredInMb": number,
         "NumberOfAcceleratorDevicesRequired": number,
         "NumberOfCpuCoresRequired": number
      },
      "Container": { 
         "ArtifactUrl": "string",
         "ContainerMetricsConfig": { 
            "MetricsEndpoints": [ 
               { 
                  "MetricPublishFrequencyInSeconds": number,
                  "MetricsEndpointPath": "string"
               }
            ]
         },
         "Environment": { 
            "string" : "string" 
         },
         "Image": "string"
      },
      "DataCacheConfig": { 
         "EnableCaching": boolean
      },
      "InstanceType": "string",
      "ModelName": "string",
      "SchedulingConfig": { 
         "AvailabilityZoneBalance": { 
            "EnforcementMode": "string",
            "MaxImbalance": number
         },
         "PlacementStrategy": "string"
      },
      "StartupParameters": { 
         "ContainerStartupHealthCheckTimeoutInSeconds": number,
         "ModelDataDownloadTimeoutInSeconds": number
      }
   },
   "Specifications": [ 
      { 
         "BaseInferenceComponentName": "string",
         "ComputeResourceRequirements": { 
            "MaxMemoryRequiredInMb": number,
            "MinMemoryRequiredInMb": number,
            "NumberOfAcceleratorDevicesRequired": number,
            "NumberOfCpuCoresRequired": number
         },
         "Container": { 
            "ArtifactUrl": "string",
            "ContainerMetricsConfig": { 
               "MetricsEndpoints": [ 
                  { 
                     "MetricPublishFrequencyInSeconds": number,
                     "MetricsEndpointPath": "string"
                  }
               ]
            },
            "Environment": { 
               "string" : "string" 
            },
            "Image": "string"
         },
         "DataCacheConfig": { 
            "EnableCaching": boolean
         },
         "InstanceType": "string",
         "ModelName": "string",
         "SchedulingConfig": { 
            "AvailabilityZoneBalance": { 
               "EnforcementMode": "string",
               "MaxImbalance": number
            },
            "PlacementStrategy": "string"
         },
         "StartupParameters": { 
            "ContainerStartupHealthCheckTimeoutInSeconds": number,
            "ModelDataDownloadTimeoutInSeconds": number
         }
      }
   ]
}

Request Parameters

For information about the parameters that are common to all actions, see Common Parameters.

The request accepts the following data in JSON format.

DeploymentConfig

The deployment configuration for the inference component. The configuration contains the desired deployment strategy and rollback settings.

Type: InferenceComponentDeploymentConfig object

Required: No

InferenceComponentName

The name of the inference component.

Type: String

Length Constraints: Minimum length of 0. Maximum length of 63.

Pattern: [a-zA-Z0-9]([\-a-zA-Z0-9]*[a-zA-Z0-9])?

Required: Yes

RuntimeConfig

Runtime settings for a model that is deployed with an inference component.

Type: InferenceComponentRuntimeConfig object

Required: No

Specification

Details about the resources to deploy with this inference component, including the model, container, and compute resources.

Type: InferenceComponentSpecification object

Required: No

Specifications

A list of specification objects for the inference component, one per instance type. Use this parameter when you want to specify different model or resource configurations for the inference component on each instance type. You can use either this parameter or the singular Specification parameter, but not both.

Type: Array of InferenceComponentSpecification objects

Array Members: Minimum number of 1 item. Maximum number of 5 items.

Required: No

Response Syntax


{
   "InferenceComponentArn": "string"
}

Response Elements

If the action is successful, the service sends back an HTTP 200 response.

The following data is returned in JSON format by the service.

InferenceComponentArn

The Amazon Resource Name (ARN) of the inference component.

Type: String

Length Constraints: Minimum length of 20. Maximum length of 2048.

Errors

For information about the errors that are common to all actions, see Common Error Types.

ResourceLimitExceeded

You have exceeded an SageMaker resource limit. For example, you might have too many training jobs created.

HTTP Status Code: 400