ProductionVariantSummary
Describes weight and capacities for a production variant associated with an
endpoint. If you sent a request to the UpdateEndpointWeightsAndCapacities
API and the endpoint status is Updating
, you get different desired and
current values.
Contents
- VariantName
-
The name of the variant.
Type: String
Length Constraints: Maximum length of 63.
Pattern:
^[a-zA-Z0-9](-*[a-zA-Z0-9]){0,62}
Required: Yes
- CurrentInstanceCount
-
The number of instances associated with the variant.
Type: Integer
Valid Range: Minimum value of 0.
Required: No
- CurrentServerlessConfig
-
The serverless configuration for the endpoint.
Type: ProductionVariantServerlessConfig object
Required: No
- CurrentWeight
-
The weight associated with the variant.
Type: Float
Valid Range: Minimum value of 0.
Required: No
- DeployedImages
-
An array of
DeployedImage
objects that specify the Amazon EC2 Container Registry paths of the inference images deployed on instances of thisProductionVariant
.Type: Array of DeployedImage objects
Required: No
- DesiredInstanceCount
-
The number of instances requested in the
UpdateEndpointWeightsAndCapacities
request.Type: Integer
Valid Range: Minimum value of 0.
Required: No
- DesiredServerlessConfig
-
The serverless configuration requested for the endpoint update.
Type: ProductionVariantServerlessConfig object
Required: No
- DesiredWeight
-
The requested weight, as specified in the
UpdateEndpointWeightsAndCapacities
request.Type: Float
Valid Range: Minimum value of 0.
Required: No
- ManagedInstanceScaling
-
Settings that control the range in the number of instances that the endpoint provisions as it scales up or down to accommodate traffic.
Type: ProductionVariantManagedInstanceScaling object
Required: No
- RoutingConfig
-
Settings that control how the endpoint routes incoming traffic to the instances that the endpoint hosts.
Type: ProductionVariantRoutingConfig object
Required: No
- VariantStatus
-
The endpoint variant status which describes the current deployment stage status or operational status.
Type: Array of ProductionVariantStatus objects
Array Members: Minimum number of 0 items. Maximum number of 5 items.
Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: