Class CfnInferenceComponent
- All Implemented Interfaces:
IInspectable,ITaggableV2,IInferenceComponentRef,software.amazon.jsii.JsiiSerializable,software.constructs.IConstruct,software.constructs.IDependable
In the inference component settings, you specify the model, the endpoint, and how the model utilizes the resources that the endpoint hosts. You can optimize resource utilization by tailoring how the required CPU cores, accelerators, and memory are allocated. You can deploy multiple inference components to an endpoint, where each inference component contains one model and the resource utilization needs for that individual model. After you deploy an inference component, you can directly invoke the associated model when you use the InvokeEndpoint API action.
Example:
// The code below shows an example of how to instantiate this type.
// The values are placeholders you should change.
import software.amazon.awscdk.services.sagemaker.*;
CfnInferenceComponent cfnInferenceComponent = CfnInferenceComponent.Builder.create(this, "MyCfnInferenceComponent")
.endpointName("endpointName")
.specification(InferenceComponentSpecificationProperty.builder()
.baseInferenceComponentName("baseInferenceComponentName")
.computeResourceRequirements(InferenceComponentComputeResourceRequirementsProperty.builder()
.maxMemoryRequiredInMb(123)
.minMemoryRequiredInMb(123)
.numberOfAcceleratorDevicesRequired(123)
.numberOfCpuCoresRequired(123)
.build())
.container(InferenceComponentContainerSpecificationProperty.builder()
.artifactUrl("artifactUrl")
.deployedImage(DeployedImageProperty.builder()
.resolutionTime("resolutionTime")
.resolvedImage("resolvedImage")
.specifiedImage("specifiedImage")
.build())
.environment(Map.of(
"environmentKey", "environment"))
.image("image")
.build())
.modelName("modelName")
.startupParameters(InferenceComponentStartupParametersProperty.builder()
.containerStartupHealthCheckTimeoutInSeconds(123)
.modelDataDownloadTimeoutInSeconds(123)
.build())
.build())
// the properties below are optional
.deploymentConfig(InferenceComponentDeploymentConfigProperty.builder()
.autoRollbackConfiguration(AutoRollbackConfigurationProperty.builder()
.alarms(List.of(AlarmProperty.builder()
.alarmName("alarmName")
.build()))
.build())
.rollingUpdatePolicy(InferenceComponentRollingUpdatePolicyProperty.builder()
.maximumBatchSize(InferenceComponentCapacitySizeProperty.builder()
.type("type")
.value(123)
.build())
.maximumExecutionTimeoutInSeconds(123)
.rollbackMaximumBatchSize(InferenceComponentCapacitySizeProperty.builder()
.type("type")
.value(123)
.build())
.waitIntervalInSeconds(123)
.build())
.build())
.endpointArn("endpointArn")
.inferenceComponentName("inferenceComponentName")
.runtimeConfig(InferenceComponentRuntimeConfigProperty.builder()
.copyCount(123)
.currentCopyCount(123)
.desiredCopyCount(123)
.build())
.tags(List.of(CfnTag.builder()
.key("key")
.value("value")
.build()))
.variantName("variantName")
.build();
- See Also:
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic interfaceAn Amazon CloudWatch alarm configured to monitor metrics on an endpoint.static interfaceExample:static final classA fluent builder forCfnInferenceComponent.static interfaceGets the Amazon EC2 Container Registry path of the docker image of the model that is hosted in this ProductionVariant .static interfaceSpecifies the type and size of the endpoint capacity to activate for a rolling deployment or a rollback strategy.static interfaceDefines the compute resources to allocate to run a model, plus any adapter models, that you assign to an inference component.static interfaceDefines a container that provides the runtime environment for a model that you deploy with an inference component.static interfaceThe deployment configuration for an endpoint that hosts inference components.static interfaceSpecifies a rolling deployment strategy for updating a SageMaker AI inference component.static interfaceRuntime settings for a model that is deployed with an inference component.static interfaceDetails about the resources to deploy with this inference component, including the model, container, and compute resources.static interfaceSettings that take effect while the model container starts up.Nested classes/interfaces inherited from class software.amazon.jsii.JsiiObject
software.amazon.jsii.JsiiObject.InitializationModeNested classes/interfaces inherited from interface software.constructs.IConstruct
software.constructs.IConstruct.Jsii$DefaultNested classes/interfaces inherited from interface software.amazon.awscdk.services.sagemaker.IInferenceComponentRef
IInferenceComponentRef.Jsii$Default, IInferenceComponentRef.Jsii$ProxyNested classes/interfaces inherited from interface software.amazon.awscdk.IInspectable
IInspectable.Jsii$Default, IInspectable.Jsii$ProxyNested classes/interfaces inherited from interface software.amazon.awscdk.ITaggableV2
ITaggableV2.Jsii$Default, ITaggableV2.Jsii$Proxy -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final StringThe CloudFormation resource type name for this resource class. -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedCfnInferenceComponent(software.amazon.jsii.JsiiObject.InitializationMode initializationMode) protectedCfnInferenceComponent(software.amazon.jsii.JsiiObjectRef objRef) CfnInferenceComponent(software.constructs.Construct scope, String id, CfnInferenceComponentProps props) -
Method Summary
Modifier and TypeMethodDescriptionThe time when the inference component was created.The failure reason if the inference component is in a failed state.The Amazon Resource Name (ARN) of the inference component.The status of the inference component.The time when the inference component was last updated.The number of runtime copies of the model container that are currently deployed.The number of runtime copies of the model container that you requested to deploy with the inference component.Tag Manager which manages the tags for this resource.The deployment configuration for an endpoint, which contains the desired deployment strategy and rollback configurations.The Amazon Resource Name (ARN) of the endpoint that hosts the inference component.The name of the endpoint that hosts the inference component.The name of the inference component.A reference to a InferenceComponent resource.The runtime config for the inference component.The specification for the inference component.getTags()An array of tags to apply to the resource.The name of the production variant that hosts the inference component.voidinspect(TreeInspector inspector) Examines the CloudFormation resource and discloses attributes.renderProperties(Map<String, Object> props) voidsetDeploymentConfig(IResolvable value) The deployment configuration for an endpoint, which contains the desired deployment strategy and rollback configurations.voidThe deployment configuration for an endpoint, which contains the desired deployment strategy and rollback configurations.voidsetEndpointArn(String value) The Amazon Resource Name (ARN) of the endpoint that hosts the inference component.voidsetEndpointName(String value) The name of the endpoint that hosts the inference component.voidsetInferenceComponentName(String value) The name of the inference component.voidsetRuntimeConfig(IResolvable value) The runtime config for the inference component.voidThe runtime config for the inference component.voidsetSpecification(IResolvable value) The specification for the inference component.voidThe specification for the inference component.voidAn array of tags to apply to the resource.voidsetVariantName(String value) The name of the production variant that hosts the inference component.Methods inherited from class software.amazon.awscdk.CfnResource
addDeletionOverride, addDependency, addDependsOn, addMetadata, addOverride, addPropertyDeletionOverride, addPropertyOverride, applyRemovalPolicy, applyRemovalPolicy, applyRemovalPolicy, getAtt, getAtt, getCfnOptions, getCfnResourceType, getMetadata, getUpdatedProperites, getUpdatedProperties, isCfnResource, obtainDependencies, obtainResourceDependencies, removeDependency, replaceDependency, shouldSynthesize, toString, validatePropertiesMethods inherited from class software.amazon.awscdk.CfnRefElement
getRefMethods inherited from class software.amazon.awscdk.CfnElement
getCreationStack, getLogicalId, getStack, isCfnElement, overrideLogicalIdMethods inherited from class software.constructs.Construct
getNode, isConstructMethods inherited from class software.amazon.jsii.JsiiObject
jsiiAsyncCall, jsiiAsyncCall, jsiiCall, jsiiCall, jsiiGet, jsiiGet, jsiiSet, jsiiStaticCall, jsiiStaticCall, jsiiStaticGet, jsiiStaticGet, jsiiStaticSet, jsiiStaticSetMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitMethods inherited from interface software.constructs.IConstruct
getNodeMethods inherited from interface software.amazon.jsii.JsiiSerializable
$jsii$toJson
-
Field Details
-
CFN_RESOURCE_TYPE_NAME
The CloudFormation resource type name for this resource class.
-
-
Constructor Details
-
CfnInferenceComponent
protected CfnInferenceComponent(software.amazon.jsii.JsiiObjectRef objRef) -
CfnInferenceComponent
protected CfnInferenceComponent(software.amazon.jsii.JsiiObject.InitializationMode initializationMode) -
CfnInferenceComponent
@Stability(Stable) public CfnInferenceComponent(@NotNull software.constructs.Construct scope, @NotNull String id, @NotNull CfnInferenceComponentProps props) - Parameters:
scope- Scope in which this resource is defined. This parameter is required.id- Construct identifier for this resource (unique in its scope). This parameter is required.props- Resource properties. This parameter is required.
-
-
Method Details
-
inspect
Examines the CloudFormation resource and discloses attributes.- Specified by:
inspectin interfaceIInspectable- Parameters:
inspector- tree inspector to collect and process attributes. This parameter is required.
-
renderProperties
@Stability(Stable) @NotNull protected Map<String,Object> renderProperties(@NotNull Map<String, Object> props) - Overrides:
renderPropertiesin classCfnResource- Parameters:
props- This parameter is required.
-
getAttrCreationTime
The time when the inference component was created. -
getAttrFailureReason
The failure reason if the inference component is in a failed state. -
getAttrInferenceComponentArn
The Amazon Resource Name (ARN) of the inference component. -
getAttrInferenceComponentStatus
The status of the inference component. -
getAttrLastModifiedTime
The time when the inference component was last updated. -
getAttrRuntimeConfigCurrentCopyCount
The number of runtime copies of the model container that are currently deployed. -
getAttrRuntimeConfigDesiredCopyCount
The number of runtime copies of the model container that you requested to deploy with the inference component. -
getAttrSpecificationContainerDeployedImage
-
getCdkTagManager
Tag Manager which manages the tags for this resource.- Specified by:
getCdkTagManagerin interfaceITaggableV2
-
getCfnProperties
- Overrides:
getCfnPropertiesin classCfnResource
-
getInferenceComponentRef
A reference to a InferenceComponent resource.- Specified by:
getInferenceComponentRefin interfaceIInferenceComponentRef
-
getEndpointName
The name of the endpoint that hosts the inference component. -
setEndpointName
The name of the endpoint that hosts the inference component. -
getSpecification
The specification for the inference component.Returns union: either
IResolvableorCfnInferenceComponent.InferenceComponentSpecificationProperty -
setSpecification
The specification for the inference component. -
setSpecification
@Stability(Stable) public void setSpecification(@NotNull CfnInferenceComponent.InferenceComponentSpecificationProperty value) The specification for the inference component. -
getDeploymentConfig
The deployment configuration for an endpoint, which contains the desired deployment strategy and rollback configurations.Returns union: either
IResolvableorCfnInferenceComponent.InferenceComponentDeploymentConfigProperty -
setDeploymentConfig
The deployment configuration for an endpoint, which contains the desired deployment strategy and rollback configurations. -
setDeploymentConfig
@Stability(Stable) public void setDeploymentConfig(@Nullable CfnInferenceComponent.InferenceComponentDeploymentConfigProperty value) The deployment configuration for an endpoint, which contains the desired deployment strategy and rollback configurations. -
getEndpointArn
The Amazon Resource Name (ARN) of the endpoint that hosts the inference component. -
setEndpointArn
The Amazon Resource Name (ARN) of the endpoint that hosts the inference component. -
getInferenceComponentName
The name of the inference component. -
setInferenceComponentName
The name of the inference component. -
getRuntimeConfig
The runtime config for the inference component.Returns union: either
IResolvableorCfnInferenceComponent.InferenceComponentRuntimeConfigProperty -
setRuntimeConfig
The runtime config for the inference component. -
setRuntimeConfig
@Stability(Stable) public void setRuntimeConfig(@Nullable CfnInferenceComponent.InferenceComponentRuntimeConfigProperty value) The runtime config for the inference component. -
getTags
An array of tags to apply to the resource. -
setTags
An array of tags to apply to the resource. -
getVariantName
The name of the production variant that hosts the inference component. -
setVariantName
The name of the production variant that hosts the inference component.
-