大型模型推論的 SageMaker AI 端點參數

焦點模式

大型模型推論的 SageMaker AI 端點參數 - Amazon SageMaker AI

您可以自訂下列參數，以使用 SageMaker AI 促進低延遲大型模型推論 (LMI)：

執行個體 (VolumeSizeInGB) 上的 Amazon EBS 磁碟區大小上限 — 如果模型的大小大於 30 GB，而且您使用的執行個體沒有本機磁碟，則應將此參數增加至稍微大於模型的大小。
運作狀態檢查逾時配額 (ContainerStartupHealthCheckTimeoutInSeconds) — 如果您的容器已正確設定，且 CloudWatch 日誌指出運作狀態檢查逾時，您應該增加該配額，讓容器有足夠的時間回應運作狀態檢查。
模型下載逾時配額 (ModelDataDownloadTimeoutInSeconds) — 如果模型的大小大於 40 GB，則應增加該配額，提供足夠的時間將模型從 Amazon S3 下載到執行個體。

以下程式碼片段示範如何以程式化的方式設定上述參數。將範例中的斜體預留位置文字取代為您自己的資訊。


import boto3

aws_region = "aws-region"
sagemaker_client = boto3.client('sagemaker', region_name=aws_region)

# The name of the endpoint. The name must be unique within an AWS Region in your AWS account.
endpoint_name = "endpoint-name"

# Create an endpoint config name.
endpoint_config_name = "endpoint-config-name"

# The name of the model that you want to host.
model_name = "the-name-of-your-model"

instance_type = "instance-type"

sagemaker_client.create_endpoint_config(
    EndpointConfigName = endpoint_config_name
    ProductionVariants=[
        {
            "VariantName": "variant1", # The name of the production variant.
            "ModelName": model_name,
            "InstanceType": instance_type, # Specify the compute instance type.
            "InitialInstanceCount": 1, # Number of instances to launch initially.
            "VolumeSizeInGB": 256, # Specify the size of the Amazon EBS volume.
            "ModelDataDownloadTimeoutInSeconds": 1800, # Specify the model download timeout in seconds.
            "ContainerStartupHealthCheckTimeoutInSeconds": 1800, # Specify the health checkup timeout in seconds
        },
    ],
)

sagemaker_client.create_endpoint(EndpointName=endpoint_name, EndpointConfigName=endpoint_config_name)

如需金鑰的詳細資訊ProductionVariants，請參閱 ProductionVariant。

如需示範如何使用大型模型實現低延遲推論的範例，請參閱 aws-samples GitHub 儲存庫中 Amazon SageMaker AI 上的生成性 AI 推論範例。

您的瀏覽器已停用或無法使用 Javascript。

您必須啟用 Javascript，才能使用 AWS 文件。請參閱您的瀏覽器說明頁以取得說明。

文件慣用形式

LMI 容器文件

部署未壓縮的模型

選取您的 Cookie 偏好設定

自訂 Cookie 偏好設定

必要

效能

功能

廣告

無法儲存 Cookie 偏好設定

大型模型推論的 SageMaker AI 端點參數

Related resources

此頁面是否有幫助？

Related resources

下一個主題：

上一個主題：

需要協助？