建立一個端點組態 - Amazon SageMaker AI

本文為英文版的機器翻譯版本,如內容有任何歧義或不一致之處,概以英文版為準。

建立一個端點組態

建立模型後,使用 CreateEndpointConfig 建立一個端點組態。Amazon SageMaker AI 託管服務使用此組態來部署模型。在組態中,您可以識別使用 搭配 建立的一或多個模型CreateModel,以部署您希望 Amazon SageMaker AI 佈建的資源。指定 AsyncInferenceConfig 物件並為 OutputConfig 提供一個輸出 Amazon S3 位置。您可以選用指定 Amazon SNS 主題,以傳送預測結果通知。如需 Amazon SNS 主題的更多相關資訊,請參閱 設定 Amazon SNS

下列範例說明如何使用 AWS SDK for Python (Boto3)建立端點組態:

import datetime from time import gmtime, strftime # Create an endpoint config name. Here we create one based on the date # so it we can search endpoints based on creation time. endpoint_config_name = f"XGBoostEndpointConfig-{strftime('%Y-%m-%d-%H-%M-%S', gmtime())}" # The name of the model that you want to host. This is the name that you specified when creating the model. model_name='<The_name_of_your_model>' create_endpoint_config_response = sagemaker_client.create_endpoint_config( EndpointConfigName=endpoint_config_name, # You will specify this name in a CreateEndpoint request. # List of ProductionVariant objects, one for each model that you want to host at this endpoint. ProductionVariants=[ { "VariantName": "variant1", # The name of the production variant. "ModelName": model_name, "InstanceType": "ml.m5.xlarge", # Specify the compute instance type. "InitialInstanceCount": 1 # Number of instances to launch initially. } ], AsyncInferenceConfig={ "OutputConfig": { # Location to upload response outputs when no location is provided in the request. "S3OutputPath": f"s3://{s3_bucket}/{bucket_prefix}/output" # (Optional) specify Amazon SNS topics "NotificationConfig": { "SuccessTopic": "arn:aws:sns:aws-region:account-id:topic-name", "ErrorTopic": "arn:aws:sns:aws-region:account-id:topic-name", } }, "ClientConfig": { # (Optional) Specify the max number of inflight invocations per instance # If no value is provided, Amazon SageMaker will choose an optimal value for you "MaxConcurrentInvocationsPerInstance": 4 } } ) print(f"Created EndpointConfig: {create_endpoint_config_response['EndpointConfigArn']}")

在上述範例中,您可以為 AsyncInferenceConfig 欄位的 OutputConfig 指定下列金鑰:

  • S3OutputPath:請求中未提供位置時,上傳回應輸出的位置。

  • NotificationConfig: (選用) 在推論請求成功 (SuccessTopic) 或失敗 (ErrorTopic) 時,會傳送通知給您的 SNS 主題 。

您還可以為AsyncInferenceConfig 欄位中的 ClientConfig 指定下列選用引數:

  • MaxConcurrentInvocationsPerInstance:(選用) SageMaker AI 用戶端傳送至模型容器的並行請求數目上限。