엔드포인트 구성 생성

모델을 만들었으면 CreateEndpointConfig를 사용하여 엔드포인트 구성을 생성하세요. Amazon SageMaker 호스팅 서비스는 이 구성을 사용하여 모델을 배포합니다. 구성에서 와 함께 를 사용하여 생성한 하나 이상의 모델을 식별CreateModel하여 Amazon이 프로비저닝할 리소스를 배포 SageMaker 합니다. AsyncInferenceConfig 객체를 지정하고 OutputConfig에 대한 출력 Amazon S3 위치를 제공합니다. 선택적으로 예측 결과에 대한 알림을 보낼 Amazon SNS 주제를 지정할 수 있습니다. Amazon SNS 주제에 대한 자세한 내용은 Amazon 구성을 SNS참조하세요.

다음 예제는 AWS SDK for Python (Boto3)을 사용하여 엔드포인트 구성을 생성하는 방법을 보여줍니다.


import datetime
from time import gmtime, strftime

# Create an endpoint config name. Here we create one based on the date  
# so it we can search endpoints based on creation time.
endpoint_config_name = f"XGBoostEndpointConfig-{strftime('%Y-%m-%d-%H-%M-%S', gmtime())}"

# The name of the model that you want to host. This is the name that you specified when creating the model.
model_name='<The_name_of_your_model>'

create_endpoint_config_response = sagemaker_client.create_endpoint_config(
    EndpointConfigName=endpoint_config_name, # You will specify this name in a CreateEndpoint request.
    # List of ProductionVariant objects, one for each model that you want to host at this endpoint.
    ProductionVariants=[
        {
            "VariantName": "variant1", # The name of the production variant.
            "ModelName": model_name, 
            "InstanceType": "ml.m5.xlarge", # Specify the compute instance type.
            "InitialInstanceCount": 1 # Number of instances to launch initially.
        }
    ],
    AsyncInferenceConfig={
        "OutputConfig": {
            # Location to upload response outputs when no location is provided in the request.
            "S3OutputPath": f"s3://{s3_bucket}/{bucket_prefix}/output"
            # (Optional) specify Amazon SNS topics
            "NotificationConfig": {
                "SuccessTopic": "arn:aws:sns:aws-region:account-id:topic-name",
                "ErrorTopic": "arn:aws:sns:aws-region:account-id:topic-name",
            }
        },
        "ClientConfig": {
            # (Optional) Specify the max number of inflight invocations per instance
            # If no value is provided, Amazon SageMaker will choose an optimal value for you
            "MaxConcurrentInvocationsPerInstance": 4
        }
    }
)

print(f"Created EndpointConfig: {create_endpoint_config_response['EndpointConfigArn']}")

앞서 언급한 예시에서는 AsyncInferenceConfig 필드에 OutputConfig에 대해 다음 키를 지정합니다.

S3OutputPath: 요청에 위치가 제공되지 않은 경우 응답 출력을 업로드할 위치입니다.
NotificationConfig: (선택 사항) 추론 요청이 성공할 때() 또는 실패할 때(SuccessTopic) 알림을 게시하는 SNS 주제입니다ErrorTopic.

AsyncInferenceConfig 필드에서 ClientConfig에 대한 다음과 같은 선택적 인수를 지정할 수도 있습니다.

MaxConcurrentInvocationsPerInstance: (선택 사항) SageMaker 클라이언트가 모델 컨테이너로 보낸 동시 요청의 최대 수입니다.

javascript가 브라우저에서 비활성화되거나 사용이 불가합니다.

AWS 설명서를 사용하려면 Javascript가 활성화되어야 합니다. 지침을 보려면 브라우저의 도움말 페이지를 참조하십시오.

문서 규칙

모델 생성

엔드포인트 생성