Buat Konfigurasi Endpoint

Setelah Anda memiliki model, buat konfigurasi titik akhir dengan CreateEndpointConfig. Layanan hosting Amazon SageMaker AI menggunakan konfigurasi ini untuk menyebarkan model. Dalam konfigurasi, Anda mengidentifikasi satu atau beberapa model, yang dibuat menggunakan dengan CreateModel, untuk menyebarkan sumber daya yang Anda ingin Amazon SageMaker AI berikan. Tentukan AsyncInferenceConfig objek dan berikan lokasi keluaran Amazon S3 untuk. OutputConfig Anda dapat secara opsional menentukan topik Amazon SNS untuk mengirim pemberitahuan tentang hasil prediksi. Untuk informasi selengkapnya tentang topik Amazon SNS, lihat Mengonfigurasi Amazon SNS.

Contoh berikut menunjukkan cara membuat konfigurasi endpoint menggunakan AWS SDK for Python (Boto3):


import datetime
from time import gmtime, strftime

# Create an endpoint config name. Here we create one based on the date  
# so it we can search endpoints based on creation time.
endpoint_config_name = f"XGBoostEndpointConfig-{strftime('%Y-%m-%d-%H-%M-%S', gmtime())}"

# The name of the model that you want to host. This is the name that you specified when creating the model.
model_name='<The_name_of_your_model>'

create_endpoint_config_response = sagemaker_client.create_endpoint_config(
    EndpointConfigName=endpoint_config_name, # You will specify this name in a CreateEndpoint request.
    # List of ProductionVariant objects, one for each model that you want to host at this endpoint.
    ProductionVariants=[
        {
            "VariantName": "variant1", # The name of the production variant.
            "ModelName": model_name, 
            "InstanceType": "ml.m5.xlarge", # Specify the compute instance type.
            "InitialInstanceCount": 1 # Number of instances to launch initially.
        }
    ],
    AsyncInferenceConfig={
        "OutputConfig": {
            # Location to upload response outputs when no location is provided in the request.
            "S3OutputPath": f"s3://{s3_bucket}/{bucket_prefix}/output"
            # (Optional) specify Amazon SNS topics
            "NotificationConfig": {
                "SuccessTopic": "arn:aws:sns:aws-region:account-id:topic-name",
                "ErrorTopic": "arn:aws:sns:aws-region:account-id:topic-name",
            }
        },
        "ClientConfig": {
            # (Optional) Specify the max number of inflight invocations per instance
            # If no value is provided, Amazon SageMaker will choose an optimal value for you
            "MaxConcurrentInvocationsPerInstance": 4
        }
    }
)

print(f"Created EndpointConfig: {create_endpoint_config_response['EndpointConfigArn']}")

Dalam contoh yang disebutkan di atas, Anda menentukan kunci berikut OutputConfig untuk AsyncInferenceConfig bidang:

S3OutputPath: Lokasi untuk mengunggah output respons ketika tidak ada lokasi yang disediakan dalam permintaan.
NotificationConfig: (Opsional) Topik SNS yang memposting pemberitahuan kepada Anda saat permintaan inferensi berhasil (SuccessTopic) atau jika gagal (ErrorTopic).

Anda juga dapat menentukan argumen opsional berikut untuk ClientConfig di AsyncInferenceConfig bidang:

MaxConcurrentInvocationsPerInstance: (Opsional) Jumlah maksimum permintaan bersamaan yang dikirim oleh klien SageMaker AI ke wadah model.

Awas Javascript dinonaktifkan atau tidak tersedia di browser Anda.

Untuk menggunakan Dokumentasi AWS, Javascript harus diaktifkan. Lihat halaman Bantuan browser Anda untuk petunjuk.

Konvensi Dokumen

Buat Model

Buat Endpoint