Creazione di una configurazione endpoint

Una volta creato un modello, crea una configurazione dell'endpoint con CreateEndpointConfig. I servizi di hosting Amazon SageMaker AI utilizzano questa configurazione per distribuire modelli. Nella configurazione, identifichi uno o più modelli, creati utilizzando with CreateModel, per distribuire le risorse di cui desideri che Amazon SageMaker AI fornisca. Specifica l'oggetto AsyncInferenceConfig e fornisci una posizione di output Amazon S3 per OutputConfig. Facoltativamente, puoi specificare gli argomenti di Amazon SNS su cui inviare notifiche sui risultati delle previsioni. Per ulteriori informazioni sugli argomenti Amazon SNS, consulta Configurazione di Amazon SNS.

Nell'esempio seguente viene illustrato come creare una configurazione endpoint utilizzando AWS SDK for Python (Boto3):


import datetime
from time import gmtime, strftime

# Create an endpoint config name. Here we create one based on the date  
# so it we can search endpoints based on creation time.
endpoint_config_name = f"XGBoostEndpointConfig-{strftime('%Y-%m-%d-%H-%M-%S', gmtime())}"

# The name of the model that you want to host. This is the name that you specified when creating the model.
model_name='<The_name_of_your_model>'

create_endpoint_config_response = sagemaker_client.create_endpoint_config(
    EndpointConfigName=endpoint_config_name, # You will specify this name in a CreateEndpoint request.
    # List of ProductionVariant objects, one for each model that you want to host at this endpoint.
    ProductionVariants=[
        {
            "VariantName": "variant1", # The name of the production variant.
            "ModelName": model_name, 
            "InstanceType": "ml.m5.xlarge", # Specify the compute instance type.
            "InitialInstanceCount": 1 # Number of instances to launch initially.
        }
    ],
    AsyncInferenceConfig={
        "OutputConfig": {
            # Location to upload response outputs when no location is provided in the request.
            "S3OutputPath": f"s3://{s3_bucket}/{bucket_prefix}/output"
            # (Optional) specify Amazon SNS topics
            "NotificationConfig": {
                "SuccessTopic": "arn:aws:sns:aws-region:account-id:topic-name",
                "ErrorTopic": "arn:aws:sns:aws-region:account-id:topic-name",
            }
        },
        "ClientConfig": {
            # (Optional) Specify the max number of inflight invocations per instance
            # If no value is provided, Amazon SageMaker will choose an optimal value for you
            "MaxConcurrentInvocationsPerInstance": 4
        }
    }
)

print(f"Created EndpointConfig: {create_endpoint_config_response['EndpointConfigArn']}")

Nell'esempio precedente, si specificano le seguenti chiavi OutputConfig per il campo AsyncInferenceConfig:

S3OutputPath: posizione in cui caricare gli output di risposta quando nella richiesta non viene fornita alcuna posizione.
NotificationConfig: (facoltativo) Argomenti SNS che inviano notifiche all'utente quando una richiesta di inferenza ha esito positivo (SuccessTopic) o negativo (ErrorTopic).

È inoltre possibile specificare il seguente argomento opzionale per ClientConfig nel campo AsyncInferenceConfig:

MaxConcurrentInvocationsPerInstance: (Facoltativo) Il numero massimo di richieste simultanee inviate dal client SageMaker AI al contenitore del modello.

Avvertimento JavaScript è disabilitato o non è disponibile nel tuo browser.

Per usare la documentazione AWS, JavaScript deve essere abilitato. Consulta le pagine della guida del browser per le istruzioni.

Convenzioni dei documenti

Creazione di un modello

Creazione endpoint