Amazon SageMaker AI

Developer Guide

What is Amazon SageMaker AI?
Setting up SageMaker AI
Automated ML, no-code, or low-code
- SageMaker Autopilot
  Create Regression or Classification Jobs Using the AutoML API
  Datasets Format and Problem Types
  Training Modes and Algorithms
  Metrics and validation
  Model deployment and prediction
  Deploy models for real-time inference
  Run batch inference jobs
  View model details
  View a model performance report
  Notebooks generated
  Data exploration report
  Find and run the candidate definition notebook
  Configure inference output
  Create an Image Classification Job using the AutoML API
  Datasets Format and Objective Metric
  Deploy Autopilot Models
  Explainability Report
  Model Performance Report
  Create a Text Classification job using the AutoML API
  Datasets Format and Objective Metric
  Deploy Autopilot Models for Prediction
  Explainability Report
  Model Performance Report
  Create a time-series forecasting job using the AutoML API
  Datasets format and missing values filling methods
  National Holiday Calendars
  Objective metrics
  Algorithms
  Model deployment and forecasts
  Real-time forecasting
  Batch forecasting
  Data exploration notebook
  Reports generated
  Time-series forecasting resource limits
  Create an LLM fine-tuning job using the AutoML API
  Supported models
  Dataset file types and input data format
  Hyperparameters
  Metrics
  Model deployment and predictions
  Create a Regression or Classification Job Using the Studio Classic UI
  Configure the default parameters of an Autopilot experiment (for administrators)
  Example Notebooks
  Videos
  Tutorials
  Quotas
  API reference
- SageMaker JumpStart
  Foundation models
  Available models
  Foundation model usage
  Studio
  Fine-tune a model in Studio
  Deploy a model in Studio
  Evaluate a model in Studio
  Use your SageMaker JumpStart Models in Amazon Bedrock
  Studio Classic
  SageMaker Python SDK
  Fine-tune a public model
  Deploy a public model
  Deploy a proprietary model
  SageMaker AI Console
  Licenses
  Model Customization
  Prompt engineering
  Fine-tuning
  Fine-tune a model using domain adaptation
  Fine-tune a model with prompt instructions
  Retrieval Augmented Generation
  Evaluate a Model
  Example notebooks
  Access control
  Admin guide
  Create a private model hub
  Add models to a private hub
  Cross-account sharing
  Set up cross-account hub sharing
  Delete models from a private hub
  Remove access to the SageMaker Public models hub
  Delete a private hub
  Troubleshooting
  User guide
  Studio Classic
  Task-Specific Models
  Deploy a Model
  Fine-Tune a Model
  Share Models
  Shared Models and Notebooks
  Model and notebook sharing
  Access shared content
  Add a model
  Add basic information
  Enable training
  Enable deployment
  Add a notebook
  Solution Templates
  Launch a Solution
  SageMaker JumpStart Industry: Financial
Machine learning environments offered by Amazon SageMaker AI
Data labeling with a human-in-the-loop
Prepare data
Processing jobs
Create, store, and share features
Reserve capacity with SageMaker training plans
Model training
Deploy models for inference
Implement MLOps
Data and model quality monitoring
Evaluate, explain, and detect bias in models
Model governance
Docker containers for training and deploying models
Configure security in Amazon SageMaker AI
Algorithms and packages in the AWS Marketplace
Tools for monitoring the AWS resources provisioned while using Amazon SageMaker AI
Reference

Amazon SageMaker

Documentation
Amazon SageMaker
Developer Guide

Request Inferences from a Deployed Service (Boto3)

PDF

RSS

Focus mode

Request Inferences from a Deployed Service (Boto3) - Amazon SageMaker AI

Documentation Amazon SageMaker Developer Guide

You can submit inference requests using SageMaker AI SDK for Python (Boto3) client and invoke_endpoint() API once you have an SageMaker AI endpoint InService. The following code example shows how to send an image for inference:

PyTorch and MXNet


import boto3

import json
 
endpoint = 'insert name of your endpoint here'
 
runtime = boto3.Session().client('sagemaker-runtime')
 
# Read image into memory
with open(image, 'rb') as f:
    payload = f.read()
# Send image via InvokeEndpoint API
response = runtime.invoke_endpoint(EndpointName=endpoint, ContentType='application/x-image', Body=payload)

# Unpack response
result = json.loads(response['Body'].read().decode())

TensorFlow

For TensorFlow submit an input with application/json for the content type.


from PIL import Image
import numpy as np
import json
import boto3

client = boto3.client('sagemaker-runtime') 
input_file = 'path/to/image'
image = Image.open(input_file)
batch_size = 1
image = np.asarray(image.resize((224, 224)))
image = image / 128 - 1
image = np.concatenate([image[np.newaxis, :, :]] * batch_size)
body = json.dumps({"instances": image.tolist()})
ioc_predictor_endpoint_name = 'insert name of your endpoint here'
content_type = 'application/json'   
ioc_response = client.invoke_endpoint(
    EndpointName=ioc_predictor_endpoint_name,
    Body=body,
    ContentType=content_type
 )

XGBoost

For an XGBoost application, you should submit a CSV text instead:


import boto3
import json
 
endpoint = 'insert your endpoint name here'
 
runtime = boto3.Session().client('sagemaker-runtime')
 
csv_text = '1,-1.0,1.0,1.5,2.6'
# Send CSV text via InvokeEndpoint API
response = runtime.invoke_endpoint(EndpointName=endpoint, ContentType='text/csv', Body=csv_text)
# Unpack response
result = json.loads(response['Body'].read().decode())

anchor anchor anchor


import boto3

import json
 
endpoint = 'insert name of your endpoint here'
 
runtime = boto3.Session().client('sagemaker-runtime')
 
# Read image into memory
with open(image, 'rb') as f:
    payload = f.read()
# Send image via InvokeEndpoint API
response = runtime.invoke_endpoint(EndpointName=endpoint, ContentType='application/x-image', Body=payload)

# Unpack response
result = json.loads(response['Body'].read().decode())

Note that BYOM allows for a custom content type. For more information, see runtime_InvokeEndpoint.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Request Inferences from a Deployed Service (Amazon SageMaker SDK)

Request Inferences from a Deployed Service (AWS CLI)

Did this page help you? - Yes

Thanks for letting us know we're doing a good job!

If you've got a moment, please tell us what we did right so we can do more of it.

Did this page help you? - No

Thanks for letting us know this page needs work. We're sorry we let you down.

If you've got a moment, please tell us how we can make the documentation better.

Need help?

Try AWS re:Post
Connect with an AWS IQ expert

Privacy Site terms Cookie preferences

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences

Request Inferences from a Deployed Service (Boto3)

Related resources

Did this page help you?

Related resources

Next topic:

Previous topic:

Need help?