Amazon SageMaker Autopilot example notebooks
The following notebooks serve as practical, hands-on examples that address various use cases of Autopilot.
You can find all of Autopilot's notebooks in the autopilot
We recommend cloning the full Git repository within Studio Classic to access and run the notebooks directly. For information on how to clone a Git repository in Studio Classic, see Clone a Git Repository in SageMaker Studio Classic.
Use case | Description |
---|---|
Serverless inference |
By default, Autopilot allows deploying generated models to real-time inference
endpoints. In this repository, the notebook illustrates how to deploy Autopilot models
trained with |
Autopilot inspects your data set, and runs a number of candidates to figure out the optimal combination of data preprocessing steps, machine learning algorithms, and hyperparameters. You can easily deploy either on a real-time endpoint or for batch processing. In some cases, you might want to have the flexibility to bring custom data processing code to Autopilot. For example, your datasets might contain a large number of independent variables, and you may wish to incorporate a custom feature selection step to remove irrelevant variables first. The resulting smaller dataset can then be used to launch an Autopilot job. Ultimately, you would also want to include both the custom processing code and models from Autopilot for real-time or batch processing. |
|
While Autopilot streamlines the process of building ML models, MLOps engineers are
still responsible for creating, automating, and managing end-to-end ML workflows in
production. SageMaker Pipelines can assist in automating various steps of the ML lifecycle,
such as data preprocessing, model training, hyperparameter tuning, model evaluation,
and deployment. This notebook serves as a demonstration of how to incorporate Autopilot
into a SageMaker Pipelines end-to-end AutoML training workflow. To launch an Autopilot
experiment within Pipelines, you must create a model-building workflow by writing
custom integration code using Pipelines Lambda or Processing steps. For more information, refer to Move Amazon SageMaker Autopilot ML models from experimentation to production using Amazon SageMaker AI
Pipelines Alternatively, when using Autopilot in Ensembling
mode, you can refer to the notebook example that demonstrates how to use
native AutoML step in SageMaker Pipeline's native AutoML step |
|
Direct marketing with Amazon SageMaker Autopilot |
This notebook demonstrates how uses the Bank Marketing Data
Set |
Customer Churn Prediction with Amazon SageMaker Autopilot |
This notebook describes using machine learning for the automated identification of unhappy customers, also known as customer churn prediction. The example shows how to analyze a publicly available dataset and perform feature engineering on it. Next it shows how to tune a model by selecting the best performing pipeline along with the optimal hyperparameters for the training algorithm. Finally, it shows how to deploy the model to a hosted endpoint and how to evaluate its predictions against ground truth. However, ML models rarely give perfect predictions. That's why this notebook also shows how to incorporate the relative costs of prediction mistakes when determining the financial outcome of using ML. |
Top Candidates Customer Churn Prediction with Amazon SageMaker Autopilot and Batch Transform (Python
SDK) |
This notebook also describes using machine learning for the automated identification of unhappy customers, also known as customer churn prediction. This notebook demonstrates how to configure the model to obtain the inference probability, select the top N models, and make Batch Transform on a hold-out test set for evaluation. NoteThis notebook works with SageMaker Python SDK >= 1.65.1 released on 6/19/2020. |
Bringing your own data processing code to Amazon SageMaker Autopilot |
This notebook demonstrates how to incorporate and deploy custom data processing code when using Amazon SageMaker Autopilot. It adds a custom feature selection step to remove irrelevant variables to an Autopilot job. It then shows how to deploy both the custom processing code and models generated by Autopilot on a real-time endpoint and, alternatively, for batch processing. |
More notebooks |
You can find more notebooks illustrating other use cases such as batch transform |