LightGBM
LightGBM
Amazon EC2 instance recommendation for the LightGBM algorithm
SageMaker LightGBM currently supports single-instance and multi-instance CPU training. For
multi-instance CPU training (distributed training), specify an
instance_count
greater than 1 when you define your Estimator. For more
information on distributed training with LightGBM, see Amazon SageMaker LightGBM Distributed training using Dask
LightGBM is a memory-bound (as opposed to compute-bound) algorithm. So, a general-purpose compute instance (for example, M5) is a better choice than a compute-optimized instance (for example, C5). Further, we recommend that you have enough total memory in selected instances to hold the training data.
LightGBM sample notebooks
The following table outlines a variety of sample notebooks that address different use cases of Amazon SageMaker LightGBM algorithm.
Notebook Title | Description |
---|---|
Tabular classification with Amazon SageMaker LightGBM and CatBoost algorithm |
This notebook demonstrates the use of the Amazon SageMaker LightGBM algorithm to train and host a tabular classification model. |
Tabular regression with Amazon SageMaker LightGBM and CatBoost algorithm |
This notebook demonstrates the use of the Amazon SageMaker LightGBM algorithm to train and host a tabular regression model. |
This notebook demonstrates distributed training with the Amazon SageMaker LightGBM algorithm using the Dask framework. |
For instructions on how to create and access Jupyter notebook instances that you can use to run the example in SageMaker, see Amazon SageMaker Notebook Instances. After you have created a notebook instance and opened it, choose the SageMaker Examples tab to see a list of all of the SageMaker samples. To open a notebook, choose its Use tab and choose Create copy.