How CatBoost Works
CatBoost implements a conventional Gradient Boosting Decision Tree (GBDT) algorithm with the addition of two critical algorithmic advances:
The implementation of ordered boosting, a permutation-driven alternative to the classic algorithm
An innovative algorithm for processing categorical features
Both techniques were created to fight a prediction shift caused by a special kind of target leakage present in all currently existing implementations of gradient boosting algorithms.
The CatBoost algorithm performs well in machine learning competitions because of its robust handling of a variety of data types, relationships, distributions, and the diversity of hyperparameters that you can fine-tune. You can use CatBoost for regression, classification (binary and multiclass), and ranking problems.
For more information on gradient boosting, see How the SageMaker AI XGBoost algorithm works. For in-depth details about the additional GOSS
and EFB techniques used in the CatBoost method, see CatBoost: unbiased boosting with
categorical features