Gradient Boosting
Short Definition
Full Definition
Gradient Boosting is one of the most successful machine learning algorithms for structured and tabular data, consistently winning competitions and powering production systems across industries. Unlike Random Forest which builds trees independently, gradient boosting builds trees sequentially where each new tree focuses on correcting the mistakes of the existing ensemble. The algorithm works by computing the gradient of the loss function with respect to the current model’s predictions, then fitting a new decision tree to these gradients (residual errors). This new tree is added to the ensemble with a small learning rate, gradually improving predictions. The name comes from this combination of gradient descent optimization with boosting (sequentially focusing on hard examples). Jerome Friedman introduced the modern gradient boosting framework in 2001, and it has since spawned highly optimized implementations. XGBoost (2016) introduced regularization and efficient computing. LightGBM (2017) added leaf-wise tree growth and histogram-based splitting for speed. CatBoost (2018) added native categorical feature handling. These implementations dominate tabular data competitions and are widely deployed in finance, healthcare, advertising, and recommendation systems. Gradient boosting remains the algorithm of choice for most structured data problems where deep learning is overkill.
Technical Explanation
The algorithm iteratively fits residuals: at step m, compute pseudo-residuals r_im = -dL(y_i, F_{m-1}(x_i))/dF for each sample, fit a tree h_m to these residuals, and update F_m(x) = F_{m-1}(x) + eta * h_m(x), where eta is the learning rate. Key hyperparameters include number of trees (n_estimators), learning rate (eta, typically 0.01-0.3), max tree depth (3-10), and subsampling ratio. XGBoost adds L1 and L2 regularization to the objective: sum L(y_i, F(x_i)) + sum(gamma*T + 0.5*lambda*||w||^2), where T is the number of leaves. LightGBM uses Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) for efficiency.
Use Cases
Advantages
Disadvantages
Schema Type
Featured Snippet Candidate
Difficulty Level