A data scientist is using a gradient boosting model (XGBoost) for a regression task and observes that the model's performance on the training set is much better than on the test set. Which hyperparameter tuning strategy would most effectively reduce overfitting?
Shallow trees are less complex and generalize better, reducing overfitting.
Why this answer
Option A (Increase learning rate) makes each tree more influential, increasing overfitting. Option C (Increase boosting rounds) increases model complexity. Option D (Subsample less than 1.0) introduces randomness but is less direct than tree depth.
Option B (Reduce max depth) limits tree complexity, reducing overfitting.