MLS-C01 Modeling • Set 31
MLS-C01 Modeling Practice Test 31 — 15 questions with explanations. Free, no signup.
A data scientist is using SageMaker to train an XGBoost model for regression. The training data contains categorical features with high cardinality (e.g., zip code with over 10,000 unique values). Which feature engineering approach is MOST appropriate to avoid overfitting while preserving predictive power?