An e-commerce company uses a linear regression model to predict customer lifetime value (LTV). The model shows high variance on the test set, with training RMSE much lower than test RMSE. Which of the following is the MOST effective approach to reduce overfitting?
L2 regularization shrinks coefficients and reduces variance.
Why this answer
High variance (low training RMSE, high test RMSE) indicates overfitting. L2 regularization (Ridge regression) adds a penalty proportional to the square of the coefficients, shrinking them toward zero without eliminating them, which reduces model complexity and improves generalization. This directly addresses overfitting by constraining the model's sensitivity to noise in the training data.
Exam trap
Cisco often tests the misconception that adding more data always reduces overfitting, but the trap here is that duplicating existing samples (Option D) does not provide new, diverse examples and therefore fails to address the root cause of high variance.
How to eliminate wrong answers
Option B is wrong because using a polynomial kernel in a support vector regressor increases model complexity by mapping data into a higher-dimensional space, which would exacerbate overfitting rather than reduce it. Option C is wrong because adding more features, including interaction terms, further increases model complexity and variance, making overfitting worse. Option D is wrong because duplicating existing samples does not introduce new information; it artificially inflates the weight of existing patterns, which can actually increase overfitting by reinforcing noise in the training data.