A data scientist trains a deep learning model on a large dataset. The training loss decreases steadily but the validation loss starts increasing after 20 epochs. The scientist uses early stopping with patience=5. Which of the following is the MOST likely cause and best corrective action?
Diverging validation loss after training loss decrease is classic overfitting; dropout helps.
Why this answer
The training loss decreasing while validation loss increasing after 20 epochs is a classic sign of overfitting, where the model memorizes training data noise instead of generalizing. Early stopping with patience=5 would halt training after 5 epochs of no validation improvement, but the root cause is overfitting. Adding dropout regularization randomly drops neurons during training, forcing the network to learn more robust features and reducing overfitting.
Exam trap
CompTIA often tests the distinction between overfitting and underfitting by showing a diverging validation loss curve, and the trap here is that candidates may confuse overfitting with a learning rate issue or data quality problem, leading them to choose 'reduce learning rate' or 'collect more data' instead of the correct regularization technique.
How to eliminate wrong answers
Option B is wrong because the validation loss increasing while training loss decreases indicates overfitting, not unrepresentative data; collecting more data might help but is not the most direct corrective action for overfitting. Option C is wrong because underfitting would show high training loss that does not decrease, not a decreasing training loss with increasing validation loss. Option D is wrong because a high learning rate would typically cause training loss to oscillate or diverge, not steadily decrease; reducing learning rate addresses convergence issues, not overfitting.