A data scientist is training a binary classification model on an imbalanced dataset (95% negative class, 5% positive class). The model currently achieves 94% accuracy but a recall of only 0.10 on the positive class. Which TWO strategies should the data scientist consider to improve recall without significantly sacrificing precision? (Choose 2.)
Higher weight for positive class penalizes false negatives, improving recall.
Why this answer
Oversampling the minority class (option A) increases the number of positive examples, which helps the model learn better decision boundaries for the positive class. Using class weights (option B) penalizes misclassifications of the minority class more heavily, encouraging the model to focus on positive examples. Both techniques directly address class imbalance.
Option C (undersampling) may discard useful negative samples and harm performance. Option D (increasing regularization) typically reduces overfitting but does not specifically improve recall. Option E (using a deeper network) may increase overfitting and does not target recall directly.