A data scientist is training a binary classification model to predict customer churn. The dataset has 10,000 records with 9,500 non-churners and 500 churners. After training a logistic regression model, the model achieves 95% accuracy on the test set. However, the business team reports that the model is not useful because it predicts almost all customers as non-churners. Which metric should the data scientist use to evaluate the model's performance in this scenario?
Trap 1: Accuracy
Accuracy is not suitable for imbalanced datasets as it can be high even if the model predicts the majority class only.
Trap 2: R-squared
R-squared is a metric for regression models, not classification.
Trap 3: Precision
Precision measures how many of the predicted churners are actual churners, but it does not reflect how many actual churners were missed.
- A
Accuracy
Why wrong: Accuracy is not suitable for imbalanced datasets as it can be high even if the model predicts the majority class only.
- B
R-squared
Why wrong: R-squared is a metric for regression models, not classification.
- C
Precision
Why wrong: Precision measures how many of the predicted churners are actual churners, but it does not reflect how many actual churners were missed.
- D
Recall
Recall measures the proportion of actual churners correctly identified, which is the key metric for this imbalanced problem.