A financial services company is deploying Einstein Prediction Builder to predict customer churn. The data includes both numerical and categorical fields. Which step is essential to ensure the model is not biased against protected attributes like race or gender?
This is the standard approach to mitigate bias.
Why this answer
Excluding protected attributes like race or gender from the training data and ensuring the model does not use correlated proxies is essential to prevent bias in Einstein Prediction Builder. This approach directly removes the risk of the model learning discriminatory patterns based on these attributes, as the platform relies on the data provided and does not automatically enforce fairness constraints. Including such attributes or relying on built-in fairness would not guarantee unbiased predictions because the model could still infer protected characteristics from correlated features.
Exam trap
Salesforce often tests the misconception that including protected attributes allows the model to 'adjust' for bias, when in reality it introduces direct bias, and that built-in fairness constraints or advanced algorithms can automatically fix bias without explicit data preparation.
How to eliminate wrong answers
Option A is wrong because including race and gender as predictors would allow the model to directly learn and potentially amplify biases, leading to discriminatory outcomes rather than adjusting for them. Option B is wrong because Einstein Prediction Builder does not have built-in fairness constraints that automatically correct for bias; it requires careful data preparation and feature selection by the user. Option C is wrong because deep learning algorithms do not inherently handle bias correction; they can actually exacerbate biases present in the data if not explicitly mitigated through techniques like adversarial debiasing or reweighting.