An admin is training a new Einstein Prediction Builder model to predict whether a support case will be escalated (binary). They have selected the prediction field 'Escalated__c' and the data set of all cases from the past year. Which step is essential to ensure the model can distinguish between escalated and non-escalated cases?
Binary classification requires both outcomes present; otherwise the model cannot learn the difference.
Why this answer
Option A is correct because Einstein Prediction Builder requires the target prediction field to contain both positive and negative examples (e.g., 'True' and 'False') in the training data. Without both values, the model cannot learn the decision boundary between escalated and non-escalated cases, making binary classification impossible.
Exam trap
Cisco often tests the misconception that more features or larger datasets are always better, but the essential requirement for binary classification is that the target field has both outcome values present in the training data.
How to eliminate wrong answers
Option B is wrong because the prediction window (e.g., next 30 days) is used for time-series or event-based predictions, not for a binary classification model where the target field already exists in historical data. Option C is wrong because Einstein Prediction Builder automatically selects relevant features from the object; there is no requirement to manually select at least 20 features, and forcing too many features can lead to overfitting. Option D is wrong because while having sufficient data is important, Einstein Prediction Builder does not enforce a strict minimum of 500 records; the actual requirement depends on the number of features and the rarity of the target event, and the platform provides guidance on data sufficiency during model training.