A company plans to use Einstein Discovery to analyze sales data. Which data preparation step is essential for time-series forecasting?
Einstein Discovery relies on date fields for trend detection.
Why this answer
For time-series forecasting in Einstein Discovery, the date field must be properly formatted (e.g., as a date or datetime data type) and contain a sufficient historical range to identify patterns like seasonality and trends. Without adequate historical data, the model cannot learn temporal dependencies, making this step essential.
Exam trap
Salesforce often tests the misconception that data normalization (scaling) is always required for AI models, but for tree-based algorithms like those in Einstein Discovery, scaling is irrelevant, and the trap is that candidates pick Option D thinking it is a universal preprocessing step.
How to eliminate wrong answers
Option A is wrong because removing all outliers in sales amounts can discard legitimate extreme values that represent real-world events (e.g., holiday spikes, promotions), which are critical for accurate time-series forecasting; Einstein Discovery handles outliers through model tuning rather than blanket removal. Option C is wrong because while removing duplicate records is a general data cleaning best practice, it is not specifically essential for time-series forecasting; duplicates in date-indexed data are typically handled by aggregation or deduplication, but this step is not a prerequisite for the forecasting algorithm. Option D is wrong because scaling numeric fields to a 0-1 range is unnecessary for time-series forecasting in Einstein Discovery, as tree-based models (like Gradient Boosted Trees) used internally are invariant to monotonic transformations and do not require normalization.