Question 100 of 506

Data for AI →hardMultiple ChoiceObjective-mapped

Quick Answer

The correct answer is to create a holdout set of transactions from the last six months and compare model performance on it versus older data. This directly tests for concept drift, where the statistical properties of the target variable change over time—in this case, due to the new loyalty program altering purchasing patterns. By measuring the drop in R-squared or other metrics on the recent holdout set, you can confirm the model no longer generalizes to the shifted data distribution. On the Salesforce AI Associate exam, this scenario tests your understanding of model monitoring and validation strategies; a common trap is retraining on all data without isolating the drift period. Remember the memory tip: “Hold out the shift to catch the drift”—always isolate the time window where behavior changed to validate performance degradation.

AI Associate Data for AI Practice Question

This AI Associate practice question tests your understanding of data for ai. The scenario asks you to isolate a root cause — eliminate options that address a different problem before choosing. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

You are a data scientist at a retail company. The company uses Einstein Discovery to analyze customer purchase patterns. The model is built on a dataset of 50,000 transactions. The model's R-squared is 0.85, but the predictions for new customers are consistently off by a large margin. The data includes features like 'Customer Age', 'Income', 'Previous Purchases', and 'Product Category'. The model was trained on data from the past two years. However, six months ago, the company launched a new loyalty program that significantly changed purchasing behavior. You suspect the model is not generalizing to new customers. What should you do to validate your hypothesis?

Question 1hardmultiple choice

Read the full NAT/PAT explanation →

A
Create a holdout set of transactions from the last six months and compare model performance on it vs. older data
If performance is worse on recent data, concept drift is confirmed.
B
Exclude new customers from the dataset entirely
Why wrong: This would avoid the problem but not solve it.
C
Increase the training data size to include older transactions
Why wrong: More old data will not help the model adapt to new patterns.
D
Remove the 'Product Category' feature to simplify the model
Why wrong: Feature reduction is not a diagnostic for drift.

Full breakdown with real-world context →

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

✓

Create a holdout set of transactions from the last six months and compare model performance on it vs. older data

Option A is correct because creating a holdout set of transactions from the last six months directly tests whether the model's performance has degraded due to the loyalty program's impact on purchasing behavior. By comparing the R-squared or other metrics on this recent holdout set versus older data, you can quantify the drop in predictive accuracy and confirm that the model fails to generalize to the new data distribution. This approach is a standard method for detecting concept drift in machine learning models, especially when external changes (like a loyalty program) alter the underlying patterns.

Key principle: Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

✓
Create a holdout set of transactions from the last six months and compare model performance on it vs. older data
Why this is correct
If performance is worse on recent data, concept drift is confirmed.
Related concept
Read the scenario before looking for a memorised answer.
✗
Exclude new customers from the dataset entirely
Why it's wrong here
This would avoid the problem but not solve it.
✗
Increase the training data size to include older transactions
Why it's wrong here
More old data will not help the model adapt to new patterns.
✗
Remove the 'Product Category' feature to simplify the model
Why it's wrong here
Feature reduction is not a diagnostic for drift.

Common exam traps

Common exam trap: answer the scenario, not the keyword

Salesforce often tests the misconception that improving model performance (e.g., by adding more data or simplifying features) is the correct response to poor generalization, rather than first validating the hypothesis of concept drift through a time-based holdout evaluation.

Detailed technical explanation

How to think about this question

Concept drift, specifically sudden drift, occurs when a change in the environment (e.g., a loyalty program) causes the joint distribution of features and target to shift. In Einstein Discovery, the model's R-squared of 0.85 on the training set indicates strong fit to historical patterns, but a holdout set from the post-change period would likely show a much lower R-squared or higher error metrics (e.g., MAE, RMSE). This technique is analogous to using a time-based split in time series validation, where the model is evaluated on a contiguous future period to assess its robustness to temporal shifts.

KKey Concepts to Remember

Read the scenario before looking for a memorised answer.
Find the constraint that changes the correct option.
Eliminate answers that are true in general but not in this case.

TExam Day Tips

Watch for words such as best, first, most likely and least administrative effort.
Review why wrong options are wrong, not only why the correct option is correct.

Key takeaway

Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Real-world example

How this comes up in practice

A small business has 20 workstations on the 192.168.1.0/24 network and one public IP from its ISP. The router uses PAT (NAT overload) so all 20 devices share one public address using different source ports. NAT questions test whether you understand the four address terms and which direction each translation applies.

What to study next

Got this wrong? Here's your next step.

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

Related AI Associate practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

AI Fundamentals practice questions

Practise AI Associate questions linked to AI Fundamentals.

AI Capabilities in CRM practice questions

Practise AI Associate questions linked to AI Capabilities in CRM.

Ethical Considerations of AI practice questions

Practise AI Associate questions linked to Ethical Considerations of AI.

Data for AI practice questions

Practise AI Associate questions linked to Data for AI.

AI Associate fundamentals practice questions

Practise AI Associate questions linked to AI Associate fundamentals.

AI Associate scenario practice questions

Practise AI Associate questions linked to AI Associate scenario.

AI Associate troubleshooting practice questions

Practise AI Associate questions linked to AI Associate troubleshooting.

Practice this exam

Start a free AI Associate practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

10 questions 20 questions 30 questions 50 questions Timed 30

AI Associate practice-test guide →Study guide →Browse all practice tests

FAQ

Questions learners often ask

What does this AI Associate question test?

Data for AI — This question tests Data for AI — Read the scenario before looking for a memorised answer..

What is the correct answer to this question?

The correct answer is: Create a holdout set of transactions from the last six months and compare model performance on it vs. older data — Option A is correct because creating a holdout set of transactions from the last six months directly tests whether the model's performance has degraded due to the loyalty program's impact on purchasing behavior. By comparing the R-squared or other metrics on this recent holdout set versus older data, you can quantify the drop in predictive accuracy and confirm that the model fails to generalize to the new data distribution. This approach is a standard method for detecting concept drift in machine learning models, especially when external changes (like a loyalty program) alter the underlying patterns.

What should I do if I get this AI Associate question wrong?

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

What is the key concept behind this question?

Read the scenario before looking for a memorised answer.

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Same concept, more angles

1 more ways this is tested on AI Associate

These questions test the same concept from different angles. Work through them to make sure you can recognise it however the exam phrases it.

Variation 1. You are a Salesforce AI Specialist at a mid-sized manufacturing company. The company uses Einstein Lead Scoring to prioritize leads. The model was trained on historical lead data and has been in production for three months. Recently, the sales team reports that high-scoring leads are not converting as expected. You investigate and find that the model's data source includes leads from the past 18 months. However, six months ago, the company changed its lead qualification process: they started requiring a demo before scoring leads as 'qualified.' As a result, the definition of a converted lead changed. What is the best course of action to improve model performance?

hard

A.Manually adjust the model's prediction threshold to account for the new process
✓ B.Retrain the model using only leads from the last six months after the process change
C.Remove the 'Demo Scheduled' field from the model to avoid bias
D.Add more historical leads from before the process change to increase data volume

Why B: Option B is correct because the change in lead qualification process six months ago introduced a data distribution shift (concept drift), making older leads no longer representative of the current conversion behavior. Retraining the model on only the last six months of data aligns the training set with the new definition of a 'converted lead,' allowing Einstein Lead Scoring to learn the updated patterns and improve prediction accuracy.

Keep practising

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

This AI Associate practice question is part of Courseiva's free Salesforce certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the AI Associate exam.