Courseiva
Knowledge + Practice
CertificationsVendorsCareer RoadmapsLabs & ToolsStudy GuidesGlossaryPractice Questions
C
Courseiva

Free IT certification practice questions with explained answers for CCNA, CompTIA, AWS, Azure, Google Cloud, and more.

Certification Practice Questions

CCNA practice questionsSecurity+ SY0-701 practice questionsAWS SAA-C03 practice questionsAZ-104 practice questionsAZ-900 practice questionsCLF-C02 practice questionsA+ Core 1 practice questionsGoogle Cloud ACE practice questionsCySA+ CS0-003 practice questionsNetwork+ N10-009 practice questions
View all certifications →

Product

CertificationsCertification PathsExam TopicsPractice TestsExam Dumps vs Practice TestsStudy HubComparisons

Company

AboutContactEditorial PolicyQuestion Writing PolicyTrust Center

Legal

Privacy PolicyTerms of Service

Courseiva is a free IT certification practice platform offering original exam-style practice questions, detailed explanations, topic-based practice, mock exams, readiness tracking, and study analytics for Cisco, CompTIA, Microsoft, AWS, and other technology certifications.

© 2026 Courseiva. Courseiva is operated by JTNetSolutions Ltd. All rights reserved.

Courseiva is an independent certification practice platform and is not affiliated with, endorsed by, or sponsored by Cisco, Microsoft, AWS, CompTIA, Google, ISC2, ISACA, or any other certification vendor. Vendor names and certification marks are used only to identify the exams learners are preparing for.

HomeCertificationsAI AssociateDomainsData for AI
AI AssociateFree — No Signup

Data for AI

Practice AI Associate Data for AI questions with full explanations on every answer.

163questions

Start practicing

Data for AI — choose a session length

10 questions~10 min20 questions~20 min30 questions~30 min50 questions~50 min

Free · No account required

AI Associate Domains

AI FundamentalsAI Capabilities in CRMEthical Considerations of AIData for AI

Practice Data for AI questions

10Q20Q30Q50Q

All AI Associate Data for AI questions (163)

Start session

Click any question to see the full explanation and answer options, or start a focused practice session above.

1

A company wants to use Einstein Prediction Builder to predict customer churn. Which data preparation step is essential before building the model?

2

A data scientist needs to prepare data for Einstein Discovery. The dataset includes a field 'Customer_Status__c' with values 'Active', 'Inactive', and 'Churned'. How should this field be treated?

3

A company uses Salesforce Data Cloud to unify customer data from multiple sources. After connecting a data stream, they notice that records are missing from the unified profile. What is the most likely cause?

4

A Salesforce admin is training an Einstein Bot to answer customer questions. Which data source should the bot use to provide accurate responses?

5

A company uses Einstein Discovery to identify factors that increase case resolution time. After training, the model shows that 'Case_Origin__c' has high importance. What action should the company take?

6

A company has set up Einstein Next Best Action with a recommendation strategy. They want to ensure that recommendations are personalized based on the customer's recent behavior. What data should be used?

7

A company wants to use Einstein Article Recommendations to suggest knowledge articles to support agents. What is a prerequisite for this feature?

8

Which TWO actions are required to prepare data for an Einstein Discovery model?

9

Which THREE factors should be considered when evaluating the quality of a dataset for an AI model?

10

Which TWO data sources can be used with Einstein Prediction Builder?

11

A company uses Salesforce Data Platform to store customer data. They want to use this data to train an AI model for lead scoring, but they are concerned about data quality. Which step should they take first to ensure the data is suitable for AI?

12

A data scientist is building a predictive model for customer churn using Salesforce data. The dataset has 20 features, and the target variable is highly imbalanced (5% churn, 95% non-churn). Which technique should be applied to handle the class imbalance before training?

13

An administrator is configuring a Salesforce AI model that uses historical sales data. The data includes fields like 'Amount', 'Close_Date', and 'Lead_Source'. What is the primary purpose of data preprocessing in this context?

14

A company is deploying an AI model that recommends next best actions for sales reps. They notice that the model's recommendations are biased towards high-revenue opportunities. Which data-related action can help reduce this bias?

15

A Salesforce admin wants to use Einstein Prediction Builder to predict case resolution time. What type of data is most critical for training this model?

16

During the data preparation phase for an AI model, a data engineer discovers that the 'AnnualRevenue' field contains some negative values. What is the best course of action?

17

Which TWO techniques are commonly used to handle missing values in a dataset for AI training?

18

Which THREE factors should be considered when selecting features for a predictive model in Salesforce?

19

Which TWO are common data quality issues that can negatively impact AI model performance?

20

A company wants to use its data from Salesforce to train an Einstein AI model. However, they need to exclude records where the customer has opted out of data use. Which field should they configure in the Data Manager?

21

A Salesforce admin is troubleshooting an Einstein Prediction Builder model that is not generating predictions. The model was created with a custom object 'Feedback__c'. The admin notices that the model's data source includes records with status 'In Progress' and 'Closed'. What is the most likely cause of the model not generating predictions?

22

A large enterprise is using Einstein Lead Scoring and notices that the model score is not updating for leads created via a web-to-lead form. The leads have all required fields populated. The admin has verified that the model is active and the data source includes the Lead object. What could be causing the score to remain static?

23

A company wants to use Einstein Article Recommendations to surface relevant knowledge articles to its support agents. What two data components are required to set up this feature?

24

An admin is configuring Einstein Vision and wants to train a model to identify product defects from images. The admin has uploaded 500 images of defective products and 500 images of non-defective products. However, the model training fails with an error about data quality. What is the most likely cause?

25

A company is using Einstein Discovery to predict customer churn. The model was created six months ago and has been making predictions. Recently, the model's accuracy has dropped significantly. The data scientist confirms that the data schema has not changed. What is the most likely reason for the drop in accuracy?

26

A company is implementing Einstein Prediction Builder to predict whether a support case will escalate. Which TWO data preparation steps should the admin take to improve model accuracy?

27

You are a Salesforce AI Specialist at a mid-sized manufacturing company. The company uses Einstein Lead Scoring to prioritize leads. The model was trained on historical lead data and has been in production for three months. Recently, the sales team reports that high-scoring leads are not converting as expected. You investigate and find that the model's data source includes leads from the past 18 months. However, six months ago, the company changed its lead qualification process: they started requiring a demo before scoring leads as 'qualified.' As a result, the definition of a converted lead changed. What is the best course of action to improve model performance?

28

You are an admin at a financial services firm. The firm wants to use Einstein Next Best Action to offer personalized product recommendations to customers on its service portal. The data includes customer profiles, transaction history, and support case history. The Einstein Next Best Action strategy is configured with a recommendation that shows a 'Savings Account' offer to customers who have a checking account. However, the recommendation is not appearing for any customers. You check the Data Flow and see that the 'Account' object data is flowing correctly. The recommendation's filter condition is: AND( Has_Checking_Account__c = true, Age__c > 18 ). You verify that many customers meet these conditions. What is the most likely reason the recommendation is not appearing?

29

You are a Salesforce admin at a nonprofit organization. The organization uses Einstein Engagement Scoring to prioritize donors for outreach. The model is based on donation history and event attendance. Recently, the model stopped generating new scores for recently added donors. You check the data source and see that the model's data includes the 'Contact' and 'Opportunity' objects. The data refresh is scheduled daily. The model status is 'Active'. What should you investigate first to resolve the issue?

30

You are a data scientist at a retail company. The company uses Einstein Discovery to analyze customer purchase patterns. The model is built on a dataset of 50,000 transactions. The model's R-squared is 0.85, but the predictions for new customers are consistently off by a large margin. The data includes features like 'Customer Age', 'Income', 'Previous Purchases', and 'Product Category'. The model was trained on data from the past two years. However, six months ago, the company launched a new loyalty program that significantly changed purchasing behavior. You suspect the model is not generalizing to new customers. What should you do to validate your hypothesis?

31

A company is preparing customer data for a predictive model. They notice that many records have missing values for the 'annual income' field. Which approach is best to handle this issue while minimizing bias?

32

A team is labeling text data for a sentiment analysis model. To ensure consistency and quality, which practice should they prioritize?

33

For a real-time AI application that requires low-latency access to customer interaction data, which storage solution is most appropriate?

34

A company wants to use customer purchase history to train a recommendation model. Which action is essential to comply with data privacy regulations?

35

A data pipeline fails intermittently when processing large CSV files. The error log shows 'OutOfMemoryError'. Which configuration change is most likely to resolve this?

36

Which data transformation is most appropriate for converting categorical variables into numerical format for a machine learning model?

37

A dataset contains a 'date' column. Which feature engineering technique would best capture both long-term trends and seasonal patterns?

38

Which method is most suitable for ingesting streaming data from IoT sensors into a data lake?

39

A global company needs to ensure that customer data used for AI models complies with multiple regional regulations (GDPR, CCPA, LGPD). Which data governance practice is most effective?

40

Which TWO data preparation steps are critical for ensuring high-quality training data?

41

Which THREE are key dimensions of data quality that directly impact AI model performance?

42

Which TWO considerations are important when labeling data for a supervised learning model?

43

Refer to the exhibit. A data access policy is defined for a customer data set. Which statement best describes this policy?

44

Refer to the exhibit. The data pipeline is failing. What is the most likely cause?

45

Refer to the exhibit. A developer runs a SOQL query. What does the output indicate?

46

A data engineer is troubleshooting a predictive model that stopped updating. The data flow from Data Cloud shows 'Data Transform Failed' with error: 'Field Amount cannot be null'. What is the most likely cause?

47

A company is preparing data for Einstein Article Recommendation. Which data source is most appropriate for training the model?

48

A retail company uses Einstein Next Best Action with customer data from Data Cloud. The recommendations are not personalized. The admin checks the data quality dashboard and finds that the 'Customer_Profile' object has 40% records with missing 'PreferredChannel' field. What is the best course of action?

49

An admin created a data stream to bring external customer data into Data Cloud for Einstein. The data stream fails with error 'Schema mismatch: expected 10 fields, got 8'. What is the likely cause?

50

A company wants to use Einstein Reply Recommendations in Service Cloud. What data is required to train the model?

51

A data architect is designing a data model for Einstein Discovery. The data includes categorical variables with high cardinality (e.g., postal codes). What is the best practice to handle such features?

52

A company uses Einstein Prediction Builder to predict customer churn. The model's accuracy is low. The admin reviews the training data and notices that only 2% of records are churned. What should the admin do to improve the model?

53

A system administrator receives an error when running a Data Cloud data transform: 'Row-level security settings are preventing access to the source data.' The admin has appropriate permissions. What is the most likely cause?

54

A marketer wants to use Einstein Segment Creation to build a segment for a campaign. Which data source can be used?

55

A data analyst is evaluating data quality for an Einstein model. Which TWO dimensions are most critical for model accuracy?

56

A company is ingesting data from multiple sources into Data Cloud for Einstein. Which THREE data preparation steps should be performed?

57

A data scientist is using Einstein Discovery to analyze sales data. The model results show a high correlation between two predictor variables. Which TWO actions should the data scientist take?

58

Refer to the exhibit. What effect does this masking policy have on the data used for training an Einstein model?

59

Refer to the exhibit. What is the most likely cause of this error?

60

Refer to the exhibit. What data quality issue does the exhibit reveal?

61

A company is preparing data for Einstein Prediction Builder to forecast lead conversion. They have historical data with fields like Lead Source, Industry, Number of Employees, and Converted (boolean). Which data preparation step is most critical?

62

A data scientist notices that an Einstein model for predicting customer churn has unusually high accuracy on training data but performs poorly on validation data. Which data issue is the most likely cause?

63

A company wants to build a sentiment analysis model using customer feedback. What is the best practice for labeling the training data?

64

A large enterprise needs to integrate data from Salesforce CRM, an external ERP, and marketing automation to train an AI model for cross-sell recommendations. Which data storage strategy is most aligned with Salesforce's AI capabilities?

65

A company is using customer support tickets to train a model for auto-classifying issues. The dataset includes fields like 'Case Title', 'Description', 'Product', and 'Customer Name'. Which privacy concern is most critical to address before training?

66

A fraud detection model is being trained on transaction data where only 1% of transactions are fraudulent. The current model predicts 'non-fraud' for all transactions, achieving 99% accuracy. Which technique should be applied to improve model performance?

67

After applying a log transformation to a numeric feature, an Einstein model’s performance dropped significantly. What is the most likely cause?

68

A bank uses Einstein Discovery to generate insights about loan approval decisions. After deployment, they notice the model denies loans to a higher percentage of applicants from a certain postal code. Which action should be taken to ensure responsible AI?

69

A company plans to use Einstein Discovery to analyze sales data. Which data preparation step is essential for time-series forecasting?

70

A company is training a customer service chatbot using historical conversation logs. Which TWO data preparation practices should be followed to ensure data quality?

71

Before training an Einstein Prediction model, a data analyst must perform data quality checks. Which THREE checks are most critical?

72

A data scientist is preparing numeric features for a regression model. Which TWO transformations are commonly applied to improve model performance?

73

Refer to the exhibit. A data analyst has defined this field mapping for Einstein Prediction Builder. Which data issue would most likely arise from this mapping?

74

Refer to the exhibit. A data file for click-through model training has the above content. Which data quality issue is most critical to address before training?

75

Refer to the exhibit. A data analyst runs a profile on a dataset and sees these statistics. Based on best practices, which action should be taken first?

76

A data scientist notices that the model accuracy drops significantly after retraining with new data. Upon inspection, they find that many records have missing values for a key feature. Which data quality improvement should be prioritized first?

77

A company is building a text classification model for customer support tickets. They have a dataset of 10,000 tickets. The team decides to use active learning for labeling. Which approach best aligns with active learning principles?

78

For an AI project, data must be stored in a way that supports both training and real-time inference. Which storage solution meets this requirement?

79

A data engineer needs to create a feature that represents the average purchase amount per customer over the last 30 days. The transactional data is timestamped. Which feature engineering technique is most appropriate?

80

A healthcare AI model uses patient data. The legal team requires that all data used for training be de-identified according to HIPAA Safe Harbor method. Which data handling process satisfies this?

81

A team is building a pipeline to train a model daily. The source data arrives in CSV files but needs to be converted to Parquet for efficiency. Which pipeline step should perform this conversion?

82

During data transformation, a data scientist applies one-hot encoding to a categorical feature with 50 unique values. The resulting dataset has 50 new columns. What is a potential drawback of this transformation?

83

An organization uses Salesforce Data Cloud to unify customer data from multiple sources. They want to ensure that data lineage is tracked for AI models. Which practice supports data lineage?

84

A machine learning team is preparing a dataset for a supervised learning task. They have 100,000 labeled samples. Which data preparation step is essential before splitting into train/test sets?

85

Which TWO of the following are common dimensions of data quality that must be addressed for AI training?

86

Which TWO considerations are critical when planning data labeling for a computer vision project in a regulated industry?

87

Which THREE types of data sources are commonly integrated into Salesforce Data Cloud for AI use cases?

88

Refer to the exhibit. A data scientist tries to query the dataset but receives an error. Which of the following is the most likely cause?

89

Refer to the exhibit. A data pipeline fails during the DataTransformation stage. What is the most likely root cause?

90

Refer to the exhibit. A data transformation configuration is shown. Which of the following describes the outcome of applying this transformation?

91

A company uses Einstein Prediction Builder to predict customer churn. The data includes account creation date, number of support cases, and average payment delay. After training, the model shows low confidence scores. What is the most likely cause?

92

A Salesforce admin wants to use Einstein Recommendations to suggest products. What is a key requirement for the data used to train the recommendation model?

93

An organization is preparing data for Einstein Next Best Action. They have multiple action types (discounts, product suggestions, content). Which data model approach best ensures accurate recommendations?

94

A data scientist is preparing data for Einstein Discovery. The dataset has 10,000 records with 5 predictors and one outcome. The outcome is binary (1/0). What is the minimum number of positive outcomes typically required for a reliable model?

95

An admin is setting up Einstein Article Recommendations. Which type of data is essential for the model to learn which articles are relevant?

96

A company uses Einstein Forecasting for revenue prediction. The historical data shows seasonal spikes every quarter. The model consistently underestimates peak periods. What is the best data preparation step to improve accuracy?

97

An admin is troubleshooting Einstein Sentiment. The model returns high confidence but wrong sentiment (e.g., positive reviews labeled negative). What is the most likely issue?

98

When using Einstein Lead Scoring, which data source is most critical for generating accurate lead scores?

99

A company has international customers and wants Einstein Prediction Builder to forecast deal closure probability. The data includes fields like 'region', 'product line', and 'deal amount'. What is a best practice to ensure the model works for all regions?

100

Which TWO data preparation steps are required before using Einstein Discovery for sales forecasting? (Choose 2)

101

Which THREE actions are recommended when preparing data for Einstein Next Best Action? (Choose 3)

102

A data analyst is troubleshooting Einstein Article Recommendations that are not showing up on the site. Which TWO checks should be performed first? (Choose 2)

103

Refer to the exhibit. A data scientist sees this error when training an Einstein Discovery model for customer churn prediction. What is the most likely reason for the error?

104

Refer to the exhibit. A developer runs this SOQL query to prepare data for Einstein Lead Scoring. The query returns an error. What is the most likely issue?

105

Refer to the exhibit. A dataflow is set up to prepare data for a prediction model. The model is expected to predict close probability for all open opportunities. What is wrong with this dataflow?

106

A company wants to train an AI model to predict customer churn using historical data that contains many missing values. What is the best practice for handling missing data?

107

A data scientist needs to feed customer interaction data into Einstein Discovery for predictive analysis. Which data format is required?

108

A company uses Salesforce Data Cloud to unify customer data from multiple sources for AI model training. After adding a new data source, model performance degrades significantly. What is the most likely cause?

109

Which data type is most commonly used for image recognition AI models?

110

A team has limited labeled data for a Salesforce predictive model but wants to leverage a pre-trained model from a related task. Which machine learning approach should they use?

111

After deploying an AI model in Salesforce, the data scientist notices high accuracy on the training set but poor accuracy on new incoming data. What is this phenomenon called?

112

To ensure AI model fairness and avoid biased outcomes, which practice is most critical when preparing training data?

113

A company wants to integrate external customer behavior data into Salesforce to enhance AI predictions. Which Salesforce Data Cloud feature is specifically designed to ingest and map external data?

114

A data scientist discovers that an AI model used for loan approval predicts high default risk disproportionately for a specific demographic group. What is the first step to address this issue?

115

Which TWO are best practices for data labeling in AI projects? (Choose two.)

116

Which THREE are key considerations for data privacy when using AI models that process customer data? (Choose three.)

117

Which TWO are common data quality issues that negatively impact AI model performance? (Choose two.)

118

What is the most likely cause of the error?

119

What is the primary purpose of this policy?

120

What is being performed in this command?

121

A Salesforce admin is preparing a dataset for Einstein Prediction Builder. The dataset contains a field "Income" with many missing values. The admin wants to minimize bias in the model. What is the best practice?

122

When training an Einstein Discovery model, which data type is not supported as a predictor field?

123

A data scientist notices that a Salesforce Einstein model's performance degrades over time. The model was trained on data from the last year. What is the most likely cause?

124

To integrate external data into Salesforce for AI, which tool is recommended by Salesforce for building data pipelines?

125

In Salesforce CRM Analytics (formerly Einstein Analytics), what is the primary purpose of a dataset?

126

A company wants to use Einstein Next Best Action but needs to ensure data privacy. What is the required step for anonymizing customer data in Data Pipelines?

127

While building a prediction model in Einstein Studio, the system warns about "high cardinality" for a categorical field. What should the admin do?

128

Which Salesforce feature automatically flags data quality issues before training an AI model?

129

A data integration specialist is using Data Pipelines to combine Salesforce data with an external CSV file. The CSV has a header row but some rows have extra commas, causing parsing errors. What should the specialist do?

130

A Salesforce admin is reviewing data sources for Einstein Recommendation Builder. Which two data types are required for training? (Choose two.)

131

Which three practices help maintain data quality for AI models in Salesforce? (Choose three.)

132

When preparing data for Einstein Next Best Action, which two aspects must be considered for compliance with data privacy regulations? (Choose two.)

133

Refer to the exhibit. In the JSON configuration above, which data preparation step could introduce bias?

134

Refer to the exhibit. What is the most likely cause of the pipeline failure?

135

A global company uses Salesforce Einstein Discovery to predict customer churn. They have a dataset with fields: Customer_Since__c (date), Last_Interaction_Date__c (date), Support_Cases__c (number), Product_Usage__c (percentage), Region__c (picklist), and Churned__c (boolean target). The model was trained and deployed, but predictions show bias against customers in the "EMEA" region. The data scientist notices that in the training data, 80% of EMEA customers are labeled as churned, while only 20% of other regions. Additionally, the Product_Usage__c field has many missing values for EMEA customers. The company wants to retrain the model to reduce bias. What is the best course of action?

136

A marketing agency needs to ingest real-time social media mentions for a sentiment analysis AI model. Which Data Cloud object type should they use to set up the ingestion?

137

A retailer's AI model for recommendation is producing poor results. Analysis shows that the customer entity has many duplicate records with slight variations. Which Data Cloud feature should be used to address this?

138

A large enterprise uses Data Cloud to power an Einstein model for lead scoring. The model's feature pipeline includes dozens of fields from multiple data streams. Performance has degraded, and the team suspects slow feature retrieval. What is the most efficient way to speed up feature computation in Data Cloud?

139

A company plans to train an AI model using data from Salesforce CRM and an external marketing automation platform. What is the first step to unify these data sources in Data Cloud?

140

A financial institution must ensure that customer data used for AI models does not expose personally identifiable information (PII) to unauthorized users. Which Data Cloud feature should be applied to the data model?

141

A data architect notices that a Data Stream from an external ERP system is failing intermittently with schema mismatch errors. The ERP team says the schema changes occasionally. What is the most effective long-term solution?

142

A news outlet wants to build an AI model that predicts article popularity using real-time social media mentions. Which data source type should they use to ingest tweets?

143

A manufacturer wants to improve demand forecasting by enriching its CRM orders with external demographic data. The external data is available via a SOAP API. How should the data architect implement this?

144

Which TWO of the following are valid methods to improve data quality in Data Cloud before training an AI model?

145

Which THREE of the following are required when setting up a data stream from Salesforce to Data Cloud?

146

Which THREE of the following are best practices for feature engineering in Einstein Studio?

147

A large retail company uses Data Cloud to consolidate customer data from e-commerce, POS, and loyalty programs. They plan to use Einstein Studio to build a churn prediction model. The data architect notices that the churn model's accuracy is below expectations. Upon investigation, they find that the customer entity in Data Cloud has multiple records for the same customer with slightly different spellings and addresses. The data comes from different streams. What should the data architect do to improve the model?

148

A financial services firm uses Data Cloud to enrich sales data with external credit scores via an API. They set up a Data Action to call the credit bureau API for each new lead. Over time, API costs are rising, and the action is slowing down lead processing. They only need credit scores for leads with a high probability of conversion. What is the best approach to reduce costs and improve performance?

149

A non-profit organization uses Data Cloud to manage donor data from multiple sources (email campaigns, event attendance, donations). They want to use an AI model to predict future donations. The data scientist says the model needs a unified view of each donor with consistent fields. What is the first step the data architect should take in Data Cloud to enable this?

150

A healthcare provider implements Data Cloud to predict patient readmission rates. They have HIPAA compliance requirements. The data includes sensitive patient health information (PHI). The AI model must be trained without exposing PHI to unauthorized users. The data architect uses Data Cloud's data masking on PHI fields. However, model performance drops significantly after masking because the masked values lose predictive value. What additional step should the architect consider to maintain model performance while protecting PHI?

151

A company is preparing customer data to train a custom AI model for sentiment analysis. Which two data preparation best practices should they follow? (Choose two.)

152

Data quality is critical for AI model performance. Which three data quality dimensions should be monitored? (Choose three.)

153

A retail company has implemented a Salesforce AI lead scoring model to prioritize high-value customers. After three months, the model's AUC-ROC score is only 0.55, indicating poor performance. The data scientist reviews the training data and finds that 20% of the records are exact duplicates due to multiple data imports from different sources. The duplicates have inconsistent target labels (some labeled 'converted', others 'not converted'). What should the data scientist do to improve model performance?

154

A telecom company uses Einstein Discovery to predict customer churn. The training dataset contains 100,000 records, but only 5% represent churned customers. The model achieves 95% accuracy on a holdout test set, but the recall for churn is only 20%. The business wants to proactively retain at-risk customers, so they need to identify as many churners as possible. What action should the data scientist take to improve churn recall?

155

A healthcare organization uses Salesforce to develop an AI model for patient readmission prediction. They must comply with HIPAA regulations. The dataset includes patient names, addresses, medical record numbers, and detailed clinical notes. The data scientist plans to train a supervised model using historical readmission outcomes. What is the most important data governance step before model training?

156

A marketing team wants to use Einstein Recommendations to personalize product offers on their e-commerce site. They have a dataset of 50,000 customers with purchase history. However, 40% of customers have no purchase history (new registrations). The model performs well for returning customers but gives generic recommendations for new ones. The team wants to improve recommendations for new customers. What data preparation step should they take?

157

A sales operations team is training an AI model to forecast quarterly revenue. They have five years of historical data, which includes a strong seasonal pattern but also a significant outlier: during the pandemic year, revenue dropped by 70% from typical values. The model trains with high accuracy on historical data but fails to predict future quarters accurately, consistently overestimating revenue. What should the data scientist do to improve forecast accuracy?

158

A financial services company uses Salesforce AI to detect fraudulent transactions. The dataset has 1 million legitimate transactions and only 1,000 fraudulent ones. The model trained with default parameters achieves 99.9% accuracy but identifies no fraud (precision and recall of 0). The data scientist wants to maximize fraud detection (recall) while minimizing false positives. Which approach is most effective?

159

A company is building a chatbot using Einstein Bot's AI capabilities. They want to train intent recognition using historical chat transcripts. The transcripts contain many typos (e.g., 'hellp' instead of 'help') and slang (e.g., 'gonna' instead of 'going to'). The initial model performs poorly, misclassifying many intents. What data cleaning step is most important?

160

A multinational corporation uses Salesforce AI to analyze customer feedback across multiple languages. They have 10,000 English reviews, 2,000 Spanish reviews, and 500 French reviews. The sentiment model performs well on English (F1=0.85) but poorly on French (F1=0.40). The data scientist wants to improve French sentiment performance without collecting new data. What should they do?

161

A company is preparing their Salesforce Data Cloud for Einstein AI predictions. They need to ensure data quality and governance. Which TWO actions should they take? (Choose two.)

162

Refer to the exhibit. A data analyst receives an error when trying to use this model configuration for Einstein AI predictions. Which issue is most likely causing the error?

163

A retail company uses Salesforce Data Cloud to power Einstein AI for personalized product recommendations. They have integrated customer data from multiple sources: ERP (order history), marketing automation (email engagement), and web analytics (browsing behavior). The data model includes a unified Customer__dlm object with fields: Age__c, TotalSpend__c, LastPurchaseDate__c, EmailEngagementScore__c, and WebSessionCount__c. The AI model is configured to predict "LikelyToPurchaseNextWeek__c" (Boolean). The data team has noticed that the predictions are less accurate for new customers (those with less than 30 days of data). The model was trained on all customer data without any filtering. The team wants to improve model performance without increasing training frequency. What should they do?

Practice all 163 Data for AI questions

Other AI Associate exam domains

AI FundamentalsAI Capabilities in CRMEthical Considerations of AI

Frequently asked questions

What does the Data for AI domain cover on the AI Associate exam?

The Data for AI domain covers the key concepts tested in this area of the AI Associate exam blueprint published by Salesforce. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all AI Associate domains — no account required.

How many Data for AI questions are in the AI Associate question bank?

The Courseiva AI Associate question bank contains 163 questions in the Data for AI domain. Click any question to see the full explanation and answer breakdown.

What is the best way to practice Data for AI for AI Associate?

Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.

Can I practice only Data for AI questions for AI Associate?

Yes — the session launcher on this page draws questions exclusively from the Data for AI domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.

Free forever · No credit card required

Track your AI Associate domain progress

Save your results, see per-domain analytics, and get readiness scores — free, for every certification.

Sign Up Free

Free forever · Every certification included

Practice Session

10 questions20 questions30 questions50 questions

Study Resources

All DomainsPractice TestMock ExamFlashcardsStudy Guide