Salesforce AI Associate AI Associate (AI Associate) — Questions 151225

506 questions total · 7pages · All types, answers revealed

Page 2

Page 3 of 7

Page 4
151
MCQhard

A large financial institution uses Einstein Discovery to automate loan pre-approval decisions. The model was trained on ten years of historical data. After deployment, the compliance team finds that the approval rate for minority groups is 15% lower than the majority group, even after controlling for credit score and income. The data is balanced across groups. The model uses features like zip code, employment history, and debt-to-income ratio. The institution has a strict policy of fairness and non-discrimination. The AI team proposes three options: (1) remove zip code and employment history from the model, (2) add a fairness constraint to the model training, (3) lower the decision threshold for minority groups to balance approval rates. The compliance officer must choose the most ethical and effective course of action that aligns with Salesforce AI ethical guidelines. Which option should they choose?

A.Add a fairness constraint to the model training
B.Lower the decision threshold for minority groups
C.Remove zip code and employment history from the model
D.Continue using the model as is, since data is balanced
AnswerA

Fairness constraints adjust the model to reduce bias while maintaining accuracy.

Why this answer

Option B is correct because adding a fairness constraint directly addresses bias without arbitrary threshold changes (Option C) and while removing features (Option A) may not eliminate bias due to correlated features. Option A is wrong because zip code and employment history may be proxies; removing them could reduce predictive power without fully solving bias. Option C is wrong because it applies different standards to groups, which may be discriminatory and illegal.

Option D is to continue using the model, which is unethical.

152
MCQhard

A financial services company uses Salesforce AI to detect fraudulent transactions. The dataset has 1 million legitimate transactions and only 1,000 fraudulent ones. The model trained with default parameters achieves 99.9% accuracy but identifies no fraud (precision and recall of 0). The data scientist wants to maximize fraud detection (recall) while minimizing false positives. Which approach is most effective?

A.Increase the weight of the majority class in the loss function.
B.Use SMOTE to generate synthetic fraud samples to balance the dataset.
C.Train multiple models on different random subsets and average predictions.
D.Use a simpler model to avoid overfitting on the majority class.
AnswerB

SMOTE creates synthetic instances of the minority class, allowing the model to learn fraud patterns effectively and improve recall.

Why this answer

With extreme imbalance, oversampling the minority class (e.g., SMOTE) generates synthetic fraud examples, helping the model learn fraud patterns and improve recall without discarding legitimate data.

153
MCQeasy

A Salesforce admin wants to use Einstein Prediction Builder to predict case resolution time. What type of data is most critical for training this model?

A.Customer satisfaction survey responses
B.Historical case records including resolution time
C.Product inventory levels
D.Employee work schedules
AnswerB

Historical data is essential for training.

Why this answer

Einstein Prediction Builder requires historical data with known outcomes to train a supervised machine learning model. Historical case records containing actual resolution times provide the labeled examples needed for the model to learn patterns and predict future case resolution times. Without this ground truth data, the model cannot be trained to make accurate predictions.

Exam trap

The trap here is that candidates may confuse factors that influence resolution time (like employee schedules or inventory) with the actual labeled outcome data required to train a supervised prediction model.

How to eliminate wrong answers

Option A is wrong because customer satisfaction survey responses measure post-resolution sentiment, not the actual resolution time, and they lack the precise timestamp data required for regression-based time prediction. Option C is wrong because product inventory levels are unrelated to case resolution time; they might be relevant for supply chain predictions but not for service case duration. Option D is wrong because employee work schedules, while potentially influencing resolution time, are not the historical outcome data needed to train the model — the model needs actual resolution times from past cases, not staffing inputs.

154
Multi-Selecteasy

Which TWO are key principles of Salesforce's AI ethics? (Choose two.)

Select 2 answers
A.Speed of deployment
B.Profitability
C.Transparency
D.Accountability
E.Full automation
AnswersC, D

Salesforce emphasizes explainable AI.

Why this answer

Options A and C are correct because transparency and accountability are core principles. Option B is wrong because profitability is not an ethical principle. Option D is wrong because automation is a capability, not a principle.

Option E is wrong because speed is not an ethical principle.

155
MCQhard

A company's Einstein Discovery model for customer lifetime value shows a significant correlation between predicted value and customer's postal code. The company is concerned about ethical implications. What is the most appropriate response?

A.Remove the postal code field from the model immediately
B.Investigate whether postal code is a proxy for protected attributes and, if so, consider retraining the model without it or with fairness constraints
C.Add more demographic data to the model to improve its accuracy
D.Ignore the correlation since the model is predicting business value, not demographic attributes
AnswerB

This approach addresses the ethical concern while preserving model utility.

Why this answer

Option B (Investigate whether postal code is a proxy for protected attributes and, if so, consider retraining the model without it or with fairness constraints) is correct because postal code can be a proxy for race or income. Option A (removing postal code outright) may not be straightforward. Option C (ignoring correlation as coincidental) is unethical.

Option D (adding more demographic data) could increase bias.

156
MCQmedium

While building a prediction model in Einstein Studio, the system warns about "high cardinality" for a categorical field. What should the admin do?

A.Convert the field to a numeric type
B.Use frequency encoding or binning to reduce cardinality
C.Remove the field from the model
D.Increase the model complexity by adding more trees
AnswerB

Frequency encoding (replace with count) or binning groups rare values into categories, reducing cardinality while preserving signal.

Why this answer

Option C is correct because high cardinality (many unique values) can hurt model performance. Frequency encoding or binning reduces cardinality while retaining information. Removing the field or converting to numeric may lose information; increasing model complexity is not recommended.

157
MCQmedium

A data scientist is evaluating the performance of an Einstein Discovery model. They observe that the model has high accuracy but low precision for a specific prediction class. What does this indicate?

A.The model is overfitted to the training data.
B.The model correctly predicts most instances but has many false positives for that class.
C.The model correctly predicts most instances but has many false negatives for that class.
D.The model rarely predicts that class, leading to high accuracy.
AnswerB

Low precision indicates many false positives.

Why this answer

High accuracy with low precision for a specific class indicates that while the model correctly classifies the majority of instances overall, it produces a high number of false positives for that class. Precision measures the proportion of positive identifications that were actually correct, so low precision means many of the predicted positive cases are false alarms. In Einstein Discovery, this trade-off is critical when optimizing for business outcomes where false positives are costly.

Exam trap

Salesforce often tests the distinction between precision and recall, trapping candidates who confuse false positives (low precision) with false negatives (low recall) when interpreting accuracy metrics.

How to eliminate wrong answers

Option A is wrong because overfitting typically leads to high accuracy on training data but poor generalization to unseen data, not specifically to low precision for a single class. Option C is wrong because many false negatives would reduce recall, not precision; the scenario describes low precision, which is about false positives. Option D is wrong because if the model rarely predicts that class, precision could be high (few predictions, mostly correct) and accuracy could be high due to class imbalance, but low precision indicates frequent incorrect predictions for that class.

158
MCQmedium

A company is preparing customer data for a predictive model. They notice that many records have missing values for the 'annual income' field. Which approach is best to handle this issue while minimizing bias?

A.Remove all records with missing values.
B.Use model-based imputation considering other features.
C.Replace missing values with the mean.
D.Set missing values to zero.
AnswerB

Model-based imputation leverages other features to predict missing values, preserving relationships and minimizing bias.

Why this answer

Model-based imputation (Option B) is best because it uses relationships between features (e.g., education, job role) to predict missing 'annual income' values, preserving data distribution and minimizing bias. This approach avoids the distortion caused by simple mean/zero imputation and retains sample size better than deletion.

Exam trap

Salesforce often tests the misconception that mean imputation is a safe default, but the trap here is that it ignores feature dependencies and can artificially shrink variance, leading to overconfident model predictions and biased coefficients.

How to eliminate wrong answers

Option A is wrong because removing all records with missing values can introduce selection bias and reduce sample size, potentially discarding valuable patterns in the data. Option C is wrong because replacing missing values with the mean ignores feature correlations, artificially compresses variance, and can bias relationships in the predictive model. Option D is wrong because setting missing values to zero is arbitrary and unrealistic for income data, likely creating a skewed distribution and misleading model coefficients.

159
Multi-Selectmedium

An AI system is used to detect fraud in financial transactions. Which THREE steps should be taken to address ethical concerns?

Select 3 answers
A.Lower the fraud detection threshold to catch more cases
B.Automatically accept all flagged transactions to improve user experience
C.Implement a human-in-the-loop for high-stakes decisions
D.Ensure the model provides explanations for its decisions
E.Test the model for disparate impact across demographic groups
AnswersC, D, E

Human oversight ensures accountability.

Why this answer

Option C is correct because implementing a human-in-the-loop ensures that high-stakes decisions, such as blocking a legitimate transaction or allowing a potentially fraudulent one, are reviewed by a human before final action. This addresses ethical concerns by preventing fully automated decisions that could cause financial harm or violate user trust, and it aligns with principles of accountability and fairness in AI governance.

Exam trap

Salesforce often tests the misconception that ethical AI is solely about improving model performance or user experience, when in fact it requires balancing accuracy, fairness, and human oversight—candidates may incorrectly choose options that sound beneficial (like lowering thresholds) without considering the ethical trade-offs.

160
MCQmedium

A retail company uses Salesforce Einstein Vision to analyze customer images for product recommendations. The AI team notices that the model performs poorly on images of customers with darker skin tones, leading to fewer recommendations for that demographic. The team has access to a dataset of diverse skin tones but the company's data privacy policy prohibits using demographic data in training. What should the team do?

A.Retrain the model on the diverse dataset without considering privacy regulations.
B.Ignore the performance disparity as it is a result of natural data distribution.
C.Document the bias, escalate to the ethics board, and seek guidance on using diverse data while maintaining privacy.
D.Use the diverse dataset but remove skin tone labels to avoid privacy issues.
AnswerC

This addresses the bias through proper channels while respecting privacy policies.

Why this answer

Option D is correct because it addresses the bias through proper channels while respecting privacy policies. Option A is wrong because removing labels may still encode bias and could violate the spirit of privacy. Option B is wrong because ignoring performance disparities is unethical.

Option C is wrong because violating privacy regulations is not acceptable.

161
Multi-Selectmedium

Which TWO actions are most effective in promoting transparency in AI systems? (Choose two.)

Select 2 answers
A.Publish the entire source code of the model online.
B.Provide a model card that describes the purpose, accuracy, and limitations of the AI model.
C.Withhold the data sources used for training to protect the company's competitive advantage.
D.Encrypt all customer data used in the model.
E.Offer an explanation feature that shows why a specific prediction was made for a given user.
AnswersB, E

Model cards are a standard transparency tool.

Why this answer

Options B and C are correct. Option A is wrong because hiding the data sources reduces transparency. Option D is wrong as it relates to privacy, not transparency.

Option E is wrong because publishing details without context can be confusing.

162
MCQhard

A data pipeline fails intermittently when processing large CSV files. The error log shows 'OutOfMemoryError'. Which configuration change is most likely to resolve this?

A.Use a smaller file size limit.
B.Increase the number of worker threads.
C.Switch to XML format.
D.Increase the heap memory for the processing application.
AnswerD

Increasing heap memory provides more space for large file processing.

Why this answer

The OutOfMemoryError indicates that the Java Virtual Machine (JVM) heap space is exhausted while processing large CSV files. Increasing the heap memory (e.g., using -Xmx flag) allocates more memory to the application, allowing it to handle larger datasets without crashing. This directly addresses the root cause of insufficient memory for the data pipeline's processing workload.

Exam trap

Salesforce often tests the misconception that increasing parallelism (worker threads) solves memory issues, but in reality, more threads increase memory pressure and can trigger OutOfMemoryError faster.

How to eliminate wrong answers

Option A is wrong because using a smaller file size limit is a workaround that avoids the problem rather than solving it, and it may not be feasible if large files are required by the business. Option B is wrong because increasing worker threads typically increases memory consumption and contention, which would worsen the OutOfMemoryError, not resolve it. Option C is wrong because switching to XML format would likely increase memory usage due to verbose markup and parsing overhead, making the error more likely, not less.

163
Multi-Selectmedium

Which TWO actions are most effective for ensuring fairness in an AI model used for loan approvals?

Select 2 answers
A.Regularly audit model outcomes for disparate impact across demographic groups.
B.Rely solely on historical data without any adjustments.
C.Allow the model to self-correct over time without human intervention.
D.Use diverse and representative training data.
E.Use a single, simple algorithm to avoid complexity.
AnswersA, D

Audits identify bias even after deployment, enabling corrective action.

Why this answer

Options A and B are correct. Using diverse training data helps the model learn from all groups, and regular auditing detects bias. Option C limits complexity.

Option D perpetuates historical bias. Option E is not a systematic approach.

164
MCQeasy

A company wants to use customer data to train an AI model. Which ethical consideration is paramount?

A.Model accuracy
B.Cost efficiency
C.Data minimization
D.Speed of deployment
AnswerC

Ensures only necessary data is collected, reducing privacy risks.

Why this answer

Option B is correct because data minimization ensures only necessary data is collected, reducing privacy risks. Option A is wrong while important, accuracy is secondary to ethical data use. Option C is wrong cost efficiency is not an ethical consideration.

Option D is wrong speed of deployment is not an ethical consideration.

165
MCQhard

A company has international customers and wants Einstein Prediction Builder to forecast deal closure probability. The data includes fields like 'region', 'product line', and 'deal amount'. What is a best practice to ensure the model works for all regions?

A.One-hot encode the region field using 50+ dummy variables.
B.Remove the region field to avoid bias.
C.Use region as a numeric rank based on past conversion rates.
D.Group regions into broader categories like 'Americas', 'EMEA', 'APAC'.
AnswerD

Grouping reduces noise and improves generalizability while maintaining regional distinction.

Why this answer

Option D is correct because grouping regions into broader categories like 'Americas', 'EMEA', and 'APAC' reduces high cardinality and sparsity in categorical features, which improves model stability and prevents overfitting in Einstein Prediction Builder. This approach ensures each region group has sufficient training data to learn meaningful patterns, enabling the model to generalize better across all regions without introducing bias from rare categories.

Exam trap

Salesforce often tests the misconception that more granular data (like one-hot encoding with many categories) always improves model accuracy, when in fact it can harm performance due to sparsity and overfitting in prediction builder tools.

How to eliminate wrong answers

Option A is wrong because one-hot encoding a region field with 50+ dummy variables introduces high cardinality and sparsity, which can cause the model to overfit to rare categories and degrade prediction performance in Einstein Prediction Builder. Option B is wrong because removing the region field entirely discards valuable geographic information that can significantly influence deal closure probability, leading to a less accurate model. Option C is wrong because using region as a numeric rank based on past conversion rates introduces ordinal bias and assumes a linear relationship that may not exist, which can misrepresent the true categorical nature of the data and reduce model interpretability.

166
MCQmedium

Refer to the exhibit. A Salesforce admin is reviewing an AI model's fairness report. Which action should the admin take?

A.Remove the email_engagement feature to improve fairness.
B.Retrain the model because the equal opportunity score is below threshold.
C.Increase the bias threshold to 0.9.
D.Deploy the model because all metrics exceed the threshold.
AnswerB

The low equal opportunity score indicates bias that needs mitigation.

Why this answer

Option B is correct because the equal opportunity score (0.72) is below the bias threshold (0.8), indicating potential unfairness in true positive rates across groups. Option A is wrong because demographic parity is above threshold but equal opportunity is not, so not all metrics exceed threshold. Option C is wrong because removing features may not address the root cause.

Option D is wrong because increasing the threshold would mask the problem.

167
MCQmedium

Refer to the exhibit. Based on the JSON policy for AI fairness checks, which fairness metric is NOT enabled?

A.Demographic parity
B.All are enabled
C.Disparate impact
D.Equal opportunity
AnswerD

Correct. The 'equal_opportunity' field is false.

Why this answer

Option D (Equal opportunity) is correct because the JSON policy shown in the exhibit configures fairness checks for demographic parity, disparate impact, and equalized odds, but does NOT include the equal opportunity metric. Equal opportunity requires equal true positive rates across groups, which is a separate metric from equalized odds and must be explicitly enabled in the policy definition.

Exam trap

Salesforce often tests the distinction between equalized odds and equal opportunity, trapping candidates who assume equalized odds automatically includes equal opportunity, when in fact they are separate metrics with different mathematical definitions.

How to eliminate wrong answers

Option A is wrong because demographic parity is explicitly enabled in the JSON policy under the 'fairness_metrics' array. Option B is wrong because not all metrics are enabled; the policy omits equal opportunity. Option C is wrong because disparate impact is also explicitly listed in the policy's fairness metrics.

168
MCQeasy

A small business uses a pre-built Salesforce AI model to predict inventory needs. The model recommends ordering extra stock based on seasonal trends. One month, the model fails to predict a sudden demand spike, resulting in stockouts and lost sales. The business owner is frustrated and considers disabling the AI. The owner wants to know if this is an ethical issue and what to do next. As an AI ethics advisor, what is the best response?

A.Agree that the AI is unreliable and recommend disabling it immediately
B.Recommend using a different AI model that is guaranteed to be 100% accurate
C.Explain that this is a model performance issue, not an ethics violation, and suggest reviewing the model's accuracy and retraining with recent data
D.State that all AI failures are ethical issues and the company should stop using AI
AnswerC

Ethical issues involve bias, transparency, etc.; a single failure is a reliability issue.

Why this answer

The correct answer is A because a single failure does not necessarily indicate an ethical problem; reliability is a consideration but not an ethics violation. The model should be reviewed and improved. Option B is wrong because blaming the model prematurely is not constructive.

Option C is wrong because ethical concerns are about fairness, transparency, etc., not just accuracy. Option D is wrong because disabling the AI might not be the best solution if the model generally helps.

169
MCQeasy

A company wants to build a sentiment analysis model using customer feedback. What is the best practice for labeling the training data?

A.Ignore labeling and use unsupervised learning
B.Have a single domain expert label all data
C.Employ a diverse set of human labelers with clear guidelines
D.Use automated keyword matching to assign sentiment
AnswerC

Human labeling with guidelines provides accurate, consistent labels.

Why this answer

Using diverse human labelers with clear guidelines ensures label consistency and reduces bias. Automated keyword matching is error-prone, a single expert may introduce personal bias, and using only positive labels would create an unbalanced dataset.

170
MCQeasy

A company wants to train an AI model to predict customer churn using historical data that contains many missing values. What is the best practice for handling missing data?

A.Use only features without missing values.
B.Ignore missing values as they do not affect AI training.
C.Impute missing values using mean or median.
D.Remove all records with missing values.
AnswerC

Imputation preserves data and reduces bias.

Why this answer

Option C is correct because imputing missing values using mean or median is a standard practice that preserves the dataset size and statistical properties, allowing the AI model to learn from all available features without introducing bias from data removal. This approach is particularly effective for numerical features in customer churn prediction, where missing values are often random and imputation maintains the distribution for algorithms like logistic regression or gradient boosting.

Exam trap

Salesforce often tests the misconception that removing missing data is safe, but the trap here is that candidates overlook how data removal can shrink the dataset and introduce bias, while imputation is a more balanced and widely accepted practice in AI workflows.

How to eliminate wrong answers

Option A is wrong because discarding features with missing values can remove valuable predictors, reducing model accuracy and ignoring the fact that missingness itself may carry predictive signal. Option B is wrong because ignoring missing values causes most AI algorithms to fail or produce incorrect results, as they cannot process null or NaN entries, leading to runtime errors or biased learning. Option D is wrong because removing all records with missing values can drastically reduce the dataset size, introduce selection bias, and discard potentially useful patterns in the remaining data.

171
MCQhard

A company deploys an AI recommender system that personalizes content. The system is trained on user click data. After deployment, the company notices that the system increasingly recommends sensationalist content, leading to user polarization. Which principle is being violated?

A.Accuracy
B.Privacy
C.Beneficence
D.Transparency
AnswerC

The system should promote well-being and avoid harm.

Why this answer

The recommender system's shift toward sensationalist content, which polarizes users, violates the principle of beneficence because it causes harm (user polarization) rather than promoting well-being. Beneficence requires AI systems to act in the best interests of users and society, not to optimize for engagement metrics at the expense of ethical outcomes.

Exam trap

Salesforce often tests the distinction between ethical principles by presenting a scenario where a system functions correctly (accurate) but produces harmful outcomes, leading candidates to mistakenly choose accuracy or transparency instead of beneficence.

How to eliminate wrong answers

Option A is wrong because accuracy refers to the system's ability to make correct predictions or recommendations based on training data, not to the ethical impact of those recommendations; the system may be accurately predicting clicks on sensationalist content. Option B is wrong because privacy concerns unauthorized access or misuse of personal data, whereas the issue here is about the content being recommended, not data exposure. Option D is wrong because transparency involves explainability and openness about how the system works, but the problem is the harmful outcome of the recommendations, not a lack of clarity in the system's logic.

172
MCQmedium

A retailer's AI model for recommendation is producing poor results. Analysis shows that the customer entity has many duplicate records with slight variations. Which Data Cloud feature should be used to address this?

A.Create a Data Transformation to merge duplicates using rules
B.Increase the Data Stream frequency to get fresher data
C.Ignore the duplicates and use all records as-is
D.Set up a Data Action to deduplicate at the source
AnswerA

Data Transformations can deduplicate records effectively.

Why this answer

Option C is correct because Data Transformations can apply fuzzy matching to merge records. Option A is wrong because ignoring duplicates leaves poor data. Option B is wrong because Data Actions trigger external actions, not dedup.

Option D is wrong because increasing stream frequency does not fix existing duplicates.

173
MCQmedium

A healthcare provider uses Einstein's Prediction Builder to predict patient readmission risk. The model outputs a risk score, but clinicians do not understand how the score is calculated. According to ethical AI principles, what should the provider implement?

A.Integrate model explanations using Einstein's Explainability feature to show key factors influencing each prediction.
B.Proceed with the model since it provides accurate predictions.
C.Train only the data science team on the model's inner workings.
D.Replace the model with a simpler, less accurate but fully transparent model.
AnswerA

Explainability supports transparency and human oversight.

Why this answer

Option B is correct because explainability is a key ethical principle, and providing interpretable insights builds trust. Option A is wrong because ignoring the lack of understanding is risky. Option C is wrong because it only provides training to a few.

Option D is wrong because simplifying the model may reduce accuracy, but the core issue is explainability, not complexity.

174
MCQeasy

A retail company uses AI to personalize marketing emails. A customer complains that their data was used without explicit permission. What ethical principle was most likely violated?

A.Transparency
B.Fairness
C.Consent and privacy
D.Accountability
AnswerC

Using data without consent violates privacy and autonomy.

Why this answer

Option C is correct: Autonomy and privacy require obtaining explicit consent. Option A is wrong because transparency is about disclosure, not permission. Option B is wrong because accountability is about responsibility.

Option D is wrong because fairness is about impartiality.

175
Multi-Selecthard

A multinational corporation uses Einstein Discovery to predict employee performance. An audit reveals potential bias against employees in certain countries. Which THREE actions should they take to address ethical concerns? (Choose three.)

Select 3 answers
A.Investigate the training data for biased labels or sampling bias across countries
B.Engage local ethics representatives from the affected regions to understand context
C.Discontinue use of the AI model worldwide immediately
D.Use the model only in countries where it shows no bias
E.Implement a fairness metric and set acceptable thresholds for performance across groups
AnswersA, B, E

Data investigation is fundamental to identifying bias.

Why this answer

Option A (Investigate the training data for biased labels or sampling bias) is correct because data is often the source. Option C (Engage local ethics representatives from affected regions) ensures diverse perspectives. Option E (Implement a fairness metric and set acceptable thresholds) provides quantitative governance.

Option B (Discontinue the model globally) may be too extreme. Option D (Use the model only in countries where it performs well) could still be unfair to others.

176
MCQhard

After deploying an AI model in Salesforce, the data scientist notices high accuracy on the training set but poor accuracy on new incoming data. What is this phenomenon called?

A.Data leakage
B.Overfitting
C.Underfitting
D.Concept drift
AnswerB

Overfitting causes high training accuracy but low test accuracy.

Why this answer

Option A is correct because overfitting occurs when the model learns noise instead of the underlying pattern, performing well on training but poorly on new data. Option B is wrong because underfitting would show poor performance on both. Option C is wrong because data leakage gives unrealistically high performance on training but does not cause poor generalization.

Option D is wrong because concept drift refers to changing data distribution over time, not immediate poor generalization.

177
MCQhard

A healthcare organization uses Einstein Discovery to predict patient readmission risk. The model uses protected attributes like race and age as features. Which action best aligns with Salesforce's ethical AI principles?

A.Retain the features but monitor for disparate impact and ensure compliance with regulations.
B.Remove race and age features entirely to ensure fairness.
C.Replace age with an age group bucket to reduce granularity.
D.Use the model as is because predictions are accurate.
AnswerA

Ethical AI allows use if monitored and regulated.

Why this answer

Removing protected attributes is a common step, but if they are proxies for legitimate medical factors, they may be retained with monitoring. Option A is too aggressive. Option C ignores that age can be medically relevant.

Option D violates transparency and accountability.

178
MCQeasy

To identify common customer issues from chat transcripts using AI, which feature should be used?

A.Einstein Conversation Mining
B.Post-Interaction Survey
C.Reports
D.Dashboards
AnswerA

Extracts insights from conversation data using NLP.

Why this answer

Einstein Conversation Mining analyzes unstructured chat data to surface common themes. Surveys collect feedback, reports and dashboards show aggregated data but not insights from text.

179
MCQhard

An AI model predicts employee performance. The HR team uses it to identify high-potential employees. What is a potential ethical risk?

A.Over-reliance on the model
B.Privacy violation
C.Underutilization of human judgment
D.All of the above
AnswerD

Correct. All listed risks are potential ethical concerns.

Why this answer

Option D is correct because all three listed risks—over-reliance on the model, privacy violation, and underutilization of human judgment—are potential ethical risks when an AI model predicts employee performance. Over-reliance can lead to automated decisions without human oversight, privacy violation may occur if sensitive employee data is mishandled, and underutilization of human judgment ignores contextual factors that the model cannot capture. Together, these represent a comprehensive set of ethical concerns in AI-driven HR practices.

Exam trap

Salesforce often tests the 'all of the above' trap where candidates think only one or two risks apply, but the question explicitly lists multiple interconnected ethical concerns that collectively form the correct answer.

How to eliminate wrong answers

Option A is wrong because over-reliance on the model is indeed a risk, but it is not the only risk, so selecting only A ignores other ethical issues. Option B is wrong because privacy violation is a valid risk, but it is incomplete without considering over-reliance and underutilization of human judgment. Option C is wrong because underutilization of human judgment is a real concern, but it does not cover the full spectrum of risks including privacy and over-reliance.

180
MCQeasy

A user asks Einstein GPT to generate a product description. The AI returns a response with a confidence score of 0.65. What does this score indicate?

A.There is a 65% probability that the response exactly matches the training data
B.The model is 65% confident that the response is accurate
C.The response is 65% shorter than the optimal length
D.The model has a 65% likelihood of generating the same response again
AnswerB

Confidence scores indicate the model's assessment of how likely the generated answer is correct.

Why this answer

Option B is correct because the confidence score in AI models like Einstein GPT quantifies the model's internal certainty that its generated output is factually correct or contextually appropriate. A score of 0.65 means the model estimates a 65% probability that the response is accurate based on its training and inference algorithms, not that it matches training data or has a fixed length.

Exam trap

Salesforce often tests the misconception that a confidence score indicates a direct probability of correctness or a measure of output quality, when in reality it is a model's self-assessed certainty that can be misleading and is not a guarantee of factual accuracy.

How to eliminate wrong answers

Option A is wrong because the confidence score does not measure how closely the response matches training data; it reflects the model's probabilistic assessment of correctness, not a similarity metric. Option C is wrong because confidence scores are unrelated to output length; they are a probability value between 0 and 1, not a measure of brevity. Option D is wrong because the score does not indicate the likelihood of generating the same response again; that depends on the model's stochastic sampling parameters (e.g., temperature), not the confidence score.

181
Multi-Selecthard

Which THREE factors can affect the accuracy of an Einstein GPT response?

Select 3 answers
A.Brightness of the user interface
B.Model temperature setting
C.Clarity and specificity of the prompt
D.Quality of the grounding data provided
E.Length of the response generated
AnswersB, C, D

Higher temperature increases creativity but may reduce factual accuracy.

Why this answer

Option B is correct because the model temperature setting directly controls the randomness of the output. A higher temperature (e.g., 0.9) produces more creative but potentially less accurate responses, while a lower temperature (e.g., 0.1) makes the model more deterministic and factual. This parameter is a core hyperparameter in large language models like those powering Einstein GPT, and it significantly influences response accuracy.

Exam trap

Salesforce often tests the misconception that output length or interface settings affect model accuracy, when in fact only prompt clarity, grounding data quality, and model hyperparameters like temperature are the true determinants.

182
MCQeasy

In Salesforce CRM Analytics (formerly Einstein Analytics), what is the primary purpose of a dataset?

A.To prepare data for AI and analytics
B.To run SQL queries directly
C.To store raw, unprocessed records
D.To create dashboards only
AnswerA

Datasets are the building blocks for AI modeling, dashboards, and analytical queries.

Why this answer

In Salesforce CRM Analytics, a dataset is the foundational data structure that transforms raw data into an optimized, columnar format for analytics and AI features like Einstein Discovery. It is created by extracting, cleaning, and aggregating data from sources such as Salesforce objects or external connectors, enabling efficient querying, dashboarding, and machine learning model training. This makes option A correct because the primary purpose is to prepare data specifically for AI and analytics workloads.

Exam trap

Salesforce often tests the misconception that datasets are simply raw storage containers, but the trap here is that candidates overlook the 'preparation for AI' aspect and choose 'store raw records' because they confuse datasets with database tables or data lakes.

How to eliminate wrong answers

Option B is wrong because datasets do not support direct SQL query execution; instead, they use SAQL (Salesforce Analytics Query Language) or lens-based exploration for querying. Option C is wrong because datasets store processed, flattened, and indexed data, not raw, unprocessed records—raw data is typically held in dataflows or external systems before transformation. Option D is wrong because while datasets can be used to build dashboards, their primary purpose is broader, encompassing AI, analytics, and data preparation, not just dashboard creation.

183
MCQmedium

What is the most likely cause of the error?

A.Authentication failure
B.Data quality threshold violation
C.Data schema mismatch
D.Network timeout
AnswerB

Null values exceed acceptable threshold.

Why this answer

Option B is correct because the error mentions a high percentage of null values in a critical field, which violates a data quality threshold. Option A is wrong because schema mismatch would show field type inconsistencies. Option C is wrong because authentication failure would show a different error.

Option D is wrong because network timeout would mention connection issues.

184
Multi-Selecthard

Which TWO techniques are commonly used to handle missing values in a dataset for AI training?

Select 2 answers
A.L1 regularization
B.Deletion of rows with missing values
C.One-hot encoding
D.Min-max normalization
E.Imputation with mean or median
AnswersB, E

Simple but valid method.

Why this answer

Option B is correct because deleting rows with missing values is a straightforward technique to handle missing data, especially when the missingness is random and the dataset is large enough that removing a few rows does not significantly impact model performance. This approach avoids introducing bias from imputation methods but can lead to loss of valuable information if too many rows are removed.

Exam trap

Salesforce often tests the distinction between data preprocessing techniques (like handling missing values) and model regularization or feature engineering, so candidates may confuse L1 regularization or one-hot encoding as methods for missing data when they serve entirely different purposes.

185
MCQeasy

When training an Einstein Discovery model, which data type is not supported as a predictor field?

A.Multi-select picklist
B.Numeric
C.Picklist
D.Date
AnswerA

Multi-select picklists have multiple values per record and cannot be used directly as predictors.

Why this answer

Option A is correct because multi-select picklists are not supported as predictors in Einstein Discovery. Numeric, picklist, and date fields are supported.

186
Multi-Selectmedium

Which TWO actions are best practices when implementing Einstein Prediction Service?

Select 2 answers
A.Ignore correlated features to simplify the model.
B.Clean the data to handle missing values and outliers.
C.Select relevant features that are likely to influence the prediction.
D.Include all available fields in the dataset for maximum information.
E.Use the default field mapping without review.
AnswersB, C

Data cleaning improves model accuracy.

Why this answer

Option B is correct because data quality directly impacts the accuracy and reliability of Einstein Prediction Service models. Cleaning data to handle missing values and outliers ensures that the training data is representative and reduces the risk of skewed predictions, which is a fundamental prerequisite for any machine learning model within Salesforce's AI framework.

Exam trap

Salesforce often tests the misconception that more data always leads to better predictions, but in practice, irrelevant or noisy features degrade model performance, making feature selection and data cleaning critical.

187
MCQhard

Refer to the exhibit. The model is deployed and monitoring triggers an alert for a fairness violation. What does this indicate?

A.The demographic parity difference exceeded 0.1.
B.The protected attributes were removed.
C.The model has been retrained.
D.The model's accuracy has dropped below threshold.
AnswerA

That is the threshold for fairness violation.

Why this answer

Option B is correct because the demographic parity constraint has a threshold of 0.1, so a violation means the difference in selection rates across protected groups exceeded 0.1. Option A is wrong because accuracy is a separate metric. Option C is wrong retraining is not indicated.

Option D is wrong the exhibit does not mention removing attributes.

188
Multi-Selectmedium

A company is preparing their Salesforce Data Cloud for Einstein AI predictions. They need to ensure data quality and governance. Which TWO actions should they take? (Choose two.)

Select 2 answers
A.Declare uniqueness rules on calculated insights.
B.Create profiling and auditing dashboards to monitor data health.
C.Set role-based access controls on data model objects.
D.Use Data Cloud's data model to establish relationships between objects.
E.Enable automatic field mapping for all data sources.
AnswersB, D

Monitoring data health is essential for ongoing data quality and governance.

Why this answer

Option B is correct because establishing relationships in the data model is fundamental for accurate predictions. Option C is correct because profiling and auditing dashboards help monitor data health and governance. Option A is incorrect because uniqueness rules on calculated insights are not a standard data quality practice.

Option D is incorrect because automatic field mapping may introduce errors without validation. Option E is incorrect although role-based access contributes to governance, it is not the primary action for data quality.

189
MCQhard

A financial services firm uses a deep learning model to approve loans. The model is highly accurate but cannot explain its decisions. Regulators now require the firm to provide reasons for loan denials. What is the best approach to address this ethical concern?

A.Remove the most influential features from the model.
B.Retrain the model with more data to improve accuracy further.
C.Replace the deep learning model with a simpler, interpretable model like logistic regression.
D.Use a post-hoc explanation tool like LIME to approximate decisions.
AnswerC

Interpretable models can provide clear reasons for decisions.

Why this answer

Option D is correct: Using an inherently interpretable model (e.g., logistic regression) can provide explanations. Option A is wrong because retraining the same model doesn't guarantee explainability. Option B is wrong because approximations may be inaccurate.

Option C is wrong because removing features doesn't address the need for explanations.

190
Multi-Selecteasy

Which TWO are common data quality issues that negatively impact AI model performance? (Choose two.)

Select 2 answers
A.Multicollinearity
B.Missing values
C.Outliers
D.Data volume
E.High dimensionality
AnswersB, C

Missing values can lead to biased or incomplete training.

Why this answer

Options A and B are correct. Missing values and outliers can skew model training. Option C is wrong because high dimensionality is more about feature count than quality.

Option D is wrong because multicollinearity affects interpretability but not necessarily quality. Option E is wrong because data volume alone is not a quality issue.

191
Multi-Selecteasy

Which TWO capabilities are available in Einstein GPT for Sales?

Select 2 answers
A.Sentiment analysis of customer emails
B.Automated lead enrichment
C.Generating Apex code
D.Summarizing call transcripts
E.Generating personalized email drafts
AnswersD, E

Einstein GPT can summarize conversations for quick review.

Why this answer

Correct: B and D. Option A: Sentiment analysis is not a primary feature of Einstein GPT for Sales (it's more in Service). Option C: Automated data entry is not a direct GPT feature.

Option E: Code generation is for developers, not sales.

192
Multi-Selectmedium

A company is deploying Einstein Prediction Builder to predict equipment failure. Which three considerations are essential for building an accurate prediction model? (Choose 3)

Select 3 answers
A.Missing values in features should be handled appropriately.
B.The dataset should span a sufficient time period to capture patterns.
C.All input features must be numerical.
D.The model should be retrained only once after initial deployment.
E.The prediction horizon must be clearly defined.
AnswersA, B, E

Missing data can bias the model.

Why this answer

Option A is correct because missing values in features can introduce bias or cause errors in the predictive model. Einstein Prediction Builder automatically handles missing data through imputation, but understanding how missing values are treated is essential for model accuracy, as inappropriate handling can distort relationships between features and the target outcome.

Exam trap

Salesforce often tests the misconception that all input features must be numerical for machine learning models, but Einstein Prediction Builder natively supports non-numerical data types through automated preprocessing.

193
MCQeasy

A marketing team wants to use Einstein Recommendations to personalize product offers on their e-commerce site. They have a dataset of 50,000 customers with purchase history. However, 40% of customers have no purchase history (new registrations). The model performs well for returning customers but gives generic recommendations for new ones. The team wants to improve recommendations for new customers. What data preparation step should they take?

A.Remove all customers with missing purchase history from the training set.
B.Assign a random purchase frequency to each new customer to add variety.
C.Impute missing purchase history with the average purchase frequency across all customers.
D.Use only customers with complete purchase history to train a more accurate model.
AnswerC

Imputation provides a baseline signal for new customers, enabling the model to make reasonable recommendations.

Why this answer

Imputing missing purchase data with a sensible default (e.g., average purchase frequency) gives the model signal for new customers, improving recommendations without discarding data.

194
Multi-Selecthard

Which TWO are best practices when implementing Einstein Bots? (Choose two.)

Select 2 answers
A.Start with high-volume, low-complexity conversations
B.Use complex intents for initial setup
C.Configure the bot to handle all customer conversations
D.Continuously improve intents based on conversation logs
E.Deploy the bot to all channels immediately without testing
AnswersA, D

Ensures easy wins and learning.

Why this answer

Option A is correct because Einstein Bots are designed to automate high-volume, low-complexity conversations first, such as password resets or order status inquiries. This approach allows the bot to handle the most frequent interactions efficiently, reducing agent workload and providing immediate ROI. Starting simple also enables easier intent training and faster deployment, aligning with best practices for conversational AI implementation.

Exam trap

Salesforce often tests the misconception that bots should handle everything immediately, but the best practice is to start small and iterate based on conversation logs to improve intent accuracy and coverage.

195
MCQhard

Refer to the exhibit. An administrator runs an audit on a sentiment analysis model. What is the primary ethical concern?

A.The bias score is below threshold so no action needed.
B.The training data is imbalanced.
C.The model has low accuracy on negative text, indicating potential bias.
D.The model is overfitting to positive text.
AnswerC

The accuracy gap suggests bias against negative sentiment.

Why this answer

Option B is correct because the large accuracy gap (98% vs 80%) indicates the model performs poorly on negative text, which could lead to unfair treatment of users expressing negative sentiment. Option A is wrong because overfitting is not evident from this data. Option C is wrong even though bias score is below threshold, the significant accuracy disparity is a concern.

Option D is wrong while data imbalance is present, the disparate performance is the more direct ethical issue.

196
Multi-Selecteasy

A sales team is implementing Einstein Lead Scoring. Which two actions should they take to ensure the model is effective? (Choose 2)

Select 2 answers
A.Include only demographic data for scoring.
B.Set a fixed score threshold for all users.
C.Disable the model for low-volume leads.
D.Ensure the training data includes both converted and unconverted leads.
E.Regularly review and provide feedback on lead conversions.
AnswersD, E

Balanced data is necessary.

Why this answer

Option D is correct because Einstein Lead Scoring requires training data that includes both converted and unconverted leads to build a predictive model that can distinguish between leads likely to convert and those that are not. Without unconverted leads, the model cannot learn the negative patterns, leading to biased predictions and poor accuracy.

Exam trap

Salesforce often tests the misconception that Einstein Lead Scoring can work effectively with only positive examples (converted leads), but the model fundamentally requires both positive and negative examples to learn the difference between high- and low-quality leads.

197
MCQmedium

A company uses Einstein Discovery to identify factors that increase case resolution time. After training, the model shows that 'Case_Origin__c' has high importance. What action should the company take?

A.Remove the field from the model to reduce complexity.
B.Create interaction terms between Case_Origin and other fields.
C.Increase the data quality threshold for Case_Origin records.
D.Investigate the categories within Case_Origin to understand their impact.
AnswerD

Understanding which origins cause delays helps in process improvement.

Why this answer

Option C is correct because the model identifies 'Case_Origin__c' as important; analyzing its categories can reveal which origins cause delays. Option A is wrong because removing the field loses information. Option B is wrong because the model already accounts for interactions.

Option D is wrong because the origin is not necessarily a data quality issue.

198
MCQhard

Refer to the exhibit. A company uses this policy for a customer-facing AI model. What is the most critical ethical risk?

A.Performance
B.Data privacy
C.Model accuracy
D.Lack of transparency
AnswerD

With explainability set to none, users cannot understand how decisions are made, violating transparency.

Why this answer

Option B is correct because lack of transparency (explainability: none) makes it impossible for users to understand or challenge decisions, which is a major ethical risk. Option A is wrong because accuracy is not indicated in the policy. Option C is wrong while data privacy is important, the policy does not mention privacy settings.

Option D is wrong performance is not addressed.

199
MCQmedium

A large retail company uses Data Cloud to consolidate customer data from e-commerce, POS, and loyalty programs. They plan to use Einstein Studio to build a churn prediction model. The data architect notices that the churn model's accuracy is below expectations. Upon investigation, they find that the customer entity in Data Cloud has multiple records for the same customer with slightly different spellings and addresses. The data comes from different streams. What should the data architect do to improve the model?

A.Create a Data Transform to merge duplicate records based on fuzzy matching on name and address fields
B.Increase the data stream frequency to get more recent data
C.Change the primary key in the data model to use a different identifier
D.Use a Calculated Insight to aggregate customer behavior over time
AnswerA

Directly addresses the duplicate issue and creates a unified view.

Why this answer

Option A is the best course of action because creating a Data Transform with fuzzy matching merges duplicates into a single clean record, improving data quality for the model. Option B is flawed because increasing frequency does not fix existing duplicates. Option C aggregates but doesn't resolve the duplication.

Option D changes the primary key but duplicates remain.

200
MCQhard

A healthcare company uses an AI model built on Salesforce to predict patient readmission risk. The model is trained on historical data that underrepresents certain ethnic groups. During testing, the model shows significantly higher false negative rates for those groups, meaning it fails to flag high-risk patients. The ethical concern is most directly related to which AI principle?

A.Privacy
B.Accountability
C.Transparency
D.Fairness
AnswerD

Fairness ensures AI does not discriminate against groups; the model's bias is a fairness issue.

Why this answer

The correct answer is C because the model's underrepresentation leads to unfair outcomes for specific groups, violating the principle of fairness. Option A is wrong because transparency is about explainability, not outcome disparity. Option B is wrong because accountability refers to who is responsible, not the bias itself.

Option D is wrong because privacy is about data protection, not fairness.

201
MCQhard

A company has a custom AI model for sentiment analysis and wants to use it in Salesforce without rebuilding. Which approach should they take?

A.Use Data Export and import into external system
B.Use Bring Your Own Model (BYOM) for Einstein
C.Use MuleSoft
D.Build in Apex
AnswerB

Enables custom model deployment in Salesforce.

Why this answer

Bring Your Own Model (BYOM) for Einstein allows companies to deploy their own pre-trained AI models directly into Salesforce without rebuilding them. This approach leverages Salesforce's infrastructure for inference while keeping the custom model intact, making it the ideal solution for integrating a custom sentiment analysis model.

Exam trap

The trap here is that candidates may confuse MuleSoft (an integration tool) with a model deployment service, or assume that any external model must be rebuilt in Apex, when BYOM is specifically designed to avoid that.

How to eliminate wrong answers

Option A is wrong because Data Export is a tool for exporting Salesforce data to external systems, not for importing or running custom AI models within Salesforce. Option C is wrong because MuleSoft is an integration platform for connecting applications and data, not a service for deploying custom AI models into Salesforce's AI framework. Option D is wrong because building the model in Apex would require rewriting the entire model from scratch in Apex code, which is impractical for complex machine learning models and defeats the purpose of using an existing custom model.

202
MCQmedium

Refer to the exhibit. A data scientist tries to query the dataset but receives an error. Which of the following is the most likely cause?

A.The requested fields are not included in the policy.
B.The condition filters out records with amount=5000.
C.The data scientist is not listed in the allowedUsers array.
D.The policy format is invalid JSON.
AnswerA

If the query requests a field not listed (e.g., customer_name), it would be denied.

Why this answer

Option A is correct because 'data_scientist' is in allowedUsers, so they are allowed. Option B is not in policy, C the policy filters amounts >0 and <10000, so 5000 is included, D the fields are in the policy, so they should be accessible.

203
MCQhard

Refer to the exhibit. An admin configures Einstein Next Best Action with the above JSON. The expected behavior is to recommend the top 5 actions for open leads with a score of at least 70. However, only 2 recommendations appear for some leads. Which is the most likely cause?

A.The filter on Lead Status is incorrectly excluding actions.
B.The scoreThreshold of 70 excludes many actions, so fewer than 5 meet the criteria.
C.The recommendation strategy is misconfigured for this object.
D.The maxRecommendations is set to 2 instead of 5.
AnswerB

Score threshold filters out low-scoring actions.

Why this answer

Option B is correct because the scoreThreshold of 70 filters out any actions with a score below 70. If fewer than 5 actions meet this threshold, the system returns only those that qualify, resulting in fewer than 5 recommendations. The maxRecommendations setting defines the upper limit, but the actual number returned is constrained by the scoreThreshold.

Exam trap

Salesforce often tests the interaction between scoreThreshold and maxRecommendations, where candidates mistakenly assume maxRecommendations is the sole determinant of the number of recommendations, overlooking that scoreThreshold can reduce the count below that limit.

How to eliminate wrong answers

Option A is wrong because the filter on Lead Status is not mentioned in the JSON exhibit; the issue is with the score threshold, not a status filter. Option C is wrong because the recommendation strategy is correctly configured for the object; the JSON shows valid strategy settings, and the problem is purely about scoring. Option D is wrong because maxRecommendations is set to 5 in the JSON (as stated in the expected behavior), not 2; the reduced count is due to the scoreThreshold, not a misconfiguration of maxRecommendations.

204
Multi-Selecthard

Before training an Einstein Prediction model, a data analyst must perform data quality checks. Which THREE checks are most critical?

Select 3 answers
A.Confirm that label distribution matches the target baseline
B.Remove duplicate records that could cause data leakage
C.Verify consistent data types across records (e.g., all dates as Date)
D.Ensure all features follow a normal distribution
E.Check for missing values in key fields
AnswersB, C, E

Duplicates can over-represent certain patterns.

Why this answer

Option B is correct because duplicate records can cause data leakage by allowing the model to see the same or highly similar data in both training and validation splits, leading to overfitting and inflated performance metrics. Removing duplicates ensures that the model generalizes to unseen data rather than memorizing repeated instances.

Exam trap

Salesforce often tests the misconception that all features must be normally distributed, which is a requirement for some statistical tests but not for machine learning models like those in Einstein Prediction Builder, which can handle non-normal data via tree-based or ensemble methods.

205
MCQhard

A legal firm wants to automate contract clause generation using AI with preapproved language. Which approach should they use?

A.Email Templates
B.Flow
C.Document Templates
D.Einstein GPT with Clause Library
AnswerD

Generates AI text based on approved clauses.

Why this answer

Option D is correct because Einstein GPT with Clause Library is specifically designed to generate contract clauses using preapproved language. It leverages generative AI combined with a library of approved legal clauses, ensuring compliance and consistency while automating clause creation within Salesforce.

Exam trap

The trap here is that candidates confuse Document Templates (static, pre-filled forms) with AI-powered clause generation, not realizing that Einstein GPT with Clause Library dynamically selects and inserts preapproved language, whereas Document Templates require manual clause insertion or simple merge fields.

How to eliminate wrong answers

Option A is wrong because Email Templates are used for standardizing email communications, not for generating contract clauses with preapproved language. Option B is wrong because Flow is a process automation tool for orchestrating actions and approvals, not a content generation system for legal clauses. Option C is wrong because Document Templates provide static document structures but lack the AI-driven clause selection and generation capabilities needed for dynamic, preapproved clause insertion.

206
MCQeasy

For a real-time AI application that requires low-latency access to customer interaction data, which storage solution is most appropriate?

A.Flat files on a network drive.
B.In-memory data store.
C.Relational database with complex joins.
D.Data lake with batch processing.
AnswerB

In-memory storage offers microsecond latency, ideal for real-time AI.

Why this answer

In-memory data stores (e.g., Redis, Memcached) store data in RAM rather than on disk, providing sub-millisecond read/write latencies essential for real-time AI applications that need immediate access to customer interaction data. This eliminates disk I/O bottlenecks and enables high-throughput, low-latency data retrieval for time-sensitive inference or decision-making.

Exam trap

Salesforce often tests the misconception that relational databases are always the best for structured data, but the trap here is that candidates overlook the strict latency requirement and choose a relational database (Option C) without considering that complex joins and disk-based storage make it too slow for real-time AI workloads.

How to eliminate wrong answers

Option A is wrong because flat files on a network drive introduce high latency due to network overhead and disk I/O, and they lack the indexing and concurrency control needed for real-time access. Option C is wrong because relational databases with complex joins incur significant query processing overhead and disk-based storage, making them unsuitable for low-latency requirements despite ACID compliance. Option D is wrong because data lakes with batch processing are designed for high-throughput, periodic analytics (e.g., hourly/daily) and cannot provide the sub-second response times required for real-time AI interactions.

207
MCQmedium

A company wants an AI chatbot that can handle customer inquiries about order status. Which tool should be configured?

A.Queues
B.Einstein Bot
C.Omni-Channel
D.Case Assignment Rules
AnswerB

Provides AI-powered chat for automated responses.

Why this answer

Einstein Bot is the correct tool because it is Salesforce's native AI-powered chatbot designed to handle customer inquiries, including order status, through natural language processing and automated conversations. It can be configured to answer common questions, escalate complex issues, and integrate with backend systems to retrieve real-time order data without human intervention.

Exam trap

The trap here is that candidates often confuse routing tools (Omni-Channel, Queues) or automation rules (Case Assignment Rules) with AI-powered conversational tools, mistakenly thinking any routing or assignment feature can handle customer inquiries directly.

How to eliminate wrong answers

Option A is wrong because Queues are used for routing work items (like cases or leads) to a group of users based on assignment rules, not for building conversational AI or handling customer inquiries directly. Option C is wrong because Omni-Channel is a routing engine that distributes work across channels (chat, phone, etc.) to available agents, but it does not provide AI-driven chatbot capabilities or automated responses. Option D is wrong because Case Assignment Rules automatically assign cases to users or queues based on criteria, but they lack the conversational AI and natural language understanding needed to interact with customers and answer order status questions.

208
Multi-Selectmedium

Which TWO of the following are limitations of Einstein GPT? (Choose two.)

Select 2 answers
A.It requires structured prompts for best results
B.It can automatically generate account summaries from leads
C.It may produce biased or inaccurate content
D.It supports all languages equally
E.It requires no training data
AnswersA, C

Prompts must be well-formatted.

Why this answer

Einstein GPT relies on structured prompts to guide the generative AI model toward relevant and accurate outputs. Without clear, well-formed prompts, the model may produce vague or off-target responses, making prompt engineering a critical skill for users.

Exam trap

Salesforce often tests the misconception that generative AI tools like Einstein GPT are fully autonomous or require no user input, when in reality they depend on structured prompts and quality training data for reliable results.

209
Multi-Selectmedium

Which TWO data preparation steps are required before using Einstein Discovery for sales forecasting? (Choose 2)

Select 2 answers
A.Convert all text fields to numeric using one-hot encoding
B.Remove duplicate records
C.Include a date or timestamp field for time series analysis
D.Ensure all predictor fields have no missing values
E.Normalize numeric fields to a 0-1 scale
AnswersC, D

For forecasting, a date field is needed to order records.

Why this answer

Einstein Discovery requires a date or timestamp field to perform time series analysis, which is essential for identifying trends, seasonality, and patterns in historical sales data. Without this field, the model cannot properly order observations or forecast future values based on temporal dependencies.

Exam trap

Salesforce often tests the misconception that manual data preprocessing steps like normalization or one-hot encoding are required, when in fact Einstein Discovery automates these steps, and the key prerequisite is ensuring a proper date/timestamp field exists for time-based analysis.

210
MCQhard

A mid-sized company uses Salesforce for sales and service. They have implemented Einstein Prediction Builder on a custom object 'Support_Ticket__c' to predict whether a ticket will be escalated (field: 'Escalated__c' Boolean). The model was trained with 10,000 records and 15 fields including 'Subject', 'Description_Summary__c', 'Priority__c', 'Hours_to_Resolution__c', and others. After deployment, the model's precision for escalated tickets is only 30%, while recall is 80%. The business finds too many false positives. The admin notices that the 'Priority__c' field has many missing values (60% null) and that the field 'Is_Critical__c' (a formula field) was included though it flags tickets as critical only rarely. The data spans 12 months but the last 3 months have a significantly higher escalation rate due to a product bug that has since been fixed. Which course of action will most likely improve the model's precision without harming recall?

A.Roll back the model to the version trained 6 months ago when escalation rates were lower.
B.Exclude the 'Priority__c' field from the model and retrain.
C.Filter training data to exclude tickets from the last 3 months and impute missing 'Priority__c' values with the most common priority.
D.Remove the 'Is_Critical__c' field and increase training data to 50,000 records.
AnswerC

Removing anomalous period and fixing data quality improves model relevance.

Why this answer

Option C is correct because it addresses both the data drift and data quality issues that degrade precision. Excluding the last 3 months removes the biased escalation pattern caused by a fixed product bug, ensuring the model learns from stable historical patterns. Imputing missing 'Priority__c' values with the most common priority reduces noise from nulls without discarding the field entirely, which helps maintain recall by preserving predictive signal.

Exam trap

Salesforce often tests the misconception that simply removing a problematic field or adding more data will fix model performance, when the real issue is data drift and missing value handling that require both temporal filtering and imputation.

How to eliminate wrong answers

Option A is wrong because rolling back to a 6-month-old model ignores the fact that the recent 3-month spike was due to a temporary bug that has been fixed; the older model may not generalize to current data and could still have poor precision. Option B is wrong because simply excluding 'Priority__c' without addressing missing values or data drift may harm recall, as the field could still be predictive when populated; the core issue is the biased training period and null handling, not the field itself. Option D is wrong because removing 'Is_Critical__c' alone does not fix the data drift from the last 3 months, and increasing training data to 50,000 records without cleaning or rebalancing could amplify the bias from the bug period, potentially worsening precision.

211
MCQmedium

A company deploys an AI system to screen job applications. The system is found to consistently reject candidates from a particular university, even though those candidates are qualified. What is the most ethical first step?

A.Increase rejections from other universities to balance
B.Ignore the finding as correlation, not causation
C.Change the screening criteria to include more universities
D.Investigate the training data and model for bias
AnswerD

Bias investigation is the ethical first step to identify and mitigate unfairness.

Why this answer

The correct answer is A because investigating the data and model for bias is the appropriate ethical action. Option B is wrong because ignoring the bias violates fairness. Option C is wrong because immediately increasing rejections is unethical.

Option D is wrong because changing the screening criteria without analysis may not address the root cause.

212
MCQmedium

Refer to the exhibit. In the JSON configuration above, which data preparation step could introduce bias?

A.Excluding rows with missing Stage
B.Ignoring missing Description
C.Filling missing Amount with median
D.Using default for missing CreatedDate
AnswerA

Excluding rows can systematically remove cases if missing is not random, especially if Stage is related to the target.

Why this answer

Option B is correct because excluding rows with missing Stage (a picklist that may correlate with outcome) can introduce selection bias. Filling with median (A) or default (C) are common imputation methods; ignoring Description (D) is generally safe as it treats missing as information.

213
MCQhard

A financial services firm deploys Einstein Prediction Builder to predict loan default risk. The model uses sensitive attributes like zip code and age. During testing, the model shows a disparate impact on minority neighborhoods. The compliance team requires explanation of individual predictions for regulatory audits. The data science team wants to use a complex deep learning model that is not interpretable. Which approach best balances performance and ethical responsibility?

A.Use the complex model but provide post-hoc explanations like SHAP values to satisfy compliance.
B.Use the complex model but only for a subset of customers to limit exposure.
C.Use the complex model and hide the disparate impact by adjusting thresholds per group.
D.Use a simpler, interpretable model (e.g., logistic regression) that may have slightly lower accuracy but ensures transparency and reduces bias.
AnswerD

A simpler, interpretable model ensures transparency and reduces bias, aligning with ethical AI principles.

Why this answer

Option B is correct because a simpler, interpretable model ensures transparency and reduces bias, aligning with ethical AI principles. Option A is wrong because post-hoc explanations may not be reliable or accepted by regulators. Option C is wrong because adjusting thresholds per group is discriminatory and illegal.

Option D is wrong because using the model on a subset does not resolve the underlying bias or compliance requirement.

214
MCQmedium

A Salesforce admin builds an Einstein Prediction Builder model to predict customer churn. The model assigns higher churn risk to customers in a certain demographic group. What is the MOST ethical FIRST step?

A.Disable the model immediately
B.Remove demographic features from the model
C.Analyze the data and model for potential bias
D.Retrain the model with more data
AnswerC

Investigation is the ethical first step to understand and mitigate bias.

Why this answer

Option D is correct because understanding the cause of disparity is essential before taking action. Option A is wrong because disabling the model may lose business value. Option B is wrong as simply retraining may not address root cause.

Option C is wrong because removing demographic features might not eliminate bias if correlated.

215
Multi-Selectmedium

A company is implementing Einstein Prediction Builder to predict whether a support case will escalate. Which TWO data preparation steps should the admin take to improve model accuracy?

Select 2 answers
A.Include as many fields as possible to provide more context
B.Ensure missing values are handled appropriately (e.g., imputed or excluded)
C.Encrypt all fields containing personally identifiable information
D.Exclude cases that were closed without escalation
E.Remove fields that have a one-to-one relationship with the outcome
AnswersB, E

Missing values can bias the model; proper handling improves accuracy.

Why this answer

Correct: Removing redundant fields (like record IDs) and handling missing values are crucial for model accuracy. Option A is wrong because more fields can introduce noise. Option C is wrong because data encryption is about security, not accuracy.

Option D is wrong because all cases should be included to represent the full pattern.

216
Multi-Selecthard

A data analyst is reviewing an Einstein Discovery story and notices that one input feature has a very high influence on the predicted outcome. Which two conclusions are justified based on this observation? (Choose 2)

Select 2 answers
A.The feature has a causal relationship with the outcome.
B.Removing the feature will significantly improve model performance.
C.The model is likely overfitted to that feature.
D.The feature could be a surrogate for other correlated features.
E.The feature is the most important predictor of the outcome.
AnswersD, E

Could represent collinear features.

Why this answer

Option D is correct because a feature with high influence in an Einstein Discovery story may be acting as a proxy for other correlated features, meaning its apparent importance could be due to shared variance with other predictors. This is a known phenomenon in machine learning where collinearity can inflate a feature's influence score without it being uniquely causal.

Exam trap

Salesforce often tests the distinction between correlation and causation, and the trap here is assuming that high feature influence implies a direct causal link or that removing the feature will always improve the model.

217
MCQeasy

A user asks an AI assistant to generate content that may be offensive. What should the AI do?

A.Ignore the request
B.Refuse and explain why
C.Generate and report the user
D.Generate with a warning
AnswerB

Correct. The AI should not produce offensive content and should provide reasoning.

Why this answer

The AI should refuse the request and explain why, upholding ethical standards.

218
Multi-Selectmedium

Which TWO of the following are common dimensions of data quality that must be addressed for AI training?

Select 2 answers
A.Storage efficiency
B.Accuracy of values
C.Encryption strength
D.Consistency with external benchmarks
E.Completeness of records
AnswersB, E

Accuracy ensures data correctly represents real-world entities.

Why this answer

Accuracy of values (Option B) is a fundamental dimension of data quality because AI models learn patterns from training data; if the data contains incorrect values, the model will learn and propagate those errors, leading to unreliable predictions. For example, in a dataset of customer ages, a single erroneous entry of '200' can skew the model's understanding of age distributions, directly impacting model performance.

Exam trap

Salesforce often tests the distinction between data quality dimensions (accuracy, completeness, consistency) and operational or security attributes (storage efficiency, encryption strength), tricking candidates into selecting options that sound technical but are irrelevant to data quality for AI training.

219
MCQeasy

Refer to the exhibit. Which ethical principle is violated?

A.Safety
B.Fairness
C.Accountability
D.Transparency
AnswerD

The user cannot understand why the loan was denied, violating transparency.

Why this answer

Option B is correct because the AI fails to provide an explanation for the decision, violating transparency. Option A is wrong accountability is about responsibility, but the issue is lack of explanation. Option C is wrong fairness is not directly addressed.

Option D is wrong safety is not relevant here.

220
Multi-Selecteasy

Which TWO data types can be used as input for Einstein Vision?

Select 2 answers
A.Video files (MP4).
B.URLs pointing to images.
C.Image files (JPEG, PNG).
D.Text documents (PDF).
E.Audio files (WAV).
AnswersB, C

URLs are accepted as references to images.

Why this answer

Einstein Vision is designed to analyze and classify visual content, specifically images. It accepts image files (JPEG, PNG) and URLs pointing to images as input, allowing the model to process the visual data for object detection, classification, or other vision tasks. Video, text, and audio files are not supported because the service is not built for temporal or non-visual data.

Exam trap

Salesforce often tests the misconception that Einstein Vision can handle video or multimedia files because it is an 'AI' service, but the exam specifically limits input to static images and image URLs.

221
MCQhard

Refer to the exhibit. A developer runs this SOQL query to prepare data for Einstein Lead Scoring. The query returns an error. What is the most likely issue?

A.The alias 'TotalAmount' is not allowed in the HAVING clause.
B.The query misses a GROUP BY clause.
C.The SUM(Amount) cannot be used in the HAVING clause.
D.The WHERE clause condition is invalid.
AnswerA

In SOQL, HAVING must use the full aggregate expression, not an alias.

Why this answer

The HAVING clause references alias TotalAmount, but SOQL does not allow aliases in HAVING; the aggregated expression must be repeated.

222
MCQhard

A company has set up Einstein Next Best Action with a recommendation strategy. They want to ensure that recommendations are personalized based on the customer's recent behavior. What data should be used?

A.Event data from the website tracked via Google Analytics.
B.Streaming data from Data Cloud that includes recent website interactions.
C.Static profile fields like customer age and location.
D.Historical data from a data warehouse updated daily.
AnswerB

Data Cloud can ingest streaming events and make them available for real-time decisions.

Why this answer

Option B is correct because Einstein Next Best Action requires real-time or near-real-time data to personalize recommendations based on recent customer behavior. Streaming data from Data Cloud captures website interactions as they happen, enabling the recommendation engine to use the most current signals (e.g., page views, clicks) to adjust offers dynamically.

Exam trap

The trap here is that candidates often confuse 'historical data' or 'static profile data' as sufficient for personalization, but Cisco tests the understanding that real-time behavior requires streaming data, not batch or static sources.

How to eliminate wrong answers

Option A is wrong because Google Analytics event data is not natively integrated with Einstein Next Best Action; the platform requires data through Salesforce Data Cloud or connected Salesforce sources, not third-party analytics tools. Option C is wrong because static profile fields like age and location do not reflect recent behavior, which is essential for real-time personalization; they are useful for segmentation but not for dynamic, behavior-driven recommendations. Option D is wrong because historical data updated daily is too stale for real-time personalization; Einstein Next Best Action needs streaming or near-real-time data to respond to a customer's latest actions, not batch-loaded historical records.

223
MCQeasy

A Salesforce admin wants to deploy an Einstein bot that uses natural language processing. Which practice best ensures ethical use?

A.Provide clear disclaimers that the user is interacting with an AI.
B.Use the bot only for internal processes.
C.Collect as much personal data as possible to improve accuracy.
D.Allow the bot to make autonomous decisions without human review.
AnswerA

Clear disclaimers ensure transparency and informed consent.

Why this answer

Option A is correct because transparency is a key ethical principle; users should know they are interacting with AI. Option B is wrong because restricting to internal processes does not address ethical use. Option C is wrong because collecting excessive personal data violates privacy.

Option D is wrong because autonomous decisions may require human oversight.

224
MCQeasy

What is the primary purpose of Einstein Studio in the Salesforce AI ecosystem?

A.Create and manage prompt templates
B.Unify customer data from multiple sources
C.Train, test, and deploy custom AI models using AutoML
D.Deploy pre-built Einstein GPT models
AnswerC

Einstein Studio provides AutoML capabilities for building custom models without deep data science expertise.

Why this answer

Option C is correct because Einstein Studio is specifically designed to allow users to train, test, and deploy custom AI models using AutoML, without requiring deep data science expertise. It provides a no-code interface for building models on Salesforce Data Cloud data, enabling predictive scoring and recommendations directly within the Salesforce ecosystem.

Exam trap

The trap here is confusing Einstein Studio's custom AutoML model building with Einstein GPT's pre-built generative AI capabilities, leading candidates to select Option D or A when the question specifically asks about the primary purpose of Einstein Studio.

How to eliminate wrong answers

Option A is wrong because creating and managing prompt templates is the primary function of Einstein GPT Trust Layer and Prompt Builder, not Einstein Studio. Option B is wrong because unifying customer data from multiple sources is the role of Data Cloud (formerly Customer Data Platform), not Einstein Studio. Option D is wrong because deploying pre-built Einstein GPT models is handled by Einstein GPT and its pre-built actions, whereas Einstein Studio focuses on custom model creation using AutoML.

225
MCQhard

A large enterprise uses Data Cloud to power an Einstein model for lead scoring. The model's feature pipeline includes dozens of fields from multiple data streams. Performance has degraded, and the team suspects slow feature retrieval. What is the most efficient way to speed up feature computation in Data Cloud?

A.Increase the parallelism of the data streams
B.Implement external caching in the application layer
C.Use Calculated Insights to pre-compute and cache common features
D.Move all data to a single data lake object
AnswerC

Reduces on-the-fly computation by storing results.

Why this answer

Option B is correct because Calculated Insights can pre-aggregate and store frequently used features, reducing computation. Option A is wrong because parallelism isn't always the bottleneck. Option C is wrong because storage location affects latency but not computation.

Option D is wrong because caching at the application layer bypasses Data Cloud optimizations.

Page 2

Page 3 of 7

Page 4

All pages

Practice AI Associate by domain

Target a specific domain to shore up weak areas.

See all domains with question counts →