Knowledge + Practice

CompTIA AI+ AI0-001 (AI0-001) — Questions 226–300

500 questions total · 7pages · All types, answers revealed

Take a mock exam Exam hub

Page 4 of 7

226

Multi-Selecteasy

Which TWO actions are most appropriate for managing model drift in a production AI system?

Select 2 answers

A.Freeze the model to prevent any changes

B.Roll back to a previous model version if performance degrades

C.Periodically retrain the model on recent data

D.Manually review all model predictions

E.Implement automated monitoring to detect drift indicators

AnswersC, E

Regular retraining helps the model adapt to new patterns.

Why this answer

Option C is correct because periodically retraining the model on recent data is a fundamental strategy to combat model drift, ensuring the model adapts to changes in the underlying data distribution (e.g., concept drift or covariate shift). This aligns with MLOps best practices for maintaining model accuracy over time in production AI systems.

Exam trap

CompTIA often tests the distinction between reactive fixes (like rollback) and proactive, automated strategies (like monitoring and retraining), tricking candidates into choosing rollback as a valid long-term drift management action.

Full explanation →

227

Multi-Selectmedium

A financial services firm has deployed an AI model for real-time credit scoring. The operations team needs to ensure the model remains reliable and compliant over time. Which TWO actions should the team prioritize? (Choose two.)

Select 2 answers

A.Implement automated monitoring for data drift and model performance metrics.

B.Deploy a model versioning system with automated rollback capabilities.

C.Establish a governance process for version-controlled model deployment and retraining.

D.Schedule monthly manual retraining of the model using historical data.

E.Generate weekly compliance reports for regulatory review.

AnswersA, C

Monitoring data drift and performance metrics is proactive and addresses the root cause of model degradation.

Why this answer

Option A is correct because monitoring for data drift helps detect when the distribution of input features changes, which can degrade model performance. Option D is correct because version-controlled retraining ensures reproducibility and auditability. Option B is wrong because implementing automated rollback is reactive, not proactive; monitoring is more fundamental.

Option C is wrong while useful, manual retraining on fixed intervals does not adapt to drift as effectively as drift-triggered retraining. Option E is wrong because compiling frequent reports creates overhead without directly ensuring model reliability.

Full explanation →

228

Multi-Selecthard

An organization is deploying a deep learning model in production. Which THREE components are essential for maintaining model performance over time?

Select 3 answers

A.Performance monitoring

B.Hyperparameter tuning

C.Model retraining pipeline

D.Feature importance analysis

E.Data drift detection

AnswersA, C, E

Continuous monitoring of key metrics alerts teams to degradation in model performance.

Why this answer

Performance monitoring (A) is essential because it provides continuous visibility into model metrics such as accuracy, latency, and throughput, enabling early detection of degradation. Without ongoing monitoring, teams cannot identify when a model's predictions deviate from expected behavior, which is critical for maintaining reliability in production.

Exam trap

CompTIA often tests the distinction between development-phase activities (hyperparameter tuning, feature analysis) and production-phase operational components (monitoring, retraining, drift detection), so candidates mistakenly include tuning or analysis as essential for ongoing maintenance.

Full explanation →

229

MCQmedium

A financial institution is training a risk assessment model. The dataset includes customer credit scores, income, age, and past loan defaults. During feature engineering, a data engineer creates a new feature 'income_to_debt_ratio'. Which type of feature engineering technique is this?

A.Feature encoding

B.Feature scaling

C.Feature selection

D.Feature combination

AnswerD

Creating a ratio from two continuous variables is a combination technique to capture interaction.

Why this answer

Option D is correct because 'income_to_debt_ratio' is created by combining two existing features (income and debt) into a single derived feature. This is a classic example of feature combination (also known as feature crossing or feature construction), where arithmetic operations or logical rules are applied to existing variables to generate new predictive signals. The goal is to capture interactions or relationships that the original features alone may not express linearly.

Exam trap

CompTIA often tests the distinction between feature engineering techniques by presenting a derived feature and expecting candidates to recognize it as feature combination rather than confusing it with scaling or encoding.

How to eliminate wrong answers

Option A is wrong because feature encoding transforms categorical variables into numerical representations (e.g., one-hot encoding, label encoding), not create new numerical ratios from existing numerical features. Option B is wrong because feature scaling normalizes or standardizes the range of feature values (e.g., min-max scaling, z-score normalization) without generating new features. Option C is wrong because feature selection reduces the number of features by choosing a subset of the original ones (e.g., using correlation analysis or recursive feature elimination), not by engineering new derived attributes.

Full explanation →

230

MCQmedium

A healthcare AI startup has developed a model to detect diabetic retinopathy from retinal images. The model achieved 96% sensitivity and 94% specificity on a validation set from the same distribution as the training data. After deployment in a rural clinic, the model's sensitivity drops to 80%. The data team analyzes the clinical images from the clinic and finds that the images have lower resolution and different lighting conditions compared to the training dataset. The team has the ability to collect more data from the clinic and retrain the model. What is the BEST course of action?

A.Reduce the model's complexity by removing several convolutional layers to improve generalization.

B.Apply transfer learning using a model pre-trained on a different medical imaging dataset.

C.Implement adversarial validation to identify which images are out-of-distribution and filter them out.

D.Collect additional retinal images from the rural clinic, label them, and retrain the model including the new data.

AnswerD

Adding data from the target domain re-aligns the model with the deployment environment.

Why this answer

Option D is correct because the performance drop is caused by a domain shift (lower resolution, different lighting) between the training and deployment data. The most direct and effective solution is to collect labeled images from the target domain (rural clinic) and retrain the model, which aligns with the principle of domain adaptation through data augmentation. This approach addresses the root cause by exposing the model to the actual distribution it will encounter in production.

Exam trap

CompTIA often tests the misconception that reducing model complexity or using generic transfer learning can fix domain shift, when in reality the most reliable solution is to retrain with data from the target deployment environment.

How to eliminate wrong answers

Option A is wrong because reducing model complexity (e.g., removing convolutional layers) would likely decrease capacity to learn domain-specific features, potentially worsening performance rather than fixing the domain shift. Option B is wrong because transfer learning from a different medical imaging dataset (e.g., X-rays or MRIs) may not help if the source domain still differs significantly from the rural clinic's retinal images; it could introduce irrelevant features or negative transfer. Option C is wrong because adversarial validation only identifies out-of-distribution samples but does not improve model performance on those samples; filtering them out would reduce the usable data and fail to address the need for the model to work on the clinic's images.

Full explanation →

231

MCQmedium

Refer to the exhibit. A team created an access policy for a fraud detection model endpoint. An intern reports being unable to access the model for testing. Reviewing the policy, what is the most likely cause?

A.The intern's role is not included in the allowed roles

B.The policy JSON has a syntax error

C.The endpoint path is incorrect

D.The intern's role is explicitly denied in the policy

AnswerD

Denied roles override any allowed list.

Why this answer

Option D is correct because the exhibit shows an explicit `Deny` effect for the intern's role in the policy. In AWS IAM (or similar cloud provider) access policies, an explicit deny overrides any allow, so even if the intern's role is listed in allowed roles, the explicit deny will block access. This is a fundamental principle of IAM policy evaluation logic.

Exam trap

CompTIA often tests the explicit deny override principle, where candidates mistakenly think that listing a role in allowed roles guarantees access, ignoring that an explicit deny in the same policy will block it.

How to eliminate wrong answers

Option A is wrong because the intern's role is actually listed in the allowed roles section, so the issue is not a missing role. Option B is wrong because the policy JSON is syntactically valid (no missing commas, brackets, or quotes) and would parse correctly. Option C is wrong because the endpoint path is correctly specified in the policy's `Resource` element, matching the model endpoint ARN.

Full explanation →

232

MCQmedium

A company is deploying an AI model to recommend products. The model's training data included historical purchases from the past two years, but the business environment has changed significantly due to a market shift. What is the most likely issue affecting model performance?

A.Concept drift

B.Overfitting

C.Underfitting

D.Data leakage

AnswerA

Concept drift is the change in the underlying relationship between features and target variable over time, making the model outdated.

Why this answer

Concept drift occurs when the statistical properties of the target variable change over time, which is common in dynamic business environments. Overfitting and underfitting relate to training dataset characteristics. Data leakage involves using information not available at prediction time.

Full explanation →

233

MCQeasy

A healthcare provider wants to use AI to predict patient readmission risk. They have structured data (age, diagnosis, lab results) and unstructured clinical notes. Which approach is most appropriate?

A.Convolutional neural network (CNN) on clinical notes

B.Recurrent neural network (RNN) on structured data

C.Logistic regression on structured data only

D.Multimodal model combining structured and text embeddings

AnswerD

A multimodal model can process both structured data and text, leveraging all available information.

Why this answer

Option D is correct because the scenario involves both structured data (age, diagnosis, lab results) and unstructured clinical notes. A multimodal model can process both types by combining embeddings from text (e.g., via a transformer or RNN) with structured features, enabling the model to learn cross-modal patterns that improve readmission risk prediction. This approach leverages the complementary strengths of structured and unstructured data, which is essential for capturing the full clinical picture.

Exam trap

The trap here is that candidates may assume a single model type (like CNN or RNN) is sufficient for all data, overlooking the need to combine structured and unstructured data through a multimodal architecture.

How to eliminate wrong answers

Option A is wrong because a convolutional neural network (CNN) on clinical notes alone ignores the structured data (age, diagnosis, lab results), which are critical for readmission prediction; CNNs are also less effective for sequential text than transformers or RNNs. Option B is wrong because a recurrent neural network (RNN) on structured data is suboptimal—structured data is typically tabular and better handled by tree-based models or dense layers, and RNNs are designed for sequential data like time series or text. Option C is wrong because logistic regression on structured data only discards the valuable unstructured clinical notes, missing key risk factors embedded in free text, and logistic regression cannot capture complex nonlinear interactions in the data.

Full explanation →

234

MCQmedium

A machine learning engineer is training a Support Vector Machine (SVM) with an RBF kernel on a dataset with features on different scales (e.g., age 0-100, income 0-1,000,000). The model converges slowly and yields poor accuracy. What should the engineer do first?

A.Standardize the features to have zero mean and unit variance

B.Increase the regularization parameter C to penalize misclassifications more

C.Decrease the gamma parameter to reduce the influence of each data point

D.Switch to a linear kernel to avoid distance calculations

AnswerA

Standardization ensures all features contribute equally to the distance metric.

Why this answer

Option D is correct because feature scaling (normalization or standardization) is crucial for SVMs with RBF kernel, as the distance metric depends on feature scales. Option A is wrong because switching to linear kernel may not capture non-linearity. Option B is wrong because increasing C is regularization, not addressing scale.

Option C is wrong because reducing gamma may help but without scaling, distances are dominated by large-scale features.

Full explanation →

235

MCQhard

An AI system is deployed to detect fraudulent transactions. The system flags 5% of transactions as fraudulent, but the actual fraud rate is 0.1%. The business sees many false positives and wants to reduce them without significantly increasing false negatives. Which metric should be prioritized for optimization?

A.Recall

B.F1 score

C.Accuracy

D.Precision

AnswerB

F1 score balances precision and recall, allowing trade-off to reduce false positives while maintaining reasonable recall.

Why this answer

The F1 score balances precision and recall, making it ideal when false positives are costly but false negatives must not increase significantly. Optimizing precision alone would reduce false positives but could increase false negatives, while recall alone would not address the false positive problem. The F1 score ensures both metrics are jointly optimized, aligning with the business requirement.

Exam trap

CompTIA often tests the misconception that precision is the best metric for reducing false positives, but the trap here is that precision alone ignores the impact on false negatives, which the business explicitly wants to avoid increasing.

How to eliminate wrong answers

Option A is wrong because recall focuses on minimizing false negatives, but does not address the false positive problem; optimizing recall alone would likely increase false positives, worsening the business issue. Option C is wrong because accuracy is misleading in highly imbalanced datasets (0.1% fraud rate); a system that never flags any transaction would achieve 99.9% accuracy but fail to detect fraud. Option D is wrong because precision reduces false positives, but optimizing precision alone could increase false negatives (missed fraud), which the business wants to avoid; the F1 score balances both.

Full explanation →

236

Multi-Selectmedium

A data science team is building a model to predict customer churn. The dataset includes categorical variables like 'region' and 'subscription_type'. Which three preprocessing steps should be applied to these categorical features? (Select THREE).

Select 3 answers

A.Normalization

B.Label encoding

C.Standard scaling

D.Ordinal encoding

E.One-hot encoding

AnswersB, D, E

Label encoding assigns integers to each category, suitable for ordinal categories.

Why this answer

Label encoding (B) is correct because it converts each unique category in a categorical variable into a unique integer, which is a simple and memory-efficient way to prepare categorical data for machine learning models. Ordinal encoding (D) is correct for categorical variables with a natural order, such as 'subscription_type' if tiers exist (e.g., basic, premium, enterprise), preserving ordinal relationships. One-hot encoding (E) is correct for nominal categorical variables like 'region' where no order exists, creating binary columns for each category to avoid implying false ordinality.

Exam trap

CompTIA often tests the distinction between ordinal and nominal categorical variables, trapping candidates who apply label encoding to nominal data or one-hot encoding to ordinal data without considering the feature's inherent order.

Full explanation →

237

MCQeasy

Refer to the exhibit. The data scientist notices that the model achieves 98% accuracy on the training set but only 72% on the test set. Which change to the model parameters is most likely to reduce this gap?

A.Increase n_estimators to 500.

B.Set max_depth to None to allow trees to grow fully.

C.Reduce max_depth to 3.

D.Switch from RandomForest to a linear model like LogisticRegression.

AnswerC

Reducing max_depth restricts the tree depth, reducing overfitting.

Why this answer

The model is overfitting: 98% training accuracy vs. 72% test accuracy. Reducing max_depth to 3 limits the depth of each decision tree, preventing them from memorizing noise and forcing them to learn more generalizable patterns. This is a standard regularization technique for tree-based ensembles.

Exam trap

CompTIA often tests the bias-variance tradeoff by presenting overfitting symptoms and expecting candidates to choose a regularization parameter (like reducing max_depth) rather than increasing model complexity or switching model families entirely.

How to eliminate wrong answers

Option A is wrong because increasing n_estimators to 500 would add more trees, which generally improves stability but does not reduce overfitting—it may even exacerbate it if individual trees are already too deep. Option B is wrong because setting max_depth to None allows trees to grow fully, which increases the risk of overfitting by capturing every detail in the training data, widening the accuracy gap. Option D is wrong because switching to a linear model like LogisticRegression is a drastic architectural change that may underfit if the data has non-linear relationships; the goal is to regularize the existing RandomForest, not replace it entirely.

Full explanation →

238

MCQhard

An MLOps team uses a CI/CD pipeline to automate model retraining. The pipeline triggers on new labeled data, runs feature engineering, retrains the model, evaluates against a holdout set, and deploys if metrics exceed thresholds. Recently, a retrained model passed validation but caused a 5% accuracy drop in production. Which improvement best prevents this?

A.Implement canary deployment with shadow scoring to compare with current model

B.Require manual approval before deployment

C.Use the entire production dataset for validation instead of a holdout set

D.Increase the amount of training data used in each retraining cycle

AnswerA

Canary deployment allows testing on live traffic with minimal risk.

Why this answer

Option D is correct because adding canary deployment and shadow testing catches performance issues before full rollout. Option A is wrong because more training data might not help and could introduce bias. Option B is wrong because manual approval slows down pipeline.

Option C is wrong because only using full dataset for evaluation doesn't simulate production conditions.

Full explanation →

239

MCQmedium

Refer to the exhibit. A data scientist observes the training output. Which issue is most likely?

A.Underfitting

B.Data augmentation failure

C.Overfitting

D.Model compression

AnswerC

Correct; high training accuracy with lower validation accuracy suggests overfitting.

Why this answer

The exhibit shows training loss decreasing while validation loss increases after a certain epoch, which is the classic signature of overfitting. The model is memorizing the training data rather than learning generalizable patterns, leading to poor performance on unseen data.

Exam trap

CompTIA often tests the distinction between overfitting and underfitting by showing a loss curve where training loss is low but validation loss rises, tricking candidates who focus only on the low training loss without checking validation performance.

How to eliminate wrong answers

Option A is wrong because underfitting would show both training and validation loss remaining high and not decreasing, not the divergence seen here. Option B is wrong because data augmentation failure would typically cause both losses to be high or erratic, not a clear divergence with low training loss. Option D is wrong because model compression reduces model size and may affect accuracy, but it does not produce the specific loss divergence pattern of overfitting.

Full explanation →

240

MCQmedium

Based on the exhibit, what is the most likely issue with the model training?

A.Vanishing gradient

B.Learning rate too high

C.Underfitting

D.Overfitting

AnswerD

The diverging validation loss after initial improvement indicates the model is memorizing the training data and failing to generalize.

Why this answer

The exhibit shows training loss decreasing while validation loss increases after a certain point, which is a classic sign of overfitting. The model is memorizing the training data rather than generalizing, leading to poor performance on unseen validation data.

Exam trap

CompTIA often tests the distinction between overfitting and underfitting by showing a diverging validation loss curve, which candidates may misinterpret as a learning rate issue or vanishing gradient.

How to eliminate wrong answers

Option A is wrong because vanishing gradient typically causes training to stall early with both losses high and flat, not a diverging validation loss. Option B is wrong because a learning rate too high would cause both training and validation losses to oscillate or diverge together, not just validation loss increasing. Option C is wrong because underfitting would show both training and validation losses remaining high and plateauing, not a decreasing training loss.

Full explanation →

241

Multi-Selecteasy

Which TWO techniques are commonly used for feature scaling? (Choose two.)

Select 2 answers

A.Standardization

B.One-hot encoding

C.Min-Max scaling

D.Normalization

E.PCA

AnswersA, C

Correct: Centers features to mean 0 and standard deviation 1.

Why this answer

Options A and B are correct because Min-Max scaling and standardization are standard feature scaling methods. Options C, D, and E are incorrect: PCA is dimensionality reduction, one-hot encoding is for categorical variables, and normalization is often synonymous with scaling but here it is less specific.

Full explanation →

242

MCQmedium

A dataset for a binary classification problem has 95% of samples in class "0" and 5% in class "1". The data scientist trains a logistic regression model and achieves 95% accuracy. Which metric should the scientist primarily use to evaluate model performance?

A.Precision, recall, and F1-score.

B.R-squared.

C.Accuracy.

D.Mean squared error.

AnswerA

These metrics evaluate performance on the minority class, crucial for imbalanced data.

Why this answer

In a highly imbalanced dataset (95% class 0, 5% class 1), accuracy is misleading because a model can achieve 95% accuracy by simply predicting the majority class for all samples. Precision, recall, and F1-score provide a more nuanced view of performance on the minority class, which is typically the class of interest in binary classification problems. The F1-score, in particular, balances precision and recall, making it the primary metric for evaluating model effectiveness on imbalanced data.

Exam trap

CompTIA often tests the concept that accuracy is a poor metric for imbalanced datasets, trapping candidates who assume high accuracy always indicates good model performance without considering class distribution.

How to eliminate wrong answers

Option B is wrong because R-squared is a metric for regression models, measuring the proportion of variance in the dependent variable explained by the independent variables, and is not applicable to classification tasks. Option C is wrong because accuracy is not a reliable metric for imbalanced datasets; a model that always predicts the majority class can achieve high accuracy without actually learning meaningful patterns, as seen with the 95% accuracy matching the class distribution. Option D is wrong because mean squared error (MSE) is a loss function for regression problems, used to quantify the average squared difference between predicted and actual continuous values, and is not appropriate for evaluating binary classification outputs.

Full explanation →

243

MCQmedium

A team is training a convolutional neural network (CNN) for medical image diagnosis. They have a limited dataset of 500 labeled images. Which strategy is most effective to improve model generalization?

A.Increasing network depth

B.Data augmentation

C.Using a larger batch size

D.Reducing the number of filters

AnswerB

Augmentation (e.g., rotation, flip) generates more training examples, improving generalization.

Why this answer

Data augmentation artificially increases the size and diversity of the training set by applying transformations, reducing overfitting.

Full explanation →

244

MCQmedium

An e-commerce company uses a machine learning model to recommend products to users. The model is retrained weekly and deployed to production. For the past three weeks, the model's click-through rate (CTR) has been stable except on Mondays, when it drops by 15%. Analysis reveals that the training data is extracted on Sundays and includes only weekday behavior. On Mondays, user behavior shifts due to weekend browsing patterns not captured in the training data. The team wants to maintain a weekly retraining cadence but fix the Monday performance drop. Which solution best addresses the Monday CTR drop without changing the retraining frequency?

A.Deploy a separate model specifically for Monday predictions

B.Modify the data pipeline to include the full week (including the past weekend) in each retraining

C.Serve the previous week's model on Mondays to use older but stable patterns

D.Change to daily retraining to include weekend data more promptly

AnswerB

Captures weekend behavior without altering frequency.

Why this answer

Option B is correct because it directly addresses the root cause: the training data excludes weekend behavior, causing the model to be blind to Monday patterns. By modifying the data pipeline to include the full week (including the past weekend) in each retraining, the model learns from weekend browsing patterns and can generalize to Monday user behavior without changing the weekly retraining cadence. This ensures the training distribution matches the inference distribution on Mondays, stabilizing CTR.

Exam trap

CompTIA often tests the misconception that changing retraining frequency (Option D) is the only way to incorporate new data, when in fact adjusting the data window within the existing cadence (Option B) is a more efficient and correct solution.

How to eliminate wrong answers

Option A is wrong because deploying a separate model for Monday predictions introduces operational complexity and does not fix the data gap; it merely treats the symptom by creating a specialized model that still lacks weekend data unless separately trained. Option C is wrong because serving the previous week's model on Mondays would use older patterns that also exclude the most recent weekend behavior, and the model would be even more stale, likely worsening the drop. Option D is wrong because changing to daily retraining alters the retraining frequency, which the team explicitly wants to maintain; it also adds unnecessary overhead and does not address the fact that the training data extraction point (Sundays) is the core issue.

Full explanation →

245

MCQeasy

A team is building a regression model to predict house prices. Which data transformation is most appropriate if the target variable exhibits right skewness?

A.Principal component analysis (PCA)

B.Standardization (Z-score)

C.One-hot encoding

D.Log transformation

AnswerD

Log transformation reduces right skewness by compressing large values.

Why this answer

Log transformation is the most appropriate technique for right-skewed target variables because it compresses the long tail, making the distribution more symmetric and closer to Gaussian. This stabilizes variance and often improves the performance of regression models that assume normally distributed errors, such as linear regression.

Exam trap

CompTIA often tests the misconception that standardization can fix skewness, but candidates must remember that standardization only rescales the data, not reshape its distribution.

How to eliminate wrong answers

Option A is wrong because Principal Component Analysis (PCA) is a dimensionality reduction technique for features, not a transformation applied to the target variable; it does not address skewness in the target. Option B is wrong because Standardization (Z-score) centers and scales the data but does not change the shape of the distribution, so it cannot correct right skewness. Option C is wrong because One-hot encoding is used to convert categorical variables into numerical format, not to transform a continuous target variable.

Full explanation →

246

MCQhard

A multinational corporation deploys an AI recruitment tool that must comply with GDPR's right to explanation. Which practice best ensures the tool meets this requirement?

A.Obtain explicit consent from candidates

B.Implement a system to output the key factors influencing each decision

C.Anonymize all candidate data before processing

D.Always have a human review the AI's recommendations

AnswerB

GDPR requires automated decisions to be explainable.

Why this answer

Option C is correct because providing meaningful information about the decision logic aligns with GDPR Article 22. Option A is wrong because anonymization is for privacy, not explanation. Option B is wrong because consent is for data processing, not explanation.

Option D is wrong because a human-in-the-loop is a safeguard but not a direct explanation.

Full explanation →

247

Multi-Selecteasy

Which TWO of the following are common techniques to improve the transparency and interpretability of an AI model?

Select 2 answers

A.Generate SHAP (SHapley Additive exPlanations) values

B.Use differential privacy to add noise to training data

C.Implement a random forest algorithm

D.Use deep neural networks to increase model complexity

E.Apply LIME (Local Interpretable Model-agnostic Explanations)

AnswersA, E

SHAP values explain the contribution of each feature to predictions.

Why this answer

Options B and D are correct. SHAP values and LIME are model-agnostic interpretability methods. Option A is wrong because deep learning models are often black boxes.

Option C is wrong as random forest is a specific model, not a technique. Option E is wrong because differential privacy is for privacy, not interpretability.

Full explanation →

248

MCQmedium

A healthcare AI system that diagnoses medical images must provide explanations for its predictions to comply with regulatory requirements. Which technique should the team implement?

A.Reduce the model's accuracy to make it simpler.

B.Only deploy rule-based systems.

C.Apply model interpretability methods such as SHAP or LIME.

D.Use a more complex deep learning model.

AnswerC

These methods provide explanations for individual predictions without sacrificing model accuracy.

Why this answer

Option C is correct because SHAP (SHapley Additive exPlanations) and LIME (Local Interpretable Model-agnostic Explanations) are established model interpretability techniques that provide per-prediction explanations, which are essential for regulatory compliance in healthcare AI. These methods generate feature attribution scores or local surrogate models to explain why a specific diagnosis was made, meeting transparency requirements without sacrificing model performance.

Exam trap

The trap here is that candidates often assume complex models are inherently better for compliance, but Cisco tests the understanding that interpretability techniques are required to bridge the gap between high-performance black-box models and regulatory transparency.

How to eliminate wrong answers

Option A is wrong because reducing model accuracy to make it simpler would degrade diagnostic performance and still not guarantee interpretability; a simpler model is not inherently explainable in a regulatory sense. Option B is wrong because only deploying rule-based systems is overly restrictive and impractical for complex medical image analysis, where deep learning models often achieve superior accuracy; rule-based systems may also lack the flexibility to handle edge cases. Option D is wrong because using a more complex deep learning model typically reduces interpretability, making it harder to provide the required explanations, and does not address regulatory compliance.

Full explanation →

249

Multi-Selecthard

Which three techniques are commonly used to mitigate overfitting in neural networks? (Choose three.)

Select 3 answers

A.Adding L2 regularization

B.Increasing training data

C.Dropout

D.Reducing number of layers

E.Early stopping

AnswersA, C, E

L2 regularization adds a penalty on large weights, discouraging overfitting by constraining the model complexity.

Why this answer

Adding L2 regularization (also known as weight decay) penalizes large weights by adding a term proportional to the squared magnitude of the weights to the loss function. This forces the network to keep weights small, reducing the model's sensitivity to noise in the training data and preventing it from fitting spurious patterns, which is a direct and effective method to combat overfitting.

Exam trap

CompTIA often tests the distinction between data-level strategies (like increasing training data) and algorithmic regularization techniques (like L2, dropout, early stopping), leading candidates to mistakenly select 'increasing training data' as a technique when the question specifically asks for techniques commonly used within the neural network training process.

Full explanation →

250

MCQmedium

A security team discovers that an AI-based anomaly detection system frequently misclassifies benign network traffic as malicious when the source IP is from a specific geographic region. Which type of AI vulnerability is most likely being exploited?

A.Data poisoning

B.Model inversion

C.Adversarial evasion

D.Membership inference

AnswerC

Adversarial evasion manipulates input features to cause misclassification. The regional bias suggests crafted inputs bypassing detection.

Why this answer

The system's biased behavior due to geographic region indicates an adversarial attack that exploits the model's sensitivity to certain features. Data poisoning would require alteration of training data, model inversion extracts training data, and membership inference determines if a record was in training set. The scenario describes an evasion attack using adversarial examples to cause misclassification.

Full explanation →

251

MCQeasy

A startup has developed a natural language processing model for sentiment analysis. Their CI/CD pipeline includes a step that runs unit tests on the model's output format and a validation step that checks accuracy on a static test dataset. Recently, the pipeline often fails during the validation step, but the failures are inconsistent—sometimes the same model version passes, sometimes fails. The team suspects the test dataset is small and randomly sampled. They need a reliable validation process to deploy models with confidence. Which approach should the team implement?

A.Replace the static test set with k-fold cross-validation in each pipeline run

B.Increase the accuracy threshold to 95% so only very good models pass

C.Remove the validation step and rely on unit tests only

D.Fix the test dataset to be larger and more representative, and use a statistical test to compare against baseline

AnswerD

A fixed dataset and statistical test provide consistent and objective validation.

Why this answer

Option D is correct because the core issue is that the static test dataset is too small and randomly sampled, leading to inconsistent validation results. By fixing the dataset to be larger and more representative, and using a statistical test (e.g., a paired t-test or McNemar's test) to compare the model's accuracy against a baseline, the team can reliably determine if performance changes are statistically significant, eliminating the randomness that causes pipeline failures to be inconsistent.

Exam trap

CompTIA often tests the misconception that increasing the accuracy threshold or using cross-validation alone can fix validation instability, when the real solution is to address the root cause of small, non-representative test data with statistical rigor.

How to eliminate wrong answers

Option A is wrong because k-fold cross-validation is computationally expensive and time-consuming for a CI/CD pipeline, and it does not directly address the root cause of a small, randomly sampled test set; it would still suffer from variance if the dataset is small. Option B is wrong because simply raising the accuracy threshold to 95% does not fix the underlying inconsistency from a small test set; it may cause even more frequent failures due to random sampling noise, and it does not provide a statistical basis for decision-making. Option C is wrong because removing the validation step entirely would allow models with poor accuracy to be deployed, undermining the goal of deploying with confidence; unit tests alone cannot assess model performance.

Full explanation →

252

Multi-Selectmedium

A team is developing a natural language processing model to classify customer feedback. The dataset contains text in multiple languages. Which THREE preprocessing steps are essential to ensure the model performs well across all languages?

Select 3 answers

A.One-hot encoding

B.Lowercasing

C.Tokenization

D.Stemming

E.Removing stop words

AnswersB, C, E

Lowercasing reduces vocabulary size and helps generalize across different cases.

Why this answer

Lowercasing is essential because it normalizes text across languages by converting all characters to the same case, reducing vocabulary size and ensuring that words like 'Good' and 'good' are treated identically. This prevents the model from learning separate representations for case variations, which is critical for multilingual datasets where case usage may differ (e.g., German capitalizes nouns). Without lowercasing, the model's performance degrades due to sparsity and increased feature space.

Exam trap

CompTIA often tests the distinction between preprocessing steps (like lowercasing, tokenization, stop word removal) and feature engineering techniques (like one-hot encoding), leading candidates to mistakenly include one-hot encoding as a preprocessing step when it is actually a vectorization method applied after preprocessing.

Full explanation →

253

MCQeasy

A data scientist trains a regression model and notices the training loss is low but validation loss is high. Which technique should be applied FIRST to address this issue?

A.Increase the learning rate.

B.Add more layers to the neural network.

C.Increase the size of the training dataset.

D.Apply L1 or L2 regularization to the model.

AnswerD

Regularization penalizes large weights, reducing overfitting.

Why this answer

The scenario describes overfitting, where the model memorizes the training data but fails to generalize to unseen data. Applying L1 or L2 regularization (Option D) is the correct first step because it adds a penalty to the loss function for large weights, discouraging complexity and reducing overfitting without requiring additional data or architectural changes.

Exam trap

CompTIA often tests the distinction between overfitting and underfitting, and the trap here is that candidates may incorrectly choose to increase dataset size (Option C) as the first action, when regularization is the more immediate and practical first step to address overfitting without requiring new data collection.

How to eliminate wrong answers

Option A is wrong because increasing the learning rate would make training more unstable and could cause the loss to diverge, worsening both training and validation performance. Option B is wrong because adding more layers increases model capacity, which exacerbates overfitting when the training loss is already low and validation loss is high. Option C is wrong because increasing the size of the training dataset can help reduce overfitting, but it is not the first technique to apply; regularization is a simpler, more immediate fix that does not require collecting new data.

Full explanation →

254

MCQmedium

A healthcare startup deploys an AI model to predict patient readmission rates. An internal audit reveals that the model consistently underestimates readmission risk for non-native English speakers. According to AI ethics principles, what is the most appropriate course of action?

A.Add a confidence score disclaimer to model outputs

B.Reduce the sample size of non-native English speakers to balance the dataset

C.Continue using the model as is, since overall accuracy is acceptable

D.Retrain the model with a more representative dataset that includes diverse language backgrounds

AnswerD

Retraining with balanced data addresses the root cause of bias.

Why this answer

Option C is correct because the issue stems from biased training data; retraining with balanced data and including diverse patient data can reduce bias. Option A is wrong as ignoring the issue is unethical and may violate regulations. Option B is wrong because post-hoc explanations do not fix the underlying bias.

Option D is wrong because reducing sample size may worsen bias.

Full explanation →

255

MCQmedium

An AI operations team notices that the accuracy of a deployed fraud detection model has been declining over the past month. Which action should the team take to address this issue proactively?

A.Retrain the model with the most recent data immediately.

B.Manually update the model weights weekly.

C.Replace the model with a rule-based system.

D.Set up automated retraining pipeline triggered by performance degradation thresholds.

AnswerD

This allows continuous monitoring and automated response to drift, keeping the model accurate.

Why this answer

Option D is correct because it establishes an automated retraining pipeline triggered by performance degradation thresholds, which aligns with MLOps best practices for maintaining model accuracy in production. This proactive approach ensures the model is retrained when its performance drops below a predefined metric (e.g., AUC or F1 score), without requiring manual intervention. It addresses concept drift, which is a common cause of declining accuracy in deployed fraud detection models.

Exam trap

CompTIA often tests the misconception that retraining with the most recent data immediately is the best proactive action, when in fact automated threshold-based retraining is the correct MLOps practice to avoid overfitting and ensure controlled updates.

How to eliminate wrong answers

Option A is wrong because retraining with the most recent data immediately may introduce data leakage or overfit to recent noise, and it does not address the root cause of performance degradation (e.g., concept drift) in a controlled manner. Option B is wrong because manually updating model weights weekly is not a scalable or reliable practice; it introduces human error and does not leverage automated monitoring or drift detection. Option C is wrong because replacing a machine learning model with a rule-based system would likely reduce the model's ability to detect complex fraud patterns, and it ignores the potential to retrain or update the existing model.

Full explanation →

256

MCQmedium

A machine learning engineer is building a spam filter. The dataset contains 10,000 emails, of which 1,000 are spam. The engineer decides to use a Random Forest classifier. Which preprocessing step is most critical to ensure the model generalizes well to new, unseen emails?

A.Apply Principal Component Analysis (PCA) to reduce dimensionality

B.Normalize the numerical features to have zero mean and unit variance

C.Split the data into training and testing sets before any other preprocessing

D.Encode all features using one-hot encoding

AnswerC

Splitting first prevents data leakage and ensures realistic evaluation.

Why this answer

Option C is correct because splitting the data into training and testing sets before any other preprocessing prevents data leakage. If preprocessing like normalization or PCA is applied to the entire dataset first, the test set information influences the training process, leading to overly optimistic performance estimates and poor generalization to new, unseen emails.

Exam trap

CompTIA often tests the concept of data leakage by presenting preprocessing steps that seem harmless but actually incorporate test set information, tricking candidates into thinking scaling or dimensionality reduction is always necessary for tree-based models.

How to eliminate wrong answers

Option A is wrong because PCA is an unsupervised dimensionality reduction technique that, if applied before splitting, would leak information from the test set into the training set, and Random Forest is robust to high-dimensional sparse data, making PCA unnecessary for generalization. Option B is wrong because Random Forest is a tree-based ensemble method that is invariant to monotonic transformations and does not require feature scaling; normalizing before splitting would also risk data leakage if done on the full dataset. Option D is wrong because one-hot encoding is only relevant for categorical features, and applying it before splitting could introduce data leakage if the encoding uses levels present only in the test set; moreover, not all features in an email dataset are categorical, and Random Forest can handle label encoding without one-hot encoding.

Full explanation →

257

MCQeasy

A model serving endpoint is tested using curl commands. Based on the exhibit, what is the most likely issue?

A.The server is returning HTTP 500 errors

B.The input features are malformed

C.The model is experiencing intermittent high latency leading to timeouts

D.The model is not deployed on the server

AnswerC

The third request timed out, suggesting occasional performance degradation.

Why this answer

The exhibit shows that the first curl request succeeds (HTTP 200), but subsequent requests fail with 'curl: (28) Operation timed out' after the default timeout of 30 seconds. This pattern of intermittent success followed by timeouts is characteristic of a model experiencing high latency spikes, not a persistent server error or configuration issue. The server is reachable and the model responds correctly some of the time, ruling out deployment or malformed input issues.

Exam trap

CompTIA often tests the distinction between persistent errors (like 500 or 404) and intermittent timeout failures, where candidates mistakenly attribute timeouts to server errors or input issues rather than recognizing the pattern of variable latency.

How to eliminate wrong answers

Option A is wrong because the exhibit shows HTTP 200 responses for successful requests, not HTTP 500 errors; a server returning 500 errors would consistently fail with a 5xx status code, not timeouts. Option B is wrong because the first request succeeds, proving the input features are correctly formatted and accepted by the model; malformed features would cause persistent failures across all requests. Option D is wrong because the successful first request confirms the model is deployed and serving predictions; an undeployed model would return a 404 or 503 error, not a timeout after a successful response.

Full explanation →

258

MCQhard

An ML team monitors a production model using a dashboard that shows daily performance metrics. Over the past month, the model's accuracy has dropped from 92% to 87%, while the data distribution of input features has remained stable according to statistical tests. Which type of model drift is most likely occurring?

A.Data drift (covariate shift)

B.Model decay

C.Overfitting

D.Concept drift

AnswerD

Concept drift changes the mapping from inputs to outputs, reducing accuracy.

Why this answer

Concept drift occurs when the relationship between input features and the target variable changes, even if the input data distribution remains stable. In this scenario, the model's accuracy declines from 92% to 87% while input feature distributions are unchanged, indicating that the underlying mapping from features to labels has shifted—a classic sign of concept drift.

Exam trap

CompTIA often tests the distinction between data drift and concept drift by presenting a scenario where input distributions are stable but model performance degrades, leading candidates to mistakenly choose data drift (covariate shift) because they focus on the input features rather than the label relationship.

How to eliminate wrong answers

Option A is wrong because data drift (covariate shift) refers to changes in the distribution of input features, which the question explicitly states has remained stable according to statistical tests. Option B is wrong because model decay is a general term for performance degradation over time, but it is not a specific type of drift; the question asks for the type of drift, and concept drift is the precise classification. Option C is wrong because overfitting is a training-time issue where a model fits noise in the training data, leading to poor generalization on new data; it does not explain a gradual performance drop in production while input distributions remain stable.

Full explanation →

259

MCQmedium

A machine learning team notices that their model's performance degrades when deployed to a new geographic region. The data distribution in the new region differs from the training data. Which concept best describes this issue?

A.Covariate shift

B.Data leakage

C.Underfitting

D.Overfitting

AnswerA

Covariate shift happens when the distribution of input features changes between training and deployment.

Why this answer

Covariate shift occurs when the distribution of the input features (covariates) changes between training and deployment, while the conditional relationship P(Y|X) remains the same. In this scenario, the model's performance degrades because the new geographic region has a different data distribution than the training data, which is the classic definition of covariate shift. This is a common issue in machine learning when models are deployed in environments not represented in the training set.

Exam trap

CompTIA often tests the distinction between covariate shift and overfitting, where candidates mistakenly think performance degradation on new data is always due to overfitting, but the key is that overfitting implies poor performance on the same distribution, not a different one.

How to eliminate wrong answers

Option B is wrong because data leakage refers to information from outside the training set (e.g., future data or target information) being used to train the model, which artificially inflates performance, not a distribution shift between training and deployment. Option C is wrong because underfitting occurs when a model is too simple to capture patterns in the training data, resulting in poor performance on both training and test sets, not specifically a degradation due to a change in data distribution. Option D is wrong because overfitting happens when a model learns noise or specific patterns in the training data too well, leading to poor generalization on unseen data from the same distribution, not a shift to a different distribution.

Full explanation →

260

Multi-Selecthard

Which TWO are valid techniques to reduce overfitting in a deep neural network? (Choose TWO.)

Select 2 answers

A.Increase batch size

B.Increase learning rate

C.L2 regularization

D.Gradient clipping

E.Dropout

AnswersC, E

L2 regularization adds a penalty for large weights, discouraging complex models.

Why this answer

L2 regularization (option C) is a valid technique to reduce overfitting by adding a penalty term proportional to the square of the weight magnitudes to the loss function. This discourages the network from learning overly complex patterns, effectively shrinking weights and improving generalization. Dropout (option E) randomly drops a fraction of neurons during training, which prevents co-adaptation of features and forces the network to learn more robust representations, also reducing overfitting.

Exam trap

CompTIA often tests the distinction between techniques that improve training stability (like gradient clipping or adjusting batch size/learning rate) versus those that directly regularize the model to reduce overfitting (like L2 regularization and dropout), leading candidates to confuse optimization tricks with regularization methods.

Full explanation →

261

Multi-Selecthard

A team is using k-fold cross-validation to evaluate a model. They observe high variance in performance scores across folds. Which TWO actions are most likely to reduce this variance? (Choose TWO.)

Select 2 answers

A.Increase the number of folds

B.Use stratified cross-validation

C.Decrease the number of folds

D.Shuffle data before splitting

E.Use a more complex model

AnswersA, B

More folds mean each training set is larger and more similar to the full dataset, reducing variance.

Why this answer

Increasing the number of folds (e.g., from 5 to 10) uses more training data per fold, reducing variance. Using stratified cross-validation ensures that each fold has a representative class distribution, which stabilizes scores. Decreasing folds increases variance.

Shuffling is already a common practice. Using a more complex model typically increases variance.

Full explanation →

262

MCQeasy

A team is building a recommendation system using collaborative filtering. They have a sparse user-item matrix. Which technique should they use to handle the sparsity and improve recommendations?

A.Association rule mining

B.Matrix factorization

C.k-nearest neighbors

D.Content-based filtering

AnswerB

Matrix factorization reduces dimensionality and captures latent features, effectively handling sparsity.

Why this answer

Matrix factorization (B) is the correct technique because it decomposes the sparse user-item matrix into lower-dimensional latent factor matrices, effectively capturing underlying patterns and filling in missing entries. This directly addresses sparsity by learning dense representations that generalize beyond observed interactions, which is a core strength in collaborative filtering for recommendation systems.

Exam trap

CompTIA often tests the misconception that k-nearest neighbors (k-NN) is the go-to for collaborative filtering, but candidates fail to recognize that k-NN's performance collapses under high sparsity, whereas matrix factorization explicitly models latent factors to overcome this.

How to eliminate wrong answers

Option A is wrong because association rule mining (e.g., Apriori algorithm) is designed for market basket analysis to find frequent itemsets and rules, not for handling sparse user-item matrices in collaborative filtering; it fails to generalize from sparse data and does not model latent factors. Option C is wrong because k-nearest neighbors (k-NN) is a memory-based collaborative filtering method that relies on direct similarity computations between users or items, which degrades severely with high sparsity due to lack of overlapping ratings, leading to poor recommendations. Option D is wrong because content-based filtering uses item features (e.g., genre, keywords) to recommend similar items, not the user-item interaction matrix; it does not address sparsity in collaborative filtering and ignores collaborative signals from other users.

Full explanation →

263

Multi-Selectmedium

Which TWO are best practices for deploying AI models in a containerized production environment? (Select TWO.)

Select 2 answers

A.Always pull the latest image tag for automatic updates

B.Store model artifacts inside the container image for portability

C.Use an orchestration platform like Kubernetes for scaling and health management

D.Package the model and its dependencies into a single container image

E.Configure JVM heap arguments inside the container if using Java

AnswersC, D

Kubernetes provides automated scaling and self-healing.

Why this answer

Option C is correct because orchestration platforms like Kubernetes provide automated scaling, self-healing, and rolling updates for containerized AI models. Kubernetes uses liveness and readiness probes to monitor model health and restart failed containers, ensuring high availability in production.

Exam trap

CompTIA often tests the distinction between containerization best practices (e.g., immutable images, external model storage) and generic software deployment habits (e.g., using latest tags, embedding data), so candidates mistakenly select options that seem convenient but violate production reliability principles.

Full explanation →

264

Multi-Selecteasy

Which TWO of the following are common activation functions used in deep neural networks?

Select 2 answers

A.Linear Regression

B.Support Vector Machine

C.K-means

D.ReLU

E.Sigmoid

AnswersD, E

ReLU is the most common activation for hidden layers.

Why this answer

Sigmoid and ReLU are widely used activation functions. Support Vector Machine is a classifier, not an activation. K-means is a clustering algorithm.

Linear regression is a model, not an activation function.

Full explanation →

265

MCQeasy

A machine learning engineer has a dataset of 100,000 records. She splits it into 70% training, 15% validation, and 15% test sets. After training, the model achieves 95% accuracy on training and 85% on validation. What does the accuracy difference most likely indicate?

A.The validation set is too small

B.The model generalizes well

C.The model is overfitting

D.The test set should be larger

AnswerC

Overfitting explains high training accuracy and lower validation accuracy.

Why this answer

A wide gap between training and validation accuracy is a classic sign of overfitting, where the model memorizes training data but fails to generalize.

Full explanation →

266

MCQeasy

A startup is building a chatbot for customer service. They have 500 recorded conversations and want to use a pre-trained language model to generate responses. However, they have limited computational resources and need the chatbot to respond in real-time. They are considering fine-tuning a large model like GPT-3 or using a smaller model like DistilBERT. The conversation data contains industry-specific jargon. Which approach should they take?

A.Use GPT-3 via API without fine-tuning

B.Fine-tune DistilBERT on the conversation data

C.Train a custom RNN from scratch on the conversations

D.Implement a rule-based system with keywords

AnswerB

DistilBERT is smaller, faster, and fine-tuning on domain-specific data will adapt it to jargon while meeting real-time requirements.

Why this answer

Option B is correct because fine-tuning DistilBERT on the 500 recorded conversations allows the model to adapt to industry-specific jargon while maintaining real-time responsiveness due to its smaller size. DistilBERT is a distilled version of BERT that retains 97% of BERT’s language understanding with 40% fewer parameters, making it suitable for limited computational resources. Fine-tuning on domain-specific data is essential here, as pre-trained models like GPT-3 lack exposure to the startup’s specialized terminology, and using a smaller model ensures low-latency inference for real-time chatbot responses.

Exam trap

CompTIA often tests the misconception that larger pre-trained models like GPT-3 are always superior for domain adaptation, ignoring the critical trade-offs of computational cost, latency, and the need for fine-tuning on small, specialized datasets.

How to eliminate wrong answers

Option A is wrong because using GPT-3 via API without fine-tuning would not adapt to the industry-specific jargon in the 500 conversations, leading to generic or incorrect responses, and the API call latency and cost are unsuitable for real-time constraints with limited resources. Option C is wrong because training a custom RNN from scratch on only 500 conversations is insufficient for learning complex language patterns, resulting in poor generalization and high risk of overfitting, while also requiring significant computational resources for training. Option D is wrong because a rule-based system with keywords cannot handle the variability and nuance of natural language in customer service conversations, especially with industry-specific jargon, and would fail to generate coherent, context-aware responses beyond predefined patterns.

Full explanation →

267

MCQeasy

A security analyst is reviewing logs from an AI-powered recommendation system and notices an unusually high number of requests for products from a specific vendor. The analyst suspects data poisoning. Which mitigation strategy should be implemented first?

A.Encrypt all training data at rest

B.Deploy an anomaly detection system on model outputs

C.Retrain the model with a smaller, curated dataset

D.Implement input validation and sanitization for training data

AnswerD

Input validation prevents poisoned data from entering the training pipeline.

Why this answer

Option B is correct because input validation and sanitization help prevent malicious data from entering the training pipeline. Option A is wrong because encryption does not detect or prevent data poisoning. Option C is wrong because anomaly detection is reactive, not proactive.

Option D is wrong while retraining is part of the solution, the immediate step is to validate the input.

Full explanation →

268

MCQeasy

A data analyst wants to predict housing prices based on square footage, number of bedrooms, and location. Which machine learning approach is most suitable?

A.K-means clustering

B.Decision tree regression

C.Association rule mining

D.Linear regression

AnswerD

Linear regression models the linear relationship between input features and a continuous output.

Why this answer

Linear regression is a simple and interpretable model for predicting a continuous target variable like housing price.

Full explanation →

269

Multi-Selecteasy

Which TWO are common types of adversarial attacks on AI models?

Select 2 answers

A.Hyperparameter tuning

B.Transfer learning

C.Evasion attack

D.Backdoor attack

E.Data poisoning

AnswersC, E

Evasion attacks craft input perturbations to cause misclassification at test time.

Why this answer

Evasion attacks and data poisoning are well-known adversarial attack vectors.

Full explanation →

270

MCQeasy

Which metric is most appropriate for evaluating a binary classification model where the positive class is rare and false positives are costly?

A.Accuracy

B.F1-score

C.Precision

D.Recall

AnswerC

Correct; precision measures how many predicted positives are actually positive, reducing false positives.

Why this answer

Precision is the most appropriate metric when the positive class is rare and false positives are costly because it measures the proportion of true positive predictions among all positive predictions. In this scenario, minimizing false positives is critical, and precision directly penalizes them by requiring high confidence before labeling an instance as positive. This aligns with the business need to avoid costly false alarms, such as in fraud detection or medical diagnosis for rare diseases.

Exam trap

CompTIA often tests the misconception that accuracy is always the best metric, but the trap here is that candidates overlook how class imbalance and asymmetric costs make precision or recall more relevant, and they fail to distinguish between F1-score and precision when the cost of false positives is explicitly stated.

How to eliminate wrong answers

Option A is wrong because accuracy is misleading for imbalanced datasets; a model that predicts the majority class for all instances can achieve high accuracy while failing to identify any positive cases, which is useless when the positive class is rare. Option B is wrong because F1-score balances precision and recall, but when false positives are costly, precision alone is more appropriate; F1-score would still allow some false positives in favor of recall, which is undesirable here. Option D is wrong because recall focuses on capturing all positive instances, but it does not penalize false positives; in a rare positive class scenario with high cost of false positives, maximizing recall would likely increase false positives, which is counterproductive.

Full explanation →

271

MCQhard

A deep learning model for image classification is overfitting the training data. The team has already tried data augmentation and dropout. Which additional technique should they implement to reduce overfitting?

A.Batch normalization

B.Increase number of epochs

C.Gradient clipping

D.Early stopping

AnswerD

Early stopping monitors validation loss and stops training when it starts to increase, reducing overfitting.

Why this answer

Early stopping (Option D) is the correct additional technique because it halts training when validation performance stops improving, directly preventing the model from memorizing noise in the training data. Since data augmentation and dropout are already in use, early stopping provides a complementary regularization effect by limiting the number of training iterations before overfitting occurs.

Exam trap

CompTIA often tests the distinction between techniques that address overfitting versus those that solve optimization issues, leading candidates to confuse batch normalization or gradient clipping as overfitting solutions when they are not.

How to eliminate wrong answers

Option A is wrong because batch normalization primarily accelerates training and stabilizes learning by normalizing layer inputs, but it does not directly reduce overfitting—it can even have a slight regularizing effect, but it is not a primary overfitting countermeasure. Option B is wrong because increasing the number of epochs would exacerbate overfitting by giving the model more opportunities to memorize training data, making the problem worse. Option C is wrong because gradient clipping is used to prevent exploding gradients in deep networks, especially in RNNs, and does not address overfitting from excessive model capacity or insufficient regularization.

Full explanation →

272

MCQmedium

An AI engineer is tuning a deep learning model and observes that the training loss decreases very slowly. The learning rate is set to 0.001. Which adjustment is most likely to speed up convergence?

A.Increase the learning rate to 0.01

B.Add more hidden layers

C.Decrease the learning rate to 0.0001

D.Increase the batch size

AnswerA

A higher learning rate allows larger weight updates, potentially speeding up convergence.

Why this answer

A learning rate of 0.001 is causing the model to take very small steps toward the minimum of the loss function, resulting in slow convergence. Increasing the learning rate to 0.01 allows larger weight updates per iteration, which typically speeds up training. However, care must be taken not to overshoot the optimum, as an excessively high learning rate can cause divergence.

Exam trap

CompTIA often tests the misconception that decreasing the learning rate always improves training, when in fact a learning rate that is too low is a primary cause of slow convergence, and the correct adjustment is to increase it within a safe range.

How to eliminate wrong answers

Option B is wrong because adding more hidden layers increases model complexity and the number of parameters, which generally slows training and can exacerbate the slow convergence problem rather than solving it. Option C is wrong because decreasing the learning rate to 0.0001 would make the updates even smaller, further slowing convergence. Option D is wrong because increasing the batch size provides a more accurate gradient estimate but reduces the frequency of updates per epoch, which can actually slow convergence in terms of steps needed to reach a given loss.

Full explanation →

273

MCQmedium

An AI system experiences degraded accuracy over time due to changes in user behavior. Which monitoring metric should be prioritized to detect this issue earliest?

A.API response latency

B.Data drift detection on input features

C.Area under the ROC curve (AUC)

D.Model accuracy on a holdout validation set

AnswerB

Data drift detects changes before performance degrades.

Why this answer

Option B is correct: Data drift detection monitors changes in input distribution, which often precedes accuracy drop. Option A is wrong because accuracy is a lagging indicator. Option C is wrong because latency doesn't reflect data shift.

Option D is wrong because AUC is also lagging.

Full explanation →

274

Multi-Selecteasy

A data scientist is monitoring a deployed image classification model. Which TWO actions are best practices for detecting model drift? (Choose 2.)

Select 2 answers

A.Schedule automatic weekly retraining of the model.

B.Increase the model's complexity to improve generalization.

C.Use a holdout test set to periodically evaluate model accuracy.

D.Monitor the average prediction confidence of the model.

E.Track the distribution of input data over time.

AnswersC, E

Comparing performance on a static test set reveals concept drift.

Why this answer

Option C is correct because periodically evaluating the model on a holdout test set that reflects the current production data distribution is a direct method to detect accuracy degradation caused by model drift. This approach measures whether the model's performance on unseen data has declined over time, which is a key indicator of drift.

Exam trap

CompTIA often tests the distinction between detection and remediation actions, so candidates mistakenly choose retraining (Option A) as a detection method when it is actually a corrective action.

Full explanation →

275

MCQhard

Refer to the exhibit. An AI governance review finds that a model was deployed without required ethics approval. Based on the audit log, who is most responsible for the compliance failure?

A.Bob

B.Alice

C.Carol

D.System

AnswerA

Bob deployed the model without ethics approval.

Why this answer

Option B (Bob) is correct because he deployed the model without ethics approval. The log shows no approval step before deployment. Option A (Alice) trained the model but did not deploy.

Option C (Carol) performed the rollback after the issue. Option D (System) detected the anomaly but is not responsible.

Full explanation →

276

MCQhard

A large hospital system deploys an AI triage system for emergency rooms. The system uses patient vitals and symptoms to recommend treatment priority. Six months after deployment, complaints arise that the system frequently underestimates the severity of symptoms for patients from certain ethnic backgrounds. A data scientist runs a bias audit and finds that the model's false negative rate is 20% higher for the minority group. The hospital's AI governance board requires immediate corrective action. The data science team has limited resources and cannot retrain the entire model from scratch. They have access to the training data, which is imbalanced. The model is a gradient boosted tree. Which course of action best addresses the bias while minimizing operational impact?

A.Rebalance the training data using SMOTE and retrain the model

B.Use adversarial debiasing during training to remove protected attribute correlations

C.Post-process the model's predictions by adjusting thresholds for the minority group

D.Replace the model with a simpler logistic regression model to improve interpretability

AnswerC

Threshold adjustment is fast, cheap, and directly minimizes false negative disparity.

Why this answer

Post-processing threshold adjustment is quick, does not require retraining, and directly reduces false negative disparities across groups.

Full explanation →

277

Multi-Selecteasy

A data scientist is tuning a deep learning model. Which TWO hyperparameters directly affect the model's capacity to overfit?

Select 2 answers

A.Number of layers in the network.

B.Batch size.

C.Optimizer choice (e.g., SGD vs Adam).

D.Dropout rate.

E.Learning rate.

AnswersA, D

More layers increase capacity, raising overfitting risk.

Why this answer

Option A is correct because increasing the number of layers increases the model's depth, which expands its representational capacity and allows it to learn more complex patterns, including noise, thereby directly increasing overfitting risk. Option D is correct because dropout is a regularization technique that randomly drops neurons during training; a low dropout rate (e.g., 0.0) removes this regularization, while a high rate (e.g., 0.5) reduces overfitting by preventing co-adaptation of neurons.

Exam trap

CompTIA often tests the distinction between hyperparameters that affect model capacity (number of layers, dropout rate) versus those that affect training dynamics (batch size, optimizer, learning rate), leading candidates to mistakenly select learning rate or batch size as direct overfitting controls.

Full explanation →

278

MCQhard

A company deploys an AI model for loan approval. The model shows bias against a protected group. The team decides to use adversarial debiasing. What is the PRIMARY advantage of this approach?

A.It guarantees the model's predictions are private.

B.It reduces bias while preserving predictive performance by learning representations that are invariant to sensitive attributes.

C.It is simpler to implement than pre-processing techniques.

D.It ensures equal approval rates across all groups.

AnswerB

This is the core benefit of adversarial debiasing.

Why this answer

Adversarial debiasing is an in-processing technique that trains a primary model to predict the target (e.g., loan approval) while simultaneously training an adversary to predict the sensitive attribute from the model's learned representations. The primary model is penalized when the adversary succeeds, forcing it to learn representations that are invariant to the sensitive attribute. This reduces bias while preserving predictive performance because the model retains the ability to learn task-relevant patterns that are not correlated with the protected attribute.

Exam trap

The trap here is that candidates confuse 'reducing bias' with 'ensuring equal outcomes' (demographic parity), but adversarial debiasing targets equalized odds or equal opportunity by focusing on representation invariance, not strict rate equality.

How to eliminate wrong answers

Option A is wrong because adversarial debiasing does not guarantee privacy; it addresses fairness, not confidentiality, and does not provide differential privacy or encryption. Option C is wrong because adversarial debiasing is an in-processing technique that is generally more complex to implement than pre-processing techniques like reweighing or sampling, which modify the dataset before training. Option D is wrong because adversarial debiasing aims to reduce bias by learning invariant representations, but it does not enforce equal approval rates across groups; equal approval rates would be demographic parity, which is a different fairness metric and may not align with the model's predictive performance.

Full explanation →

279

MCQmedium

A financial services company has a real-time fraud detection system that uses Apache Kafka to stream transaction events, a TensorFlow Serving model for scoring, and a Redis cache for lookup of historical fraud patterns. The system processes 10,000 transactions per second with an SLA of 100ms latency per transaction. Recently, after a model update, the latency for some transactions spiked to over 500ms, causing timeouts. The model uses a deep neural network with 10 million parameters. The engineering team suspects the issue is due to increased model inference time. Which action should be taken to reduce latency without significant loss in accuracy?

A.Add more Redis nodes to the cache cluster

B.Increase the number of Kafka partitions and consumer threads

C.Decrease the inference batch size from 32 to 1

D.Quantize the model weights from FP32 to FP16

AnswerD

FP16 quantization reduces model size and speeds up inference, typically with minimal accuracy impact.

Why this answer

The latency spike is caused by increased model inference time after a model update. Quantizing model weights from FP32 to FP16 reduces memory bandwidth and computation requirements, directly speeding up inference on compatible hardware (e.g., GPUs with Tensor Cores) with minimal accuracy loss. This addresses the root cause—model inference latency—without changing the system architecture.

Exam trap

The trap here is that candidates confuse system-level scaling (adding cache nodes or Kafka partitions) with model-level optimization, failing to recognize that the latency spike originates from the model inference step itself.

How to eliminate wrong answers

Option A is wrong because adding Redis nodes improves cache lookup throughput, but the latency spike is due to model inference time, not cache performance. Option B is wrong because increasing Kafka partitions and consumer threads improves message ingestion parallelism, but does not reduce the per-transaction inference latency of the TensorFlow Serving model. Option C is wrong because decreasing the inference batch size from 32 to 1 reduces throughput and increases per-transaction overhead (e.g., kernel launch latency), which would worsen latency, not improve it.

Full explanation →

280

MCQhard

An AI model is deployed to a mobile app with limited computational resources. The model is a deep neural network with high latency. Which technique is best to reduce inference time?

A.Increase batch size

B.Add more layers

C.Use a larger model

D.Quantization

AnswerD

Quantization reduces model size and speeds up inference by using lower-precision arithmetic.

Why this answer

Quantization reduces the precision of model weights (e.g., from float32 to int8), significantly speeding up inference and reducing memory footprint with minimal accuracy loss. Increasing batch size is for throughput, not single inference latency. Using a larger model or adding more layers would increase latency.

Full explanation →

281

MCQmedium

An AI model's performance drops significantly in production compared to testing. The data shows distribution shift. What is the best first step?

A.Add more features

B.Retrain model with new data

C.Use a different algorithm

D.Reduce model complexity

AnswerB

Retraining with current data addresses drift.

Why this answer

Option B (Retrain model with new data) is correct because retraining with more representative data adapts to distribution shift. Option A (Add more features) may not address the shift. Option C (Change algorithm) is a larger change without addressing data.

Option D (Reduce model complexity) might worsen performance.

Full explanation →

282

MCQhard

Based on the exhibit, what is the most likely cause of the accuracy drop?

A.A required feature is missing from the production data pipeline.

B.Data drift in the 'income' feature has caused the model to become less accurate.

C.The model was overfitted to the training data.

D.The model's confidence threshold needs to be adjusted.

AnswerB

The detected distribution shift for 'income' indicates data drift, a common cause of performance degradation.

Why this answer

The exhibit shows a sudden and sustained drop in model accuracy coinciding with a shift in the distribution of the 'income' feature. This is a classic symptom of data drift, where the statistical properties of the input feature change over time, causing the model's learned patterns to no longer match the production data. Option B correctly identifies this as the most likely cause because the model was trained on a prior income distribution and is now encountering values outside that range.

Exam trap

CompTIA often tests the distinction between data drift and model overfitting by presenting a sudden accuracy drop after stable performance, leading candidates to incorrectly attribute it to overfitting when the exhibit clearly shows a distribution shift in a specific feature.

How to eliminate wrong answers

Option A is wrong because a missing feature in the production data pipeline would typically cause a pipeline failure or missing-value error, not a gradual accuracy drop that correlates with a specific feature's distribution shift. Option C is wrong because overfitting would manifest as high training accuracy with poor generalization from the start, not a sudden accuracy drop after a period of stable performance; the exhibit shows a clear change point, not a consistently low accuracy. Option D is wrong because adjusting the confidence threshold changes the precision-recall trade-off but does not address the underlying cause of the model's predictions becoming less reliable due to shifted input distributions; it would not restore the original accuracy level.

Full explanation →

283

MCQmedium

A healthcare startup is developing a deep learning model to detect diabetic retinopathy from retinal images. The model is trained on a dataset of 10,000 labeled images. During initial testing, the model achieves 99% accuracy on the training set but only 85% on the test set. The startup wants to deploy the model in a clinical setting where false negatives (missing a disease) are critical. The team has access to additional unlabeled retinal images from multiple sources. Which strategy should the team use to improve the model's generalization and reduce false negatives?

A.Use semi-supervised learning with the unlabeled images to improve feature representations

B.Apply aggressive data augmentation to the training set

C.Increase the learning rate during training

D.Add more convolutional layers to the model

AnswerA

Semi-supervised learning utilizes unlabeled data to learn generalizable features, reducing overfitting and improving test performance.

Why this answer

Semi-supervised learning leverages the large pool of unlabeled retinal images to learn robust feature representations, which helps the model generalize better to unseen data. By reducing overfitting (the gap between 99% training and 85% test accuracy), this approach directly improves test-set performance. Additionally, semi-supervised methods can be tuned to emphasize recall, thereby reducing false negatives critical in clinical diabetic retinopathy screening.

Exam trap

CompTIA often tests the misconception that simply increasing data or model complexity (augmentation, layers) always improves generalization, when in fact semi-supervised learning is the targeted solution for leveraging unlabeled data to close the train-test accuracy gap and address class-specific metrics like false negatives.

How to eliminate wrong answers

Option B is wrong because aggressive data augmentation, while helpful for generalization, does not directly address the high false-negative rate; it may even distort critical pathological features if applied too aggressively. Option C is wrong because increasing the learning rate typically destabilizes training, leading to divergence or poor convergence, and does not reduce false negatives or improve generalization. Option D is wrong because adding more convolutional layers increases model capacity, which would likely worsen overfitting given the already large gap between training and test accuracy, and does not specifically target false negatives.

Full explanation →

284

MCQhard

A company uses the above policy to control AI model access. A data scientist tries to run inference with model "llama-3-70b" at 150 requests in 30 minutes. What will happen?

A.All requests are allowed because the model is in the allowed list

B.All requests are denied because the rate limit is per minute and 150 exceeds the limit

C.The first 100 requests are allowed; the remaining 50 are denied

D.All requests are denied because the second rule blocks all models

AnswerC

The rate limit allows 100 requests per hour; exceeding requests are denied.

Why this answer

Option C is correct because the policy allows up to 100 requests per 30 minutes for models in the allowed list, and 'llama-3-70b' is in that list. The rate limit is applied per 30-minute window, not per minute, so the first 100 requests are allowed, and the remaining 50 exceed the limit and are denied.

Exam trap

The trap here is that candidates often misinterpret the rate limit as a per-minute value (like 100 per minute) rather than the stated 100 per 30 minutes, leading them to incorrectly select Option B.

How to eliminate wrong answers

Option A is wrong because it ignores the rate limit; being in the allowed list does not bypass the 100 requests per 30-minute cap. Option B is wrong because it misinterprets the rate limit as per minute, but the policy specifies a per-30-minute window, so 150 requests in 30 minutes does not exceed a per-minute limit. Option D is wrong because the second rule does not block all models; it only blocks models not in the allowed list, and 'llama-3-70b' is explicitly allowed.

Full explanation →

285

Multi-Selecteasy

A data scientist is preparing a dataset for a classification model. The dataset contains several categorical variables with high cardinality. Which TWO encoding methods are appropriate for converting these categorical variables into numerical features?

Select 2 answers

A.Min-max scaling

B.K-means clustering

C.One-hot encoding

D.Principal component analysis (PCA)

E.Label encoding

AnswersC, E

One-hot encoding converts each category into a binary vector, suitable for categorical variables.

Why this answer

One-hot encoding and label encoding are both appropriate techniques for encoding categorical variables. One-hot encoding creates binary columns for each category, while label encoding assigns a unique integer to each category. PCA is a dimensionality reduction technique, K-means is a clustering algorithm, and min-max scaling is a normalization method, none of which are encoding methods.

Full explanation →

286

MCQeasy

Refer to the exhibit. What is the recall of the model?

A.0.72

B.0.8

C.0.7

D.0.73

AnswerB

TP=80, FN=20, recall=80/100=0.8

Why this answer

Recall is calculated as True Positives divided by (True Positives + False Negatives). From the confusion matrix, True Positives = 72 and False Negatives = 18, so recall = 72 / (72 + 18) = 72 / 90 = 0.8. This measures the model's ability to correctly identify all actual positive cases.

Exam trap

CompTIA often tests recall by providing a confusion matrix and expects candidates to correctly identify the denominator as TP+FN, not total samples, to avoid confusing recall with accuracy or precision.

How to eliminate wrong answers

Option A (0.72) is wrong because it incorrectly uses True Positives divided by total predictions (72/100 = 0.72), which is accuracy, not recall. Option C (0.7) is wrong because it likely results from misreading the matrix (e.g., using 72/102 or confusing with precision). Option D (0.73) is wrong because it may come from a miscalculation such as (72 + 1)/(72 + 18 + 10) = 73/100, which is not a standard metric.

Full explanation →

287

MCQmedium

An e-commerce company uses an AI system to set dynamic prices for products. A customer complains that the price they see is higher than the price shown to a friend for the same product at the same time. The company wants to ensure pricing fairness. Which ethical principle should guide the redesign of the pricing algorithm?

A.Transparency and explainability

B.Privacy by design

C.Accountability

D.Beneficence

AnswerA

Transparency requires the company to disclose how prices are determined, helping to ensure fairness and build trust.

Why this answer

Transparency and explainability is the correct principle because the core issue is that the customer cannot understand why the AI system set a different price for them compared to their friend. Redesigning the algorithm to provide clear, understandable reasons for price variations—such as demand, purchase history, or time of day—directly addresses this lack of visibility. This principle ensures that the system's decision-making process is open to scrutiny, which is essential for building trust and resolving fairness complaints in dynamic pricing models.

Exam trap

CompTIA often tests the distinction between 'accountability' (who is responsible) and 'transparency' (how the decision is made), leading candidates to pick accountability when the question explicitly asks for the principle that guides the redesign to ensure fairness through understanding.

How to eliminate wrong answers

Option B (Privacy by design) is wrong because the complaint is about price disparity and lack of understanding, not about how customer data is collected, stored, or protected. Option C (Accountability) is wrong because while accountability is important for assigning responsibility, it does not directly solve the customer's need to understand why the price differs; it focuses on who is responsible rather than making the algorithm's logic visible. Option D (Beneficence) is wrong because beneficence refers to doing good or maximizing benefits, but the immediate ethical failure here is the lack of clarity and justification for the pricing decision, not the absence of overall positive outcomes.

Full explanation →

288

Multi-Selectmedium

Which TWO actions should be taken to ensure an AI model complies with GDPR requirements when processing personal data?

Select 2 answers

A.Limit data collection to only what is necessary for the model

B.Provide a full explanation of model predictions

C.Store all user data for a minimum of 10 years

D.Anonymize all personal data before use

E.Implement user data deletion upon request

AnswersA, E

Data minimization is a GDPR principle.

Why this answer

Option A is correct because GDPR's data minimization principle (Article 5(1)(c)) requires that personal data collected be adequate, relevant, and limited to what is necessary for the purpose for which it is processed. In AI model training, this means collecting only the features essential for the model's objective, reducing the risk of processing excessive or irrelevant personal data.

Exam trap

CompTIA often tests the misconception that anonymization is always required before any AI processing of personal data, but GDPR allows processing under lawful bases without anonymization, making Option D a tempting but incorrect choice.

Full explanation →

289

MCQmedium

A data science team uses Git for version control of model code and DVC for data versioning. They want to implement a model registry to track trained models, their hyperparameters, and performance metrics. Which tool is specifically designed for this purpose and integrates with the existing workflow?

A.Apache Airflow

B.Docker

C.MLflow Model Registry

D.Kubernetes

AnswerC

MLflow provides a model registry that stores model versions and metadata.

Why this answer

MLflow Model Registry is specifically designed for managing model versions, tracking metadata, and integrating with Git and DVC. Apache Airflow is for workflow orchestration, not model registry. Kubernetes is for container orchestration.

Docker is for containerization.

Full explanation →

290

MCQhard

A healthcare startup is developing a deep learning model to detect diabetic retinopathy from retinal fundus images. The dataset contains 50,000 images, but only 5% are labeled as positive for the disease. The team uses a convolutional neural network (CNN) with a final sigmoid layer and binary cross-entropy loss. After training for 20 epochs, the model achieves 95% accuracy on the test set, but the recall for the positive class is only 10%. The team suspects the model is biased toward the negative class due to class imbalance. The data is stored in a secure environment, and no additional labeled data can be obtained. The team has access to the following techniques: oversampling the minority class, undersampling the majority class, using class weights in the loss function, applying data augmentation, and using a different architecture. Which course of action is most likely to improve recall for the positive class while maintaining reasonable overall performance?

A.Undersample the majority class to balance the dataset

B.Oversample the minority class using synthetic image generation

C.Assign higher class weights to the positive class in the loss function

D.Replace the CNN with a transformer-based architecture

AnswerC

Class weights force the model to focus on the minority class, improving recall.

Why this answer

Assigning higher class weights to the positive class in the loss function directly penalizes misclassifications of the minority class during training. This forces the model to pay more attention to positive samples without altering the dataset distribution, which is critical when no additional labeled data can be obtained and the data is in a secure environment. It improves recall by increasing the gradient contribution from positive samples, while maintaining overall performance because the model still sees the original data distribution.

Exam trap

The trap here is that candidates often choose oversampling (Option B) as the default solution for class imbalance, but fail to recognize that synthetic image generation for medical images can introduce unrealistic patterns and is not a standard or safe technique, whereas class weights are a lightweight, data-preserving approach that directly addresses the loss function.

How to eliminate wrong answers

Option A is wrong because undersampling the majority class discards a large number of negative samples, which can lead to loss of valuable information and degrade overall accuracy, especially with a 95% negative class. Option B is wrong because oversampling the minority class using synthetic image generation (e.g., SMOTE) is not directly applicable to high-dimensional image data without careful adaptation, and it may introduce unrealistic artifacts that harm generalization; the question specifies 'synthetic image generation' which is not a standard or safe approach for retinal fundus images. Option D is wrong because replacing the CNN with a transformer-based architecture does not address the class imbalance problem; transformers are not inherently better at handling imbalanced data and would require more data and computational resources, which are not available here.

Full explanation →

291

MCQeasy

A company wants to deploy a machine learning model that requires continuous learning as new data arrives. The model must be able to adapt to changing patterns without retraining from scratch. Which approach should be used?

A.Transfer learning

B.Online learning

C.Batch learning

D.Unsupervised learning

AnswerB

Online learning updates the model incrementally, allowing adaptation to new data without full retraining.

Why this answer

Online learning (also called incremental learning) updates the model incrementally as each new data point arrives, without requiring full retraining. This makes it ideal for scenarios where data arrives continuously and patterns shift over time, as the model can adapt its parameters on the fly.

Exam trap

CompTIA often tests the distinction between training paradigms (online vs. batch) and other ML concepts like transfer learning or unsupervised learning, so candidates may confuse 'continuous learning' with 'transfer learning' or incorrectly assume that any learning method can handle streaming data.

How to eliminate wrong answers

Option A is wrong because transfer learning reuses a pre-trained model on a new but related task, but it does not inherently support continuous adaptation to streaming data—it typically requires a separate fine-tuning phase. Option C is wrong because batch learning trains the model on the entire dataset at once and requires retraining from scratch when new data arrives, making it unsuitable for continuous learning. Option D is wrong because unsupervised learning is a paradigm for finding patterns in unlabeled data, not a deployment strategy for handling streaming data or model updates.

Full explanation →

292

Multi-Selecthard

A data scientist is evaluating a binary classification model for fraud detection. The dataset is highly imbalanced (99% non-fraud, 1% fraud). Which TWO metrics are most appropriate for assessing model performance? (Choose two.)

Select 2 answers

A.Precision

B.Recall

C.F1 score

D.Area under the ROC curve (AUC-ROC)

E.Accuracy

AnswersA, B

Precision measures the proportion of predicted fraud that is actually fraud, important to avoid false positives.

Why this answer

Precision is appropriate because it measures the proportion of predicted fraud cases that are actually fraudulent, which is critical when false positives (flagging legitimate transactions as fraud) are costly. In a highly imbalanced dataset like this (99% non-fraud), precision directly evaluates the model's ability to avoid overwhelming fraud analysts with false alarms.

Exam trap

CompTIA often tests the misconception that AUC-ROC is always the best metric for imbalanced datasets, but the trap here is that AUC-ROC can be misleadingly high even when the model performs poorly on the minority class, whereas precision and recall directly address the class imbalance.

Full explanation →

293

MCQhard

A research lab trains a language model using DP-SGD. What primary privacy risk does this technique mitigate?

A.Data poisoning attacks

B.Membership inference attacks

C.Adversarial patch attacks

D.Model inversion attacks

AnswerB

DP-SGD explicitly bounds the contribution of each datapoint, making membership inference harder.

Why this answer

DP-SGD limits the memorization of individual training records, preventing reconstruction attacks.

Full explanation →

294

MCQmedium

You are an AI governance officer at a bank that uses a machine learning model to predict credit risk. The model was developed by an external vendor and uses a proprietary algorithm. The bank's compliance team has determined that the model must be explainable to meet regulatory requirements. However, the vendor claims the model is a 'black box' and cannot provide explanations. You need to ensure compliance while maintaining the model's performance. What is the best course of action?

A.Ignore the requirement as the model is proprietary

B.Ask the vendor to develop a custom explanation module

C.Replace the model with a simpler, interpretable model

D.Use a model-agnostic explanation technique like SHAP

AnswerD

SHAP provides explanations for any model, satisfying regulatory needs.

Why this answer

D is correct because model-agnostic explanation techniques like SHAP (SHapley Additive exPlanations) can provide post-hoc interpretability for any black-box model without requiring access to its internal structure or proprietary algorithm. This allows the bank to meet regulatory explainability requirements while preserving the vendor's proprietary model and its predictive performance.

Exam trap

The trap here is that candidates may assume that a 'black box' model cannot be explained at all, leading them to choose replacement with a simpler model (Option C), when in fact model-agnostic techniques like SHAP or LIME can provide explanations without altering the model itself.

How to eliminate wrong answers

Option A is wrong because ignoring regulatory requirements is not a viable option for a financial institution; it would lead to non-compliance and potential penalties. Option B is wrong because asking the vendor to develop a custom explanation module would require the vendor to modify their proprietary algorithm, which they have stated is a 'black box' and cannot provide explanations, making this request impractical and likely impossible. Option C is wrong because replacing the model with a simpler, interpretable model would sacrifice the predictive performance that the current model provides, which may be critical for accurate credit risk assessment.

Full explanation →

295

MCQeasy

A dataset used for training a classification model contains 10% missing values in a feature that is known to be important. The data scientist decides to impute the missing values. Which imputation method is most robust if the data is not missing completely at random?

A.Delete all rows with missing values

B.Use multiple imputation to model missing values

C.Replace missing values with the mean of the feature

D.Fill missing values with 0

AnswerB

Multiple imputation provides unbiased estimates under missing at random assumption.

Why this answer

Option B is correct because multiple imputation accounts for uncertainty and is robust when data are missing at random. Option A is wrong because mean imputation distorts distributions and relationships. Option C is wrong because deleting rows with missing values reduces sample size and may bias results.

Option D is wrong because filling with zero is arbitrary and introduces bias.

Full explanation →

296

Multi-Selecteasy

Which TWO of the following are common activation functions used in neural networks? (Choose two.)

Select 2 answers

A.Gradient descent

B.LSTM

C.Dropout

D.ReLU

E.Sigmoid

AnswersD, E

ReLU is a widely used activation function.

Why this answer

ReLU (Rectified Linear Unit) is a widely used activation function that outputs the input directly if it is positive, and zero otherwise, introducing non-linearity while mitigating the vanishing gradient problem. Sigmoid is another common activation function that maps any real-valued input to a value between 0 and 1, making it useful for binary classification output layers. Both are fundamental building blocks in neural network architectures.

Exam trap

CompTIA often tests the distinction between activation functions and other neural network components like optimizers (gradient descent), architectures (LSTM), or regularization techniques (dropout), expecting candidates to recognize that only ReLU and Sigmoid directly compute a neuron's output from its input.

Full explanation →

297

Multi-Selecthard

An AI operations team is monitoring a deployed image classification model. They notice a gradual increase in prediction confidence but a drop in accuracy. Which THREE actions should they take to diagnose the issue?

Select 3 answers

A.Analyze the model's calibration curve to see if confidence scores align with actual accuracy.

B.Increase the size of the training dataset by collecting more unlabeled data.

C.Compare the distribution of input features between training and recent production data.

D.Evaluate model performance on a held-out test set collected at deployment time.

E.Retrain the model immediately with the most recent data.

AnswersA, C, D

Calibration analysis reveals if model is overconfident due to drift.

Why this answer

Option A is correct because a calibration curve (reliability diagram) directly compares predicted confidence scores against actual accuracy. In this scenario, increasing confidence with dropping accuracy indicates miscalibration—the model is becoming overconfident. Analyzing the calibration curve reveals whether the confidence scores systematically deviate from true probabilities, which is the core diagnostic step for this specific symptom.

Exam trap

CompTIA often tests the distinction between diagnostic actions and corrective actions—candidates mistakenly jump to retraining (Option E) or data collection (Option B) instead of first analyzing calibration and data distribution (Options A, C, D) to identify the specific type of drift or miscalibration.

Full explanation →

298

MCQeasy

A company wants to use AI to automatically categorize customer support tickets into topics like 'billing', 'technical', 'account'. They have 10,000 labeled examples. Which algorithm is most suitable for this task?

A.DBSCAN

B.Apriori

C.Principal component analysis (PCA)

D.Logistic regression

AnswerD

Logistic regression is a supervised learning algorithm for classification, suitable for multi-class problems with moderate data.

Why this answer

Option A is correct because logistic regression works well for multi-class classification with limited data. Option B is wrong because DBSCAN is clustering, not classification. Option C is wrong because Apriori is for association rules.

Option D is wrong because PCA is dimensionality reduction.

Full explanation →

299

MCQhard

A model trained on a dataset with imbalanced classes achieves 98% accuracy but only 50% recall for the minority class. Which technique should be applied first to address the imbalance?

A.Apply cost-sensitive learning

B.Reduce the majority class size

C.Use SMOTE to generate synthetic samples

D.Collect more data for the minority class

AnswerA

Cost-sensitive learning adjusts class weights in the loss function, directly tackling imbalance without data modification.

Why this answer

Cost-sensitive learning directly modifies the model's loss function to penalize misclassifications of the minority class more heavily than those of the majority class. This approach addresses the root cause of the imbalance—the model's bias toward the majority class—without altering the dataset distribution, making it the most immediate and effective first step.

Exam trap

CompTIA often tests the misconception that data-level techniques like SMOTE or undersampling should always be the first approach, when in fact cost-sensitive learning is a simpler, less invasive, and often more effective initial step that directly adjusts the model's learning objective.

How to eliminate wrong answers

Option B is wrong because reducing the majority class size (random undersampling) discards potentially valuable data, which can lead to loss of information and increased variance in the model, and it is not typically the first technique applied. Option C is wrong because SMOTE generates synthetic samples for the minority class, which can introduce noise and is a data-level augmentation technique that should be considered after cost-sensitive adjustments or as a complementary method, not as the first step. Option D is wrong because collecting more data for the minority class is often impractical, time-consuming, and may not be feasible in real-world scenarios; it is not a guaranteed or immediate solution to the imbalance.

Full explanation →

300

Multi-Selectmedium

Which THREE of the following are types of machine learning paradigms? (Choose three.)

Select 3 answers

A.Gradient boosting

B.Reinforcement learning

C.Unsupervised learning

D.Quantum computing

E.Supervised learning

AnswersB, C, E

Reinforcement learning involves an agent learning from rewards.

Why this answer

Reinforcement learning is a correct machine learning paradigm where an agent learns to make decisions by interacting with an environment, receiving rewards or penalties based on its actions. This trial-and-error approach is distinct from supervised and unsupervised learning, as it focuses on maximizing cumulative reward through exploration and exploitation.

Exam trap

CompTIA often tests candidates by listing specific algorithms (like gradient boosting) or adjacent technologies (like quantum computing) as distractors, hoping you confuse a technique or enabling technology with a fundamental learning paradigm.

Full explanation →

Page 4 of 7

All pages

Practice AI0-001 by domain

Target a specific domain to shore up weak areas.

AI Concepts and Foundations Machine Learning and Deep Learning AI Models and Data Engineering AI Implementation and Operations AI Security, Ethics and Governance

See all domains with question counts →