Back to CompTIA AI+ AI0-001 questions

Scenario-based practice

Hard Difficulty Questions

Practise CompTIA AI+ AI0-001 practice questions — original exam-style scenarios covering every exam domain, with detailed explanations, wrong-answer analysis, and common exam traps.

20
scenario questions
AI0-001
exam code
CompTIA
vendor

Scenario guide

How to approach hard difficulty questions

These are the questions most candidates get wrong. They require connecting multiple concepts, reading tricky output, or knowing edge-case behaviour that isn't on most study cards. Practising them trains you to operate under uncertainty — a necessary skill on the real exam.

Quick answer

Hard Difficulty Questions questions test whether you can apply the concept in context, not just recognise a definition.

How the topic appears in realistic exam-style scenarios.

Which detail in the question changes the correct answer.

How to eliminate plausible but wrong options.

How to connect the question back to the wider exam objective.

Related practice questions

Related AI0-001 topic practice pages

Scenario questions usually connect to one or more exam topics. Use these links to review the underlying concepts behind the scenario.

Practice set

Practice scenarios

Question 1hardmultiple choice
Full question →

An AI system used for autonomous driving is found to have a lower accuracy in detecting pedestrians with darker skin tones. The development team wants to address this ethical issue. Which action is most effective?

Question 2hardmultiple choice
Full question →

A model trained on a dataset with imbalanced classes achieves 98% accuracy but only 50% recall for the minority class. Which technique should be applied first to address the imbalance?

Question 3hardmultiple choice
Full question →

An MLOps team automates model deployment with a CI/CD pipeline. A performance regression is detected after deploying a new model version. The team needs to automatically roll back to the previous version. Which approach best enables safe automated rollback?

Question 4hardmulti select
Full question →

A data scientist is evaluating a binary classification model for fraud detection. The dataset is highly imbalanced (99% non-fraud, 1% fraud). Which TWO metrics are most appropriate for assessing model performance? (Choose two.)

Question 5hardmultiple choice
Full question →

Refer to the exhibit. A deep learning model is being trained. Based on the training log, which problem is most evident?

Exhibit

Refer to the exhibit.

```
Epoch 1/10
 - loss: 1.2345 - accuracy: 0.6543 - val_loss: 1.9876 - val_accuracy: 0.4321
Epoch 2/10
 - loss: 1.0123 - accuracy: 0.7123 - val_loss: 2.3456 - val_accuracy: 0.3987
Epoch 3/10
 - loss: 0.8765 - accuracy: 0.7654 - val_loss: 2.8765 - val_accuracy: 0.3654
```
Question 6hardmultiple choice
Full question →

A large e-commerce company uses a recommendation system based on collaborative filtering. The system uses a matrix factorization model that is trained nightly on the entire user-item interaction history. Recently, the company launched a flash sale with thousands of new products. Users are reporting that the recommendations are not showing the new products, even for users who have purchased them during the sale. The data engineering team notices that the new products have very few interactions in the training data. The model's loss on the validation set has increased, and the recall@10 metric has dropped from 0.45 to 0.32. The team needs to improve the recommendation of new items without retraining the entire model from scratch every hour. Which approach should the team take?

Question 7hardmultiple choice
Full question →

A data scientist is training a convolutional neural network (CNN) for object detection. The training loss decreases rapidly but then plateaus at a high value, and the validation loss starts increasing. Which action should the scientist take to improve the model?

Question 8hardmultiple choice
Full question →

A data scientist is training a multi-class classifier with 10 classes. The training log shows the above output for the first two epochs. What is the most likely cause?

Exhibit

Refer to the exhibit.

```
Epoch 1/10
 - loss: 2.3026 - accuracy: 0.1000 - val_loss: 2.3026 - val_accuracy: 0.1000
Epoch 2/10
 - loss: 2.3026 - accuracy: 0.1000 - val_loss: 2.3026 - val_accuracy: 0.1000
```
Question 9hardmultiple choice
Full question →

An organization is implementing an AI-powered chatbot for customer service. The chatbot must comply with GDPR and handle data subject access requests (DSARs). Which design approach best ensures compliance?

Question 10hardmulti select
Full question →

A team is deploying a deep learning model that uses a convolutional neural network (CNN) for image recognition. The model achieves high accuracy but is very slow to infer on edge devices. Which THREE optimization techniques should the team consider to speed up inference without significant accuracy loss? (Select three.)

Question 11hardmultiple choice
Full question →

A self-driving car company is testing an AI model for pedestrian detection. During simulation, the model fails to detect pedestrians in low-light conditions. The safety team wants to improve robustness without retraining the entire model from scratch. Which approach is most appropriate?

Question 12hardmultiple choice
Read the full NAT/PAT explanation →

An AI system for autonomous vehicles uses reinforcement learning (RL) to navigate. The reward function encourages reaching the destination quickly but penalizes collisions heavily. The agent learns to drive aggressively, causing minor accidents. Which modification to the reward function would best align the agent's behavior with desired safe driving?

Question 13hardmulti select
Full question →

Which TWO of the following are techniques used for reducing overfitting in neural networks? (Choose two.)

Question 14hardmultiple choice
Full question →

Refer to the exhibit. A team deploys a sentiment analysis model with this policy. After one month, the monitoring system triggers an alert for feature drift. Which action should the team take first?

Exhibit

Refer to the exhibit.

JSON Policy for Model Deployment:
{
  "model": "sentiment_analysis_v2",
  "threshold": 0.7,
  "fairness_check": {
    "protected_attributes": ["gender", "age_group"],
    "metric": "demographic_parity",
    "tolerance": 0.05
  },
  "explainability": {
    "method": "LIME",
    "num_features": 5
  },
  "monitoring": {
    "drift_detection": {
      "feature_drift": true,
      "prediction_drift": true,
      "alert_threshold": 0.2
    }
  }
}
Question 15hardmulti select
Full question →

Which THREE of the following are key considerations when deploying an AI model in a production environment?

Question 16hardmulti select
Full question →

An organization is deploying a deep learning model in production. Which THREE components are essential for maintaining model performance over time?

Question 17hardmultiple choice
Full question →

An ML engineering team has a retraining pipeline that triggers automatically when model accuracy drops below a threshold. Recently, the model's accuracy has been fluctuating, causing frequent retraining and high compute costs. The team suspects the data distribution is changing slowly. Which approach should the team implement to reduce unnecessary retraining while maintaining model performance?

Question 18hardmultiple choice
Full question →

A financial institution needs to integrate an AI-based credit scoring model into an existing mainframe system that processes transactions in COBOL. The model is deployed as a REST API. What is the best strategy to ensure minimal disruption and maintain data integrity?

Question 19hardmultiple choice
Full question →

A deep learning model for image classification is overfitting due to a small dataset. The team decides to apply data augmentation. Which augmentation technique is least likely to preserve the label?

Question 20hardmultiple choice
Read the full NAT/PAT explanation →

A healthcare startup is deploying a machine learning model to predict patient readmission within 30 days using electronic health records (EHR). The data pipeline uses Apache Spark for preprocessing and training on an Amazon EMR cluster. The training dataset is 50 GB and composed of structured numeric and categorical features, along with unstructured clinical notes. The data scientist observes that training takes over 12 hours and frequently fails due to out-of-memory (OOM) errors, especially when processing the clinical notes via TF-IDF vectorization. The cluster has 10 nodes with 64 GB RAM each. The data engineer has already tried increasing spark.sql.shuffle.partitions to 400 and using Kryo serialization, but OOM persists. Which action should the data engineer take next to resolve the OOM errors?

These AI0-001 practice questions are part of Courseiva's free CompTIA certification practice question bank. Courseiva provides original exam-style AI0-001 questions with detailed explanations, topic-based practice, mock exams, readiness tracking, and study analytics.