How many Machine Learning and Deep Learning questions are on the AI0-001 exam?

The Machine Learning and Deep Learning domain is one of the weighted domains on the AI0-001 exam. The Courseiva question bank has 106 practice questions for this domain.

Free AI0-001 Machine Learning and Deep Learning Practice Questions (2026)

Q: How can I practice Machine Learning and Deep Learning questions for AI0-001?

Click any of the 106 questions listed on this page to see the full question and explanation, or use the session launcher to start a focused practice session of 10, 20, 30 or 50 questions drawn only from the Machine Learning and Deep Learning domain.

Practice Machine Learning and Deep Learning questions

10Q 20Q 30Q 50Q

All AI0-001 Machine Learning and Deep Learning questions (106)

Start session

Click any question to see the full explanation and answer options, or start a focused practice session above.

A data scientist is building a classification model to detect fraudulent transactions. The dataset is highly imbalanced with only 1% fraudulent cases. Which approach should the scientist use to evaluate model performance most effectively?

A machine learning team is deploying a model that predicts customer churn. They notice that the model's predictions are highly sensitive to small changes in input features, leading to inconsistent outputs. Which technique should the team apply to improve model stability?

A deep learning model for image classification is overfitting the training data. The team has already tried data augmentation and dropout. Which additional technique should they implement to reduce overfitting?

A company wants to deploy a machine learning model that requires continuous learning as new data arrives. The model must be able to adapt to changing patterns without retraining from scratch. Which approach should be used?

A data engineer is designing a pipeline to train a linear regression model on a dataset with 10 million rows and 50 features. The dataset fits in memory. Which approach should the engineer use to train the model efficiently?

A data scientist is training a convolutional neural network (CNN) for object detection. The training loss decreases rapidly but then plateaus at a high value, and the validation loss starts increasing. Which action should the scientist take to improve the model?

A team is building a recommendation system using collaborative filtering. They have a sparse user-item matrix. Which technique should they use to handle the sparsity and improve recommendations?

Which TWO techniques are commonly used to handle missing data in a machine learning dataset? (Choose TWO.)

Which THREE are common activation functions used in neural networks? (Choose THREE.)

Which TWO are valid techniques to reduce overfitting in a deep neural network? (Choose TWO.)

A data scientist is training a multi-class classifier with 10 classes. The training log shows the above output for the first two epochs. What is the most likely cause?

A team is reviewing a neural network model summary. The input layer expects 784 features (e.g., 28x28 images). How many parameters does the first dense layer have?

A data scientist is training a neural network to classify images of handwritten digits. The model achieves 99% accuracy on training data but only 85% on validation data. Which technique should the scientist apply first to address this issue?

A company is deploying a machine learning model to predict customer churn. The dataset is highly imbalanced (95% non-churn, 5% churn). The model achieves 96% accuracy, but the F1-score for the churn class is only 0.2. Which metric should the team prioritize to evaluate model performance for this business problem?

An autonomous vehicle system uses a deep reinforcement learning agent to navigate. The agent's reward function gives +1 for reaching the destination and -0.1 for each time step. After training, the agent learns to circle the block repeatedly without reaching the destination. Which modification is most likely to fix this behavior?

A machine learning engineer is building a spam filter. The dataset contains 10,000 emails, of which 1,000 are spam. The engineer decides to use a Random Forest classifier. Which preprocessing step is most critical to ensure the model generalizes well to new, unseen emails?

Which TWO techniques are commonly used to prevent overfitting in deep neural networks?

Refer to the exhibit. A data scientist is training a binary classifier. Based on the training log, which problem is the model experiencing?

A healthcare startup is developing a deep learning model to detect diabetic retinopathy from retinal fundus images. The dataset contains 50,000 images, but only 5% are labeled as positive for the disease. The team uses a convolutional neural network (CNN) with a final sigmoid layer and binary cross-entropy loss. After training for 20 epochs, the model achieves 95% accuracy on the test set, but the recall for the positive class is only 10%. The team suspects the model is biased toward the negative class due to class imbalance. The data is stored in a secure environment, and no additional labeled data can be obtained. The team has access to the following techniques: oversampling the minority class, undersampling the majority class, using class weights in the loss function, applying data augmentation, and using a different architecture. Which course of action is most likely to improve recall for the positive class while maintaining reasonable overall performance?

A retail company uses a gradient boosting model to predict customer lifetime value (CLV). The model currently uses 50 features including purchase history, demographics, and web behavior. The model's RMSE on the test set is 120. The data science team wants to improve the model's accuracy without increasing training time significantly. They have access to additional data: customer support interaction logs (text), social media sentiment (text), and third-party credit scores (numeric). They also have the ability to perform feature engineering, hyperparameter tuning, and ensemble methods. Which approach is most likely to yield the best improvement in predictive performance with minimal increase in training time?

A data scientist is training a binary classification model to detect fraudulent transactions. The dataset is highly imbalanced with only 1% fraud cases. Which technique is most appropriate to address the class imbalance?

A machine learning engineer is tuning a neural network for image classification. The training loss decreases steadily, but the validation loss starts increasing after 50 epochs. Which action best addresses this issue?

A company deploys a deep learning model for real-time object detection in autonomous vehicles. The model was trained on high-end GPUs but needs to run on edge devices with limited computational resources. Which technique is most effective for reducing model size and inference latency while maintaining acceptable accuracy?

A data analyst wants to predict housing prices based on square footage, number of bedrooms, and location. Which machine learning approach is most suitable?

A team is training a convolutional neural network (CNN) for medical image diagnosis. They have a limited dataset of 500 labeled images. Which strategy is most effective to improve model generalization?

An AI developer observes that the training accuracy of a neural network is high, but the test accuracy is low. The model uses a ReLU activation function and Adam optimizer. Which approach is most likely to improve test accuracy?

A machine learning engineer needs to choose an algorithm for grouping customers into segments based on purchasing behavior without any labels. Which algorithm should the engineer use?

While training a deep neural network, the loss function fails to converge and oscillates wildly. Which adjustment is most likely to stabilize training?

A data scientist is training a random forest model on a large dataset and notices that the model is overfitting. Which hyperparameter adjustment is most likely to reduce overfitting?

A company is preparing a dataset for training a supervised machine learning model. The dataset contains missing values, outliers, and categorical features. Which two preprocessing steps are typically performed to prepare the data? (Choose two.)

A deep learning engineer is training a convolutional neural network for image classification. The model is overfitting the training data. Which three techniques can help reduce overfitting? (Choose three.)

A data scientist is evaluating a trained binary classification model. The model has high accuracy but the precision is low and recall is high. Which three actions are most appropriate to improve precision? (Choose three.)

Refer to the exhibit. A data scientist is training a neural network and observes the training log above. What is the most likely cause?

Refer to the exhibit. An AI specialist reviews the model evaluation report for a binary classifier. The specialist wants to improve recall. Which action is most likely effective?

Refer to the exhibit. An AI developer implements the above neural network architecture for handwritten digit recognition. The model achieves 85% training accuracy and 83% test accuracy. Which modification is most likely to improve training accuracy?

A data scientist is building a binary classification model to predict customer churn. The dataset has 10,000 samples with 80% non-churn and 20% churn. The model achieves 95% accuracy but fails to identify churners correctly. Which metric should the scientist focus on to evaluate model performance properly?

A team is implementing a machine learning pipeline to classify images for a defect detection system. They are considering using a pre-trained convolutional neural network (CNN) and fine-tuning it on their small dataset. What is the primary advantage of transfer learning in this scenario?

A company uses linear regression to predict sales based on advertising spend. The model's residuals show a pattern of increasing variance as spend increases. Which assumption of linear regression is violated?

An AI engineer is training a deep neural network for image recognition. The training loss decreases steadily for the first few epochs but then plateaus and starts to oscillate. Which adjustment is most likely to improve convergence?

A healthcare organization wants to use patient data to predict disease risk. They are concerned about bias in the model. Which step is most critical during the data preparation phase to mitigate bias?

A team trains a random forest model on a dataset with 50 features. The model's performance on the test set is significantly worse than on the training set. Which technique is most appropriate to address this issue?

A deep learning model for natural language processing uses a recurrent neural network (RNN) to process long sequences. The gradients vanish after many time steps. Which architectural change is most effective to mitigate this problem?

An organization has a dataset with categorical features having high cardinality (e.g., ZIP codes). They plan to use a tree-based model. Which encoding method is most appropriate?

A company deploys a machine learning model that makes predictions on streaming data. Over time, the data distribution shifts, causing model performance to degrade. Which monitoring strategy is most appropriate to detect this drift?

A data scientist is tuning hyperparameters for a support vector machine (SVM) with an RBF kernel. Which two hyperparameters most significantly affect model performance? (Select TWO.)

A team is designing a deep learning pipeline for a computer vision task. They want to reduce overfitting. Which two techniques are specifically effective for this purpose? (Select TWO.)

A data scientist is using an ensemble method to combine multiple models. Which three statements about bagging (Bootstrap Aggregating) are true? (Select THREE.)

Refer to the exhibit. The training log shows losses and accuracies over 5 epochs. What is the most likely problem?

Refer to the exhibit. A developer is using the above configuration for a multi-class classification task. The model performs well on training data but poorly on validation data. Which modification could help?

Refer to the exhibit. The training pod is using 2 GPUs. During training, the GPU utilization is only 30% each. What is the most likely cause?

A data scientist needs to predict whether a customer will churn based on historical data containing features like account age, monthly charges, and support tickets. The target variable is binary (churn or not). Which type of machine learning algorithm should be used?

A team trained a deep neural network on a limited dataset. The training loss decreases consistently, but the validation loss starts increasing after 20 epochs. What is the most likely issue and the best corrective action?

A company is building a computer vision system to detect defects in manufactured parts. They have 10,000 labeled images per class (defective and non-defective). They want to achieve high accuracy with limited computational resources. Which deep learning architecture and approach is most appropriate?

A machine learning engineer has a dataset of 100,000 records. She splits it into 70% training, 15% validation, and 15% test sets. After training, the model achieves 95% accuracy on training and 85% on validation. What does the accuracy difference most likely indicate?

A deep learning model for sentiment analysis uses a softmax output layer. The hidden layers currently use tanh activation. Which activation function should replace tanh to mitigate vanishing gradients in deeper networks?

A fraud detection model is trained on a dataset where only 0.1% of transactions are fraudulent. The model achieves 99.9% accuracy but fails to catch most frauds. Which metric should the team prioritize, and which technique could help?

A dataset contains features on vastly different scales (e.g., age 0-100 vs. income 0-1,000,000). Which preprocessing step is essential before training a neural network?

During training of a neural network, the loss oscillates and does not converge smoothly. The learning rate is set to 0.1. What is the most likely cause and what adjustment should be made?

A team is building a model to predict stock prices based on time series data. They need to capture long-term dependencies and avoid vanishing gradients. Which architecture is best suited?

Which TWO are characteristics of supervised learning?

Which THREE techniques can help reduce overfitting in neural networks?

Which TWO are key differences between Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN)?

Refer to the exhibit. What is the most likely issue and what action should be taken?

Refer to the exhibit. A compliance audit requires that model predictions be explainable for regulatory reasons. Which setting in the deployment configuration supports this requirement?

Refer to the exhibit. What is the recall of the model?

A data scientist trains a linear regression model on housing prices. The training error is low, but test error is high. What is the most likely issue?

A team trains a deep learning model for image classification with 1000 classes. The training loss decreases but validation loss starts increasing after 10 epochs. What should they do first?

A company uses a neural network for fraud detection. The dataset has 99% legitimate, 1% fraudulent. The model achieves 99% accuracy but fails to detect most frauds. Which metric should they focus on?

A data scientist wants to reduce the dimensionality of a dataset with 200 features before training a regression model. Which technique should they use?

A deep learning model for sentiment analysis has millions of parameters and is trained on a small dataset. Which technique can help prevent overfitting?

An organization needs to classify customer emails into categories. They have labeled data for some categories but not all. Which approach should they use?

A machine learning engineer notices that the gradient values in a deep network are becoming extremely small during backpropagation. What is this problem?

A team wants to predict monthly sales using historical data. Which algorithm is most appropriate?

A model trained on a dataset has high bias and low variance. What does this indicate?

Which TWO techniques are commonly used for feature scaling? (Choose two.)

Which THREE are common activation functions used in neural networks? (Choose three.)

Which TWO are evaluation metrics for classification problems? (Choose two.)

Based on the exhibit, what is the likely problem with the model?

The exhibit shows a model configuration for a classification task with 10 classes. What is wrong with this setup?

Based on the exhibit, what does this indicate about the model?

A data scientist is training a binary classification model to detect fraudulent transactions. The dataset is highly imbalanced with 99% legitimate and 1% fraudulent. Which evaluation metric should be prioritized to assess model performance?

A team is deploying a deep learning model for real-time image classification on edge devices with limited computational resources. Which technique would best help reduce model size and inference time without significant accuracy loss?

A machine learning engineer notices that a linear regression model has high bias. Which action is most likely to reduce bias?

A team is developing a recommendation system for an e-commerce platform. They want to use collaborative filtering but are concerned about cold-start problems for new users. Which approach would best mitigate the cold-start problem?

A data scientist is training a deep neural network for sentiment analysis. The training loss decreases steadily but the validation loss starts to increase after 10 epochs. What is the most likely cause and best corrective action?

An organization wants to automate the detection of defective products on an assembly line using computer vision. They have a limited number of labeled images for defective items. Which approach would be most effective?

A machine learning engineer is troubleshooting a recurrent neural network that fails to learn long-range dependencies in sequential data. The gradients are computed using backpropagation through time. Which phenomenon is most likely occurring, and what architectural change would best address it?

A data scientist is using a gradient boosting model (XGBoost) for a regression task and observes that the model's performance on the training set is much better than on the test set. Which hyperparameter tuning strategy would most effectively reduce overfitting?

A deep learning model for autonomous vehicle perception uses a large convolutional neural network. During deployment, the model misclassifies a stop sign that has a small sticker on it. This is likely an example of what type of vulnerability, and which defense is most appropriate?

Which TWO of the following are common activation functions used in deep neural networks?

Which THREE of the following are techniques for handling missing data in machine learning?

Which THREE of the following are best practices for preventing overfitting in deep learning models?

A hospital wants to deploy a machine learning model to predict patient readmission risk within 30 days. They have a dataset with 10,000 records, 70 features including demographics, lab results, and past admissions. The target variable is binary (readmitted or not). The data scientist trains a logistic regression model and achieves an AUC of 0.85 on the test set. However, the hospital's clinicians require interpretability of predictions to trust the model. Which action should the data scientist take to ensure the model meets the interpretability requirement while maintaining performance?

An e-commerce company uses a gradient boosting model to forecast daily sales. Recently, the model's predictions have become less accurate, showing a significant drop in R-squared on validation data. The data scientist checks for data drift but finds no significant changes in feature distributions. The model was trained on data from the past 24 months and is retrained monthly. Upon inspecting the feature importance, the data scientist notices that the top feature 'promotion_flag' has decreased in importance over time. What is the most likely cause of the performance degradation, and what should be done?

A financial institution uses a deep learning model for fraud detection. The model is a feedforward neural network with three hidden layers. It was trained on a balanced dataset of 100,000 transactions. During deployment, the model achieves high accuracy on the test set but the fraud detection rate (true positive rate) is only 40% while the false positive rate is 0.1%. The business requires a true positive rate of at least 80%. Which of the following actions is most likely to achieve the required true positive rate while minimizing the increase in false positives?

A data scientist is training a binary classification model to detect fraudulent transactions. The dataset contains 99.9% legitimate transactions and 0.1% fraudulent transactions. After training a logistic regression model, the accuracy is 99.9%, but the recall for the fraud class is 0%. Which of the following is the MOST likely cause?

A machine learning engineer is preparing to train a deep neural network for image classification. To avoid overfitting, which TWO techniques should the engineer apply? (Select TWO.)

A company is deploying a machine learning model that predicts customer churn. The model currently has high variance. Which THREE actions should the data scientist take to reduce variance? (Select THREE.)

A healthcare startup is developing a diagnostic system using medical images. The team has collected 10,000 labeled images of skin lesions. They plan to train a convolutional neural network (CNN) from scratch. However, training converges slowly, and the validation accuracy plateaus at 70%. The data scientist suspects overfitting. The dataset contains 8,000 images of benign lesions and 2,000 of malignant. The team has limited GPU resources. Which of the following is the MOST effective course of action to improve validation accuracy? A. Reduce the number of convolutional layers. B. Apply transfer learning using a pre-trained model on ImageNet. C. Increase the learning rate by a factor of 10. D. Add more dropout after every convolutional layer.

100

A financial institution uses a random forest model to approve loan applications. Recently, the model's false positive rate has increased, leading to more defaults. The data science team reviews the feature importance and finds that the model heavily relies on a feature 'zip code' which correlates with income. The company is concerned about fairness. The regulatory team requires that the model's predictions are not biased against protected groups. Which action BEST addresses the fairness concern while maintaining predictive performance? A. Remove the 'zip code' feature and retrain the model. B. Use adversarial debiasing to train a model that is invariant to protected attributes. C. Add more training data from underrepresented zip codes. D. Apply a post-processing technique that adjusts thresholds for different groups.

101

An e-commerce company deploys a deep learning model for product recommendation. After a new data pipeline is implemented, the model's online performance degrades: recall drops by 20% and the click-through rate decreases. The data scientists suspect data drift. They compare the distribution of the input features between the training data and recent production data. The Kolmogorov-Smirnov test shows significant differences for two numerical features (price and rating). The team also notices that the frequency of categorical feature 'category' has changed. Which of the following is the MOST appropriate first step? A. Immediately retrain the model on all available data including new production data. B. Roll back to the previous data pipeline and investigate the root cause of drift. C. Use feature selection to remove the drifting features and retrain. D. Implement a monitoring dashboard to track drift over time and set up alerts.

102

A self-driving car company uses a reinforcement learning agent to navigate. The agent was trained in a simulated environment and achieved high rewards. When deployed in the real world, the agent fails to avoid obstacles. The team collects real-world driving data and uses it to fine-tune the model. However, fine-tuning leads to catastrophic forgetting of the simulated knowledge. Which technique should the team use to mitigate this? A. Increase the learning rate during fine-tuning. B. Use elastic weight consolidation (EWC) to regularize important weights. C. Train the model from scratch using only real-world data. D. Increase the number of layers in the network.

103

A media company uses a natural language processing (NLP) model to classify news articles into topics. The model was trained on articles from 2015-2018. In 2023, the model's F1 score drops significantly. The data scientists find that the word embeddings no longer capture the meaning of some terms (e.g., 'covid', 'metaverse'). The model uses static word embeddings (Word2Vec) trained on the original corpus. Which solution BEST addresses the observed degradation? A. Replace static embeddings with contextual embeddings from a transformer model like BERT, then fine-tune the classifier. B. Retrain the static Word2Vec embeddings on a larger corpus from 2023. C. Apply data augmentation to the original training data by replacing words with synonyms. D. Increase the dimensionality of the static embeddings.

104

A data scientist is preparing a dataset for a binary classification neural network. The dataset contains both numerical and categorical features, and some rows have identical entries. Which TWO preprocessing steps are most essential to improve model performance and avoid overfitting?

105

Based on the exhibit, what is the most likely issue with the trained model?

106

A financial institution is developing a fraud detection model using historical transaction data. The dataset contains over 10 million records, but only 0.01% of transactions are fraudulent. The current model uses a neural network trained with standard cross-entropy loss, and the team applies random undersampling of the majority class to create a balanced training set. However, the model still produces a high number of false positives (legitimate transactions flagged as fraud) and misses approximately 30% of actual fraud cases. The business requires that at least 95% of frauds be caught, and the false positive rate must be below 1% to avoid overwhelming fraud analysts. The team has limited resources to collect additional data and cannot change the model architecture significantly. Which approach should the team take to best meet the business requirements?

Practice all 106 Machine Learning and Deep Learning questions

Other AI0-001 exam domains

AI Concepts and Foundations AI Models and Data Engineering AI Implementation and Operations AI Security, Ethics and Governance

Frequently asked questions

What does the Machine Learning and Deep Learning domain cover on the AI0-001 exam?

The Machine Learning and Deep Learning domain covers the key concepts tested in this area of the AI0-001 exam blueprint published by CompTIA. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all AI0-001 domains — no account required.

How many Machine Learning and Deep Learning questions are in the AI0-001 question bank?

The Courseiva AI0-001 question bank contains 106 questions in the Machine Learning and Deep Learning domain. Click any question to see the full explanation and answer breakdown.

What is the best way to practice Machine Learning and Deep Learning for AI0-001?

Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.

Can I practice only Machine Learning and Deep Learning questions for AI0-001?

Yes — the session launcher on this page draws questions exclusively from the Machine Learning and Deep Learning domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.

Free forever · No credit card required

Track your AI0-001 domain progress

Save your results, see per-domain analytics, and get readiness scores — free, for every certification.

Free forever · Every certification included