Practice AI0-001 AI Concepts and Foundations questions with full explanations on every answer.
Start practicing
AI Concepts and Foundations — choose a session length
Free · No account required
Click any question to see the full explanation and answer options, or start a focused practice session above.
A company deploys an AI model to predict equipment failure. The model performs well on historical data but fails to generalize to new data from a different factory. Which concept best describes this issue?
2A data scientist trains a linear regression model to predict house prices. The model has high bias and low variance. Which action would most likely reduce bias?
3An AI engineer trains a deep learning model for image classification. After training, the training accuracy is 99% but validation accuracy is 85%. Which technique would best address this discrepancy?
4A company implements a chatbot using a rule-based system. Users complain the chatbot cannot handle new queries. Which AI approach should be considered to improve flexibility?
5An AI model for detecting fraudulent transactions has high precision but low recall. Which business impact is most likely?
6A data scientist splits a dataset into training (80%) and test (20%). After training, the model achieves 95% accuracy on training and 60% on test. Which step should the data scientist take first?
7An organization wants to classify support tickets into categories (billing, technical, etc.). Which type of machine learning is most suitable?
8Which TWO techniques are commonly used to handle missing data in a dataset?
9Which THREE factors are common causes of bias in AI systems?
10Which TWO statements correctly describe the difference between supervised and unsupervised learning?
11Refer to the exhibit. A deep learning model is being trained. Based on the training log, which problem is most evident?
12Refer to the exhibit. A data scientist defines a model configuration in JSON. Which component is missing from the configuration for a complete machine learning pipeline?
13A hospital uses an AI system to prioritize patient triage based on vital signs and medical history. During a trial, the system consistently assigns lower urgency to elderly patients with chronic conditions, even when their symptoms suggest high risk. Which approach best addresses this bias?
14An e-commerce company deploys a recommendation system using collaborative filtering. After launch, the system shows high accuracy for popular items but fails to recommend niche products to users who would likely buy them. Which technique should the team implement to improve recommendations for long-tail items?
15A data scientist is training a binary classification model to detect fraudulent transactions. The dataset has 99% legitimate transactions and 1% fraudulent. The model achieves 99% accuracy but fails to catch most fraud. Which metric should the team prioritize to evaluate model performance?
16A startup is building a chatbot to handle customer inquiries. They want the chatbot to understand context and provide accurate responses without requiring extensive labeled data. Which AI approach is most suitable?
17Which TWO of the following are key characteristics of unsupervised learning?
18A company uses the above policy to control AI model access. A data scientist tries to run inference with model "llama-3-70b" at 150 requests in 30 minutes. What will happen?
19A manufacturing company uses a computer vision AI to inspect products on an assembly line for defects. The AI model was trained on images from a single camera angle under bright, uniform lighting. Recently, the company moved the inspection station to a different part of the factory where lighting is dimmer and varies due to nearby windows. The model now misclassifies many non-defective products as defective, causing false alarms and production delays. The team has limited labeled data from the new environment. Which action should the team take to restore inspection accuracy while minimizing downtime?
20A financial institution uses a machine learning model to approve loan applications. The model was trained on historical data that inadvertently encoded a bias against applicants from certain zip codes, leading to discriminatory lending practices. A recent audit reveals that the model's decisions are unfair, and regulators require the bank to remediate the bias without significantly reducing overall approval accuracy. The data science team has access to the training data, the model, and a set of fairness metrics. They also have a small, unbiased validation set. Which course of action should the team take to satisfy regulatory requirements?
21A retail company wants to build a model to predict customer churn based on purchase history and demographics. The dataset includes categorical features like region and gender, and numerical features like total spend. What is the best initial step before training the model?
22An AI system is being designed to automatically detect fraudulent transactions in real-time. The system must have low latency and high precision to minimize false alarms. Which algorithm is most appropriate?
23A data scientist trains a deep neural network for image classification. The training loss decreases but validation loss starts increasing after 50 epochs. What should the data scientist do to improve generalization?
24A company wants to deploy a chatbot that uses natural language understanding (NLU) to answer customer queries. Which AI technique is most suitable for understanding the intent of user input?
25An AI model is trained to predict loan default. The training data contains 95% non-default and 5% default. Which metric is most appropriate to evaluate model performance given the imbalanced dataset?
26A self-driving car company is developing an object detection system using a convolutional neural network (CNN). The system needs to detect pedestrians and vehicles in real-time with high accuracy. Which technique can reduce inference time while maintaining accuracy?
27A marketing team wants to segment customers into groups based on purchasing behavior without predefined categories. Which algorithm should they use?
28An AI model is being developed for medical diagnosis from X-ray images. The dataset contains only frontal chest X-rays. The model achieves high accuracy on test set but fails on lateral views. What is the most likely cause?
29A team is training a deep learning model for natural language processing using a large corpus. They notice the model has a very high number of parameters and training is slow. Which technique can reduce the number of parameters without significant performance loss?
30A data analyst needs to select two appropriate unsupervised learning techniques for clustering unlabeled data. (Choose two.)
31When evaluating a binary classification model, which two metrics are most appropriate for imbalanced datasets? (Choose two.)
32Which three techniques are commonly used to mitigate overfitting in neural networks? (Choose three.)
33Based on the exhibit, what is the most likely issue with the model training?
34A data scientist notices the model overfits. Which change to the exhibit's configuration would most likely reduce overfitting?
35Based on the exhibit, what issue should the team address?
36A data scientist is preparing a dataset for a classification task. The dataset contains 10,000 rows and 50 features, but many features have missing values. Which approach should the scientist take first to address the missing data?
37A company built a speech-to-text model using a recurrent neural network (RNN). During deployment, the model performs poorly on accented speech. Which action would most effectively improve model robustness?
38An AI team is deploying a predictive maintenance model for industrial equipment. The model predicts failure within a 30-day window. The cost of a false positive is 10% of the cost of a false negative. Which evaluation metric should the team prioritize?
39A marketing team uses a recommendation system to suggest products to customers. The system currently uses collaborative filtering. Which scenario would most likely cause the cold-start problem?
40A hospital deploys an AI system to detect pneumonia from chest X-rays. The model achieves 95% accuracy on the test set but later is found to be less accurate for patients under 18. The development team suspects bias. Which step should be taken first to investigate?
41A research team is training a deep neural network for image classification. The training loss decreases rapidly for the first few epochs but then plateaus, while validation loss starts to increase after epoch 10. Which action would best address this issue?
42A chatbot developer uses a transformer-based model for customer service. Users complain that the chatbot sometimes gives offensive responses. Which technique should be applied first to mitigate this issue?
43A financial institution uses a regression model to predict credit risk. The model has a high R-squared on training data but low R-squared on test data. Which of the following is the most likely cause?
44An AI system for autonomous vehicles uses reinforcement learning (RL) to navigate. The reward function encourages reaching the destination quickly but penalizes collisions heavily. The agent learns to drive aggressively, causing minor accidents. Which modification to the reward function would best align the agent's behavior with desired safe driving?
45A data scientist is building a natural language processing model to classify customer reviews as positive or negative. Which TWO preprocessing steps are most essential before tokenization? (Select two.)
46A company is implementing an AI solution for fraud detection. The dataset is highly imbalanced (only 1% fraudulent transactions). Which THREE techniques are most appropriate to address class imbalance? (Select three.)
47A team is deploying a deep learning model that uses a convolutional neural network (CNN) for image recognition. The model achieves high accuracy but is very slow to infer on edge devices. Which THREE optimization techniques should the team consider to speed up inference without significant accuracy loss? (Select three.)
48Refer to the exhibit. The data scientist notices that the model achieves 98% accuracy on the training set but only 72% on the test set. Which change to the model parameters is most likely to reduce this gap?
49Refer to the exhibit. The model is a neural network for 10-class classification. The training log shows no improvement over 5 epochs. Which of the following is the most likely root cause?
50Refer to the exhibit. A team deploys a sentiment analysis model with this policy. After one month, the monitoring system triggers an alert for feature drift. Which action should the team take first?
51A data scientist is training a model to classify customer support tickets into categories. The dataset has 10,000 labeled examples, but the 'billing' category contains 8,000 examples while the 'technical' category contains 2,000. Which technique is most appropriate to address this imbalance before training?
52A team deploying an AI model for real-time fraud detection notices that inference latency is too high. The model is a deep neural network with 50 layers, deployed on a cloud GPU. Which of the following is the BEST approach to reduce latency while maintaining acceptable accuracy?
53An AI system is being developed to diagnose diseases from medical images. The model achieves 99% accuracy on the test set, but when deployed in a different hospital, performance drops significantly. Which of the following is the MOST likely cause?
54A company wants to use AI to analyze customer reviews and determine sentiment (positive, negative, neutral). Which AI subfield is most directly applicable?
55A team is training a neural network for image classification. They observe that training loss decreases steadily but validation loss starts increasing after 20 epochs. What is the most likely issue?
56An organization is developing an AI system to approve loan applications. They want to ensure the model does not discriminate based on race or gender. Which technique BEST addresses this concern?
57A machine learning engineer wants to evaluate a binary classifier. Which metric is MOST appropriate when the positive class is rare (e.g., 1% of total data)?
58A company uses a pre-trained language model for a legal document classification task. They have limited labeled data (500 documents). Which strategy is MOST effective for adapting the model to this domain?
59A team is designing an AI system for autonomous driving. They need to decide between an end-to-end deep learning approach versus a modular pipeline (perception, planning, control). Which is a key advantage of the modular approach?
60Which TWO of the following are common techniques to reduce overfitting in a neural network?
61Which TWO of the following are appropriate uses of unsupervised learning?
62Which THREE of the following are key considerations when deploying an AI model in a production environment?
63A company uses an AI model to screen job applications. The model is trained on historical hiring data that reflects past biases. After deployment, the model disproportionately rejects candidates from certain demographics. Which concept does this best illustrate?
64A data scientist wants to group customers into segments based on purchasing behavior without predefined labels. Which type of machine learning is most appropriate?
65An AI model achieves high accuracy on training data but performs poorly on new test data. The data scientist suspects the model has memorized noise. Which technique directly adds a penalty term to the loss function to address this?
66A company deploys a chatbot using a large language model (LLM). After launch, users report that the chatbot sometimes generates plausible but false information. This phenomenon is known as:
67In the AI lifecycle, which phase involves splitting data into training, validation, and test sets?
68A self-driving car uses an AI model that learns by trial and error, receiving rewards for correct actions and penalties for mistakes. This type of learning is:
69An AI team notices that their model's performance degrades over time because the statistical relationship between input features and the target variable changes. This issue is called:
70Which metric is most appropriate for evaluating a binary classification model where the positive class is rare and false positives are costly?
71A company wants to create an AI system that can identify objects in images. They have a large dataset of labeled images. Which type of neural network architecture is most suitable?
72A data scientist is preparing a dataset for supervised learning. Which TWO steps are essential?
73Which THREE are common machine learning algorithms used for regression?
74A team is deploying an AI model for credit approval. Which TWO ethical considerations must be addressed?
75Refer to the exhibit. A data scientist observes the training output. Which issue is most likely?
76Refer to the exhibit. An AI auditor reviews the fairness configuration. What is the purpose of this policy?
77Refer to the exhibit. A system administrator reviews the deployment. Which action should be taken to meet the SLA?
78A company is building a recommendation system for an e-commerce platform. They want the system to learn from user purchase history and browsing behavior to suggest products. Which type of machine learning is most appropriate for this task?
79A data scientist is training a neural network to classify images of animals. The training accuracy is 99%, but validation accuracy is only 65%. Which technique should the data scientist use to address this issue?
80An AI system is deployed to detect fraudulent transactions. The system flags 5% of transactions as fraudulent, but the actual fraud rate is 0.1%. The business sees many false positives and wants to reduce them without significantly increasing false negatives. Which metric should be prioritized for optimization?
81A company wants to use AI to automatically categorize customer support tickets into topics like 'billing', 'technical', 'account'. They have 10,000 labeled examples. Which algorithm is most suitable for this task?
82A machine learning team notices that their model's performance degrades when deployed to a new geographic region. The data distribution in the new region differs from the training data. Which concept best describes this issue?
83A team is building a natural language processing (NLP) model to analyze customer feedback. They have a large corpus of unlabeled text data and want to generate word embeddings that capture semantic meaning. Which approach should they use?
84A healthcare provider wants to use AI to predict patient readmission risk. They have structured data (age, diagnosis, lab results) and unstructured clinical notes. Which approach is most appropriate?
85An AI engineer is tuning a deep learning model and observes that the training loss decreases very slowly. The learning rate is set to 0.001. Which adjustment is most likely to speed up convergence?
86A company develops an AI model that recommends job candidates. The model inadvertently discriminates against a protected group. Which approach is most effective for mitigating this bias?
87Which TWO of the following are common activation functions used in neural networks? (Choose two.)
88Which THREE of the following are types of machine learning paradigms? (Choose three.)
89Which TWO of the following are techniques used for reducing overfitting in neural networks? (Choose two.)
90A startup is building a chatbot for customer service. They have 500 recorded conversations and want to use a pre-trained language model to generate responses. However, they have limited computational resources and need the chatbot to respond in real-time. They are considering fine-tuning a large model like GPT-3 or using a smaller model like DistilBERT. The conversation data contains industry-specific jargon. Which approach should they take?
91A hospital uses an AI system to predict patient deterioration from vital signs. The system currently uses a logistic regression model trained on data from the past year. Recently, the hospital adopted a new patient monitoring device that provides more accurate readings. The model's performance has dropped significantly. The data science team has access to the new device's data for the past month and wants to improve the model with minimal disruption. The team also wants to ensure the model remains interpretable for regulatory compliance. Which approach should they take?
92A financial institution is deploying a reinforcement learning agent to optimize stock trading decisions. The agent is trained in a simulated environment that mimics historical market data. After deployment, the agent performs well initially but then suffers large losses during a period of high volatility that was underrepresented in the training data. The team wants to make the agent more robust to such market conditions without retraining from scratch. They have a budget for additional simulation compute and access to a broader historical dataset including past crises. The agent uses a deep Q-network (DQN) architecture. Which strategy should they adopt?
93A financial services company is developing an AI model to detect fraudulent transactions. The dataset contains 99.9% legitimate transactions and 0.1% fraudulent ones. Which technique should the data scientist use to address the class imbalance problem?
94A data scientist is training a supervised learning model for customer churn prediction. Which TWO types of bias are most likely to affect the model's fairness and accuracy if not addressed?
95An organization is deploying a deep learning model in production. Which THREE components are essential for maintaining model performance over time?
96A healthcare startup is building an AI system to predict patient readmission risk. The team collects structured data from electronic health records (EHR) including age, diagnosis codes, lab results, and previous admissions. During initial training, the model achieves 95% accuracy on the validation set but only 60% accuracy on a holdout test set from a different hospital. The data scientist suspects overfitting. Which action should the team take first to improve generalization?
97A retail company wants to implement a recommendation system using collaborative filtering. The dataset contains user-item interactions (ratings) for 10,000 users and 5,000 products. The matrix is very sparse (99% missing values). The team plans to use matrix factorization to predict missing ratings. However, the training time is excessively long, and the model is not converging. The data engineer suggests using a smaller learning rate and more iterations. Which additional technique should the team apply to speed up training and improve convergence?
98A government agency is deploying an AI model to screen loan applications. The model uses features like income, credit score, employment history, and zip code. During fairness auditing, the model is found to deny a disproportionately high number of applicants from a particular demographic group, even when controlling for legitimate financial factors. The agency wants to mitigate this bias without significantly reducing overall accuracy. Which approach should the data scientist prioritize?
99A manufacturing company is using a convolutional neural network (CNN) to detect defects on an assembly line. The model was trained on a balanced dataset of defective and non-defective parts. In production, the model shows high precision (95%) but very low recall (50%). The production line manager wants to minimize missed defects (false negatives). The data scientist has access to the original training data and can retrain the model. Which strategy is most effective for increasing recall while maintaining acceptable precision?
100A cybersecurity firm is developing an AI system to detect zero-day malware using behavior analysis. The team collects a dataset of 1,000 malware samples and 10,000 benign files from corporate endpoints. The model is a random forest classifier. After deployment, the false positive rate is 5%, which is acceptable, but the detection rate for new malware variants drops to 30%. The security analyst suspects the model is overfitting to the specific malware families in the training set. Which improvement should the team implement first?
101Which TWO of the following are key stages in the AI lifecycle?
102Refer to the exhibit. The training log shows loss and accuracy for a binary classification model. What is the most likely issue with this model?
103A retail company deploys a machine learning model to predict customer churn. The model outputs a probability between 0 and 1, and churn is predicted if probability > 0.5. After deployment, the model has a high false positive rate (many non-churning customers labeled as churn), which leads to unnecessary retention offers and increased costs. The data science team confirms the model was trained on historical data with a balanced class distribution. The business team wants to reduce false positives while maintaining a reasonable true positive rate. However, they cannot retrain the model because the original training data is no longer available. What is the best course of action to reduce false positives?
The AI Concepts and Foundations domain covers the key concepts tested in this area of the AI0-001 exam blueprint published by CompTIA. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all AI0-001 domains — no account required.
The Courseiva AI0-001 question bank contains 103 questions in the AI Concepts and Foundations domain. Click any question to see the full explanation and answer breakdown.
Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.
Yes — the session launcher on this page draws questions exclusively from the AI Concepts and Foundations domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.
Save your results, see per-domain analytics, and get readiness scores — free, for every certification.
Sign Up FreeFree forever · Every certification included