Practice AI0-001 AI Infrastructure and Technologies questions with full explanations on every answer.
Start practicing
AI Infrastructure and Technologies — choose a session length
Free · No account required
Click any question to see the full explanation and answer options, or start a focused practice session above.
A machine learning team is training a large transformer model on a text corpus. They need to reduce training time while maintaining model accuracy. Which hardware configuration would be MOST effective for this task?
2An organization wants to integrate an AI-powered summarization feature into their existing web application. The AI service will be called via API. Which factor is MOST important to consider for cost management?
3A data science team is deploying a real-time fraud detection model on edge devices in retail stores. The model must infer under 10ms and fit within 50MB memory. Which combination of techniques should the team apply?
4A company has a TensorFlow model trained on-premises and wants to deploy it on AWS SageMaker for scalable inference. What is the BEST way to package the model for deployment?
5A data engineer is building a pipeline to process streaming clickstream data and feed it into a real-time ML feature store. Which tool is BEST suited for the streaming ingestion?
6A developer is building a mobile app that uses a pre-trained image classification model on-device. Which framework should they use to run the model on iOS devices?
7An ML team uses Kubeflow to orchestrate a pipeline that includes data preprocessing, model training, and evaluation. The pipeline runs on a Kubernetes cluster. After a cluster upgrade, the pipeline fails at the training step with an 'OOMKilled' error. What is the MOST likely cause?
8A security team needs to ensure that all data used for AI model training in the cloud is encrypted at rest and in transit. Which set of measures meets this requirement on AWS?
9A data scientist is using PyTorch to train a custom NLP model. The training is slow on a single GPU. They want to speed up training by using multiple GPUs on a single machine. Which PyTorch feature should they use?
10Which of the following is a key advantage of using ONNX (Open Neural Network Exchange) format for model deployment?
11A team is deploying a BERT-based question-answering model using a REST API endpoint with gRPC for internal microservices. They notice high latency for small payloads. Which optimization is MOST likely to reduce latency?
12An organization uses Azure Machine Learning to manage the ML lifecycle. They want to automatically retrain a model when new data arrives in Azure Blob Storage. Which Azure service should they integrate with Azure ML to trigger retraining?
13A startup is building a recommendation system that requires low-latency similarity search over millions of product embeddings. They need a vector database that offers high performance and has a managed cloud option. Which TWO databases are best suited for this requirement?
14A financial services company needs to deploy an ML model for loan approval that must be explainable to regulators. The model is a gradient boosting ensemble. They need to track experiments, log model parameters, and serve the model with explanations. Which THREE tools from the MLOps ecosystem should they use?
15A data scientist wants to build a proof-of-concept chatbot using a large language model. They need to choose a cloud AI platform that provides easy access to pre-trained models via API, with built-in safety filters and prompt engineering tools. Which TWO platforms are best suited?
16A data scientist needs to train a deep learning model on a large image dataset. Which hardware is most suitable for parallel matrix operations and faster training compared to a CPU?
17An ML engineer wants to deploy a model as a REST API that can scale to handle thousands of inference requests per second. Which serving approach is most appropriate?
18A team wants to deploy a large language model on edge devices with limited memory and compute. They need to reduce model size by at least 50% while preserving accuracy. Which combination of techniques is most effective?
19A company uses a vector database to store embeddings for a RAG application. Users report that some queries return irrelevant results. Which adjustment is most likely to improve relevance?
20An AI team uses SageMaker Pipelines to orchestrate their ML workflow. They need to version the pipeline and track experiments across runs. Which complementary MLflow feature should they integrate?
21Which AWS service would a developer use to integrate a pre-built foundation model into an application via API, without managing underlying infrastructure?
22A team uses Apache Kafka to stream real-time sensor data for ML inference. They need to process the stream, perform feature engineering, and store results in a data lake. Which tool is best suited for this streaming ML pipeline?
23An organization must ensure that an AI model deployed on an IoT device meets stringent latency requirements. The model is currently in FP32 and runs at 200ms per inference on the device; the target is 50ms. Which technique will provide the greatest latency reduction with the least accuracy loss?
24A company wants to store unstructured text data for AI model training while enabling SQL-based queries for analytics. Which storage solution should they use as the primary data source?
25Which open-source framework is commonly used for building, training, and deploying machine learning models and provides high-level APIs like Keras?
26A team uses Kubeflow to manage ML workflows on Kubernetes. They want to automate hyperparameter tuning for a training job. Which Kubeflow component should they use?
27During inference, a model served via a REST API occasionally returns high latency due to cold starts. The team uses a containerized service on Kubernetes with horizontal pod autoscaling. Which solution minimizes cold start impact while controlling cost?
28A company uses Azure OpenAI to generate marketing copy. They need to manage costs and ensure consistent response quality. Which TWO actions should they take?
29An organization is building a recommendation system that requires low-latency vector similarity search. They need to store and query millions of embeddings. Which THREE technologies are appropriate for this task?
30A data science team uses Vertex AI for model training and deployment. They want to implement CI/CD for ML pipelines. Which THREE Google Cloud services should they integrate?
31A machine learning engineer needs to deploy a PyTorch model for real-time inference with low latency. The model uses custom operators that are not supported by standard ONNX conversion. Which deployment approach is MOST appropriate?
32A data scientist is training a large language model on a custom dataset using PyTorch on AWS. The training is taking too long due to GPU memory constraints. The team wants to use multiple GPUs across instances with minimal code changes. Which AWS service should they use?
33A company wants to build a real-time anomaly detection system for IoT sensor data using edge AI. The model must run on resource-constrained devices with minimal power consumption. Which model optimization technique is MOST important?
34A team is building a retrieval-augmented generation (RAG) pipeline. They need to store embeddings of company documents and perform fast similarity searches. Which data store is BEST suited for this task?
35A data engineering team needs to orchestrate a complex ML pipeline that involves data extraction, transformation, model training, and deployment. They require scheduling, monitoring, and retry logic. Which MLOps tool is BEST suited for this task?
36A company uses Azure OpenAI to generate customer support responses. The team notices that repeated queries with similar context incur high costs due to token usage. They want to reduce costs without affecting response quality. Which strategy is MOST effective?
37A developer is using Hugging Face Transformers to fine-tune a BERT model for sentiment analysis. They want to track experiments, log metrics, and compare runs. Which MLOps tool should they integrate?
38A company is deploying a computer vision model to smartphones for offline object detection. The model was trained in PyTorch. Which format should they use for deployment on iOS devices?
39A data scientist is building a recommendation system using Apache Spark for feature engineering. They need to process streaming user click data in real-time before feeding into the model. Which tool should they use for the streaming data ingestion?
40A team is deploying a model on AWS SageMaker and needs to handle variable traffic patterns with automatic scaling based on request latency. They want to minimize costs during low traffic. Which endpoint configuration should they use?
41Which hardware accelerator is specifically designed by Google for training and inference of machine learning models, particularly their TensorFlow framework?
42A company is using Google Cloud Vertex AI for model training. They want to automate the retraining pipeline when new data arrives in BigQuery. Which Vertex AI feature should they use?
43A data scientist is deploying a model on edge devices using TensorFlow Lite. The model currently uses FP32 precision. Which TWO techniques can reduce the model size and improve inference speed without significant accuracy loss? (Choose TWO.)
44A company is building a multi-modal AI application that processes text, images, and audio. They need a unified platform to store embeddings for all modalities, perform hybrid search (vector + metadata filtering), and scale to millions of vectors. Which THREE services are suitable for this purpose? (Choose THREE.)
45A team is using Kubeflow to orchestrate ML workflows on Kubernetes. They need to ensure reproducibility, track experiments, and share models across the organization. Which THREE components or tools should they integrate? (Choose THREE.)
46A data science team is deploying a deep learning model for real-time inference on edge devices with limited power and memory. Which model optimisation technique would be MOST effective for reducing latency and memory footprint while maintaining acceptable accuracy?
47A company uses AWS SageMaker to train a large language model. The training job fails with an out-of-memory error. The team is already using the largest available GPU instance. Which step should the team take to resolve the issue without modifying the model architecture?
48An AI team wants to version control datasets, track experiments, and log model parameters across multiple projects. Which MLOps platform is specifically designed for experiment tracking and model management?
49A financial institution requires that all AI model predictions be explainable and auditable for regulatory compliance. Which model serving approach should be used to meet these requirements?
50A company is implementing a retrieval-augmented generation (RAG) pipeline using a vector database. They notice that the retrieved documents often lack relevance to the query. Which adjustment would MOST improve retrieval quality?
51An organisation needs to deploy PyTorch models on mobile devices with minimal latency. Which framework or tool should they use to convert and optimise the model for on-device inference?
52A data engineer needs to process streaming clickstream data for real-time feature engineering in an ML pipeline. Which data pipeline technology is BEST suited for this task?
53A team is using Hugging Face Transformers to serve an LLM via a REST API. They notice high latency during inference. The model is deployed on a single GPU. Which optimisation would reduce inference latency WITHOUT changing the model architecture?
54A healthcare AI startup must store and query high-dimensional embeddings of medical records for a RAG system. They need low-latency similarity search at scale. Which database should they choose?
55A company wants to use a pre-trained model from Azure OpenAI but must ensure that customer data is not used to improve the service. Which configuration should they choose?
56Which AI accelerator is specifically designed by Google to accelerate the training and inference of large neural networks, especially in their cloud environment?
57A team is deploying a model on Kubernetes using Kubeflow. They want to automatically scale the number of inference pods based on request latency. Which Kubernetes-native feature should they configure?
58A machine learning engineer wants to track hyperparameter experiments and compare results across runs. Which TWO tools are best suited for this purpose? (Choose 2)
59A data scientist needs to store large volumes of unstructured log data for future AI model training. They also need to run SQL-based analytics on the data. Which THREE services are appropriate for this requirement? (Choose 3)
60An organisation is deploying a fine-tuned LLM for internal use. They need to ensure the API endpoint is secure and cost-effective. Which TWO measures should they implement? (Choose 2)
61A machine learning engineer needs to train a deep neural network on a large image dataset. Which hardware component is specifically optimized for this task due to its high parallel processing capability and is commonly used in AI training?
62A data scientist needs to deploy a PyTorch model to production with low-latency inference. The model must be served as a REST API and should support GPU acceleration. Which combination of tools is MOST suitable for this task?
63A company is building a recommendation system that uses user embeddings stored in a vector database. The system must retrieve the top 10 most similar items for a given user query. Which vector database feature is MOST critical for this task?
64An MLOps team observes that their production inference API experiences increasing latency as more concurrent requests arrive. They need to scale horizontally while maintaining session state of preprocessing steps. Which deployment strategy should they implement?
65A developer wants to integrate an AI-powered text summarization API into their application. They need to authenticate securely and manage usage limits. What is the standard mechanism for authenticating with cloud-based AI services?
66A data engineering team is building a pipeline to ingest streaming user activity data, process it in real-time, and store features in a feature store for ML models. Which streaming technology is BEST suited for this real-time data ingestion and processing?
67A team has trained a large transformer model that achieves 95% accuracy but requires 8 GB of GPU memory for inference. They need to deploy it on edge devices with only 2 GB of memory and minimal accuracy loss. Which combination of techniques should they apply?
68An organization wants to centralize experiment tracking, model versioning, and deployment management across its data science team. Which MLOps platform is specifically designed for experiment tracking and model registry?
69A company uses AWS SageMaker to train a model and wants to deploy it for real-time inference. They also need to monitor the endpoint for data drift and retrain automatically. Which SageMaker feature enables this automated retraining pipeline?
70A team is deploying a machine learning model on a Kubernetes cluster. They need to ensure low-latency inference and efficient resource utilization. Which approach should they use to dynamically scale inference pods based on request volume?
71A data scientist is using a Hugging Face transformer model for a sentiment analysis task. They want to optimize inference latency for a mobile app. Which model format and framework combination is BEST suited for on-device deployment?
72An AI developer needs to store large amounts of unstructured data (e.g., images, logs) for training datasets. Which cloud storage solution is purpose-built for data lakes?
73A team is deploying a model that must comply with GDPR. Users can request deletion of their data. Which TWO practices should be implemented to support this compliance? (Select TWO.)
74A machine learning engineer is designing a pipeline to train a computer vision model using PyTorch on a large dataset stored in an S3 data lake. They need to preprocess images (resize, normalize) and stream them efficiently to GPUs. Which THREE components are essential in this pipeline? (Select THREE.)
75A data scientist is building a RAG (Retrieval-Augmented Generation) system. They need to store document embeddings and retrieve relevant chunks efficiently. Which TWO technologies are most appropriate for this task? (Select TWO.)
76A data scientist needs to train a deep learning model on a large image dataset. Which hardware component is specifically designed to accelerate deep learning training workloads?
77An MLOps team wants to deploy a trained PyTorch model to production with low latency inference. The model must be interoperable across different frameworks and runtimes. Which approach is BEST?
78A company is deploying a real-time object detection model on a fleet of IoT cameras. The model must run at 30 FPS on a device with limited memory and no internet connectivity. Which combination of techniques is MOST suitable?
79A team is using an API from a cloud AI service to generate text. They notice that repeated requests with the same prompt return different outputs. They want consistent responses for testing. Which parameter should they adjust?
80A company needs to store large volumes of unstructured data (PDFs, images, logs) for future AI model training. The data must be easily accessible by data scientists using Spark and must support cost-effective storage. Which data infrastructure is MOST appropriate?
81A machine learning engineer wants to track experiment parameters, metrics, and model artifacts across multiple runs. Which MLOps tool is specifically designed for experiment tracking?
82An organization is deploying a large language model on-premises for compliance reasons. They need to serve inference requests with low latency. Which architecture should they use?
83A team uses a retrieval-augmented generation (RAG) system to answer questions from a large enterprise document repository. They observe that the generated answers sometimes contain information not present in the retrieved documents. What is the MOST likely cause?
84A company wants to build an AI pipeline that processes streaming data from IoT sensors, performs feature engineering, trains a model incrementally, and deploys the updated model. Which data pipeline technology is BEST suited for the streaming ingestion step?
85A developer wants to deploy a scikit-learn model as a REST API endpoint with minimal infrastructure management. Which cloud service is MOST appropriate?
86A team is using a cloud AI service with a pay-per-token pricing model. They want to minimize costs while maintaining response quality. Which strategy is MOST effective?
87An engineer is deploying a model on edge devices with limited compute. The model was trained in PyTorch. They need to convert it to a format optimized for mobile CPUs. Which framework should they use?
88A data science team wants to implement a feature store to serve pre-computed features for both training and inference with low latency. Which TWO tools are commonly used for building a feature store?
89A company is building a secure AI system that must comply with GDPR. They want to allow users to request deletion of their personal data from training sets and model outputs. Which THREE techniques should they implement?
90A developer is choosing a vector database for a RAG application that requires real-time updates and millisecond query latency. Which TWO vector databases are best suited for this requirement?
91A company wants to build a customer service chatbot that answers questions about their internal policy documents. The documents are updated monthly, and the team cannot afford to retrain a model each time. Which approach is MOST appropriate?
92A data scientist is choosing a hardware accelerator for training a large transformer model. Which of the following is specifically designed for deep learning workloads and offers the highest throughput for matrix multiplications?
93An ML team deploys a model on edge devices using INT8 quantization. They notice a significant drop in accuracy on a subset of classes. Which technique should they apply to recover accuracy without increasing model size?
94A healthcare startup needs to deploy an AI model for real-time patient monitoring on IoT devices with limited battery and compute. The model must run locally with minimal latency. Which TWO strategies are most appropriate?
95A data engineering team is designing a data pipeline to process streaming sensor data and feed it into an ML model for anomaly detection. Which THREE components are essential for this pipeline?
96A team is selecting a vector database for a RAG application that requires low-latency similarity search on millions of embeddings. They prioritize ease of use and fully managed cloud service. Which TWO options meet these requirements?
97A machine learning engineer needs to containerize a PyTorch model for deployment on Kubernetes. Which THREE tools or formats should they use?
98A company is deploying a large language model via a REST API using a cloud AI service. They expect high traffic and need to minimize latency while controlling costs. Which THREE strategies should they implement?
99A team is evaluating MLOps platforms to manage experiments, track model versions, and deploy models to production. Which THREE platforms provide end-to-end capabilities including experiment tracking and model deployment?
100A data scientist wants to develop a computer vision model using transfer learning. They need a framework that provides pre-trained models and easy-to-use APIs for data augmentation and training. Which TWO frameworks are best suited for this task?
The AI Infrastructure and Technologies domain covers the key concepts tested in this area of the AI0-001 exam blueprint published by CompTIA. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all AI0-001 domains — no account required.
The Courseiva AI0-001 question bank contains 100 questions in the AI Infrastructure and Technologies domain. Click any question to see the full explanation and answer breakdown.
Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.
Yes — the session launcher on this page draws questions exclusively from the AI Infrastructure and Technologies domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.
Save your results, see per-domain analytics, and get readiness scores — free, for every certification.
Sign Up FreeFree forever · Every certification included