How many AI Infrastructure and Technologies questions are on the AI0-001 exam?

The AI Infrastructure and Technologies domain is one of the weighted domains on the AI0-001 exam. The Courseiva question bank has 100 practice questions for this domain.

Free AI0-001 AI Infrastructure and Technologies Practice Questions (2026)

Q: How can I practice AI Infrastructure and Technologies questions for AI0-001?

Click any of the 100 questions listed on this page to see the full question and explanation, or use the session launcher to start a focused practice session of 10, 20, 30 or 50 questions drawn only from the AI Infrastructure and Technologies domain.

Practice AI Infrastructure and Technologies questions

10Q 20Q 30Q 50Q

All AI0-001 AI Infrastructure and Technologies questions (100)

Start session

Click any question to see the full explanation and answer options, or start a focused practice session above.

A machine learning team is training a large transformer model on a text corpus. They need to reduce training time while maintaining model accuracy. Which hardware configuration would be MOST effective for this task?

An organization wants to integrate an AI-powered summarization feature into their existing web application. The AI service will be called via API. Which factor is MOST important to consider for cost management?

A data science team is deploying a real-time fraud detection model on edge devices in retail stores. The model must infer under 10ms and fit within 50MB memory. Which combination of techniques should the team apply?

A company has a TensorFlow model trained on-premises and wants to deploy it on AWS SageMaker for scalable inference. What is the BEST way to package the model for deployment?

A data engineer is building a pipeline to process streaming clickstream data and feed it into a real-time ML feature store. Which tool is BEST suited for the streaming ingestion?

A developer is building a mobile app that uses a pre-trained image classification model on-device. Which framework should they use to run the model on iOS devices?

An ML team uses Kubeflow to orchestrate a pipeline that includes data preprocessing, model training, and evaluation. The pipeline runs on a Kubernetes cluster. After a cluster upgrade, the pipeline fails at the training step with an 'OOMKilled' error. What is the MOST likely cause?

A security team needs to ensure that all data used for AI model training in the cloud is encrypted at rest and in transit. Which set of measures meets this requirement on AWS?

A data scientist is using PyTorch to train a custom NLP model. The training is slow on a single GPU. They want to speed up training by using multiple GPUs on a single machine. Which PyTorch feature should they use?

Which of the following is a key advantage of using ONNX (Open Neural Network Exchange) format for model deployment?

A team is deploying a BERT-based question-answering model using a REST API endpoint with gRPC for internal microservices. They notice high latency for small payloads. Which optimization is MOST likely to reduce latency?

An organization uses Azure Machine Learning to manage the ML lifecycle. They want to automatically retrain a model when new data arrives in Azure Blob Storage. Which Azure service should they integrate with Azure ML to trigger retraining?

A startup is building a recommendation system that requires low-latency similarity search over millions of product embeddings. They need a vector database that offers high performance and has a managed cloud option. Which TWO databases are best suited for this requirement?

A financial services company needs to deploy an ML model for loan approval that must be explainable to regulators. The model is a gradient boosting ensemble. They need to track experiments, log model parameters, and serve the model with explanations. Which THREE tools from the MLOps ecosystem should they use?

A data scientist wants to build a proof-of-concept chatbot using a large language model. They need to choose a cloud AI platform that provides easy access to pre-trained models via API, with built-in safety filters and prompt engineering tools. Which TWO platforms are best suited?

A data scientist needs to train a deep learning model on a large image dataset. Which hardware is most suitable for parallel matrix operations and faster training compared to a CPU?

An ML engineer wants to deploy a model as a REST API that can scale to handle thousands of inference requests per second. Which serving approach is most appropriate?

A team wants to deploy a large language model on edge devices with limited memory and compute. They need to reduce model size by at least 50% while preserving accuracy. Which combination of techniques is most effective?

A company uses a vector database to store embeddings for a RAG application. Users report that some queries return irrelevant results. Which adjustment is most likely to improve relevance?

An AI team uses SageMaker Pipelines to orchestrate their ML workflow. They need to version the pipeline and track experiments across runs. Which complementary MLflow feature should they integrate?

Which AWS service would a developer use to integrate a pre-built foundation model into an application via API, without managing underlying infrastructure?

A team uses Apache Kafka to stream real-time sensor data for ML inference. They need to process the stream, perform feature engineering, and store results in a data lake. Which tool is best suited for this streaming ML pipeline?

An organization must ensure that an AI model deployed on an IoT device meets stringent latency requirements. The model is currently in FP32 and runs at 200ms per inference on the device; the target is 50ms. Which technique will provide the greatest latency reduction with the least accuracy loss?

A company wants to store unstructured text data for AI model training while enabling SQL-based queries for analytics. Which storage solution should they use as the primary data source?

Which open-source framework is commonly used for building, training, and deploying machine learning models and provides high-level APIs like Keras?

A team uses Kubeflow to manage ML workflows on Kubernetes. They want to automate hyperparameter tuning for a training job. Which Kubeflow component should they use?

During inference, a model served via a REST API occasionally returns high latency due to cold starts. The team uses a containerized service on Kubernetes with horizontal pod autoscaling. Which solution minimizes cold start impact while controlling cost?

A company uses Azure OpenAI to generate marketing copy. They need to manage costs and ensure consistent response quality. Which TWO actions should they take?

An organization is building a recommendation system that requires low-latency vector similarity search. They need to store and query millions of embeddings. Which THREE technologies are appropriate for this task?

A data science team uses Vertex AI for model training and deployment. They want to implement CI/CD for ML pipelines. Which THREE Google Cloud services should they integrate?

A machine learning engineer needs to deploy a PyTorch model for real-time inference with low latency. The model uses custom operators that are not supported by standard ONNX conversion. Which deployment approach is MOST appropriate?

A data scientist is training a large language model on a custom dataset using PyTorch on AWS. The training is taking too long due to GPU memory constraints. The team wants to use multiple GPUs across instances with minimal code changes. Which AWS service should they use?

A company wants to build a real-time anomaly detection system for IoT sensor data using edge AI. The model must run on resource-constrained devices with minimal power consumption. Which model optimization technique is MOST important?

A team is building a retrieval-augmented generation (RAG) pipeline. They need to store embeddings of company documents and perform fast similarity searches. Which data store is BEST suited for this task?

A data engineering team needs to orchestrate a complex ML pipeline that involves data extraction, transformation, model training, and deployment. They require scheduling, monitoring, and retry logic. Which MLOps tool is BEST suited for this task?

A company uses Azure OpenAI to generate customer support responses. The team notices that repeated queries with similar context incur high costs due to token usage. They want to reduce costs without affecting response quality. Which strategy is MOST effective?

A developer is using Hugging Face Transformers to fine-tune a BERT model for sentiment analysis. They want to track experiments, log metrics, and compare runs. Which MLOps tool should they integrate?

A company is deploying a computer vision model to smartphones for offline object detection. The model was trained in PyTorch. Which format should they use for deployment on iOS devices?

A data scientist is building a recommendation system using Apache Spark for feature engineering. They need to process streaming user click data in real-time before feeding into the model. Which tool should they use for the streaming data ingestion?

A team is deploying a model on AWS SageMaker and needs to handle variable traffic patterns with automatic scaling based on request latency. They want to minimize costs during low traffic. Which endpoint configuration should they use?

Which hardware accelerator is specifically designed by Google for training and inference of machine learning models, particularly their TensorFlow framework?

A company is using Google Cloud Vertex AI for model training. They want to automate the retraining pipeline when new data arrives in BigQuery. Which Vertex AI feature should they use?

A data scientist is deploying a model on edge devices using TensorFlow Lite. The model currently uses FP32 precision. Which TWO techniques can reduce the model size and improve inference speed without significant accuracy loss? (Choose TWO.)

A company is building a multi-modal AI application that processes text, images, and audio. They need a unified platform to store embeddings for all modalities, perform hybrid search (vector + metadata filtering), and scale to millions of vectors. Which THREE services are suitable for this purpose? (Choose THREE.)

A team is using Kubeflow to orchestrate ML workflows on Kubernetes. They need to ensure reproducibility, track experiments, and share models across the organization. Which THREE components or tools should they integrate? (Choose THREE.)

A data science team is deploying a deep learning model for real-time inference on edge devices with limited power and memory. Which model optimisation technique would be MOST effective for reducing latency and memory footprint while maintaining acceptable accuracy?

A company uses AWS SageMaker to train a large language model. The training job fails with an out-of-memory error. The team is already using the largest available GPU instance. Which step should the team take to resolve the issue without modifying the model architecture?

An AI team wants to version control datasets, track experiments, and log model parameters across multiple projects. Which MLOps platform is specifically designed for experiment tracking and model management?

A financial institution requires that all AI model predictions be explainable and auditable for regulatory compliance. Which model serving approach should be used to meet these requirements?

A company is implementing a retrieval-augmented generation (RAG) pipeline using a vector database. They notice that the retrieved documents often lack relevance to the query. Which adjustment would MOST improve retrieval quality?

An organisation needs to deploy PyTorch models on mobile devices with minimal latency. Which framework or tool should they use to convert and optimise the model for on-device inference?

A data engineer needs to process streaming clickstream data for real-time feature engineering in an ML pipeline. Which data pipeline technology is BEST suited for this task?

A team is using Hugging Face Transformers to serve an LLM via a REST API. They notice high latency during inference. The model is deployed on a single GPU. Which optimisation would reduce inference latency WITHOUT changing the model architecture?

A healthcare AI startup must store and query high-dimensional embeddings of medical records for a RAG system. They need low-latency similarity search at scale. Which database should they choose?

A company wants to use a pre-trained model from Azure OpenAI but must ensure that customer data is not used to improve the service. Which configuration should they choose?

Which AI accelerator is specifically designed by Google to accelerate the training and inference of large neural networks, especially in their cloud environment?

A team is deploying a model on Kubernetes using Kubeflow. They want to automatically scale the number of inference pods based on request latency. Which Kubernetes-native feature should they configure?

A machine learning engineer wants to track hyperparameter experiments and compare results across runs. Which TWO tools are best suited for this purpose? (Choose 2)

A data scientist needs to store large volumes of unstructured log data for future AI model training. They also need to run SQL-based analytics on the data. Which THREE services are appropriate for this requirement? (Choose 3)

An organisation is deploying a fine-tuned LLM for internal use. They need to ensure the API endpoint is secure and cost-effective. Which TWO measures should they implement? (Choose 2)

A machine learning engineer needs to train a deep neural network on a large image dataset. Which hardware component is specifically optimized for this task due to its high parallel processing capability and is commonly used in AI training?

A data scientist needs to deploy a PyTorch model to production with low-latency inference. The model must be served as a REST API and should support GPU acceleration. Which combination of tools is MOST suitable for this task?

A company is building a recommendation system that uses user embeddings stored in a vector database. The system must retrieve the top 10 most similar items for a given user query. Which vector database feature is MOST critical for this task?

An MLOps team observes that their production inference API experiences increasing latency as more concurrent requests arrive. They need to scale horizontally while maintaining session state of preprocessing steps. Which deployment strategy should they implement?

A developer wants to integrate an AI-powered text summarization API into their application. They need to authenticate securely and manage usage limits. What is the standard mechanism for authenticating with cloud-based AI services?

A data engineering team is building a pipeline to ingest streaming user activity data, process it in real-time, and store features in a feature store for ML models. Which streaming technology is BEST suited for this real-time data ingestion and processing?

A team has trained a large transformer model that achieves 95% accuracy but requires 8 GB of GPU memory for inference. They need to deploy it on edge devices with only 2 GB of memory and minimal accuracy loss. Which combination of techniques should they apply?

An organization wants to centralize experiment tracking, model versioning, and deployment management across its data science team. Which MLOps platform is specifically designed for experiment tracking and model registry?

A company uses AWS SageMaker to train a model and wants to deploy it for real-time inference. They also need to monitor the endpoint for data drift and retrain automatically. Which SageMaker feature enables this automated retraining pipeline?

A team is deploying a machine learning model on a Kubernetes cluster. They need to ensure low-latency inference and efficient resource utilization. Which approach should they use to dynamically scale inference pods based on request volume?

A data scientist is using a Hugging Face transformer model for a sentiment analysis task. They want to optimize inference latency for a mobile app. Which model format and framework combination is BEST suited for on-device deployment?

An AI developer needs to store large amounts of unstructured data (e.g., images, logs) for training datasets. Which cloud storage solution is purpose-built for data lakes?

A team is deploying a model that must comply with GDPR. Users can request deletion of their data. Which TWO practices should be implemented to support this compliance? (Select TWO.)

A machine learning engineer is designing a pipeline to train a computer vision model using PyTorch on a large dataset stored in an S3 data lake. They need to preprocess images (resize, normalize) and stream them efficiently to GPUs. Which THREE components are essential in this pipeline? (Select THREE.)

A data scientist is building a RAG (Retrieval-Augmented Generation) system. They need to store document embeddings and retrieve relevant chunks efficiently. Which TWO technologies are most appropriate for this task? (Select TWO.)

A data scientist needs to train a deep learning model on a large image dataset. Which hardware component is specifically designed to accelerate deep learning training workloads?

An MLOps team wants to deploy a trained PyTorch model to production with low latency inference. The model must be interoperable across different frameworks and runtimes. Which approach is BEST?

A company is deploying a real-time object detection model on a fleet of IoT cameras. The model must run at 30 FPS on a device with limited memory and no internet connectivity. Which combination of techniques is MOST suitable?

A team is using an API from a cloud AI service to generate text. They notice that repeated requests with the same prompt return different outputs. They want consistent responses for testing. Which parameter should they adjust?

A company needs to store large volumes of unstructured data (PDFs, images, logs) for future AI model training. The data must be easily accessible by data scientists using Spark and must support cost-effective storage. Which data infrastructure is MOST appropriate?

A machine learning engineer wants to track experiment parameters, metrics, and model artifacts across multiple runs. Which MLOps tool is specifically designed for experiment tracking?

An organization is deploying a large language model on-premises for compliance reasons. They need to serve inference requests with low latency. Which architecture should they use?

A team uses a retrieval-augmented generation (RAG) system to answer questions from a large enterprise document repository. They observe that the generated answers sometimes contain information not present in the retrieved documents. What is the MOST likely cause?

A company wants to build an AI pipeline that processes streaming data from IoT sensors, performs feature engineering, trains a model incrementally, and deploys the updated model. Which data pipeline technology is BEST suited for the streaming ingestion step?

A developer wants to deploy a scikit-learn model as a REST API endpoint with minimal infrastructure management. Which cloud service is MOST appropriate?

A team is using a cloud AI service with a pay-per-token pricing model. They want to minimize costs while maintaining response quality. Which strategy is MOST effective?

An engineer is deploying a model on edge devices with limited compute. The model was trained in PyTorch. They need to convert it to a format optimized for mobile CPUs. Which framework should they use?

A data science team wants to implement a feature store to serve pre-computed features for both training and inference with low latency. Which TWO tools are commonly used for building a feature store?

A company is building a secure AI system that must comply with GDPR. They want to allow users to request deletion of their personal data from training sets and model outputs. Which THREE techniques should they implement?

A developer is choosing a vector database for a RAG application that requires real-time updates and millisecond query latency. Which TWO vector databases are best suited for this requirement?

A company wants to build a customer service chatbot that answers questions about their internal policy documents. The documents are updated monthly, and the team cannot afford to retrain a model each time. Which approach is MOST appropriate?

A data scientist is choosing a hardware accelerator for training a large transformer model. Which of the following is specifically designed for deep learning workloads and offers the highest throughput for matrix multiplications?

An ML team deploys a model on edge devices using INT8 quantization. They notice a significant drop in accuracy on a subset of classes. Which technique should they apply to recover accuracy without increasing model size?

A healthcare startup needs to deploy an AI model for real-time patient monitoring on IoT devices with limited battery and compute. The model must run locally with minimal latency. Which TWO strategies are most appropriate?

A data engineering team is designing a data pipeline to process streaming sensor data and feed it into an ML model for anomaly detection. Which THREE components are essential for this pipeline?

A team is selecting a vector database for a RAG application that requires low-latency similarity search on millions of embeddings. They prioritize ease of use and fully managed cloud service. Which TWO options meet these requirements?

A machine learning engineer needs to containerize a PyTorch model for deployment on Kubernetes. Which THREE tools or formats should they use?

A company is deploying a large language model via a REST API using a cloud AI service. They expect high traffic and need to minimize latency while controlling costs. Which THREE strategies should they implement?

A team is evaluating MLOps platforms to manage experiments, track model versions, and deploy models to production. Which THREE platforms provide end-to-end capabilities including experiment tracking and model deployment?

100

A data scientist wants to develop a computer vision model using transfer learning. They need a framework that provides pre-trained models and easy-to-use APIs for data augmentation and training. Which TWO frameworks are best suited for this task?

Practice all 100 AI Infrastructure and Technologies questions

Other AI0-001 exam domains

AI Security AI Concepts and Foundations AI Concepts and Techniques Machine Learning and Deep Learning AI Models and Data Engineering Implementing AI Solutions AI Implementation and Operations AI Security, Ethics and Governance AI Governance and Ethics

Frequently asked questions

What does the AI Infrastructure and Technologies domain cover on the AI0-001 exam?

The AI Infrastructure and Technologies domain covers the key concepts tested in this area of the AI0-001 exam blueprint published by CompTIA. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all AI0-001 domains — no account required.

How many AI Infrastructure and Technologies questions are in the AI0-001 question bank?

The Courseiva AI0-001 question bank contains 100 questions in the AI Infrastructure and Technologies domain. Click any question to see the full explanation and answer breakdown.

What is the best way to practice AI Infrastructure and Technologies for AI0-001?

Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.

Can I practice only AI Infrastructure and Technologies questions for AI0-001?

Yes — the session launcher on this page draws questions exclusively from the AI Infrastructure and Technologies domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.

Free forever · No credit card required

Track your AI0-001 domain progress

Save your results, see per-domain analytics, and get readiness scores — free, for every certification.

Free forever · Every certification included