Knowledge + Practice

CCNA AI Implementation and Operations Questions

28 of 103 questions · Page 2/2 · AI Implementation and Operations · Answers revealed

Practice these questions Domain overview All questions

76

Multi-Selecteasy

An organization wants to implement a robust MLOps pipeline. Which THREE components are essential for a complete MLOps framework? (Choose three.)

Select 3 answers

A.Continuous integration and continuous deployment (CI/CD) pipeline

B.Automated testing and validation

C.Automated code review and approval gates

D.Real-time model monitoring dashboard

E.Version control for data and model code

AnswersA, B, E

CI/CD automates model building and deployment.

Why this answer

A is correct because a CI/CD pipeline automates the integration of code changes and deployment of models into production, ensuring consistent and reliable releases. In MLOps, this includes building, testing, and deploying both application code and ML model artifacts, which is fundamental for operationalizing machine learning at scale.

Exam trap

CompTIA often tests the distinction between 'essential framework components' and 'optional operational tools,' leading candidates to mistakenly select monitoring dashboards or code review gates as core MLOps requirements instead of the foundational pillars of CI/CD, automated testing, and version control.

Practice this question →

77

MCQmedium

A model serving pod is failing with OOMKilled. What is the most likely cause?

A.The container image is corrupted

B.The model version is outdated

C.The model requires more memory than the 2Gi limit

D.The Kubernetes cluster has run out of disk space

AnswerC

The pod was killed because it used more memory than allowed.

Why this answer

Option C is correct because an OOMKilled error in Kubernetes indicates that a container exceeded its memory limit and was terminated by the Out Of Memory (OOM) killer. The most common cause is that the model's inference or training workload requires more memory than the configured resource limit (e.g., 2Gi), forcing the kernel to kill the process. This is a direct result of the container's memory request/limit mismatch with the actual consumption.

Exam trap

CompTIA often tests the distinction between OOMKilled (memory limit exceeded) and other pod failure reasons like CrashLoopBackOff (application crash) or ImagePullBackOff (image issues), so candidates must associate OOMKilled specifically with memory resource constraints, not general pod failures.

How to eliminate wrong answers

Option A is wrong because a corrupted container image would typically cause an ImagePullBackOff or CrashLoopBackOff error, not an OOMKilled termination, which is specifically a memory-related kernel action. Option B is wrong because an outdated model version might cause performance or accuracy issues, but it does not directly trigger the OOM killer; memory exhaustion is a resource constraint, not a version compatibility problem. Option D is wrong because running out of disk space on the Kubernetes cluster would result in Evicted pods or ImagePullBackOff errors due to node pressure, not an OOMKilled status, which is tied to memory limits enforced by cgroups.

Practice this question →

78

MCQeasy

A company deploys an AI model via a REST API that handles sensitive customer data. To secure the endpoint, the security team requires that only authenticated and authorized applications can invoke the API. Which mechanism should be implemented?

A.API key or bearer token in the HTTP header

B.TLS encryption for the connection

C.Input sanitization to prevent injection

D.IP whitelisting

AnswerA

API keys/tokens authenticate the caller and are standard for API security.

Why this answer

Option A is correct because API keys or bearer tokens (e.g., OAuth 2.0 access tokens) are the standard mechanism for authenticating and authorizing client applications when invoking a REST API. These tokens are passed in the HTTP Authorization header, allowing the server to verify the client's identity and permissions before processing requests containing sensitive customer data.

Exam trap

CompTIA often tests the distinction between transport-layer security (TLS) and application-layer authentication, so candidates mistakenly choose TLS because it 'secures' the endpoint, but it does not verify who is calling the API.

How to eliminate wrong answers

Option B is wrong because TLS encryption secures data in transit but does not authenticate or authorize the calling application; it only prevents eavesdropping and tampering. Option C is wrong because input sanitization protects against injection attacks (e.g., SQL injection) but does not verify the identity or authorization of the API caller. Option D is wrong because IP whitelisting restricts access based on source IP addresses, which can be spoofed or shared, and does not provide per-application authentication or authorization; it is a network-layer control, not an application-layer identity mechanism.

Practice this question →

79

MCQmedium

You are an AI engineer at a financial services firm. The company has deployed a gradient boosting model to predict loan default risk. The model takes features such as credit score, debt-to-income ratio, loan amount, and employment length. In production, the model processes about 10,000 predictions per day with an average latency of 50ms. Recently, the accuracy has dropped from 92% to 85%. You also notice that the average credit score of applicants has increased significantly because the marketing team launched a campaign targeting prime borrowers. The model was originally trained on data from the past three years, which included a mix of prime and subprime borrowers. You need to restore model performance while minimizing downtime and retraining cost. Which action should you take first?

A.Add a regularization term to penalize high credit scores.

B.Deploy an ensemble of the original model and a neural network.

C.Reject all predictions where the confidence score is below 0.9.

D.Retrain the model using the last three months of production data with labels.

AnswerD

Retraining with recent data realigns the model with the current applicant pool, directly addressing the covariate shift.

Why this answer

The drop in accuracy is due to data drift—the production data now has a different distribution (higher credit scores) than the training data. Retraining on the most recent three months of production data with labels directly addresses this shift by adapting the model to the new population, and it minimizes downtime because it uses existing infrastructure and avoids complex architectural changes.

Exam trap

CompTIA often tests the misconception that model performance degradation is always due to model architecture or hyperparameters, rather than recognizing data drift as the primary cause, leading candidates to choose complex solutions like ensembles or threshold adjustments instead of retraining on recent data.

How to eliminate wrong answers

Option A is wrong because adding a regularization term to penalize high credit scores would artificially bias the model against a legitimate feature value, reducing accuracy rather than correcting for distribution shift. Option B is wrong because deploying an ensemble with a neural network adds complexity, latency, and retraining cost without addressing the root cause of data drift, and it may not be feasible with the current 50ms latency requirement. Option C is wrong because rejecting predictions with confidence below 0.9 would discard many valid predictions (especially if the model is miscalibrated due to drift), reducing throughput and not fixing the underlying accuracy issue.

Practice this question →

80

MCQmedium

A healthcare company must deploy a diagnostic AI model that uses protected health information (PHI). To comply with HIPAA, the operations team needs to ensure data privacy during model inference. Which practice should be implemented?

A.Run the model on-premises to avoid cloud data transmission

B.Encrypt all PHI at rest and in transit within the inference pipeline

C.Mask sensitive fields in the input data before inference

D.Apply differential privacy during model training only

AnswerB

Encryption ensures confidentiality of PHI.

Why this answer

Option B is correct because HIPAA mandates encryption of protected health information (PHI) both at rest and in transit to safeguard data confidentiality during model inference. Encrypting the entire inference pipeline ensures that even if data is intercepted or accessed without authorization, it remains unreadable. This practice directly addresses the compliance requirement for data privacy without relying on network location or partial obfuscation.

Exam trap

CompTIA often tests the misconception that on-premises deployment or data masking alone satisfies HIPAA, when in fact encryption of PHI at rest and in transit is the mandatory technical safeguard under the HIPAA Security Rule.

How to eliminate wrong answers

Option A is wrong because running the model on-premises does not inherently ensure data privacy; PHI could still be exposed through insecure storage, unencrypted logs, or internal network breaches, and HIPAA requires encryption regardless of deployment location. Option C is wrong because masking sensitive fields before inference only obscures data at the input stage, but the model may still process and output PHI in intermediate layers or results, leaving the pipeline vulnerable. Option D is wrong because differential privacy applied only during training does not protect PHI during inference; inference-time data must be protected with encryption and access controls to comply with HIPAA's operational requirements.

Practice this question →

81

MCQmedium

An AIOps platform monitors server metrics and triggers alerts. The team notices too many false positives. Which adjustment should be made to the anomaly detection model?

A.Use a more complex model to better fit the data.

B.Shorten the observation window to detect anomalies faster.

C.Increase the training data to include more normal patterns.

D.Raise the anomaly score threshold for triggering alerts.

AnswerD

A higher threshold means only more extreme deviations trigger alerts.

Why this answer

Raising the anomaly score threshold (Option D) directly reduces false positives by requiring a higher deviation from normal behavior before an alert is triggered. In AIOps platforms, the anomaly score is a numeric value (e.g., 0–100) that quantifies how unusual a metric is; a higher threshold means only more extreme deviations generate alerts, filtering out minor fluctuations that were incorrectly flagged.

Exam trap

CompTIA often tests the misconception that adding more data or using a more complex model inherently improves accuracy, when in fact the threshold tuning is the direct lever for controlling false positive rates in operational AIOps systems.

How to eliminate wrong answers

Option A is wrong because using a more complex model increases the risk of overfitting to noise in the training data, which can actually increase false positives by treating random variations as anomalies. Option B is wrong because shortening the observation window makes the model more sensitive to short-term spikes and noise, which typically increases false positives rather than reducing them. Option C is wrong because increasing training data with more normal patterns can improve baseline accuracy, but it does not directly control the alerting sensitivity; false positives are primarily managed by the threshold, not by adding more normal data.

Practice this question →

82

MCQeasy

A company deployed a machine learning model on a cloud inference service. Users report high latency during peak hours. The model is deployed on a single instance. Which action should the team take to reduce latency without significant architectural changes?

A.Increase the model size to improve accuracy

B.Switch to a batch inference pipeline

C.Enable autoscaling for the inference instances

D.Add an API gateway to route requests

AnswerC

Autoscaling adds capacity during peak demand, reducing latency.

Why this answer

Enabling autoscaling allows the inference service to automatically add instances during high demand, distributing the load and reducing latency. Increasing the model size would worsen latency. Switching to a batch inference pipeline would increase latency for real-time requests.

Adding an API gateway does not address compute capacity.

Practice this question →

83

MCQeasy

A team deploys a real-time fraud detection model on a streaming platform. The model must produce predictions within 100 milliseconds per event. Initial latency is 150 ms. Which optimization is most likely to meet the latency requirement?

A.Apply model quantization to reduce precision from FP32 to INT8.

B.Increase the batch size to process more events simultaneously.

C.Add more feature engineering steps to improve model accuracy.

D.Migrate from a decision tree ensemble to a deep neural network.

AnswerA

Quantization reduces model size and speeds up computation, lowering latency.

Why this answer

Model quantization reduces the numerical precision of the model's weights and activations from FP32 to INT8, which decreases memory footprint and speeds up inference. This optimization directly addresses the 150 ms latency by enabling faster arithmetic operations on modern hardware, often cutting inference time by 2-4x, which can bring latency below the 100 ms requirement.

Exam trap

CompTIA often tests the misconception that increasing batch size or model complexity improves throughput for real-time systems, but candidates must recognize that real-time streaming requires low per-event latency, not high aggregate throughput.

How to eliminate wrong answers

Option B is wrong because increasing batch size processes more events simultaneously, which increases per-batch latency and is unsuitable for real-time streaming where each event must be handled individually within 100 ms. Option C is wrong because adding more feature engineering steps increases preprocessing time, worsening latency without guaranteeing a reduction in model inference time. Option D is wrong because migrating from a decision tree ensemble to a deep neural network typically increases model complexity and computational cost, raising latency rather than reducing it.

Practice this question →

84

Multi-Selecteasy

Which THREE components are essential in an MLOps pipeline?

Select 3 answers

A.Data versioning

B.Manual code review

C.Deployment automation

D.Hardware procurement

E.Automated model testing

AnswersA, C, E

Versioning data is crucial for reproducibility and tracking.

Why this answer

Data versioning (A) is essential in an MLOps pipeline because it ensures reproducibility and traceability of datasets used for training, validation, and testing. Without versioning, changes to data cannot be tracked, leading to inconsistent model behavior and difficulty in debugging. Tools like DVC or Git LFS enable precise snapshotting of data, which is critical for auditing and rollback in production AI systems.

Exam trap

CompTIA often tests the distinction between operational pipeline components (automation, testing, versioning) and peripheral activities (procurement, manual reviews) to see if candidates understand that MLOps is about automating the ML lifecycle, not general IT operations.

Practice this question →

85

MCQeasy

A team of data scientists and engineers is working on multiple AI projects. They often struggle to reproduce experiments and manage model versions. Which tool or practice should they adopt?

A.Document experiments in a shared Word document.

B.Share code via email attachments.

C.Keep all models in a shared network drive.

D.Use an MLOps platform that provides version control, tracking, and reproducibility.

AnswerD

MLOps platforms are designed to manage the ML lifecycle effectively.

Why this answer

Option D is correct because an MLOps platform (e.g., MLflow, Kubeflow, or Vertex AI) provides integrated version control for code, data, and models, along with experiment tracking and reproducibility. This directly addresses the team's struggle to reproduce experiments and manage model versions by automating lineage capture and enabling consistent environment recreation.

Exam trap

CompTIA often tests the misconception that simple file-sharing or document-based approaches are sufficient for reproducibility, when in fact they lack the automated lineage and environment locking that MLOps platforms provide.

How to eliminate wrong answers

Option A is wrong because a shared Word document lacks automated versioning, dependency tracking, and execution capture, making it impossible to reliably reproduce experiments from static text. Option B is wrong because sharing code via email attachments introduces version confusion, lacks any form of change tracking or environment locking, and violates basic software engineering practices for collaboration. Option C is wrong because keeping models on a shared network drive provides no version history, no lineage to training code or data, and no mechanism to roll back or compare model iterations, leading to overwrites and irreproducible results.

Practice this question →

86

MCQmedium

During model monitoring, a loan approval model shows disparate impact against a protected group. The model's overall accuracy is high, but the false positive rate for the protected group is 0.12 compared to 0.02 for other groups. Which action should the operations team take first?

A.Document the disparity and proceed with deployment because accuracy is high

B.Replace the model with a simpler model that is less discriminatory

C.Retrain the model with reweighted training data to minimize disparity

D.Adjust the decision threshold for the protected group to equalize false positive rates

AnswerC

Retraining with fairness constraints directly mitigates bias in the model.

Why this answer

Option C is correct because retraining the model with reweighted training data directly addresses the root cause of disparate impact—biased historical data—by assigning higher weights to underrepresented groups during training. This technique, often implemented via cost-sensitive learning or sample reweighting, adjusts the model's internal decision boundaries to reduce false positive rate disparities without sacrificing overall accuracy. The operations team should first attempt to mitigate bias at the data level before considering threshold adjustments or model replacement, as reweighting preserves the model's learned patterns while promoting fairness.

Exam trap

CompTIA often tests the misconception that adjusting the decision threshold for a specific group is a quick fix for disparate impact, but the trap is that this violates the principle of equal treatment and can introduce legal liability, whereas retraining with reweighted data addresses bias at the algorithmic level without changing the decision rule per group.

How to eliminate wrong answers

Option A is wrong because documenting the disparity and proceeding with deployment ignores the ethical and regulatory requirement to address disparate impact, even if overall accuracy is high; high accuracy can mask significant bias against protected groups. Option B is wrong because replacing the model with a simpler model does not guarantee less discrimination—simplicity does not correlate with fairness, and a simpler model may still exhibit bias or have lower predictive performance. Option D is wrong because adjusting the decision threshold for the protected group alone treats the symptom (unequal false positive rates) rather than the cause, and can lead to calibration drift, reduced model interpretability, and potential legal issues under the Equal Credit Opportunity Act (ECOA) by applying different standards to different groups.

Practice this question →

87

MCQhard

A CI/CD pipeline for a computer vision model uses canary deployment. After deploying a new version to 5% of traffic, the pipeline automatically rolls back due to a spike in error rate. The new model's inference time is 20% higher than the previous version. The operations team finds that the error is caused by timeout in the inference service. Which action should be taken to prevent future rollbacks?

A.Increase the timeout threshold for inference requests

B.Implement a fallback to the previous model when timeout occurs

C.Optimize the model using TensorRT or ONNX Runtime before deployment

D.Reduce the canary percentage to 1% to minimize impact

AnswerC

Optimizing reduces inference time, addressing the cause of timeouts.

Why this answer

Option C is correct because the root cause of the timeout is the 20% higher inference time of the new model. Optimizing the model using TensorRT or ONNX Runtime reduces inference latency directly, addressing the performance bottleneck that causes timeouts. This prevents the spike in error rate and subsequent rollback without masking the underlying issue.

Exam trap

The trap here is that candidates may confuse symptom management (increasing timeout or fallback) with root-cause resolution (model optimization), which Cisco tests to see if you understand that performance issues must be fixed at the source in AI/ML operations.

How to eliminate wrong answers

Option A is wrong because increasing the timeout threshold only masks the symptom (timeout) without fixing the underlying performance degradation; it may lead to poor user experience and does not prevent future rollbacks if the model remains slow. Option B is wrong because implementing a fallback to the previous model on timeout is a reactive workaround that does not address the root cause; it can cause inconsistent behavior and still result in errors during the fallback transition. Option D is wrong because reducing the canary percentage to 1% only minimizes the blast radius but does not prevent the timeout errors from occurring; the spike in error rate would still trigger a rollback, just with less traffic affected.

Practice this question →

88

MCQeasy

An organization deploys an AI model on edge devices for real-time image classification. Which metric is most important to monitor for ensuring the device's operational health?

A.Model calibration error

B.Inference memory consumption

C.Average prediction confidence

D.Model accuracy on local test data

AnswerB

Memory is a key operational health indicator for edge devices.

Why this answer

For edge devices with limited resources, inference memory consumption is the most critical operational health metric because exceeding available memory can cause the model to crash or the device to become unresponsive. Unlike accuracy or confidence, memory usage directly reflects whether the device can sustain real-time inference without resource exhaustion.

Exam trap

CompTIA often tests the misconception that model accuracy or confidence is the primary concern for operational health, but the trap here is that edge device stability depends on resource constraints like memory, not model performance metrics.

How to eliminate wrong answers

Option A is wrong because model calibration error measures the reliability of predicted probabilities, not the operational health of the device. Option C is wrong because average prediction confidence indicates model certainty, not whether the device has sufficient memory to run inference. Option D is wrong because model accuracy on local test data evaluates model performance, not the device's ability to operate without memory overflow or system failure.

Practice this question →

89

MCQeasy

A company must deploy a new model version with zero downtime. The current model is served via a REST API on a Kubernetes cluster. Which deployment strategy should the team use to gradually shift traffic to the new version while monitoring for errors?

A.Blue-green deployment

B.Canary deployment

C.Recreate deployment

D.Rolling update

AnswerB

Canary deployment gradually routes traffic to the new version for safe rollout.

Why this answer

A canary deployment gradually shifts a small percentage of traffic to the new model version while the majority continues to hit the stable version. This allows the team to monitor for errors and roll back quickly if issues arise, achieving zero downtime. It is the ideal strategy for validating a new model in production with minimal risk.

Exam trap

The trap here is that candidates confuse 'rolling update' with 'canary deployment' because both involve gradual changes, but a rolling update replaces pods sequentially without the ability to route a controlled subset of traffic for targeted monitoring and rollback.

How to eliminate wrong answers

Option A is wrong because blue-green deployment switches all traffic at once from the old to the new environment, which does not provide gradual traffic shifting or incremental error monitoring; it is an all-or-nothing cutover. Option C is wrong because recreate deployment tears down the old version before deploying the new one, causing downtime and violating the zero-downtime requirement. Option D is wrong because a rolling update replaces pods incrementally but does not allow fine-grained traffic splitting or canary-style monitoring; it updates all instances without a separate traffic-routing phase for error detection.

Practice this question →

90

MCQhard

An organization is implementing an AI-powered chatbot for customer service. The chatbot must comply with GDPR and handle data subject access requests (DSARs). Which design approach best ensures compliance?

A.Minimize data collection by not logging any user interactions.

B.Anonymize all user data before logging interactions.

C.Implement an audit trail that logs interactions with a unique user identifier, and provide a mechanism to delete logs upon user request.

D.Encrypt all chat logs and store them indefinitely for audit purposes.

AnswerC

This ensures compliance with the right to access and erasure under GDPR.

Why this answer

Option C is correct because GDPR requires that personal data be stored only as long as necessary and that data subjects have the right to erasure. By logging interactions with a unique user identifier and providing a deletion mechanism, the chatbot can fulfill DSARs while maintaining an audit trail for compliance monitoring. This approach balances operational needs with regulatory obligations.

Exam trap

CompTIA often tests the misconception that GDPR requires complete data minimization (Option A) or indefinite encryption (Option D), when in fact the regulation mandates a balance between data utility and privacy rights, including the ability to delete data upon request.

How to eliminate wrong answers

Option A is wrong because not logging any user interactions prevents the organization from monitoring chatbot performance, improving the AI model, or detecting security incidents, and GDPR does not prohibit all logging—only excessive or unnecessary data collection. Option B is wrong because anonymization must be irreversible to be GDPR-compliant; if the data can be re-identified (e.g., via correlation with other logs), it is pseudonymization, which still subjects it to GDPR requirements, and anonymizing before logging does not address the need to handle DSARs for data that was originally personal. Option D is wrong because storing chat logs indefinitely violates the GDPR storage limitation principle (Article 5(1)(e)), which mandates that personal data be kept no longer than necessary for the purpose for which it is processed.

Practice this question →

91

MCQeasy

A data science team uses a CI/CD pipeline for ML models. They need to ensure that each model version is traceable back to the exact training data and hyperparameters. Which practice should be implemented?

A.Use a model registry with metadata tracking (e.g., MLflow)

B.Use Git LFS for model files

C.Store model artifacts in blob storage with timestamped filenames

D.Record hyperparameters in a shared spreadsheet

AnswerA

A model registry stores versions and associated metadata for full traceability.

Why this answer

A model registry (Option C) serves as a centralized repository that tracks model versions along with their metadata, including training data snapshots and hyperparameters. Git LFS (Option A) only handles large files, not metadata. Blob storage with timestamps (Option B) lacks structured tracking.

A spreadsheet (Option D) is error-prone and not integrated into the pipeline.

Practice this question →

92

MCQmedium

A company uses an AI system to recommend products. The recommendation accuracy is high, but users complain about lack of diversity. Which strategy should the team adopt to improve diversity without significantly sacrificing accuracy?

A.Randomly replace some recommendations with popular items.

B.Use only popularity-based recommendations.

C.Increase the number of recommendations and use collaborative filtering.

D.Modify the loss function to include a term that penalizes overly similar recommendations.

AnswerD

This explicitly encourages diversity while retaining accuracy.

Why this answer

Option D is correct because modifying the loss function to include a diversity penalty directly addresses the lack of recommendation diversity at the algorithmic level. By adding a regularization term that penalizes overly similar recommendations, the model learns to balance accuracy with variety, ensuring that the output set remains diverse without a significant drop in relevance. This approach is a standard technique in recommendation systems, often implemented via Determinantal Point Processes (DPPs) or diversity-aware loss functions.

Exam trap

CompTIA often tests the misconception that simply adding more recommendations or using popularity will solve diversity issues, when in reality, algorithmic constraints like loss function modification are required to maintain accuracy while improving diversity.

How to eliminate wrong answers

Option A is wrong because randomly replacing some recommendations with popular items introduces noise and can significantly degrade accuracy, as popular items may not be relevant to the user's specific preferences. Option B is wrong because using only popularity-based recommendations completely ignores personalization, leading to a severe loss of accuracy and user-specific relevance. Option C is wrong because simply increasing the number of recommendations and using collaborative filtering does not inherently enforce diversity; it may still produce a homogeneous set of similar items, and the increased list size can dilute relevance without a diversity constraint.

Practice this question →

93

MCQhard

A real-time recommendation system uses a model retrained daily. The operations team notices that click-through rate drops sharply at 8 AM each day and recovers by noon. The retraining job runs at midnight. What is the most likely cause?

A.The model overfits to late-night user behavior

B.The model suffers from catastrophic forgetting due to daily retraining

C.There is data drift due to morning user patterns not seen in training

D.The retraining pipeline has a bug that only affects morning predictions

AnswerC

Morning patterns differ from training data, causing a temporary performance drop until the model adapts through retraining.

Why this answer

The sharp drop in click-through rate at 8 AM, followed by recovery by noon, strongly indicates data drift caused by a shift in user behavior patterns during morning hours. Since the model is retrained at midnight using data that predominantly captures late-night user behavior, it fails to generalize to the distinct morning user patterns (e.g., different browsing habits, content preferences). This is a classic example of temporal data drift where the training distribution does not match the inference distribution at specific times of day.

Exam trap

CompTIA often tests the distinction between data drift and model degradation issues; the trap here is that candidates might confuse a temporary performance dip due to distribution shift (data drift) with a permanent model flaw like overfitting or catastrophic forgetting, which would not self-correct within the same day.

How to eliminate wrong answers

Option A is wrong because overfitting to late-night user behavior would cause poor performance during morning hours, but the recovery by noon suggests the model adapts as more morning data becomes available, not that it is permanently overfit. Option B is wrong because catastrophic forgetting refers to a model losing previously learned knowledge when trained on new data, which would cause a persistent performance drop, not a temporary one that recovers within hours. Option D is wrong because a pipeline bug that only affects morning predictions would likely cause consistent errors or failures at 8 AM every day, not a gradual recovery by noon, and there is no evidence of a bug in the retraining process itself.

Practice this question →

94

Multi-Selectmedium

Which TWO of the following are best practices for monitoring AI models in production?

Select 2 answers

A.Set up alerts for prediction latency and error rates.

B.Monitor model accuracy only at deployment time.

C.Regularly retrain without checking performance.

D.Freeze the model version once deployed to avoid changes.

E.Track input data distribution and compare with training data.

AnswersA, E

Operational metrics like latency and errors are critical for production monitoring.

Why this answer

Tracking input data distribution helps detect drift, and alerts on latency/error rates ensure operational health. Other options are incorrect or incomplete.

Practice this question →

95

MCQhard

A team is implementing an ML pipeline using a feature store. Which benefit does a feature store primarily provide in an AI operations context?

A.Automated scaling of inference endpoints

B.Real-time monitoring of model performance

C.Consistency of feature computation between training and inference

D.Automatic model versioning and rollback

AnswerC

Feature store provides a centralized, consistent feature computation pipeline.

Why this answer

A feature store ensures that feature engineering logic is stored, versioned, and reused consistently across both training and inference pipelines. This eliminates training-serving skew, a common cause of model degradation in production, by guaranteeing that the same transformations are applied to data regardless of when or where it is computed.

Exam trap

CompTIA often tests the distinction between infrastructure-level benefits (scaling, monitoring, versioning) and the core data-consistency problem that a feature store solves, leading candidates to confuse feature stores with model registries or serving platforms.

How to eliminate wrong answers

Option A is wrong because automated scaling of inference endpoints is a function of model serving infrastructure (e.g., Kubernetes Horizontal Pod Autoscaler or serverless inference platforms), not a primary benefit of a feature store. Option B is wrong because real-time monitoring of model performance is handled by observability tools (e.g., MLflow, Prometheus, or custom drift detection systems), not by the feature store itself. Option D is wrong because automatic model versioning and rollback is a capability of model registries and CI/CD pipelines (e.g., MLflow Model Registry or DVC), whereas a feature store focuses on feature definitions and values, not model artifacts.

Practice this question →

96

Multi-Selecthard

A deployed NLP sentiment analysis model experiences a sharp decline in accuracy on customer reviews. The team has verified the input data format and pipeline are correct. Which THREE actions should be taken to diagnose and remediate? (Choose 3.)

Select 3 answers

A.Analyze recent user input for distribution shifts compared to training data.

B.Immediately retrain the model with all available data.

C.Increase the size of the training dataset by adding synthetic data.

D.Revert to a previous model version that performed well.

E.Conduct a root cause analysis focusing on concept drift.

AnswersA, D, E

Identifies data drift which is a common cause of degradation.

Why this answer

Options A, B, and D are correct. Analyzing user input detects shift, reverting provides quick recovery, and root cause analysis prevents recurrence. Option C is wrong because synthetic data may introduce noise.

Option E is wrong because immediate retraining without analysis could embed issues.

Practice this question →

97

MCQmedium

A company deployed a chatbot using a pre-trained language model. Users report that the chatbot provides incorrect answers to domain-specific questions. Which approach should the AI team prioritize to improve accuracy without retraining the entire model?

A.Fine-tune the model on a curated dataset of domain-specific conversations.

B.Increase the temperature parameter to reduce randomness.

C.Collect more general training data and retrain the model from scratch.

D.Roll back to a previous version of the model that was more accurate.

AnswerA

Fine-tuning adapts the model to the domain with less data and compute.

Why this answer

Fine-tuning on a curated domain-specific dataset is the most efficient way to improve accuracy for specialized queries without retraining the entire model. It adjusts the model's weights using a smaller, targeted dataset, preserving general language understanding while adapting to domain terminology and context.

Exam trap

CompTIA often tests the misconception that increasing temperature reduces randomness (when it actually increases it) or that rolling back to an older version is a valid fix for new domain-specific issues, leading candidates to choose B or D instead of recognizing fine-tuning as the targeted, efficient solution.

How to eliminate wrong answers

Option B is wrong because increasing the temperature parameter increases randomness in token selection, which would make answers less deterministic and more likely to be incorrect, not more accurate. Option C is wrong because collecting more general training data and retraining from scratch is resource-intensive, time-consuming, and contradicts the requirement to avoid retraining the entire model. Option D is wrong because rolling back to a previous version does not address the domain-specific inaccuracies; the older model likely lacks the specialized knowledge needed and may have its own deficiencies.

Practice this question →

98

MCQhard

An AI system misclassifies rare but critical events. The team considers using synthetic data. Which consideration is MOST important for ensuring the synthetic data improves performance on real rare events?

A.The synthetic data should include a wide variety of events, even if not realistic.

B.The synthetic data should be generated using an unsupervised generative model.

C.The synthetic data should accurately represent the distribution and features of real rare events.

D.The synthetic data should be as large as possible to cover all possibilities.

AnswerC

Fidelity to real event characteristics is crucial for generalization.

Why this answer

Option C is correct because synthetic data must faithfully replicate the distribution and feature space of real rare events to enable the model to learn meaningful decision boundaries. If the synthetic data does not capture the true underlying patterns—such as specific sensor readings or transaction anomalies—the model will fail to generalize to actual rare events, defeating the purpose of augmentation.

Exam trap

CompTIA often tests the misconception that 'more data is always better' or that 'any synthetic data helps,' when in reality the fidelity of the synthetic data to the real rare event distribution is the paramount factor for improving model performance on those events.

How to eliminate wrong answers

Option A is wrong because including a wide variety of unrealistic events introduces noise and spurious correlations, which can degrade the model's precision and recall on real rare events. Option B is wrong because the choice of generative model (unsupervised vs. supervised) is secondary; the critical factor is that the synthetic data accurately reflects the real rare event distribution, not the training paradigm. Option D is wrong because simply maximizing dataset size without ensuring fidelity to real rare events can lead to overfitting on synthetic artifacts and poor generalization to authentic edge cases.

Practice this question →

99

Multi-Selectmedium

A team monitors a production model for bias. They measure the selection rate for two demographic groups and find a significant difference. Which TWO actions should the team take to mitigate bias? (Choose two.)

Select 2 answers

A.Increase the complexity of the model to capture more patterns

B.Add more training data from both groups

C.Retrain the model with a balanced training dataset

D.Remove the protected attribute from the model input

E.Implement a post-processing fairness adjustment

AnswersC, E

Balanced data reduces bias by ensuring the model learns from fair representations.

Why this answer

Retraining with a balanced training dataset (Option C) directly addresses the root cause of bias by ensuring the model learns from equal representation of both demographic groups, which reduces skewed selection rates. This is a standard data-level mitigation technique in AI fairness, as it prevents the model from overfitting to majority patterns.

Exam trap

CompTIA often tests the misconception that removing the protected attribute (Option D) is sufficient to eliminate bias, when in reality proxy features and correlated variables can perpetuate discrimination.

Practice this question →

100

Multi-Selecthard

Which TWO deployment strategies allow for testing a new model version before fully rolling it out?

Select 2 answers

A.Shadow deployment

B.Canary deployment

C.Direct cutover

D.A/B testing with traffic splitting

E.Blue-Green deployment

AnswersB, D

Canary releases route a subset of users to the new version for validation.

Why this answer

Canary deployment is correct because it routes a small percentage of live traffic to the new model version while the majority continues using the stable version. This allows real-world validation of the new model's performance and error rates under production load before a full rollout, minimizing blast radius if issues arise.

Exam trap

The trap here is that candidates confuse shadow deployment with canary deployment, mistakenly thinking shadow also tests user-facing behavior, when in fact shadow only tests infrastructure impact without validating model outputs against live user expectations.

Practice this question →

101

MCQhard

An e-commerce company deploys a recommendation model that must serve predictions with sub-100 ms latency for millions of users during peak hours. The model is a large neural network. Which architecture is most suitable?

A.Batch process predictions every hour.

B.Use a distributed system with load balancers and model replicas.

C.Deploy the model on a single powerful GPU server.

D.Use serverless functions with auto-scaling.

AnswerB

This architecture handles high traffic and meets latency requirements efficiently.

Why this answer

Option B is correct because distributing the model across multiple servers with load balancers and replicas allows horizontal scaling to handle millions of concurrent users while maintaining sub-100 ms latency. This architecture provides fault tolerance and can dynamically adjust to peak traffic loads, which is essential for real-time inference with large neural networks.

Exam trap

CompTIA often tests the misconception that a single powerful server or serverless functions can meet strict latency and throughput requirements, but the trap here is that horizontal scaling with load-balanced replicas is the only viable solution for high-concurrency, low-latency inference with large models.

How to eliminate wrong answers

Option A is wrong because batch processing predictions every hour introduces latency of up to 3600 seconds, which fails the sub-100 ms requirement and is unsuitable for real-time recommendation systems. Option C is wrong because a single powerful GPU server creates a single point of failure and cannot scale horizontally to handle millions of concurrent users during peak hours, leading to resource contention and latency spikes. Option D is wrong because serverless functions typically have cold start delays (often 100 ms to several seconds) and may not support large neural network models due to memory and execution time limits (e.g., AWS Lambda max 15 minutes, 10 GB memory), making them unsuitable for low-latency, high-throughput inference.

Practice this question →

102

MCQhard

An MLOps team automates model deployment with a CI/CD pipeline. A performance regression is detected after deploying a new model version. The team needs to automatically roll back to the previous version. Which approach best enables safe automated rollback?

A.Use a blue/green deployment with automated health checks and traffic switching

B.Maintain a manual rollback script that the operations team can run

C.Deploy new models as canary releases and monitor for 24 hours

D.Automatically keep the previous model version in storage for later use

AnswerA

Blue/green allows instant rollback by redirecting traffic.

Why this answer

Blue/green deployment with automated health checks and traffic switching is the best approach because it allows the team to instantly route all traffic back to the previous (green) environment if the new (blue) version fails health checks. This ensures zero-downtime rollback without manual intervention, directly addressing the need for safe automated rollback in a CI/CD pipeline.

Exam trap

CompTIA often tests the distinction between preserving artifacts (storage) and enabling automated traffic switching (deployment strategy), so candidates mistakenly choose Option D thinking storage alone ensures rollback capability.

How to eliminate wrong answers

Option B is wrong because a manual rollback script introduces human delay and error risk, contradicting the requirement for automated rollback. Option C is wrong because canary releases with a 24-hour monitoring window do not provide immediate automated rollback; they rely on manual decision-making after observation, which is not fully automated. Option D is wrong because simply keeping the previous model version in storage does not enable automatic traffic switching or rollback; it only preserves the artifact, not the deployment state.

Practice this question →

103

MCQeasy

A company has developed a deep learning model for image classification. The team wants to deploy the model to production with high availability and scalability. Which approach should they use?

A.Run the model on a laptop during business hours.

B.Deploy the model as a monolithic application on a single server.

C.Embed the model directly into a mobile app.

D.Use a containerized approach with Kubernetes.

AnswerD

Kubernetes provides orchestration, scaling, and high availability for containerized applications.

Why this answer

Option D is correct because containerization with Kubernetes provides the orchestration, auto-scaling, and self-healing capabilities required for high availability and scalability in production. Kubernetes manages container lifecycles, distributes traffic across replicas via Services and Ingress controllers, and can automatically scale pods based on CPU/memory metrics or custom metrics, ensuring the deep learning model handles variable loads without downtime.

Exam trap

CompTIA often tests the misconception that embedding AI models directly into mobile apps or running them on a single server is sufficient for production, when in reality enterprise-grade deployments require container orchestration for resilience and elasticity.

How to eliminate wrong answers

Option A is wrong because running the model on a laptop during business hours lacks any production-grade availability, scalability, or fault tolerance; it is a single point of failure and cannot handle concurrent requests. Option B is wrong because a monolithic application on a single server creates a single point of failure, cannot scale horizontally, and offers no load balancing or automated recovery, making it unsuitable for high availability. Option C is wrong because embedding the model directly into a mobile app offloads inference to client devices, which introduces latency, security risks, and inconsistent performance; it does not provide centralized high availability or scalability for the production service.

Practice this question →

← PreviousPage 2 of 2 · 103 questions total

Ready to test yourself?

Try a timed practice session using only AI Implementation and Operations questions.

Start 20-question session