PMLE · topic practice

Monitoring ML solutions practice questions

Practise Google Professional Machine Learning Engineer Monitoring ML solutions practice questions — original exam-style scenarios with answer choices, explanations, and analysis of common mistakes.

Courseiva uses original exam-style practice questions designed for learning and revision. The goal is to understand the concepts, recognise exam patterns, and improve through explanations — not memorise copied exam dumps.

Reviewed byJohnson Ajibi· MSc IT Security
20 questionsDomain: Monitoring ML solutions

What the exam tests

What to know about Monitoring ML solutions

Monitoring ML solutions questions test whether you can apply the concept in context, not just recognise a definition.

How the topic appears in realistic exam-style scenarios.

Which detail in the question changes the correct answer.

How to eliminate plausible but wrong options.

How to connect the question back to the wider exam objective.

Watch out for

Common Monitoring ML solutions exam traps

  • Answering from memory before reading the full scenario.
  • Missing a constraint such as cost, availability, security, scope or command context.
  • Choosing a broad answer when the question asks for the most specific fix.
  • Ignoring why the wrong options are tempting.

Practice set

Monitoring ML solutions questions

20 questions · select your answer, then reveal the explanation

You have deployed a regression model that predicts house prices. Over the past month, the model's predictions have been consistently too high. You suspect data drift in the input features. Which monitoring metric should you prioritize to confirm this?

Your team has deployed a text classification model on Vertex AI Endpoints. You notice that the model's latency has increased significantly over the last week, but the request rate has remained stable. Which of the following is the most likely cause?

You are monitoring a classification model that predicts loan default. The model was trained on data from 2020-2022. In 2023, the economic conditions changed, and the model's accuracy dropped significantly. Which monitoring approach would best help you detect this issue early?

You are responsible for monitoring a batch prediction pipeline that runs daily. Recently, the pipeline started failing intermittently with out-of-memory errors. The input data volume has not changed. What is the most likely cause?

You need to set up monitoring for a Vertex AI model that serves predictions in real-time. The model is expected to have a latency SLA of under 100ms. Which metric should you configure an alert on to ensure the SLA is met?

Your company uses a custom container for model serving on Vertex AI. After a recent update, the model returns predictions but they are clearly wrong (e.g., negative probabilities for a classification model). The logs show no errors. What is the most likely cause?

You are monitoring a machine learning pipeline that runs on Vertex AI Pipelines. The pipeline occasionally fails with a 'ResourceExhausted' error when attempting to read data from BigQuery. Which action should you take to resolve this issue?

You have an online prediction model that is showing increasing prediction latency. You have already verified that the request rate and input data size are unchanged. Which of the following should you investigate next?

Which TWO metrics should you monitor to detect data drift in a batch prediction pipeline?

Which THREE components should you include in a comprehensive model monitoring dashboard for a production ML system?

Which TWO actions are appropriate when you detect that a production model's prediction distribution has shifted significantly from the training distribution?

You are the ML engineer for a financial services company. You have deployed a fraud detection model on Vertex AI Endpoints using a custom container. The model is a gradient boosting model trained on transactional data. Over the past week, the model's precision has dropped from 95% to 80%, while recall has remained stable. The input data volume and distribution have not changed significantly. The model is served on a single endpoint with autoscaling enabled (min replicas=2, max replicas=10). You notice that the average CPU utilization of the serving containers has increased from 40% to 90%, and the p99 latency has increased from 50ms to 200ms. The model is retrained weekly using the latest data, and the last retraining was 3 days ago. The logs show no errors, and the model version is unchanged. Given these symptoms, what is the most likely cause of the precision drop?

A data science team deploys a regression model to predict house prices. After one month, the mean absolute error (MAE) on the serving data increases by 20% compared to the test set. Which monitoring strategy should the team implement first to diagnose the issue?

An e-commerce company uses a recommendation model deployed on Vertex AI Endpoints. The model's latency increases gradually over two weeks, causing timeouts. The model is served using a custom container. What is the most likely root cause and corrective action?

Question 15hardmultiple choice
Read the full NAT/PAT explanation →

A financial services firm deploys a binary classification model for fraud detection. The model's precision is 0.95 and recall is 0.60 on the test set. After deployment, the fraud rate in production is 0.5% compared to 5% in the test set. The model shows good calibration on the test set (Brier score 0.02) but poor calibration in production (Brier score 0.15). What is the most likely explanation for the calibration degradation?

A company implements an ML pipeline using Vertex AI Pipelines. The pipeline trains a model using custom training jobs and then deploys it to an endpoint. The team notices that the endpoint occasionally serves an older model version for a few minutes after a new pipeline run completes. What is the most likely cause?

A team has deployed a model on Vertex AI Prediction and wants to monitor for data drift. Which TWO metrics should they use to detect drift in numerical features?

A company uses Vertex AI Model Monitoring to detect training-serving skew. They have a categorical feature 'product_category' with high cardinality. The monitoring job alerts for skew, but the data scientists believe the model performance is still acceptable. Which THREE actions should the team take to investigate and resolve the alert?

Question 19mediummultiple choice
Review the full routing breakdown →

You are an ML engineer at a logistics company. You have deployed a deep learning model on Vertex AI Endpoints using a custom container with GPU acceleration. The model predicts delivery times based on route features. After one week, you notice that the endpoint's GPU utilization is consistently at 10%, but the prediction latency has increased by 50%. The number of prediction requests per second has remained stable. You check the container logs and see no errors. The model is served using TensorFlow Serving with batching enabled (batch size: 32, batch timeout: 100ms). The custom container uses a single NVIDIA T4 GPU. You have also set the Vertex AI endpoint to use autoscaling with minReplicaCount: 1 and maxReplicaCount: 5, and the CPU utilization target is 60%. Which action should you take to reduce latency?

A company deploys a custom ML model on Vertex AI to predict customer churn. The model retrains weekly, and predictions are served via a Vertex AI endpoint. After a recent retraining, the monitoring dashboard shows a sudden increase in prediction requests but a decrease in predicted churn probabilities. The model's accuracy on the validation set remains stable. What is the most likely cause of the observed behavior?

Free account

Track your progress over time

Create a free account to save your results and see which topics improve across sessions.

Focused Monitoring ML solutions sessions

Start a Monitoring ML solutions only practice session

Every question in these sessions is drawn from the Monitoring ML solutions domain — nothing else.

Related practice questions

Related PMLE topic practice pages

Move into related areas when this topic feels solid.

Frequently asked questions

What does the PMLE exam test about Monitoring ML solutions?
Monitoring ML solutions questions test whether you can apply the concept in context, not just recognise a definition.
How should I use these practice questions?
Select your answer before revealing the explanation. Then read why each option is right or wrong — this active recall approach builds retention far faster than re-reading notes.
Can I practise just Monitoring ML solutions questions in a focused session?
Yes — the session launcher on this page draws every question from the Monitoring ML solutions domain. Use a 10-question session first to gauge your baseline, then move to 20 or 30 once the weak spots are clear.
Where can I practise other PMLE topics?
Use the topic links above to move to related areas, or go back to the PMLE question bank to see all topics.
Are these real exam questions or dumps?
These are original practice questions written to test the same concepts the PMLE exam covers. They are not copied from any real exam or dump site.