Question 55 of 506
Scaling prototypes into ML modelshardMultiple ChoiceObjective-mapped

Quick Answer

The correct answer is that data drift is causing the performance degradation and latency increase. When the production data distribution no longer matches the training data, the model makes more incorrect predictions, and these out-of-distribution inputs often require additional computation—such as fallback logic or uncertainty estimation—which directly drives up latency. On the Google Professional Machine Learning Engineer exam, this scenario tests your understanding of how data drift impacts both accuracy and operational metrics, a common trap being to overlook that degraded predictions can consume extra resources. A key memory tip is to think of the “drift double-hit”: accuracy drops while latency spikes, because the model struggles to process unfamiliar data efficiently.

PMLE Scaling prototypes into ML models Practice Question

This PMLE practice question tests your understanding of scaling prototypes into ml models. Read the scenario carefully and evaluate each option against the stated constraints before committing to an answer. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

Exhibit

Refer to the exhibit.

```
Model accuracy: 0.92
Training data: 10,000 records
Online prediction latency: 95th percentile = 450ms
QPS: 50

After moving to production:
- New data from users: 100,000 records/day
- Data distribution shift detected (new features emerge)
- Prediction latency increases to 95th percentile = 1200ms
- QPS drops to 30
```

A team deployed a prototype classification model to Vertex AI Prediction. After a week, they notice the metrics shown in the exhibit. What is the most likely cause of the performance degradation and latency increase?

Clue words in this question

Noticing these words before you look at the options changes how you read each choice.

  • Clue: "most likely"

    Why it matters: Probability qualifier — the question wants the most probable cause or outcome, not a guaranteed one. Eliminate low-probability options.

Question 1hardmultiple choice
Full question →

Exhibit

Refer to the exhibit.

```
Model accuracy: 0.92
Training data: 10,000 records
Online prediction latency: 95th percentile = 450ms
QPS: 50

After moving to production:
- New data from users: 100,000 records/day
- Data distribution shift detected (new features emerge)
- Prediction latency increases to 95th percentile = 1200ms
- QPS drops to 30
```

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

The training data does not represent the current production data distribution, causing the model to make incorrect predictions and requiring more computation.

The exhibit shows both accuracy degradation and increased latency. Option C is correct because when the production data distribution shifts away from the training data (data drift), the model makes more incorrect predictions, which can trigger additional computation (e.g., retries, fallback logic, or increased uncertainty estimation) and cause latency spikes. Vertex AI Prediction does not inherently add computation for wrong predictions, but the model's internal confidence thresholds or post-processing steps may consume extra resources when handling out-of-distribution inputs.

Key principle: Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

  • The prediction endpoint's autoscaling is too slow, causing requests to queue and time out.

    Why it's wrong here

    Autoscaling may contribute but does not explain the accuracy drop.

  • The prediction requests are too large, exceeding the maximum request size limit for Vertex AI.

    Why it's wrong here

    Request size limit would cause errors, not a gradual latency increase.

  • The training data does not represent the current production data distribution, causing the model to make incorrect predictions and requiring more computation.

    Why this is correct

    Data distribution shift degrades accuracy and can increase latency if the model is uncertain.

    Clue confirmation

    The clue word "most likely" in the question point toward this answer.

    Related concept

    Read the scenario before looking for a memorised answer.

  • The custom prediction container uses outdated libraries that are incompatible with Vertex AI's runtime.

    Why it's wrong here

    Library incompatibility would cause errors, not gradual latency increase and accuracy drop.

Common exam traps

Common exam trap: answer the scenario, not the keyword

Google Cloud often tests the misconception that latency increase must be caused by infrastructure issues (autoscaling or request size) rather than model behavior, but the key clue is the simultaneous accuracy degradation, which points to data drift as the root cause.

Detailed technical explanation

How to think about this question

Data drift is often detected by monitoring the distribution of prediction inputs versus training features using statistical tests like Kolmogorov-Smirnov or Population Stability Index (PSI). In Vertex AI, you can enable Model Monitoring to automatically detect drift and skew, triggering retraining or alerting. A real-world scenario is a retail demand model trained on pre-pandemic data failing during a pandemic because customer behavior shifted, causing both accuracy loss and increased latency due to the model's uncertainty handling (e.g., Bayesian neural networks taking longer to sample).

KKey Concepts to Remember

  • Read the scenario before looking for a memorised answer.
  • Find the constraint that changes the correct option.
  • Eliminate answers that are true in general but not in this case.

TExam Day Tips

  • Watch for words such as best, first, most likely and least administrative effort.
  • Review why wrong options are wrong, not only why the correct option is correct.

Key takeaway

Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Real-world example

How this comes up in practice

A cloud solutions architect for a retail company is evaluating services for a new workload. The correct answer here reflects best practice for the specific scenario described — not a general cloud recommendation. Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option. Cloud exam questions reward reading the constraint carefully: the same technology can be right or wrong depending on the use case.

What to study next

Got this wrong? Here's your next step.

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

Related practice questions

Related PMLE practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

Practice this exam

Start a free PMLE practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

FAQ

Questions learners often ask

What does this PMLE question test?

Scaling prototypes into ML models — This question tests Scaling prototypes into ML models — Read the scenario before looking for a memorised answer..

What is the correct answer to this question?

The correct answer is: The training data does not represent the current production data distribution, causing the model to make incorrect predictions and requiring more computation. — The exhibit shows both accuracy degradation and increased latency. Option C is correct because when the production data distribution shifts away from the training data (data drift), the model makes more incorrect predictions, which can trigger additional computation (e.g., retries, fallback logic, or increased uncertainty estimation) and cause latency spikes. Vertex AI Prediction does not inherently add computation for wrong predictions, but the model's internal confidence thresholds or post-processing steps may consume extra resources when handling out-of-distribution inputs.

What should I do if I get this PMLE question wrong?

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

Are there clue words in this question I should notice?

Yes — watch for: "most likely". Probability qualifier — the question wants the most probable cause or outcome, not a guaranteed one. Eliminate low-probability options.

What is the key concept behind this question?

Read the scenario before looking for a memorised answer.

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Last reviewed: Jun 30, 2026

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

Sign in to join the discussion.

This PMLE practice question is part of Courseiva's free Google Cloud certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the PMLE exam.