← PDE·Google Cloud

Question 494 of 1,000

Operationalizing machine learning models →hardMultiple SelectObjective-mapped

Diagnosing AUC Drop Despite Training Validation

This PDE practice question tests your understanding of operationalizing machine learning models. The scenario asks you to isolate a root cause — eliminate options that address a different problem before choosing. A key principle to apply: data Drift. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

An MLOps team manages a pipeline that retrains an XGBoost classifier weekly using BigQuery data. The pipeline is orchestrated with Cloud Composer and deploys the new model to Vertex AI Endpoint if validation metrics (AUC > 0.9) are met. Over the past month, the deployed model's AUC has dropped from 0.95 to 0.88, despite the training pipeline consistently reporting AUC > 0.9. Which THREE steps should the team take to diagnose and fix this issue?

A
Review the training pipeline's hyperparameter tuning configuration to ensure it is not overfitting to stale data.
Why wrong: Hyperparameter tuning is unlikely the root cause; the issue is more about training-serving skew or concept drift.
B
Add a canary deployment step where new model version receives a small percentage of traffic before full rollout.
Canary testing can catch performance issues early before the model is fully deployed.
C
Compare feature distributions between the training data and online serving data using Vertex AI Model Monitoring.
This can detect data skew, which is a common cause of performance degradation.
D
Retrain the model using a longer training history to include older data that may still be relevant.
Why wrong: Adding older data might dilute recent patterns; the issue is likely skew or drift, not lack of data.
E
Implement model validation on the deployed endpoint by logging predictions and comparing against actuals for a sample of traffic using Vertex Explainable AI.
This helps monitor actual model performance in production and detect drift.

Full breakdown with real-world context →

Quick Answer

The answer is to implement model validation on the deployed endpoint by logging predictions and comparing against actuals for a sample of traffic using Vertex Explainable AI. This step is critical because the core issue is a training-serving skew, where the model performs well on the validation data seen during training but fails on the real-world data distribution encountered at inference. The team must diagnose this model performance degradation by capturing live prediction data and analyzing feature attribution to identify which input features have drifted from the training dataset, revealing why the deployed AUC is dropping despite the pipeline’s validation passing. On the Google Professional Data Engineer exam, this scenario tests your understanding of MLOps monitoring and the difference between offline validation and online evaluation, a common trap where candidates focus only on retraining or hyperparameter tuning. A useful memory tip is “validate where you serve”—always monitor the endpoint, not just the pipeline, to catch silent model decay.

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

✓

Add a canary deployment step where new model version receives a small percentage of traffic before full rollout.

Option B is correct because a canary deployment allows the team to gradually roll out the new model to a small percentage of traffic, enabling early detection of performance degradation in production before a full rollout. This step directly addresses the discrepancy between training metrics and live performance by exposing the model to real-world data patterns that may differ from the training set. In Cloud Composer and Vertex AI, canary deployments can be implemented by routing a fraction of requests to the new model version and monitoring its AUC in real time.

Key principle: Data Drift

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

✗
Review the training pipeline's hyperparameter tuning configuration to ensure it is not overfitting to stale data.
Why it's wrong here
Hyperparameter tuning is unlikely the root cause; the issue is more about training-serving skew or concept drift.
✓
Add a canary deployment step where new model version receives a small percentage of traffic before full rollout.
Why this is correct
Canary testing can catch performance issues early before the model is fully deployed.
Related concept
Data Drift
✓
Compare feature distributions between the training data and online serving data using Vertex AI Model Monitoring.
Why this is correct
This can detect data skew, which is a common cause of performance degradation.
Related concept
Data Drift
✗
Retrain the model using a longer training history to include older data that may still be relevant.
Why it's wrong here
Adding older data might dilute recent patterns; the issue is likely skew or drift, not lack of data.
✓
Implement model validation on the deployed endpoint by logging predictions and comparing against actuals for a sample of traffic using Vertex Explainable AI.
Why this is correct
This helps monitor actual model performance in production and detect drift.
Related concept
Data Drift

Common exam traps

Common exam trap: answer the scenario, not the keyword

A common mistake is assuming that retraining with more historical data (Option D) or tuning hyperparameters (Option A) will solve the performance drop. However, the real issue is often data drift or a mismatch between training and serving environments. In Google Cloud, the correct approach is to use Vertex AI Model Monitoring to compare feature distributions, implement a canary deployment with Cloud Composer to test new models against live traffic, and validate predictions using Vertex Explainable AI to log and compare against actual outcomes.

Detailed technical explanation

How to think about this question

Vertex AI Model Monitoring (Option C) automatically detects skew and drift by comparing feature distributions between training data and online serving data using statistical tests like Jensen-Shannon divergence or L-infinity distance. When a model's AUC drops in production despite high training AUC, it often indicates covariate shift (feature distribution change) or concept drift (change in the relationship between features and target). Canary deployments (Option B) combined with prediction logging and actuals comparison (Option E) form a robust feedback loop that catches such drift early, allowing rollback before full rollout.

KKey Concepts to Remember

Data Drift
Canary Deployment
Model Validation

TExam Day Tips

Watch for words such as best, first, most likely and least administrative effort.
Review why wrong options are wrong, not only why the correct option is correct.

Key takeaway

Data Drift

Real-world example

How this comes up in practice

A cloud solutions architect for a retail company is evaluating services for a new workload. The correct answer here reflects best practice for the specific scenario described — not a general cloud recommendation. Data Drift Cloud exam questions reward reading the constraint carefully: the same technology can be right or wrong depending on the use case.

What to study next

Got this wrong? Here's your next step.

Review data Drift, then practise related PDE questions on the same topic to reinforce the concept.

Related PDE practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

Designing Data Processing Systems practice questions

Practise PDE questions linked to Designing Data Processing Systems.

Ingesting and Processing the Data practice questions

Practise PDE questions linked to Ingesting and Processing the Data.

Storing the Data practice questions

Practise PDE questions linked to Storing the Data.

Preparing and Using Data for Analysis practice questions

Practise PDE questions linked to Preparing and Using Data for Analysis.

Maintaining and Automating Data Workloads practice questions

Practise PDE questions linked to Maintaining and Automating Data Workloads.

Building and operationalizing data processing systems practice questions

Practise PDE questions linked to Building and operationalizing data processing systems.

Operationalizing machine learning models practice questions

Practise PDE questions linked to Operationalizing machine learning models.

Ensuring solution quality practice questions

Practise PDE questions linked to Ensuring solution quality.

PDE fundamentals practice questions

Practise PDE questions linked to PDE fundamentals.

PDE scenario practice questions

Practise PDE questions linked to PDE scenario.

PDE troubleshooting practice questions

Practise PDE questions linked to PDE troubleshooting.

Practice this exam

Start a free PDE practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

10 questions 20 questions 30 questions 50 questions Timed 30

PDE practice-test guide →Study guide →Browse all practice tests

FAQ

Questions learners often ask

What does this PDE question test?

Operationalizing machine learning models — This question tests Operationalizing machine learning models — Data Drift.

What is the correct answer to this question?

The correct answer is: Add a canary deployment step where new model version receives a small percentage of traffic before full rollout. — Option B is correct because a canary deployment allows the team to gradually roll out the new model to a small percentage of traffic, enabling early detection of performance degradation in production before a full rollout. This step directly addresses the discrepancy between training metrics and live performance by exposing the model to real-world data patterns that may differ from the training set. In Cloud Composer and Vertex AI, canary deployments can be implemented by routing a fraction of requests to the new model version and monitoring its AUC in real time.

What should I do if I get this PDE question wrong?

Review data Drift, then practise related PDE questions on the same topic to reinforce the concept.

What is the key concept behind this question?

Data Drift

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Keep practising

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

This PDE practice question is part of Courseiva's free Google Cloud certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the PDE exam.

Diagnosing AUC Drop Despite Training Validation

Why each option matters

Option-by-option breakdown

Common exam trap: answer the scenario, not the keyword

How to think about this question

KKey Concepts to Remember

TExam Day Tips

How this comes up in practice

Got this wrong? Here's your next step.

Related PDE practice-question pages

Designing Data Processing Systems practice questions

Ingesting and Processing the Data practice questions

Storing the Data practice questions

Preparing and Using Data for Analysis practice questions

Maintaining and Automating Data Workloads practice questions

Building and operationalizing data processing systems practice questions

Operationalizing machine learning models practice questions

Ensuring solution quality practice questions

PDE fundamentals practice questions

PDE scenario practice questions

PDE troubleshooting practice questions

Start a free PDE practice session

Questions learners often ask

What does this PDE question test?

What is the correct answer to this question?

What should I do if I get this PDE question wrong?

What is the key concept behind this question?

More PDE practice questions

Question Discussion