- A
Review the training pipeline's hyperparameter tuning configuration to ensure it is not overfitting to stale data.
Why wrong: Hyperparameter tuning is unlikely the root cause; the issue is more about training-serving skew or concept drift.
- B
Add a canary deployment step where new model version receives a small percentage of traffic before full rollout.
Canary testing can catch performance issues early before the model is fully deployed.
- C
Compare feature distributions between the training data and online serving data using Vertex AI Model Monitoring.
This can detect data skew, which is a common cause of performance degradation.
- D
Retrain the model using a longer training history to include older data that may still be relevant.
Why wrong: Adding older data might dilute recent patterns; the issue is likely skew or drift, not lack of data.
- E
Implement model validation on the deployed endpoint by logging predictions and comparing against actuals for a sample of traffic using Vertex Explainable AI.
This helps monitor actual model performance in production and detect drift.
Diagnosing AUC Drop Despite Training Validation
This PDE practice question tests your understanding of operationalizing machine learning models. The scenario asks you to isolate a root cause — eliminate options that address a different problem before choosing. A key principle to apply: data Drift. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.
An MLOps team manages a pipeline that retrains an XGBoost classifier weekly using BigQuery data. The pipeline is orchestrated with Cloud Composer and deploys the new model to Vertex AI Endpoint if validation metrics (AUC > 0.9) are met. Over the past month, the deployed model's AUC has dropped from 0.95 to 0.88, despite the training pipeline consistently reporting AUC > 0.9. Which THREE steps should the team take to diagnose and fix this issue?
Quick Answer
The answer is to implement model validation on the deployed endpoint by logging predictions and comparing against actuals for a sample of traffic using Vertex Explainable AI. This step is critical because the core issue is a training-serving skew, where the model performs well on the validation data seen during training but fails on the real-world data distribution encountered at inference. The team must diagnose this model performance degradation by capturing live prediction data and analyzing feature attribution to identify which input features have drifted from the training dataset, revealing why the deployed AUC is dropping despite the pipeline’s validation passing. On the Google Professional Data Engineer exam, this scenario tests your understanding of MLOps monitoring and the difference between offline validation and online evaluation, a common trap where candidates focus only on retraining or hyperparameter tuning. A useful memory tip is “validate where you serve”—always monitor the endpoint, not just the pipeline, to catch silent model decay.
Answer choices
Why each option matters
Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.
Correct answer & explanation
Add a canary deployment step where new model version receives a small percentage of traffic before full rollout.
Option B is correct because a canary deployment allows the team to gradually roll out the new model to a small percentage of traffic, enabling early detection of performance degradation in production before a full rollout. This step directly addresses the discrepancy between training metrics and live performance by exposing the model to real-world data patterns that may differ from the training set. In Cloud Composer and Vertex AI, canary deployments can be implemented by routing a fraction of requests to the new model version and monitoring its AUC in real time.
Key principle: Data Drift
Answer analysis
Option-by-option breakdown
For each option: why learners choose it and why it is or isn't the right answer here.
- ✗
Review the training pipeline's hyperparameter tuning configuration to ensure it is not overfitting to stale data.
Why it's wrong here
Hyperparameter tuning is unlikely the root cause; the issue is more about training-serving skew or concept drift.
- ✓
Add a canary deployment step where new model version receives a small percentage of traffic before full rollout.
Why this is correct
Canary testing can catch performance issues early before the model is fully deployed.
Related concept
Data Drift
- ✓
Compare feature distributions between the training data and online serving data using Vertex AI Model Monitoring.
Why this is correct
This can detect data skew, which is a common cause of performance degradation.
Related concept
Data Drift
- ✗
Retrain the model using a longer training history to include older data that may still be relevant.
Why it's wrong here
Adding older data might dilute recent patterns; the issue is likely skew or drift, not lack of data.
- ✓
Implement model validation on the deployed endpoint by logging predictions and comparing against actuals for a sample of traffic using Vertex Explainable AI.
Why this is correct
This helps monitor actual model performance in production and detect drift.
Related concept
Data Drift
Common exam traps
Common exam trap: answer the scenario, not the keyword
A common mistake is assuming that retraining with more historical data (Option D) or tuning hyperparameters (Option A) will solve the performance drop. However, the real issue is often data drift or a mismatch between training and serving environments. In Google Cloud, the correct approach is to use Vertex AI Model Monitoring to compare feature distributions, implement a canary deployment with Cloud Composer to test new models against live traffic, and validate predictions using Vertex Explainable AI to log and compare against actual outcomes.
Detailed technical explanation
How to think about this question
Vertex AI Model Monitoring (Option C) automatically detects skew and drift by comparing feature distributions between training data and online serving data using statistical tests like Jensen-Shannon divergence or L-infinity distance. When a model's AUC drops in production despite high training AUC, it often indicates covariate shift (feature distribution change) or concept drift (change in the relationship between features and target). Canary deployments (Option B) combined with prediction logging and actuals comparison (Option E) form a robust feedback loop that catches such drift early, allowing rollback before full rollout.
KKey Concepts to Remember
- Data Drift
- Canary Deployment
- Model Validation
TExam Day Tips
- Watch for words such as best, first, most likely and least administrative effort.
- Review why wrong options are wrong, not only why the correct option is correct.
Key takeaway
Data Drift
Real-world example
How this comes up in practice
A cloud solutions architect for a retail company is evaluating services for a new workload. The correct answer here reflects best practice for the specific scenario described — not a general cloud recommendation. Data Drift Cloud exam questions reward reading the constraint carefully: the same technology can be right or wrong depending on the use case.
What to study next
Got this wrong? Here's your next step.
Review data Drift, then practise related PDE questions on the same topic to reinforce the concept.
- →
Operationalizing machine learning models — study guide chapter
Learn the concepts, then practise the questions
- →
Operationalizing machine learning models practice questions
Targeted practice on this topic area only
- →
All PDE questions
1,000 questions across all exam domains
- →
Google Professional Data Engineer study guide
Full concept coverage aligned to exam objectives
- →
PDE practice test guide
How to use practice tests most effectively before exam day
Related practice questions
Related PDE practice-question pages
Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.
Designing Data Processing Systems practice questions
Practise PDE questions linked to Designing Data Processing Systems.
Ingesting and Processing the Data practice questions
Practise PDE questions linked to Ingesting and Processing the Data.
Storing the Data practice questions
Practise PDE questions linked to Storing the Data.
Preparing and Using Data for Analysis practice questions
Practise PDE questions linked to Preparing and Using Data for Analysis.
Maintaining and Automating Data Workloads practice questions
Practise PDE questions linked to Maintaining and Automating Data Workloads.
Building and operationalizing data processing systems practice questions
Practise PDE questions linked to Building and operationalizing data processing systems.
Operationalizing machine learning models practice questions
Practise PDE questions linked to Operationalizing machine learning models.
Ensuring solution quality practice questions
Practise PDE questions linked to Ensuring solution quality.
PDE fundamentals practice questions
Practise PDE questions linked to PDE fundamentals.
PDE scenario practice questions
Practise PDE questions linked to PDE scenario.
PDE troubleshooting practice questions
Practise PDE questions linked to PDE troubleshooting.
Practice this exam
Start a free PDE practice session
Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.
FAQ
Questions learners often ask
What does this PDE question test?
Operationalizing machine learning models — This question tests Operationalizing machine learning models — Data Drift.
What is the correct answer to this question?
The correct answer is: Add a canary deployment step where new model version receives a small percentage of traffic before full rollout. — Option B is correct because a canary deployment allows the team to gradually roll out the new model to a small percentage of traffic, enabling early detection of performance degradation in production before a full rollout. This step directly addresses the discrepancy between training metrics and live performance by exposing the model to real-world data patterns that may differ from the training set. In Cloud Composer and Vertex AI, canary deployments can be implemented by routing a fraction of requests to the new model version and monitoring its AUC in real time.
What should I do if I get this PDE question wrong?
Review data Drift, then practise related PDE questions on the same topic to reinforce the concept.
What is the key concept behind this question?
Data Drift
About these practice questions
Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →
Keep practising
More PDE practice questions
- A company wants to process large CSV files stored in Cloud Storage and load them into BigQuery. The files are generated…
- A company runs a Dataflow streaming pipeline that reads from Cloud Pub/Sub and writes to BigQuery. The pipeline uses a s…
- A company uses Cloud Dataproc for ephemeral clusters to run batch jobs. They want to ensure job reliability and data qua…
- Your company uses Vertex AI Pipelines to automate model retraining. The pipeline has three steps: data extraction from B…
- A company wants to use BigQuery to query data stored in Parquet files in Cloud Storage without loading the data into Big…
- A company has deployed a machine learning model to AI Platform Prediction. The model uses a custom container with a Tens…
Last reviewed: Jul 4, 2026
This PDE practice question is part of Courseiva's free Google Cloud certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the PDE exam.
Question Discussion
Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.
Sign in to join the discussion.