PMLE Practice Questions

Question 1

A travel booking company has a real-time recommendation system that suggests hotels and flights to users. The model is served using TensorFlow Serving on a Google Kubernetes Engine (GKE) cluster with auto-scaling enabled. The cluster uses n1-standard-4 machine types. The team has set up Cloud Monitoring dashboards and alerts. Last week, during a major holiday promotion, the team noticed that the model's inference latency P99 increased from 150 ms to 450 ms over a 30-minute period, while the request throughput increased from 500 to 1,200 requests per second. CPU utilization across the cluster rose to 95%, but memory utilization remained at 60%. The model version and the serving infrastructure configuration have not changed since the last deployment. Which action should the team take to mitigate the latency issue?

Accepted Answer

Add more nodes to the GKE cluster to increase the total CPU resources available for serving.. The latency spike is caused by CPU saturation (95% utilization) under increased load (500 to 1,200 RPS). Adding more nodes to the GKE cluster directly increases the total CPU resources available, allowing the existing TensorFlow Serving pods to handle the higher throughput without contention. This is the most immediate and infrastructure-appropriate fix because the model version and serving configuration have not changed, ruling out model-level or code-level optimizations.

Answer

Implement a feature engineering pipeline that compresses the input features to reduce data size and inference time.

Answer

Deploy a newer version of the model that uses a more efficient architecture to reduce computational complexity.

Answer

Increase the number of TensorFlow Serving instances by reducing the CPU request per pod in GKE to allow more pods per node.

Question 2

A global retail company uses Vertex AI Recommendations to provide product recommendations on their website. They have a large catalog and millions of users. The initial deployment works well for active users, but they notice that new users (with no purchase history) receive generic recommendations that are not personalized. The company wants to improve the cold-start experience. They have user demographic data (age, location) available at sign-up. Current recommendation model is a collaborative filtering model using the built-in Vertex AI Recommendations. What should the company do to improve personalization for new users?

Accepted Answer

Increase the user exploration parameter in the Vertex AI Recommendations configuration. Option C is correct because increasing the user exploration parameter in Vertex AI Recommendations instructs the model to allocate a higher percentage of recommendations to items with less historical data, effectively enabling personalized suggestions for cold-start users based on available demographic signals. This parameter directly controls the balance between exploiting known user-item interactions and exploring new or less-seen items, which is the standard mechanism within Vertex AI's built-in collaborative filtering to address the cold-start problem without requiring a custom model.

Answer

Collect more historical interaction data before showing recommendations

Answer

Disable recommendations for new users until they have at least 10 interactions

Answer

Build a custom two-tower recommendation model using Vertex AI Training

Question 3

Your team is developing a machine learning model for real-time fraud detection. The training pipeline runs on Vertex AI and uses BigQuery for feature engineering. Recently, the pipeline has been taking significantly longer to execute. Upon investigation, you find that the BigQuery query for feature extraction is being rerun every time the pipeline runs, even though the underlying data hasn't changed. The pipeline is scheduled to run every hour. You want to reduce cost and execution time without losing the ability to detect data drifts. Which approach should you take?

Accepted Answer

Move the feature extraction to a separate scheduled query in BigQuery and load the results into a table that the pipeline reads from.. Option B is correct because it decouples the feature extraction from the training pipeline by using a separate scheduled BigQuery query that writes results to a table. This eliminates redundant query execution on every pipeline run, reducing cost and execution time, while the scheduled query can be set to run at a frequency that still detects data drifts (e.g., hourly). The pipeline then reads from the precomputed table, avoiding repeated full scans of the source data.

Answer

Implement a caching mechanism in the pipeline that stores the results of the BigQuery query and reuses them if the data hasn't changed.

Answer

Reduce the pipeline frequency to once a day to minimize the number of runs.

Answer

Use a conditional pipeline that checks if the data has changed before running the feature extraction step.

Question 4

A healthcare organization is building a machine learning model to predict patient readmission risk. They have sensitive data stored in BigQuery that includes protected health information (PHI). The data science team uses Vertex AI Workbench notebooks to explore the data and develop models. The organization's security policy requires that all PHI data must be encrypted at rest and in transit, and that access to the data is logged and audited. They also need to ensure that the data used for model training is de-identified to remove direct identifiers such as patient names and SSNs. The team wants to automate the de-identification process as part of the data pipeline. Which approach meets these requirements?

Accepted Answer

Create a Dataflow pipeline that reads from the original BigQuery table, applies Cloud DLP de-identification transforms, and writes to a new BigQuery table. Grant the data science team access to the de-identified table.. Option A is correct because it uses Cloud DLP within a Dataflow pipeline to automatically de-identify PHI data as it is read from the original BigQuery table and written to a new, de-identified table. This satisfies the requirement for automated de-identification, while the original table remains encrypted at rest (BigQuery default) and in transit (TLS), and access to the original data can be logged via Cloud Audit Logs. The data science team only gets access to the de-identified table, ensuring PHI is not exposed during model development.

Answer

Enable Shielded VM on Vertex AI Workbench notebooks and use VPC-SC to restrict data access.

Answer

Use Cloud Key Management Service to encrypt the PHI columns in BigQuery, and share the encryption key with the data science team.

Answer

Use BigQuery row-level security to mask PHI columns for the data science team, and train the model directly on the original table.

Question 5

You are an ML engineer at a global e-commerce company. Your team has developed a deep learning model for product recommendation that runs on Vertex AI Prediction. The model is deployed on a single n1-highmem-2 instance (CPU only) with autoscaling enabled (min replicas=1, max replicas=10). During Black Friday, traffic spikes to 1000 requests per second (QPS), and you observe that latency increases from 50ms to over 5000ms, and many requests time out. You check the monitoring dashboard and see that CPU utilization is at 100% on the single instance, and autoscaling is not triggering quickly enough. The team has a budget for this service and wants to handle the spike without compromising latency. What should you do?

Accepted Answer

Switch to GPU instances (e.g., n1-standard-4 with T4) and set min replicas=2 with autoscaling up to 10. Option A is correct because switching to GPU instances (n1-standard-4 with T4) offloads compute-intensive recommendation model inference to GPUs, significantly reducing per-request latency. Setting min replicas=2 ensures that at least two instances are always warm, reducing cold-start delays and allowing autoscaling to handle traffic spikes more responsively. This combination addresses both the CPU bottleneck and the slow scaling trigger, keeping latency under 50ms even at 1000 QPS.

Answer

Increase min replicas to 5 to keep warm instances

Answer

Set min replicas=1 and max replicas=5 to control cost

Answer

Increase max replicas to 20 and keep CPU instances

Question 6

A financial services company uses Vertex AI AutoML Tables to build a credit risk model. The dataset contains 500,000 rows and 50 features, including loan amount, credit score, debt-to-income ratio, and employment length. The target variable is binary: 'default' (1) or 'no default' (0). The data is highly imbalanced, with only 2% defaults. The data scientist trains a model with AutoML Tables using default settings. The evaluation metrics show an AUC of 0.85, but the confusion matrix reveals that the model predicts 'no default' for almost all cases, missing most defaults. The data scientist needs to improve the model's ability to identify defaults without significantly increasing false positives. They have limited time and cannot write custom code. What should they do?

Accepted Answer

Enable 'Enable weighted evaluation' and set the optimization objective to 'Maximize recall at a specific recall@P%' with a target precision of 0.5.. Option C is correct because AutoML Tables allows you to set a custom optimization objective to handle class imbalance without custom code. By enabling weighted evaluation and setting the objective to 'Maximize recall at a specific recall@P%' with a target precision of 0.5, the model will be tuned to prioritize identifying defaults (recall) while maintaining a specified precision level, directly addressing the need to catch more defaults without a massive increase in false positives.

Answer

Manually split the data into a stratified train/test set to ensure the same proportion of defaults in each.

Answer

Train multiple models with different algorithms (e.g., XGBoost, Random Forest) and blend them using a custom script.

Answer

Under-sample the majority class to create a balanced dataset and retrain.

Question 7

A financial services firm deploys a binary classification model for fraud detection. The model's precision is 0.95 and recall is 0.60 on the test set. After deployment, the fraud rate in production is 0.5% compared to 5% in the test set. The model shows good calibration on the test set (Brier score 0.02) but poor calibration in production (Brier score 0.15). What is the most likely explanation for the calibration degradation?

Accepted Answer

The relationship between features and the target has changed (concept drift), causing the model's probability estimates to be misaligned with the true probabilities.. The model's calibration degrades in production despite being well-calibrated on the test set, which had a 5% fraud rate, while production has a 0.5% fraud rate. This shift in class imbalance (prior probability shift) directly affects the model's probability estimates because the model's predicted probabilities are conditional on the training distribution. Option D is correct because concept drift—specifically a change in the base rate of fraud—causes the model's probability estimates to no longer reflect the true posterior probabilities in production, leading to a higher Brier score.

Answer

The distribution of input features has shifted significantly, causing the model to produce incorrect probabilities.

Answer

The model overfits to noise in the training data, leading to poor generalization.

Answer

The production data has a different class imbalance than the training data, causing the model to be biased toward the majority class.

Question 8

A logistics company uses Vertex AI AutoML Tables to predict delivery delays based on order attributes, weather data, and traffic data. The model is retrained weekly using a Vertex AI Pipeline that runs a BigQuery query to get training data, then triggers AutoML training. Recently, the pipeline fails with the error 'Dataset not found' when the AutoML training step starts. The BigQuery query runs successfully and outputs a table. Which is the most likely cause?

Accepted Answer

The BigQuery output table is not being passed as a Vertex AI Dataset resource.. The error 'Dataset not found' occurs because AutoML Tables requires a Vertex AI Dataset resource (a metadata wrapper) to reference the training data, not just a BigQuery table. The pipeline's BigQuery query produces a table, but if that table is not explicitly converted into or passed as a Vertex AI Dataset resource (via the `aiplatform.Dataset` creation step), AutoML training cannot locate it. Option D correctly identifies this missing step as the root cause.

Answer

The AutoML training step is referencing a different dataset location.

Answer

The training data has been manually deleted from Cloud Storage.

Answer

The pipeline's IAM permissions are insufficient to access BigQuery.

Question 9

Match each ML model interpretability method to its description.

Question 10

Match each feature engineering technique to its description.

Question 11

A financial services company has deployed a classification model on Vertex AI to detect fraudulent transactions. The model is monitored using Vertex AI Model Monitoring for skew and drift detection, and also logs predictions to BigQuery for analysis. After a month, the monitoring alerts show a significant drift in one feature (transaction_amount). Which TWO actions should the team take to diagnose and address this issue?

Accepted Answer

Compare the feature distribution in the training data with the recent serving data using statistical tests.. Option A is correct because comparing the feature distribution of the training data with recent serving data using statistical tests (e.g., Kolmogorov-Smirnov or Jensen-Shannon divergence) is the standard first step to quantify the drift and confirm it is statistically significant. This diagnostic action helps the team understand the nature and magnitude of the drift before deciding on remediation steps. Vertex AI Model Monitoring already performs such comparisons, but the team should independently verify the results in BigQuery to ensure accuracy.

Answer

Increase the frequency of model monitoring checks to every hour.

Answer

Increase the sampling rate for prediction logging to ensure full data capture.

Answer

Reduce the alert threshold to minimize false positives.

Question 12

Drag and drop the steps to set up model monitoring for drift detection on Vertex AI in the correct order.

Question 13

Drag and drop the steps to create and deploy a custom ML model on Vertex AI using a container in the correct order.

Question 14

Drag and drop the steps to set up a BigQuery ML linear regression model for forecasting in the correct order.

Question 15

Drag and drop the steps to deploy a trained TensorFlow model to Vertex AI Prediction in the correct order.

Google Professional Machine Learning Engineer PMLE practice test

Three ways to study

All 506 PMLE questions with answers

Study PMLE by domain

Study PMLE by topic

Scaling prototypes into ML models practice questions

Automating and orchestrating ML pipelines practice questions

Collaborating within and across teams to manage data and models practice questions

Architecting low-code ML solutions practice questions

Collaborating to manage data and models practice questions

Serving and scaling models practice questions

Monitoring ML solutions practice questions

Solving business challenges with ML practice questions

PMLE fundamentals practice questions

PMLE scenario practice questions

PMLE troubleshooting practice questions

Top PMLE questions

Google Professional Machine Learning Engineer practice questions

Match each ML model interpretability method to its description.

Match each feature engineering technique to its description.

Drag and drop the steps to set up model monitoring for drift detection on Vertex AI in the correct order.

Drag and drop the steps to create and deploy a custom ML model on Vertex AI using a container in the correct order.

Drag and drop the steps to set up a BigQuery ML linear regression model for forecasting in the correct order.

Drag and drop the steps to deploy a trained TensorFlow model to Vertex AI Prediction in the correct order.

Drag and drop the steps to set up a distributed training job on Vertex AI using a custom container in the correct order.

Drag and drop the steps to set up a batch prediction job using Vertex AI in the correct order.

A team uses Vertex AI Feature Store to serve features for real-time predictions. They notice that feature values are frequently updated from multiple source systems, leading to inconsistencies. They need to ensure that feature values are consistent across all serving endpoints. What should they do?

An organization uses Cloud Composer to orchestrate ML workflows. A DAG that triggers Vertex AI training jobs fails because the training job exceeds the 7-day maximum runtime. What is the best way to handle long-running training jobs in Cloud Composer?

A company deploys a TensorFlow model on Vertex AI Prediction with a single node. During peak hours, inference latency increases. What should they do first to reduce latency?

A company uses Vertex AI Prediction with a custom container for a TensorFlow model. They notice that after deploying a new model version, requests still go to the old version. What is the most likely cause?

A company needs to serve a model with strict latency requirements (<100ms). They are using Vertex AI Prediction with CPU. During testing, latency is 150ms. What should they do?

Question Discussion

How to use these PMLE questions

Quick answer