Back to Google Professional Machine Learning Engineer

Google Cloud exam questions

Google Professional Machine Learning Engineer PMLE practice test

Practise CPU questions covering socket types, core counts, clock speeds, and cooling solutions for the PMLE exam.

506
practice questions
8
topics covered
PMLE
exam code
Google Cloud
vendor

Study modes

Three ways to study

Start with the Study Sheet to learn the material, switch to Practice Tests for active recall, then take a Mock Exam to simulate the real thing.

Study Sheet

All 506 questions with correct answers and explanations already visible. Read at your own pace — no time pressure.

Start reading →

Practice Test

Answer first, then see feedback and explanation. Tracks your score per session. Best for active recall and identifying weak areas.

Mock Exam

Full timed simulation with countdown. Answers hidden until the end. Includes all question types just like the real exam.

Start mock exam →

Study Sheet

All 506 PMLE questions with answers

Every question in the bank, paginated 75 per page. Correct answers and full explanations are revealed upfront — ideal for first-pass learning and pre-exam review.

7 pages · 75 questions per page · 506 total

Related practice questions

Study PMLE by topic

Topic pages go deep on individual concepts — each one covers a specific exam topic with questions, explanations, and study notes.

Courseiva uses original exam-style practice questions created for learning and revision. The goal is to understand the concepts, recognise exam patterns, and improve through explanations — not memorise copied exam dumps. Learn the difference →

Sample questions

Google Professional Machine Learning Engineer practice questions

Start practice test

A travel booking company has a real-time recommendation system that suggests hotels and flights to users. The model is served using TensorFlow Serving on a Google Kubernetes Engine (GKE) cluster with auto-scaling enabled. The cluster uses n1-standard-4 machine types. The team has set up Cloud Monitoring dashboards and alerts. Last week, during a major holiday promotion, the team noticed that the model's inference latency P99 increased from 150 ms to 450 ms over a 30-minute period, while the request throughput increased from 500 to 1,200 requests per second. CPU utilization across the cluster rose to 95%, but memory utilization remained at 60%. The model version and the serving infrastructure configuration have not changed since the last deployment. Which action should the team take to mitigate the latency issue?

A global retail company uses Vertex AI Recommendations to provide product recommendations on their website. They have a large catalog and millions of users. The initial deployment works well for active users, but they notice that new users (with no purchase history) receive generic recommendations that are not personalized. The company wants to improve the cold-start experience. They have user demographic data (age, location) available at sign-up. Current recommendation model is a collaborative filtering model using the built-in Vertex AI Recommendations. What should the company do to improve personalization for new users?

Your team is developing a machine learning model for real-time fraud detection. The training pipeline runs on Vertex AI and uses BigQuery for feature engineering. Recently, the pipeline has been taking significantly longer to execute. Upon investigation, you find that the BigQuery query for feature extraction is being rerun every time the pipeline runs, even though the underlying data hasn't changed. The pipeline is scheduled to run every hour. You want to reduce cost and execution time without losing the ability to detect data drifts. Which approach should you take?

Question 4mediummultiple choice
Read the full NAT/PAT explanation →

A healthcare organization is building a machine learning model to predict patient readmission risk. They have sensitive data stored in BigQuery that includes protected health information (PHI). The data science team uses Vertex AI Workbench notebooks to explore the data and develop models. The organization's security policy requires that all PHI data must be encrypted at rest and in transit, and that access to the data is logged and audited. They also need to ensure that the data used for model training is de-identified to remove direct identifiers such as patient names and SSNs. The team wants to automate the de-identification process as part of the data pipeline. Which approach meets these requirements?

You are an ML engineer at a global e-commerce company. Your team has developed a deep learning model for product recommendation that runs on Vertex AI Prediction. The model is deployed on a single n1-highmem-2 instance (CPU only) with autoscaling enabled (min replicas=1, max replicas=10). During Black Friday, traffic spikes to 1000 requests per second (QPS), and you observe that latency increases from 50ms to over 5000ms, and many requests time out. You check the monitoring dashboard and see that CPU utilization is at 100% on the single instance, and autoscaling is not triggering quickly enough. The team has a budget for this service and wants to handle the spike without compromising latency. What should you do?

A financial services company uses Vertex AI AutoML Tables to build a credit risk model. The dataset contains 500,000 rows and 50 features, including loan amount, credit score, debt-to-income ratio, and employment length. The target variable is binary: 'default' (1) or 'no default' (0). The data is highly imbalanced, with only 2% defaults. The data scientist trains a model with AutoML Tables using default settings. The evaluation metrics show an AUC of 0.85, but the confusion matrix reveals that the model predicts 'no default' for almost all cases, missing most defaults. The data scientist needs to improve the model's ability to identify defaults without significantly increasing false positives. They have limited time and cannot write custom code. What should they do?

Question 7hardmultiple choice
Read the full NAT/PAT explanation →

A financial services firm deploys a binary classification model for fraud detection. The model's precision is 0.95 and recall is 0.60 on the test set. After deployment, the fraud rate in production is 0.5% compared to 5% in the test set. The model shows good calibration on the test set (Brier score 0.02) but poor calibration in production (Brier score 0.15). What is the most likely explanation for the calibration degradation?

A logistics company uses Vertex AI AutoML Tables to predict delivery delays based on order attributes, weather data, and traffic data. The model is retrained weekly using a Vertex AI Pipeline that runs a BigQuery query to get training data, then triggers AutoML training. Recently, the pipeline fails with the error 'Dataset not found' when the AutoML training step starts. The BigQuery query runs successfully and outputs a table. Which is the most likely cause?

Match each ML model interpretability method to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts
Matches

Game-theoretic approach to explain feature contributions

Local surrogate model to explain individual predictions

Ranking features by their impact on model output

Shows marginal effect of a feature on predictions

Measures decrease in performance when feature is shuffled

Match each feature engineering technique to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts
Matches

Convert categorical variable into binary columns

Combine two or more features to capture interactions

Normalize numeric features to a standard range

Group continuous values into discrete intervals

Weight term frequency by inverse document frequency

A financial services company has deployed a classification model on Vertex AI to detect fraudulent transactions. The model is monitored using Vertex AI Model Monitoring for skew and drift detection, and also logs predictions to BigQuery for analysis. After a month, the monitoring alerts show a significant drift in one feature (transaction_amount). Which TWO actions should the team take to diagnose and address this issue?

Drag and drop the steps to set up model monitoring for drift detection on Vertex AI in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order
1Step 1
2Step 2
3Step 3
4Step 4
5Step 5

Drag and drop the steps to create and deploy a custom ML model on Vertex AI using a container in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order
1Step 1
2Step 2
3Step 3
4Step 4
5Step 5

Drag and drop the steps to set up a BigQuery ML linear regression model for forecasting in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order
1Step 1
2Step 2
3Step 3
4Step 4
5Step 5

Drag and drop the steps to deploy a trained TensorFlow model to Vertex AI Prediction in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order
1Step 1
2Step 2
3Step 3
4Step 4
5Step 5

Drag and drop the steps to set up a distributed training job on Vertex AI using a custom container in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order
1Step 1
2Step 2
3Step 3
4Step 4
5Step 5

Drag and drop the steps to set up a batch prediction job using Vertex AI in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order
1Step 1
2Step 2
3Step 3
4Step 4
5Step 5

You are designing an ML pipeline for a large-scale recommendation system that runs weekly retraining on historical user interaction data. The pipeline uses TensorFlow and is deployed on Google Cloud. The pipeline must be orchestrated and automated with minimal manual intervention. Which THREE options should you include in your design? (Choose three.)

A manufacturing company wants to predict equipment failure using sensor data stored in BigQuery. They have limited ML expertise and want to use AutoML Tables. The data includes timestamps, numerical sensor readings, and a boolean 'failure' column. The dataset is highly imbalanced with only 1% failure cases. Which of the following is the most effective approach to handle the imbalance in AutoML Tables?

A team uses Vertex AI Feature Store to serve features for real-time predictions. They notice that feature values are frequently updated from multiple source systems, leading to inconsistencies. They need to ensure that feature values are consistent across all serving endpoints. What should they do?

An organization uses Cloud Composer to orchestrate ML workflows. A DAG that triggers Vertex AI training jobs fails because the training job exceeds the 7-day maximum runtime. What is the best way to handle long-running training jobs in Cloud Composer?

A company deploys a TensorFlow model on Vertex AI Prediction with a single node. During peak hours, inference latency increases. What should they do first to reduce latency?

A company uses Vertex AI Prediction with a custom container for a TensorFlow model. They notice that after deploying a new model version, requests still go to the old version. What is the most likely cause?

A company needs to serve a model with strict latency requirements (<100ms). They are using Vertex AI Prediction with CPU. During testing, latency is 150ms. What should they do?

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

Sign in to join the discussion.

Exam question guide

How to use these PMLE questions

Use these questions as active recall, not passive reading. Try the question first, review the answer choices, then open the explanation and connect the result back to the exam topic.

Quick answer

CPU questions test socket types, core count, clock speed, and cooling methods for PMLE.

Identify CPU socket types and compatibility with motherboards.

Distinguish between 32-bit and 64-bit processor architectures.

Recognize hyperthreading and multi-core processor features.

Select appropriate cooling methods: air vs liquid cooling.

These PMLE practice questions are part of Courseiva's free Google Cloud certification practice question bank. Courseiva provides original exam-style PMLE questions with detailed explanations, topic-based practice, mock exams, readiness tracking, and study analytics.