Back to Google Professional Machine Learning Engineer questions

Scenario-based practice

Hard Difficulty Questions

Practise Google Professional Machine Learning Engineer practice questions — original exam-style scenarios covering every exam domain, with detailed explanations, wrong-answer analysis, and common exam traps.

20
scenario questions
PMLE
exam code
Google Cloud
vendor

Scenario guide

How to approach hard difficulty questions

These are the questions most candidates get wrong. They require connecting multiple concepts, reading tricky output, or knowing edge-case behaviour that isn't on most study cards. Practising them trains you to operate under uncertainty — a necessary skill on the real exam.

Quick answer

Hard Difficulty Questions questions test whether you can apply the concept in context, not just recognise a definition.

How the topic appears in realistic exam-style scenarios.

Which detail in the question changes the correct answer.

How to eliminate plausible but wrong options.

How to connect the question back to the wider exam objective.

Related practice questions

Related PMLE topic practice pages

Scenario questions usually connect to one or more exam topics. Use these links to review the underlying concepts behind the scenario.

Practice set

Practice scenarios

Question 1hardmultiple choice
Full question →

A travel booking company has a real-time recommendation system that suggests hotels and flights to users. The model is served using TensorFlow Serving on a Google Kubernetes Engine (GKE) cluster with auto-scaling enabled. The cluster uses n1-standard-4 machine types. The team has set up Cloud Monitoring dashboards and alerts. Last week, during a major holiday promotion, the team noticed that the model's inference latency P99 increased from 150 ms to 450 ms over a 30-minute period, while the request throughput increased from 500 to 1,200 requests per second. CPU utilization across the cluster rose to 95%, but memory utilization remained at 60%. The model version and the serving infrastructure configuration have not changed since the last deployment. Which action should the team take to mitigate the latency issue?

Question 2hardmultiple choice
Full question →

You are an ML engineer at a global e-commerce company. Your team has developed a deep learning model for product recommendation that runs on Vertex AI Prediction. The model is deployed on a single n1-highmem-2 instance (CPU only) with autoscaling enabled (min replicas=1, max replicas=10). During Black Friday, traffic spikes to 1000 requests per second (QPS), and you observe that latency increases from 50ms to over 5000ms, and many requests time out. You check the monitoring dashboard and see that CPU utilization is at 100% on the single instance, and autoscaling is not triggering quickly enough. The team has a budget for this service and wants to handle the spike without compromising latency. What should you do?

Question 3hardmultiple choice
Read the full NAT/PAT explanation →

A financial services firm deploys a binary classification model for fraud detection. The model's precision is 0.95 and recall is 0.60 on the test set. After deployment, the fraud rate in production is 0.5% compared to 5% in the test set. The model shows good calibration on the test set (Brier score 0.02) but poor calibration in production (Brier score 0.15). What is the most likely explanation for the calibration degradation?

Question 4hardmultiple choice
Full question →

A logistics company uses Vertex AI AutoML Tables to predict delivery delays based on order attributes, weather data, and traffic data. The model is retrained weekly using a Vertex AI Pipeline that runs a BigQuery query to get training data, then triggers AutoML training. Recently, the pipeline fails with the error 'Dataset not found' when the AutoML training step starts. The BigQuery query runs successfully and outputs a table. Which is the most likely cause?

Question 5hardmulti select
Full question →

A financial services company has deployed a classification model on Vertex AI to detect fraudulent transactions. The model is monitored using Vertex AI Model Monitoring for skew and drift detection, and also logs predictions to BigQuery for analysis. After a month, the monitoring alerts show a significant drift in one feature (transaction_amount). Which TWO actions should the team take to diagnose and address this issue?

Question 6hardmulti select
Full question →

You are designing an ML pipeline for a large-scale recommendation system that runs weekly retraining on historical user interaction data. The pipeline uses TensorFlow and is deployed on Google Cloud. The pipeline must be orchestrated and automated with minimal manual intervention. Which THREE options should you include in your design? (Choose three.)

Question 7hardmultiple choice
Full question →

A team uses Vertex AI Feature Store to serve features for real-time predictions. They notice that feature values are frequently updated from multiple source systems, leading to inconsistencies. They need to ensure that feature values are consistent across all serving endpoints. What should they do?

Question 8hardmultiple choice
Full question →

A company uses Vertex AI Prediction with a custom container for a TensorFlow model. They notice that after deploying a new model version, requests still go to the old version. What is the most likely cause?

Question 9hardmultiple choice
Full question →

A company uses Vertex AI Pipelines to orchestrate their ML training workflow. The pipeline includes a BigQuery ML training step, a model evaluation step, and a deployment step to Vertex AI Endpoints. The engineer notices that the pipeline fails intermittently due to a quota exceeded error on Vertex AI Endpoints during model deployment. What is the best long-term solution to prevent this failure?

Question 10hardmultiple choice
Full question →

An ML team uses Vertex AI Pipelines to automate model retraining. The pipeline includes a step that queries BigQuery to create a training dataset. The team notices that the pipeline fails intermittently with a '403 Exceeded rate limits' error. What is the most likely cause and solution?

Question 11hardmultiple choice
Full question →

Your team has deployed a text classification model on Vertex AI Endpoints. You notice that the model's latency has increased significantly over the last week, but the request rate has remained stable. Which of the following is the most likely cause?

Question 12hardmultiple choice
Full question →

You are an ML engineer at a large e-commerce company. Your team has developed a product recommendation model using TensorFlow and deployed it on Vertex AI Endpoints for real-time inference. The model is retrained weekly using a Vertex AI Pipeline that reads new user interaction data from BigQuery, trains the model, evaluates it, and deploys the new version to the endpoint with a traffic split: 10% to the new model and 90% to the previous champion model. Recently, the team noticed that the new model's online prediction latency has increased significantly (from 50ms to 200ms) after deployment, causing timeouts for some requests. The training code has not changed, and the model size is similar. The pipeline uses a custom container with the same TensorFlow Serving image as before. The deployment step uses the same machine type (n1-standard-4) for the endpoint. What is the most likely cause of the latency increase?

Question 13hardmultiple choice
Full question →

A machine learning engineer needs to share a trained model with the product team for integration. The model is stored in Cloud Storage, and the product team’s service account needs read access. The engineer wants to follow the principle of least privilege. Which IAM configuration should be used?

Question 14hardmulti select
Read the full NAT/PAT explanation →

A healthcare company uses AutoML Tables to predict patient readmission risk. The dataset contains 500,000 rows and 200 features, including patient demographics, lab results, and medical history. The model accuracy is lower than expected. The engineer wants to improve performance using low-code techniques. Which THREE actions are most effective? (Choose THREE.)

Question 15hardmultiple choice
Full question →

A financial institution uses BigQuery ML to train a linear regression model to predict loan default risk. The model is trained on a dataset with 100 million rows and 50 features. During inference, the engineer uses the ML.PREDICT function. However, the query takes several minutes to run and times out frequently. The data is static and updated monthly. What is the most cost-effective and low-code solution to improve prediction latency?

Question 16hardmulti select
Full question →

Which TWO factors should you consider when choosing between BigQuery and Cloud Storage for storing training data? (Choose 2)

Question 17hardmulti select
Full question →

A company trains a model using Vertex AI Training and then deploys it to Vertex AI Prediction. They notice that prediction requests fail with 'InvalidArgument: input tensor shape mismatch'. Which THREE are possible causes?

Question 18hardmultiple choice
Full question →

A team uses Vertex AI Pipelines with CustomJob components that pull training code from a Cloud Source Repository. The pipeline fails with a 'Permission denied' error when trying to access the repository. The service account used by the pipeline has the 'Source Repository Viewer' role. What is the likely issue?

Question 19hardmultiple choice
Full question →

A company serves a scikit-learn model on Vertex AI Prediction but receives a 400 error with 'Prediction failed: Model evaluation error'. What is the most likely cause?

Question 20hardmultiple choice
Full question →

Your company uses a custom container for model serving on Vertex AI. After a recent update, the model returns predictions but they are clearly wrong (e.g., negative probabilities for a classification model). The logs show no errors. What is the most likely cause?

These PMLE practice questions are part of Courseiva's free Google Cloud certification practice question bank. Courseiva provides original exam-style PMLE questions with detailed explanations, topic-based practice, mock exams, readiness tracking, and study analytics.