How should I use these Scaling prototypes into ML models practice questions?

Read each scenario carefully and choose your answer before revealing the explanation. Then check why your choice was right or wrong. Repeat until the reasoning feels automatic.

Can I practise just Scaling prototypes into ML models questions in a focused session?

Yes — use the session launcher on this page to start a 10-, 20-, 30- or 50-question session drawn entirely from the Scaling prototypes into ML models domain.

PMLE · topic practice

Scaling prototypes into ML models practice questions

Practise Google Professional Machine Learning Engineer Scaling prototypes into ML models practice questions — original exam-style scenarios with answer choices, explanations, and analysis of common mistakes.

Courseiva uses original exam-style practice questions designed for learning and revision. The goal is to understand the concepts, recognise exam patterns, and improve through explanations — not memorise copied exam dumps.

Reviewed byJohnson Ajibi· MSc IT Security

20 questionsDomain: Scaling prototypes into ML models

Practice 10 questions Browse domain →

What the exam tests

What to know about Scaling prototypes into ML models

Scaling prototypes into ML models questions test whether you can apply the concept in context, not just recognise a definition.

How the topic appears in realistic exam-style scenarios.

Which detail in the question changes the correct answer.

How to eliminate plausible but wrong options.

How to connect the question back to the wider exam objective.

Watch out for

Common Scaling prototypes into ML models exam traps

▸Answering from memory before reading the full scenario.
▸Missing a constraint such as cost, availability, security, scope or command context.
▸Choosing a broad answer when the question asks for the most specific fix.
▸Ignoring why the wrong options are tempting.

Practice set

Scaling prototypes into ML models questions

20 questions · select your answer, then reveal the explanation

Question 1mediummultiple choice

Read the full Scaling prototypes into ML models explanation →

A startup has developed a prototype ML model using scikit-learn on a single machine. They now need to scale it to handle larger datasets and deploy it for real-time predictions. The team is small and wants minimal operational overhead. Which Google Cloud service should they use?

Trap 1: AI Platform Prediction

AI Platform Prediction is a legacy service; Vertex AI is the recommended unified platform.

Trap 2: Cloud Functions

Serverless but limited to short-running functions; not suited for large-scale ML inference.

Trap 3: Compute Engine with TensorFlow Serving

Requires manual management of infrastructure and load balancing.

Study all Scaling prototypes into ML models common traps →

A
AI Platform Prediction
Why wrong: AI Platform Prediction is a legacy service; Vertex AI is the recommended unified platform.
B
Vertex AI
Vertex AI provides managed training, deployment, and autoscaling with minimal operational overhead.
C
Cloud Functions
Why wrong: Serverless but limited to short-running functions; not suited for large-scale ML inference.
D
Compute Engine with TensorFlow Serving
Why wrong: Requires manual management of infrastructure and load balancing.

Full breakdown with real-world context →

Question 2hardmultiple choice

Read the full Scaling prototypes into ML models explanation →

A data science team has trained a TensorFlow model on-premises using a large dataset. When they try to deploy the model to Vertex AI for online predictions, the deployed model fails to start with a ‘MemoryError’. The model artifact is 2 GB, and the machine type is n1-standard-4 (15 GB RAM). What is the most likely cause?

Trap 1: The model is stored in a regional bucket and the Vertex AI endpoint…

Cross-region access is allowed, though not optimal; it would not cause MemoryError.

Trap 2: The machine type does not support TensorFlow models larger than 1…

No such limitation exists; TensorFlow can load larger models given sufficient memory.

Trap 3: The model file is corrupted or missing dependencies, causing a…

Corruption typically causes import errors, not MemoryError.

Study all Scaling prototypes into ML models common traps →

A
The model is stored in a regional bucket and the Vertex AI endpoint is in a different region.
Why wrong: Cross-region access is allowed, though not optimal; it would not cause MemoryError.
B
The machine type does not support TensorFlow models larger than 1 GB.
Why wrong: No such limitation exists; TensorFlow can load larger models given sufficient memory.
C
The model is too large for the machine's memory, causing an out-of-memory (OOM) error during loading.
The 2 GB model may require more than 15 GB RAM during loading due to overhead and intermediate structures.
D
The model file is corrupted or missing dependencies, causing a crash.
Why wrong: Corruption typically causes import errors, not MemoryError.

Full breakdown with real-world context →

Question 3easymultiple choice

Read the full Scaling prototypes into ML models explanation →

A company has a prototype ML model that works well on historical data, but when deployed to production, the model performance degrades over time. The data distribution shifts gradually. Which strategy should they implement to maintain model accuracy?

Trap 1: Increase the regularization strength to prevent overfitting.

Regularization reduces overfitting but does not address distribution shift.

Trap 2: Increase the amount of training data by using more historical…

Using more historical data does not address the shift in data distribution.

Trap 3: Switch to a more complex model architecture to better capture…

A more complex model may overfit to old patterns and not generalize to new data.

Study all Scaling prototypes into ML models common traps →

A
Increase the regularization strength to prevent overfitting.
Why wrong: Regularization reduces overfitting but does not address distribution shift.
B
Increase the amount of training data by using more historical records.
Why wrong: Using more historical data does not address the shift in data distribution.
C
Implement a retraining pipeline that periodically retrains the model on recent data.
Periodic retraining with fresh data helps the model adapt to gradual distribution shifts.
D
Switch to a more complex model architecture to better capture patterns.
Why wrong: A more complex model may overfit to old patterns and not generalize to new data.

Full breakdown with real-world context →

Question 4mediummultiple choice

Read the full Scaling prototypes into ML models explanation →

An ML engineer is scaling a prototype to production using Vertex AI Pipelines. The pipeline includes data validation, preprocessing, training, and deployment steps. They want to ensure that the pipeline can be reproduced and audited. What is the best practice?

Trap 1: Use a Docker container with fixed tags and manually record runs.

Manual recording is error-prone and not scalable.

Trap 2: Store all data and models in a single Cloud Storage bucket with no…

Versioning is important for reproducibility; no versioning loses history.

Trap 3: Pin all library versions in a requirements.txt file.

Pinning versions is good but not sufficient; pipeline orchestration and tracking are needed.

Study all Scaling prototypes into ML models common traps →

A
Define the pipeline using Kubeflow Pipelines SDK and run it on Vertex AI Pipelines.
Vertex AI Pipelines automatically tracks artifacts, parameters, and lineage.
B
Use a Docker container with fixed tags and manually record runs.
Why wrong: Manual recording is error-prone and not scalable.
C
Store all data and models in a single Cloud Storage bucket with no versioning.
Why wrong: Versioning is important for reproducibility; no versioning loses history.
D
Pin all library versions in a requirements.txt file.
Why wrong: Pinning versions is good but not sufficient; pipeline orchestration and tracking are needed.

Full breakdown with real-world context →

Question 5mediummulti select

Read the full Scaling prototypes into ML models explanation →

A team has trained a sentiment analysis model using PyTorch on Vertex AI Training. They now want to deploy it for online predictions with low latency. Which TWO actions should they take? (Choose 2)

Trap 1: Create multiple model versions for A/B testing.

A/B testing is for evaluation, not low latency.

Trap 2: Enable batch prediction instead of online prediction.

Batch prediction is for high throughput, not low latency.

Trap 3: Convert the model to TensorFlow SavedModel format.

Conversion is not necessary; PyTorch can be deployed as is.

Study all Scaling prototypes into ML models common traps →

A
Create multiple model versions for A/B testing.
Why wrong: A/B testing is for evaluation, not low latency.
B
Use a machine type with a GPU for faster inference.
GPUs can accelerate inference for deep learning models.
C
Enable batch prediction instead of online prediction.
Why wrong: Batch prediction is for high throughput, not low latency.
D
Convert the model to TensorFlow SavedModel format.
Why wrong: Conversion is not necessary; PyTorch can be deployed as is.
E
Package the model in a custom container with a web server (e.g., FastAPI).
Custom containers allow deploying PyTorch models on Vertex AI.

Full breakdown with real-world context →

Question 6hardmulti select

Read the full Scaling prototypes into ML models explanation →

A company has a prototype ML model that predicts equipment failure. They want to deploy it to production using Vertex AI. The model must be retrained weekly with new data. They also need to monitor for data drift and model performance. Which THREE components should they include in their MLOps pipeline? (Choose 3)

Trap 1: A manual QA step where data scientists approve each deployment.

Manual steps reduce automation and slow down the pipeline.

Trap 2: A manual review of new data before it is used for training.

Manual review is not scalable and introduces delays.

Study all Scaling prototypes into ML models common traps →

A
A scheduled training pipeline that retrains the model weekly.
Scheduled retraining is essential for keeping the model up-to-date.
B
A manual QA step where data scientists approve each deployment.
Why wrong: Manual steps reduce automation and slow down the pipeline.
C
A manual review of new data before it is used for training.
Why wrong: Manual review is not scalable and introduces delays.
D
An automated trigger that redeploys the model when performance drops below a threshold.
Automated redeployment based on performance ensures quick recovery.
E
A monitoring system that checks for data drift and triggers alerts.
Monitoring is critical for detecting when the model degrades.

Full breakdown with real-world context →

Question 7hardmultiple choice

Read the full Scaling prototypes into ML models explanation →

An ML engineer is trying to upload a TensorFlow model to Vertex AI using the gcloud command shown. The model was trained using TensorFlow 2.11 and saved with model.save('model/'). The engineer sees the error. What is the most likely cause?

Network Topology

Trap 1: The container port should be 8080 instead of 8501.

Port 8501 is standard for TF Serving; port mismatch would not cause this error.

Trap 2: The service account does not have permission to access the bucket.

Permissions issues would yield a different error (e.g., PERMISSION_DENIED).

Trap 3: The container image is for TensorFlow 2.11 but the model was saved…

The container image matches TF2.11; version mismatch usually gives a different error.

Study all Scaling prototypes into ML models common traps →

A
The container port should be 8080 instead of 8501.
Why wrong: Port 8501 is standard for TF Serving; port mismatch would not cause this error.
B
The service account does not have permission to access the bucket.
Why wrong: Permissions issues would yield a different error (e.g., PERMISSION_DENIED).
C
The container image is for TensorFlow 2.11 but the model was saved with an older version.
Why wrong: The container image matches TF2.11; version mismatch usually gives a different error.
D
The model was saved in a format other than SavedModel (e.g., HDF5) or the artifact path does not contain the expected directory structure.
The error explicitly states no saved_model.pb found, indicating the model is not in SavedModel format.

Full breakdown with real-world context →

Question 8mediummultiple choice

Read the full Scaling prototypes into ML models explanation →

You are an ML engineer at a fintech company. You have a prototype credit risk model built using XGBoost that achieves high accuracy on historical data. The model is trained on a dataset with 500,000 rows and 50 features. The company wants to deploy this model to production to score loan applications in real-time. The production environment must handle a peak load of 100 requests per second with a latency under 200ms. You have decided to use Vertex AI for deployment. After deploying the model as a Vertex AI endpoint with a single n1-standard-4 machine, you notice that latency exceeds 500ms at peak load and some requests time out. You have verified that the model prediction itself (excluding network overhead) takes about 50ms on average. What should you do to meet the latency and throughput requirements?

Trap 1: Change the machine type to a GPU-accelerated machine like…

XGBoost does not benefit from GPU typically; GPUs are for deep learning. CPU scaling is more appropriate.

Trap 2: Prune the model to reduce size and improve prediction speed.

Prediction time is already 50ms, so pruning may not significantly reduce latency; the bottleneck is throughput.

Trap 3: Switch from online prediction to batch prediction using Vertex AI…

Batch prediction is for offline scoring, not real-time.

Study all Scaling prototypes into ML models common traps →

A
Change the machine type to a GPU-accelerated machine like n1-standard-4 with a T4 GPU.
Why wrong: XGBoost does not benefit from GPU typically; GPUs are for deep learning. CPU scaling is more appropriate.
B
Prune the model to reduce size and improve prediction speed.
Why wrong: Prediction time is already 50ms, so pruning may not significantly reduce latency; the bottleneck is throughput.
C
Enable autoscaling with a minimum of 2 replicas and use a larger machine type (e.g., n1-standard-8) to handle more concurrent requests.
Autoscaling increases replicas to handle load, and a larger machine can process more requests concurrently, reducing queueing time.
D
Switch from online prediction to batch prediction using Vertex AI Batch Prediction.
Why wrong: Batch prediction is for offline scoring, not real-time.

Full breakdown with real-world context →

Question 9mediummultiple choice

Read the full Scaling prototypes into ML models explanation →

A machine learning team has a prototype using a custom TensorFlow model trained on a small dataset stored in Cloud Storage. They want to scale the prototype to production with minimal code changes while ensuring the model can handle increased traffic and new data. The model currently loads data using tf.data.Dataset from CSV files. Which approach best meets these requirements?

Trap 1: Deploy the model to AI Platform (Unified) Prediction with a custom…

AI Platform (Unified) is deprecated; Vertex AI is the recommended service.

Trap 2: Migrate the model to BigQuery ML and use SQL for training and…

BigQuery ML requires rewriting the model and does not support custom TensorFlow models directly.

Trap 3: Package the model as a Cloud Run Function and use Cloud Scheduler…

Cloud Run Functions are stateless and have request limits, not suitable for ML serving.

Study all Scaling prototypes into ML models common traps →

A
Use Vertex AI Training with hyperparameter tuning and distributed training, then deploy the model to Vertex AI Prediction with autoscaling.
Vertex AI provides seamless scaling with minimal code changes and supports tf.data.Dataset.
B
Deploy the model to AI Platform (Unified) Prediction with a custom container, and use AI Platform Training to retrain on larger datasets.
Why wrong: AI Platform (Unified) is deprecated; Vertex AI is the recommended service.
C
Migrate the model to BigQuery ML and use SQL for training and prediction to leverage BigQuery's scalability.
Why wrong: BigQuery ML requires rewriting the model and does not support custom TensorFlow models directly.
D
Package the model as a Cloud Run Function and use Cloud Scheduler to trigger retraining periodically.
Why wrong: Cloud Run Functions are stateless and have request limits, not suitable for ML serving.

Full breakdown with real-world context →

Question 10easymulti select

Read the full Scaling prototypes into ML models explanation →

Which TWO actions are best practices when scaling a prototype ML model to production in Google Cloud?

Trap 1: Test the model only on a small sample of the production data to…

Testing on a small sample may not represent production data distribution.

Trap 2: Manually scale inference instances based on historical traffic…

Manual scaling is inefficient; use autoscaling.

Trap 3: Use one-hot encoding for all categorical features without…

High-cardinality features may need embeddings or other techniques.

Study all Scaling prototypes into ML models common traps →

A
Store and manage features in a feature store like Vertex AI Feature Store.
Feature store ensures consistency and reuse across models.
B
Test the model only on a small sample of the production data to save costs.
Why wrong: Testing on a small sample may not represent production data distribution.
C
Set up monitoring and logging for model performance and data drift.
Monitoring is critical for production ML systems.
D
Manually scale inference instances based on historical traffic patterns.
Why wrong: Manual scaling is inefficient; use autoscaling.
E
Use one-hot encoding for all categorical features without considering cardinality.
Why wrong: High-cardinality features may need embeddings or other techniques.

Full breakdown with real-world context →

Question 11hardmultiple choice

Read the full Scaling prototypes into ML models explanation →

A team deployed a prototype classification model to Vertex AI Prediction. After a week, they notice the metrics shown in the exhibit. What is the most likely cause of the performance degradation and latency increase?

Exhibit

Refer to the exhibit.

```
Model accuracy: 0.92
Training data: 10,000 records
Online prediction latency: 95th percentile = 450ms
QPS: 50

After moving to production:
- New data from users: 100,000 records/day
- Data distribution shift detected (new features emerge)
- Prediction latency increases to 95th percentile = 1200ms
- QPS drops to 30
```

Trap 1: The prediction endpoint's autoscaling is too slow, causing requests…

Autoscaling may contribute but does not explain the accuracy drop.

Trap 2: The prediction requests are too large, exceeding the maximum…

Request size limit would cause errors, not a gradual latency increase.

Trap 3: The custom prediction container uses outdated libraries that are…

Library incompatibility would cause errors, not gradual latency increase and accuracy drop.

Study all Scaling prototypes into ML models common traps →

A
The prediction endpoint's autoscaling is too slow, causing requests to queue and time out.
Why wrong: Autoscaling may contribute but does not explain the accuracy drop.
B
The prediction requests are too large, exceeding the maximum request size limit for Vertex AI.
Why wrong: Request size limit would cause errors, not a gradual latency increase.
C
The training data does not represent the current production data distribution, causing the model to make incorrect predictions and requiring more computation.
Data distribution shift degrades accuracy and can increase latency if the model is uncertain.
D
The custom prediction container uses outdated libraries that are incompatible with Vertex AI's runtime.
Why wrong: Library incompatibility would cause errors, not gradual latency increase and accuracy drop.

Full breakdown with real-world context →

Question 12mediumdrag order

Read the full Scaling prototypes into ML models explanation →

Drag and drop the steps to create and deploy a custom ML model on Vertex AI using a container in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

1Step 1

2Step 2

3Step 3

4Step 4

5Step 5

Question 13mediumdrag order

Read the full Scaling prototypes into ML models explanation →

Drag and drop the steps to set up model monitoring for drift detection on Vertex AI in the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

1Step 1

2Step 2

3Step 3

4Step 4

5Step 5

Question 14mediummatching

Read the full Scaling prototypes into ML models explanation →

Match each ML acronym to its definition.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Area Under the ROC Curve

Mean Squared Error

Tensor Processing Unit

Support Vector Machine

Principal Component Analysis

Question 15mediummatching

Read the full Scaling prototypes into ML models explanation →

Match each ML model interpretability method to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Game-theoretic approach to explain feature contributions

Local surrogate model to explain individual predictions

Ranking features by their impact on model output

Shows marginal effect of a feature on predictions

Measures decrease in performance when feature is shuffled

Question 16easymultiple choice

Read the full Scaling prototypes into ML models explanation →

A team has a trained TensorFlow model running locally and wants to deploy it for low-latency online predictions on Google Cloud. Which service should they use?

Trap 1: AI Platform Training

AI Platform Training is for training jobs, not serving predictions.

Trap 2: Cloud Run

Cloud Run can serve models but is less optimized for ML than Vertex AI Prediction.

Trap 3: Cloud Functions

Cloud Functions is for event-driven functions, not ideal for ML prediction latency.

Study all Scaling prototypes into ML models common traps →

A
Vertex AI Prediction
Vertex AI Prediction is purpose-built for low-latency online ML predictions.
B
AI Platform Training
Why wrong: AI Platform Training is for training jobs, not serving predictions.
C
Cloud Run
Why wrong: Cloud Run can serve models but is less optimized for ML than Vertex AI Prediction.
D
Cloud Functions
Why wrong: Cloud Functions is for event-driven functions, not ideal for ML prediction latency.

Full breakdown with real-world context →

Question 17mediummultiple choice

Study the full Python automation breakdown →

An ML team is scaling a prototype to production. The data pipeline currently reads from Cloud Storage and transforms data with a custom Python script. They need to handle higher throughput and add monitoring. Which approach should they take?

Trap 1: Deploy the Python script on a large Compute Engine instance with a…

Single instance limits scalability and lacks built-in monitoring and fault tolerance.

Trap 2: Rewrite the pipeline to use Pub/Sub and Cloud Functions for…

Pub/Sub is for messaging; Cloud Functions are ephemeral and not designed for high-throughput data transformation.

Trap 3: Use Cloud Composer to orchestrate the Python script at scale

Composer orchestrates but doesn't auto-scale data processing; custom scripts still have scaling limits.

Study all Scaling prototypes into ML models common traps →

A
Deploy the Python script on a large Compute Engine instance with a cron job
Why wrong: Single instance limits scalability and lacks built-in monitoring and fault tolerance.
B
Migrate the pipeline to Apache Beam on Dataflow with Cloud Monitoring
Dataflow is serverless, auto-scales, and integrates with Cloud Monitoring for observability.
C
Rewrite the pipeline to use Pub/Sub and Cloud Functions for processing
Why wrong: Pub/Sub is for messaging; Cloud Functions are ephemeral and not designed for high-throughput data transformation.
D
Use Cloud Composer to orchestrate the Python script at scale
Why wrong: Composer orchestrates but doesn't auto-scale data processing; custom scripts still have scaling limits.

Full breakdown with real-world context →

Question 18hardmultiple choice

Read the full Scaling prototypes into ML models explanation →

A company has a prototype ML model that achieves 85% accuracy on historical data. In production, accuracy drops to 70% after two weeks due to data drift. They need an automated retraining pipeline with minimal manual oversight. Which solution is most cost-effective?

Trap 1: Use Cloud Functions to trigger a Dataflow job that trains the model…

Dataflow is for data processing, not model training; would need complex custom setup.

Trap 2: Deploy the model on a GPU-equipped Compute Engine VM and run…

Constant GPU cost, manual setup, no drift detection.

Trap 3: Schedule a weekly Cloud Composer DAG that runs a new training job…

Scheduled retraining doesn't adapt to actual drift, may waste resources.

Study all Scaling prototypes into ML models common traps →

A
Use Cloud Functions to trigger a Dataflow job that trains the model using custom containers
Why wrong: Dataflow is for data processing, not model training; would need complex custom setup.
B
Deploy the model on a GPU-equipped Compute Engine VM and run retraining every time new data arrives
Why wrong: Constant GPU cost, manual setup, no drift detection.
C
Set up Vertex AI Model Monitoring to detect drift, which triggers a Cloud Function that submits a Vertex AI Training job with new data
Monitoring detects drift, automation triggers retraining with new data, cost-effective.
D
Schedule a weekly Cloud Composer DAG that runs a new training job with all available data
Why wrong: Scheduled retraining doesn't adapt to actual drift, may waste resources.

Full breakdown with real-world context →

Question 19easymultiple choice

Read the full Scaling prototypes into ML models explanation →

A team prototypes a recommendation model using a Jupyter notebook on Vertex AI Workbench. They want to productionize the model with CI/CD. Which approach should they use to package the model for deployment?

Trap 1: Use Cloud Build to deploy the notebook directly as a prediction…

Notebooks are not directly deployable; need to export the model.

Trap 2: Store the model in Cloud Source Repositories and deploy from there

Source Repositories is for code, not model artifacts.

Trap 3: Containerize the model and push to Artifact Registry, then deploy…

Possible but lacks model versioning and monitoring integration.

Study all Scaling prototypes into ML models common traps →

A
Use Cloud Build to deploy the notebook directly as a prediction endpoint
Why wrong: Notebooks are not directly deployable; need to export the model.
B
Store the model in Cloud Source Repositories and deploy from there
Why wrong: Source Repositories is for code, not model artifacts.
C
Containerize the model and push to Artifact Registry, then deploy via Cloud Run
Why wrong: Possible but lacks model versioning and monitoring integration.
D
Upload the model to Vertex AI Model Registry and use it for deployment
Model Registry manages versions and deployment targets.

Full breakdown with real-world context →

Question 20mediummultiple choice

Read the full Scaling prototypes into ML models explanation →

A data scientist trains an XGBoost model on Vertex AI with a custom container. The model performs well on a held-out test set but fails to generalize in production. They suspect data leakage between training and validation. What is the best practice to prevent this?

Trap 1: Implement feature engineering in Vertex AI Pipelines to ensure…

Pipelines orchestrate but don't enforce feature consistency by themselves.

Trap 2: Store all features in BigQuery and join on timestamp during…

Manual joins are error-prone and may still introduce leakage.

Trap 3: Use Vertex AI AutoML instead of custom training

AutoML may not solve custom feature engineering leakage.

Study all Scaling prototypes into ML models common traps →

A
Store and serve features using Vertex AI Feature Store with point-in-time correctness
Feature Store provides consistent feature values for each timestamp, preventing leakage.
B
Implement feature engineering in Vertex AI Pipelines to ensure temporal ordering
Why wrong: Pipelines orchestrate but don't enforce feature consistency by themselves.
C
Store all features in BigQuery and join on timestamp during training and serving
Why wrong: Manual joins are error-prone and may still introduce leakage.
D
Use Vertex AI AutoML instead of custom training
Why wrong: AutoML may not solve custom feature engineering leakage.

Full breakdown with real-world context →

Continue with 20-question session →

Free account

Track your progress over time

Create a free account to save your results and see which topics improve across sessions.

Focused Scaling prototypes into ML models sessions

Start a Scaling prototypes into ML models only practice session

Every question in these sessions is drawn from the Scaling prototypes into ML models domain — nothing else.

10 questions 20 questions 30 questions 50 questions

Browse all Scaling prototypes into ML models questions →Mixed PMLE session

Frequently asked questions

What does the PMLE exam test about Scaling prototypes into ML models?: Scaling prototypes into ML models questions test whether you can apply the concept in context, not just recognise a definition.
How should I use these practice questions?: Select your answer before revealing the explanation. Then read why each option is right or wrong — this active recall approach builds retention far faster than re-reading notes.
Can I practise just Scaling prototypes into ML models questions in a focused session?: Yes — the session launcher on this page draws every question from the Scaling prototypes into ML models domain. Use a 10-question session first to gauge your baseline, then move to 20 or 30 once the weak spots are clear.
Where can I practise other PMLE topics?: Use the topic links above to move to related areas, or go back to the PMLE question bank to see all topics.
Are these real exam questions or dumps?: These are original practice questions written to test the same concepts the PMLE exam covers. They are not copied from any real exam or dump site.

Scaling prototypes into ML models only

10 questions 20 questions 30 questions 50 questions

Mixed PMLE session

Track your progress

A free account saves results across sessions and highlights which topics need work.

Study resources

All PMLE questions Scaling prototypes into ML models domain overview PMLE exam guide

Exam traps to avoid

▸Answering from memory before reading the full scenario.
▸Missing a constraint such as cost, availability, security, scope or command context.
▸Choosing a broad answer when the question asks for the most specific fix.
▸Ignoring why the wrong options are tempting.

Scaling prototypes into ML models practice questions

What to know about Scaling prototypes into ML models

Common Scaling prototypes into ML models exam traps

Scaling prototypes into ML models questions

A startup has developed a prototype ML model using scikit-learn on a single machine. They now need to scale it to handle larger datasets and deploy it for real-time predictions. The team is small and wants minimal operational overhead. Which Google Cloud service should they use?

A company has a prototype ML model that works well on historical data, but when deployed to production, the model performance degrades over time. The data distribution shifts gradually. Which strategy should they implement to maintain model accuracy?

An ML engineer is scaling a prototype to production using Vertex AI Pipelines. The pipeline includes data validation, preprocessing, training, and deployment steps. They want to ensure that the pipeline can be reproduced and audited. What is the best practice?

A team has trained a sentiment analysis model using PyTorch on Vertex AI Training. They now want to deploy it for online predictions with low latency. Which TWO actions should they take? (Choose 2)

An ML engineer is trying to upload a TensorFlow model to Vertex AI using the gcloud command shown. The model was trained using TensorFlow 2.11 and saved with model.save('model/'). The engineer sees the error. What is the most likely cause?

Which TWO actions are best practices when scaling a prototype ML model to production in Google Cloud?

A team deployed a prototype classification model to Vertex AI Prediction. After a week, they notice the metrics shown in the exhibit. What is the most likely cause of the performance degradation and latency increase?

Exhibit

Drag and drop the steps to create and deploy a custom ML model on Vertex AI using a container in the correct order.

Drag and drop the steps to set up model monitoring for drift detection on Vertex AI in the correct order.

Match each ML acronym to its definition.

Match each ML model interpretability method to its description.

A team has a trained TensorFlow model running locally and wants to deploy it for low-latency online predictions on Google Cloud. Which service should they use?

An ML team is scaling a prototype to production. The data pipeline currently reads from Cloud Storage and transforms data with a custom Python script. They need to handle higher throughput and add monitoring. Which approach should they take?

A company has a prototype ML model that achieves 85% accuracy on historical data. In production, accuracy drops to 70% after two weeks due to data drift. They need an automated retraining pipeline with minimal manual oversight. Which solution is most cost-effective?

A team prototypes a recommendation model using a Jupyter notebook on Vertex AI Workbench. They want to productionize the model with CI/CD. Which approach should they use to package the model for deployment?

A data scientist trains an XGBoost model on Vertex AI with a custom container. The model performs well on a held-out test set but fails to generalize in production. They suspect data leakage between training and validation. What is the best practice to prevent this?

Track your progress over time

Start a Scaling prototypes into ML models only practice session

Related PMLE topic practice pages