Practice PMLE Scaling Prototypes into ML Models questions with full explanations on every answer.
Start practicing
Scaling Prototypes into ML Models — choose a session length
Free · No account required
Click any question to see the full explanation and answer options, or start a focused practice session above.
You have a TensorFlow training script that runs on a single machine. To speed up training on Vertex AI with 8 GPUs on a single machine, which strategy should you use?
2A data science team is building a feature engineering pipeline that processes large-scale data from BigQuery daily. They need to compute aggregate features and store the results in Vertex AI Feature Store for both online serving and offline training. Which Google Cloud service is best suited for this batch computation?
3You are fine-tuning a large language model (LLM) from Hugging Face Transformers using Vertex AI Training. The model has 7 billion parameters and does not fit into the memory of a single GPU. You need to train across multiple GPUs, splitting the model layers across devices. Which distributed training approach should you use?
4A company is using Vertex AI Vizier for hyperparameter tuning of a model with 5 integer hyperparameters, each with a range of 10-100. They have a budget of 50 trials and want to maximize the chance of finding the best configuration. Which Vizier algorithm should they use?
5You want to use a pre-trained model from TensorFlow Hub for image classification, but you need to adapt it to classify your own custom categories with a small dataset. Which Vertex AI approach is most appropriate?
6Your Vertex AI custom training job is failing with an out-of-memory error on a single GPU. You need to reduce memory usage without changing the model architecture. Which approach should you try first?
7You are deploying a deep learning model on edge devices with limited computational resources. The model must run inference in <10 ms and the model size must be under 50 MB. Currently, your trained model is 200 MB and runs in 50 ms. Which combination of model compression techniques should you apply?
8You are running a Vertex AI custom training job with pre-built TensorFlow container. You want to use TPU v3 pods for faster training. Which configuration is required?
9You need to perform a large-scale feature computation on streaming data from Pub/Sub, transforming raw events into features, and writing results to Vertex AI Feature Store for online serving. Which Google Cloud architecture is most appropriate?
10You want to use Vertex AI JumpStart to quickly deploy a pre-built foundation model for text summarization. Which action is required?
11Your PyTorch training script uses DistributedDataParallel (DDP) across 4 vertices each with 4 GPUs (16 GPUs total). You submit a Vertex AI custom training job. How should you configure the worker pool spec?
12You are fine-tuning a pre-trained BERT model from Hugging Face on a custom text classification dataset using Vertex AI Training. You want to speed up training by using mixed precision. What should you do?
13You are designing a distributed training job for a very large neural network that does not fit on a single machine. You need to split the model across multiple devices. Which TWO techniques can you use?
14You are fine-tuning a large language model using Vertex AI Training with spot VMs to reduce cost. Your training job keeps getting preempted, causing delays. Which THREE strategies can help mitigate the impact of preemption?
15You are building a machine learning pipeline on Google Cloud. You need to perform feature engineering on large datasets stored in BigQuery and store the resulting features in Vertex AI Feature Store for both online and offline use. Which TWO Google Cloud services should you use?
16A data scientist wants to train a TensorFlow model on Vertex AI using a pre-built container. Which of the following pre-built containers is NOT available for custom training in Vertex AI?
17You need to run a distributed training job on Vertex AI using TensorFlow with MirroredStrategy on a single machine with 4 GPUs. Which training configuration should you use?
18You have a very large language model that does not fit on a single GPU. You need to train it efficiently across multiple GPUs on a single machine. Which approach should you use?
19You want to reduce training costs by using preemptible VMs on Vertex AI for a fault-tolerant distributed training job that uses checkpointing. Which machine type should you choose in the worker pool configuration?
20You are performing hyperparameter tuning on Vertex AI with Vizier. You want to maximize the accuracy of your model, and you have a budget of 50 trials. Which algorithm should you choose to best explore the search space?
21You need to preprocess a large dataset (terabytes) for training a TensorFlow model. The preprocessing includes scaling and bucketizing features, and the same transformations must be applied during serving. Which tool should you use?
22You are fine-tuning a pre-trained BERT model from Hugging Face for a sentiment analysis task using Vertex AI training. The dataset has 100k examples. To avoid catastrophic forgetting, which layer freezing strategy should you apply?
23Which Vertex AI service allows you to discover, fine-tune, and deploy foundation models with a few clicks, including models like Llama and Gemma?
24You want to deploy a trained scikit-learn model to Vertex AI for online predictions. The model file is 2 GB. Which option should you use?
25You are designing a distributed training job on Vertex AI for a PyTorch model using DataDistributedParallel (DDP). You have 4 nodes, each with 4 GPUs. What is the total number of workers that should be configured in the TF_CONFIG equivalent for PyTorch?
26You have an edge device with limited compute resources. You need to deploy a deep learning model for real-time inference. Which model compression technique should you apply to reduce the model size and latency with minimal accuracy loss?
27You want to use Vertex AI Vizier for hyperparameter tuning. You have 2 categorical parameters and 3 continuous parameters. Which algorithm is best suited for this mixed parameter space?
28You are using tf.Transform to preprocess data at scale. Which TWO services are required to run tf.Transform on Google Cloud? (Choose 2)
29You need to reduce the cost of training a large model on Vertex AI while maintaining fault tolerance. Which THREE actions should you take? (Choose 3)
30You are fine-tuning a Gemma model using Vertex AI JumpStart. You want to combine the fine-tuned model with a custom output layer for a unique task. Which TWO components are required to deploy the combined model? (Choose 2)
31A data scientist needs to train a large PyTorch model on a custom dataset using Vertex AI. The training script expects data from Cloud Storage and uses GPU acceleration. Which option correctly configures a custom training job with a pre-built container for PyTorch and attaches a single NVIDIA V100 GPU?
32A machine learning engineer is preparing to train a Transformer-based model using TensorFlow on a single TPU v3-8 pod slice. The training script uses tf.distribute.TPUStrategy. Which environment variable must be set in Vertex AI to enable TPU training with the appropriate topology?
33A team wants to perform hyperparameter tuning on a Vertex AI custom training job with 100 trials. They require an algorithm that efficiently explores the search space by learning from previous trials. Which algorithm should they select in the study configuration?
34A data engineering team needs to compute rolling window features (7-day average, 30-day sum) from a high-volume stream of e-commerce events stored in BigQuery. They must output the features to Vertex AI Feature Store for online serving. Which approach is MOST cost-effective and scalable?
35A machine learning engineer wants to use Vertex AI Vizier to tune three hyperparameters: learning rate (log scale), number of layers (integer), and optimizer (categorical). They have 50 parallel trials available. Which parameter specification types should they define?
36A team is fine-tuning a large language model (LLaMA 2) using Vertex AI with a custom container on a multi-node GPU cluster. They need to implement model parallelism to fit the model across multiple GPUs because it does not fit into a single GPU memory. Which distributed training strategy should they use?
37An engineer is using TensorFlow Transform (tf.Transform) to preprocess training data. They want to ensure that the same preprocessing logic is applied during inference without code duplication. Which approach should they take?
38A company wants to bring their own Docker container to Vertex AI for training a model with a custom framework. They need to ensure the container is compatible with the Vertex AI training service. What is the minimum requirement for the container?
39A machine learning team is deploying a PyTorch model on Vertex AI Prediction for real-time inference. The model was trained with preprocessing that includes tokenization and normalization. They want to embed the preprocessing logic in the model to reduce prediction latency and avoid additional service calls. Which approach should they take?
40A data scientist wants to quickly experiment with a pre-trained Vision Transformer model from Hugging Face and fine-tune it on a custom dataset using Vertex AI. They want to use a managed environment with minimal setup. Which Vertex AI service should they use?
41A team is training a large TensorFlow model that requires more memory than a single GPU provides. They have access to multiple GPUs on a single machine. Which distributed training strategy should they use to split the model layers across GPUs?
42A company is deploying a deep learning model on edge devices with limited storage and computational resources. They need to reduce the model size by 80% while maintaining acceptable accuracy. Which two techniques should they combine?
43A company wants to train a custom machine learning model on Vertex AI using a pre-built container for scikit-learn. They want to use spot VMs to reduce costs. However, the training job fails intermittently due to preemption. Which TWO actions should they take to ensure the training job completes successfully?
44A data scientist is training a very large neural network using Vertex AI with multiple GPUs across multiple nodes. The model does not fit on a single GPU, so they need to use both data parallelism and model parallelism (pipeline parallelism). Which THREE components or configurations are required to set up distributed training with Vertex AI?
45A company is fine-tuning a large language model (Gemma 7B) using Vertex AI JumpStart. They want to reduce the model's memory footprint for deployment on edge devices. Which THREE model compression techniques should they consider?
46A data scientist has a TensorFlow 2.x model trained on a single GPU. They want to scale training to multiple GPUs on a single Vertex AI machine without code changes. Which strategy should they use?
47You are fine-tuning a BERT model from Hugging Face Transformers on Vertex AI. You want to minimise cost for a short experiment. Which compute configuration should you use?
48An ML engineer is using Vertex AI Vizier to tune hyperparameters for a PyTorch model. They want to maximise the chance of finding the global optimum within a fixed trial budget of 50 trials. Which algorithm should they select?
49Your team is deploying a large language model (LLM) on Vertex AI for online prediction. The model exceeds the maximum request size for Vertex AI Prediction. Which approach should you take to serve this model?
50You are using TensorFlow Transform (tf.Transform) to preprocess data for a model that will be deployed on Vertex AI. What is the primary benefit of using tf.Transform over Dataflow alone?
51An ML engineer wants to use Vertex AI Model Garden to deploy a pre-trained foundation model for text summarisation. What is the quickest way to achieve this?
52Your team is training a very large transformer model that does not fit on a single GPU. They are using Vertex AI custom training with PyTorch. Which distributed training approach should they use?
53You are performing post-training quantisation of a trained TensorFlow model to INT8 for deployment on edge devices. Which technique should you use to minimise accuracy loss?
54An ML team is building a feature pipeline with Dataflow that reads from BigQuery, computes features, and writes to Vertex AI Feature Store. They need to ensure that features are available for both training and serving with low latency. Which Feature Store option should they use?
55You need to run a custom training job on Vertex AI using a pre-built container for scikit-learn. Which container image should you specify?
56You are fine-tuning a pre-trained model using transfer learning. The new dataset is small and very similar to the original training data. To avoid overfitting, which layer freezing strategy should you adopt?
57You are using Vertex AI hyperparameter tuning with a custom container. The training job reports the objective metric but Vizier is not converging. Which configuration change could improve convergence?
58You are designing a distributed training job for a PyTorch model on Vertex AI using multiple machines with GPUs. Which TWO configurations are required to enable data parallelism with PyTorch DDP? (Choose 2.)
59Your team is deploying a large model on edge devices and needs to reduce its size by 80% while maintaining reasonable accuracy. Which THREE techniques should they consider? (Choose 3.)
60You are using Vertex AI to train a model with a custom container. You need to pass command-line arguments for hyperparameters. Which TWO methods can you use? (Choose 2.)
61A data scientist wants to train a PyTorch model on Vertex AI using a pre-built container for GPU training. She needs to use 4 NVIDIA A100 GPUs on a single machine. Which machine configuration should she select?
62An ML engineer is using Vertex AI Vizier to tune hyperparameters for a custom training job. The training job takes 2 hours per trial. To speed up the process, the engineer wants to run 10 trials in parallel. What is the correct way to configure parallel trial execution?
63A team is building a feature pipeline for an ML model. They need to compute aggregate features over a sliding time window from streaming data. Which Google Cloud service is most appropriate for this task?
64An ML team is fine-tuning a large language model using a custom container on Vertex AI. They want to reduce costs by using preemptible (spot) VMs for training. The training job is long-running and uses checkpointing. Which statement is correct regarding spot VM usage?
65A machine learning engineer is training a TensorFlow model on Vertex AI using distributed training with the MultiWorkerMirroredStrategy. The training job uses 4 workers with 4 GPUs each. The engineer notices that the training is not scaling linearly. What is the most likely cause?
66An organization wants to deploy a pre-trained BERT model for sentiment analysis on Vertex AI. They want to fine-tune it on their domain-specific data. Which feature in Vertex AI allows them to find and fine-tune a suitable foundation model with minimal effort?
67A machine learning engineer is deploying a TensorFlow model on an edge device with limited memory and compute. The model needs to perform inference with low latency. The engineer has a trained float32 model. Which model compression technique should be applied first to reduce the model size and improve inference speed without significant accuracy loss?
68A data scientist wants to use tf.Transform for preprocessing a large dataset stored in BigQuery before training a TensorFlow model. The preprocessing should be consistent during training and serving. What is the correct way to use tf.Transform in this scenario?
69A team is training a large image classification model using transfer learning from a pre-trained ResNet50. The model will be deployed on mobile devices. They want to fine-tune only the last few layers while keeping the earlier layers frozen. Which approach should they use?
70An engineer is training a model on Vertex AI using a custom container. The training job fails with an error indicating that the container exited with a non-zero status. The engineer wants to debug the issue. What is the best way to access the logs?
71A research team is training a very large Transformer model that does not fit into the memory of a single GPU. They have access to multiple GPUs on a single machine and want to split the model layers across GPUs. Which distributed training strategy should they use?
72An ML team wants to use Vertex AI Hyperparameter Tuning to tune a custom training job. They have a budget of 50 trials and want to use an algorithm that balances exploration and exploitation. Which algorithm should they choose?
73A company is deploying a TensorFlow model on Vertex AI Prediction. The model is memory-intensive and requires GPU acceleration. The team wants to minimize latency and cost. Which TWO configurations should they select? (Select 2)
74An ML engineer is using Vertex AI for distributed training of a PyTorch model across multiple nodes. The training job must use TPUs for high throughput. The engineer sets up the job configuration. Which THREE components are required for the training to work correctly? (Select 3)
75A machine learning team is building a feature engineering pipeline using Dataflow. They need to compute features from streaming data and store them in Vertex AI Feature Store for online serving. The features must be updated within 5 seconds of the event. Which TWO services should they combine? (Select 2)
76A team is scaling a prototype ML model to production on Vertex AI. The model was developed using scikit-learn and requires custom preprocessing. They want to minimize operational overhead and ensure consistency between training and serving. Which approach should they use?
77A data scientist is fine-tuning a large language model from Hugging Face using Vertex AI Training with a GPU. The model has 7 billion parameters and does not fit on a single GPU. They need to split the model across multiple GPUs and train with data parallelism. Which strategy should they use?
78A company wants to use Vertex AI Vizier to tune hyperparameters for a PyTorch model. They have a limited budget of 50 training jobs. The objective metric is validation accuracy, and they want to find the best configuration efficiently. Which algorithm should they choose?
79A data engineer wants to compute feature aggregates over a large dataset stored in BigQuery and write the results to Vertex AI Feature Store. The pipeline must handle both batch and streaming data. Which Google Cloud service should they use?
80An ML team is using Vertex AI to train a deep learning model on a large dataset. To reduce costs, they want to use preemptible VMs for training jobs. However, training must complete within a bounded time. Which strategy should they use?
81A developer wants to quickly deploy a pre-trained foundation model for text generation without writing any code. Which Vertex AI feature should they use?
82A company has a TensorFlow model for image classification that must run on edge devices with limited memory. They need to reduce the model size without significant accuracy loss. Which technique should they use?
83An ML engineer is using Vertex AI distributed training for a TensorFlow model that uses the MirroredStrategy. They notice that the training throughput drops significantly when moving from a single GPU to multiple GPUs on the same machine. What is the most likely cause?
84A data scientist wants to use a pre-trained ResNet model from Keras Applications and fine-tune it on a small custom dataset. Which approach should they take to avoid overfitting?
85A team is using TensorFlow Transform (tf.Transform) to create preprocessing functions that will be used both in training and serving. They want to ensure consistency. Which artifact should they save after analyzing the training data?
86An organization wants to use Vertex AI JumpStart to fine-tune a foundation model for a custom classification task. They have a labeled dataset stored in BigQuery. Which steps should they take?
87An ML engineer is training a very large PyTorch model on Vertex AI using a TPU v3 pod. The training is slower than expected, and the TPU utilization is low. What is the most likely cause?
88An ML team is optimizing an inference model for deployment on edge devices. They need to reduce the model size and improve latency while maintaining accuracy as much as possible. Which two techniques should they use? (Choose TWO.)
89A company wants to use Vertex AI for hyperparameter tuning. Which three components are required to configure a hyperparameter tuning job? (Choose THREE.)
90An engineer is designing a distributed training job on Vertex AI for a TensorFlow model that uses the MultiWorkerMirroredStrategy. They need to ensure proper communication between workers. Which two environment variables must be set correctly for each worker? (Choose TWO.)
91A machine learning team is training a large transformer model on Vertex AI. They need to reduce training time by utilizing multiple GPUs across nodes, but the model is too large to fit into a single GPU memory. Which distributed training strategy should they use?
92You are deploying a pre-trained BERT model for inference on edge devices. The model must be under 500 MB and inference latency under 50 ms. Which approach should you take?
93A data science team is building a real-time feature engineering pipeline for ML model training and serving. They need to compute features from streaming data, store them for low-latency serving, and ensure consistency between training and serving. Which TWO Google Cloud services should they use?
94You are fine-tuning a large language model (LLM) from Vertex AI Model Garden using a custom dataset. You need to minimize training cost while maintaining reasonable throughput. Which THREE strategies should you combine?
95A company wants to use Vertex AI JumpStart to deploy a pre-trained image classification model and later fine-tune it on their own data. Which TWO statements are true about Vertex AI JumpStart?
96You are setting up a hyperparameter tuning job on Vertex AI for a large neural network. The objective is to minimize validation loss. You want to explore the hyperparameter space efficiently with a limited budget of 100 trials. Which THREE settings should you configure in the study?
97A team is training a custom TensorFlow model on Vertex AI using a pre-built container. They need to use a TPU pod slice (v3-32). What THREE actions are required to set up the training job correctly?
98A company is deploying a computer vision model on edge devices using TensorFlow Lite. They want to reduce model size without significant accuracy loss. Which TWO model compression techniques are most suitable?
99You are using tf.Transform to preprocess data for a TensorFlow model. You want to ensure that the same transformations applied during training are also applied during serving. Which THREE components are necessary to achieve this?
The Scaling Prototypes into ML Models domain covers the key concepts tested in this area of the PMLE exam blueprint published by Google Cloud. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all PMLE domains — no account required.
The Courseiva PMLE question bank contains 99 questions in the Scaling Prototypes into ML Models domain. Click any question to see the full explanation and answer breakdown.
Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.
Yes — the session launcher on this page draws questions exclusively from the Scaling Prototypes into ML Models domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.
Save your results, see per-domain analytics, and get readiness scores — free, for every certification.
Sign Up FreeFree forever · Every certification included