CCNA Solving business challenges with ML Questions

40 questions · Solving business challenges with ML · All types, answers revealed

1
MCQmedium

Refer to the exhibit. This IAM policy is applied at the project level. What is the effect of the condition?

A.The service account can only access AI Platform resources that start with 'projects/ml-'
B.The service account can only be used in projects whose ID starts with 'ml-'
C.The role is granted only if the project's name contains 'ml-'
D.The condition is ignored because conditions are not supported for service accounts
AnswerA

Condition on resource name limits access to resources with that prefix.

Why this answer

Option A is correct because the condition block uses the `resource.name.startsWith` condition key to restrict access to AI Platform resources whose names begin with `projects/ml-`. This means the service account can only interact with AI Platform resources (such as models, jobs, or endpoints) that have a resource name starting with that prefix, effectively scoping the permission to a specific set of projects or resources.

Exam trap

Google Cloud often tests the distinction between resource-level conditions (like `resource.name`) and identity-level conditions (like `principal` or `request.auth`), and candidates mistakenly apply the condition to the service account's project ID instead of the target resource's name.

How to eliminate wrong answers

Option B is wrong because the condition checks the resource name (the AI Platform resource path), not the project ID of the service account itself; the service account can be from any project, but the resources it can access must have names starting with `projects/ml-`. Option C is wrong because the condition uses `resource.name.startsWith`, which operates on the resource name, not the project's display name or label; the project name is irrelevant. Option D is wrong because IAM conditions are fully supported for service accounts; the condition is evaluated at access time and can restrict permissions based on resource attributes.

2
MCQmedium

A data science team is using AI Platform for training. They want to track hyperparameters and metrics across multiple experiments. What should they use?

A.Cloud Logging with custom metrics
B.Vertex AI Experiments
C.Store metrics in Cloud Storage and compare manually
D.Cloud Monitoring dashboards
AnswerB

Provides experiment tracking, comparison, and analysis.

Why this answer

Vertex AI Experiments is the correct choice because it is the native service within Vertex AI designed specifically for tracking, comparing, and analyzing hyperparameters and metrics across multiple training runs. It provides a centralized UI and SDK to log parameters, metrics, and artifacts, enabling systematic experiment management without manual effort or external tools.

Exam trap

Google Cloud often tests the distinction between logging/monitoring services (Cloud Logging, Cloud Monitoring) and ML-specific experiment tracking (Vertex AI Experiments), leading candidates to pick a generic monitoring tool instead of the purpose-built ML service.

How to eliminate wrong answers

Option A is wrong because Cloud Logging is intended for collecting and querying log data (e.g., application logs, error messages), not for structured tracking of hyperparameters and metrics across experiments; it lacks built-in experiment comparison features. Option C is wrong because storing metrics in Cloud Storage and comparing manually is inefficient, error-prone, and does not provide automated tracking, visualization, or versioning of experiments, which is the core requirement. Option D is wrong because Cloud Monitoring dashboards are designed for monitoring infrastructure and application performance metrics (e.g., CPU usage, latency), not for tracking ML experiment hyperparameters and metrics across multiple runs.

3
MCQmedium

A data scientist trained a custom TensorFlow model using Vertex AI Training and wants to deploy it for online predictions with low latency (<100ms). Which deployment option on Google Cloud is best?

A.Deploy on Cloud Run with a custom container
B.Deploy on Cloud Functions
C.Deploy on AI Platform Prediction (legacy)
D.Deploy on Vertex AI Endpoints
AnswerD

Vertex AI Endpoints provide managed, scalable, low-latency online prediction.

Why this answer

Vertex AI Endpoints is the correct choice because it is purpose-built for deploying TensorFlow models with optimized serving infrastructure, including automatic scaling, GPU/TPU support, and built-in monitoring for latency-sensitive online predictions. It provides a managed endpoint that can achieve sub-100ms latency by leveraging model optimization techniques like TensorFlow Serving and hardware accelerators, which are not available in the other options.

Exam trap

Google Cloud often tests the misconception that any serverless option (like Cloud Run or Cloud Functions) is sufficient for low-latency ML inference, ignoring the need for GPU acceleration and optimized serving infrastructure that only Vertex AI Endpoints provides.

How to eliminate wrong answers

Option A is wrong because Cloud Run, while supporting custom containers, lacks native GPU/TPU acceleration and has a cold-start latency that can exceed 100ms, making it unsuitable for low-latency online predictions. Option B is wrong because Cloud Functions has a maximum timeout of 9 minutes and no GPU support, and its cold-start latency often exceeds 100ms, making it impractical for real-time inference. Option C is wrong because AI Platform Prediction (legacy) is being deprecated and does not offer the same level of integration with Vertex AI's model registry, monitoring, and autoscaling features, and it may not achieve the same low-latency guarantees as Vertex AI Endpoints.

4
MCQhard

A company has a large dataset of 1 million unlabeled images for object detection. They want to use AutoML Vision but need to minimize labeling effort. Which strategy should they use?

A.Use Vertex AI Active Learning to choose a subset for labeling
B.Apply data augmentation techniques to increase dataset size
C.Manually label all 1 million images
D.Train a custom object detection model on unlabeled data with unsupervised learning
AnswerA

Active learning selects the most valuable images, reducing labeling effort significantly.

Why this answer

Vertex AI Active Learning is the correct strategy because it intelligently selects the most informative unlabeled images for human labeling, maximizing model accuracy while minimizing labeling effort. This approach uses the model's uncertainty to prioritize data points that will most improve performance, making it ideal for large datasets where manual labeling of all images is impractical.

Exam trap

Google Cloud often tests the misconception that data augmentation can replace the need for initial labeling, when in reality it only expands existing labeled data and does not address the core challenge of obtaining labels for unlabeled images.

How to eliminate wrong answers

Option B is wrong because data augmentation techniques increase dataset size by creating modified copies of existing labeled images, but they do not reduce the initial labeling effort required for the original dataset. Option C is wrong because manually labeling all 1 million images is prohibitively time-consuming and expensive, directly contradicting the goal of minimizing labeling effort. Option D is wrong because unsupervised learning cannot train a custom object detection model without labeled data; object detection requires bounding box annotations or similar labels to learn object locations and classes.

5
MCQeasy

Refer to the exhibit. A data scientist runs this Vertex AI training job code. What will be the outcome?

A.The job runs as a regular custom training with 10 replicas.
B.A HyperparameterTuningJob is created and runs trials.
C.A CustomJob is created with hyperparameters from the spec.
D.The job fails because parallel_trial_count cannot be less than max_trial_count.
AnswerB

The hyperparameter_tuning_job_spec instructs Vertex AI to run tuning.

Why this answer

The code uses `HyperparameterTuningJob` with `parallel_trial_count=1` and `max_trial_count=10`. This creates a hyperparameter tuning job that runs up to 10 trials, each trial being a separate training run with different hyperparameter values. The `parallel_trial_count=1` means trials run sequentially, not in parallel, but this is valid and does not cause failure.

Exam trap

Google Cloud often tests the misconception that `parallel_trial_count` must be equal to or greater than `max_trial_count`, when in reality it can be any value from 1 to `max_trial_count`, and sequential trials are perfectly valid.

How to eliminate wrong answers

Option A is wrong because the code explicitly creates a `HyperparameterTuningJob`, not a regular custom training job; a regular custom training job would use `CustomJob` or `CustomContainerTrainingJob` without hyperparameter tuning parameters. Option C is wrong because a `CustomJob` does not accept hyperparameter tuning parameters like `parallel_trial_count` or `max_trial_count`; those are specific to `HyperparameterTuningJob`. Option D is wrong because `parallel_trial_count` can be less than `max_trial_count`; the constraint is that `parallel_trial_count` must be less than or equal to `max_trial_count`, and 1 ≤ 10 is valid.

6
MCQmedium

A financial services company uses BigQuery ML to build a logistic regression model for fraud detection. The model is trained on the last 6 months of transaction data (about 50 million rows). After deployment, the fraud detection team notices a high false positive rate, causing customer dissatisfaction and extra manual review costs. The model is currently retrained monthly. The team wants to reduce false positives without sacrificing recall. They have access to real-time transaction streaming and can compute new features quickly. What is the most effective approach?

A.Replace logistic regression with gradient boosted trees (XGBoost) in BigQuery ML
B.Use Vertex AI AutoML Tables to train a more complex model
C.Increase retraining frequency to daily
D.Add engineered features like rolling transaction count and velocity per user
AnswerD

New features provide more signal to reduce false positives.

Why this answer

Option D is correct because adding engineered features like rolling transaction count and velocity per user directly addresses the high false positive rate by providing the logistic regression model with more discriminative temporal signals. Since the team has access to real-time streaming and can compute features quickly, these features capture behavioral patterns that reduce false positives without sacrificing recall, and logistic regression can effectively leverage them with proper feature engineering.

Exam trap

The trap here is that candidates often assume a more complex model (XGBoost or AutoML) is always better for reducing false positives, but the question specifically tests the principle that feature engineering—especially temporal aggregations—is the most effective lever when the model is already appropriate and data is streaming.

How to eliminate wrong answers

Option A is wrong because replacing logistic regression with gradient boosted trees (XGBoost) may improve model capacity but does not directly target the root cause of high false positives—lack of informative features—and could increase complexity without guaranteed recall preservation. Option B is wrong because using Vertex AI AutoML Tables to train a more complex model similarly addresses model complexity rather than feature insufficiency, and may introduce overfitting or latency issues without solving the false positive problem. Option C is wrong because increasing retraining frequency to daily does not change the underlying feature set or model architecture; it only refreshes weights on the same features, which will not reduce false positives if the model lacks discriminative signals.

7
MCQeasy

A financial company is building a fraud detection model. The dataset has 1% fraud cases and 99% legitimate transactions. Which technique should they use to handle the class imbalance?

A.Use class weighting or synthetic oversampling (SMOTE) during training
B.Randomly undersample the majority class to balance the dataset
C.Collect more data until the fraud rate increases
D.Train without any modifications; the model will naturally handle it
AnswerA

This addresses imbalance effectively.

Why this answer

Class weights or resampling techniques like SMOTE are standard for imbalanced datasets. Option A is correct. Option B (undersampling majority) can lose information.

Option C (collect more data) is impractical. Option D (no alterations) will bias the model.

8
Multi-Selecthard

A team is deploying a model on Vertex AI Prediction. Which THREE configuration settings have a direct impact on both latency and cost? (Choose THREE.)

Select 3 answers
A.Size of training dataset
B.Minimum and maximum number of nodes (autoscaling)
C.Machine type (e.g., n1-standard-2)
D.Model architecture (e.g., number of layers)
E.Number of replicas in the endpoint
AnswersB, C, E

More nodes lower latency but increase cost.

Why this answer

Option B is correct because the minimum and maximum number of nodes in autoscaling directly control how many compute instances are provisioned to handle prediction requests. A higher minimum node count increases baseline cost and reduces cold-start latency, while a lower maximum can cause queuing and higher latency under load, directly impacting both metrics.

Exam trap

Google Cloud often tests the distinction between model-level properties (architecture, training data) and deployment-level configuration settings (machine type, replicas, autoscaling) to see if candidates confuse model development with serving infrastructure.

9
Multi-Selectmedium

A media company uses a custom Python script on a Compute Engine VM to run batch predictions with a large ML model. The script loads the model from Cloud Storage, processes records from a Pub/Sub pull subscription, and writes results to BigQuery. Predictions are taking too long and the VM often runs out of memory. Which two changes should the company implement to improve performance and scalability? (Choose TWO)

Select 2 answers
A.Deploy the model on Vertex AI Prediction for batch prediction
B.Change Pub/Sub to a push subscription that sends messages to a load-balanced group of VMs
C.Use Dataflow to read from Pub/Sub, run predictions using the model, and write to BigQuery
D.Switch to a larger VM with more memory
E.Store results in Cloud SQL instead of BigQuery
AnswersB, C

Push subscriptions with load balancing allow horizontal scaling across multiple VMs.

Why this answer

Option B is correct because switching to a push subscription with a load-balanced group of VMs distributes the message processing load across multiple instances, preventing any single VM from being overwhelmed. This directly addresses the memory exhaustion issue by parallelizing the work and allowing horizontal scaling.

Exam trap

Google Cloud often tests the distinction between vertical scaling (larger VM) and horizontal scaling (load-balanced VMs or Dataflow), where candidates mistakenly choose a larger VM thinking it solves memory issues without recognizing the scalability bottleneck.

10
MCQhard

A mobile app company needs to run an image classification model on-device for real-time performance. The model is a ResNet-50 trained in TensorFlow. They need to reduce latency to under 50ms on a mid-range phone. Which optimization should they apply first?

A.Convert the model to TensorFlow Lite
B.Quantize the model weights to 8-bit integers
C.Replace ResNet-50 with MobileNet
D.Apply weight pruning to remove 50% of connections
AnswerB

Quantization reduces model size and speeds up inference significantly.

Why this answer

Quantizing the model weights to 8-bit integers (option B) is the most effective first optimization because it directly reduces the model size by 4x and leverages integer-arithmetic acceleration on mobile CPUs/GPUs, often cutting inference latency by 2-3x without requiring architectural changes. This is the standard first step for on-device deployment of TensorFlow models, as it preserves the ResNet-50 accuracy while meeting the 50ms target on mid-range hardware.

Exam trap

Google Cloud often tests the misconception that converting to TensorFlow Lite alone is sufficient for latency reduction, but the real performance gain comes from quantization, not the format change.

How to eliminate wrong answers

Option A is wrong because simply converting to TensorFlow Lite (TFLite) without quantization does not reduce latency; TFLite is a runtime format that enables on-device inference but does not inherently speed up computation—quantization must be applied during conversion. Option C is wrong because replacing ResNet-50 with MobileNet is a model architecture change that would require retraining and potentially degrade accuracy for the specific image classification task, and the question asks for the first optimization to apply, not a model swap. Option D is wrong because weight pruning (removing 50% of connections) can reduce model size but often requires specialized hardware or software support for sparse matrix multiplication, which is not universally available on mid-range phones, and the latency improvement is less predictable than quantization.

11
MCQeasy

A data scientist wants to perform feature engineering on a large dataset stored in BigQuery before training a model. Which feature engineering tool is most appropriate?

A.Use Vertex AI Feature Store to store engineered features
B.Export data to Cloud Dataproc for feature engineering
C.Create a Dataflow pipeline to compute features
D.Use BigQuery ML TRANSFORM clause
AnswerD

Enables SQL-based feature transformations.

Why this answer

Option A is correct because BigQuery ML TRANSFORM clause allows creating transformed features directly in SQL. Option B is wrong because Cloud Dataflow is for pipelines, not direct interactive feature engineering. Option C is wrong because Vertex AI Feature Store is for storing already created features.

Option D is wrong because Cloud Dataproc is for Hadoop/Spark, not integrated with BigQuery as directly.

12
MCQhard

A healthcare startup is using Vertex AI to train a deep learning model for detecting anomalies in chest X-rays. The training dataset is 500 GB of images stored in Cloud Storage (GCS). They use a custom training container with TPU v3-32. The training job completes successfully, but the model performance is poor. On investigation, they discover that the input images were not preprocessed correctly: the images were resized to 256x256 instead of the required 512x512. They need to fix the preprocessing and retrain as quickly as possible. The preprocessing pipeline involves decompressing, resizing, normalizing, and augmenting images. They have a small team and limited time. Which approach should they take?

A.Use Vertex AI Batch Transform to preprocess the images
B.Run another Vertex AI Training job with a modified container that preprocesses and trains
C.Use Dataflow with Apache Beam to build a parallel preprocessing pipeline
D.Use Cloud Data Fusion to orchestrate the preprocessing steps
AnswerC

Dataflow scales to process large volumes of data quickly in parallel.

Why this answer

Option C is correct because Dataflow with Apache Beam provides a fully managed, serverless, and highly parallel preprocessing pipeline that can efficiently process 500 GB of images in Cloud Storage. This approach decouples preprocessing from training, allowing the team to fix the resize step (256x256 to 512x512) and run the pipeline independently, then feed the corrected data into a new training job. Dataflow automatically scales resources to handle large datasets, minimizing retraining time without requiring infrastructure management.

Exam trap

Google Cloud often tests the misconception that Vertex AI Training should handle preprocessing inline, but the trap here is that decoupling preprocessing with a scalable, serverless pipeline like Dataflow is faster and more maintainable than modifying the training container or using prediction-oriented services like Batch Transform.

How to eliminate wrong answers

Option A is wrong because Vertex AI Batch Transform is designed for batch predictions on already-preprocessed data, not for transforming raw images (decompressing, resizing, normalizing, augmenting) — it lacks the flexibility to run custom preprocessing logic like image resizing. Option B is wrong because running a combined preprocessing and training container would require modifying the training code and container, which is inefficient for a quick fix; it also ties preprocessing to the training job, preventing parallelization and reuse of the preprocessing step. Option D is wrong because Cloud Data Fusion is a visual data integration tool for ETL/ELT workflows, but it is overkill for image preprocessing and does not natively support the high-throughput, parallel image transformations needed for 500 GB of X-ray images; it is better suited for structured data pipelines.

13
Multi-Selectmedium

A company is evaluating Google Cloud ML solutions. Which TWO services are appropriate for building custom machine learning models (not using pre-built APIs)? (Choose TWO.)

Select 2 answers
A.Vertex AI Workbench
B.Cloud Translation API
C.Vertex AI Training
D.Cloud AutoML
E.Cloud Vision API
AnswersA, C

Notebooks for custom model development.

Why this answer

Vertex AI Workbench is correct because it provides a Jupyter-based development environment where data scientists can write custom code, train models from scratch, and manage the entire ML workflow without relying on pre-built APIs. It supports custom containers, frameworks like TensorFlow and PyTorch, and integrates with Vertex AI Training for distributed training.

Exam trap

Google Cloud often tests the distinction between 'building custom models' and 'using pre-built APIs' — candidates mistakenly choose AutoML or pre-built APIs because they think any ML service that trains models qualifies, but the question explicitly requires building from scratch without pre-built models.

14
MCQeasy

A retail company wants to forecast weekly sales for each of its 500 stores. The data includes historical sales, promotions, holidays, and local weather. The company needs to update forecasts every week with new data. Which ML approach should they use?

A.Use BigQuery ML to create a linear regression model on historical data
B.Use Vertex AI Forecasting to train a time-series model with holiday and weather features
C.Export data to AutoML Tables and train a regression model
D.Build a custom LSTM model using TensorFlow on Vertex AI Workbench
AnswerB

Vertex AI Forecasting is designed for time series with multiple features and supports automatic retraining.

Why this answer

Vertex AI Forecasting is purpose-built for time-series forecasting with support for exogenous features like holidays and weather, making it the ideal choice for weekly sales predictions across 500 stores. It handles multiple time series automatically and integrates with the required weekly retraining cycle, unlike generic regression models that lack temporal awareness.

Exam trap

Google Cloud often tests the distinction between general regression (which assumes i.i.d. data) and time-series forecasting (which requires temporal dependencies and exogenous features), leading candidates to pick a simpler regression option like BigQuery ML or AutoML Tables instead of the specialized forecasting service.

How to eliminate wrong answers

Option A is wrong because BigQuery ML linear regression treats data as independent rows, ignoring the temporal ordering and seasonality inherent in sales forecasting, and cannot natively handle multiple time series (500 stores) with exogenous features like holidays. Option C is wrong because AutoML Tables is designed for tabular regression with independent rows, not time-series forecasting, and would require manual feature engineering to capture time dependencies, leading to poor forecast accuracy. Option D is wrong because building a custom LSTM on Vertex AI Workbench is overkill for this problem—Vertex AI Forecasting already provides a managed, scalable time-series solution with built-in support for holiday and weather features, avoiding the operational overhead of custom model development and hyperparameter tuning.

15
MCQmedium

A data scientist deployed a TensorFlow model for sentiment analysis to Vertex AI Prediction. The model expects input key 'text' but the client sends requests with key 'review_text'. Which step should the data scientist take to resolve the error without retraining the model?

A.Use a Cloud Function to strip the 'review_text' key and replace it with 'text'
B.Retrain the model with input key 'review_text'
C.Create a new Vertex AI Endpoint with an alias mapping 'review_text' to 'text'
D.Modify the client code to send requests with input key 'text'
AnswerD

This aligns the request with the model's expected signature without changing the model.

Why this answer

Option D is correct because the most straightforward and reliable solution is to modify the client code to send the request with the expected input key 'text'. This avoids any additional infrastructure, latency, or complexity, and does not require retraining the model or altering the deployed endpoint. Vertex AI Prediction serves the model as-is, so aligning the client's request format with the model's expected input is the simplest and most maintainable fix.

Exam trap

Google Cloud often tests the misconception that you need to add infrastructure (like Cloud Functions) or modify the model to handle input key mismatches, when the correct answer is to adjust the client code to match the model's expected input schema.

How to eliminate wrong answers

Option A is wrong because introducing a Cloud Function adds an unnecessary hop, increases latency, and creates an extra point of failure; it also violates the principle of keeping the architecture simple when a direct client-side fix exists. Option B is wrong because retraining the model is an expensive and time-consuming process that is not needed when the only issue is a key name mismatch in the request payload. Option C is wrong because Vertex AI Endpoints do not support alias mappings for input keys; the endpoint simply forwards the request payload to the model, and the model's input signature is fixed at deployment time.

16
MCQmedium

Refer to the exhibit. A machine learning engineer deployed a model on Vertex AI using this configuration. When testing the endpoint, the engineer receives a 400 error with the message: 'Invalid argument: Explanation metadata missing required field: `outputs`.' What is the most likely cause?

A.The explanation metadata outputs field is missing the required 'displayName' attribute.
B.The explanation metadata needs a 'baseline' configuration for the input.
C.The explanation metadata inputs field should be wrapped inside a 'visualization' block.
D.The explainability method chosen is not supported for the model type.
AnswerA

Vertex AI requires each output in explanation metadata to have a 'displayName' field.

Why this answer

The error message indicates that the explanation metadata provided in the Vertex AI endpoint configuration is missing the required `outputs` field. In Vertex AI's Explainable AI, the `outputs` field must contain at least one entry with a `displayName` attribute to define which output tensor to explain. Without this, the API rejects the request with a 400 error.

Exam trap

Google Cloud often tests the distinction between required fields in the explanation metadata (inputs vs. outputs) and their sub-attributes (like displayName), leading candidates to confuse a missing baseline or unsupported method with the actual missing outputs field.

How to eliminate wrong answers

Option B is wrong because a `baseline` configuration is required for the input, not the output; the error specifically points to the missing `outputs` field, not the input baseline. Option C is wrong because the `visualization` block is used for image-specific explanations (e.g., integrated gradients with visualization), not for wrapping the inputs field; the error is about the `outputs` field, not the inputs. Option D is wrong because the error message does not mention an unsupported explainability method; it explicitly states that the `outputs` field is missing, which is a metadata configuration issue, not a method compatibility problem.

17
Multi-Selecthard

A manufacturing company wants to predict equipment failure using sensor data. The data is highly imbalanced (only 1% failures). They are using a gradient boosted tree model with class weights. The model achieves 0.99 recall but 0.2 precision on the test set. Which two actions should they take to improve precision without significantly hurting recall? (Choose TWO)

Select 2 answers
A.Oversample the minority class using SMOTE
B.Try an anomaly detection algorithm like Isolation Forest
C.Add more features to the model
D.Increase the class weight for the minority class
E.Increase the decision threshold for classifying a positive
AnswersB, E

Anomaly detection is designed for imbalanced data and can improve precision by focusing on outliers.

Why this answer

Option B is correct because anomaly detection algorithms like Isolation Forest are designed to identify rare events by isolating anomalies rather than modeling the majority class, which can improve precision when the minority class is extremely rare (1%). Option E is correct because increasing the decision threshold for classifying a positive reduces false positives by requiring higher confidence for a positive prediction, directly improving precision while only minimally reducing recall if the model's probability scores are well-calibrated.

Exam trap

Google Cloud often tests the misconception that oversampling or adding features always improves model performance, but in highly imbalanced scenarios, these actions can degrade precision without recall benefit, and the correct approach is to adjust the decision threshold or use anomaly detection.

18
MCQmedium

A logistics company uses a regression model to predict delivery times. The model currently uses features: distance (km), traffic index, weather condition, and time of day. The data scientist notices that the model's predictions are systematically too low for deliveries during peak traffic hours. Which action would best address this issue?

A.Switch to a deep neural network model
B.Remove the traffic index feature as it is causing bias
C.Add a cross-feature that multiplies distance by traffic index
D.Collect more training data during peak traffic hours
AnswerC

This interaction term allows the model to capture the combined effect.

Why this answer

The model's systematic underestimation during peak traffic hours indicates a missing interaction effect between distance and traffic. Adding a cross-feature (distance × traffic index) allows a linear model to capture the non-linear relationship where traffic disproportionately increases delivery time over longer distances. This directly addresses the bias without discarding useful data or unnecessarily complicating the model.

Exam trap

Google Cloud often tests the misconception that systematic bias is always due to insufficient data or the wrong model type, when in fact it is frequently caused by missing feature interactions that can be fixed with simple feature engineering.

How to eliminate wrong answers

Option A is wrong because switching to a deep neural network is overkill and does not guarantee fixing systematic bias; it may even introduce overfitting without addressing the root cause of missing feature interactions. Option B is wrong because removing the traffic index feature would eliminate a key predictor entirely, likely worsening the model's accuracy and increasing bias rather than correcting it. Option D is wrong because collecting more data during peak hours would not fix the model's inability to model the interaction between distance and traffic; the model would still systematically underpredict unless the feature representation is improved.

19
MCQhard

A hospital wants to deploy a machine learning model for detecting anomalies in patient vital signs. The model was trained on historical data but must comply with HIPAA regulations. The model serving must be low-latency (under 100 ms) and handle up to 1000 requests per second. Which architecture should they use on Google Cloud?

A.Use Vertex AI Batch Prediction to run predictions in batch jobs every hour
B.Use BigQuery ML to run predictions directly from a BigQuery table
C.Deploy the model as a container on Cloud Run with a load balancer
D.Deploy the model to Vertex AI Prediction with a private endpoint and use VPC Service Controls for data isolation
AnswerD

Vertex AI Prediction with private endpoints offers low latency and VPC-SC provides HIPAA-compliant data boundaries.

Why this answer

Vertex AI Prediction with a private endpoint and VPC Service Controls meets all requirements: it provides low-latency (sub-100ms) online predictions for up to 1000 QPS, enforces HIPAA compliance by isolating the model within a VPC and preventing data exfiltration, and supports autoscaling. Batch Prediction (A) cannot meet the latency requirement, BigQuery ML (B) is designed for analytical queries not real-time serving, and Cloud Run (C) lacks native HIPAA-compliant data isolation controls.

Exam trap

Google Cloud often tests the distinction between batch and online prediction, and candidates mistakenly choose Cloud Run because it offers low latency, but they overlook the HIPAA data isolation requirement that VPC Service Controls uniquely satisfy in a managed ML context.

How to eliminate wrong answers

Option A is wrong because Vertex AI Batch Prediction processes predictions in batch jobs with latency of minutes to hours, not sub-100ms, and cannot handle real-time requests at 1000 QPS. Option B is wrong because BigQuery ML runs predictions via SQL queries on BigQuery tables, which incurs query execution latency (typically seconds) and is not designed for low-latency online serving. Option C is wrong because Cloud Run, while capable of low-latency serving, does not provide built-in VPC Service Controls or private endpoints for HIPAA-compliant data isolation; additional configuration would be needed and it lacks the managed ML serving optimizations of Vertex AI Prediction.

20
MCQhard

A media company wants to build a real-time recommendation system for articles. They have a large user base (10M+) and frequent updates to user interactions. They need to handle cold-start users and new articles. Which architecture on Vertex AI is most suitable?

A.Deploy a Deep Learning Recommendation Model (DLRM) for prediction
B.Use a contextual bandit algorithm for exploration only
C.Use matrix factorization with collaborative filtering
D.Implement a two-tower model (user and item towers) with embeddings and nearest neighbor search
AnswerD

Two-tower models can incorporate side features and enable fast retrieval.

Why this answer

The two-tower model (user and item towers) with embeddings and nearest neighbor search is the most suitable because it handles cold-start users and new articles by learning separate embeddings for users and items, enabling efficient retrieval via approximate nearest neighbor (ANN) search. This architecture supports real-time updates and scales to 10M+ users by decoupling user and item representations, allowing incremental training on new interactions without full retraining.

Exam trap

Google Cloud often tests the misconception that matrix factorization (Option C) is sufficient for cold-start scenarios, but candidates miss that it requires retraining on new data and cannot generate embeddings for unseen users or items without side features.

How to eliminate wrong answers

Option A is wrong because DLRM is a deep learning model for click-through rate prediction that requires retraining on new data and does not natively handle cold-start items or users without additional feature engineering, making it less suitable for frequent updates and real-time recommendation. Option B is wrong because a contextual bandit algorithm for exploration only lacks exploitation of known user preferences, leading to suboptimal recommendations over time, and does not provide a full recommendation system. Option C is wrong because matrix factorization with collaborative filtering cannot handle cold-start users or new articles without retraining the entire model, as it relies on existing interaction matrices and lacks a mechanism for incorporating new entities in real time.

21
MCQhard

A financial institution needs to deploy a fraud detection model with strict latency <100ms per prediction and high throughput (1000 predictions/sec). The model is a deep neural network. Which architecture on Google Cloud meets these requirements?

A.Deploy the model on AI Platform Training with a single large VM
B.Deploy the model as a Cloud Function triggered by Cloud Pub/Sub
C.Use Vertex AI Batch Prediction with a fixed number of machines
D.Use Vertex AI Prediction with autoscaling enabled and GPU machine types
AnswerD

Vertex AI Prediction provides real-time endpoints with autoscaling and GPU support for low latency and high throughput.

Why this answer

Vertex AI Prediction with autoscaling and GPU machine types is correct because it provides low-latency online serving with autoscaling to handle high throughput (1000 predictions/sec) while keeping latency under 100ms. GPUs accelerate deep neural network inference, and autoscaling ensures resources match demand without over-provisioning.

Exam trap

Google Cloud often tests the distinction between batch and online prediction services, where candidates mistakenly choose batch prediction for real-time requirements because they focus on throughput without considering latency constraints.

How to eliminate wrong answers

Option A is wrong because AI Platform Training is designed for model training, not real-time serving, and a single large VM cannot guarantee sub-100ms latency under high throughput due to resource contention and lack of autoscaling. Option B is wrong because Cloud Functions have a maximum timeout of 9 minutes (540 seconds) and are not optimized for high-throughput, low-latency ML inference; they also lack GPU support, making deep neural network inference too slow. Option C is wrong because Vertex AI Batch Prediction is for asynchronous, offline predictions on large datasets, not real-time serving with strict latency requirements; it processes jobs in batches and cannot meet sub-100ms per prediction.

22
Multi-Selecteasy

Which TWO are best practices for building ML pipelines on Vertex AI Pipelines?

Select 2 answers
A.Store all trained models in Cloud Storage without versioning
B.Use Cloud Build as the pipeline orchestrator
C.Use a container-based approach for each component
D.Define pipelines using the Kubeflow Pipelines SDK
E.Use Cloud Composer as the primary pipeline tool
AnswersC, D

Containerized components are reusable and scalable.

Why this answer

Option C is correct because Vertex AI Pipelines is designed to run container-based components, where each step in the pipeline is a Docker container that encapsulates its dependencies and execution logic. This approach ensures reproducibility, isolation, and scalability, aligning with best practices for ML pipelines on Vertex AI.

Exam trap

Google Cloud often tests the distinction between general-purpose orchestration tools (Cloud Composer, Cloud Build) and ML-specific pipeline services (Vertex AI Pipelines), expecting candidates to recognize that container-based components and the Kubeflow Pipelines SDK are the correct building blocks for ML pipelines on Vertex AI.

23
MCQeasy

A startup wants to add sentiment analysis to their customer feedback app without any labeled data or custom model training. Which Google Cloud service should they use?

A.Cloud Natural Language API
B.AutoML Natural Language with manual labeling
C.Use BigQuery ML to train a text classification model
D.Train a custom sentiment model on Vertex AI
AnswerA

Pre-trained model available via API call.

Why this answer

The Cloud Natural Language API provides pre-trained models for sentiment analysis that require no labeled data or custom training. It offers a ready-to-use sentiment analysis feature via a simple API call, making it ideal for a startup that wants to add sentiment analysis without any machine learning expertise or data preparation.

Exam trap

Google Cloud often tests the distinction between pre-trained APIs and custom training services, where candidates mistakenly choose AutoML or Vertex AI because they think any ML task requires custom training, overlooking the existence of fully managed, pre-trained APIs like Cloud Natural Language API.

How to eliminate wrong answers

Option B is wrong because AutoML Natural Language with manual labeling requires labeled data and custom model training, which contradicts the requirement of no labeled data or custom training. Option C is wrong because BigQuery ML is designed for training models using SQL queries on structured data, not for pre-trained sentiment analysis, and it still requires labeled data and training. Option D is wrong because training a custom sentiment model on Vertex AI involves building, training, and deploying a custom model, which requires labeled data and significant ML effort, not a pre-built solution.

24
MCQmedium

A financial services company wants to detect fraudulent transactions in real-time. They have a trained XGBoost model that runs on a single Compute Engine instance. The current solution processes about 100 transactions per second, but they need to scale to 10,000 transactions per second. Which approach should they take?

A.Increase the VM to a machine type with more vCPUs and memory
B.Deploy the model to Vertex AI Prediction with autoscaling enabled
C.Use Dataflow to process transactions in micro-batches every second
D.Rewrite the model as a Cloud Function triggered by Pub/Sub messages
AnswerB

Vertex AI Prediction automatically scales based on traffic.

Why this answer

Vertex AI Prediction with autoscaling is the correct choice because it is purpose-built for serving ML models at scale, automatically adjusting the number of compute nodes based on incoming request traffic. This allows the company to seamlessly handle the increase from 100 to 10,000 transactions per second without manual intervention, while XGBoost is natively supported as a framework.

Exam trap

Google Cloud often tests the misconception that vertical scaling (bigger VM) is sufficient for large throughput increases, when in reality horizontal scaling with a managed service like Vertex AI is required for elasticity and high availability.

How to eliminate wrong answers

Option A is wrong because simply scaling up a single VM (vertical scaling) has hardware limits and cannot reliably handle a 100x increase in throughput; it also introduces a single point of failure and lacks automatic scaling. Option C is wrong because Dataflow is designed for batch and stream processing pipelines, not for low-latency real-time model serving; processing in micro-batches every second would add unacceptable latency for fraud detection. Option D is wrong because Cloud Functions have a maximum timeout of 9 minutes and are not designed for sustained high-throughput inference workloads; they are better suited for lightweight, event-driven tasks, not for serving a complex XGBoost model at 10,000 TPS.

25
MCQhard

A retail company uses Vertex AI Tabular (AutoML Tables) to build a customer churn prediction model. The training dataset contains 50,000 rows and 30 features, with a 5% churn rate. The model achieves an AUC of 0.85 on the test set. When deployed for online predictions, the average latency is 800ms, while the business requirement is under 200ms. The engineer has already reduced the feature set to 10 features, but latency only dropped to 600ms. The model size is 2GB. The endpoint is in us-central1 using an n1-standard-4 machine with minReplicaCount=1. What should the engineer do to meet the latency requirement?

A.Move the endpoint to a region geographically closer to the majority of customers.
B.Use a larger machine type (e.g., n1-highmem-8) for the endpoint.
C.Convert the model to a custom TensorFlow Lite model and deploy it.
D.Enable model compression in Vertex AI Tabular.
AnswerD

Model compression reduces model size and inference latency, which directly addresses the issue.

Why this answer

Vertex AI Tabular (AutoML Tables) supports model compression, which reduces model size and inference latency by applying techniques like quantization and pruning. Since the model is 2GB and latency is 600ms (still above the 200ms target), enabling compression can shrink the model significantly, often cutting latency by 2-3x, directly meeting the requirement without changing infrastructure or converting to a different framework.

Exam trap

Google Cloud often tests the misconception that latency issues are always solved by scaling up hardware or changing regions, but the real bottleneck here is model size and inference complexity, which Vertex AI Tabular's built-in compression directly addresses without requiring framework conversion or infrastructure changes.

How to eliminate wrong answers

Option A is wrong because moving the endpoint geographically reduces network round-trip time, but the 600ms latency is dominated by model inference time on the server, not network latency; the business requirement is under 200ms total, and network latency is typically <50ms within a region. Option B is wrong because using a larger machine type (e.g., n1-highmem-8) increases CPU/memory resources, but AutoML Tables models are often CPU-bound and the bottleneck is model size and complexity, not compute capacity; a larger machine may only shave off a small fraction of latency and is cost-inefficient. Option C is wrong because Vertex AI Tabular models are not TensorFlow-based; they are ensemble models (e.g., gradient-boosted trees, neural networks) that cannot be directly converted to TensorFlow Lite, which is designed for TensorFlow/Keras models, not AutoML Tables output.

26
MCQhard

A company wants to use ML to predict customer churn. They have user activity logs in Cloud Storage, account data in BigQuery, and want an automated pipeline. Which pipeline architecture on Google Cloud should they use?

A.Load both data sources into AutoML Tables and train directly
B.Export logs from Cloud Storage to Cloud Dataproc for preprocessing, then train
C.Use Cloud Functions to preprocess data, then train on AI Platform
D.Use BigQuery to join logs and account data, train on Vertex AI, deploy to an endpoint
AnswerD

Seamless integration: BigQuery queries external tables, Vertex AI trains from BigQuery, endpoint serves.

Why this answer

Option D is correct because it leverages BigQuery's ability to join structured account data with semi-structured logs (via federated queries or external tables), then uses Vertex AI for end-to-end ML training and deployment. This architecture minimizes data movement, keeps the pipeline serverless, and directly addresses the requirement for an automated pipeline with both data sources.

Exam trap

Google Cloud often tests the misconception that AutoML Tables can handle multi-source data natively, when in fact it requires a single pre-joined dataset, and that Cloud Functions are suitable for heavy preprocessing workloads despite their strict resource limits.

How to eliminate wrong answers

Option A is wrong because AutoML Tables requires data to be in a single table format (CSV/JSON) and cannot directly ingest data from two separate sources without prior joining; it also lacks native pipeline automation. Option B is wrong because Cloud Dataproc (managed Spark/Hadoop) is overkill for simple preprocessing and introduces unnecessary cluster management overhead; BigQuery can perform the join and preprocessing more efficiently without spinning up ephemeral clusters. Option C is wrong because Cloud Functions have a 9-minute timeout and 2GB memory limit, making them unsuitable for preprocessing large-scale log data; Vertex AI is the correct training platform, but the preprocessing should be done in BigQuery, not Cloud Functions.

27
MCQeasy

A retail company wants to forecast daily sales for inventory planning. They have 3 years of historical sales data with clear weekly and yearly seasonality. Which approach should they use?

A.Call a pre-built Google Cloud API for sales prediction
B.Use a linear regression model in Vertex AI
C.Use Vertex AI AutoML Tables with date as feature
D.Use BigQuery ML to train an ARIMA_PLUS model
AnswerD

ARIMA_PLUS handles seasonality and is optimized for time series.

Why this answer

Option D is correct because ARIMA_PLUS in BigQuery ML is specifically designed for time-series forecasting with multiple seasonalities (weekly and yearly). It automatically handles seasonality detection, trend decomposition, and holiday effects, making it ideal for retail sales data with clear periodic patterns.

Exam trap

The trap here is that candidates often choose AutoML Tables (Option C) thinking it can handle any structured data, but they miss that AutoML Tables is not a dedicated time-series model and requires manual feature engineering to capture seasonality, whereas ARIMA_PLUS is purpose-built for this scenario.

How to eliminate wrong answers

Option A is wrong because calling a pre-built Google Cloud API for sales prediction is vague and not a specific, integrated solution for time-series forecasting with seasonality; such APIs may not exist or may not handle custom seasonality patterns. Option B is wrong because linear regression in Vertex AI is a general-purpose model that does not inherently capture time-series dependencies like weekly and yearly seasonality without extensive feature engineering (e.g., lag features, Fourier terms). Option C is wrong because Vertex AI AutoML Tables with date as a feature treats the problem as a regression on tabular data, not as a dedicated time-series model, and may fail to properly model temporal autocorrelation and multiple seasonalities without manual time-series preprocessing.

28
MCQeasy

A global retail company uses Vertex AI Recommendations to provide product recommendations on their website. They have a large catalog and millions of users. The initial deployment works well for active users, but they notice that new users (with no purchase history) receive generic recommendations that are not personalized. The company wants to improve the cold-start experience. They have user demographic data (age, location) available at sign-up. Current recommendation model is a collaborative filtering model using the built-in Vertex AI Recommendations. What should the company do to improve personalization for new users?

A.Collect more historical interaction data before showing recommendations
B.Disable recommendations for new users until they have at least 10 interactions
C.Increase the user exploration parameter in the Vertex AI Recommendations configuration
D.Build a custom two-tower recommendation model using Vertex AI Training
AnswerC

Exploration helps serve diverse items to new users to learn preferences.

Why this answer

Option C is correct because increasing the user exploration parameter in Vertex AI Recommendations instructs the model to allocate a higher percentage of recommendations to items with less historical data, effectively enabling personalized suggestions for cold-start users based on available demographic signals. This parameter directly controls the balance between exploiting known user-item interactions and exploring new or less-seen items, which is the standard mechanism within Vertex AI's built-in collaborative filtering to address the cold-start problem without requiring a custom model.

Exam trap

Google Cloud often tests the misconception that cold-start problems always require custom models or additional data collection, when in fact built-in platform features like exploration parameters are designed specifically to handle this scenario without custom development.

How to eliminate wrong answers

Option A is wrong because collecting more historical interaction data before showing recommendations does not solve the immediate cold-start problem for new users; it merely delays personalization and contradicts the goal of improving the experience from sign-up. Option B is wrong because disabling recommendations for new users until they have at least 10 interactions is a poor user experience and ignores the fact that Vertex AI Recommendations can leverage user demographic data (age, location) to provide personalized suggestions even without purchase history. Option D is wrong because building a custom two-tower recommendation model using Vertex AI Training is unnecessary and over-engineered; Vertex AI's built-in service already supports exploration parameters and can utilize demographic features for cold-start personalization without requiring custom model development.

29
MCQhard

Refer to the exhibit. A user attempts to upload a model to Vertex AI Model Registry using the gcloud CLI. The command fails with the error shown. What is the most likely cause?

A.The region us-central1 does not support Vertex AI
B.The --container-command flag is misspelled
C.The --artifact-uri points to a directory instead of a model file
D.The --container-ports flag expects a comma-separated list
AnswerC

Error indicates URI must point to a single file.

Why this answer

The error indicates that the `--artifact-uri` flag points to a directory (e.g., `gs://bucket/model/`) rather than a specific model file (e.g., `gs://bucket/model/saved_model.pb`). Vertex AI Model Registry requires a direct path to the model artifact file, not a container directory, because the service needs to locate and register the exact model binary for deployment.

Exam trap

Google Cloud often tests the distinction between a directory path and a file path in cloud CLI commands, exploiting the common mistake of assuming a folder URI is acceptable when the service expects a specific artifact file.

How to eliminate wrong answers

Option A is wrong because us-central1 is a fully supported region for Vertex AI, including Model Registry, and the error message does not indicate a regional restriction. Option B is wrong because the `--container-command` flag is correctly spelled in the command; the error is unrelated to flag spelling. Option D is wrong because the `--container-ports` flag does accept a comma-separated list, but the error message points to the `--artifact-uri` value, not to the ports flag.

30
Multi-Selecthard

An e-commerce company uses a recommendation model that suggests products based on user browsing history. The model was trained on data from the past year and has high accuracy on the test set. However, after deployment, the click-through rate (CTR) on recommendations is much lower than expected. Which three steps should the data scientist take to diagnose and improve the model? (Choose THREE)

Select 3 answers
A.Run offline evaluation on a holdout dataset to confirm accuracy
B.Set up an A/B experiment comparing the model's recommendations against a baseline
C.Retrain the model on the most recent three months of data to capture recent trends
D.Check the distribution of predictions versus the training set to detect drift
E.Increase the training dataset size by including data from two years ago
AnswersB, C, D

A/B testing validates the model's real-world performance and identifies issues.

Why this answer

Option B is correct because an A/B experiment directly measures the model's real-world impact by comparing its CTR against a baseline (e.g., random or popularity-based recommendations). This isolates the model's performance from confounding factors like seasonality or user behavior changes, providing a causal estimate of its effectiveness.

Exam trap

Google Cloud often tests the misconception that high offline accuracy guarantees online success, ignoring that offline metrics can be misleading due to distribution shift, feedback loops, or mismatched optimization objectives (e.g., accuracy vs. CTR).

31
MCQeasy

A data scientist runs a BigQuery ML prediction query and gets a region mismatch error. The model is in the US region, but the new_data table is in the EU region. What is the simplest way to resolve this?

A.Recreate the model in the EU region using the same training data
B.Copy the new_data table to the US region using the BigQuery UI or CLI
C.Enable cross-region query in BigQuery settings
D.Export the model from US and import it to EU
AnswerB

Copying the table to the same region resolves the mismatch with minimal effort.

Why this answer

Option B is correct because the simplest fix is to move the new_data table to the same region as the model (US). BigQuery ML requires that the model and the data used for predictions reside in the same multi-region or regional location. Copying the table via the BigQuery UI or CLI (e.g., `bq cp`) is a straightforward, no-code operation that avoids retraining or exporting the model.

Exam trap

The trap here is that candidates may overthink the solution and choose to recreate the model or export/import it, not realizing that the simplest and most efficient fix is to copy the data table to the model's region.

How to eliminate wrong answers

Option A is wrong because recreating the model in the EU region would require retraining the model from scratch, which is unnecessary and time-consuming when a simple data copy resolves the mismatch. Option C is wrong because BigQuery does not support a 'cross-region query' setting; queries are always restricted to a single region or multi-region, and enabling such a feature is not possible. Option D is wrong because exporting and importing a model between regions is more complex and involves additional steps (e.g., using Cloud Storage as an intermediary), whereas copying the table is simpler and directly addresses the region mismatch.

32
Multi-Selectmedium

Which THREE are key capabilities of Vertex AI Feature Store?

Select 3 answers
A.Automatic generation of feature embeddings
B.Feature monitoring and validation to detect skew
C.Online serving for low-latency feature retrieval
D.Real-time streaming ingestion from Apache Kafka
E.Offline batch serving for training
AnswersB, C, E

Feature Store includes monitoring for distribution changes.

Why this answer

Option B is correct because Vertex AI Feature Store provides built-in feature monitoring and validation capabilities that detect training-serving skew and data drift. This is critical for maintaining model performance in production, as it alerts when the distribution of feature values changes between training and serving environments.

Exam trap

Google Cloud often tests the misconception that Vertex AI Feature Store includes automatic embedding generation or direct Kafka integration, when in fact these are separate services or require custom implementation.

33
MCQmedium

A company is training a large neural network on Vertex AI and training jobs keep failing with 'Out of memory' errors. The VM uses a standard n1-standard-4 machine with 15 GB RAM. Which action should they take first?

A.Use a larger machine type like n1-standard-16
B.Reduce the batch size in the training script
C.Enable distributed training across multiple VMs
D.Switch the training to CPU only
AnswerB

Smaller batch size reduces peak memory usage.

Why this answer

The 'Out of memory' error on a n1-standard-4 VM (15 GB RAM) indicates the model's memory footprint exceeds available RAM. Reducing the batch size directly decreases the memory required for storing intermediate activations and gradients during training, which is the most immediate and cost-effective fix without changing the underlying infrastructure.

Exam trap

The trap here is that candidates often jump to scaling up infrastructure (larger machine or distributed training) instead of first tuning the training hyperparameter (batch size) that directly controls memory consumption, which is the simplest and most cost-effective fix.

How to eliminate wrong answers

Option A is wrong because upgrading to a larger machine type (e.g., n1-standard-16) increases cost and may still fail if the memory issue is due to a batch size that is too large for the model; it does not address the root cause of memory pressure. Option C is wrong because enabling distributed training across multiple VMs introduces network overhead and synchronization complexity (e.g., using all-reduce with NCCL) and does not reduce per-VM memory consumption; it may even increase memory usage due to gradient accumulation buffers. Option D is wrong because switching to CPU only typically uses more memory for the same batch size (CPU memory is not the bottleneck here; GPU memory is not even mentioned, and the error is on a standard VM without GPUs), and it would dramatically slow training without solving the OOM issue.

34
Multi-Selecteasy

Refer to the exhibit. A data scientist is evaluating a binary classification model trained with BigQuery ML on an imbalanced dataset. The exhibit shows the output of ML.EVALUATE run on two different thresholds. Which TWO actions should the data scientist take to improve model performance? (Choose two.)

Select 2 answers
A.Add more features from the source data.
B.Use AUC-ROC as the evaluation metric instead of accuracy.
C.Apply SMOTE oversampling in the preprocessing pipeline.
D.Use class weights in the CREATE MODEL statement.
E.Increase the number of training iterations.
AnswersB, D

AUC-ROC is robust to class imbalance and provides a better measure of model discrimination.

Why this answer

Option B is correct because AUC-ROC is insensitive to class imbalance and evaluates the model's ability to rank positive instances higher than negative ones across all thresholds, unlike accuracy which can be misleading when the majority class dominates. In BigQuery ML, ML.EVALUATE returns metrics like accuracy, precision, recall, and AUC-ROC; for imbalanced datasets, AUC-ROC provides a more reliable measure of discriminative power.

Exam trap

Google Cloud often tests the misconception that adding more data or features is a universal fix for imbalanced datasets, when in fact the core issue requires adjustments to the evaluation metric or the loss function (e.g., class weights) rather than simply increasing data volume or iterations.

35
Multi-Selecteasy

A company wants to use pre-built Google Cloud APIs for text analysis. Which TWO APIs can they use? (Choose TWO.)

Select 2 answers
A.Cloud Natural Language API
B.Cloud Translation API
C.Cloud Vision API
D.Video Intelligence API
E.Document AI
AnswersA, B

For text analysis.

Why this answer

The Cloud Natural Language API provides pre-built machine learning models for text analysis tasks such as entity recognition, sentiment analysis, and syntax analysis. The Cloud Translation API can translate text between languages, which is a form of text analysis. Both are pre-built Google Cloud APIs that directly address the company's need for text analysis without requiring custom model training.

Exam trap

The trap here is that candidates may confuse Document AI with a general text analysis API, but Document AI is specifically for document parsing and OCR, not for core NLP tasks like sentiment or entity extraction, which are the focus of the Cloud Natural Language API.

36
MCQhard

A healthcare startup is building a diagnostic tool that uses a deep learning model to classify medical images. The model is trained on TensorFlow and deployed on Vertex AI Prediction. The startup has strict latency requirements: predictions must return within 200 ms for 95% of requests. Current performance shows p95 latency of 350 ms. The team has already tried using a smaller model, but accuracy dropped below acceptable levels. The traffic pattern is spiky: low load during nights but bursts of 1000 requests per second during business hours. Currently, they use a single n1-highmem-8 VM with a GPU attached. They have a budget for additional resources but need to optimize cost. The model is about 500 MB and requires GPU for inference. Which course of action should they take to meet the latency requirement while managing costs?

A.Upgrade to an n1-highmem-16 VM with a more powerful GPU
B.Switch to batch prediction using Vertex AI Batch Prediction and store results in a database for retrieval
C.Create a Vertex AI Prediction endpoint with an accelerator (GPU) and enable autoscaling (min 1, max 5 nodes)
D.Deploy the model as a Cloud Function using TensorFlow Serving
AnswerC

Autoscaling with GPU provides low latency during bursts and cost efficiency by scaling down during low load.

Why this answer

Option C is correct because it leverages Vertex AI Prediction's autoscaling to handle spiky traffic efficiently, using GPU-accelerated endpoints that can scale from 1 to 5 nodes to meet the 200 ms p95 latency requirement. This approach minimizes cost during low-load periods while providing burst capacity for the 1000 requests per second peak, addressing both the latency and budget constraints without compromising model accuracy.

Exam trap

The trap here is that candidates often choose a single-node upgrade (Option A) thinking more power solves latency, but they overlook the need for horizontal scaling to handle spiky traffic, while Option B seems cost-effective but ignores the real-time requirement, and Option D appears serverless but fails due to GPU and timeout limitations.

How to eliminate wrong answers

Option A is wrong because upgrading to a more powerful VM (n1-highmem-16 with a better GPU) does not solve the spiky traffic pattern; it increases cost during low-load periods and still risks latency spikes during bursts due to a single-node bottleneck. Option B is wrong because batch prediction is asynchronous and not suitable for real-time diagnostic tools requiring sub-200 ms responses; storing results in a database for retrieval introduces additional latency and cannot meet the strict p95 latency requirement. Option D is wrong because Cloud Functions have a maximum timeout of 540 seconds and do not natively support GPU acceleration, making them incapable of running a 500 MB deep learning model with GPU inference within the latency constraint.

37
MCQmedium

A model deployed on Vertex AI Prediction is returning high latency for real-time requests. The model is a small TensorFlow model. Which troubleshooting step should the team take first?

A.Retrain the model with a larger batch size
B.Check if the machine type is too small and enable autoscaling
C.Use a custom container with optimized runtime
D.Enable Cloud Armor to reduce traffic
AnswerB

Low latency often requires adequate resources.

Why this answer

Option B is correct because high latency for real-time predictions from a small TensorFlow model often indicates that the serving infrastructure is under-provisioned. Checking the machine type and enabling autoscaling directly addresses whether the instance is too small to handle the request volume, which is the most common first step in diagnosing latency issues on Vertex AI Prediction.

Exam trap

Google Cloud often tests the principle of 'start with the simplest infrastructure fix before optimizing the model or container,' so candidates mistakenly jump to retraining or custom containers without first checking if the instance type and scaling settings are appropriate.

How to eliminate wrong answers

Option A is wrong because retraining with a larger batch size affects training throughput, not inference latency for real-time requests; inference batch size is set at serving time, not during training. Option C is wrong because using a custom container with an optimized runtime is a more advanced optimization step that should be considered only after verifying that the base infrastructure (machine type and scaling) is adequate. Option D is wrong because Cloud Armor is a security service for DDoS protection and traffic filtering, not a tool for reducing latency caused by insufficient compute resources.

38
Multi-Selecthard

Which TWO actions are recommended to detect and mitigate data drift in a production ML system on Vertex AI?

Select 2 answers
A.Deploy multiple models and use an ensemble to average predictions
B.Manually review model predictions daily
C.Automatically retrain the model when drift exceeds thresholds
D.Set up Vertex AI Model Monitoring to alert on feature distribution changes
E.Monitor prediction errors and flag when confidence is low
AnswersC, D

Automated retraining mitigates drift.

Why this answer

Option C is correct because Vertex AI's automated retraining pipeline can be triggered when data drift exceeds a predefined threshold, ensuring the model adapts to distribution changes without manual intervention. Option D is correct because Vertex AI Model Monitoring continuously tracks feature distribution statistics (e.g., using Jensen-Shannon divergence or L-infinity distance) and sends alerts when drift is detected, enabling proactive mitigation.

Exam trap

Google Cloud often tests the distinction between drift detection (monitoring input distributions) and model performance monitoring (tracking prediction errors or confidence), leading candidates to confuse E with a valid drift mitigation technique.

39
MCQmedium

A data scientist deployed a classification model on Vertex AI Endpoints. After a week, the model's accuracy drops significantly from 92% to 78%. The data scientist suspects training-serving skew. What is the first step to confirm this?

A.Look for data leakage in the training pipeline
B.Compare feature distributions between training and serving data using Vertex AI Model Monitoring
C.Examine the feature importance of the model
D.Check the prediction confidence over time
AnswerB

Model Monitoring can detect skew by comparing distributions.

Why this answer

Option B is correct because Vertex AI Model Monitoring provides a built-in capability to automatically detect training-serving skew by comparing feature distributions between the training data and the live serving data. This is the most direct and efficient first step to confirm whether the accuracy drop is due to a shift in the input data distribution, which is the hallmark of training-serving skew. The data scientist can set up monitoring jobs that compute statistical distance metrics (e.g., Jensen-Shannon divergence) and alert when significant deviations occur.

Exam trap

Google Cloud often tests the distinction between diagnosing the root cause of a performance drop versus investigating a specific type of issue; the trap here is that candidates may jump to data leakage (Option A) because it sounds similar to skew, but leakage is a pre-deployment problem, not a post-deployment distribution shift.

How to eliminate wrong answers

Option A is wrong because looking for data leakage in the training pipeline addresses a different problem—where the model inadvertently uses information from the future or target during training—not a post-deployment distribution shift between training and serving data. Option C is wrong because examining feature importance helps understand which features drive predictions but does not directly compare training and serving distributions to confirm skew. Option D is wrong because checking prediction confidence over time can indicate model uncertainty but does not isolate whether the cause is a change in input data distribution versus model drift or other issues.

40
MCQeasy

A retail company wants to predict customer churn using their transaction history and customer demographics. They have limited ML expertise and want to use a managed service on Google Cloud. Which service should they use?

A.AI Platform Notebooks
B.Vertex AI AutoML (Tables)
C.Cloud TPU
D.BigQuery ML
AnswerB

AutoML Tables provides end-to-end automated model building for tabular data, ideal for limited ML expertise.

Why this answer

Vertex AI AutoML (Tables) is the correct choice because it is a managed service specifically designed for tabular data, requiring no ML expertise. It automates model training, hyperparameter tuning, and deployment for classification tasks like churn prediction, directly handling transaction history and demographic features.

Exam trap

The trap here is that candidates may confuse BigQuery ML as a fully managed no-code solution, but it still requires SQL proficiency and manual model selection, whereas Vertex AI AutoML is the true zero-code managed service for tabular data.

How to eliminate wrong answers

Option A is wrong because AI Platform Notebooks provides a Jupyter-based development environment for custom ML coding, not a managed no-code solution, and requires ML expertise to build and train models. Option C is wrong because Cloud TPU is a hardware accelerator for training large deep learning models (e.g., NLP or vision), not a managed service for tabular churn prediction, and is overkill for this use case. Option D is wrong because BigQuery ML enables SQL-based model creation directly in BigQuery, but it requires some ML knowledge to write queries and tune models, and is less automated than AutoML for users with limited ML expertise.

Ready to test yourself?

Try a timed practice session using only Solving business challenges with ML questions.