Knowledge + Practice

Google Professional Data Engineer (PDE) — Questions 76–150

499 questions total · 7pages · All types, answers revealed

Take a mock exam Exam hub

Page 2 of 7

76

MCQmedium

A company uses GKE to run microservices. They want to ensure the application restarts automatically if it becomes unresponsive. Which probes should they configure in their pod spec?

A.Startup probes only.

B.Liveness probes only.

C.Readiness probes only.

D.Both readiness and liveness probes.

AnswerD

Using both ensures traffic is only sent to ready pods and unresponsive pods are restarted, providing complete health management.

Why this answer

Liveness probes determine when to restart a container; readiness probes determine when a pod is ready to serve traffic. Both are needed for full resilience.

Full explanation →

77

MCQmedium

You run batch predictions using Vertex AI Batch Prediction on a tabular dataset. The job processes 1 million rows and takes 6 hours to complete. You need to reduce the processing time to under 2 hours without increasing cost significantly. What should you do?

A.Switch to a machine type with more CPU cores and vCPUs

B.Increase the machine count (number of worker replicas) in the batch prediction job

C.Downsample the dataset to 500k rows

D.Use Online prediction instead of batch

AnswerB

More workers process data in parallel, reducing runtime linearly with cost.

Why this answer

Correct: D. Increasing the number of worker replicas speeds up batch jobs. Option A is wrong because machine type may help but usually less effective than parallelization.

Option B is wrong because streaming is for online. Option C is wrong because reducing data size is not an option.

Full explanation →

78

Multi-Selecteasy

A company is designing a data processing system that must handle both batch and streaming workloads with unified pipeline code. Which two Google Cloud services are most suitable for implementing a unified batch and streaming pipeline? (Choose TWO.)

Select 2 answers

A.Cloud Data Fusion

B.BigQuery

C.Apache Beam SDK

D.Cloud Dataflow

E.Cloud Dataproc

AnswersC, D

Beam is the unified model; Dataflow is one runner.

Why this answer

Apache Beam SDK (C) provides a unified programming model that allows developers to write a single pipeline that can execute in both batch and streaming modes without code changes. It abstracts the underlying execution engine, making it the correct choice for unified pipeline code.

Exam trap

Google Cloud often tests the misconception that Cloud Data Fusion or Cloud Dataproc can achieve unified batch and streaming with a single codebase, but only Apache Beam SDK combined with Cloud Dataflow provides the native programming model and execution engine for this requirement.

Full explanation →

79

MCQeasy

A data pipeline using Dataflow processes streaming data. Late-arriving events are currently being dropped. How should the team modify the pipeline to ensure late data is processed correctly?

A.Use side inputs to join late data with the main stream.

B.Use streaming inserts into BigQuery and ignore late data.

C.Configure triggers with allowed lateness and accumulation of late firings.

D.Increase the window duration to cover late data.

AnswerC

This is the standard pattern for handling late data in Dataflow: set allowed lateness and use triggers to emit on late arrival.

Why this answer

Dataflow allows setting allowed lateness on windows; trigger configuration can handle late data by emitting updates.

Full explanation →

80

MCQhard

A financial services company must comply with GDPR "right to be forgotten". They store customer transactions in BigQuery partitioned by date. When a user requests deletion, all their data must be removed within 48 hours. The deletion requests are received via a Pub/Sub topic. What is the most scalable and cost-effective approach?

A.Use Cloud Functions to execute a BigQuery DELETE statement on each request

B.Use Cloud DLP to redact the user's data in Cloud Storage

C.Use a Dataflow pipeline that reads the deletion IDs from Pub/Sub, joins with the transactions table using a side input, and writes the filtered data to a new table, then swapping

D.Use BigQuery table snapshots and restore after deletion

AnswerC

This scales well and avoids full table scans; the side input contains the IDs to delete.

Why this answer

Option C is correct because it uses Dataflow to process deletion requests from Pub/Sub, join them with the BigQuery transactions table via a side input, and write a filtered copy to a new table. This approach is scalable (handles high-throughput streaming deletions) and cost-effective (avoids expensive DELETE mutations on BigQuery, which consume slot resources and can be slow for large tables). Swapping the new table for the old one completes the deletion efficiently within the 48-hour SLA.

Exam trap

Google Cloud often tests the misconception that BigQuery DELETE statements are the simplest way to remove data, but the trap here is that DELETE operations on large partitioned tables are expensive and not scalable for streaming deletion requests, whereas a Dataflow-based rewrite is both cost-effective and meets the 48-hour SLA.

How to eliminate wrong answers

Option A is wrong because executing a BigQuery DELETE statement per request is not scalable for high-volume deletion requests; each DELETE incurs slot consumption and can be slow on large partitioned tables, potentially exceeding the 48-hour SLA. Option B is wrong because Cloud DLP is designed for data masking and redaction in Cloud Storage, not for deleting rows from BigQuery tables; it does not address the requirement to remove customer transactions from BigQuery. Option D is wrong because BigQuery table snapshots are read-only copies used for point-in-time recovery, not for deleting specific user data; restoring a snapshot would revert the table to a previous state, not selectively remove a user's records.

Full explanation →

81

MCQhard

A gaming company uses Pub/Sub to ingest player events and Dataflow for real-time analytics. They notice that the Pub/Sub subscription backlog is growing despite the Dataflow pipeline running continuously. The pipeline has a 1-hour window for aggregations. What is the most effective way to reduce the backlog?

A.Increase the Dataflow pipeline's worker count via autoscaling.

B.Use a push subscription instead of pull.

C.Decrease the window duration to 10 minutes.

D.Enable Pub/Sub topic retention.

AnswerA

More workers increase parallelism and processing rate, reducing backlog.

Why this answer

Increasing the Dataflow pipeline's worker count via autoscaling directly addresses the backlog by adding more parallel processing capacity to consume messages from the Pub/Sub subscription faster. Since the pipeline is continuously running but the backlog grows, the bottleneck is processing throughput, not pipeline availability. Autoscaling allows Dataflow to dynamically allocate more workers based on the backlog size, matching consumption rate to the incoming message rate.

Exam trap

Google Cloud often tests the misconception that changing window duration or subscription type can fix a throughput bottleneck, when the real solution is scaling compute resources to match the consumption rate.

How to eliminate wrong answers

Option B is wrong because switching from pull to push subscription does not inherently increase throughput; push subscriptions have their own limitations (e.g., endpoint capacity, HTTP timeouts) and the backlog growth is a processing capacity issue, not a delivery mechanism issue. Option C is wrong because decreasing the window duration to 10 minutes does not reduce the backlog; it changes the aggregation granularity but does not affect the rate at which messages are consumed from the subscription. Option D is wrong because enabling Pub/Sub topic retention controls how long unacknowledged messages are kept, not the rate of consumption; it would only extend the time messages remain available, not reduce the backlog.

Full explanation →

82

MCQeasy

A startup is deploying a PyTorch model on Google Cloud. They need to serve predictions for a mobile app with bursty traffic. Which service is most cost-effective?

A.Vertex AI Endpoints with autoscaling and a minimum of 0 replicas to scale down to zero

B.Vertex AI Endpoints with a minimum number of replicas

C.App Engine with manual scaling

D.Cloud Run with CPU always allocated

AnswerA

Scaling to zero minimizes cost when idle, ideal for bursty traffic.

Why this answer

Option D is correct because Vertex AI Endpoints with autoscaling and minimum replicas set to 0 can scale down to zero when idle, reducing cost. Option A has minimum replicas, incurring cost. Options B and C may not scale to zero or lack ML optimization.

Full explanation →

83

MCQhard

A company wants to implement a near-real-time lake architecture using Cloud Storage and BigQuery. They need to enable queries on data within 5 minutes of arrival. Which approach meets the requirement with minimal operational overhead?

A.Use BigQuery Omni with external tables pointing to Cloud Storage

B.Set up a Cloud Function to trigger BigQuery load jobs every 5 minutes

C.Use Cloud Storage FUSE to mount buckets and query with Spark on Dataproc

D.Stream data into a BigQuery table via streaming inserts, then use a scheduled query to merge into the main table

AnswerA

BigQuery Omni allows querying data directly from Cloud Storage with minimal latency.

Why this answer

Option D is correct because BigQuery Omni with external tables can query data directly from Cloud Storage without loading. Option A is wrong because Cloud Storage FUSE adds a filesystem layer that may not be fast enough. Option B is wrong because streaming inserts into a separate table and then merging adds complexity and latency.

Option C is wrong because scheduled batch loads have a minimum 10-minute interval, not meeting the 5-minute requirement.

Full explanation →

84

MCQhard

A data engineer configures the above lifecycle rule on a Cloud Storage bucket that stores daily log files. After 60 days, they notice that files older than 30 days have been transitioned to Nearline, but files older than 90 days are still present. What is the most likely cause?

A.The delete rule is missing `isLive: true` condition, so it does not apply to live objects.

B.The `age` condition in the delete rule is calculated from the transition date, not creation date.

C.The bucket has object versioning enabled, and the delete rule only applies to non-current versions.

D.The delete rule's condition includes `matchesStorageClass`: `STANDARD`, which does not match the Nearline storage class of transitioned objects.

AnswerD

After the first rule transitions objects to Nearline, they no longer match the `STANDARD` storage class required by the delete rule, so they are not deleted.

Why this answer

Option D is correct because the lifecycle delete rule includes a `matchesStorageClass` condition set to `STANDARD`. Once objects are transitioned to Nearline (which is a different storage class), they no longer match the `STANDARD` condition, so the delete rule does not apply to them. As a result, files older than 90 days that were moved to Nearline remain in the bucket.

Exam trap

Google Cloud often tests the interaction between lifecycle rules and storage class transitions, specifically that a `matchesStorageClass` condition filters objects based on their current storage class, not the original class at creation.

How to eliminate wrong answers

Option A is wrong because the delete rule does not need an `isLive: true` condition to apply to live objects; in fact, `isLive: true` is the default behavior for lifecycle rules, and omitting it does not prevent the rule from applying to live objects. Option B is wrong because the `age` condition in lifecycle rules is always calculated from the object's creation date, not from the transition date. Option C is wrong because object versioning being enabled would cause the delete rule to apply only to non-current versions only if the rule explicitly targets non-current versions; the scenario describes current versions still present, and versioning does not inherently prevent deletion of current versions.

Full explanation →

85

Multi-Selecteasy

A company uses Dataproc for transient clusters. Which TWO actions can reduce costs?

Select 2 answers

A.Increase master node size

B.Set cluster autoscaling to minimize idle resources

C.Use standard VMs for all nodes

D.Use persistent clusters to avoid creation overhead

E.Use preemptible VMs for worker nodes

AnswersB, E

Autoscaling reduces resource waste, lowering cost.

Why this answer

Option B is correct because Dataproc cluster autoscaling automatically adjusts the number of worker nodes based on the YARN memory and CPU utilization metrics. By scaling down during idle periods, you avoid paying for unused compute capacity, directly reducing costs for transient clusters that have variable workloads.

Exam trap

The trap here is that candidates often think 'persistent clusters' are cheaper because they avoid re-creation overhead, but they overlook the continuous compute cost of idle persistent clusters versus the pay-per-use model of transient clusters.

Full explanation →

86

MCQmedium

Refer to the exhibit. This log entry was generated by Vertex AI Model Monitoring for a production model. What should the data engineer do to address this issue?

A.Increase the drift threshold to 0.9 to suppress alerts

B.Retrain the model with more recent data

C.Deploy a new model version trained on the original dataset

D.Disable monitoring for the 'age' feature

AnswerB

Addresses the root cause by adapting to data shift.

Why this answer

Option B is correct because Vertex AI Model Monitoring detected a drift in the 'age' feature, indicating that the production data distribution has shifted from the training data. Retraining the model with more recent data aligns the model with the current data distribution, mitigating the drift and maintaining prediction accuracy. This is the standard remediation for model drift in production ML systems.

Exam trap

Google Cloud often tests the misconception that adjusting thresholds or disabling monitoring is a valid fix for drift, when the correct action is always to retrain the model with current data.

How to eliminate wrong answers

Option A is wrong because increasing the drift threshold to 0.9 would suppress alerts without addressing the underlying data drift, allowing the model to continue making inaccurate predictions. Option C is wrong because deploying a new model version trained on the original dataset would not resolve the drift; it would reuse the same outdated training data that no longer represents the current production distribution. Option D is wrong because disabling monitoring for the 'age' feature would hide the drift issue rather than fixing it, leaving the model vulnerable to degraded performance due to a drifted feature.

Full explanation →

87

MCQmedium

A data pipeline uses Cloud Composer (Airflow) to orchestrate Dataproc jobs. Each job submits a Spark application that reads from BigQuery and writes to Cloud Storage. The pipeline runs nightly and takes 6 hours. Management wants to reduce costs. Which approach is most effective?

A.Use preemptible VMs for the Dataproc cluster

B.Switch to Cloud Dataproc billing per second instead of per minute

C.Increase the memory of the driver node to improve performance

D.Upgrade the Cloud Storage class from Standard to Nearline

AnswerA

Preemptible VMs are cheaper and suitable for batch jobs.

Why this answer

Preemptible VMs are significantly cheaper (up to 80% discount) than standard VMs and are ideal for fault-tolerant, batch workloads like nightly Dataproc jobs. Since the pipeline runs nightly and takes 6 hours, it can tolerate the occasional preemption of worker nodes by using Spark's built-in resilience (e.g., task retries). This directly reduces compute cost without sacrificing completion, assuming the cluster is configured with enough preemptible workers to handle the workload.

Exam trap

Google Cloud often tests the misconception that 'upgrading' storage class or changing billing granularity saves money, when in fact the correct answer involves leveraging cheaper compute resources (preemptible VMs) that are designed for fault-tolerant batch jobs.

How to eliminate wrong answers

Option B is wrong because Dataproc already bills per second after a 1-minute minimum, so switching to per-second billing is not a change that reduces costs further. Option C is wrong because increasing driver memory does not reduce costs; it may actually increase costs by requiring a larger, more expensive VM, and performance gains are unlikely if the bottleneck is not driver memory. Option D is wrong because upgrading from Standard to Nearline storage increases cost (Nearline has higher retrieval and minimum storage duration fees) and is intended for infrequently accessed data, not for nightly write workloads where data is read soon after writing.

Full explanation →

88

MCQhard

A financial services company uses a custom container on Vertex AI Prediction to serve a fraud detection model. The container runs a Flask app that loads a large feature engineering library (~2 GB) at startup. The model is updated weekly. For the past two weeks, the new model version has been failing health checks and showing 'Container failed to start' errors in the logs. The previous versions worked fine. You inspect the container image and confirm it is built correctly using Cloud Build. The only change in the latest build is an updated version of the feature engineering library. What is the most likely cause and how should you fix it?

A.The Cloud Build step that pushes the image is misconfigured. Rebuild using a different approach.

B.The Vertex AI endpoint machine type is too small for the new container. Upgrade to a larger machine type.

C.The new library version increased memory consumption during startup, exceeding the health check timeout. Increase the startup probe initial delay.

D.The new library has a dependency conflict that causes the Flask app to crash. Roll back to the previous library version.

AnswerC

A larger library could cause longer initialization; adjusting the health check timing accommodates that.

Full explanation →

89

MCQeasy

A team has set up a push subscription to an HTTPS endpoint. They notice that messages are not being acknowledged and are resent every 10 seconds. What is the most likely issue?

A.The push endpoint is returning HTTP 200 but taking too long to process

B.The push endpoint is returning HTTP 500

C.The push endpoint is returning HTTP 200 with 'ack' in the body

D.The push endpoint is returning HTTP 400

AnswerB

Any non-200 response (e.g., 500) causes Pub/Sub to retry; 500 indicates a server error.

Why this answer

In Google Cloud Pub/Sub push subscriptions, the subscriber must acknowledge messages by returning an HTTP 200 status code. If the endpoint returns HTTP 500, Pub/Sub interprets this as a failure and will retry delivery with exponential backoff, but the default minimum retry interval is 10 seconds. This matches the observed behavior of messages being resent every 10 seconds without acknowledgment.

Exam trap

Google Cloud often tests the misconception that the response body or processing time affects acknowledgment, when in fact only the HTTP status code determines whether a message is acknowledged or retried.

How to eliminate wrong answers

Option A is wrong because returning HTTP 200, even with slow processing, is treated as a successful acknowledgment; Pub/Sub would not resend the message. Option C is wrong because returning HTTP 200 with 'ack' in the body is still a valid acknowledgment (the body content is irrelevant; only the status code matters). Option D is wrong because HTTP 400 indicates a client error, which Pub/Sub treats as a permanent failure and will not retry indefinitely with a 10-second interval.

Full explanation →

90

MCQhard

A company runs a Dataproc cluster with 10 worker nodes for a Spark streaming job that processes data from Pub/Sub (via Pub/Sub Lite) and writes to Cloud Storage. They observe that the job is producing many small files in Cloud Storage, leading to high costs and performance issues in downstream batch pipelines. The team wants to consolidate output files while maintaining low latency. What is the best solution?

A.Run a separate compaction job that periodically merges small files into larger ones

B.Use windowed streaming with a longer window duration and Spark's file size configuration

C.Reduce the number of workers to force more data per task

D.Switch from Dataproc to Dataflow, which has built-in file sharding optimization

AnswerB

Allows batching data to create larger files with acceptable latency.

Why this answer

Option B is correct because using a longer window duration in Spark Streaming allows more data to accumulate before writing, and combining this with Spark's file size configuration (e.g., `spark.sql.files.maxRecordsPerFile` or `spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2`) ensures that output files are consolidated into larger sizes. This reduces the number of small files in Cloud Storage while maintaining low latency by avoiding an extra compaction job or reducing parallelism.

Exam trap

The trap here is that candidates often choose a separate compaction job (Option A) because it seems like a straightforward fix, but they overlook the latency penalty and the fact that Spark's native streaming configurations can achieve the same goal without extra overhead.

How to eliminate wrong answers

Option A is wrong because running a separate compaction job introduces additional latency and resource overhead, which contradicts the requirement to maintain low latency; it also adds complexity and potential data consistency issues. Option C is wrong because reducing the number of workers decreases parallelism, which can increase processing latency and may not guarantee larger files if the data volume per task remains small due to Spark's default partitioning. Option D is wrong because switching to Dataflow does not inherently solve the small files problem; Dataflow's built-in file sharding optimization (e.g., via `FileIO.write()` with `withNumShards`) still requires explicit configuration, and the question specifically asks for a solution within the existing Dataproc/Spark context.

Full explanation →

91

Matchingmedium

Match each Google Cloud IAM role to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Read access to BigQuery datasets and tables

Permission to run BigQuery jobs

Read access to Cloud Storage objects

Permissions for Dataflow worker nodes

Why these pairings

Predefined IAM roles relevant to data engineering.

Full explanation →

92

Multi-Selecthard

A Dataflow batch job frequently fails with 'OutOfMemoryError'. Which THREE are common causes? (Choose 3)

Select 3 answers

A.Too many parallel workers

B.Inefficient GroupByKey with hot keys

C.Too many side inputs

D.Too large window accumulation in streaming mode

E.Using Dataflow Shuffle

AnswersB, C, D

Hot keys cause all values to be processed by a single worker, leading to memory exhaustion.

Why this answer

Option B is correct because a hot key in a GroupByKey operation causes all values for that key to be processed by a single worker, leading to memory exhaustion when the key's associated data exceeds the worker's memory capacity. This is a common cause of OutOfMemoryError in Dataflow batch jobs, as the SDK buffers all values for a key before emitting the result.

Exam trap

Google Cloud often tests the misconception that increasing parallelism (Option A) always reduces memory errors, but in Dataflow, hot keys cause memory issues regardless of worker count because the hot key's data is processed by a single worker.

Full explanation →

93

MCQmedium

A data scientist is using Vertex AI to train a model and wants to ensure that the training code and environment are reproducible. Which approach should they take?

A.Use Jupyter notebooks on Vertex AI Workbench

B.Use Vertex AI Training with a pre-built container and specify the exact version of the framework

C.Use custom containers with fixed tags

D.Use Cloud Build to train the model

AnswerB

Pre-built containers with version pinning ensure consistent environment and code execution.

Why this answer

Option B is correct because pre-built containers with specific versions ensure the exact same environment across runs. Option A with custom containers is good but less standard. Options C and D are not best practices for reproducibility.

Full explanation →

94

MCQmedium

Refer to the exhibit. A Dataflow pipeline is failing intermittently with the shown error. Which step should the team take to ensure data quality and prevent such errors?

A.Increase the number of workers to process the data faster.

B.Add a monitoring alert on the 'system_lag' metric.

C.Use a strongly typed schema for the PCollection and let Beam automatically reject malformed data.

D.Modify the pipeline to handle parsing failures by sending invalid records to a dead letter queue.

AnswerD

A dead letter queue isolates bad data for later inspection without failing the pipeline.

Why this answer

Option D is correct because the error indicates that the pipeline is failing due to malformed or unparseable data. By sending invalid records to a dead letter queue (DLQ), the pipeline can continue processing valid data while capturing and isolating bad records for later analysis or reprocessing. This pattern is a standard data quality practice in Apache Beam and Dataflow, ensuring that transient or corrupt data does not cause pipeline failures.

Exam trap

Google Cloud often tests the distinction between scaling solutions (like increasing workers) and data quality patterns (like dead letter queues), trapping candidates who confuse performance optimization with error handling.

How to eliminate wrong answers

Option A is wrong because increasing the number of workers addresses throughput and latency, not data quality or malformed data errors; it does not prevent parsing failures. Option B is wrong because monitoring the 'system_lag' metric tracks pipeline latency, not data quality issues; it would not prevent or handle malformed records. Option C is wrong because while strongly typed schemas can help catch type mismatches at compile time, they do not automatically reject malformed data at runtime in Beam; the pipeline would still fail if a record cannot be parsed into the schema, and Beam does not have built-in automatic rejection to a dead letter queue without explicit handling.

Full explanation →

95

MCQmedium

A company deploys a machine learning model to Vertex AI for real-time predictions. After deployment, they notice that prediction latency spikes during peak traffic hours. Which approach should they take to reduce latency without sacrificing accuracy?

A.Configure auto-scaling with higher min and max instances

B.Reduce the number of input features

C.Switch from online to batch prediction

D.Use a larger machine type for the model

AnswerA

Auto-scaling handles traffic spikes.

Why this answer

Option A is correct because configuring auto-scaling with higher min and max instances ensures that Vertex AI has sufficient pre-warmed replicas to handle traffic spikes without cold-start latency. This approach maintains model accuracy because it does not alter the model architecture or inference logic, only the infrastructure capacity.

Exam trap

Google Cloud often tests the misconception that reducing features or using batch prediction is the primary way to reduce latency, but the real exam trap is that candidates overlook the need to maintain real-time capability and accuracy, and instead choose a solution that changes the model or prediction mode rather than scaling infrastructure.

How to eliminate wrong answers

Option B is wrong because reducing the number of input features may degrade model accuracy, and the question explicitly requires not sacrificing accuracy. Option C is wrong because switching from online to batch prediction eliminates real-time capability, which contradicts the requirement for real-time predictions. Option D is wrong because using a larger machine type can reduce latency but often increases cost and may introduce cold-start delays if scaling is not addressed; it does not directly solve latency spikes during peak traffic, and the question asks for a solution that does not sacrifice accuracy, which a larger machine type does not affect but is not the most targeted fix for traffic-induced latency.

Full explanation →

96

MCQeasy

A company is designing a real-time clickstream analytics pipeline using Pub/Sub and Dataflow. The pipeline must handle late-arriving data (up to 1 hour) and ensure exactly-once processing. Which Dataflow feature should be configured to handle late data correctly?

A.Configure the trigger with allowed lateness of 1 hour.

B.Use fixed windows with a 1-hour period and enable data discarding.

C.Use session windows with a gap duration of 1 hour.

D.Set the watermark estimate to 1 hour.

AnswerA

Allowed lateness specifies how long after the watermark the system waits for late data before considering the window complete.

Why this answer

Option A is correct because Dataflow's allowed lateness feature explicitly controls how long the pipeline waits for late-arriving data before closing a window. By setting allowed lateness to 1 hour, the watermark is held back, and late data within that period is still processed with exactly-once semantics. This directly addresses the requirement for handling late data up to 1 hour while ensuring no duplicates or data loss.

Exam trap

Google Cloud often tests the distinction between allowed lateness (which extends window lifetime for late data) and watermark estimation (which is a system property, not a user-set parameter), leading candidates to incorrectly choose D.

How to eliminate wrong answers

Option B is wrong because fixed windows with a 1-hour period and data discarding would drop any data arriving after the window's end, failing the late-data requirement. Option C is wrong because session windows with a 1-hour gap duration merge events into sessions based on inactivity gaps, not fixed lateness, and do not guarantee handling of data arriving up to 1 hour late for a specific event time. Option D is wrong because the watermark estimate is a system-managed heuristic, not a configurable feature; setting it to 1 hour is not a valid Dataflow configuration and would not correctly handle late data.

Full explanation →

97

MCQmedium

A team needs to orchestrate a complex ETL workflow that includes conditional branching (if new data arrives, run transformation A, else run transformation B), error handling, and coordination across multiple services. Which service should they use?

A.Cloud Functions

B.Cloud Composer (Apache Airflow)

C.Cloud Workflows

D.Cloud Scheduler

AnswerB

Airflow natively supports branching, dependencies, and error handling in Python DAGs, ideal for complex orchestration.

Why this answer

Cloud Composer (Apache Airflow) is the correct choice because it is designed for orchestrating complex, multi-step ETL workflows with conditional branching, error handling, and cross-service coordination. Airflow's directed acyclic graphs (DAGs) natively support conditional logic (e.g., BranchPythonOperator), retries, and dependency management across heterogeneous services, making it ideal for this use case.

Exam trap

Google Cloud often tests the distinction between orchestration (Cloud Composer) and simple scheduling or event-driven compute (Cloud Scheduler, Cloud Functions), leading candidates to pick Cloud Functions for its event-driven nature or Cloud Workflows for its branching capability, without recognizing that Airflow is the only service purpose-built for complex, multi-step ETL orchestration with conditional logic and error handling.

How to eliminate wrong answers

Option A is wrong because Cloud Functions is a serverless compute service for single-purpose, event-driven functions, not a workflow orchestrator; it lacks native support for conditional branching, retry policies, and multi-step coordination across services. Option C is wrong because Cloud Workflows is a low-code orchestration service that can handle branching and error handling, but it is designed for simpler, synchronous workflows and does not provide the same level of scheduling, retry, and monitoring capabilities as Airflow for complex ETL pipelines. Option D is wrong because Cloud Scheduler is a cron job service that triggers tasks on a schedule, but it cannot manage conditional branching, error handling, or multi-service coordination within a single workflow.

Full explanation →

98

MCQhard

A company wants to replicate a Cloud SQL (PostgreSQL) database to BigQuery in near real-time for analytics. The volume is about 10GB per day with frequent updates and deletes. They need to capture changes with low latency and ensure exactly-once delivery to BigQuery. Which approach should they use?

A.Export the entire database to Cloud Storage as CSV files every hour and load them into BigQuery using a load job with WRITE_TRUNCATE.

B.Use a Dataflow pipeline with JDBCIO to read from Cloud SQL every minute and write changes to BigQuery using upserts.

C.Use Cloud Data Fusion with a Debezium streaming source to capture CDC from Cloud SQL and a BigQuery sink with exactly-once mode.

D.Use Cloud SQL's change data capture feature to write changes to a Pub/Sub topic and use a Dataflow pipeline to stream into BigQuery.

AnswerC

D is correct because Data Fusion with Debezium provides near real-time CDC with exactly-once semantics.

Why this answer

Option C is correct because Cloud Data Fusion with a Debezium streaming source provides native change data capture (CDC) from PostgreSQL, capturing inserts, updates, and deletes with low latency. The BigQuery sink in exactly-once mode ensures no duplicate records, meeting the requirement for near real-time analytics with frequent updates and deletes.

Exam trap

Google Cloud often tests the misconception that Cloud SQL has a native CDC feature to write to Pub/Sub, but in reality, it requires an external CDC tool like Debezium or Datastream to capture changes.

How to eliminate wrong answers

Option A is wrong because exporting the entire database as CSV files every hour and using WRITE_TRUNCATE overwrites the entire BigQuery table, losing all historical data and failing to capture updates and deletes in near real-time; it also does not provide exactly-once delivery. Option B is wrong because JDBCIO reads snapshots of the table at each poll interval, not change data capture, so it cannot capture deletes and may miss updates between polls; it also does not guarantee exactly-once semantics for upserts in BigQuery. Option D is wrong because Cloud SQL does not have a built-in change data capture feature that writes directly to Pub/Sub; this option describes a non-existent capability, as Cloud SQL requires third-party tools like Debezium or Datastream to capture CDC.

Full explanation →

99

MCQeasy

A data scientist has trained an XGBoost model on Vertex AI and wants to deploy it to an endpoint with automatic scaling based on traffic. What is the recommended deployment approach?

A.Export the model to a container and deploy on Cloud Run

B.Use AI Platform Prediction with batch prediction

C.Deploy the model as an API on App Engine

D.Use Vertex AI Endpoints with automatic scaling enabled

AnswerD

Vertex AI Endpoints support automatic scaling based on traffic, making it the recommended approach.

Why this answer

Vertex AI Endpoints with automatic scaling enabled is the recommended approach because it directly supports deploying trained models (including XGBoost) as online prediction endpoints with built-in autoscaling based on incoming traffic. This service manages the underlying infrastructure, load balancing, and scaling policies, aligning with the requirement for automatic scaling without additional containerization or serverless overhead.

Exam trap

Google Cloud often tests the distinction between online (real-time) and batch prediction services, and the trap here is that candidates may confuse Vertex AI Endpoints with generic serverless options like Cloud Run or App Engine, overlooking the fact that Vertex AI provides a purpose-built, managed endpoint service with native autoscaling for ML models.

How to eliminate wrong answers

Option A is wrong because exporting the model to a container and deploying on Cloud Run requires manual containerization and does not natively integrate with Vertex AI's model registry, versioning, or monitoring, and Cloud Run's scaling is based on request concurrency rather than the model-specific metrics Vertex AI provides. Option B is wrong because AI Platform Prediction with batch prediction is designed for offline, asynchronous predictions on large datasets, not for real-time online serving with automatic scaling based on live traffic. Option C is wrong because deploying the model as an API on App Engine introduces unnecessary complexity and lacks the optimized serving infrastructure, model versioning, and traffic splitting capabilities that Vertex AI Endpoints offer for ML models.

Full explanation →

100

MCQhard

A model deployed on Vertex AI Endpoint is making predictions with high accuracy but the business team suspects bias against a certain demographic group. You need to analyze the model's predictions for fairness. What is the most effective approach?

A.Use Vertex AI Explainable AI to generate feature attributions for each prediction and analyze whether the demographic feature has disproportionate impact.

B.Compute overall fairness metrics by comparing prediction rates across demographic groups.

C.Collect more data for the under-represented group and retrain the model.

D.Use Vertex AI Model Monitoring to check for training-serving skew on the demographic feature.

AnswerA

Explanations help identify if a sensitive attribute is influencing predictions unfairly.

Why this answer

Vertex AI Explainable AI provides per-instance feature attributions, which allow you to examine how the model uses each feature—including sensitive demographic attributes—to arrive at a prediction. By analyzing these attributions across demographic groups, you can detect whether the model disproportionately relies on the demographic feature, indicating potential bias. This approach is more granular than aggregate metrics and directly addresses the business team's concern about bias in individual predictions.

Exam trap

Google Cloud often tests the distinction between bias detection (analysis) and bias mitigation (retraining), so candidates may incorrectly choose Option C as a quick fix instead of the correct analytical approach using Explainable AI.

How to eliminate wrong answers

Option B is wrong because computing overall fairness metrics (e.g., demographic parity) only compares aggregate prediction rates across groups, which can mask per-instance bias and does not reveal whether the model is using the demographic feature in a discriminatory way. Option C is wrong because collecting more data and retraining the model is a remediation step, not an analysis step; it does not help diagnose whether the current model exhibits bias. Option D is wrong because Vertex AI Model Monitoring checks for training-serving skew (distribution drift between training and serving data), not for bias or fairness in predictions against demographic groups.

Full explanation →

101

MCQhard

A company has a model that requires GPU for inference and has strict latency requirements. They deployed on Vertex AI Endpoint with autoscaling but observe cold start latency when scaling up. What is the best solution?

A.Set a higher min_replica_count to keep instances warm

B.Pre-compile the model with TensorRT

C.Use a larger GPU instance

D.Switch to batch prediction

AnswerA

Keeping a minimum number of instances online avoids cold starts when traffic spikes.

Why this answer

Option A is correct: setting a higher min_replica_count ensures there are always some warm instances ready to serve traffic, reducing cold start latency. Option B is wrong because a larger GPU does not address the cold start issue. Option C is wrong because batch prediction is not suitable for online serving.

Option D is wrong because pre-compiling with TensorRT can improve inference speed but does not eliminate cold start delays from scaling.

Full explanation →

102

MCQeasy

A company is building a data lake on Cloud Storage for log analysis. Log files (CSV) arrive every 5 minutes from multiple sources. The files should be ingested into BigQuery for reporting within 15 minutes. Which approach best meets the requirements with minimal operational overhead?

A.Set up a Cloud Storage notification to trigger a Cloud Function that loads each file into BigQuery using the BigQuery API.

B.Schedule a daily batch load from Cloud Storage to BigQuery using the BigQuery Data Transfer Service.

C.Use Dataflow to read from Pub/Sub (ingested from Cloud Storage) and write to BigQuery.

D.Use BigQuery federated queries to query the CSV files directly from Cloud Storage.

AnswerA

This approach provides near-real-time loading (within minutes) with minimal operational overhead, as Cloud Functions are serverless.

Why this answer

Option A is correct because Cloud Storage notifications trigger a Cloud Function on each file upload, which then loads the file into BigQuery via the BigQuery API. This provides near-real-time ingestion (within seconds of file arrival) with minimal operational overhead, as there are no servers to manage and no scheduling needed. The 5-minute file arrival and 15-minute SLA are easily met without complex infrastructure.

Exam trap

Google Cloud often tests the misconception that serverless options like Cloud Functions are only for simple tasks, but here they are the most efficient choice for near-real-time ingestion with minimal overhead, while Dataflow is overkill for this straightforward file-load pattern.

How to eliminate wrong answers

Option B is wrong because a daily batch load does not meet the 15-minute ingestion requirement; it would only load data once per day, causing up to 24 hours of latency. Option C is wrong because it introduces unnecessary complexity and operational overhead by adding Pub/Sub and Dataflow, which are not needed when files are already in Cloud Storage and can be loaded directly via a Cloud Function. Option D is wrong because BigQuery federated queries do not ingest data into BigQuery; they query the CSV files directly from Cloud Storage, which is slower and does not support the required reporting use case where data must be stored in BigQuery for efficient analysis.

Full explanation →

103

Drag & Dropmedium

Drag and drop the steps to deploy a Cloud Dataflow pipeline from a template into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Templates simplify deployment of common Dataflow patterns without writing code.

Full explanation →

104

MCQeasy

A data engineer needs to design a batch processing pipeline using Cloud Data Fusion. The pipeline should read data from Cloud Storage, perform transformations (join, filter, aggregate), and write to BigQuery. What is the most efficient way to handle the transformations?

A.Use Data Fusion Wrangler to visually design the transformations and then run the pipeline on a Dataproc cluster.

B.Use SQL queries in BigQuery to perform the transformations after loading raw data into staging tables.

C.Use custom Python scripts in a Cloud Function triggered after the files land in Cloud Storage.

D.Use Apache Spark on Dataproc to code the transformations manually, bypassing Data Fusion.

AnswerA

Wrangler provides a UI for transformations and Data Fusion executes them on Dataproc.

Why this answer

Option A is correct because Cloud Data Fusion Wrangler provides a visual, no-code interface for designing transformations (join, filter, aggregate) that are then compiled into an Apache Spark or MapReduce program and executed on a Dataproc cluster. This approach leverages Data Fusion's native integration with Dataproc for efficient, scalable batch processing without manual coding, while keeping the pipeline fully managed within the Data Fusion ecosystem.

Exam trap

Google Cloud often tests the misconception that Cloud Data Fusion is only a visual tool and that transformations must be coded manually in Spark or SQL, when in fact Wrangler generates optimized Spark code under the hood and integrates seamlessly with Dataproc for execution.

How to eliminate wrong answers

Option B is wrong because it bypasses Data Fusion entirely, requiring raw data to be loaded into BigQuery staging tables first, which adds latency and storage costs; transformations in BigQuery are better suited for analytics queries, not as a primary ETL step in a Data Fusion pipeline. Option C is wrong because Cloud Functions have a maximum timeout of 9 minutes (540 seconds) and limited memory (up to 8 GB), making them unsuitable for large-scale batch transformations like joins and aggregations on datasets that may be gigabytes or terabytes in size. Option D is wrong because it suggests manually coding Spark on Dataproc, which defeats the purpose of using Data Fusion's visual design and managed execution; while Spark can be used, Data Fusion already abstracts and optimizes the Spark execution, so manual coding adds unnecessary complexity and maintenance overhead.

Full explanation →

105

Multi-Selectmedium

A company is designing a data lake on Google Cloud. They need to store raw data in multiple formats (CSV, Parquet, Avro) and allow various downstream processing frameworks. Which two storage solutions provide flexibility and scalability? (Choose two.)

Select 2 answers

A.Cloud Filestore

B.BigQuery

C.Cloud Storage

D.Cloud Spanner

E.Cloud Bigtable

AnswersB, C

BigQuery can store and query structured data, and with federated queries it can access external files.

Why this answer

BigQuery is correct because it can directly query raw data stored in Cloud Storage in formats like CSV, Parquet, and Avro using external tables or federated queries, without requiring data loading. This provides a flexible, serverless analytics layer that scales automatically and integrates with downstream processing frameworks like Apache Spark, Dataflow, and Dataproc.

Exam trap

Google Cloud often tests the misconception that any database or storage service can serve as a data lake, but the trap here is that only object storage (Cloud Storage) and a serverless query engine (BigQuery) provide the schema-on-read flexibility and scalability required for raw multi-format data, while transactional or operational databases (Spanner, Bigtable) impose schema-on-write constraints and are not designed for bulk analytical storage.

Full explanation →

106

MCQhard

A healthcare company streams patient monitoring data to Cloud Pub/Sub. A Dataflow pipeline reads the stream, enriches with patient records from BigQuery, and writes to Bigtable for real-time queries. The BigQuery lookup is slow and causes pipeline lag. What is the best approach to improve performance?

A.Increase the number of Dataflow workers and use vertical scaling.

B.Use BigQuery's streaming read API in the pipeline.

C.Pre-join the data in a batch pipeline and load into Bigtable.

D.Use a side input from a BigQuery query with a global window and periodic refresh.

AnswerD

Side inputs cache data efficiently.

Why this answer

Option D is correct because using a side input from BigQuery with a global window and periodic refresh allows the Dataflow pipeline to cache the patient records in memory across all workers, avoiding per-element slow lookups. This pattern leverages Beam's side input semantics to broadcast a relatively static lookup table, significantly reducing latency compared to synchronous BigQuery queries for each incoming event.

Exam trap

The trap here is that candidates often assume that increasing parallelism (Option A) or using a faster read API (Option B) will solve the latency issue, when in fact the core problem is the synchronous per-element lookup pattern, which is best addressed by caching the reference data as a side input.

How to eliminate wrong answers

Option A is wrong because increasing the number of workers and vertical scaling does not address the root cause: the per-element synchronous BigQuery lookup is the bottleneck, and simply adding more workers will not reduce the latency of each individual query. Option B is wrong because BigQuery's streaming read API is designed for high-throughput ingestion, not for low-latency point lookups; it still requires a query per event and does not eliminate the network round-trip overhead. Option C is wrong because pre-joining in a batch pipeline and loading into Bigtable would work only if the patient records are static and the data is not truly streaming; it sacrifices the real-time nature of the pipeline and cannot handle late-arriving or updated patient data without reprocessing.

Full explanation →

107

MCQhard

You are responsible for deploying a PyTorch model for real-time inference. The model requires GPU acceleration. You want to minimize infrastructure management overhead. Which serving option should you choose?

A.Deploy the model as a Cloud Function with a GPU backend

B.Use Cloud Run with GPU enabled

C.Use AI Platform Training to host the model as a prediction service

D.Deploy the model on Vertex AI Endpoints using a custom container with GPU support

AnswerD

Vertex AI supports custom containers and GPUs for serving.

Why this answer

Vertex AI Endpoints with a custom container and GPU support is the correct choice because it is purpose-built for serving ML models at scale, fully managed, and supports GPU acceleration for low-latency inference. It minimizes infrastructure overhead by handling auto-scaling, health checks, and model versioning, unlike the other options that lack GPU support or are designed for training rather than serving.

Exam trap

Google Cloud often tests the misconception that Cloud Run or Cloud Functions can support GPUs, but in reality, neither service offers GPU acceleration, making Vertex AI Endpoints the only viable managed option for GPU inference.

How to eliminate wrong answers

Option A is wrong because Cloud Functions do not support GPU backends; they are serverless compute for lightweight, event-driven code and cannot accelerate PyTorch inference. Option B is wrong because Cloud Run does not currently support GPUs; it is a managed compute platform for containerized applications but lacks GPU attachment capabilities. Option C is wrong because AI Platform Training is designed for model training jobs, not for hosting a real-time prediction service; it lacks the endpoint management, autoscaling, and low-latency serving features required for production inference.

Full explanation →

108

MCQhard

Your company runs a real-time recommendation system for a popular e-commerce website using a machine learning model deployed on Vertex AI Endpoints. The model takes user features and product catalog data as input and returns top-10 product recommendations. The system uses a feature store to serve user embeddings and product embeddings. Recently, the recommender team retrained the model with a new algorithm and deployed it as a new version. Since the deployment, the latency for recommendation requests has increased from 100ms to 500ms on average, exceeding the 200ms SLO. The model accuracy is acceptable, and there are no errors. The endpoint uses an n1-standard-8 machine with a single GPU. The new model is larger but still fits on the GPU. You investigate and find that the GPU utilization remains low (<20%), but CPU utilization is high (90%). What should you do to reduce latency while maintaining accuracy?

A.Upgrade the machine type to one with more GPU memory (e.g., n1-standard-8 with a larger GPU) to reduce model inference time.

B.Change the batch size in the model serving code to process multiple requests together, improving GPU utilization.

C.Increase the number of replicas (nodes) to parallelize the CPU-bound preprocessing work.

D.Offload preprocessing to a dedicated Cloud Run service that runs asynchronously and returns precomputed feature vectors.

AnswerC

Adding more nodes will distribute the preprocessing load across multiple CPUs, reducing the overall latency per request if the load balancer dispatches requests efficiently. However, this increases cost.

Why this answer

Option C is correct because the high CPU utilization (90%) with low GPU utilization (<20%) indicates that the bottleneck is CPU-bound preprocessing, not GPU inference. Increasing the number of replicas (nodes) distributes the CPU preprocessing load across multiple instances, reducing per-request latency without affecting model accuracy. This directly addresses the root cause while keeping the existing GPU resources.

Exam trap

Google Cloud often tests the misconception that GPU utilization must be increased to reduce latency, but the trap here is that the bottleneck is CPU-bound preprocessing, not GPU inference, so scaling replicas (horizontal scaling) is the correct fix, not GPU upgrades or batching.

How to eliminate wrong answers

Option A is wrong because upgrading to a larger GPU does not address the CPU bottleneck; GPU memory is sufficient and GPU utilization is low, so more GPU memory would not reduce latency. Option B is wrong because increasing batch size would increase latency per request (as requests wait to be batched) and does not solve CPU-bound preprocessing; it may even worsen CPU contention. Option D is wrong because offloading preprocessing to a Cloud Run service asynchronously would add network round-trip latency and complexity, and the preprocessing is likely synchronous and required per request; it would not reduce the CPU bottleneck on the serving path.

Full explanation →

109

MCQhard

A Dataflow streaming pipeline uses stateful transformations with per-key state and timers. After a deployment, the team observes that the pipeline is reprocessing events from the last 30 minutes every time it restarts. The pipeline's checkpoint is configured to persist every 10 seconds. Which change should be made to prevent unnecessary reprocessing?

A.Use a non-volatile state backend like Cloud Bigtable for state storage.

B.Increase the checkpoint interval to 60 seconds to reduce frequency of checkpoints.

C.Enable idempotent writes to the sink by adding a unique identifier per event.

D.Decrease the checkpoint interval to 1 second to checkpoint more frequently.

AnswerC

Idempotent writes prevent duplicates from being written when reprocessing occurs.

Why this answer

Option C is correct because enabling idempotent writes ensures that even if events are reprocessed due to pipeline restarts, the sink will deduplicate them based on the unique identifier. This prevents duplicate data from being written, which is the core issue when stateful transformations cause reprocessing of events from the last 30 minutes. The checkpoint interval (10 seconds) is already frequent enough; the problem is not checkpoint frequency but the lack of deduplication at the sink.

Exam trap

Google Cloud often tests the misconception that increasing checkpoint frequency or changing state backends solves reprocessing issues, when the real solution is idempotent sinks to handle duplicates from replay.

How to eliminate wrong answers

Option A is wrong because using a non-volatile state backend like Cloud Bigtable does not prevent reprocessing; it only ensures state survives restarts, but the pipeline still replays uncommitted events from the last checkpoint. Option B is wrong because increasing the checkpoint interval to 60 seconds would actually increase the window of potential reprocessing, making the problem worse, not better. Option D is wrong because decreasing the checkpoint interval to 1 second would increase overhead and still not prevent reprocessing; the pipeline will always replay events from the last successful checkpoint, regardless of frequency.

Full explanation →

110

Multi-Selectmedium

An organization is moving on-premises Hadoop workloads to Google Cloud. They need to minimize code changes and manage transient clusters for cost savings. Which two Google Cloud services should they consider? (Choose TWO.)

Select 2 answers

A.Compute Engine with self-managed Hadoop

B.BigQuery

C.Dataproc on GKE

D.Cloud Dataproc

E.Cloud Dataflow

AnswersC, D

Allows running Spark workloads on GKE, leveraging container orchestration.

Why this answer

Options B and D are correct: Dataproc is a managed Hadoop/Spark service that can run transient clusters, and Dataproc on GKE allows running Spark workloads on GKE for flexibility. Option A is wrong because Dataflow is not compatible with Hadoop. Option C is wrong because Compute Engine requires manual cluster setup.

Option E is wrong because BigQuery is not Hadoop-compatible.

Full explanation →

111

Multi-Selecteasy

A company uses Pub/Sub to decouple services. They have a topic with two subscriptions: Subscription A is a push subscription that sends messages to a Cloud Function; Subscription B is a pull subscription used by a Dataflow job. They need to ensure that messages are processed in order for a specific device_id. Which TWO configurations should they apply?

Select 2 answers

A.Enable message ordering on the topic and set an ordering key for each message.

B.Disable duplicate filtering on the topic.

C.Configure the Cloud Function to retry on failure with exponential backoff.

D.Use exactly one subscription for both the Cloud Function and Dataflow job.

E.Use a single subscription with multiple concurrent consumers.

AnswersA, D

Ordering key is required for ordered delivery.

Why this answer

Option A is correct because enabling message ordering on the topic and setting an ordering key (e.g., device_id) ensures that messages with the same key are delivered to subscribers in the order they were published. This is a fundamental Pub/Sub feature that guarantees FIFO (first-in, first-out) delivery per ordering key, which directly addresses the requirement for processing messages in order for a specific device_id.

Exam trap

Google Cloud often tests the misconception that multiple subscriptions or multiple consumers can maintain ordering independently, but in Pub/Sub, ordering is per subscription and per ordering key, and only a single subscriber per subscription can guarantee FIFO delivery.

Full explanation →

112

MCQhard

A company runs a production Dataflow streaming pipeline that reads from Pub/Sub, groups events by customer ID, and writes to BigQuery. The pipeline uses global windows with triggers. After a recent code change, the pipeline started generating duplicate events in BigQuery for the same customer ID. The previous version did not have duplicates. The team reviews the code and sees that the trigger was changed from 'afterProcessingTime' to 'afterWatermark'. What is the most likely reason for duplicates?

A.The afterProcessingTime trigger fired multiple times for the same window

B.Late-arriving events cause the afterWatermark trigger to fire additional panes for the same window

C.The pipeline is firing early and on-time panes for the same window

D.The pipeline uses accumulation mode which accumulates results across firings

AnswerB

Watermark triggers can fire again for late data, producing duplicates if not deduplicated.

Why this answer

The change from `afterProcessingTime` to `afterWatermark` introduces a dependency on the watermark, which estimates event time progress. When late-arriving events (those with timestamps before the watermark) arrive after the watermark has advanced, the `afterWatermark` trigger fires an additional pane for the same window, causing duplicate writes to BigQuery. The previous trigger (`afterProcessingTime`) fired based on processing time, which does not react to late data in the same way, hence no duplicates.

Exam trap

Google Cloud often tests the distinction between processing-time and event-time triggers, and the trap here is that candidates assume `afterWatermark` is simply a 'one-time' trigger, overlooking that late-arriving data can cause additional firings.

How to eliminate wrong answers

Option A is wrong because `afterProcessingTime` fires based on wall-clock time, not on data arrival, and it does not inherently cause multiple firings for the same window unless combined with other triggers or accumulation; the issue here is specifically the switch to watermark-based triggering. Option C is wrong because firing early and on-time panes is a feature of `afterWatermark` with early firings, but the question states the trigger was changed to `afterWatermark` alone (without early firings), so this does not explain the duplicates. Option D is wrong because accumulation mode (e.g., `accumulatingFiredPanes`) determines whether results are accumulated across firings, but the core cause of additional panes is the watermark reacting to late data, not the accumulation mode itself.

Full explanation →

113

Multi-Selecthard

Which TWO strategies help reduce prediction latency for a real-time model deployed on Vertex AI Endpoint?

Select 2 answers

A.Use batch prediction instead of online

B.Use Cloud CDN to cache predictions

C.Use a larger machine type (e.g., n1-highcpu-16)

D.Reduce model complexity (e.g., quantize or prune)

E.Enable autoscaling with a minimum replica count

AnswersD, E

Reduces inference time.

Why this answer

Enabling autoscaling with min replicas reduces cold start latency. Reducing model complexity (e.g., quantization) speeds up inference. Larger machine types may increase latency if not needed.

Batch prediction is offline. Cloud CDN is for static content.

Full explanation →

114

MCQhard

You are a data engineer at a financial services company that uses Vertex AI to train and deploy models for credit risk assessment. The company has strict governance requirements: every model version must be approved by the risk committee before going to production. The approval process can take several days. Currently, the team trains a new model weekly and manually deploys it to a staging endpoint for review, then manually promotes to production after approval. This process is error-prone and slow. You want to automate the pipeline: training should trigger automatically when new data arrives, the model should be automatically deployed to a staging endpoint for review, and after manual approval, it should be promoted to production. Additionally, you need to ensure that if a model in staging performs poorly (e.g., low accuracy), it should not be promoted even if approved. What should you do?

A.Use Vertex AI Experiments to track model versions, then manually deploy from the Experiments UI.

B.Use Cloud Scheduler to run training weekly, then use Cloud Functions to deploy to staging, and after manual approval, use another Cloud Function to check performance and deploy to production.

C.Create a Vertex AI Pipeline that: (1) Triggers on new data, (2) Trains model, (3) Evaluates and stores metrics in the model registry, (4) Deploys to staging endpoint as a new model version. Then use a manual approval step (e.g., via Cloud Build approval or external system) to trigger a second pipeline that checks the stored metrics and, if acceptable, deploys to production endpoint.

D.Train models on Vertex AI Workbench and use a CI/CD tool like Cloud Build to deploy to staging. Use a Cloud Build approval step to promote to production after manual check.

AnswerC

This automates training and staging deployment, then separates approval gate, and uses metric check to conditionally promote to production.

Why this answer

The best approach uses Vertex AI Pipelines to automatically train and deploy to a staging endpoint. After manual approval, a separate pipeline step checks model performance metrics (which were stored during training/evaluation) and if they meet a threshold, promotes to production. This enforces governance and automation.

Full explanation →

115

MCQeasy

A data scientist has iterated on a model and produced a new version. The organization requires the ability to roll back to the previous version quickly if the new version performs poorly in production. Which approach should be used?

A.Store each model version in a separate Cloud Storage bucket.

B.Keep the previous model in a container image and redeploy via Cloud Run.

C.Use Cloud Source Repositories to tag model versions.

D.Upload both versions to Vertex AI Model Registry and use endpoint traffic splitting to route 100% to the safe version if needed.

AnswerD

The registry keeps versions; endpoint traffic allows instant switch.

Why this answer

Vertex AI Model Registry allows you to deploy multiple model versions and use endpoint traffic splitting to gradually shift traffic or instantly route 100% to a specific version. This enables immediate rollback by setting the traffic split to 100% for the previous model version without redeploying or changing infrastructure.

Exam trap

Google Cloud often tests the misconception that version control tools (like Cloud Source Repositories) or storage buckets are sufficient for rollback, when in fact the key requirement is a managed model registry with traffic splitting capabilities for instant, no-downtime rollback.

How to eliminate wrong answers

Option A is wrong because storing each model version in a separate Cloud Storage bucket does not provide a mechanism for quick rollback; you would still need to redeploy the model from that bucket, which is not instantaneous. Option B is wrong because keeping the previous model in a container image and redeploying via Cloud Run is not a rollback strategy—it requires a new deployment, which takes time and does not leverage Vertex AI's managed traffic splitting. Option C is wrong because Cloud Source Repositories is a source code version control service, not a model registry; tagging model versions there does not affect production endpoint traffic.

Full explanation →

116

MCQeasy

You are operating a streaming data pipeline that uses Cloud Pub/Sub and Dataflow. The data source sometimes emits events that are delayed by several minutes due to network issues. Your pipeline must produce accurate aggregations (e.g., counts per minute) even for late data, but you also need to avoid waiting for a long time before emitting results. Which approach should you use?

A.Use processing-time windows and ignore the event timestamps entirely.

B.Use event-time processing with allowed lateness and a trigger that fires early to provide speculative results.

C.Use global windows and hold all data for 24 hours before processing to ensure completeness.

D.Use event-time processing and discard any data that arrives after the window ends.

AnswerB

Dataflow supports allowed lateness and triggers; you can set a trigger to emit early results every minute, and then a final result after the allowed lateness period, ensuring both low latency and eventual accuracy.

Why this answer

Option B is correct because it uses event-time processing to handle late data via allowed lateness, combined with early triggers to emit speculative results before the window closes. This balances accuracy for delayed events with low latency for downstream consumers, which is a common requirement in streaming pipelines using Cloud Pub/Sub and Dataflow.

Exam trap

Google Cloud often tests the distinction between processing-time and event-time semantics, and the trap here is that candidates may choose processing-time windows (Option A) thinking they are simpler, not realizing they sacrifice correctness for late data.

How to eliminate wrong answers

Option A is wrong because processing-time windows ignore event timestamps entirely, so late-arriving data would be assigned to the wrong window, producing inaccurate aggregations. Option C is wrong because global windows with a 24-hour hold would cause unbounded latency and memory pressure, violating the requirement to avoid waiting a long time before emitting results. Option D is wrong because discarding late data after the window ends would lose delayed events, failing the requirement for accurate aggregations even with late data.

Full explanation →

117

MCQmedium

The exhibit shows a Cloud Logging query result. A data engineer sees this log for a streaming Dataflow job. What is the most likely cause?

A.The job is experiencing network latency.

B.The job is using too much memory per worker.

C.The job has insufficient permissions to scale.

D.The job has reached the maximum number of workers allowed by the project quota.

AnswerD

Worker pool exhausted indicates quota limit.

Why this answer

The log shows that the Dataflow job is not scaling up despite pending work. This typically occurs when the job has reached the maximum number of workers allowed by the project quota. Dataflow uses the Compute Engine default worker quota, and if the job attempts to exceed that limit, it will stop scaling and log messages indicating that it cannot add more workers.

Exam trap

Google Cloud often tests the distinction between resource quotas and permissions, so the trap here is that candidates confuse a quota limit (which is a hard resource cap) with an IAM permissions issue (which would produce a different error).

How to eliminate wrong answers

Option A is wrong because network latency would cause delays in data processing but would not prevent the job from scaling up; the log would show slow progress or timeouts, not a scaling block. Option B is wrong because excessive memory usage per worker would cause worker crashes or OOM errors, not a failure to scale; the job would still attempt to add workers. Option C is wrong because insufficient permissions to scale would result in an authorization error when trying to create new worker instances, not a quota-related log message; the error would reference IAM roles or service account permissions.

Full explanation →

118

MCQhard

You are a machine learning engineer at a FinTech company. Your team has developed a credit risk model using XGBoost and deployed it on Vertex AI Prediction using a custom container. The model is used for real-time credit decisions, and the endpoint is configured with a single machine type (n1-standard-4) and min_replica_count = 2, max_replica_count = 10. Recently, the team observed that during a promotional campaign, the endpoint's prediction latency increased from 200ms to over 2 seconds, and some requests resulted in 503 errors. You check the Cloud Monitoring metrics and see that CPU utilization reached 100% on the existing replicas, but the number of replicas never scaled beyond the initial 2. The deployment uses a custom container that runs a TensorFlow Serving-like model server. The container image is stored in Artifact Registry. The Vertex AI endpoint is configured with a traffic split of 100% to this model version. What is the most likely cause of the scaling failure, and what step should you take to resolve it?

A.Increase min_replica_count to 5 to handle the baseline load.

B.Change the endpoint configuration to use gRPC instead of HTTP to reduce latency.

C.Ensure the custom container exposes the correct metrics for CPU utilization so that Vertex AI autoscaling can trigger.

D.Set the max_replica_count to a higher value like 20.

AnswerC

Autoscaling relies on metrics; if the container doesn't expose them, scaling won't happen.

Why this answer

Option C is correct because Vertex AI's autoscaling relies on the custom container exposing standard metrics (e.g., CPU utilization via the /metrics endpoint in a Prometheus format or through the Vertex AI custom metric adapter). If the container does not expose these metrics, the autoscaler cannot detect high CPU usage and will not trigger scaling beyond the initial replicas, leading to latency spikes and 503 errors under load.

Exam trap

The trap here is that candidates assume autoscaling is automatic based on CPU utilization alone, but Vertex AI requires explicit metric exposure from custom containers; otherwise, the autoscaler remains inactive.

How to eliminate wrong answers

Option A is wrong because increasing min_replica_count only sets a baseline number of replicas; it does not fix the autoscaling mechanism that failed to add replicas when CPU hit 100%. Option B is wrong because switching to gRPC can reduce network overhead and latency, but it does not address the root cause of scaling failure—the autoscaler not triggering due to missing metrics. Option D is wrong because raising max_replica_count only increases the upper limit; if the autoscaler never triggers scaling (due to missing metrics), the replicas will remain at the initial count regardless of the max setting.

Full explanation →

119

MCQmedium

After deploying a model to Vertex AI Endpoints, the prediction responses include unexpected data. The model returns logits instead of probabilities. What is the most likely cause?

A.The model was trained with different loss

B.The input data is scaled incorrectly

C.The endpoint is not properly configured

D.The model output is not post-processed

AnswerD

Missing softmax or similar transformation leads to raw logits being returned.

Why this answer

The most likely cause is that the model output is not post-processed. In Vertex AI Endpoints, models often output raw logits (unnormalized scores) from the final layer, and a softmax or sigmoid activation must be applied as a post-processing step to convert these logits into probabilities. Without this post-processing, the endpoint returns the raw logits, which is why the prediction responses contain unexpected data.

Exam trap

Google Cloud often tests the distinction between model training configurations and serving/post-processing steps, and the trap here is that candidates assume the endpoint or deployment configuration controls output formatting, when in fact the model's exported graph or serving function determines whether logits or probabilities are returned.

How to eliminate wrong answers

Option A is wrong because training with a different loss function (e.g., cross-entropy vs. mean squared error) does not directly cause the model to output logits instead of probabilities; the output layer's activation function (or lack thereof) determines whether outputs are logits or probabilities. Option B is wrong because incorrect input scaling would affect the prediction values (e.g., shifting or scaling them), but it would not change the fundamental nature of the output from logits to probabilities; the model would still output whatever its final layer produces. Option C is wrong because the endpoint configuration (e.g., machine type, traffic splitting, or model version) does not alter the model's output format; the endpoint simply serves the model's raw predictions as-is.

Full explanation →

120

MCQmedium

A streaming Dataflow pipeline ingests events from Cloud Pub/Sub and writes to BigQuery. The event schema evolves occasionally (new columns added). The pipeline fails when new columns appear. What is the best long-term solution?

A.Configure the BigQuery sink to use stored 'dynamic' schema by setting create_disposition to CREATE_NEVER and writing to a temporary table with schema auto-detection

B.Stop the pipeline and update the BigQuery schema manually whenever a new column appears

C.Switch to Dataproc to process the data with Spark and write to BigQuery using the Avro format

D.Use a Cloud Function to transform the data and add null columns for missing fields

AnswerA

Using schema auto-detection on a temporary table and then merging into the main table with wildcard tables or using BigQuery's schema flexibility can handle new columns.

Why this answer

Option A is correct because it leverages BigQuery's schema auto-detection with a temporary table to handle schema evolution dynamically. By setting create_disposition to CREATE_NEVER, the pipeline writes to a table that already exists, while the temporary table with auto-detection allows the pipeline to infer new columns from the incoming data. This approach avoids pipeline failures when new columns appear, as the sink can adapt without manual intervention or pipeline restarts.

Exam trap

Google Cloud often tests the misconception that manual schema updates or external transformations are acceptable long-term solutions, when in fact the correct answer leverages a built-in BigQuery feature (schema auto-detection) to handle schema evolution dynamically without pipeline downtime.

How to eliminate wrong answers

Option B is wrong because it requires manual intervention to stop the pipeline and update the BigQuery schema each time a new column appears, which is not a long-term solution and defeats the purpose of a streaming pipeline that needs to handle schema evolution automatically. Option C is wrong because switching to Dataproc with Spark and Avro format does not inherently solve the schema evolution problem; it adds unnecessary complexity and still requires handling schema changes in the Spark job or BigQuery sink. Option D is wrong because using a Cloud Function to transform data and add null columns for missing fields is a brittle workaround that requires maintaining a separate function and does not scale well with frequent schema changes; it also introduces additional latency and cost.

Full explanation →

121

Multi-Selecthard

A company uses Cloud Build to deploy containerized applications. They want to ensure build and deployment quality. Which THREE steps should they include in their CI/CD pipeline? (Choose three.)

Select 3 answers

A.Scan container images for vulnerabilities using Container Analysis.

B.Run unit tests after deployment.

C.Deploy directly to production on every commit.

D.Use canary deployments with gradual traffic shifting.

E.Pin base image digests in Dockerfile.

AnswersA, D, E

Vulnerability scanning ensures images are secure before deployment.

Why this answer

Options B, D, and E are correct: Container scanning catches vulnerabilities, canary deployments reduce risk, and pinning base image digests ensures reproducibility. Option A (unit tests after deployment) is too late. Option C (direct deployment to production) bypasses safety checks.

Full explanation →

122

Multi-Selectmedium

Which TWO factors should be considered when choosing between Cloud Dataflow and Dataproc for a batch processing pipeline?

Select 2 answers

A.Dataproc allows custom Docker containers, while Dataflow does not.

B.Dataflow is built for data processing patterns, while Dataproc is better for general-purpose compute.

C.Dataproc supports Python, while Dataflow only supports Java.

D.Dataflow provides auto-scaling, while Dataproc requires manual cluster sizing.

E.Dataflow supports Java and Python, while Dataproc only supports Java.

AnswersB, D

Dataflow is specialized for data pipelines.

Why this answer

Option B is correct because Dataflow is purpose-built for data processing patterns like batch and stream processing with unified programming models (Apache Beam), while Dataproc is optimized for general-purpose compute workloads such as running custom Spark, Hadoop, or ML jobs. Option D is correct because Dataflow provides automatic horizontal autoscaling based on pipeline throughput, whereas Dataproc requires manual cluster sizing or configuration of autoscaling policies, which are not as granular or reactive as Dataflow's.

Exam trap

Google Cloud often tests the misconception that Dataflow only supports Java and that Dataproc requires manual scaling, when in fact both services support multiple languages and Dataproc offers optional autoscaling, but Dataflow's autoscaling is more dynamic and fine-grained.

Full explanation →

123

MCQeasy

Your company wants to analyze real-time user clickstream data from a website. The data arrives as JSON messages via an HTTP endpoint. The pipeline should be able to handle spikes in traffic, provide low-latency insights, and store the raw data in a data lake for historical analysis. Which Google Cloud service should you use to ingest and process the streaming data?

A.Cloud Pub/Sub combined with Dataflow

B.Cloud Dataproc

C.Cloud Functions

D.Cloud IoT Core

AnswerA

Cloud Pub/Sub provides reliable, scalable ingestion; Dataflow enables stream processing with exactly-once semantics and can write to Cloud Storage.

Why this answer

Cloud Pub/Sub is the correct ingestion service because it provides a highly scalable, fully managed message queue that can handle traffic spikes by decoupling producers from consumers. Dataflow (Apache Beam) then processes the streaming data with low latency, supports exactly-once semantics, and can write raw data to a data lake like Cloud Storage for historical analysis. This combination meets all requirements: spike handling, low-latency insights, and raw data storage.

Exam trap

Google Cloud often tests the misconception that Cloud Functions can handle streaming ingestion due to its HTTP trigger, but its 9-minute timeout and lack of native streaming support make it unsuitable for high-throughput, low-latency pipelines.

How to eliminate wrong answers

Option B (Cloud Dataproc) is wrong because it is a managed Hadoop/Spark service designed for batch and stream processing but requires manual cluster management and autoscaling configuration, making it less suitable for handling unpredictable traffic spikes with low latency compared to the serverless Pub/Sub + Dataflow pipeline. Option C (Cloud Functions) is wrong because it is a lightweight, event-driven compute service with a maximum timeout of 9 minutes and limited throughput, making it unsuitable for high-volume, real-time streaming ingestion and processing. Option D (Cloud IoT Core) is wrong because it is specifically designed for ingesting data from IoT devices using MQTT/HTTP protocols, not for general web clickstream data from an HTTP endpoint, and it lacks the native streaming analytics capabilities needed for low-latency insights.

Full explanation →

124

MCQhard

A data science team deploys a TensorFlow image classification model to Vertex AI Prediction. The model performs well in offline evaluation but shows a 15% drop in accuracy in production. The production data distribution has shifted compared to the training data. The team needs to continuously monitor and retrain the model. Which solution is most appropriate for detecting drift and triggering retraining?

A.Enable Vertex AI Model Monitoring for feature drift; configure alerts to trigger a Vertex AI Pipelines retraining run.

B.Export production predictions to Cloud Logging, then use Log Analytics to compare distributions.

C.Store predictions in BigQuery and run scheduled SQL queries to detect drift; trigger retraining via Cloud Functions.

D.Use Cloud Monitoring to track prediction latency and error rates; manually retrain when errors increase.

AnswerA

Vertex AI Model Monitoring detects drift and can trigger automated retraining.

Why this answer

Vertex AI Model Monitoring is purpose-built for detecting feature drift in production ML models by comparing live inference data against a baseline distribution. When drift is detected, it can directly trigger a Vertex AI Pipelines retraining run, creating an automated, end-to-end MLOps loop that addresses the production accuracy drop without manual intervention.

Exam trap

Google Cloud often tests the distinction between operational monitoring (latency, errors) and data-quality monitoring (feature drift), leading candidates to mistakenly choose Cloud Monitoring (Option D) because they confuse production health metrics with model-specific distribution shifts.

How to eliminate wrong answers

Option B is wrong because exporting predictions to Cloud Logging and using Log Analytics for distribution comparison is a manual, ad-hoc approach that lacks native drift detection algorithms and automated retraining triggers, making it unsuitable for continuous monitoring. Option C is wrong because storing predictions in BigQuery and running scheduled SQL queries to detect drift requires custom statistical logic and does not leverage Vertex AI's built-in drift detection, alerting, or pipeline integration, leading to higher maintenance overhead. Option D is wrong because Cloud Monitoring tracks prediction latency and error rates, which are operational metrics, not feature distribution shifts; relying on error rates as a proxy for drift is indirect and unreliable, and manual retraining defeats the goal of continuous automation.

Full explanation →

125

Drag & Dropmedium

Drag and drop the steps to create a Cloud Bigtable instance and table using the CLI into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Bigtable instances contain clusters; tables are created within instances and must have column families.

Full explanation →

126

Multi-Selecthard

A company is migrating their on-premises Apache Spark jobs to Google Cloud Dataproc. They want to minimize operational overhead and cost for jobs that run only a few times per day. Which TWO strategies should they adopt? (Choose TWO.)

Select 2 answers

A.Configure HDFS replication factor to 3 to ensure data durability during cluster restarts.

B.Rewrite the Spark jobs as Dataflow pipelines to take advantage of serverless processing.

C.Store all data in Cloud Storage instead of HDFS, and use the Cloud Storage connector to access it.

D.Create an ephemeral Dataproc cluster for each job and delete it after completion.

E.Use a small persistent cluster that runs continuously and submit jobs to it.

AnswersC, D

C is correct because Cloud Storage is durable and eliminates HDFS management.

Why this answer

Option C is correct because storing data in Cloud Storage decouples storage from compute, allowing ephemeral clusters to be spun up and down without data loss. The Cloud Storage connector provides Hadoop-compatible file system access, eliminating the need for HDFS replication and reducing costs by avoiding persistent cluster storage.

Exam trap

Google Cloud often tests the misconception that persistent clusters are necessary for data durability, but the correct approach for intermittent workloads is to use ephemeral clusters with Cloud Storage to minimize cost and operational overhead.

Full explanation →

127

MCQeasy

Your team wants to continuously monitor a deployed model's performance in production. They need to detect when the model's predictions become unreliable due to changes in the real world (e.g., new customer behavior). Which Vertex AI service should they use?

A.Vertex AI Explainable AI

B.Vertex AI Experiments

C.Vertex AI Model Monitoring

D.Vertex AI Prediction

AnswerC

Model Monitoring continuously checks for skew, drift, and performance issues.

Why this answer

Vertex AI Model Monitoring is the correct choice because it is specifically designed to continuously track a deployed model's prediction quality over time, detecting issues like data drift, feature drift, and prediction skew that indicate the model's reliability is degrading due to changes in the real world. It automatically compares incoming prediction data against a baseline training dataset and alerts when statistical distributions shift beyond configurable thresholds, enabling proactive retraining or intervention.

Exam trap

Google Cloud often tests the distinction between services that 'serve' predictions (Vertex AI Prediction) versus those that 'monitor' predictions (Vertex AI Model Monitoring), leading candidates to mistakenly choose the prediction service when the question asks about detecting unreliability.

How to eliminate wrong answers

Option A is wrong because Vertex AI Explainable AI provides feature attributions and explanations for individual predictions, but it does not continuously monitor model performance or detect drift in production. Option B is wrong because Vertex AI Experiments is used for tracking and comparing machine learning experiments during model development, not for monitoring deployed models in production. Option D is wrong because Vertex AI Prediction is the service that hosts and serves the model for online predictions, but it has no built-in monitoring capabilities for detecting performance degradation or data drift.

Full explanation →

128

MCQhard

In the Vertex AI Pipeline component YAML exhibit, the component is designed to evaluate a model and produce metrics. If the threshold_accuracy is set to 0.85, what is the expected behavior of this component?

A.It will output the evaluation metrics, and the pipeline can use them for conditional decisions

B.It will deploy the model if the accuracy meets the threshold

C.It will ignore the threshold_accuracy input if not provided

D.It will fail if the model accuracy is below 0.85

AnswerA

The component outputs metrics for downstream use.

Why this answer

In Vertex AI Pipelines, a component's YAML definition specifies inputs, outputs, and implementation. Setting `threshold_accuracy` to 0.85 defines a parameter that the component can use internally, but by itself it does not trigger deployment or cause failure. The component's expected behavior is to output evaluation metrics, and the pipeline can then use those metrics in conditional logic (e.g., via `Condition` or `if/else` tasks) to decide subsequent steps, such as model deployment or retraining.

Exam trap

Google Cloud often tests the misconception that setting a threshold in a component's YAML automatically enforces that threshold (e.g., causing failure or deployment), when in reality the YAML only defines the interface and the component's code must explicitly implement such logic.

How to eliminate wrong answers

Option B is wrong because Vertex AI Pipeline components do not inherently deploy models; deployment is a separate step typically handled by a deployment component or a pipeline condition that triggers a deployment task. Option C is wrong because if `threshold_accuracy` is not provided, the component will either use a default value defined in the YAML or fail validation, depending on whether the input is required; it does not simply ignore it. Option D is wrong because the component does not fail when accuracy is below the threshold; it merely outputs the metrics, and the pipeline logic (e.g., a conditional branch) must be explicitly configured to handle such cases.

Full explanation →

129

MCQmedium

Refer to the exhibit. A developer sees this log entry when trying to get a prediction. What is the most likely cause?

A.The model ID is incorrect

B.The model version is not deployed

C.The endpoint does not exist

D.The project ID is wrong

AnswerB

A model version must be deployed to an endpoint to serve predictions; 'not found' suggests it is not deployed.

Why this answer

The log entry indicates that the model version specified in the request is not currently deployed to the serving infrastructure. In Google Cloud's Vertex AI, a model version must be explicitly deployed to an endpoint before it can serve predictions; attempting to predict against a non-deployed version returns an error. This is the most likely cause because the error message directly references the model version's deployment status.

Exam trap

Google Cloud often tests the distinction between model registry operations (uploading, versioning) and serving operations (deploying, predicting), trapping candidates who assume any model version in the registry is automatically available for predictions.

How to eliminate wrong answers

Option A is wrong because an incorrect model ID would typically result in a 'Model not found' or 'Invalid model' error, not a deployment status error. Option C is wrong because a non-existent endpoint would produce a 'Endpoint not found' or 'Resource not found' error, not a version deployment issue. Option D is wrong because a wrong project ID would cause an authentication or permission error (e.g., 'Project not found' or 'Permission denied'), not a model version deployment error.

Full explanation →

130

MCQeasy

Refer to the exhibit. A subscriber is unable to pull messages from the topic. What is the most likely cause?

A.The service account has the subscriber role but the topic is not configured correctly.

B.The service account needs roles/pubsub.viewer to list subscriptions.

C.No subscription has been created for the topic.

D.The service account lacks roles/pubsub.publisher.

AnswerC

A subscription is required to pull messages; the topic only provides the ability to publish.

Why this answer

Option B is correct because a subscription must exist for pulling messages; the topic alone is not enough. Option A (publisher role) is not needed for subscribers. Option C (subscriber role on topic) is correct but the subscriber also needs to have a subscription.

Option D (viewer role) is irrelevant.

Full explanation →

131

Multi-Selecthard

A company uses Cloud Pub/Sub with pull subscriptions to process orders. The application requires at-least-once delivery and the ability to process orders in order per customer_id. Which THREE features should they configure? (Choose three.)

Select 3 answers

A.Configure a dead letter topic

B.Use a push subscription with a HTTPS endpoint

C.Enable ordering keys on the topic

D.Enable message ordering on the subscription

E.Set the subscription's ackDeadline to 600 seconds

AnswersA, C, D

Allows failed messages to be stored without blocking subsequent messages.

Why this answer

Correct answers are B, C, and E. Enabling ordering keys ensures messages with the same key are delivered in order. Setting exactly-once delivery on subscriptions provides at-least-once with deduplication; but exactly-once delivery on subscription actually reduces duplicates.

However, the question says at-least-once, so exactly-once delivery would be too strong? Actually, exactly-once delivery on subscription ensures no duplicates, so it's even better. But if they need at-least-once, they could enable it and it still satisfies. Dead letter topics allow failed messages to be isolated and reprocessed later without blocking.

A (ackDeadline of 600 seconds) is too long. D (using a push subscription) does not inherently improve ordering.

Full explanation →

132

MCQeasy

A company is running a Cloud Dataflow streaming pipeline that aggregates events in 1-minute windows. They notice that the watermark is lagging significantly behind real-time. What is the most likely cause?

A.A hot key is causing data skew.

B.The window duration is too short.

C.The pipeline was recently updated.

D.The allowed lateness is set too high.

AnswerA

Hot key causes processing delays.

Why this answer

A hot key causes data skew, which means a disproportionate amount of data is assigned to a single key. In Cloud Dataflow, this leads to a single worker processing the bulk of the events, creating a processing bottleneck. The watermark, which tracks the progress of event-time processing, cannot advance until all data for a given window is processed, so the skewed key delays watermark progression significantly behind real-time.

Exam trap

Google Cloud often tests the misconception that watermark lag is caused by configuration settings like window duration or allowed lateness, rather than by data-level issues like hot keys that create processing bottlenecks.

How to eliminate wrong answers

Option B is wrong because a short window duration does not inherently cause watermark lag; it may increase computational overhead but does not prevent the watermark from advancing based on data arrival. Option C is wrong because a pipeline update (e.g., via a new job version) does not cause persistent watermark lag; it may cause a brief reprocessing delay but not a sustained lag. Option D is wrong because setting allowed lateness too high only affects how long the pipeline waits for late data after the watermark passes; it does not cause the watermark itself to lag behind real-time.

Full explanation →

133

MCQmedium

A financial services company uses Cloud Composer to orchestrate daily batch jobs. One job extracts data from MongoDB to Cloud Storage, then loads into BigQuery, and finally runs a Dataflow pipeline for aggregations. The Dataflow job fails intermittently. They want to automatically restart only the failed Dataflow job without re-running the earlier extraction and load. Which Airflow operator configuration should they use?

A.Implement a SlaMiss sensor

B.Use a DAG with depends_on_past=True

C.Set retries=2 on the Dataflow operator

D.Set trigger_rule='one_success' for downstream tasks

AnswerC

Retries automatically re-run the failed task without affecting upstream tasks.

Why this answer

Option C is correct because setting retries=2 on the Dataflow operator instructs Airflow to automatically restart only that specific task upon failure, without affecting upstream tasks (MongoDB extraction, BigQuery load). This isolates the retry to the Dataflow job, preserving the earlier completed work and avoiding redundant data movement.

Exam trap

Google Cloud often tests the distinction between task-level retry mechanisms and dependency/trigger rules, so the trap here is confusing `retries` (which restarts the failed task) with `trigger_rule` or `depends_on_past` (which only affect task scheduling or downstream execution).

How to eliminate wrong answers

Option A is wrong because SlaMiss sensors are used to detect when tasks have not completed within a defined SLA window, not to trigger automatic retries of failed tasks. Option B is wrong because depends_on_past=True enforces sequential execution order across DAG runs (e.g., today’s task waits for yesterday’s success), but does not provide automatic retry on failure within the same run. Option D is wrong because trigger_rule='one_success' controls downstream task execution based on upstream task outcomes (e.g., if one upstream succeeds, proceed), but does not restart a failed task; it only affects task dependencies.

Full explanation →

134

MCQeasy

A data engineer notices that Spark jobs on the Dataproc cluster shown often fail with executor lost errors. What is the most likely reason?

A.All 10 workers are preemptible and can be reclaimed by Compute Engine at any time.

B.The master node has only 4 vCPUs, which may be insufficient for job coordination.

C.The cluster is in a single zone, so a zone failure could cause all workers to shut down.

D.Autoscaling is enabled and scaling down is causing workers to be removed during job execution.

AnswerA

Preemptible VMs can be terminated within 24 hours; Spark executors fail when workers are preempted.

Why this answer

Preemptible VMs in Google Compute Engine can be terminated at any time due to resource contention or other factors, with only 30 seconds notice. If all 10 worker nodes are preemptible, Spark executors running on them will be frequently lost, causing job failures. This is the most direct cause of 'executor lost' errors in a Dataproc cluster.

Exam trap

The trap here is that candidates may overlook the 'all 10 workers are preemptible' detail and instead focus on common misconfigurations like single-zone risk or autoscaling, but the explicit mention of preemptible VMs is the key indicator of frequent, unpredictable executor loss.

How to eliminate wrong answers

Option B is wrong because the master node's vCPUs (4) are typically sufficient for job coordination; executor lost errors are not caused by insufficient master resources but by worker instability. Option C is wrong because a single-zone cluster does not cause frequent executor losses; zone failures are rare and would cause complete cluster failure, not intermittent executor lost errors. Option D is wrong because autoscaling removes workers gracefully, allowing Spark to reschedule tasks before termination; it does not cause the abrupt 'executor lost' errors seen here.

Full explanation →

135

Multi-Selectmedium

A data engineer is migrating on-premises Hadoop jobs to Dataproc. Which TWO considerations are important?

Select 2 answers

A.Use Preemptible VMs for worker nodes to reduce cost

B.Use Cloud Storage instead of HDFS for data storage

C.Avoid using Cloud Storage connector to prevent overhead

D.Keep HDFS for better performance

E.Use on-demand VMs for master node to ensure availability

AnswersA, B

Preemptible VMs are cost-effective for fault-tolerant jobs.

Why this answer

Options A and C are correct: use Cloud Storage instead of HDFS for scalability and cost, and use Preemptible VMs for transient tasks. Option B is wrong because HDFS is not recommended on Dataproc. Option D is wrong because master nodes should be on-demand for reliability.

Option E is wrong because Cloud Storage connector is necessary.

Full explanation →

136

MCQhard

A team uses Vertex AI Feature Store for real-time features. They notice that features are frequently missing during prediction serving. What is the best practice to handle missing features?

A.Retrain the model to handle missing values

B.Impute missing values in the serving function

C.Use a default value in the feature store definition

D.Drop the prediction request

AnswerC

Feature store allows defining default values for missing features.

Why this answer

Option C is correct because Vertex AI Feature Store allows you to define a default value for each feature at the time of feature store creation or feature definition. When a feature value is missing during serving, the feature store automatically returns this default value instead of failing or returning null. This ensures that the serving function always receives a valid feature value without requiring custom imputation logic or model retraining.

Exam trap

Google Cloud often tests the misconception that missing values should be handled by the model or serving code, but the correct approach is to leverage the feature store's built-in default value capability to ensure consistency and low latency.

How to eliminate wrong answers

Option A is wrong because retraining the model to handle missing values does not address the root cause of missing features during serving; it only adapts the model to potentially missing inputs, but the feature store should guarantee a value is present. Option B is wrong because imputing missing values in the serving function introduces latency and custom logic that should be handled at the feature store level; Vertex AI Feature Store provides built-in default value support to avoid this. Option D is wrong because dropping the prediction request is a drastic measure that leads to poor user experience and loss of business value; the feature store should gracefully handle missing features with defaults.

Full explanation →

137

MCQeasy

A data engineer needs to process large CSV files (hundreds of GB) stored in Cloud Storage using Spark on a Dataproc cluster. The job performs a series of transformations and aggregations. Which configuration is most cost-effective and operationally efficient?

A.Use a cluster with 10 high-memory (n1-highmem-8) VMs as workers to improve shuffle performance.

B.Use a cluster with a standard master node and 10 preemptible worker nodes (n1-standard-4).

C.Use a single-node cluster with a high-memory machine type.

D.Use a cluster with 10 standard (n1-standard-4) VMs as master and worker nodes, all non-preemptible.

AnswerB

Preemptible workers are cost-effective and suitable for fault-tolerant jobs like Spark.

Why this answer

Option B is correct because preemptible workers are significantly cheaper (about 80% discount) and ideal for batch processing of large CSV files where fault tolerance is built into Spark via RDD lineage. Using standard nodes for the master ensures cluster stability, while preemptible workers handle the distributed transformations and aggregations cost-effectively. This configuration balances cost and operational efficiency for ephemeral, fault-tolerant workloads.

Exam trap

Google Cloud often tests the misconception that preemptible VMs are unreliable for all workloads, but in Spark batch processing with fault tolerance, they are both cost-effective and operationally efficient, unlike stateful or latency-sensitive applications.

How to eliminate wrong answers

Option A is wrong because using high-memory VMs (n1-highmem-8) for all workers increases cost unnecessarily; shuffle performance is better addressed by tuning Spark parameters (e.g., spark.shuffle.partitions) and using SSDs, not by over-provisioning memory. Option C is wrong because a single-node cluster cannot process hundreds of GB of data efficiently due to lack of parallelism and memory constraints, and it violates the distributed processing paradigm of Spark. Option D is wrong because using all non-preemptible standard VMs (n1-standard-4) for both master and workers eliminates the cost savings of preemptible instances, and having a separate master node is unnecessary for small clusters—the driver can run on a worker—but the main issue is the higher cost without fault-tolerance benefits.

Full explanation →

138

MCQmedium

A machine learning team wants to deploy a new model version for canary testing, where only 5% of traffic is routed to the new version. Which Vertex AI endpoint configuration supports this?

A.Have the client application randomly select which model to call with 5% probability.

B.Deploy the new version to a separate endpoint and direct 5% of users via a load balancer.

C.Configure the endpoint with traffic split: 95% to old version, 5% to new version.

D.Use an A/B testing framework outside of Vertex AI to compare results.

AnswerC

Vertex AI endpoints allow splitting traffic between deployed models; the platform handles routing.

Why this answer

Vertex AI supports traffic splitting between model versions in endpoints. Option B is correct. Option A is wrong because it deploys a separate endpoint.

Option C is wrong because it suggests direct traffic control by client. Option D is wrong because A/B testing is a process, not configuration.

Full explanation →

139

MCQmedium

A data engineering team needs to build a data integration pipeline that involves connecting to multiple sources, performing data transformations with visual editing, and then running custom machine learning algorithms. The team has both data analysts and data scientists. Which approach is most suitable?

A.Use Cloud Composer to orchestrate both Data Fusion and Dataproc

B.Use only Cloud Dataproc for all steps

C.Use only Cloud Data Fusion for all steps

D.Use Cloud Data Fusion for the initial ingestion and transformations, then export the data to Cloud Dataproc for the ML algorithms

AnswerD

This leverages the strengths of both services: visual integration and custom ML.

Why this answer

Option D is correct because it leverages Cloud Data Fusion's visual, no-code interface for data ingestion and transformation, which is ideal for data analysts, and then exports the prepared data to Cloud Dataproc, which provides native support for custom machine learning algorithms using Spark or Hadoop, meeting the data scientists' needs. This separation of concerns optimizes the pipeline for both user groups and avoids forcing all tasks into a single tool that may not excel at both visual ETL and custom ML.

Exam trap

Google Cloud often tests the misconception that a single tool can handle both visual ETL and custom ML, leading candidates to choose Cloud Data Fusion alone (Option C) without realizing it lacks native support for running custom algorithms like Spark MLlib or TensorFlow.

How to eliminate wrong answers

Option A is wrong because Cloud Composer is an orchestration tool (based on Apache Airflow) that manages workflow dependencies and scheduling, but it does not perform data transformations or run ML algorithms itself; using it to orchestrate both Data Fusion and Dataproc adds unnecessary complexity and does not directly address the need for visual editing or custom ML execution. Option B is wrong because Cloud Dataproc is a managed Spark/Hadoop service that requires coding for data transformations, which does not provide the visual editing capabilities needed by data analysts, and it would force all team members to write code, reducing productivity. Option C is wrong because Cloud Data Fusion is designed for visual ETL and data integration but lacks native support for running custom machine learning algorithms; it can only trigger external services like Dataproc for such tasks, making it insufficient for the ML step.

Full explanation →

140

Multi-Selecteasy

A data engineer is setting up CI/CD for a machine learning model using Cloud Build and Vertex AI. Which two components are essential? (Select 2)

Select 2 answers

A.Cloud Storage for datasets

B.Container Registry for model images

C.Cloud Source Repositories

D.Vertex AI Endpoints for deployment

E.Cloud Functions for triggers

AnswersB, D

Model images must be stored and versioned in a registry like Container Registry to deploy to Vertex AI.

Why this answer

Container Registry stores model images, and Vertex AI Endpoints hosts the deployed model. Both are essential in a CI/CD pipeline for ML.

Full explanation →

141

MCQeasy

A company wants to stream data from Cloud Pub/Sub into BigQuery with minimal latency. They have a small team and limited operational resources. Which approach is best?

A.Write a custom application on Compute Engine that polls Pub/Sub and writes to BigQuery.

B.Create a Dataproc cluster running a Spark Streaming job.

C.Create a Cloud Function that writes to BigQuery.

D.Use a Dataflow pipeline with a BigQuery subscription.

AnswerD

Serverless and low maintenance.

Why this answer

Option D is correct because a Dataflow pipeline with a BigQuery subscription provides a fully managed, serverless streaming solution that directly ingests messages from Pub/Sub and writes them to BigQuery with minimal latency. Dataflow handles autoscaling, checkpointing, and exactly-once semantics, which aligns with the team's limited operational resources. The BigQuery subscription (via the Pub/Sub to BigQuery template) eliminates the need for custom code or cluster management, ensuring low-latency streaming without operational overhead.

Exam trap

Google Cloud often tests the misconception that a simple serverless function (Cloud Function) is sufficient for streaming workloads, but candidates overlook that Cloud Functions are designed for event-driven, short-lived tasks and lack the state management, exactly-once guarantees, and sustained throughput needed for continuous data ingestion into BigQuery.

How to eliminate wrong answers

Option A is wrong because writing a custom application on Compute Engine requires the team to manage polling logic, handle failures, and scale instances manually, which contradicts the requirement for minimal operational resources and introduces unnecessary latency and complexity. Option B is wrong because creating a Dataproc cluster running a Spark Streaming job introduces significant operational overhead for cluster provisioning, scaling, and maintenance, and Spark Streaming typically has higher latency (seconds) compared to Dataflow's millisecond-level streaming, making it suboptimal for minimal latency. Option C is wrong because a Cloud Function that writes to BigQuery is not designed for continuous streaming; Cloud Functions have a maximum timeout of 9 minutes (or 60 minutes with 2nd gen) and are triggered per event, which can lead to throttling, out-of-order writes, and lack of exactly-once semantics, making it unsuitable for sustained, low-latency streaming into BigQuery.

Full explanation →

142

Multi-Selectmedium

A company deploys a TensorFlow model on Vertex AI for online predictions. They want to monitor model performance in production to detect degradation. Which TWO practices should they implement? (Choose 2.)

Select 2 answers

A.Use a separate endpoint for shadow testing new model versions.

B.Log prediction requests and responses to Cloud Logging and analyze distribution metrics.

C.Set up Cloud Monitoring alerts for high prediction latency.

D.Schedule daily retraining of the model regardless of monitoring alerts.

E.Enable Vertex AI Model Monitoring for feature drift and skew detection on the deployed model.

AnswersB, E

Analyzing request distributions can detect changes in input data patterns that may affect model performance.

Why this answer

Option B is correct because logging prediction requests and responses to Cloud Logging allows you to analyze distribution metrics (e.g., mean, variance, quantiles) over time. This enables detection of data drift or performance degradation by comparing live distributions against baseline distributions, which is a standard monitoring practice for production ML models.

Exam trap

Google Cloud often tests the distinction between monitoring for model degradation (data drift/skew) versus monitoring for operational issues (latency, errors), leading candidates to confuse infrastructure alerts with model performance monitoring.

Full explanation →

143

MCQeasy

You are responsible for monitoring a production ML model on Vertex AI. The model predicts loan approval probability. The business team reports that the model's predictions are becoming less accurate over the last week. You check the model's monitoring dashboard and see that the prediction distribution has changed significantly. What is the most likely issue?

A.The model is suffering from overfitting to the training data.

B.There is a bug in the model's preprocessing code.

C.There is data drift in the input features.

D.The model is experiencing concept drift.

AnswerD

Concept drift means the underlying relationship between features and target has changed, causing prediction distribution to shift and accuracy to drop.

Why this answer

The correct answer is D because concept drift occurs when the underlying relationship between input features and the target variable changes over time, causing the model's predictions to become less accurate even if the input data distribution remains stable. In this scenario, the prediction distribution has changed significantly, which is a hallmark of concept drift, as the model's learned decision boundary no longer reflects the current real-world patterns. Vertex AI's monitoring dashboard can track prediction distribution shifts, and this symptom points to concept drift rather than data drift.

Exam trap

Google Cloud often tests the distinction between data drift and concept drift, and the trap here is that candidates see 'prediction distribution has changed' and incorrectly assume it must be data drift, when in fact a change in prediction distribution without a change in input features is a classic sign of concept drift.

How to eliminate wrong answers

Option A is wrong because overfitting to the training data is a static issue that would manifest as poor generalization from the start, not as a sudden degradation in accuracy over the last week; overfitting does not cause a change in prediction distribution over time. Option B is wrong because a bug in the model's preprocessing code would likely cause consistent, systematic errors or failures, not a gradual shift in prediction distribution over a week; preprocessing bugs are typically static and would be caught during deployment. Option C is wrong because data drift refers to changes in the input feature distribution, which would be detected by monitoring input feature statistics, not directly by a change in prediction distribution; the question states the prediction distribution has changed, which is more directly tied to concept drift.

Full explanation →

144

MCQmedium

After deploying a model, the team notices that predictions are significantly different from training data distribution. What should they do?

A.Update the model endpoint

B.Review the training data pipeline

C.Set up Vertex AI Model Monitoring for skew detection

D.Retrain the model with new data

AnswerC

Model Monitoring provides continuous tracking of distribution differences.

Why this answer

Vertex AI Model Monitoring is specifically designed to detect skew between training data and serving data, including prediction drift. When predictions differ significantly from the training distribution, this indicates a skew or drift issue that Model Monitoring can alert on, enabling proactive investigation. Updating the endpoint or retraining without diagnosis would not address the root cause, and reviewing the pipeline alone does not provide ongoing detection.

Exam trap

Google Cloud often tests the distinction between reactive troubleshooting (reviewing pipelines, retraining) and proactive monitoring (skew detection), tempting candidates to choose a fix like retraining instead of the monitoring solution that detects the issue first.

How to eliminate wrong answers

Option A is wrong because updating the model endpoint does not diagnose or resolve the distribution mismatch; it only changes the serving target without addressing the underlying data or model behavior. Option B is wrong because reviewing the training data pipeline is a reactive, one-time investigation step, whereas the question describes a deployed model scenario where continuous monitoring is needed to detect and alert on skew in real time. Option D is wrong because retraining with new data without first understanding the cause of the skew may introduce new biases or fail to fix the issue; monitoring should be used to detect and diagnose before retraining.

Full explanation →

145

MCQeasy

An online retailer uses BigQuery for analytics. They have a time-series table with 5 billion rows and new data arrives every day. They want to optimize query performance and reduce costs by ensuring that queries scan only the partitions they need. Which table design should they use?

A.Use a table partitioned on the timestamp column.

B.Use a table clustered on the timestamp column.

C.Use a table with no partitioning but use LIMIT in queries.

D.Use a table partitioned by ingestion time with a partition expiration.

AnswerA

Allows queries to scan only relevant time-range partitions.

Why this answer

Partitioning on the timestamp column allows BigQuery to perform partition pruning, so queries with filters on that column only scan the relevant partitions. This directly reduces the amount of data read, lowering both query cost (pay-per-byte) and improving performance. For a 5-billion-row table with daily data arrival, time-unit partitioning is the standard design to meet the stated goals.

Exam trap

Google Cloud often tests the distinction between partitioning (which prunes data at the storage level) and clustering (which only sorts data within a partition or table), leading candidates to mistakenly believe clustering alone can reduce bytes scanned for time-range queries.

How to eliminate wrong answers

Option B is wrong because clustering only sorts data within partitions or within the table, but does not enable partition pruning; without partitioning, queries still scan the entire table unless a filter matches the clustering key, and clustering alone does not reduce the bytes billed to only the needed time range. Option C is wrong because using LIMIT does not reduce the amount of data scanned; BigQuery still reads all bytes from the entire table before applying the LIMIT, so costs remain high and performance is not improved. Option D is wrong because partitioning by ingestion time (using _PARTITIONTIME or _PARTITIONDATE) only works for append-only streaming or load jobs and does not allow querying on an arbitrary timestamp column; also, partition expiration would delete old data automatically, but the requirement is to scan only needed partitions, not to expire them.

Full explanation →

146

MCQhard

A financial services company needs to explain predictions from a complex ensemble model for regulatory compliance. Which Vertex AI service should they use?

A.Vertex AI Explainable AI

B.Vertex AI Vizier

C.Vertex AI Feature Store

D.Vertex AI Prediction

AnswerA

Provides explanations via feature attributions.

Why this answer

Vertex AI Explainable AI is the correct service because it provides feature attributions and other explainability techniques (e.g., Shapley value approximations, integrated gradients) that help interpret predictions from complex ensemble models. This is essential for regulatory compliance, where the company must demonstrate how input features influence each prediction, ensuring transparency and auditability.

Exam trap

Google Cloud often tests the distinction between services that optimize or deploy models versus those that interpret them, so the trap here is assuming that Vertex AI Prediction includes built-in explainability, when in fact it only serves predictions and requires a separate Explainable AI request for attributions.

How to eliminate wrong answers

Option B (Vertex AI Vizier) is wrong because it is a hyperparameter tuning and optimization service, not designed for explaining model predictions. Option C (Vertex AI Feature Store) is wrong because it serves as a centralized repository for feature management and serving, not for generating post-hoc explanations of model outputs. Option D (Vertex AI Prediction) is wrong because it handles model deployment and online/batch inference requests, but does not natively provide interpretability or attribution explanations for individual predictions.

Full explanation →

147

MCQhard

Refer to the exhibit. A data scientist notices that the evaluation component rarely passes the threshold, causing the pipeline to fail often. What should they do to improve efficiency?

A.Reduce the training dataset size

B.Add a conditional component that only runs evaluation if training metrics are above a certain level

C.Remove the evaluation component

D.Increase the threshold value

AnswerB

Conditional execution saves cost and time by skipping evaluation on underperforming models.

Why this answer

Adding a conditional component that only runs evaluation when training metrics exceed a certain threshold prevents unnecessary evaluation runs on poorly performing models. This reduces pipeline failures by ensuring that evaluation, which may be resource-intensive or prone to failure with low-quality inputs, is only triggered when the model has demonstrated sufficient training performance. This approach optimizes resource usage and pipeline reliability without sacrificing the evaluation step entirely.

Exam trap

Google Cloud often tests the misconception that simply adjusting thresholds or removing components is the solution, when the correct approach is to add conditional logic to gate resource-intensive steps based on upstream quality metrics.

How to eliminate wrong answers

Option A is wrong because reducing the training dataset size would likely degrade model quality and does not address the root cause of evaluation failures; it may even increase variance and instability. Option C is wrong because removing the evaluation component entirely would eliminate the ability to validate model performance, which is critical for ensuring model quality and compliance in production pipelines. Option D is wrong because increasing the threshold value would make it even harder for the evaluation component to pass, exacerbating the failure rate rather than improving efficiency.

Full explanation →

148

MCQmedium

A company uses BigQuery ML to create a classification model. The model is used for batch prediction on a weekly basis. After six months, the data distribution shifts, and model accuracy drops. Which approach should the company take to maintain model performance?

A.Use Cloud Dataflow to preprocess the data and then update the model with new features.

B.Perform hyperparameter tuning on the original training data.

C.Apply model quantization to reduce model size and improve inference speed.

D.Schedule automatic retraining of the model using the most recent three months of data.

AnswerD

Retraining on recent data adapts to distribution shift.

Why this answer

Option D is correct because the model's accuracy drop is due to data distribution shift (concept drift). Scheduling automatic retraining using the most recent three months of data ensures the model adapts to the new patterns without manual intervention. BigQuery ML supports scheduled queries and automatic model retraining via the `CREATE OR REPLACE MODEL` statement, making this approach both practical and aligned with MLOps best practices for batch prediction pipelines.

Exam trap

Google Cloud often tests the misconception that hyperparameter tuning or feature engineering alone can fix data drift, when in fact only retraining on fresh data addresses the shift.

How to eliminate wrong answers

Option A is wrong because Cloud Dataflow is a data processing tool, not a solution for retraining; preprocessing and adding new features does not address the distribution shift unless the model is retrained on the new data. Option B is wrong because hyperparameter tuning on the original training data optimizes the model for the old distribution, not the shifted one, and will not recover accuracy. Option C is wrong because model quantization reduces model size and speeds up inference but does not improve accuracy or address data drift; it may even slightly degrade performance.

Full explanation →

149

Multi-Selecteasy

Which TWO are valid approaches to handle late-arriving data in a Cloud Dataflow streaming pipeline?

Select 2 answers

A.Change to processing time windows instead of event time windows

B.Set allowed lateness on the window

C.Use a side input with a fixed window to join late data

D.Discard any events that arrive after the window closes

E.Use a trigger that fires every second

AnswersB, C

Allowed lateness tells the pipeline how long to wait for late data.

Why this answer

Option B is correct because setting allowed lateness on a window in Cloud Dataflow allows the pipeline to wait for late-arriving data within a specified duration after the watermark passes the window end. This is a standard mechanism to handle out-of-order or delayed events without discarding them, ensuring completeness of windowed aggregations.

Exam trap

Google Cloud often tests the misconception that processing time windows are a valid substitute for handling late data, but they fundamentally change the semantics from event-time to processing-time, which is not a proper solution for late-arriving events.

Full explanation →

150

Multi-Selectmedium

Which TWO actions should you take to ensure model reliability in a production Vertex AI Endpoint?

Select 2 answers

A.Use only batch predictions to avoid real-time issues

B.Monitor prediction accuracy in production with logging and alerts

C.Disable request/response logging to reduce latency

D.Use a single model endpoint for all traffic

E.Gradually shift traffic to new model versions (canary deployment)

AnswersB, E

Detects model degradation.

Why this answer

Monitoring prediction accuracy with logging and alerts (B) is essential for detecting model drift, data drift, and performance degradation in production. Vertex AI provides model monitoring features that automatically log prediction requests and responses, compute statistics, and trigger alerts when skew or drift thresholds are breached, enabling proactive remediation.

Exam trap

Google Cloud often tests the misconception that disabling logging improves reliability by reducing latency, when in fact it removes the observability needed to detect and diagnose failures, which is a core tenet of MLOps reliability.

Full explanation →

Page 2 of 7

All pages

Practice PDE by domain

Target a specific domain to shore up weak areas.

Designing data processing systems Building and operationalizing data processing systems Operationalizing machine learning models Ensuring solution quality

See all domains with question counts →