Google Professional Data Engineer (PDE) — Questions 175

499 questions total · 7pages · All types, answers revealed

Page 1 of 7

Page 2
1
MCQmedium

Your team uses a CI/CD pipeline with Cloud Build to train and deploy ML models on Vertex AI. You want to ensure that only models that pass validation checks (e.g., accuracy threshold, fairness metrics) are promoted to production. What is the best way to implement this?

A.Use Cloud Scheduler to trigger retraining and only deploy if the new model outperforms the previous one on a holdout set.
B.Use Vertex AI Model Registry's automatic promotion feature that moves models to production based on evaluation results.
C.Configure Cloud Functions to re-evaluate the model daily and promote if it passes.
D.In the Cloud Build pipeline, after training, run validation scripts. If validation passes, deploy to a staging endpoint for manual approval, then promote to production.
AnswerD

This ensures automated validation before any deployment, with optional manual gate for production.

Why this answer

Option D is correct because it integrates validation directly into the CI/CD pipeline using Cloud Build, ensuring that only models passing specific checks (e.g., accuracy threshold, fairness metrics) are promoted. By running validation scripts after training and requiring manual approval before production promotion, this approach provides both automated gatekeeping and human oversight, aligning with MLOps best practices for safe model deployment.

Exam trap

Google Cloud often tests the misconception that Vertex AI Model Registry has built-in automatic promotion based on evaluation metrics, but in reality, it requires external orchestration (like Cloud Build) to implement such logic.

How to eliminate wrong answers

Option A is wrong because Cloud Scheduler triggers retraining on a schedule, not based on validation results, and it does not integrate with the CI/CD pipeline to enforce promotion gates. Option B is wrong because Vertex AI Model Registry does not have an automatic promotion feature based on evaluation results; it stores and manages models but requires external logic to decide promotion. Option C is wrong because Cloud Functions re-evaluating the model daily is reactive and does not tie into the build pipeline's validation step, potentially promoting a model that was not validated at the time of training.

2
MCQeasy

Refer to the exhibit. An auditor sees the following output from `gcloud ai models list`. What can they conclude about versioning?

A.The model is deployed on a single endpoint
B.The model has two versions with v2 being the latest
C.Only the latest version is available
D.The model is automatically scaled
AnswerB

Two distinct versions are shown; v2 has a later timestamp.

Why this answer

The `gcloud ai models list` output shows two model versions (v1 and v2) under the same model resource. The default traffic split or the listed order indicates v2 is the latest version. This directly confirms that the model has two versions, with v2 being the latest, making option B correct.

Exam trap

Google Cloud often tests that candidates confuse model versioning with endpoint deployment details, leading them to assume a single endpoint or automatic scaling from a model list output that contains no such information.

How to eliminate wrong answers

Option A is wrong because the output does not show any endpoint information; model versions can be deployed to multiple endpoints or not deployed at all. Option C is wrong because the output explicitly lists two versions (v1 and v2), so both are available, not just the latest. Option D is wrong because the output provides no scaling configuration or metrics; autoscaling is a deployment setting, not a model version property.

3
MCQhard

A company processes IoT sensor data in near real-time. They ingest data via Cloud Pub/Sub, then a Dataflow streaming pipeline writes to Bigtable for low-latency queries. Recently, they observed increased Pub/Sub message backlog during traffic spikes. What is the most effective scaling strategy?

A.Increase Pub/Sub subscription throughput by increasing the number of partitions
B.Increase Dataflow worker count and adjust autoscaling configuration
C.Use a Cloud Scheduler to throttle Pub/Sub publishing
D.Add a Cloud Function to pre-process messages before they are consumed by Dataflow
AnswerB

Dataflow autoscaling can handle backlogs if enough workers are provisioned; increasing the max number of workers allows the pipeline to catch up during spikes.

Why this answer

The correct answer is B because the increased Pub/Sub backlog during traffic spikes indicates that the Dataflow pipeline is unable to consume messages as fast as they are being published. Increasing the Dataflow worker count and adjusting autoscaling configuration allows the pipeline to scale horizontally, processing more messages per second and reducing the backlog. Pub/Sub itself is designed to handle high throughput, so the bottleneck is the consumer (Dataflow), not the ingestion layer.

Exam trap

The trap here is that candidates mistakenly think Pub/Sub's throughput is limited by partitions (like Kafka) or that throttling the publisher is a valid scaling strategy, when in fact the bottleneck is the streaming pipeline's processing capacity, which must be scaled horizontally.

How to eliminate wrong answers

Option A is wrong because Pub/Sub does not use partitions like Kafka; increasing partitions is not a valid concept for Pub/Sub subscriptions, and throughput is managed by the subscriber's ability to pull messages, not by partitioning. Option C is wrong because throttling Pub/Sub publishing with Cloud Scheduler would reduce the incoming data rate, but this is counterproductive for near real-time processing and does not address the root cause of insufficient consumer capacity. Option D is wrong because adding a Cloud Function to pre-process messages would introduce an additional processing step that could further increase latency and does not directly solve the Dataflow pipeline's inability to keep up with the message volume.

4
Multi-Selectmedium

A Dataflow streaming job is processing data from Pub/Sub and writing to BigQuery. The job is stuck with the message 'No progress has been made' for several minutes. Which TWO actions should the team take to troubleshoot and resolve the issue? (Choose TWO.)

Select 2 answers
A.Set the updateCompatibility flag to true and restart the pipeline.
B.Increase the persistent disk size for all workers to reduce I/O contention.
C.Examine the worker logs in Cloud Logging for any error messages or exceptions.
D.Force stop the pipeline and update it with a new version using the --update flag.
E.Enable Dataflow Streaming Engine to move state to the backend and reduce worker load.
AnswersC, E

B is correct because logs can reveal the root cause, such as out-of-memory errors or stuck transforms.

Why this answer

Option C is correct because examining worker logs in Cloud Logging is the first step to identify the root cause of a stuck pipeline. Common issues like out-of-memory errors, serialization failures, or worker crashes are logged there, and without inspecting logs, troubleshooting is guesswork.

Exam trap

Google Cloud often tests the misconception that increasing resources (like disk size) or restarting the pipeline is the default fix, when in reality the first step is always to inspect logs to understand the failure mode.

5
Multi-Selecteasy

Which TWO actions can reduce the cost of running a Dataproc cluster for a nightly batch job?

Select 2 answers
A.Increase the number of worker nodes for faster processing.
B.Use high-memory machine types for master node.
C.Use preemptible VMs for worker nodes.
D.Attach local SSDs to all nodes.
E.Delete the cluster after the job completes.
AnswersC, E

Preemptible VMs are much cheaper.

Why this answer

Preemptible VMs (Option C) are significantly cheaper than standard VMs because Compute Engine can terminate them at any time, making them ideal for fault-tolerant, stateless batch jobs like nightly data processing on Dataproc. Deleting the cluster after the job completes (Option E) eliminates ongoing compute costs for idle resources, which is a best practice for ephemeral workloads.

Exam trap

Google Cloud often tests the misconception that scaling up resources (more nodes or faster hardware) always reduces cost by shortening runtime, but in reality, the increased per-hour cost usually outweighs the time savings for batch jobs.

6
MCQmedium

A company has a production model deployed on Vertex AI that shows declining accuracy over time. The model uses features from a BigQuery feature store. The data science team suspects data drift. What is the most efficient way to monitor and detect drift for this model?

A.Enable Vertex AI Model Monitoring on the endpoint to automatically detect skew and drift
B.Periodically export training data and production data to CSV and compare distributions manually
C.Create a scheduled retraining pipeline that runs weekly
D.Set up Cloud Monitoring dashboards to track prediction request volumes and error rates
AnswerA

Vertex AI Model Monitoring provides built-in drift detection for deployed models.

Why this answer

Option B is correct because Vertex AI Model Monitoring can automatically monitor prediction input data for drift and send alerts. Option A (manual comparison) is not efficient. Option C (Cloud Monitoring dashboards) can show metrics but not automatically detect drift.

Option D (retraining pipeline) is reactive, not proactive monitoring.

7
MCQeasy

A company is designing a streaming data pipeline to process real-time clickstream events. They need to aggregate events by session window with a 5-minute gap and enable exactly-once processing semantics. Which Google Cloud service should they use?

A.Cloud Pub/Sub with Cloud Functions
B.Cloud Dataflow with Apache Beam
C.Cloud Dataproc with Spark Streaming
D.Cloud Bigtable with Dataflow templates
AnswerB

Dataflow with Beam natively supports session windows and exactly-once processing via its processing guarantees.

Why this answer

Cloud Dataflow with Apache Beam is the correct choice because it provides native support for session windows with a 5-minute gap duration and exactly-once processing semantics via its sink and source integrations. Dataflow's Beam SDK allows you to define session windows using `Window.into(Sessions.withGapDuration(Duration.standardMinutes(5)))`, and its checkpointing and idempotent writes ensure exactly-once delivery even in failure scenarios.

Exam trap

Google Cloud often tests the distinction between stateless serverless services (like Cloud Functions) and stateful stream processing engines (like Dataflow), leading candidates to incorrectly choose Cloud Pub/Sub with Cloud Functions because they overlook the need for session window state management and exactly-once semantics.

How to eliminate wrong answers

Option A is wrong because Cloud Pub/Sub with Cloud Functions does not support session windowing natively; Cloud Functions are stateless and cannot maintain session state across invocations, and Pub/Sub offers at-least-once delivery, not exactly-once. Option C is wrong because Cloud Dataproc with Spark Streaming can implement session windows but requires manual state management and does not provide built-in exactly-once semantics; Spark Streaming's checkpointing can lead to duplicate outputs in failure recovery. Option D is wrong because Cloud Bigtable with Dataflow templates is a storage and template combination, not a processing service; Dataflow templates can be used for streaming but the question asks for the service to use, and Bigtable is a NoSQL database, not a stream processing engine.

8
Multi-Selecthard

Which TWO metrics are most important to monitor for a real-time online prediction system to ensure service reliability and model performance?

Select 2 answers
A.Feature distribution skew between training and serving
B.Prediction latency (p50, p99)
C.Number of training examples used for the latest model version
D.Batch prediction job throughput
E.Prediction error rate (e.g., 4xx/5xx responses)
AnswersB, E

Latency is critical for real-time applications; p99 shows tail performance.

Why this answer

Prediction latency (p50, p99) is critical because it directly impacts user experience and system reliability; high tail latency (p99) can indicate resource contention or model complexity issues. Prediction error rate (4xx/5xx) is essential for detecting serving infrastructure failures, such as model server crashes or misconfigured endpoints, which degrade service reliability. Both metrics provide real-time visibility into the serving layer's health and performance, distinct from offline training metrics.

Exam trap

Google Cloud often tests the distinction between offline training metrics (like feature skew or training example count) and real-time serving metrics (like latency and error rate), trapping candidates who confuse model performance monitoring with service reliability monitoring.

9
MCQmedium

A team trained a model on a Vertex AI custom training job and wants to deploy it to an endpoint for online predictions. They have the model artifacts stored in Cloud Storage. What steps are required?

A.Upload model to Model Registry, create endpoint, deploy model
B.Directly deploy from Cloud Storage without Model Registry
C.Create endpoint, then upload model
D.Use Vertex AI Batch Prediction only
AnswerA

This is the standard workflow: register model, create endpoint, then deploy.

Why this answer

To deploy a model for online predictions on Vertex AI, you must first upload the model artifacts from Cloud Storage to the Model Registry, which creates a versioned model resource. Then you create an endpoint (or use an existing one) and deploy the model to that endpoint, specifying machine type, traffic split, and other settings. This three-step process (upload → create endpoint → deploy) is the required workflow for online serving.

Exam trap

Google Cloud often tests the misconception that you can deploy directly from Cloud Storage without the Model Registry, or that the endpoint must be created before the model is uploaded, when in fact the model must be registered first.

How to eliminate wrong answers

Option B is wrong because Vertex AI does not allow direct deployment from Cloud Storage without first registering the model in the Model Registry; the registry is required to manage model versions and associate deployment configurations. Option C is wrong because you cannot create an endpoint before uploading the model to the Model Registry, as the endpoint deployment references a model resource that must already exist. Option D is wrong because the question explicitly asks for online predictions, and batch prediction is a separate, asynchronous process that does not involve endpoints or real-time serving.

10
Multi-Selecthard

A data science team uses Cloud Build and Vertex AI to implement CI/CD for their machine learning models. Which THREE steps are essential for a production-ready operationalization pipeline? (Choose 3.)

Select 3 answers
A.Store all training artifacts in Cloud Storage without versioning.
B.Deploy the model to a staging endpoint for manual approval before promoting to production.
C.Automatically deploy every new model version directly to the production endpoint.
D.Use Vertex AI Model Evaluation to validate the new model against the current production model metrics.
E.Include unit and integration tests for the training code in the Cloud Build pipeline.
AnswersB, D, E

Staging allows human review and canary testing before full production rollout.

Why this answer

Option B is correct because deploying to a staging endpoint for manual approval before promoting to production is a critical step in a production-ready CI/CD pipeline. This allows data scientists to validate model behavior, performance, and fairness in a near-production environment, preventing regressions and ensuring governance compliance before the model serves live traffic.

Exam trap

Google Cloud often tests the misconception that full automation (Option C) is always better, but the trap here is that production-ready pipelines require human-in-the-loop approval for critical model changes to ensure accountability and safety.

11
MCQmedium

A data pipeline processes streaming data from Pub/Sub to BigQuery. The pipeline needs to handle late-arriving data that is up to 1 hour late. Which Dataflow feature should be used?

A.Global windows with watermark
B.Session windows
C.Sliding windows with allowed lateness
D.Fixed windows with allowed lateness
AnswerD

Fixed windows with allowed lateness (set to 1 hour) ensure late events are processed in the correct window.

Why this answer

Fixed windows with allowed lateness are the correct choice because the pipeline needs to handle late-arriving data up to 1 hour late while processing data in fixed time intervals (e.g., 1-hour windows). The `allowedLateness` parameter in Dataflow (Apache Beam) allows late data to be included in the appropriate fixed window for up to the specified duration after the watermark passes the window end. This ensures that late Pub/Sub messages are correctly joined with their original window in BigQuery.

Exam trap

Google Cloud often tests the distinction between window types and lateness handling, and the trap here is that candidates confuse 'allowed lateness' as a feature exclusive to sliding windows or global windows, when in fact it is a parameter that can be applied to fixed windows to handle late data within a bounded delay.

How to eliminate wrong answers

Option A is wrong because global windows with watermark process all data in a single unbounded window and rely on watermark to trigger output, but they cannot segment data into fixed time intervals for BigQuery loading, and late data handling is not as precise for per-window aggregation. Option B is wrong because session windows group events based on gaps of inactivity, which is not suitable for processing data in fixed time intervals as required by the pipeline. Option C is wrong because sliding windows produce overlapping windows that emit multiple outputs per element, which is unnecessary and inefficient for a simple fixed-interval pipeline, and allowed lateness is a property of fixed windows, not sliding windows in this context.

12
MCQhard

A company runs a daily batch data processing pipeline using Cloud Dataproc. The pipeline reads 10 TB of CSV files from Cloud Storage, performs a heavy aggregation (GroupBy) and joins with a small reference table, then writes the results to BigQuery. The cluster consists of 20 n1-standard-8 nodes, including 10 preemptible workers for cost savings. Recently, the job completion time has doubled from 30 minutes to over an hour. The job logs show many tasks being retried, and the Shuffle spill ratio is high. No significant data volume change was observed. What is the most likely root cause?

A.The cluster's HDFS is running out of space due to intermediate shuffle data.
B.Data skew has developed, causing a few tasks to process most of the data.
C.Preemptible workers are being reclaimed, causing YARN container failures and task retries.
D.The reference table has increased in size, causing more data to be broadcast to all workers.
AnswerC

Preemptible nodes can be taken at any time; Shuffle-heavy jobs suffer greatly from lost intermediate data.

Why this answer

The correct answer is C because preemptible workers are frequently reclaimed by Google Cloud, causing YARN containers to fail and tasks to be retried. This leads to increased job completion time and a high shuffle spill ratio, as partial shuffle data is lost and must be recomputed. The doubling of job time without data volume change strongly points to infrastructure instability rather than data or configuration issues.

Exam trap

The trap here is that candidates may attribute high shuffle spill and task retries to data skew or HDFS space, but the key clue is the unchanged data volume and the use of preemptible workers, which directly cause container failures and retries.

How to eliminate wrong answers

Option A is wrong because Cloud Dataproc uses Cloud Storage for intermediate shuffle data by default (via the 'spark.shuffle.useOldFetchProtocol' or 'spark.shuffle.manager' settings), not HDFS, so HDFS space is not a bottleneck. Option B is wrong because data skew would cause a few tasks to process most data, but the symptom of many tasks being retried and high shuffle spill ratio is more consistent with container failures, not skew; skew typically manifests as a few long-running tasks, not widespread retries. Option D is wrong because the reference table is described as small, and even if it increased, broadcasting more data would not cause task retries or high shuffle spill; it would instead increase memory pressure on executors, not trigger widespread failures.

13
Drag & Dropmedium

Drag and drop the steps to configure a VPC network with private Google access for on-premises connectivity using Cloud VPN into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order

Why this order

Private Google Access allows on-premises hosts to reach Google APIs via VPN without public IPs.

14
MCQhard

A financial services company uses Dataflow pipelines with late data handling. They need to ensure that all late-arriving data is processed correctly but also want to control costs. What is the best configuration?

A.Use a global window with a very long allowed lateness (e.g., 7 days).
B.Use session windows with a gap duration of 1 hour and allowed lateness of 2 days.
C.Use sliding windows with a short allowed lateness (e.g., 10 minutes) and a side input containing historical data.
D.Use fixed windows with allowed lateness set to the maximum expected delay (e.g., 2 days) and a trivial watermark.
AnswerD

Fixed windows with a realistic allowed lateness capture late data without excessive state cost, and a trivial watermark ensures no data is dropped.

Why this answer

Option D is correct because using fixed windows with allowed lateness set to the maximum expected delay and a trivial watermark balances completeness and cost. Option A (global window with long allowed lateness) can cause high state cost. Option B (session windows) may merge late data incorrectly.

Option C (sliding windows with short allowed lateness and side input) is complex and may miss data.

15
MCQeasy

A company uses Cloud Dataflow to process streaming data from Pub/Sub into BigQuery. The pipeline uses a side input from a Cloud Bigtable table containing user profile information to enrich the events. The side input is updated every hour. Which approach should the company use to ensure that the pipeline uses the latest profile data without causing high memory usage?

A.Use a side input that is periodically refreshed by reading the Cloud Bigtable table at a regular interval.
B.For each incoming event, read the corresponding profile from Cloud Bigtable using a synchronous call.
C.Use a CoGroupByKey transform to join the stream with a bounded PCollection created from the Cloud Bigtable table.
D.Stream the profile updates into a separate BigQuery table and use a BigQuery streaming query to join in real-time.
AnswerA

B is correct because side inputs with periodic refreshes provide a fresh snapshot of the reference data without high memory overhead.

Why this answer

Option A is correct because Cloud Dataflow supports periodically refreshing side inputs by reading from an external source like Cloud Bigtable at a specified interval. This approach keeps the profile data up-to-date without storing the entire side input in memory for the lifetime of the pipeline; instead, the side input is rebuilt and cached only when refreshed, controlling memory usage.

Exam trap

Google Cloud often tests the misconception that side inputs are static and cannot be updated, leading candidates to choose per-element lookups (Option B) or complex joins (Option C), when in fact Dataflow's side input refresh mechanism is the correct, efficient solution for periodically updated reference data.

How to eliminate wrong answers

Option B is wrong because making a synchronous call to Cloud Bigtable for every incoming event would introduce high latency and potentially overwhelm Bigtable with thousands of read requests per second, leading to performance degradation and increased cost. Option C is wrong because CoGroupByKey requires both inputs to be bounded PCollections; the streaming Pub/Sub source is unbounded, and joining it with a bounded Bigtable snapshot would not reflect updates to the profile data over time. Option D is wrong because streaming profile updates into a separate BigQuery table and using a streaming query to join in real-time would add unnecessary complexity and latency, and BigQuery is not designed for high-frequency per-event joins in a streaming pipeline.

16
MCQmedium

A production model deployed on Vertex AI Endpoint is experiencing high latency during traffic spikes. The current configuration uses a single replica. What is the most efficient solution?

A.Set a higher min replica count (e.g., 3)
B.Enable autoscaling with minReplicaCount=1 and maxReplicaCount=10
C.Use a larger machine type (e.g., n1-highmem-8)
D.Switch to batch prediction to handle spikes
AnswerB

Autoscaling adjusts replicas based on load, balancing latency and cost.

Why this answer

Enabling autoscaling with a min replica count ensures always-on capacity and scales up during spikes. Using a larger machine type might help but is less dynamic. Using batch prediction doesn't solve real-time latency.

Increasing min replicas without autoscaling leaves resources idle at quiet times.

17
Multi-Selectmedium

Which TWO configurations are required to enable online prediction for a model deployed on Vertex AI Endpoints?

Select 2 answers
A.A feature store must be attached to the endpoint.
B.The endpoint must be configured with a machine type (e.g., n1-standard-2).
C.The model must be trained on Vertex AI.
D.A model must be deployed to an endpoint.
E.Autoscaling must be enabled.
AnswersB, D

A machine type must be specified to allocate resources for serving.

Why this answer

Option B is correct because Vertex AI Endpoints require a machine type to be specified when deploying a model. The machine type determines the compute resources (CPU/memory) allocated to the serving container, which is essential for handling prediction requests. Without a machine type, the endpoint cannot provision the underlying infrastructure to serve online predictions.

Exam trap

The trap here is that candidates often confuse optional features (like Feature Store or autoscaling) with mandatory configurations, or assume the model must be trained on Vertex AI, when in fact only the machine type and model deployment are strictly required for online prediction.

18
Multi-Selecteasy

A data pipeline uses Cloud Pub/Sub to ingest events, then a Cloud Dataflow job writes to BigQuery. The Dataflow job is failing with 'deadline exceeded' errors. Which TWO actions can resolve this? (Choose TWO.)

Select 2 answers
A.Increase the number of Dataflow workers.
B.Switch to BigQuery Storage Write API.
C.Decrease the batch size for writes to BigQuery.
D.Set the --maxStreamingRowsToBundle parameter to a higher value.
E.Change the windowing from fixed to global.
AnswersA, D

Reduces load per worker.

Why this answer

Increasing the number of Dataflow workers (Option A) is correct because 'deadline exceeded' errors typically indicate that the pipeline is falling behind on processing due to insufficient parallelism. By adding more workers, the workload is distributed across more virtual machines, reducing the per-worker load and allowing the pipeline to keep up with the incoming Pub/Sub stream, thereby avoiding timeouts when writing to BigQuery.

Exam trap

Google Cloud often tests the misconception that 'deadline exceeded' errors are always caused by slow writes to the sink, leading candidates to choose options like decreasing batch size or switching write APIs, when the real issue is insufficient parallelism in the streaming pipeline.

19
MCQhard

A financial services firm processes sensitive transactions using Cloud Dataflow. The pipeline reads from Pub/Sub, performs stateful processing (e.g., fraud detection), and writes to Cloud Spanner. Compliance requires exactly-once processing semantics. Which configuration ensures exactly-once processing?

A.Configure Pub/Sub to use exactly-once delivery mode.
B.Use Pub/Sub with at-least-once delivery and Dataflow with at-least-once processing mode.
C.Set Dataflow pipeline to exactly-once mode and design Spanner writes to be idempotent.
D.Enable Dataflow's streaming engine and use Spanner's built-in retry logic.
AnswerC

Exactly-once mode with idempotent sinks prevents duplicates.

Why this answer

Option C is correct because exactly-once processing in a Dataflow pipeline requires the pipeline itself to be set to exactly-once mode (which uses consistent snapshots and transactional sinks) and the output writes to Spanner to be idempotent. This combination ensures that even if a record is reprocessed due to failures, the final state in Spanner remains consistent, satisfying compliance requirements.

Exam trap

The trap here is that candidates often assume Pub/Sub's exactly-once delivery alone is sufficient, but they overlook that Dataflow's internal processing and output writes must also be idempotent or transactional to achieve end-to-end exactly-once semantics.

How to eliminate wrong answers

Option A is wrong because Pub/Sub's exactly-once delivery mode only guarantees that a message is delivered exactly once to the subscriber, but it does not prevent duplicate processing within the Dataflow pipeline due to retries or checkpoint recovery. Option B is wrong because using at-least-once delivery in Pub/Sub combined with at-least-once processing in Dataflow inherently allows duplicates, violating exactly-once semantics. Option D is wrong because enabling Dataflow's streaming engine improves scalability and latency but does not enforce exactly-once processing, and Spanner's built-in retry logic only handles transient failures, not duplicate writes from reprocessing.

20
MCQhard

A data science team is operationalizing a batch prediction job using Vertex AI Batch Prediction. The model uses a custom container that requires a specific GPU for inference. The job processes a large dataset stored in Cloud Storage. The team wants to minimize cost while ensuring the job completes within a 2-hour window. Which configuration should they choose?

A.Use a custom training job with a GPU worker pool and run the inference as a custom job.
B.Use a custom machine type with a GPU accelerator in the batch prediction request.
C.Use a high-memory machine type (e.g., n1-highmem-32) without GPU to reduce cost.
D.Configure a Vertex AI endpoint with GPU and submit batch requests to the endpoint.
AnswerA

This approach allows GPU usage and is cost-effective for batch processing within a time window.

Why this answer

Option A is correct because Vertex AI Batch Prediction does not support custom containers with GPU accelerators; it only supports CPUs for batch prediction jobs. To run GPU-accelerated inference on a large dataset, the team must use a custom training job (which supports GPU worker pools) and run inference as a custom job. This approach allows them to leverage GPU hardware for the 2-hour window while minimizing cost by using preemptible VMs or choosing the smallest GPU instance that meets throughput requirements.

Exam trap

Google Cloud often tests the misconception that Vertex AI Batch Prediction supports GPU accelerators because it is a managed service, but in reality, GPU support is only available for online prediction endpoints and custom training jobs, not for batch prediction.

How to eliminate wrong answers

Option B is wrong because Vertex AI Batch Prediction does not allow attaching GPU accelerators to custom machine types; the batch prediction service only supports CPU-based machine types. Option C is wrong because a high-memory CPU-only machine type would likely be too slow for GPU-required inference, causing the job to exceed the 2-hour window or require many more instances, increasing cost. Option D is wrong because configuring an endpoint with GPU and submitting batch requests would incur ongoing endpoint deployment costs (even when idle) and is designed for online prediction, not cost-efficient batch processing; it also introduces unnecessary latency and scaling complexity.

21
MCQhard

A company uses Cloud Spanner for a global transactional application. During peak hours, commit latency increases by over 50%. Which configuration issue is the most likely root cause?

A.Insufficient compute capacity (nodes) allocated to the instance.
B.Hotspotting due to monotonically increasing primary keys.
C.Incorrect indexing of secondary indexes.
D.Network bandwidth constraints between regions.
AnswerB

This is a common cause of latency spikes in Spanner; use hash-prefixed keys to distribute writes.

Why this answer

Monotonically increasing primary keys in Spanner create hot spots, as all writes hit a single tablet, causing contention and increased latency.

22
Multi-Selectmedium

A data engineer needs to monitor the performance of BigQuery queries to identify opportunities for optimization. Which TWO metrics should they focus on? (Choose two.)

Select 2 answers
A.Slot usage
B.Data scanned per query
C.Query execution time
D.Number of tables joined
E.Number of users
AnswersA, B

Monitoring slot usage helps identify query resource consumption and opportunities to optimize.

Why this answer

Options A and E are correct: Slot usage indicates resource consumption, and data scanned per query directly correlates with cost and performance. Option B (number of tables joined) is not a direct metric. Option C (execution time) is important but can be affected by many factors.

Option D (number of users) is irrelevant.

23
MCQhard

A team is implementing CI/CD for their ML models using Google Cloud. They want to automatically retrain and deploy a new model version when new training data arrives in Cloud Storage. Which combination of services should they use?

A.Cloud Storage triggers, Cloud Functions, and Vertex AI Pipelines
B.Cloud Scheduler and Vertex AI Training
C.Cloud Pub/Sub and Cloud Composer
D.Cloud Storage notifications and Cloud Build
AnswerA

Event-driven pipeline with managed ML services.

Why this answer

Option A is correct because Cloud Storage triggers fire an event when new data arrives, which invokes a Cloud Function that can start a Vertex AI Pipeline for retraining and deploying the model. This combination provides a fully managed, event-driven CI/CD pipeline for ML models without manual intervention.

Exam trap

Google Cloud often tests the distinction between event-driven triggers (Cloud Storage triggers) and time-based scheduling (Cloud Scheduler), leading candidates to choose B or D when they overlook the need for automatic retraining upon data arrival.

How to eliminate wrong answers

Option B is wrong because Cloud Scheduler is for time-based scheduling, not event-driven triggers from Cloud Storage, so it cannot automatically retrain when new data arrives. Option C is wrong because Cloud Pub/Sub and Cloud Composer (Apache Airflow) are more suited for complex workflow orchestration with multiple dependencies, not a simple event-driven retraining trigger from Cloud Storage. Option D is wrong because Cloud Build is designed for building and testing application code, not for orchestrating ML training pipelines with Vertex AI, and it lacks native integration for model deployment.

24
MCQmedium

Refer to the exhibit. What is the most likely cause of the error?

A.The model artifact was not uploaded to Cloud Storage
B.The endpoint does not exist
C.The service account lacks permissions
D.The model ID is invalid
AnswerA

The error explicitly states the artifact URI is missing.

Why this answer

The error occurs because the model artifact must be uploaded to Cloud Storage before it can be deployed to an endpoint. Vertex AI requires the model to be stored in a Cloud Storage bucket, and the deployment process references that artifact. Without the artifact in Cloud Storage, the endpoint creation or model deployment fails with an error indicating the resource is missing.

Exam trap

Google Cloud often tests the distinction between resource existence errors (like missing artifact) and permission or configuration errors, leading candidates to incorrectly choose permission issues when the actual problem is a missing prerequisite resource.

How to eliminate wrong answers

Option B is wrong because if the endpoint did not exist, the error would typically be a 404 Not Found or a message stating the endpoint resource is not found, not a generic error about missing artifact. Option C is wrong because a lack of permissions would result in a 403 Forbidden error or an IAM-related message, not an error about a missing model artifact. Option D is wrong because an invalid model ID would produce an error like 'Model not found' or 'Invalid model ID', not an error indicating the artifact is missing from Cloud Storage.

25
MCQmedium

A company uses Cloud Dataproc to run nightly Spark ETL jobs that process about 500 GB of data each night. The jobs currently take 4 hours to complete. The company wants to reduce the runtime to under 2 hours to meet a new SLA. The cluster is configured with 10 worker nodes (n1-standard-4) and 1 master node (n1-standard-4). The jobs are CPU-bound and use only default settings. The cluster is deleted after each job and recreated. The data is stored in Cloud Storage. The company is open to increasing cost but wants the most cost-effective solution to meet the SLA. Which approach should they take?

A.Use a regional Cloud Storage bucket to improve read throughput.
B.Replace worker nodes with n1-highmem-16 instances to increase memory.
C.Increase the number of worker nodes to 20 and use preemptible VMs for half of them.
D.Change machine type to n2-standard-8 for all nodes.
AnswerC

Doubles processing power cost-effectively.

Why this answer

Option C is correct because adding more worker nodes (from 10 to 20) directly increases parallelism for CPU-bound Spark jobs, and using preemptible VMs for half of them reduces cost while still meeting the SLA. Since the job is CPU-bound and uses default settings, scaling horizontally with a mix of standard and preemptible VMs is the most cost-effective way to halve runtime, as Spark can efficiently distribute the workload across more cores.

Exam trap

The trap here is that candidates may assume CPU-bound jobs require faster CPUs (Option D) or more memory (Option B), but horizontal scaling with preemptible VMs is the most cost-effective way to increase parallelism in Cloud Dataproc.

How to eliminate wrong answers

Option A is wrong because using a regional Cloud Storage bucket improves data durability and availability but does not significantly increase read throughput for a single job; the bottleneck is CPU, not I/O. Option B is wrong because the job is CPU-bound, not memory-bound; increasing memory with n1-highmem-16 instances does not address the CPU bottleneck and adds unnecessary cost. Option D is wrong because changing to n2-standard-8 (8 vCPUs per node) doubles vCPUs per node but only increases total vCPUs from 40 to 80, which may not halve runtime, and is less cost-effective than using 20 n1-standard-4 nodes (80 vCPUs) with preemptible VMs for half.

26
MCQeasy

A data science team has trained a TensorFlow model for image classification and wants to deploy it to production with minimal latency. They have already exported the model as a SavedModel directory. Which service should they use to create an online prediction endpoint?

A.Cloud Functions
B.Vertex AI Endpoints
C.AI Platform Prediction (legacy)
D.Cloud Dataflow
AnswerB

Vertex AI Endpoints provide scalable, low-latency online prediction serving.

Why this answer

Vertex AI Endpoints is the correct service for deploying a TensorFlow SavedModel to an online prediction endpoint with minimal latency. It provides managed, autoscaling infrastructure optimized for real-time inference, including GPU/TPU support, request batching, and automatic health checking, which are essential for production deployment.

Exam trap

The trap here is that candidates may confuse Vertex AI Endpoints with AI Platform Prediction (legacy) or think Cloud Functions can serve models, but Cisco tests that Vertex AI is the modern, fully managed service for online prediction with minimal latency, while the others are either deprecated or designed for different workloads.

How to eliminate wrong answers

Option A is wrong because Cloud Functions is a serverless compute service for event-driven, short-lived functions, not designed for hosting persistent ML models with low-latency prediction endpoints; it lacks built-in model serving, batching, and autoscaling for inference workloads. Option C is wrong because AI Platform Prediction (legacy) is the older, deprecated service that has been replaced by Vertex AI; while it could serve models, it is no longer the recommended or supported path for new deployments, and Vertex AI offers superior latency optimization and integration. Option D is wrong because Cloud Dataflow is a batch and stream data processing service based on Apache Beam, intended for ETL and data pipelines, not for hosting online prediction endpoints; it cannot serve real-time inference requests with sub-second latency.

27
MCQmedium

A team wants to ingest streaming data from millions of IoT devices and store historical data in BigQuery for analysis. They need near real-time analytics on the most recent data, with sub-second latency. Which architecture should they use?

A.Use Pub/Sub to receive data, then stream directly into BigQuery using the streaming API, and use standard SQL queries for real-time analytics.
B.Use Pub/Sub, then a Dataflow pipeline that filters and transforms data, writing to Cloud Bigtable for real-time queries and to Cloud Storage for periodic BigQuery loads.
C.Use Pub/Sub to ingest data into a Dataproc Spark Streaming job that writes to both Bigtable and BigQuery.
D.Use Cloud SQL to store the latest data and periodically move historical data to BigQuery via cron jobs.
AnswerB

Bigtable provides sub-millisecond latency for real-time queries, and BigQuery handles large-scale analytics.

Why this answer

Option B is correct because it uses Cloud Bigtable for sub-second latency on recent data, which is ideal for near real-time analytics on streaming IoT data. Dataflow provides the necessary stream processing, filtering, and transformation before writing to Bigtable for low-latency queries and to Cloud Storage for periodic batch loads into BigQuery for historical analysis. This architecture decouples real-time and historical paths, meeting both latency and storage requirements.

Exam trap

Google Cloud often tests the misconception that BigQuery's streaming API can provide sub-second query latency, but in reality, BigQuery is a columnar analytics engine optimized for large scans, not for low-latency point reads, which is why a separate low-latency store like Bigtable is required for real-time access.

How to eliminate wrong answers

Option A is wrong because streaming directly into BigQuery via the streaming API does not guarantee sub-second latency for queries; BigQuery is optimized for analytical queries on large datasets, not for real-time point lookups or low-latency access to the most recent data. Option C is wrong because Dataproc Spark Streaming adds unnecessary operational overhead and latency compared to a managed service like Dataflow, and writing directly to both Bigtable and BigQuery from Spark can cause contention and complexity without the built-in exactly-once semantics and auto-scaling of Dataflow. Option D is wrong because Cloud SQL is not designed for high-throughput streaming ingestion from millions of devices and cannot handle the scale; also, periodic cron jobs to move data to BigQuery introduce latency that violates the sub-second requirement for near real-time analytics.

28
Multi-Selecteasy

A company uses Cloud Logging to monitor application errors. They want to set up real-time notifications for critical errors. Which two actions are essential? (Choose two.)

Select 2 answers
A.Create a log-based metric for critical errors.
B.Export logs to BigQuery for later analysis.
C.Create a Cloud Pub/Sub notification directly on the log sink.
D.Enable VPC Flow Logs to capture network traffic.
E.Set up a Cloud Monitoring alert policy based on the log-based metric.
AnswersA, E

A log-based metric extracts error counts from logs, enabling quantitative alerting.

Why this answer

First, create a log-based metric to count critical error events. Then, set up an alert policy in Cloud Monitoring that triggers when that metric crosses a threshold.

29
Multi-Selecteasy

A team is deploying a TensorFlow model for online predictions on AI Platform Prediction. They want to monitor for data drift and model performance degradation. Which TWO Google Cloud services should they use?

Select 2 answers
A.Cloud Composer
B.AI Platform Continuous Evaluation
C.Cloud Monitoring
D.AI Platform Pipelines
E.Cloud Logging
AnswersB, C

Provides automated drift detection and model evaluation.

Why this answer

AI Platform Continuous Evaluation (option B) is correct because it is a managed service specifically designed to detect data drift and model performance degradation in deployed models. It automatically compares incoming prediction data against the training data distribution and monitors metrics like accuracy over time, triggering alerts when significant drift is detected. Cloud Monitoring (option C) is correct because it provides the underlying metrics and alerting infrastructure that can track model performance indicators (e.g., prediction latency, error rates) and integrate with Continuous Evaluation for comprehensive observability.

Exam trap

Google Cloud often tests the distinction between services that orchestrate pipelines (Composer, Pipelines) versus services that monitor and evaluate deployed models (Continuous Evaluation, Monitoring), leading candidates to mistakenly choose orchestration tools for monitoring tasks.

30
Matchingmedium

Match each data pipeline term to its definition.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts
Matches

Extract, Transform, Load

Extract, Load, Transform

Raw data storage in native format

Optimized storage for structured analytics

Why these pairings

Common data pipeline concepts and their meanings.

31
Multi-Selectmedium

A team is debugging a sudden increase in prediction latency for a model deployed on Vertex AI Endpoints. Which TWO metrics in Cloud Monitoring should they examine first? (Choose two.)

Select 2 answers
A.CPU utilization
B.Memory utilization
C.gRPC port errors
D.Number of predictions
E.Prediction request latency
AnswersA, B

High CPU utilization can cause processing delays.

Why this answer

CPU utilization (A) is correct because a sudden increase in prediction latency often stems from the model consuming excessive CPU cycles during inference, especially for compute-intensive models like deep neural networks. Monitoring CPU utilization helps identify whether the endpoint's compute resources are saturated, causing requests to queue and latency to spike. Memory utilization (B) is correct because insufficient memory can lead to swapping or garbage collection pauses, directly increasing latency.

Vertex AI Endpoints autoscales based on these metrics, so examining them first pinpoints resource bottlenecks.

Exam trap

Google Cloud often tests the distinction between symptom metrics (like prediction request latency) and root-cause metrics (like CPU/memory utilization), trapping candidates who select the symptom as a diagnostic metric instead of the underlying resource indicators.

32
MCQeasy

A team wants to retrain a model weekly using new data stored in BigQuery. They want to minimize manual effort. Which approach should they use?

A.Use Cloud Scheduler to trigger a Cloud Function that retrains
B.Retrain manually in a notebook each week
C.Use Cloud Composer to orchestrate retraining
D.Create a Vertex AI Pipeline scheduled via Cloud Scheduler
AnswerD

Pipelines automate retraining end-to-end.

Why this answer

Vertex AI Pipelines allow you to define a repeatable, automated ML workflow that can be triggered on a schedule via Cloud Scheduler. This minimizes manual effort by handling data extraction from BigQuery, model retraining, and deployment without human intervention, while also providing versioning and monitoring capabilities.

Exam trap

Google Cloud often tests the distinction between simple scheduling (Cloud Scheduler + Cloud Function) and full ML orchestration (Vertex AI Pipelines), where candidates mistakenly choose the simpler option without considering the need for a managed, scalable ML workflow.

How to eliminate wrong answers

Option A is wrong because Cloud Scheduler triggering a Cloud Function is suitable for lightweight tasks, but retraining a model typically requires more complex orchestration, dependency management, and resource handling that a Cloud Function alone cannot efficiently provide. Option B is wrong because manual retraining in a notebook each week introduces significant manual effort and is error-prone, directly contradicting the goal of minimizing manual effort. Option C is wrong because Cloud Composer (based on Apache Airflow) is a powerful orchestration tool but is overkill for a simple weekly retraining schedule; it adds unnecessary complexity and cost compared to a Vertex AI Pipeline scheduled via Cloud Scheduler.

33
MCQmedium

A company has a Cloud Functions function that triggers on new files in Cloud Storage and writes a message to Pub/Sub for downstream processing. Recently, the function has been timing out after 60 seconds. The downstream processing is critical. What is the best solution?

A.Replace Cloud Functions with a Cloud Run job that has longer timeout
B.Increase the function memory to 2 GB to speed up execution
C.Reduce the function timeout to 30 seconds to force faster execution
D.Increase function timeout to 540 seconds and delegate heavy processing to Cloud Dataflow
AnswerD

This addresses both timeout and heavy processing.'

Why this answer

Option D is correct because Cloud Functions has a maximum timeout of 540 seconds (9 minutes) for HTTP-triggered functions, and by increasing the timeout you allow the function to complete its work. Delegating heavy processing to Cloud Dataflow offloads the computationally intensive tasks, preventing future timeouts and ensuring scalable, reliable downstream processing for critical workloads.

Exam trap

Google Cloud often tests the misconception that increasing memory or reducing timeout directly solves performance issues, but the real solution is to extend the timeout and delegate heavy processing to a scalable service like Dataflow.

How to eliminate wrong answers

Option A is wrong because Cloud Run jobs are designed for batch workloads that run to completion, not for event-driven triggers like Cloud Storage; replacing Cloud Functions with a Cloud Run job would require a different invocation pattern and does not directly solve the timeout issue. Option B is wrong because increasing memory may improve performance for memory-bound tasks but does not guarantee faster execution for I/O-bound or CPU-bound operations, and the function still has a 60-second timeout limit. Option C is wrong because reducing the timeout to 30 seconds would force the function to fail even faster, making the timeout problem worse and potentially losing critical messages.

34
MCQhard

Refer to the exhibit. A data engineer sees these metrics from Cloud Monitoring for a deployed Vertex AI Endpoint. What is the most effective action to reduce latency?

A.Switch to batch prediction
B.Increase the number of replicas
C.Reduce the machine type
D.Enable model quantization
AnswerB

Adding replicas scales horizontally, reducing load per replica and improving latency.

Why this answer

The metrics show high CPU utilization and increasing latency, indicating the current instance is overloaded. Increasing the number of replicas distributes the inference requests across multiple instances, reducing per-replica load and lowering response times. This is the most direct way to scale horizontally and address latency caused by resource saturation.

Exam trap

Google Cloud often tests the misconception that model optimization (quantization) or switching to batch mode is the primary fix for latency, when the metrics clearly point to a scaling bottleneck.

How to eliminate wrong answers

Option A is wrong because batch prediction is designed for asynchronous, large-scale processing and does not reduce real-time endpoint latency; it actually increases latency for individual requests. Option C is wrong because reducing the machine type would decrease compute capacity, worsening CPU saturation and increasing latency further. Option D is wrong because model quantization reduces model size and inference time per request but does not address the root cause of high concurrent load; it may help marginally but is less effective than scaling out replicas.

35
Multi-Selectmedium

Which THREE steps are required to set up a continuous training pipeline on Google Cloud using Vertex AI?

Select 3 answers
A.Run training on a single Compute Engine VM with a cron job.
B.Create a Vertex AI Pipeline to orchestrate data preprocessing, training, and model evaluation.
C.Set up a trigger (e.g., Cloud Scheduler or Cloud Build) to start training on a schedule or new data.
D.Manually upload the model to Vertex AI Model Registry after each training run.
E.Configure model evaluation and promotion rules (e.g., if accuracy > threshold, deploy to endpoint).
AnswersB, C, E

Pipeline orchestrates the steps.

Why this answer

Option B is correct because Vertex AI Pipelines provide a managed, repeatable, and scalable way to orchestrate the entire ML workflow, including data preprocessing, training, and model evaluation. This is essential for a continuous training pipeline, as it automates the sequence of steps and ensures consistency across runs.

Exam trap

Google Cloud often tests the distinction between manual, ad-hoc automation (like cron jobs) and fully managed, integrated orchestration services (like Vertex AI Pipelines), leading candidates to incorrectly select simpler but non-scalable options.

36
Multi-Selecthard

Which TWO are common causes of prediction bias in a deployed machine learning model in production?

Select 2 answers
A.Model accuracy is too high.
B.Data drift between training and serving data distributions.
C.Model is overfitted to training data.
D.Low latency predictions.
E.Training-serving skew due to differences in feature engineering.
AnswersB, E

Changes in the real-world data distribution can cause the model to produce biased results.

Why this answer

Option B is correct because data drift refers to changes in the statistical properties of the input features between the training and serving environments. When the distribution of real-world data shifts (e.g., seasonal trends, user behavior changes), the model's predictions become biased even if the model itself hasn't changed. This is a primary cause of prediction bias in production ML systems.

Exam trap

Google Cloud often tests the distinction between training-time issues (like overfitting) and production-time causes (like data drift and training-serving skew), so candidates mistakenly select overfitting as a production bias cause.

37
MCQhard

A real-time recommendation system uses a custom container deployed on AI Platform Prediction. The model requires a large in-memory embedding lookup table that is loaded from Cloud Storage at startup. The current startup time is over 5 minutes, causing prediction requests to timeout. Which strategy would most effectively reduce startup time?

A.Increase the machine type to one with more memory and CPU.
B.Preload the embedding table into a persistent disk and attach it to the container.
C.Reduce the size of the embedding table by using a smaller embedding dimension or fewer categories.
D.Use a faster storage class for the Cloud Storage bucket, such as Standard instead of Nearline.
AnswerC

Smaller table loads faster, directly addressing startup time.

Why this answer

Option C is correct because cutting down the embedding table size reduces data to load. Option A might not reduce time significantly; B adds complexity; D may not be possible or effective. The core issue is loading a large file, so reduce its size.

38
MCQhard

Refer to the exhibit. The feature store 'my_fs' responds to offline queries but online serving requests fail. What is the most likely cause?

A.Create a new feature store with online serving enabled
B.Use Cloud Bigtable directly
C.Update the existing feature store to enable online serving
D.Re-import features into a new store
AnswerC

Online serving can be enabled by setting appropriate scaling configuration.

Why this answer

The feature store 'my_fs' responds to offline queries but not online serving requests, which indicates that online serving is not enabled for the feature store. In Vertex AI Feature Store, online serving requires a dedicated endpoint and underlying infrastructure (e.g., Bigtable) to serve low-latency requests. Updating the existing feature store to enable online serving (option C) is the correct fix, as it activates the necessary serving resources without recreating the store.

Exam trap

Google Cloud often tests the misconception that a feature store's offline and online serving are automatically coupled, leading candidates to think a new store or data re-import is required when online serving fails, rather than recognizing that online serving is an optional configuration that must be explicitly enabled on the existing store.

How to eliminate wrong answers

Option A is wrong because creating a new feature store with online serving enabled is unnecessary and wasteful; the existing store can be updated to enable online serving without data re-import. Option B is wrong because using Cloud Bigtable directly bypasses the feature store's managed serving layer, losing integration with Vertex AI's serving APIs, monitoring, and consistency guarantees. Option D is wrong because re-importing features into a new store does not address the root cause—the existing store simply needs its online serving configuration enabled, not a full data migration.

39
MCQmedium

Refer to the exhibit. A Dataflow streaming pipeline subscribes to this Pub/Sub subscription. The pipeline occasionally takes more than 10 seconds to process a message. Which behavior will occur?

A.The message will be sent to the dead letter topic immediately.
B.The message will be retried with exponential backoff as per retry policy.
C.The message will be redelivered after 10 seconds if not acknowledged.
D.The message will be dropped after 10 seconds due to expiration policy.
AnswerC

The ack deadline is 10 seconds; if processing exceeds that, Pub/Sub redelivers the message.

Why this answer

Option C is correct because Pub/Sub delivery requires an acknowledgment within the configurable `ackDeadlineSeconds` (default 10 seconds). If the pipeline takes longer than the ack deadline to process a message, Pub/Sub considers the message unacknowledged and redelivers it. This is the standard behavior for at-least-once delivery in Google Cloud Pub/Sub.

Exam trap

Google Cloud often tests the distinction between ack deadline expiration and dead letter topics, trapping candidates who assume any processing delay immediately triggers a dead letter or that Pub/Sub uses exponential backoff like some other messaging systems.

How to eliminate wrong answers

Option A is wrong because a dead letter topic is only triggered after a message has been retried the maximum number of times (configurable via `maxDeliveryAttempts`), not immediately upon exceeding the ack deadline. Option B is wrong because Pub/Sub does not use exponential backoff for redelivery; it uses a fixed or configurable `ackDeadlineSeconds` and redelivers after that deadline expires, with no built-in exponential backoff retry policy. Option D is wrong because the expiration policy (`messageRetentionDuration`) controls how long unacknowledged messages are retained in the subscription, not a 10-second drop; messages are retained for up to 7 days by default.

40
MCQeasy

A company wants to version its ML models and track lineage from training data to deployed model. Which Google Cloud service should they use?

A.Cloud Storage with object versioning
B.Data Catalog
C.Artifact Registry
D.Vertex AI ML Metadata
AnswerD

ML Metadata tracks artifacts, lineage, and metadata for ML models.

Why this answer

Option B is correct because Vertex AI ML Metadata manages lineage and artifacts. Option A (Cloud Storage) is for storage only. Option C (Artifact Registry) is for container images, not ML models.

Option D (Data Catalog) is for data discovery.

41
MCQeasy

Refer to the exhibit. What is the most likely cause?

A.The model container does not support this prediction route
B.The request format is incorrect
C.The model was built for batch prediction only
D.The endpoint ID is wrong
AnswerA

The error indicates the prediction method is not supported by the model, likely due to container configuration.

Why this answer

Option D is correct: the error 'model type is not supported for this prediction method' suggests the model container does not support the online prediction route (e.g., it expects a different protocol or is designed for batch only). Option A is wrong because the endpoint ID is likely correct; the error is model-specific. Option B is wrong because the request format might be correct but the model rejects it.

Option C is wrong because batch-only models would show a different error; this error indicates the model's container doesn't handle the request.

42
MCQeasy

A data pipeline ingests streaming data from Pub/Sub into BigQuery via Dataflow. Recently, the pipeline has been failing with 'deadline exceeded' errors. What is the most likely cause?

A.The BigQuery streaming quota is exceeded.
B.Dataflow workers are underutilized due to batch size settings.
C.Dataflow autoscaling is disabled.
D.The Pub/Sub subscription's acknowledgement deadline is too short for the processing time.
AnswerD

A short acknowledgment deadline causes messages to be redelivered, leading to repeated processing attempts and eventual deadline exceeded errors.

Why this answer

Option D is correct because 'deadline exceeded' errors in a Dataflow pipeline reading from Pub/Sub indicate that the subscriber is taking longer to process messages than the acknowledgement deadline allows. When the deadline expires, Pub/Sub redelivers the message, causing duplicate processing and eventual pipeline failure. This is a common issue when processing time exceeds the default 10-second acknowledgement deadline.

Exam trap

Google Cloud often tests the distinction between resource quota errors (like BigQuery streaming quota) and Pub/Sub-specific timeout errors, trapping candidates who confuse 'deadline exceeded' with general quota exhaustion.

How to eliminate wrong answers

Option A is wrong because BigQuery streaming quota exceeded would produce 'quota exceeded' or 'rate limit exceeded' errors, not 'deadline exceeded' errors. Option B is wrong because underutilized workers due to batch size settings would cause poor performance or backpressure, not 'deadline exceeded' errors; the error is about processing time vs. acknowledgement deadline, not worker utilization. Option C is wrong because disabled autoscaling would lead to resource exhaustion or latency, but the specific 'deadline exceeded' error is tied to Pub/Sub's acknowledgement mechanism, not Dataflow's scaling behavior.

43
MCQmedium

A retail company is using a machine learning model for inventory forecasting. They observe that the model's predictions become less accurate over time, especially during holiday seasons. Which monitoring metric should they prioritize?

A.Model latency
B.Prediction counts
C.Resource utilization
D.Prediction drift (feature drift)
AnswerD

Monitoring feature drift helps detect when training data distribution shifts, leading to accuracy loss.

Why this answer

Prediction drift (feature drift) is the correct metric because it directly measures changes in the input data distribution over time, which is the root cause of degrading model accuracy during holiday seasons. When customer behavior shifts (e.g., buying patterns during holidays), the features the model relies on drift, causing predictions to become less accurate. Monitoring prediction drift allows the team to detect when retraining or updating the model is necessary.

Exam trap

Google Cloud often tests the misconception that model latency or resource utilization are the primary concerns for accuracy degradation, when in fact drift monitoring is the key metric for detecting data shifts that cause performance decay.

How to eliminate wrong answers

Option A is wrong because model latency measures the time taken for a single prediction, which is unrelated to accuracy degradation over time. Option B is wrong because prediction counts track the volume of predictions made, not the quality or drift of those predictions. Option C is wrong because resource utilization (CPU, memory, etc.) monitors infrastructure health, not model performance or data distribution shifts.

44
MCQhard

Refer to the exhibit. A team received this error when running a query. Which optimization should they apply first?

A.Use clustering on the date column.
B.Reduce the number of rows by aggregating in a subquery.
C.Run the query as a batch job.
D.Add a WHERE clause on a partitioning column.
AnswerD

If the table is partitioned by date, this prunes partitions and reduces data scanned, directly addressing the resource limit.

Why this answer

Option A is correct because adding a WHERE clause on a partitioning column (if the table is partitioned by date) would allow BigQuery to prune partitions, significantly reducing data scanned. Option B (clustering) is helpful but less effective than partitioning for date-range queries. Option C (batch job) does not reduce resource usage.

Option D (reducing rows early) is a good practice but not as impactful as partition pruning.

45
MCQeasy

A company needs to process real-time clickstream data and store it in a data warehouse for SQL-based analytics. The data volume is moderate. Which combination of Google Cloud services is most cost-effective?

A.Cloud Pub/Sub, Cloud Dataproc, Cloud Storage
B.Cloud Pub/Sub, Cloud Dataflow, Cloud Spanner
C.Cloud Pub/Sub, Cloud Dataflow, BigQuery
D.Cloud Pub/Sub, Cloud Dataflow, Cloud Storage
AnswerC

Best for real-time SQL analytics.

Why this answer

Option C is correct because Cloud Pub/Sub ingests real-time clickstream data, Cloud Dataflow processes it with low latency, and BigQuery provides a serverless, SQL-based data warehouse that is cost-effective for moderate data volumes due to its pay-per-query pricing and automatic scaling. This combination avoids the overhead of managing clusters (Dataproc) or expensive storage (Cloud Spanner) while directly supporting SQL analytics.

Exam trap

Google Cloud often tests the misconception that Cloud Storage is a suitable destination for analytics-ready data, but it lacks native SQL querying, forcing candidates to overlook BigQuery's direct integration with Dataflow for real-time analytics.

How to eliminate wrong answers

Option A is wrong because Cloud Dataproc requires a running cluster (even with preemptible VMs) and is optimized for batch processing, not real-time streaming, and Cloud Storage is not a SQL-queryable data warehouse, forcing additional ETL steps. Option B is wrong because Cloud Spanner is a globally distributed, strongly consistent relational database designed for transactional workloads, not cost-effective for analytics at moderate data volumes; its per-node pricing makes it expensive compared to BigQuery's serverless model. Option D is wrong because Cloud Storage is an object store, not a data warehouse; storing processed data there would require additional services (e.g., BigQuery external tables or Dataproc) to run SQL analytics, increasing complexity and cost.

46
Multi-Selecthard

Which THREE considerations are important when designing a batch prediction pipeline for a large dataset on Vertex AI?

Select 3 answers
A.Batch prediction automatically uses GPUs if the model framework requires them
B.Batch prediction requires a dedicated real-time endpoint
C.Choosing the appropriate machine type (e.g., n1-standard-16) balances cost and throughput
D.Large input files can be split into multiple smaller files to improve parallelism
E.Input data should be in Cloud Storage in a format supported by Vertex AI (e.g., JSONL, TFRecord)
AnswersC, D, E

Machine type impacts performance and cost.

Why this answer

Option C is correct because selecting the appropriate machine type, such as n1-standard-16, directly impacts the cost-performance trade-off in batch prediction. Vertex AI batch prediction jobs run on Compute Engine instances, and choosing a machine type with more vCPUs and memory can increase throughput for large datasets, but also raises cost. The key is to match the machine type to the model's computational needs and the data volume, avoiding over-provisioning while ensuring the job completes within acceptable time.

Exam trap

Google Cloud often tests the misconception that batch prediction requires a real-time endpoint or automatically uses GPUs, when in fact batch prediction is a serverless, endpoint-free process that requires explicit machine type and GPU configuration.

47
MCQmedium

A data platform uses Cloud Spanner for transactional data. They are experiencing high latency during write-heavy periods. To maintain solution quality, what configuration change is most effective?

A.Use interleaved tables to reduce the number of split operations.
B.Enable online schema changes.
C.Increase the number of nodes in the Cloud Spanner instance.
D.Manually split the table using ALTER TABLE statements.
AnswerA

Interleaved tables store related rows in the same split, minimizing distributed transaction overhead.

Why this answer

Option A is correct because interleaved tables improve data locality, reducing the number of splits and distributed commits, thus reducing write latency. Option B (increasing nodes) can increase throughput but may increase latency due to more distributed transactions. Option C (online schema changes) does not directly affect write performance.

Option D (manual splitting) is not recommended and may worsen performance.

48
MCQmedium

A financial company processes transactions in real-time and requires exactly-once processing semantics. They also need to reprocess historical data for backtesting. Which Google Cloud service should they use?

A.Cloud Pub/Sub
B.Cloud Functions
C.Cloud Dataproc
D.Cloud Dataflow
AnswerD

Supports exactly-once and batch/streaming.

Why this answer

Cloud Dataflow (D) is correct because it provides exactly-once processing semantics via its distributed snapshot mechanism (based on the MillWheel paper) and supports both real-time streaming and batch processing for historical backtesting under a unified programming model. This allows the company to reprocess historical data using the same pipeline code, ensuring consistency across real-time and batch modes.

Exam trap

Google Cloud often tests the misconception that Cloud Pub/Sub (A) provides exactly-once delivery, but in reality it offers at-least-once delivery, and candidates overlook Dataflow's unified batch/streaming model for reprocessing historical data.

How to eliminate wrong answers

Option A is wrong because Cloud Pub/Sub is a messaging service that offers at-least-once delivery by default, not exactly-once processing, and it lacks built-in capabilities for reprocessing historical data in a unified batch/streaming manner. Option B is wrong because Cloud Functions is an event-driven serverless compute service that does not provide exactly-once processing guarantees or native support for reprocessing large historical datasets; it is designed for lightweight, stateless functions. Option C is wrong because Cloud Dataproc is a managed Hadoop/Spark service that does not natively guarantee exactly-once processing semantics and requires manual handling of state and reprocessing logic, unlike Dataflow's automatic checkpointing.

49
MCQmedium

What is the most likely cause of this error?

A.The BigQuery table is not partitioned
B.The Dataflow worker does not have the correct time zone
C.The pipeline is using a fixed window but the data is out of order
D.The schema of the BigQuery table expects a TIMESTAMP but the pipeline is sending a STRING
AnswerD

The error clearly shows an attempt to convert a string to a timestamp, indicating a schema mismatch.

Why this answer

Option D is correct because the error message indicates a type mismatch: BigQuery expects a TIMESTAMP column, but the pipeline is sending a STRING. Dataflow's BigQuery sink performs automatic schema validation, and if the source data type (STRING) does not match the target column type (TIMESTAMP), the write operation fails with a mismatch error. This is a common issue when pipeline code or source data formats timestamps as strings without explicit conversion.

Exam trap

Google Cloud often tests the distinction between schema type mismatches and data ordering or partitioning issues, so candidates may confuse a type error with a windowing or time zone problem.

How to eliminate wrong answers

Option A is wrong because a non-partitioned BigQuery table would not cause a type mismatch error; it would instead cause performance issues or quota errors on large writes. Option B is wrong because the Dataflow worker's time zone affects timestamp interpretation, not the data type of the field being written; the error is about schema type, not time zone conversion. Option C is wrong because out-of-order data with a fixed window causes late data handling or watermark issues, not a schema type mismatch; the error is specifically about the data type sent to BigQuery.

50
MCQmedium

Your organization deploys multiple versions of the same model to Vertex AI Endpoint for A/B testing. You have a production model (v1) serving 90% of traffic and a candidate model (v2) serving 10%. After one week, you observe that v2 has a slightly lower AUC but significantly higher business metrics like click-through rate. The product team wants to gradually increase v2's traffic. However, you need to ensure that the overall prediction latency remains under 200 ms. Currently, the endpoint has 10 replicas for v1 and 2 replicas for v2. What is the best approach to roll out v2 while maintaining latency SLO?

A.Merge v2's model into v1 by retraining v1 with v2's architecture and deploy as a single model.
B.Immediately set v2 to serve 100% traffic and monitor latency; if it exceeds 200 ms, roll back.
C.Increase v2's traffic split by 10% each day while also adding replicas for v2 based on CPU utilization.
D.Use a separate endpoint for v2 and route traffic at the load balancer level.
AnswerC

Gradual increase with autoscaling ensures latency remains within bounds.

51
MCQmedium

Your team has implemented a CI/CD pipeline using Cloud Composer (Apache Airflow) to retrain a model every day. The pipeline reads new data from BigQuery, trains a model using Vertex AI Training, evaluates it, and if the accuracy improves, deploys it to a Vertex AI Endpoint. For the past week, the pipeline has been running successfully but no new model has been deployed because the evaluation accuracy never exceeds the previous model's accuracy. The training data volume has been consistent. You suspect that the model is not learning from the new data. What should you do?

A.Deploy the new model anyway and run an A/B test in production to see if it performs better online.
B.Examine the training data for any data quality issues such as missing values or label leakage.
C.Increase the training budget or number of training steps to allow the model to converge better.
D.Change the evaluation metric to a different one that may show improvement, such as F1 score instead of accuracy.
AnswerB

Data quality issues can prevent the model from learning meaningful patterns despite sufficient data volume.

52
Matchingmedium

Match each data lifecycle stage to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts
Matches

Collecting data from various sources

Persisting data in a durable system

Transforming and analyzing data

Making data available for consumption

Moving data to long-term, low-cost storage

Why these pairings

Common stages in a data lifecycle.

53
MCQeasy

You are monitoring a Dataflow streaming job and need to track the freshness of data being processed. What metric should you alert on?

A.Output throughput (elements/sec)
B.Error count
C.Data freshness (seconds)
D.CPU utilization
AnswerC

Data freshness measures the latency of the last processed event, indicating pipeline delay.

Why this answer

Data freshness (seconds) is the correct metric to alert on because it directly measures the lag between when an event occurs and when it is processed by the Dataflow pipeline. This metric, exposed as the 'system_lag' in Dataflow monitoring, indicates how up-to-date the output is relative to the input watermark. Alerting on data freshness ensures that the pipeline is meeting service-level agreements (SLAs) for real-time or near-real-time processing.

Exam trap

Google Cloud often tests the distinction between throughput and latency metrics, and the trap here is that candidates confuse high throughput with low latency, not realizing that a pipeline can process many elements per second while still having stale data due to watermark delays or unprocessed late data.

How to eliminate wrong answers

Option A is wrong because output throughput (elements/sec) measures processing rate, not timeliness; a pipeline can have high throughput but still be processing stale data due to backlog or watermark delays. Option B is wrong because error count tracks failures (e.g., exceptions, dropped elements) but does not indicate how current the processed data is; a pipeline with zero errors could still have high latency. Option D is wrong because CPU utilization is a resource metric that reflects compute efficiency, not data freshness; high CPU might cause delays, but it is an indirect indicator and not the direct measure of data staleness.

54
MCQhard

A company is migrating their on-premises Hadoop cluster to Google Cloud. The existing cluster runs HDFS, Hive, and Spark jobs. The migration must minimize changes to existing job code and configuration. The data volume is 50 TB and growing. The team expects to run both batch and interactive SQL queries. Which architecture should they use?

A.Keep HDFS on persistent Cloud Dataproc clusters and use BigQuery for SQL queries.
B.Use Cloud Dataflow for all batch processing and BigQuery for storage and querying.
C.Migrate HDFS to Cloud Storage, create a Cloud Dataproc cluster for Spark jobs, and use BigQuery for interactive SQL queries via a Hive metastore linked to BigQuery.
D.Use Cloud Dataproc with ephemeral clusters and Cloud Storage (instead of HDFS) for data storage. Run Spark jobs directly, and use Cloud Dataproc's built-in Hive on Cloud Dataproc for SQL queries.
AnswerD

Cloud Dataproc can use Cloud Storage as the data layer; most Spark and Hive jobs need minimal changes (e.g., file path prefix). Ephemeral clusters reduce cost. This preserves existing code.

Why this answer

Option D is correct because it uses Cloud Storage as the underlying storage layer, which is HDFS-compatible and allows existing Spark jobs to run without code changes. Ephemeral Dataproc clusters reduce costs and provide native Hive support for interactive SQL queries, meeting both batch and interactive requirements without altering job configurations.

Exam trap

Google Cloud often tests the misconception that BigQuery must be used for all SQL queries in a migration, ignoring that Dataproc's Hive can directly query data in Cloud Storage without code changes, making it a simpler path for interactive SQL on existing Hive workloads.

How to eliminate wrong answers

Option A is wrong because keeping HDFS on persistent Dataproc clusters does not leverage Cloud Storage's scalability and cost benefits, and using BigQuery for SQL queries would require significant code changes to redirect queries away from Hive. Option B is wrong because Cloud Dataflow is not designed for Spark job compatibility, and using BigQuery for storage would break existing HDFS-based job code and configurations. Option C is wrong because linking a Hive metastore to BigQuery requires modifying the Hive configuration and does not support running Spark jobs directly on BigQuery storage without additional connectors, increasing complexity and potential code changes.

55
Multi-Selecthard

Which TWO statements about designing a data processing pipeline on Google Cloud are correct? (Choose 2.)

Select 2 answers
A.Pub/Sub guarantees message ordering across all subscribers globally.
B.Cloud Bigtable is ideal for data warehousing and SQL analytics.
C.Dataproc is the best choice for fully managed data warehousing and analytics.
D.Cloud Data Fusion allows you to build and manage data pipelines visually without writing code.
E.Dataflow supports both batch and streaming modes in a single pipeline model.
AnswersD, E

Cloud Data Fusion provides a visual UI for designing pipelines.

Why this answer

Cloud Data Fusion provides a visual, no-code interface for building and managing data pipelines, enabling users to design ETL/ELT workflows through a drag-and-drop UI. It abstracts the underlying complexity of Apache Spark and Cloud Dataproc, making it suitable for users who prefer a graphical approach over writing code.

Exam trap

Google Cloud often tests the distinction between fully managed services (like BigQuery for warehousing) and managed cluster services (like Dataproc), as well as the limitations of Pub/Sub ordering guarantees, to see if candidates confuse operational databases with analytical systems.

56
MCQeasy

A data engineer needs to design a batch pipeline that processes daily log files from Cloud Storage and writes aggregated results to BigQuery. Which service is most appropriate for this ETL job?

A.Cloud Pub/Sub with Cloud Functions
B.Cloud Composer
C.Cloud Data Fusion
D.Dataproc with PySpark
AnswerD

Dataproc handles large batch processing efficiently with Spark.

Why this answer

Dataproc with PySpark is the most appropriate choice because it provides a managed Spark/Hadoop environment that can efficiently process large daily log files stored in Cloud Storage using distributed computing. PySpark's native integration with BigQuery via the Spark BigQuery connector allows direct writing of aggregated results, making it ideal for batch ETL workloads that require complex transformations and high throughput.

Exam trap

The trap here is that candidates often confuse orchestration (Cloud Composer) with execution, or assume serverless options like Cloud Functions can handle heavy batch ETL, but the question specifically requires a service that performs the ETL processing, not just schedules or triggers it.

How to eliminate wrong answers

Option A is wrong because Cloud Pub/Sub with Cloud Functions is designed for event-driven, real-time streaming pipelines, not for batch processing of daily log files; Cloud Functions has a timeout limit (9 minutes for HTTP functions) and is not suited for heavy ETL jobs. Option B is wrong because Cloud Composer is a workflow orchestration tool (based on Apache Airflow) that schedules and monitors jobs, but it does not perform the actual data processing or transformation itself. Option C is wrong because Cloud Data Fusion is a visual data integration service for building pipelines, but it is more suited for low-code ETL and may lack the flexibility and performance of PySpark for large-scale batch log processing with custom transformations.

57
MCQeasy

A data engineer needs to monitor model performance over time for drift detection. What tool is specifically designed for this?

A.Vertex AI Model Monitoring
B.Cloud Monitoring
C.Cloud Logging
D.BigQuery ML
AnswerA

Vertex AI Model Monitoring provides drift detection, skew detection, and alerts for deployed models.

Why this answer

Vertex AI Model Monitoring is specifically designed to detect prediction drift and feature skew in deployed machine learning models. It continuously analyzes serving data against training data distributions and alerts when statistical metrics (e.g., Jensen-Shannon divergence, L-infinity distance) exceed configured thresholds, making it the correct tool for drift detection in the context of operationalizing ML models.

Exam trap

Google Cloud often tests the distinction between general-purpose monitoring tools (Cloud Monitoring, Cloud Logging) and ML-specific monitoring services (Vertex AI Model Monitoring), trapping candidates who assume any monitoring tool can handle drift detection.

How to eliminate wrong answers

Option B (Cloud Monitoring) is wrong because it is a general-purpose infrastructure and application monitoring service for metrics, uptime, and alerting, not specialized for ML model drift detection. Option C (Cloud Logging) is wrong because it is a centralized log management and analysis service for storing and querying log data, not designed to compute statistical drift between training and serving distributions. Option D (BigQuery ML) is wrong because it is a service for creating and executing machine learning models using SQL queries in BigQuery, not a monitoring tool for detecting drift in already-deployed models.

58
MCQhard

A healthcare organization is deploying a model that processes protected health information (PHI). They need to ensure that the inference data is encrypted in transit and at rest, and access is audited. Which combination of services meets these requirements?

A.Cloud Run with VPC connector and Cloud KMS
B.Vertex AI Endpoints with IAM and Cloud Monitoring
C.Vertex AI Endpoints with VPC-SC, CMEK, and Cloud Audit Logs
D.AI Platform Prediction with Cloud Armor
AnswerC

VPC Service Controls protect against unauthorized data movement, CMEK for customer-managed encryption keys, and Cloud Audit Logs for compliance.

Why this answer

Option C is correct because VPC Service Controls (VPC-SC) provides data exfiltration protection and ensures inference data remains within a defined security perimeter, Customer-Managed Encryption Keys (CMEK) encrypt data at rest with keys controlled by the organization, and Cloud Audit Logs capture all access events for auditing. This combination directly addresses encryption in transit (via VPC-SC perimeter enforcement) and at rest (via CMEK), plus access auditing via Cloud Audit Logs.

Exam trap

Google Cloud often tests the misconception that IAM and Cloud Monitoring alone satisfy encryption and auditing requirements, but IAM controls access without encrypting data at rest, and Cloud Monitoring tracks performance metrics, not access logs; candidates must recognize that VPC-SC, CMEK, and Cloud Audit Logs are the specific services needed for encryption and auditing in ML inference.

How to eliminate wrong answers

Option A is wrong because Cloud Run with VPC connector and Cloud KMS does not provide a managed inference endpoint optimized for ML models; Cloud Run is a general-purpose compute service, and while VPC connector enables private networking and Cloud KMS manages encryption keys, it lacks the specific model hosting, scaling, and monitoring capabilities of Vertex AI Endpoints. Option B is wrong because Vertex AI Endpoints with IAM and Cloud Monitoring provides access control and performance monitoring but does not encrypt data at rest with customer-managed keys (CMEK) or enforce a data perimeter via VPC-SC; Cloud Monitoring logs metrics, not access events for auditing. Option D is wrong because AI Platform Prediction (now legacy) with Cloud Armor provides DDoS protection but does not encrypt data at rest with CMEK or provide VPC-SC perimeter controls; Cloud Armor operates at the network edge and does not address encryption or audit logging requirements.

59
Multi-Selectmedium

You are optimizing a Dataflow pipeline that performs a group-by-key transformation on a large, skewed dataset. The pipeline is experiencing high latency due to data skew (some keys have many more values). Which TWO actions can help mitigate the skew? (Choose two.)

Select 2 answers
A.Use hot key detection and split the hot key into multiple sub-keys (e.g., append a random number).
B.Enable the Dataflow service's automatic reshuffling feature.
C.Use CoGroupByKey to reduce the number of keys.
D.Increase the number of worker machines.
E.Use Combine.perKey with a combiner to aggregate values locally before shuffling.
AnswersA, E

Splitting a hot key distributes its values across multiple workers, reducing bottleneck.

Why this answer

Option A is correct because splitting a hot key into multiple sub-keys (e.g., by appending a random number) distributes the values across multiple shards during the shuffle phase, reducing the load on any single worker. This technique, often called "salting," is a standard pattern in Dataflow and Apache Beam to handle data skew by breaking the bottleneck caused by a single key with disproportionately many values.

Exam trap

Google Cloud often tests the misconception that simply adding more workers (Option D) or enabling automatic reshuffling (Option B) can fix data skew, when in fact these actions do not address the root cause of a single key being processed by one shard.

60
MCQhard

Your MLOps pipeline uses Vertex AI Pipelines. You want to ensure that model training uses a consistent environment with specific Python package versions. Which approach best achieves this?

A.Include a requirements.txt file in the pipeline step and let Vertex AI install them
B.Use a pre-built deep learning container from Deep Learning Containers and install packages at runtime
C.Specify the Python version and package versions in the training job configuration
D.Build a custom container image with all dependencies and use it in the training step
AnswerD

Custom containers ensure exact same environment.

Why this answer

Option D is correct because building a custom container image with all dependencies ensures a fully deterministic and reproducible environment for model training. Vertex AI Pipelines executes each step as a container, so by pre-installing specific Python package versions into a custom image, you eliminate any risk of version drift or network issues during package installation at runtime. This approach aligns with MLOps best practices for environment consistency and is the most reliable method when exact package versions are critical.

Exam trap

Google Cloud often tests the distinction between runtime configuration (options A, B, C) and pre-built containerization (option D), trapping candidates who think specifying versions in a config file or installing at runtime is sufficient for full environment consistency in a pipeline context.

How to eliminate wrong answers

Option A is wrong because including a requirements.txt file and letting Vertex AI install them at runtime introduces variability; the installation may fail due to network issues, dependency conflicts, or changes in package repositories, and it does not guarantee the same environment across pipeline retries. Option B is wrong because using a pre-built deep learning container and installing packages at runtime still relies on runtime installation, which can lead to inconsistent environments if the installation process fails or if package versions are not pinned correctly. Option C is wrong because specifying Python version and package versions in the training job configuration only applies to AI Platform Training jobs, not to Vertex AI Pipelines; Vertex AI Pipelines runs steps as containers and does not natively support specifying package versions in the pipeline step configuration—the environment must be defined within the container image.

61
MCQmedium

Refer to the exhibit. An ML engineer sees this error when invoking a Vertex AI endpoint. What is the most likely cause?

A.The input format should be JSON
B.The model has a bug in the ResNet50 architecture
C.The model expects 128x128 images but raw input is 256x256
D.The endpoint is overloaded
AnswerC

The error shows expected shape [1,128,128,3] but got [1,256,256,3], indicating image size mismatch.

Why this answer

The error indicates a mismatch between the input dimensions expected by the model and the dimensions of the data being sent to the Vertex AI endpoint. ResNet50 models are commonly trained on 128x128 images, and if the raw input is 256x256, the endpoint will reject the request because the model's input tensor shape does not match. This is a typical input validation error in Vertex AI, where the serving infrastructure checks the shape of the prediction request against the model's signature.

Exam trap

Google Cloud often tests the distinction between input validation errors (e.g., shape mismatch) and model logic errors (e.g., architecture bugs), so candidates mistakenly attribute the error to a model bug or endpoint overload rather than a simple data preprocessing mismatch.

How to eliminate wrong answers

Option A is wrong because the error is about image dimensions, not the serialization format; Vertex AI endpoints accept JSON by default, and the error message would explicitly mention 'invalid format' if that were the issue. Option B is wrong because a bug in the ResNet50 architecture would cause inference errors or incorrect predictions, not a dimension mismatch error at the endpoint level. Option D is wrong because an overloaded endpoint would return a 429 HTTP status code or a 'resource exhausted' error, not a dimension mismatch error.

62
MCQeasy

Your company runs batch predictions using Vertex AI Batch Prediction on a monthly basis. The predictions are used to generate customer segments for marketing campaigns. This month, the batch prediction job failed with an error: 'The number of rows in the input table does not match the number of rows in the output table.' The input table in BigQuery has 5 million rows, but the output table has only 4.5 million rows. You need to identify and handle the missing predictions. What is the most efficient course of action?

A.Manually inspect the input table to find which rows are missing and rerun the batch prediction for those rows.
B.Run the batch prediction job with the 'generate_explanation' parameter enabled to get additional output for debugging.
C.Enable the 'write_prediction_errors' flag in the batch prediction configuration to capture failed predictions in a separate table.
D.Use a Cloud Dataflow pipeline to process the input data and call the model for each row, handling errors programmatically.
AnswerC

This flag causes failed predictions to be written to an error table, allowing you to identify and correct the problematic rows.

63
MCQeasy

You are using AI Platform Prediction (now Vertex AI) for online predictions. You notice that some requests are failing with a 503 status code. Which is the most likely cause?

A.The model is experiencing high traffic and the underlying nodes are still scaling up
B.The input data format does not match the model's expected schema
C.The project has exceeded its prediction requests quota
D.The service account used for prediction does not have the required permissions
AnswerA

503 errors often occur during scaling.

Why this answer

A 503 status code in Vertex AI (formerly AI Platform Prediction) indicates that the prediction service is temporarily unavailable, most commonly due to autoscaling latency. When a model receives a sudden spike in traffic, the underlying nodes (compute instances) may still be provisioning and initializing, causing requests to be rejected until the new nodes are ready to serve. This is a transient condition that resolves once scaling completes.

Exam trap

Google Cloud often tests the distinction between HTTP 503 (service unavailable, transient) and HTTP 429 (quota exceeded) or HTTP 400 (bad request), so candidates mistakenly attribute scaling issues to quota exhaustion or permission errors.

How to eliminate wrong answers

Option B is wrong because a mismatch in input data format (e.g., wrong tensor shape or feature names) would result in a 400 Bad Request error, not a 503. Option C is wrong because exceeding prediction request quota would return a 429 Too Many Requests error, not a 503. Option D is wrong because insufficient permissions (e.g., missing `aiplatform.predict` role) would cause a 403 Forbidden error, not a 503.

64
MCQhard

A company processes large volumes of GPS sensor data stored in Cloud Storage. Each hour, they run an Apache Spark job that aggregates the data by geohash region. The job must be cost-effective and scale automatically. Currently, they are using a Dataproc cluster with preemptible workers. Which improvement would best reduce costs while maintaining performance?

A.Use a larger Dataproc cluster with standard workers
B.Migrate the job to BigQuery scheduled queries
C.Switch to Dataflow batch pipeline with Apache Beam
D.Use Dataproc Serverless Spark
AnswerD

Dataproc Serverless Spark runs Spark jobs without cluster management, scales automatically, and you pay only for resources used, reducing cost.

Why this answer

Dataproc Serverless Spark (Option D) eliminates the need to manage a cluster, automatically scaling resources to match job demand and charging only for the resources consumed during execution. This removes the overhead of preemptible worker management and idle cluster costs, directly reducing expenses while maintaining performance for the hourly aggregation job.

Exam trap

Google Cloud often tests the misconception that migrating to a different processing engine (like Dataflow or BigQuery) is always the best cost-saving move, when in fact reusing existing Spark code on a serverless platform avoids migration costs and leverages the same API.

How to eliminate wrong answers

Option A is wrong because using a larger cluster with standard workers increases costs due to higher per-hour instance pricing and potential idle time, without addressing the cost inefficiency of preemptible workers. Option B is wrong because BigQuery scheduled queries are designed for SQL-based analytics on data already in BigQuery, not for processing large volumes of GPS sensor data stored in Cloud Storage with Apache Spark aggregations; migrating would require rewriting the Spark logic and may incur high BigQuery slot costs. Option C is wrong because while Dataflow batch pipelines with Apache Beam can process data cost-effectively, they require rewriting the existing Spark job into Beam, introducing development overhead and potential performance differences, whereas Dataproc Serverless Spark directly runs the existing Spark code without migration.

65
MCQeasy

A company needs to stream real-time user click events from a web application to BigQuery for analysis. Which Google Cloud architecture is most suitable?

A.App Engine -> Pub/Sub -> Dataflow -> BigQuery
B.Cloud Scheduler -> BigQuery
C.Compute Engine -> Cloud Storage -> BigQuery
D.Cloud Functions -> BigQuery
AnswerA

This architecture supports real-time streaming with decoupled components.

Why this answer

Option A is correct because it provides a fully managed, scalable, and decoupled architecture for ingesting real-time click events. Pub/Sub acts as a durable, asynchronous message buffer that can handle high-throughput streams, Dataflow (Apache Beam) processes the events in near real-time with exactly-once semantics, and BigQuery serves as the analytics warehouse. This pattern is the recommended Google Cloud approach for streaming analytics, as it decouples producers from consumers and supports auto-scaling.

Exam trap

The trap here is that candidates often choose Cloud Functions (Option D) thinking it is sufficient for real-time ingestion, but they overlook its execution timeout and lack of built-in streaming semantics, which makes it unsuitable for sustained high-throughput event pipelines.

How to eliminate wrong answers

Option B is wrong because Cloud Scheduler is a cron job service for triggering actions on a schedule, not a real-time event ingestion mechanism; it cannot stream continuous click events. Option C is wrong because Compute Engine and Cloud Storage are batch-oriented; writing events directly to Cloud Storage introduces latency and requires additional batch processing to load into BigQuery, making it unsuitable for real-time streaming. Option D is wrong because Cloud Functions has a 9-minute timeout and is designed for short-lived, event-driven compute, not for continuous, high-throughput streaming; it would also require custom code to buffer and batch writes to BigQuery, losing the managed streaming capabilities of Dataflow.

66
MCQeasy

A data engineer runs this Dataflow template to load CSV files from Cloud Storage into BigQuery. The job fails with a 'File pattern not matching any files' error. What is the most likely cause?

A.The bucket name is incorrectly spelled
B.The CSV files are stored in a subdirectory that is not matched by the pattern
C.The template has a bug
D.The output table does not exist
AnswerB

The pattern `*.csv` in a prefix does not include files in nested subdirectories.

Why this answer

The error 'File pattern not matching any files' indicates that the file pattern specified in the Dataflow template does not resolve to any existing objects in Cloud Storage. If the CSV files are stored in a subdirectory (e.g., gs://bucket/subdir/*.csv) but the pattern only references the root (e.g., gs://bucket/*.csv), no files will be matched. This is the most likely cause because the pattern must explicitly include the subdirectory path.

Exam trap

Google Cloud often tests the distinction between file pattern matching errors and bucket-level errors, trapping candidates who confuse a missing subdirectory in the pattern with a misspelled bucket name.

How to eliminate wrong answers

Option A is wrong because an incorrectly spelled bucket name would result in a 'bucket not found' or 'access denied' error, not a 'file pattern not matching any files' error. Option C is wrong because the template is a well-tested Google-provided template; a bug is unlikely and would typically cause different errors (e.g., runtime exceptions). Option D is wrong because the output table not existing would cause a BigQuery table creation or write error, not a file pattern matching error in Cloud Storage.

67
MCQhard

A team configured a garbage collection rule on a Cloud Bigtable column family with max_age of 100 seconds. After 2 minutes, they notice that data older than 100 seconds is still present. What is the most likely reason?

A.They need to apply the rule using a different API
B.Garbage collection runs only periodically (e.g., once per day)
C.The max_age must be at least 1 hour
D.The rule is applied only to new data, not existing data
AnswerB

Bigtable GC runs in the background at intervals (by default once per day), so newly set rules may not take effect immediately.

Why this answer

Cloud Bigtable garbage collection (GC) is not applied in real time; it runs as a background process that typically executes once per day. Even though the max_age rule is set to 100 seconds, the actual deletion of expired data occurs only during the next scheduled GC cycle, which may not happen for up to 24 hours. Therefore, observing data older than 100 seconds after only 2 minutes is expected behavior.

Exam trap

The trap here is that candidates assume garbage collection is immediate or near-real-time, but Cloud Bigtable's GC is a batch process with a long interval (typically daily), so data persists until the next scheduled run.

How to eliminate wrong answers

Option A is wrong because Cloud Bigtable garbage collection rules are configured via the standard Cloud Bigtable API (e.g., gcloud bigtable instances tables update or the client library's modify_column_family method); no different API is required. Option C is wrong because Cloud Bigtable does not enforce a minimum max_age of 1 hour; the max_age can be set to any positive duration, including 100 seconds. Option D is wrong because garbage collection rules apply to both existing and new data; the rule is not limited to new data only—it governs all data in the column family once the rule is set.

68
MCQeasy

A data scientist wants to test a new model version on a small percentage of traffic before full rollout. Which Vertex AI feature allows this?

A.A/B testing
B.Endpoint traffic splitting
C.Model monitoring
D.Model versioning with canary deployments
AnswerB

Traffic splitting allows routing a subset of requests to a different model version.

Why this answer

Vertex AI Endpoint traffic splitting allows you to route a specified percentage of inference requests to different model versions deployed on the same endpoint. This enables gradual rollout by directing a small fraction of traffic (e.g., 5%) to the new model while the rest goes to the current version, without needing separate endpoints or manual routing logic.

Exam trap

The trap here is that candidates confuse the conceptual practice of 'canary deployments' (Option D) with the specific Vertex AI feature 'endpoint traffic splitting' (Option B), but the exam expects the exact feature name as defined in the Google Cloud documentation.

How to eliminate wrong answers

Option A is wrong because A/B testing in Vertex AI is a feature for comparing model performance metrics (like accuracy or latency) by splitting traffic, but it is not the feature that directly enables traffic splitting itself—traffic splitting is the underlying mechanism, and A/B testing is a higher-level evaluation tool built on top of it. Option C is wrong because Model monitoring is used to detect data drift, feature skew, and prediction anomalies on deployed models, not to control traffic distribution between versions. Option D is wrong because model versioning with canary deployments is a conceptual practice, not a specific Vertex AI feature; the actual feature that implements canary-style traffic routing is endpoint traffic splitting, which is the correct answer.

69
MCQmedium

A retail company uses a Vertex AI endpoint to serve product recommendations. The model is a TensorFlow model deployed with a custom container. Recently, users have reported that recommendations are stale. The model is retrained daily using Vertex AI Pipelines. The pipeline completes successfully, but the endpoint continues to serve the old model. The team checks the pipeline logs and sees that the new model is uploaded to the Vertex AI Model Registry. The endpoint has traffic split set to 100% for the old model. The team needs to update the endpoint to serve the new model version. What should they do?

A.Check the pipeline for errors in the deployment step
B.Re-upload the model with a different version ID
C.Redeploy the same model to the endpoint
D.Update the endpoint to deploy the new model version from the registry and adjust traffic split
AnswerD

Explicitly deploy new version to endpoint.

Why this answer

Option D is correct because the pipeline successfully uploaded the new model to the Vertex AI Model Registry, but the endpoint still has its traffic split configured to 100% for the old model. To serve the new model, the team must explicitly update the endpoint to deploy the new model version from the registry and adjust the traffic split to route 100% of traffic to it. This is a standard operational step in Vertex AI: uploading a model does not automatically update the endpoint's deployment or traffic allocation.

Exam trap

Google Cloud often tests the misconception that uploading a new model version to the registry automatically updates the endpoint's serving configuration, when in fact the traffic split must be explicitly adjusted to route requests to the new model.

How to eliminate wrong answers

Option A is wrong because the pipeline logs show no errors in the deployment step; the model was successfully uploaded to the registry, so checking for errors is unnecessary and misdiagnoses the issue. Option B is wrong because re-uploading the model with a different version ID does not change the endpoint's deployment or traffic split; the endpoint still points to the old model version. Option C is wrong because redeploying the same model (the old version) to the endpoint would not serve the new model; the team needs to deploy the new model version from the registry, not redeploy the old one.

70
Multi-Selecthard

Which THREE factors should be considered when designing a Vertex AI Pipeline for continuous training?

Select 3 answers
A.Cost of training and infrastructure
B.Debugging tools like Cloud Debugger
C.Trigger mechanism (time-based or event-based)
D.Number of model versions to keep
E.Data freshness and staleness tolerance
AnswersA, C, E

Budget impacts resource selection.

Why this answer

Cost of training and infrastructure (A) is correct because Vertex AI Pipelines incur compute costs for each pipeline run, including training, data processing, and orchestration. Continuous training amplifies these costs, so you must consider budget constraints, resource optimization (e.g., using preemptible VMs), and cost monitoring to avoid unexpected bills.

Exam trap

Google Cloud often tests the distinction between operational pipeline design factors (triggers, cost, data freshness) and peripheral management tasks (versioning, debugging tools), leading candidates to incorrectly select options like D or B that are valid but not core to pipeline design.

71
MCQmedium

A company uses Vertex AI to serve a model. They notice that some predictions are incorrect due to data drift. What is the best way to detect and retrain the model automatically?

A.Store predictions in BigQuery and run scheduled queries
B.Create a Cloud Monitoring dashboard
C.Set up Cloud Logging metrics to monitor predictions
D.Use Vertex AI Model Monitoring with alerts and retraining pipeline
AnswerD

Monitors drift and triggers retraining.

Why this answer

Option D is correct because Vertex AI Model Monitoring is specifically designed to detect data drift and feature skew in production models. It can be configured to send alerts and trigger an automated retraining pipeline via Cloud Functions or Vertex AI Pipelines, enabling continuous model improvement without manual intervention. This directly addresses the need for automatic detection and retraining in response to data drift.

Exam trap

The trap here is that candidates may confuse general monitoring tools (Cloud Monitoring, Cloud Logging) with the specialized drift detection and automated retraining capabilities of Vertex AI Model Monitoring, assuming any monitoring solution can trigger retraining without native integration.

How to eliminate wrong answers

Option A is wrong because storing predictions in BigQuery and running scheduled queries is a manual, batch-oriented approach that does not provide real-time drift detection or automated retraining; it requires custom code and lacks native integration with Vertex AI's monitoring capabilities. Option B is wrong because Cloud Monitoring dashboards visualize metrics but do not inherently detect data drift or trigger retraining pipelines; they are for observability, not automated action. Option C is wrong because Cloud Logging metrics can track prediction logs but are not designed for statistical drift analysis (e.g., distribution comparisons) and cannot directly initiate retraining workflows without additional custom logic.

72
MCQhard

A data analyst runs a complex SQL query in BigQuery that joins multiple large tables and receives the above error. Which action is most likely to resolve the issue?

A.Use a larger number of workers in the query execution.
B.Use smaller tables by sampling data.
C.Add clustering on join columns.
D.Increase the number of slots allocated to the project.
AnswerD

More slots provide more memory and CPU, reducing resource exceeded errors.

Why this answer

The error indicates that the query exceeded the available slot resources in the BigQuery project. Increasing the number of slots allocated to the project (option D) directly addresses this by providing more compute capacity for parallel query execution, which is the correct action to resolve resource exhaustion in BigQuery's serverless architecture.

Exam trap

Google Cloud often tests the misconception that performance tuning (e.g., clustering or sampling) can resolve resource exhaustion errors, when in fact the root cause is insufficient compute capacity that must be addressed by increasing slot allocation.

How to eliminate wrong answers

Option A is wrong because BigQuery automatically manages parallelism; manually specifying a larger number of workers is not supported and would not increase slot capacity. Option B is wrong because sampling data reduces accuracy and may not reflect the full dataset, which is not a valid solution for resource exhaustion—it changes the query result rather than fixing the resource issue. Option C is wrong because clustering on join columns improves query performance and reduces data scanned, but it does not increase the number of slots available; the error is about insufficient compute resources, not about inefficient data access patterns.

73
Multi-Selectmedium

A data engineer is designing a streaming pipeline with Cloud Pub/Sub and Cloud Dataflow. They need to guarantee at-least-once delivery and handle occasional duplicates. Which TWO configurations should they implement?

Select 2 answers
A.Use idempotent sinks
B.Use global windows with triggers
C.Use fixed windows
D.Use at-least-once Pub/Sub subscription
E.Enable Dataflow Streaming Engine
AnswersA, D

Idempotent sinks allow safe duplicate writes, ensuring exactly-once effect despite duplicates.

Why this answer

Option A is correct because idempotent sinks (e.g., BigQuery with insertId, Cloud Storage with object generation numbers) allow the pipeline to safely process duplicate records without causing data corruption or double-counting. In a streaming pipeline with at-least-once semantics, duplicates are inevitable, and idempotent sinks ensure that repeated writes produce the same result as a single write, maintaining data consistency.

Exam trap

Google Cloud often tests the misconception that windowing strategies (global or fixed) or execution engine features (Streaming Engine) can substitute for explicit delivery guarantees and idempotent sinks, when in fact they address entirely different concerns.

74
Drag & Dropmedium

Drag and drop the steps to create a Cloud Function triggered by Cloud Storage events into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order

Why this order

Cloud Functions can respond to changes in Cloud Storage buckets.

75
MCQmedium

You have deployed a classification model on Vertex AI Endpoints. The model's training data had a balanced class distribution, but over time, the production data has shifted such that one class appears 90% of the time. The model's overall accuracy remains high, but the recall for the minority class has dropped significantly. What is the best approach to detect and address this issue?

A.Retrain the model daily on the entire historical dataset
B.Set up Vertex AI Model Monitoring to detect skew and drift, and retrain using a sliding window of recent data
C.Increase the number of replicas on the endpoint to reduce latency
D.Adjust the decision threshold to improve minority class recall
AnswerB

Model Monitoring detects skew/drift; retraining on recent data adapts to new distribution.

Why this answer

Vertex AI Model Monitoring is specifically designed to detect skew and drift between training and serving data. In this scenario, the production data has shifted to 90% of one class, which is a clear case of data drift. By setting up monitoring, you can be alerted to this drift and then retrain the model using a sliding window of recent data, which adapts to the new distribution without requiring full retraining on the entire historical dataset.

This approach directly addresses the root cause—the shift in class distribution—rather than just treating symptoms.

Exam trap

Google Cloud often tests the distinction between monitoring/detection (Model Monitoring) and reactive fixes (threshold tuning), where candidates mistakenly choose a quick fix like adjusting the decision threshold instead of addressing the root cause of data drift.

How to eliminate wrong answers

Option A is wrong because retraining daily on the entire historical dataset is computationally expensive and does not prioritize recent data; it would still include the old balanced distribution, potentially diluting the model's ability to adapt to the new skewed production data. Option C is wrong because increasing the number of replicas on the endpoint reduces latency and improves throughput, but it does not address data drift or the drop in minority class recall; it is a scaling solution, not a monitoring or retraining solution. Option D is wrong because adjusting the decision threshold can improve recall for the minority class in the short term, but it does not fix the underlying model's inability to generalize to the shifted data distribution; it is a band-aid that may hurt precision and overall model performance.

Page 1 of 7

Page 2

All pages