Google Professional Data Engineer (PDE) — Questions 451499

499 questions total · 7pages · All types, answers revealed

Page 6

Page 7 of 7

451
MCQhard

A company runs a large Dataflow pipeline that aggregates user activity data from Pub/Sub into BigQuery every 10 minutes using fixed windows. Recently, the daily summary reports have shown 5-10% lower user engagement for certain segments compared to historical trends. The pipeline is completing successfully with no errors in Cloud Monitoring, and the Dataflow job dashboard shows all steps in green. There are no alarms. The team suspects data is being dropped or missed. They have verified that the Pub/Sub topic is receiving data correctly. After reviewing the pipeline code, they find that the pipeline uses a global window with a default 10-minute trigger, and writes results to a single BigQuery table partitioned by date. They also use exactly-once processing mode. Which of the following is the most likely cause and the best course of action to diagnose and fix the data quality issue?

A.Implement a retry mechanism in the Pub/Sub subscription to ensure no messages are lost.
B.Enable Cloud Logging for all pipeline steps and analyze the logs for dropped elements.
C.Add a global window with a late-data trigger to capture any data arriving after the window ends.
D.Use Dataflow’s built-in metrics to compare the number of elements read from Pub/Sub and written to BigQuery for each window.
AnswerD

This identifies exactly where data is lost, enabling targeted debugging without overhead.

Why this answer

Option D is correct because the pipeline uses a global window with a default 10-minute trigger, which means data is processed in micro-batches but the global window never closes, so late-arriving data is included. However, the team suspects data is being dropped, and the most direct way to diagnose this is to compare the number of elements read from Pub/Sub (using the Pub/Sub subscription's 'pubsub_subscription' metric) with the number of elements written to BigQuery (using the BigQuery sink's 'bigquery_rows_written' metric) for each window. This comparison will reveal if any data is lost between reading and writing, which is a common issue when using exactly-once processing mode with streaming inserts that may silently fail due to schema mismatches or quota limits.

Exam trap

The trap here is that candidates assume 'exactly-once processing' guarantees no data loss, but in reality, exactly-once only ensures no duplicates, not that all data is successfully written to the sink; silent failures in streaming inserts to BigQuery can cause data to be dropped without triggering pipeline errors.

How to eliminate wrong answers

Option A is wrong because Pub/Sub subscriptions already have built-in retry mechanisms (e.g., at-least-once delivery) and the issue is not about message loss from Pub/Sub; the team verified the topic is receiving data correctly. Option B is wrong because enabling Cloud Logging for all pipeline steps would generate excessive logs and is not the most efficient diagnostic approach; Dataflow already provides built-in metrics (e.g., 'pubsub_subscription' and 'bigquery_rows_written') that can directly compare element counts without needing to parse logs. Option C is wrong because the pipeline already uses a global window with a default 10-minute trigger, which inherently captures late data (since the global window never closes); adding a late-data trigger is redundant and does not address the potential data loss between Pub/Sub and BigQuery.

452
MCQeasy

A company deploys a new machine learning model for real-time predictions using Vertex AI. The model is stored in a Cloud Storage bucket and deployed to an endpoint. To ensure traceability and rollback capability, which practice should be followed?

A.Deploy multiple versions of the model to the same endpoint using traffic splitting and set the primary version to 100% traffic.
B.Use the same model name for all deployments and overwrite the existing model.
C.Store the model in a Cloud Storage bucket with a fixed name and rely on Cloud Build for rollback.
D.Create a new model resource in Vertex AI for each version and deploy the specific version to an endpoint.
AnswerD

This allows version tracking, easy rollback by redeploying a previous version, and maintains a clean deployment history.

Why this answer

Option D is correct because creating a new model resource in Vertex AI for each version ensures that each model iteration is independently tracked, versioned, and can be deployed to an endpoint with full rollback capability. This practice aligns with Vertex AI's model versioning and endpoint deployment model, where each model resource has a unique ID and can be deployed or undeployed without affecting other versions, enabling precise traceability and rollback.

Exam trap

Google Cloud often tests the misconception that traffic splitting alone (Option A) provides sufficient versioning and rollback, but the trap is that traffic splitting still operates within a single model resource, which does not preserve independent version history or allow clean rollback to a prior model resource without manual intervention.

How to eliminate wrong answers

Option A is wrong because deploying multiple versions to the same endpoint with traffic splitting and setting the primary version to 100% traffic does not inherently create separate model resources for each version; it still relies on a single model resource with aliases, which can complicate rollback if the model resource itself is overwritten or corrupted. Option B is wrong because using the same model name for all deployments and overwriting the existing model destroys the previous version's metadata and artifacts, making rollback impossible without manual restoration from backups. Option C is wrong because storing the model in a Cloud Storage bucket with a fixed name and relying on Cloud Build for rollback does not provide native Vertex AI model versioning or endpoint deployment tracking; Cloud Build is a CI/CD tool, not a model registry, and overwriting the bucket contents loses previous versions.

453
MCQeasy

A company wants to ingest IoT sensor data from thousands of devices into BigQuery for near-real-time analytics. The data volume is approximately 10 GB per hour. Which combination of Google Cloud services should they use for a cost-effective and scalable solution?

A.Pub/Sub → Dataflow → BigQuery
B.Cloud IoT Core → Cloud Functions → BigQuery
C.Cloud IoT Core → Cloud Dataproc → BigQuery
D.Cloud IoT Core → Cloud Storage → BigQuery load jobs
AnswerA

Pub/Sub ingests events, Dataflow streams them to BigQuery, scaling automatically.

Why this answer

Pub/Sub provides a scalable, managed ingestion layer for high-volume IoT data, decoupling producers from consumers. Dataflow (Apache Beam) processes the streaming data in near-real-time with exactly-once semantics and auto-scaling, writing directly to BigQuery for analytics. This combination minimizes operational overhead and cost by avoiding intermediate storage and manual scaling.

Exam trap

Google Cloud often tests the misconception that Cloud Functions can handle streaming workloads, but its synchronous nature and timeout limit make it unsuitable for sustained high-throughput ingestion, whereas Pub/Sub + Dataflow is the standard pattern for near-real-time analytics.

How to eliminate wrong answers

Option B is wrong because Cloud Functions has a 9-minute timeout and is not designed for sustained high-throughput streaming (10 GB/hour), leading to timeouts and data loss. Option C is wrong because Cloud Dataproc (managed Spark/Hadoop) is optimized for batch processing, not near-real-time streaming; it adds latency and complexity compared to Dataflow's native streaming. Option D is wrong because Cloud Storage load jobs are batch-oriented, introducing minutes-to-hours latency and requiring manual orchestration, which fails the near-real-time requirement.

454
Multi-Selecthard

Which THREE considerations are important when designing a data lake on Google Cloud using Cloud Storage?

Select 3 answers
A.Use Cloud Storage's eventual consistency model for cost savings.
B.Define a schema when writing data to enforce data quality.
C.Choose the appropriate storage class based on access patterns.
D.Enable encryption at rest using CMEK or CSEK.
E.Use object lifecycle management to transition data to colder storage classes.
AnswersC, D, E

Storage class impacts cost and latency.

Why this answer

Option C is correct because selecting the appropriate storage class (e.g., Standard, Nearline, Coldline, Archive) based on data access patterns directly optimizes cost and performance in Cloud Storage. For a data lake, where data may be accessed frequently initially and rarely later, matching the storage class to the access pattern avoids paying premium rates for infrequently accessed data.

Exam trap

Google Cloud often tests the misconception that Cloud Storage uses eventual consistency, but since 2020 it offers strong consistency for all operations, making option A a trap for those not updated on the change.

455
Multi-Selecteasy

Which THREE Google Cloud services are considered fully managed serverless data processing services? (Choose THREE.)

Select 3 answers
A.Cloud Dataproc
B.Cloud Functions
C.Cloud Composer
D.Cloud Data Fusion
E.Cloud Dataflow
AnswersB, D, E

E is correct because Cloud Functions is a serverless compute service often used for data transformation.

Why this answer

Cloud Functions is a fully managed serverless data processing service because it executes code in response to events without requiring any server provisioning or management. It automatically scales from zero to thousands of instances based on incoming requests, and you pay only for compute time used while your code runs. This makes it ideal for lightweight, event-driven data processing tasks such as transforming data in Cloud Storage or reacting to Pub/Sub messages.

Exam trap

Google Cloud often tests the distinction between 'fully managed' and 'serverless'—the trap here is that Cloud Dataproc and Cloud Composer are fully managed (Google handles infrastructure) but still require you to manage cluster resources or worker nodes, so they are not serverless; candidates mistakenly equate 'fully managed' with 'serverless'.

456
MCQhard

A company is designing a data lake on Google Cloud. The data lake will store raw, curated, and analytics-ready data. Security requirements include: data must be encrypted at rest and in transit, access must be controlled based on data sensitivity (public, internal, confidential), and all access to sensitive data must be audited. The company also wants to minimize data transfer costs for frequently accessed curated datasets. Which combination of services and configurations best meets these requirements?

A.Use Cloud Storage with default encryption, bucket policies, and Cloud Audit Logs. For frequent access, use Cloud CDN.
B.Use Cloud Storage with CMEK, and use Cloud HSM for key storage. Use Cloud Audit Logs. Avoid caching to ensure security.
C.Use Cloud Storage with SSE-C, bucket policies, and Cloud Audit Logs. Use Cloud Load Balancing for caching.
D.Use Cloud Storage with CMEK, bucket-level IAM, and object ACLs. Use Cloud Data Loss Prevention API to classify data. Enable Cloud Audit Logs. Use Cloud CDN to cache curated datasets.
AnswerD

CMEK ensures customer-controlled encryption; IAM+ACLs give granular access; DLP inspects and classifies; audit logs capture access; CDN caches data for lower latency and cost.

Why this answer

Option D is correct because it combines CMEK for encryption at rest (with Cloud HSM for key management), bucket-level IAM and object ACLs for granular access control based on data sensitivity, Cloud Audit Logs for auditing access to sensitive data, and Cloud CDN to cache curated datasets, reducing data transfer costs for frequently accessed data. This configuration meets all security requirements (encryption at rest and in transit, access control, auditing) while optimizing cost for frequent access.

Exam trap

Google Cloud often tests the misconception that caching (Cloud CDN) is inherently insecure or that it cannot be used with sensitive data, but in reality, Cloud CDN can be secured with signed URLs, IAM, and encryption, and it is the correct way to reduce data transfer costs for frequently accessed data.

How to eliminate wrong answers

Option A is wrong because default encryption uses Google-managed keys, not customer-managed keys (CMEK), which may not satisfy compliance requirements for controlling encryption keys; Cloud CDN caches content at edge locations but does not reduce data transfer costs from Cloud Storage to the same region (it reduces egress for global distribution, not for frequent access within a region). Option B is wrong because 'Avoid caching to ensure security' contradicts the requirement to minimize data transfer costs for frequently accessed curated datasets; caching with Cloud CDN is secure when properly configured (e.g., signed URLs, IAM), and avoiding it increases costs. Option C is wrong because SSE-C (Server-Side Encryption with Customer-Provided Keys) requires the client to manage keys and is not integrated with Cloud HSM or Cloud KMS; Cloud Load Balancing does not cache data (it distributes traffic), so it does not reduce data transfer costs for frequent access.

457
MCQeasy

A company wants to analyze server logs stored in Cloud Storage using SQL. They need to get results in seconds without setting up any clusters. Which service should they use?

A.Cloud Dataflow
B.Cloud Logging
C.BigQuery
D.Cloud Dataproc
AnswerC

BigQuery supports federated queries on Cloud Storage using SQL, providing fast results without clusters.

Why this answer

BigQuery is a serverless, highly scalable, and cost-effective multi-cloud data warehouse designed for business agility. It allows you to analyze petabytes of data using standard SQL without needing to provision or manage any clusters, making it ideal for querying server logs stored in Cloud Storage directly via external tables or loading data into BigQuery for sub-second query performance.

Exam trap

Google Cloud often tests the distinction between serverless SQL analytics (BigQuery) and managed compute frameworks (Dataflow, Dataproc), where candidates mistakenly choose Dataflow or Dataproc for SQL-like analysis without recognizing the need for cluster management or pipeline setup.

How to eliminate wrong answers

Option A is wrong because Cloud Dataflow is a unified stream and batch data processing service that requires setting up and managing pipelines (though serverless, it is not primarily for ad-hoc SQL queries on stored logs). Option B is wrong because Cloud Logging is a real-time log management and analysis service for monitoring and debugging, not designed for complex SQL analytics on large historical log datasets stored in Cloud Storage. Option D is wrong because Cloud Dataproc is a managed Spark and Hadoop service that requires provisioning clusters (even if ephemeral) and is not serverless SQL querying.

458
MCQhard

You are designing a streaming pipeline using Cloud Dataflow with exactly-once semantics. The source is Pub/Sub and the sink is Cloud Bigtable. The pipeline must handle late data up to 10 minutes. You need to minimize cost while maintaining correctness. Which configuration should you use?

A.Fixed windows of 1 minute with allowed lateness 10 minutes and accumulating fired panes
B.Sliding windows of 1 minute with allowed lateness 10 minutes and accumulating fired panes
C.Global window with allowed lateness 10 minutes and trigger=afterWatermark with early firings
D.Session windows of 5 minutes with gap duration 1 minute and discarding fired panes
AnswerC

Global window with watermark-based triggers handles late data efficiently.

Why this answer

Option C is correct because a global window with an after-watermark trigger and early firings is the most cost-effective way to handle unbounded data from Pub/Sub with exactly-once semantics, while allowing up to 10 minutes of lateness. Fixed or sliding windows would create many small window states, increasing Bigtable write costs and shuffle overhead. The global window minimizes state and processing, and the trigger ensures results are emitted promptly without accumulating panes.

Exam trap

Google Cloud often tests the misconception that windowing is always required for streaming pipelines, but here the sink (Bigtable) stores individual records, so a global window with triggers is the most efficient and correct choice, not fixed or sliding windows.

How to eliminate wrong answers

Option A is wrong because fixed windows of 1 minute with accumulating panes would create a new window every minute, leading to excessive state and write amplification in Bigtable, increasing cost without benefit for a global sink. Option B is wrong because sliding windows of 1 minute would create overlapping windows, multiplying state and processing overhead even more than fixed windows, which is wasteful for a use case that doesn't require windowed aggregations. Option D is wrong because session windows with a 5-minute gap duration and discarding panes are designed for grouping events by activity sessions, not for a simple streaming pipeline to Bigtable; discarding panes also risks losing late data that arrives within the 10-minute allowed lateness, violating correctness.

459
MCQhard

A data scientist uses Vertex AI Workbench notebooks for model development. They want to share the environment with team members while maintaining version control. Which approach should they use?

A.Use Cloud Shell and clone the repo
B.Use a user-managed notebook instance with multiple users
C.Share the notebook via Cloud Storage
D.Store notebooks in Cloud Source Repositories
AnswerB

Allows collaboration with version control.

Why this answer

A user-managed notebook instance with multiple users is the correct approach because Vertex AI Workbench supports collaboration by allowing multiple users to access the same instance via IAM permissions, while the underlying Git integration enables version control. This setup provides a shared, persistent environment where team members can work on the same codebase without duplicating work, and changes can be tracked through Git repositories.

Exam trap

The trap here is that candidates confuse storing notebooks in a version control system (like Cloud Source Repositories) with having a shared, interactive development environment, overlooking that version control alone does not provide the compute and collaboration features of a user-managed notebook instance.

How to eliminate wrong answers

Option A is wrong because Cloud Shell is a temporary, per-user environment with limited resources and no persistent storage, making it unsuitable for sharing a development environment with version control across a team. Option C is wrong because sharing notebooks via Cloud Storage is a static file-sharing method that does not provide version control, collaborative editing, or a live execution environment. Option D is wrong because Cloud Source Repositories is a Git repository hosting service for storing code, not a shared interactive development environment; it lacks the compute and runtime capabilities needed for model development.

460
Multi-Selectmedium

A team monitors a deployed Vertex AI model and notices an increasing number of prediction errors with status code 413 (Request Entity Too Large). Which TWO actions should they consider to resolve this issue?

Select 2 answers
A.Implement client-side pre-processing to compress or downsample input data
B.Switch the model to batch prediction to handle large payloads offline
C.Increase the number of replicas to handle load
D.Decrease the machine type to reduce resource consumption
E.Increase the maximum request size limit in the endpoint configuration
AnswersA, E

Reducing input size prevents exceeding the limit.

Why this answer

Option A is correct because status code 413 indicates the HTTP request payload exceeds the server's size limit. Implementing client-side pre-processing to compress or downsample input data reduces the payload size before it reaches the Vertex AI endpoint, directly addressing the root cause. This approach is efficient because it shifts the computational burden to the client and avoids hitting the server-imposed request size cap, which is typically 1.5 MB for online predictions in Vertex AI.

Exam trap

Google Cloud often tests the misconception that scaling resources (replicas or machine type) can fix request size errors, but 413 is a protocol-level limit that must be addressed by reducing payload size, not by increasing infrastructure capacity.

461
MCQhard

A company needs to serve predictions for a model that runs an expensive computation on each request. The model is used by a batch job that processes millions of records each night, and also by a real-time API for a few thousand queries per hour. Which prediction strategy minimizes cost and latency for both use cases?

A.Deploy two identical models, one on a Compute Engine VM for batch, one on Vertex AI for online, and synchronize updates.
B.Use Vertex AI batch prediction for the nightly job and a separate online endpoint with auto-scaling for the real-time API.
C.Use Vertex AI batch prediction for both workloads.
D.Use a single online Vertex AI endpoint with auto-scaling to handle both workloads.
AnswerB

This separates concerns: batch prediction is optimized for throughput, online endpoint for low-latency, and auto-scaling handles varying traffic.

Why this answer

Using batch prediction for the batch job and a separate online endpoint with a smaller machine or auto-scaling for real-time queries optimizes cost and latency. Option D is correct. Option A is wrong because batch prediction alone doesn't serve real-time.

Option B is wrong because online endpoint for millions of records is expensive. Option C is wrong because using the same endpoint for both may cause interference.

462
MCQhard

A company needs to process sensitive healthcare data with strict compliance requirements. They want to use Cloud Dataflow but must ensure data is encrypted end-to-end and audit logs are retained. Which combination of features should they enable?

A.Use Customer-Managed Encryption Keys (CMEK) and VPC Service Controls.
B.Use Data Loss Prevention API to redact sensitive data.
C.Enable Cloud Audit Logs and VPC Service Controls.
D.Enable default encryption at rest and in transit.
AnswerA

Provides control and exfiltration prevention.

Why this answer

Option A is correct because Customer-Managed Encryption Keys (CMEK) allow the company to control the encryption keys used to protect data at rest in Cloud Dataflow, while VPC Service Controls provide a security perimeter that prevents data exfiltration and ensures end-to-end encryption boundaries. Together, they address the compliance requirement for encryption control and audit logging by restricting data movement within a VPC service perimeter and using customer-managed keys for data encryption.

Exam trap

The trap here is that candidates often assume default encryption (Option D) or audit logs alone (Option C) satisfy compliance requirements, but they overlook the need for customer-managed keys and network-level exfiltration controls that VPC Service Controls provide.

How to eliminate wrong answers

Option B is wrong because the Data Loss Prevention (DLP) API is used for inspecting and redacting sensitive data (e.g., PII), not for ensuring end-to-end encryption or audit log retention; it does not provide encryption key management or network-level controls. Option C is wrong because while Cloud Audit Logs capture API activity and VPC Service Controls provide a security perimeter, this combination lacks customer-managed encryption keys (CMEK), which are required for the 'encrypted end-to-end' and key control compliance mandate. Option D is wrong because default encryption at rest and in transit uses Google-managed keys, not customer-managed keys, and does not include VPC Service Controls to enforce data exfiltration prevention or audit log retention policies.

463
MCQmedium

Refer to the exhibit. What is the cause of this error?

A.The machine type flag is only used during model deployment, not endpoint creation
B.The endpoint name already exists
C.The user must specify a model name
D.The region is missing
AnswerA

Correct: machine type is a property of the deployed model, not the endpoint.

Why this answer

The --machine-type flag is not valid for the endpoints create command; it should be specified when deploying a model to the endpoint using 'gcloud ai endpoints deploy-model'. The user must first create an endpoint without machine type, then deploy a model.

464
MCQmedium

A company deploys a model to Vertex AI Endpoint. They want to run a canary deployment to test a new model version with 10% of traffic. How should they configure this?

A.Deploy to a new endpoint and update the application to call both
B.Use Cloud Load Balancing to route traffic
C.Deploy the new model to the same endpoint and set traffic split
D.Deploy to Cloud Run and use gradual rollout
AnswerC

Traffic splitting allows canary.

Why this answer

Option C is correct because Vertex AI Endpoints natively support traffic splitting between model versions deployed to the same endpoint. By deploying the new model version to the same endpoint and setting a traffic split of 10% to the new version and 90% to the current version, the company can perform a canary deployment without changing the application code or infrastructure.

Exam trap

Google Cloud often tests the misconception that canary deployments require separate endpoints or external load balancers, when in fact Vertex AI Endpoints provide a built-in traffic splitting feature that handles this at the model version level.

How to eliminate wrong answers

Option A is wrong because deploying to a new endpoint and updating the application to call both endpoints adds unnecessary complexity and defeats the purpose of a canary deployment, which should be transparent to the application. Option B is wrong because Cloud Load Balancing operates at the network layer and cannot route traffic based on model version within a single Vertex AI Endpoint; it is designed for distributing traffic across regional endpoints or backends, not for model version canary testing. Option D is wrong because deploying to Cloud Run and using gradual rollout is not the native way to manage model versions in Vertex AI; Vertex AI Endpoints provide built-in traffic splitting for model versions, which is the recommended approach for canary deployments in this context.

465
MCQmedium

A company uses BigQuery for analytics. They need to ensure data quality by preventing duplicate records from being inserted. Which approach is most effective?

A.Use BigQuery ML to train a model that identifies anomalies.
B.Use a DML MERGE statement that filters out duplicates based on a unique key.
C.Use Cloud Data Loss Prevention API to scan for duplicates.
D.Use COUNT DISTINCT in queries to ignore duplicates.
AnswerB

MERGE with deduplication logic ensures only one copy of each record is inserted, maintaining data quality.

Why this answer

Using MERGE with ROW_NUMBER() to identify and skip duplicates in a staging table before inserting into the final table is a common pattern for deduplication.

466
MCQeasy

A data engineer wants to automatically detect when the distribution of input features to a production model has shifted significantly. Which Vertex AI feature should they enable?

A.Vertex AI Vizier
B.Vertex AI Model Monitoring
C.Vertex AI Explainable AI
D.Vertex AI Feature Store
AnswerB

Monitors prediction and feature drift/skew.

Why this answer

Vertex AI Model Monitoring is the correct service because it is specifically designed to continuously detect feature distribution drift and prediction skew in production models. It automatically compares the current input feature distribution against a baseline (e.g., training data) and triggers alerts when significant statistical shifts occur, enabling proactive retraining or investigation.

Exam trap

The trap here is that candidates confuse 'monitoring model performance' (e.g., accuracy, latency) with 'monitoring input feature distribution drift', leading them to incorrectly choose Vertex AI Vizier or Explainable AI, which address different aspects of model lifecycle management.

How to eliminate wrong answers

Option A is wrong because Vertex AI Vizier is a hyperparameter tuning service that optimizes model performance through black-box optimization, not for monitoring distribution shifts in production. Option C is wrong because Vertex AI Explainable AI provides feature attributions and explanations for individual predictions, but it does not monitor aggregate distribution changes over time. Option D is wrong because Vertex AI Feature Store is a centralized repository for storing, serving, and sharing feature data, but it lacks built-in drift detection or alerting capabilities.

467
MCQhard

You manage a team that deploys multiple versions of a computer vision model for A/B testing on Vertex AI Endpoints. You need to route a small percentage of traffic to a canary version while the rest goes to the stable version. You also need to gradually increase the canary traffic over time based on performance metrics. Which approach should you take?

A.Create two separate endpoints, one for each version, and use a separate load balancer to route a percentage of requests to the canary endpoint.
B.Deploy both models to the same endpoint and configure traffic splitting percentages using the Vertex AI console or API.
C.Use Cloud Armor with weighted backend services to route a portion of requests to the canary version.
D.Implement feature flags in the application code to randomly select the model version for each prediction request.
AnswerB

Vertex AI endpoints natively support traffic splitting between deployed models, allowing gradual rollout and canary testing.

Why this answer

Vertex AI Endpoints support traffic splitting between model versions. You can assign percentage splits and adjust them programmatically. Weighted routing in Cloud Load Balancing is lower-level.

Using two separate endpoints would not allow splitting within a single endpoint. Feature flags are for application logic, not model serving.

468
MCQmedium

A company has a Dataflow pipeline that reads from Pub/Sub, applies transformations, and writes to BigQuery. The pipeline is failing with 'deadline exceeded' errors during peak hours. The team suspects that the pipeline cannot keep up with the incoming data rate. They also notice that the autoscaling algorithm sets maxNumWorkers to 10, but the pipeline only scales to 5 workers. What is the most likely cause of the inadequate scaling?

A.The maxNumWorkers setting is too low and should be reduced to trigger more aggressive scaling
B.BigQuery streaming quota is limiting the number of concurrent writes
C.The Pub/Sub subscription has a per-subscriber throughput limit of 5 workers
D.The pipeline is CPU-bound and the autoscaler evaluates that adding more workers would not improve throughput
AnswerD

Autoscaler uses utilization metrics; if workers are already saturated, it may not add more.

Why this answer

Option D is correct because the autoscaler in Dataflow evaluates CPU utilization and throughput per worker. If the pipeline is CPU-bound, adding more workers does not reduce per-worker CPU load or improve throughput, so the autoscaler stops at 5 workers even though maxNumWorkers is 10. This is a classic symptom of a bottleneck that cannot be parallelized further, such as a single-threaded transformation or a hot key in a GroupByKey operation.

Exam trap

The trap here is that candidates assume autoscaling always scales to maxNumWorkers when there is a backlog, but the autoscaler only adds workers if they will actually improve throughput, and a CPU-bound pipeline is a common reason for scaling to stall.

How to eliminate wrong answers

Option A is wrong because reducing maxNumWorkers would further restrict scaling, not trigger more aggressive scaling; the autoscaler already has permission to scale to 10 but chooses not to. Option B is wrong because BigQuery streaming quota limits the rate of inserts, not the number of concurrent workers; quota exhaustion would cause insert errors, not prevent the autoscaler from adding workers. Option C is wrong because Pub/Sub subscriptions have a per-subscriber throughput limit that is very high (typically hundreds of MB/s per subscriber), and the pipeline is not hitting that limit; the limit is on throughput, not on the number of subscribers.

469
MCQeasy

A team has trained a scikit-learn model and wants to deploy it to AI Platform Prediction for online predictions. What is the required format for the model artifact?

A.A model.joblib file (or model.pkl) along with any custom code.
B.A single .h5 file containing the model weights.
C.A SavedModel directory containing the model for TensorFlow.
D.A model.pt file for PyTorch models.
AnswerA

AI Platform supports joblib/pickle for scikit-learn.

Why this answer

Option B is correct because AI Platform Prediction expects the model to be saved as joblib or pickle files for scikit-learn. Option A is incorrect because h5 is for Keras; C is for TensorFlow; D is for PyTorch.

470
Multi-Selecteasy

A company is designing a data processing pipeline for real-time sensor data. They want to ensure low latency and exactly-once processing semantics. Which two Google services should they combine to achieve this? (Choose 2)

Select 2 answers
A.Cloud Dataproc with Spark Streaming
B.Cloud Functions with Cloud Pub/Sub triggers
C.Cloud Pub/Sub with exactly-once delivery
D.Cloud Dataflow with exactly-once processing mode
E.Cloud IoT Core with device gateways
AnswersC, D

Pub/Sub can be configured for exactly-once delivery to subscribers.

Why this answer

Cloud Pub/Sub with exactly-once delivery (Option C) ensures that each message is delivered to subscribers exactly once, preventing duplicates in the pipeline. Cloud Dataflow with exactly-once processing mode (Option D) provides end-to-end exactly-once semantics by leveraging consistent snapshots and idempotent sinks, which is critical for real-time sensor data pipelines requiring low latency and accuracy.

Exam trap

Google Cloud often tests the misconception that Cloud Pub/Sub alone provides end-to-end exactly-once processing, but candidates must recognize that Pub/Sub only guarantees delivery exactly once to subscribers, while Dataflow is needed to ensure processing exactly once across transformations and sinks.

471
MCQmedium

A data engineer is designing a batch ETL pipeline using Cloud Composer and Dataflow. The pipeline must be self-healing and retry on failures. Which Composer feature should they configure?

A.Use Cloud Tasks for retries
B.Retry policy on the DAG
C.Cloud Composer with high availability
D.Dataflow retries
AnswerB

Composer DAGs can have retry policies for tasks.

Why this answer

Option B is correct because Cloud Composer (based on Apache Airflow) allows you to configure a retry policy directly on the DAG or individual tasks. This enables the pipeline to automatically retry failed tasks according to parameters like `retries`, `retry_delay`, and `retry_exponential_backoff`, making the ETL pipeline self-healing without external services.

Exam trap

Google Cloud often tests the distinction between orchestration-level retries (Composer DAG) and execution-level retries (Dataflow), leading candidates to pick Dataflow retries (Option D) when the question explicitly asks for a Composer feature.

How to eliminate wrong answers

Option A is wrong because Cloud Tasks is a fully managed queue service for asynchronous task execution, not a feature of Cloud Composer; it would introduce unnecessary complexity and is not the native way to handle retries within a Composer DAG. Option C is wrong because high availability (HA) for Cloud Composer ensures the Airflow components are resilient to zone failures, but it does not configure task-level retry behavior for pipeline failures. Option D is wrong because Dataflow retries handle failures at the Dataflow job level (e.g., worker failures), but the question asks for a Composer feature to manage retries of the overall pipeline orchestration, not the underlying data processing job.

472
Matchingmedium

Match each BigQuery feature to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts
Matches

Sorting data within partitions to improve query performance

Dividing tables into segments based on a date/timestamp column

Unit of computational capacity in BigQuery

Pre-computed query results for faster access

Why these pairings

BigQuery features that optimize performance and cost.

473
MCQeasy

A company uses Cloud Dataflow to process streaming data. They notice that the pipeline's throughput is lower than expected and the system is experiencing high latency. What is the most likely cause?

A.Using batch mode instead of streaming mode
B.Too many workers
C.Too few workers
D.Incorrect watermark setting
AnswerC

Insufficient workers cause backpressure and latency.

Why this answer

Option A is correct because insufficient workers are a common cause of low throughput and high latency. Autoscaling may not be enabled or workers are too few. Option B is wrong because batch mode is not used in streaming.

Option C is incorrect; watermark settings affect late data, not throughput. Option D is wrong; too many workers would not cause high latency.

474
MCQeasy

A team deployed a model to Vertex AI Endpoint and notices latency spikes during peak hours. What should they first investigate?

A.Switch to batch prediction
B.Reduce number of features
C.Increase machine type
D.Check if autoscaling is enabled and configured correctly
AnswerD

Autoscaling misconfiguration is a common cause of latency spikes during traffic surges.

Why this answer

Latency spikes during peak hours typically indicate that the serving infrastructure is unable to handle the increased request volume. The first step is to check if autoscaling is enabled and configured correctly on the Vertex AI Endpoint, as this determines whether additional compute nodes are automatically provisioned to match demand. Without proper autoscaling, the endpoint will be overwhelmed, leading to queuing delays and latency spikes.

Exam trap

Google Cloud often tests the misconception that latency spikes are always due to model complexity or feature engineering, when in fact the first diagnostic step should always be to verify the serving infrastructure's scaling configuration.

How to eliminate wrong answers

Option A is wrong because switching to batch prediction is for asynchronous, non-real-time inference and does not address the root cause of latency spikes during online serving. Option B is wrong because reducing the number of features may lower model complexity but does not directly resolve infrastructure scaling issues; latency spikes are typically due to insufficient compute resources, not feature count. Option C is wrong because increasing the machine type (e.g., using a larger VM) may improve per-request performance but does not solve the problem of handling concurrent peak traffic; without autoscaling, a single larger machine can still be overwhelmed.

475
Matchingmedium

Match each Google Cloud data service to its primary use case.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts
Matches

Serverless data warehouse for analytics

Object storage for unstructured data

Globally distributed relational database

NoSQL wide-column database for low-latency workloads

Asynchronous messaging service for event-driven systems

Why these pairings

These are core Google Cloud data services with distinct primary use cases.

476
MCQmedium

Refer to the exhibit. A team uses this Cloud Build configuration to deploy a service to Cloud Run. The deployment step fails with a 'Permission denied' error. What is the most likely cause?

A.The Dockerfile is missing from the repository.
B.The Docker image tag is missing or malformed.
C.The region 'us-central1' is incorrect for Cloud Run.
D.The Cloud Build service account does not have the Cloud Run Admin role.
AnswerD

The deploy step requires IAM permissions to create/update Cloud Run services; typically the Cloud Build service account needs roles/run.admin.

Why this answer

Cloud Build uses its default service account (or custom service account) which needs the Cloud Run Admin role (roles/run.admin) to deploy services. The error indicates the service account lacks permission to create or update the Cloud Run service.

477
MCQhard

A healthcare analytics company runs a nightly Dataproc workflow that reads radiology reports from Cloud Storage (CSV files), transforms them using PySpark, and writes results to BigQuery. The workflow is orchestrated by Cloud Composer. Recently, the job has started failing with 'Disk quota exceeded' errors on the worker nodes. The data volume has grown 5x over the past month. Currently, the cluster uses 5 n1-standard-4 workers (each 10GB persistent disk). The PySpark jobs heavily use intermediate shuffles. You need a cost-effective solution that avoids future failures as data grows. What should you do?

A.Upgrade the worker machine type to n1-standard-8 with local SSDs for shuffle storage.
B.Increase the persistent disk size on each worker node to 100 GB.
C.Add more preemptible workers to the cluster and keep boot disk size at 10GB.
D.Use Cloud Dataflow instead of Dataproc, as it handles disk management transparently.
AnswerB

More disk space per worker allows shuffles to complete without quota errors.

Why this answer

The 'Disk quota exceeded' error occurs because the 10 GB persistent disks on the n1-standard-4 workers are too small to accommodate the intermediate shuffle data, which has grown 5x. Increasing the persistent disk size to 100 GB directly addresses the storage bottleneck without changing the machine type or incurring the cost of local SSDs, making it a cost-effective solution that scales with data growth.

Exam trap

The trap here is that candidates may over-engineer the solution by upgrading machine types or switching to a different service (Dataflow) when the root cause is simply insufficient disk space for shuffle data, which is easily fixed by increasing the persistent disk size.

How to eliminate wrong answers

Option A is wrong because upgrading to n1-standard-8 with local SSDs is overkill and more expensive; the issue is disk space for shuffle data, not CPU or memory, and local SSDs are ephemeral and not cost-effective for persistent storage needs. Option C is wrong because adding more preemptible workers does not increase the persistent disk size per worker; each worker still has only 10 GB, so shuffle data will still exceed the disk quota on those nodes. Option D is wrong because migrating to Cloud Dataflow is a significant architectural change that incurs migration costs and learning curve, and it does not address the immediate disk quota issue in the existing Dataproc workflow; Dataflow also has its own disk management limits.

478
Multi-Selectmedium

A company uses Cloud Composer to orchestrate data pipelines. They have a DAG that runs hourly and processes files from Cloud Storage. The DAG is triggered by a Pub/Sub message sent from a Cloud Storage bucket notification. Recently, some DAG runs are not starting even though the Pub/Sub messages are published. Which two likely causes should the team investigate? (Choose TWO.)

Select 2 answers
A.The Cloud Storage bucket notification is not sending messages to the correct Pub/Sub topic, or the subscription's ack deadline is too short.
B.The DAG's start_date is set in the past and catchup is set to False, so DAG runs are only triggered on schedule.
C.The total number of DAGs in the environment exceeds the maximum limit of 100, causing DAG processing to stop.
D.The DAG's schedule interval is set too frequently, causing the executor queue to be full and new runs are skipped.
E.The Cloud Composer environment is using a pull subscription instead of a push subscription for the Pub/Sub sensor.
AnswersA, D

C is correct because misconfiguration of the notification or subscription can cause message loss.

Why this answer

Option A is correct because if the Cloud Storage bucket notification is misconfigured to send messages to the wrong Pub/Sub topic, the Pub/Sub sensor in the DAG will never receive the trigger message, causing DAG runs to not start. Additionally, if the subscription's ack deadline is too short, the message may be acknowledged before the sensor processes it, leading to message loss and missed triggers. Both issues directly prevent the DAG from being triggered by Pub/Sub messages.

Exam trap

Google Cloud often tests the misconception that a push subscription is required for Pub/Sub sensors in Cloud Composer, when in fact the sensor uses a pull subscription and the ack deadline is the critical parameter to manage.

479
Drag & Dropmedium

Drag and drop the steps to set up a BigQuery dataset with a scheduled query into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps
Order

Why this order

Scheduled queries allow automating recurring data transformations and loads.

480
MCQeasy

A company uses BigQuery for real-time analytics. They stream data from IoT devices into a BigQuery table. After a few hours, some of the recent data becomes visible in the table although it was streamed less than 10 minutes ago. The data team confirms that no one ran any manual queries. What is the most likely reason for the data visibility?

A.The data was stored in the streaming buffer for more than 24 hours, and BigQuery automatically flushes it to the table.
B.BigQuery time travel allows querying data from the past, including data still in the streaming buffer.
C.The table has an expiration set, and the data is made visible as soon as the table is about to expire.
D.The streaming buffer reached its maximum capacity (default 90 minutes) and automatically flushed the data to the table.
AnswerD

C is correct because the streaming buffer flushes data approximately every 90 minutes, making it visible.

Why this answer

Option D is correct because BigQuery's streaming buffer has a maximum capacity limit, typically around 90 minutes. When the buffer reaches this capacity, BigQuery automatically flushes the buffered data to the table, making it visible. This explains why data streamed less than 10 minutes ago became visible after a few hours.

Exam trap

The trap here is that candidates often assume streaming data is immediately visible or that time travel is responsible for visibility, but BigQuery's streaming buffer has a finite capacity that triggers automatic flushes, making data visible after a delay.

How to eliminate wrong answers

Option A is wrong because the streaming buffer does not have a 24-hour retention; data is flushed automatically within about 90 minutes or when the buffer reaches capacity, not after 24 hours. Option B is wrong because BigQuery time travel allows querying historical data within a 7-day window, but it does not cause data in the streaming buffer to become visible; it only affects how you query already-committed data. Option C is wrong because table expiration settings control when the table is deleted, not when streaming data becomes visible; data visibility is independent of table expiration.

481
Multi-Selectmedium

Which TWO are best practices for monitoring a deployed machine learning model in production on Vertex AI?

Select 2 answers
A.Set up a weekly retraining pipeline triggered by calendar schedule
B.Enable Vertex AI Model Monitoring to track feature drift and skew
C.Monitor the training job duration to detect anomalies
D.Monitor the distribution of predictions over time to detect concept drift
E.Monitor the model's file size to ensure it hasn't changed
AnswersB, D

Model Monitoring automatically detects drift.

Why this answer

Option B is correct because Vertex AI Model Monitoring automatically tracks feature drift and skew by comparing the serving data distribution against the training data distribution using statistical tests like the Kolmogorov-Smirnov test. This is a best practice for detecting data quality issues that can degrade model performance in production.

Exam trap

The trap here is that candidates confuse operational maintenance tasks (like scheduled retraining) with monitoring tasks, or they focus on infrastructure metrics (like job duration or file size) instead of data and prediction distribution monitoring, which directly impact model accuracy in production.

482
MCQmedium

Your team is using Vertex AI Pipelines to orchestrate a model retraining workflow. The pipeline includes a data validation step, a training step, and a model evaluation step. You want to ensure that if the evaluation step fails due to low model performance, the pipeline stops and does not deploy the model. Which approach should you use?

A.Run the evaluation step after deployment and roll back if performance is low
B.Configure the evaluation step to retry up to 3 times on failure
C.Use a Conditional in the pipeline to check evaluation metrics and only run the deployment step if metrics pass thresholds
D.Create a separate pipeline for deployment and trigger it manually after review
AnswerC

Conditionals allow pipeline to branch based on results.

Why this answer

Option C is correct because Vertex AI Pipelines supports conditional execution via the `Condition` component, which allows you to evaluate model performance metrics (e.g., accuracy, RMSE) and gate subsequent steps. By placing the deployment step inside a conditional branch that only executes when evaluation metrics meet predefined thresholds, the pipeline automatically stops and avoids deploying a poor-performing model. This approach aligns with MLOps best practices for automated gating in production pipelines.

Exam trap

The trap here is that candidates confuse retry logic (Option B) with conditional gating, mistakenly thinking that retrying a failed evaluation step will somehow improve model performance, when in fact retries only handle transient errors, not metric-based failures.

How to eliminate wrong answers

Option A is wrong because running the evaluation step after deployment and then rolling back violates the principle of failing fast; it wastes compute resources and risks serving a bad model to users before rollback. Option B is wrong because retrying the evaluation step on failure does not address the root cause — low model performance — and would simply re-run the same evaluation, potentially masking the failure or delaying the pipeline. Option D is wrong because creating a separate pipeline for manual deployment defeats the purpose of automation and introduces human latency and error, contradicting the goal of an automated orchestrated workflow.

483
MCQhard

A company runs a batch data processing workload using Dataproc clusters that are auto-scaled based on YARN memory utilization. During peak times, jobs take much longer than expected. Analysis shows the cluster is not scaling up despite high YARN memory utilization. What is the most likely cause?

A.Spark dynamic allocation is disabled, preventing executors from using added workers
B.The cluster autoscaler is misconfigured to scale based on CPU, not memory
C.The autoscaler is set to scale down secondary workers, not up
D.The cluster is using primary workers only; auto-scaling only adds secondary workers
AnswerD

Auto-scaling adds secondary workers, not primary; if only primary workers exist, no scale-up occurs.

Why this answer

Dataproc clusters have two types of workers: primary workers (which run both HDFS and compute) and secondary workers (compute-only). The autoscaler can only add or remove secondary workers; it cannot scale primary workers. If the cluster uses only primary workers, the autoscaler has no secondary workers to add, so it cannot scale up even under high YARN memory utilization.

This explains why the cluster remains static during peak times.

Exam trap

The trap here is that candidates assume autoscaling applies to all worker nodes equally, overlooking the Dataproc-specific distinction between primary and secondary workers and the autoscaler's limitation to secondary workers only.

How to eliminate wrong answers

Option A is wrong because Spark dynamic allocation controls how executors are distributed within existing nodes, not how the cluster adds new nodes; even if disabled, the autoscaler would still attempt to add workers if configured correctly. Option B is wrong because the question explicitly states the autoscaler is based on YARN memory utilization, not CPU; a misconfiguration to CPU would cause scaling based on CPU metrics, but the symptom here is no scaling at all, not scaling on the wrong metric. Option C is wrong because the autoscaler is designed to scale up secondary workers when utilization is high; a misconfiguration to scale down would cause premature removal of workers, not a failure to scale up.

484
MCQhard

A team is training a large model using a custom container with TensorFlow on Vertex AI Training. They need to use multiple GPUs across several machines. Which strategy should they implement to maximize training throughput?

A.Use Cloud TPU Pods for distributed training
B.Use Dataflow for distributed training
C.Use Vertex AI Training with a custom job specifying workerPoolSpecs and MultiWorkerMirroredStrategy
D.Use a single worker with multiple GPUs and TensorFlow MirroredStrategy
AnswerC

MultiWorkerMirroredStrategy distributes across multiple machines.

Why this answer

Vertex AI supports multi-worker distributed training with the 'distribution_strategy' argument in the custom job config. Using a single VM with multiple GPUs is limited by that machine's capabilities. The 'MirroredStrategy' addresses single-machine multi-GPU, not multi-machine.

485
Multi-Selectmedium

A data warehouse team uses Cloud BigQuery for analytics. They want to optimize query performance and reduce costs. Which three actions should they take? (Choose 3)

Select 3 answers
A.Use partitioned tables on time columns
B.Use clustered tables on frequently filtered columns
C.Use automatic reclustering
D.Use materialized views for aggregations
E.Use BI Engine for all queries
AnswersA, B, D

Partitioning allows queries to skip irrelevant partitions, reducing cost and improving speed.

Why this answer

Option A is correct because partitioning tables on time columns (e.g., DATE, TIMESTAMP) in BigQuery allows the query engine to perform partition pruning, scanning only the relevant partitions instead of the entire table. This directly reduces the amount of data read, lowering query costs and improving performance by limiting I/O to the necessary time range.

Exam trap

Google Cloud often tests the distinction between automatic reclustering as a passive maintenance feature versus an active optimization action, leading candidates to mistakenly select it as a cost-saving measure when it is actually a built-in behavior that does not require manual intervention.

486
Multi-Selectmedium

Which TWO steps are required to deploy a custom scikit-learn model to Vertex AI for online predictions?

Select 2 answers
A.Write a custom prediction routine
B.Containerize the model using Docker
C.Save the model using joblib or pickle
D.Create a Vertex AI Endpoint manually
E.Upload the model to Vertex AI Model Registry
AnswersC, E

Vertex AI expects a saved model artifact.

Why this answer

Option C is correct because scikit-learn models must be serialized using joblib or pickle to be saved as a model artifact that can be uploaded to Vertex AI. Vertex AI's pre-built prediction containers for scikit-learn expect the model file to be in this format (typically model.joblib or model.pkl) to serve online predictions.

Exam trap

Google Cloud often tests the misconception that you must always write a custom prediction routine or containerize your model, when in fact Vertex AI provides pre-built containers for popular frameworks like scikit-learn, making steps A and B unnecessary for standard deployments.

487
MCQeasy

A data scientist wants to automate retraining of a classification model when new labeled data arrives. The model is deployed on AI Platform Prediction. Which Google Cloud service should be used to orchestrate the retraining pipeline?

A.AI Platform Prediction
B.AI Platform Pipelines
C.AI Platform Continuous Evaluation
D.Cloud Dataflow
AnswerB

AI Platform Pipelines provides a way to build and orchestrate ML pipelines.

Why this answer

AI Platform Pipelines (now Vertex AI Pipelines) is the correct service because it provides a fully managed, serverless orchestration engine for building, deploying, and running machine learning pipelines. It integrates with Kubeflow Pipelines and TensorFlow Extended (TFX) to automate the retraining workflow when new labeled data arrives, enabling continuous training and model versioning without manual intervention.

Exam trap

Google Cloud often tests the distinction between services that execute ML tasks (like prediction or evaluation) versus services that orchestrate the workflow; the trap here is that candidates confuse AI Platform Prediction (serving) or Cloud Dataflow (data processing) with pipeline orchestration, missing that AI Platform Pipelines is purpose-built for automating multi-step ML workflows.

How to eliminate wrong answers

Option A is wrong because AI Platform Prediction is a serving endpoint for deploying trained models to make predictions; it does not orchestrate retraining pipelines. Option C is wrong because AI Platform Continuous Evaluation is a service for monitoring model performance and detecting drift, not for orchestrating retraining workflows. Option D is wrong because Cloud Dataflow is a stream and batch data processing service (based on Apache Beam) used for data transformation and ETL, not for orchestrating end-to-end ML pipelines with conditional retraining logic.

488
MCQhard

A company has a production machine learning model deployed on Vertex AI Endpoint that predicts customer churn. The model is retrained weekly using a Vertex AI Pipeline that pulls new data from BigQuery. Recently, the model's accuracy has been declining. The data science team suspects data drift but is unsure. They have enabled Vertex AI Model Monitoring but have not set up any alerts. The team wants to diagnose and address the issue quickly. The pipeline runs successfully, and no errors are reported. The model endpoint is serving predictions with average latency of 200ms. What should the team do first?

A.Immediately trigger a retraining pipeline with more recent data
B.Increase the number of replicas to reduce latency
C.Examine Cloud Logging for prediction errors
D.Review Vertex AI Model Monitoring drift reports and set up alerts for significant drift
AnswerD

Directly addresses drift detection.

Why this answer

Option D is correct because the team has already enabled Vertex AI Model Monitoring, which automatically tracks feature distributions and prediction statistics over time. The first diagnostic step should be to review the drift reports generated by Model Monitoring to confirm whether data drift is occurring, and then set up alerts so the team is proactively notified of significant drift in the future. This directly addresses the suspected root cause without unnecessary operational changes.

Exam trap

Google Cloud often tests the misconception that any model performance decline must be fixed by immediate retraining or infrastructure scaling, when the correct first step is always to diagnose the root cause using the monitoring tools already in place.

How to eliminate wrong answers

Option A is wrong because blindly retraining with more recent data without first confirming data drift may waste resources and could even degrade model performance if the new data is not representative or contains label errors. Option B is wrong because increasing replicas addresses latency, not accuracy decline; the current 200ms latency is well within acceptable bounds and is unrelated to the accuracy problem. Option C is wrong because Cloud Logging captures prediction errors (e.g., runtime exceptions, invalid inputs), but the pipeline runs successfully with no errors, so examining logs for errors will not reveal gradual accuracy degradation caused by data drift.

489
MCQhard

A financial services company deploys a fraud detection model on Vertex AI using a custom prediction container that runs a PyTorch model. The model requires GPU acceleration. The deployment succeeds but predictions return an error: 'CUDA error: out of memory'. What should the team do to resolve this issue?

A.Change the container to use a CPU-only image to avoid CUDA errors
B.Increase the GPU machine type to one with more memory (e.g., from NVIDIA T4 to A100)
C.Enable Vertex AI Model Monitoring to automatically scale the endpoint
D.Add CPU replicas to distribute the inferencing load
AnswerB

The CUDA out of memory error indicates the current GPU cannot hold the model; a larger GPU or model optimization is needed.

Why this answer

Option A is correct because the GPU memory is insufficient; using a machine with more GPU memory or optimizing the model is the solution. Option B (enabling model monitoring) does not fix memory. Option C (adding more CPUs) does not address GPU memory.

Option D (using CPU-only) would defeat the purpose of GPU acceleration.

490
MCQeasy

Your company has a machine learning model that predicts customer churn. The model is deployed on Vertex AI Endpoints with autoscaling. After a marketing campaign, traffic to the endpoint increases by 10x. Some predictions start failing with 'HTTP 503 Service Unavailable' errors. What is the most likely cause?

A.The model container has a memory leak.
B.The model's accuracy has degraded due to data drift.
C.The autoscaling configuration has insufficient maximum nodes to handle the traffic.
D.The model is using an older version that is not supported.
AnswerC

Autoscaling with too few max nodes cannot scale up to meet demand, causing overload and 503 errors.

Why this answer

A 503 Service Unavailable error from Vertex AI Endpoints indicates that the endpoint is overwhelmed and cannot handle the incoming request volume. With a 10x traffic spike and autoscaling configured, the most likely cause is that the autoscaling configuration has insufficient maximum nodes, so the endpoint cannot scale out enough to handle the load, causing requests to be rejected.

Exam trap

Google Cloud often tests the distinction between model-level errors (e.g., data drift, accuracy degradation) and infrastructure-level errors (e.g., 503, 429, timeout), so the trap here is that candidates confuse a model performance issue with a scaling/availability issue.

How to eliminate wrong answers

Option A is wrong because a memory leak in the model container would cause gradual performance degradation or OOM kills, not a sudden 503 error under high traffic; Vertex AI would still attempt to serve requests until the container crashes. Option B is wrong because data drift affects prediction accuracy (e.g., wrong predictions), not the availability or HTTP status of the endpoint; 503 errors are infrastructure-level, not model-level. Option D is wrong because using an unsupported older version would cause deployment or startup failures, not transient 503 errors under load; Vertex AI would reject the deployment or return a different error (e.g., 400 or 404) if the version is incompatible.

491
MCQmedium

A company runs a Dataflow pipeline that reads from Pub/Sub, aggregates events in a 10-minute fixed window, and writes to BigQuery. Recently, the pipeline has been failing with 'high uncommitted bytes' errors during periods of high traffic. What is the most likely cause and recommended action?

A.Reduce the window size from 10 minutes to 1 minute to decrease the amount of data per window.
B.Increase the number of worker machines to handle higher throughput.
C.Use a global window with a trigger that fires early based on element count to reduce the number of open windows.
D.Set a maximum number of workers and use a Pub/Sub flow control setting to limit incoming messages.
AnswerC

A global window with early triggers can reduce the number of panes and mitigate the high uncommitted bytes problem.

Why this answer

The 'high uncommitted bytes' error in Dataflow occurs when the system holds too much data in memory across many open windows, exceeding the default 200 MB limit. Using a global window with an early trigger based on element count reduces the number of simultaneous open windows and allows data to be committed more frequently, preventing memory pressure. This approach is recommended over reducing window size or scaling workers because the root cause is window fan-out, not throughput or parallelism.

Exam trap

Google Cloud often tests the misconception that scaling workers or reducing window size solves memory pressure, when the real issue is the number of open windows in a stateful pipeline.

How to eliminate wrong answers

Option A is wrong because reducing the window size from 10 minutes to 1 minute increases the number of open windows (from 6 per hour to 60 per hour), which would worsen the 'high uncommitted bytes' issue by creating more in-memory state. Option B is wrong because increasing worker machines does not address the fundamental problem of excessive open windows consuming memory; it may temporarily mask the issue but will not reduce the per-worker uncommitted bytes. Option D is wrong because setting a maximum number of workers and Pub/Sub flow control limits incoming messages but does not reduce the number of open windows or the memory used by uncommitted data; it may cause backpressure and data loss without fixing the window state explosion.

492
MCQmedium

A company is using Dataflow to stream data from Cloud Pub/Sub to BigQuery. The pipeline includes a custom ParDo transformation that enriches the data with external API calls. The pipeline is experiencing high latency and occasional failures due to API timeouts. What strategy should be employed to improve reliability and performance?

A.Remove the enrichment step and store raw data in BigQuery.
B.Use a global window to accumulate all data before enrichment.
C.Use a DoFn with stateful processing and batch API calls using asynchronous HTTP client.
D.Increase the number of workers to parallelize API calls.
AnswerC

Batching and async calls reduce per-element latency and handle timeouts gracefully.

Why this answer

Option C is correct because using a DoFn with stateful processing and an asynchronous HTTP client allows the pipeline to batch API calls and handle timeouts without blocking the main processing thread. This reduces latency by enabling concurrent requests and improves reliability through retry logic and state management, which is essential for external API enrichment in Dataflow.

Exam trap

Google Cloud often tests the misconception that scaling workers (Option D) is a universal fix for performance issues, but the trap here is that API timeouts are often caused by the external service's capacity, not the pipeline's parallelism, and stateful batching with async calls is the correct architectural pattern.

How to eliminate wrong answers

Option A is wrong because removing the enrichment step defeats the purpose of the pipeline and does not address the underlying issue of API call reliability. Option B is wrong because using a global window to accumulate all data before enrichment would introduce unbounded state and memory pressure, and it does not solve API timeout problems; it would also break the streaming nature of the pipeline. Option D is wrong because simply increasing the number of workers does not fix API timeouts; it may even exacerbate the problem by overwhelming the external API with more concurrent requests, leading to more failures.

493
Multi-Selectmedium

Which THREE metrics should be monitored to detect model drift in a production ML system?

Select 3 answers
A.Training loss convergence.
B.Prediction distribution (prediction drift).
C.Feature distribution (data drift).
D.CPU utilization of the serving nodes.
E.Model performance metrics (e.g., accuracy, precision, recall) on a ground truth dataset.
AnswersB, C, E

Changes in prediction distribution can indicate concept drift.

Why this answer

Prediction drift (distribution of predictions), feature drift (distribution of input features), and model performance metrics (e.g., accuracy) are key indicators. Infrastructure metrics (CPU usage) and training loss are not directly drift indicators.

494
MCQeasy

A user gets the above error when trying to get online predictions. The model was created and the endpoint exists. What is the most likely reason?

A.The endpoint does not exist.
B.The endpoint is in a different region than the model.
C.No version of the model is deployed to the endpoint.
D.The model does not exist.
AnswerC

A model must be deployed (a model version) to the endpoint to serve predictions.

Why this answer

Correct: C. The model must be deployed (a version deployed) to the endpoint. Option A wrong because model exists.

Option B wrong because endpoint exists. Option D wrong because region mismatch would give different error.

495
MCQhard

Refer to the exhibit. A team is trying to run a custom prediction container on Vertex AI Endpoint. They get this error when the container starts. What is the most likely cause?

A.The container image is too large
B.The entry point is missing or incorrect
C.The container is built for a different CPU architecture
D.The model file is missing from the container
AnswerB

The error message directly states to ensure the container has an entry point.

Why this answer

The error occurs when the container starts, which typically happens during the initial health check or readiness probe. Vertex AI Endpoints require a valid entry point (e.g., CMD or ENTRYPOINT in the Dockerfile) to start the prediction server. If the entry point is missing or incorrect, the container fails to launch, resulting in the observed error.

Exam trap

Google Cloud often tests the distinction between container startup failures (entry point issues) and runtime failures (missing model files or architecture mismatches), leading candidates to confuse a missing model file with a startup error.

How to eliminate wrong answers

Option A is wrong because container image size does not prevent startup; Vertex AI supports images up to 10 GB, and a large image would only affect pull time, not the container's ability to start. Option C is wrong because CPU architecture mismatch would cause a runtime crash or 'exec format error' during execution, not a startup failure, and Vertex AI uses x86_64 architecture by default. Option D is wrong because a missing model file would cause a runtime error during prediction (e.g., 404 or model load failure), not a container startup failure, as the container can still start and listen for requests.

496
Multi-Selecteasy

Which TWO options can help reduce costs for a Dataflow batch pipeline that processes 100 GB of data daily from Cloud Storage? (Choose 2)

Select 2 answers
A.Use Dataflow Prime (now Dataflow Runner v2)
B.Use high-memory machine types
C.Use Streaming Engine
D.Use FlexRS (Flexible Resource Scheduling)
E.Use preemptible VMs for Dataflow workers
AnswersD, E

FlexRS offers discounted pricing for batch jobs that are flexible on start time.

Why this answer

FlexRS (Flexible Resource Scheduling) allows you to run batch workloads on a discounted, flexible schedule. It reduces costs by offering lower prices in exchange for the job being able to wait up to 6 hours for resources to become available. This is ideal for a daily 100 GB batch pipeline that can tolerate some scheduling delay.

Exam trap

Google Cloud often tests the distinction between batch and streaming optimizations, so the trap here is that candidates might select Streaming Engine (Option C) thinking it reduces costs in batch pipelines, when it is only relevant for streaming.

497
MCQmedium

A company uses a custom container image for model serving. The image is large (10 GB). During deployment, they get timeouts. What should they do?

A.Pre-pull the image on all nodes
B.Increase the timeout in the deployment config
C.Switch to a larger machine type
D.Use a smaller base image
AnswerD

Smaller image reduces pull time and deployment time.

Why this answer

Option D is correct because using a smaller base image directly addresses the root cause of the timeout: the 10 GB image takes too long to download from the container registry during pod startup. By reducing the image size (e.g., using a slim or distroless base image), the pull time decreases, avoiding the default kubelet image pull timeout (typically 5 minutes) without requiring infrastructure changes.

Exam trap

Google Cloud often tests the misconception that increasing timeouts or scaling up hardware solves performance bottlenecks, when the correct answer is to optimize the artifact itself (image size) to meet the system's implicit constraints.

How to eliminate wrong answers

Option A is wrong because pre-pulling the image on all nodes is a manual workaround that does not solve the underlying issue of a bloated image; it also adds operational overhead and fails in dynamic clusters where new nodes are added. Option B is wrong because increasing the timeout in the deployment config (e.g., the `imagePullPolicy` or pod-level timeout) only masks the symptom and does not reduce the pull time, potentially leading to other timeouts in the cluster. Option C is wrong because switching to a larger machine type does not affect the network transfer time for pulling the image; it only provides more local resources, which does not address the slow image download.

498
Multi-Selectmedium

A company is planning to migrate a legacy batch ETL pipeline to Google Cloud. The pipeline involves reading from a relational database, transforming data, and writing to a data warehouse. Which three Google Cloud services can be used as the orchestration layer? (Choose three.)

Select 3 answers
A.Cloud Dataproc
B.Cloud Scheduler
C.Cloud Dataflow
D.Cloud Workflows
E.Cloud Composer
AnswersB, D, E

Cloud Scheduler can trigger jobs on a schedule, acting as a simple orchestrator.

Why this answer

Cloud Scheduler is a fully managed cron job service that can trigger orchestration workflows on a schedule. It is correct because it can initiate batch ETL pipelines by sending HTTP requests to Cloud Run, Cloud Functions, or Pub/Sub, making it a lightweight orchestration trigger for scheduled batch jobs.

Exam trap

Google Cloud often tests the distinction between data processing services (Dataproc, Dataflow) and orchestration services (Workflows, Composer, Scheduler), so candidates mistakenly select Dataproc or Dataflow thinking they can orchestrate, when they are actually execution engines.

499
MCQmedium

A company is building a real-time streaming pipeline using Pub/Sub and Dataflow to process clickstream data. The pipeline writes aggregated metrics to BigQuery every 10 seconds using a fixed window. During peak traffic, some windows produce duplicate rows in BigQuery. What is the most likely cause?

A.Dataflow is retrying BigQuery streaming inserts after a timeout, and the retries succeed even though the original insert succeeded.
B.The pipeline uses default triggers instead of after-watermark triggers.
C.The fixed window duration is too short, causing overlapping windows.
D.The pipeline is using too many Dataflow workers, causing load balancing issues.
AnswerA

This is a known scenario: BigQuery streaming inserts are not idempotent, and retries can lead to duplicates.

Why this answer

Option A is correct because Dataflow uses at-least-once semantics for streaming inserts into BigQuery. When a streaming insert times out, Dataflow retries the insert, and if the original insert actually succeeded but the acknowledgment was lost, the retry produces a duplicate row. This is a known behavior of BigQuery streaming inserts with retry logic.

Exam trap

The trap here is that candidates often confuse trigger behavior (Option B) with the root cause of duplicates, not realizing that duplicates stem from retry semantics in the sink, not from windowing or parallelism.

How to eliminate wrong answers

Option B is wrong because default triggers in Dataflow (which fire on element arrival and after watermark) do not cause duplicate rows; they affect when results are emitted, not whether duplicates occur. Option C is wrong because fixed windows of 10 seconds do not overlap by design; overlapping windows would require a sliding window, not a fixed window. Option D is wrong because using too many Dataflow workers can cause resource inefficiency or shuffle issues, but it does not directly cause duplicate rows in BigQuery output.

Page 6

Page 7 of 7

All pages