Google Cloud · Free Practice Questions · Last reviewed May 2026
24real exam-style questions organised by domain, each with the correct answer highlighted and a plain-English explanation of why it's right — and why the others are wrong.
A company is migrating on-premises Apache Spark jobs to Google Cloud Dataproc. They want to reduce operational overhead and minimize costs. Which architecture is most appropriate?
Use Cloud Dataproc Serverless for all Spark jobs.
Migrate jobs to Cloud Dataflow.
Run Spark on Compute Engine instances with startup scripts.
Use Dataproc clusters with auto-scaling and preemptible VMs.
Reduces cost and operational overhead.
A data pipeline ingests sensor data from IoT devices via Cloud Pub/Sub, processes it with Cloud Dataflow, and writes to BigQuery. The pipeline is failing with high latency and data loss. Which troubleshooting step should be taken first?
Check Stackdriver logging for error messages.
Identifies root cause.
Disable exactly-once processing in Dataflow.
Increase the number of Dataflow workers.
Switch to BigQuery streaming inserts.
A company needs to process real-time clickstream data and store it in a data warehouse for SQL-based analytics. The data volume is moderate. Which combination of Google Cloud services is most cost-effective?
Cloud Pub/Sub, Cloud Dataproc, Cloud Storage
Cloud Pub/Sub, Cloud Dataflow, Cloud Spanner
Cloud Pub/Sub, Cloud Dataflow, BigQuery
Best for real-time SQL analytics.
Cloud Pub/Sub, Cloud Dataflow, Cloud Storage
A financial company processes transactions in real-time and requires exactly-once processing semantics. They also need to reprocess historical data for backtesting. Which Google Cloud service should they use?
Cloud Pub/Sub
Cloud Functions
Cloud Dataproc
Cloud Dataflow
Supports exactly-once and batch/streaming.
A company is building a data lake on Cloud Storage with data from multiple sources. They need to apply schema-on-read and support ad-hoc SQL queries. Which architecture is most suitable?
Ingest to Cloud Spanner, query directly.
Ingest to Cloud SQL, then export to Cloud Storage for queries.
Ingest to Cloud Storage, create BigQuery external tables.
Schema-on-read and SQL.
Ingest to Cloud Storage, load into Dataproc for queries.
A company wants to stream data from Cloud Pub/Sub into BigQuery with minimal latency. They have a small team and limited operational resources. Which approach is best?
Write a custom application on Compute Engine that polls Pub/Sub and writes to BigQuery.
Create a Dataproc cluster running a Spark Streaming job.
Create a Cloud Function that writes to BigQuery.
Use a Dataflow pipeline with a BigQuery subscription.
Serverless and low maintenance.
Want more Designing data processing systems practice?
Practice this domainA company is migrating its on-premises Apache Spark jobs to Dataproc. The jobs read from and write to Cloud Storage. After migration, the jobs are slower than expected. The Dataproc cluster uses standard worker machines with local SSDs. What is the most likely cause of the performance degradation?
The Spark shuffle service is not enabled on the cluster.
The local SSDs are not mounted or are misconfigured.
The Cloud Storage connector is not using the gRPC protocol.
The jobs use the Cloud Storage connector instead of HDFS, causing network latency.
Reading from Cloud Storage over network is slower than local HDFS reads.
A data pipeline ingests real-time events from Cloud Pub/Sub into BigQuery using Dataflow. The pipeline uses a sliding window of 5 minutes with a 1-minute period to aggregate event counts. Recently, the pipeline started failing with 'The worker failed to provide a heartbeat.' The Dataflow logs show high CPU usage on the workers. What is the best course of action to resolve the issue?
Increase the number of workers and enable autoscaling to distribute the load.
More workers can handle the CPU load from streaming inserts.
Reduce the number of workers to minimize coordination overhead.
Use a global window with a trigger to reduce state size.
Change the windowing to a fixed 5-minute window to reduce computations.
A company wants to process large CSV files stored in Cloud Storage and load them into BigQuery. The files are generated daily and each file is about 10 GB. The data is not time-sensitive and can be processed within a 24-hour window. Which service is most cost-effective for this use case?
Dataproc Serverless with PySpark
Dataproc Serverless is cost-effective and suitable for batch processing of large CSVs.
Dataflow with batch mode
Cloud Data Fusion
BigQuery Data Transfer Service
A financial services company uses Cloud Composer to orchestrate a daily workflow that includes a Dataproc job for risk analysis. The workflow sometimes fails because the Dataproc cluster creation times out. The cluster creation typically takes 3 minutes, but occasionally takes over 10 minutes. What is the most effective way to handle this variability?
Create a long-running Dataproc cluster that remains idle and reuse it for each workflow.
Reusing an existing cluster eliminates the creation step and associated timeout.
Implement a retry loop with exponential backoff in the DAG.
Use preemptible VMs for the cluster to reduce cost and improve creation speed.
Increase the cluster creation timeout in the Airflow configuration.
A company is using Dataflow to stream data from Cloud Pub/Sub to BigQuery. The pipeline includes a custom ParDo transformation that enriches the data with external API calls. The pipeline is experiencing high latency and occasional failures due to API timeouts. What strategy should be employed to improve reliability and performance?
Remove the enrichment step and store raw data in BigQuery.
Use a global window to accumulate all data before enrichment.
Use a DoFn with stateful processing and batch API calls using asynchronous HTTP client.
Batching and async calls reduce per-element latency and handle timeouts gracefully.
Increase the number of workers to parallelize API calls.
A data engineer needs to process a large dataset (500 TB) stored in Cloud Storage using Dataproc. The processing job requires reading the entire dataset and writing results back to Cloud Storage. The job is expected to run for 6 hours. Which configuration minimizes cost?
Use a single-node cluster with standard VMs.
Use a cluster with local SSDs for faster I/O.
Use a cluster with a mix of standard and preemptible VMs.
Preemptible VMs reduce cost significantly while providing sufficient compute.
Use a cluster with n1-highmem-32 instances and 1000 cores.
Want more Building and operationalizing data processing systems practice?
Practice this domainA company deploys a machine learning model to Vertex AI for real-time predictions. After deployment, they notice that prediction latency spikes during peak traffic hours. Which approach should they take to reduce latency without sacrificing accuracy?
Configure auto-scaling with higher min and max instances
Auto-scaling handles traffic spikes.
Reduce the number of input features
Switch from online to batch prediction
Use a larger machine type for the model
A data science team uses Vertex AI Pipelines to automate retraining. They want to ensure that only models with performance above a threshold are deployed. Which component should they add to the pipeline?
Vertex AI Feature Store
Vertex AI Model Evaluation
Evaluates model and can block deployment if threshold not met.
Cloud Build trigger
Cloud Monitoring alert
A company trains a custom model using TensorFlow and wants to deploy it to Vertex AI for low-latency predictions. The model is large (2 GB). Which deployment option should they choose?
Use Vertex AI Batch Prediction job
Deploy as a Cloud Function
Deploy to Vertex AI Endpoint with a custom container
Custom containers allow large models.
Deploy to Cloud Run with minimum instances
A company uses Vertex AI to serve a model. They notice that some predictions are incorrect due to data drift. What is the best way to detect and retrain the model automatically?
Store predictions in BigQuery and run scheduled queries
Create a Cloud Monitoring dashboard
Set up Cloud Logging metrics to monitor predictions
Use Vertex AI Model Monitoring with alerts and retraining pipeline
Monitors drift and triggers retraining.
A financial services company needs to explain predictions from a complex ensemble model for regulatory compliance. Which Vertex AI service should they use?
Vertex AI Explainable AI
Provides explanations via feature attributions.
Vertex AI Vizier
Vertex AI Feature Store
Vertex AI Prediction
A team wants to retrain a model weekly using new data stored in BigQuery. They want to minimize manual effort. Which approach should they use?
Use Cloud Scheduler to trigger a Cloud Function that retrains
Retrain manually in a notebook each week
Use Cloud Composer to orchestrate retraining
Create a Vertex AI Pipeline scheduled via Cloud Scheduler
Pipelines automate retraining end-to-end.
Want more Operationalizing machine learning models practice?
Practice this domainA data pipeline ingests streaming data from Pub/Sub into BigQuery via Dataflow. Recently, the pipeline has been failing with 'deadline exceeded' errors. What is the most likely cause?
The BigQuery streaming quota is exceeded.
Dataflow workers are underutilized due to batch size settings.
Dataflow autoscaling is disabled.
The Pub/Sub subscription's acknowledgement deadline is too short for the processing time.
A short acknowledgment deadline causes messages to be redelivered, leading to repeated processing attempts and eventual deadline exceeded errors.
A team is designing a data lake on Google Cloud using Cloud Storage and BigQuery. They need to ensure that sensitive data (e.g., PII) is encrypted at rest and have the ability to audit access. Which approach meets these requirements?
Use Customer-Managed Encryption Keys (CMEK) and enable VPC Service Controls.
Use Customer-Managed Encryption Keys (CMEK) and enable Cloud Audit Logs.
CMEK provides control over encryption keys, and Cloud Audit Logs record access to data.
Use Default Encryption and enable Data Loss Prevention (DLP) API.
Use Customer-Supplied Encryption Keys (CSEK) and enable VPC Service Controls.
A company runs a batch processing job on Dataproc that uses Apache Spark to process 500 GB of data daily. The job completes successfully but takes 4 hours. The team wants to reduce the runtime to under 2 hours without increasing cost. What should they do?
Use preemptible VMs for worker nodes and increase the number of workers.
Preemptible VMs are cheaper, allowing more workers for the same cost, reducing runtime.
Increase the master node's machine type to n2-standard-8.
Increase the machine type of worker nodes to n2-highmem-8.
Migrate the job to Dataflow with autoscaling enabled.
Which TWO actions are recommended to improve the reliability of a Cloud Dataflow streaming pipeline that processes event data from Pub/Sub?
Use a pull subscription with a 10-second acknowledgment deadline.
Enable Dataflow Streaming Engine.
Streaming Engine offloads state management to the backend, improving reliability.
Enable exactly-once processing sinks (e.g., BigQuery with guaranteed row-level insertion).
Exactly-once processing prevents duplicate data.
Disable autoscaling to prevent worker churn.
Use micro-batch processing with a small batch size.
A data analyst runs a complex SQL query in BigQuery that joins multiple large tables and receives the above error. Which action is most likely to resolve the issue?
Use a larger number of workers in the query execution.
Use smaller tables by sampling data.
Add clustering on join columns.
Increase the number of slots allocated to the project.
More slots provide more memory and CPU, reducing resource exceeded errors.
A company runs a real-time anomaly detection system on Google Cloud. Streaming data from IoT devices is ingested via Pub/Sub, processed by Dataflow (Apache Beam), and results are written to Bigtable for low-latency serving. Recently, the system has been experiencing increased latency and occasional data loss. The Dataflow pipeline shows high system lag and backlog in Pub/Sub. The Bigtable cluster has 3 nodes and is reporting high CPU utilization (over 90%). The team suspects the issue is with the pipeline configuration. They have already verified that there are no errors in the pipeline code and no network issues. Which action should they take to resolve the issue?
Increase the number of Bigtable nodes to handle the write throughput.
High CPU utilization suggests Bigtable is overwhelmed; adding nodes increases capacity.
Change the Dataflow worker machine type to n2-standard-8.
Decrease the batch size in the Dataflow pipeline to reduce latency.
Increase the number of Dataflow workers to process messages faster.
Want more Ensuring solution quality practice?
Practice this domainThe PDE exam has 60 questions and must be completed in 120 minutes. The passing score is 720/1000.
Scenario-based questions covering exam objectives with detailed answer explanations.
The exam covers 4 domains: Designing data processing systems, Building and operationalizing data processing systems, Operationalizing machine learning models, Ensuring solution quality. Questions are weighted by domain — higher-weight domains appear more on your actual exam.
No. These are original exam-style practice questions written against the official Google Cloud PDE exam objectives. They are not copied from the real exam. Courseiva focuses on genuine understanding, not memorisation of braindumps.
Courseiva tracks your accuracy per domain and routes you toward weak areas automatically. Free, no account required.