Practice PDE Ensuring solution quality questions with full explanations on every answer.
Start practicing
Ensuring solution quality — choose a session length
Free · No account required
Click any question to see the full explanation and answer options, or start a focused practice session above.
A data pipeline ingests streaming data from Pub/Sub into BigQuery via Dataflow. Recently, the pipeline has been failing with 'deadline exceeded' errors. What is the most likely cause?
2A team is designing a data lake on Google Cloud using Cloud Storage and BigQuery. They need to ensure that sensitive data (e.g., PII) is encrypted at rest and have the ability to audit access. Which approach meets these requirements?
3A company runs a batch processing job on Dataproc that uses Apache Spark to process 500 GB of data daily. The job completes successfully but takes 4 hours. The team wants to reduce the runtime to under 2 hours without increasing cost. What should they do?
4Which TWO actions are recommended to improve the reliability of a Cloud Dataflow streaming pipeline that processes event data from Pub/Sub?
5A data analyst runs a complex SQL query in BigQuery that joins multiple large tables and receives the above error. Which action is most likely to resolve the issue?
6A company runs a real-time anomaly detection system on Google Cloud. Streaming data from IoT devices is ingested via Pub/Sub, processed by Dataflow (Apache Beam), and results are written to Bigtable for low-latency serving. Recently, the system has been experiencing increased latency and occasional data loss. The Dataflow pipeline shows high system lag and backlog in Pub/Sub. The Bigtable cluster has 3 nodes and is reporting high CPU utilization (over 90%). The team suspects the issue is with the pipeline configuration. They have already verified that there are no errors in the pipeline code and no network issues. Which action should they take to resolve the issue?
7Drag and drop the steps to create a Cloud Composer environment for Apache Airflow into the correct order.
8Drag and drop the steps to migrate an on-premises MySQL database to Cloud SQL using Database Migration Service into the correct order.
9Match each machine learning term to its description.
10Match each data encryption concept to its description.
11A data pipeline processes streaming data with Dataflow. The team notices occasional data duplication in BigQuery. What is the best approach to ensure exactly-once processing?
12A company is deploying a large-scale streaming application on Google Kubernetes Engine. They need to ensure the application can handle sudden traffic spikes without dropping data. Which architectural pattern is most appropriate?
13A data science team uses AI Platform Training with hyperparameter tuning. They observe that some trials fail due to transient errors. To improve solution quality and reduce costs, what should they do?
14A company runs batch jobs on Dataproc. They need to ensure that if a job fails, it automatically retries with exponential backoff. What is the recommended approach?
15A team developed a microservice that writes logs to stdout. They want to centralize logs for analysis. Which GCP service should they use to automatically collect and store logs?
16A data platform uses Cloud Spanner for transactional data. They are experiencing high latency during write-heavy periods. To maintain solution quality, what configuration change is most effective?
17A company uses Cloud Functions to process events from Cloud Storage. They notice that occasionally functions are not triggered. What should they check first to ensure solution quality?
18A financial services company uses Dataflow pipelines with late data handling. They need to ensure that all late-arriving data is processed correctly but also want to control costs. What is the best configuration?
19A team is deploying a model on AI Platform Prediction. They want to monitor for data drift to maintain model quality. Which service should they use?
20A data engineer needs to monitor the performance of BigQuery queries to identify opportunities for optimization. Which TWO metrics should they focus on? (Choose two.)
21A company uses Cloud Build to deploy containerized applications. They want to ensure build and deployment quality. Which THREE steps should they include in their CI/CD pipeline? (Choose three.)
22A company wants to ensure high availability for their Cloud SQL instance. Which TWO actions are most appropriate? (Choose two.)
23Refer to the exhibit. A team configured a Cloud Monitoring alerting policy as shown. They recently started receiving false positive alerts. What is the most likely cause?
24Refer to the exhibit. A team received this error when running a query. Which optimization should they apply first?
25Refer to the exhibit. A subscriber is unable to pull messages from the topic. What is the most likely cause?
26A company uses Cloud Monitoring to track application latency. They notice a spike in latency every 30 minutes. What is the best initial step to diagnose the issue?
27A team deploys a new version of a Cloud Function. After deployment, error rates increase significantly. What is the most efficient way to diagnose the cause?
28A data pipeline using Dataflow processes streaming data. Late-arriving events are currently being dropped. How should the team modify the pipeline to ensure late data is processed correctly?
29A company uses GKE to run microservices. They want to ensure the application restarts automatically if it becomes unresponsive. Which probes should they configure in their pod spec?
30After migrating a production Cloud SQL for PostgreSQL database to a larger machine type, the team notices slower queries. What is the best step to identify the cause?
31A company uses BigQuery for analytics. They need to ensure data quality by preventing duplicate records from being inserted. Which approach is most effective?
32A company uses Cloud Spanner for a global transactional application. During peak hours, commit latency increases by over 50%. Which configuration issue is the most likely root cause?
33A team deploys a Cloud Run service that processes user-uploaded files. Some requests time out after 60 minutes. They need to handle large files reliably without losing tasks. What is the best solution?
34A data engineering team uses Cloud Composer (Airflow) for workflow orchestration. They notice DAG runs frequently fail, and the error indicates insufficient Airflow workers. The team wants to ensure reliable execution. Which approach best addresses the issue?
35A company uses Cloud Logging to monitor application errors. They want to set up real-time notifications for critical errors. Which two actions are essential? (Choose two.)
36A team runs a production application on Compute Engine. They want to ensure high availability and quality. Which three best practices should they implement? (Choose three.)
37A company uses Cloud Dataproc for ephemeral clusters to run batch jobs. They want to ensure job reliability and data quality. Which two configuration options should they use? (Choose two.)
38Refer to the exhibit. A team uses this Cloud Build configuration to deploy a service to Cloud Run. The deployment step fails with a 'Permission denied' error. What is the most likely cause?
39Refer to the exhibit. A BigQuery dataset is shared with the group 'analysts@example.com' using the IAM policy shown. A user who is a member of this group reports that they cannot run queries on the dataset, though they can see the tables. What is the most likely reason?
40A company runs a data pipeline that ingests clickstream events from multiple websites into Cloud Pub/Sub, then processed by Dataflow to generate user sessions, and written to BigQuery for analytics. The pipeline runs 24/7. Recently, the team noticed that some sessions are incomplete due to missing events, and data quality checks reveal that about 2% of sessions have gaps of more than 30 minutes. The pipeline uses fixed 30-minute windows for sessionization, with allowed lateness set to 10 minutes. They have Cloud Monitoring dashboards tracking system throughput and pipeline lag but do not have custom metrics tracking per-element delays or watermark progress. The team suspects two possible causes: (a) the Pub/Sub subscription accumulates backlog and some messages are delivered after the window end; (b) the Dataflow job has insufficient workers causing checkpoint failures. The team needs to determine the root cause and improve data quality. What is the best first course of action?
41A media company runs a batch data pipeline on Cloud Dataflow that ingests log files from Cloud Storage, transforms them, and writes results to BigQuery for analytics. The pipeline runs daily and has been stable for months. Recently, the source log format changed: a new optional field was added to some records. The pipeline started failing with ParseErrors for rows that contain the new field. The error logs show that the Dataflow job uses a hardcoded JSON schema that does not include the new field. The Dataflow pipeline logs are written to Stackdriver Logging, but no alerts are configured. The team wants to ensure that future schema changes do not break the pipeline and that failures are detected promptly. The team has limited experience with streaming and wants to keep the batch approach. Which course of action should the team take to improve solution quality?
42A financial services company operates a real-time fraud detection pipeline using Apache Beam running on Google Cloud Dataflow. The pipeline reads transactions from Pub/Sub, enriches them with customer data from Bigtable, runs a machine learning model with side inputs from a Redis cluster, and writes results to BigQuery for downstream reporting. The data must be processed with exactly-once semantics to avoid duplicate fraud alerts or missing transactions. The pipeline currently uses a global window with 5-minute accumulation, but the team is experiencing high latency and occasional duplicates when the model side input is updated (triggered every 15 minutes via a WatchTransform). Additionally, the pipeline has a dead letter queue that outputs failed records to a separate Pub/Sub topic, but these records are never reprocessed. The team needs to ensure high reliability and data quality. Which course of action should the team take to improve solution quality?
43A company is developing a streaming Dataflow pipeline to process real-time sensor data. To ensure data quality, the team wants to detect malformed records and late data. Which two practices should they implement? (Choose two.)
44Refer to the exhibit. A Dataflow pipeline is failing intermittently with the shown error. Which step should the team take to ensure data quality and prevent such errors?
45A company runs a large Dataflow pipeline that aggregates user activity data from Pub/Sub into BigQuery every 10 minutes using fixed windows. Recently, the daily summary reports have shown 5-10% lower user engagement for certain segments compared to historical trends. The pipeline is completing successfully with no errors in Cloud Monitoring, and the Dataflow job dashboard shows all steps in green. There are no alarms. The team suspects data is being dropped or missed. They have verified that the Pub/Sub topic is receiving data correctly. After reviewing the pipeline code, they find that the pipeline uses a global window with a default 10-minute trigger, and writes results to a single BigQuery table partitioned by date. They also use exactly-once processing mode. Which of the following is the most likely cause and the best course of action to diagnose and fix the data quality issue?
The Ensuring solution quality domain covers the key concepts tested in this area of the PDE exam blueprint published by Google Cloud. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all PDE domains — no account required.
The Courseiva PDE question bank contains 45 questions in the Ensuring solution quality domain. Click any question to see the full explanation and answer breakdown.
Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.
Yes — the session launcher on this page draws questions exclusively from the Ensuring solution quality domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.
Save your results, see per-domain analytics, and get readiness scores — free, for every certification.
Sign Up FreeFree forever · Every certification included