Back to Google Professional Data Engineer questions

Scenario-based practice

Select Two (Multi-Select) Questions

Practise Google Professional Data Engineer practice questions — original exam-style scenarios covering every exam domain, with detailed explanations, wrong-answer analysis, and common exam traps.

20
scenario questions
PDE
exam code
Google Cloud
vendor

Scenario guide

How to approach select two (multi-select) questions

Multi-select questions tell you to 'Choose TWO' or 'Choose THREE'. Getting partial credit is not a thing — you must select all correct answers with no incorrect ones. The stem always states how many to choose, so trust it. These questions require precision, not best-guess elimination.

Quick answer

Select Two (Multi-Select) Questions questions test whether you can apply the concept in context, not just recognise a definition.

How the topic appears in realistic exam-style scenarios.

Which detail in the question changes the correct answer.

How to eliminate plausible but wrong options.

How to connect the question back to the wider exam objective.

Related practice questions

Related PDE topic practice pages

Scenario questions usually connect to one or more exam topics. Use these links to review the underlying concepts behind the scenario.

Practice set

Practice scenarios

Question 1mediummulti select
Full question →

Which TWO actions are recommended to improve the reliability of a Cloud Dataflow streaming pipeline that processes event data from Pub/Sub?

Question 2easymulti select
Full question →

A data engineering team is operationalizing a machine learning model for real-time fraud detection. The model must process transactions with sub-100ms latency and be highly available. Which TWO strategies should the team implement?

Question 3mediummulti select
Full question →

Which TWO are best practices for monitoring a deployed machine learning model in production on Vertex AI?

Question 4hardmulti select
Full question →

A company is migrating an on-premises Hadoop cluster to Google Cloud. They need to run existing Spark jobs with minimal modification. Which THREE strategies should they consider? (Choose THREE.)

Question 5mediummulti select
Full question →

A data engineer is designing a batch processing system using Cloud Dataproc. Which TWO practices improve performance and reduce costs? (Choose TWO.)

Question 6mediummulti select
Full question →

Which TWO statements are correct about designing a data pipeline using Cloud Dataflow for processing unbounded data?

Question 7hardmulti select
Full question →

Which THREE considerations are important when designing a data lake on Google Cloud using Cloud Storage?

Question 8hardmulti select
Full question →

Which THREE best practices should be followed when designing a Dataflow pipeline for real-time data processing?

Question 9mediummulti select
Full question →

Which TWO factors should be considered when choosing between Cloud Dataflow and Dataproc for a batch processing pipeline?

Question 10mediummulti select
Full question →

Which TWO actions can help reduce the latency of a Vertex AI endpoint serving a large neural network model?

Question 11easymulti select
Full question →

Which TWO actions can reduce the cost of running a Dataproc cluster for a nightly batch job?

Question 12easymulti select
Full question →

A data pipeline uses Cloud Pub/Sub to ingest events, then a Cloud Dataflow job writes to BigQuery. The Dataflow job is failing with 'deadline exceeded' errors. Which TWO actions can resolve this? (Choose TWO.)

Question 13easymulti select
Full question →

A company is deploying a machine learning model for fraud detection. The model is trained using TensorFlow and will be served on Vertex AI Prediction. The team wants to implement model monitoring to detect prediction drift. Which TWO actions should they take? (Choose 2)

Question 14hardmulti select
Full question →

Which TWO statements about designing a data processing pipeline on Google Cloud are correct? (Choose 2.)

Question 15easymulti select
Full question →

A data team uses Cloud Composer to orchestrate Airflow DAGs. They need to ensure that a downstream task runs only if at least two out of three upstream sensor tasks succeed. Which TWO configurations should they combine?

Question 16hardmulti select
Full question →

Which THREE factors should be considered when designing a Vertex AI Pipeline for continuous training?

Question 17hardmulti select
Full question →

Which THREE considerations are important when designing a batch prediction pipeline for a large dataset on Vertex AI?

Question 18mediummulti select
Full question →

Which TWO steps are required to deploy a custom scikit-learn model to Vertex AI for online predictions?

Question 19easymulti select
Full question →

Which TWO approaches are recommended for handling late-arriving data in a streaming Dataflow pipeline?

Question 20easymulti select
Full question →

Which TWO actions can help reduce prediction latency for a Vertex AI endpoint?

These PDE practice questions are part of Courseiva's free Google Cloud certification practice question bank. Courseiva provides original exam-style PDE questions with detailed explanations, topic-based practice, mock exams, readiness tracking, and study analytics.