- A
Use cross-region replication with two separate Dataflow pipelines reading from a Pub/Sub cross-region subscription and writing to a BigQuery cross-region dataset
Cross-region replication ensures data is available in another region with minimal latency, meeting RPO and RTO.
- B
Run the pipeline using Dataflow batch mode with a 1-minute trigger and store intermediate results in Cloud Storage
Why wrong: Batch mode has higher latency and RPO may exceed 1 minute due to batch intervals.
- C
Deploy resources in a single region with regular backups to Cloud Storage
Why wrong: Single region fails during a regional outage; backups have higher RPO.
- D
Use a single Dataflow pipeline with a standby cluster in another region, but failover is manual
Why wrong: Manual failover increases RTO beyond 5 minutes.
Quick Answer
The correct architecture is an active-active cross-region setup with two separate Dataflow pipelines reading from a Pub/Sub cross-region subscription and writing to a BigQuery cross-region dataset. This design meets the stringent RPO of under one minute and RTO of under five minutes because cross-region Pub/Sub subscriptions replicate messages with sub-second latency, ensuring no data loss during a regional outage, while the independent Dataflow pipelines in each region provide instant failover without manual intervention. On the Google Professional Data Engineer exam, this scenario tests your understanding of disaster recovery architecture for streaming data pipeline low RPO RTO requirements, often appearing as a trap where candidates mistakenly choose a single-region pipeline with backups or a cold standby. The key insight is that streaming pipelines demand active-active replication, not just data durability. Remember the memory tip: “Two pipes, two regions, zero downtime”—if your RPO is under a minute, you cannot afford to replay data; you need both pipelines running simultaneously.
PDE Practice Question: Building and operationalizing data processing systems
This PDE practice question tests your understanding of building and operationalizing data processing systems. This is a configuration task: choose the command set that satisfies every stated requirement. Small differences — like 'secret' vs 'password' or 'transport input ssh' vs 'all' — change whether the answer is correct. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.
You are designing a disaster recovery strategy for a critical streaming data processing pipeline. The pipeline reads from Cloud Pub/Sub, processes with Dataflow streaming, and writes to BigQuery. The required RPO is less than 1 minute, and RTO is less than 5 minutes. Which architecture should you implement?
Answer choices
Why each option matters
Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.
Correct answer & explanation
Use cross-region replication with two separate Dataflow pipelines reading from a Pub/Sub cross-region subscription and writing to a BigQuery cross-region dataset
Option A is correct because cross-region replication for Pub/Sub ensures messages are available in a secondary region with sub-second latency, and a separate Dataflow pipeline reading from a cross-region subscription provides active-active processing. BigQuery cross-region dataset replication (using the 'cross-region' dataset location, e.g., EU or US multi-region, or a specific dual-region configuration) ensures data durability and availability within the RPO of <1 minute. This architecture meets both RPO and RTO by eliminating single points of failure and enabling automatic failover without manual intervention.
Key principle: Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.
Answer analysis
Option-by-option breakdown
For each option: why learners choose it and why it is or isn't the right answer here.
- ✓
Use cross-region replication with two separate Dataflow pipelines reading from a Pub/Sub cross-region subscription and writing to a BigQuery cross-region dataset
Why this is correct
Cross-region replication ensures data is available in another region with minimal latency, meeting RPO and RTO.
Related concept
Read the scenario before looking for a memorised answer.
- ✗
Run the pipeline using Dataflow batch mode with a 1-minute trigger and store intermediate results in Cloud Storage
Why it's wrong here
Batch mode has higher latency and RPO may exceed 1 minute due to batch intervals.
- ✗
Deploy resources in a single region with regular backups to Cloud Storage
Why it's wrong here
Single region fails during a regional outage; backups have higher RPO.
- ✗
Use a single Dataflow pipeline with a standby cluster in another region, but failover is manual
Why it's wrong here
Manual failover increases RTO beyond 5 minutes.
Common exam traps
Common exam trap: answer the scenario, not the keyword
The trap here is that candidates often assume a single pipeline with a standby cluster is sufficient, but they overlook that manual failover cannot meet the strict RTO of <5 minutes, and that cross-region replication must be active-active (not active-passive) to achieve sub-minute RPO.
Detailed technical explanation
How to think about this question
Under the hood, Pub/Sub cross-region subscriptions use a multi-region topic with message replication across zones and regions, leveraging Google's global network for low-latency delivery. Dataflow streaming pipelines use exactly-once processing semantics and can checkpoint state to Cloud Storage, but the key is that each pipeline operates independently in its region, reading from the same cross-region subscription to avoid data loss. BigQuery cross-region datasets use synchronous replication (e.g., dual-region in US or EU) to ensure that writes are durable across regions within seconds, supporting the RPO requirement.
KKey Concepts to Remember
- Read the scenario before looking for a memorised answer.
- Find the constraint that changes the correct option.
- Eliminate answers that are true in general but not in this case.
TExam Day Tips
- Watch for words such as best, first, most likely and least administrative effort.
- Review why wrong options are wrong, not only why the correct option is correct.
Key takeaway
Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.
Real-world example
How this comes up in practice
A media company stores terabytes of video archives that are accessed once a year for audit purposes. Moving these objects to a cold storage tier (Azure Archive, S3 Glacier, or Google Nearline) costs a fraction of hot storage. Questions like this test whether you understand storage tiers, access frequency tradeoffs, and retrieval latency requirements.
What to study next
Got this wrong? Here's your next step.
Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.
- →
Building and operationalizing data processing systems — study guide chapter
Learn the concepts, then practise the questions
- →
Building and operationalizing data processing systems practice questions
Targeted practice on this topic area only
- →
All PDE questions
499 questions across all exam domains
- →
Google Professional Data Engineer study guide
Full concept coverage aligned to exam objectives
- →
PDE practice test guide
How to use practice tests most effectively before exam day
Related practice questions
Related PDE practice-question pages
Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.
Designing data processing systems practice questions
Practise PDE questions linked to Designing data processing systems.
Building and operationalizing data processing systems practice questions
Practise PDE questions linked to Building and operationalizing data processing systems.
Operationalizing machine learning models practice questions
Practise PDE questions linked to Operationalizing machine learning models.
Ensuring solution quality practice questions
Practise PDE questions linked to Ensuring solution quality.
PDE fundamentals practice questions
Practise PDE questions linked to PDE fundamentals.
PDE scenario practice questions
Practise PDE questions linked to PDE scenario.
PDE troubleshooting practice questions
Practise PDE questions linked to PDE troubleshooting.
Practice this exam
Start a free PDE practice session
Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.
FAQ
Questions learners often ask
What does this PDE question test?
Building and operationalizing data processing systems — This question tests Building and operationalizing data processing systems — Read the scenario before looking for a memorised answer..
What is the correct answer to this question?
The correct answer is: Use cross-region replication with two separate Dataflow pipelines reading from a Pub/Sub cross-region subscription and writing to a BigQuery cross-region dataset — Option A is correct because cross-region replication for Pub/Sub ensures messages are available in a secondary region with sub-second latency, and a separate Dataflow pipeline reading from a cross-region subscription provides active-active processing. BigQuery cross-region dataset replication (using the 'cross-region' dataset location, e.g., EU or US multi-region, or a specific dual-region configuration) ensures data durability and availability within the RPO of <1 minute. This architecture meets both RPO and RTO by eliminating single points of failure and enabling automatic failover without manual intervention.
What should I do if I get this PDE question wrong?
Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.
What is the key concept behind this question?
Read the scenario before looking for a memorised answer.
About these practice questions
Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →
Last reviewed: Jun 24, 2026
This PDE practice question is part of Courseiva's free Google Cloud certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the PDE exam.
Question Discussion
Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.
Sign in to join the discussion.