Question 153 of 499

Quick Answer

The answer is to increase the number of workers in the pipeline configuration and ensure the maximum worker count is set higher. This resolves the Dataflow keyed state out of memory and worker termination issue because keyed state, such as per-store counters in a stateful ParDo, is distributed across worker VMs; adding workers spreads the memory footprint of that state, preventing any single worker from exceeding its limit. On the Google Professional Data Engineer exam, this scenario tests your understanding that with Streaming Engine enabled, state is still held in worker memory for low-latency access, so scaling workers is the direct fix—not disabling state or reducing parallelism. A common trap is to assume Streaming Engine offloads all state, but it only manages shuffle and I/O; worker memory remains the bottleneck for keyed state. Memory tip: “More workers, less per-worker burden—state scales with nodes, not with code.”

PDE Practice Question: Building and operationalizing data processing systems

This PDE practice question tests your understanding of building and operationalizing data processing systems. Read the scenario carefully and evaluate each option against the stated constraints before committing to an answer. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

A retail company uses Cloud Dataflow for a streaming pipeline that aggregates sales events from thousands of stores. The pipeline writes aggregated results to BigQuery every 5 minutes. Recently, the Dataflow job has been restarting multiple times a day with the error: 'Worker ran out of memory' in the logs. The streaming engine is enabled. The pipeline uses keyed state (ParDo with stateful processing) to maintain per-store counters. The average event size is 2KB, and the throughput is 2,000 events/sec. You need to resolve the out-of-memory issues without losing data. What should you do?

Question 1mediummultiple choice
Full question →

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

Increase the number of workers in the pipeline configuration and ensure the maximum worker count is set higher to allow better distribution of state.

Option C is correct because increasing the number of workers distributes the keyed state (per-store counters) across more VMs, reducing the memory pressure on each individual worker. With streaming engine enabled, state is still held in worker memory for low-latency access, so adding workers is the direct way to scale the state footprint. This avoids data loss because the pipeline continues processing with exactly-once semantics and state is preserved via checkpointing.

Key principle: Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

  • Disable stateful processing and use side inputs from BigQuery to get per-store aggregates.

    Why it's wrong here

    Side inputs are read-only snapshots and would lose real-time updates; not suitable.

  • Modify the pipeline to use sliding windows with a shorter duration to reduce the state size.

    Why it's wrong here

    Changing window type does not necessarily reduce state size and could cause duplicate data.

  • Increase the number of workers in the pipeline configuration and ensure the maximum worker count is set higher to allow better distribution of state.

    Why this is correct

    More workers spread the stateful processing and reduce memory per worker.

    Related concept

    Read the scenario before looking for a memorised answer.

  • Reduce the number of workers to limit the overhead of data shuffling.

    Why it's wrong here

    Fewer workers increase per-worker load and exacerbate memory issues.

Common exam traps

Common exam trap: answer the scenario, not the keyword

The trap here is that candidates may confuse window-based state (which can be reduced by shortening windows) with keyed state (which is independent of window duration), leading them to incorrectly choose option B.

Detailed technical explanation

How to think about this question

In Dataflow, stateful ParDo uses the KeyedState API, which stores state in the worker's local memory (backed by persistent checkpointing to Cloud Storage). When the number of keys (stores) is large and workers are few, the heap can be exhausted even with streaming engine enabled, because streaming engine offloads shuffle but not state storage. Increasing the max worker count allows the autoscaler to add more workers, distributing the state more evenly and reducing per-worker memory usage.

KKey Concepts to Remember

  • Read the scenario before looking for a memorised answer.
  • Find the constraint that changes the correct option.
  • Eliminate answers that are true in general but not in this case.

TExam Day Tips

  • Watch for words such as best, first, most likely and least administrative effort.
  • Review why wrong options are wrong, not only why the correct option is correct.

Key takeaway

Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Real-world example

How this comes up in practice

A startup's cloud architect reviews their monthly bill and notices costs are higher than expected for a long-running batch job. Switching from on-demand instances to Reserved Instances — or using Spot/Preemptible VMs — can reduce compute costs by up to 72 %. Questions like this test whether you understand the tradeoffs between commitment, flexibility, and cost across cloud pricing models.

What to study next

Got this wrong? Here's your next step.

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

Related practice questions

Related PDE practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

Practice this exam

Start a free PDE practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

FAQ

Questions learners often ask

What does this PDE question test?

Building and operationalizing data processing systems — This question tests Building and operationalizing data processing systems — Read the scenario before looking for a memorised answer..

What is the correct answer to this question?

The correct answer is: Increase the number of workers in the pipeline configuration and ensure the maximum worker count is set higher to allow better distribution of state. — Option C is correct because increasing the number of workers distributes the keyed state (per-store counters) across more VMs, reducing the memory pressure on each individual worker. With streaming engine enabled, state is still held in worker memory for low-latency access, so adding workers is the direct way to scale the state footprint. This avoids data loss because the pipeline continues processing with exactly-once semantics and state is preserved via checkpointing.

What should I do if I get this PDE question wrong?

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

What is the key concept behind this question?

Read the scenario before looking for a memorised answer.

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Same concept, more angles

2 more ways this is tested on PDE

These questions test the same concept from different angles. Work through them to make sure you can recognise it however the exam phrases it.

Variation 1. Your company uses Cloud Dataflow to process streaming data from Pub/Sub. The pipeline occasionally fails with a 'worker terminated unexpectedly' error. What is the most likely cause of this error?

easy
  • A.Insufficient memory per worker causing OOM errors
  • B.Incorrect VPC firewall rules blocking internal communication
  • C.Staging location bucket lacks write permissions
  • D.Pub/Sub subscription throughput quota exceeded

Why A: The 'worker terminated unexpectedly' error in Cloud Dataflow typically indicates that a worker process ran out of memory (OOM) and was killed by the operating system. This occurs when the pipeline's memory requirements exceed the configured worker machine type's memory capacity, often due to large windowing accumulations, skewed data, or inefficient state handling.

Variation 2. A data pipeline ingests real-time events from Cloud Pub/Sub into BigQuery using Dataflow. The pipeline uses a sliding window of 5 minutes with a 1-minute period to aggregate event counts. Recently, the pipeline started failing with 'The worker failed to provide a heartbeat.' The Dataflow logs show high CPU usage on the workers. What is the best course of action to resolve the issue?

hard
  • A.Increase the number of workers and enable autoscaling to distribute the load.
  • B.Reduce the number of workers to minimize coordination overhead.
  • C.Use a global window with a trigger to reduce state size.
  • D.Change the windowing to a fixed 5-minute window to reduce computations.

Why A: The 'worker failed to provide a heartbeat' error combined with high CPU usage indicates that workers are overloaded and cannot process data fast enough to maintain their heartbeat to the Dataflow service. Increasing the number of workers and enabling autoscaling distributes the computational load across more machines, reducing per-worker CPU pressure and allowing heartbeats to be sent on time. This directly addresses the root cause of resource exhaustion.

Keep practising

More PDE practice questions

Last reviewed: Jun 24, 2026

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

Sign in to join the discussion.

This PDE practice question is part of Courseiva's free Google Cloud certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the PDE exam.