Question 435 of 499

Quick Answer

The answer is that the stateful processing is causing large state sizes that lead to GC overhead, requiring a more efficient state backend or increased worker memory. This is correct because the GC overhead limit exceeded error directly indicates that the JVM is spending more than 98% of its time garbage collecting while recovering less than 2% of heap space, a classic symptom of excessive per-key state accumulation. In stateful processing with sliding windows, each key maintains overlapping state across multiple window firings, which can balloon memory usage and force constant garbage collection, driving system lag above five minutes. On the Google Professional Data Engineer exam, this scenario tests your understanding of how Dataflow’s stateful operations interact with JVM memory management, and it’s a common trap to blame source or sink bottlenecks when the real issue is worker heap pressure. A useful memory tip: when you see GC overhead errors in a stateful pipeline, think “state size, not source size.”

PDE Practice Question: Building and operationalizing data processing systems

This PDE practice question tests your understanding of building and operationalizing data processing systems. The scenario asks you to isolate a root cause — eliminate options that address a different problem before choosing. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

A company runs a critical real-time data pipeline using Dataflow that ingests events from Cloud Pub/Sub, performs aggregations using sliding windows, and writes results to BigQuery. The pipeline is deployed in us-central1. The pipeline's latency has increased recently, and the Dataflow monitoring shows that the 'system lag' metric is consistently above 5 minutes. The pipeline is using Streaming Engine and has 10 workers with 4 vCPUs each. The pipeline processes approximately 100,000 events per second. The team has verified that the source Pub/Sub topic has sufficient publish throughput and the BigQuery table has no quota issues. The pipeline logs show that some workers are experiencing GC overhead limit exceeded errors. The pipeline code uses stateful processing with a custom keyed state for deduplication. What is the most likely cause of the increased latency?

Clue words in this question

Noticing these words before you look at the options changes how you read each choice.

  • Clue: "most likely"

    Why it matters: Probability qualifier — the question wants the most probable cause or outcome, not a guaranteed one. Eliminate low-probability options.

Question 1hardmultiple choice
Full question →

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

The stateful processing is causing large state sizes that lead to GC overhead; use a more efficient state backend or increase worker memory.

The GC overhead limit exceeded errors indicate that workers are spending too much time garbage collecting, which is a classic symptom of excessive heap memory usage. Stateful processing with custom keyed state for deduplication can cause large per-key state sizes, especially with sliding windows that maintain overlapping state for each key. This forces the JVM to constantly garbage collect, increasing system lag beyond 5 minutes. Using a more efficient state backend (e.g., reducing state size or using Dataflow's built-in deduplication) or increasing worker memory directly addresses the root cause.

Key principle: Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

  • The number of workers is insufficient; increasing to 20 workers will reduce latency.

    Why it's wrong here

    Adding workers may not solve GC issues if state is not properly partitioned.

  • The stateful processing is causing large state sizes that lead to GC overhead; use a more efficient state backend or increase worker memory.

    Why this is correct

    GC overhead indicates memory pressure from large state; increasing memory or using a more efficient state backend like Cloud Bigtable can help.

    Clue confirmation

    The clue word "most likely" in the question point toward this answer.

    Related concept

    Read the scenario before looking for a memorised answer.

  • The sliding window duration is too long; reducing it to 1 minute will improve performance.

    Why it's wrong here

    Shorter windows can increase state churn and may worsen GC.

  • The deduplication logic is causing a bottleneck; removing it will reduce latency.

    Why it's wrong here

    Removing deduplication may be unacceptable and may not address GC overhead.

Common exam traps

Common exam trap: answer the scenario, not the keyword

Google Cloud often tests the misconception that scaling workers (Option A) is the universal fix for latency, when in reality memory-related issues like GC overhead require tuning state management or worker resources, not just parallelism.

Detailed technical explanation

How to think about this question

Dataflow's Streaming Engine offloads shuffle and state storage to backend services, but per-key state for deduplication still resides in worker memory when using custom keyed state (e.g., ValueState or MapState). Sliding windows compound this because each key maintains state for multiple overlapping windows, leading to O(n) memory per key where n is the number of active windows. In practice, a pipeline processing 100k events/s with 10 workers and 4 vCPUs each (typically ~15 GB heap per worker) can easily exhaust heap if the deduplication state grows unbounded or if keys are high cardinality.

KKey Concepts to Remember

  • Read the scenario before looking for a memorised answer.
  • Find the constraint that changes the correct option.
  • Eliminate answers that are true in general but not in this case.

TExam Day Tips

  • Watch for words such as best, first, most likely and least administrative effort.
  • Review why wrong options are wrong, not only why the correct option is correct.

Key takeaway

Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Real-world example

How this comes up in practice

An e-commerce site experiences heavy traffic on Black Friday and near-zero traffic during off-peak weeks. Rather than provisioning permanent large VMs, the team uses auto-scaling groups that add capacity automatically under load and reduce it overnight. Questions like this test whether you understand elasticity, availability zones, and cloud compute scaling patterns.

What to study next

Got this wrong? Here's your next step.

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

Related practice questions

Related PDE practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

Practice this exam

Start a free PDE practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

FAQ

Questions learners often ask

What does this PDE question test?

Building and operationalizing data processing systems — This question tests Building and operationalizing data processing systems — Read the scenario before looking for a memorised answer..

What is the correct answer to this question?

The correct answer is: The stateful processing is causing large state sizes that lead to GC overhead; use a more efficient state backend or increase worker memory. — The GC overhead limit exceeded errors indicate that workers are spending too much time garbage collecting, which is a classic symptom of excessive heap memory usage. Stateful processing with custom keyed state for deduplication can cause large per-key state sizes, especially with sliding windows that maintain overlapping state for each key. This forces the JVM to constantly garbage collect, increasing system lag beyond 5 minutes. Using a more efficient state backend (e.g., reducing state size or using Dataflow's built-in deduplication) or increasing worker memory directly addresses the root cause.

What should I do if I get this PDE question wrong?

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

Are there clue words in this question I should notice?

Yes — watch for: "most likely". Probability qualifier — the question wants the most probable cause or outcome, not a guaranteed one. Eliminate low-probability options.

What is the key concept behind this question?

Read the scenario before looking for a memorised answer.

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Last reviewed: Jun 30, 2026

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

Sign in to join the discussion.

This PDE practice question is part of Courseiva's free Google Cloud certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the PDE exam.