Question 390 of 499
Designing data processing systemsmediumMultiple ChoiceObjective-mapped

Quick Answer

The correct design change is to implement retries with exponential backoff for Cloud Composer tasks. This approach directly addresses transient failures like API rate limits or resource contention by automatically re-attempting failed tasks with progressively longer delays, preventing repeated immediate retries from overwhelming the system. On the Google Professional Data Engineer exam, this concept tests your understanding of Apache Airflow’s built-in retry parameters and how they integrate with Cloud Composer’s managed environment—a common trap is choosing static retries without delay, which can worsen throttling. Remember the mnemonic “Backoff to bounce back”: exponential backoff gives transient errors time to resolve, making your pipeline resilient without manual intervention.

PDE Designing data processing systems Practice Question

This PDE practice question tests your understanding of designing data processing systems. The scenario asks you to isolate a root cause — eliminate options that address a different problem before choosing. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

A data pipeline uses Cloud Composer to orchestrate Dataflow and BigQuery jobs. The pipeline fails intermittently with dependency errors. Which design change can improve reliability?

Question 1mediummultiple choice
Full question →

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

Use retries with exponential backoff

Cloud Composer (Apache Airflow) tasks can fail due to transient issues like API rate limits or resource contention. Implementing retries with exponential backoff allows the DAG to automatically re-attempt failed tasks with increasing delays, reducing the impact of intermittent failures without manual intervention. This is a standard Airflow pattern for improving reliability in orchestrated pipelines.

Key principle: Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

  • Use retries with exponential backoff

    Why this is correct

    Retries with backoff handle transient failures, improving reliability.

    Related concept

    Read the scenario before looking for a memorised answer.

  • Switch to Cloud Functions for orchestration

    Why it's wrong here

    Cloud Functions is event-driven, not suited for complex workflow orchestration with retries.

  • Increase worker count in Dataflow

    Why it's wrong here

    Worker count affects throughput, not dependency resolution.

  • Use a simpler DAG with fewer dependencies

    Why it's wrong here

    Simplification may reduce complexity but does not address transient errors.

Common exam traps

Common exam trap: answer the scenario, not the keyword

Google Cloud often tests the distinction between scaling compute resources (Dataflow workers) and improving orchestration reliability (retries), leading candidates to mistakenly choose option C when the problem is transient task failures, not resource bottlenecks.

Detailed technical explanation

How to think about this question

In Apache Airflow, retries are configured via the `retries` and `retry_delay` parameters on tasks; exponential backoff can be enabled with `exponential_backoff=True`, which multiplies the delay by a factor (default 2) after each retry. This is particularly effective for API rate limiting (e.g., BigQuery's 409 'rateLimitExceeded' errors) or transient Dataflow job submission failures, where immediate retries would likely fail again. The default maximum retry delay is capped at 24 hours in Airflow, preventing runaway backoff.

KKey Concepts to Remember

  • Read the scenario before looking for a memorised answer.
  • Find the constraint that changes the correct option.
  • Eliminate answers that are true in general but not in this case.

TExam Day Tips

  • Watch for words such as best, first, most likely and least administrative effort.
  • Review why wrong options are wrong, not only why the correct option is correct.

Key takeaway

Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Real-world example

How this comes up in practice

A cloud solutions architect for a retail company is evaluating services for a new workload. The correct answer here reflects best practice for the specific scenario described — not a general cloud recommendation. Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option. Cloud exam questions reward reading the constraint carefully: the same technology can be right or wrong depending on the use case.

What to study next

Got this wrong? Here's your next step.

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

Related practice questions

Related PDE practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

Practice this exam

Start a free PDE practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

FAQ

Questions learners often ask

What does this PDE question test?

Designing data processing systems — This question tests Designing data processing systems — Read the scenario before looking for a memorised answer..

What is the correct answer to this question?

The correct answer is: Use retries with exponential backoff — Cloud Composer (Apache Airflow) tasks can fail due to transient issues like API rate limits or resource contention. Implementing retries with exponential backoff allows the DAG to automatically re-attempt failed tasks with increasing delays, reducing the impact of intermittent failures without manual intervention. This is a standard Airflow pattern for improving reliability in orchestrated pipelines.

What should I do if I get this PDE question wrong?

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

What is the key concept behind this question?

Read the scenario before looking for a memorised answer.

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Same concept, more angles

1 more ways this is tested on PDE

These questions test the same concept from different angles. Work through them to make sure you can recognise it however the exam phrases it.

Variation 1. A company uses Cloud Composer to orchestrate Dataproc and BigQuery jobs. They need to implement retry logic for transient failures. Which THREE features can help?

hard
  • A.Dataflow pipeline retries
  • B.DAG retry_delay
  • C.BigQuery job retries
  • D.Cloud Composer high availability
  • E.Task retries and retry_delay

Why B: Option B is correct because Cloud Composer (Apache Airflow) allows setting `retry_delay` at the DAG level to define the time delay between task retries. This is a native Airflow feature that helps handle transient failures by automatically retrying failed tasks after a specified delay, reducing manual intervention.

Last reviewed: Jun 30, 2026

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

Sign in to join the discussion.

This PDE practice question is part of Courseiva's free Google Cloud certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the PDE exam.