← PDE·Google Cloud

Question 236 of 499

Designing data processing systems →easyMultiple ChoiceObjective-mapped

Quick Answer

The correct choice is that the table is partitioned by sale_date, because this enables BigQuery partition pruning efficiency, which is the key mechanism allowing the query to scan only the relevant partitions rather than all 10 billion rows. Partition pruning works by leveraging the table’s partitioning column—here, sale_date—so that the query engine reads only the data blocks matching the filter conditions, dramatically reducing I/O and processing time. On the Google Professional Data Engineer exam, this concept tests your understanding of how table design directly impacts query performance and cost; a common trap is assuming that a large total row count automatically means slow queries, when in fact effective partitioning can make them highly efficient. Remember the memory tip: “Partition to prune—your query’s best boon.”

PDE Designing data processing systems Practice Question

This PDE practice question tests your understanding of designing data processing systems. Read the scenario carefully and evaluate each option against the stated constraints before committing to an answer. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

Exhibit

Refer to the exhibit.

```sql
SELECT product_id, SUM(amount) AS total_sales
FROM sales
WHERE sale_date BETWEEN '2024-01-01' AND '2024-12-31'
GROUP BY product_id
```
The job metadata shows: Input: 10 billion rows, Output: 500 million rows, Slot time: 20000 seconds, Elapsed time: 10 minutes, Shuffle: 100% locally, Joins: 0.

Given the query plan, what is the most likely reason this query is efficient despite processing 10 billion rows?

Clue words in this question

Noticing these words before you look at the options changes how you read each choice.

Clue: "most likely"
Why it matters: Probability qualifier — the question wants the most probable cause or outcome, not a guaranteed one. Eliminate low-probability options.

Question 1easymultiple choice

Full question →

Exhibit

Refer to the exhibit.

```sql
SELECT product_id, SUM(amount) AS total_sales
FROM sales
WHERE sale_date BETWEEN '2024-01-01' AND '2024-12-31'
GROUP BY product_id
```
The job metadata shows: Input: 10 billion rows, Output: 500 million rows, Slot time: 20000 seconds, Elapsed time: 10 minutes, Shuffle: 100% locally, Joins: 0.

A
The query uses a wildcard function.
Why wrong: No wildcard function is used in the query.
B
The table is partitioned by sale_date.
Partition pruning removes irrelevant partitions, reducing scanned data from billions of rows to only those in the date range.
C
The table is materialized.
Why wrong: A materialized table would be static and not reflect new data; also not indicated in query.
D
The table is clustered by product_id.
Why wrong: Clustering improves performance within partitions but without partitioning, all rows are scanned.

Full breakdown with real-world context →

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

✓

The table is partitioned by sale_date.

Option B is correct because partitioning by sale_date enables partition pruning, which allows the query engine to scan only the relevant partitions instead of the entire 10-billion-row table. This drastically reduces the amount of data read and processed, making the query efficient even with a large total row count.

Key principle: Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

✗
The query uses a wildcard function.
Why it's wrong here
No wildcard function is used in the query.
✓
The table is partitioned by sale_date.
Why this is correct
Partition pruning removes irrelevant partitions, reducing scanned data from billions of rows to only those in the date range.
Clue confirmation
The clue word "most likely" in the question point toward this answer.
Related concept
Read the scenario before looking for a memorised answer.
✗
The table is materialized.
Why it's wrong here
A materialized table would be static and not reflect new data; also not indicated in query.
✗
The table is clustered by product_id.
Why it's wrong here
Clustering improves performance within partitions but without partitioning, all rows are scanned.

Common exam traps

Common exam trap: answer the scenario, not the keyword

Google Cloud often tests the distinction between partitioning (which reduces scanned rows via pruning) and clustering (which only improves sorting and compression within partitions), leading candidates to mistakenly choose clustering as the primary efficiency driver.

Detailed technical explanation

How to think about this question

Partition pruning works by leveraging table metadata (e.g., in BigQuery or similar systems) to skip entire partitions that do not match the query's filter conditions. For example, a query filtering on sale_date = '2024-01-15' would only scan the partition containing that date, reducing the scanned rows from billions to millions. This is especially effective in columnar storage systems where partition elimination is applied before any data reading begins.

KKey Concepts to Remember

Read the scenario before looking for a memorised answer.
Find the constraint that changes the correct option.
Eliminate answers that are true in general but not in this case.

TExam Day Tips

Watch for words such as best, first, most likely and least administrative effort.
Review why wrong options are wrong, not only why the correct option is correct.

Key takeaway

Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Real-world example

How this comes up in practice

A cloud solutions architect for a retail company is evaluating services for a new workload. The correct answer here reflects best practice for the specific scenario described — not a general cloud recommendation. Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option. Cloud exam questions reward reading the constraint carefully: the same technology can be right or wrong depending on the use case.

What to study next

Got this wrong? Here's your next step.

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

Related PDE practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

Designing data processing systems practice questions

Practise PDE questions linked to Designing data processing systems.

Building and operationalizing data processing systems practice questions

Practise PDE questions linked to Building and operationalizing data processing systems.

Operationalizing machine learning models practice questions

Practise PDE questions linked to Operationalizing machine learning models.

Ensuring solution quality practice questions

Practise PDE questions linked to Ensuring solution quality.

PDE fundamentals practice questions

Practise PDE questions linked to PDE fundamentals.

PDE scenario practice questions

Practise PDE questions linked to PDE scenario.

PDE troubleshooting practice questions

Practise PDE questions linked to PDE troubleshooting.

Practice this exam

Start a free PDE practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

10 questions 20 questions 30 questions 50 questions Timed 30

PDE practice-test guide →Study guide →Browse all practice tests

FAQ

Questions learners often ask

What does this PDE question test?

Designing data processing systems — This question tests Designing data processing systems — Read the scenario before looking for a memorised answer..

What is the correct answer to this question?

The correct answer is: The table is partitioned by sale_date. — Option B is correct because partitioning by sale_date enables partition pruning, which allows the query engine to scan only the relevant partitions instead of the entire 10-billion-row table. This drastically reduces the amount of data read and processed, making the query efficient even with a large total row count.

What should I do if I get this PDE question wrong?

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

Are there clue words in this question I should notice?

Yes — watch for: "most likely". Probability qualifier — the question wants the most probable cause or outcome, not a guaranteed one. Eliminate low-probability options.

What is the key concept behind this question?

Read the scenario before looking for a memorised answer.

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Last reviewed: Jun 30, 2026

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

This PDE practice question is part of Courseiva's free Google Cloud certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the PDE exam.