Question 191 of 846
Design and implement data storagehardMultiple ChoiceObjective-mapped

Quick Answer

The correct approach is to create a separate staging table for the last 24 hours of data with round-robin distribution and a clustered columnstore index, then merge it into the main partitioned Sales table after each batch load. This solution directly addresses the Synapse dedicated SQL pool real-time analytics optimization requirement by isolating the hot data into a compact, columnstore-optimized table, which avoids expensive full partition scans on the 2-billion-row main table and enables sub-second aggregations. On the DP-203 exam, this scenario tests your understanding of hybrid transactional/analytical processing patterns, specifically the hot/cold data separation strategy using staging tables to minimize latency without costly repartitioning or additional dedicated SQL pools. A common trap is assuming you must repartition the main table or use a different distribution key, but the key insight is that columnstore indexes excel at fast aggregations on recent data when combined with a round-robin distribution for balanced writes. Memory tip: think “hot staging, cold merging” to recall that recent data lives in a separate, fast-query staging table while historical data remains partitioned for integrity.

DP-203 Design and implement data storage Practice Question

This DP-203 practice question tests your understanding of design and implement data storage. Match the stated requirement to the specific cloud service, access model, or configuration option — many options are valid in isolation but not for this scenario. After answering, compare your reasoning against the explanation and wrong-answer breakdown below. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

You are a data engineer for a large e-commerce company. The company uses Azure Synapse Analytics dedicated SQL pool as its enterprise data warehouse. A new business requirement mandates that the Sales fact table, which contains 2 billion rows, must support real-time analytics with a maximum query latency of 1 second for aggregations on the most recent 24 hours of data. The table is currently hash-distributed on CustomerID and partitioned monthly by SaleDate. The current query performance for recent data is slow due to full partition scans. The data is ingested via Azure Event Hubs and processed by Azure Stream Analytics, which writes to staging tables every minute. You need to redesign the storage to meet the latency requirement while minimizing cost and maintaining data integrity. Which approach should you take?

Question 1hardmultiple choice
Full question →

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

Create a separate staging table for the last 24 hours of data with round-robin distribution and a clustered columnstore index. After each batch load, merge the staging table into the main partitioned Sales table. Queries for recent data should target the staging table.

Option C is correct because it isolates the hot (recent 24-hour) data into a separate staging table with a clustered columnstore index, which is optimized for fast aggregations and high compression. This avoids full partition scans on the 2-billion-row main table, and merging into the partitioned table after each batch load maintains data integrity without requiring expensive repartitioning or additional dedicated SQL pools.

Key principle: Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

  • Increase partition granularity to hourly partitions for the Sales table.

    Why it's wrong here

    Even hourly partitions would be large and may not meet 1-second latency; also increases partition management overhead.

  • Change the Sales table to use a clustered index on SaleDate and a round-robin distribution.

    Why it's wrong here

    Clustered index is inefficient for large fact tables; round-robin distribution causes data movement.

  • Create a separate staging table for the last 24 hours of data with round-robin distribution and a clustered columnstore index. After each batch load, merge the staging table into the main partitioned Sales table. Queries for recent data should target the staging table.

    Why this is correct

    The staging table is small, enabling fast queries for recent data; merging maintains the historical archive.

    Related concept

    Read the scenario before looking for a memorised answer.

  • Provision a second dedicated SQL pool optimized for real-time queries and replicate the recent data there.

    Why it's wrong here

    A second pool adds significant cost and complexity; replication overhead may impact latency.

Common exam traps

Common exam trap: answer the scenario, not the keyword

The trap here is that candidates assume finer partitioning (Option A) always improves query performance, but they overlook that partition elimination still requires scanning the entire partition, and that a separate hot table with columnstore indexing is more efficient for real-time aggregations on a sliding window of recent data.

Detailed technical explanation

How to think about this question

Clustered columnstore indexes in Azure Synapse dedicated SQL pool use batch-mode processing and column segment elimination, which can achieve sub-second aggregation performance on hot data by scanning only the relevant column segments. The staging table approach leverages this while the main partitioned table retains historical data for broader analytics; merging via CTAS (CREATE TABLE AS SELECT) or partition switching ensures atomicity and minimizes downtime.

KKey Concepts to Remember

  • Read the scenario before looking for a memorised answer.
  • Find the constraint that changes the correct option.
  • Eliminate answers that are true in general but not in this case.

TExam Day Tips

  • Watch for words such as best, first, most likely and least administrative effort.
  • Review why wrong options are wrong, not only why the correct option is correct.

Key takeaway

Answer the scenario, not the keyword: identify the specific constraint before choosing the most familiar-sounding option.

Real-world example

How this comes up in practice

A startup's cloud architect reviews their monthly bill and notices costs are higher than expected for a long-running batch job. Switching from on-demand instances to Reserved Instances — or using Spot/Preemptible VMs — can reduce compute costs by up to 72 %. Questions like this test whether you understand the tradeoffs between commitment, flexibility, and cost across cloud pricing models.

What to study next

Got this wrong? Here's your next step.

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

Related practice questions

Related DP-203 practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

Practice this exam

Start a free DP-203 practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

FAQ

Questions learners often ask

What does this DP-203 question test?

Design and implement data storage — This question tests Design and implement data storage — Read the scenario before looking for a memorised answer..

What is the correct answer to this question?

The correct answer is: Create a separate staging table for the last 24 hours of data with round-robin distribution and a clustered columnstore index. After each batch load, merge the staging table into the main partitioned Sales table. Queries for recent data should target the staging table. — Option C is correct because it isolates the hot (recent 24-hour) data into a separate staging table with a clustered columnstore index, which is optimized for fast aggregations and high compression. This avoids full partition scans on the 2-billion-row main table, and merging into the partitioned table after each batch load maintains data integrity without requiring expensive repartitioning or additional dedicated SQL pools.

What should I do if I get this DP-203 question wrong?

Identify which exam domain this question belongs to, review the core concept, then practise similar questions from the same domain.

What is the key concept behind this question?

Read the scenario before looking for a memorised answer.

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Keep practising

More DP-203 practice questions

Last reviewed: Jun 24, 2026

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

Sign in to join the discussion.

This DP-203 practice question is part of Courseiva's free Microsoft certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the DP-203 exam.