Question 354 of 982

Describe an analytics workload on Azure →hardMultiple ChoiceObjective-mapped

Quick Answer

The answer is to replicate both the Store and Product dimension tables. This design change most reduces data movement in Synapse joins because replicating small dimension tables copies the entire table to every compute node, allowing joins with the large hash-distributed fact table to occur locally without shuffling data across nodes. Since the Store table has only 10,000 rows and the Product table 500,000 rows, both are small enough to fit in memory on each node, making replication far more efficient than hash-distributing them. On the DP-900 exam, this tests your understanding of dedicated SQL pool distribution strategies and the trade-off between replication and data movement. A common trap is to assume hash-distributing dimension tables is always best, but for tables under 1–2 GB, replication avoids costly shuffle operations. Memory tip: “Small tables get replicated, big tables get distributed—replication means zero shuffle.”

DP-900 Describe an analytics workload on Azure Practice Question

This DP-900 practice question tests your understanding of describe an analytics workload on azure. Match the stated requirement to the specific cloud service, access model, or configuration option — many options are valid in isolation but not for this scenario. A key principle to apply: replicated tables store a full copy on every compute node in Synapse SQL pool.. Once you have made your selection, read the full explanation to reinforce the concept and understand why each distractor is designed to mislead on exam day.

A company uses Azure Synapse Analytics dedicated SQL pool to store sales data. The fact table contains billions of rows and is hash-distributed on ProductID. Queries aggregate sales by store and product for the current month and join with a small Store dimension table (10,000 rows) and a medium-sized Product dimension table (500,000 rows). The queries are slow due to data movement during joins. Which design change will most reduce data movement and improve query performance?

Question 1hardmultiple choice

Full question →

A
Change the fact table to round-robin distribution.
Why wrong: Incorrect because round-robin distributes rows evenly without any key, which can cause extensive data shuffling for joins and aggregations, often degrading performance.
B
Replicate the Store dimension table and the Product dimension table.
Correct. Replicating small dimension tables across all distributions eliminates data movement during joins, as each distribution already has the full dimension data.
C
Change the hash distribution key of the fact table to StoreID.
Why wrong: Incorrect because changing the distribution key to StoreID would colocate data by store, but joins with Product dimension and aggregations by both store and product may still cause movement. Also, it requires repartitioning the entire table.
D
Implement a clustered columnstore index on the fact table.
Why wrong: Incorrect because a clustered columnstore index improves compression and scan performance but does not address data movement during joins, which is the root cause of the slowness.

Full breakdown with real-world context →

Answer choices

Why each option matters

Answer the question above first, then reveal the full breakdown to understand why each option is right or wrong.

Correct answer & explanation

✓

Replicate the Store dimension table and the Product dimension table.

Replicating the Store and Product dimension tables across all compute nodes eliminates the need to shuffle data during joins with the large fact table. Since both dimension tables are small enough to fit in memory on each node (10,000 and 500,000 rows), replication avoids costly data movement and significantly improves query performance for aggregations that join on multiple dimensions.

Key principle: Replicated tables store a full copy on every compute node in Synapse SQL pool.

Answer analysis

Option-by-option breakdown

For each option: why learners choose it and why it is or isn't the right answer here.

✗
Change the fact table to round-robin distribution.
Why it's wrong here
Incorrect because round-robin distributes rows evenly without any key, which can cause extensive data shuffling for joins and aggregations, often degrading performance.
✓
Replicate the Store dimension table and the Product dimension table.
Why this is correct
Correct. Replicating small dimension tables across all distributions eliminates data movement during joins, as each distribution already has the full dimension data.
Related concept
Replicated tables store a full copy on every compute node in Synapse SQL pool.
✗
Change the hash distribution key of the fact table to StoreID.
Why it's wrong here
Incorrect because changing the distribution key to StoreID would colocate data by store, but joins with Product dimension and aggregations by both store and product may still cause movement. Also, it requires repartitioning the entire table.
✗
Implement a clustered columnstore index on the fact table.
Why it's wrong here
Incorrect because a clustered columnstore index improves compression and scan performance but does not address data movement during joins, which is the root cause of the slowness.

Common exam traps

Common exam trap: answer the scenario, not the keyword

The trap here is that candidates often focus on indexing or distribution key changes (like C or D) without recognizing that data movement during joins is the root cause, and that replicating small dimension tables is the most direct solution to eliminate that movement.

Detailed technical explanation

How to think about this question

In Azure Synapse dedicated SQL pool, replicated tables are stored in full on each distribution node, allowing joins to occur locally without moving data. The decision to replicate depends on table size—typically under 2 GB after compression—and update frequency; for dimension tables that are frequently joined but rarely updated, replication is ideal. Under the hood, replication uses a background process to maintain copies across nodes, and queries benefit from elimination of shuffle operations in the distributed query plan.

KKey Concepts to Remember

Replicated tables store a full copy on every compute node in Synapse SQL pool.
Replication eliminates data movement during joins with any distributed table.
Best suited for small to medium dimension tables (typically under 2 GB).
Reduces network traffic and improves query performance for star schema joins.

TExam Day Tips

Watch for words such as best, first, most likely and least administrative effort.
Review why wrong options are wrong, not only why the correct option is correct.

Key takeaway

Replicated tables store a full copy on every compute node in Synapse SQL pool.

Real-world example

How this comes up in practice

A startup's cloud architect reviews their monthly bill and notices costs are higher than expected for a long-running batch job. Switching from on-demand instances to Reserved Instances — or using Spot/Preemptible VMs — can reduce compute costs by up to 72 %. Questions like this test whether you understand the tradeoffs between commitment, flexibility, and cost across cloud pricing models.

What to study next

Got this wrong? Here's your next step.

Review replicated tables store a full copy on every compute node in Synapse SQL pool., then practise related DP-900 questions on the same topic to reinforce the concept.

Related DP-900 practice-question pages

Use these pages to review the topic behind this question. This is how one missed question becomes focused revision.

Describe core data concepts practice questions

Practise DP-900 questions linked to Describe core data concepts.

Describe an analytics workload on Azure practice questions

Practise DP-900 questions linked to Describe an analytics workload on Azure.

Identify considerations for relational data on Azure practice questions

Practise DP-900 questions linked to Identify considerations for relational data on Azure.

Describe considerations for working with non-relational data on Azure practice questions

Practise DP-900 questions linked to Describe considerations for working with non-relational data on Azure.

DP-900 fundamentals practice questions

Practise DP-900 questions linked to DP-900 fundamentals.

DP-900 scenario practice questions

Practise DP-900 questions linked to DP-900 scenario.

DP-900 troubleshooting practice questions

Practise DP-900 questions linked to DP-900 troubleshooting.

Practice this exam

Start a free DP-900 practice session

Short sessions build daily habit. Longer sessions build exam-day stamina. Try a timed session to simulate real conditions.

10 questions 20 questions 30 questions 50 questions Timed 30

DP-900 practice-test guide →Study guide →Browse all practice tests

FAQ

Questions learners often ask

What does this DP-900 question test?

Describe an analytics workload on Azure — This question tests Describe an analytics workload on Azure — Replicated tables store a full copy on every compute node in Synapse SQL pool..

What is the correct answer to this question?

The correct answer is: Replicate the Store dimension table and the Product dimension table. — Replicating the Store and Product dimension tables across all compute nodes eliminates the need to shuffle data during joins with the large fact table. Since both dimension tables are small enough to fit in memory on each node (10,000 and 500,000 rows), replication avoids costly data movement and significantly improves query performance for aggregations that join on multiple dimensions.

What should I do if I get this DP-900 question wrong?

Review replicated tables store a full copy on every compute node in Synapse SQL pool., then practise related DP-900 questions on the same topic to reinforce the concept.

What is the key concept behind this question?

Replicated tables store a full copy on every compute node in Synapse SQL pool.

About these practice questions

Courseiva creates original exam-style practice questions with explanations and wrong-answer analysis. It does not publish real exam questions, exam dumps, or protected exam content. Learn why practice questions differ from exam dumps →

How Courseiva writes practice questions · Editorial policy

Same concept, more angles

1 more ways this is tested on DP-900

These questions test the same concept from different angles. Work through them to make sure you can recognise it however the exam phrases it.

Variation 1. A company uses Azure Synapse Analytics dedicated SQL pool for a large data warehouse. The fact table contains billions of rows and is hash-distributed on ProductID. Frequent queries join this fact table with a small Store dimension table (10,000 rows) and a medium-sized Product dimension table (500,000 rows). The queries aggregate sales by store and product for recent months, but run slowly due to data movement during joins. Which design change will most reduce data movement and improve query performance?

hard

✓ A.Replicate the Store dimension table
B.Change the distribution of the fact table to round-robin
C.Change the distribution key of the fact table to StoreID
D.Add a nonclustered index on the StoreID column in the fact table

Why A: Replicating the small Store dimension table (10,000 rows) across all compute nodes eliminates the need to shuffle data during joins with the fact table. In Azure Synapse dedicated SQL pool, replicated tables store a full copy on each distribution, so queries that join a replicated table with a distributed fact table avoid costly data movement, significantly improving performance for frequent aggregation queries.

Keep practising

Question Discussion

Share a tip, memory trick, or ask about the reasoning behind this question. Do not post real exam questions, leaked content, braindumps, or copyrighted exam material. Comments are moderated and may be removed without notice.

Loading comments…

This DP-900 practice question is part of Courseiva's free Microsoft certification practice question bank. Courseiva provides original exam-style practice questions with explanations, topic-based practice, mock exams, readiness tracking, and study analytics to help learners prepare for the DP-900 exam.