A retail company uses Azure SQL Database to store a large fact table of sales transactions with millions of rows. They run complex aggregate queries (SUM, COUNT, AVG) across many rows for monthly reports. These queries take too long. Which index type should they add to the table to improve performance?
Correct. A clustered columnstore index is ideal for data warehousing and analytical workloads, significantly improving aggregate query performance.
Why this answer
Clustered columnstore indexes are optimized for large fact tables and analytical workloads because they store data in a columnar format, which significantly reduces the amount of data read from disk for aggregate queries like SUM, COUNT, and AVG. This index type also uses batch processing and compression to accelerate query performance on millions of rows, making it ideal for monthly reporting queries.
Exam trap
The trap here is that candidates often confuse nonclustered columnstore indexes with clustered columnstore indexes, assuming any columnstore index will suffice, but only the clustered version is designed for large fact tables with heavy aggregation workloads and avoids the overhead of maintaining a separate rowstore index.
How to eliminate wrong answers
Option A is wrong because a clustered B-tree index stores data in row-based pages and is optimized for point lookups and range scans, not for large-scale aggregations across millions of rows. Option B is wrong because a nonclustered rowstore index also uses a row-based structure and, while it can cover some queries, it does not provide the columnar compression and batch processing needed for efficient aggregate operations. Option D is wrong because a nonclustered columnstore index is a secondary index that requires the table to have a separate clustered index, adding overhead for writes and not being as efficient as a clustered columnstore index for full-table scans and aggregations.