AZ-204Chapter 47 of 102Objective 2.2

Azure Table Storage Query Performance

This chapter covers Azure Table Storage query performance, a critical topic for the AZ-204 exam under Objective 2.2: Implement Azure Cosmos DB and Azure Table Storage. You will learn how to design efficient queries by leveraging PartitionKey and RowKey, understand the cost model based on Request Units, and avoid common pitfalls that lead to full table scans. This area typically accounts for 5-10% of exam questions, often appearing as scenario-based questions where you must choose the optimal query pattern or diagnose performance issues.

25 min read
Intermediate
Updated May 31, 2026

Library Card Catalog for Table Storage

Imagine a massive library with millions of books, each with a unique ID (PartitionKey) and a secondary ID (RowKey). The library has a card catalog organized by these IDs. When you want a specific book, you go to the catalog, look up the exact ID, and retrieve the book instantly. This is a point query. Now, suppose you want all books on 'Azure' written after 2020. Without an index, you'd have to walk every aisle and check every book—a full table scan. But if the library has a secondary index sorted by topic and date, you can quickly find the relevant section. However, the library charges you for every book you touch during the search. So, using the index costs a bit upfront but saves you from scanning millions of books. Similarly, Azure Table Storage uses PartitionKey and RowKey as its primary index. Queries that specify these keys are fast and inexpensive. Queries without them must scan all partitions, which consumes more Request Units (RUs) and takes longer. Secondary indexes, like those in Azure Cognitive Search, can speed up such queries but add cost and complexity. The library analogy breaks down at scale: Azure Table Storage is a NoSQL key-value store, not a relational database. It doesn't support joins or complex queries. But the core idea of indexing and partitioning to avoid full scans is exactly how it works. The PartitionKey determines the physical server (partition) where data is stored. Queries that filter on PartitionKey are routed to the correct partition, minimizing data movement. RowKey is used within a partition to sort and locate rows. Together, they form the primary key, which is the only guaranteed fast access path. Any other query requires a partition scan or a full table scan, which the exam tests heavily.

How It Actually Works

What is Azure Table Storage and Why It Exists

Azure Table Storage is a NoSQL key-value store that offers schemaless design, high scalability, and low cost for storing large amounts of structured, non-relational data. It is part of Azure Storage, which also includes Blob, Queue, and File Storage. Table Storage is ideal for storing flexible datasets like user data, device logs, metadata, or any data that doesn't require complex joins or relational integrity. It supports trillions of entities (rows) and petabytes of data, with automatic partitioning and load balancing.

How It Works Internally

Azure Table Storage uses a flat namespace of tables, each containing entities. An entity has a PartitionKey, RowKey, and a Timestamp (system-managed). The PartitionKey determines the partition (physical node) where the entity is stored. All entities with the same PartitionKey are stored together, and the RowKey uniquely identifies an entity within a partition. The primary key is the combination of PartitionKey and RowKey. This key is used to build a clustered index that sorts entities first by PartitionKey, then by RowKey.

When you query a table, the storage service evaluates whether the query can be satisfied by the primary index. If the query specifies both PartitionKey and RowKey, it is a point query and is extremely efficient—it directly locates the entity in O(1) time. If the query specifies only PartitionKey, it is a partition scan—it retrieves all entities in that partition, which is still efficient but consumes more RUs as the partition grows. If the query specifies neither, it is a full table scan—it must scan every partition, which is the most expensive and slowest operation.

Key Components, Values, Defaults, and Timers

PartitionKey: String value up to 1 KB. Determines the partition. Queries must include PartitionKey to avoid full table scans. The choice of PartitionKey is critical for performance. A good PartitionKey distributes load evenly across partitions (e.g., user ID, date). A bad PartitionKey can cause hot partitions (e.g., a single value like "all" for all entities).

RowKey: String value up to 1 KB. Uniquely identifies an entity within a partition. Sorted lexicographically. Use RowKey to enable range queries (e.g., timestamp).

Timestamp: System-managed datetime. Used for optimistic concurrency control. Not indexed.

Request Units (RUs): Cost metric for operations. Each operation consumes RUs based on entity size, number of entities returned, and whether it's a point query, partition scan, or table scan. Point queries cost ~1 RU per 1 KB entity. Partition scans cost more. Table scans are expensive and should be avoided.

Timeout: Default 30 seconds for queries. Can be increased up to 5 minutes. Long queries may time out if scanning large tables.

Max results per query: 1,000 entities. Use continuation tokens to retrieve more.

Max table size: 500 TB per storage account.

Max entity size: 1 MB (including property names and values).

Max properties per entity: 255 (including PartitionKey, RowKey, Timestamp).

Configuration and Verification Commands

To create a table and query it using Azure CLI:

# Create storage account
az storage account create --name mystorageaccount --resource-group myResourceGroup --location eastus --sku Standard_LRS

# Get connection string
az storage account show-connection-string --name mystorageaccount --resource-group myResourceGroup --query connectionString

# Create table
az storage table create --name MyTable --connection-string "<connection-string>"

# Insert entity
az storage entity insert --table-name MyTable --entity PartitionKey=pk1 RowKey=rk1 Name=John Age=30 --connection-string "<connection-string>"

# Query by PartitionKey and RowKey (fast)
az storage entity query --table-name MyTable --filter "PartitionKey eq 'pk1' and RowKey eq 'rk1'" --connection-string "<connection-string>"

# Query by PartitionKey only (partition scan)
az storage entity query --table-name MyTable --filter "PartitionKey eq 'pk1'" --connection-string "<connection-string>"

# Query without PartitionKey (full table scan) - avoid
az storage entity query --table-name MyTable --filter "Name eq 'John'" --connection-string "<connection-string>"

In .NET, using the Azure.Data.Tables SDK:

// Point query
TableClient tableClient = new TableClient(connectionString, tableName);
TableEntity entity = await tableClient.GetEntityAsync<TableEntity>(partitionKey, rowKey);

// Partition scan
Pageable<TableEntity> entities = tableClient.Query<TableEntity>(filter: $"PartitionKey eq '{partitionKey}'");

// Full table scan (bad)
Pageable<TableEntity> allEntities = tableClient.Query<TableEntity>(filter: $"Name eq 'John'");

How It Interacts with Related Technologies

Azure Cosmos DB Table API: Provides the same Table Storage API but with global distribution, multi-region writes, and lower latency at higher cost. Queries in Cosmos DB Table API also use PartitionKey and RowKey, but Cosmos DB offers additional indexing options and can automatically index all properties. However, the exam focuses on Azure Table Storage specifically.

Azure Cognitive Search: Can index Table Storage data for full-text search and complex queries. This is a workaround for queries that don't use PartitionKey/RowKey, but it adds cost and complexity.

Azure Functions: Often used to process Table Storage data. When writing functions, ensure queries are optimized to avoid table scans.

Power BI and Azure Synapse: Can query Table Storage via connectors, but performance depends on the query pattern.

Performance Best Practices

1.

Design PartitionKey for even distribution: Avoid a single partition for all entities. Use a high-cardinality key like user ID, device ID, or date-hour combination.

2.

Use RowKey for sorting and range queries: If you need to query by timestamp, use a reverse tick (e.g., DateTime.MaxValue.Ticks - timestamp.Ticks) to sort newest first.

3.

Avoid full table scans: Always include PartitionKey in your queries. If you need to query by other properties, consider duplicating data or using a secondary index in Azure Cognitive Search.

4.

Batch operations: Use Entity Group Transactions (ETags) to batch up to 100 entities in a single transaction, but all entities must share the same PartitionKey.

5.

Monitor with Azure Storage Analytics: Enable logging to track slow queries and high RU consumption.

6.

Use continuation tokens: For queries returning more than 1,000 entities, use continuation tokens to paginate results efficiently.

7.

Limit properties returned: Use the select parameter to only retrieve needed properties, reducing RU cost.

Common Pitfalls

Using PartitionKey as a constant: If you set PartitionKey to "all", every query becomes a full table scan. This is the most common performance mistake.

Not using RowKey for uniqueness: If you don't ensure RowKey uniqueness, you'll overwrite entities. Use a GUID or timestamp for RowKey.

Ignoring continuation tokens: If you don't handle continuation tokens, you'll only get the first 1,000 results.

Assuming Table Storage supports joins: It doesn't. You must denormalize data.

Advanced: Partition Splitting and Load Balancing

Azure Storage automatically splits partitions when they exceed 10 GB. This can cause performance issues if your PartitionKey design leads to large partitions. After a split, queries that previously targeted a single partition may now hit multiple partitions, increasing latency. To avoid this, keep partitions small (e.g., by using a fine-grained PartitionKey like user ID).

Summary of RU Costs

Point query (PartitionKey + RowKey): ~1 RU per 1 KB entity.

Partition scan (PartitionKey only): ~1 RU per 1 KB entity retrieved, but for all entities in the partition.

Table scan (no PartitionKey): ~1 RU per 1 KB entity scanned, which can be millions of RUs.

Insert: ~1 RU per 1 KB entity.

Update: ~1 RU per 1 KB entity (plus delete if replacing).

Delete: ~1 RU per entity.

These costs are approximate; actual RU depends on entity size and consistency model. For Table Storage, RUs are not directly exposed like in Cosmos DB, but the concept is similar: the storage service throttles requests if you exceed scale targets (20,000 IOPS per partition).

Exam-Relevant Numbers

Max entity size: 1 MB.

Max properties per entity: 255.

Max table size: 500 TB per storage account.

Max results per query: 1,000.

Default timeout: 30 seconds.

Max batch size: 100 entities.

Max batch size in bytes: 4 MB.

PartitionKey max size: 1 KB.

RowKey max size: 1 KB.

Storage account scale target: 20,000 IOPS for table storage (assuming no throttling).

Walk-Through

1

Define PartitionKey and RowKey

Start by designing your entity schema. Choose a PartitionKey that distributes data evenly across partitions. For example, use UserId for user data or DateHour for logs. Ensure RowKey is unique within a partition, such as a GUID or timestamp. This design is critical because it determines query performance. The PartitionKey is hashed to determine the physical partition. A good design avoids hot partitions where one partition receives most requests. For high-traffic entities, consider using a composite PartitionKey like UserId_Date to spread load.

2

Insert entities into the table

Use the Insert operation to add entities. Each insert costs ~1 RU per 1 KB. You can batch up to 100 entities in a single transaction if they share the same PartitionKey. The batch must be under 4 MB. Insert operations are atomic within a batch. After insertion, the entity is immediately available for queries. The storage service automatically partitions data based on PartitionKey. If a partition exceeds 10 GB, it may be split into multiple partitions, which can affect query routing.

3

Execute a point query

A point query specifies both PartitionKey and RowKey. This is the fastest and cheapest query. The storage service hashes the PartitionKey to locate the partition, then uses the RowKey to find the exact entity within that partition. The operation consumes ~1 RU per 1 KB returned. Point queries are O(1) in time complexity. On the exam, whenever you see a query that needs a single entity, always recommend a point query. The wrong answer might suggest a filter without PartitionKey.

4

Execute a partition scan

A partition scan specifies only PartitionKey, optionally with a filter on RowKey or other properties. The service retrieves all entities in that partition and applies the filter. This is more expensive than a point query because it reads all entities in the partition. The cost is proportional to the partition size. For large partitions, this can be slow and costly. To optimize, use RowKey range queries (e.g., RowKey ge '2020-01-01' and RowKey le '2020-01-31') to limit the scan. The service can leverage the RowKey index to skip entities outside the range.

5

Execute a full table scan

A query that does not include PartitionKey forces a full table scan. The service must iterate over every partition, reading all entities and applying the filter. This is the most expensive operation and should be avoided. It can consume millions of RUs and time out. On the exam, this is a classic trap: a query filtering by a non-key property like 'Name eq 'John'' will scan the entire table. The correct answer is to redesign the schema to include that property in the key or use a secondary index like Azure Cognitive Search.

What This Looks Like on the Job

Scenario 1: IoT Device Telemetry Ingestion

A company ingests telemetry from millions of IoT devices. Each device sends a message every minute with device ID, timestamp, temperature, and humidity. They store this in Azure Table Storage. The PartitionKey is set to device ID, and RowKey is a reverse tick timestamp (DateTime.MaxValue.Ticks - timestamp.Ticks) to sort newest first. Queries typically ask: "Get the last hour of data for a specific device." This is a partition scan with a RowKey range filter, which is efficient. However, if they forget to include the device ID in the query, they would scan all partitions—disastrous. In production, they also use a secondary table with PartitionKey as date-hour and RowKey as device ID for cross-device queries. They monitor RU consumption using Azure Monitor and set alerts for throttling. The biggest issue is hot partitions: if one device sends more data than others, its partition grows large and may be split, causing temporary performance degradation. To mitigate, they use a composite PartitionKey like device ID + hour to spread load.

Scenario 2: User Profile Storage for a Web Application

A social media app stores user profiles in Table Storage. PartitionKey is user ID (a GUID), RowKey is 'profile'. Each user has one entity. Queries are point queries: get a user's profile by user ID. This is fast and cheap. However, the app also needs to search users by email. Without a secondary index, they would have to scan the entire table. They solve this by maintaining a separate table where PartitionKey is the email hash and RowKey is user ID. This allows point queries by email. They also use Azure Cognitive Search for full-text search. The common mistake is to query the main table with a filter on email, which causes a full table scan. The correct approach is to use the secondary table or a search service.

Scenario 3: Order Management System

An e-commerce platform stores orders with PartitionKey = customer ID and RowKey = order ID. They need to query all orders for a customer (partition scan) and get a specific order (point query). Performance is good. But they also need to query orders by date range across all customers. This would be a full table scan. They solve this by creating a separate table with PartitionKey = date (e.g., '2023-01-01') and RowKey = order ID. This allows date-range queries to scan only one partition. They also use a materialized view in Azure SQL Database for complex analytics. The key lesson: design your PartitionKey based on the most common query patterns. If you need multiple access patterns, consider multiple tables or denormalization.

How AZ-204 Actually Tests This

What AZ-204 Tests on This Topic

Objective 2.2: Implement Azure Cosmos DB and Azure Table Storage. For Table Storage, the exam focuses on:

Designing PartitionKey and RowKey for optimal query performance.

Understanding the cost implications of different query patterns (point query vs. partition scan vs. table scan).

Using continuation tokens to paginate results.

Batch operations and Entity Group Transactions.

Identifying scenarios where Table Storage is appropriate vs. other storage solutions.

Common Wrong Answers and Why Candidates Choose Them

1.

"Use a filter on a non-key property without PartitionKey": Candidates think they can query efficiently like SQL. They don't realize this causes a full table scan. The exam will present a scenario where a query filters by a property like 'Status' or 'Email'. The correct answer is to redesign the schema or use a secondary index.

2.

"Always use a single PartitionKey for simplicity": Some candidates set PartitionKey to a constant like 'all' to simplify coding. This creates a hot partition and kills performance. The exam tests the principle of even distribution.

3.

"Use RowKey as the only key": Candidates may omit PartitionKey and use RowKey as a unique identifier. But without PartitionKey, every query is a table scan. The exam expects you to know that PartitionKey is required for efficient queries.

4.

"Batch operations can include entities from different partitions": Candidates may think you can batch any entities. In reality, all entities in a batch must share the same PartitionKey. The exam tests this limit.

Specific Numbers and Terms That Appear Verbatim

Max entity size: 1 MB.

Max properties: 255.

Max batch size: 100 entities, 4 MB.

Max results per query: 1,000.

Default timeout: 30 seconds.

PartitionKey and RowKey max size: 1 KB each.

Storage account scale target: 20,000 IOPS.

Continuation token: Used to retrieve more than 1,000 entities.

Edge Cases and Exceptions

Table Storage vs. Cosmos DB Table API: The exam may ask which to use. Cosmos DB offers global distribution and lower latency but costs more. Table Storage is cheaper and sufficient for most scenarios.

Querying with a filter on RowKey without PartitionKey: Still a full table scan because PartitionKey is missing. The RowKey index is only used within a partition.

Using select to limit properties: Reduces RU cost because less data is transferred. The exam may ask how to optimize a query that returns large entities.

Timestamp property: Not indexed. Do not filter on Timestamp without PartitionKey.

How to Eliminate Wrong Answers

If the question asks for the most efficient query, look for both PartitionKey and RowKey in the filter.

If the question involves a query that must return all entities matching a non-key property, the answer is not to scan the table but to redesign or use a secondary index.

If the question involves batch operations, check that all entities share the same PartitionKey.

If the question involves pagination, the correct answer is to use continuation tokens, not skip/take.

Key Takeaways

Always include PartitionKey in queries to avoid full table scans.

Point queries (PartitionKey + RowKey) are the most efficient and cheapest.

PartitionKey should be chosen for even distribution and based on the most common query pattern.

RowKey can be used for range queries within a partition (e.g., timestamp).

Batch operations require all entities to share the same PartitionKey; max 100 entities or 4 MB.

Queries return at most 1,000 entities; use continuation tokens for pagination.

Full table scans are expensive and should be avoided; redesign schema or use secondary indexes.

Max entity size is 1 MB; max properties per entity is 255.

Table Storage does not support joins, triggers, or stored procedures.

Monitor performance using Azure Storage Analytics and set alerts for throttling.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Azure Table Storage

Part of Azure Storage; lower cost.

No global distribution; single region.

Eventual consistency only.

Automatic partition splitting based on size.

Max 20,000 IOPS per partition.

Azure Cosmos DB Table API

Separate service; higher cost.

Global distribution with multi-region writes.

Multiple consistency levels (strong, bounded staleness, session, consistent prefix, eventual).

Manual or automatic indexing of all properties.

Virtually unlimited throughput with request units.

Watch Out for These

Mistake

Azure Table Storage supports SQL-like queries with joins.

Correct

Table Storage is a NoSQL key-value store. It does not support joins, foreign keys, or complex queries. You must denormalize data to avoid joins.

Mistake

You can query efficiently by any property if you add an index.

Correct

Table Storage only has a primary index on PartitionKey + RowKey. There are no secondary indexes. To query by other properties, you must either scan or use a separate table or Azure Cognitive Search.

Mistake

A query with a filter on RowKey is always efficient even without PartitionKey.

Correct

Without PartitionKey, the query must scan all partitions. The RowKey filter is only applied after scanning each partition. This is a full table scan.

Mistake

Batch operations can include up to 100 entities from different partitions.

Correct

All entities in a batch must share the same PartitionKey. The batch is atomic within that partition.

Mistake

Setting PartitionKey to a constant like 'all' is fine for small tables.

Correct

Even for small tables, a single partition limits scalability. As data grows, the partition becomes hot and performance degrades. Always use a high-cardinality PartitionKey.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

How do I optimize a query that filters by a non-key property like 'Status'?

The best approach is to redesign your PartitionKey or RowKey to include that property. For example, if you often query by Status, use Status as the PartitionKey (or part of a composite key). Alternatively, create a separate table where PartitionKey is the property value. If the property is highly selective, you can also use Azure Cognitive Search to index the table. Avoid scanning the entire table.

What is the difference between a partition scan and a table scan?

A partition scan queries all entities within a single partition (specified by PartitionKey). It is efficient if the partition is small. A table scan queries every partition in the table, which is expensive and slow. Always include PartitionKey to limit the scan to one partition.

Can I use a filter on RowKey without PartitionKey?

Yes, but it will perform a full table scan. The RowKey filter is applied after scanning each partition. This is inefficient. Always include PartitionKey in the filter to avoid scanning all partitions.

How many entities can I insert in a single batch?

Up to 100 entities, with a total batch size not exceeding 4 MB. All entities must share the same PartitionKey. The batch is atomic within that partition.

What happens if a query returns more than 1,000 entities?

The response includes a continuation token. You must use this token in subsequent requests to retrieve the next set of entities. The token is valid for the same table and query filter. Without it, you only get the first 1,000 entities.

How do I choose between Azure Table Storage and Azure Cosmos DB Table API?

Use Table Storage for simple, cost-effective storage with moderate throughput and single-region needs. Use Cosmos DB Table API for global distribution, multi-region writes, lower latency, and higher throughput. Cosmos DB also offers multiple consistency levels and automatic indexing of all properties.

What is the maximum size of an entity in Azure Table Storage?

1 MB, including property names and values. The maximum number of properties per entity is 255 (including PartitionKey, RowKey, and Timestamp).

Terms Worth Knowing

Ready to put this to the test?

You've just covered Azure Table Storage Query Performance — now see how well it sticks with free AZ-204 practice questions. Full explanations included, no account needed.

Done with this chapter?