AZ-204Chapter 44 of 102Objective 2.1

Cosmos DB Partition Key Selection

This chapter covers Azure Cosmos DB partition key selection, a critical skill for optimizing performance and cost in Cosmos DB solutions. For the AZ-204 exam, partition key design is a high-weight topic, appearing in roughly 15-20% of questions related to Cosmos DB. Mastering this concept is essential for passing the Storage domain (Objective 2.1) and for designing scalable, efficient database solutions. We will explore the mechanics of partitioning, how to choose an effective partition key, and common pitfalls to avoid.

25 min read
Intermediate
Updated May 31, 2026

Library Filing System for Cosmos DB

Imagine a massive library with millions of books. The library uses a filing system where each book is assigned a single attribute (e.g., genre) that determines which shelf it goes on. This attribute is the partition key. When a patron requests a book, the librarian uses the partition key to instantly locate the correct shelf (partition) and then scans the books on that shelf to find the exact title. If all books have the same genre, they all go on one shelf—the librarian must search through all books to find any one, causing slow performance. Conversely, if each book has a unique genre, every shelf holds exactly one book, wasting shelf space and requiring the librarian to visit many shelves for a query that spans multiple genres. The optimal partition key is one that spreads books evenly across shelves (high cardinality) and matches the most common search patterns. For example, if patrons usually search by author, the partition key should be author, so the librarian goes to the author's shelf and finds all books by that author together. This analogy directly mirrors how Cosmos DB distributes data across physical partitions based on the partition key value, and how query performance depends on choosing a key that evenly distributes request units (RU) and storage across partitions.

How It Actually Works

What is a Partition Key and Why Does It Exist?

Azure Cosmos DB is a globally distributed, multi-model database service that automatically scales throughput and storage across multiple servers. To achieve this, Cosmos DB uses horizontal partitioning (sharding). Every container (table, collection, graph, etc.) has a partition key—a property within each item (document, row, node) that Cosmos DB uses to distribute data across logical and physical partitions.

The partition key is a JSON property path (e.g., /userId, /city, /date). Cosmos DB hashes the partition key value using a consistent hash function to determine which physical partition stores the item. This hash-based distribution ensures that items with the same partition key value are stored together on the same physical partition.

Why is this necessary? Without partitioning, all data would reside on a single server, limiting storage and throughput. Partitioning allows Cosmos DB to scale out by adding more servers (physical partitions) as data grows, and to distribute request units (RU) across partitions to achieve high throughput.

How Partitioning Works Internally

Cosmos DB operates with two levels of partitioning: logical partitions and physical partitions.

Logical partition: A group of items that share the same partition key value. For example, if partition key is /city, all items with city = "Seattle" belong to the same logical partition. There is no limit on the number of logical partitions per container.

Physical partition: A server node that stores one or more logical partitions. Each physical partition has a fixed storage limit (currently 50 GB) and a reserved amount of throughput (RU/s). Cosmos DB automatically manages the mapping of logical partitions to physical partitions, splitting physical partitions when they exceed storage limits or when throughput demands require more capacity.

When you provision throughput at the container level (dedicated throughput), Cosmos DB divides the RU/s equally among all physical partitions. For example, if you provision 4000 RU/s and there are 4 physical partitions, each partition gets 1000 RU/s. If one logical partition receives more than its share of requests, it may throttle even if other partitions are idle.

When you create an item, Cosmos DB computes the hash of the partition key value. The hash determines which physical partition stores the item. The hash range is 0 to 2^64 - 1, and Cosmos DB divides this range into equal chunks assigned to each physical partition.

Key Components, Values, and Defaults

Partition key path: Must be a string, numeric, or boolean property. It cannot be an array, object, or null. The property must exist in every item (or be missing, which is treated as an undefined value that goes to a special partition).

Max partition key length: 2 KB for string values.

Logical partition limit: No limit on number of logical partitions, but each logical partition has a maximum storage of 20 GB (as of 2025). This is a hard limit—exceeding it will cause write failures.

Physical partition storage limit: 50 GB per physical partition.

Throughput per physical partition: Maximum 10,000 RU/s per physical partition. If you need more, Cosmos DB creates more physical partitions.

RU consumption: Cross-partition queries consume more RU because they must fan out to all partitions. Point reads (single partition key + item ID) consume the least RU.

Indexing: By default, all properties are indexed. The partition key is always indexed.

Configuration and Verification

You set the partition key when creating a container. It cannot be changed after creation.

Azure Portal:

In Cosmos DB account → Data Explorer → New Container → specify Partition key (e.g., /userId).

Azure CLI:

az cosmosdb sql container create \
    --resource-group myResourceGroup \
    --account-name myCosmosAccount \
    --database-name myDatabase \
    --name myContainer \
    --partition-key-path "/userId" \
    --throughput 400

PowerShell:

New-AzCosmosDBSqlContainer `
    -ResourceGroupName myResourceGroup `
    -AccountName myCosmosAccount `
    -DatabaseName myDatabase `
    -Name myContainer `
    -PartitionKeyPath "/userId" `
    -Throughput 400

SDK (C#):

ContainerProperties containerProperties = new ContainerProperties
{
    Id = "myContainer",
    PartitionKeyPath = "/userId"
};
Container container = await database.CreateContainerIfNotExistsAsync(containerProperties, 400);

To verify partition distribution, use Azure Monitor metrics: "Normalized RU Consumption" per partition key range, "Data Size" per partition, and "Max RU/s consumed per partition".

Interaction with Related Technologies

Change Feed: The change feed is ordered per partition key. If you need global ordering, you must use a synthetic partition key that ensures order (e.g., concatenating date and user ID).

Stored Procedures: Stored procedures execute within the scope of a single logical partition. They cannot span partitions.

Transactions: Multi-item transactions (using stored procedures or batch operations) are only supported within the same logical partition.

Global Distribution: Write regions are independent of partition key—data is replicated globally. However, partition key affects cross-region RU consumption for writes.

Analytical Store: The analytical store (for Synapse Link) also uses the same partition key.

Choosing an Effective Partition Key

The goal is to select a partition key that: 1. High cardinality: Many distinct values (e.g., user ID, device ID) to distribute data evenly. 2. Even request distribution: Workload should hit all partitions uniformly. Avoid hot partitions where one key receives most requests. 3. Limit per logical partition: Ensure no single partition key value exceeds 20 GB storage. 4. Common query filter: Queries should often include the partition key in the WHERE clause to avoid cross-partition scans.

Good examples: - /userId for a user profile container (high cardinality, even distribution if users are active similarly). - /deviceId for IoT telemetry. - /customerId for e-commerce orders.

Bad examples: - /status (e.g., "active", "inactive") – low cardinality, hot partition for "active". - /date (e.g., date string) – if many items per day, the partition may exceed 20 GB or become hot. - /category (few values) – uneven distribution.

Synthetic Partition Keys

If no natural property meets all criteria, you can create a synthetic partition key by concatenating multiple properties. For example, combine /userId and /date into a property like /userId-date. This increases cardinality and distributes writes evenly. However, queries must include the synthetic key to avoid cross-partition scans.

Partition Key and Throughput

Dedicated throughput: RU/s are divided equally among physical partitions. A hot partition can throttle even if other partitions are idle.

Shared throughput (database level): Throughput is shared across containers, but each container still has its own partition key. The same hot partition issues apply.

Autoscale: Throughput scales automatically based on usage, but partition key still affects distribution.

Monitoring and Troubleshooting

Azure Monitor metrics: "Data Size" and "Max RU/s consumed per partition" reveal skew.

Diagnostic logs: Query CDBPartitionKeyStatistics to see logical partition sizes.

SDK diagnostics: The Cosmos DB SDK provides request diagnostics including partition key range ID and RU charge.

If you see throttling (HTTP 429) on a specific partition, consider:

Changing the partition key (must create new container and migrate data).

Using a synthetic key.

Increasing throughput to reduce throttling (temporary).

Using caching or load leveling to reduce peak load.

Walk-Through

1

Identify Candidate Partition Keys

List all properties that could serve as partition keys. Evaluate each for cardinality (number of distinct values), access patterns (which properties are most used in WHERE clauses), and storage growth per value. For each candidate, estimate the maximum storage per logical partition over time—it must stay under 20 GB. Also assess whether write and read requests will be evenly distributed across values. Avoid properties with few distinct values (e.g., 'status' with 3 values) or that are monotonically increasing (e.g., 'createdDate' with high write volume on latest date).

2

Analyze Query Patterns

Review the application's most frequent and critical queries. Determine which properties are used as filter conditions. The ideal partition key is one that appears in the WHERE clause of the majority of queries. If a query includes the partition key, Cosmos DB routes it to the appropriate physical partition(s) directly, avoiding a costly cross-partition fan-out. For point reads (single item by ID and partition key), the RU cost is minimal (typically 1-2 RU for a 1 KB item). For cross-partition queries, the RU cost increases linearly with the number of physical partitions.

3

Evaluate Storage Distribution

Estimate the maximum size of each logical partition. If any single partition key value is expected to exceed 20 GB, that key is invalid—you must choose a different key or use a synthetic key to further subdivide. For example, if partition key is '/city' and one city has 30 GB of data, you need to combine with another property like '/city-date' to reduce per-partition size. Use historical data or projected growth to calculate. Cosmos DB enforces the 20 GB limit at write time; exceeding it results in a 403 (Forbidden) error.

4

Simulate Workload Distribution

Use the Cosmos DB capacity calculator or perform a proof-of-concept with realistic data and workload. Monitor metrics like 'Normalized RU Consumption per PartitionKeyRange' in Azure Monitor. Aim for a normalized RU consumption below 80% on all partitions under peak load. If one partition shows consistently higher consumption, it's a hot partition. Also check 'Data Size per PartitionKeyRange' to ensure even storage distribution. Adjust the partition key or throughput accordingly.

5

Finalize and Implement

Once you select the partition key, create the container with that key path. Because partition key cannot be changed after creation, you must migrate data if you later need to change it. Use Azure Data Factory or the Cosmos DB change feed to copy data to a new container with the new key. After creation, monitor the container for throttling and skew. If issues arise, consider using a synthetic key or increasing throughput. Document the chosen key and rationale for future reference.

What This Looks Like on the Job

Enterprise Scenario 1: E-Commerce Order Management

A large online retailer uses Cosmos DB to store order history. Each order document includes customerId, orderId, orderDate, and status. They initially chose /customerId as the partition key, assuming customers would have a few orders each. However, some VIP customers placed thousands of orders, causing their logical partitions to approach the 20 GB limit. Additionally, the sales team frequently ran queries to retrieve orders by date range across all customers—these queries were cross-partition and slow. The solution was to use a synthetic partition key /customerId-orderDate (e.g., customer123-2023-01), which distributed orders evenly and allowed date-filtered queries to target specific partitions. They also implemented caching for hot customer data to reduce read RU.

Enterprise Scenario 2: IoT Device Telemetry

A manufacturing company ingests sensor data from thousands of devices every second. Each reading includes deviceId, timestamp, and sensorValue. They chose /deviceId as the partition key. This distributed writes evenly because each device writes at similar rates. However, queries that analyzed trends across all devices (e.g., average temperature per hour) were extremely expensive as they required scanning all partitions. To optimize, they created a separate container for aggregated data with partition key /hour (low cardinality but acceptable because only 24 partitions) and used Azure Functions to aggregate raw data into hourly summaries. The raw data container remained with /deviceId for point queries.

Scenario 3: Multi-Tenant SaaS Application

A SaaS provider stores tenant configuration in Cosmos DB. Each tenant has a unique tenantId. They used /tenantId as partition key, which worked well because tenants are isolated and queries are always scoped to a single tenant. However, a few large tenants had high write throughput, causing throttling on their partitions. They mitigated by increasing RU/s and implementing client-side retry policies with exponential backoff. They also monitored storage per tenant and alerted when approaching 20 GB, triggering a migration to a new container with a synthetic key combining tenantId and a shard key (e.g., tenantId-modulo-10).

Common Misconfigurations and Consequences

Choosing a low-cardinality key (e.g., /status): Results in hot partitions; the 'active' partition may throttle while others are idle. Throughput is wasted.

Ignoring storage limits: A logical partition exceeding 20 GB causes write failures. Recovery requires migrating to a new container.

Not considering query patterns: Cross-partition queries become slow and expensive. Users experience high latency.

Monotonically increasing keys (e.g., /timestamp): All new writes go to the same logical partition (the latest time), creating a hot partition. This is a classic anti-pattern.

How AZ-204 Actually Tests This

AZ-204 Exam Focus on Partition Key Selection

The AZ-204 exam tests partition key selection under Objective 2.1: Develop solutions that use Cosmos DB storage. Specifically, you must be able to: - Determine the appropriate partition key based on data characteristics and access patterns. - Identify hot partitions and propose solutions (synthetic keys, increasing throughput). - Understand the 20 GB logical partition limit and its implications. - Recognize the RU cost difference between point reads and cross-partition queries.

Common Wrong Answers and Why Candidates Choose Them

1.

Choosing `/id` as partition key: Candidates think this ensures uniqueness, but /id is typically a GUID with high cardinality. However, if queries rarely filter by id, this forces cross-partition scans. The exam expects you to choose a key that matches query patterns, not just high cardinality.

2.

Selecting `/timestamp` or `/date`: Candidates assume date-based partitioning is good for time-series data, but it creates hot partitions on the current date. The exam tests this anti-pattern.

3.

Thinking partition key can be changed later: Many candidates believe you can update the partition key via SDK or portal. The exam emphasizes that it must be set at container creation and is immutable.

4.

Ignoring the 20 GB limit: Candidates may choose a key like /country for a global app, not realizing a large country (e.g., USA) could exceed 20 GB. The exam includes scenarios where storage limits are tested.

Specific Numbers and Terms on the Exam

20 GB: Maximum logical partition size.

50 GB: Maximum physical partition size.

10,000 RU/s: Maximum throughput per physical partition.

Point read: 1-2 RU for 1 KB item with partition key and ID.

Cross-partition query: RU cost scales with number of partitions.

Synthetic partition key: Concatenation of multiple properties.

Edge Cases and Exceptions

Null partition key: If an item lacks the partition key property, it is assigned a special null partition. All null-valued items go to the same logical partition, which can become hot.

Partition key with array values: Not allowed—Cosmos DB does not support array partition keys. The path must point to a scalar value.

Large partition key values: Strings up to 2 KB are allowed, but hashing is still performed. Very long keys increase storage overhead.

How to Eliminate Wrong Answers

If the scenario describes a query that filters by a specific property, that property is likely the partition key. Eliminate options that don't match the filter.

If the scenario mentions throttling on a single partition, look for a low-cardinality or monotonically increasing key. The correct answer will suggest a synthetic key or different key.

If the scenario includes a requirement for transactions across multiple items, those items must share the same partition key. Choose a key that groups them.

Always check storage growth: if any value could exceed 20 GB, the key is invalid.

By understanding these patterns, you can quickly eliminate distractors and select the correct partition key.

Key Takeaways

The partition key is a JSON property path set at container creation; it is immutable.

Logical partitions have a maximum storage of 20 GB; exceeding it causes write failures.

Physical partitions have a maximum storage of 50 GB and a maximum throughput of 10,000 RU/s.

Choose a partition key with high cardinality and even request distribution to avoid hot partitions.

Queries that include the partition key in the WHERE clause are point reads (low RU); those that don't are cross-partition (higher RU).

Synthetic partition keys combine multiple properties to improve distribution and query efficiency.

The partition key cannot be an array, object, or null; missing property is treated as null.

Stored procedures and transactions are scoped to a single logical partition.

Monitor normalized RU consumption and data size per partition to detect skew.

For time-series data, avoid monotonically increasing keys like /timestamp; use a synthetic key with a shard prefix.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

High Cardinality Key (e.g., /userId)

Many distinct values (e.g., 1 million users).

Distributes storage evenly across partitions.

Distributes requests evenly if workload is uniform.

Point reads are efficient if query includes userId.

Less risk of hitting 20 GB logical partition limit.

Low Cardinality Key (e.g., /status)

Few distinct values (e.g., 'active', 'inactive', 'pending').

Logical partitions can become very large (e.g., all active items).

Hot partition on the most common value (e.g., 'active').

Cross-partition queries still hit few partitions, but writes are skewed.

High risk of exceeding 20 GB logical partition limit.

Watch Out for These

Mistake

The partition key must be unique for each item.

Correct

The partition key does not need to be unique; multiple items can share the same partition key value. Uniqueness is enforced by the id property combined with the partition key (the logical partition).

Mistake

You can change the partition key after creating the container.

Correct

The partition key is immutable after container creation. To change it, you must create a new container and migrate data.

Mistake

A high-cardinality partition key always ensures good performance.

Correct

High cardinality is necessary but not sufficient. The key must also match query patterns to avoid cross-partition scans. Additionally, even distribution of requests is critical—a high-cardinality key can still have hot partitions if some values receive disproportionate traffic.

Mistake

The partition key can be an array or nested object.

Correct

The partition key path must point to a single scalar value (string, number, boolean). Arrays and objects are not supported.

Mistake

Cross-partition queries are always inefficient and should be avoided.

Correct

Cross-partition queries are acceptable for infrequent or analytical workloads. They consume more RU but can be optimized by using feed options like MaxConcurrency and MaxItemCount. The exam expects you to know the trade-off.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the maximum size of a logical partition in Cosmos DB?

The maximum size of a logical partition is 20 GB. This is a hard limit enforced by Cosmos DB. If you attempt to insert an item that would cause a logical partition to exceed 20 GB, the write fails with a 403 (Forbidden) error. To avoid this, ensure that no single partition key value accumulates more than 20 GB of data. If necessary, use a synthetic partition key to further subdivide the data.

Can I change the partition key after creating a Cosmos DB container?

No, the partition key is immutable after container creation. You cannot modify it via the portal, SDK, or CLI. To change the partition key, you must create a new container with the desired partition key and migrate the data from the old container using Azure Data Factory, the change feed, or a custom migration tool.

What happens if I don't include the partition key in a query?

If a query does not include the partition key in the WHERE clause, Cosmos DB performs a cross-partition query, meaning it fans out the query to all physical partitions. This consumes more RU (request units) and increases latency. The RU cost scales linearly with the number of physical partitions. For example, a query that scans 10 partitions will consume approximately 10 times the RU of a query that targets a single partition.

How do I choose between a single property and a synthetic partition key?

Use a single property if it has high cardinality, even distribution, and is commonly used in queries. Use a synthetic key if no single property meets all criteria—for example, if a property has high cardinality but some values exceed 20 GB, or if the most common query filter involves multiple properties. A synthetic key concatenates two or more properties (e.g., /userId-date) to create a more granular partitioning scheme.

What is a hot partition and how do I fix it?

A hot partition is a logical partition that receives a disproportionately high amount of requests (reads or writes) compared to other partitions, causing throttling (HTTP 429) even when overall throughput is not exhausted. To fix it, you can: (1) Increase the provisioned RU/s to give the hot partition more capacity (temporary), (2) Change the partition key to one that distributes requests more evenly, (3) Use a synthetic partition key to spread the load, or (4) Implement caching or client-side load leveling to reduce peak requests.

Can I use a partition key that is an array or nested object?

No, the partition key path must resolve to a scalar value (string, number, boolean). Arrays and objects are not supported. The path is a JSON property path like '/address/city' that must point to a single scalar. If the property is missing, the item is assigned to a special null partition.

How does the partition key affect global distribution?

The partition key determines how data is distributed within a single region, but global distribution (multi-region writes) operates independently. When you write to a multi-region account, the write is replicated to all regions. The partition key still affects RU consumption per write operation, and the 20 GB logical partition limit applies in each region independently (data is replicated, so storage is same across regions).

Terms Worth Knowing

Ready to put this to the test?

You've just covered Cosmos DB Partition Key Selection — now see how well it sticks with free AZ-204 practice questions. Full explanations included, no account needed.

Done with this chapter?