DP-900Chapter 47 of 101Objective 2.4

Cosmos DB Partitioning Strategies

This chapter covers Cosmos DB partitioning strategies, a core concept for understanding how Cosmos DB achieves horizontal scaling and high performance. For the DP-900 exam, approximately 15-20% of questions touch on partitioning, including partition key selection, logical vs. physical partitions, and the impact on throughput and storage. You will learn the internal mechanics of partitioning, how to choose an effective partition key, and common pitfalls that lead to hot partitions and throttling.

25 min read
Intermediate
Updated May 31, 2026

Library Shelving with Multiple Floors

Imagine a massive library with millions of books. The library is divided into floors, each floor representing a physical partition. Within each floor, books are arranged on shelves by a specific attribute, such as the author's last name—this is the partition key. When a patron requests a book by 'Smith, John', the librarian doesn't search the entire library. Instead, she calculates the floor based on the partition key (e.g., floor number = hash('Smith') % 10) and goes directly to that floor. On that floor, she looks up the exact shelf using the partition key value. If the library adds more floors (scale out), the hash function redistributes books across new floors. However, if too many patrons request books by the same author (e.g., 'Smith'), one floor gets overloaded—this is a hot partition. The library's efficiency depends on choosing a partition key that spreads requests evenly. Unlike a single-floor library where you must search every shelf, this multi-floor design allows parallel access and faster retrieval, but only if the partition key is well-chosen.

How It Actually Works

What is Partitioning and Why Does Cosmos DB Use It?

Partitioning is the mechanism by which Cosmos DB distributes data across multiple physical servers (or nodes) to achieve horizontal scaling. Without partitioning, a single node would become a bottleneck as data grows, limiting storage and throughput. Cosmos DB uses automatic partitioning based on a partition key that you specify when creating a container. This key is a property in each document (e.g., /userId, /deviceId). The partition key value determines which logical partition the document belongs to. Cosmos DB then maps logical partitions to physical partitions, each hosted on a separate node.

How Partitioning Works Internally

When you insert a document, Cosmos DB computes a hash of the partition key value. The hash range is divided into ranges, each assigned to a physical partition. The document is stored on the physical partition responsible for that hash range. All documents with the same partition key value end up in the same logical partition, which resides on a single physical partition. This is crucial: a logical partition cannot span multiple physical partitions. Physical partitions can host multiple logical partitions.

Key Components and Values

Partition Key: A property you define at container creation. It can be a single property (e.g., /city) or a synthetic key (e.g., /userId). Max size of partition key value is 2 KB.

Logical Partition: A group of documents sharing the same partition key value. Maximum size of a logical partition is 20 GB (for Azure Cosmos DB for NoSQL).

Physical Partition: A replica set of servers that stores a subset of data. Each physical partition has a maximum storage of 50 GB (for most APIs) and up to 10,000 RU/s of throughput.

Request Unit (RU): The cost of operations. 1 RU = 1 KB read. Partitioning affects RU consumption because cross-partition queries cost more (at least 2.5 RU per partition touched).

Hash Function: Cosmos DB uses a consistent hash algorithm to distribute partition key values across physical partitions. The hash range is 0 to 2^64-1.

Choosing a Partition Key: Best Practices

1.

High Cardinality: The partition key should have many distinct values (e.g., userId for a user table) to distribute data evenly. Avoid keys with few values (e.g., status with only 'active' and 'inactive').

2.

Even Request Distribution: Choose a key that spreads read and write operations uniformly. A key that causes all writes to go to one partition (e.g., date for a time-series with mostly today's data) creates a hot partition.

3.

Transactional Boundaries: For stored procedures and triggers, all documents involved must share the same partition key value. If you need transactions across multiple entities, design the key to group them (e.g., /userId for orders and order details).

4.

Storage Limits: Ensure no single partition key value exceeds 20 GB of data. For example, if you store all documents under a single key /region = 'US', that logical partition may hit the 20 GB limit.

Impact on Throughput and Queries

When you provision throughput at the container level (e.g., 400 RU/s), Cosmos DB distributes this throughput across all physical partitions. Each physical partition gets an equal share. For example, if you have 4 physical partitions and 4000 RU/s, each partition gets 1000 RU/s. If one logical partition receives more requests than its host physical partition's share, requests get throttled (HTTP 429).

Queries that specify the partition key in the WHERE clause (e.g., WHERE userId = '123') are point queries and are routed to a single physical partition, costing minimal RUs. Queries without the partition key are cross-partition queries and must fan out to all physical partitions, increasing RU cost and latency.

Step-by-Step: How a Write Operation is Partitioned

1.

Client sends a write request with a document containing the partition key property.

2.

Cosmos DB SDK extracts the partition key value (e.g., userId = 'abc').

3.

SDK computes the hash of the value and determines which physical partition's hash range it falls into.

4.

SDK routes the request directly to that physical partition (gateway or direct mode).

5.

The physical partition stores the document in the logical partition for that key value.

6.

The write is replicated to all replicas within the physical partition (consistency level dependent).

Changing the Partition Key

Once a container is created, you cannot change the partition key. To use a different key, you must create a new container and migrate data using Azure Data Factory or change feed. This is a common exam trap: candidates think you can update it later.

Synthetic Partition Keys

If your natural data doesn't have a high-cardinality key, you can create a synthetic key by concatenating multiple properties. For example, /userId + /date (e.g., user123_2023-01-01). This increases cardinality and spreads writes evenly. However, it complicates queries because you must know the full key.

Partitioning in Different Cosmos DB APIs

NoSQL API: Partition key is a required property. You can use any JSON property path (e.g., /address/city).

MongoDB API: Partition key is defined as a shard key (e.g., shard key: { userId: 1 }). Similar behavior.

Cassandra API: Partition key is the first part of the primary key (e.g., PRIMARY KEY ((user_id), timestamp)).

Gremlin API: Partition key is a property on vertices/edges (e.g., /partitionKey).

Table API: Partition key is the PartitionKey property, required.

Monitoring and Troubleshooting

Azure Monitor Metrics: Use Normalized RU Consumption per partition key range to detect hot partitions. If one partition consistently shows high utilization (e.g., >90%), it's a hot partition.

Diagnostic Logs: Enable logging to see partition key statistics and throttling events.

Change Feed: Can be used to process data per logical partition.

Exam Focus: Common Misconceptions

Myth: Partition key can be changed after container creation. Reality: It cannot; you must migrate data.

Myth: More partitions always improve performance. Reality: Too many small partitions can increase overhead; Cosmos DB auto-scales partitions as needed.

Myth: Partition key must be unique across documents. Reality: It can have duplicates; all duplicates go to the same logical partition.

Myth: Cross-partition queries are always bad. Reality: They are acceptable for analytics but cost more RUs.

Interaction with Throughput and Consistency

Provisioned throughput is shared across physical partitions. If you have 10,000 RU/s and 5 physical partitions, each gets 2,000 RU/s. A hot partition can only use its share, leading to throttling even if other partitions are idle.

Autoscale throughput adjusts allocation but still per physical partition.

Consistency levels affect replication within a physical partition but not partitioning.

Summary of Key Values

Max logical partition size: 20 GB (NoSQL API)

Max physical partition storage: 50 GB (varies by API)

Max partition key value size: 2 KB

Default number of physical partitions: starts at 1, grows as data or throughput increases.

RU cost for cross-partition query: at least 2.5 RU per partition touched.

Code Example: Creating a Container with Partition Key (C# SDK)

var container = await database.CreateContainerIfNotExistsAsync(
    id: "users",
    partitionKeyPath: "/userId",
    throughput: 400);

CLI Command to Create Container

az cosmosdb sql container create \
    --resource-group myGroup \
    --account-name myAccount \
    --database-name myDB \
    --name users \
    --partition-key-path "/userId" \
    --throughput 400

Walk-Through

1

Select Partition Key Path

Before creating a container, you must choose a partition key (e.g., `/userId`). This decision is irreversible. The key should have high cardinality (many distinct values) and distribute requests evenly. Avoid keys that cause hot partitions, such as timestamps for write-heavy workloads. The partition key path is a JSON property path, limited to 2 KB per value. You can also use a synthetic key by concatenating multiple properties.

2

Create Container with Partition Key

Use the Azure portal, SDK, or CLI to create the container specifying the partition key path. For example, `az cosmosdb sql container create --partition-key-path "/userId"`. The container is created with a default number of physical partitions (usually 1). As data and throughput increase, Cosmos DB automatically splits physical partitions. You cannot change the partition key later.

3

Insert Documents with Partition Key

When inserting a document, you must include the partition key property. The SDK hashes the value to determine the target physical partition. For example, a document with `"userId": "abc123"` will be routed to a specific partition. All documents with the same userId share the same logical partition, which cannot exceed 20 GB. The write operation consumes RUs based on document size and indexing.

4

Query with Partition Key

To optimize queries, include the partition key in the WHERE clause (e.g., `WHERE userId = 'abc123'`). This is a point query, routed to a single physical partition, costing minimal RUs. Queries without the partition key become cross-partition queries, fanning out to all partitions. Cross-partition queries use more RUs (minimum 2.5 RU per partition) and may have higher latency.

5

Monitor and Adjust

Use Azure Monitor metrics like `Normalized RU Consumption` per partition key range to detect hot partitions. If one partition consumes >90% of its throughput, consider redesigning the partition key. You might need to create a new container with a better key and migrate data using change feed or Azure Data Factory. Also monitor logical partition size to avoid hitting the 20 GB limit.

What This Looks Like on the Job

Scenario 1: E-commerce Order Management

A company stores customer orders in Cosmos DB. Initially, they used /customerId as the partition key. This worked well because each customer has many orders, and cardinality is high. However, they noticed that when a VIP customer placed many orders simultaneously (e.g., during a flash sale), the partition for that customer became a hot spot, throttling requests. They mitigated by using a synthetic key /customerId_orderDate to spread writes across multiple logical partitions per customer. Queries that need all orders for a customer now require a cross-partition query, but the trade-off is acceptable for write-heavy workloads. They also set up autoscale throughput to handle spikes.

Scenario 2: IoT Telemetry Data

An IoT company ingests sensor readings from millions of devices. They chose /deviceId as the partition key, which distributes data evenly because each device sends data independently. However, they also need to query data across all devices for a specific time range. Without the partition key in queries, they face cross-partition queries that are costly. To optimize, they use a secondary index on timestamp and run queries that include deviceId when possible. For global analytics, they use Azure Synapse Link to run analytical queries without impacting transactional throughput.

Scenario 3: Multi-tenant SaaS Application

A SaaS provider stores tenant data in a single Cosmos DB container. They use /tenantId as the partition key. This ensures that all data for a tenant is in one logical partition, enabling transactional stored procedures within a tenant. However, a large tenant's data grew to 18 GB, approaching the 20 GB limit. They had to split the tenant across multiple logical partitions by using a composite key /tenantId_entityType. This required application changes to handle cross-partition transactions. They also faced throttling when the large tenant consumed more than its physical partition's share of throughput. They solved it by increasing provisioned throughput and using autoscale.

Common Misconfiguration

A common mistake is choosing a partition key with low cardinality, such as /status with values 'active' and 'inactive'. This results in only two logical partitions, each potentially huge, and all writes go to one partition (e.g., new active records). This causes severe throttling. Another mistake is using a monotonically increasing value like a timestamp as the partition key, which creates a hot partition for recent data. The fix is to use a synthetic key that includes a hash or random suffix to distribute writes.

How DP-900 Actually Tests This

What DP-900 Tests on Partitioning

DP-900 objective 2.4: 'Describe the partitioning strategies in Azure Cosmos DB.' Specifically, you need to understand:

The purpose of partitioning (horizontal scaling)

What a partition key is and how it is used

The difference between logical and physical partitions

How to choose an effective partition key (high cardinality, even distribution)

The consequences of a poor partition key (hot partitions, throttling)

That partition key cannot be changed after container creation

The 20 GB logical partition limit (NoSQL API)

That cross-partition queries consume more RUs

Common Wrong Answers on the Exam

1.

'Partition key can be modified after container creation.' This is false. Many candidates think you can update it like any property, but Cosmos DB does not allow it. The correct answer: you must create a new container.

2.

'You should choose a partition key with few distinct values to improve performance.' Wrong. Low cardinality leads to large logical partitions and uneven distribution. High cardinality is correct.

3.

'All queries must include the partition key to avoid high RU costs.' Not exactly. While optimal, you can run cross-partition queries. The exam may ask about the impact: cross-partition queries cost more RUs.

4.

'Logical partitions can span multiple physical partitions.' False. A logical partition is always contained within a single physical partition.

Specific Numbers and Terms That Appear

20 GB: Maximum size of a logical partition in Azure Cosmos DB for NoSQL.

10,000 RU/s: Maximum throughput per physical partition (though this can vary).

2 KB: Maximum size of a partition key value.

Hash range: 0 to 2^64-1.

Point query: A query that includes the partition key in the filter.

Cross-partition query: A query that does not specify the partition key.

Edge Cases and Exceptions

Synthetic partition keys: The exam may ask about using concatenated properties to improve cardinality.

Partition key in different APIs: For MongoDB API, it's called a shard key; for Table API, it's PartitionKey.

Change feed: Can be used per logical partition.

Stored procedures: Require all documents to have the same partition key.

How to Eliminate Wrong Answers

If an answer says you can change the partition key, it's wrong.

If an answer suggests using a key with low cardinality (e.g., 'region' with few values), it's wrong.

If an answer claims that cross-partition queries are always prohibited, it's wrong; they are allowed but more expensive.

If an answer states that logical partitions are unlimited in size, it's wrong; 20 GB limit.

Always look for the option that emphasizes even distribution, high cardinality, and the immutability of the partition key.

Key Takeaways

Partition key is chosen at container creation and cannot be changed later.

Logical partition max size is 20 GB for NoSQL API; exceeding this causes errors.

High cardinality and even request distribution are essential for partition key selection.

Cross-partition queries cost at least 2.5 RU per physical partition touched.

Point queries (including partition key) are the most efficient.

Hot partitions occur when a single partition key value receives disproportionate traffic.

Synthetic partition keys can solve cardinality issues but complicate queries.

Monitor Normalized RU Consumption per partition key range to detect hot partitions.

Physical partitions auto-scale; you don't manage them directly.

Stored procedures require all documents to share the same partition key.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Single-Property Partition Key

Uses a single property from the document (e.g., /userId).

Simple to implement and query (point queries).

May lead to hot partitions if the property has low cardinality or uneven distribution.

Cannot exceed 20 GB per logical partition if a single value dominates.

Best for natural high-cardinality keys like user ID or device ID.

Synthetic Partition Key

Concatenates multiple properties (e.g., /userId_date).

Increases cardinality, reducing hot partitions.

Complicates queries because you must know the full key to do point queries.

Helps avoid 20 GB limit by spreading data across more logical partitions.

Useful for time-series data or multi-tenant scenarios with large tenants.

Watch Out for These

Mistake

The partition key can be changed after the container is created.

Correct

Once a container is created, the partition key path is immutable. To change it, you must create a new container and migrate the data using Azure Data Factory, change feed, or bulk executor.

Mistake

A partition key must be unique across all documents.

Correct

The partition key does not need to be unique. Multiple documents can have the same partition key value, and they will all be stored in the same logical partition.

Mistake

Choosing a partition key with low cardinality (few distinct values) improves performance.

Correct

Low cardinality leads to large logical partitions that can approach the 20 GB limit and cause uneven load distribution. High cardinality (many distinct values) is better for performance.

Mistake

Cross-partition queries are not allowed in Cosmos DB.

Correct

Cross-partition queries are allowed but consume more RUs (at least 2.5 RU per partition) and may have higher latency. They are not prohibited.

Mistake

Physical partitions are static and never split.

Correct

Cosmos DB automatically splits physical partitions as data grows or throughput increases. You do not manage them manually.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

Can I change the partition key of an existing Cosmos DB container?

No, you cannot change the partition key after container creation. You must create a new container with the desired partition key and migrate data using Azure Data Factory, the change feed, or the bulk executor library. This is a common exam question.

What happens if a logical partition exceeds 20 GB?

If a logical partition exceeds 20 GB, Cosmos DB will return a 413 (Request Entity Too Large) error for writes to that partition. You must redesign the partition key to distribute data more evenly, e.g., using a synthetic key.

What is a hot partition and how do I fix it?

A hot partition is a logical partition that receives a disproportionate amount of requests, causing throttling. Fix it by choosing a partition key with higher cardinality or using a synthetic key. You can also increase throughput, but that may not solve the root cause.

What is the difference between logical and physical partitions?

A logical partition is a group of documents with the same partition key value. A physical partition is a server node that hosts multiple logical partitions. Logical partitions cannot span physical partitions, but physical partitions can contain many logical partitions.

How does the partition key affect query performance?

Queries that include the partition key in the WHERE clause (point queries) are routed to a single physical partition, consuming minimal RUs. Queries without the partition key become cross-partition queries, fanning out to all physical partitions, which increases RU consumption and latency.

What is a synthetic partition key?

A synthetic partition key is created by concatenating two or more properties, e.g., /userId_date. It increases cardinality and distributes writes evenly. However, it requires knowing the full key for point queries.

Is the partition key required for all Cosmos DB APIs?

Yes, all Cosmos DB APIs require a partition key (or shard key) for containers. For Table API, it's the PartitionKey property; for MongoDB API, it's the shard key; for Cassandra API, it's the partition key in the primary key.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Cosmos DB Partitioning Strategies — now see how well it sticks with free DP-900 practice questions. Full explanations included, no account needed.

Done with this chapter?