DP-900Chapter 26 of 101Objective 2.4

Cosmos DB Consistency Levels

This chapter covers Azure Cosmos DB's five consistency levels, a core concept for the DP-900 exam. Understanding these levels is critical because approximately 15-20% of exam questions touch on Cosmos DB, and consistency is a frequent topic. You will learn what each level guarantees, how it impacts performance, and how to choose the right level for your application. The exam tests not just definitions but also trade-offs and real-world scenarios.

25 min read
Intermediate
Updated May 31, 2026

Cosmos DB Consistency: A Library Book System

Imagine a public library with multiple branches, each holding copies of the same book. The library system guarantees different levels of consistency for readers. At the strongest level (Strong), when you borrow a book from the main branch, all other branches immediately update their records—no one else can check out that book until you return it. This ensures everyone sees the same version, but it slows down the system because every branch must confirm the update. At the weakest level (Eventual), branches update independently. You might check out a book at one branch while another branch still shows it as available, leading to confusion. In between, there are levels like Bounded Staleness: the system guarantees that no branch is more than, say, 5 minutes behind the main branch. If you borrow a book, within 5 minutes all branches will know. Session consistency means that within your own library card session, you always see your own updates, but other readers might see older data until they refresh. Consistent Prefix ensures that if you read a series of updates, you always see them in order—no missing chapters. These levels mirror Cosmos DB's five consistency models, balancing read performance, write latency, and data coherence across globally distributed replicas.

How It Actually Works

What Are Consistency Levels and Why Do They Exist?

Cosmos DB is a globally distributed, multi-model database service. When you replicate data across multiple Azure regions, you face a fundamental trade-off between consistency, availability, and latency. The CAP theorem states that in the presence of a network partition, you must choose between consistency and availability. Cosmos DB offers five well-defined consistency levels to let you balance these factors according to your application's needs. Each level provides a specific guarantee about how and when replicas converge to the same data.

The Five Consistency Levels – Detailed Breakdown

#### 1. Strong Consistency

Strong consistency guarantees that a read operation returns the most recent write across all replicas. Internally, Cosmos DB uses a quorum-based replication protocol. For a write to be acknowledged, it must be committed to a majority of replicas (a write quorum). For a read to return the latest value, it must contact a read quorum that includes at least one replica that has the latest write. This ensures linearizability: once a write completes, all subsequent reads from any replica see that write.

Default behavior: Strong is the default for newly created accounts if no other level is specified.

Performance impact: Write latency is higher because every write must be acknowledged by a majority of replicas. Read latency is also higher because reads need to contact a quorum.

Availability: In the event of a regional outage, if the write quorum cannot be formed, writes may fail. Reads may also fail if a read quorum is unavailable.

Use case: Applications that require absolute data integrity, such as financial transactions or inventory systems where double-booking is unacceptable.

#### 2. Bounded Staleness

Bounded staleness guarantees that reads are not too far behind writes. You configure two parameters: the maximum staleness interval (in seconds) and the maximum number of stale versions (K). Cosmos DB ensures that a read returns data that is within these bounds. Internally, the system tracks the progress of replication and ensures that replicas lag no more than the configured staleness.

Default values: There is no default; you must set K and t. K can be up to 100,000 operations, and t can be up to 86400 seconds (24 hours).

Performance: Writes are acknowledged after a local quorum (not a global quorum), so write latency is lower than strong. Reads may be served from any replica that is within the staleness bound, so read latency is also lower.

Use case: Applications that need near-strong consistency but can tolerate a small delay, such as social media feeds or news updates.

#### 3. Session Consistency

Session consistency is the most commonly used level for applications with a user context. It guarantees monotonic reads, monotonic writes, read-your-writes, and write-follows-reads within a single client session. Internally, Cosmos DB uses a session token that the client sends with each request. The server uses this token to ensure that the client sees its own writes and reads in a consistent order.

Session token: A string that encodes the progress of the session. The client must manage the token and include it in subsequent requests.

Performance: Writes are acknowledged after a local quorum, and reads are served from any replica that has the session token's progress. This provides low latency and high availability.

Use case: Any application with user-specific data, such as e-commerce carts, user profiles, or gaming states.

#### 4. Consistent Prefix

Consistent prefix guarantees that reads never see out-of-order writes. If you perform writes A, B, C in that order, any read will see A, A+B, or A+B+C, but never A+C or B+A. This level does not guarantee that you see the latest write, only that the sequence is preserved.

Performance: Writes are acknowledged after a local quorum. Reads are served from any replica, but the replica must return data in order. This provides low latency and high availability.

Use case: Applications where order matters but not absolute freshness, such as chat messages or event sourcing.

#### 5. Eventual Consistency

Eventual consistency is the weakest level. It guarantees that if no new writes are made, replicas will eventually converge to the same state. There is no time bound or ordering guarantee. Reads may return stale data, and different replicas may return different values at the same time.

Performance: Lowest write and read latency because no quorum is needed. Highest availability because any replica can serve reads and writes.

Use case: Applications that tolerate stale data, such as product catalogs, ratings, or non-critical logs.

How Consistency Levels Are Implemented

Cosmos DB uses a multi-master replication protocol. Each write is applied to a set of replicas (a quorum). The consistency level determines the size of the quorum and the synchronization requirements.

Strong: Write quorum = majority of replicas. Read quorum = majority of replicas. This ensures linearizability.

Bounded Staleness: Write quorum = local majority (within the region). Read quorum = any replica within staleness bounds.

Session: Write quorum = local majority. Read quorum = any replica with session token.

Consistent Prefix: Write quorum = local majority. Read quorum = any replica, but order is preserved.

Eventual: Write quorum = one replica. Read quorum = any replica.

Configuration and Verification

You can set the default consistency level at the Cosmos DB account level using the Azure portal, CLI, or PowerShell. You can also override it per request using the consistency level header.

Using Azure CLI:

az cosmosdb update --name mycosmosdb --resource-group myrg --default-consistency-level Session

Using .NET SDK:

CosmosClient client = new CosmosClient(connectionString, new CosmosClientOptions() { ConsistencyLevel = ConsistencyLevel.Session });

Verification:

You can verify the current consistency level in the Azure portal under the "Default consistency" blade. The portal also shows the latency and throughput impact for each level.

Interaction with Related Technologies

Consistency levels affect Cosmos DB's multi-region writes, conflict resolution, and change feed. For multi-region writes, conflicts can arise. Cosmos DB provides automatic conflict resolution with last-writer-wins (using a timestamp) or custom conflict resolution policies. The consistency level influences how conflicts are detected and resolved. The change feed is ordered by the timestamp of writes, but the consistency level affects how quickly changes appear in the feed.

Performance Trade-offs Summary

Strong: Highest write latency (2-5 ms additional per write in multi-region), lowest availability during partitions.

Bounded Staleness: Low write latency (1-2 ms), high availability, but bounded staleness.

Session: Low latency, high availability, best for user-centric apps.

Consistent Prefix: Low latency, high availability, but no freshness guarantee.

Eventual: Lowest latency, highest availability, but no ordering or freshness.

Key Terms for the Exam

Linearizability: The property that operations appear to occur in a single, global order.

Quorum: The minimum number of replicas that must agree on a write or read.

Session token: A string used to maintain session consistency.

Staleness bound: The maximum time or number of versions a replica can lag behind.

Multi-master: Cosmos DB allows multiple regions to accept writes, requiring conflict resolution.

Exam Trap: Confusing Consistency Levels

Candidates often confuse Bounded Staleness with Session. Remember: Bounded Staleness is a global guarantee across all clients, while Session is per-client. Also, Strong consistency is not available for multi-region write accounts (multi-master). If you enable multi-region writes, you cannot use Strong consistency. The exam may test this by asking which consistency level is not supported with multiple write regions.

Walk-Through

1

Select Consistency Level for Account

When creating a Cosmos DB account, you choose a default consistency level. This level applies to all operations unless overridden per request. The Azure portal, CLI, or ARM templates allow you to set this. For multi-region write accounts, Strong is not available. The default is Session for new accounts. You must understand the trade-offs: Strong ensures linearizability but reduces availability and increases latency. Eventual offers best performance but no guarantees.

2

Configure Bounded Staleness Parameters

If you choose Bounded Staleness, you must set two parameters: maximum staleness interval (in seconds) and maximum staleness versions (K). These define the acceptable lag. The system ensures that no replica is behind by more than t seconds or K operations, whichever is stricter. For example, if you set t=5 seconds and K=10, a read will never see data older than 5 seconds or more than 10 writes behind. This is a global guarantee across all clients.

3

Manage Session Tokens in Client Code

For Session consistency, the client SDK automatically manages session tokens. The token is returned in the response header 'x-ms-session-token'. The client must include this token in subsequent requests to the same session. If the token is lost, the client may read stale data. The SDK handles this automatically, but if you use a custom HTTP client, you must manage the token yourself. Session consistency ensures read-your-writes and monotonic reads within the session.

4

Override Consistency per Request

You can override the default consistency for individual requests using the 'x-ms-consistency-level' header. Allowed values are 'Strong', 'BoundedStaleness', 'Session', 'ConsistentPrefix', and 'Eventual'. This is useful for specific operations that require stronger guarantees. For example, a financial transaction might use Strong even if the account default is Session. However, you cannot request a stronger level than the account allows; for instance, if the account is set to Eventual, you cannot request Strong.

5

Monitor Consistency Impact via Metrics

Azure Monitor provides metrics for replication lag and request latency. For Bounded Staleness, you can monitor the 'Max Replication Lag' metric to ensure your staleness bounds are not being exceeded. For Strong consistency, high write latency may indicate quorum issues. Use these metrics to tune your consistency level. The exam may ask which metric to use for monitoring staleness.

What This Looks Like on the Job

Scenario 1: Global E-commerce Platform

A large e-commerce company uses Cosmos DB to store product inventory across multiple Azure regions. They need strong consistency for inventory counts to prevent overselling, but they also want low latency for customers worldwide. They chose Bounded Staleness with a 2-second maximum staleness and 100 operations. This ensures that inventory updates propagate globally within 2 seconds, preventing most overselling while keeping write latency low. In production, they monitor replication lag and have alerts if lag exceeds 1 second. Misconfiguration: if they set staleness too high (e.g., 60 seconds), customers might see outdated inventory and attempt to buy items that are out of stock, leading to cancellations and poor experience.

Scenario 2: Social Media Feed

A social media app uses Cosmos DB for user posts and feeds. They need fast reads and writes, and can tolerate some staleness. They use Eventual consistency for the feed reads because users don't need to see every post instantly. However, for the user's own posts, they use Session consistency to ensure read-your-writes. This is achieved by setting the default to Eventual but overriding per request for user-specific operations. In production, they handle millions of requests per second. If they used Strong consistency, latency would be too high and the app would feel sluggish. Misconfiguration: if they forgot to override consistency for user posts, users might not see their own posts immediately, causing confusion.

Scenario 3: Financial Trading System

A financial trading platform requires absolute consistency for trade orders. They use Strong consistency for order placement and confirmation. However, they also need high availability. They deploy Cosmos DB in a single region with multiple replicas to avoid the unavailability of multi-region strong consistency. They accept higher write latency (around 10 ms) for the guarantee. They also use Consistent Prefix for historical trade data where order matters but not real-time freshness. Misconfiguration: if they mistakenly used Eventual for order placement, trades could be duplicated or lost, leading to financial loss.

How DP-900 Actually Tests This

The DP-900 exam tests your ability to differentiate between the five consistency levels and understand their trade-offs. Objective 2.4 specifically covers "describe consistency models for Azure Cosmos DB." Expect 2-3 questions on this topic. The most common wrong answers come from confusing Session with Bounded Staleness or thinking that Strong is available with multi-region writes.

Common Wrong Answers: 1. "Strong consistency ensures the highest availability." – Wrong: Strong reduces availability because it requires a quorum. Eventual offers highest availability. 2. "Session consistency guarantees global ordering." – Wrong: Session only guarantees within a session. Consistent Prefix guarantees global ordering. 3. "Bounded Staleness is the same as Eventual with a time limit." – Wrong: Bounded Staleness has a strict bound; Eventual has no bound. 4. "You can set Strong consistency on a multi-region write account." – Wrong: Strong is not supported with multiple write regions.

Numbers and Terms to Memorize: - Maximum staleness interval: up to 86400 seconds (24 hours) - Maximum staleness versions: up to 100,000 - Default consistency level: Session (for new accounts) - Session token header: x-ms-session-token - Consistency level header: x-ms-consistency-level

Edge Cases: - If you set Bounded Staleness with t=0 and K=0, it behaves like Strong? No, it behaves like Eventual because no bound is set? Actually, t=0 and K=0 means no staleness is allowed, effectively Strong, but Cosmos DB does not allow this configuration. You must set at least one bound >0. - If you use Session consistency but lose the session token, you might read stale data. The exam may ask what happens if the token is not provided.

How to Eliminate Wrong Answers: - If the question mentions "read-your-writes" or "within a session," the answer is Session. - If it mentions "no bound" or "eventually consistent," answer Eventual. - If it mentions "ordered writes" but not "latest," answer Consistent Prefix. - If it mentions a time bound or version bound, answer Bounded Staleness. - If it mentions "linearizability" or "strongest," answer Strong.

Key Takeaways

Cosmos DB offers five consistency levels: Strong, Bounded Staleness, Session, Consistent Prefix, and Eventual.

Strong consistency requires a quorum for both reads and writes, increasing latency and reducing availability.

Bounded Staleness allows you to configure maximum staleness in seconds (up to 86400) and versions (up to 100,000).

Session consistency guarantees read-your-writes and monotonic reads within a single session using a session token.

Consistent Prefix ensures reads never see out-of-order writes, but does not guarantee freshness.

Eventual consistency provides the lowest latency and highest availability, but no ordering or freshness guarantees.

You cannot set Strong consistency on a Cosmos DB account configured with multiple write regions.

The default consistency level for new Cosmos DB accounts is Session.

You can override consistency per request using the 'x-ms-consistency-level' header, but cannot request a stronger level than the account default.

Session tokens are returned in the 'x-ms-session-token' response header and must be included in subsequent requests for session consistency.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Strong Consistency

Guarantees linearizability: reads always return the latest write.

Highest write latency due to quorum requirements.

Lower availability during partitions because quorum may be lost.

Cannot be used with multi-region writes.

Suitable for financial transactions or inventory systems.

Eventual Consistency

No guarantee of recency; reads may return stale data.

Lowest write and read latency.

Highest availability: any replica can serve requests.

Works with multi-region writes.

Suitable for non-critical data like logs or product catalogs.

Bounded Staleness

Global guarantee across all clients within staleness bounds.

Requires configuration of K and t parameters.

Provides a bounded delay on data freshness.

Writes acknowledged after local quorum.

Good for near-strong consistency with lower latency.

Session Consistency

Per-client guarantee: read-your-writes, monotonic reads/writes.

Uses session token managed by client SDK.

No global ordering; different sessions may see different data.

Writes acknowledged after local quorum.

Ideal for user-specific data like shopping carts.

Watch Out for These

Mistake

Strong consistency is always the best choice because it guarantees the most recent data.

Correct

Strong consistency reduces availability and increases latency. It is not suitable for all applications, especially those requiring high availability or low latency. It also cannot be used with multi-region writes.

Mistake

Session consistency guarantees that all users see the same data at the same time.

Correct

Session consistency only guarantees consistency within a single client session. Different sessions may see different versions of data. It does not provide global consistency.

Mistake

Eventual consistency means data is never consistent.

Correct

Eventual consistency guarantees that if no new writes occur, replicas will eventually converge. It does not mean data is always inconsistent; it just doesn't guarantee when.

Mistake

Bounded Staleness is the same as Strong but with a tolerance for staleness.

Correct

Bounded Staleness is weaker than Strong because reads may not return the latest write if the write is within the staleness bound. Strong always returns the latest write.

Mistake

You can set a per-request consistency level stronger than the account default.

Correct

You cannot request a stronger consistency than the account allows. For example, if the account is set to Eventual, you cannot request Strong per request. You can only request weaker or equal levels.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between Strong and Bounded Staleness consistency in Cosmos DB?

Strong consistency guarantees that every read returns the most recent write across all replicas, ensuring linearizability. Bounded Staleness allows reads to return data that is not the absolute latest but is within a configurable time or version lag. For example, with Bounded Staleness set to 5 seconds, a read may return data up to 5 seconds old. Strong has higher latency and lower availability because it requires a quorum for reads and writes, while Bounded Staleness uses a local quorum for writes and any replica within bounds for reads. Exam tip: If a question mentions a configurable lag, it's Bounded Staleness; if it mentions the most recent write without exception, it's Strong.

Can I use Strong consistency with multi-region writes in Cosmos DB?

No, Strong consistency is not supported when multiple write regions are enabled. This is because Strong consistency requires a global quorum across all regions, which would introduce unacceptable latency and availability trade-offs. If you need Strong consistency, you must use a single write region. The exam often tests this limitation. For multi-region writes, the highest consistency level available is Bounded Staleness or Session.

How does session consistency work in Cosmos DB?

What happens if I set Bounded Staleness with K=0 and t=0?

Cosmos DB does not allow both K and t to be zero. You must set at least one parameter to a positive value. If you set both to zero, the configuration is invalid and the system will reject it. If you set K=0 and t=5, the staleness bound is only based on time. If you set K=10 and t=0, the bound is only based on versions. The exam may ask about valid configurations.

Which consistency level should I choose for a global social media feed?

For a social media feed, Eventual consistency is often sufficient because users do not require real-time accuracy. However, for the user's own posts, you might use Session consistency to ensure they see their own posts immediately. You can set the default to Eventual and override per request for specific operations. This balances performance and user experience. The exam may present a scenario and ask you to recommend a consistency level.

What is the difference between Consistent Prefix and Eventual consistency?

Consistent Prefix guarantees that reads never see out-of-order writes. For example, if writes A, B, C occur in that order, a read will see A, A+B, or A+B+C, but never A+C or B+A. Eventual consistency has no ordering guarantee; a read could see B before A. Consistent Prefix is stronger than Eventual but weaker than Session. It is useful for scenarios where order matters, like chat messages or event sourcing.

How can I monitor replication lag for Bounded Staleness?

Azure Monitor provides the metric 'Max Replication Lag' (in seconds) for Cosmos DB accounts. You can set alerts to notify you if the lag exceeds your configured staleness bound. This helps ensure that your consistency guarantees are being met. The exam may ask which metric to use to monitor staleness.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Cosmos DB Consistency Levels — now see how well it sticks with free DP-900 practice questions. Full explanations included, no account needed.

Done with this chapter?