This chapter covers key-value stores and in-memory caching in Azure, both of which are core data storage patterns tested on the DP-900 exam. You'll learn how key-value stores provide simple, high-speed lookups using a unique key, and how in-memory caches like Azure Cache for Redis accelerate data access by storing frequently used data in RAM. These topics appear in roughly 10-15% of exam questions, focusing on use cases, characteristics, and comparison with other data stores. Mastering this chapter will help you identify when to use key-value versus relational or document stores, and how caching improves performance.
Jump to a section
Imagine a hotel lobby with a wall of small lockers, each with a key. Guests check in and are assigned a locker number. They store their room key in the locker, and the front desk keeps a master list mapping guest names to locker numbers. When a guest returns, they give their name, and the front desk instantly tells them which locker holds their key. This avoids the guest having to carry the key all day or wait for a new one. In-memory caching works like this: the cache (locker wall) stores frequently accessed data (keys) mapped by a lookup key (guest name). The main database is the hotel safe where all keys are stored permanently. When an application needs data, it first checks the cache (locker). If found (cache hit), it retrieves it instantly. If not (cache miss), it fetches from the database (safe), copies it into the cache for future use, and sets an expiration time (like checkout time). The cache has limited capacity—just like the hotel has limited lockers. When all lockers are full, the front desk must evict a key (using an eviction policy like LRU—least recently used) to make room for a new one. The front desk also sets a time limit (TTL) for each key; after that, the key is automatically removed, ensuring stale data isn't used. This system dramatically speeds up check-in and check-out for frequent guests, just as caching speeds up data access for frequently queried data.
What Are Key-Value Stores?
A key-value store is a non-relational database that stores data as a collection of key-value pairs. Each record is identified by a unique key, and the value can be a simple string, a complex object (like JSON), or even a binary blob. The key is used for all operations: get, put, delete. There is no schema, no query language beyond key-based operations, and no relationships between records. This simplicity enables extremely high throughput and low latency, often serving millions of read/write operations per second.
Key-value stores are optimized for lookup-heavy workloads. Examples include Amazon DynamoDB, Redis (when used as a persistent store), and Azure Cosmos DB (when configured with the Table API or using the core SQL API with a single partition key). On the DP-900 exam, you need to know that key-value stores are ideal for session management, user profiles, shopping carts, and caching layers.
How Key-Value Stores Work Internally
Internally, key-value stores use a hash table or a distributed hash table (DHT) to map keys to storage locations. When a write occurs, the key is hashed, and the value is stored on the node responsible for that hash range. Reads hash the key and retrieve the value directly. To ensure durability, writes are often replicated to multiple nodes. Some key-value stores (like Redis) offer in-memory storage with optional persistence, while others (like DynamoDB) store data on SSDs.
Key-value stores typically support the following operations:
GET(key): retrieve the value for a given key
PUT(key, value): insert or update a key-value pair
DELETE(key): remove a key-value pair
SCAN: iterate over keys (often limited to a partition)
The performance of a key-value store is measured in operations per second (OPS) and latency (sub-millisecond for in-memory, a few milliseconds for disk-based).
Key Components and Defaults
For Azure Cosmos DB Table API (a key-value store):
Partition key: required; determines data distribution; must be chosen carefully to avoid hot partitions.
Row key: unique within a partition; together with partition key forms the primary key.
Time-to-live (TTL): optional; can be set at item level or container level; default is no TTL.
Consistency levels: five levels (strong, bounded staleness, session, consistent prefix, eventual); session is default.
For Azure Cache for Redis (in-memory key-value store with caching):
Maxmemory policy: default is volatile-lru (evict keys with TTL set using LRU).
Time-to-live (TTL): can be set per key using EXPIRE command; no default TTL.
Persistence: RDB (snapshot) or AOF (append-only file); disabled by default.
Clustering: up to 10 shards; each shard is a primary with up to 2 replicas.
Configuration and Verification Commands
For Azure Cache for Redis, you can configure via Azure Portal or CLI. To set maxmemory policy:
az redis update --name MyCache --resource-group MyGroup --set redisConfiguration.maxmemory-policy=allkeys-lruTo verify cache statistics:
az redis show --name MyCache --resource-group MyGroup --query redisConfigurationFor Cosmos DB Table API, set TTL via portal or SDK. Using .NET SDK:
ItemRequestOptions options = new ItemRequestOptions { EnableContentResponseOnWrite = false };
await container.UpsertItemAsync<TableEntity>(entity, new PartitionKey(partitionKey), requestOptions: options);How Key-Value Stores Interact with Related Technologies
Key-value stores often serve as the backend for caching layers. For example, Redis can be used as a cache in front of a relational database. The application first checks Redis; if not found, it queries the SQL database and populates Redis. This pattern reduces load on the primary database and improves read latency.
Key-value stores also integrate with content delivery networks (CDNs) to cache static assets, and with message queues (e.g., Azure Service Bus) to store state temporarily.
What Is In-Memory Caching?
In-memory caching stores data in the RAM of a dedicated cache server or within the application process. The primary benefit is speed: reading from RAM is orders of magnitude faster than reading from disk or SSD. In-memory caches are commonly used to store results of expensive queries, session data, and frequently accessed configuration.
Azure offers two main caching services:
Azure Cache for Redis: a managed Redis cache (open-source Redis or Redis Enterprise).
Azure Managed Cache (deprecated): use Redis instead.
Additionally, you can implement in-memory caching within an application using IMemoryCache in ASP.NET Core, but this is not a managed service.
How In-Memory Caching Works
When an application requests data, it first checks the cache. If the data is present (cache hit), it returns immediately. If not (cache miss), the application retrieves the data from the original source (database, API, etc.), stores it in the cache with a TTL, and returns it. The cache uses an eviction policy to remove old data when memory is full.
Common eviction policies:
LRU (Least Recently Used): evicts the item that hasn't been accessed for the longest time.
LFU (Least Frequently Used): evicts the item with the fewest accesses.
TTL-based: evicts items when their TTL expires.
Random: evicts a random item.
Key Cache Patterns
Cache-Aside: Application code explicitly loads data into cache on miss. Most common pattern.
Read-Through: Cache layer automatically loads missing data from the database.
Write-Through: Data is written to cache and database simultaneously.
Write-Behind: Data is written to cache immediately and asynchronously to database.
Redis Data Structures
Redis is not just a key-value store; it supports multiple data structures:
Strings: simple key-value (up to 512 MB per value)
Lists: ordered collections of strings (linked list)
Sets: unordered collections of unique strings
Sorted Sets: ordered by score
Hashes: maps of field-value pairs
Bitmaps, HyperLogLogs, Geospatial indexes
Performance and Scale
Azure Cache for Redis offers three tiers:
Basic: single node, no SLA, no replication (dev/test)
Standard: two nodes (primary/replica), 99.9% SLA
Premium: clustering, persistence, zone redundancy, up to 99.95% SLA
Throughput varies by tier and size. For a Standard C1 (1 GB cache), expect ~15,000 requests per second. For Premium P5 (120 GB), up to 1.2 million requests per second.
Time-to-Live (TTL)
TTL is critical for cache freshness. Setting TTL too long can serve stale data; too short reduces cache effectiveness. Default TTL is infinite (no expiration), but best practice is to set a TTL based on data volatility. Use EXPIRE command in Redis:
EXPIRE mykey 3600 ; TTL in secondsComparison with Other Azure Data Stores
Key-value stores differ from relational databases (no schema, no joins, no ACID beyond single record) and document databases (documents have structure, can be queried on fields). On the exam, remember that key-value stores are optimized for simple lookups at scale.
In-memory caching is distinct from content delivery networks (CDNs) which cache static content at edge locations, and from application-level caching (e.g., IMemoryCache) which is local to a single server.
Trap Patterns on the Exam
Confusing key-value stores with document stores: Cosmos DB SQL API is document, Table API is key-value.
Thinking all caches are persistent: In-memory caches lose data on restart unless persistence is enabled.
Assuming TTL is mandatory: It is optional; default is no expiration.
Mixing up eviction policies: LRU vs LFU vs TTL.
Overlooking that Redis supports multiple data structures beyond key-value.
Exam-Relevant Numbers
Redis value size limit: 512 MB
Azure Cache for Redis Standard tier: 99.9% SLA
Premium tier clustering: up to 10 shards
Maxmemory policy default: volatile-lru
Throughput for Standard C1: ~15,000 requests/second
TTL can be set per key using EXPIRE (seconds) or PEXPIRE (milliseconds).
Application Requests Data
The application needs to retrieve a piece of data, such as a user's profile. It generates a cache key (e.g., 'user:123') and sends a GET command to the cache (Azure Cache for Redis). This is a synchronous call; the application blocks until it receives a response. The cache client library (e.g., StackExchange.Redis) serializes the command and sends it over TCP to the Redis server. The Redis server receives the command, looks up the key in its in-memory hash table, and returns the value if found, or nil if not found. The entire round-trip typically takes less than 1 millisecond.
Cache Hit or Miss
If the key exists in the cache (cache hit), Redis returns the value immediately. The application deserializes the value (e.g., JSON to object) and uses it. No further action is needed. If the key does not exist (cache miss), Redis returns a nil response. The application recognizes the miss and proceeds to fetch the data from the primary data store (e.g., Azure SQL Database). The miss can happen because the data was never cached, the TTL expired, or it was evicted due to memory pressure.
Fetch from Primary Store
On a cache miss, the application queries the primary database (e.g., Azure SQL) using a SELECT statement. This involves a TCP connection, query parsing, index lookup, and data retrieval from disk or buffer pool. Latency is typically 5-50 milliseconds, depending on load and data size. The application retrieves the full data row (e.g., user name, email, preferences). It then prepares to store this data in the cache for future requests.
Store in Cache with TTL
The application serializes the retrieved data (e.g., as JSON string) and sends a SET command to Redis, specifying the key and value. It also sets a TTL using the EXPIRE command or combined SETEX command (e.g., SETEX user:123 3600 '{"name":"Alice"}'). The TTL value (in seconds) determines how long the data remains in cache. The Redis server stores the key-value pair in its in-memory dictionary and starts a timer for the TTL. The application then returns the data to the caller.
Cache Eviction and Expiration
Over time, the cache may reach its maxmemory limit. When a new key is added and memory is full, Redis evicts keys based on the configured maxmemory policy. For volatile-lru (default), Redis removes the least recently used key among those with a TTL set. Keys without TTL are not evicted under this policy. Additionally, keys with expired TTL are lazily removed when accessed, or actively by Redis's periodic expiration cycle (every 100ms, it tests 20 random keys and deletes expired ones). This ensures the cache stays within memory limits and stale data is removed.
E-Commerce Shopping Cart
A large e-commerce platform uses Azure Cache for Redis to store shopping cart data. Each user's cart is stored as a Redis hash with fields like item ID, quantity, and price. The cart is keyed by a session ID. This approach allows sub-millisecond reads and writes, essential for a smooth user experience. The cache is configured with a TTL of 24 hours to clean up abandoned carts. The application uses cache-aside pattern: on add/remove, it updates both Redis and the underlying SQL database for durability. During peak traffic (Black Friday), the cache handles millions of operations per second. Misconfiguration, such as setting TTL too low (e.g., 5 minutes), caused users to lose carts frequently, leading to support tickets. The fix was to increase TTL and implement a background job to persist active carts to SQL.
Gaming Leaderboard
An online game uses Redis sorted sets to maintain a global leaderboard. Each player's score is stored as a member with a score. Redis provides O(log N) operations for adding scores and O(1) for retrieving top players using ZREVRANGE. The leaderboard is updated in real-time as players finish games. The cache is deployed in a Premium tier with clustering to handle 500,000 concurrent players. Persistence (AOF) is enabled to avoid losing leaderboard data on restart. A common issue was using a single Redis instance without clustering, causing CPU saturation at 100% during peak. After sharding across 10 shards, throughput scaled linearly. The exam may ask about using sorted sets for leaderboards—remember that Redis is the go-to solution.
Session Store for Web Applications
A SaaS provider uses Azure Cache for Redis as a distributed session store for its ASP.NET Core web app. Session data (user login, preferences) is stored in Redis using the built-in AddDistributedRedisCache extension. This allows any web server in a load-balanced pool to access the same session data. The cache is configured with a TTL of 20 minutes for session timeout. The standard tier with replication ensures high availability. A common mistake is not setting a TTL, causing sessions to accumulate indefinitely, leading to out-of-memory errors. Proper monitoring with Azure Metrics (e.g., cache misses, evictions) is essential. The exam may ask about session state vs. cache—know that session state is a specific use case of caching.
What DP-900 Tests
Objective 1.3: Describe core data concepts including key-value stores and in-memory caching. Specific sub-objectives:
Identify characteristics of key-value data (simple schema, fast lookups, no relationships).
Identify use cases for key-value stores (session management, caching, IoT telemetry).
Identify characteristics of in-memory caching (low latency, volatile, TTL).
Identify Azure services: Azure Cache for Redis, Cosmos DB Table API.
Common Wrong Answers
"Key-value stores support complex queries and joins." – Wrong. Key-value stores only support key-based operations. Candidates confuse them with relational databases.
"In-memory caches are durable and survive restarts." – Wrong unless persistence is enabled. By default, Redis is an in-memory store; data is lost on restart. The exam expects you to know that.
"Azure Cache for Redis only supports string values." – Wrong. Redis supports strings, lists, sets, sorted sets, hashes, etc. The exam may test this.
"TTL is mandatory for all cache entries." – Wrong. TTL is optional; default is no expiration. A common trap.
"Cosmos DB Table API is a relational database." – Wrong. It is a key-value store (Table API) or document (SQL API). Know the difference.
Specific Numbers and Terms
Redis value size limit: 512 MB.
Azure Cache for Redis Standard tier SLA: 99.9%.
Default maxmemory policy: volatile-lru.
Commands: SET, GET, EXPIRE, SETEX.
TTL unit: seconds (EXPIRE) or milliseconds (PEXPIRE).
Eviction policies: volatile-lru, allkeys-lru, volatile-random, etc.
Edge Cases
What happens when cache is full and no keys have TTL? With volatile-lru, no eviction occurs; writes fail. Use allkeys-lru to evict any key.
Can Redis be used as a primary database? Yes, with persistence enabled, but it's not typical. The exam expects caching use cases.
What is the difference between Cosmos DB Table API and SQL API? Table API is key-value; SQL API is document (JSON) with SQL query support.
How to Eliminate Wrong Answers
If the question mentions "complex queries" or "relationships", eliminate key-value store.
If the question mentions "persistent storage" without "persistence enabled", eliminate in-memory cache.
If the question mentions "multiple data types" (lists, sets), lean toward Redis.
If the question mentions "session state" or "shopping cart", think key-value store or cache.
Key-value stores use a unique key to retrieve a value; they do not support complex queries or joins.
Azure Cache for Redis is an in-memory key-value store that also supports data structures like lists, sets, and hashes.
Default eviction policy for Redis is volatile-lru (evicts keys with TTL set using LRU).
TTL is optional; set per key with EXPIRE (seconds) or SETEX (combined set+expire).
In-memory caches are volatile unless persistence (RDB/AOF) is enabled.
Common use cases: session management, caching database query results, real-time leaderboards.
Azure Cosmos DB Table API is a key-value store; SQL API is a document store.
Cache-aside is the most common caching pattern: check cache, on miss fetch from DB and store in cache.
Redis value size limit is 512 MB per key.
Standard tier Azure Cache for Redis offers 99.9% SLA with replication.
These come up on the exam all the time. Here's how to tell them apart.
Key-Value Store (e.g., Cosmos DB Table API)
Data stored as opaque key-value pairs; value is not interpreted.
Query only by key; no field-level queries.
No schema; value can be any format.
Ideal for simple lookups at massive scale.
Examples: session state, user profiles, IoT telemetry.
Document Store (e.g., Cosmos DB SQL API)
Data stored as JSON documents; schema is flexible but known.
Query by any field using SQL-like syntax.
Supports indexing on multiple fields.
Ideal for complex queries and hierarchical data.
Examples: product catalog, content management, orders.
Mistake
Key-value stores are the same as document stores.
Correct
Key-value stores treat the value as an opaque blob; document stores understand the structure (e.g., JSON) and allow querying on fields. Cosmos DB SQL API is document; Table API is key-value.
Mistake
Azure Cache for Redis is a relational database.
Correct
Redis is an in-memory key-value store (with data structures). It does not support SQL queries, joins, or ACID transactions across multiple keys (though it has transactions).
Mistake
In-memory caches are always faster than disk databases.
Correct
For most reads, yes, but if the cache is cold (empty) or has many misses, the overhead of checking cache plus fetching from database can be worse than direct database access. Caching is only beneficial for repeated reads.
Mistake
All data in a key-value store must be strings.
Correct
Values can be strings, numbers, JSON, binary, etc. The store treats them as bytes. Redis supports typed structures (lists, sets, hashes) that are not just strings.
Mistake
TTL in Redis is set at the database level.
Correct
TTL is set per key using EXPIRE command. There is no default TTL for all keys unless the application sets it. You can also set a default TTL at the Redis configuration level using the `maxmemory-policy`? No, that's eviction policy, not TTL.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Cosmos DB Table API is a key-value store where data is stored as entities with a partition key and row key. It supports simple key-based lookups. Cosmos DB SQL API is a document database that stores JSON documents and supports SQL queries, indexing on any field, and stored procedures. For DP-900, remember that Table API is key-value, SQL API is document.
By default, no. Azure Cache for Redis stores data in memory only. You can enable persistence (RDB snapshots or AOF log) in the Premium tier. Without persistence, data is lost on a restart or failover. The exam expects you to know that caching is primarily for performance, not durability.
If you have keys without TTL and want the cache to evict them when memory is full, use `allkeys-lru` or `allkeys-random`. The default `volatile-lru` only evicts keys with TTL set. If no keys have TTL, writes will fail when memory is exhausted.
Yes, but it's not typical. Redis can be used as a primary database with persistence enabled (RDB/AOF) and replication. However, it lacks advanced querying, ACID transactions across multiple keys, and is optimized for in-memory speed. For DP-900, focus on caching use cases.
Cache-aside is a pattern where the application code explicitly checks the cache before querying the database. On a cache miss, the application retrieves data from the database, stores it in the cache, and returns it. Subsequent reads hit the cache. This is the most common pattern for using Azure Cache for Redis.
512 MB per key. This is a specific number that appears on the exam. Values larger than 512 MB must be split across multiple keys.
TTL (Time-To-Live) is set per key using the EXPIRE command with seconds, or PEXPIRE with milliseconds. Once the TTL expires, the key is automatically deleted. You can check remaining TTL with TTL command. Setting TTL to -1 means no expiration; -2 means key does not exist.
You've just covered Key-Value Stores and In-Memory Caching — now see how well it sticks with free DP-900 practice questions. Full explanations included, no account needed.
Done with this chapter?