DVA-C02Chapter 38 of 101Objective 1.3

Amazon ElastiCache for Session Caching

This chapter covers Amazon ElastiCache for session caching, a critical pattern for building stateless, scalable web applications. Session caching with ElastiCache (especially Redis) is a core topic on the DVA-C02 exam, appearing in approximately 10-15% of questions related to caching and performance optimization. You will learn how to design, configure, and troubleshoot ElastiCache for session management, including key concepts like TTL, eviction policies, replication, and cluster mode. Mastering this topic will help you answer scenario-based questions about reducing database load, improving latency, and maintaining session state across auto-scaled instances.

25 min read
Intermediate
Updated May 31, 2026

ElastiCache as a Hotel Front Desk Key Locker

Imagine a large hotel with hundreds of guests. Each guest has a room key that they must show to access the pool, gym, and other amenities. Instead of carrying their key everywhere, the hotel provides a secure locker at the front desk for each guest. When a guest arrives at the pool, they hand over their room number, and the front desk clerk retrieves their key from the locker, allowing access. The lockers are fast-access (in-memory) and shared by all guests. If the locker system fails, the hotel can fall back to a slower process: the guest must go to their room to get the physical key (database lookup). The front desk clerk (ElastiCache node) stores keys in memory for instant retrieval. The lockers have a time limit: if a guest doesn't use the key for a set period (TTL), the key is automatically returned to the guest's room (expired session). The hotel may have multiple front desks (cluster mode) to handle many guests, and each desk knows where every guest's key is stored (sharding). If a desk crashes, the keys it held are lost, but the hotel can rebuild them from the room (database) or from a backup (Redis AOF/RDB). The hotel also uses a master desk that replicates keys to backup desks (replica nodes) for high availability. This analogy mirrors how ElastiCache for Redis provides fast, in-memory session caching with TTL, persistence, replication, and sharding, reducing load on the primary database and improving application performance.

How It Actually Works

What is Session Caching and Why Use ElastiCache?

Session caching stores user session data (e.g., login state, shopping cart contents, user preferences) in a fast, in-memory data store so that it can be retrieved quickly by any application instance. Without a centralized session store, each web server would have to maintain its own session state, which breaks when the server is replaced by auto-scaling or a load balancer distributes requests to different instances. The traditional approach is to store sessions in the application's relational database, but this adds latency and increases database load, especially for read-heavy session lookups. ElastiCache provides a managed Redis or Memcached service that sits between the application and the database, offering sub-millisecond read/write performance.

How ElastiCache Works for Session Caching

When a user logs into an application, the application creates a session object (e.g., user ID, expiration time, permissions) and stores it in ElastiCache with a unique session ID as the key. The session ID is then sent to the user's browser as a cookie. On subsequent requests, the browser presents the session ID, and the application reads the session data from ElastiCache using that ID. If the data is present and not expired, the application uses it; otherwise, it may redirect to login or create a new session. ElastiCache nodes run in-memory, so reads and writes are extremely fast. The service handles node failures, replication, and scaling, allowing developers to focus on application logic.

Key Components and Defaults

Redis vs. Memcached for Sessions: The DVA-C02 exam strongly favors Redis for session caching due to its support for persistence, replication, data structures (e.g., hashes, sets), and atomic operations. Memcached is simpler but lacks persistence and replication, making it unsuitable for sessions that must survive node failures. Redis also supports TTL (Time-To-Live) natively via the EXPIRE command, which is essential for session expiration.

TTL (Time-To-Live): Each session key in Redis can have a TTL set in seconds. The default TTL is not set by ElastiCache; it is the application's responsibility to set it. Common values range from 15 minutes (900 seconds) to 24 hours (86400 seconds). When a key expires, it is automatically deleted. This prevents the cache from filling with stale sessions.

Eviction Policies: When memory is full, Redis evicts keys based on the configured policy. For session caching, the recommended policy is allkeys-lru (Least Recently Used) or volatile-lru (LRU only among keys with TTL). The default policy is volatile-lru for Redis 6.x and later. The exam may ask which policy is best for session caching to avoid losing important data.

Cluster Mode: Redis Cluster (enabled via cluster mode) allows data to be sharded across multiple nodes, supporting up to 500 shards and 90 nodes. Each shard has a primary node and up to 5 replicas. Cluster mode is required for sessions exceeding the memory of a single node (e.g., > 50 GB) or for very high throughput (over 100,000 operations/sec). However, it introduces complexity: multi-key operations (e.g., MGET) are limited to keys in the same hash slot. The exam tests when to use cluster mode vs. non-cluster mode.

Replication: In non-cluster mode, you can have a primary node and up to 5 replica nodes. Replicas asynchronously replicate data from the primary. Read requests can be offloaded to replicas, but writes always go to the primary. For session caching, you typically configure the application to write to the primary and read from replicas to balance load. If the primary fails, ElastiCache automatically promotes a replica to primary. This is critical for high availability.

Endpoints: ElastiCache provides a primary endpoint (for writes) and a reader endpoint (for read load balancing across replicas). In cluster mode, you get a configuration endpoint that returns the list of shard endpoints. Applications must use the appropriate endpoint.

Configuration and Verification

ElastiCache is configured via the AWS Management Console, CLI, or SDK. Key steps for session caching:

1.

Create a Redis cluster (non-cluster or cluster mode enabled).

2.

Set security group rules to allow inbound traffic from your application instances on port 6379 (default Redis port).

3.

Configure application to use the Redis endpoint. For example, in a Node.js app using ioredis:

const Redis = require('ioredis');
const redis = new Redis({
  host: 'mycluster.xxxxxx.ng.0001.use1.cache.amazonaws.com',
  port: 6379,
  // For cluster mode, use Redis.Cluster with startup nodes
});
4.

Set TTL on session keys:

redis.setex(sessionId, 3600, JSON.stringify(sessionData)); // TTL = 3600 seconds
5.

Verify connectivity using redis-cli from an EC2 instance in the same VPC:

redis-cli -h mycluster.xxxxxx.ng.0001.use1.cache.amazonaws.com -p 6379 ping
6.

Monitor using CloudWatch metrics: CacheHits, CacheMisses, CurrConnections, Evictions, ReplicationLag.

Interaction with Related Technologies

ElastiCache and Auto Scaling: When using Auto Scaling groups, each new EC2 instance must be able to access the same ElastiCache cluster. This is achieved by placing the cluster in the same VPC and using security groups that allow traffic from the Auto Scaling group's security group. The session data persists in ElastiCache regardless of instance churn.

ElastiCache and Load Balancers: Application Load Balancers (ALB) can be configured with sticky sessions (session affinity) using cookies, but this ties a user to a specific target. With ElastiCache, you can disable sticky sessions and let any instance serve any request, because session data is centralized.

ElastiCache and DynamoDB DAX: DAX is an in-memory cache for DynamoDB, but it is not suitable for session caching because sessions are not DynamoDB items. ElastiCache is the correct service.

ElastiCache and CloudFront: CloudFront can cache static content, but session data is dynamic and must be stored server-side. ElastiCache is the server-side cache.

Performance Considerations and Scaling

Memory Sizing: Estimate session size (e.g., 2 KB per session) multiplied by concurrent users. For 1 million sessions, you need at least 2 GB. Add headroom for evictions (target < 10% evictions).

Network Latency: Place ElastiCache in the same Availability Zone as your application for lowest latency. Cross-AZ adds 1-2 ms.

Connection Limits: Each Redis node can handle up to 65,000 concurrent connections by default (configurable). Use connection pooling to avoid exhausting connections.

Scaling: For non-cluster mode, you can vertically scale by modifying the node type (requires downtime). For cluster mode, you can add shards online (resharding) with minimal impact.

Security

Encryption in transit: Enable TLS by setting TransitEncryptionEnabled to true. The application must use a TLS-compatible Redis client.

Encryption at rest: Enable using AWS KMS.

Auth token: Set a Redis AUTH token for password-based authentication.

VPC: ElastiCache clusters must be launched in a VPC. They cannot be accessed from the internet unless you use a proxy or VPN.

Common Pitfalls

Not setting TTL: Sessions accumulate and fill memory, causing evictions of all keys (including active ones).

Using Memcached for sessions: Memcached lacks persistence; if a node fails, all sessions are lost. Also, Memcached does not support TTL natively (you must store expiration in the value).

Ignoring eviction policy: The default volatile-lru may evict sessions with TTL, but allkeys-lru can evict even persistent keys. Choose volatile-ttl to evict keys with the nearest expiration.

Not using replicas for reads: Sending all reads to the primary increases load. Offload reads to replicas.

Cross-AZ latency: If replicas are in different AZs, ensure your application reads from the primary if latency is critical.

Exam Tips

The exam will present scenarios where an application experiences high database load due to session management. The correct solution is to use ElastiCache (Redis) to cache sessions.

Know the difference between Redis and Memcached: Redis supports persistence, replication, and complex data types; Memcached is simpler but not suitable for sessions.

Understand TTL: Setting appropriate TTL is crucial. The exam may ask what happens if TTL is too long (memory waste) or too short (frequent re-login).

Be familiar with eviction policies: allkeys-lru and volatile-lru are common answers.

Remember that ElastiCache clusters are not directly accessible from the internet; they must be in a VPC.

Cluster mode is needed for sharding, but it complicates multi-key operations.

Replication provides high availability; replicas can serve reads.

The exam may ask about backup and restore: Redis supports automated snapshots (RDB) and append-only files (AOF) for persistence.

Step-by-Step: Setting Up ElastiCache for Session Caching

Step 1: Create a Redis Cluster Use the AWS CLI:

aws elasticache create-cache-cluster \
    --cache-cluster-id my-session-cache \
    --cache-node-type cache.m5.large \
    --engine redis \
    --num-cache-nodes 1 \
    --security-group-ids sg-12345678 \
    --cache-subnet-group-name my-subnet-group

For cluster mode:

aws elasticache create-replication-group \
    --replication-group-id my-session-cluster \
    --replication-group-description "Session cache cluster" \
    --cache-node-type cache.m5.large \
    --engine redis \
    --num-node-groups 3 \
    --replicas-per-node-group 1 \
    --security-group-ids sg-12345678 \
    --cache-subnet-group-name my-subnet-group

Step 2: Configure Application to Use Redis Example in Python using redis-py:

import redis
r = redis.Redis(
    host='my-session-cache.xxxxxx.ng.0001.use1.cache.amazonaws.com',
    port=6379,
    decode_responses=True
)
r.setex('session:abc123', 1800, '{"user_id": 42, "role": "admin"}')

Step 3: Set TTL and Handle Misses When a request arrives, the application attempts to read the session from Redis. If the key does not exist (cache miss), the application should authenticate the user (e.g., via database) and then create a new session in Redis.

Step 4: Monitor and Tune Use CloudWatch alarms for Evictions > 0 and CacheMisses > 10%. If evictions occur, increase memory or adjust TTL.

Verification

Check connectivity: redis-cli -h endpoint -p 6379 ping should return PONG.

List keys: redis-cli -h endpoint keys '*' (use with caution in production).

Check memory usage: redis-cli -h endpoint info memory.

Walk-Through

1

Create ElastiCache Redis Cluster

Use the AWS Management Console, CLI, or CloudFormation to create a Redis cluster. Choose the node type based on expected memory (e.g., cache.m5.large has 6.44 GB). For high availability, enable Multi-AZ with automatic failover and at least one replica. Set the port to 6379 (default). Ensure the cluster is in the same VPC as your application. For session caching, you typically start with a single shard (non-cluster mode) unless you anticipate >50 GB of session data or very high throughput. The creation process takes 5-10 minutes.

2

Configure Security Group

Create a security group for the ElastiCache cluster that allows inbound TCP traffic on port 6379 from the security group of your application instances (or from the VPC CIDR). Outbound rules are not needed for the cluster itself. This is a common exam point: ElastiCache nodes are not public; they must be accessed from within the VPC. If your application is outside the VPC (e.g., on-premises), you need a VPN or Direct Connect.

3

Update Application Code

Modify your application to use the ElastiCache endpoint for session storage. For example, in a Java Spring Boot application, configure the Redis connection factory with the primary endpoint. Use the `spring-session-data-redis` library to transparently store HTTP sessions in Redis. Set the session timeout (TTL) in the application configuration (e.g., `server.servlet.session.timeout=30m`). The library will automatically set the TTL on the Redis key. Ensure the application handles connection failures gracefully, e.g., by falling back to a database or returning an error.

4

Test Session Persistence

Deploy the updated application to an Auto Scaling group. Log in to the application and verify that the session is maintained across multiple requests. Then, terminate the EC2 instance serving your request. After Auto Scaling launches a new instance, your session should still be valid because the session data is in ElastiCache. If the session is lost, check that the new instance can reach ElastiCache (security group) and that the session ID cookie is being sent correctly. Use `redis-cli` to verify that the session key exists and has the correct TTL.

5

Monitor and Optimize

Set up CloudWatch alarms on ElastiCache metrics: `CacheHits` and `CacheMisses` to calculate hit rate (should be >90%). `Evictions` should be zero; if not, increase memory or reduce TTL. `CurrConnections` should be below the node's max connections (default 65,000). `ReplicationLag` should be near zero. If evictions occur despite adequate memory, check the eviction policy. For session caching, use `volatile-ttl` to evict keys with the shortest remaining TTL first. Also, consider using `allkeys-lru` if you have no persistent keys. Use the Redis `INFO` command to get detailed statistics.

What This Looks Like on the Job

Scenario 1: E-commerce Platform with User Carts A large e-commerce platform uses ElastiCache for Redis to store shopping cart data. Each user's cart is stored as a Redis hash with TTL of 24 hours. The application has multiple microservices (product catalog, pricing, checkout) that all need to read and write the cart. With ElastiCache, cart updates are atomic and fast. The platform handles 10,000 concurrent users with an average cart size of 5 KB. They use a cluster mode with 3 shards (each 6.44 GB) and one replica per shard. The primary endpoint is used for writes, and the reader endpoint for reads. They set an eviction policy of volatile-ttl to ensure that only expired carts are evicted. A common issue they faced was forgetting to set TTL on new sessions, causing memory to fill up quickly. They resolved by adding a default TTL in the application code. They also monitor Evictions and set an alarm if >0. In production, they observed a 95% cache hit rate, reducing database load by 80%.

Scenario 2: Gaming Leaderboard and Session A mobile game uses ElastiCache for both session caching and leaderboard storage. Player session data (level, inventory) is stored with a TTL of 1 hour. The leaderboard is a Redis sorted set, updated on each game completion. The application uses a single shard (cache.r5.large) with a replica. They enable encryption in transit because the data includes player tokens. The game experiences spikes during tournaments (100,000 concurrent players). They use connection pooling (JedisPool) to avoid exhausting connections. A mistake they made was not enabling Multi-AZ; when the primary node failed, there was downtime until a new node was provisioned. They now use Multi-AZ with automatic failover. They also set timeout parameter to 300 seconds to close idle connections.

Scenario 3: SaaS Application with Multi-Tenant Sessions A SaaS application stores sessions for thousands of tenants. Each tenant's sessions are stored with a key prefix like tenant123:session:xyz. They use a single Redis cluster (non-cluster) with 2 replicas. To isolate tenants, they considered using separate Redis databases (SELECT command), but Redis databases are not a security boundary; they instead use separate clusters per tenant for compliance. They set TTL to 8 hours (workday). A common problem was that some tenants had very long sessions (e.g., admin users), causing memory imbalance. They solved by setting a maximum TTL of 24 hours. They also use RANDOM eviction policy? No, they use allkeys-lru because they have no persistent keys. They monitor KeyspaceHits and KeyspaceMisses to detect session issues. When a tenant reports frequent logouts, they check if the TTL is too short or if evictions are happening.

How DVA-C02 Actually Tests This

What DVA-C02 Tests on ElastiCache for Session Caching The exam objectives under Domain 1 (Deployment) and Domain 3 (Development with AWS Services) include caching strategies. Specifically, you need to know:

How to offload session state from EC2 instances to ElastiCache (Redis) to achieve statelessness.

Differences between Redis and Memcached for session caching.

TTL, eviction policies, and their impact on session persistence.

Replication and Multi-AZ for high availability.

Cluster mode for scaling beyond a single node.

Security: VPC, encryption, auth token.

Common Wrong Answers and Why Candidates Choose Them 1. Choosing Memcached for sessions: Many candidates think Memcached is faster because it is simpler. However, Memcached lacks persistence and replication, so a node failure loses all sessions. The exam expects Redis. 2. Setting TTL too high or not setting it: Candidates may forget that sessions need TTL. Without TTL, memory fills up, causing evictions. The exam will test that you must set a reasonable TTL. 3. Using DynamoDB DAX for sessions: DAX is a cache for DynamoDB, not a general-purpose cache. It cannot store arbitrary session data. ElastiCache is the correct service. 4. Configuring ElastiCache without VPC: ElastiCache must be in a VPC. Some candidates think it can be public, but it's not. 5. Using sticky sessions with ALB instead of ElastiCache: Sticky sessions tie a user to a specific instance, which breaks with auto scaling. ElastiCache is the correct centralized solution.

Specific Numbers and Terms on the Exam - Default Redis port: 6379 (Memcached: 11211). - Default eviction policy: volatile-lru (Redis 6.x). - Maximum number of replicas per shard: 5. - Maximum number of shards in cluster mode: 500 (soft limit). - TTL values: typical session TTL 15-60 minutes. - Node types: cache.m5.large (6.44 GB), cache.r5.large (13.07 GB). - Multi-AZ: requires at least one replica.

Edge Cases and Exceptions - If you set TransitEncryptionEnabled to true, your Redis client must support TLS. The exam may ask what additional configuration is needed. - In cluster mode, MGET only works for keys in the same hash slot. If you need to retrieve multiple sessions, ensure they share the same hash tag (e.g., {user123}:session:abc and {user123}:session:def). - When using AUTH token, the client must provide the password. The exam may ask the CLI command: redis-cli -h endpoint -p 6379 -a your-auth-token. - If you enable automatic failover, the primary endpoint remains the same, but DNS updates to point to the new primary. The application should use the primary endpoint, not individual node endpoints.

How to Eliminate Wrong Answers - If the scenario mentions "high availability" or "survive node failures", eliminate Memcached and non-replicated Redis. - If the scenario mentions "scale beyond memory of a single node", cluster mode is required. - If the scenario mentions "reduce database load for session reads", ElastiCache is the answer. - If the scenario mentions "in-memory cache with TTL", it's Redis. - If the scenario mentions "multi-key operations across shards", be cautious: they are only possible if keys share a hash slot.

Key Takeaways

ElastiCache for Redis is the correct service for centralized session caching to achieve stateless applications.

Always set TTL on session keys to prevent memory exhaustion and auto-clean stale sessions.

Use the 'volatile-ttl' eviction policy to evict sessions with the shortest remaining TTL first.

Enable Multi-AZ with at least one replica for high availability and automatic failover.

Cluster mode is needed only when session data exceeds a single node's memory (typically >50 GB) or for very high throughput.

ElastiCache clusters must be in the same VPC as your application; they are not internet-accessible.

Use the primary endpoint for writes and the reader endpoint for read load balancing across replicas.

Monitor CloudWatch metrics: CacheHits, CacheMisses, Evictions, and CurrConnections.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

ElastiCache for Redis

Supports persistence (RDB/AOF) for session durability.

Supports replication (up to 5 replicas) and automatic failover.

Native TTL via EXPIRE command.

Rich data structures (hashes, lists, sets) useful for complex session data.

Recommended for session caching on DVA-C02.

ElastiCache for Memcached

No persistence; data lost on node failure.

No replication; single point of failure.

No native TTL; must be managed in application code.

Simple key-value store; limited data types.

Not suitable for session caching; used for simple caching of database query results.

Watch Out for These

Mistake

ElastiCache can be accessed from the internet.

Correct

ElastiCache clusters are launched in a VPC and are not publicly accessible. They can only be reached from within the same VPC, via VPC peering, VPN, or Direct Connect. Internet access requires a proxy or a load balancer.

Mistake

Memcached is better for session caching because it is simpler and faster.

Correct

Memcached lacks persistence, replication, and native TTL. If a Memcached node fails, all sessions are lost. Redis supports persistence (RDB/AOF), replication, and TTL, making it the correct choice for session caching on the exam.

Mistake

Setting TTL is optional; Redis will automatically expire old sessions.

Correct

Redis only expires keys that have a TTL set via EXPIRE or SETEX. Without TTL, keys remain forever until memory is full, then eviction policies (like LRU) remove keys. For session caching, you must always set TTL to prevent memory exhaustion and to automatically clean up stale sessions.

Mistake

ElastiCache cluster mode is required for high availability.

Correct

High availability (automatic failover) is available in both cluster mode and non-cluster mode when you have at least one replica. Cluster mode is specifically for sharding data across multiple nodes to scale beyond a single node's memory or throughput.

Mistake

You can use the same ElastiCache cluster for both session caching and database query caching without any issues.

Correct

While technically possible, it is not recommended because session data is often sensitive and has different TTL and eviction requirements. Mixing use cases can lead to eviction of important session data due to cache pressure from query results. Better to use separate clusters or logical databases (though Redis databases are not isolated).

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the default eviction policy for ElastiCache Redis?

The default eviction policy for Redis 6.x is `volatile-lru`, which evicts keys with a TTL set using the LRU algorithm. For session caching, you may want to change it to `volatile-ttl` to evict keys with the nearest expiration time, ensuring active sessions are less likely to be evicted. You can set the policy using the `maxmemory-policy` parameter in the ElastiCache parameter group.

Can I use ElastiCache for session caching with an Application Load Balancer?

Yes. In fact, ElastiCache enables you to disable sticky sessions on the ALB because session state is centralized. Without ElastiCache, you would need sticky sessions to ensure a user's requests go to the same instance, which breaks with auto scaling. With ElastiCache, any instance can serve any request because it reads the session from the cache. This improves scalability and fault tolerance.

What happens if an ElastiCache Redis node fails?

If you have Multi-AZ enabled with at least one replica, ElastiCache automatically promotes a replica to primary with minimal downtime (typically under 30 seconds). The primary endpoint DNS record is updated to point to the new primary. If you have no replicas, the node is replaced automatically, but all data in memory is lost. For session caching, this would log out all users. Therefore, always use at least one replica for production.

How do I set TTL for a session in Redis?

Use the `SETEX` command: `SETEX session:abc123 3600 "data"` sets the key with a TTL of 3600 seconds. Alternatively, use `SET` followed by `EXPIRE`. In most client libraries, there is a method like `setex(key, ttl, value)`. For example, in Python: `r.setex('session:abc123', 1800, '{"user":"john"}')`. The TTL should be set to the desired session timeout.

What is the difference between Redis cluster mode and non-cluster mode?

Non-cluster mode (standard) uses a single shard with up to 5 replicas. All data fits on one node. Cluster mode shards data across multiple node groups (shards), each with its own primary and replicas. Cluster mode supports up to 500 shards and scales horizontally. The trade-off: multi-key operations (e.g., MGET) only work if keys are in the same hash slot. For session caching, you typically start with non-cluster mode and switch to cluster mode when you exceed ~50 GB of session data or need more than 100,000 operations per second.

Can I access ElastiCache from an on-premises application?

Yes, but only if you have a VPN connection or AWS Direct Connect between your on-premises network and the VPC where the ElastiCache cluster resides. ElastiCache does not have a public endpoint. You can also use a proxy (e.g., HAProxy) running on an EC2 instance that is accessible from on-premises and forwards traffic to ElastiCache.

How do I encrypt session data in ElastiCache?

ElastiCache for Redis supports encryption in transit (TLS) and encryption at rest (using KMS). To enable encryption in transit, set `TransitEncryptionEnabled` to `true` when creating the cluster. Your application client must support TLS (e.g., use `rediss://` scheme). For encryption at rest, set `AtRestEncryptionEnabled` to `true`. You can also use an AUTH token for password-based authentication.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Amazon ElastiCache for Session Caching — now see how well it sticks with free DVA-C02 practice questions. Full explanations included, no account needed.

Done with this chapter?