DVA-C02Chapter 3 of 101Objective 1.3

DynamoDB for Developers

This chapter covers Amazon DynamoDB, AWS's flagship NoSQL key-value and document database, which is heavily tested on the DVA-C02 exam (approximately 15-20% of questions). You will learn the core concepts: tables, items, attributes, primary keys (partition key and sort key), secondary indexes, read/write capacity modes, consistency models, and advanced features like DynamoDB Streams, Transactions, and DAX. Mastering DynamoDB is essential for building scalable serverless applications and acing the developer exam.

25 min read
Intermediate
Updated May 31, 2026

DynamoDB as a Massive Filing Cabinet with Automatic Index Cards

Imagine a giant filing cabinet with millions of drawers, each labeled by a unique drawer ID (partition key). Inside each drawer, documents are sorted by a secondary label (sort key). You can instantly open any drawer by its ID and retrieve a document by its sort key. This is DynamoDB's core access pattern: direct, predictable, and fast. Now imagine that the cabinet automatically creates and maintains extra index cards (Global Secondary Indexes) that reorganize documents by different labels, allowing you to search across drawers. The cabinet also has a 'strongly consistent' mode: if you just filed a document and ask for it immediately, the clerk checks the actual drawer, not just the index card. In 'eventually consistent' mode (default), the clerk checks the index card, which might be a few milliseconds behind. The cabinet can also handle bursts of requests by temporarily increasing the number of drawers (partition splitting) if you enable auto-scaling or provision enough capacity. If you try to put too many documents into one drawer, the cabinet slows down — that's a hot partition. DynamoDB Accelerator (DAX) is like a super-fast assistant who memorizes the most frequently requested documents and hands them to you without even opening the cabinet.

How It Actually Works

What is Amazon DynamoDB?

Amazon DynamoDB is a fully managed NoSQL database service that provides single-digit millisecond latency at any scale. It is a key-value and document database that supports both structured and semi-structured data (JSON, documents). It is serverless — no servers to provision, patch, or manage — and scales automatically to handle throughput and storage. DynamoDB is designed for high-traffic web applications, gaming, IoT, ad tech, and any workload requiring consistent performance at scale.

Why DynamoDB Exists

Traditional relational databases (RDBMS) like Amazon RDS enforce schemas, support complex joins, and provide ACID transactions. However, they struggle with horizontal scaling and can introduce latency under heavy read/write loads. DynamoDB sacrifices some relational features (joins, complex queries) to achieve predictable performance, automatic scaling, and high availability. It is optimized for access patterns where you know your primary key and need fast, consistent lookups.

How DynamoDB Works Internally

DynamoDB stores data across multiple partitions automatically. Each partition is a unit of storage and compute. When you create a table, DynamoDB initially creates enough partitions to handle the provisioned capacity. Data is distributed across partitions based on the partition key's hash value. The hash function ensures uniform distribution, but if you choose a poor partition key (e.g., a date that grows monotonically), you can create hot partitions.

Key Components

#### Tables, Items, and Attributes - Table: A collection of items, analogous to a table in RDBMS. - Item: A single record in a table, analogous to a row. Each item is uniquely identified by the primary key. - Attribute: A data element within an item, analogous to a column. Attributes can be of various types: scalar (string, number, binary, boolean, null), document (list, map), or set (string set, number set, binary set).

#### Primary Key Every table must have a primary key. Two options: - Partition key only (simple primary key): A single attribute that uniquely identifies each item. Example: UserId (String). - Partition key and sort key (composite primary key): Two attributes that together form the unique identifier. Example: UserId (partition key) and Timestamp (sort key). Items with the same partition key are stored together, sorted by sort key.

#### Secondary Indexes DynamoDB supports two types of secondary indexes to enable querying on non-key attributes: - Global Secondary Index (GSI): An index with a different partition key and sort key than the table. It spans all partitions. You can create up to 20 GSIs per table (default limit, can be increased). GSIs have their own provisioned read/write capacity settings. - Local Secondary Index (LSI): An index that uses the same partition key as the table but a different sort key. It is local to a partition. You can create up to 5 LSIs per table, and they must be defined at table creation time (cannot be added later). LSIs share the table's provisioned capacity.

#### Read/Write Capacity Modes DynamoDB offers two capacity modes: - Provisioned Capacity: You specify the number of reads and writes per second. You can set Read Capacity Units (RCUs) and Write Capacity Units (WCUs). One RCU = one strongly consistent read per second for an item up to 4 KB, or two eventually consistent reads per second. One WCU = one write per second for an item up to 1 KB. You can use Auto Scaling to adjust capacity automatically based on utilization. - On-Demand Capacity: You pay per request (reads and writes). DynamoDB scales instantly to handle any traffic. Suitable for unpredictable workloads. On-demand is more expensive per request than provisioned but eliminates capacity planning.

#### Consistency Models DynamoDB offers two consistency models for reads: - Eventually Consistent Reads (default): The read may not reflect the most recent write. Consistency is usually achieved within one second. This option uses half the RCUs of strongly consistent reads. - Strongly Consistent Reads: Returns the most up-to-date data. However, it may have higher latency and is not supported in some scenarios (e.g., GSIs are eventually consistent only).

#### DynamoDB Streams DynamoDB Streams captures a time-ordered sequence of item-level changes (inserts, updates, deletes) in a table. Stream records are written in near real-time and can be read by AWS Lambda, Kinesis, or custom consumers. Streams are useful for cross-region replication, materialized views, and event-driven architectures. You can enable streams on a table with four view types: KEYS_ONLY, NEW_IMAGE, OLD_IMAGE, and NEW_AND_OLD_IMAGES.

#### Transactions DynamoDB Transactions provide ACID guarantees across one or more tables within a single AWS account and region. You can use the TransactGetItems and TransactWriteItems APIs. Transactions consume 2x the RCUs/WCUs of standard operations. They are useful for financial transactions, inventory management, and other scenarios requiring atomicity.

#### Time to Live (TTL) TTL allows you to define a per-item timestamp attribute to automatically delete items after a specified expiry. This helps manage storage costs and comply with data retention policies. TTL is best-effort — expired items are typically deleted within 48 hours.

#### DynamoDB Accelerator (DAX) DAX is an in-memory cache for DynamoDB that delivers up to 10x read performance improvement. It is fully managed, compatible with existing DynamoDB APIs, and sits between your application and the database. DAX is ideal for read-heavy workloads with frequent access to the same items (e.g., gaming leaderboards, product catalogs).

How DynamoDB Interacts with Related Technologies

AWS Lambda: Lambda can be triggered by DynamoDB Streams to process changes in real-time (e.g., update an Elasticsearch index, send notifications).

Amazon S3: DynamoDB can store metadata for objects in S3, enabling fast lookups.

Amazon SQS: Decouple microservices that write to DynamoDB and need asynchronous processing.

AWS AppSync: Use DynamoDB as a data source for GraphQL APIs.

Amazon CloudWatch: Monitor DynamoDB metrics like ConsumedReadCapacityUnits, ThrottledRequests, and SystemErrors.

Configuration and Verification Commands

Using AWS CLI:

Create a table:

aws dynamodb create-table \
    --table-name Users \
    --attribute-definitions AttributeName=UserId,AttributeType=S \
    --key-schema AttributeName=UserId,KeyType=HASH \
    --billing-mode PROVISIONED \
    --provisioned-throughput ReadCapacityUnits=5,WriteCapacityUnits=5

Put an item:

aws dynamodb put-item \
    --table-name Users \
    --item '{"UserId": {"S": "user123"}, "Name": {"S": "Alice"}}'

Query with GSI:

aws dynamodb query \
    --table-name Users \
    --index-name NameIndex \
    --key-condition-expression "Name = :name" \
    --expression-attribute-values '{ ":name": {"S": "Alice"} }'

Key Values, Defaults, and Timers

Maximum item size: 400 KB (including attribute names and values).

Maximum partition size: 10 GB.

Default read consistency: Eventually consistent.

RCU for eventually consistent read: 1 RCU = 2 reads of 4 KB items per second.

RCU for strongly consistent read: 1 RCU = 1 read of 4 KB item per second.

WCU: 1 WCU = 1 write of up to 1 KB per second.

GSI limit: Up to 20 per table.

LSI limit: Up to 5 per table, defined at creation.

TTL deletion: Typically within 48 hours of expiry.

Streams retention: 24 hours.

Transactional RCU/WCU: 2x standard.

DAX cache: Up to 10x read performance improvement.

Exam-First Structure

For the DVA-C02, you must know:

How to choose between provisioned and on-demand capacity.

How to design primary keys to avoid hot partitions.

How to use GSIs and LSIs to support query patterns.

The implications of eventually vs. strongly consistent reads.

How to use DynamoDB Streams with Lambda.

How to implement transactions correctly.

The limits and pricing model.

Walk-Through

1

Choose a primary key design

The primary key determines how data is distributed and accessed. For simple lookups, use a partition key only (e.g., UserId). For hierarchical or time-series data, use a composite key (partition + sort key). Avoid monotonically increasing partition keys (timestamps) to prevent hot partitions. Instead, use a hash prefix or a random suffix. For example, for a gaming leaderboard, use GameId as partition key and Score as sort key to efficiently query top scores.

2

Choose a capacity mode

Decide between provisioned and on-demand. Provisioned is cost-effective for predictable workloads; you set RCUs and WCUs and can enable Auto Scaling. On-demand is simpler for unpredictable traffic but costs more per request. The exam often tests that on-demand is ideal for new applications with unknown traffic patterns or spiky workloads. Remember that on-demand can scale instantly but has no reserved capacity.

3

Define secondary indexes

If you need to query on non-key attributes, create GSIs or LSIs. GSIs can have a different partition key and sort key; they are eventually consistent only. LSIs use the same partition key but a different sort key; they support strongly consistent reads but must be created at table creation. The exam tests that GSIs have their own capacity settings (if using provisioned mode) and that they are eventually consistent. Also, GSIs can be added after table creation; LSIs cannot.

4

Optimize read/write operations

Use BatchGetItem and BatchWriteItem to reduce API calls. For large scans, use parallel scans or pagination. Use ProjectionExpression to retrieve only needed attributes. For strongly consistent reads when required (e.g., after a write), set ConsistentRead=true. Remember that strongly consistent reads consume 2x RCUs. Use DAX for read-heavy workloads to cache frequent queries. The exam tests that DAX is an in-memory cache, not a write cache.

5

Enable DynamoDB Streams

Streams are essential for event-driven architectures. Enable streams with the appropriate view type. For example, use NEW_AND_OLD_IMAGES to capture both before and after states. Then attach a Lambda function to process stream records. The exam tests that stream records are ordered within a partition, and that processing can be idempotent to handle duplicate records. Also, streams have a 24-hour retention, so consumers must process within that window.

What This Looks Like on the Job

Enterprise Scenario 1: E-Commerce Product Catalog

A large e-commerce company stores product information in DynamoDB. Each product has a unique ProductId (partition key) and a SKU (sort key). They need to query products by category, so they create a GSI with Category as partition key and Price as sort key. This allows fast retrieval of all products in a category sorted by price. They use provisioned capacity with Auto Scaling because traffic peaks during holiday sales. They also use DAX to cache popular product details, reducing database load and improving latency from 10ms to 1ms. A common misconfiguration is choosing a partition key that doesn't distribute evenly; for example, using Category alone as partition key for the main table would cause hot partitions for popular categories. Instead, they use ProductId as the partition key and the GSI handles category queries.

Enterprise Scenario 2: Gaming Leaderboard

A mobile game uses DynamoDB to store player scores. The table has a composite primary key: GameId (partition key) and PlayerId (sort key). They need to display the top 100 players for each game. To avoid scanning all players, they create a GSI with GameId as partition key and Score as sort key (descending). Querying this GSI with ScanIndexForward=false returns the top scores efficiently. They use on-demand capacity because player activity spikes unpredictably after new game releases. They also enable DynamoDB Streams to trigger a Lambda function that updates a real-time leaderboard in ElastiCache. A pitfall: forgetting that GSIs are eventually consistent, so the leaderboard might be slightly stale. They mitigate by using strongly consistent reads on the main table when a player checks their own rank.

Enterprise Scenario 3: IoT Device Telemetry

An IoT company ingests millions of sensor readings per second. They use DynamoDB with a composite key: DeviceId (partition key) and Timestamp (sort key). To avoid hot partitions, they add a random suffix to DeviceId (e.g., DeviceId#ShardId) to distribute writes across partitions. They use provisioned capacity with a write-heavy workload. They set up a TTL attribute to delete data older than 30 days. They also use DynamoDB Streams to feed a Kinesis Data Stream for real-time analytics. A common failure: not properly sizing the write capacity, leading to throttling. They monitor ThrottledWriteEvents in CloudWatch and adjust capacity or use on-demand. Also, they ensure that the partition key design prevents any single partition from exceeding 10 GB.

How DVA-C02 Actually Tests This

What DVA-C02 Tests on DynamoDB (Objective 1.3)

The exam focuses on your ability to design and implement DynamoDB solutions. Key areas: - Primary key design: Choosing between partition key only and composite key; avoiding hot partitions. - Capacity modes: When to use provisioned vs. on-demand; Auto Scaling. - Secondary indexes: GSI vs. LSI; eventual consistency of GSIs; capacity implications. - Consistency models: Eventually consistent (default) vs. strongly consistent; RCU consumption. - Transactions: ACID guarantees; 2x RCU/WCU cost. - DynamoDB Streams: Use with Lambda; view types; ordering; retention. - DAX: In-memory cache; use cases; not a write cache. - TTL: Automatic deletion; best-effort; 48-hour window. - Limits: Item size (400 KB), partition size (10 GB), GSI count (20), LSI count (5).

Common Wrong Answers and Why

1.

Choosing strongly consistent reads for all queries: Candidates think it's safer, but it doubles RCU cost and increases latency. The exam expects you to use eventually consistent unless the application requires immediate consistency after a write.

2.

Using on-demand for predictable workloads: On-demand is more expensive; provisioned with Auto Scaling is cheaper for steady traffic.

3.

Adding an LSI after table creation: LSIs must be defined at creation; GSIs can be added later. Many candidates confuse this.

4.

Assuming GSIs support strongly consistent reads: GSIs are always eventually consistent. Only the main table and LSIs support strongly consistent reads.

5.

Using DAX for write caching: DAX caches reads only; it does not buffer writes. Writes go directly to DynamoDB.

Specific Numbers and Terms

RCU/WCU formulas: 1 RCU = 1 strongly consistent read of 4 KB per second; 1 RCU = 2 eventually consistent reads of 4 KB per second. 1 WCU = 1 write of 1 KB per second.

Item size limit: 400 KB.

Partition size limit: 10 GB.

GSI limit: 20 per table.

LSI limit: 5 per table.

Stream retention: 24 hours.

Transactional operations cost 2x.

DAX offers up to 10x read performance.

Edge Cases the Exam Tests

Empty or sparse GSIs: If many items don't have the GSI key attribute, they are not indexed. This can cause unexpected query results.

GSI write capacity: If a GSI's provisioned capacity is too low, writes to the main table can be throttled even if the main table has enough capacity.

Transactional conflicts: If two transactions try to modify the same item simultaneously, one may fail with a TransactionConflictException.

Streams ordering: Stream records are ordered within a partition key group, but not globally.

How to Eliminate Wrong Answers

If a question mentions 'strong consistency' but the use case can tolerate eventual consistency, eliminate the strongly consistent option.

If a question mentions 'adding an index after table creation', eliminate LSI as an option.

If a question mentions 'cache for reads', consider DAX; for writes, consider SQS or Lambda.

If a question involves 'atomic transactions across multiple tables', consider DynamoDB Transactions.

If a question involves 'real-time processing of changes', consider DynamoDB Streams + Lambda.

Key Takeaways

DynamoDB is a fully managed NoSQL key-value and document database with single-digit millisecond latency at any scale.

Every table must have a primary key: partition key only or composite (partition key + sort key).

Choose provisioned capacity for predictable workloads; on-demand for unpredictable or new applications.

GSIs can be created anytime, are eventually consistent, and have their own capacity settings; LSIs must be defined at table creation, share table capacity, and support strongly consistent reads.

1 RCU = 1 strongly consistent read of 4 KB/sec or 2 eventually consistent reads of 4 KB/sec. 1 WCU = 1 write of 1 KB/sec.

DynamoDB Streams capture item-level changes with 24-hour retention; commonly used with Lambda for event-driven processing.

DynamoDB Transactions provide ACID guarantees across one or more tables but cost 2x RCUs/WCUs.

DAX is an in-memory read cache that improves read performance up to 10x; it does not cache writes.

Maximum item size is 400 KB; maximum partition size is 10 GB.

TTL automatically deletes expired items within 48 hours (best-effort).

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Provisioned Capacity

Specify RCUs and WCUs in advance.

Cost-effective for predictable, steady traffic.

Supports Auto Scaling to adjust capacity based on utilization.

Can be throttled if capacity is exceeded.

Requires capacity planning and monitoring.

On-Demand Capacity

No capacity planning needed; scales instantly.

Pay per request (read/write units consumed).

Ideal for new applications, spiky, or unpredictable workloads.

Never throttled due to capacity limits (but still subject to account limits).

Higher per-request cost than provisioned.

Global Secondary Index (GSI)

Different partition key and sort key from table.

Spans all partitions.

Eventually consistent only.

Can be created after table creation.

Has its own provisioned capacity settings (if using provisioned mode).

Local Secondary Index (LSI)

Same partition key as table, different sort key.

Local to a partition.

Supports strongly consistent reads.

Must be defined at table creation; cannot be added later.

Shares table's provisioned capacity.

Watch Out for These

Mistake

DynamoDB is a relational database that supports joins and complex queries.

Correct

DynamoDB is a NoSQL key-value and document database. It does not support joins, foreign keys, or complex SQL queries. Queries are limited to primary key and secondary index lookups. For complex queries, use Amazon RDS or Aurora.

Mistake

You can change the primary key of a table after creation.

Correct

The primary key schema is immutable after table creation. To change it, you must create a new table, migrate data, and delete the old table. You can add GSIs and LSIs (LSIs only at creation) but cannot modify the base table's key schema.

Mistake

DynamoDB strongly consistent reads are always the best choice.

Correct

Strongly consistent reads consume twice the RCUs and may have higher latency. They are only needed when you must read the most recent write immediately. For most use cases, eventually consistent reads (default) are sufficient and more cost-effective.

Mistake

DynamoDB Accelerator (DAX) can be used for both read and write caching.

Correct

DAX is a read cache only. Writes go directly to DynamoDB. DAX improves read performance by caching frequently accessed items. It does not buffer or batch writes.

Mistake

On-demand capacity mode is cheaper than provisioned for steady workloads.

Correct

On-demand charges per request, which is typically 2-3x more expensive than provisioned capacity for steady workloads. Provisioned with Auto Scaling is more cost-effective for predictable traffic. On-demand is best for new applications with unknown or spiky traffic.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between eventually consistent and strongly consistent reads in DynamoDB?

Eventually consistent reads (default) may not reflect the most recent write; consistency is typically achieved within one second. They consume half the RCUs of strongly consistent reads. Strongly consistent reads return the most up-to-date data but may have higher latency and are not supported for GSIs. Use strongly consistent reads when you must read immediately after a write (e.g., updating a user's profile and then displaying it).

Can I change the primary key of a DynamoDB table after creation?

No, the primary key schema is immutable. To change it, you must create a new table with the desired key schema, migrate data (using AWS Data Pipeline, EMR, or custom scripts), and then delete the old table. You can, however, add GSIs after creation to support new query patterns.

When should I use DynamoDB Accelerator (DAX)?

Use DAX when you have read-heavy workloads with frequent access to the same items, such as gaming leaderboards, product catalogs, or session stores. DAX provides up to 10x read performance improvement by caching data in memory. It is not suitable for write-heavy workloads or cases where you need strong consistency (DAX returns eventually consistent data).

How do DynamoDB Streams work with Lambda?

You can configure a Lambda function to be triggered by a DynamoDB Stream. When items in the table are inserted, updated, or deleted, the stream captures the change and invokes the Lambda function with a batch of records. The function can process the records (e.g., replicate to another table, update a search index). Ensure your function is idempotent because stream records may be delivered more than once.

What are the limitations of DynamoDB Transactions?

Transactions support up to 25 items or 4 MB of data per transaction. They cost 2x the RCUs/WCUs of standard operations. They are available within a single AWS account and region. If a transaction fails due to conflicts, it throws a TransactionConflictException. Use transactions for scenarios requiring atomicity, such as financial transfers.

How can I avoid hot partitions in DynamoDB?

Choose a partition key that distributes traffic evenly. Avoid monotonically increasing values like timestamps. Instead, use a random suffix (e.g., UserId + random number) or a hash prefix. For time-series data, use a composite key with a coarse-grained partition key (e.g., DeviceId#Date) to spread writes. Also, consider using write sharding.

What is the maximum item size in DynamoDB?

The maximum item size is 400 KB, including attribute names and values. If you need to store larger objects, store them in Amazon S3 and keep a pointer (S3 key) in DynamoDB.

Terms Worth Knowing

Ready to put this to the test?

You've just covered DynamoDB for Developers — now see how well it sticks with free DVA-C02 practice questions. Full explanations included, no account needed.

Done with this chapter?