This chapter covers DynamoDB Streams and event processing, a core topic for the SAA-C03 exam under Resilient Architectures (Objective 2.3). You will learn how to capture item-level changes in DynamoDB tables, process them in real time with AWS Lambda, and integrate with other services like Kinesis and EventBridge. Expect 3-5% of exam questions to test your understanding of stream mechanics, shard limits, and failure handling patterns. Mastering this topic will help you design decoupled, event-driven architectures that scale without polling or custom queues.
Jump to a section
Imagine a stock exchange ticker tape machine that records every trade as it happens—buy orders, sell orders, price changes—and prints them on a continuous paper roll in real time. Each trade is an item-level change in a DynamoDB table. The ticker tape is the stream, an ordered, durable log of changes. Multiple brokers (consumers) can subscribe to the same ticker tape, each reading from their own starting point (shard). The tape moves forward; once a trade is printed, it cannot be altered. A broker reads the tape at their own pace, moving a pointer (sequence number) to track progress. If a broker crashes, they restart from the last checkpoint—exactly once processing. DynamoDB Streams works the same: every Create, Update, or Delete on a table item generates a stream record with a sequence number. AWS Lambda or Kinesis Client Library (KCL) applications poll shards, read records in order, and checkpoint after successful processing. The stream retains records for 24 hours (default), after which the tape is discarded. Unlike a physical ticker tape, DynamoDB Streams supports up to 2 shards per partition, and each shard can be read by multiple consumers independently.
What is DynamoDB Streams?
DynamoDB Streams is a time-ordered sequence of item-level changes in a DynamoDB table. When you enable streams on a table, DynamoDB captures a log of every Create, Update, and Delete operation on items, including images of the item before and after the change (depending on the stream view type). Streams are designed for real-time event-driven processing, enabling use cases such as cross-region replication, audit logging, materialized views, and triggering downstream workflows.
How Streams Work Internally
Each DynamoDB table has a primary key (partition key and optional sort key). Data is distributed across partitions based on the partition key's hash. A stream is composed of shards, each corresponding to a partition. When you enable streams, DynamoDB creates a shard for each partition. As the table grows and splits, new shards are created automatically.
When an item is modified, DynamoDB appends a stream record with the following fields:
- eventID: A globally unique identifier for the event.
- eventName: INSERT, MODIFY, or REMOVE.
- eventSource: Always aws:dynamodb.
- awsRegion: The region where the table resides.
- dynamodb: Contains:
- Keys: The primary key of the modified item.
- NewImage: The item as it appears after the modification (if selected).
- OldImage: The item as it appeared before the modification (if selected).
- SequenceNumber: A monotonically increasing number per shard, representing the order of the record within that shard.
- SizeBytes: The size of the stream record.
- StreamViewType: The type of data captured.
- eventSourceARN: The ARN of the table.
Streams are stored for 24 hours by default. After 24 hours, records are automatically deleted. You cannot extend this retention period.
Stream View Types
When enabling streams, you choose a stream view type:
- KEYS_ONLY: Only the primary key attributes of the modified item.
- NEW_IMAGE: The entire item after modification.
- OLD_IMAGE: The entire item before modification.
- NEW_AND_OLD_IMAGES: Both the before and after images.
Shards and Limits
Each partition can have up to 2 shards.
Each shard can be read by one consumer at a time (but multiple consumers can read from different shards).
The maximum number of shards in a stream is 2 × number of partitions.
Each shard can deliver up to 1 MB per second of stream data.
Each shard can deliver up to 1,000 records per second.
The maximum record size is 1 MB.
There is no limit on the number of Lambda functions or KCL applications that can read from the same stream, but they must coordinate shard assignment.
Enabling DynamoDB Streams
You can enable streams via the AWS Management Console, CLI, or SDK. Using the AWS CLI:
aws dynamodb update-table --table-name MyTable --stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGESTo describe the stream:
aws dynamodb describe-table --table-name MyTableThe response includes LatestStreamArn and LatestStreamLabel.
Reading from a Stream
Applications read from streams using the DynamoDB Streams API:
- ListStreams: Returns all streams in a region.
- DescribeStream: Returns shard information for a stream.
- GetShardIterator: Returns a shard iterator (a pointer to a position in the shard).
- GetRecords: Returns records from a shard iterator.
A shard iterator can be:
- TRIM_HORIZON: Start from the oldest record (within 24 hours).
- LATEST: Start from the most recent record.
- AT_SEQUENCE_NUMBER: Start from a specific sequence number.
- AFTER_SEQUENCE_NUMBER: Start after a specific sequence number.
Example using AWS CLI:
# Describe stream to get shard IDs
aws dynamodbstreams describe-stream --stream-arn arn:aws:dynamodb:us-east-1:123456789012:table/MyTable/stream/2024-01-01T00:00:00.000
# Get shard iterator
aws dynamodbstreams get-shard-iterator --stream-arn <arn> --shard-id <shardId> --shard-iterator-type LATEST
# Get records
aws dynamodbstreams get-records --shard-iterator <iterator>Processing with AWS Lambda
The most common integration is to trigger an AWS Lambda function from a DynamoDB stream. Lambda polls the stream on your behalf, reading records and invoking your function synchronously. The Lambda function receives a batch of records (up to 10,000 records or 6 MB payload, whichever is smaller). The function must process records and then return a success status. If the function fails, Lambda retries the batch until it succeeds or the records expire (24 hours).
To configure a Lambda trigger:
aws lambda create-event-source-mapping --function-name MyFunction --event-source-arn <stream-arn> --starting-position LATESTYou can also specify BatchSize (default 100, max 10,000) and MaximumBatchingWindowInSeconds (default 0, max 300).
Failure Handling and Retries
Lambda retries failed batches until the records expire (24 hours).
You can configure MaximumRetryAttempts (default unlimited) and MaximumRecordAgeInSeconds (default 24 hours).
If the function is throttled, Lambda will retry with backoff.
To avoid blocking the stream, you can configure a DestinationConfig to send failed records to an SQS queue or SNS topic.
For critical workloads, use a dead-letter queue (DLQ) with SQS to capture records that fail after all retries.
Integration with Kinesis Data Streams
If you need longer retention (up to 365 days) or higher throughput, you can replicate DynamoDB Streams to Kinesis Data Streams using the DynamoDB Streams Kinesis Adapter (KCL). This allows you to use Kinesis features like enhanced fan-out, long retention, and integration with Kinesis Data Analytics.
Integration with Amazon EventBridge
You can also route DynamoDB stream records to EventBridge using a Lambda function that publishes events to the EventBridge bus. This enables event-driven architectures with multiple targets (SQS, SNS, Step Functions, etc.).
Best Practices
Use NEW_AND_OLD_IMAGES only when necessary to reduce record size.
Set BatchSize to a high value (e.g., 1000) for high-throughput tables.
Use MaximumBatchingWindowInSeconds to batch records for efficiency.
Monitor IteratorAge metric to detect if processing is falling behind.
For idempotent processing, use eventID as a unique identifier to deduplicate.
When using KCL, ensure your application checkpoints after successful processing to avoid reprocessing.
Use CloudWatch alarms on ThrottledRequests and UserErrors.
Interaction with Global Tables
DynamoDB Global Tables use streams to replicate changes across regions. Each replica table has its own stream. When an item is updated in one region, the change is propagated to other regions via streams. The stream records in the replica table will have eventName as MODIFY and the replication is eventually consistent (typically within 1 second).
Enable DynamoDB Streams
Use the AWS CLI, SDK, or console to enable streams on a table. You must specify the stream view type: `KEYS_ONLY`, `NEW_IMAGE`, `OLD_IMAGE`, or `NEW_AND_OLD_IMAGES`. The command `update-table` with `--stream-specification` creates a stream. DynamoDB assigns a unique ARN and label. Once enabled, all future item changes are captured. Existing items are not recorded. The stream is automatically partitioned into shards based on table partitions.
Understand Stream Record Structure
Each stream record contains metadata: `eventID` (unique), `eventName` (INSERT/MODIFY/REMOVE), `eventSource` (aws:dynamodb), `awsRegion`, `dynamodb` (Keys, NewImage, OldImage, SequenceNumber, SizeBytes, StreamViewType), and `eventSourceARN`. The `SequenceNumber` is a string that increases monotonically within a shard. Records are ordered by sequence number. The `ApproximateCreationDateTime` is also included for temporal ordering.
Configure Lambda Trigger
Create an event source mapping using `create-event-source-mapping` with the Lambda function ARN and stream ARN. Specify `StartingPosition` (TRIM_HORIZON or LATEST). Lambda automatically polls the stream, reading batches of records. The function receives an event payload containing an array of records. Each record is parsed from JSON. Lambda handles scaling by adding more shard processors (one per shard).
Process Records in Lambda
Write Lambda code to iterate over the `Records` array. For each record, extract `dynamodb.NewImage`, `dynamodb.OldImage`, and `eventName`. Perform business logic (e.g., update an external database, call an API, publish to SNS). The function must return a success response. If it throws an error, Lambda retries the entire batch. Use `eventID` for idempotency. The function can log to CloudWatch for debugging.
Handle Failures and Retries
Lambda retries failed batches up to the configured `MaximumRetryAttempts` (default unlimited) or until `MaximumRecordAgeInSeconds` (default 24h) expires. If the function is throttled, it retries with exponential backoff. To avoid blocking the stream, configure a `DestinationConfig` with an SQS queue or SNS topic for failed records. Monitor `IteratorAge` to detect if processing is falling behind. Use CloudWatch alarms to alert on high iterator age.
Enterprise Scenario 1: Real-Time Order Processing
An e-commerce platform uses DynamoDB to store orders. Each order status change (e.g., PENDING, PAID, SHIPPED) is captured via DynamoDB Streams. A Lambda function processes the stream and triggers downstream workflows: sends confirmation emails via SES, updates inventory in an RDS database, and sends events to EventBridge for analytics. The stream view type is NEW_AND_OLD_IMAGES to capture status transitions. The system handles 10,000 orders per second. The Lambda batch size is set to 1,000 with a batching window of 5 seconds to reduce costs. The IteratorAge metric is monitored; if it exceeds 10 seconds, an alarm triggers. A DLQ (SQS) captures any records that fail after 3 retries, and a separate Lambda processes the DLQ for manual intervention.
Enterprise Scenario 2: Cross-Region Replication
A financial services company uses DynamoDB Global Tables for disaster recovery across US East and US West. DynamoDB Streams on each table replicates changes to the other region. The stream view type is NEW_AND_OLD_IMAGES to ensure both images are available for conflict resolution. The replication latency is typically under 1 second. The company uses the stream to audit all changes: a Lambda function in each region writes the stream records to an S3 bucket for compliance. The stream retention of 24 hours is sufficient for auditing; longer retention is achieved via S3. If a regional failure occurs, the other region continues processing, and streams catch up when the failed region recovers.
Common Misconfigurations
Wrong stream view type: Choosing KEYS_ONLY when you need full item data forces additional reads to fetch the item, defeating the purpose of streams.
Ignoring shard limits: If the table has many partitions (e.g., 1000), the stream will have up to 2000 shards. Each Lambda function can only process one shard per function instance. Ensure your Lambda concurrency limit is high enough to handle all shards.
Not handling duplicates: Streams guarantee at-least-once delivery. Without idempotent processing, duplicates can cause data corruption. Always use eventID or a business key for deduplication.
Retention misunderstanding: Stream records are deleted after 24 hours. If your consumer is down for longer, you lose data. Use a DLQ or replicate to Kinesis for longer retention.
What SAA-C03 Tests (Objective 2.3)
The exam tests your ability to design event-driven architectures using DynamoDB Streams. Key areas:
Stream mechanics: shards, sequence numbers, view types.
Integration with Lambda: event source mapping, batch size, retries, DLQ.
Failure handling: iterator age, throttling, destination config.
Comparison with other streaming services: Kinesis vs. DynamoDB Streams.
Global Tables and replication.
Common Wrong Answers
Choosing Kinesis Data Streams over DynamoDB Streams for simple change capture: Candidates often pick Kinesis because they think it's more scalable. But DynamoDB Streams is simpler and cost-effective for low-latency change capture. Kinesis is better for high-throughput, long retention, or multiple consumers with different processing rates.
Setting stream retention to 7 days: DynamoDB Streams has a fixed 24-hour retention. Candidates confuse it with Kinesis (up to 365 days). The exam tests this exact number.
Using an SNS topic as a trigger directly: DynamoDB Streams cannot trigger SNS directly. You must use Lambda or KCL. The exam may include a distractor where SNS is used without Lambda.
Assuming streams capture existing items: Streams only capture changes after they are enabled. To capture existing data, you must scan the table or use a separate process.
Key Numbers and Terms
Stream retention: 24 hours (fixed).
Shard limit: 2 per partition.
Record size limit: 1 MB.
Lambda batch size: 100-10,000 (default 100).
Maximum batching window: 0-300 seconds (default 0).
MaximumRecordAgeInSeconds: 60-604800 (default 86400).
MaximumRetryAttempts: 0-10000 (default unlimited).
Edge Cases
Empty record batches: Lambda might receive an empty batch if the stream has no new records. The function should handle empty Records arrays.
Shard splitting: When a partition splits, the stream also splits into new shards. Lambda automatically handles this; you don't need to manage shard reassignment.
Throttling: If the Lambda function is throttled, the stream records accumulate, increasing IteratorAge. The exam may ask how to mitigate this: increase Lambda concurrency or reduce batch size.
Eliminating Wrong Answers
If the question asks for real-time processing of item changes and mentions 'low cost', 'simple setup', or 'within AWS', prefer DynamoDB Streams with Lambda over Kinesis.
If the question mentions 'multiple consumers with different processing speeds' or 'long retention', Kinesis is better.
If the question involves cross-region replication, remember that DynamoDB Global Tables use streams internally.
If the question involves a dead-letter queue, use SQS, not SNS (SNS is push-only, cannot hold messages).
DynamoDB Streams captures item-level changes in real time with a fixed 24-hour retention.
Stream records are organized into shards; each partition can have up to 2 shards.
You can choose stream view types: KEYS_ONLY, NEW_IMAGE, OLD_IMAGE, or NEW_AND_OLD_IMAGES.
Lambda is the most common consumer; it polls the stream and processes batches of up to 10,000 records.
Lambda retries failed batches until the records expire (24 hours) or until MaximumRetryAttempts is reached.
Use a dead-letter queue (SQS) to capture records that fail after all retries.
Streams do not capture existing items; only changes after enabling are recorded.
Global Tables rely on DynamoDB Streams for cross-region replication.
Monitor IteratorAge to detect if stream processing is falling behind.
For longer retention or multiple consumers, replicate to Kinesis Data Streams.
These come up on the exam all the time. Here's how to tell them apart.
DynamoDB Streams
Fixed 24-hour retention
Automatic shard management tied to table partitions
Up to 2 shards per partition
Lower cost for simple change capture
No need to provision shard capacity
Kinesis Data Streams
Configurable retention up to 365 days
Manual shard management (or using KCL)
Up to 1000 shards per stream (default limit)
Higher cost but supports multiple consumers with enhanced fan-out
Requires provisioning write and read capacity (shards)
Mistake
DynamoDB Streams can retain records for up to 7 days.
Correct
DynamoDB Streams has a fixed retention period of 24 hours. For longer retention, you must replicate to Kinesis Data Streams (up to 365 days) or archive to S3.
Mistake
DynamoDB Streams captures all existing items when enabled.
Correct
Streams only capture changes made after the stream is enabled. Existing items are not included. To process existing data, you must perform a full table scan separately.
Mistake
You can directly trigger an SNS topic from a DynamoDB stream.
Correct
DynamoDB Streams cannot directly invoke SNS. You must use an intermediary like Lambda or Kinesis to process the stream and publish to SNS.
Mistake
Each shard can be read by multiple consumers concurrently.
Correct
A shard can be read by only one consumer at a time. However, multiple consumers can read from different shards concurrently. Use Kinesis enhanced fan-out if you need multiple consumers per shard.
Mistake
Stream records are guaranteed to be processed exactly once.
Correct
DynamoDB Streams offers at-least-once delivery. Duplicates can occur due to retries. Your application must handle idempotency using `eventID` or other unique identifiers.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
DynamoDB Streams retains records for 24 hours. This is fixed and cannot be changed. If you need longer retention, you must replicate the stream to Kinesis Data Streams (up to 365 days) or archive records to S3 via Lambda.
Create an event source mapping using the AWS CLI: `aws lambda create-event-source-mapping --function-name MyFunction --event-source-arn <stream-arn> --starting-position LATEST`. Lambda automatically polls the stream, reads batches of records, and invokes your function synchronously. Your function must return a success status; otherwise, Lambda retries the batch.
DynamoDB Streams is purpose-built for capturing DynamoDB table changes with automatic sharding tied to table partitions and a fixed 24-hour retention. Kinesis Data Streams is a general-purpose streaming service with configurable retention (up to 365 days), manual shard management, and support for multiple consumers with enhanced fan-out. Use DynamoDB Streams for simple, low-cost change capture; use Kinesis for complex streaming needs.
Yes, multiple Lambda functions can read from the same stream, but each shard can be read by only one consumer at a time. Lambda automatically distributes shards among function instances. If you need multiple consumers per shard, consider using Kinesis Data Streams with enhanced fan-out.
DynamoDB Streams provides at-least-once delivery, so duplicates are possible. Use the `eventID` field (unique per record) to deduplicate in your processing logic. Alternatively, use a business key from the item data. For example, store processed `eventID`s in a separate table or use a cache.
Lambda will retry the failed batch until it succeeds or the records expire (24 hours). You can configure `MaximumRetryAttempts` (default unlimited) and `MaximumRecordAgeInSeconds` (default 86400). To avoid blocking the stream, set a `DestinationConfig` to send failed records to an SQS queue or SNS topic for later analysis.
Yes, each replica table in a Global Table has its own stream. Changes replicated from other regions appear as `MODIFY` events in the local stream. This allows you to process cross-region changes locally.
You've just covered DynamoDB Streams and Event Processing — now see how well it sticks with free SAA-C03 practice questions. Full explanations included, no account needed.
Done with this chapter?