DVA-C02Chapter 47 of 101Objective 1.5

Kinesis Data Streams for Developers

The Development domain of the DVA-C02 exam tests Amazon Kinesis Data Streams (KDS) from a developer's perspective. KDS is a real-time data streaming service that allows you to build applications that process continuous data flows. On the DVA-C02 exam, approximately 5-8% of questions touch on Kinesis, typically comparing it to other AWS streaming services like Kinesis Data Firehose, Kinesis Data Analytics, and Amazon SQS. You will need to understand the architecture, shard mechanics, producer/consumer SDK usage, and how to design for scalability and fault tolerance.

25 min read

Intermediate

Updated Jul 20, 2026

Reviewed by Johnson Ajibi· Senior Network & Security Engineer · MSc IT Security

Jump to a section

Explain it to me simply Where people get tripped up Test what I know Look up key terms

Kinesis Data Streams: The Factory Conveyor Belt

Fifty workers, one conveyor belt, and a constant stream of parts from multiple suppliers. At the start, raw parts (data records) arrive from multiple suppliers (producers) and are placed onto a moving conveyor belt (the data stream). The conveyor belt is divided into numbered segments (shards). Each segment moves at a fixed speed (throughput limit per shard). A worker (consumer) stands at a specific segment and picks up parts that pass by. The worker has a notepad (shard iterator) that records which part they last picked up, so if they take a break (consumer failure), they can resume from where they left off, not from the beginning. The conveyor belt never stops; parts that are not picked up in time fall off the end (data retention limit, default 24 hours). If the factory needs to handle more parts, they can add more segments (increase shards) by splitting a segment into two, each with the same speed. Conversely, if fewer parts arrive, they can merge two adjacent segments into one. The factory manager (AWS Kinesis API) monitors the load on each segment and can trigger splits or merges. The key point: the conveyor belt does not process parts; it just transports them. Workers (consumers) pull parts at their own pace, and multiple workers can read from the same segment, but each part is only processed once per worker group (consumer group). This is exactly how Kinesis Data Streams works: producers push data records into shards, consumers pull records using a shard iterator, and the stream retains data for a configurable period (up to 365 days with extended retention).

How It Actually Works

What is Amazon Kinesis Data Streams?

Amazon Kinesis Data Streams (KDS) is a fully managed, serverless service for real-time streaming data ingestion and processing. It can capture and store terabytes of data per hour from hundreds of thousands of sources. The core abstraction is a data stream, which is a logical grouping of shards. Each shard is a unit of capacity that provides a fixed write and read throughput: 1 MB/s write or 1,000 records/s write, and 2 MB/s read or 5 API calls/s read (per shard for shared-throughput consumers). For enhanced fan-out consumers, each shard provides 2 MB/s read per consumer.

Why Kinesis Data Streams Exists

Traditional batch processing (e.g., cron jobs dumping logs to S3) introduces latency. KDS enables real-time analytics, alerting, and reactive applications. It decouples data producers from consumers, allowing multiple consumers to process the same stream independently. Key use cases: log and event data ingestion, real-time metrics, clickstream analysis, IoT device telemetry, and fraud detection.

How It Works Internally

Producers put data records into the stream. Each record consists of a partition key (a string up to 256 bytes), a data blob (up to 1 MB), and a sequence number (assigned by Kinesis). The partition key determines which shard the record goes to: Kinesis uses the MD5 hash of the partition key to map it to a shard. This ensures ordering within a shard but not across shards.

Shards are the base throughput unit. A stream can have multiple shards. The total throughput is the sum of all shard throughputs. Shards are self-contained; records in one shard are independent of others.

Consumers read records from shards using a shard iterator. The consumer calls GetShardIterator to get an iterator pointing to a position (TRIM_HORIZON for oldest, LATEST for newest, AT_SEQUENCE_NUMBER, AFTER_SEQUENCE_NUMBER, or AT_TIMESTAMP). Then it calls GetRecords with the iterator to retrieve up to 10,000 records or 10 MB (whichever is smaller) per call. The iterator is valid for 5 minutes; after that, you must get a new one.

Enhanced Fan-Out (EFO) consumers register with the stream and get a dedicated 2 MB/s read throughput per shard, bypassing the shared 2 MB/s limit. They use SubscribeToShard via HTTP/2. This is ideal for multiple consumers that need low latency.

Resharding allows scaling: split a shard into two (increasing capacity) or merge two adjacent shards into one (decreasing capacity). Resharding is an asynchronous operation that can take seconds to minutes. During resharding, the stream remains available.

Key Components, Values, Defaults, and Timers

Stream name: must be unique within an AWS account per region, 1-128 characters, alphanumeric and hyphens.

Shard count: default 1, max 500 per stream (soft limit, can be increased via support ticket).

Data retention: default 24 hours, can be increased to 365 days (extended retention incurs additional cost).

Record size: max 1 MB for the data blob plus partition key.

Sequence number: unique per shard, monotonically increasing, used for ordering.

Shard iterator type: TRIM_HORIZON (oldest), LATEST (newest), AT_SEQUENCE_NUMBER, AFTER_SEQUENCE_NUMBER, AT_TIMESTAMP.

GetRecords max: 10,000 records or 10 MB per call.

GetRecords frequency: 5 API calls per second per shard (shared throughput). For EFO, no limit on calls but still subject to 2 MB/s throughput.

Shard iterator expiry: 5 minutes if not used; after use, it becomes invalid immediately.

PutRecord throughput: 1 MB/s or 1,000 records/s per shard.

PutRecords (batch): up to 500 records or 5 MB per request. The total throughput limit applies to the sum of all concurrent puts.

ProvisionedThroughputExceededException: occurs when you exceed the shard throughput. The SDK automatically retries with exponential backoff.

Configuration and Verification Commands

Using AWS CLI:

# Create a stream with 2 shards
aws kinesis create-stream --stream-name my-stream --shard-count 2

# Describe the stream
aws kinesis describe-stream --stream-name my-stream

# Put a record
aws kinesis put-record --stream-name my-stream --partition-key pk1 --data "$(echo 'Hello' | base64)"

# Get a shard iterator
aws kinesis get-shard-iterator --stream-name my-stream --shard-id shardId-000000000000 --shard-iterator-type TRIM_HORIZON

# Get records (use iterator from previous command)
aws kinesis get-records --shard-iterator <iterator>

# List shards
aws kinesis list-shards --stream-name my-stream

# Increase retention to 7 days
aws kinesis increase-stream-retention-period --stream-name my-stream --retention-period-hours 168

# Split a shard
aws kinesis split-shard --stream-name my-stream --shard-to-split shardId-000000000000 --new-starting-hash-key 113427455640312821154458202477256070485

# Merge shards
aws kinesis merge-shards --stream-name my-stream --shard-to-merge shardId-000000000000 --adjacent-shard-to-merge shardId-000000000001

How KDS Interacts with Related Technologies

Kinesis Data Firehose: can read from a KDS stream and deliver data to S3, Redshift, Elasticsearch, Splunk, etc. Firehose is a fully managed delivery service; it can write data in batches.

Kinesis Data Analytics: can consume KDS streams and perform SQL or Apache Flink-based real-time analytics.

AWS Lambda: can be a consumer (via event source mapping) or producer. Lambda can process records from a stream and write results to another stream.

Amazon DynamoDB Streams: similar concept but tied to DynamoDB tables; KDS is more general-purpose.

Amazon SQS: SQS is a message queue (pull-based), not a stream. SQS messages are deleted after consumption; KDS retains data. SQS is for decoupling microservices; KDS for real-time streaming.

Amazon MSK (Managed Kafka): offers Kafka API compatibility; KDS is a proprietary API. MSK is more complex but offers more configuration control.

Walk-Through

Create a Kinesis Data Stream

Define the stream name and initial number of shards. Each shard provides 1 MB/s write and 2 MB/s read (shared). The stream is created immediately but may take a few seconds to become ACTIVE. Use the AWS Console, CLI (`create-stream`), or SDK. The shard count determines the base throughput. For example, 5 shards give 5 MB/s write capacity. The stream's ARN is in the format `arn:aws:kinesis:region:account-id:stream/stream-name`. The stream must be in ACTIVE state before you can put records.

Produce Data Records

Producers call `PutRecord` or `PutRecords` to send data. Each record requires a partition key (string) and data blob (base64-encoded). The partition key is hashed using MD5 to determine the target shard. To ensure even distribution, use a high-cardinality partition key (e.g., user ID, device ID). If the partition key is too low-cardinality, records may hotspot to a single shard, causing throttling. The `PutRecord` response includes the sequence number and shard ID. `PutRecords` accepts up to 500 records or 5 MB total; individual failures are reported per record.

Configure Consumers

Consumers can be shared-throughput (classic) or enhanced fan-out (EFO). Classic consumers share the 2 MB/s read throughput per shard and are limited to 5 `GetRecords` calls per second. EFO consumers register via `RegisterStreamConsumer` and use `SubscribeToShard` (HTTP/2) to get a dedicated 2 MB/s per shard. EFO is recommended for multiple consumers that need low latency (sub-200ms). Consumers must handle shard changes (resharding) by discovering new shards using `ListShards` or `DescribeStream`.

Read Data from Shards

Consumer calls `GetShardIterator` with a shard ID and iterator type (e.g., LATEST for new records only, TRIM_HORIZON for all available records). It receives a shard iterator (a string). Then it calls `GetRecords` with that iterator to retrieve up to 10 MB or 10,000 records. The response includes the next shard iterator. The consumer must use the new iterator for subsequent calls. If the iterator expires (after 5 minutes of inactivity), the consumer gets an `ExpiredIteratorException` and must call `GetShardIterator` again.

Scale with Resharding

When throughput needs change, you can split or merge shards. Splitting increases shard count and throughput; merging decreases. To split, you specify the shard to split and a new starting hash key (any hash in the shard's range). The shard is split into two child shards. During resharding, the parent shard becomes CLOSED and is eventually expired (after retention period). The stream remains readable/writable. After resharding, producers continue to write to the same stream; the hash mapping automatically routes records to child shards. Consumers must discover new shards via `ListShards` and start reading from them.

What This Looks Like on the Job

Enterprise Scenario 1: Real-Time Clickstream Analytics

A large e-commerce platform ingests clickstream events from millions of users. They use Kinesis Data Streams with 100 shards to handle 100 MB/s write throughput. Producers are web servers that send events via the Kinesis Producer Library (KPL) for batching and compression. Consumers include a Lambda function that aggregates click counts per product (updated every minute to DynamoDB), and a Kinesis Data Analytics application that detects anomalies (e.g., sudden traffic spikes). They use enhanced fan-out for the analytics consumer to avoid latency interference. Common issue: if the partition key is the user ID, some popular users cause hot shards. Solution: add a random prefix to the partition key to distribute load evenly. They monitor with CloudWatch metrics: WriteProvisionedThroughputExceeded and ReadProvisionedThroughputExceeded. They set up auto-scaling using the UpdateShardCount API based on IncomingBytes metric.

Enterprise Scenario 2: IoT Device Telemetry

A smart building company collects temperature, humidity, and occupancy data from thousands of sensors. Each sensor sends a record every 30 seconds. They use a Kinesis stream with 10 shards. Producers are IoT devices using the AWS IoT Core rules engine to push data to Kinesis. Consumers include a Lambda function that alerts when temperature exceeds thresholds (fan-out to SNS), and a Firehose delivery stream that writes raw data to S3 for archival. Retention is set to 7 days for reprocessing. Problem: if a consumer fails, it can resume from the last checkpoint (stored in DynamoDB). They use the Kinesis Client Library (KCL) which handles checkpointing and shard discovery. They configure the KCL with a DynamoDB table for lease management.

Scenario 3: Financial Fraud Detection

A payment processing company streams transaction data in real-time. They need sub-second latency. They use Kinesis with 50 shards and enhanced fan-out for the fraud detection engine (a custom Java application using KCL). The engine checks transactions against known patterns and updates a risk score in DynamoDB. They also have a second consumer (S3 archive) using shared throughput, which is acceptable with higher latency. They use PutRecords with a batch size of 500 to maximize write throughput. They monitor MillisBehindLatest to ensure consumers keep up. If the metric grows, they increase shards or add more consumer instances. They also use AWS KMS for encryption at rest.

How DVA-C02 Actually Tests This

DVA-C02 Exam Focus on Kinesis Data Streams

The exam tests your ability to choose the right AWS service for a given scenario. For KDS, key differentiators: real-time processing, ordered data within a shard, multiple consumers, data retention up to 365 days, and replay capability. Common distractors: SQS (no ordering, no retention, messages deleted after consumption), SNS (pub/sub, no persistence), Firehose (near-real-time, no replay, no multiple consumers with independent processing).

Most Common Wrong Answers

'Use SQS with a Lambda consumer for real-time processing.' Wrong because SQS does not support ordering at scale (FIFO queues limit throughput) and messages are deleted after consumption. KDS retains data for replay.

'Use Kinesis Data Firehose for real-time analytics.' Firehose is near-real-time (minimum 60 seconds buffer) and does not support multiple consumers reading the same data independently. It delivers to a single destination.

'Kinesis Data Streams supports at-most-once delivery.' Actually, KDS provides at-least-once delivery. Records may be duplicated if a producer retries. Consumers must be idempotent.

'Increasing shard count automatically increases retention.' No, retention is independent. You must call IncreaseStreamRetentionPeriod.

Specific Numbers and Terms Tested

Default retention: 24 hours. Max with extended: 365 days.

Record size limit: 1 MB.

Shard throughput: 1 MB/s write, 2 MB/s read (shared), 2 MB/s read per consumer (EFO).

ProvisionedThroughputExceededException when exceeding limits.

Shard iterator types: TRIM_HORIZON, LATEST, AT_SEQUENCE_NUMBER, AFTER_SEQUENCE_NUMBER, AT_TIMESTAMP.

Enhanced fan-out uses HTTP/2 and SubscribeToShard.

KCL uses DynamoDB for checkpointing and lease management.

Edge Cases

Resharding during high load: if you split a shard that is already throttled, the split may fail. Plan resharding during off-peak.

Empty shards: after merging, the parent shard becomes CLOSED and eventually expires. Consumers must handle child shards.

Sequence number wrap: sequence numbers are unique per shard and do not wrap; they are monotonically increasing.

How to Eliminate Wrong Answers

If the scenario requires ordered processing of messages with multiple consumers reading the same data independently, eliminate SQS (no ordering across consumers, messages disappear). If the scenario requires replay of historical data, eliminate Firehose (no retention). If the scenario requires real-time processing with sub-second latency, eliminate Firehose (60-second buffer). If the scenario requires multiple applications to process the same stream independently, KDS with enhanced fan-out is the answer.

Key Takeaways

Kinesis Data Streams is a real-time streaming service with data retention from 24 hours to 365 days.

Each shard provides 1 MB/s write and 2 MB/s read (shared) or 2 MB/s read per consumer (enhanced fan-out).

Records are ordered within a shard; partition key determines shard assignment via MD5 hash.

Use Kinesis Client Library (KCL) for building resilient consumer applications that handle resharding and checkpointing.

Enhanced fan-out (EFO) uses HTTP/2 and provides dedicated read throughput per consumer, ideal for low-latency requirements.

Resharding (split/merge) allows scaling throughput but requires consumer logic to discover new shards.

Common exam distractors: SQS (no ordering for multiple consumers), Firehose (no data retention, single destination).

Monitor `MillisBehindLatest` to ensure consumers keep up with the stream.

KDS supports server-side encryption with AWS KMS at rest.

The `ProvisionedThroughputExceededException` indicates you need to increase shard count or reduce request rate.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Kinesis Data Streams

Streaming data with persistence up to 365 days

Ordering within a shard (per partition key)

Multiple consumers read same records independently

At-least-once delivery; records not deleted after consumption

Shard-based throughput scaling (1 MB/s write per shard)

Amazon SQS

Message queue; messages deleted after consumption

Ordering only with FIFO queues (limited throughput)

Single consumer per message (standard queues); multiple consumers compete for messages

At-least-once (standard) or exactly-once (FIFO) delivery

Throughput based on queue type: standard unlimited, FIFO 3000 msg/s with batching

Kinesis Data Streams

Real-time (sub-second latency)

Data retained for configurable period (24h to 365d)

Multiple consumers can process the same data

Requires custom consumer code (KCL, Lambda, etc.)

Manual scaling via shard management

Kinesis Data Firehose

Near-real-time (60-second minimum buffer)

No data retention; delivered to destination immediately

Single destination per delivery stream

No custom consumer; writes directly to S3, Redshift, etc.

Automatic scaling (no shards to manage)

Watch Out for These

Mistake

Kinesis Data Streams guarantees exactly-once delivery.

Correct

KDS provides at-least-once delivery. Duplicates can occur if a producer retries a `PutRecord` call after a timeout. Consumers should be idempotent.

Mistake

You can increase shard count without any downtime.

Correct

Resharding is an online operation; the stream remains available. However, during resharding, there is a brief period where the parent shard is closed and child shards are created. Producers and consumers continue to work, but consumers must discover new shards.

Mistake

Kinesis Data Streams is a queue like SQS.

Correct

KDS is a stream, not a queue. Records are not deleted after consumption; they persist for the retention period. Multiple consumers can read the same records independently. SQS deletes messages after they are processed by a consumer.

Mistake

You can only have one consumer per shard.

Correct

You can have multiple consumers per shard, but they share the 2 MB/s read throughput (unless using enhanced fan-out). The KCL handles multiple consumer instances by distributing shards among them via leases.

Mistake

Enhanced fan-out is always better than shared throughput.

Correct

EFO incurs additional cost per consumer. For a single consumer, shared throughput is sufficient and cheaper. EFO is only beneficial when you have multiple consumers that need low latency (<200ms) or high read throughput.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between Kinesis Data Streams and Kinesis Data Firehose?

Kinesis Data Streams is a real-time data streaming service that stores data for a configurable retention period (up to 365 days) and allows multiple consumers to read the same data independently. It requires you to write consumer code (e.g., using KCL or Lambda). Kinesis Data Firehose is a fully managed data delivery service that can read from a KDS stream or other sources, buffer data for a minimum of 60 seconds, and deliver it to destinations like S3, Redshift, Elasticsearch, or Splunk. Firehose does not retain data; it delivers it immediately. Use KDS when you need real-time processing, multiple consumers, or data replay. Use Firehose when you want a simple, serverless way to load streaming data into a data store.

How do I ensure data ordering in Kinesis Data Streams?

Data is ordered within a shard. To maintain order for a specific entity (e.g., a user), use the entity's ID as the partition key. All records with the same partition key go to the same shard, preserving order. However, ordering across shards is not guaranteed. If you need global ordering, you must use a single shard, which limits throughput. For most use cases, per-shard ordering is sufficient.

What happens if a consumer falls behind in Kinesis Data Streams?

If a consumer cannot keep up with the write rate, records may expire before being read. The consumer's `MillisBehindLatest` metric will increase. To catch up, you can either increase the number of shards (to increase write capacity and reduce backlog), add more consumer instances (if using KCL, it will distribute shard leases), or use enhanced fan-out to get dedicated read throughput. If records expire, they are lost; you cannot recover them. Set appropriate retention and monitor consumer lag.

Can I use AWS Lambda to process Kinesis Data Streams?

Yes, Lambda can be a consumer via an event source mapping. You configure the stream and batch size. Lambda polls shards (using shared throughput) and invokes your function with a batch of records. For enhanced fan-out, you can use `SubscribeToShard` (requires Lambda support for HTTP/2, which is not available in all runtimes). Lambda is a good choice for simple processing, but for high throughput or complex stateful processing, consider using KCL or Kinesis Data Analytics.

What is the difference between TRIM_HORIZON and LATEST shard iterator types?

TRIM_HORIZON starts reading from the oldest record in the shard (the first record that hasn't expired). LATEST starts reading from the newest record, so you only get new records as they arrive. Use TRIM_HORIZON for replaying historical data or when starting a new consumer that needs to process all existing data. Use LATEST for consumers that only care about new data.

How do I handle resharding in my consumer application?

When a shard is split or merged, the parent shard becomes CLOSED and eventually expires. New child shards are created. Your consumer must periodically call `ListShards` or `DescribeStream` to discover new shards. The Kinesis Client Library (KCL) handles this automatically: it uses DynamoDB to track shard leases and reassigns them when resharding occurs. If you are not using KCL, you must implement your own logic to detect new shards and start reading from them.

What is the maximum record size in Kinesis Data Streams?

The maximum size of the data blob (payload) plus the partition key is 1 MB. The partition key itself can be up to 256 bytes. If your data exceeds 1 MB, you must split it into multiple records. The Kinesis Producer Library (KPL) can aggregate multiple small records into a single `PutRecord` call to improve efficiency.

Terms Worth Knowing

Data Data warehouse EventBridge Non-relational database Pub/Sub Relational database

Ready to put this to the test?

You've just covered Kinesis Data Streams for Developers — now see how well it sticks with free DVA-C02 practice questions. Full explanations included, no account needed.

Try DVA-C02 practice questions Back to all chapters

Done with this chapter?

SNS Message Filtering and Fan-Out

Kinesis Data Firehose

See the full DVA-C02 study guide