SAA-C03Chapter 144 of 189Objective 3.2

Kinesis Data Firehose Delivery Streams

This chapter covers Kinesis Data Firehose delivery streams, a fully managed service for loading streaming data into data stores and analytics tools. For the SAA-C03 exam, understanding Firehose's architecture, buffering, transformation, and delivery options is essential, as it frequently appears in questions about real-time data ingestion and analytics pipelines. Expect roughly 5-8% of exam questions to involve Firehose, often comparing it with Kinesis Data Streams or requiring you to choose the correct service for a given latency and throughput requirement.

25 min read

Intermediate

Updated Jul 20, 2026

Reviewed by Johnson Ajibi· Senior Network & Security Engineer · MSc IT Security

Jump to a section

Explain it to me simply Where people get tripped up Test what I know Look up key terms

Firehose Delivery Streams as a Conveyor Belt

A large warehouse accepts packages from many different sources — online orders, supplier deliveries, returns — all arriving at unpredictable rates. Instead of having each source drive a forklift directly to the final storage area, you install a conveyor belt system. The conveyor belt has a constant speed, but it can accept packages at any rate; if too many arrive at once, they queue up on a temporary buffer ramp. As packages move along the belt, they pass through stations that can optionally inspect, transform, or repackage them (like AWS Lambda functions). Finally, the belt delivers the packages to one or more destinations: a long-term storage warehouse (S3), a real-time analytics room (Redshift), a data lake (S3 in Parquet format), or an external partner's loading dock (HTTP endpoint). The conveyor belt automatically scales the buffer ramp when needed, and if the belt jams or a destination is full, it retries delivery or sends packages to a dead-letter ramp. The system is designed for near-real-time, not instant — a package might take 60 seconds from source to destination. You cannot reverse the belt or pick a specific package once it's on the belt. This is exactly how Kinesis Data Firehose works: it's a fully managed, near-real-time data ingestion service that buffers, optionally transforms, and reliably delivers streaming data to destinations like S3, Redshift, OpenSearch, and Splunk.

How It Actually Works

What is Kinesis Data Firehose?

Kinesis Data Firehose is a fully managed, serverless service for capturing, transforming, and loading streaming data into AWS data stores and analytics services. It is designed for near-real-time data ingestion with minimal operational overhead. Unlike Kinesis Data Streams, which is a low-level streaming data platform that requires consumers to process records, Firehose is a delivery service that automatically buffers, optionally transforms, and writes data to destinations. The exam often tests the distinction: Firehose is for loading data into destinations (S3, Redshift, OpenSearch, Splunk, HTTP endpoints), while Kinesis Data Streams is for building custom real-time applications.

How Firehose Works Internally

When you create a Firehose delivery stream, you specify a source, optional transformations, and one or more destinations. The source can be: - Direct PUT using the AWS SDK, Kinesis Agent, or AWS Management Console. - Kinesis Data Streams as a source (Firehose reads from the stream). - Amazon CloudWatch Logs subscription filter. - AWS IoT rules. - Amazon EventBridge.

Once data arrives, Firehose buffers it based on two configurable parameters: buffer size (1 MB to 128 MB) and buffer interval (60 seconds to 900 seconds). The first condition met triggers a delivery. For example, if buffer interval is 300 seconds and buffer size is 5 MB, Firehose will deliver after 5 MB of data or 300 seconds, whichever comes first. This buffering is internal to Firehose and uses a Lambda-backed buffer that scales automatically.

Key Components and Defaults

Delivery stream name: Unique per AWS account and region.

Source: Direct PUT or Kinesis Data Streams.

Transform records: Optional AWS Lambda function that transforms each incoming record. The Lambda function must return records in the same order and with the same record ID. The Lambda function can add, drop, or modify fields. Default: No transformation.

Convert record format: Optionally convert incoming data to Apache Parquet or Apache ORC. Requires a schema (either from AWS Glue Data Catalog or manually defined). Default: No conversion.

Destination: S3, Redshift (via S3 COPY), Amazon OpenSearch Service, Splunk, or HTTP endpoint.

Buffering hints: For S3 destinations, you can set buffer size (1-128 MB) and buffer interval (60-900 seconds). Default: 5 MB and 300 seconds.

Compression: GZIP, Snappy, ZIP, or Hadoop-compatible Snappy. Default: GZIP.

Encryption: Server-side encryption (SSE-S3, SSE-KMS, or SSE-C) for S3 destinations.

Error handling: If a destination is unreachable, Firehose retries for up to 24 hours by default. You can configure a dead-letter S3 bucket for failed records.

Permissions: Firehose needs IAM roles to write to destinations and invoke Lambda functions.

Configuration and Verification

To create a delivery stream using AWS CLI:

aws firehose create-delivery-stream \
    --delivery-stream-name my-stream \
    --s3-destination-configuration \
        BucketARN=arn:aws:s3:::my-bucket,
        BufferingHints={IntervalInSeconds=300,SizeInMBs=5},
        CompressionFormat=GZIP

To verify delivery stream status:

aws firehose describe-delivery-stream --delivery-stream-name my-stream

To put records:

aws firehose put-record --delivery-stream-name my-stream --record '{"Data":"<base64-encoded-data>"}'

Interaction with Related Technologies

Kinesis Data Streams: Firehose can consume from a Kinesis Data Stream as its source. This decouples producers from delivery and allows multiple consumers (including Firehose) to process the same data.

AWS Lambda: Used for data transformation. The Lambda function receives a batch of records and returns transformed records. The function must complete within the Lambda timeout (default 60s, max 15 min). The batch size is controlled by Firehose's buffering.

AWS Glue: Provides schema for converting data to Parquet/ORC. You can create a Glue Data Catalog table and reference it in Firehose.

Amazon S3: Primary destination. Firehose writes data in a partition-friendly path: YYYY/MM/DD/HH/ by default, or a custom prefix with !{timestamp:...} variables.

Amazon Redshift: Firehose first writes to S3, then issues a COPY command to load data into Redshift. This requires a Redshift cluster and proper IAM permissions.

Amazon OpenSearch Service: Firehose buffers data and then bulk indexes into an OpenSearch cluster. You can configure index rotation (e.g., daily, hourly) and document ID extraction.

Splunk: Firehose delivers data to Splunk via HTTP Event Collector (HEC). Requires a Splunk endpoint and token.

HTTP endpoint: Custom destination for on-premises systems or third-party APIs.

Performance and Scaling

Firehose automatically scales to handle throughput. There is no limit on the number of delivery streams per account (soft limit, can be increased). Each stream can ingest up to 5,000 records/second, 5 MB/second (soft limits). For higher throughput, use Kinesis Data Streams as source, which has higher limits (1 MB/s or 1,000 records/s per shard, and you can add more shards). Firehose's internal buffer can hold up to 128 MB per batch; if data arrives faster than it can be delivered, it buffers in memory and disk. If buffer fills completely, Firehose throttles the source (e.g., returns ProvisionedThroughputExceededException for direct PUT).

Monitoring and Alarms

CloudWatch metrics include: - IncomingBytes, IncomingRecords - DeliveryToS3.Success, DeliveryToS3.Failure - S3Backup.Success, S3Backup.Failure (for dead-letter bucket) - ThrottledRecords

Recommended alarms: ThrottledRecords > 0, DeliveryToS3.Failure > 0, and high buffer age (indicating slow delivery).

Error Handling and Retries

If delivery to a destination fails (e.g., S3 bucket permissions missing, Redshift cluster unavailable), Firehose retries for up to 24 hours. You can configure a dead-letter S3 bucket to capture records that failed after all retries. For Redshift, if the COPY command fails, Firehose writes the error details to S3 in a _errors folder. For OpenSearch, failed documents are written to the dead-letter bucket.

Pricing

You pay per GB of data ingested (including transformed data) and per GB of data delivered to destinations. There is no charge for data transferred between Firehose and S3 within the same region. Lambda transformation costs are additional. The exam rarely asks about exact pricing, but you should know that Firehose is cost-effective for high-volume streaming data.

Walk-Through

Create Delivery Stream

Define the stream name, choose source (direct PUT or Kinesis Data Streams), configure optional transformations (Lambda), set destination (S3, Redshift, OpenSearch, Splunk, HTTP endpoint), specify buffering hints (size and interval), compression, encryption, and error handling (dead-letter bucket). During creation, Firehose creates the necessary IAM role if you provide permissions.

Data Ingestion

Producers send data to the Firehose endpoint using AWS SDK, Kinesis Agent, or via integrated services (CloudWatch Logs, IoT). Each record must be base64-encoded and up to 1 MB. Firehose accepts data and stores it in an internal buffer. The buffer is partitioned by shard-like internal partitions for parallelism.

Optional Transformation via Lambda

If a transformation Lambda is configured, Firehose invokes it with a batch of records (as per buffering). The Lambda function processes each record and returns an array of transformed records in the same order. Records that fail transformation can be dropped or sent to the dead-letter bucket. The Lambda must complete within the configured timeout (default 60s, max 15 min).

Format Conversion

If enabled, Firehose converts the incoming data format (e.g., JSON, CSV) to Apache Parquet or ORC. It uses the schema from AWS Glue Data Catalog or a manually defined schema. The conversion happens after transformation and before buffering for delivery. This step is optional and incurs additional compute cost.

Buffering and Delivery

Firehose buffers data until either the buffer size (1-128 MB) or buffer interval (60-900 seconds) is reached, then delivers the batch to the destination. For S3, it writes an object with a unique name (e.g., `YYYY/MM/DD/HH/file_name.gz`). For Redshift, it first writes to S3 then issues a COPY command. For OpenSearch, it bulk indexes documents. Delivery retries for up to 24 hours on failure.

Error Handling and Backup

If delivery fails after all retries, Firehose writes the failed records to the dead-letter S3 bucket (if configured). You can also enable backup of all original data to S3 before transformation. For Redshift, failed COPY commands generate error files in S3. CloudWatch metrics track success/failure rates, and you can set alarms for anomalies.

What This Looks Like on the Job

Enterprise Scenario 1: Real-Time Clickstream Analytics

A large e-commerce company wants to analyze user clickstream data in near-real-time to personalize recommendations and detect fraud. They use a Kinesis Data Firehose delivery stream with direct PUT from their web servers (using Kinesis Agent). The stream buffers data for 60 seconds or 1 MB, then delivers to an S3 bucket in Parquet format (converted using AWS Glue schema). From S3, Athena queries are run for ad-hoc analysis, and a separate pipeline loads data into Redshift for dashboarding. The company sets a dead-letter S3 bucket to capture any records that fail transformation or delivery. They monitor ThrottledRecords and DeliveryToS3.Failure metrics. A common misconfiguration is forgetting to base64-encode the data, causing Firehose to reject records. Also, if the Glue schema changes, Parquet conversion fails; they handle this by versioning the schema.

Enterprise Scenario 2: Centralized Log Aggregation

A financial services firm collects logs from thousands of EC2 instances and on-premises servers using CloudWatch Logs subscription filters that feed into a Firehose delivery stream. The stream transforms logs via a Lambda function that masks sensitive data (e.g., credit card numbers) and converts timestamps to UTC. The transformed logs are delivered to Amazon OpenSearch Service for real-time search and alerting, with a backup copy to S3 for long-term archival. They use index rotation every hour to manage OpenSearch storage. A pitfall is that the Lambda transformation must be idempotent and fast; if it times out, logs are dropped. They set the Lambda timeout to 5 minutes and ensure the batch size is small enough to process within that window. They also configure a dead-letter S3 bucket for failed transformations.

Enterprise Scenario 3: IoT Sensor Data Ingestion

A manufacturing company collects sensor data from thousands of IoT devices via AWS IoT Core rules that route messages to a Firehose delivery stream. The stream delivers data to S3 in ORC format for later analysis with Amazon EMR and Redshift Spectrum. They use a buffer interval of 900 seconds to reduce small file counts in S3. A common issue is that the IoT rule sends data in a nested JSON format that Firehose cannot convert to ORC without a proper Glue schema. They mitigate by using a Lambda transformation to flatten the JSON before conversion. They also enable backup to S3 for raw data in case of schema changes.

How SAA-C03 Actually Tests This

What SAA-C03 Tests on This Topic

The exam objectives for Kinesis Data Firehose fall under Domain 3: High Performance, specifically Objective 3.2: Determine high-performing data ingestion and transformation solutions. Questions typically present a scenario requiring near-real-time data ingestion (not real-time) and ask you to select Firehose over Kinesis Data Streams or vice versa. Key differentiators: Firehose is fully managed, has built-in buffering, transformation, and delivery to destinations; Kinesis Data Streams requires custom consumers and is for real-time processing with sub-second latency.

Common Wrong Answers and Why Candidates Choose Them

Choosing Kinesis Data Streams when the requirement is to load data into S3 or Redshift: Candidates think 'streaming' means they need Data Streams, but Firehose is designed for exactly this purpose with less operational overhead.

Selecting Firehose for real-time, sub-second processing: Firehose has a minimum buffer interval of 60 seconds, so it is not suitable for real-time analytics requiring immediate action. Candidates overlook the buffering latency.

Assuming Firehose can directly write to DynamoDB or RDS: Firehose only supports S3, Redshift, OpenSearch, Splunk, and HTTP endpoints. Candidates often confuse it with Kinesis Data Analytics or other services.

Thinking Firehose provides exactly-once delivery: Firehose provides at-least-once delivery, meaning duplicates are possible. Candidates may assume it's exactly-once because of the word 'delivery'.

Specific Numbers and Terms on the Exam

Buffer size: 1 MB to 128 MB (default 5 MB)

Buffer interval: 60 to 900 seconds (default 300 seconds)

Maximum record size: 1 MB (base64-encoded)

Retry duration: up to 24 hours

Supported destinations: S3, Redshift, OpenSearch, Splunk, HTTP endpoints

Transformation: AWS Lambda

Format conversion: Parquet, ORC (requires Glue schema)

Compression: GZIP, Snappy, ZIP, Hadoop-compatible Snappy

Encryption: SSE-S3, SSE-KMS, SSE-C

Source options: Direct PUT, Kinesis Data Streams, CloudWatch Logs, IoT, EventBridge

Edge Cases and Exceptions

Firehose cannot deliver to S3 in a different region unless you use cross-region replication on the S3 bucket.

For Redshift destination, the S3 bucket and Redshift cluster must be in the same AWS region.

Firehose does not guarantee ordering of records within a delivery batch, but records from a single producer are generally ordered.

If the Lambda transformation fails, you can configure Firehose to drop the failed records or send them to the dead-letter bucket.

Firehose does not support VPC endpoints; you must use a public endpoint or NAT gateway for private subnets.

How to Eliminate Wrong Answers

If the scenario mentions 'real-time' with sub-second latency, eliminate Firehose.

If the scenario requires custom processing logic (e.g., aggregations, joins), choose Kinesis Data Streams + Lambda/EC2.

If the requirement is to load data into S3/Redshift/OpenSearch with minimal coding, choose Firehose.

If the scenario mentions 'exactly-once delivery', Firehose is not the answer (it's at-least-once).

If the scenario mentions 'data transformation' without specifying a destination, Firehose can do it with Lambda.

Key Takeaways

Firehose is for near-real-time data delivery (60s+ latency) to S3, Redshift, OpenSearch, Splunk, or HTTP endpoints.

Buffering is configurable: buffer size 1-128 MB (default 5 MB) and buffer interval 60-900 seconds (default 300s).

Data transformation is optional via AWS Lambda; format conversion to Parquet/ORC requires a Glue schema.

Maximum record size is 1 MB (base64-encoded).

Firehose provides at-least-once delivery; duplicates are possible.

Supports compression (GZIP, Snappy, ZIP, Hadoop-compatible Snappy) and encryption (SSE-S3, SSE-KMS, SSE-C).

Error handling: retries up to 24 hours; dead-letter S3 bucket for failed records.

Firehose can consume from Kinesis Data Streams, CloudWatch Logs, IoT, or EventBridge.

For Redshift destination, Firehose first writes to S3 then issues a COPY command.

Firehose does not support VPC endpoints; use public endpoint or NAT gateway.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Kinesis Data Firehose

Fully managed delivery service – no consumer code needed

Built-in buffering (60-900s, 1-128 MB)

Automatic data transformation via Lambda

Directly delivers to S3, Redshift, OpenSearch, Splunk, HTTP endpoints

At-least-once delivery; no replay capability

Kinesis Data Streams

Low-level streaming platform – custom consumers required

Real-time with sub-second latency (no buffering)

No built-in transformation; consumers process raw data

No native delivery; consumers write to destinations

Exactly-once per shard (with checkpointing); supports replay via shard iterator

Watch Out for These

Mistake

Firehose provides real-time data delivery with sub-second latency.

Correct

Firehose has a minimum buffer interval of 60 seconds, so it delivers data in near-real-time (60 seconds or more). For sub-second latency, use Kinesis Data Streams.

Mistake

Firehose can directly write to DynamoDB or RDS.

Correct

Firehose only supports S3, Redshift, OpenSearch, Splunk, and HTTP endpoints. To write to DynamoDB or RDS, you need to use a Lambda function as a transformation or use Kinesis Data Streams with a custom consumer.

Mistake

Firehose guarantees exactly-once delivery.

Correct

Firehose provides at-least-once delivery. Duplicates can occur if the destination acknowledges after a timeout and Firehose retries. For exactly-once, you need to implement deduplication in the destination.

Mistake

Firehose can ingest data from any source without any configuration.

Correct

Firehose supports only specific sources: direct PUT, Kinesis Data Streams, CloudWatch Logs, IoT, and EventBridge. For other sources, you need to use the AWS SDK or Kinesis Agent to send data.

Mistake

Firehose automatically handles schema evolution for Parquet/ORC conversion.

Correct

Firehose requires a schema from AWS Glue Data Catalog or manual definition. If the schema changes, conversion may fail unless you update the Glue table or use a Lambda transformation to adapt.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between Kinesis Data Firehose and Kinesis Data Streams?

Kinesis Data Firehose is a fully managed service for loading streaming data into data stores and analytics tools. It automatically buffers, transforms, and delivers data to destinations like S3, Redshift, OpenSearch, and Splunk. Kinesis Data Streams is a low-level streaming data platform that stores data for up to 365 days and requires custom consumers to process and deliver data. Firehose is for near-real-time delivery (60s+ latency), while Data Streams supports real-time, sub-second processing. For the exam, choose Firehose when the requirement is to load data into a supported destination with minimal coding, and choose Data Streams when you need real-time processing, custom consumer logic, or data replay.

Can Firehose deliver data to multiple destinations?

Firehose can deliver to only one destination type per delivery stream. If you need multiple destinations, you can create multiple delivery streams or use a Kinesis Data Streams source and have multiple Firehose streams consuming from it. Alternatively, you can use a Lambda function to fan out data to multiple destinations, but that adds complexity. The exam often tests that Firehose has a single destination per stream.

What happens if the Lambda transformation fails?

If the Lambda function fails (e.g., timeout, error), Firehose retries the transformation. If it continues to fail, you can configure Firehose to either drop the failed records or send them to a dead-letter S3 bucket. The default behavior is to retry and then drop if unsuccessful. You should also set CloudWatch alarms to monitor transformation failures.

How does Firehose handle data delivery to Redshift?

Firehose first writes the data to an intermediate S3 bucket in batches. Then it issues a Redshift COPY command to load the data from S3 into the Redshift table. The COPY command uses the specified IAM role for permissions. If the COPY fails, Firehose writes error details to an `_errors` folder in the S3 bucket. The S3 bucket and Redshift cluster must be in the same AWS region.

Can Firehose guarantee exactly-once delivery?

No, Firehose provides at-least-once delivery. Duplicates can occur if the destination acknowledges after a timeout and Firehose retries the batch. To achieve exactly-once, you need to implement deduplication logic in the destination (e.g., using unique identifiers in S3 or primary keys in Redshift). The exam may test this: Firehose is not suitable for use cases requiring exactly-once semantics.

What is the maximum record size for Firehose?

The maximum record size is 1 MB before base64 encoding. After base64 encoding, the maximum payload is approximately 1.4 MB. If a record exceeds this limit, Firehose will reject it. This is a common exam detail: remember 1 MB as the maximum record size.

Does Firehose support VPC endpoints?

No, Firehose does not support VPC endpoints (interface or gateway). To send data from a private subnet, you must use a NAT gateway or a public endpoint. Alternatively, you can use a VPC endpoint for Kinesis Data Streams and then have Firehose consume from the stream. This is a nuance the exam might test.

Terms Worth Knowing

IAM Kinesis Region

Ready to put this to the test?

You've just covered Kinesis Data Firehose Delivery Streams — now see how well it sticks with free SAA-C03 practice questions. Full explanations included, no account needed.

Try SAA-C03 practice questions Back to all chapters

Done with this chapter?

Kinesis Data Streams: Shards and Consumers

Kinesis Data Analytics (Managed Apache Flink)

See the full SAA-C03 study guide