AZ-204Chapter 66 of 102Objective 5.3

Event Hubs: Partitions, Consumer Groups, Capture

This chapter covers Azure Event Hubs, focusing on three critical features: partitions, consumer groups, and Capture. These concepts are central to building scalable event ingestion pipelines on Azure. For the AZ-204 exam, Event Hubs appears in approximately 5-10% of questions, often integrated with other services like Azure Functions, Stream Analytics, or Blob Storage. Mastering partitions, consumer groups, and Capture will help you design solutions that scale, maintain ordering, and archive event data efficiently.

25 min read
Intermediate
Updated May 31, 2026

Event Hubs as a Multi-Lane Toll Plaza

Imagine a massive toll plaza with multiple lanes. Each lane (partition) is an independent queue that processes vehicles (events) in strict order. The toll plaza manager assigns each vehicle to a specific lane based on its license plate number (partition key) to ensure fairness and avoid congestion. Each lane has its own ticket booth (consumer group) that serves a specific purpose: one booth counts vehicles, another inspects cargo, and a third records video. Each booth can have multiple operators (consumers) reading from the lane, but each operator reads from a different segment of the lane to avoid duplication. The plaza also has a camera system (Event Hubs Capture) that automatically takes a snapshot of all vehicles passing through every few minutes, storing the images in a storage warehouse (Azure Blob Storage or Data Lake Storage). If the snapshot interval is set to 5 minutes, the system writes a file containing all vehicles that passed in that window. This allows later analysis without needing to replay the entire stream. The key insight: partitions provide ordering and scale, consumer groups allow multiple independent readers, and Capture provides automatic archiving.

How It Actually Works

What is Azure Event Hubs?

Azure Event Hubs is a managed, real-time data ingestion service that can receive millions of events per second from sources like applications, devices, or services. It acts as a distributed, buffered event stream that decouples event producers from event consumers. The service provides a unified platform for event processing, analytics, and storage.

Partitions: The Foundation of Scale

A partition is an ordered sequence of events stored in an Event Hub. Each partition is an independent, append-only log. When you create an Event Hub, you specify a partition count between 2 and 32 (default 4). The partition count cannot be changed after creation, so choose carefully based on expected throughput.

How partitions work: Events are distributed across partitions using a partition key. The producer can specify a partition key (e.g., device ID, user ID). Events with the same partition key are guaranteed to go to the same partition, preserving order. If no partition key is provided, events are distributed in a round-robin fashion.

Throughput unit (TU) scaling: Each TU provides 1 MB/s ingress (or 1000 events/s) and 2 MB/s egress. Partitions allow you to scale by adding TUs. However, the number of partitions limits the maximum parallelism for consumers. Each partition can be read by only one consumer per consumer group at a time (for the EventProcessorHost or similar).

Ordering guarantee: Within a partition, events are delivered in the order they were received. Across partitions, no order is guaranteed. This is a common exam point: if you need strict ordering for all events, you must ensure they all go to the same partition (e.g., by using the same partition key), which limits throughput.

Partition ownership: When using the EventProcessorHost, each partition is owned by a single consumer instance. If a consumer fails, the partition is rebalanced to another consumer. This ensures exactly-once processing semantics (if checkpointing is used correctly).

Consumer Groups: Independent Views of the Stream

A consumer group is a logical group of consumers that independently read from an Event Hub. Each consumer group maintains its own offset and checkpoint state. This allows multiple applications to read the same event stream without interfering.

Default consumer group: Every Event Hub has a default consumer group named "$Default". You can create up to 20 consumer groups per Event Hub (standard tier) or 100 (dedicated tier).

Use cases: One consumer group might be used for real-time processing (e.g., Azure Stream Analytics), another for archival (e.g., Event Hubs Capture), and another for long-term analytics (e.g., Azure Databricks).

Consumer group and partitions: Each consumer group reads from all partitions. Within a consumer group, each partition is read by at most one consumer (for checkpointed processing). This ensures that events are not duplicated across consumers in the same group.

Exam trap: A common wrong answer is that consumer groups isolate events between groups. Actually, all consumer groups see the same events; they just track their own position independently.

Event Hubs Capture: Automatic Archiving

Event Hubs Capture automatically writes the events from an Event Hub to Azure Blob Storage or Azure Data Lake Storage Gen2 in Avro or Parquet format. It is a no-code solution for archiving event data.

- Configuration: You enable Capture at the Event Hub level. You specify a time window (1-15 minutes) or a size window (10-500 MB). The first condition met triggers a file write. Default is 5 minutes or 300 MB. - File naming convention: Capture writes files with the following pattern: {Namespace}/{EventHub}/{PartitionId}/{Year}/{Month}/{Day}/{Hour}/{Minute}/{Second} - Format: Avro is the default format. You can also choose Parquet. The Avro file includes the schema and event data in a self-describing format. - Performance considerations: Capture does not affect the event ingestion pipeline. It reads from the Event Hub as a separate consumer (using a hidden consumer group) and writes to storage. Ensure the storage account has sufficient egress capacity. - Exam tip: Capture is often used with Azure Stream Analytics or Azure Data Lake Analytics for batch processing. Remember that Capture writes files per partition, so if you have 32 partitions, you get up to 32 files per window.

How Partitions, Consumer Groups, and Capture Interact

Partitions provide the physical scale and ordering. Consumer groups provide logical separation of readers. Capture provides automatic durability and archival.

When Capture is enabled, it uses one of the consumer groups (hidden) to read from all partitions. This does not conflict with other consumer groups.

The maximum throughput is determined by the number of TUs and partitions. Each partition can handle up to 1 MB/s ingress (or 1000 events/s) per TU. But the partition count itself does not limit throughput; TUs do. However, if you have more consumers than partitions, some consumers will be idle.

Configuration and Verification

To create an Event Hub with partitions and Capture using Azure CLI:

az eventhubs eventhub create --name myeventhub \
  --namespace-name mynamespace \
  --resource-group myrg \
  --partition-count 8 \
  --message-retention-in-days 1 \
  --enable-capture true \
  --capture-interval 300 \
  --capture-size-limit 314572800 \
  --capture-destination-name blobcontainer \
  --capture-destination-storage-account- resource-id /subscriptions/.../storageAccounts/mystorage

To verify partitions and consumer groups:

az eventhubs eventhub show --name myeventhub --namespace-name mynamespace --resource-group myrg --query partitions

To list consumer groups:

az eventhubs consumer-group list --eventhub-name myeventhub --namespace-name mynamespace --resource-group myrg

Related Technologies

Azure Functions: Can be triggered by Event Hubs. The function triggers on batches of events. The maxBatchSize and prefetchCount settings affect performance.

Azure Stream Analytics: Can read from Event Hubs as input and write to Event Hubs as output. It uses its own consumer group.

Azure Data Lake Storage: Capture destination. Used for long-term storage and batch analytics.

EventProcessorHost: A .NET library that simplifies reading from Event Hubs with checkpointing and partition lease management.

Key Numbers and Defaults for the Exam

Partition count: 2-32, default 4. Cannot be changed after creation.

Consumer groups: up to 20 (standard), 100 (dedicated).

Capture time window: 1-15 minutes, default 5.

Capture size window: 10-500 MB, default 300.

Message retention: 1-7 days (standard), up to 90 days (dedicated).

Throughput units: 1-20 (standard), auto-inflate available.

Maximum ingress per TU: 1 MB/s or 1000 events/s.

Maximum egress per TU: 2 MB/s.

Walk-Through

1

Define partition key strategy

Before creating an Event Hub, determine how you will distribute events across partitions. If you need ordering for a specific subset of events (e.g., all events from a device), use the device ID as the partition key. If ordering is not critical, omit the partition key to allow round-robin distribution. This decision affects both throughput and consumer parallelism.

2

Create Event Hub with partition count

Use Azure portal, CLI, or ARM template to create the Event Hub. Specify the partition count based on expected throughput. For example, if you anticipate 10 MB/s ingress, with 1 MB/s per TU, you need at least 10 TUs and at least 10 partitions to fully utilize them. Partition count cannot be changed later, so over-provision slightly (e.g., 16 partitions for future growth).

3

Configure consumer groups

Create additional consumer groups for each independent consumer application. For instance, one group for real-time processing, one for archival, and one for analytics. Each group will maintain its own offset. The default $Default group is used by the Azure portal and some SDKs. Avoid using the same group for multiple purposes to prevent offset conflicts.

4

Enable Event Hubs Capture

In the Event Hub settings, enable Capture and specify the storage account, container, and file naming convention. Set the time window (e.g., 5 minutes) and size window (e.g., 300 MB). Capture will automatically write Avro files. Ensure the storage account is in the same region to minimize latency and cost.

5

Implement consumer with checkpointing

When building a consumer application, use the EventProcessorHost (for .NET) or similar SDK that supports checkpointing. The consumer reads events from each partition and periodically saves the offset (checkpoint) to a storage account. If the consumer restarts, it resumes from the last checkpoint. This ensures exactly-once processing semantics.

What This Looks Like on the Job

Enterprise Scenario 1: IoT Device Telemetry

A manufacturing company ingests telemetry from 100,000 IoT sensors. Each sensor sends data every 5 seconds. They create an Event Hub with 32 partitions and use the sensor ID as the partition key to maintain per-sensor order. They enable Capture with a 5-minute window to archive raw data to Azure Data Lake Storage for historical analysis. A real-time Azure Stream Analytics job reads from a separate consumer group to detect anomalies and trigger alerts. The system handles 20,000 events per second. A common misconfiguration is using too few partitions, causing hot partitions where one partition receives more data than others, leading to throttling. They solved this by ensuring partition keys are evenly distributed (e.g., using a hash of sensor ID).

Enterprise Scenario 2: E-Commerce Order Processing

An e-commerce platform processes orders from multiple sources. They use Event Hubs to decouple order ingestion from processing. Each order event includes a partition key of the order ID to ensure ordering within an order. They have 16 partitions. Capture is enabled with a 10-minute window to store raw orders for auditing. Multiple microservices read from different consumer groups: one for payment processing, one for inventory updates, and one for analytics. The payment processor uses EventProcessorHost with checkpointing to ensure exactly-once processing. A common issue is checkpointing too frequently, which degrades performance. They set checkpointing every 1000 events or every 30 seconds, whichever comes first.

Enterprise Scenario 3: Log Aggregation

A SaaS company aggregates logs from thousands of servers. They use Event Hubs with 8 partitions and no partition key (round-robin) to distribute load evenly. They enable Capture with a 1-minute window to minimize data loss. The logs are then processed by Azure Functions for real-time alerting and by Azure Data Explorer for long-term analysis. They learned that setting the capture window too short (e.g., 1 minute) creates many small files, increasing storage costs. They optimized to 5 minutes. Also, they initially forgot to set a consumer group for the Azure Functions, causing conflicts with the Capture consumer. They now always create dedicated consumer groups.

How AZ-204 Actually Tests This

The AZ-204 exam tests Event Hubs under objective "Integrate" with a focus on designing and implementing event-driven solutions. Key areas:

1.

Partition key and ordering: The exam often asks whether events are ordered across partitions. The correct answer is: ordering is guaranteed only within a partition. If you need global ordering, you must use a single partition, which limits scale. A common wrong answer is that Event Hubs guarantees global ordering.

2.

Consumer groups: The exam tests that consumer groups provide independent views of the stream. A typical question: "You have two applications reading from the same Event Hub. How do you ensure they don't interfere?" The answer is to use separate consumer groups. A wrong answer is to use separate partitions.

3.

Capture settings: Know the default capture time (5 minutes) and size (300 MB). The exam may ask: "You need to archive events every 2 minutes. Can you set the time window to 2 minutes?" Yes, the range is 1-15 minutes. Another trap: Capture writes one file per partition per window. So with 4 partitions and a 5-minute window, you get up to 4 files every 5 minutes.

4.

Throughput units vs partitions: A common mistake is thinking partitions directly affect throughput. Actually, TUs determine throughput. Partitions limit parallelism. The exam might present a scenario: "You need to increase ingress throughput from 1 MB/s to 10 MB/s. What do you do?" Answer: increase TUs, not partitions. However, if you have only 2 partitions, you can only have at most 2 consumers reading, so you might need more partitions for consumer parallelism.

5.

Event retention: The default retention is 1 day, configurable up to 7 days (standard) or 90 days (dedicated). The exam may ask: "Events must be available for reprocessing for 30 days. Which tier?" Dedicated tier.

6.

Capture vs. Event Grid: Capture is for archiving to storage. Event Grid is for event routing. The exam might confuse these. Remember: Capture writes to Blob/ADLS; Event Grid sends events to subscribers.

7.

Checkpointing: The exam tests that checkpointing enables resuming from the last processed event. A wrong answer is that checkpointing ensures exactly-once delivery (it ensures exactly-once processing).

Eliminate wrong answers by focusing on the underlying mechanism: partitions are physical logs, consumer groups are logical pointers, Capture is a separate consumer writing to storage.

Key Takeaways

Partitions provide ordered event streams within each partition; global ordering requires a single partition.

Partition count is immutable after Event Hub creation; choose based on expected parallelism.

Consumer groups allow multiple independent applications to read the same event stream without conflict.

Event Hubs Capture automatically writes events to Blob Storage or Data Lake Storage in Avro or Parquet format.

Default capture time window is 5 minutes; range is 1-15 minutes. Default size window is 300 MB; range is 10-500 MB.

Throughput Units (TUs) control ingress/egress capacity; partitions enable consumer parallelism.

Checkpointing enables exactly-once processing by storing the offset of last processed event.

Event Hubs can retain events for 1-7 days (standard) or up to 90 days (dedicated).

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Event Hubs

Designed for high-throughput event ingestion (millions of events per second).

Supports multiple consumer groups for independent readers.

No built-in device management or twin support.

Partition count is fixed at creation (2-32).

Capture provides automatic archiving to storage.

IoT Hub

Designed for IoT device communication and management.

Supports device-to-cloud and cloud-to-device messaging.

Includes device identity registry, twins, and direct methods.

Built-in per-device authentication and throttling.

Events are stored for up to 7 days (default 1 day).

Watch Out for These

Mistake

Event Hubs guarantees global ordering across all partitions.

Correct

Ordering is only guaranteed within a single partition. Across partitions, events arrive in an undefined order.

Mistake

Adding more partitions increases throughput automatically.

Correct

Throughput is determined by Throughput Units (TUs), not partition count. Partitions enable parallelism for consumers but do not directly increase ingress/egress capacity.

Mistake

Consumer groups filter events so different groups see different events.

Correct

All consumer groups see the same set of events. Each group independently tracks its own offset, but the event stream is identical.

Mistake

Event Hubs Capture writes one file per Event Hub per time window.

Correct

Capture writes one file per partition per time window (or size window). So with 4 partitions, you get up to 4 files per window.

Mistake

You can change the partition count after creating an Event Hub.

Correct

Partition count is fixed at creation and cannot be changed. You must create a new Event Hub if you need a different partition count.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between a partition and a consumer group?

A partition is a physical, ordered log that stores events. A consumer group is a logical grouping of consumers that independently read from all partitions. Partitions provide scale and ordering; consumer groups provide separate views of the stream. Each consumer group can have multiple consumers, but each partition is read by at most one consumer per group.

Can I change the partition count of an existing Event Hub?

No, the partition count is set at creation and cannot be changed. You would need to create a new Event Hub with the desired partition count and migrate your producers and consumers.

How does Event Hubs Capture work?

Capture is a feature that automatically writes events from an Event Hub to Azure Blob Storage or Azure Data Lake Storage Gen2. It uses a hidden consumer group to read events and writes files in Avro or Parquet format based on a time window (1-15 min, default 5) or size window (10-500 MB, default 300). The first condition met triggers the write.

What is the default consumer group name?

The default consumer group is named "$Default". Every Event Hub has this group automatically. It is used by the Azure portal and many SDKs if no other group is specified.

How many consumer groups can I have per Event Hub?

In the standard tier, you can have up to 20 consumer groups. In the dedicated tier, up to 100. The default $Default group counts toward this limit.

What happens if I exceed my Throughput Units?

If ingress exceeds the allocated TUs, the service throttles requests with a 429 (Too Many Requests) error. Producers should implement retry logic with exponential backoff. You can enable auto-inflate to automatically increase TUs up to a maximum you define.

Can I use Event Hubs for exactly-once delivery?

Event Hubs provides at-least-once delivery. To achieve exactly-once processing, you must use checkpointing in your consumer to track the last processed offset, combined with idempotent processing logic.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Event Hubs: Partitions, Consumer Groups, Capture — now see how well it sticks with free AZ-204 practice questions. Full explanations included, no account needed.

Done with this chapter?