AZ-900Chapter 75 of 127Objective 2.2

Azure Event Hubs

This chapter covers Azure Event Hubs, a fully managed real-time data ingestion service that is part of the Azure messaging and eventing portfolio. For the AZ-900 exam, Event Hubs falls under Domain 2 (Azure Architecture and Services), specifically Objective 2.2: Identify core data services. This objective area carries approximately 10-15% of the exam weight, and Event Hubs is a frequently tested service because it represents a key pattern for big data and telemetry ingestion. By the end of this chapter, you will understand what Event Hubs is, how it works, its key components and pricing tiers, and exactly what you need to know for the exam.

25 min read
Intermediate
Updated May 31, 2026

The Busy Airport Baggage System

Imagine a major international airport. Thousands of passengers arrive every minute, each with luggage that needs to be sorted and sent to the correct gate, connecting flight, or baggage claim. The airport cannot have every passenger walk their own bag to the gate—that would cause chaos and delays. Instead, they use a centralized baggage handling system. Passengers drop their bags at check-in counters (the producers). The bags are placed on a high-speed conveyor belt network (the event hub). The conveyor belt does not inspect or modify the bags—it just moves them quickly. At various points along the belt, automated scanners read the bag tags and divert bags to specific chutes leading to different gates, carousels, or cargo holds (the consumers). The system can handle millions of bags per day without any single point of failure because it uses redundant belts and multiple scanners. If a scanner fails, others take over. If a gate is overwhelmed, bags are temporarily held in a buffer area (the checkpoint store). This is exactly how Azure Event Hubs works: it ingests massive streams of events (like sensor data, clicks, or log entries) from many sources, holds them reliably in a partitioned log, and allows multiple downstream systems to read the same events at their own pace, each processing only the events they care about. The baggage system doesn't care what's inside the bags—it just moves them efficiently. Event Hubs doesn't care about the content of events—it just ingests, buffers, and distributes them at scale.

How It Actually Works

What is Azure Event Hubs and What Business Problem Does It Solve?

Azure Event Hubs is a big data streaming platform and event ingestion service. It can receive and process millions of events per second from concurrent sources, such as mobile devices, IoT sensors, web applications, and log files. The core problem it solves is the need to decouple event producers from event consumers in a scalable, reliable, and low-latency manner. In traditional on-premises architectures, if you have thousands of devices sending data, you would need to build a custom message queue or broker, manage its scaling, handle failures, and ensure data durability. Event Hubs does all of this as a PaaS (Platform as a Service) offering, so you pay only for what you use and never worry about infrastructure.

How Event Hubs Works – Step by Step Mechanism

Event Hubs operates on a publish-subscribe model with a partitioned consumer pattern. Here is the mechanism:

1.

Producers send events to an Event Hubs namespace. An event is a small packet of data (up to 1 MB) that contains a body (the actual data) and optional metadata (properties, system properties).

2.

The namespace is the management container for one or more event hubs. Each event hub is a specific data stream, analogous to a topic in a message broker.

3.

Inside an event hub, events are stored in a partitioned log. Partitions are ordered sequences of events that are independent of each other. The number of partitions is specified when creating the event hub (between 2 and 32, with 32 being the default for Standard tier). Partitions allow horizontal scaling: producers can send events to any partition, and consumers can read from multiple partitions in parallel.

4.

Events are retained for a configurable period (1 to 7 days by default, up to 90 days with the Capture feature). During retention, events are immutable and can be replayed by consumers.

5.

Consumers read events from partitions using consumer groups. A consumer group is a view of the entire event hub that allows multiple consumer applications to read the same stream independently. For example, a real-time dashboard and a long-term archival process can each have their own consumer group, both reading the same events without interfering.

6.

The consumer reads events using a checkpointing mechanism: it records the offset (position) of the last successfully processed event. If the consumer fails, it can resume from the checkpoint, ensuring at-least-once delivery.

Key Components, Tiers, and Pricing

Namespace: The management container. Provides a unique FQDN (e.g., myhub-ns.servicebus.windows.net). It can contain multiple event hubs.

Event Hub: The actual data stream. Each event hub has a name and a partition count.

Partition: A ordered sequence of events. Partitions are the unit of scale. Events are assigned to a partition based on a partition key (if provided) or round-robin.

Consumer Group: A named grouping of consumers. Each consumer group maintains its own offset. Default consumer group: $Default.

Throughput Units (TUs) or Processing Units (PUs) : In the Standard tier, you purchase TUs. Each TU provides 1 MB/s ingress (sending) and 2 MB/s egress (reading), up to 1000 events per second. In the Premium and Dedicated tiers, you use PUs, which provide more predictable performance.

Capture: An automatic feature that stores events in Azure Blob Storage or Azure Data Lake Storage without writing any code. This is useful for long-term retention and batch processing.

Auto-Inflate: A feature that automatically scales up the number of TUs based on load, up to a maximum you set. This helps handle traffic spikes without manual intervention.

Pricing tiers: - Basic: Ingress only (no egress metering), limited to 1 consumer group, 1 TU. Suitable for test/dev. - Standard: Full features, up to 20 consumer groups, auto-inflate, Capture, Geo-disaster recovery. Pay per TU. - Premium: Higher performance, predictable latency, more TUs (up to 100), no throttling. Uses PUs. - Dedicated: Single-tenant cluster, best for large-scale workloads. No throttling, unlimited TUs.

Comparison to On-Premises Equivalent

On-premises, you might use Apache Kafka, RabbitMQ, or custom TCP-based ingestion. Kafka is the closest analogy: both use a partitioned log model. However, Kafka requires you to manage ZooKeeper, brokers, disks, replication, and monitoring. Event Hubs is fully managed: Microsoft handles the infrastructure, patching, and high availability. Event Hubs also integrates natively with other Azure services like Azure Stream Analytics, Azure Functions, and Power BI, reducing the need for custom integration code.

Azure Portal and CLI Touchpoints

To create an Event Hubs namespace and event hub:

1.

In the Azure portal, search for "Event Hubs" and click "Create".

2.

Choose a resource group, namespace name, location, pricing tier (Standard is common for production).

3.

Set throughput units (e.g., 1) and enable auto-inflate if desired.

4.

After creation, go to the namespace and click "+ Event Hub" to add an event hub with a name and partition count (e.g., 4 partitions).

5.

To send events, you can use the .NET, Java, Python, or Node.js SDK. For testing, the portal provides a "Generate data" feature.

Using Azure CLI:

# Create a resource group
az group create --name MyResourceGroup --location eastus

# Create an Event Hubs namespace
az eventhubs namespace create --name MyNamespace --resource-group MyResourceGroup --location eastus --sku Standard

# Create an event hub
az eventhubs eventhub create --name MyEventHub --namespace-name MyNamespace --resource-group MyResourceGroup --partition-count 4

# List authorization rules
ez eventhubs namespace authorization-rule list --namespace-name MyNamespace --resource-group MyResourceGroup

Concrete Business Scenarios

IoT Telemetry: A factory has thousands of sensors sending temperature and vibration data every second. Event Hubs ingests all this data, then Azure Stream Analytics runs real-time queries to detect anomalies and trigger alerts. A second consumer group archives raw data to Azure Data Lake for machine learning training.

Clickstream Analysis: An e-commerce website sends page views, clicks, and purchases to Event Hubs. Multiple consumer groups allow a real-time dashboard (using Power BI) and a batch processing job (using Azure Databricks) to read the same stream independently.

Log Aggregation: Multiple applications send structured logs to Event Hubs. Azure Monitor or a third-party SIEM tool consumes the logs for analysis and alerting.

Walk-Through

1

Plan Partitions and TUs

Before creating an Event Hubs namespace, determine the required number of partitions and throughput units. Partitions are fixed once created; you cannot change them later. A good rule of thumb is to start with 4 partitions for development and scale out as needed. Throughput units (TUs) can be changed later, but auto-inflate can handle spikes. For the exam, remember that partitions are the unit of parallelism and scale, and each partition provides 1 MB/s ingress and 2 MB/s egress per TU.

2

Create Namespace and Event Hub

In the Azure portal, create an Event Hubs namespace with a globally unique name, select a pricing tier (Standard is typical), and set the initial TUs. After the namespace is deployed, create an event hub inside it. Specify the partition count (between 2 and 32). The event hub name must be unique within the namespace. Behind the scenes, Azure allocates storage and network resources. A shared access signature (SAS) policy is created automatically for send and listen permissions.

3

Configure Producers

Producers are applications that send events to the event hub. You need to provide the connection string (from the SAS policy) and the event hub name. Producers can send events with a partition key to ensure related events go to the same partition, preserving order. Without a partition key, events are distributed round-robin. The SDKs handle batching and retries. For high-throughput, producers can send events asynchronously. Event Hubs accepts events up to 1 MB in size.

4

Configure Consumers and Consumer Groups

Consumers read events from partitions. Each consumer group provides an independent view of the stream. Create additional consumer groups if you have multiple downstream processes (e.g., real-time analytics and archival). Consumers use a checkpoint store (Azure Blob Storage) to record offsets. The EventProcessorHost pattern (or equivalent in .NET, Java) manages partition ownership and load balancing across multiple consumer instances. For the exam, know that the default consumer group is `$Default` and you can have up to 20 consumer groups in Standard tier.

5

Monitor and Scale

After deployment, monitor metrics like incoming messages, outgoing messages, throttled requests, and quota exceeded errors. If you see throttling (HTTP 429 or ServerBusyException), you need to increase TUs or enable auto-inflate. In the portal, you can adjust TUs manually or set auto-inflate with a maximum TU limit. For Premium and Dedicated tiers, you use Processing Units (PUs) instead. Use Azure Monitor alerts to notify you of high throttling rates.

What This Looks Like on the Job

Scenario 1: Real-Time Fraud Detection for a Payment Processor

A financial services company processes millions of credit card transactions per day. They need to detect fraudulent transactions in real-time to block them before completion. They use Azure Event Hubs to ingest transaction events from their payment gateway. Each event contains transaction ID, amount, merchant, location, and timestamp. The event hub is configured with 16 partitions to handle high throughput. A Stream Analytics job reads from one consumer group, runs a pattern-matching query (e.g., multiple high-value transactions in different locations within minutes), and outputs alerts to an Azure Function that blocks the transaction. A second consumer group feeds raw data into Azure Data Lake Storage for historical analysis and model training. The company sets auto-inflate with a maximum of 10 TUs to handle peak hours like Black Friday. Cost is managed by using the Standard tier with reserved capacity. Common pitfalls: not enough partitions leading to throttling, or forgetting to enable auto-inflate, causing dropped events during spikes.

Scenario 2: IoT Sensor Telemetry for a Smart Building

A property management company deploys thousands of IoT sensors across multiple buildings to monitor temperature, humidity, CO2 levels, and occupancy. All sensors send data every 30 seconds to a central Event Hubs namespace. Each sensor has a unique device ID used as the partition key, ensuring all data from one sensor is in order. The event hub uses 8 partitions. A real-time dashboard built with Power BI reads from the $Default consumer group to show current conditions. An Azure Function reads from a second consumer group to trigger HVAC adjustments when temperature exceeds thresholds. Data is also captured to Azure Blob Storage for compliance (retention required for 2 years). The team uses the Basic tier initially but moves to Standard when they need multiple consumer groups. Issues arise when sensors send malformed events—Event Hubs does not validate content, so downstream consumers must handle errors gracefully.

Scenario 3: Clickstream Analytics for an E-Commerce Platform

An online retailer wants to analyze user behavior to personalize recommendations. They instrument their website to send page views, clicks, add-to-cart events, and purchases to Event Hubs. The event hub has 32 partitions to handle high traffic. A Spark job in Azure Databricks reads from one consumer group to build a real-time product recommendation model. Simultaneously, a Stream Analytics job reads from another consumer group to update a real-time sales dashboard. The Capture feature archives raw events to Azure Data Lake Storage for nightly batch processing. The team misconfigures the checkpoint store, causing consumers to restart from the beginning after a restart, reprocessing millions of events. This is avoided by using a proper Azure Storage account for checkpoints and ensuring the consumer group starts at the latest offset.

How AZ-900 Actually Tests This

AZ-900 Exam Focus: Objective 2.2 – Identify Core Data Services

Event Hubs is tested under "Data ingestion and processing services". The exam expects you to:

Identify Event Hubs as a real-time data ingestion service for telemetry and event streams.

Understand the difference between Event Hubs, IoT Hub, and Event Grid.

Know the pricing tiers (Basic, Standard, Premium, Dedicated) and their basic differences.

Understand partitions, consumer groups, and throughput units (TUs/PUs).

Common Wrong Answers and Why Candidates Choose Them

1.

"Event Hubs is a message queue like Service Bus." Wrong. Event Hubs is for high-throughput event streaming, not for point-to-point messaging with features like sessions, transactions, or dead-lettering. Service Bus is for enterprise messaging with those features. Candidates confuse the two because both are under the same Azure Messaging family.

2.

"Event Hubs guarantees exactly-once delivery." Wrong. Event Hubs provides at-least-once delivery. Exactly-once is not guaranteed; consumers must handle duplicates using idempotent processing. Candidates assume all messaging services guarantee exactly-once.

3.

"You can change the partition count after creation." Wrong. Partition count is fixed at creation. Candidates think scaling is always flexible.

4.

"Event Hubs can store events indefinitely." Wrong. Default retention is 1-7 days (up to 90 with Capture). Candidates confuse retention with archival storage.

Specific Terms and Values

Default consumer group: $Default

Maximum partition count: 32 (Standard), 100 (Premium), 200 (Dedicated)

Maximum event size: 1 MB

Throughput unit: 1 MB/s ingress, 2 MB/s egress, up to 1000 events/sec

Retention: 1-7 days (default), up to 90 days with Capture

Capture automatically stores events in Azure Blob Storage or Data Lake Storage.

Edge Cases and Tricky Distinctions

Event Hubs vs. IoT Hub: IoT Hub is specifically for IoT device management and communication, with device identity, twin, and direct methods. Event Hubs is a general-purpose streaming service. The exam may ask which to use for device-to-cloud telemetry vs. simple event ingestion.

Event Hubs vs. Event Grid: Event Grid is for reactive event routing (push model) with a schema, not for high-throughput streaming. Event Hubs is pull-based for consumers. The exam may test scenarios: real-time analytics uses Event Hubs; reacting to blob creation uses Event Grid.

Throughput Units vs. Processing Units: Standard uses TUs; Premium uses PUs. PUs provide more predictable performance and are not throttled like TUs.

Memory Trick: "Event Hubs = High-Throughput Stream"

Use the acronym H.U.B.S. to remember key features: - High-throughput (millions of events/sec) - Unlimited consumers (via consumer groups) - Buffered storage (partitioned log with retention) - Scalable (partitions and TUs)

When you see a scenario with "real-time telemetry from many devices", choose Event Hubs. If the scenario mentions "command and control of devices" or "device twins", choose IoT Hub. If it's about "reacting to events with serverless functions", choose Event Grid.

Key Takeaways

Azure Event Hubs is a fully managed, real-time data ingestion service for high-throughput event streaming.

Events are stored in a partitioned log with configurable retention (1-7 days default, up to 90 with Capture).

Partitions are fixed at creation (2-32 for Standard, up to 200 for Dedicated).

Throughput Units (Standard) provide 1 MB/s ingress and 2 MB/s egress per TU.

Consumer groups allow multiple independent consumers to read the same event stream.

Event Hubs provides at-least-once delivery; consumers must handle duplicates.

Capture automatically writes events to Azure Blob Storage or Data Lake Storage.

Pricing tiers: Basic (1 consumer group, no egress metering), Standard (up to 20 consumer groups, auto-inflate), Premium (PUs, predictable performance), Dedicated (single-tenant).

Event Hubs integrates natively with Azure Stream Analytics, Azure Functions, Power BI, and Azure Databricks.

On the AZ-900 exam, distinguish Event Hubs from IoT Hub (device management) and Event Grid (reactive event routing).

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Azure Event Hubs

General-purpose event streaming service

Supports multiple protocols (AMQP, HTTPS, Kafka)

No device identity or twin management

Pull-based consumption (consumers read from partitions)

Best for high-throughput telemetry from any source

Azure IoT Hub

Purpose-built for IoT device connectivity and management

Supports MQTT, AMQP, HTTPS, and WebSockets

Includes device identity registry, twins, and direct methods

Supports cloud-to-device messages and device-to-cloud telemetry

Best for bidirectional communication with IoT devices

Watch Out for These

Mistake

Event Hubs is the same as Azure Service Bus.

Correct

Event Hubs is for high-throughput event streaming (telemetry, logs) with a partitioned log model, while Service Bus is for enterprise messaging with features like queues, topics, sessions, and dead-lettering. Service Bus is for point-to-point or publish-subscribe with transactional guarantees.

Mistake

Event Hubs provides exactly-once delivery.

Correct

Event Hubs provides at-least-once delivery. Consumers may receive duplicate events, especially during failures or rebalancing. Applications must be idempotent to handle duplicates.

Mistake

You can change the number of partitions after creating an event hub.

Correct

Partition count is fixed at creation and cannot be changed. You must plan your partition count based on expected throughput. To scale beyond the initial partitions, you need to create a new event hub with more partitions and migrate producers/consumers.

Mistake

Event Hubs can store events for months without extra configuration.

Correct

Default retention is 1-7 days. To store events for longer, you must enable the Capture feature, which automatically writes events to Azure Blob Storage or Azure Data Lake Storage. Capture allows retention up to 90 days in the event hub itself, but storage is separate.

Mistake

Event Hubs is only for IoT scenarios.

Correct

While Event Hubs is commonly used for IoT telemetry, it is a general-purpose streaming service suitable for clickstream analysis, log aggregation, real-time analytics, and any scenario requiring high-throughput event ingestion from many sources.

Frequently Asked Questions

What is the difference between Azure Event Hubs and Azure Service Bus?

Event Hubs is designed for high-throughput event streaming, like telemetry from millions of devices. It uses a partitioned log model and supports multiple consumer groups for independent processing. Service Bus is for enterprise messaging with features like queues, topics, sessions, and dead-lettering. It provides transactional guarantees and is suitable for point-to-point messaging or publish-subscribe with complex routing. For the exam, remember: Event Hubs = big data streams; Service Bus = reliable messaging.

Can Event Hubs be used with Apache Kafka?

Yes, Event Hubs provides a Kafka endpoint that is compatible with Apache Kafka protocols. You can use existing Kafka clients (e.g., Java, Python) to send and receive events from Event Hubs without changing your code. This is called AMQP- and Kafka-compatible endpoints. On the exam, know that Event Hubs can be a drop-in replacement for Kafka.

How many partitions should I create for my event hub?

Partition count is fixed at creation, so plan carefully. A general guideline is to start with 4-8 partitions for development and scale out with more partitions for production. Each partition can handle up to 1 MB/s ingress and 2 MB/s egress per TU. More partitions allow higher throughput and more parallel consumers. However, too many partitions can increase overhead. For the exam, remember that the maximum is 32 for Standard tier.

What happens if I exceed my throughput units?

If you exceed the allocated throughput units (TUs), Event Hubs will throttle requests and return a 429 (Too Many Requests) or ServerBusyException. To avoid this, enable auto-inflate to automatically scale up TUs, or manually increase TUs. In Premium and Dedicated tiers, you use Processing Units (PUs) which are not throttled but have a maximum capacity. On the exam, know that throttling is a sign of insufficient TUs.

Can I replay events from Event Hubs?

Yes, because events are stored in a partitioned log with a retention period (default 1-7 days). Consumers can start reading from any offset, including the beginning of the retention window. This allows replaying events for reprocessing or catching up after a failure. The Capture feature extends retention by writing to Azure Blob Storage for long-term archival.

What is a consumer group in Event Hubs?

A consumer group is a named grouping of consumers that provides an independent view of the event stream. Each consumer group maintains its own checkpoint (offset) per partition. Multiple consumer groups allow different applications to read the same events without interfering. For example, one consumer group for real-time analytics and another for archival. The default consumer group is `$Default`. Standard tier supports up to 20 consumer groups.

How does Event Hubs ensure high availability?

Event Hubs is a PaaS service with a 99.95% SLA for Standard tier (99.99% for Premium and Dedicated). It uses availability zones in supported regions to provide resilience against datacenter failures. Data is replicated across multiple replicas. For geo-disaster recovery, you can configure paired namespaces with automatic failover. On the exam, know that Event Hubs is highly available without manual intervention.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Azure Event Hubs — now see how well it sticks with free AZ-900 practice questions. Full explanations included, no account needed.

Done with this chapter?