AZ-305Chapter 75 of 103Objective 4.4

Azure Event Grid Architecture

This chapter covers Azure Event Grid architecture, a critical topic for the AZ-305 exam under domain 'Infrastructure' objective 4.4. Event Grid is a fully managed event routing service that enables event-driven, reactive programming in Azure. Approximately 5-10% of exam questions touch on messaging and eventing services, with Event Grid being the most commonly tested for serverless event routing. Mastering this chapter will help you differentiate Event Grid from Event Hubs, Service Bus, and other Azure messaging services, and understand its role in building scalable, decoupled solutions.

25 min read
Intermediate
Updated May 31, 2026

Event Grid: The Postal Sorting Office

Imagine a massive postal sorting office in a busy city. Letters (events) arrive from countless senders (publishers) — individuals, businesses, automated systems. Each letter has a destination address, but the sorting office doesn't read the letter's content; it only looks at the envelope's address and topic (e.g., 'Invoice', 'Order Shipped'). The sorting office has a giant map (filtering system) that says: 'All letters with topic "Order Shipped" go to bin #7, which is picked up by the Logistics team (subscriber).' But here's the key: the sorting office doesn't wait for the Logistics team to confirm they've read the letter. Instead, it drops the letter in the bin and immediately moves on to the next letter. If the bin is full, the sorting office has a rule: retry up to 3 times, then if still full, send the letter to the Dead Letter Office (dead-letter queue). The sorting office also guarantees that every letter will be delivered at least once — but possibly more than once if a carrier picks up the same letter twice by mistake. This is exactly how Event Grid works: it's a push-based, serverless event routing service that filters and delivers events to subscribers with at-least-once delivery, automatic retries, and dead-lettering, all while the publisher never waits for a response.

How It Actually Works

What is Azure Event Grid?

Azure Event Grid is a fully managed, serverless event routing service that enables event-driven architectures. It acts as a central hub for events from various Azure services (e.g., Blob Storage, Resource Groups, Event Hubs) and custom applications, routing them to subscribers like Azure Functions, webhooks, or Logic Apps. Event Grid uses a publish-subscribe model where publishers emit events without knowing who will handle them, and subscribers express interest in certain event types.

How It Works Internally

Event Grid operates on a push-based model. When an event occurs (e.g., a blob is created), the publisher sends an HTTP POST request to Event Grid's endpoint. Event Grid then: 1. Validates the event schema. 2. Applies topic filters and subscription filters. 3. Forwards the event to each matching subscriber endpoint via HTTP POST. 4. If the subscriber does not acknowledge (HTTP 200 OK) within a timeout, Event Grid retries according to its retry policy. 5. After exhausting retries, the event can be sent to a dead-letter destination or dropped.

Key Components

Topics: Endpoints where publishers send events. There are system topics (Azure services) and custom topics (your applications).

Event Subscriptions: Define which events from a topic are forwarded to which subscriber. Can include filters on event types, subject, or advanced filtering.

Events: JSON objects with a defined schema. Minimum schema includes id, eventType, subject, eventTime, data, and topic.

Subscribers: Endpoints that receive events. Supported types: Azure Functions, Webhooks, Event Hubs, Service Bus, Queue Storage, Hybrid Connections, and more.

Defaults and Timers

Retry Policy: Event Grid retries delivery for up to 30 minutes (default). The retry schedule is exponential backoff: 10 seconds, then 30 seconds, then 1 minute, and so on, up to 5 minutes between retries. After 30 minutes, if no acknowledgment, the event is either dead-lettered or dropped.

Event Time-to-Live (TTL): Default is 1440 minutes (24 hours) for custom topics. For system topics, it's 1440 minutes as well. If an event is not delivered within its TTL, it expires.

Max Retry Attempts: Default is 30 attempts within the 30-minute window. You can configure this up to 10 attempts (yes, that's a contradiction — actually, the default max retry count is 30, but you can set it to any value from 1 to 30).

Dead-Lettering: Optional. You can configure a dead-letter destination (Blob Storage or Queue Storage) where events that fail delivery after all retries are sent.

Event Size: Max 1 MB per event. Events over 64 KB are billed differently.

Throughput: Event Grid can handle millions of events per second per region, with a publish latency of <1 second for 99.99th percentile.

Configuration and Verification

You can create custom topics and subscriptions via Azure Portal, CLI, or PowerShell. Example Azure CLI:

# Create a custom topic
az eventgrid topic create --name myTopic --location westus --resource-group myResourceGroup

# Create an event subscription to a webhook endpoint
az eventgrid event-subscription create \
  --name mySubscription \
  --source-resource-id /subscriptions/{sub-id}/resourceGroups/myResourceGroup/providers/Microsoft.EventGrid/topics/myTopic \
  --endpoint https://mywebhook.azurewebsites.net/api/events

To verify delivery, check subscriber logs or use Event Grid's diagnostic settings to stream events to a Log Analytics workspace.

Interaction with Related Technologies

Event Hubs: Used for high-throughput data ingestion, not event routing. Event Hubs can be a publisher to Event Grid (e.g., when Event Hubs captures a file).

Service Bus: Used for enterprise messaging with queues and topics, supporting message ordering, sessions, and transactions. Event Grid is simpler and faster for event routing.

Azure Functions: Often used as a subscriber to process events. Event Grid triggers a Function via HTTP.

Logic Apps: Can subscribe to events and orchestrate workflows.

Blob Storage: Can emit events on blob creation/deletion to trigger processing.

Event Grid vs. Other Services

| Feature | Event Grid | Event Hubs | Service Bus | |---------|------------|------------|-------------| | Primary use | Event routing | Data ingestion | Enterprise messaging | | Delivery model | Push | Pull (via consumer groups) | Pull (via queues/topics) | | Ordering | Not guaranteed | Per partition | First-in-first-out (queues) | | At-least-once | Yes | Yes | Yes | | Exactly-once | No | No | Yes (via dedup) | | Dead-letter | Yes | No (via capture) | Yes | | Throughput | Millions/s | Millions/s | Thousands/s |

Advanced Filtering

Event subscriptions support filtering on:

Event types (e.g., Microsoft.Storage.BlobCreated)

Subject prefix/suffix (e.g., subject starts with /blobServices/default/containers/images/)

Advanced filters: numeric comparisons, boolean, string contains/not contains, etc.

Security

Event Grid uses Shared Access Signatures (SAS) or managed identities for authentication. Subscribers must validate endpoints via a handshake (e.g., returning validationCode).

Walk-Through

1

Publisher sends event to topic

A publisher (e.g., Azure Blob Storage) detects a blob creation event. It constructs an HTTP POST request to the Event Grid topic endpoint (e.g., https://myTopic.westus-1.eventgrid.azure.net/api/events). The request body is a JSON array containing one or more event objects conforming to the CloudEvents 1.0 or Event Grid schema. The publisher must include an authentication token (SAS or managed identity). Event Grid validates the token and the event schema. If invalid, it returns HTTP 400. If valid, it returns HTTP 200 and begins processing.

2

Event Grid applies subscription filters

For each event subscription on the topic, Event Grid evaluates filters. Filters can be on eventType, subject, or advanced properties. For example, a subscription might only accept events with eventType 'Microsoft.Storage.BlobCreated' and subject starting with '/blobServices/default/containers/images/'. If the event does not match, it is silently dropped for that subscription. Event Grid evaluates filters efficiently using a trie structure for subject filters. Only matching events proceed to delivery.

3

Event Grid delivers event to subscriber

Event Grid constructs an HTTP POST request to the subscriber's endpoint (e.g., an Azure Function URL). The request body contains the event in JSON format. The subscriber must respond with HTTP 200 OK within 30 seconds (default timeout) to acknowledge successful receipt. If the subscriber responds with any other status (e.g., 400, 500) or the request times out, Event Grid considers delivery failed and schedules a retry. The retry interval starts at 10 seconds and increases exponentially up to 5 minutes.

4

Event Grid retries on failure

If delivery fails, Event Grid retries according to the retry policy. Default: up to 30 attempts over 30 minutes. The retry schedule is: 10s, 30s, 1m, 5m, 10m, 30m (then every 30m until 30m total). After each failed attempt, Event Grid waits the scheduled interval. If the event's TTL (24 hours) is reached, the event expires and is dropped. If dead-lettering is configured, after the maximum retry count (or after the retry period), the event is sent to the dead-letter destination (Blob Storage or Queue Storage).

5

Subscriber processes event

Upon receiving the HTTP POST, the subscriber (e.g., Azure Function) deserializes the event JSON and performs its logic: perhaps resizing an image, updating a database, or triggering a workflow. The subscriber must return HTTP 200 within the timeout (default 30s). If the subscriber takes longer, it should acknowledge quickly and process asynchronously. After processing, the subscriber can optionally send a response body, which Event Grid ignores. The event is now considered delivered.

What This Looks Like on the Job

Enterprise Scenario 1: Automated Image Processing Pipeline

A media company uploads millions of images daily to Azure Blob Storage. They need to automatically generate thumbnails and apply watermarks. Using Event Grid, they subscribe to Blob Created events. When an image is uploaded, Event Grid triggers an Azure Function that resizes the image. The function writes the thumbnail to a different container. This pattern decouples upload from processing, allowing the storage service to scale independently. In production, they configured a dead-letter queue (Azure Queue Storage) to capture failed events (e.g., corrupted images). They set retry policy to 5 attempts over 10 minutes to avoid endless retries on permanent failures. Common issue: misconfigured subject filters caused functions to fire for every blob event, including small temp files, overwhelming the function. Solution: use subject prefix filter on /blobServices/default/containers/images/.

Enterprise Scenario 2: Real-Time Resource Monitoring

A large enterprise wants to react to Azure resource changes (e.g., VM creation, storage account deletion) for compliance auditing. They create a system topic for Azure subscriptions and subscribe to all resource write events. Events are sent to Event Hubs for long-term storage and to a Logic App that sends alerts to a Teams channel. Event Grid's built-in filtering ensures only events of interest (e.g., 'Microsoft.Resources.ResourceWriteSuccess') trigger alerts. They set up a second subscription to send all events to a Log Analytics workspace via diagnostic settings for later analysis. Pitfall: Event Grid's default retry policy caused duplicate alerts when the Logic App occasionally returned 500 errors. They configured a retry policy of 2 attempts to reduce duplicates while still handling transient failures.

Scenario 3: Custom Application Event Bus

A SaaS provider uses Event Grid as an internal event bus for microservices. Each microservice publishes events to custom topics (e.g., 'OrderPlaced', 'PaymentReceived'). Other microservices subscribe to relevant topics. This avoids tight coupling. They use advanced filtering on order value to route high-value orders to a premium processing service. Performance: they handle 500 events/second with <100ms latency. Issue: one subscriber's endpoint was slow, causing retries and back-pressure. They moved that subscriber to a queue (Service Bus) and used Event Grid to push events to the queue instead, decoupling the slow processor.

How AZ-305 Actually Tests This

What AZ-305 Tests on Event Grid

AZ-305 objective 4.4: 'Design an event-driven architecture.' The exam focuses on choosing the right messaging service (Event Grid vs. Event Hubs vs. Service Bus) for a given scenario. Key differentiators tested:

Event Grid is for event routing (push-based, serverless).

Event Hubs is for data ingestion (pull-based, high throughput).

Service Bus is for enterprise messaging (queues/topics, ordering, transactions).

Common Wrong Answers

1.

'Use Event Grid when you need message ordering.' Wrong. Event Grid does not guarantee order. Candidates confuse it with Service Bus queues or Event Hubs partitions.

2.

'Event Grid supports exactly-once delivery.' Wrong. It guarantees at-least-once, which can result in duplicates. Exactly-once is possible with Service Bus deduplication.

3.

'Event Grid is best for high-throughput streaming telemetry.' Wrong. Event Hubs is designed for millions of events per second ingestion. Event Grid is for routing discrete events.

4.

'Event Grid can receive events from on-premises systems without custom code.' Wrong. On-premises systems need to publish via HTTP to a custom topic, which requires internet access or a custom connector.

Exam Numbers and Terms

Default retry period: 30 minutes.

Default event TTL: 1440 minutes (24 hours).

Max event size: 1 MB.

Supported subscriber types: Azure Functions, Webhooks, Event Hubs, Service Bus, Queue Storage, Hybrid Connections.

Filter types: event type, subject prefix/suffix, advanced (numeric, boolean, string).

Authentication: SAS tokens or managed identity.

Edge Cases

Expired events: If TTL expires during retry, the event is dropped. The exam may ask what happens to an event that fails delivery for 25 hours.

Validation handshake: Webhook subscribers must validate the endpoint by returning the validation code in the initial handshake. If not, Event Grid won't send events.

Dead-lettering only works if configured: If not set, events are dropped after retries.

Event Grid is regional: Topics are scoped to a region. Cross-region delivery requires a global subscriber endpoint.

How to Eliminate Wrong Answers

If the scenario mentions 'order processing' or 'guaranteed delivery', think Service Bus.

If 'high throughput' or 'streaming', think Event Hubs.

If 'reactive', 'serverless', 'push-based', or 'event routing', think Event Grid.

If 'duplicates are acceptable' or 'at-least-once', Event Grid is fine.

If 'exactly-once' is required, eliminate Event Grid.

Key Takeaways

Event Grid is a push-based event routing service with at-least-once delivery.

Default retry period is 30 minutes; default event TTL is 1440 minutes (24 hours).

Maximum event size is 1 MB.

Event Grid does not guarantee ordering; use Event Hubs or Service Bus if ordering is required.

Supported subscribers: Azure Functions, Webhooks, Event Hubs, Service Bus, Queue Storage, Hybrid Connections.

Authentication via SAS tokens or managed identity; webhook subscribers must validate endpoint.

Dead-lettering is optional and must be configured; otherwise failed events are dropped after retries.

Use Event Grid for reactive, serverless event routing; use Event Hubs for streaming; use Service Bus for enterprise messaging.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Event Grid

Push-based delivery to subscribers.

Best for event routing and reactive scenarios.

No ordering guarantee.

No consumer groups needed.

Supports dead-lettering and retry policies.

Event Hubs

Pull-based via consumer groups.

Best for high-throughput data ingestion (streaming).

Ordering per partition.

Multiple consumer groups allow parallel processing.

No built-in dead-lettering; uses Capture for storage.

Event Grid

Push-based, serverless event routing.

At-least-once delivery (duplicates possible).

No message ordering or sessions.

No support for transactions.

Lower latency (<1 second).

Service Bus

Pull-based (queues/topics) or push (via sessions).

Exactly-once with deduplication.

Supports FIFO ordering and sessions.

Supports transactions and dead-letter queues.

Higher latency but richer messaging features.

Watch Out for These

Mistake

Event Grid guarantees exactly-once delivery.

Correct

Event Grid delivers at-least-once, meaning duplicates are possible. Exactly-once is not guaranteed. For exactly-once, use Service Bus with deduplication.

Mistake

Event Grid can order events in the order they were published.

Correct

Event Grid does not preserve order. Events may be delivered out of order. If ordering is required, use Event Hubs (per partition) or Service Bus queues.

Mistake

Event Grid is ideal for high-throughput data ingestion like IoT telemetry.

Correct

Event Grid is for event routing, not high-throughput ingestion. Event Hubs is designed for millions of events per second ingestion. Event Grid can handle millions of events per second for routing, but it is not a streaming ingestion service.

Mistake

Event Grid can only be used with Azure services.

Correct

Event Grid supports custom topics and webhook subscribers, so it can integrate with any HTTP endpoint, including on-premises or third-party services, as long as they can send/receive HTTP requests.

Mistake

Event Grid automatically retries forever until the event is delivered.

Correct

Event Grid has a default retry policy of 30 minutes and a TTL of 24 hours. After that, the event is dropped (or dead-lettered if configured). It does not retry indefinitely.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between Event Grid and Event Hubs for AZ-305?

Event Grid is for event routing (push-based, reactive), while Event Hubs is for data ingestion (pull-based, high throughput). On the exam, if the scenario mentions 'react to events' or 'serverless', choose Event Grid. If it mentions 'ingest millions of events per second' or 'streaming', choose Event Hubs. Also, Event Grid does not guarantee order; Event Hubs does per partition.

Does Event Grid support exactly-once delivery?

No, Event Grid guarantees at-least-once delivery. Duplicates can occur. If exactly-once is required, use Service Bus with deduplication. Exam trap: candidates often assume Event Grid provides exactly-once because it is a managed service, but it does not.

How do I configure dead-lettering in Event Grid?

Dead-lettering is configured at the event subscription level. You specify a destination (Blob Storage container or Queue Storage queue) and optional TTL for dead-lettered events. When an event fails delivery after all retries, it is sent to that destination. If not configured, failed events are dropped. Exam tip: remember that dead-lettering is optional and must be explicitly set.

Can Event Grid deliver events to on-premises systems?

Yes, if the on-premises system exposes an HTTP endpoint (webhook) that is accessible from the internet (or via Hybrid Connection). Event Grid can push events to any public HTTP endpoint. However, the endpoint must respond with HTTP 200 and handle the validation handshake. For private networks, use Azure Hybrid Connections or a VPN.

What happens if an Event Grid subscriber is down for an extended period?

Event Grid will retry delivery for up to 30 minutes (default retry period) or until the event's TTL (24 hours) expires. After that, the event is either dead-lettered (if configured) or dropped. If the subscriber comes back after 30 minutes, it will miss events that were not dead-lettered. To avoid loss, configure dead-lettering or use a durable subscriber like Azure Functions with a queue.

What is the validation handshake for webhook subscribers?

When you create a webhook subscription, Event Grid sends a validation request to the endpoint with a validation code. The endpoint must respond with the same validation code in the response body (or in the query string) within 5 seconds. If not, Event Grid will not send events. This prevents misconfigured endpoints from receiving events. Exam: you may be asked why events aren't reaching a webhook — answer: the validation handshake failed.

Can I filter events by custom properties in Event Grid?

Yes, Event Grid supports advanced filtering on event data properties (numeric, boolean, string comparisons). You can filter on fields in the 'data' object, such as 'data.size > 1000'. This is done in the event subscription. Exam: you might be asked to design a filter to only process events from a specific container or with a specific property.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Azure Event Grid Architecture — now see how well it sticks with free AZ-305 practice questions. Full explanations included, no account needed.

Done with this chapter?