AZ-305Chapter 43 of 103Objective 4.4

Retry, Throttling, and Backoff Patterns

This chapter covers retry, throttling, and backoff patterns in Azure—critical mechanisms for building resilient cloud applications. For the AZ-305 exam, approximately 5-10% of questions touch on these patterns, often in the context of designing for reliability and performance optimization. Understanding these concepts is essential for ensuring that applications can gracefully handle transient faults and service limits without degrading user experience or incurring unnecessary costs.

25 min read
Intermediate
Updated May 31, 2026

Bursty Water Supply with Pressure Regulators

Imagine a municipal water distribution system connecting a high-pressure reservoir to thousands of homes. The reservoir can supply water at a maximum rate of 10,000 gallons per minute (GPM), but each home has a pipe that can safely handle only 5 GPM. If too many homes open their taps simultaneously, the pressure in the main pipe drops, and some homes receive a trickle or no water at all. To prevent this, the water authority installs pressure-regulating valves at the entrance to each neighborhood. When a home's tap is opened, the valve measures the current flow. If the flow exceeds 80% of the pipe's capacity, the valve temporarily restricts further flow by partially closing, causing the home to experience reduced pressure. The home's resident must then wait for a random period (exponential backoff) before trying again. The valve also logs each restriction event and can notify the authority if a home repeatedly attempts to draw water too aggressively. This mechanism ensures fair distribution and prevents any single home from starving others. In computing, the reservoir is the service (like Azure SQL Database), the homes are client applications, and the pressure regulators are throttling and retry policies enforced by Azure or implemented by the client.

How It Actually Works

What Are Retry, Throttling, and Backoff Patterns?

Retry, throttling, and backoff are fundamental patterns for handling transient faults and resource limits in distributed systems. Transient faults are temporary errors that occur due to network glitches, service restarts, or resource contention. Retry logic automatically reattempts a failed operation after a delay. Throttling is a service-side mechanism that limits the rate of requests from a client to protect the service from overload. Backoff is a strategy where the client increases the delay between retries to reduce load on the service and avoid exacerbating congestion.

How Throttling Works in Azure Services

Azure services implement throttling using token bucket or leaky bucket algorithms. Each service defines a limit on the number of requests per second (RPS) or per hour per subscription, region, or resource. For example, Azure SQL Database has a DTU-based throttling: if a query consumes more than the allocated DTUs, the database engine throttles by queuing or rejecting new requests with error code 10928 (RESOURCE_QUOTA_EXCEEDED). Similarly, Azure Storage accounts have limits of 20,000 IOPS for standard accounts and 50,000 IOPS for premium accounts. When a client exceeds these limits, the service returns HTTP 429 (Too Many Requests) or 503 (Server Busy) with a Retry-After header indicating the number of seconds to wait.

Client-Side Retry Logic

Client applications should implement retry logic with exponential backoff to handle throttling responses. The standard approach is to use a library like Polly (for .NET) or Azure SDK built-in retry policies. A typical retry policy might retry up to 3 times with delays of 1 second, 2 seconds, and 4 seconds (exponential backoff). Some policies add jitter (randomization) to prevent thundering herd problems—where all clients retry simultaneously after a service outage. The Azure SDK for .NET, for instance, uses a default retry policy with exponential backoff and jitter for storage operations.

Key Defaults and Timers

Azure Storage SDK default retry policy: MaxRetries=3, Mode=Exponential, DeltaBackoff=1 second, MinBackoff=1 second, MaxBackoff=60 seconds.

Azure SQL Database throttling: Retry-After header typically set to 5-10 seconds for 10928 errors.

Azure Cosmos DB: Request rate too large (429) includes a Retry-After header in milliseconds; client SDKs automatically retry with exponential backoff up to 9 times by default.

Azure Key Vault: Throttle limit of 10 requests per second per vault; returns 429 with Retry-After.

Azure API Management: Configurable rate limits per subscription or API; returns 429 with Retry-After.

How Retry and Backoff Interact with Circuit Breaker

Retry patterns are often combined with a circuit breaker pattern to prevent excessive retries when a service is down. The circuit breaker monitors failure rates; if failures exceed a threshold (e.g., 50% in 10 seconds), it opens the circuit and immediately fails requests without attempting retries for a cooldown period (e.g., 30 seconds). After cooldown, it allows a limited number of test requests to see if the service has recovered. This prevents retry storms from overwhelming a failing service.

Configuration and Verification

To configure retry policies in .NET using Polly:

var retryPolicy = Policy
    .Handle<HttpRequestException>()
    .OrResult<HttpResponseMessage>(r => r.StatusCode == HttpStatusCode.ServiceUnavailable)
    .WaitAndRetryAsync(3, retryAttempt => TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)));

To verify throttling behavior, you can use Azure Monitor metrics. For Azure SQL Database, check the 'dtu_consumption_percent' metric. If it consistently exceeds 100%, throttling is likely. For Azure Storage, monitor 'SuccessE2ELatency' and 'SuccessServerLatency' — high values indicate queuing and possible throttling.

Interaction with Related Technologies

Azure Load Balancer: Distributes traffic but does not handle retries; clients must implement retry for failed connections.

Azure Traffic Manager: Provides DNS-level load balancing; does not retry on failure.

Azure Front Door: Supports retry on backend failures with configurable retry count (default 3) and circuit breaker.

Azure Redis Cache: Throttles connections based on max clients (default 256 for basic tier); returns error 'ERR max number of clients reached'.

Service Bus: Throttles by exceeding maximum throughput units; returns 429 with Retry-After.

Advanced Considerations

Idempotency: Retries can cause duplicate operations. Design operations to be idempotent (e.g., using idempotency keys) so that retrying does not cause side effects.

Backoff Strategies: Fixed interval, incremental (linear), exponential, and exponential with jitter. Jitter is crucial to avoid synchronized retries.

Retry-After Header: The most reliable way to determine backoff delay when the service provides it. Always respect this header.

Throttling vs. Rate Limiting: Throttling is reactive (based on current load), while rate limiting is proactive (based on fixed limits). Azure uses both.

Exam-Relevant Numbers

Default retry count in Azure SDK: 3

Exponential backoff base: 2 seconds (common)

Retry-After typical values: 5-10 seconds for SQL, 1-60 seconds for Storage

Circuit breaker default: open after 5 failures, cooldown 30 seconds (in Azure Front Door)

Throttle limit for Azure SQL DTU: 100% consumption triggers throttling

Max retries for Cosmos DB SDK: 9

Common Anti-Patterns

Immediate Retry: Retrying immediately without backoff floods the service.

Infinite Retry: Can cause indefinite resource consumption.

Ignoring Retry-After: Overrides the service's recommended delay, leading to continued throttling.

Same Backoff for All Clients: Causes thundering herd; use jitter.

Not Handling Non-Transient Errors: Retrying on 400 Bad Request is useless.

Summary

Retry, throttling, and backoff are essential for building resilient Azure applications. The exam expects you to know default values, how to configure policies, and how these patterns interact with Azure services. Always implement exponential backoff with jitter, respect Retry-After headers, and combine with circuit breaker for robust fault handling.

Walk-Through

1

Client sends request to Azure service

The client application sends an HTTP request to an Azure service endpoint, such as Azure SQL Database or Azure Storage. The request includes headers and payload. At this point, the client has no knowledge of the service's current load or capacity. The client's retry policy is not yet invoked; it is waiting for a response.

2

Service evaluates request against limits

The Azure service receives the request and checks its current resource utilization. For example, Azure SQL Database checks DTU consumption; if it exceeds the allocated DTUs (e.g., 100% for a 100 DTU database), the service decides to throttle. The service uses a token bucket algorithm: each request consumes a token, and tokens are replenished at a fixed rate. If the bucket is empty, the request is throttled.

3

Service returns throttling response

The service responds with an HTTP 429 (Too Many Requests) or 503 (Service Unavailable) status code. The response includes a Retry-After header specifying the number of seconds the client should wait before retrying. For Azure SQL Database, the error code 10928 is returned. The client must parse this header and delay accordingly.

4

Client applies retry policy with backoff

The client's retry logic (e.g., Polly policy) intercepts the 429/503 response. It checks the retry count: if this is the first retry, it waits for a delay calculated by the policy. For exponential backoff, the delay is typically baseDelay * (2^retryAttempt). The client also checks the Retry-After header; if present, it should use that value instead of the policy's calculated delay. The client then waits for the specified time.

5

Client resends request after delay

After the delay, the client resends the exact same request. If the operation is not idempotent, this could cause duplicate data. Ideally, the request includes an idempotency key so the service can deduplicate. The service again evaluates the request against current limits. If load has subsided, the request succeeds; otherwise, another throttling response may be returned.

6

Retry count exceeded or success

If the retry count (e.g., 3) is exceeded without success, the client throws an exception or returns an error to the application. If the request succeeds, the client proceeds normally. The retry policy may also implement circuit breaker: if a threshold of failures is reached (e.g., 5 in 10 seconds), the circuit opens and all subsequent requests fail immediately without retrying for a cooldown period.

What This Looks Like on the Job

Enterprise Scenario 1: High-Volume E-Commerce Platform

A major e-commerce company runs its product catalog on Azure Cosmos DB. During flash sales, the client app (a web API) sends thousands of read requests per second. Cosmos DB throttles requests when the request rate exceeds the provisioned throughput (e.g., 10,000 RU/s). The client uses the Cosmos DB SDK, which automatically retries with exponential backoff up to 9 times. However, during peak load, many clients retry simultaneously, causing a thundering herd. The solution was to implement a distributed rate limiter using Azure Redis Cache to pre-throttle requests at the API gateway, reducing the number of requests reaching Cosmos DB. Additionally, they increased provisioned throughput during sales events using autoscale. Misconfiguration: initially, the SDK's max retry count was set to 30, causing long delays and poor user experience. They reduced it to 9 and added a circuit breaker that opens after 5 consecutive throttling responses.

Enterprise Scenario 2: Financial Services with Azure SQL Database

A bank uses Azure SQL Database for transaction processing. A batch job runs daily to update account balances, sending many concurrent update queries. The database throttles with error 10928 when DTU consumption hits 100%. The batch job originally used a retry policy with fixed 1-second intervals, which did not reduce the load. They switched to exponential backoff with jitter (base delay 1 second, max 30 seconds) and respected the Retry-After header. They also implemented a circuit breaker: if 3 consecutive throttling errors occur, the job pauses for 60 seconds before retrying. This reduced database CPU from 100% to 70% during the batch window. Performance consideration: they moved to a higher DTU tier to reduce throttling frequency, balancing cost and performance.

Enterprise Scenario 3: IoT Telemetry with Azure Event Hubs

An IoT company sends telemetry from millions of devices to Azure Event Hubs. Each device sends data every minute. Event Hubs throttles when the ingress rate exceeds throughput units (TUs). The device SDK implements retry with exponential backoff (initial 1 second, max 60 seconds). However, when many devices are throttled simultaneously, they all retry after the same backoff intervals, causing repeated throttling. The solution was to add jitter: each device adds a random offset to its backoff delay (0-1 second). They also implemented a local cache on the device to store data during throttling and send it later. Misconfiguration: they did not respect the Retry-After header initially, causing continuous throttling. After fixing, throttling errors dropped by 90%.

How AZ-305 Actually Tests This

What AZ-305 Tests on This Topic

The AZ-305 exam objective 4.4 focuses on designing for reliability and performance. Specifically, you must know how to implement retry and backoff patterns to handle transient faults and throttling. The exam may ask you to choose the correct strategy for a given scenario, such as selecting between fixed interval, incremental, exponential backoff, or exponential backoff with jitter. It also tests your understanding of circuit breaker pattern and its integration with retry.

Common Wrong Answers and Why Candidates Choose Them

1.

Fixed interval retry: Many candidates choose this because it's simple. However, the exam expects exponential backoff for most scenarios because it reduces load on the service. Fixed interval is only acceptable for very low retry counts or non-critical operations.

2.

Immediate retry: Candidates think retrying immediately is fast, but it causes thundering herd and worsens throttling. The exam tests that you should always add a delay.

3.

Ignoring Retry-After header: Some candidates assume the client's backoff policy is always sufficient. The exam emphasizes that you must respect the Retry-After header when provided, as it reflects the service's actual recovery time.

4.

Infinite retry: Candidates may think 'retry until success' is resilient. The exam expects you to limit retries to avoid indefinite resource consumption and to implement circuit breaker.

Specific Numbers and Terms That Appear on the Exam

Default retry count: 3 (Azure SDK)

Exponential backoff formula: delay = baseDelay * (2^attempt)

Retry-After header: typical values 5, 10, 30 seconds

Circuit breaker: open after 5 failures, cooldown 30 seconds (Azure Front Door)

Throttle error codes: 429 (Too Many Requests), 503 (Service Unavailable), 10928 (SQL Database)

Jitter: random addition of up to 1 second to avoid synchronization

Edge Cases and Exceptions

Non-transient errors: Do not retry on 400 (Bad Request) or 404 (Not Found).

Idempotency: Retrying a non-idempotent operation (e.g., POST without idempotency key) can cause duplicate data. The exam tests that you should design operations to be idempotent.

Circuit breaker in combination: The exam may ask when to use circuit breaker vs. retry alone. Circuit breaker is needed when the service is likely to be down for an extended period.

Backoff with jitter: Required when many clients are likely to be throttled simultaneously (e.g., after a service restart).

How to Eliminate Wrong Answers

If the scenario involves a service that provides Retry-After header, always choose an answer that respects it.

If the scenario involves many clients, choose exponential backoff with jitter over plain exponential.

If the scenario mentions transient faults (e.g., network timeout), retry is appropriate; if it mentions service overload, throttling is happening.

Always look for the option that limits retry count and includes a delay.

Key Takeaways

Throttling in Azure returns HTTP 429 or 503 with a Retry-After header; always respect this header.

Default retry count in Azure SDKs is 3; exponential backoff is the recommended strategy.

Exponential backoff formula: delay = baseDelay * (2^attempt); baseDelay is typically 1 second.

Jitter (random addition to delay) prevents thundering herd when many clients retry simultaneously.

Circuit breaker pattern should be combined with retry to prevent retries during extended outages.

Non-transient errors (4xx client errors) should not be retried; only transient faults (5xx, 429) are retriable.

Idempotency is essential when retrying write operations to avoid duplicate data.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Fixed Interval Retry

Delay is constant (e.g., 1 second) between retries.

Simple to implement and understand.

Does not reduce load on the service over time.

Can cause thundering herd if many clients retry simultaneously.

Suitable for low retry counts (1-2) or non-critical operations.

Exponential Backoff

Delay increases exponentially (e.g., 1s, 2s, 4s).

More complex but reduces load on the service.

Helps service recover by giving it time to clear backlog.

Without jitter, can still cause thundering herd.

Recommended for most production scenarios by Azure.

Watch Out for These

Mistake

Retrying immediately after a throttling response is effective because it gets the request through quickly.

Correct

Immediate retry without delay worsens congestion and is likely to be throttled again. Exponential backoff with a delay is required to give the service time to recover.

Mistake

The Retry-After header is optional and can be ignored if the client has its own backoff policy.

Correct

The Retry-After header indicates the service's recommended wait time. Ignoring it can lead to repeated throttling and is considered a best practice violation. The client should always use the Retry-After value when present.

Mistake

Exponential backoff without jitter is sufficient for all scenarios.

Correct

Without jitter, all clients may retry at the same time, causing a thundering herd. Jitter randomizes delays to spread retries over time, which is essential for large-scale systems.

Mistake

Throttling only occurs at the service level, so client-side retry logic is unnecessary.

Correct

Client-side retry logic is critical because throttling can happen at various layers (network, load balancer, service). The client must handle throttling responses gracefully to maintain reliability.

Mistake

Circuit breaker and retry are mutually exclusive; you should use one or the other.

Correct

Circuit breaker and retry are complementary. Retry handles transient faults, while circuit breaker prevents retries when the service is likely down. They are often used together: retry with backoff, and open circuit after repeated failures.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between throttling and rate limiting in Azure?

Throttling is reactive: it occurs when a service detects that a client is exceeding its allocated resources (e.g., DTU consumption over 100%). Rate limiting is proactive: it enforces a fixed maximum number of requests per second (e.g., 10 requests per second per subscription). Both result in 429 responses, but throttling is based on current load, while rate limiting is based on predefined limits. On the exam, understand that both can trigger retry logic.

How should I configure retry for Azure SQL Database?

Use exponential backoff with jitter, respecting the Retry-After header in the 10928 error response. The Azure SDK for .NET (SqlConnection) has built-in retry logic for transient faults. You can also use Polly to define custom policies. Set max retries to 3-5. Example: Policy.Handle<SqlException>(e => e.Number == 10928).WaitAndRetryAsync(3, attempt => TimeSpan.FromSeconds(Math.Pow(2, attempt)));

What is the purpose of jitter in backoff?

Jitter adds a random offset to the backoff delay (e.g., between 0 and 1 second) to prevent all clients from retrying at the same time. Without jitter, after a service outage, all clients might retry simultaneously, causing a thundering herd that overwhelms the service. Jitter spreads retries over time, improving success rates. The exam expects you to choose jitter when multiple clients are involved.

When should I use circuit breaker instead of retry?

Use circuit breaker when the service is likely to be down for an extended period (e.g., due to a crash or maintenance). Retry with backoff is suitable for transient faults that resolve quickly (seconds). Circuit breaker monitors failure rates and opens the circuit after a threshold, preventing retries and allowing the service to recover. For example, after 5 consecutive failures in 10 seconds, open the circuit for 30 seconds.

What is the default retry policy for Azure Storage SDK?

The default retry policy for Azure Storage SDK (e.g., BlobClient) uses exponential backoff with max retries = 3, delta backoff = 1 second, min backoff = 1 second, max backoff = 60 seconds. It also respects the Retry-After header if provided. You can customize this via RetryOptions in the client configuration.

How can I monitor throttling in Azure?

Use Azure Monitor metrics. For Azure SQL Database, monitor 'dtu_consumption_percent' — if it exceeds 100%, throttling is occurring. For Azure Storage, check 'SuccessE2ELatency' and 'SuccessServerLatency' — high values indicate queuing. You can also log 429 responses in application logs. Set up alerts for these metrics to proactively adjust provisioned capacity.

What is the Retry-After header and how is it used?

The Retry-After header is an HTTP response header sent by the server (e.g., Azure) to indicate how long the client should wait before retrying. It can be in seconds (e.g., Retry-After: 10) or a date. The client must use this value as the delay, overriding any client-side backoff calculation. This ensures the client waits the optimal time for the service to recover.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Retry, Throttling, and Backoff Patterns — now see how well it sticks with free AZ-305 practice questions. Full explanations included, no account needed.

Done with this chapter?