This chapter covers Azure Traffic Manager, Microsoft's DNS-based global traffic load balancer. It is a core topic in the AZ-305 exam under Infrastructure (Objective 4.2: Design a load balancing and failover strategy). Approximately 5-10% of exam questions relate to global load balancing, with Traffic Manager being the primary service. You will learn its routing methods, endpoint monitoring, configuration, and how it differs from other Azure load balancing services. Mastery of this topic is critical for designing resilient, high-performance multi-region architectures.
Jump to a section
Azure Traffic Manager operates like a global air traffic control system for DNS queries. Each endpoint (web server) is like an airport in a different city. When a user types a domain name, the request is like an airplane wanting to land. Traffic Manager's DNS server acts as the air traffic controller, directing each plane to the best airport based on rules: shortest route (performance), closest airport (geographic), or distributing traffic evenly (round-robin). The controller doesn't fly the plane or handle the landing; it just tells the pilot which airport to go to. The pilot then flies directly to that airport. Similarly, Traffic Manager returns a DNS response with the IP address of the chosen endpoint, and the client connects directly to that IP. The controller continuously monitors each airport for delays or closures (endpoint health checks) and reroutes planes if an airport is closed. If all airports are closed, the controller can direct to a backup airport (failover). The key is that the controller only directs; it never handles the actual flight. This is why Traffic Manager is DNS-based and not a proxy or gateway.
What is Azure Traffic Manager?
Azure Traffic Manager is a DNS-based traffic load balancer that distributes incoming DNS requests across multiple endpoints (Azure regions or external locations) based on a chosen routing method. It operates at the DNS level (Layer 7) and does not proxy traffic; it returns a DNS response pointing to the best endpoint according to the configured policy. This makes it ideal for global load balancing, disaster recovery, and latency-sensitive applications.
How Traffic Manager Works
When a client makes a DNS query for a domain (e.g., app.contoso.com) that is configured with Traffic Manager, the following steps occur:
The client's DNS resolver queries the authoritative DNS server for the domain, which returns a CNAME record pointing to the Traffic Manager profile's DNS name (e.g., app.trafficmanager.net).
The resolver then queries the Traffic Manager DNS server for the profile's DNS name.
Traffic Manager evaluates the configured routing method, checks the health of all endpoints, and selects the best endpoint.
Traffic Manager returns a DNS response containing the endpoint's IP address (or CNAME) with a Time-to-Live (TTL) value (default 300 seconds).
The client's DNS resolver caches this response and returns the IP to the client. The client then connects directly to that IP.
Traffic Manager does not sit in the data path. It only influences DNS resolution. This means it cannot see HTTP traffic or session state; it only directs clients to endpoints.
Routing Methods
Traffic Manager supports six routing methods:
Priority: Designates a primary endpoint and fallbacks. All traffic goes to the highest-priority healthy endpoint. If it fails, traffic shifts to the next priority. Used for active/passive failover.
Weighted: Distributes traffic across endpoints based on assigned weights (1-1000). Used for gradual rollout, A/B testing, or load distribution.
Performance: Uses the client's source IP (via a latency database) to select the endpoint with the lowest network latency. Does NOT measure real-time latency; it uses a continuously updated table of round-trip times from client IP ranges to Azure regions.
Geographic: Routes traffic based on the geographic location of the client's DNS resolver IP. Requires mapping geographic regions to endpoints. Used for data sovereignty or geo-restriction.
Multivalue: Returns multiple healthy endpoints in the DNS response. The client chooses one. Used when the client can handle multiple IPs for high availability.
Subnet: Routes traffic based on the client's source IP subnet. Allows granular IP-based routing.
Endpoints and Monitoring
An endpoint can be an Azure cloud service, App Service, Public IP address, or external (non-Azure) endpoint. Each endpoint must be configured with a monitoring URL (e.g., /health) and a protocol (HTTP/HTTPS/TCP). Traffic Manager probes endpoints every 30 seconds (configurable via the Monitoring Interval, default 30, min 10, max 30). If an endpoint fails to respond within a timeout period (default 10 seconds) or returns a non-200 status, it is marked as degraded. After a configurable number of consecutive failures (Tolerated Number of Failures, default 3), the endpoint is marked as unhealthy and removed from DNS responses.
Configuration and Verification
To create a Traffic Manager profile using Azure CLI:
az network traffic-manager profile create \
--name MyProfile \
--resource-group MyRG \
--routing-method Performance \
--unique-dns-name myapp \
--ttl 60 \
--protocol HTTP \
--port 80 \
--path "/health"Add endpoints:
az network traffic-manager endpoint create \
--name endpoint1 \
--resource-group MyRG \
--profile-name MyProfile \
--type azureEndpoints \
--target-resource-id /subscriptions/.../webapps/myapp1 \
--endpoint-status EnabledTo check DNS resolution:
nslookup myapp.trafficmanager.netInteraction with Other Services
Traffic Manager works with Azure Front Door (AFD) and Application Gateway. AFD is also a global load balancer but operates at Layer 7 with HTTP/S capabilities (SSL offload, WAF, URL-based routing). Traffic Manager is DNS-only and simpler. Application Gateway is regional Layer 7 load balancer. In a typical architecture, Traffic Manager distributes traffic across regions, and each region has an Application Gateway for internal load balancing. Traffic Manager can also be used with Azure Load Balancer (Layer 4 regional) for global failover.
Performance and Scale
Traffic Manager is a global service with no capacity limits. It can handle millions of DNS queries per second. DNS response is cached by clients and resolvers based on TTL. Lower TTL (e.g., 60 seconds) provides faster failover but increases DNS query load. Default TTL is 300 seconds. Traffic Manager supports up to 1000 endpoints per profile.
Pricing
Traffic Manager is billed per 1 million DNS queries (first 1 billion queries per month free for Azure DNS zones). Health checks are included. No cost for endpoints.
Limitations
DNS caching: Clients and resolvers may cache DNS responses beyond TTL, delaying failover.
No session affinity: Because it is DNS-based, it cannot maintain sticky sessions. Use Application Gateway or Azure Load Balancer for session affinity.
No HTTP inspection: Cannot examine HTTP headers, paths, or cookies. Use Front Door for that.
No SSL offloading: Traffic Manager does not terminate SSL.
No health check on port 443 for TCP: For HTTPS endpoints, use HTTP health probes on port 443.
Best Practices
Set TTL appropriately: Low TTL (60s) for fast failover, high TTL (300s) for reduced DNS load.
Use a dedicated health endpoint that checks application health, not just server availability.
Combine with Azure Front Door for advanced routing and security.
Use Performance routing for latency-sensitive global apps.
Use Priority routing for disaster recovery (active-passive).
Monitor endpoint health via Azure Monitor and set alerts for endpoint degradation.
Client DNS Query Initiation
The client (or its DNS resolver) initiates a DNS query for the application domain (e.g., app.contoso.com). This query first goes to the authoritative DNS server for contoso.com, which returns a CNAME record pointing to the Traffic Manager profile's DNS name (e.g., app.trafficmanager.net). The resolver then queries the Traffic Manager DNS server for that name. At this point, no routing decision has been made yet.
Traffic Manager DNS Evaluation
The Traffic Manager DNS server receives the query. It evaluates the configured routing method (e.g., Performance, Priority). It checks the health status of all endpoints by reviewing the latest health probe results (which are collected every 30 seconds by default). It selects the best healthy endpoint based on the routing method. For Performance routing, it uses the client's source IP (from the DNS query) to look up latency data in a global table.
DNS Response with Endpoint IP
Traffic Manager returns a DNS response containing the IP address (or CNAME) of the selected endpoint, along with a TTL value (default 300 seconds). The response is authoritative for the trafficmanager.net zone. The client's DNS resolver caches this response for the TTL duration. The client then uses the IP address to establish a direct TCP connection to the endpoint. Traffic Manager is no longer involved.
Health Probe Execution
Traffic Manager continuously probes each endpoint using the configured protocol (HTTP/HTTPS/TCP) and path. Probes are sent from Azure's health check infrastructure (multiple IP ranges). If an endpoint fails to respond within the timeout (default 10 seconds) or returns a non-200 status (for HTTP), that probe is marked as failed. After the configured number of consecutive failures (Tolerated Number of Failures, default 3), the endpoint is marked as unhealthy.
Failover on Endpoint Degradation
When an endpoint is marked unhealthy, Traffic Manager stops including it in DNS responses. Depending on the routing method, traffic is redirected to the next best healthy endpoint. For Priority routing, traffic shifts to the next priority endpoint. For Performance routing, the next lowest-latency healthy endpoint is chosen. The failover is not instantaneous because DNS responses may still be cached by clients or resolvers for up to the TTL duration. To minimize downtime, use a low TTL (e.g., 60 seconds).
Scenario 1: Global E-Commerce Platform
A global e-commerce company wants to deploy its web application in three Azure regions: West US, West Europe, and Southeast Asia. They need low latency for users worldwide and automatic failover if a region goes down. They choose Traffic Manager with Performance routing. Each region has an App Service behind an Application Gateway. Traffic Manager probes each region's health endpoint. During normal operation, users are directed to the region with the lowest latency. When West US experiences an outage, Traffic Manager detects the health probe failures (after three consecutive failures) and stops returning that endpoint. Users in the Americas are then directed to West Europe or Southeast Asia based on latency. The TTL is set to 60 seconds to ensure quick failover. The team monitors endpoint health using Azure Monitor and alerts on endpoint degradation. One challenge is that DNS caching by ISPs can cause some users to still hit the failed region for up to the TTL. To mitigate, they use a low TTL and pre-warm the DNS cache by reducing TTL before planned maintenance.
Scenario 2: Disaster Recovery with Active-Passive
A financial services company uses an active-passive architecture for its core banking application. The primary site is in East US, and the disaster recovery (DR) site is in West US. They use Traffic Manager with Priority routing. The primary endpoint has priority 1, and the DR endpoint has priority 2. Traffic Manager probes both endpoints. Under normal conditions, all traffic goes to East US. If East US fails, Traffic Manager detects the health check failure and automatically routes all traffic to West US. They set the TTL to 300 seconds because failover time is less critical than reducing DNS load. They also configure the DR endpoint to be 'Disabled' manually during testing to prevent accidental failover. They test failover quarterly by disabling the primary endpoint and verifying traffic shifts to DR. They also use Azure Site Recovery to replicate data and VMs to the DR region.
Scenario 3: A/B Testing and Gradual Rollout
A software company wants to test a new version of their web app with a small percentage of users before full rollout. They deploy the new version in a separate App Service in the same region. They configure a Traffic Manager profile with Weighted routing. The old version gets weight 90, and the new version gets weight 10. 10% of DNS queries resolve to the new version. They monitor performance and errors. After successful testing, they gradually increase the weight of the new version to 100. They use a low TTL (60 seconds) so that changes take effect quickly. They also set up health probes for both endpoints. One pitfall: if the new version has a bug that causes it to return 500 errors, Traffic Manager will mark it as unhealthy and route all traffic to the old version, effectively aborting the test. They therefore ensure the health endpoint is robust.
What AZ-305 Tests on Traffic Manager
AZ-305 objective 4.2: Design a load balancing and failover strategy. This includes selecting appropriate Azure load balancing solutions (Traffic Manager, Front Door, Application Gateway, Azure Load Balancer) based on requirements. The exam focuses on:
Routing methods: Know the difference between Performance, Priority, Weighted, Geographic, Multivalue, and Subnet. Performance uses latency data from a table, not real-time measurement. Geographic requires mapping regions.
Endpoint types: Azure endpoints, external endpoints, nested endpoints. Nested endpoints allow combining profiles.
Health monitoring: Default probe interval 30 seconds, timeout 10 seconds, tolerated failures 3. Endpoint becomes unhealthy after 3 consecutive failures.
TTL: Default 300 seconds. Lower TTL for faster failover, higher for reduced DNS load.
Failover behavior: DNS caching can delay failover. The exam may ask how to mitigate (use low TTL, but also be aware of ISP caching).
Comparison with Front Door: Traffic Manager is DNS-only (Layer 7 DNS), Front Door is HTTP/S proxy (Layer 7) with WAF, SSL offload, URL-based routing. The exam will ask which to use for given requirements.
Common Wrong Answers
'Traffic Manager can provide session affinity' – Wrong. Traffic Manager is DNS-based and cannot maintain session state. Use Application Gateway or Azure Load Balancer for session affinity.
'Performance routing uses real-time latency measurements' – Wrong. It uses a precomputed latency table updated periodically. It does not probe the endpoint for latency.
'Traffic Manager can inspect HTTP headers and route based on URL path' – Wrong. That is Front Door functionality. Traffic Manager only uses DNS.
'Setting TTL to 0 eliminates DNS caching' – Wrong. TTL minimum is 0, but many DNS resolvers ignore TTL=0 or have minimum cache times (e.g., 30 seconds). Traffic Manager enforces a minimum TTL of 0, but actual caching behavior varies.
Specific Exam Numbers
Default TTL: 300 seconds
Default probe interval: 30 seconds
Default probe timeout: 10 seconds
Default tolerated number of failures: 3
Maximum endpoints per profile: 1000
Supported routing methods: 6
Edge Cases
Nested endpoints: When you have multiple Traffic Manager profiles (e.g., one for each region) and a parent profile that combines them. The exam may test that nested endpoints allow hierarchical routing.
Geographic routing with missing region: If a client's IP is from a region not mapped, Traffic Manager returns NODATA or a default response depending on configuration.
Multivalue routing: Returns up to 4 healthy endpoints in the response. The client can try them in order.
How to Eliminate Wrong Answers
If the scenario requires HTTP inspection, SSL offload, or WAF, eliminate Traffic Manager and choose Front Door. If it requires session affinity or regional load balancing, eliminate Traffic Manager and choose Application Gateway or Azure Load Balancer. If it requires global DNS-based routing with simple failover or latency, choose Traffic Manager. Always check if the requirement is at global or regional level.
Traffic Manager is a DNS-based global load balancer; it does not proxy traffic.
Default TTL is 300 seconds; lower TTL (e.g., 60s) enables faster failover.
Health probes are sent every 30 seconds by default; endpoint is marked unhealthy after 3 consecutive failures.
Performance routing uses a latency table, not real-time measurements.
Traffic Manager supports six routing methods: Priority, Weighted, Performance, Geographic, Multivalue, Subnet.
Cannot provide session affinity, SSL termination, or HTTP inspection.
For HTTP/S routing, WAF, or SSL offload, use Azure Front Door instead.
Traffic Manager can use nested endpoints to combine multiple profiles.
Geographic routing requires mapping geographic regions to endpoints; un-mapped regions receive no response.
Multivalue routing returns up to 4 healthy endpoints in DNS response for client-side selection.
These come up on the exam all the time. Here's how to tell them apart.
Azure Traffic Manager
DNS-based global load balancer (Layer 7 DNS)
No SSL termination or offload
No WAF capabilities
Routing based on DNS (Performance, Priority, Weighted, Geographic, Multivalue, Subnet)
Cannot inspect HTTP headers or URL paths
Azure Front Door
HTTP/HTTPS global load balancer and CDN (Layer 7 proxy)
SSL termination and offload supported
Built-in WAF (Web Application Firewall)
Routing based on URL path, host header, query strings, etc.
Supports session affinity and HTTP-level routing
Azure Traffic Manager
Global scope (multi-region)
DNS-based, no proxy in data path
No SSL termination
No URL-based routing
No session affinity
Azure Application Gateway
Regional scope (single region)
Layer 7 proxy with full HTTP/S processing
SSL termination and end-to-end SSL
URL path-based routing, host header routing
Supports session affinity (sticky sessions) via cookie-based affinity
Mistake
Traffic Manager routes traffic based on real-time latency measurements.
Correct
Performance routing uses a latency table built from historical data, not real-time probes. It selects the endpoint with the lowest expected latency based on the client's IP range, but does not measure current network conditions.
Mistake
Traffic Manager can terminate SSL and inspect HTTP traffic.
Correct
Traffic Manager operates only at the DNS level (Layer 7 DNS). It cannot terminate SSL, inspect HTTP headers, or perform content-based routing. Those capabilities are provided by Azure Front Door or Application Gateway.
Mistake
Setting TTL to 0 ensures instant failover with no caching.
Correct
While TTL=0 tells resolvers not to cache, many DNS resolvers enforce a minimum TTL (e.g., 30 seconds) or ignore TTL=0. Additionally, client-side DNS caching may still occur. Failover is never instantaneous due to DNS propagation delays.
Mistake
Traffic Manager can load balance within a single Azure region.
Correct
Traffic Manager is designed for global load balancing across regions. For regional load balancing within a single region, use Azure Load Balancer (Layer 4) or Application Gateway (Layer 7).
Mistake
Traffic Manager supports session affinity (sticky sessions).
Correct
Because Traffic Manager returns a DNS response that is cached by the client, it cannot ensure that subsequent requests from the same client go to the same endpoint. Session affinity requires a Layer 7 proxy like Application Gateway or Azure Load Balancer with source IP affinity.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Traffic Manager is a DNS-based global load balancer that directs clients to endpoints based on DNS resolution. It does not inspect HTTP traffic, terminate SSL, or provide WAF. Azure Front Door is a Layer 7 HTTP/HTTPS global load balancer and CDN that terminates SSL, applies WAF rules, and routes based on URL paths, headers, and cookies. Use Traffic Manager for simple DNS-based routing and failover; use Front Door when you need advanced HTTP/S features.
Traffic Manager continuously probes endpoints. If an endpoint fails the configured number of consecutive health probes (default 3), it is marked unhealthy and removed from DNS responses. The next DNS query will return a different healthy endpoint based on the routing method (e.g., next priority for Priority routing, next lowest-latency for Performance). However, DNS caching by clients and resolvers can delay failover for up to the TTL duration. To minimize delay, set a low TTL (e.g., 60 seconds).
No. Traffic Manager is a public DNS-based service that resolves to public IP addresses. For internal load balancing within a VNet, use Azure Load Balancer (Layer 4) or Application Gateway (Layer 7). Traffic Manager endpoints must be publicly accessible or have public DNS names.
The default probe interval is 30 seconds. The default timeout for each probe is 10 seconds. The default number of tolerated failures before marking an endpoint unhealthy is 3. These values are configurable: interval can be set to 10 or 30 seconds, timeout can be 5, 10, or 30 seconds, and tolerated failures can be 0-9.
Yes, Traffic Manager supports Weighted routing method. You assign a weight (1-1000) to each endpoint, and traffic is distributed proportionally. This is useful for A/B testing, gradual rollouts, or distributing load unevenly across endpoints.
If all endpoints are unhealthy, Traffic Manager returns a DNS response with all endpoints (including unhealthy ones) to prevent complete outage. This is called 'all endpoints fail' behavior. The client may attempt to connect to an unhealthy endpoint and fail, but at least there is a chance of success if the endpoint recovers quickly. This behavior can be overridden by configuring a 'fallback' endpoint that is always healthy, such as a static error page.
Geographic routing directs traffic based on the geographic location of the client's DNS resolver IP. You map regions (e.g., 'Europe', 'Asia') to specific endpoints. The client's resolver IP is looked up in a GeoIP database. If the region is mapped, traffic goes to that endpoint. If not, you can configure a default endpoint. This is used for data sovereignty, content localization, or regulatory compliance.
You've just covered Azure Traffic Manager for Global Load Balancing — now see how well it sticks with free AZ-305 practice questions. Full explanations included, no account needed.
Done with this chapter?