AZ-104Chapter 129 of 168Objective 4.2

Load Balancer Frontend, Backend, and Health Probe Rules

This chapter covers the core components of Azure Load Balancer: frontend IP configurations, backend pools, and health probes. Understanding how these elements work together is critical for designing highly available and scalable applications in Azure. On the AZ-104 exam, approximately 15-20% of networking questions touch on load balancer configuration, making this a high-yield topic. You will learn the exact configuration options, default values, and traffic flow mechanics that the exam tests.

25 min read
Intermediate
Updated May 31, 2026

Azure Load Balancer as a Hotel Concierge

Imagine a busy hotel with a single concierge desk (frontend IP) that guests call for services. The concierge receives all incoming requests (traffic) and decides which staff member (backend VM) should handle each request. But the concierge doesn't just pick randomly—they follow rules. First, they check if the staff member is available by sending a quick ping (health probe) every 5 seconds. If a staff member doesn't respond within two failed pings (2 x 5-second intervals), the concierge marks them as unavailable and stops sending requests to them. When distributing work, the concierge uses a method: either round-robin (give the next task to the next person in line) or least-connections (give to the person with the fewest current tasks). The concierge also keeps a table mapping each ongoing task to the staff member handling it, so when a guest calls back about the same task, the concierge directs them to the same staff member (session persistence). Importantly, the concierge never remembers the guest's identity between different tasks unless session persistence is enabled. This mirrors Azure Load Balancer's stateless nature: it doesn't track sessions by default, but can be configured to do so using source IP affinity.

How It Actually Works

What is Azure Load Balancer and Why It Exists

Azure Load Balancer operates at Layer 4 (Transport layer) of the OSI model, distributing incoming TCP and UDP traffic across healthy backend instances. It is designed to provide high availability and scalability by spreading traffic across multiple virtual machines or instances within a virtual network. The service supports both inbound and outbound scenarios, though this chapter focuses on inbound traffic distribution.

Frontend IP Configuration

The frontend IP configuration is the public or private IP address that clients connect to. It acts as the entry point for all incoming traffic. You can configure: - Public frontend: A public IP address (standard or basic SKU) that is reachable from the internet. The load balancer translates the destination IP to the backend instance's private IP. - Private frontend: A private IP address within a virtual network subnet. Used for internal load balancing between VMs.

Each load balancer can have multiple frontend IP configurations, each with its own set of rules. For example, you could have one public frontend for web traffic and one private frontend for API traffic.

Backend Pool

The backend pool is a collection of resources that will handle the incoming traffic. Backend pool members can be:

Virtual machines (via NIC or IP address)

Virtual machine scale sets (VMSS)

IP addresses (for hybrid scenarios)

When using NIC-based association, the load balancer automatically detects the VM's health status via health probes. For IP-based association, you must manually manage health probes.

Load Balancing Rules

A load balancing rule defines how traffic arriving at the frontend is distributed to the backend pool. Key parameters: - Protocol: TCP or UDP - Frontend port: The port on the frontend IP (e.g., 80) - Backend port: The port on the backend instances (e.g., 80) - Health probe: Associates a health probe to determine backend health - Session persistence: None, Client IP, or Client IP and Protocol - Idle timeout: Range 4-30 minutes, default 4 minutes - Floating IP: Direct Server Return (DSR) mode

Health Probes

Health probes determine which backend instances are healthy and can receive traffic. Three types: 1. TCP probe: Attempts a TCP connection to the backend port. If the connection succeeds, the instance is healthy. 2. HTTP probe: Sends an HTTP GET request to a specified path (e.g., /health). A 200 OK response indicates health. 3. HTTPS probe: Same as HTTP but with TLS encryption.

Probe configuration: - Interval: Time between probes (5-300 seconds, default 5 seconds for standard SKU) - Unhealthy threshold: Number of consecutive failures before marking instance unhealthy (2-100, default 2) - Healthy threshold: Number of consecutive successes after being unhealthy to mark as healthy (2-100, default 2 for TCP, 1 for HTTP/HTTPS)

Traffic Flow Mechanics

When a client sends a packet to the frontend IP: 1. The load balancer uses a 5-tuple hash (source IP, source port, destination IP, destination port, protocol) to determine which backend instance receives the packet. This hash is consistent for the duration of the flow, ensuring all packets of the same flow go to the same backend. 2. If session persistence is enabled (Client IP or Client IP and Protocol), the hash includes the source IP to ensure all connections from the same client go to the same backend. 3. The load balancer forwards the packet to the backend instance, preserving the original source IP address (unless using NAT). The backend sees the client's source IP, not the load balancer's IP. 4. The backend sends response packets directly to the client (in non-floating IP mode) or back through the load balancer (in floating IP mode).

Interaction with Related Technologies

Azure Application Gateway: Operates at Layer 7 and can perform SSL termination, URL-based routing, and Web Application Firewall. Use App Gateway for HTTP/HTTPS traffic needing advanced features; use Load Balancer for low-latency, high-throughput TCP/UDP traffic.

Traffic Manager: DNS-level load balancing across regions. Load Balancer handles regional traffic distribution.

Virtual Network NAT: Provides outbound connectivity for VMs. Load Balancer can be used for inbound and outbound, but NAT Gateway is preferred for outbound.

Configuration Examples

Create a load balancer with Azure CLI:

# Create a public IP
az network public-ip create --resource-group MyRG --name MyPublicIP --sku Standard

# Create a load balancer
az network lb create --resource-group MyRG --name MyLB --sku Standard --public-ip-address MyPublicIP --frontend-ip-name MyFrontend --backend-pool-name MyBackendPool

# Create a health probe
az network lb probe create --resource-group MyRG --lb-name MyLB --name MyProbe --protocol tcp --port 80 --interval 5 --threshold 2

# Create a load balancing rule
az network lb rule create --resource-group MyRG --lb-name MyLB --name MyRule --protocol tcp --frontend-port 80 --backend-port 80 --frontend-ip-name MyFrontend --backend-pool-name MyBackendPool --probe-name MyProbe

Default Values and Timers

Idle timeout: 4 minutes (can be increased to 30 minutes)

Probe interval: 5 seconds (Standard SKU), 15 seconds (Basic SKU)

Unhealthy threshold: 2 consecutive failures

Healthy threshold: 2 (TCP), 1 (HTTP/HTTPS)

Session persistence: None by default

Distribution mode: 5-tuple hash (default), or source IP affinity

SKU Differences

Basic SKU: No SLA, no availability zones, no health probe interval below 15 seconds, no HA ports, no outbound rules, limited to 100 backend instances.

Standard SKU: 99.99% SLA, availability zones, health probe interval as low as 5 seconds, HA ports, outbound rules, up to 1000 backend instances, required for VMSS.

High Availability Ports (HA Ports)

Standard Load Balancer supports HA Ports, which is a rule that matches all ports (0-65535) for both TCP and UDP. This is used with Network Virtual Appliances (NVAs) to load balance all traffic to a set of VMs.

Floating IP (Direct Server Return)

When enabled, the backend VM has the frontend IP configured on its loopback interface. The backend responds directly to the client, bypassing the load balancer. This is used for SQL AlwaysOn Availability Groups and other scenarios where the backend must see the original destination IP.

Walk-Through

1

Client sends request to frontend IP

The client initiates a TCP connection or sends a UDP datagram to the public frontend IP address of the load balancer. For example, a web browser sends an HTTP request to 20.185.0.1:80. The packet arrives at Azure's edge network and is forwarded to the load balancer's frontend IP configuration. The load balancer inspects the destination IP and port, and matches it to a load balancing rule. If no rule matches, the packet is dropped.

2

Load balancer performs 5-tuple hash

The load balancer computes a hash using the source IP, source port, destination IP, destination port, and protocol. This hash determines which backend instance in the pool receives the packet. The hash is deterministic for the duration of the flow, so all packets of the same TCP connection go to the same backend. If session persistence is enabled (Client IP), the hash uses only the source IP, ensuring all connections from the same client go to the same backend.

3

Load balancer checks health probe status

Before forwarding the packet, the load balancer checks the health status of the selected backend instance. If the instance is marked healthy (based on the health probe), the packet is forwarded. If unhealthy, the load balancer selects the next instance based on the distribution algorithm (round-robin or least-connections). The health probe runs independently: a TCP probe attempts a connection to the backend port every 5 seconds; if two consecutive probes fail, the instance is marked unhealthy.

4

Packet forwarded to backend instance

The load balancer rewrites the destination IP to the private IP of the backend instance and forwards the packet. The source IP remains the client's IP (unless using SNAT, which is not typical for inbound rules). The backend VM receives the packet and processes it. For TCP, the three-way handshake is completed between the client and backend directly (the load balancer does not terminate the connection).

5

Backend response sent to client

The backend VM sends response packets directly to the client's IP address (since the source IP is the client's). The packets do not go through the load balancer unless floating IP (DSR) is enabled. In DSR mode, the backend sends responses to the client with the frontend IP as the source, and the packets traverse the load balancer in the reverse direction. In normal mode, the load balancer is only involved in the forward path.

What This Looks Like on the Job

Scenario 1: High-Availability Web Tier

A SaaS company deploys a three-tier application with web, API, and database layers. For the web tier, they use a Standard Load Balancer with a public frontend. The backend pool contains 10 VMs in an availability set. They configure a TCP health probe on port 80 with a 5-second interval and 2 unhealthy threshold. This ensures that if a VM crashes or the web server process stops, traffic is redirected within 10 seconds. In production, they monitor the load balancer metrics (e.g., Data path availability, Health probe status) to alert on backend pool degradation. Common misconfiguration: setting the probe interval too high (e.g., 30 seconds) leads to slower failover, causing user-visible errors during a failure.

Scenario 2: Internal Load Balancing for Microservices

A financial services company uses an internal Standard Load Balancer to distribute traffic between microservices running on VMs in a virtual network. Each microservice has its own backend pool and health probe. They use HTTP probes with a custom path like /health to check application-level health, not just TCP connectivity. This allows them to detect when the application is running but returning errors. They also enable session persistence (Client IP) for stateful services like shopping carts. In production, they must ensure the backend VMs are in the same region and virtual network as the load balancer. A common mistake is using a basic SKU load balancer for production, which lacks SLA and has lower backend limits.

Scenario 3: Network Virtual Appliance (NVA) Load Balancing

An enterprise deploys third-party firewalls as NVAs in a hub-and-spoke topology. They use a Standard Load Balancer with HA Ports to distribute all traffic across two NVAs. The frontend is a private IP in the hub subnet, and the backend pool contains the NVA VMs. Health probes are TCP probes on the management port. They enable floating IP (DSR) so that the NVAs see the original client IP. In this scenario, misconfiguring the health probe path (e.g., using HTTP probe on a non-HTTP port) causes all NVAs to be marked unhealthy, dropping all traffic. They also must configure the NVA's loopback interface with the frontend IP for DSR to work.

How AZ-104 Actually Tests This

AZ-104 Exam Focus: Load Balancer Frontend, Backend, and Health Probe Rules

This topic falls under Objective 4.2: Configure and manage Azure Load Balancer. The exam tests your ability to:

Choose the correct SKU (Basic vs Standard) for a given scenario

Configure frontend IP (public vs private)

Configure backend pool (NIC-based vs IP-based)

Create load balancing rules with proper ports, protocol, and health probes

Select the appropriate health probe type (TCP, HTTP, HTTPS)

Understand session persistence options

Know default values (idle timeout 4 min, probe interval 5 sec, unhealthy threshold 2)

Common Wrong Answers and Traps

1.

Choosing Basic SKU for production: Candidates see 'Basic' and think it's sufficient. But Basic has no SLA, no availability zones, and limited features. The exam will describe a production scenario requiring 99.99% SLA; the correct answer is Standard.

2.

Health probe on the wrong port: Candidates configure the health probe to connect to the frontend port instead of the backend port. The probe must target the backend port on the VM.

3.

Session persistence misunderstanding: Candidates think session persistence is always needed. The exam tests when to use 'None' (stateless apps) vs 'Client IP' (stateful apps).

4.

Idle timeout too low: Candidates forget that the default is 4 minutes. For long-lived connections (e.g., SSH), they must increase it.

Specific Numbers and Terms to Memorize

Default idle timeout: 4 minutes (range 4-30)

Default probe interval: 5 seconds (Standard), 15 seconds (Basic)

Default unhealthy threshold: 2

Default healthy threshold: 2 (TCP), 1 (HTTP/HTTPS)

Maximum backend instances: 100 (Basic), 1000 (Standard)

HA Ports: 0-65535

Distribution modes: 5-tuple (default), source IP affinity

Edge Cases

UDP health probes: Not supported. Use TCP or HTTP/HTTPS probes even for UDP rules.

Floating IP: Requires backend VM to have frontend IP on loopback. If not configured, traffic fails.

Backend pool across regions: Not supported. Load balancer is regional.

Multiple frontends: Each rule must reference a single frontend. You can have multiple frontends on one load balancer.

How to Eliminate Wrong Answers

If the scenario mentions 'high availability' or 'SLA', eliminate Basic SKU.

If the scenario mentions 'HTTP health check with custom path', eliminate TCP probe.

If the scenario mentions 'sticky sessions', look for session persistence options.

If the scenario mentions 'all ports', look for HA Ports rule.

Key Takeaways

Frontend IP can be public or private; public is for internet-facing apps, private for internal.

Backend pool members can be NIC-based (auto-detects health) or IP-based (manual health management).

Health probes must target the backend port, not the frontend port.

Default idle timeout is 4 minutes; increase for long-lived connections like SSH.

Standard SKU is required for production workloads needing SLA and availability zones.

Session persistence 'Client IP' ensures all connections from same client go to same backend.

HA Ports rule (0-65535) is used for NVAs; requires floating IP (DSR) typically.

UDP rules still need TCP or HTTP health probes.

Load balancer is Layer 4 only; for Layer 7 features, use Application Gateway.

Basic SKU has no SLA; never use for production in exam scenarios.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Basic Load Balancer

No SLA (99.99% unavailable)

No availability zone support

Health probe interval minimum 15 seconds

Maximum 100 backend instances

No HA Ports support

No outbound rules

Open by default (security groups required)

Standard Load Balancer

99.99% SLA

Supports availability zones

Health probe interval as low as 5 seconds

Maximum 1000 backend instances

Supports HA Ports

Supports outbound rules

Secure by default (no inbound traffic until rules allow)

Watch Out for These

Mistake

Health probes check the frontend IP port.

Correct

Health probes check the backend port on the VM, not the frontend port. The probe connects to the backend IP and port to determine health.

Mistake

Basic Load Balancer supports availability zones.

Correct

Only Standard Load Balancer supports availability zones. Basic is zone-redundant but not zone-aware; it cannot distribute traffic across zones.

Mistake

Session persistence is enabled by default.

Correct

Session persistence is set to 'None' by default. You must explicitly configure 'Client IP' or 'Client IP and Protocol' if needed.

Mistake

HTTP health probes require the backend to return any status code.

Correct

HTTP probes require a 200 OK response. Any other status code (e.g., 301, 404) is considered a failure.

Mistake

You can use a UDP health probe for UDP load balancing rules.

Correct

Azure Load Balancer does not support UDP health probes. You must use TCP or HTTP/HTTPS probes even for UDP rules.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between Basic and Standard Load Balancer SKU?

Standard SKU offers a 99.99% SLA, availability zones, lower health probe intervals (5 seconds), higher backend pool limit (1000 instances), HA Ports, outbound rules, and is secure by default. Basic SKU has no SLA, no zones, 15-second probe interval, 100-instance limit, and is open by default. Always choose Standard for production workloads.

Can I use the same health probe for multiple load balancing rules?

Yes, you can associate one health probe with multiple rules. However, the probe must be configured to check the port that is common across all rules. For example, if you have rules for port 80 and 443, you can use a TCP probe on port 80 for both if the backend responds on that port.

What happens if all backend instances are unhealthy?

If all instances are unhealthy, the load balancer will not forward traffic to any backend. The client will receive a connection timeout or reset. To avoid this, ensure at least one healthy instance is available. You can also configure a 'backup' pool or use a different load balancing strategy.

How does session persistence work with Azure Load Balancer?

Session persistence (sticky sessions) ensures that all traffic from a client goes to the same backend VM. Options: 'None' (no stickiness), 'Client IP' (same source IP goes to same backend), 'Client IP and Protocol' (same source IP and protocol go to same backend). It uses a hash of the source IP (and optionally protocol) to select the backend.

What is floating IP (Direct Server Return)?

Floating IP, or DSR, allows the backend VM to respond directly to the client using the frontend IP as the source address. This requires the backend VM to have the frontend IP configured on its loopback interface. It is used for scenarios like SQL AlwaysOn Availability Groups and NVAs.

Can I add VMs from different availability zones to the same backend pool?

Yes, but only with Standard Load Balancer. You can add VMs from different zones to the same backend pool. The load balancer will distribute traffic across zones, but if you want zone-redundant frontend, use a zone-redundant public IP.

What is the default distribution algorithm for Azure Load Balancer?

The default distribution algorithm is a 5-tuple hash (source IP, source port, destination IP, destination port, protocol). This ensures that packets of the same flow go to the same backend. It is not round-robin per packet; it is per flow.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Load Balancer Frontend, Backend, and Health Probe Rules — now see how well it sticks with free AZ-104 practice questions. Full explanations included, no account needed.

Done with this chapter?