SAA-C03Chapter 81 of 189Objective 2.2

Gateway Load Balancer (GWLB) for Inline Security

This chapter covers the AWS Gateway Load Balancer (GWLB), a fully managed service designed for deploying, scaling, and managing third-party virtual network appliances—such as firewalls, intrusion detection/prevention systems (IDS/IPS), and deep packet inspection (DPI) appliances—in a transparent inline mode. For the SAA-C03 exam, understanding GWLB is critical for designing resilient architectures that require traffic inspection without compromising performance or availability. Approximately 5-10% of exam questions touch on network load balancing or traffic inspection, with GWLB appearing in scenarios involving security compliance, east-west traffic inspection, and integration with AWS Transit Gateway.

25 min read
Intermediate
Updated May 31, 2026

Airport Security Checkpoint with External Screening

Imagine an airport where passengers must pass through a security checkpoint before entering the departure lounge. In a traditional setup, the checkpoint is inside the terminal—passengers enter, get screened, then proceed. Now, consider a new design: the airport contracts a third-party security company to run the checkpoint. The airport installs a special gate that, when a passenger arrives, redirects them to the external security company's facility. The security company screens the passenger (inspects, X-rays, etc.) and then sends them back to the same gate, which now lets them into the lounge. The airport never sees the passenger's internal details; it only sees the security company's 'cleared' status. Similarly, a Gateway Load Balancer (GWLB) sits in the network path, intercepting traffic and redirecting it to a fleet of third-party security appliances (firewalls, IDS/IPS) that inspect and forward the traffic back to the GWLB, which then sends it to its original destination. The GWLB is transparent to the endpoints—they don't know the traffic was inspected. Just like the airport gate, the GWLB uses a GENEVE encapsulation tunnel to redirect traffic to the security appliances, ensuring low overhead and maintaining original packet headers. The security appliances can be scaled horizontally, and the GWLB handles health checks and load distribution, much like the airport manages multiple security lanes.

How It Actually Works

What is Gateway Load Balancer and Why It Exists

The Gateway Load Balancer (GWLB) is an AWS service introduced in 2020 to address a specific gap: the need to deploy third-party virtual appliances (like firewalls, IDS/IPS, or DPI) in a transparent inline mode across multiple VPCs. Before GWLB, customers had to manually route traffic through a chain of appliances using route tables, which was complex, brittle, and difficult to scale. Alternatives like using a Network Load Balancer (NLB) in front of appliances had limitations: NLBs operate at Layer 4 (TCP/UDP) and cannot easily preserve the original source/destination IP addresses required for stateful inspection. GWLB solves this by acting as a single entry and exit point for traffic, seamlessly redirecting packets to a target group of appliances using GENEVE encapsulation, while maintaining transparency to the endpoints.

How GWLB Works Internally – The Mechanism

GWLB operates at Layer 3 (network layer) and uses the GENEVE encapsulation protocol (Generic Network Virtualization Encapsulation, RFC 8926) to tunnel traffic between the GWLB and the virtual appliances. Here is the step-by-step packet flow:

1.

Traffic Arrives: A packet destined for a workload (e.g., an EC2 instance) arrives at the GWLB endpoint. The endpoint is a logical construct in a VPC subnet, associated with a GWLB. The source and destination IPs are the original ones (no NAT).

2.

Load Balancing Decision: The GWLB selects a healthy target appliance from its target group based on a flow hash (5-tuple: source IP, destination IP, source port, destination port, protocol). All packets of the same flow go to the same appliance to maintain state.

3.

GENEVE Encapsulation: The GWLB encapsulates the original packet inside a GENEVE header. The outer IP header is addressed to the appliance's IP (the target), and the inner packet remains intact. The GENEVE header includes a Virtual Network Identifier (VNI) and optional metadata. By default, the GWLB sets the VNI to 0 and includes no optional metadata unless configured.

4.

Appliance Processing: The appliance receives the GENEVE packet, decapsulates it, inspects the original packet, and makes a forwarding decision. The appliance must be configured to accept GENEVE traffic and must forward the (possibly modified) packet back to the GWLB—also encapsulated in GENEVE. The appliance must ensure the return traffic is sent to the GWLB's IP address (not the original source).

5.

Return to GWLB: The GWLB receives the encapsulated packet from the appliance, decapsulates it, and forwards the original packet to its intended destination (the workload). The source IP of the packet is the original source (e.g., an internet client), not the appliance.

6.

Return Traffic: For return traffic from the workload to the client, the same process occurs in reverse: the packet hits the GWLB, is encapsulated to the same appliance (based on flow affinity), inspected, and then forwarded to the client.

Key Components, Values, Defaults, and Timers

Gateway Load Balancer (GWLB): The main resource. It is regional and can span multiple Availability Zones (AZs). It has a DNS name but is typically used with an endpoint.

Gateway Load Balancer Endpoint (GWLBe): A VPC endpoint (similar to VPC Endpoint Gateway type) that serves as the entry/exit point for traffic. It is created in a subnet and associated with a GWLB. Traffic is routed to the endpoint via route table entries.

Target Group: A logical grouping of appliances (registered as targets). Targets are EC2 instances or IP addresses (on-premises via AWS Direct Connect or VPN). Health checks are performed by the GWLB to determine target health. Default health check: TCP health check on port 80 every 10 seconds, with a timeout of 10 seconds, healthy threshold of 3, unhealthy threshold of 3.

GENEVE Protocol: Port 6081 is the default UDP port for GENEVE. Appliances must support GENEVE encapsulation (most modern third-party appliances do). The maximum transmission unit (MTU) of the network must account for the GENEVE header overhead (60 bytes for the outer IP + UDP + GENEVE). AWS recommends an MTU of 1500 bytes or jumbo frames (9001 bytes) inside the VPC.

Flow Affinity: The GWLB uses a hash of the 5-tuple to ensure all packets of a flow go to the same appliance. This is critical for stateful appliances. The hash is consistent as long as the target group does not change (e.g., addition/removal of targets).

Sticky Sessions: Not supported. GWLB does not use cookies or source IP affinity beyond the flow hash.

Cross-Zone Load Balancing: Enabled by default. Traffic can be distributed across appliances in different AZs. Can be disabled.

Proxy Protocol v2: GWLB supports Proxy Protocol v2 to pass client connection metadata (e.g., original source IP) to appliances. This is optional and must be enabled on the target group.

Preserve Client IP: GWLB does not change the source IP of the packet. The appliance sees the original client IP in the inner packet. However, the appliance must be configured to return traffic to the GWLB, not directly to the client.

Configuration and Verification Commands

Creating a GWLB via AWS CLI:

aws elbv2 create-load-balancer --name my-gwlb --type gateway --subnets subnet-123 subnet-456

Creating a target group:

aws elbv2 create-target-group --name my-tg --protocol GENEVE --port 6081 --target-type instance --vpc vpc-789

Registering targets:

aws elbv2 register-targets --target-group-arn arn:aws:elasticloadbalancing:us-east-1:123456789012:targetgroup/my-tg/123 --targets Id=i-1234567890abcdef0

Creating a GWLB endpoint:

ec2 create-vpc-endpoint --vpc-endpoint-type GatewayLoadBalancer --service-name com.amazonaws.vpce.us-east-1.vpce-svc-123 --subnet-ids subnet-123

Verification:

aws elbv2 describe-target-health --target-group-arn <tg-arn>

Interaction with Related Technologies

AWS Transit Gateway (TGW): GWLB can be integrated with TGW to inspect traffic between VPCs or between VPC and on-premises. For example, you can attach a TGW to a VPC and use route tables to send inter-VPC traffic to a GWLB endpoint for inspection.

VPC Peering: GWLB is not directly integrated with VPC peering. To inspect traffic across a VPC peering connection, you need to route traffic through a GWLB endpoint in the same VPC.

Direct Connect/VPN: GWLB endpoints can be used in a VPC that has Direct Connect or VPN connections. Traffic from on-premises can be routed to the GWLB endpoint for inspection before reaching workloads.

AWS Network Firewall: GWLB is an alternative to AWS Network Firewall for deploying third-party appliances. AWS Network Firewall is a managed firewall service, while GWLB allows you to use your own appliances.

Security Groups and NACLs: GWLB does not replace security groups or network ACLs. Traffic inspection by appliances is in addition to these controls.

Performance and Scaling

GWLB can handle millions of packets per second (PPS) and can scale horizontally by adding more endpoints and targets. The GWLB itself is fully managed and scales automatically. Each GWLB endpoint supports up to 10 Gbps per AZ. To achieve higher throughput, use multiple endpoints across AZs. The appliances must be scaled independently (e.g., using Auto Scaling groups). GWLB supports up to 100 targets per target group, and up to 20 target groups per load balancer.

Pricing

You pay for:

GWLB per hour (or partial hour) per GWLB.

Data processed (GB) through the GWLB.

GWLB endpoint per hour.

Data processed through the endpoint.

Pricing is similar to NLB but with additional charges for endpoints.

Walk-Through

1

Traffic arrives at GWLB endpoint

A packet from an internet client or another VPC destined to a workload (e.g., EC2 instance) arrives at the Gateway Load Balancer endpoint (GWLBe) in the VPC subnet. The packet's source and destination IPs are the original ones—no NAT is applied. The GWLBe is a logical interface that accepts traffic based on the VPC route table entries that point to the endpoint. The GWLB receives the packet and examines its 5-tuple (source IP, destination IP, source port, destination port, protocol) to determine the flow.

2

GWLB selects target appliance

The GWLB uses a hash of the 5-tuple to select a healthy target from the target group. The hash ensures all packets of the same flow go to the same appliance, maintaining stateful inspection. If the target group has multiple appliances, the GWLB distributes flows across them. The selection is consistent as long as the target group membership does not change. If a target is unhealthy (based on health checks), it is excluded from selection.

3

Packet encapsulated in GENEVE

The GWLB encapsulates the original packet inside a GENEVE header. The outer IP header has source = GWLB endpoint IP (or a private IP from the subnet) and destination = the selected appliance's private IP. The UDP header uses port 6081. The GENEVE header includes a Virtual Network Identifier (VNI) set to 0 by default. The original packet remains intact inside. This encapsulation allows the appliance to receive the packet without losing the original source/destination information.

4

Appliance decapsulates and inspects

The appliance receives the GENEVE packet on its interface. It decapsulates the packet, revealing the original packet. The appliance then performs its security functions (e.g., firewall rules, IDS/IPS inspection). It may drop the packet, modify it (e.g., tag or alter fields), or allow it. If allowed, the appliance must re-encapsulate the packet in GENEVE and send it back to the GWLB endpoint IP. The appliance must ensure the return packet is addressed to the GWLB endpoint, not the original source.

5

GWLB forwards original packet to destination

The GWLB receives the encapsulated packet from the appliance, decapsulates it, and forwards the original packet to the workload's destination (e.g., the EC2 instance). The source IP of the packet remains the original client IP. The workload responds as if the traffic came directly from the client, unaware of the inspection. The return traffic from the workload goes back through the same GWLB endpoint, which again encapsulates and sends to the same appliance for inspection before forwarding to the client.

What This Looks Like on the Job

Enterprise Scenario 1: Centralized Firewall for Multi-VPC Architecture

A large enterprise runs multiple VPCs (e.g., production, development, shared services) connected via AWS Transit Gateway. They need to inspect all east-west traffic between VPCs using a third-party firewall (e.g., Palo Alto Networks, Fortinet) for compliance and threat detection. Without GWLB, they would need to create complex routing and firewall chains. With GWLB, they create a single GWLB in a central security VPC. Each VPC's route table points to the GWLB endpoint for inter-VPC traffic. The firewall appliances are registered as targets in the GWLB target group, scaled with an Auto Scaling group. Traffic from VPC A to VPC B flows: VPC A -> TGW -> GWLB endpoint -> GWLB -> firewall -> GWLB -> TGW -> VPC B. The solution handles up to 10 Gbps per AZ, and they deploy endpoints in two AZs for high availability. Common misconfiguration: forgetting to update the TGW route tables to send traffic to the GWLB endpoint, causing traffic to bypass inspection.

Enterprise Scenario 2: Inline IDS/IPS for Internet-Facing Applications

An e-commerce company uses an Application Load Balancer (ALB) in front of web servers. They want to inspect all incoming traffic for malicious payloads using a third-party IDS/IPS (e.g., Snort, Suricata). They place a GWLB between the internet gateway and the ALB. The VPC route table for the public subnet sends all traffic to the GWLB endpoint. The GWLB forwards traffic to the IDS/IPS appliances. After inspection, traffic is sent back to the GWLB, which forwards it to the ALB. The ALB sees the original client IP. Performance considerations: the IDS/IPS appliances must handle the peak traffic, typically 1-5 Gbps. They use GENEVE encapsulation with jumbo frames (MTU 9001) to reduce overhead. They enable Proxy Protocol v2 on the target group to pass client metadata. A common mistake: the IDS/IPS appliance is not configured to return traffic to the GWLB endpoint IP, causing asymmetric routing and dropped connections.

Scenario 3: Outbound Traffic Inspection for Compliance

A financial institution needs to inspect all outbound traffic from a VPC to the internet for data loss prevention (DLP). They use a next-generation firewall (NGFW) as a transparent proxy. They deploy a GWLB in a private subnet and route all outbound traffic (0.0.0.0/0) through the GWLB endpoint. The NGFW appliances inspect traffic and enforce policies. They use Auto Scaling to handle variable traffic, with a minimum of 2 instances per AZ. They monitor health checks and set up CloudWatch alarms. A typical issue: the NGFW appliances are not configured to perform source NAT (SNAT), so return traffic from the internet tries to go directly to the workload's private IP, which is not routable. They must ensure the NGFW performs SNAT to its own IP, or use a NAT gateway after inspection.

How SAA-C03 Actually Tests This

SAA-C03 Exam Focus on Gateway Load Balancer

The SAA-C03 exam tests GWLB primarily under Domain 2: Resilient Architectures, Objective 2.2: Design a resilient network. You may also see it in Domain 3: High-Performance Architectures. The exam expects you to know:

GWLB is used for transparent inline inspection of traffic (not for load balancing applications).

It uses GENEVE encapsulation (port 6081) to tunnel traffic to appliances.

GWLB endpoints (GWLBe) are VPC endpoints that serve as entry/exit points.

It preserves original source IP (unlike NLB with proxy protocol).

It supports flow affinity (5-tuple hash) for stateful appliances.

Cross-zone load balancing is enabled by default.

GWLB is regional and can span multiple AZs.

Common Wrong Answers and Why Candidates Choose Them

1.

"Use a Network Load Balancer (NLB) in front of the appliances." – Candidates often confuse GWLB with NLB. NLB operates at Layer 4 and does not provide transparent inline inspection. NLB cannot easily preserve original source IP for return traffic. GWLB is specifically designed for inline security appliances.

2.

"Use an Application Load Balancer (ALB) with Lambda functions for inspection." – ALB is Layer 7 and not suitable for packet-level inspection. Lambda cannot process raw packets. GWLB is the correct service for third-party virtual appliances.

3.

"GWLB supports TCP health checks only." – While GWLB uses TCP health checks by default, it actually supports HTTP and HTTPS health checks as well (though GENEVE health checks are not supported). The exam may present a scenario where a specific health check type is needed.

4.

"GWLB can be used to load balance web traffic to EC2 instances." – This is incorrect. GWLB is not a replacement for ALB or NLB for application traffic. Its purpose is to load balance traffic to security appliances.

Specific Numbers, Values, and Terms

GENEVE port: 6081 (UDP)

Default health check: TCP on port 80, interval 10s, timeout 10s, healthy/unhealthy threshold 3.

MTU consideration: 1500 bytes default, but jumbo frames (9001) are recommended to accommodate GENEVE overhead (60 bytes).

Flow hash: 5-tuple (source IP, destination IP, source port, destination port, protocol).

Cross-zone load balancing: enabled by default.

Target type: instance or IP.

Maximum targets per target group: 100.

Maximum target groups per GWLB: 20.

GWLB endpoint: type GatewayLoadBalancer.

Edge Cases and Exceptions

Non-GENEVE appliances: If an appliance does not support GENEVE, GWLB cannot be used. The exam may test this by presenting a scenario with legacy appliances.

Return traffic routing: The appliance must send traffic back to the GWLB endpoint IP. If it sends directly to the client, traffic bypasses inspection and may be dropped.

MTU issues: If the network path does not support jumbo frames, fragmentation may occur. GWLB does not fragment packets; it relies on the underlying network.

Health checks: If the appliance's health check port is different from the data port, configure the health check accordingly. The exam may ask about configuring health checks on a non-standard port.

How to Eliminate Wrong Answers

If the question mentions "transparent inline inspection," "third-party virtual appliances," or "preserve source IP," the answer is likely GWLB.

If the question mentions "load balancing traffic to web servers" or "HTTP/HTTPS traffic," eliminate GWLB (use ALB or NLB).

If the question mentions "encapsulation" or "GENEVE," it must be GWLB.

If the question involves "VPC endpoints" and "security appliances," think GWLBe.

Remember: GWLB is not a replacement for NAT Gateway or Internet Gateway; it is an inspection layer.

Key Takeaways

GWLB is used for transparent inline inspection of traffic by third-party virtual appliances (firewalls, IDS/IPS).

GWLB uses GENEVE encapsulation (UDP port 6081) to tunnel traffic to appliances, preserving original source/destination IPs.

GWLB endpoints (GWLBe) are VPC endpoints that serve as entry/exit points; traffic is routed to them via route tables.

Flow affinity is based on a 5-tuple hash; all packets of a flow go to the same appliance.

Cross-zone load balancing is enabled by default; can be disabled.

Default health check: TCP on port 80, interval 10s, timeout 10s, healthy/unhealthy threshold 3.

GWLB does not support sticky sessions (cookies) — only flow affinity.

GWLB is not a replacement for ALB or NLB; it is for security appliances only.

Maximum 100 targets per target group, 20 target groups per GWLB.

GWLB is regional and can span multiple AZs; endpoints are per-AZ.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Gateway Load Balancer (GWLB)

Operates at Layer 3 (network layer) using GENEVE encapsulation.

Designed for transparent inline inspection by virtual appliances.

Preserves original source IP (no proxy protocol needed for IP).

Uses flow affinity (5-tuple hash) for stateful appliances.

Requires GENEVE support on targets (port 6081).

Network Load Balancer (NLB)

Operates at Layer 4 (TCP/UDP) without encapsulation.

Designed for high-performance load balancing of TCP/UDP traffic.

Can preserve client IP via Proxy Protocol v2 (optional).

Supports sticky sessions via source IP or cookies.

Works with standard TCP/UDP targets (no special protocol).

GWLB with GENEVE

Allows use of third-party appliances (firewalls, IDS/IPS).

Requires manual scaling and management of appliances.

Supports custom inspection logic (any appliance).

Encapsulation adds overhead (60 bytes per packet).

More flexible but higher operational overhead.

AWS Network Firewall

Fully managed firewall service by AWS.

Automatic scaling and high availability.

Pre-configured rules (stateful/stateless) with limited customization.

No encapsulation overhead; integrates natively with VPC.

Less flexible but lower operational overhead.

Watch Out for These

Mistake

GWLB can be used to load balance traffic to any application, like web servers.

Correct

GWLB is specifically designed for virtual network appliances (firewalls, IDS/IPS) in transparent inline mode. It is not a general-purpose load balancer. Use ALB or NLB for application traffic.

Mistake

GWLB changes the source IP of the packet, similar to a NAT.

Correct

GWLB preserves the original source IP. The appliance sees the original client IP in the inner packet. GWLB does not perform NAT; it only encapsulates/decapsulates.

Mistake

GWLB uses VXLAN encapsulation.

Correct

GWLB uses GENEVE encapsulation (UDP port 6081), not VXLAN. VXLAN is used in other contexts (e.g., AWS Transit Gateway). GENEVE is more flexible and supports variable-length metadata.

Mistake

GWLB endpoints are the same as VPC Gateway Endpoints (like S3 or DynamoDB).

Correct

GWLB endpoints are of type GatewayLoadBalancer, not Gateway. They are similar in concept but used for different purposes. Gateway endpoints (S3, DynamoDB) use route table entries, while GWLB endpoints also use route tables but forward traffic to the GWLB.

Mistake

GWLB supports sticky sessions via cookies.

Correct

GWLB does not support sticky sessions. It uses flow affinity based on the 5-tuple hash, which ensures all packets of the same flow go to the same appliance. This is not the same as cookie-based stickiness.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between a Gateway Load Balancer and a Network Load Balancer?

A Gateway Load Balancer (GWLB) is designed for transparent inline inspection of traffic by third-party virtual appliances (like firewalls). It operates at Layer 3 and uses GENEVE encapsulation to tunnel packets to appliances, preserving original IP addresses. A Network Load Balancer (NLB) operates at Layer 4 and is used for high-performance load balancing of TCP/UDP traffic to application servers. NLB can also preserve client IP via Proxy Protocol, but it does not support transparent inline inspection. On the exam, if the scenario involves security appliances and inspection, choose GWLB; if it's about distributing traffic to web servers, choose NLB.

Can I use a Gateway Load Balancer to load balance traffic to my web servers?

No. GWLB is specifically designed for virtual network appliances (firewalls, IDS/IPS, DPI) that need to inspect traffic in a transparent inline mode. It is not a general-purpose load balancer for application traffic. For web servers, use an Application Load Balancer (ALB) for HTTP/HTTPS or a Network Load Balancer (NLB) for TCP/UDP. On the exam, if the question mentions 'security inspection' or 'third-party appliances,' GWLB is the answer; if it mentions 'web traffic' or 'load balancing across instances,' use ALB or NLB.

Does Gateway Load Balancer support sticky sessions?

No, GWLB does not support sticky sessions in the traditional sense (e.g., cookie-based stickiness). However, it does support flow affinity, which ensures that all packets belonging to the same flow (based on 5-tuple hash) are sent to the same target appliance. This is critical for stateful appliances like firewalls. The hash is consistent unless the target group changes. On the exam, you may see a question asking about 'sticky sessions' — remember that GWLB uses flow affinity, not cookies.

What protocol does Gateway Load Balancer use to communicate with targets?

GWLB uses the GENEVE protocol (Generic Network Virtualization Encapsulation) on UDP port 6081. The target appliances must support GENEVE encapsulation and decapsulation. The GENEVE header includes a Virtual Network Identifier (VNI) and optional metadata. AWS sets the VNI to 0 by default. On the exam, remember the port number 6081 and the protocol name GENEVE.

How do I route traffic to a Gateway Load Balancer endpoint?

You create a Gateway Load Balancer endpoint (GWLBe) in a VPC subnet, and then update the VPC route table to send traffic to the endpoint. For example, to inspect all outbound traffic, you can add a route for 0.0.0.0/0 pointing to the GWLBe. The GWLBe then forwards the traffic to the GWLB, which load balances it to the target appliances. On the exam, remember that GWLB endpoints are VPC endpoints of type GatewayLoadBalancer.

Can I use Gateway Load Balancer with AWS Transit Gateway?

Yes, GWLB can be integrated with AWS Transit Gateway (TGW) to inspect traffic between VPCs. You can attach a TGW to a VPC and use route tables to send inter-VPC traffic to a GWLB endpoint in a central security VPC. The GWLB then forwards traffic to the security appliances for inspection before routing it to the destination VPC. This is a common pattern for centralized inspection of east-west traffic.

What are the performance limits of a Gateway Load Balancer?

Each GWLB endpoint supports up to 10 Gbps per Availability Zone. To achieve higher throughput, you can deploy multiple endpoints across multiple AZs. The GWLB itself scales automatically and can handle millions of packets per second. The target group can have up to 100 targets, and you can have up to 20 target groups per GWLB. On the exam, you might be asked about scaling considerations — remember to distribute endpoints across AZs for high availability and throughput.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Gateway Load Balancer (GWLB) for Inline Security — now see how well it sticks with free SAA-C03 practice questions. Full explanations included, no account needed.

Done with this chapter?