SAA-C03Chapter 177 of 189Objective 4.4

NAT Gateway vs NAT Instance Cost

This chapter provides a comprehensive, exam-focused comparison of NAT Gateway and NAT Instance in AWS, diving deep into their architectures, costs, performance, and operational trade-offs. Understanding when to choose one over the other is critical for the SAA-C03 exam, as cost optimization is a core domain (Objective 4.4). Roughly 5-10% of exam questions touch on NAT solutions, often in the context of designing cost-effective and highly available VPC architectures. By the end of this chapter, you will be able to confidently select the right NAT solution for any scenario and avoid common exam traps.

25 min read
Intermediate
Updated May 31, 2026

The Company Switchboard vs. The Auto-Attendant

Imagine a mid-sized company with 200 employees, each with a desk phone extension (private IP addresses). They need to make outbound calls to clients (the internet), but the company has only one public phone number (elastic IP). The company could hire a human switchboard operator (NAT Instance). The operator sits at a desk, manually logs each outbound call in a notebook, and when the return call comes in, she flips through the notebook to find which extension made the call and patches it through. If she gets sick (instance fails), no one can call out. She also needs breaks and overtime pay (management overhead). Alternatively, the company can install an auto-attendant system (NAT Gateway). This is a dedicated appliance that automatically rewrites the caller ID on every outbound call, logs the mapping in hardware memory, and routes return calls instantly without human intervention. It never gets sick, works 24/7, and handles thousands of simultaneous calls without breaking a sweat. However, the auto-attendant costs a monthly subscription fee per line (hourly charge + data processing fee), while the human operator is only paid per hour but you must also pay for her desk, chair, and training (AMI, instance size, patches). The auto-attendant scales automatically as call volume grows; the human operator would need a second operator (additional instance) and complex coordination. The auto-attendant also doesn't need a dedicated room (single AZ) — it sits in the lobby (public subnet) and is managed by the phone company (AWS). The human operator works best if the company has very few calls (low traffic) and needs full control over how calls are logged (custom scripts). For most companies, the auto-attendant is simpler, more reliable, and more cost-effective despite the monthly fee.

How It Actually Works

What Are NAT Gateway and NAT Instance?

Network Address Translation (NAT) is a method used to enable instances in a private subnet to initiate outbound traffic to the internet while preventing the internet from initiating connections to those instances. In AWS, you have two primary options: a managed NAT Gateway service and a self-managed NAT instance (an EC2 instance configured with NAT software). Both perform source NAT (SNAT) by replacing the source IP address of outbound packets with the NAT device's elastic IP address and tracking the state of connections to route return traffic back to the correct private instance.

Why They Exist

Private subnets have no direct internet route. Without NAT, instances in private subnets cannot download updates, access external APIs, or send logs to external services. NAT solutions bridge this gap by masquerading private IPs behind a public IP. The choice between NAT Gateway and NAT Instance boils down to trade-offs in management overhead, cost predictability, availability, and performance.

How NAT Gateway Works Internally

A NAT Gateway is a horizontally scalable, fully managed AWS service that operates at the hypervisor layer. When you create a NAT Gateway in a public subnet, AWS allocates an elastic IP (EIP) to it and attaches it to a highly available, redundant network appliance. The gateway performs stateful packet inspection: for every outbound packet, it creates a NAT session entry in a connection tracking table. This entry maps the private source IP:port to the gateway's EIP:ephemeral port. Return packets are matched against this table and forwarded to the correct private instance. The default idle timeout for NAT Gateway connections is 350 seconds (5 minutes 50 seconds). After this period of inactivity, the session entry is removed, and subsequent return packets are dropped. NAT Gateway automatically handles up to 5 Gbps of bandwidth and can burst to 10 Gbps. It supports TCP, UDP, and ICMP (though ICMP is not recommended for NAT). Each NAT Gateway is deployed in a single Availability Zone (AZ) and provides ~45 Gbps throughput for workloads within that AZ.

How NAT Instance Works Internally

A NAT instance is an EC2 AMI (e.g., Amazon Linux AMI with NAT enabled) that you launch, configure, and manage. The instance runs iptables or similar software to perform SNAT. You must disable Source/Destination Check on the instance's ENI to allow it to forward traffic. The instance uses iptables rules with conntrack to track connections. Default connection tracking timeout for established TCP connections is 5 days (432,000 seconds) in Linux, but for NAT use, AWS recommends setting it lower (e.g., 600 seconds). The instance's maximum throughput depends on its instance type — ranging from ~1 Gbps for t3.nano to 25 Gbps for c5n.18xlarge. You must manage patching, scaling, and failover yourself. You can use an Auto Scaling group with a lifecycle hook to replace failed instances, but failover is not instantaneous.

Key Components, Values, and Defaults

NAT Gateway: Hourly charge (approximately $0.045 per hour per AZ, varies by region) plus $0.045 per GB of data processed. No free tier. Supports 5 Gbps baseline, 10 Gbps burst. Idle timeout: 350 seconds. Max concurrent connections: 55,000 per gateway. Each gateway requires an EIP. One gateway per AZ for high availability.

NAT Instance: EC2 instance cost (t3.nano ~$0.0052/hour, but you need at least t3.micro for production) plus EIP cost if not using an associated instance (free when attached to running instance). Data processing cost is the same as EC2 data transfer (e.g., $0.01/GB for first 10 TB). No additional hourly NAT fee. You must manage the AMI, patches, and scaling. Instance type determines throughput. Default conntrack timeout: 432,000 seconds (5 days) — must be adjusted.

Configuration and Verification

NAT Gateway: 1. Create a NAT Gateway in a public subnet, allocate an EIP. 2. Update the private subnet's route table to add a default route (0.0.0.0/0) pointing to the NAT Gateway's ID. 3. Verify: from a private instance, run curl ifconfig.me — it should show the NAT Gateway's EIP. Check CloudWatch metrics (BytesOutToSource, BytesInFromDestination, PacketsOutToSource, ConnectionAttemptCount).

NAT Instance: 1. Launch an Amazon Linux 2 AMI with NAT support (or configure manually). 2. Disable Source/Destination Check on the instance's primary ENI. 3. Enable IP forwarding: echo net.ipv4.ip_forward=1 > /etc/sysctl.d/ip_forward.conf && sysctl -p /etc/sysctl.d/ip_forward.conf. 4. Configure iptables: iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE. 5. Update private subnet route table to point 0.0.0.0/0 to the NAT instance's ENI (private IP). 6. Verify: same as NAT Gateway. Monitor instance health, CPU, network.

Interaction with Related Technologies

VPC Peering: NAT devices in one VPC cannot be used by instances in a peered VPC unless the route tables explicitly point to the NAT device's IP. For NAT Gateway, you must use a private IP address (if the gateway is in a public subnet, its private IP is used for routing). For NAT Instance, the private IP is used. However, AWS recommends using a centralized egress VPC with Transit Gateway for complex multi-VPC setups.

Direct Connect/VPN: NAT devices can be used to route traffic to on-premises networks, but careful routing is needed to avoid asymmetric routing. NAT Gateway supports only internet traffic; for on-premises, use a NAT instance or a firewall appliance.

Security Groups: NAT Gateway does not support security groups — you cannot restrict traffic through it. NAT Instance supports security groups, allowing you to control inbound/outbound traffic at the instance level.

Network ACLs: Both are affected by subnet NACLs. For NAT Gateway, the public subnet's NACL must allow inbound ephemeral ports (1024-65535) from the internet and outbound to the internet. For NAT Instance, the instance's security group and subnet NACL apply.

Cost Analysis Deep Dive

The cost difference is not just about hourly rates. NAT Gateway has a fixed hourly cost regardless of usage, plus per-GB data processing. NAT Instance has only EC2 compute cost and standard data transfer fees. For low-volume workloads (e.g., <100 GB/month), a NAT instance (t3.nano) can be cheaper. For high-volume workloads (>1 TB/month), NAT Gateway becomes more cost-effective due to its higher throughput and no need for over-provisioning. However, you must factor in operational overhead: patching, monitoring, failover management for NAT Instance. The exam often presents scenarios where a NAT instance appears cheaper but fails to account for the cost of high availability (multiple instances, load balancer) or the risk of downtime.

High Availability Considerations

NAT Gateway is inherently highly available within an AZ but not across AZs. For multi-AZ resilience, you must deploy one NAT Gateway per AZ and route traffic from each private subnet to the NAT Gateway in the same AZ. This increases cost but ensures no single point of failure. NAT Instance can be made highly available using an Auto Scaling group with a health check and a script to update route tables (e.g., using Lambda). However, failover time is minutes, not seconds. The exam expects you to recommend NAT Gateway for production workloads requiring high availability and minimal management.

Performance and Scalability

NAT Gateway automatically scales to handle up to 45 Gbps per AZ (as of 2025). It supports up to 55,000 concurrent connections per gateway. If you need more, deploy multiple gateways in the same AZ and use a Network Load Balancer (NLB) with a target group of NAT Gateways (though this is complex). NAT Instance performance is limited by instance type. For example, a t3.medium can handle ~1 Gbps, while a c5n.18xlarge can handle 25 Gbps. You must scale by launching larger instances or adding more instances behind a load balancer. The exam often tests the fact that NAT Gateway scales automatically, while NAT Instance requires manual or automated scaling.

Walk-Through

1

Deploy NAT Gateway in Public Subnet

First, create a NAT Gateway in the AWS Management Console, CLI, or CloudFormation. Specify the public subnet (must have a route to an Internet Gateway) and allocate an Elastic IP (EIP). The EIP is associated with the gateway and cannot be detached. The gateway is automatically assigned a private IP from the subnet. AWS provisions the gateway in the hypervisor layer, making it highly available within the AZ. The gateway's state is immediately 'Available' within seconds. At the packet level, the gateway is a virtual appliance that listens on the subnet's default gateway IP. The gateway has no security groups — traffic is controlled by the subnet's NACL. The default idle timeout for connection tracking is 350 seconds. This step is irreversible: you cannot change the subnet or EIP after creation without deleting and recreating the gateway.

2

Update Private Subnet Route Table

Navigate to the VPC route table associated with the private subnet(s) that need internet access. Add a route with destination 0.0.0.0/0 and target set to the NAT Gateway ID (e.g., nat-0abcdef1234567890). This directs all outbound internet traffic to the NAT Gateway. Ensure the route table does not have a conflicting route (e.g., a route to a virtual private gateway). The route propagation is immediate. At the packet level, when an instance in the private subnet sends a packet to an internet IP, the VPC router checks the route table, finds the 0.0.0.0/0 route, and forwards the packet to the NAT Gateway's private IP. The NAT Gateway then performs SNAT: it changes the source IP to its EIP and source port to an ephemeral port, records the mapping, and forwards the packet to the internet gateway. The return packet follows the reverse path.

3

Test Connectivity from Private Instance

Launch a test EC2 instance in the private subnet with no public IP. SSH into a bastion host in a public subnet (or use AWS Systems Manager Session Manager if configured). From the private instance, run `curl ifconfig.me` or `ping google.com` (ICMP may be blocked — use `traceroute` or `nc`). The response should show the NAT Gateway's EIP as the source IP. Monitor CloudWatch metrics for the NAT Gateway: BytesOutToSource, BytesInFromDestination, PacketsOutToSource, PacketsInFromDestination, ConnectionAttemptCount, IdleTimeoutCount. If traffic fails, check the NACL of the public subnet: it must allow inbound traffic from the internet on ephemeral ports (1024-65535) for return traffic, and outbound traffic to the internet on ephemeral ports. Also ensure the private subnet's NACL allows outbound traffic to the internet (port 80/443) and inbound return traffic.

4

Compare with NAT Instance Deployment

For a NAT instance, launch an EC2 instance from an AMI that supports NAT (e.g., Amazon Linux 2 with NAT enabled). Choose an instance type based on expected throughput. Disable Source/Destination Check on the instance's ENI. Configure iptables for SNAT: `iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE`. Enable IP forwarding. Update the private subnet route table to point 0.0.0.0/0 to the NAT instance's private IP (not the instance ID). Test connectivity. At the packet level, the NAT instance uses conntrack to track connections. The default conntrack timeout for established TCP connections is 5 days (432,000 seconds), which can cause stale entries. To mitigate, set the timeout to 600 seconds using `sysctl -w net.netfilter.nf_conntrack_tcp_timeout_established=600`. The NAT instance's security group must allow inbound traffic from the private subnet on all ports (or specific ports) and outbound traffic to the internet. The instance's network performance depends on the instance type; for example, a t3.medium can handle ~1 Gbps.

5

Implement High Availability for NAT Instance

To achieve high availability with NAT instances, use an Auto Scaling group with a minimum of 1 instance, a health check, and a lifecycle hook. When an instance fails, Auto Scaling launches a new one. However, the route table still points to the failed instance's private IP. To update the route table automatically, use an AWS Lambda function triggered by the Auto Scaling lifecycle event (e.g., instance launch). The Lambda function updates the route table to point to the new instance's private IP. This process takes 1-2 minutes. Alternatively, use a secondary NAT instance in another AZ with a separate route table for each AZ. The exam expects you to recognize that NAT Gateway provides simpler HA: deploy one gateway per AZ and route each private subnet to its AZ's gateway. This avoids the complexity of scripting route table updates and reduces downtime.

What This Looks Like on the Job

Enterprise Scenario 1: E-Commerce Platform with Variable Traffic

A large e-commerce company runs its application servers in private subnets across three Availability Zones (us-east-1a, 1b, 1c). The application needs outbound internet access to validate credit cards via an external API and to download security updates. Traffic is bursty: on Black Friday, outbound traffic peaks at 30 Gbps; on normal days, it averages 2 Gbps. The company initially used a single NAT instance (c5n.large) in one AZ, but during peak, the instance saturated at 3 Gbps, causing timeouts. They switched to three NAT Gateways (one per AZ) with each private subnet routing to its own AZ's gateway. This provided 45 Gbps per AZ, easily handling the peak. Cost: $0.045/hr × 3 × 730 hrs = $98.55/month for the gateways, plus $0.045/GB × 30 TB = $1,350 for data processing. Total ~$1,448.55/month. With NAT instances, they would need three c5n.18xlarge instances (25 Gbps each) at $3.888/hr each = $8,515/month, plus data transfer costs. The NAT Gateway solution was cheaper and required no patching or scaling management.

Enterprise Scenario 2: Dev/Test Environment with Low Traffic

A startup runs a small dev/test environment with 10 EC2 instances in a single AZ. Outbound traffic is about 50 GB/month. They use a NAT instance (t3.nano) costing $0.0052/hr = $3.80/month, plus data transfer of $0.01/GB × 50 = $0.50, total ~$4.30/month. A NAT Gateway would cost $0.045/hr × 730 = $32.85/month plus $0.045/GB × 50 = $2.25, total ~$35.10/month — about 8x more expensive. In this low-traffic scenario, the NAT instance is the cost-effective choice. However, they must manage patching and ensure the instance is not a single point of failure. They use a simple script to replace the instance if it fails, accepting 5 minutes of downtime. This is acceptable for dev/test.

Enterprise Scenario 3: Regulated Financial Services with Strict Security

A bank requires all outbound traffic to be inspected by a third-party security appliance (e.g., Check Point or Palo Alto). They cannot use NAT Gateway because it does not support security groups or deep packet inspection. Instead, they deploy a NAT instance (or a marketplace firewall AMI) that routes traffic through the security appliance. The NAT instance is placed in a public subnet, and the security appliance is in a separate inspection VPC. The bank uses a Transit Gateway with a central egress VPC. The NAT instance is sized to handle 10 Gbps of encrypted traffic. They implement Auto Scaling with a custom AMI that includes the security software. Despite the operational overhead, the NAT instance is the only option that meets compliance requirements. The exam may present a scenario where security requirements force the use of a NAT instance over a NAT Gateway.

How SAA-C03 Actually Tests This

What the SAA-C03 Tests

SAA-C03 Objective 4.4: "Determine cost-effective and right-sized compute and database services." Under this, NAT Gateway vs. NAT Instance is a classic cost optimization topic. The exam tests your ability to recommend the most cost-effective solution given traffic patterns, availability requirements, and operational overhead. Specific sub-objectives include:

Compare pricing models (hourly vs. per-GB).

Evaluate total cost of ownership (including management time).

Identify when a NAT instance is cheaper (low traffic, non-critical) and when NAT Gateway is cheaper (high traffic, critical).

Understand that NAT Gateway scales automatically while NAT Instance requires manual scaling.

Recognize that NAT Gateway is managed by AWS, reducing operational overhead.

Common Wrong Answers and Why Candidates Choose Them

1.

"NAT Instance is always cheaper." Candidates see the low hourly cost of a t3.nano and forget that for high traffic, the per-GB data processing fee on NAT Gateway is offset by the need to over-provision NAT instance throughput. Also, they ignore the cost of high availability (multiple instances, load balancer).

2.

"NAT Gateway supports security groups." This is a common mistake. NAT Gateway does not support security groups; it relies on NACLs. Candidates assume it does because it's a managed service.

3.

"NAT Instance provides better performance." While a large NAT instance can outperform a single NAT Gateway, the gateway scales automatically to 45 Gbps per AZ. Candidates often forget that NAT Instance performance is limited by instance type and requires manual scaling.

4.

"Deploy one NAT Gateway in one AZ and route all private subnets to it." This creates a single point of failure and inter-AZ data transfer costs. The correct design is one NAT Gateway per AZ.

Specific Numbers and Values to Memorize

NAT Gateway hourly cost: ~$0.045/hr (varies by region).

NAT Gateway data processing cost: ~$0.045/GB.

NAT Gateway idle timeout: 350 seconds.

NAT Gateway max connections: 55,000.

NAT Gateway bandwidth: 5 Gbps baseline, 10 Gbps burst, 45 Gbps per AZ.

NAT Instance: no additional hourly fee beyond EC2 and EIP (free if attached).

NAT Instance default conntrack timeout: 432,000 seconds (5 days) — must be reduced for NAT workloads.

NAT Instance requires disabling Source/Destination Check.

Edge Cases and Exceptions

If you need to use a NAT device for both internet and on-premises traffic, NAT Gateway cannot be used for on-premises (it only routes to internet). Use a NAT instance or a virtual appliance.

For IPv6, NAT is not used — use Egress-Only Internet Gateway instead.

NAT Gateway does not support Port Address Translation (PAT) customization; NAT Instance allows full control.

In a VPC with multiple route tables, each private subnet can point to a different NAT Gateway. This is used for cost allocation or traffic segregation.

How to Eliminate Wrong Answers

If the question mentions "high availability" and "minimal management," eliminate NAT Instance.

If the question mentions "low traffic" and "cost-sensitive," eliminate NAT Gateway.

If the question mentions "security group" restrictions, eliminate NAT Gateway.

If the question mentions "custom packet inspection" or "third-party firewall," eliminate NAT Gateway.

If the question mentions "bursting to 10 Gbps" or "automatic scaling," eliminate NAT Instance (unless it's a very large instance).

Key Takeaways

NAT Gateway costs ~$0.045/hr + $0.045/GB data processing; NAT Instance costs only EC2 compute + data transfer.

NAT Gateway scales automatically to 45 Gbps per AZ; NAT Instance throughput is limited to instance type.

NAT Gateway is highly available within an AZ; for multi-AZ, deploy one per AZ.

NAT Instance requires disabling Source/Destination Check and configuring iptables.

NAT Gateway does not support security groups; use NACLs for traffic filtering.

For low-traffic (<100 GB/month) and non-critical workloads, NAT Instance is cost-effective.

For high-traffic (>1 TB/month) or production workloads, NAT Gateway is recommended despite higher base cost.

NAT Gateway idle timeout is 350 seconds; NAT Instance default conntrack timeout is 5 days (must be reduced).

NAT Gateway supports up to 55,000 concurrent connections per gateway.

Use Egress-Only Internet Gateway for IPv6 instead of NAT.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

NAT Gateway

Managed service: no patching, no maintenance.

Hourly charge ~$0.045/hr + $0.045/GB data processing.

Auto-scales to 45 Gbps per AZ, 55,000 concurrent connections.

Highly available within an AZ; must deploy per AZ for multi-AZ.

No security groups; relies on NACLs.

NAT Instance

Self-managed: you patch, monitor, and manage the instance.

EC2 cost only (e.g., t3.nano ~$0.0052/hr) + standard data transfer.

Throughput limited to instance type (e.g., t3.medium ~1 Gbps).

Single point of failure unless you implement Auto Scaling and route updates.

Supports security groups for fine-grained traffic control.

Watch Out for These

Mistake

NAT Gateway is always more expensive than NAT Instance.

Correct

For high-traffic workloads (>1 TB/month), NAT Gateway can be cheaper because it eliminates the need to over-provision EC2 instances and reduces operational overhead. The hourly cost is fixed, but data processing cost is per-GB. For low-traffic workloads, NAT Instance is cheaper.

Mistake

NAT Gateway supports security groups.

Correct

NAT Gateway does not support security groups. Traffic filtering is done at the subnet level using Network ACLs. NAT Instance does support security groups, which can be used to restrict inbound/outbound traffic.

Mistake

A single NAT Gateway in one AZ provides high availability across all AZs.

Correct

NAT Gateway is highly available only within its AZ. If the AZ fails, the gateway fails. For multi-AZ resilience, deploy one NAT Gateway per AZ and route private subnets to the gateway in the same AZ.

Mistake

NAT Instance can handle the same throughput as NAT Gateway without scaling.

Correct

NAT Gateway automatically scales to 45 Gbps per AZ. NAT Instance throughput is limited by the instance type (e.g., t3.medium ~1 Gbps, c5n.18xlarge ~25 Gbps). You must manually scale the instance or add more instances.

Mistake

NAT Gateway and NAT Instance both support ICMP.

Correct

NAT Gateway supports ICMP but it is not recommended because ICMP is connectionless and can cause issues with connection tracking. NAT Instance also supports ICMP but with similar caveats. The exam rarely tests ICMP details; focus on TCP/UDP.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

When should I use a NAT Gateway instead of a NAT Instance?

Use NAT Gateway when you need high availability, automatic scaling, and minimal operational overhead. It is ideal for production workloads with moderate to high traffic (e.g., >1 TB/month). NAT Instance is better for low-traffic, non-critical environments (e.g., dev/test) where cost is the primary concern and you can tolerate some downtime. Also, if you need security groups or custom packet inspection, NAT Instance (or a third-party appliance) is required.

How much does a NAT Gateway cost per month?

Approximately $32.85/month for the hourly fee (assuming 730 hours at $0.045/hr) plus $0.045 per GB of data processed. For example, processing 1 TB/month adds ~$46.08, for a total of ~$78.93/month per gateway. Prices vary by region. Always check the current AWS pricing page.

Can I use a NAT Gateway for IPv6 traffic?

No. NAT Gateway only supports IPv4. For IPv6 outbound traffic from a private subnet, use an Egress-Only Internet Gateway (EIGW). NAT is not needed for IPv6 because IPv6 addresses are globally unique.

Does NAT Gateway support port forwarding?

No. NAT Gateway only performs source NAT (SNAT) for outbound traffic. It does not support port forwarding or destination NAT (DNAT). For inbound port forwarding, use a Network Load Balancer or a NAT instance with iptables.

What happens if a NAT Gateway fails?

NAT Gateway is highly available within its AZ. If the underlying hardware fails, AWS automatically redirects traffic to another appliance in the same AZ. However, if the entire AZ fails, the gateway becomes unavailable. To protect against AZ failure, deploy one NAT Gateway per AZ and route traffic accordingly.

Can I use a NAT Instance in a public subnet with an Elastic IP?

Yes. A NAT instance must be in a public subnet with an Elastic IP (or a public IP) to access the internet. It must have a route to an Internet Gateway. The private subnet route table points to the NAT instance's private IP.

How do I monitor NAT Gateway traffic?

Use Amazon CloudWatch metrics: BytesOutToSource, BytesInFromDestination, PacketsOutToSource, ConnectionAttemptCount, IdleTimeoutCount. You can also enable VPC Flow Logs to capture traffic metadata. For NAT Instance, monitor standard EC2 metrics (CPU, network) and use iptables logs.

Terms Worth Knowing

Ready to put this to the test?

You've just covered NAT Gateway vs NAT Instance Cost — now see how well it sticks with free SAA-C03 practice questions. Full explanations included, no account needed.

Done with this chapter?