N10-009Chapter 97 of 163Objective 3.3

Network Capacity Planning

This chapter covers network capacity planning, a critical skill for ensuring network performance and availability. For the N10-009 exam, this topic falls under Domain 3.0 (Network Operations), objective 3.3, and typically appears in 5-8% of questions. You'll need to understand how to measure current utilization, forecast future growth, and identify bottlenecks using tools like SNMP, NetFlow, and bandwidth monitors. We'll cover key metrics, thresholds, and best practices to design scalable networks.

25 min read
Intermediate
Updated May 31, 2026

Network Capacity as a Highway System

Network capacity planning is like designing a highway system for a growing city. The 'roadway' is your network infrastructure—switches, routers, and links. Each lane represents a unit of bandwidth (e.g., 1 Gbps). During peak hour, traffic (data packets) flows from suburbs (end users) to downtown (servers). If too many cars enter the highway, congestion occurs—packets queue up (buffer bloat) or are dropped. A traffic engineer (network architect) must measure current traffic volume (baseline), predict future growth (e.g., 20% annual increase), and add lanes (upgrade links) or build alternate routes (load balancing, QoS). They also install on-ramp meters (rate limiting) and monitor traffic cameras (SNMP, NetFlow) to detect bottlenecks. If they only add lanes to the main highway but ignore intersections (core switches) or off-ramps (server NICs), congestion shifts. Similarly, upgrading a WAN link from 100 Mbps to 1 Gbps without upgrading the router's CPU or firewall throughput just moves the bottleneck. Proper capacity planning involves end-to-end analysis, considering peak utilization (typically 70% threshold for proactive upgrade), application demands, and headroom for failover.

How It Actually Works

What is Network Capacity Planning?

Network capacity planning is the process of determining the network resources required to meet current and future traffic demands while maintaining acceptable performance. It involves analyzing traffic patterns, measuring utilization, forecasting growth, and implementing upgrades before congestion occurs. The goal is to avoid performance degradation, packet loss, and downtime by ensuring that network links, devices, and services have adequate headroom.

Why It Exists

Networks are dynamic. User counts, application usage, and data volumes grow over time. Without planning, links become saturated, causing latency spikes, jitter, and packet drops. Capacity planning proactively identifies when and where upgrades are needed, aligning network investments with business needs. It also helps in budgeting and avoiding emergency upgrades.

Key Metrics and Measurements

Bandwidth Utilization: Percentage of link capacity used over a time interval. Measured in bits per second (bps). Typical threshold: 70% average utilization triggers planning; 90% requires immediate action.

Throughput: Actual data transfer rate, often less than bandwidth due to protocol overhead.

Packet Loss: Percentage of packets dropped, usually due to buffer overflow. Even 1% loss can degrade TCP performance significantly.

Latency: Round-trip time (RTT) for packets. Increases with queue buildup.

Jitter: Variation in latency, critical for real-time applications like VoIP.

Concurrent Connections: Number of active sessions a device (e.g., firewall, load balancer) can handle.

CPU/Memory Utilization: On routers, switches, firewalls. High CPU can cause packet drops even if link bandwidth is free.

Measurement Tools and Protocols

SNMP (Simple Network Management Protocol): Polls devices for interface counters (bytes in/out, errors, discards). Standard MIB-II (RFC 1213) provides ifInOctets, ifOutOctets. Poll interval typically 5 minutes. Tools: PRTG, SolarWinds, Cacti.

NetFlow / IPFIX: Provides flow-level data—source/destination IP, ports, protocol, packet/byte counts. Cisco NetFlow v5/v9, IPFIX (RFC 7011). Sampling rate often 1:1000 to reduce overhead.

sFlow: Packet sampling technology, exports header samples. Low overhead, good for high-speed links.

RMON (Remote Monitoring): Extends SNMP with history and alarm groups. RMON1 (RFC 1757) monitors Ethernet segments.

Bandwidth Test Tools: iPerf3, iPerf2 for active throughput measurement. Uses TCP or UDP streams.

Packet Captures: Wireshark, tcpdump for deep analysis of traffic patterns.

Baselining

Baselining is the process of capturing normal traffic patterns over a representative period (e.g., one week). This includes peak hours, average utilization, and application mix. Baselining helps set thresholds for alerts and provides a reference for anomaly detection. For example, if baseline shows 40% average utilization on a link, a sudden jump to 80% may indicate a problem or growth.

Forecasting Growth

Growth can be linear, exponential, or seasonal. Common methods: - Trend Analysis: Plot historical utilization and extend the curve. Use regression or moving averages. - Business Drivers: New office openings, application rollouts, mergers. Multiply current usage by expected user growth. - Compound Annual Growth Rate (CAGR): (Ending Value / Beginning Value)^(1/n) - 1. Example: If bandwidth usage grew from 100 Mbps to 200 Mbps over 2 years, CAGR = (200/100)^(1/2)-1 = 41.4% per year.

Capacity Planning Process

1.

Data Collection: Gather utilization data from all critical links and devices using SNMP or NetFlow. Collect at least 3-6 months of data.

2.

Baseline Analysis: Identify peak utilization periods, average load, and top talkers.

3.

Threshold Definition: Set warning (70%) and critical (90%) thresholds for bandwidth. For CPU, warning at 80%, critical at 90%.

4.

Forecast: Apply growth rate to baseline. For example, if current peak is 600 Mbps on a 1 Gbps link and growth is 30% per year, in one year peak = 780 Mbps, still under 1 Gbps; in two years, peak = 1014 Mbps, exceeding capacity.

5.

Upgrade Planning: Schedule upgrades (e.g., 10 Gbps) or implement traffic engineering (QoS, load balancing) before the forecasted saturation date.

6.

Implementation and Monitoring: After upgrade, continue monitoring to validate the new capacity.

Bottleneck Analysis

A bottleneck is the slowest component in the data path. Common bottlenecks: - WAN Links: Often the first to saturate due to limited bandwidth (e.g., 100 Mbps MPLS). - Router/Switch Backplane: Switching capacity may be less than sum of port speeds. Example: A 24-port 1 Gbps switch with 32 Gbps backplane can handle full line-rate on all ports simultaneously (24 Gbps), but if backplane is 20 Gbps, it's oversubscribed. - Firewall Throughput: Stateful inspection limits throughput. A firewall rated for 1 Gbps may only achieve 500 Mbps with all features enabled. - Server NICs: A single 1 Gbps NIC can be a bottleneck for a server handling many clients. - CPU/Memory: Routers with high routing table or ACLs may drop packets due to CPU overload.

Capacity Planning for Wireless Networks

Wireless capacity depends on channel utilization, number of clients, and interference. Key metrics: - Channel Utilization: Percentage of time the medium is busy. Above 50% indicates potential congestion. - Client Count: Each client adds overhead. A typical AP can handle 25-30 clients for data, fewer for voice/video. - Signal-to-Noise Ratio (SNR): Low SNR forces lower data rates, reducing overall capacity. - Co-Channel Interference: Multiple APs on same channel share airtime.

Planning involves conducting a site survey, selecting channels (1, 6, 11 for 2.4 GHz), and using load balancing (e.g., band steering).

Virtualization and Cloud Considerations

In virtualized environments, capacity planning must account for: - vSwitch Bandwidth: Virtual switches share physical NICs. Ensure sufficient uplinks (e.g., 4 x 10 Gbps for hypervisor). - VNF Performance: Virtual network functions (firewalls, routers) consume host CPU/memory. Overcommit ratios must be monitored. - Cloud Bandwidth: Direct peering, VPN bandwidth, and internet egress costs. Use cloud monitoring tools (AWS CloudWatch, Azure Monitor).

Best Practices

Monitor continuously, not just during problems.

Use a baseline period of at least one week.

Set thresholds with hysteresis to avoid alert flapping.

Plan for peak usage, not average.

Include failover capacity: if one link fails, the other must handle the load (e.g., two 1 Gbps links, each at 40% utilization, can handle 80% on one link during failure).

Document capacity plans and review quarterly.

Verification Commands

On Cisco IOS: - show interfaces [interface] – displays input/output rate, errors. - show processes cpu – CPU utilization. - show ip cache flow – NetFlow statistics. - show interface statistics – packet counts.

On Linux: - ifstat – interface utilization. - sar -n DEV 1 10 – network statistics every second. - nload – real-time bandwidth usage. - iptraf-ng – traffic monitoring.

Traffic Shaping and QoS

When capacity cannot be upgraded immediately, traffic shaping and QoS can prioritize critical traffic. For example, mark VoIP as EF (Expedited Forwarding) and limit bulk traffic. However, QoS does not create bandwidth—it only manages congestion.

Common Pitfalls

Overlooking Overhead: 1 Gbps Ethernet has about 20-30 Mbps overhead from headers and inter-frame gaps. Actual throughput is ~940 Mbps.

Ignoring Burst Traffic: Network traffic is bursty. A link may average 50% but spike to 95% for seconds. Consider burst capacity.

Assuming Symmetry: Upload and download patterns differ (e.g., typical web browsing: download-heavy). Plan asymmetrically.

Neglecting Management Traffic: SNMP polls, backups, and updates consume bandwidth.

Exam Relevance (N10-009 Objective 3.3)

The exam tests your ability to:

Identify tools for capacity planning (SNMP, NetFlow, sFlow, iPerf).

Interpret utilization graphs and determine when to upgrade.

Understand thresholds (70% warning, 90% critical).

Recognize bottlenecks (e.g., a server with 100 Mbps NIC on a 1 Gbps network).

Apply growth forecasting (CAGR).

Sample question: "A network engineer monitors a link that averages 600 Mbps on a 1 Gbps link. Traffic grows 25% per year. In how many years will the link exceed capacity?" Answer: In 2 years (600 * 1.25^2 = 937.5 Mbps; 600 * 1.25^3 = 1171.875 Mbps, exceeds 1 Gbps).

Walk-Through

1

Collect Baseline Data

Begin by gathering utilization data from all critical network interfaces using SNMP polling every 5 minutes. Record ifInOctets and ifOutOctets counters. Collect data for at least one full week to capture daily and weekly patterns. Tools like PRTG or SolarWinds can graph average, peak, and 95th percentile utilization. For example, a 1 Gbps link might show average 300 Mbps, peak 700 Mbps at 2 PM. Also collect error counters (CRC, collisions) to identify physical issues. Store data in a time-series database for trend analysis.

2

Analyze Traffic Patterns

Identify peak hours (e.g., 9-11 AM and 2-4 PM) and top talkers (hosts consuming most bandwidth). Use NetFlow to see application breakdown: HTTP, VoIP, backups. Determine if traffic is inbound or outbound heavy. For example, a web server may have 80% inbound traffic from clients. Also note seasonal patterns (e.g., month-end reporting). Calculate average, peak, and 95th percentile utilization. The 95th percentile is often used for billing and capacity planning because it excludes short bursts.

3

Set Utilization Thresholds

Define warning and critical thresholds based on industry best practices. For bandwidth, set warning at 70% average utilization over 5 minutes, and critical at 90%. For CPU on routers/switches, warning at 80%, critical at 90%. Use SNMP trap thresholds or monitoring tool alerts. Configure hysteresis: e.g., clear alert when utilization falls below 65%. Document thresholds and review them quarterly. Overly sensitive thresholds cause alert fatigue; too high thresholds miss problems.

4

Forecast Future Growth

Apply a growth rate to current baseline. Use historical data to calculate CAGR. For example, if bandwidth usage grew from 200 Mbps to 400 Mbps over 2 years, CAGR = 41.4% per year. Project future peak: Current peak = 700 Mbps, growth = 40% per year. Year 1 peak = 980 Mbps (still under 1 Gbps), Year 2 peak = 1372 Mbps (exceeds 1 Gbps). Plan upgrade to 10 Gbps before Year 2. Also consider business drivers: new office opening may add 100 users, increasing traffic by 50 Mbps.

5

Implement Upgrades or Optimization

Based on forecast, schedule upgrades: increase link speed (e.g., 1 Gbps to 10 Gbps), add additional links (link aggregation), upgrade router/firewall throughput, or deploy QoS to prioritize critical traffic. For example, if a firewall is bottleneck (CPU at 95% during peak), upgrade to a model with higher throughput. If a WAN link is saturated, implement traffic shaping to limit non-critical traffic (e.g., YouTube) and ensure VoIP gets priority. After upgrade, monitor to confirm new capacity resolves the bottleneck.

What This Looks Like on the Job

Scenario 1: Enterprise Headquarters to Branch Office

A multinational company has 50 branch offices connected via MPLS links (100 Mbps each). The network team uses SolarWinds to monitor utilization. Over six months, they notice the London office link averages 85% utilization during business hours with 2% packet loss. They collect NetFlow data and discover that a new video conferencing system is consuming 60 Mbps. The growth forecast shows 30% annual increase. The team upgrades the link to 500 Mbps and implements QoS to guarantee 50 Mbps for video. They also add a secondary 4G LTE backup link that activates when primary exceeds 90%. This ensures no disruption during peak times. The cost of upgrade is justified by preventing productivity loss.

Scenario 2: Data Center Core Upgrade

A hosting provider has a data center with 40 Gbps uplinks to the internet. They monitor interface counters and see average 30 Gbps, peak 55 Gbps. The 95th percentile is 50 Gbps. They plan to add a new customer expecting 20 Gbps traffic. Using CAGR of 25%, they forecast that within 18 months, peak will exceed 60 Gbps. They upgrade to 100 Gbps links and also add a load balancer to distribute traffic across multiple routers. They also implement sFlow to monitor flow-level data without overwhelming the switch CPU. The upgrade is done during a maintenance window, and post-upgrade monitoring shows peak utilization at 45% of new capacity, providing headroom for future growth.

Scenario 3: University Campus Wireless

A university with 10,000 students uses 500 APs across campus. They use Ekahau for site surveys and monitoring. During registration week, they get complaints of slow Wi-Fi. They check channel utilization and find that channel 6 in the library has 80% utilization with 50 clients per AP. They adjust channel allocation (use 5 GHz band more aggressively), add additional APs in high-density areas, and enable band steering to push dual-band clients to 5 GHz. They also implement client load balancing with a threshold of 30 clients per AP. After changes, channel utilization drops to 40% and complaints stop. They set up SNMP traps to alert when channel utilization exceeds 60%.

How N10-009 Actually Tests This

The N10-009 exam tests objective 3.3: 'Given a scenario, perform network capacity planning.' You must know the tools, metrics, thresholds, and process. The most common wrong answers involve confusing bandwidth with throughput, ignoring overhead, or misapplying growth formulas.

Common Wrong Answers: 1. Choosing 'bandwidth' as the only metric: Candidates think capacity planning is just about link speed, but CPU, memory, and concurrent connections are equally important. The exam will present a scenario where a link is underutilized but the router is dropping packets due to high CPU. 2. Assuming 1 Gbps link delivers 1 Gbps throughput: The correct answer accounts for overhead (Ethernet, IP, TCP). Actual throughput is ~940 Mbps. The exam may ask for the maximum throughput of a 1 Gbps link. 3. Using average instead of peak: A question may show average utilization of 40% but peaks of 95% and ask if upgrade is needed. The correct answer is yes because peaks cause packet loss. Candidates often pick 'no' based on average. 4. Misapplying growth rate: For example, if current peak is 600 Mbps on 1 Gbps link and growth is 20% per year, some candidates calculate 600 * 1.2 = 720 Mbps and think it's fine, but they forget compound growth over multiple years. The exam may ask 'in how many years will it exceed?' and the answer is 3 years (600 * 1.2^3 = 1036.8 Mbps).

Specific Numbers and Terms: - Thresholds: 70% warning, 90% critical for bandwidth; 80% warning, 90% critical for CPU. - SNMP polling interval: typically 5 minutes. - NetFlow v5/v9, IPFIX, sFlow. - 95th percentile billing. - CAGR formula. - Tools: PRTG, SolarWinds, Cacti, MRTG, iPerf, Wireshark.

Edge Cases: - Asymmetric links: e.g., cable internet (download 100 Mbps, upload 10 Mbps). Capacity planning must consider both directions. - Oversubscription: In switch design, backplane may be less than sum of port speeds. Exam may ask: 'A 48-port 1 Gbps switch has a 32 Gbps backplane. What is the oversubscription ratio?' Answer: 48/32 = 1.5:1. - Virtualization: Host with four 10 Gbps NICs serving 20 VMs. Each VM may have 1 Gbps vNIC, but physical throughput is limited.

How to Eliminate Wrong Answers: - If the question mentions packet loss but link utilization is low, look for CPU or memory bottleneck. - If the question asks for the best tool to measure historical utilization, SNMP is correct; for real-time traffic analysis, use NetFlow. - If the question involves growth, always apply compound growth (not linear). - Remember that QoS does not increase capacity; it only prioritizes.

Study these trap patterns and practice with scenario-based questions.

Key Takeaways

Capacity planning involves measuring current utilization, setting thresholds (70% warning, 90% critical), forecasting growth, and upgrading before saturation.

SNMP is used for historical bandwidth monitoring; NetFlow/sFlow for flow-level analysis.

Actual throughput is about 94% of nominal bandwidth due to overhead (e.g., 1 Gbps link yields ~940 Mbps).

Always use peak or 95th percentile utilization, not average, to assess capacity needs.

Bottlenecks can be CPU, memory, or throughput of routers/firewalls, not just link bandwidth.

CAGR formula: (End/Start)^(1/n)-1. Example: 200 Mbps to 400 Mbps in 2 years = 41.4% growth per year.

QoS does not increase capacity; it only prioritizes traffic during congestion.

In wireless, channel utilization above 50% indicates need for additional APs or channel changes.

Virtualized environments require monitoring of vSwitch and host NIC utilization, not just VMs.

Document baseline and review capacity plans quarterly to adapt to changing traffic patterns.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

SNMP

Polls interface counters (bytes, errors) at intervals (e.g., 5 min).

Low overhead on device; uses minimal CPU.

Provides aggregate utilization, not per-flow data.

Standard MIB-II (RFC 1213) supported on almost all devices.

Best for long-term trending and capacity planning.

NetFlow

Exports flow records (IP pairs, ports, protocols) in real-time or near-real-time.

Higher CPU/memory overhead; sampling often used (e.g., 1:1000).

Provides per-flow granularity to identify top talkers and applications.

Proprietary (Cisco NetFlow) or standard (IPFIX, sFlow).

Best for traffic analysis, security, and short-term troubleshooting.

Watch Out for These

Mistake

Bandwidth and throughput are the same thing.

Correct

Bandwidth is the theoretical maximum data rate of a link (e.g., 1 Gbps). Throughput is the actual data transfer rate, which is always lower due to protocol overhead (Ethernet, IP, TCP headers), retransmissions, and congestion. For example, a 1 Gbps Ethernet link typically achieves ~940 Mbps throughput with TCP due to overhead.

Mistake

A link averaging 50% utilization has plenty of headroom.

Correct

Average utilization can be misleading because network traffic is bursty. A link may average 50% but spike to 95% for seconds, causing packet loss and latency. Capacity planning should use peak or 95th percentile utilization, not average. The 70% warning threshold is based on average; peaks can be much higher.

Mistake

Upgrading the WAN link alone solves all capacity issues.

Correct

Capacity planning must be end-to-end. Upgrading a WAN link from 100 Mbps to 1 Gbps may shift the bottleneck to the router's CPU, firewall throughput, or server NIC. For example, if the firewall is rated for 500 Mbps, a 1 Gbps WAN link will still be limited to 500 Mbps. All components in the path must be evaluated.

Mistake

QoS can create additional bandwidth.

Correct

QoS (Quality of Service) does not increase bandwidth; it only prioritizes certain traffic types over others during congestion. If a link is saturated, QoS ensures critical traffic (e.g., VoIP) is forwarded first, but non-critical traffic may be dropped. The only way to increase capacity is to upgrade the link or add additional links.

Mistake

SNMP polling at 5-minute intervals is sufficient for real-time monitoring.

Correct

SNMP polling every 5 minutes is good for historical trending and capacity planning, but it may miss short bursts. For real-time monitoring, use NetFlow or sFlow which provide per-flow statistics with higher granularity. Some tools also support sub-minute polling (e.g., 30 seconds) but that increases overhead.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between bandwidth and throughput in capacity planning?

Bandwidth is the maximum theoretical data rate of a link (e.g., 1 Gbps). Throughput is the actual data transfer rate achieved, which is lower due to protocol overhead (Ethernet, IP, TCP headers) and network conditions. For example, a 1 Gbps Ethernet link typically delivers ~940 Mbps TCP throughput. Capacity planning must consider throughput, not just bandwidth, to avoid over-provisioning.

What are the recommended utilization thresholds for bandwidth and CPU?

For bandwidth, set a warning threshold at 70% average utilization and a critical threshold at 90%. For CPU on routers and switches, set warning at 80% and critical at 90%. These thresholds trigger proactive upgrades before performance degrades. Use hysteresis (e.g., clear alert at 65%) to avoid flapping.

How do I calculate the compound annual growth rate (CAGR) for network traffic?

CAGR = (Ending Value / Beginning Value)^(1/n) - 1, where n is the number of years. For example, if traffic grew from 200 Mbps to 400 Mbps over 2 years, CAGR = (400/200)^(1/2)-1 = 0.414 = 41.4% per year. Use this rate to project future utilization: Future = Current * (1 + CAGR)^years.

What tools are best for network capacity planning?

SNMP-based tools like PRTG, SolarWinds, Cacti, and MRTG are used for historical bandwidth monitoring and trending. NetFlow analyzers (e.g., SolarWinds NetFlow Traffic Analyzer, Scrutinizer) provide per-flow data for application breakdown. Active testing tools like iPerf measure actual throughput. For wireless, Ekahau or AirMagnet are used for site surveys.

What is the 95th percentile and how is it used in capacity planning?

The 95th percentile is a statistical measure where the top 5% of data points are discarded. It is often used for billing and capacity planning because it excludes short bursts. For example, if a link's utilization is sampled every 5 minutes, the 95th percentile value represents the level below which 95% of measurements fall. This gives a realistic peak that ignores outliers.

How do I identify a bottleneck in the network?

A bottleneck is the slowest component in the data path. Start by checking link utilization—if a link is near 100%, it's a bottleneck. If not, check router/firewall CPU and memory. Use tools like traceroute to see where latency increases. For example, if a server has a 100 Mbps NIC but the network is 1 Gbps, the NIC is the bottleneck. Also check switch backplane oversubscription.

What is oversubscription in networking?

Oversubscription occurs when the sum of port speeds on a switch exceeds the backplane capacity. For example, a 48-port 1 Gbps switch with a 32 Gbps backplane has an oversubscription ratio of 48:32 = 1.5:1. This means not all ports can run at full speed simultaneously. It is common in access switches to save cost, but must be accounted for in capacity planning.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Network Capacity Planning — now see how well it sticks with free N10-009 practice questions. Full explanations included, no account needed.

Done with this chapter?