This chapter covers EC2 instance types and placement groups, two fundamental concepts for designing resilient and cost-effective architectures on AWS. Understanding instance families, sizes, and placement strategies is critical for the SAA-C03 exam, as questions on these topics appear in roughly 15-20% of exam scenarios, often integrated with other services like Auto Scaling, Elastic Load Balancing, and Amazon EBS. You will learn how to select the appropriate instance type based on workload requirements and how to use placement groups to influence network performance, fault tolerance, and high availability.
Jump to a section
Think of AWS EC2 instance types as a fleet of vehicles in a delivery company. The company has different vehicles for different jobs: a small scooter (t2.nano) for quick local envelopes, a sedan (t3.medium) for standard packages, a pickup truck (m5.large) for general cargo, a heavy truck (c5.4xlarge) for compute-heavy loads, a refrigerated truck (r5.large) for cold storage items, a flatbed truck (d2.8xlarge) for dense heavy materials, and a high-performance sports car (p3.16xlarge) for GPU-intensive tasks like video rendering. Each vehicle has a specific engine (CPU), fuel tank (memory), cargo space (storage), and suspension (network bandwidth). The company also has a garage (placement group) where vehicles can be parked close together (cluster placement group) for fast communication between them, like two delivery trucks loading at the same dock. Alternatively, vehicles can be spread out across different garages (spread placement group) to ensure that if one garage has a fire, not all vehicles are lost. The company can also place vehicles in a partition placement group, where they are separated into fire-resistant compartments, so a fire in one compartment doesn't affect others. Choosing the right vehicle and parking arrangement is crucial for efficiency and reliability.
EC2 Instance Types Overview
Amazon EC2 provides a broad selection of instance types optimized for different use cases. Each instance type comprises a combination of CPU, memory, storage, and networking capacity. The naming convention follows a pattern: <family>.<generation>.<size>, e.g., t3.large, m5.xlarge, c5n.4xlarge. The family letter indicates the optimization: General purpose (m, t, a), Compute optimized (c), Memory optimized (r, x, z), Accelerated computing (p, g, inf, f), Storage optimized (i, d, h). The generation number increases over time (e.g., m4, m5, m6). The size suffix indicates relative capacity within the family (nano, micro, small, medium, large, xlarge, 2xlarge, etc.), with each step roughly doubling the resources.
General Purpose Instances
T-series (T2, T3, T3a): Burstable performance instances designed for workloads that don't consistently use full CPU. They earn CPU credits during idle periods and spend them during bursts. T2 uses Intel, T3 uses Intel, T3a uses AMD. T3 and T3a have unlimited mode by default, allowing bursts beyond the baseline at additional cost. Use cases: small databases, development environments, web servers, microservices.
M-series (M5, M5a, M6i, M6g): Balanced compute, memory, and networking. M5 and M5a are Intel/AMD based, M6g uses AWS Graviton2 (ARM) for better price/performance. Use cases: application servers, gaming servers, back-end servers, small and medium databases.
A-series (A1): ARM-based instances for scale-out workloads using ARM architecture. Use cases: web servers, containerized microservices.
Compute Optimized Instances
C-series (C5, C5n, C6g, C6gn): High-performance processors, ideal for compute-bound applications. C5n offers enhanced networking. Use cases: batch processing, media transcoding, high-performance web servers, scientific modeling, machine learning inference.
Memory Optimized Instances
R-series (R5, R5a, R5n, R6g): Large memory per vCPU, suitable for memory-intensive workloads. Use cases: high-performance databases, distributed caches (Redis, Memcached), real-time big data analytics.
X-series (X1, X1e): Very large memory, up to 3,904 GiB on X1e. Use cases: SAP HANA, large-scale enterprise databases, memory-intensive applications.
z-series (z1d): High memory and high compute, with a fixed performance processor (up to 4.0 GHz). Use cases: electronic design automation (EDA), relational databases with high per-core licensing costs.
Accelerated Computing Instances
P-series (P3, P4d): GPU instances for parallel processing. P3 uses NVIDIA Tesla V100, P4d uses A100. Use cases: machine learning training, HPC, computational fluid dynamics.
G-series (G4ad, G4dn): GPU instances for graphics-intensive applications. G4dn uses NVIDIA T4, G4ad uses AMD Radeon Pro V520. Use cases: video transcoding, 3D visualization, game streaming.
Inf-series (Inf1): Machine learning inference optimised with AWS Inferentia chips. Use cases: high-throughput, low-latency inference for models like BERT, ResNet.
F-series (F1): FPGA instances for custom hardware acceleration. Use cases: genomics, financial analytics, video processing.
Storage Optimized Instances
I-series (I3, I3en, I4i): High local NVMe SSD storage, low latency. I3en offers higher storage density. Use cases: NoSQL databases (Cassandra, MongoDB), transactional databases, data warehousing.
D-series (D2, D3, D3en): Dense HDD storage for massive data throughput. Use cases: Hadoop, MapReduce, data lakes, log processing.
H-series (H1): Balanced disk throughput and compute. Use cases: MapReduce, distributed file systems.
Instance Sizes and Resource Ratios
Each instance family offers multiple sizes. The general rule: moving from one size to the next (e.g., from medium to large) doubles the number of vCPUs, memory, and network bandwidth (up to a point). However, some families have different scaling behaviors. For example, T3 burstable instances have a baseline CPU utilization and accumulate credits. The baseline for t3.nano is 5%, t3.micro is 10%, t3.small is 20%, t3.medium is 20%, t3.large is 30%, etc. Exceeding baseline without credits incurs additional charges.
EBS-Optimized Instances
Most current generation instances support EBS optimization by default, providing dedicated throughput to Amazon EBS. Older generations (e.g., t2, m3) require explicit enablement. EBS-optimized instances have dedicated network capacity for EBS I/O, separate from data network traffic. The dedicated bandwidth varies by instance size; for example, m5.large provides up to 2,120 Mbps, m5.xlarge up to 4,750 Mbps, m5.2xlarge up to 4,750 Mbps, and so on.
Placement Groups Overview
A placement group is a logical grouping of instances within a single Availability Zone (AZ). There are three types: Cluster, Spread, and Partition. Placement groups influence the underlying hardware placement to achieve specific goals: low network latency (Cluster), high fault isolation (Spread), or a balance with scalability (Partition).
Cluster Placement Groups
Definition: A cluster placement group places instances close together in a single AZ within a single region. They are designed for low-latency, high-throughput networking between instances (up to 10 Gbps for some instance types).
Use Cases: High-performance computing (HPC), tightly coupled applications, financial modeling, large-scale simulations.
Limitations: Single AZ dependency. If the AZ fails, all instances in the group become unavailable. Also, you cannot span a cluster placement group across AZs. To launch instances, you must request a capacity reservation or use a launch template.
Networking: Instances in the same cluster placement group can achieve up to 10 Gbps of network bandwidth (for supported instance types like C5n, P3dn). This is because they are placed on the same rack or adjacent racks, reducing network hops.
Spread Placement Groups
Definition: A spread placement group distributes instances across distinct underlying hardware, each instance on a separate rack with its own power and network. A spread placement group can span multiple AZs, and you can have a maximum of seven running instances per AZ per group.
Use Cases: Small number of critical instances that must be isolated from each other, such as multiple web servers, or master nodes for a distributed system.
Limitations: Maximum of seven instances per AZ. This makes it unsuitable for large-scale deployments.
Fault Isolation: Each instance is on a distinct rack. If one rack fails, only that instance is affected.
Partition Placement Groups
Definition: A partition placement group divides instances into logical partitions, each on a separate set of racks. Each partition is isolated from other partitions in terms of rack failure. You can have up to seven partitions per AZ, and the group can span multiple AZs.
Use Cases: Large distributed applications like Hadoop, Cassandra, Kafka, where you want to spread replicas across partitions to tolerate rack failures. You can control which partition an instance goes into.
Limitations: You must specify the partition number when launching instances. The maximum number of partitions per AZ is seven.
How It Works: AWS ensures that each partition has its own independent rack infrastructure. If a rack fails, only instances in that partition are affected. You can place replicas of your data in different partitions to maintain availability.
Interaction with Other Services
Auto Scaling: You can launch instances into a placement group using an Auto Scaling group, but you must specify the placement group in the launch template or launch configuration. For cluster placement groups, Auto Scaling can help maintain the group size, but capacity constraints may cause launch failures.
Elastic Load Balancing: Load balancers can distribute traffic to instances in any placement group, but they are not placement-group-aware. The placement group is transparent to the load balancer.
EBS: Volumes attached to instances in a cluster placement group do not have special placement; EBS volumes are independent of the placement group and can be in different racks.
Key Defaults and Limits
Default tenancy: Shared (unless you specify dedicated).
Maximum instances per spread placement group per AZ: 7.
Maximum partitions per AZ per partition placement group: 7.
Maximum number of placement groups per account per region: 500.
**Cluster placement groups cannot span AZs; spread and partition can.
**You cannot merge placement groups.
**You can move an existing instance into a placement group only if it is stopped (for most instance types). Some instance types support moving while running (e.g., C5, M5, R5).
Commands and Verification
To create a placement group using AWS CLI:
aws ec2 create-placement-group --group-name MyCluster --strategy cluster
aws ec2 create-placement-group --group-name MySpread --strategy spread
aws ec2 create-placement-group --group-name MyPartition --strategy partitionTo launch an instance into a placement group (via CLI):
aws ec2 run-instances --image-id ami-0abcdef1234567890 --instance-type m5.large --placement GroupName=MyClusterTo describe placement groups:
aws ec2 describe-placement-groups --group-ids pg-12345678To modify an instance's placement group (instance must be stopped):
aws ec2 modify-instance-placement --instance-id i-1234567890abcdef0 --group-name MyNewGroupMonitoring Placement Group Performance
For cluster placement groups, you can monitor network throughput using CloudWatch metrics like NetworkIn and NetworkOut. You can also use the perf command or tools like iperf3 between instances to measure bandwidth.
Identify Workload Requirements
Begin by analyzing the application's compute, memory, storage, and networking needs. For example, a CPU-intensive batch job requires compute-optimized instances (C5), while an in-memory cache needs memory-optimized instances (R5). Determine if the workload is burstable (T3) or requires consistent performance (M5). Also consider licensing costs (if per-core, choose fewer powerful cores like z1d). This step determines the instance family and size.
Select Instance Family and Size
Based on requirements, choose the appropriate instance family. For general-purpose workloads, start with M5 or T3. For compute-intensive, choose C5. For memory-intensive, choose R5. For storage-intensive, choose I3 or D2. Then select the size (e.g., large, xlarge) based on required vCPU and memory. Refer to the AWS Instance Types documentation for exact specs. Consider using the AWS Pricing Calculator to estimate costs.
Decide Placement Strategy
Determine if placement groups are needed. If low latency between instances is critical (e.g., HPC), choose a cluster placement group. If high fault tolerance for a small number of instances is needed, choose spread placement group. If you need to control rack-level failure for a large distributed system, choose partition placement group. If no special placement is needed, do not use a placement group.
Create Placement Group (if needed)
Use AWS Management Console, CLI, or SDK to create the placement group. Specify the strategy: cluster, spread, or partition. For partition, specify the number of partitions (max 7 per AZ). The placement group is created in a specific region and can span multiple AZs for spread and partition. Note that cluster placement groups cannot span AZs.
Launch Instances into Placement Group
When launching instances, specify the placement group name in the launch configuration (Console: 'Add instance' > 'Placement group'). For spread and partition groups, you can optionally specify a partition number for each instance. For cluster groups, you may need to request capacity if the group is full. Ensure the instances are in the same AZ for cluster groups. After launch, verify placement using describe-instances or describe-placement-groups.
Verify and Monitor
After instances are running, verify that they are in the correct placement group using the console or CLI. For cluster groups, test network performance using tools like iperf3 to confirm low latency and high throughput. Monitor CloudWatch metrics for network performance. For spread and partition groups, verify that instances are on separate racks by checking the placement group details (though AWS does not expose rack IDs, you can infer from partition numbers).
Enterprise Scenario 1: High-Frequency Trading (HFT) Platform
A financial services company runs a proprietary trading algorithm that requires microsecond-level latency between compute nodes. They deploy a cluster placement group in us-east-1a with 100 c5n.18xlarge instances. The cluster placement group ensures all instances are on the same rack, providing up to 100 Gbps network bandwidth with single-digit microsecond latency. They use Elastic Fabric Adapter (EFA) for even lower latency. The challenge is capacity; they must pre-warm the placement group by launching instances in advance or use capacity reservations. If they try to add instances during peak hours, they may get InsufficientCapacity errors. Misconfiguration: they initially tried to use spread placement group, which caused high latency because instances were on different racks, resulting in trade execution delays.
Enterprise Scenario 2: Distributed Database (Cassandra) on AWS
A social media company runs a Cassandra cluster across three AZs for high availability. They use partition placement groups in each AZ to ensure that replicas are spread across different racks. Each AZ has 7 partitions, and they place each Cassandra node in a specific partition. For example, replication factor 3 means each data replica goes to a node in different partitions. This design tolerates up to 6 rack failures per AZ (since 7 partitions, one rack failure per partition). They use i3en.24xlarge instances for local NVMe storage. Misconfiguration: they initially used spread placement groups but hit the 7-instance limit per AZ, forcing them to use multiple spread groups, which complicated management. Switching to partition allowed scaling to hundreds of instances per AZ.
Enterprise Scenario 3: Web Application with Small Critical Instances
A startup runs a critical web application with two web servers and one database server. They use a spread placement group across two AZs to ensure that no two instances share the same rack. The web servers are in different AZs, and the database is in a third AZ (but within the same spread group, across two AZs only). This ensures that a rack failure takes down at most one instance. They use t3.medium instances for cost efficiency. Misconfiguration: they mistakenly used a cluster placement group, which placed all instances in the same AZ and same rack, causing a single failure to take down the entire application. They realized the error after a maintenance event.
SAA-C03 Exam Focus on EC2 Instance Types and Placement Groups
The SAA-C03 exam tests your ability to choose the right instance type and placement group based on given requirements. Objective 2.1: 'Design resilient architectures' includes selecting appropriate instance types and placement strategies. Expect scenario-based questions where you must pick the most cost-effective and resilient option.
Common Wrong Answers and Traps
Choosing T2 over T3 for burstable workloads: T3 is newer, often cheaper, and offers unlimited mode by default. Many candidates choose T2 because they remember it, but T3 is the recommended option unless the question specifies older generation.
Selecting cluster placement group for fault tolerance: Cluster groups put all instances in one AZ and one rack, which is the opposite of fault tolerance. Candidates see 'low latency' and choose cluster, but if the requirement is high availability, cluster is wrong.
Assuming spread placement groups can have more than 7 instances per AZ: The 7-instance limit is a hard limit. Questions may ask for a placement group for 10 critical instances; the correct answer is partition placement group (if you need isolation) or multiple spread groups, but the exam expects partition.
Confusing partition and spread: Both provide fault isolation, but partition allows more instances (up to 7 partitions * many instances per partition) and gives you control over placement. Spread is for a small number of instances.
Specific Numbers and Terms to Memorize
7 instances per AZ per spread placement group.
7 partitions per AZ per partition placement group.
Cluster placement groups cannot span AZs.
Maximum 500 placement groups per account per region.
Instance families: T3, M5, C5, R5, P3, I3, D2 – know their primary use cases.
Burstable instances: T2/T3/T4g (T4g is ARM-based Graviton2).
EBS-optimized by default for current generation.
Edge Cases the Exam Loves
Moving an existing instance into a placement group: You must stop the instance (for most types). Some types (C5, M5, R5) support moving while running, but the exam may not test that nuance. Assume stop is required.
Capacity limitations in cluster placement groups: You may get insufficient capacity errors. To mitigate, use capacity reservations or launch templates with specific placement.
Spread placement group across AZs: You can span AZs, but the 7-instance limit is per AZ, not total.
How to Eliminate Wrong Answers
If the question mentions 'lowest latency between instances', look for cluster placement group.
If the question mentions 'fault tolerance' or 'isolate from rack failures', look for spread or partition. If the number of instances is small (≤7 per AZ), spread is acceptable. If larger, partition.
If the question mentions 'cost-effective' and workload is variable, choose T3 burstable.
If the question mentions 'consistent performance', choose M5 or C5 (non-burstable).
If the question mentions 'GPU' or 'machine learning', choose P3 or G4.
If the question mentions 'local storage' or 'high IOPS', choose I3 or D2.
EC2 instance types are categorized into families: General Purpose (T, M, A), Compute Optimized (C), Memory Optimized (R, X, Z), Accelerated Computing (P, G, Inf, F), Storage Optimized (I, D, H).
Burstable instances (T2, T3, T4g) earn CPU credits when idle and spend them during bursts; exceeding baseline without credits incurs additional cost.
Cluster placement groups provide low latency and high throughput but are limited to a single AZ and do not provide fault tolerance.
Spread placement groups offer high fault isolation with a maximum of 7 instances per AZ.
Partition placement groups allow up to 7 partitions per AZ, each partition isolated from rack failures, and support many instances.
You can move an existing instance into a placement group only after stopping it (for most instance types).
EBS-optimized instances provide dedicated network throughput to EBS; current generation instances are EBS-optimized by default.
The exam expects you to choose placement groups based on latency, fault tolerance, and scalability requirements.
These come up on the exam all the time. Here's how to tell them apart.
Cluster Placement Group
Instances placed close together in same AZ, same rack.
Provides low latency (single-digit microseconds) and high throughput (up to 100 Gbps).
Cannot span Availability Zones.
No instance limit per group (but capacity constraints may occur).
Best for HPC, tightly coupled applications requiring fast inter-node communication.
Spread Placement Group
Instances placed on distinct racks, each with independent power and network.
Provides high fault isolation (one rack failure affects only one instance).
Can span multiple Availability Zones.
Maximum 7 instances per AZ per group.
Best for small number of critical instances requiring high availability.
Spread Placement Group
Each instance on a separate rack; maximum 7 instances per AZ.
No control over which rack an instance goes to; AWS distributes automatically.
Simple to manage for small deployments.
Cannot scale beyond 7 instances per AZ.
Use for critical instances where you want to minimize correlated failures.
Partition Placement Group
Instances grouped into partitions (up to 7 per AZ); each partition is on separate rack(s).
You can specify which partition an instance belongs to, giving control over placement.
Allows many instances per AZ (e.g., 7 partitions * many instances per partition).
Scales to hundreds of instances per AZ.
Use for large distributed systems like Cassandra, Kafka, where you want to spread replicas across partitions.
Mistake
All EC2 instances can be moved into a placement group while running.
Correct
Only some instance types (C5, M5, R5, etc.) support moving into a placement group while running. For most types, the instance must be stopped first. Always check the documentation.
Mistake
A cluster placement group can span multiple Availability Zones.
Correct
Cluster placement groups are confined to a single Availability Zone. They cannot span AZs. Spread and partition groups can span AZs.
Mistake
Spread placement groups can have unlimited instances per group.
Correct
A spread placement group can have a maximum of 7 running instances per Availability Zone. If you need more, you must use partition placement groups or multiple spread groups.
Mistake
T2 instances are always cheaper than T3 instances.
Correct
T3 instances often offer better price/performance and include unlimited mode by default at no extra cost for most workloads. T2 may be cheaper in some regions, but T3 is generally recommended for new deployments.
Mistake
Partition placement groups only work with specific instance types.
Correct
Partition placement groups support most current generation instance types. However, some older types may not be supported. Always verify compatibility.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
A cluster placement group places instances close together in a single Availability Zone to achieve low latency and high throughput, but it does not provide fault isolation. A spread placement group distributes instances across distinct racks (each on separate hardware) to maximize fault tolerance, but it is limited to 7 instances per AZ. Use cluster for HPC, spread for critical small-scale apps.
No, a spread placement group can have a maximum of 7 running instances per Availability Zone. If you need more instances with fault isolation, use a partition placement group, which allows many instances per AZ across up to 7 partitions.
You must stop the instance (for most types), then use the modify-instance-placement CLI command or the console to specify the placement group. Some instance types (C5, M5, R5) support moving while running, but it's safer to stop. After modification, start the instance.
The default limit is 500 placement groups per region per account. This limit can be increased by requesting a service limit increase.
No, cluster placement groups are confined to a single Availability Zone. To span AZs, use spread or partition placement groups.
Use a burstable instance like T3 (or T4g for ARM). These instances earn CPU credits during idle times and can burst when needed, offering lower cost for low-to-moderate traffic. Enable unlimited mode if you expect occasional spikes beyond baseline.
T3 instances are newer, use Intel (T3) or AMD (T3a) processors, offer better price/performance, and have unlimited mode enabled by default (no additional cost unless you exceed baseline for extended periods). T2 instances are older, have lower baseline performance, and unlimited mode costs extra. For new deployments, choose T3.
You've just covered EC2 Instance Types and Placement Groups — now see how well it sticks with free SAA-C03 practice questions. Full explanations included, no account needed.
Done with this chapter?