This chapter covers EC2 Placement Groups, a critical feature for controlling how instances are placed on underlying hardware to optimize performance, fault tolerance, or both. For the SOA-C02 exam, placement groups appear in roughly 5-8% of questions, often as part of high-availability or high-performance architectures. Understanding the three types — Cluster, Spread, and Partition — is essential for designing resilient and performant workloads. The exam tests not only the definitions but also the specific use cases, limitations, and interactions with other AWS services like Auto Scaling and Elastic Load Balancing.
Jump to a section
Imagine a large warehouse where workers assemble products. In a cluster placement group, it's like placing all workers at one central table — they are physically close, can pass parts instantly, and communicate with zero delay. This is ideal for high-speed assembly lines (like HPC jobs) but if the table collapses (a hardware failure), everyone is affected. In a spread placement group, workers are placed at separate desks far apart across the warehouse — no single accident (like a spilled coffee or a power outage) can take out more than one worker. This ensures high availability but increases communication time. In a partition placement group, the warehouse is divided into separate rooms (partitions), each with a group of workers. If a fire breaks out in one room, only that room's workers are lost, but other rooms continue. This balances fault isolation and communication efficiency, suitable for large distributed systems like Hadoop or Cassandra where each partition represents a rack or data center. The key mechanic is that each placement strategy trades off between latency, fault tolerance, and scale — analogous to how AWS physically places EC2 instances on underlying hardware.
What Are EC2 Placement Groups?
EC2 Placement Groups are a logical grouping of instances within a single Availability Zone (AZ) that influence how AWS places them on underlying hardware. They are used to control the proximity of instances to each other, affecting network performance and fault tolerance. There are three types:
Cluster Placement Group: Instances are placed into a low-latency group within a single AZ. All instances are physically close together on the same rack or set of racks, achieving high-bandwidth, low-latency network connectivity (up to 10 Gbps for TCP/IP and 20 Gbps for some HPC workloads using Elastic Fabric Adapter). However, they share the same underlying hardware, so a single hardware failure can impact all instances.
Spread Placement Group: Instances are placed on distinct underlying hardware, each on a different rack. A spread group can have a maximum of 7 running instances per AZ per group. This provides maximum fault isolation — no two instances share the same rack. Ideal for small sets of critical instances that must be isolated from each other.
Partition Placement Group: Instances are divided into logical segments called partitions, each of which is isolated from others in terms of rack-level failure. A partition group can have up to 7 partitions per AZ, and each partition can contain multiple instances. AWS guarantees that each partition does not share underlying hardware with other partitions. This is designed for large distributed systems like HDFS, Cassandra, or Kafka that need to spread replicas across partitions to tolerate rack failures.
How They Work Internally
When you launch an instance into a placement group, AWS's scheduler examines the placement group's type and attempts to place the instance accordingly. For a cluster placement group, the scheduler places instances as close together as possible, often on the same physical rack or a set of interconnected racks. This is achieved by reserving capacity in a specific cluster of servers. For a spread placement group, the scheduler ensures each instance is placed on a distinct rack by using a 'distinct rack' constraint. For a partition placement group, the scheduler assigns each instance to a partition based on user-specified partition number or automatically, ensuring each partition is on a different rack.
Key Components, Values, Defaults, and Timers
Placement Group Name: Must be unique within your AWS account for each region.
Strategy: One of cluster, spread, or partition.
Partition Count: For partition groups, you can specify a number from 1 to 7. If not specified, AWS chooses.
Maximum Instances per Group:
Cluster: No explicit limit, but limited by the capacity of the underlying cluster (typically hundreds of instances).
Spread: 7 running instances per AZ per group. You can have multiple spread groups to exceed this.
Partition: No per-partition limit, but overall group limit is 100 instances per AZ (hard limit).
Supported Instance Types: Not all instance types support placement groups. For cluster groups, only certain instance types (e.g., compute optimized, memory optimized, GPU instances) that support enhanced networking are recommended. Spread and partition groups support most instance types except some bare metal instances.
Network Performance: Cluster placement groups provide up to 10 Gbps for TCP/IP traffic between instances in the same group, and up to 20 Gbps for some HPC instances using Elastic Fabric Adapter (EFA). Spread and partition groups do not provide any special network performance benefits.
Interactions with Other Services:
Auto Scaling: You can specify a placement group in an Auto Scaling group launch template. However, if the placement group is full (e.g., spread group has 7 instances), Auto Scaling will fail to launch new instances.
Elastic Load Balancing: ELB can distribute traffic to instances in any placement group.
Reserved Instances: Placement groups are independent of reserved instances.
Configuration and Verification Commands
You can create placement groups using the AWS Management Console, AWS CLI, or SDKs. Using the AWS CLI:
aws ec2 create-placement-group --group-name my-cluster-group --strategy clusterTo launch an instance into a placement group:
aws ec2 run-instances --image-id ami-0abcdef1234567890 --instance-type c5.large --placement GroupName=my-cluster-groupTo describe placement groups:
aws ec2 describe-placement-groups --group-names my-cluster-groupTo move an existing instance into a placement group, you must stop the instance, modify its placement group, and start it. Note that the instance must be in the same AZ as the placement group.
How It Interacts with Related Technologies
Enhanced Networking: For cluster placement groups, using enhanced networking (SR-IOV) is highly recommended to achieve the low-latency benefits. Instance types that support enhanced networking include most current generation types.
Elastic Fabric Adapter (EFA): For HPC workloads, EFA can be used with cluster placement groups to achieve even higher performance (up to 20 Gbps) and lower latency.
Dedicated Hosts: You can launch instances into a placement group that is on a dedicated host, but the placement group constraints still apply.
Capacity Reservations: You can reserve capacity in a specific Availability Zone, but placement groups do not guarantee capacity; you still need to ensure that the required instance types are available.
Limitations
Cluster placement groups cannot span multiple Availability Zones. Spread and partition groups can span multiple AZs, but each AZ is treated independently.
You cannot merge placement groups.
You cannot change the strategy of an existing placement group; you must delete and recreate it.
Instances in a cluster placement group must be launched at the same time to ensure they are placed together. If you launch instances sequentially, they may end up on different racks.
For spread placement groups, if you attempt to launch a 8th instance in the same AZ, the launch will fail with an InsufficientInstanceCapacity error.
Best Practices
Use cluster placement groups for tightly coupled, high-performance computing jobs that benefit from low latency and high throughput.
Use spread placement groups for a small number of critical instances that need to be isolated from hardware failures (e.g., primary and standby database nodes).
Use partition placement groups for large distributed applications that need to tolerate rack failures, such as HDFS, Cassandra, or Kafka. Place replicas in different partitions.
Always test your application's performance with and without placement groups to ensure the benefits outweigh the limitations.
Be aware of the maximum instance limits per group, especially for spread groups.
Create Placement Group
First, you define a placement group using the AWS Management Console, CLI, or API. You specify a name and a strategy (cluster, spread, or partition). For partition groups, you can optionally specify the number of partitions (1-7). This step does not consume any resources; it simply creates a logical container. The placement group is region-specific and exists within an account. For example, using CLI: `aws ec2 create-placement-group --group-name my-group --strategy cluster`. This command returns a placement group ID and metadata. The group initially has no instances.
Launch Instances into Group
When you launch an EC2 instance, you specify the placement group name in the launch configuration. The AWS scheduler then attempts to place the instance according to the strategy. For cluster groups, the scheduler tries to place the instance on the same physical rack as existing instances in the group. For spread groups, it ensures the instance is on a distinct rack from all other instances in the group. For partition groups, it assigns the instance to a specific partition (either user-specified or auto-assigned). If the placement cannot be satisfied (e.g., spread group already has 7 instances in that AZ), the launch fails with an `InsufficientInstanceCapacity` error. You can launch multiple instances in a single run-instances command; for cluster groups, they will be placed together.
Verify Placement
After launching, you can verify the placement group association by describing the instance: `aws ec2 describe-instances --instance-ids i-1234567890abcdef0`. The output includes a `Placement` object with the `GroupName` and `PartitionNumber` (if applicable). For spread groups, you can also check the `GroupId` and ensure that each instance is on a different rack by using the `describe-instances` output. However, AWS does not expose the physical rack ID; you must rely on the placement group's fault isolation guarantee. For partition groups, you can see which partition each instance belongs to.
Modify Existing Instance Placement
If you have a running instance not in a placement group, you can move it into an existing placement group, but only if the instance is stopped. Use the `modify-instance-placement` CLI command: `aws ec2 modify-instance-placement --instance-id i-1234567890abcdef0 --group-name my-group`. The instance must be in the same Availability Zone as the placement group. After modification, start the instance. Note that you cannot move an instance from one placement group to another directly; you must first remove it from the current group (by stopping and modifying with an empty group name) and then add it to the new group. Also, you cannot change the strategy of an existing group; you must delete and recreate it.
Delete Placement Group
To delete a placement group, it must have zero running instances. Stop all instances in the group, then use: `aws ec2 delete-placement-group --group-name my-group`. If the group has instances (even stopped), the deletion will fail. After deletion, the logical container is removed, and instances are no longer associated with any placement group. They can be relaunched into a new group if needed. Note that you cannot delete a placement group that is referenced by an Auto Scaling group or other service; you must first remove those references.
Enterprise Scenario 1: High-Performance Computing (HPC) Cluster
A financial services firm runs a Monte Carlo risk simulation that requires massive parallel computation. They deploy a cluster placement group with 100 c5n.18xlarge instances in a single AZ. The instances communicate using MPI over EFA (Elastic Fabric Adapter). The cluster placement group ensures that all instances are on the same non-blocking fabric, providing up to 100 Gbps per instance and microsecond latency. The simulation completes in 2 hours instead of 8 hours without placement groups. The key challenge is capacity: they must launch all instances simultaneously to ensure they are placed together. They use a launch template and a single run-instances command with --count 100. If they tried to launch instances incrementally, later instances might fail due to lack of capacity on the same rack. They also monitor network throughput using CloudWatch metrics to ensure optimal performance.
Enterprise Scenario 2: High-Availability Web Application
A SaaS company runs a critical web application with a primary and standby database in a spread placement group across two AZs. The spread group ensures that the two database instances are on different racks, so a single rack failure does not cause simultaneous downtime. They use a spread placement group per AZ, each with a maximum of 7 instances. In production, they have 2 instances per AZ (primary and standby). They also use Auto Scaling groups for web servers, but these are not in the spread group because they can tolerate failures. The challenge is that if they need to add a third database instance (e.g., a read replica) in the same AZ, they cannot because the spread group already has 7 instances. They must create a new spread group for the read replica. They also ensure that the spread group is in the same AZ as the instances, and they use CloudWatch alarms to detect if an instance fails and replace it with a new one in the same group.
Enterprise Scenario 3: Distributed Data Store (Cassandra)
A large e-commerce company runs a Cassandra cluster with 30 nodes across 3 AZs (10 per AZ). They use partition placement groups with 5 partitions per AZ. Each partition represents a rack. They configure Cassandra's replication factor to 3 and ensure that each replica is placed in a different partition (rack). This way, if a rack fails, only one replica is lost, and the cluster remains available. They use a launch template that specifies the partition number for each node. The challenge is that they must manually assign partitions to ensure replicas are spread across partitions. They also monitor partition-level metrics using CloudWatch custom metrics. If a partition becomes unhealthy (e.g., all instances in a partition fail), they need to replace instances in that partition, which may require stopping and modifying placement. They also ensure that the total instances per AZ do not exceed 100 (the partition group limit).
What the Exam Tests (Objective 2.1 - Reliability)
The SOA-C02 exam tests your understanding of placement groups as a tool for improving reliability and performance. Specifically:
Identify the correct placement group type for a given scenario: Cluster for low-latency/HPC, Spread for fault isolation of critical instances, Partition for large distributed systems.
Understand limitations: Spread group max 7 instances per AZ; Cluster group cannot span AZs; Partition group max 7 partitions per AZ.
Know how to create and modify placement groups: CLI commands, console steps, and the requirement to stop instances before moving them.
Understand network performance implications: Cluster groups provide low latency and high throughput; Spread and Partition do not.
Interactions with Auto Scaling and other services: Auto Scaling groups can use placement groups, but if the group is full, scaling fails.
Common Wrong Answers and Why Candidates Choose Them
Choosing Spread for large distributed systems: Candidates think spread provides fault isolation, but spread is limited to 7 instances per AZ, making it unsuitable for large clusters. The correct answer is Partition.
Assuming Cluster groups span AZs: Candidates may think cluster groups can span AZs for high availability, but they are single-AZ only. The exam tests this with a scenario requiring low latency across AZs, where the correct answer is to use a cluster group in one AZ.
Thinking you can change placement group strategy: Candidates may believe you can modify the strategy after creation. Reality: you must delete and recreate the group.
Believing spread groups provide network performance benefits: Spread groups only provide fault isolation, not better network performance. Candidates may confuse them with cluster groups.
Specific Numbers and Values to Memorize
Spread group: maximum 7 instances per AZ per group.
Partition group: maximum 7 partitions per AZ; maximum 100 instances per AZ per group.
Cluster group: no explicit instance limit, but capacity constrained by underlying hardware.
Network throughput: up to 10 Gbps for TCP/IP, up to 20 Gbps for EFA in cluster groups.
Instance types: only certain types support placement groups (e.g., C5, M5, R5, P3, etc.).
Edge Cases and Exceptions
You can launch a spread group instance in a different AZ than the group if the group is empty? No, placement groups are AZ-specific. You cannot launch an instance into a placement group in a different AZ.
What if you need more than 7 instances in a spread group? Create multiple spread groups.
Can you use a placement group with a Dedicated Host? Yes, but the placement group constraints still apply.
Can you use a placement group with a Spot Instance? Yes, but the same limitations apply.
How to Eliminate Wrong Answers
If the scenario mentions 'low latency' or 'high network throughput', eliminate Spread and Partition; choose Cluster.
If the scenario mentions 'fault isolation' for a small number of instances, choose Spread.
If the scenario mentions 'large distributed system' with 'rack-level failure', choose Partition.
If the scenario mentions 'multiple AZs' and 'fault tolerance', Spread and Partition can span AZs, but Cluster cannot.
If the scenario mentions 'maximum 7 instances per AZ', it's a Spread group.
If the scenario mentions 'partitions' or 'up to 7 partitions', it's a Partition group.
Cluster placement groups provide low latency and high throughput but are single-AZ and share risk.
Spread placement groups offer maximum fault isolation but are limited to 7 instances per AZ per group.
Partition placement groups provide rack-level isolation for large distributed systems, with up to 7 partitions per AZ.
You cannot change the strategy of an existing placement group; you must delete and recreate it.
To move an instance into a placement group, the instance must be stopped and in the same AZ.
Cluster placement groups require enhanced networking (SR-IOV) for optimal performance.
Spread and partition groups can span multiple AZs, but each AZ has its own limits.
The exam often tests the 7-instance limit for spread groups and the 7-partition limit for partition groups.
Auto Scaling groups can use placement groups, but if the group is full, scaling fails.
When launching multiple instances into a cluster group, launch them simultaneously to ensure they are placed together.
These come up on the exam all the time. Here's how to tell them apart.
Cluster Placement Group
Instances are placed close together on the same rack(s) for low latency.
Provides high network throughput (up to 10 Gbps TCP, 20 Gbps EFA).
Single point of failure: a rack failure can affect all instances.
Cannot span multiple Availability Zones.
No hard limit on number of instances, but limited by cluster capacity.
Spread Placement Group
Each instance is placed on a distinct rack for fault isolation.
No network performance benefit beyond normal EC2 networking.
Maximum of 7 running instances per Availability Zone per group.
Can span multiple Availability Zones (each AZ has its own 7-instance limit).
Ideal for small sets of critical instances that must be isolated.
Spread Placement Group
Each instance on a separate rack (maximum isolation).
Limited to 7 instances per AZ per group.
Each instance is individually isolated from hardware failures.
Best for small, critical instances like primary/standby databases.
No concept of partitions; instances are individually placed.
Partition Placement Group
Instances are grouped into partitions; each partition is on a separate rack.
Can have up to 7 partitions per AZ, each with many instances.
Partition-level isolation: a rack failure affects only one partition.
Best for large distributed systems like HDFS, Cassandra, Kafka.
Partitions allow logical grouping for replica placement.
Mistake
Cluster placement groups provide high availability across multiple Availability Zones.
Correct
Cluster placement groups are confined to a single Availability Zone. They do not span AZs, so they do not provide AZ-level fault tolerance. Use Spread or Partition groups for multi-AZ deployments.
Mistake
You can have more than 7 instances in a Spread placement group per Availability Zone.
Correct
A Spread placement group can have a maximum of 7 running instances per AZ. To exceed this, you must create multiple Spread groups. This is a hard limit enforced by AWS.
Mistake
Partition placement groups guarantee that each partition is on a separate physical server.
Correct
Partitions isolate at the rack level, not the server level. Each partition is on a different rack, but multiple instances within a partition may share a server. The key is that no two partitions share the same rack.
Mistake
You can change the strategy of a placement group after creation.
Correct
The placement group strategy (cluster, spread, partition) is immutable after creation. To change it, you must delete the group and create a new one with the desired strategy. Instances must be removed from the old group first.
Mistake
All EC2 instance types support placement groups.
Correct
Not all instance types support placement groups. For cluster groups, only instance types that support enhanced networking (e.g., C5, M5, R5, P3) are recommended. Spread and partition groups support most instance types, but some bare metal instances may not. Always check the documentation.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
No. A placement group is associated with a specific Availability Zone. You cannot launch an instance into a placement group that is in a different AZ. If you try, the launch will fail. You must ensure that the instance's AZ matches the placement group's AZ. For spread and partition groups, you can create separate groups per AZ if you need multi-AZ deployment.
The launch will fail with an `InsufficientInstanceCapacity` error. Spread placement groups have a hard limit of 7 running instances per AZ. To add more instances, you must create another spread group in the same AZ. This is a common exam scenario: if you need 8 instances with spread placement, you need two spread groups.
Yes. You can launch Spot Instances into any type of placement group. However, the same limitations apply (e.g., spread group max 7 instances per AZ). Note that Spot Instances can be interrupted, which may affect the placement group's capacity. For critical workloads, consider using On-Demand or Reserved Instances.
First, stop the instance. Then use the `modify-instance-placement` CLI command or the console to specify the placement group name. After that, start the instance. The instance must be in the same AZ as the placement group. Note that you cannot move an instance that is in a placement group to another placement group directly; you must first remove it from the current group by modifying with an empty group name.
No. There is no additional charge for using placement groups. You pay only for the EC2 instances and other resources you use. However, because cluster placement groups may require specific instance types that support enhanced networking, those instance types may have higher hourly rates. But the placement group itself is free.
Yes. You can specify a placement group in your launch template or launch configuration. Auto Scaling will launch instances into that placement group. However, if the placement group is full (e.g., spread group has 7 instances), Auto Scaling will fail to launch new instances. Also, if you scale in, instances are removed from the group. Make sure your placement group has enough capacity for your desired scaling limits.
Both provide fault isolation, but at different granularities. Spread placement puts each instance on a separate rack, isolating individual instances. Partition placement groups instances into partitions, each on a separate rack, so a rack failure affects only one partition (which may contain multiple instances). Spread is for small numbers of critical instances; partition is for large distributed systems where you want to control which instances share a rack.
You've just covered EC2 Placement Groups: Cluster, Spread, Partition — now see how well it sticks with free SOA-C02 practice questions. Full explanations included, no account needed.
Done with this chapter?