SOA-C02Chapter 3 of 104Objective 2.1

Auto Healing and Auto Scaling

This chapter covers AWS Auto Scaling and Auto Healing, two critical mechanisms for maintaining application availability and reliability on AWS. For the SOA-C02 exam, questions on these topics appear in roughly 15-20% of exams, making them high-yield areas. You will learn how to configure Auto Scaling groups, scaling policies, health checks, and lifecycle hooks, as well as how to integrate them with Elastic Load Balancing and CloudWatch. Mastery of these concepts is essential for the Reliability domain.

25 min read
Intermediate
Updated May 31, 2026

Auto Scaling: A Hotel with Smart Staffing

Imagine a hotel with a fixed number of rooms but variable guest arrivals. The hotel manager uses a smart staffing system: a sensor at the entrance counts arriving guests and a sensor at the exit counts departing guests. The system compares the current occupancy to a target occupancy per staff member (e.g., 20 guests per front desk clerk). If occupancy exceeds 80% of that target, the system automatically calls in extra clerks from a pool of trained on-call staff. If occupancy drops below 40%, it sends some clerks home. The system also has health checks: if a clerk doesn't respond to a page within 30 seconds, they are considered sick and replaced. The manager sets minimum and maximum staff levels to ensure at least 2 clerks are always present and no more than 20 to avoid overcrowding. The system gradually adds or removes clerks one at a time to avoid sudden changes. This mirrors AWS Auto Scaling: the sensors are CloudWatch metrics, the target is a scaling policy, the on-call pool is an Auto Scaling group with launch templates, and the health checks are EC2 status checks. The hotel's front desk is the application, and the clerks are EC2 instances.

How It Actually Works

What is Auto Scaling and Auto Healing?

Auto Scaling is an AWS service that automatically adjusts the number of EC2 instances in a group based on demand. Auto Healing (or health checks) ensures that unhealthy instances are replaced automatically. Together, they form the foundation for building resilient, highly available applications. The SOA-C02 exam tests your ability to configure, monitor, and troubleshoot these features.

How Auto Scaling Works Internally

An Auto Scaling group (ASG) is a logical container for EC2 instances that share similar characteristics (e.g., same launch template, VPC, subnets). The ASG maintains a desired capacity, minimum size, and maximum size. The Auto Scaling service constantly monitors the number of healthy instances and compares it to the desired capacity. If the count deviates, it launches or terminates instances to match the desired capacity. Scaling policies (simple, step, or target tracking) modify the desired capacity based on CloudWatch alarms. For example, a target tracking policy might aim for an average CPU utilization of 50%.

Key Components, Values, Defaults, and Timers

Launch Template or Configuration: Defines instance details (AMI, instance type, key pair, security groups). Launch templates are recommended over launch configurations (more features, e.g., T2/T3 unlimited).

Health Check Grace Period: Default 300 seconds. After an instance launches, Auto Scaling waits this long before checking its health. This prevents premature termination during bootstrapping.

Health Check Types: EC2 status checks (default) or ELB health checks. If ELB health checks are enabled, the ASG uses the ELB's health check status.

Cooldown Period: Default 300 seconds for simple scaling policies. Prevents Auto Scaling from launching or terminating additional instances before previous scaling activities take effect.

Termination Policy: Default is 'Default' which selects the oldest launch template instance first, then closest to next billing hour. Other options: OldestInstance, NewestInstance, OldestLaunchConfiguration, ClosestToNextInstanceHour.

Lifecycle Hooks: Pause an instance during launch or termination to perform custom actions (e.g., install software, download logs). Hooks have a timeout (default 3600 seconds, max 48 hours) and can be completed via CompleteLifecycleAction or wait for timeout.

Suspended Processes: Can suspend specific processes like Launch, Terminate, HealthCheck, ReplaceUnhealthy, etc. Useful for maintenance.

Configuration and Verification Commands

Using AWS CLI:

# Create an Auto Scaling group
aws autoscaling create-auto-scaling-group \
    --auto-scaling-group-name my-asg \
    --launch-template LaunchTemplateName=my-template,Version=1 \
    --min-size 1 \
    --max-size 5 \
    --desired-capacity 2 \
    --vpc-zone-identifier subnet-abc,subnet-def

# Attach a target tracking scaling policy
aws autoscaling put-scaling-policy \
    --auto-scaling-group-name my-asg \
    --policy-name cpu50-target \
    --policy-type TargetTrackingScaling \
    --target-tracking-configuration TargetValue=50.0,PredefinedMetricSpecification={PredefinedMetricType=ASGAverageCPUUtilization}

# Describe scaling activities
aws autoscaling describe-scaling-activities --auto-scaling-group-name my-asg

How Auto Healing Works

Auto Healing is built into Auto Scaling groups. When an instance fails an EC2 status check (or ELB health check), the ASG marks it as unhealthy and terminates it. Then it launches a new instance to replace it. The termination and launch are separate steps. The ASG does not wait for the instance to recover; it immediately replaces it. This ensures quick recovery.

Interaction with Elastic Load Balancing

An ASG can be associated with one or more target groups (ALB/NLB) or Classic Load Balancers. When an instance is launched, it is automatically registered with the load balancer. When terminated, it is deregistered. ELB health checks can be used by the ASG to determine instance health. The ASG also respects the load balancer's connection draining settings.

Lifecycle Hooks in Detail

Lifecycle hooks allow you to pause an instance at launch or termination to perform custom actions. For example, you might want to install software or run a configuration management tool before the instance starts serving traffic. When a lifecycle hook is triggered, the instance enters a 'pending:wait' or 'terminating:wait' state. You must complete the action via the CompleteLifecycleAction API or wait for the timeout. The default timeout is 3600 seconds (1 hour), but you can set it up to 48 hours. You can also send a notification to SNS or SQS to trigger a Lambda function.

Scaling Policies

Simple Scaling: Adjusts the desired capacity by a fixed number or percentage when an alarm triggers. Has a cooldown period during which scaling activities are blocked.

Step Scaling: Adjusts based on the size of the alarm breach. Allows different adjustments for different breach sizes. Also has a cooldown period.

Target Tracking: Automatically adjusts the desired capacity to keep a metric (e.g., CPU, request count) at a target value. No cooldown needed; AWS manages it internally. This is the recommended policy.

Scheduled Scaling: Allows scaling at specific times, e.g., increase capacity at 8 AM every weekday.

Important Timers and Defaults

Health Check Grace Period: 300 seconds (default) – time after launch before health checks start.

Cooldown Period: 300 seconds (default) for simple and step scaling. Not used for target tracking.

Default Termination Policy: 'Default' – selects oldest launch template instance, then closest to next billing hour.

Lifecycle Hook Timeout: 3600 seconds (default), max 172800 seconds (48 hours).

Instance Warmup: For target tracking, you can set a warmup time (default 300 seconds) before a newly launched instance contributes to metrics.

Common Misconfigurations

Setting max size too low causes scaling to fail when demand spikes.

Not configuring health check grace period leads to premature termination of instances that are still booting.

Using simple scaling without cooldown leads to oscillating scaling.

Forgetting to attach a load balancer or target group means instances are not registered, causing traffic loss.

Verification Steps

Use aws autoscaling describe-auto-scaling-groups to see group details.

Use aws autoscaling describe-scaling-activities to see recent scaling events.

Check CloudWatch metrics for the ASG: GroupMinSize, GroupMaxSize, GroupDesiredCapacity, GroupInServiceInstances, GroupPendingInstances, GroupTerminatingInstances, GroupTotalInstances.

Exam Tips

Know the difference between EC2 status checks and ELB health checks. ELB health checks are more application-aware.

Remember that Auto Scaling does not automatically balance instances across Availability Zones. You must specify multiple subnets in different AZs.

If an instance is impaired (e.g., status check failed), Auto Scaling will terminate it and launch a new one. It does not attempt to recover it.

Lifecycle hooks can be used for graceful shutdown or custom startup.

Target tracking scaling policies are the most common on the exam because they are the simplest to set up.

Walk-Through

1

Define Launch Template

Create a launch template that specifies the AMI, instance type, key pair, security groups, and any user data scripts. This template is used for all instances launched by the Auto Scaling group. Launch templates support versioning and can be updated without affecting existing instances. For the exam, remember that launch templates are preferred over launch configurations because they offer more features, such as T2/T3 unlimited and network interfaces.

2

Create Auto Scaling Group

Create the Auto Scaling group with the launch template, specifying the VPC and subnets (at least two for high availability), min/max/desired capacity, health check type (EC2 or ELB), and health check grace period. The group will immediately launch the desired number of instances. The ASG continuously monitors the number of healthy instances and adjusts to maintain desired capacity.

3

Attach Load Balancer

Attach an Application Load Balancer (ALB) or Network Load Balancer (NLB) target group to the ASG. This ensures that newly launched instances are automatically registered with the load balancer and start receiving traffic after passing health checks. The load balancer's health checks can be used by the ASG to determine instance health. Connection draining settings on the target group allow in-flight requests to complete before an instance is terminated.

4

Configure Scaling Policy

Create a scaling policy, such as a target tracking policy based on average CPU utilization. This policy automatically adjusts the desired capacity to keep the metric at the target value (e.g., 50%). The ASG uses CloudWatch alarms to trigger scaling actions. For target tracking, AWS manages the alarms and cooldown internally. Step scaling allows more granular control with different adjustment steps for different metric thresholds.

5

Implement Lifecycle Hooks

Optionally add lifecycle hooks to perform custom actions during instance launch or termination. For example, a hook can pause the instance in 'pending:wait' state to run a configuration script. You must complete the hook via the CompleteLifecycleAction API or wait for the timeout. Hooks can send notifications to SNS or SQS to trigger a Lambda function for automation.

What This Looks Like on the Job

Scenario 1: E-Commerce Website with Variable Traffic

An e-commerce company runs a web application on EC2 behind an ALB. During normal hours, traffic is moderate, but during flash sales, traffic spikes 10x. They configure an Auto Scaling group with a target tracking policy on ALB RequestCountPerTarget. The group spans three Availability Zones with a min of 2 and max of 20 instances. They set the health check grace period to 300 seconds to allow instances to warm up. A lifecycle hook at launch runs a script to clear cache and register with a monitoring tool. During a flash sale, the ASG scales out rapidly, adding instances in batches. The cooldown (managed by target tracking) prevents oscillation. The team monitors CloudWatch metrics and sets up alarms for scaling failures. A common issue is that if the max size is too low, the ASG cannot handle the spike, causing throttling. They learned to set max size based on peak load estimates plus a buffer.

Scenario 2: Microservices with Blue/Green Deployment

A financial services company uses Auto Scaling groups for each microservice. They use lifecycle hooks for graceful shutdown: when an instance is terminating, a hook runs a script to drain connections and flush logs before the instance is terminated. They also use lifecycle hooks to run integration tests on new instances before they are marked healthy. They employ multiple scaling policies: a target tracking policy on CPU and a scheduled scaling policy to increase capacity before known peak times (e.g., market open). They encountered an issue where instances were terminated before draining connections, causing errors. They fixed it by increasing the lifecycle hook timeout and ensuring the draining script completed within the timeout. They also use ELB health checks with a custom path to verify application health. The ASG's health check grace period is set to 600 seconds because the application takes longer to initialize.

Scenario 3: Batch Processing with Spot Instances

A media processing company runs batch jobs on a mix of On-Demand and Spot Instances. They use an Auto Scaling group with a mixed instances policy to diversify across instance types and purchase options. They set a target capacity of 70% Spot and 30% On-Demand. The scaling policy is based on the number of jobs in an SQS queue (custom metric). They use lifecycle hooks to handle Spot interruption: when a Spot instance receives a termination notice, the hook pauses termination and allows the job to checkpoint to S3. They set the lifecycle hook timeout to 120 seconds (Spot notices give 2 minutes). They also use CloudWatch to track Spot instance interruption rates and adjust the mix accordingly. A common mistake is not setting the health check type to 'ELB' when using a load balancer, causing the ASG to rely only on EC2 status checks, which do not detect application failures.

How SOA-C02 Actually Tests This

What SOA-C02 Tests on This Topic

The SOA-C02 exam covers Auto Scaling and Auto Healing under Domain 2 (Reliability), Objective 2.1: Implement high availability and resilience. Specific sub-objectives include: - 2.1.1: Configure Auto Scaling groups with launch templates - 2.1.2: Implement scaling policies (simple, step, target tracking, scheduled) - 2.1.3: Configure health checks (EC2 and ELB) - 2.1.4: Implement lifecycle hooks - 2.1.5: Troubleshoot scaling issues (e.g., not scaling, premature termination)

Common Wrong Answers and Why Candidates Choose Them

1. Wrong Answer: 'Auto Scaling will automatically balance instances across Availability Zones.' - Why Chosen: Candidates assume Auto Scaling distributes instances evenly. Actually, you must specify multiple subnets in different AZs; the ASG will launch instances in the subnet with the fewest instances, but only if you configure multiple subnets. The exam tests this nuance.

2. Wrong Answer: 'Simple scaling policies are the most recommended because they are easy to configure.' - Why Chosen: Simple scaling is indeed simple, but AWS recommends target tracking because it manages cooldown and alarms automatically. The exam expects you to know best practices.

3. Wrong Answer: 'If an instance fails an EC2 status check, Auto Scaling will reboot it.' - Why Chosen: Candidates confuse Auto Scaling with EC2 auto-recovery. Auto Scaling terminates and replaces the instance; it does not reboot. EC2 auto-recovery is a separate feature for certain instance types.

4. Wrong Answer: 'Lifecycle hooks can only be used during instance launch.' - Why Chosen: Many think hooks are only for startup. They can also be used during termination. The exam tests both launch and termination hooks.

Specific Numbers and Terms That Appear on the Exam

Health Check Grace Period: Default 300 seconds. Exam may ask what happens if you set it too low (premature termination) or too high (slow detection of unhealthy instances).

Cooldown Period: Default 300 seconds for simple and step scaling. Not applicable to target tracking.

Lifecycle Hook Timeout: Default 3600 seconds, max 48 hours. Exam may ask what happens if you don't complete the action before timeout (instance continues to launch/terminate).

Default Termination Policy: 'Default' – selects oldest launch template instance first, then closest to next billing hour. Exam may ask which instance is terminated first.

Suspended Processes: 'Launch', 'Terminate', 'HealthCheck', 'ReplaceUnhealthy', etc. Exam may ask which process to suspend for maintenance.

Edge Cases and Exceptions

If you attach a load balancer after the ASG is created, existing instances are not automatically registered. You must manually register them or refresh the instances.

If you set desired capacity outside min/max, the API call fails.

If you have multiple scaling policies, they can conflict. The exam may ask about the order of evaluation (all policies are evaluated, but the most aggressive may win).

Spot Instances: If a Spot request is interrupted, the ASG launches a replacement. If the Spot pool is empty, the ASG may fail to launch. Use mixed instances policies to diversify.

How to Eliminate Wrong Answers

Focus on the mechanism: Auto Scaling always maintains desired capacity. If an answer suggests it does something else (e.g., reboot), it's wrong.

Remember that health checks are separate from scaling policies. Health checks replace unhealthy instances; scaling policies adjust capacity.

Know the defaults: grace period, cooldown, termination policy. Many questions test exact default values.

Key Takeaways

Auto Scaling groups maintain a desired capacity of instances; they launch or terminate instances to match the desired capacity.

Default health check grace period is 300 seconds; set it high enough to allow instance bootstrapping.

Target tracking scaling policies are recommended over simple and step scaling because they manage alarms and cooldown automatically.

Lifecycle hooks can pause instance launch or termination for up to 48 hours; use them for custom actions.

Auto Scaling does not reboot unhealthy instances; it terminates and replaces them.

For high availability, specify subnets in at least two Availability Zones in the Auto Scaling group.

The default termination policy selects the oldest launch template instance first, then the instance closest to the next billing hour.

ELB health checks provide application-level health monitoring; EC2 status checks only verify instance reachability.

Suspended processes (e.g., HealthCheck, ReplaceUnhealthy) can be used for maintenance to prevent automatic replacement.

Instance refresh allows you to update instances in an ASG to a new launch template without manual termination.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Simple Scaling

Requires you to create CloudWatch alarms manually.

Has a cooldown period (default 300s) that blocks scaling activities.

Adjusts desired capacity by a fixed number or percentage.

Can cause oscillation if cooldown is too short.

Less recommended by AWS; more complex to configure.

Target Tracking Scaling

AWS automatically creates and manages CloudWatch alarms.

No cooldown period; AWS manages scaling internally.

Automatically adjusts to keep a metric at a target value.

Smoother scaling, less oscillation.

Recommended by AWS; simpler to set up.

Watch Out for These

Mistake

Auto Scaling automatically distributes instances evenly across all Availability Zones in the region.

Correct

Auto Scaling distributes instances across the subnets you specify in the VPCZoneIdentifier. If you specify subnets in two AZs, it will attempt to balance across them. If you specify only one subnet, all instances launch in that AZ. The exam expects you to know that you must configure multiple subnets for high availability.

Mistake

If an instance fails an ELB health check, Auto Scaling immediately terminates it.

Correct

The instance is marked unhealthy, but Auto Scaling waits for the health check grace period (if it's a new instance) and then terminates it. The termination is not instant; there is a brief delay. Also, if the ASG is using only EC2 status checks, ELB health checks are ignored.

Mistake

Lifecycle hooks can only be used during instance launch to install software.

Correct

Lifecycle hooks can be used for both launch and termination. For termination, they can be used to gracefully shut down applications, drain connections, or upload logs before the instance is terminated.

Mistake

Target tracking scaling policies require you to create CloudWatch alarms manually.

Correct

AWS automatically creates and manages the CloudWatch alarms for target tracking policies. You only specify the target metric and value. This is a key advantage over simple and step scaling.

Mistake

You can change the launch template of an existing Auto Scaling group and it will update all running instances.

Correct

Changing the launch template does not affect running instances. Only new instances launched after the change use the new template. To update existing instances, you must manually terminate them or use an instance refresh.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between an Auto Scaling group and a launch template?

A launch template defines the configuration for EC2 instances (AMI, instance type, security groups, etc.). An Auto Scaling group is a container that uses a launch template to launch and manage instances. The ASG controls the number of instances, scaling policies, health checks, and lifecycle hooks. You can have multiple ASGs using the same launch template.

How does Auto Scaling determine which instance to terminate when scaling in?

The default termination policy selects the instance with the oldest launch template first. If multiple instances share the same launch template, it selects the one closest to the next billing hour. You can also choose other policies like 'OldestInstance', 'NewestInstance', or 'ClosestToNextInstanceHour'.

Can I use lifecycle hooks to run a script on instance termination?

Yes, you can create a lifecycle hook for the 'autoscaling:EC2_INSTANCE_TERMINATING' event. The instance will enter 'terminating:wait' state, allowing you to run a script (e.g., to drain connections, upload logs). You must complete the hook via the CompleteLifecycleAction API or wait for the timeout (default 3600 seconds).

What happens if I set the desired capacity outside the min and max range?

The API call will fail with an error. The desired capacity must be between min and max inclusive. If you attempt to update the desired capacity to a value outside the range, the operation is rejected.

Does Auto Scaling automatically register instances with a load balancer?

Yes, if you attach a target group (for ALB/NLB) or a Classic Load Balancer to the ASG, instances are automatically registered when they launch and deregistered when they terminate. You must ensure the security groups allow traffic from the load balancer.

Can I have multiple scaling policies on the same Auto Scaling group?

Yes, you can have multiple policies (e.g., target tracking on CPU and a scheduled scaling policy). However, they can conflict. AWS evaluates all policies, and the most aggressive scaling action may take effect. It's best to use one target tracking policy and supplement with scheduled scaling if needed.

What is the default health check grace period and why is it important?

The default is 300 seconds (5 minutes). It prevents Auto Scaling from marking a newly launched instance as unhealthy and terminating it before it has time to boot and pass health checks. If you set it too low, instances may be terminated prematurely. If set too high, unhealthy instances may remain in service longer.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Auto Healing and Auto Scaling — now see how well it sticks with free SOA-C02 practice questions. Full explanations included, no account needed.

Done with this chapter?