This chapter covers EC2 Target Tracking Scaling, a core Auto Scaling policy type that simplifies dynamic scaling by letting you define a target metric value and letting AWS automatically adjust capacity. For the SAA-C03 exam, Target Tracking is a high-frequency topic, appearing in roughly 5-10% of questions related to Resilient Architectures (Objective 2.1). Understanding its mechanics, configuration, and common pitfalls is essential for choosing the right scaling policy in scenario-based questions.
Jump to a section
Think of Target Tracking Scaling as a smart cruise control system in a modern car. You set a desired speed, say 65 mph. The cruise control continuously monitors the actual speed via wheel sensors. If the car slows down going uphill, the system automatically increases throttle to maintain 65 mph. If it speeds up downhill, it reduces throttle or even applies brakes. The driver doesn't manually adjust—the system reacts to keep the speed exactly at the target. In AWS EC2 Auto Scaling, Target Tracking works similarly: you define a target value for a metric, like average CPU utilization at 50%. Auto Scaling continuously monitors CloudWatch metrics for the ASG. When actual CPU utilization rises above 50%, it adds instances (more throttle). When it falls below, it removes instances (less throttle). The scaling is proportional—small deviations cause small adjustments, large deviations cause larger ones. The system uses a built-in algorithm that calculates the required number of instances based on the metric's current value and the target. It also includes a cooldown period to prevent oscillations, analogous to the cruise control's delay before responding to a change. This mechanism ensures stable, automatic scaling without manual intervention, just like cruise control maintains speed without the driver touching the pedals.
What is Target Tracking Scaling?
Target Tracking Scaling is a policy type for Amazon EC2 Auto Scaling that automatically adjusts the desired capacity of an Auto Scaling group (ASG) to keep a specific CloudWatch metric at a target value. It is one of three dynamic scaling policy types (the others being Simple Scaling and Step Scaling), but it is the simplest to configure because you only need to specify the metric and the target value. AWS handles the logic of adding or removing instances proportionally.
Why It Exists
Before Target Tracking, you had to create CloudWatch alarms and define scaling adjustments manually. For example, you might create an alarm for CPU > 70% that adds 2 instances, and another for CPU < 30% that removes 1 instance. This required careful tuning to avoid oscillations and wasted capacity. Target Tracking automates this: you set a target (e.g., average CPU at 50%), and AWS calculates the number of instances needed to maintain that target, scaling smoothly up or down.
How It Works Internally
Target Tracking uses a proportional-integral (PI) controller algorithm, similar to a thermostat. The algorithm continuously computes the required capacity based on the current metric value and the target. The formula is:
RequiredCapacity = (CurrentMetricValue / TargetMetricValue) * CurrentCapacity
For example, if target CPU is 50%, current CPU is 75%, and current capacity is 10 instances:
RequiredCapacity = (75 / 50) * 10 = 15
So the system would scale up to 15 instances. If CPU drops to 25%:
RequiredCapacity = (25 / 50) * 10 = 5
It would scale down to 5. However, the actual scaling decision also considers a stabilization period to avoid flapping.
Key Components and Defaults
- Target Metric: The metric you want to track. AWS provides predefined metrics:
- ASGAverageCPUUtilization
- ASGAverageNetworkIn
- ASGAverageNetworkOut
- ALBRequestCountPerTarget (requires ALB)
- Target Value: The desired value for the metric. For CPU, a common target is 50%.
- Disable Scale-In: By default, Target Tracking can both scale out and scale in. You can disable scale-in if you want to only scale out and manage scale-in separately (e.g., with a scheduled action).
- Cooldown Period: Default is 300 seconds (5 minutes). This is the time after a scaling activity before another scaling activity can begin. It prevents rapid oscillations.
- Scale-in Cooldown Period: Default is 300 seconds, but you can set it separately if needed.
- Estimated Instance Warmup: Default is 300 seconds. This is the time after an instance starts before it is considered healthy and included in metrics. This prevents premature scale-in.
- CloudWatch Alarm: Target Tracking automatically creates a CloudWatch alarm for the metric. This alarm has a period of 60 seconds (default) and a threshold based on the target value. You can view the alarm in the CloudWatch console.
Configuration and Verification
You can configure Target Tracking via AWS Console, CLI, or CloudFormation. Example CLI command to create a scaling policy:
aws autoscaling put-scaling-policy \
--auto-scaling-group-name my-asg \
--policy-name cpu-target \
--policy-type TargetTrackingScaling \
--target-tracking-configuration TargetValue=50.0,PredefinedMetricSpecification={PredefinedMetricType=ASGAverageCPUUtilization}To verify, describe the policy:
aws autoscaling describe-policies --auto-scaling-group-name my-asgOutput will show the policy type, target value, and the automatically created alarms.
Interaction with Related Technologies
CloudWatch: Target Tracking relies on CloudWatch metrics and alarms. Metrics must have a resolution of at least 1 minute (standard). If you use custom metrics, you need to publish them at high resolution.
Load Balancers: For ALBRequestCountPerTarget, you must specify the target group ARN. The metric is the request count per target in that group.
Lifecycle Hooks: Target Tracking works with lifecycle hooks. Instances launched or terminated by scaling policies will trigger lifecycle hooks if configured.
Scheduled Scaling: You can combine Target Tracking with scheduled scaling. For example, schedule a larger ASG during business hours and let Target Tracking fine-tune within that range.
Predictive Scaling: This is a separate feature that forecasts future traffic and pre-scales. Target Tracking can be used as a fallback.
Important Values and Timers
Cooldown period: 300 seconds default
Scale-in cooldown: 300 seconds default (can be set separately)
Estimated Instance Warmup: 300 seconds default
Metric period: 60 seconds default
Instance warmup: The time before a new instance contributes to metrics. Default is 300 seconds, but you can set it per policy.
Limitations
You cannot create multiple Target Tracking policies with the same metric. If you need multiple targets, use different metrics (e.g., CPU and network).
Target Tracking does not support step adjustments. It always scales proportionally.
It cannot scale to zero instances unless the ASG's minimum size is zero and the metric stays below target.
If the metric is missing (e.g., no data points), the policy does not scale. You should configure a separate alarm for missing data if needed.
Exam Traps
Cooldown vs. Warmup: Cooldown prevents scaling after a scaling activity. Warmup prevents including new instances in metric calculations. Candidates often confuse them.
Multiple Policies: If you have multiple Target Tracking policies (with different metrics), they can conflict. AWS scales based on the policy that requests the largest capacity.
Disable Scale-In: If you disable scale-in, the ASG will only scale out. This can lead to over-provisioning if not managed.
ALB Request Count: The metric is per target, not overall. If you have 10 instances and each receives 100 requests, the metric is 100. Target value might be 2000. This is often tested.
Step-by-Step Mechanism
Metric Collection: CloudWatch collects the specified metric (e.g., CPU) from all instances in the ASG every 60 seconds.
Alarm Evaluation: The CloudWatch alarm associated with the policy evaluates the metric. The alarm has two states: ALARM (metric > target + threshold) and OK (metric <= target).
Scaling Calculation: When the alarm triggers, Auto Scaling calculates the required capacity using the PI algorithm.
Scale Out/In: If required capacity > current capacity, a scale-out activity is initiated. If required capacity < current capacity, a scale-in activity is initiated (unless scale-in is disabled).
Instance Launch/Terminate: EC2 Auto Scaling launches new instances or terminates existing ones. The number of instances added/removed is the difference between required and current.
Cooldown: After the scaling activity, a cooldown period (default 300 seconds) starts. During this time, no new scaling activities are triggered by this policy.
Warmup: New instances are in warming state for the estimated instance warmup period (default 300 seconds). Their metrics are not included in the metric aggregation until they are warmed up.
Metric Stabilization: After warmup, the new instances' metrics are included. The metric should move toward the target.
Real-World Scenario
A typical e-commerce application uses Target Tracking with CPU target of 50%. During a flash sale, traffic spikes cause CPU to rise to 80%. The policy calculates required capacity as (80/50)*current, say 1.6 times. It scales out by adding instances. After cooldown, CPU drops to 40%, so it scales in. Without Target Tracking, you'd need to manually adjust alarms. With it, the system automatically handles load changes.
Exam Focus
SAA-C03 tests your ability to choose the right scaling policy for a given scenario. Key points:
Target Tracking is best for steady-state metrics where you want to maintain a specific utilization.
Use Simple or Step Scaling when you need more control over scaling steps (e.g., add 2 instances at 70%, add 4 at 90%).
For unpredictable traffic, consider Predictive Scaling combined with Target Tracking.
Be aware of the default cooldown and warmup values (300 seconds).
The exam often presents a scenario where a candidate incorrectly chooses Step Scaling because they think Target Tracking cannot handle multiple thresholds. Actually, Target Tracking handles continuous adjustments.
Common wrong answers:
Choosing Simple Scaling because it's simpler: but Simple Scaling requires manual alarm setup and is less responsive.
Thinking Target Tracking can scale to zero: it can only if min=0 and metric stays below target.
Confusing cooldown with warmup: cooldown prevents scaling actions; warmup prevents metric inclusion.
Assuming multiple Target Tracking policies with the same metric are allowed: they are not.
Numbers to Memorize
Default cooldown: 300 seconds
Default warmup: 300 seconds
Metric period: 60 seconds
Predefined metric types: ASGAverageCPUUtilization, ASGAverageNetworkIn, ASGAverageNetworkOut, ALBRequestCountPerTarget
Target value: typically 50% for CPU, but any value between 0 and 100.
Edge Cases
Missing data: If the metric has no data points, the policy does not scale. Use a separate alarm to handle missing data.
Metric spike: If a metric spikes briefly, the alarm may not trigger if the period is long. Use a shorter period if needed.
Scale-in protection: Instances with scale-in protection will not be terminated during scale-in. This can cause the ASG to have more instances than desired.
Lifecycle hooks: If a scale-in is blocked by a lifecycle hook that is completing, the policy may continue to request scale-in, leading to multiple terminations once the hook completes.
Verification Commands
Check scaling activities:
aws autoscaling describe-scaling-activities --auto-scaling-group-name my-asgCheck policy details:
aws autoscaling describe-policies --auto-scaling-group-name my-asg --policy-names cpu-targetView CloudWatch alarm:
aws cloudwatch describe-alarms --alarm-name-prefix "TargetTracking-my-asg"Summary
Target Tracking Scaling is a powerful, easy-to-use dynamic scaling policy. It automates capacity management by maintaining a target metric value. Understand its components, defaults, and limitations to ace SAA-C03 questions.
Metric Collection and Aggregation
CloudWatch collects the specified metric (e.g., CPU utilization) from each EC2 instance in the Auto Scaling group every 60 seconds (default period). The metric is aggregated across all instances using the statistic defined in the policy (default: Average). For example, if there are 10 instances with CPU values 40%, 50%, 60%, etc., the average is computed. This aggregated value is the current metric value used for scaling decisions.
CloudWatch Alarm Evaluation
The Target Tracking policy automatically creates a CloudWatch alarm that evaluates the metric. The alarm has a threshold that is slightly above the target value (to avoid flapping). The exact threshold is calculated by AWS. When the metric exceeds the threshold (ALARM state), a scaling out is triggered. When it falls below the threshold (OK state), scaling in is triggered. The alarm uses a period of 60 seconds and evaluates every minute.
Scaling Calculation by Auto Scaling
When the alarm triggers, Auto Scaling uses a proportional-integral (PI) controller algorithm to compute the required capacity. The formula is: required = (current metric / target metric) * current capacity. For example, with target 50%, current 75%, and capacity 10, required = (75/50)*10 = 15. The algorithm also considers the stabilization period to avoid reacting to transient spikes. The actual number of instances to add or remove is the difference between required and current capacity, rounded up for scale-out and down for scale-in.
Initiate Scaling Activity
Auto Scaling initiates a scaling activity: either a scale-out (launch instances) or scale-in (terminate instances). The number of instances to change is the calculated difference. For scale-out, new instances are launched using the launch template or configuration. For scale-in, instances are terminated based on the termination policy (default: oldest launch template, then closest to billing hour). The activity is recorded in the scaling activities history.
Cooldown and Warmup Periods
After the scaling activity, a cooldown period (default 300 seconds) begins. During this time, no new scaling activities are triggered by this policy. Additionally, for scale-out, new instances enter a warming state for the estimated instance warmup period (default 300 seconds). Their metrics are not included in the metric aggregation until they are warmed up. This prevents premature scale-in based on low metrics from new instances that are still initializing.
Metric Stabilization and Continuous Monitoring
After warmup, the new instances' metrics are included in the aggregation. The metric should move toward the target value. If the metric is still above target, the alarm will trigger again after cooldown ends, leading to another scale-out. If it is below target, scale-in occurs. This feedback loop continues, maintaining the metric near the target value. The system continuously monitors and adjusts, providing stable, automatic scaling.
Scenario 1: E-Commerce Platform with Variable Traffic
A large e-commerce company runs its web tier on an Auto Scaling group behind an Application Load Balancer. Traffic varies throughout the day, with peaks during lunch and evening hours. They use Target Tracking with the ASGAverageCPUUtilization metric set to 50%. This ensures that during low traffic, instances are minimized to save costs, and during spikes, capacity automatically increases. The cooldown period is kept at default 300 seconds to prevent oscillations. They also enable scale-in protection for instances running critical batch jobs to avoid being terminated prematurely. In production, they observed that during sudden flash sales, CPU could spike to 90% momentarily, but the policy would quickly add instances, bringing CPU back to 50% within 2-3 minutes. Misconfiguration: initially they set the target too low (30%), causing over-provisioning and wasted cost. After adjusting to 50%, they achieved a balance.
Scenario 2: Microservices with ALB Request Count
A SaaS provider uses a microservices architecture where each service runs on its own ASG. For a service that processes API requests, they use Target Tracking with the ALBRequestCountPerTarget metric, targeting 2000 requests per instance. This ensures each instance handles a consistent load. They specify the ALB target group ARN in the policy. During a marketing campaign, request volume tripled. The policy scaled out from 10 to 30 instances within minutes. However, they noticed that during scale-in, instances were terminated even if they were still processing requests. They added lifecycle hooks to delay termination until in-flight requests complete. They also set a separate scale-in cooldown of 600 seconds to be more conservative. Common pitfall: forgetting to specify the target group ARN, which causes the policy to fail.
Scenario 3: Batch Processing with Custom Metrics
A financial services company runs batch processing jobs on EC2 instances. They use a custom metric that tracks the queue depth of a job queue. They publish this metric to CloudWatch every 60 seconds. They create a Target Tracking policy with a custom metric specification targeting a queue depth of 100. As jobs pile up, the metric increases, triggering scale-out. When the queue drains, scale-in reduces capacity. They set the estimated instance warmup to 600 seconds because batch jobs take longer to initialize. They also disable scale-in to avoid terminating instances that are still processing. Instead, they use a separate scheduled action to scale in during off-hours. This combination ensures that the processing capacity matches the workload without manual intervention. A mistake they made was using a metric with a period longer than 60 seconds, causing slow response. They switched to a 60-second period for faster reaction.
What SAA-C03 Tests
The SAA-C03 exam (Objective 2.1: Design resilient architectures) tests your ability to select the appropriate scaling policy for Auto Scaling groups. Target Tracking is one of the key options. You must understand when to use it vs. Simple/Step Scaling or Scheduled Scaling. Specific areas:
Recognize that Target Tracking is ideal for maintaining a steady metric value (e.g., CPU at 50%).
Know the predefined metrics and that ALBRequestCountPerTarget requires a target group ARN.
Understand cooldown and warmup defaults (300 seconds each).
Know that you can disable scale-in to prevent automatic reduction.
Be aware that multiple Target Tracking policies can coexist only if they use different metrics.
Common Wrong Answers
Choosing Step Scaling because it allows multiple thresholds: Candidates often think Target Tracking cannot handle different scaling levels. Actually, Target Tracking automatically adjusts proportionally, so it handles all levels seamlessly. Step Scaling is for when you need different adjustment sizes at different metric thresholds.
Believing Target Tracking can scale to zero instances: It can only scale to zero if the ASG's minimum size is set to zero and the metric remains below the target for an extended period. The exam might present a scenario where min=1, making scale-to-zero impossible.
Confusing cooldown and warmup: Cooldown prevents scaling actions; warmup prevents metric inclusion. A common trap: saying 'cooldown prevents new instances from being counted' — that's warmup.
Assuming Target Tracking works with any metric: It works with CloudWatch metrics, but for custom metrics, you must use CustomizedMetricSpecification. The exam tests that you know the difference between predefined and custom.
Numbers and Terms on the Exam
Cooldown: 300 seconds default
Warmup: 300 seconds default
Predefined metric types: ASGAverageCPUUtilization, ASGAverageNetworkIn, ASGAverageNetworkOut, ALBRequestCountPerTarget
Target value: often 50% for CPU
DisableScaleIn: boolean parameter
Edge Cases
Missing data: If the metric has no data points for 5 minutes, the alarm state is INSUFFICIENT_DATA. Target Tracking does not scale in this state. The exam might ask what happens if the metric is missing — answer: no scaling.
Multiple policies: If you have two Target Tracking policies with different metrics, Auto Scaling scales based on the policy that requires the largest capacity. This is a key exam point.
Lifecycle hooks: If a lifecycle hook is in progress, scale-in might be delayed. The exam might test that scaling can continue even if hooks are pending.
How to Eliminate Wrong Answers
If the scenario mentions 'maintain CPU at 50%', Target Tracking is likely the answer.
If the scenario requires different scaling adjustments at different thresholds (e.g., add 1 at 70%, add 3 at 90%), choose Step Scaling.
If the scenario involves predictable time-based patterns, consider Scheduled Scaling.
If the scenario mentions 'simplest to configure', Target Tracking is the simplest dynamic policy.
Remember: Target Tracking is a 'set and forget' policy. It automates the calculation of capacity. The exam loves to test that you understand this automation and its limitations.
Target Tracking Scaling automatically adjusts ASG capacity to maintain a target metric value (e.g., CPU at 50%).
Default cooldown period is 300 seconds; default estimated instance warmup is also 300 seconds.
Predefined metrics: ASGAverageCPUUtilization, ASGAverageNetworkIn, ASGAverageNetworkOut, ALBRequestCountPerTarget.
You can disable scale-in to prevent automatic instance termination.
Multiple Target Tracking policies are allowed only if they use different metrics.
Target Tracking cannot scale below the ASG's minimum size.
If the metric has no data points, no scaling occurs.
These come up on the exam all the time. Here's how to tell them apart.
Target Tracking Scaling
Simpler to configure: only need target metric value
Uses PI controller for proportional adjustments
Automatically creates CloudWatch alarms
Cannot have multiple policies with same metric
Best for steady-state metric maintenance
Step Scaling
Requires defining step adjustments for different metric ranges
Uses static step adjustments (e.g., add 2 instances at 70%)
Requires manual CloudWatch alarm creation
Can have multiple step adjustments per alarm
Best when different scaling magnitudes are needed at different thresholds
Mistake
Target Tracking can scale to zero instances even if the minimum size is 1.
Correct
Target Tracking cannot scale below the ASG's minimum size. If the minimum is 1, the ASG will never go to zero, regardless of the metric. To scale to zero, you must set the minimum to 0.
Mistake
Target Tracking uses step adjustments like Step Scaling.
Correct
Target Tracking uses a proportional-integral controller algorithm, not step adjustments. It calculates the exact number of instances needed to meet the target. Step Scaling uses predefined step adjustments based on alarm breach magnitude.
Mistake
The cooldown period applies to individual instances, not the group.
Correct
The cooldown period applies to the Auto Scaling group as a whole. After any scaling activity, the entire group enters cooldown for the specified duration, preventing any further scaling actions by that policy.
Mistake
You can have multiple Target Tracking policies using the same metric.
Correct
AWS does not allow multiple Target Tracking policies with the same metric. If you try to create a second policy with the same predefined metric type, you will get an error. You can have policies with different metrics.
Mistake
Target Tracking automatically creates a CloudWatch alarm that you can modify.
Correct
The alarm is created and managed by Auto Scaling. You can view it, but you should not modify it. If you modify it, the policy may behave unexpectedly. AWS recommends leaving the alarm as is.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Target Tracking automatically adjusts capacity to keep a metric at a target value, using a proportional algorithm. Step Scaling requires you to define specific metric thresholds and corresponding capacity adjustments (e.g., add 2 instances when CPU > 70%). Target Tracking is simpler and better for maintaining a steady metric; Step Scaling offers more granular control for different load levels.
Yes, you can use a custom metric by specifying a CustomizedMetricSpecification in the policy. You must provide the metric name, namespace, and dimensions. The metric must be published to CloudWatch with a resolution of at least 1 minute.
If the CloudWatch alarm enters INSUFFICIENT_DATA state (e.g., no data points for 5 minutes), Target Tracking does not perform any scaling actions. It will not scale out or in until the metric becomes available again. You should set up a separate alarm to handle missing data if needed.
Yes, you can have multiple Target Tracking policies, but each must use a different metric. For example, one policy for CPU and another for network in. If multiple policies trigger simultaneously, Auto Scaling uses the one that requires the largest capacity.
If an instance has scale-in protection enabled, it will not be terminated during a scale-in activity. The ASG will terminate other instances without protection first. If all instances have protection, scale-in will fail, and the ASG will remain at the current capacity.
The default cooldown period is 300 seconds (5 minutes). This is the time after a scaling activity during which the policy will not trigger another scaling action. You can customize this value when creating the policy.
Yes, Target Tracking works with ASGs that use spot instances. The policy will launch spot instances during scale-out. However, if spot instances are interrupted, the ASG will try to maintain capacity, and Target Tracking may trigger scale-out to compensate.
You've just covered EC2 Target Tracking Scaling — now see how well it sticks with free SAA-C03 practice questions. Full explanations included, no account needed.
Done with this chapter?