This chapter covers Auto Scaling Group (ASG) scaling policies beyond simple dynamic scaling — specifically scheduled scaling and predictive scaling. These strategies are essential for handling predictable traffic patterns and cost optimization in production environments. On the SAA-C03 exam, approximately 10-15% of questions relate to ASG scaling mechanisms, and understanding the nuances of scheduled versus predictive scaling is frequently tested. We will explore their mechanisms, configuration, and when to use each.
Jump to a section
Scheduled scaling is like a commuter train that runs on a fixed timetable. The train company knows that at 8:00 AM every weekday, there will be a surge of passengers, so they schedule an extra train at 7:55 AM. The schedule is based on historical patterns and does not change if today's demand is different. The train leaves at the scheduled time regardless of how many passengers are waiting. Predictive scaling, on the other hand, is like a smart shuttle service that uses machine learning to forecast demand. It analyzes past ridership, weather data, and local events to predict that tomorrow at 8:00 AM there will be 150 passengers, so it dispatches two shuttles at 7:50 AM. If the prediction is wrong, the service can adjust dynamically. The key difference: scheduled scaling follows a static calendar, while predictive scaling uses algorithms to forecast and proactively adjust capacity based on learned patterns.
What Are Scheduled and Predictive Scaling?
Amazon EC2 Auto Scaling provides several ways to adjust capacity: manual, dynamic (simple, step, target tracking), scheduled, and predictive. Scheduled scaling is a time-based policy that changes the desired capacity at specific times. It is ideal for known, repetitive patterns — for example, scaling up at 9 AM Monday and scaling down at 6 PM Friday. Predictive scaling uses machine learning to forecast future traffic and proactively adjust capacity before demand spikes. It learns from historical CloudWatch metrics and can scale out ahead of predicted increases.
Why They Exist
Dynamic scaling reacts to real-time metrics, which can lag behind sudden spikes. For predictable patterns, reacting is inefficient — you pay for capacity only after the load arrives, but you may experience short-term performance issues. Scheduled scaling pre-provisions capacity exactly when needed, eliminating lag. Predictive scaling goes further by anticipating demand using ML, reducing the need for manual schedules and handling patterns that are not strictly periodic.
How Scheduled Scaling Works
A scheduled scaling policy consists of a recurring schedule (using cron expressions or one-time), a minimum, maximum, or desired capacity, and an optional time zone. When the scheduled time arrives, ASG updates the group's desired capacity. The actual instance launch or termination happens immediately, subject to cooldown and health checks.
Key components:
- Recurrence: Using cron syntax, e.g., 0 9 * * 1-5 for 9 AM weekdays. One-time schedules use a specific UTC timestamp.
- Capacity: You specify new values for Min, Max, and/or DesiredCapacity. If you only set DesiredCapacity, Min and Max remain unchanged unless overridden.
- Time zone: Defaults to UTC. You can set to any valid time zone (e.g., America/New_York).
- End time: For one-time events, you can set an end time after which the policy is removed.
Example CLI command:
aws autoscaling put-scheduled-update-group-action \
--auto-scaling-group-name my-asg \
--scheduled-action-name scale-up-morning \
--recurrence "0 9 * * 1-5" \
--time-zone "America/New_York" \
--desired-capacity 10How Predictive Scaling Works
Predictive scaling uses Auto Scaling's built-in ML model to analyze up to 14 days of historical metric data (default is 7 days). It forecasts future load and generates scaling recommendations. The policy can be configured in two modes: forecast only (dry run) or forecast and scale (active). In forecast-and-scale mode, ASG automatically adjusts capacity based on the forecast, up to a predefined maximum capacity.
Key concepts: - Metric: Must be a load metric like CPU utilization, request count per target, or network throughput. The metric must have at least 24 hours of history. - Forecast horizon: Up to 48 hours ahead. - Scaling plan: A predictive scaling policy is part of a scaling plan, which can also include target tracking policies. The plan defines the metric, target value, and behavior. - Max capacity breach: Predictive scaling will never exceed the group's max capacity. If the forecast requires more, it scales to max only. - Buffer: You can configure a buffer time (e.g., 5 minutes) to account for instance warm-up.
CLI example:
aws autoscaling put-scaling-policy \
--auto-scaling-group-name my-asg \
--policy-name predictive-policy \
--policy-type PredictiveScaling \
--predictive-scaling-configuration \
MetricSpecifications=[{TargetValue=50,PredefinedLoadMetricSpecification={PredefinedMetricType=ASGTotalCPUUtilization}}],Mode=ForecastAndScale,SchedulingBufferTime=300Interaction with Other Scaling Policies
Multiple scaling policies can coexist. The final desired capacity is the maximum of all active policies. For example, a scheduled policy may set desired capacity to 5, while a target tracking policy may also be active. If target tracking demands 8, the actual desired capacity becomes 8. Predictive scaling works similarly — it sets a desired capacity based on forecast, which can be overridden by other policies if they demand higher capacity.
Default Values and Timers
Cooldown period: Default 300 seconds for simple/step scaling; not used for target tracking and predictive scaling (they have their own mechanisms).
Instance warm-up: Default 300 seconds; the time before a new instance starts contributing to metrics.
Predictive scaling max capacity: Must be explicitly set; no default. If not set, the policy will not scale.
Forecast data retention: 14 days of metric data are retained for model training.
Verification Commands
To list scheduled actions:
aws autoscaling describe-scheduled-actions --auto-scaling-group-name my-asgTo describe scaling policies (including predictive):
aws autoscaling describe-policies --auto-scaling-group-name my-asgTo view forecast data:
aws autoscaling get-predictive-scaling-forecast --auto-scaling-group-name my-asg --policy-name my-policyCommon Pitfalls
Scheduled scaling time zone: Forgetting to set time zone leads to UTC, causing scale times off by hours.
Cron expression: Using wrong syntax (e.g., * * * * * for every minute) can cause unexpected scaling.
Predictive scaling metric: Choosing a metric that is not correlated with load (e.g., memory utilization for a stateless app) yields poor forecasts.
Max capacity: Without setting max capacity, predictive scaling may not work or may scale beyond budget.
Exam Relevance
On the SAA-C03 exam, you may be asked to choose between scheduled and predictive scaling based on a scenario. Key differentiators: scheduled is for fixed, known patterns (e.g., business hours); predictive is for patterns that repeat but vary in magnitude (e.g., website traffic that follows daily cycles but with different peaks). The exam also tests the combination of scaling policies and the fact that the highest desired capacity wins.
Identify Predictable Pattern
Analyze historical CloudWatch metrics to determine if traffic follows a recurring schedule. For example, an e-commerce site sees spikes every weekday at 10 AM. If the pattern is consistent in timing and magnitude, scheduled scaling is appropriate. If the pattern is consistent in timing but varies in magnitude (e.g., peak traffic between 10 AM and 12 PM but the exact load changes daily), predictive scaling is better. This step involves reviewing at least 2 weeks of data to identify the cycle.
Choose Scaling Strategy
Based on the pattern analysis, decide which policy type to implement. For fixed schedules (e.g., scale up at 8 AM every Monday), use scheduled scaling. For variable but predictable patterns (e.g., traffic follows a daily cycle but peak volume differs), use predictive scaling. If the pattern is unpredictable, use dynamic scaling (target tracking). The exam often tests this decision point: scheduled for known fixed times, predictive for learned patterns.
Configure Scheduled Scaling
Create a scheduled action using the AWS CLI, console, or SDK. Specify recurrence using cron syntax, time zone, and the new capacity values. For example, to scale to 10 instances at 9 AM weekdays Eastern time: `0 9 * * 1-5` with time zone `America/New_York`. The action is stored and executed by the Auto Scaling service. Verify with `describe-scheduled-actions`.
Configure Predictive Scaling
Create a predictive scaling policy as part of a scaling plan. First, ensure the ASG has at least 24 hours of metric history. Define the metric specification (e.g., CPU utilization target of 50%). Choose mode: `ForecastOnly` to test, or `ForecastAndScale` to activate. Set max capacity and optional buffer time. The ML model trains on historical data and generates forecasts. Use `get-predictive-scaling-forecast` to view predictions.
Monitor and Adjust
After deployment, monitor the ASG scaling activity in CloudTrail and CloudWatch. For scheduled scaling, verify that actions occur at the correct times. For predictive scaling, check the forecast accuracy by comparing predicted vs actual metrics. Adjust the target value or metric if needed. The exam may present a scenario where predictive scaling is under-provisioning, requiring a lower target value or different metric.
Enterprise Scenario 1: E-commerce Flash Sales
A large retailer runs flash sales every Friday at 3 PM. Traffic spikes are massive but short-lived. The operations team uses scheduled scaling to increase desired capacity from 50 to 500 at 2:45 PM, giving instances time to warm up. At 4 PM, a second scheduled action scales down to 50. This ensures capacity is ready before the spike and cost is minimized after. A common mistake is setting the scale-up time exactly at 3 PM — by then, the spike has already started, causing latency. The fix: schedule 15 minutes early to account for instance launch time.
Enterprise Scenario 2: News Website with Variable Traffic
A news website experiences daily traffic peaks but the exact volume depends on breaking news. Scheduled scaling is too rigid; dynamic scaling reacts too slowly. The team implements predictive scaling using request count per target as the metric. The ML model learns that traffic typically rises between 6 AM and 9 AM, but the magnitude varies. Predictive scaling pre-launches instances based on forecast, reducing latency by 40% compared to dynamic scaling. The team monitors forecast accuracy weekly and retrains the model every 7 days.
Enterprise Scenario 3: Batch Processing at Midnight
A financial services company runs batch jobs every night at midnight. The batch workload is constant — exactly 20 EC2 instances are needed. They use scheduled scaling to set desired capacity to 20 at 11:55 PM and scale back to 2 at 6 AM. This is a textbook case for scheduled scaling because the pattern is fixed and known. Misconfiguration occurs when the cron expression uses UTC instead of local time, causing scale-up at 5 AM local (midnight UTC) — the batch runs at the wrong time. Always set the time zone explicitly.
Performance Considerations
Scheduled scaling: No limit on number of scheduled actions per ASG, but each action must have a unique name. Avoid overlapping actions that conflict (e.g., two actions at the same time with different desired capacities — the last one processed wins).
Predictive scaling: Requires continuous metric data — gaps cause degraded forecasts. The ML model updates every 24 hours. For very spiky patterns, combine with target tracking as a safety net.
Cost: Scheduled scaling can lead to over-provisioning if the actual load is lower than expected. Predictive scaling reduces over-provisioning but incurs no additional AWS cost for the ML model.
SAA-C03 Exam Focus on Scheduled and Predictive Scaling
The SAA-C03 exam tests your ability to select the right scaling strategy based on workload characteristics. Objective 2.1 (Resilient Architectures) includes designing scalable and cost-effective architectures. You must differentiate between scheduled, predictive, and dynamic scaling.
Common Wrong Answers
Choosing dynamic scaling when the pattern is predictable: Candidates often pick target tracking because it is 'automatic'. However, if the question describes a known schedule (e.g., 'every weekday at 9 AM'), scheduled scaling is more appropriate because it pre-provisions capacity, avoiding the lag of dynamic scaling.
Choosing scheduled scaling when the pattern varies: If the question says 'traffic peaks daily but the exact load varies', scheduled scaling is wrong because the required capacity changes each day. Predictive scaling is correct.
Assuming predictive scaling requires no metric history: The exam may state 'no historical data available'. Predictive scaling needs at least 24 hours of metric data. Without it, use scheduled scaling or manual.
Confusing predictive scaling with simple scaling: Predictive scaling uses ML; simple scaling uses a fixed threshold. The exam tests the difference in how they set capacity.
Specific Numbers and Terms
Cron syntax: 0 9 * * 1-5 (minute hour day-of-month month day-of-week). The exam may give a cron expression and ask what it does.
Default cooldown: 300 seconds. Not relevant for predictive scaling.
Predictive scaling modes: ForecastOnly and ForecastAndScale.
Metric requirement: At least 24 hours of historical data.
Max capacity: Must be set for predictive scaling to work.
Edge Cases
Multiple scheduled actions at the same time: The last one created takes precedence. The exam may ask about conflict resolution.
Predictive scaling with no forecast: If the ML model cannot generate a forecast (e.g., insufficient data), the policy does not scale.
Combining policies: The final desired capacity is the maximum of all active policies. If scheduled scaling sets 10, and predictive scaling sets 15, the result is 15.
Elimination Strategy
When you see a question about scaling, first identify if the pattern is known and fixed (scheduled), learned and variable (predictive), or reactive (dynamic). Eliminate options that don't match the pattern. Then check for specific requirements: metric history, cooldown, etc. Always read the scenario for time-related clues like 'every day at 10 AM' or 'traffic follows a daily pattern but varies'.
Scheduled scaling is for fixed, known times (e.g., scale up at 9 AM weekdays).
Predictive scaling uses ML to forecast and proactively scale based on historical metrics.
Predictive scaling requires at least 24 hours of metric data; default training uses 7 days.
Scheduled scaling uses cron expressions; always set the time zone to avoid UTC confusion.
When multiple scaling policies are active, the highest desired capacity wins.
Predictive scaling can be set to 'ForecastOnly' for dry-run testing.
Predictive scaling will never exceed the ASG's max capacity.
These come up on the exam all the time. Here's how to tell them apart.
Scheduled Scaling
Based on fixed cron schedules (e.g., every weekday at 9 AM).
Capacity changes are predetermined and static.
No historical data required; you specify exact times.
Best for known, consistent patterns (e.g., batch jobs).
No ML or forecasting involved; purely time-based.
Predictive Scaling
Uses ML to forecast future demand based on historical metrics.
Capacity changes are dynamic based on forecasted load.
Requires at least 24 hours of metric history.
Best for patterns that repeat but vary in magnitude (e.g., website traffic).
Uses ML models that update every 24 hours.
Mistake
Scheduled scaling can only be set for one-time events.
Correct
Scheduled scaling supports recurring schedules using cron expressions, as well as one-time events. Recurring schedules are common for business-hour patterns.
Mistake
Predictive scaling requires at least 14 days of metric history.
Correct
The default is 7 days, but you can use up to 14 days. The minimum requirement is 24 hours of historical data for the ML model to train.
Mistake
Scheduled scaling and predictive scaling cannot be used together.
Correct
They can coexist. The actual desired capacity is the maximum of all active policies. For example, a scheduled action may set a floor, while predictive scaling adds additional capacity based on forecast.
Mistake
Predictive scaling instantly responds to real-time spikes.
Correct
Predictive scaling is proactive, not reactive. It forecasts based on historical data and scales ahead of predicted demand. For real-time spikes, you need dynamic (target tracking) scaling.
Mistake
Scheduled scaling automatically adjusts for time zone changes like daylight saving.
Correct
Scheduled scaling uses the specified time zone, which may or may not adjust for DST. For example, if you set time zone to America/New_York, it follows Eastern Time including DST. However, cron expressions are evaluated in that time zone, so a 9 AM schedule will shift to 8 AM EST during DST change if not careful.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Yes. You can have both scheduled and target tracking policies on the same ASG. The actual desired capacity is the maximum of all active policies. For example, a scheduled action may set a minimum of 5 instances during business hours, while target tracking may scale up to 10 if CPU exceeds 50%. This combination ensures a baseline capacity and allows elastic scaling above that.
Predictive scaling is proactive: it forecasts future load and scales ahead of time. Target tracking is reactive: it adjusts capacity based on current metric values to maintain a target. Predictive scaling uses ML and requires historical data; target tracking uses a simple averaging algorithm. Use predictive for predictable patterns, target tracking for unpredictable spikes.
Use the `start-time` parameter instead of `recurrence`. For example: `aws autoscaling put-scheduled-update-group-action --auto-scaling-group-name my-asg --scheduled-action-name one-time-scale-up --start-time "2025-03-15T14:00:00Z" --desired-capacity 10`. The action executes once at the specified UTC time and then is removed.
You can use predefined load metrics: ASGTotalCPUUtilization, ASGTotalNetworkIn, ASGTotalNetworkOut, ALBRequestCountPerTarget. You can also use custom metrics by specifying a custom metric specification. The metric must be a load metric that correlates with the number of instances needed.
Common reasons: (1) The time zone is set to UTC but you expected local time. (2) The cron expression is incorrect (e.g., using `*` for day-of-week when you meant a specific day). (3) The scheduled action was deleted or overwritten. (4) The ASG is in an unhealthy state or has suspended processes. Check with `describe-scheduled-actions` and verify the recurrence and time zone.
Yes. Predictive scaling not only scales out before predicted increases but also scales in before predicted decreases. The ML model forecasts both peaks and troughs. However, the scale-in happens at the forecasted time, not reactively. This helps reduce cost during low-demand periods.
If the forecast underestimates demand, the ASG may be under-provisioned until dynamic scaling (if also configured) reacts. To mitigate, you can combine predictive scaling with a target tracking policy as a safety net. If the forecast overestimates, you may over-provision instances, increasing cost. You can adjust the target value or switch to forecast-only mode to fine-tune.
You've just covered ASG Scheduled and Predictive Scaling — now see how well it sticks with free SAA-C03 practice questions. Full explanations included, no account needed.
Done with this chapter?