SAA-C03Chapter 76 of 189Objective 2.1

ASG Scheduled and Predictive Scaling

How can Auto Scaling Group scheduled and predictive scaling handle predictable traffic and cost optimisation beyond simple dynamic scaling? These strategies are essential for handling predictable traffic patterns and cost optimization in production environments. On the SAA-C03 exam, approximately 10-15% of questions relate to ASG scaling mechanisms, and understanding the nuances of scheduled versus predictive scaling is frequently tested. We will explore their mechanisms, configuration, and when to use each.

25 min read

Intermediate

Updated Jul 20, 2026

Reviewed by Johnson Ajibi· Senior Network & Security Engineer · MSc IT Security

Jump to a section

Explain it to me simply Where people get tripped up Test what I know Look up key terms

Scheduled vs Predictive Scaling: Like a Commuter Train

Scheduled scaling is like a commuter train that runs on a fixed timetable. The train company knows that at 8:00 AM every weekday, there will be a surge of passengers, so they schedule an extra train at 7:55 AM. The schedule is based on historical patterns and does not change if today's demand is different. The train leaves at the scheduled time regardless of how many passengers are waiting. Predictive scaling, on the other hand, is like a smart shuttle service that uses machine learning to forecast demand. It analyzes past ridership, weather data, and local events to predict that tomorrow at 8:00 AM there will be 150 passengers, so it dispatches two shuttles at 7:50 AM. If the prediction is wrong, the service can adjust dynamically. The key difference: scheduled scaling follows a static calendar, while predictive scaling uses algorithms to forecast and proactively adjust capacity based on learned patterns.

How It Actually Works

What Are Scheduled and Predictive Scaling?

Amazon EC2 Auto Scaling provides several ways to adjust capacity: manual, dynamic (simple, step, target tracking), scheduled, and predictive. Scheduled scaling is a time-based policy that changes the desired capacity at specific times. It is ideal for known, repetitive patterns — for example, scaling up at 9 AM Monday and scaling down at 6 PM Friday. Predictive scaling uses machine learning to forecast future traffic and proactively adjust capacity before demand spikes. It learns from historical CloudWatch metrics and can scale out ahead of predicted increases.

Why They Exist

Dynamic scaling reacts to real-time metrics, which can lag behind sudden spikes. For predictable patterns, reacting is inefficient — you pay for capacity only after the load arrives, but you may experience short-term performance issues. Scheduled scaling pre-provisions capacity exactly when needed, eliminating lag. Predictive scaling goes further by anticipating demand using ML, reducing the need for manual schedules and handling patterns that are not strictly periodic.

How Scheduled Scaling Works

A scheduled scaling policy consists of a recurring schedule (using cron expressions or one-time), a minimum, maximum, or desired capacity, and an optional time zone. When the scheduled time arrives, ASG updates the group's desired capacity. The actual instance launch or termination happens immediately, subject to cooldown and health checks.

Key components: - Recurrence: Using cron syntax, e.g., 0 9 * * 1-5 for 9 AM weekdays. One-time schedules use a specific UTC timestamp. - Capacity: You specify new values for Min, Max, and/or DesiredCapacity. If you only set DesiredCapacity, Min and Max remain unchanged unless overridden. - Time zone: Defaults to UTC. You can set to any valid time zone (e.g., America/New_York). - End time: For one-time events, you can set an end time after which the policy is removed.

Example CLI command:

aws autoscaling put-scheduled-update-group-action \
  --auto-scaling-group-name my-asg \
  --scheduled-action-name scale-up-morning \
  --recurrence "0 9 * * 1-5" \
  --time-zone "America/New_York" \
  --desired-capacity 10

How Predictive Scaling Works

Predictive scaling uses Auto Scaling's built-in ML model to analyze up to 14 days of historical metric data (default is 7 days). It forecasts future load and generates scaling recommendations. The policy can be configured in two modes: forecast only (dry run) or forecast and scale (active). In forecast-and-scale mode, ASG automatically adjusts capacity based on the forecast, up to a predefined maximum capacity.

Key concepts: - Metric: Must be a load metric like CPU utilization, request count per target, or network throughput. The metric must have at least 24 hours of history. - Forecast horizon: Up to 48 hours ahead. - Scaling plan: A predictive scaling policy is part of a scaling plan, which can also include target tracking policies. The plan defines the metric, target value, and behavior. - Max capacity breach: Predictive scaling will never exceed the group's max capacity. If the forecast requires more, it scales to max only. - Buffer: You can configure a buffer time (e.g., 5 minutes) to account for instance warm-up.

CLI example:

aws autoscaling put-scaling-policy \
  --auto-scaling-group-name my-asg \
  --policy-name predictive-policy \
  --policy-type PredictiveScaling \
  --predictive-scaling-configuration \
    MetricSpecifications=[{TargetValue=50,PredefinedLoadMetricSpecification={PredefinedMetricType=ASGTotalCPUUtilization}}],Mode=ForecastAndScale,SchedulingBufferTime=300

Interaction with Other Scaling Policies

Multiple scaling policies can coexist. The final desired capacity is the maximum of all active policies. For example, a scheduled policy may set desired capacity to 5, while a target tracking policy may also be active. If target tracking demands 8, the actual desired capacity becomes 8. Predictive scaling works similarly — it sets a desired capacity based on forecast, which can be overridden by other policies if they demand higher capacity.

Default Values and Timers

Cooldown period: Default 300 seconds for simple/step scaling; not used for target tracking and predictive scaling (they have their own mechanisms).

Instance warm-up: Default 300 seconds; the time before a new instance starts contributing to metrics.

Predictive scaling max capacity: Must be explicitly set; no default. If not set, the policy will not scale.

Forecast data retention: 14 days of metric data are retained for model training.

Verification Commands

To list scheduled actions:

aws autoscaling describe-scheduled-actions --auto-scaling-group-name my-asg

To describe scaling policies (including predictive):

aws autoscaling describe-policies --auto-scaling-group-name my-asg

To view forecast data:

aws autoscaling get-predictive-scaling-forecast --auto-scaling-group-name my-asg --policy-name my-policy

Common Pitfalls

Scheduled scaling time zone: Forgetting to set time zone leads to UTC, causing scale times off by hours.

Cron expression: Using wrong syntax (e.g., * * * * * for every minute) can cause unexpected scaling.

Predictive scaling metric: Choosing a metric that is not correlated with load (e.g., memory utilization for a stateless app) yields poor forecasts.

Max capacity: Without setting max capacity, predictive scaling may not work or may scale beyond budget.

Exam Relevance

On the SAA-C03 exam, you may be asked to choose between scheduled and predictive scaling based on a scenario. Key differentiators: scheduled is for fixed, known patterns (e.g., business hours); predictive is for patterns that repeat but vary in magnitude (e.g., website traffic that follows daily cycles but with different peaks). The exam also tests the combination of scaling policies and the fact that the highest desired capacity wins.

Walk-Through

Identify Predictable Pattern

Analyze historical CloudWatch metrics to determine if traffic follows a recurring schedule. For example, an e-commerce site sees spikes every weekday at 10 AM. If the pattern is consistent in timing and magnitude, scheduled scaling is appropriate. If the pattern is consistent in timing but varies in magnitude (e.g., peak traffic between 10 AM and 12 PM but the exact load changes daily), predictive scaling is better. This step involves reviewing at least 2 weeks of data to identify the cycle.

Choose Scaling Strategy

Based on the pattern analysis, decide which policy type to implement. For fixed schedules (e.g., scale up at 8 AM every Monday), use scheduled scaling. For variable but predictable patterns (e.g., traffic follows a daily cycle but peak volume differs), use predictive scaling. If the pattern is unpredictable, use dynamic scaling (target tracking). The exam often tests this decision point: scheduled for known fixed times, predictive for learned patterns.

Configure Scheduled Scaling

Create a scheduled action using the AWS CLI, console, or SDK. Specify recurrence using cron syntax, time zone, and the new capacity values. For example, to scale to 10 instances at 9 AM weekdays Eastern time: `0 9 * * 1-5` with time zone `America/New_York`. The action is stored and executed by the Auto Scaling service. Verify with `describe-scheduled-actions`.

Configure Predictive Scaling

Create a predictive scaling policy as part of a scaling plan. First, ensure the ASG has at least 24 hours of metric history. Define the metric specification (e.g., CPU utilization target of 50%). Choose mode: `ForecastOnly` to test, or `ForecastAndScale` to activate. Set max capacity and optional buffer time. The ML model trains on historical data and generates forecasts. Use `get-predictive-scaling-forecast` to view predictions.

Monitor and Adjust

After deployment, monitor the ASG scaling activity in CloudTrail and CloudWatch. For scheduled scaling, verify that actions occur at the correct times. For predictive scaling, check the forecast accuracy by comparing predicted vs actual metrics. Adjust the target value or metric if needed. The exam may present a scenario where predictive scaling is under-provisioning, requiring a lower target value or different metric.

What This Looks Like on the Job

Enterprise Scenario 1: E-commerce Flash Sales

A large retailer runs flash sales every Friday at 3 PM. Traffic spikes are massive but short-lived. The operations team uses scheduled scaling to increase desired capacity from 50 to 500 at 2:45 PM, giving instances time to warm up. At 4 PM, a second scheduled action scales down to 50. This ensures capacity is ready before the spike and cost is minimized after. A common mistake is setting the scale-up time exactly at 3 PM — by then, the spike has already started, causing latency. The fix: schedule 15 minutes early to account for instance launch time.

Enterprise Scenario 2: News Website with Variable Traffic

A news website experiences daily traffic peaks but the exact volume depends on breaking news. Scheduled scaling is too rigid; dynamic scaling reacts too slowly. The team implements predictive scaling using request count per target as the metric. The ML model learns that traffic typically rises between 6 AM and 9 AM, but the magnitude varies. Predictive scaling pre-launches instances based on forecast, reducing latency by 40% compared to dynamic scaling. The team monitors forecast accuracy weekly and retrains the model every 7 days.

Enterprise Scenario 3: Batch Processing at Midnight

A financial services company runs batch jobs every night at midnight. The batch workload is constant — exactly 20 EC2 instances are needed. They use scheduled scaling to set desired capacity to 20 at 11:55 PM and scale back to 2 at 6 AM. This is a textbook case for scheduled scaling because the pattern is fixed and known. Misconfiguration occurs when the cron expression uses UTC instead of local time, causing scale-up at 5 AM local (midnight UTC) — the batch runs at the wrong time. Always set the time zone explicitly.

Performance Considerations

Scheduled scaling: No limit on number of scheduled actions per ASG, but each action must have a unique name. Avoid overlapping actions that conflict (e.g., two actions at the same time with different desired capacities — the last one processed wins).

Predictive scaling: Requires continuous metric data — gaps cause degraded forecasts. The ML model updates every 24 hours. For very spiky patterns, combine with target tracking as a safety net.

Cost: Scheduled scaling can lead to over-provisioning if the actual load is lower than expected. Predictive scaling reduces over-provisioning but incurs no additional AWS cost for the ML model.

How SAA-C03 Actually Tests This

SAA-C03 Exam Focus on Scheduled and Predictive Scaling

The SAA-C03 exam tests your ability to select the right scaling strategy based on workload characteristics. Objective 2.1 (Resilient Architectures) includes designing scalable and cost-effective architectures. You must differentiate between scheduled, predictive, and dynamic scaling.

Common Wrong Answers

Choosing dynamic scaling when the pattern is predictable: Candidates often pick target tracking because it is 'automatic'. However, if the question describes a known schedule (e.g., 'every weekday at 9 AM'), scheduled scaling is more appropriate because it pre-provisions capacity, avoiding the lag of dynamic scaling.

Choosing scheduled scaling when the pattern varies: If the question says 'traffic peaks daily but the exact load varies', scheduled scaling is wrong because the required capacity changes each day. Predictive scaling is correct.

Assuming predictive scaling requires no metric history: The exam may state 'no historical data available'. Predictive scaling needs at least 24 hours of metric data. Without it, use scheduled scaling or manual.

Confusing predictive scaling with simple scaling: Predictive scaling uses ML; simple scaling uses a fixed threshold. The exam tests the difference in how they set capacity.

Specific Numbers and Terms

Cron syntax: 0 9 * * 1-5 (minute hour day-of-month month day-of-week). The exam may give a cron expression and ask what it does.

Default cooldown: 300 seconds. Not relevant for predictive scaling.

Predictive scaling modes: ForecastOnly and ForecastAndScale.

Metric requirement: At least 24 hours of historical data.

Max capacity: Must be set for predictive scaling to work.

Edge Cases

Multiple scheduled actions at the same time: The last one created takes precedence. The exam may ask about conflict resolution.

Predictive scaling with no forecast: If the ML model cannot generate a forecast (e.g., insufficient data), the policy does not scale.

Combining policies: The final desired capacity is the maximum of all active policies. If scheduled scaling sets 10, and predictive scaling sets 15, the result is 15.

Elimination Strategy

When you see a question about scaling, first identify if the pattern is known and fixed (scheduled), learned and variable (predictive), or reactive (dynamic). Eliminate options that don't match the pattern. Then check for specific requirements: metric history, cooldown, etc. Always read the scenario for time-related clues like 'every day at 10 AM' or 'traffic follows a daily pattern but varies'.

Key Takeaways

Scheduled scaling is for fixed, known times (e.g., scale up at 9 AM weekdays).

Predictive scaling uses ML to forecast and proactively scale based on historical metrics.

Predictive scaling requires at least 24 hours of metric data; default training uses 7 days.

Scheduled scaling uses cron expressions; always set the time zone to avoid UTC confusion.

When multiple scaling policies are active, the highest desired capacity wins.

Predictive scaling can be set to 'ForecastOnly' for dry-run testing.

Predictive scaling will never exceed the ASG's max capacity.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Scheduled Scaling

Based on fixed cron schedules (e.g., every weekday at 9 AM).

Capacity changes are predetermined and static.

No historical data required; you specify exact times.

Best for known, consistent patterns (e.g., batch jobs).

No ML or forecasting involved; purely time-based.

Predictive Scaling

Uses ML to forecast future demand based on historical metrics.

Capacity changes are dynamic based on forecasted load.

Requires at least 24 hours of metric history.

Best for patterns that repeat but vary in magnitude (e.g., website traffic).

Uses ML models that update every 24 hours.

Watch Out for These

Mistake

Scheduled scaling can only be set for one-time events.

Correct

Scheduled scaling supports recurring schedules using cron expressions, as well as one-time events. Recurring schedules are common for business-hour patterns.

Mistake

Predictive scaling requires at least 14 days of metric history.

Correct

The default is 7 days, but you can use up to 14 days. The minimum requirement is 24 hours of historical data for the ML model to train.

Mistake

Scheduled scaling and predictive scaling cannot be used together.

Correct

They can coexist. The actual desired capacity is the maximum of all active policies. For example, a scheduled action may set a floor, while predictive scaling adds additional capacity based on forecast.

Mistake

Predictive scaling instantly responds to real-time spikes.

Correct

Predictive scaling is proactive, not reactive. It forecasts based on historical data and scales ahead of predicted demand. For real-time spikes, you need dynamic (target tracking) scaling.

Mistake

Scheduled scaling automatically adjusts for time zone changes like daylight saving.

Correct

Scheduled scaling uses the specified time zone, which may or may not adjust for DST. For example, if you set time zone to America/New_York, it follows Eastern Time including DST. However, cron expressions are evaluated in that time zone, so a 9 AM schedule will shift to 8 AM EST during DST change if not careful.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

Can I use scheduled scaling and dynamic scaling together?

Yes. You can have both scheduled and target tracking policies on the same ASG. The actual desired capacity is the maximum of all active policies. For example, a scheduled action may set a minimum of 5 instances during business hours, while target tracking may scale up to 10 if CPU exceeds 50%. This combination ensures a baseline capacity and allows elastic scaling above that.

What is the difference between predictive scaling and target tracking?

Predictive scaling is proactive: it forecasts future load and scales ahead of time. Target tracking is reactive: it adjusts capacity based on current metric values to maintain a target. Predictive scaling uses ML and requires historical data; target tracking uses a simple averaging algorithm. Use predictive for predictable patterns, target tracking for unpredictable spikes.

How do I schedule a one-time scaling action?

Use the `start-time` parameter instead of `recurrence`. For example: `aws autoscaling put-scheduled-update-group-action --auto-scaling-group-name my-asg --scheduled-action-name one-time-scale-up --start-time "2025-03-15T14:00:00Z" --desired-capacity 10`. The action executes once at the specified UTC time and then is removed.

What metrics can I use for predictive scaling?

You can use predefined load metrics: ASGTotalCPUUtilization, ASGTotalNetworkIn, ASGTotalNetworkOut, ALBRequestCountPerTarget. You can also use custom metrics by specifying a custom metric specification. The metric must be a load metric that correlates with the number of instances needed.

Why did my scheduled scaling action not trigger?

Common reasons: (1) The time zone is set to UTC but you expected local time. (2) The cron expression is incorrect (e.g., using `*` for day-of-week when you meant a specific day). (3) The scheduled action was deleted or overwritten. (4) The ASG is in an unhealthy state or has suspended processes. Check with `describe-scheduled-actions` and verify the recurrence and time zone.

Can predictive scaling scale in before a predicted drop?

Yes. Predictive scaling not only scales out before predicted increases but also scales in before predicted decreases. The ML model forecasts both peaks and troughs. However, the scale-in happens at the forecasted time, not reactively. This helps reduce cost during low-demand periods.

What happens if predictive scaling forecast is inaccurate?

If the forecast underestimates demand, the ASG may be under-provisioned until dynamic scaling (if also configured) reacts. To mitigate, you can combine predictive scaling with a target tracking policy as a safety net. If the forecast overestimates, you may over-provision instances, increasing cost. You can adjust the target value or switch to forecast-only mode to fine-tune.

Terms Worth Knowing

IAM Region VPC

Ready to put this to the test?

You've just covered ASG Scheduled and Predictive Scaling — now see how well it sticks with free SAA-C03 practice questions. Full explanations included, no account needed.

Try SAA-C03 practice questions Back to all chapters

Done with this chapter?

ASG Lifecycle Hooks

ASG Termination Policies

See the full SAA-C03 study guide