This chapter covers two foundational cloud concepts: elasticity and scalability. These are core to the AWS Cloud and appear prominently in the CLF-C02 exam under Domain 1: Cloud Concepts (which accounts for 24% of the exam). Understanding the difference between elasticity and scalability, and knowing which AWS services provide them, is critical for both the exam and real-world architecture. You will learn the mechanisms behind Auto Scaling, Elastic Load Balancing, and how they work together to handle variable traffic.
Jump to a section
Imagine you run a movie theater and you sell popcorn. You have a single popcorn machine that can make 10 bags per minute. Most days, that's plenty. But on opening night of a blockbuster, hundreds of people show up all at once. Your one machine can't keep up — customers wait forever, and you lose sales. You could buy a second machine, but that costs money and sits idle most days. Now imagine a smarter system: you have a fleet of small popcorn machines that can be turned on or off automatically. When the line grows long, sensors detect the wait and power on extra machines. When the rush ends, they power down. You only pay for the electricity and kernels used by the machines that actually ran. That's elasticity — adding or removing capacity automatically to match demand. Scalability is the ability to add more machines in the first place. Elasticity is doing it dynamically, without human intervention. AWS Auto Scaling and Elastic Load Balancing work like this: they monitor your application's load (the line) and add or remove virtual servers (popcorn machines) to keep response times fast and costs low.
What Are Elasticity and Scalability?
Elasticity and scalability are often confused, but they have distinct meanings in AWS. Scalability is the ability of a system to handle growing amounts of work by adding resources. Elasticity is the ability to scale resources up or down automatically in response to changing demand. In other words, scalability is about capacity, and elasticity is about automation.
On AWS, you can achieve both through services like Amazon EC2 Auto Scaling and Elastic Load Balancing (ELB). These services allow your application to add or remove EC2 instances based on metrics like CPU utilization or request count. The cloud's pay-as-you-go model makes this cost-effective: you only pay for the resources you use.
How Elasticity Works: The Mechanism
AWS Auto Scaling works by defining a launch template or launch configuration that specifies the Amazon Machine Image (AMI), instance type, security groups, and other settings for new instances. You create an Auto Scaling group (ASG) that maintains a desired number of instances. You can set scaling policies based on CloudWatch alarms. For example, if average CPU utilization exceeds 70% for 5 minutes, the ASG adds two instances. If it drops below 30% for 10 minutes, it removes one instance.
Behind the scenes, AWS uses a cooldown period (default 300 seconds) to prevent rapid scaling oscillations. The ASG also integrates with Elastic Load Balancing: new instances are automatically registered with the load balancer, and unhealthy instances are terminated and replaced.
Key Components: Auto Scaling and ELB
- Auto Scaling: Manages the number of EC2 instances. You can configure: - Minimum size: The smallest number of instances (e.g., 1 for high availability). - Maximum size: The largest number (e.g., 10 to control costs). - Desired capacity: The starting number (e.g., 2). - Elastic Load Balancing: Distributes incoming traffic across multiple targets (EC2 instances, Lambda, containers). Types: - Application Load Balancer (ALB): Layer 7, for HTTP/HTTPS traffic. - Network Load Balancer (NLB): Layer 4, for TCP/UDP, ultra-low latency. - Classic Load Balancer: Legacy, not recommended for new applications.
Pricing Models
EC2 Auto Scaling: No additional charge. You pay only for the EC2 instances and other resources used (e.g., CloudWatch metrics).
Elastic Load Balancing: Charges based on hours of use and data processed. ALB costs about $0.0225 per hour plus $0.008 per LCU (Load Balancer Capacity Unit). NLB costs about $0.0225 per hour plus $0.006 per LCU.
Comparison to On-Premises
In an on-premises data center, scaling requires purchasing, racking, and configuring servers — a process that can take weeks. You must over-provision to handle peak load, wasting money during off-peak times. With AWS, you can provision resources in minutes and de-provision just as fast. This elasticity is a key advantage of cloud computing.
When to Use Elasticity vs. Scalability
Use scalability (manual or scheduled scaling) when traffic patterns are predictable, like a retail site that gets more traffic on weekends.
Use elasticity (automatic scaling) when traffic is unpredictable, like a news site that spikes during breaking news.
For stateless applications (e.g., web servers), elasticity works well. For stateful applications (e.g., databases), scaling is more complex and often requires sharding or read replicas.
AWS Services for Scalability
Amazon EC2 Auto Scaling: For EC2 instances.
AWS Auto Scaling: Centralized scaling for multiple services (EC2, DynamoDB, Aurora, etc.).
Elastic Load Balancing: Distributes traffic.
Amazon S3: Scales automatically to handle any amount of data.
Amazon DynamoDB: Scales tables automatically with on-demand capacity.
AWS Lambda: Scales automatically by running more function instances.
Limits and Defaults
Auto Scaling group: Default maximum of 200 instances per group per region (soft limit, can be increased).
ELB: Default 50 load balancers per region (soft limit).
CloudWatch metrics: Standard monitoring provides data every 5 minutes; detailed monitoring (extra cost) provides data every 1 minute.
Code Example: Creating an Auto Scaling Group with AWS CLI
aws autoscaling create-auto-scaling-group \
--auto-scaling-group-name my-asg \
--launch-configuration-name my-launch-config \
--min-size 1 \
--max-size 10 \
--desired-capacity 2 \
--vpc-zone-identifier subnet-abc123,subnet-def456CloudFormation Example
Resources:
MyASG:
Type: AWS::AutoScaling::AutoScalingGroup
Properties:
LaunchConfigurationName: !Ref MyLaunchConfig
MinSize: '1'
MaxSize: '10'
DesiredCapacity: '2'
VPCZoneIdentifier:
- subnet-abc123
- subnet-def456Summary
Elasticity and scalability are fundamental cloud concepts. AWS Auto Scaling and ELB are the primary services for achieving them. Understanding the differences, pricing, and limits is essential for the CLF-C02 exam.
Define a Launch Template
First, create a launch template that specifies the EC2 instance configuration: AMI (e.g., Amazon Linux 2), instance type (e.g., t3.micro), security group, key pair, and user data script. The launch template is versioned, allowing updates without breaking existing ASGs. AWS recommends using launch templates over launch configurations because they support newer features like T2/T3 unlimited and placement groups. You can create the template via the AWS Management Console, CLI, or CloudFormation. Example CLI: `aws ec2 create-launch-template --launch-template-name my-template --launch-template-data '{"ImageId":"ami-0abcdef1234567890","InstanceType":"t3.micro"}'`.
Create an Auto Scaling Group
Next, create an Auto Scaling group (ASG) using the launch template. Specify the VPC and subnets (at least two for high availability). Set the desired capacity (e.g., 2 instances), minimum size (e.g., 1), and maximum size (e.g., 10). The ASG will launch instances to meet the desired capacity. You can also attach an existing Elastic Load Balancer. The ASG automatically registers new instances with the load balancer. Behind the scenes, AWS uses the subnets to distribute instances across Availability Zones. If one AZ fails, the ASG launches instances in another AZ.
Configure Scaling Policies
Define scaling policies based on CloudWatch alarms. For example, create a policy to add 2 instances when average CPU utilization exceeds 70% for 5 minutes. Use a step scaling policy for precise adjustments or a simple scaling policy for fixed increments. You can also use target tracking scaling, which automatically sets the right number of instances to keep a metric (e.g., average CPU) at a target value (e.g., 50%). AWS recommends target tracking for most workloads. Example CLI: `aws autoscaling put-scaling-policy --auto-scaling-group-name my-asg --policy-name scale-out --scaling-adjustment 2 --adjustment-type ChangeInCapacity`.
Set Up Elastic Load Balancing
Create an Elastic Load Balancer (e.g., ALB) and configure listeners (e.g., HTTP on port 80). Create a target group with the EC2 instances. The ALB distributes incoming traffic across healthy instances. Health checks (e.g., HTTP GET /health) determine instance health. Unhealthy instances are automatically deregistered. The ALB supports path-based routing, host-based routing, and sticky sessions (using cookies). For exam: ALB is layer 7, NLB is layer 4. Use ALB for HTTP/HTTPS, NLB for TCP/UDP with extreme performance.
Monitor and Adjust
After setup, monitor the ASG and ELB using CloudWatch dashboards. Check metrics like HealthyHostCount, UnHealthyHostCount, RequestCount, and CPUUtilization. If scaling is too slow, reduce the cooldown period (default 300 seconds) or increase the step adjustment size. If scaling is too fast (oscillation), increase cooldown or use a higher threshold. You can also set up scheduled scaling for predictable traffic (e.g., scale up at 8 AM weekdays). AWS Trusted Advisor can provide recommendations for underutilized instances. Remember: Auto Scaling does not automatically rebalance across AZs; you may need to manually terminate instances in overused AZs.
Scenario 1: E-commerce Flash Sale
An online retailer runs a flash sale that lasts 2 hours. Traffic spikes from 1,000 requests per second to 50,000 requests per second. Without elasticity, the site would crash. The team uses an Auto Scaling group with a target tracking scaling policy to keep CPU at 50%. The ASG launches hundreds of EC2 instances within minutes. An ALB distributes traffic. After the sale, the ASG scales down to 2 instances. Cost: They pay for the extra instances only for the hours they ran. If misconfigured (e.g., insufficient maximum size), the site would still crash. If cooldown is too short, the system might oscillate.
Scenario 2: News Website with Unpredictable Spikes
A news site experiences sudden traffic surges when a major story breaks. They use AWS Auto Scaling with a simple scaling policy that adds 5 instances whenever request count exceeds 10,000 per minute. They also use an Application Load Balancer with spot instances for the web tier to reduce costs. The ASG uses a mixed instances policy to combine on-demand and spot instances. If the spot instances are interrupted, the ASG launches on-demand instances. This provides both cost savings and reliability. A common mistake: setting the minimum size too low (e.g., 1) can cause downtime if that single instance fails. Best practice: minimum size of 2 across two AZs.
Scenario 3: Video Streaming Platform
A streaming service uses AWS Lambda and Amazon API Gateway for its backend. Lambda scales automatically — no ASG needed. However, the encoding pipeline uses EC2 instances with Auto Scaling. During peak hours (evenings), they use scheduled scaling to increase capacity. They also use a Network Load Balancer to handle millions of concurrent connections with low latency. Cost considerations: NLB is more expensive than ALB per LCU, but necessary for TCP traffic. If they use ALB instead, they might see higher latency for streaming. Key lesson: choose the right load balancer type for the protocol.
What CLF-C02 Tests
Domain 1: Cloud Concepts (24% of exam). Objective 1.1: Define the benefits of the AWS Cloud including elasticity, scalability, high availability, and fault tolerance. The exam tests your ability to distinguish between these concepts and identify which AWS service provides which benefit.
Common Wrong Answers
"Elasticity means adding more resources permanently" — Incorrect. Elasticity is automatic and temporary. Scalability can be permanent.
"Auto Scaling automatically distributes traffic" — No, that's Elastic Load Balancing. Auto Scaling only manages instance count.
"You can scale a single EC2 instance vertically indefinitely" — No, instance types have maximum sizes (e.g., u-24tb1.metal). Vertical scaling requires stopping the instance. Horizontal scaling (adding more instances) is preferred.
"All AWS services are elastic by default" — No, only some (e.g., Lambda, DynamoDB on-demand). Others require configuration (e.g., EC2 Auto Scaling).
Specific Terms on the Exam
Elasticity: The ability to automatically scale resources up or down based on demand.
Scalability: The ability to increase capacity to handle growth.
Auto Scaling group: The logical group of EC2 instances.
Launch template: The configuration template for new instances.
Target tracking scaling policy: Automatically adjusts capacity to maintain a target metric.
ELB: Elastic Load Balancing (ALB, NLB, CLB).
High availability: Running across multiple Availability Zones.
Fault tolerance: Ability to continue operating after a component failure.
Tricky Distinctions
Elasticity vs. Scalability: Elasticity is a subset of scalability that implies automation. The exam may ask: "Which benefit allows you to automatically add instances during a traffic spike?" Answer: Elasticity.
Horizontal vs. Vertical Scaling: Horizontal = adding more instances. Vertical = increasing instance size (e.g., t3.micro to t3.large). The exam tests that horizontal scaling is more common in cloud.
ELB vs. Auto Scaling: ELB distributes traffic; Auto Scaling manages instance count. They often work together but are separate services.
Decision Rule
When you see a question about automatically adjusting capacity to match demand, think Auto Scaling. For distributing traffic, think ELB. For both, think they work together. If the question mentions "pay for what you use" and "no upfront cost", that's elasticity.
Elasticity = automatic scaling up/down; Scalability = ability to handle growth.
Auto Scaling manages EC2 instance count; ELB distributes traffic.
Target tracking scaling policy is the recommended method for Auto Scaling.
Default cooldown period for Auto Scaling is 300 seconds.
ELB types: ALB (layer 7), NLB (layer 4), CLB (legacy).
Auto Scaling groups can span multiple Availability Zones for high availability.
You pay only for the EC2 instances and ELB hours used, no extra charge for Auto Scaling.
These come up on the exam all the time. Here's how to tell them apart.
Elasticity
Automatic scaling up/down based on demand
Uses Auto Scaling and CloudWatch alarms
Key benefit: cost optimization (pay only for what you use)
Reacts to real-time metrics (e.g., CPU, request count)
Example: ASG adds instances during a spike, removes when traffic drops
Scalability
Ability to handle increased load by adding resources
Can be manual, scheduled, or automatic
Key benefit: capacity to grow without re-architecture
May involve vertical (bigger instance) or horizontal (more instances) scaling
Example: Migrating from t3.micro to t3.large for more memory
Mistake
Auto Scaling and Elastic Load Balancing are the same service.
Correct
They are separate services. Auto Scaling manages the number of EC2 instances. Elastic Load Balancing distributes incoming traffic across those instances. They often work together but have distinct roles.
Mistake
Elasticity means you can scale up an existing instance to a larger type without downtime.
Correct
Changing an EC2 instance type requires stopping the instance, which causes downtime. Elasticity typically refers to adding or removing instances (horizontal scaling), not resizing a single instance.
Mistake
All AWS services are elastic by default.
Correct
Only some services are inherently elastic (e.g., AWS Lambda, Amazon DynamoDB on-demand). Others, like EC2, require you to configure Auto Scaling. S3 scales automatically, but not all services do.
Mistake
You should always set the minimum size of an Auto Scaling group to 0 to save costs.
Correct
Setting minimum to 0 can cause the application to have zero capacity during low traffic, leading to downtime. Best practice is to set minimum to at least 1 (or 2 for high availability) to ensure the application is always available.
Mistake
Elasticity is the same as high availability.
Correct
Elasticity is about scaling capacity to match demand. High availability is about ensuring the system remains operational despite failures. They are different benefits. For example, an ASG can provide both by launching instances across AZs.
Elasticity is a subset of scalability that involves automatically provisioning and de-provisioning resources in response to demand. Scalability is the broader ability to increase capacity. For the CLF-C02 exam, remember that elasticity implies automation and cost optimization, while scalability is about capacity planning. Example: Auto Scaling provides elasticity; manually adding instances provides scalability but not elasticity.
No. Auto Scaling only manages the number of EC2 instances. Traffic distribution is handled by Elastic Load Balancing (ELB). However, Auto Scaling can automatically register new instances with an attached ELB. The exam tests that these two services work together but are separate.
No. To change an EC2 instance type, you must stop the instance, change the type, and start it again. This causes a brief downtime. For zero-downtime scaling, use horizontal scaling (add more instances) with a load balancer. The exam expects you to know that vertical scaling is limited and involves downtime.
ALB operates at Layer 7 (HTTP/HTTPS) and supports path-based routing, host-based routing, and WebSockets. NLB operates at Layer 4 (TCP/UDP) and offers ultra-low latency (millions of requests per second). Use ALB for web applications, NLB for high-performance TCP/UDP traffic. The exam may ask you to choose based on protocol or latency requirements.
Use a cooldown period (default 300 seconds) to prevent rapid scaling. Also, set a maximum size limit. For step scaling, use smaller step adjustments. Target tracking scaling automatically adjusts but you can set a higher target value to reduce sensitivity. The exam may test your understanding of cooldown and scaling policies.
Launch templates are newer and support versioning, T2/T3 unlimited, and placement groups. Launch configurations are legacy and do not support these features. AWS recommends using launch templates. The exam may ask which one to use for new deployments: launch template.
Yes. Lambda scales automatically by running multiple instances of your function in response to incoming events. There is no Auto Scaling configuration needed. This is a key benefit of serverless. The exam contrasts Lambda's built-in elasticity with EC2's need for Auto Scaling.
You've just covered Elasticity and Scalability on AWS — now see how well it sticks with free CLF-C02 practice questions. Full explanations included, no account needed.
Done with this chapter?