CLF-C02Chapter 121 of 130Objective 3.5

AWS Service Quotas and Limits

This chapter covers AWS Service Quotas and Limits, a fundamental concept for managing resource usage and avoiding service disruptions. For the CLF-C02 exam, this objective falls under Domain 3: Cloud Technology Services, and understanding quotas is critical for cost management and operational stability. You will learn what quotas are, how they differ from limits, how to monitor and request increases, and common exam traps. This topic typically appears in 2–3 questions on the exam.

25 min read
Beginner
Updated May 31, 2026

The City Building Permit System

Imagine you are a developer building a new city. The city council sets limits on how many skyscrapers you can build in a district, how much water you can use, and how many building permits you can apply for per month. These limits ensure no single developer consumes all resources, leaving others unable to build. Initially, you start with a 'soft limit' — a default cap that is generous but not infinite. If your city grows and you need more permits, you can submit a 'limit increase request' to the council, explaining why you need more. The council reviews your request, and if approved, you get a 'hard limit' that cannot be exceeded. AWS Service Quotas work exactly like this: each AWS account has default quotas (soft limits) for resources like EC2 instances, S3 buckets, or API calls per second. You can request increases via the Service Quotas console or API, and AWS evaluates your request based on your usage history and business need. Once approved, the new quota becomes a hard limit. Exceeding a hard limit results in an error, just as building beyond your permit would get a stop-work order. This system prevents resource exhaustion and ensures fair usage across all AWS customers.

How It Actually Works

What Are AWS Service Quotas and Why Do They Exist?

AWS Service Quotas, previously known as 'service limits', are the maximum number of resources or operations you can create or perform in an AWS account. Every AWS service has predefined quotas to prevent accidental overuse, ensure fair resource allocation among customers, and maintain overall system stability. For example, by default, you can launch up to 5 Amazon EC2 instances per region (this varies by instance type), create 100 S3 buckets per account, or make 10,000 read requests per second to DynamoDB. These quotas apply at the account level or per region, depending on the service.

Quotas are essential for capacity planning. Without them, a single account could consume all available resources in a region, degrading performance for others. They also protect you from runaway costs due to misconfigured auto-scaling or malicious activity. On the exam, you need to distinguish between two types of quotas: soft limits (which can be increased by requesting) and hard limits (which are fixed and cannot be changed).

How AWS Service Quotas Work

AWS implements quotas at the service and account level. When you attempt to create a resource (e.g., launch an EC2 instance), AWS checks your current usage against the quota for that resource type in the current region. If the usage plus the new resource exceeds the quota, the API call fails with an error like 'LimitExceeded' or 'QuotaExceeded'. AWS calculates usage in real-time, so quotas are enforced immediately.

You can view your current quotas and usage in the Service Quotas console, AWS CLI (aws service-quotas), or via CloudWatch metrics. For example, to list EC2 instance quotas:

aws service-quotas list-service-quotas --service-code ec2 --region us-east-1

To get current usage for a specific quota:

aws service-quotas get-service-quota --service-code ec2 --quota-code L-12345678

Key Tiers and Configurations

Soft Quotas: These are adjustable by submitting a request to AWS. Examples include: EC2 On-Demand instances per region (default 5, but can be increased to hundreds), S3 buckets per account (default 100, can be increased), and Lambda concurrent executions (default 1,000, can be increased).

Hard Quotas: These cannot be increased. Examples include: maximum number of VPCs per region (5), maximum number of Elastic IPs per region (5), and maximum number of IAM roles per account (1,000). Some hard quotas have a fixed maximum, like the number of security groups per VPC (500).

Rate-Based Quotas: These limit the number of API requests per second (TPS). For example, DynamoDB has read/write capacity unit limits, and S3 has PUT/COPY/POST/DELETE request rate limits (3,500 PUT/COPY/POST/DELETE requests per second per prefix).

Resource-Based Quotas: These limit the total number of resources, like EC2 instances, RDS databases, or S3 buckets.

Comparison to On-Premises or Competing Approaches

In an on-premises data center, you have physical capacity constraints (e.g., number of servers, network bandwidth) that act as hard limits. However, these are not dynamically adjustable — you must purchase new hardware. Cloud quotas are similar in that they cap usage, but they are much more flexible: you can request increases quickly (often within minutes) without capital expenditure. Competitors like Google Cloud and Azure also have quotas, but AWS's Service Quotas centralizes management and provides a unified API for viewing and requesting increases across services. AWS also integrates with CloudWatch to alarm on quota usage, which is a key differentiator for proactive management.

When to Use vs Alternatives

You don't 'use' quotas as a service; you manage them to avoid hitting limits. Use the Service Quotas console or API when you need to:

Track current usage against quotas

Request quota increases for planned growth

Set CloudWatch alarms to warn when approaching limits

Automate quota monitoring with AWS Config or custom scripts

Alternatives include manually tracking usage via CloudWatch dashboards or third-party tools, but Service Quotas provides a single pane of glass. For the exam, remember that you cannot increase hard quotas, and that some quotas are per-region while others are per-account.

Walk-Through

1

Identify Required Quota Increase

First, determine which quota you need to increase. For example, if you are launching a new application that requires 20 EC2 t3.micro instances in us-east-1, but your default quota is 5, you need to increase the 'Running On-Demand Standard (A, C, D, H, I, M, R, T, Z) instances' quota. Navigate to the Service Quotas console, search for EC2, and locate the specific quota. Note the current value and the default (soft limit). This step is crucial because requesting the wrong quota will not resolve your issue. AWS CLI command to list EC2 quotas: `aws service-quotas list-service-quotas --service-code ec2`.

2

Submit a Quota Increase Request

In the Service Quotas console, select the quota and click 'Request increase'. Choose 'At account level' or 'At resource level' if applicable. Enter the new desired value (e.g., 20) and provide a brief use case description (e.g., 'Production web application for customer XYZ'). AWS evaluates the request automatically for many quotas, and you may receive an immediate approval or a request for more information. For some quotas, especially high values, AWS may require a manual review. You can also use the AWS CLI: `aws service-quotas request-service-quota-increase --service-code ec2 --quota-code L-12345678 --desired-value 20`.

3

Monitor Request Status

After submitting, you can track the status in the Service Quotas console under 'Request history'. Statuses include 'Pending', 'Approved', 'Denied', or 'Case opened' (if AWS Support is involved). The time to approval varies: automatic approvals often take minutes, while manual requests may take hours to days. If denied, AWS provides a reason (e.g., insufficient justification). You can then refine your request and resubmit. Use CloudWatch Events (now Amazon EventBridge) to get notified when the request status changes. This step is important for exam questions about how long quota increases take.

4

Implement Alarms for Quota Usage

To avoid hitting quotas unexpectedly, set CloudWatch alarms on usage metrics. For example, you can create an alarm for EC2 instance count using the `AWS/Usage` namespace with the metric `ResourceCount` and dimension `Service` = EC2, `Class` = Standard/OnDemand, `Resource` = vCPU. Set the threshold to 80% of your quota. When usage approaches the limit, the alarm triggers a notification via SNS. This proactive approach is a best practice and is tested on the exam. You can also use AWS Trusted Advisor to check for quota usage exceeding 80%.

5

Automate Quota Monitoring with AWS Config

For advanced governance, use AWS Config rules to evaluate whether your resource counts exceed a certain percentage of quotas. For example, create a custom AWS Config rule that triggers an evaluation when a new EC2 instance is launched and compares the total count to the quota. If the count exceeds 90%, mark the resource as non-compliant. This helps enforce organizational policies. Additionally, you can use AWS Organizations to apply service control policies (SCPs) that prevent requesting certain quota increases from child accounts. This step is more advanced but demonstrates depth for the exam.

What This Looks Like on the Job

Scenario 1: Scaling a Web Application During a Marketing Campaign

A startup runs a web application on EC2 behind an Auto Scaling group. During a planned marketing campaign, they expect traffic to spike from 1,000 to 50,000 users per hour. Their current EC2 On-Demand instance quota is 10 vCPUs, which can handle about 5,000 users. To scale safely, they must request a quota increase to 100 vCPUs at least two weeks in advance. They submit the request via Service Quotas with a detailed use case. AWS approves automatically because the increase is reasonable. On campaign day, Auto Scaling launches instances, but if the quota had not been increased, new instances would fail to launch, causing a denial of service. The team also sets CloudWatch alarms on vCPU usage at 80% of quota to catch any unexpected spikes. Cost-wise, they only pay for instances launched, and the quota increase itself is free. Misconfiguration would occur if they forgot to update the quota after the campaign, leaving unused capacity that might block future scaling.

Scenario 2: Data Lake Ingestion with S3 and Lambda

A media company ingests thousands of video files per day into S3, triggering Lambda functions for transcoding. Each S3 bucket can handle 3,500 PUT requests per second per prefix. During peak hours, they hit this limit, causing upload failures. They need to increase the S3 PUT request rate quota (which is a hard limit? Actually, S3 request rate limits are performance guidelines, not quotas you can increase — this is a common misconception. The real fix is to distribute objects across multiple prefixes or use S3 Transfer Acceleration. For Lambda, the default concurrent execution quota is 1,000. If they expect 2,000 concurrent invocations, they must request an increase. They submit a quota increase request for Lambda concurrent executions to 2,000. AWS approves after reviewing their account history. They also use CloudWatch metrics to monitor Lambda throttles. This scenario highlights that not all limits are quotas; some are performance constraints that require architectural changes.

Scenario 3: Enterprise Multi-Account Governance

A large enterprise uses AWS Organizations with 50 accounts. Each account has a default VPC quota of 5 VPCs per region. The networking team wants each account to have 10 VPCs for isolation. They cannot increase the hard quota of 5 VPCs per region per account. Instead, they must architect around it: use multiple regions (each region has its own quota) or use a shared VPC with subnets. This scenario teaches that hard quotas cannot be changed, so you must design around them. Cost considerations: using multiple regions may incur data transfer costs. Misunderstanding leads to failed architecture reviews and security compliance issues.

How CLF-C02 Actually Tests This

What CLF-C02 Tests on Service Quotas and Limits

This objective (3.5) falls under Domain 3: Cloud Technology Services. The exam focuses on:

Differentiating between soft limits (adjustable) and hard limits (fixed).

Knowing default quotas for common services: EC2 instances (5 per region), S3 buckets (100 per account), Lambda concurrent executions (1,000 per region), VPCs (5 per region), Elastic IPs (5 per region), IAM roles (1,000 per account).

Understanding that you can request increases for soft limits via Service Quotas console or API.

Recognizing that some quotas are per-region (e.g., EC2 instances) and some are per-account (e.g., S3 buckets).

Knowing that AWS Trusted Advisor checks for quota usage >80% and provides recommendations.

Understanding that hitting a quota causes an API error (e.g., 'QuotaExceeded').

Common Wrong Answers and Why Candidates Choose Them

1. 'All AWS service limits can be increased by submitting a support ticket.' Wrong because hard limits (e.g., VPCs per region, IAM roles) cannot be increased. Candidates assume flexibility is universal. 2. 'Quotas apply per account globally, not per region.' Wrong because most quotas are per-region (e.g., EC2 instances). Candidates confuse with global services like IAM. 3. 'AWS automatically increases quotas when you approach them.' Wrong because you must manually request increases. Candidates think AWS is proactive. 4. 'Service Quotas is a free feature available in all AWS accounts.' True, but candidates sometimes think it's a paid add-on. Actually, it's free.

Specific Terms and Values to Memorize

Soft limit: Adjustable (e.g., EC2 instances, Lambda concurrency).

Hard limit: Fixed (e.g., VPCs per region, Elastic IPs).

Default EC2 instance limit: 5 per region (t2/t3 micro may be higher, but exam uses generic 5).

Default S3 bucket limit: 100 per account.

Default VPC limit: 5 per region.

Default IAM role limit: 1,000 per account.

Default Lambda concurrent executions: 1,000 per region.

Service Quotas console: Centralized place to view and manage quotas.

AWS Trusted Advisor: Checks quota usage and sends alerts.

Tricky Distinctions

Quota vs. Limit: Used interchangeably, but 'quota' is the official term in AWS documentation.

Resource quota vs. API rate limit: Resource quotas cap total resources; API rate limits cap requests per second (e.g., DynamoDB throughput).

Per-account vs. per-region: Most quotas are per-region; some are per-account (e.g., S3 buckets).

Decision Rule for Multiple-Choice Questions

If a question asks about increasing a limit, first determine if it's a soft or hard limit. If it's a hard limit, the answer cannot be 'request an increase'. Instead, the answer is 'redesign the architecture' or 'use multiple regions'. If it's a soft limit, the answer is 'use Service Quotas to request an increase'. Also, remember that Trusted Advisor is the tool for monitoring quota usage, not for increasing quotas.

Key Takeaways

AWS Service Quotas are the maximum number of resources or operations allowed per account/region.

Soft limits (adjustable) can be increased via Service Quotas console; hard limits cannot.

Default EC2 On-Demand instance limit is 5 per region; default S3 bucket limit is 100 per account.

Default Lambda concurrent execution limit is 1,000 per region; default VPC limit is 5 per region.

AWS Trusted Advisor monitors quota usage and alerts when exceeding 80%.

Quota increase requests are free and often approved automatically within minutes.

Rate-based quotas (e.g., API requests per second) are different from resource quotas.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Soft Limits (Adjustable Quotas)

Can be increased by submitting a request via Service Quotas console or API.

Examples: EC2 On-Demand instances, Lambda concurrent executions, S3 buckets.

AWS automatically approves many increases within minutes.

Default values are set low to prevent accidental overuse.

Used for resources that can scale with demand.

Hard Limits (Fixed Quotas)

Cannot be increased; they are architectural constraints.

Examples: VPCs per region (5), Elastic IPs per region (5), IAM roles (1,000).

No increase request is possible, even with a support ticket.

Defaults are set based on AWS infrastructure limitations.

Requires redesign (e.g., use multiple regions) to exceed.

Watch Out for These

Mistake

All AWS service limits can be increased by submitting a support request.

Correct

Only soft limits can be increased. Hard limits (e.g., VPCs per region, IAM roles) have a fixed maximum that cannot be changed, even with a support ticket.

Mistake

Service quotas apply globally across all regions.

Correct

Most service quotas are per-region. For example, EC2 instance limits are separate for each region. Some quotas, like S3 buckets per account, are per-account but still region-independent (S3 is a global service, but bucket limit is per account).

Mistake

AWS automatically increases quotas when your usage approaches the limit.

Correct

AWS does not automatically increase quotas. You must proactively request an increase through the Service Quotas console or API. However, Trusted Advisor can alert you when usage exceeds 80%.

Mistake

Service Quotas is a paid feature only available with Business or Enterprise support plans.

Correct

Service Quotas is a free feature available to all AWS accounts, regardless of support plan. You can view and request quota increases without any additional cost.

Mistake

You can increase the number of VPCs per region by submitting a quota increase request.

Correct

The default VPC limit per region is 5, and this is a hard limit that cannot be increased. To have more VPCs, you must use multiple regions or consider alternative networking architectures like shared VPCs.

Frequently Asked Questions

What is the difference between a soft limit and a hard limit in AWS?

A soft limit is a default quota that you can increase by submitting a request to AWS (e.g., EC2 instances, Lambda concurrency). A hard limit is a fixed maximum that cannot be increased (e.g., VPCs per region, Elastic IPs per region). Soft limits are adjustable to accommodate growth, while hard limits require architectural changes if exceeded. On the exam, remember that you cannot increase hard limits even with a support ticket.

How do I request a quota increase in AWS?

You can request a quota increase through the Service Quotas console, AWS CLI, or API. In the console, navigate to the service, select the quota, click 'Request increase', enter the desired value and a use case description. For CLI: `aws service-quotas request-service-quota-increase --service-code <code> --quota-code <code> --desired-value <value>`. AWS automatically approves many requests; others may require manual review. The process is free.

Can I increase the number of VPCs per region?

No, the default VPC limit per region is 5, and this is a hard limit that cannot be increased. To have more than 5 VPCs, you must use multiple regions or consider using a shared VPC architecture. This is a common exam trick: candidates think they can increase it, but it's fixed.

What happens when I exceed an AWS service quota?

When you attempt to create a resource or perform an operation that exceeds a quota, AWS returns an error such as 'QuotaExceeded' or 'LimitExceeded'. The operation fails. For example, if you try to launch an EC2 instance beyond your instance quota, you get an error. You must then either reduce usage or request a quota increase (if it's a soft limit).

Does AWS automatically increase quotas when I need them?

No, AWS does not automatically increase quotas. You must proactively request an increase. However, you can use AWS Trusted Advisor to monitor your quota usage and get recommendations when usage exceeds 80%. This helps you plan ahead. The exam tests that you know you have to request increases manually.

What is the default S3 bucket limit per AWS account?

The default S3 bucket limit is 100 buckets per AWS account. This is a soft limit and can be increased by submitting a request through Service Quotas. If you need more than 100 buckets, you can request an increase, and AWS typically approves it. Note that S3 is a global service, so this limit applies to the entire account, not per region.

How can I monitor my AWS quota usage?

You can monitor quota usage using the Service Quotas console, which shows current usage vs. quota for each service. You can also set CloudWatch alarms on usage metrics (e.g., `AWS/Usage` namespace) to get notified when you approach a quota. Additionally, AWS Trusted Advisor provides quota usage checks and alerts when usage exceeds 80%. These tools help you avoid hitting limits unexpectedly.

Terms Worth Knowing

Ready to put this to the test?

You've just covered AWS Service Quotas and Limits — now see how well it sticks with free CLF-C02 practice questions. Full explanations included, no account needed.

Done with this chapter?