AZ-900Chapter 6 of 127Objective 1.5

Benefits of Cloud Computing

This chapter covers the core benefits of cloud computing as tested on the AZ-900 exam. Understanding these benefits is critical because they form the foundation for why organizations migrate to the cloud. The Cloud Concepts domain carries approximately 20-25% of the exam weight, and objective 1.5 specifically asks you to describe the benefits of cloud computing. By the end of this chapter, you will be able to articulate the key advantages: high availability, scalability, elasticity, agility, fault tolerance, disaster recovery, and cost savings. You will also learn how Azure implements these benefits and how they differ from on-premises infrastructure.

25 min read

Beginner

Updated May 31, 2026

Reviewed by Johnson Ajibi· Senior Network & Security Engineer · MSc IT Security

Jump to a section

Explain it to me simply Where people get tripped up Look up key terms

Renting a Fleet vs. Owning One Car

Imagine you run a delivery business. If you own a single delivery van, you pay for the van whether it's on the road or parked. You also pay for maintenance, insurance, and a dedicated garage. When demand spikes—say, during the holidays—you can't deliver more than one van's worth of packages. You lose sales. To handle spikes, you'd have to buy more vans, which sit idle most of the year. Now imagine you rent vans from a fleet company. You pay only for the vans you use, by the hour or mile. When demand spikes, you rent 50 vans instantly. When demand drops, you return them and pay nothing. The fleet company handles maintenance, insurance, and parking. They have thousands of vans in depots worldwide, so you can deliver anywhere. This is exactly how cloud computing works. Instead of buying and maintaining physical servers (your own van), you rent compute, storage, and networking from a cloud provider like Azure. You scale up or down instantly, pay only for what you use, and the provider handles hardware failures, security patches, and capacity planning. The mechanism: Azure's global network of data centers acts as the fleet depots. Virtualization software (hypervisors) lets Azure slice physical servers into many virtual machines, just as the fleet company can assign any van to any driver. Metering tracks your usage per second, and billing aggregates it into a monthly invoice. You never see the physical server—just the virtual resources you provision via the portal or CLI.

How It Actually Works

What Are the Benefits of Cloud Computing?

Cloud computing provides on-demand access to computing resources—servers, storage, databases, networking, software, and analytics—over the internet. Instead of owning and maintaining physical data centers, you rent resources from a cloud provider like Microsoft Azure. The benefits fall into two categories: operational and economic. Operationally, the cloud offers high availability, scalability, elasticity, agility, fault tolerance, and disaster recovery. Economically, it shifts capital expenditure (CapEx) to operating expenditure (OpEx), reduces total cost of ownership (TCO), and provides a consumption-based pricing model.

How Cloud Benefits Work in Azure

Azure delivers these benefits through a global network of data centers, virtualization, and software-defined networking. Here’s the mechanism step by step:

High Availability (HA): Azure ensures your applications stay running even if hardware fails. HA is achieved through redundancy—multiple copies of data across availability zones (physically separate data centers within a region) and availability sets (groups of VMs in different fault domains and update domains). Azure’s Service Level Agreement (SLA) for VMs in an availability set is 99.95% uptime; for VMs in multiple availability zones, it’s 99.99%. The SLA is a contractual uptime guarantee—if Azure fails to meet it, you get service credits.

Scalability: The ability to increase resources as demand grows. Azure supports vertical scaling (increasing the size of a VM—more CPU, RAM) and horizontal scaling (adding more VMs or instances). Azure Virtual Machine Scale Sets allow you to automatically create and manage a group of load-balanced VMs. You set rules based on metrics like CPU usage or queue length; Azure adds or removes VMs accordingly.

Elasticity: The ability to scale both up and down automatically in response to real-time demand. Elasticity is broader than scalability—it includes scaling down when demand drops to avoid paying for idle resources. Azure Autoscale works with VM Scale Sets, App Service, and Azure Functions. For example, an e-commerce site can configure autoscale to add instances during Black Friday and remove them after the sale ends.

Agility: The speed at which you can provision and deploy resources. In on-premises, ordering a new server takes weeks (procurement, shipping, racking, cabling, OS install). In Azure, you can spin up a VM in minutes via the portal, CLI, or ARM templates. This agility enables rapid experimentation and faster time-to-market.

Fault Tolerance: The ability to continue operating without interruption despite component failures. Azure achieves fault tolerance through redundancy, automatic failover, and load balancing. For example, Azure SQL Database automatically replicates data to a secondary replica; if the primary fails, Azure fails over to the secondary with minimal downtime.

Disaster Recovery (DR): The ability to recover from catastrophic failures (e.g., entire region goes down). Azure Site Recovery replicates workloads from a primary site to a secondary Azure region. You can test DR drills without impacting production. Recovery Time Objective (RTO) and Recovery Point Objective (RPO) are key metrics—Azure allows RTOs of minutes and RPOs of seconds.

Cost Benefits: Cloud shifts CapEx (buying servers, software licenses, data center real estate) to OpEx (paying for what you use). The consumption-based model means no upfront costs. Azure also offers reserved instances (prepay for 1 or 3 years for discounts up to 72%) and hybrid benefit (use existing Windows Server licenses). TCO calculators help compare on-premises vs. cloud costs.

Key Components and Tiers

Regions and Availability Zones: Azure has 60+ regions worldwide. Each region has multiple availability zones (minimum 3). Zones are physically separate within a region—each has independent power, cooling, and networking. This is the foundation for HA and DR.

Resource Groups: Logical containers for Azure resources. They help manage costs, apply policies, and organize resources by lifecycle.

Azure Resource Manager (ARM): The management layer that enables consistent deployment and management via templates, RBAC, and tags.

SLA Tiers: Azure offers SLAs ranging from 99% (free tier) to 99.999% (multiple region deployments). For AZ-900, know the standard SLA for VMs: 99.95% for single VM with premium storage, 99.99% for VMs in availability zones.

Comparison to On-Premises

On-premises: You buy hardware, wait weeks for delivery, install, configure, and maintain. Capacity is fixed—you either overprovision (wasting money) or underprovision (losing customers). Hardware failures require manual replacement. Scaling requires new purchases. Disaster recovery means a second data center.

Cloud: Resources are on-demand, pay-as-you-go. You can scale globally in minutes. Azure handles hardware failures automatically. DR is built-in with geo-replication. Agility is orders of magnitude higher.

Azure Portal and CLI Touchpoints

Portal: Navigate to "Create a resource" > "Compute" > "Virtual machine" to see scaling options. Under "Availability options" you choose availability zone or set. Under "Management" you can enable autoscale.

Azure CLI:

az vm create --resource-group MyRG --name MyVM --size Standard_DS1_v2 --availability-zone 1
az vmss create --resource-group MyRG --name MyScaleSet --instance-count 3 --autoscale

PowerShell:

New-AzVm -ResourceGroupName MyRG -Name MyVM -Location eastus -Zone 1
New-AzVmss -ResourceGroupName MyRG -VMScaleSetName MyScaleSet -InstanceCount 3 -Autoscale

Concrete Business Scenarios

Startup with unpredictable growth: A mobile app startup uses Azure App Service with autoscale. During a viral launch, traffic spikes 1000x—Azure scales from 1 to 100 instances automatically. When traffic dies down, it scales back to 1. They pay only for the hours each instance runs.

Global e-commerce: A retailer deploys VMs in three availability zones in multiple regions. If one zone fails, traffic is routed to healthy zones. They use Azure Traffic Manager for global load balancing. This provides 99.99% uptime.

Financial services DR: A bank uses Azure Site Recovery to replicate on-premises VMs to Azure. They run quarterly DR drills with zero impact on production. During a real disaster, they fail over to Azure in minutes.

Walk-Through

Understand High Availability

High availability means your application remains accessible despite failures. In Azure, you design for HA by distributing resources across availability zones (physically separate data centers within a region). Each zone has independent power, cooling, and networking. For VMs, you place them in an availability set (two or more VMs in different fault domains and update domains) or across zones. Azure guarantees 99.95% uptime for a single VM with premium storage, but 99.99% for VMs in availability zones. Behind the scenes, Azure Monitor checks health; if a VM fails, it automatically restarts on healthy hardware. For databases, Azure SQL Database has built-in HA with automatic failover to a secondary replica. The key exam point: HA is about uptime, not just redundancy—redundancy is the mechanism, HA is the outcome.

Leverage Scalability

Scalability is the ability to increase resources to handle growth. Azure offers two types: vertical (scaling up) and horizontal (scaling out). Vertical scaling means increasing the VM size (e.g., from Standard_DS1_v2 to Standard_DS3_v2) to get more CPU/RAM. Horizontal scaling means adding more VMs or instances. Use Azure Virtual Machine Scale Sets for horizontal scaling. You define a scaling rule: for example, if average CPU > 75% for 5 minutes, add 1 VM. Azure monitors metrics via Azure Monitor and executes the rule. For databases, Azure SQL Database supports scaling up the service tier (DTU or vCore) with minimal downtime. The exam tests that scalability is a planned increase, while elasticity is automatic and bidirectional.

Implement Elasticity

Elasticity is the ability to automatically scale resources up and down in real-time based on demand. This is a key benefit of cloud over on-premises. In Azure, you configure Autoscale for VM Scale Sets, App Service, or Azure Functions. You set a minimum and maximum instance count, and define rules based on metrics like CPU, memory, or queue length. For example, an e-commerce site might have a rule: if CPU > 70% for 10 minutes, increase count by 2; if CPU < 30% for 10 minutes, decrease by 1. Azure's Autoscale engine runs every 30 seconds to evaluate rules. A common exam trap: elasticity does not mean unlimited resources—there are subscription limits (e.g., 20 VMs per region per subscription, adjustable via support ticket).

Achieve Agility

Agility is the speed at which you can provision and deploy resources. In on-premises, provisioning a server takes weeks. In Azure, you can create a VM in under 5 minutes via the portal, CLI, or ARM template. Agility enables rapid experimentation—developers can spin up test environments, run tests, and tear them down without waiting. Azure Resource Manager templates allow you to deploy complex multi-tier applications consistently. For example, a team can deploy a web app, database, and load balancer with one template. Agility also supports DevOps practices like CI/CD. The exam tests that agility reduces time-to-market and enables faster innovation.

Ensure Fault Tolerance

Fault tolerance means the system continues operating even when components fail. Azure achieves this through redundancy and automatic failover. For compute, availability sets and zones ensure that if a server or rack fails, your VM runs on another healthy host. For storage, Azure Storage replicates data multiple times within a region (LRS) or across regions (GRS, RA-GRS). For databases, Azure SQL Database uses built-in HA with automatic failover to a secondary replica in the same region or a different region. Load balancers distribute traffic to healthy instances. The key concept: fault tolerance is about designing for failure—assume components will fail and build redundancy. The exam may ask: which service provides fault tolerance for VMs? Answer: Availability sets or availability zones.

Plan Disaster Recovery

Disaster recovery (DR) is the ability to recover from a catastrophic failure that takes down an entire region. Azure Site Recovery (ASR) is the primary DR service. You replicate on-premises VMs or Azure VMs from a primary region to a secondary region. You can run DR drills without affecting production—this is called a test failover. Recovery Time Objective (RTO) is the target time to restore service (e.g., 1 hour). Recovery Point Objective (RPO) is the target maximum data loss (e.g., 15 minutes). Azure Site Recovery supports RTOs of minutes and RPOs of seconds for some workloads. The exam tests that DR is different from backup—backup is for accidental deletion or corruption; DR is for full-site failure.

What This Looks Like on the Job

Scenario 1: E-commerce Platform Handling Black Friday Traffic A large online retailer runs its e-commerce platform on Azure VMs behind a load balancer. They use Virtual Machine Scale Sets with Autoscale configured to add instances when CPU exceeds 70% for 5 minutes. During Black Friday, traffic surges 50x. Autoscale adds hundreds of VMs automatically. The load balancer distributes traffic evenly. Azure's global infrastructure ensures low latency. Without cloud, the retailer would have to overprovision servers that sit idle 99% of the year. The cost benefit is enormous—they pay only for the extra VMs during the surge. A common mistake: setting the autoscale max too low, causing throttling. They use Azure Monitor to track performance and adjust rules. This scenario tests your understanding of elasticity and cost savings.

Scenario 2: Healthcare Application with Compliance Requirements A healthcare provider hosts a patient portal that must be highly available and compliant with HIPAA. They deploy VMs across two availability zones in the East US region. Azure SQL Database is configured with geo-redundant backup to West US. If a zone fails, traffic fails over to the other zone automatically. They use Azure Policy to enforce encryption and access controls. Azure's SLA for multi-zone VMs is 99.99%, and they monitor compliance with Azure Security Center. The challenge: ensuring the application is stateless so failover works seamlessly. They also use Azure Site Recovery for full region failover. This scenario tests high availability, fault tolerance, and disaster recovery.

Scenario 3: Startup Rapidly Iterating on a Mobile App A startup develops a mobile game that gains viral traction. They use Azure App Service (PaaS) with autoscaling. They deploy new versions multiple times a day using deployment slots (staging vs. production). Azure's agility lets them spin up a new backend feature in hours, not weeks. They use Azure Functions for serverless background tasks (e.g., sending push notifications). The consumption plan means they pay only when code runs. A misconfiguration: not setting an autoscale minimum, so during a traffic dip, the app scales to zero, causing cold starts. They learn to set a minimum of 1 instance. This scenario tests agility, PaaS benefits, and serverless cost models.

How AZ-900 Actually Tests This

AZ-900 Objective 1.5: Describe the benefits of cloud computing This objective explicitly asks you to describe the following benefits: high availability, scalability, elasticity, agility, fault tolerance, disaster recovery, and cost savings. The exam will present scenarios and ask which benefit is being demonstrated.

Common Wrong Answers and Why 1. "Scalability" vs. "Elasticity": Many candidates choose "scalability" when the scenario describes automatic scaling down. Remember: scalability is the ability to increase resources (planned), elasticity is the ability to automatically scale up AND down (unplanned). If the scenario says "automatically adds and removes VMs based on demand," the answer is elasticity, not scalability. 2. "High availability" vs. "Fault tolerance": HA ensures uptime via redundancy; fault tolerance means the system continues despite failure. The exam may describe a system that continues operating after a server failure—that's fault tolerance. If it describes an SLA guarantee, that's high availability. 3. "Disaster recovery" vs. "Backup": DR is about full-site recovery (e.g., region failure), backup is about restoring individual files or databases. The exam often uses "recover from a catastrophic failure" to indicate DR. 4. "Agility" vs. "Scalability": Agility is about speed of provisioning, not amount. If the scenario says "provision resources in minutes," it's agility.

Specific Terms and Values - SLA percentages: 99.95% (single VM), 99.99% (availability zones), 99.999% (multi-region). - Availability zones: minimum 3 per region. - Azure Site Recovery: RTO of minutes, RPO of seconds (for some workloads). - Consumption-based model: pay only for what you use.

Edge Cases - A scenario where a company uses reserved instances: this is a cost benefit (lower cost) but reduces elasticity because you prepay. The exam may ask which benefit is reduced: elasticity. - A scenario with a single VM: it can still have high availability if it's in an availability zone with automatic restart, but the SLA is lower.

Memory Trick Use the acronym "SHAFED C" for the benefits: Scalability, High availability, Agility, Fault tolerance, Elasticity, Disaster recovery, Cost savings. Or remember: "Cloud gives you SHAFED C benefits."

Key Takeaways

High availability is achieved through redundancy across availability zones or sets; Azure SLA for multi-zone VMs is 99.99%.

Scalability is the ability to increase resources; elasticity is automatic scaling both up and down based on demand.

Agility means provisioning resources in minutes via the portal, CLI, or ARM templates, enabling faster time-to-market.

Fault tolerance ensures continued operation despite component failures; Azure uses redundancy and automatic failover.

Disaster recovery via Azure Site Recovery replicates workloads to a secondary region; RTO can be minutes, RPO seconds.

Cost benefits include consumption-based pricing, no upfront CapEx, and reserved instance discounts up to 72%.

The acronym SHAFED C helps remember the benefits: Scalability, High availability, Agility, Fault tolerance, Elasticity, Disaster recovery, Cost savings.

On-premises requires overprovisioning or risks underprovisioning; cloud offers on-demand resources that match actual usage.

Azure Resource Manager templates enable consistent, repeatable deployments for agility and governance.

Understanding the difference between scalability and elasticity is a common exam trap—elasticity includes automatic scale-in.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Scalability

Planned increase in capacity (e.g., adding more VMs for expected growth).

Can be vertical (scale up) or horizontal (scale out).

Often manual or scheduled (e.g., scale out during business hours).

Does not automatically reduce resources when demand drops.

Example: Adding 5 VMs to a scale set for a marketing campaign.

Elasticity

Automatic adjustment of capacity based on real-time demand.

Includes both scale out and scale in (up and down).

Uses autoscale rules based on metrics (CPU, queue length).

Optimizes cost by reducing resources when not needed.

Example: Autoscale adds VMs during a traffic spike and removes them when traffic subsides.

High Availability

Ensures application remains accessible during localized failures (server, rack).

Uses redundancy within a region (availability zones, sets).

Recovery time is seconds to minutes (automatic failover).

Protects against single point of failure, not full region outage.

SLA-driven (e.g., 99.99% uptime).

Disaster Recovery

Ensures recovery from catastrophic failures (region outage, natural disaster).

Uses replication to a secondary region (Azure Site Recovery).

Recovery time is minutes to hours (manual or automated failover).

Protects against full site loss.

RTO and RPO are key metrics (e.g., RTO 1 hour, RPO 15 minutes).

Watch Out for These

Mistake

Cloud computing is always cheaper than on-premises.

Correct

Cloud can be cheaper, but not always. For predictable, steady-state workloads, on-premises may be cheaper because you can buy reserved instances at a discount. Cloud is most cost-effective for variable workloads. Use Azure TCO Calculator to compare.

Mistake

High availability means 100% uptime.

Correct

No cloud provider guarantees 100% uptime. Azure SLAs are typically 99.9% to 99.99%. Even with multiple availability zones, there is a small chance of simultaneous failure. HA is about minimizing downtime, not eliminating it.

Mistake

Elasticity and scalability are the same thing.

Correct

Scalability is the ability to increase resources (up or out). Elasticity includes automatic scaling in both directions (up and down). Elasticity = scalability + automatic management + bidirectional.

Mistake

Disaster recovery is the same as backup.

Correct

Backup protects against data loss (accidental deletion, corruption). Disaster recovery protects against full-site failure (e.g., region outage). DR involves replicating entire workloads and failing over. Azure Backup and Azure Site Recovery are different services.

Mistake

Fault tolerance means no single point of failure.

Correct

Fault tolerance aims to eliminate single points of failure, but it's impossible to have zero. The goal is to design so that no single component failure brings down the system. Redundancy is key, but you must also handle software bugs and configuration errors.

Frequently Asked Questions

What is the difference between high availability and fault tolerance in Azure?

High availability (HA) is a measure of uptime, typically expressed as an SLA percentage (e.g., 99.99%). It is achieved through redundancy and automatic failover. Fault tolerance is the ability of a system to continue operating without interruption when a component fails. HA focuses on minimizing downtime, while fault tolerance focuses on maintaining functionality during failures. In Azure, HA is often provided by availability zones, while fault tolerance is a design principle that includes redundancy and graceful degradation. For the exam, remember: HA is about uptime guarantees; fault tolerance is about resilience to failure.

How does Azure Autoscale work?

Azure Autoscale automatically adjusts the number of instances (e.g., VMs in a scale set, App Service plan instances) based on predefined rules. You define a minimum and maximum instance count, and rules based on metrics like CPU, memory, disk queue, or HTTP queue length. For example, a rule might say: if average CPU > 75% for 10 minutes, increase instance count by 1. Autoscale evaluates rules every 30 seconds and scales out or in accordingly. It uses Azure Monitor to collect metrics. Autoscale helps achieve elasticity—scaling up during demand spikes and scaling down to save costs. Key exam point: Autoscale is for scaling out/in, not scaling up/down (vertical scaling is manual or requires changing VM size).

What is the difference between a fault domain and an update domain in Azure?

Fault domains represent a group of hardware that shares a common power source and network switch. If a fault domain fails (e.g., power outage), all VMs in that domain go down. Update domains are groups of VMs that are updated together during planned maintenance. Azure updates one update domain at a time to ensure only a subset of VMs are offline. In an availability set, you place VMs across multiple fault domains (up to 3) and update domains (up to 20). This ensures that during a hardware failure or maintenance, at least one VM remains running. For the exam, remember: fault domains protect against hardware failure; update domains protect against maintenance downtime.

How does Azure achieve cost savings compared to on-premises?

Azure offers a consumption-based pricing model where you pay only for the resources you use (compute hours, storage GB, etc.). This eliminates upfront capital expenditure (CapEx) for hardware, software licenses, and data center facilities. You also save on operational costs like power, cooling, and staff for maintenance. Azure provides reserved instances (1 or 3 years) for up to 72% discount, and hybrid benefit if you have existing Windows Server or SQL Server licenses. The Total Cost of Ownership (TCO) calculator helps compare on-premises vs. cloud costs. However, for steady-state workloads, on-premises might be cheaper if fully utilized. The exam tests that cloud shifts CapEx to OpEx and reduces TCO for variable workloads.

What is the difference between Azure Backup and Azure Site Recovery?

Azure Backup is a service that backs up data (files, folders, VMs, databases) to Azure. It protects against accidental deletion, corruption, or ransomware. You can restore individual files or entire VMs. Azure Site Recovery (ASR) is a disaster recovery service that replicates entire workloads (VMs, apps) from a primary site to a secondary Azure region. It protects against full-site failures (e.g., region outage). ASR enables failover and failback with defined RTO and RPO. Key difference: Backup is for data recovery; DR is for full infrastructure recovery. They are complementary—you should use both for comprehensive protection.

Can I achieve 100% uptime in Azure?

No. Azure's SLAs are less than 100% (e.g., 99.99% for multi-zone VMs). Even with multiple regions, there is always a small probability of simultaneous failures. 100% uptime is not guaranteed because of factors like software bugs, human error, or natural disasters. Azure compensates with service credits if SLA is not met. For critical applications, design for failure using redundancy, but accept that zero downtime is impossible. The exam tests that you understand SLA percentages and that HA is about minimizing downtime, not eliminating it.

What is the Azure TCO Calculator and how is it used?

The Azure Total Cost of Ownership (TCO) Calculator is a web tool that compares the cost of running your on-premises infrastructure versus migrating to Azure. You input details about your current environment (servers, storage, networking, databases, licenses). The calculator estimates the cost of equivalent Azure resources, factoring in electricity, IT labor, and hardware maintenance. It then shows a side-by-side comparison and potential savings. It also accounts for reserved instances and hybrid benefit. The TCO Calculator is used for cost justification in migration proposals. For the exam, know that it helps estimate cost savings but is not a billing tool—actual costs depend on usage.

Terms Worth Knowing

Cloud computing IaaS PaaS SaaS

Ready to put this to the test?

You've just covered Benefits of Cloud Computing — now see how well it sticks with free AZ-900 practice questions. Full explanations included, no account needed.

Try AZ-900 practice questions Back to all chapters

Done with this chapter?

Consumption-Based Pricing Model

High Availability and Scalability

See the full AZ-900 study guide