AZ-305Chapter 10 of 103Objective 4.1

Designing Compute Solutions

This chapter covers designing compute solutions in Azure, a core topic in Domain 4 of the AZ-305 exam. You will learn to select and configure Azure Virtual Machines, Virtual Machine Scale Sets, Containers, Azure App Service, Azure Functions, and Azure Kubernetes Service (AKS) to meet scalability, availability, and performance requirements. Compute solutions appear in approximately 15-20% of exam questions, often integrated with storage, networking, and identity decisions. Mastery of compute design is critical for the 'Design Infrastructure Solutions' section of the exam.

25 min read
Intermediate
Updated May 31, 2026

Azure VMs as Custom-Built Office Suites

Think of Azure Virtual Machines like renting and customizing individual office suites in a high-rise building. Each suite (VM) has its own floor plan (OS), furniture (applications), and lease terms (pricing model). You can choose a standard layout (Azure Marketplace image) or bring your own design (custom image). The building provides shared services like electricity (Azure networking), security (NSGs), and parking (managed disks). You decide how long to lease (pay-as-you-go or reserved), and you can resize your suite (change VM size) or move to a different floor (availability set) during maintenance. If you need more space, you add more suites (scale out) or upgrade to a larger suite (scale up). The building manager (Azure Fabric Controller) handles physical maintenance, but you are responsible for everything inside your suite (guest OS, apps, security patches). This analogy mirrors the IaaS model: you control the OS and applications, Azure manages the hypervisor and hardware. Just as you wouldn't rent a ballroom for a one-person office, you must choose the right VM size, storage, and networking to match your workload's compute, memory, and I/O requirements.

How It Actually Works

Overview of Azure Compute Options

Azure offers a spectrum of compute services ranging from Infrastructure as a Service (IaaS) to Platform as a Service (PaaS) and serverless. The AZ-305 exam expects you to evaluate trade-offs between control, scalability, cost, and operational overhead. The primary compute services are: - Azure Virtual Machines (VMs): IaaS providing full control over the operating system and applications. - Azure Virtual Machine Scale Sets (VMSS): Automated scaling of identical VMs. - Azure App Service: PaaS for web apps, APIs, and mobile backends. - Azure Functions: Serverless event-driven compute. - Azure Container Instances (ACI): Lightweight container hosting without orchestration. - Azure Kubernetes Service (AKS): Managed Kubernetes for container orchestration.

Azure Virtual Machines: Deep Dive

What it is: An Azure VM is a virtualized server running on Azure's hypervisor. You choose the VM size (vCPUs, memory, temporary storage), OS (Windows or Linux), and disk configuration. Azure supports both generation 1 and generation 2 VMs; generation 2 supports UEFI boot and larger memory.

How it works internally: When you create a VM, Azure's Fabric Controller allocates physical resources from a cluster of hosts. The VM runs on a hypervisor (Microsoft Hyper-V) that provides isolation. The VM's virtual hardware includes: - vCPUs: Each vCPU is a logical processor with dedicated or shared cores. Some series (e.g., Dv3) use hyper-threading; others (e.g., Fsv2) use full cores. - Memory: RAM allocated from the host, with some reserved for the hypervisor. - Disks: Managed disks (Standard HDD, Standard SSD, Premium SSD, Ultra Disk) are stored as page blobs in Azure Storage. Each disk has a max IOPS and throughput. The VM's temporary disk (D: on Windows, /dev/sdb1 on Linux) provides local SSD storage but is not persistent. - Virtual Network Interface (vNIC): Connects the VM to a virtual network (VNet). Each VM can have multiple vNICs, each attached to a different subnet.

Key components, values, defaults: - VM Sizes: Grouped into families: General purpose (B, D, Dv3, Dsv3), Compute optimized (F, Fsv2), Memory optimized (E, Esv3, M), Storage optimized (Lsv2), GPU (NC, ND), and High-performance compute (H). Each size has a specific number of vCPUs, RAM, temp disk size, and max data disks. - Availability: Single VM SLA is 99.9% for VMs with premium storage and at least two instances in an availability set or zone. Availability sets distribute VMs across fault domains (server racks) and update domains (maintenance windows). Availability zones are physically separate datacenters within a region. - Pricing: Pay-as-you-go (per second), Reserved Instances (1 or 3 years, up to 72% discount), Spot VMs (up to 90% discount, but can be evicted). - Disks: Managed disks are recommended. Maximum disk size is 32 TiB for Ultra Disk. Premium SSD provides up to 20,000 IOPS per disk (P50).

Configuration and verification commands:

# Create a VM using Azure CLI
az vm create --resource-group MyRG --name MyVM --image UbuntuLTS --size Standard_D2s_v3 --admin-username azureuser --generate-ssh-keys

# Resize a VM (requires restart)
az vm resize --resource-group MyRG --name MyVM --size Standard_D4s_v3

# Check VM status
az vm get-instance-view --resource-group MyRG --name MyVM --query "statuses[?code=='PowerState/running']"

Interactions: VMs integrate with Azure Load Balancer, Azure Application Gateway, Azure Backup, Azure Site Recovery, and Azure Monitor. VMs can join an Azure Active Directory domain for identity management.

Azure Virtual Machine Scale Sets (VMSS)

What it is: VMSS allows you to create and manage a group of identical, load-balanced VMs that automatically scale based on demand or a schedule.

How it works: A VMSS uses a scale set model (VM configuration template) to create VMs. The scale set is deployed in an availability zone or region. Scaling can be manual or autoscaling based on metrics like CPU percentage, memory, or custom metrics. Autoscaling uses Azure Monitor autoscale with rules (e.g., add 1 instance when CPU > 75% for 5 minutes).

Key values: - Overprovisioning: Default is on; creates extra VMs during scale-out to ensure availability. - Upgrade policy: Automatic (rolling upgrade), Manual, or Rolling. Rolling upgrade updates instances in batches with a pause between. - Instance protection: Prevent instances from being removed during scale-in.

Configuration command:

az vmss create --resource-group MyRG --name MyScaleSet --image UbuntuLTS --instance-count 2 --vm-sku Standard_D2s_v3 --authentication-type password --admin-username azureuser --admin-password P@ssw0rd1234

Interaction: VMSS works with Azure Load Balancer (default) or Application Gateway. For stateful applications, use persistent disks (managed disks) or external storage.

Azure App Service

What it is: A PaaS offering for hosting web applications, REST APIs, and mobile backends. Supports multiple languages (.NET, Java, Node.js, Python, PHP) and frameworks.

How it works: App Service runs inside an App Service plan, which defines the compute resources (shared or dedicated VMs). The plan has pricing tiers: Free/Shared (shared infrastructure, limited), Basic (dedicated VMs, no auto-scale), Standard (auto-scale, staging slots), Premium (more features, high density), and Isolated (dedicated App Service Environment).

Key components: - App Service Plan: Defines region, number of instances, size of instances (e.g., S1, S2, S3), and scaling behavior. - Deployment Slots: Standard tier and above allow multiple slots (e.g., production, staging) for zero-downtime deployment. - Auto-scale: Based on metrics or schedule. Maximum instances depend on tier. - Networking: VNet integration (regional or gateway-required) allows access to resources in a VNet. - Authentication: Built-in authentication with Azure AD, Facebook, Google, etc.

Configuration command:

az webapp create --resource-group MyRG --plan MyAppServicePlan --name MyUniqueAppName --runtime "DOTNET|6.0"

Interaction: App Service integrates with Azure SQL Database, Cosmos DB, Redis Cache, and Azure CDN. It can use Managed Identity to access other Azure resources.

Azure Functions

What it is: Serverless compute that executes code in response to events without managing infrastructure.

How it works: Functions run in either a Consumption plan (pay per execution, auto-scale, up to 10-minute timeout), Premium plan (pre-warmed instances, no timeout, VNet integration), or Dedicated plan (on an App Service plan). The function app is the unit of deployment. Triggers (HTTP, Timer, Blob, Queue, etc.) invoke the function. Bindings simplify input/output to services.

Key values: - Timeout: Consumption plan default 5 minutes, max 10 minutes. Premium/Dedicated: 30 minutes default, no max (but must complete before next instance recycles). - Concurrency: Consumption plan allows up to 200 concurrent function executions per instance. - Scale: Consumption plan scales based on event-driven triggers. Maximum instances: 200 for Consumption, 100 for Premium.

Configuration command:

az functionapp create --resource-group MyRG --consumption-plan-location westus --name MyFunctionApp --storage-account mystorageaccount --runtime python

Interaction: Functions often work with Azure Storage (queues, blobs), Event Grid, Service Bus, and Cosmos DB.

Azure Container Instances (ACI)

What it is: The fastest way to run a container in Azure without managing virtual machines or orchestrators.

How it works: ACI launches a container directly on Azure infrastructure. You specify a container image (from Docker Hub or Azure Container Registry), CPU, memory, and networking. Containers can be exposed via public IP or DNS name. ACI supports Linux and Windows containers.

Key values: - Pricing: Per second of container runtime. You pay for allocated CPU and memory. - Restart policy: Always, OnFailure, Never. - Volume mounts: Azure Files shares for persistent storage.

Configuration command:

az container create --resource-group MyRG --name mycontainer --image mcr.microsoft.com/azuredocs/aci-helloworld --cpu 1 --memory 1 --ports 80 --dns-name-label mydnsname

Azure Kubernetes Service (AKS)

What it is: Managed Kubernetes service for deploying and managing containerized applications at scale.

How it works: AKS provides a managed control plane (API server, etcd, scheduler) and lets you manage worker nodes (VM scale sets). You define a cluster with node pools. Each node pool has a VM size and scaling policy (manual or autoscale). AKS integrates with Azure Container Registry (ACR), Azure AD, Azure Monitor, and Azure Policy.

Key components: - Node pools: System node pool (for critical system pods) and user node pools (for applications). - Scaling: Cluster autoscaler automatically adjusts the number of nodes based on pending pods. - Networking: Azure CNI (each pod gets a VNet IP) or kubenet (pods get IPs from a private CIDR, nodes NAT). - Storage: Persistent volumes using Azure Disks (ReadWriteOnce) or Azure Files (ReadWriteMany).

Configuration command:

az aks create --resource-group MyRG --name MyAKSCluster --node-count 3 --enable-addons monitoring --generate-ssh-keys

Interaction: AKS works with Azure DevOps for CI/CD, Azure Service Mesh (Istio-based), and Azure Policy for governance.

Design Considerations for Compute Solutions

High Availability: Use availability zones for VMs and AKS node pools. For App Service, deploy across multiple instances in different zones (via zone-redundant App Service plan).

Disaster Recovery: Use Azure Site Recovery for VMs, geo-redundant storage for App Service, and cross-region replication for AKS.

Scalability: Prefer horizontal scaling (scale out) for most stateless workloads. Vertical scaling (scale up) is for stateful or legacy apps.

Cost Optimization: Use Reserved Instances for predictable workloads, Spot VMs for batch jobs, and serverless for event-driven tasks.

Security: Use managed identities, NSGs, Azure Firewall, and private endpoints. For AKS, enable Azure AD integration and RBAC.

Performance: Choose VM sizes with premium storage for high I/O. Use proximity placement groups for low latency between VMs.

Exam-Specific Details

VM SLA: 99.9% for two or more VMs in an availability set using premium storage; 99.95% for VMs across availability zones.

App Service SLA: 99.95% for Standard tier and above.

AKS SLA: 99.95% for the Kubernetes API server (paid tier).

Azure Functions: Consumption plan has a 10-minute timeout; Premium plan has no timeout (but 30-minute default).

VMSS: Default overprovisioning is enabled; upgrade policy defaults to Automatic.

Container Instances: Restart policy defaults to Always.

Common Exam Traps

Trap 1: Choosing a single VM for high availability. Wrong because the SLA requires at least two VMs.

Trap 2: Selecting App Service Basic tier for auto-scaling. Wrong because auto-scaling requires Standard or higher.

Trap 3: Using a VMSS with manual scaling for automatic scale-out. Wrong because VMSS autoscaling must be configured via Azure Monitor autoscale rules.

Trap 4: Assuming Azure Functions Consumption plan supports VNet integration. Wrong; only Premium plan supports VNet integration.

Trap 5: Thinking AKS node pools can be resized without recreating nodes. Wrong; you must add a new node pool with the desired size and migrate workloads.

Walk-Through

1

Define Compute Requirements

Start by analyzing the workload: Is it stateless or stateful? What are the CPU, memory, and I/O requirements? What SLA is needed? For example, a web frontend is stateless and can scale out, while a legacy database requires stateful VMs with high IOPS. Determine the region and availability requirements (zones or sets). Also consider cost constraints: pay-as-you-go, reserved, or spot. This step sets the foundation for all subsequent design decisions.

2

Select Compute Service Type

Based on requirements, choose between IaaS (VMs, VMSS), PaaS (App Service), Serverless (Functions), or Containers (ACI, AKS). For full control and legacy apps, VMs are appropriate. For web apps with minimal management, App Service is ideal. For event-driven tasks, Azure Functions. For containerized microservices, AKS. For simple container deployment, ACI. The exam often tests the trade-offs: e.g., App Service for auto-scaling web apps, VMs for custom OS requirements.

3

Configure VM Size and Storage

If using VMs, select the appropriate size family and tier. For example, D-series for balanced compute/memory, E-series for memory-intensive, F-series for compute-only. Choose managed disk type: Standard HDD for dev/test, Standard SSD for production with moderate I/O, Premium SSD for high I/O, Ultra Disk for extremely low latency. Configure data disks and use striping (Storage Spaces) for throughput. For VMSS, specify the same size for all instances.

4

Design for High Availability

For VMs, place at least two instances in an availability set or across availability zones. For App Service, configure multiple instances in a Standard or higher plan and enable zone redundancy if needed. For AKS, create node pools across zones. Ensure the SLA requirements are met. Remember: a single VM with premium storage only gets 99.9% if it's part of an availability set or zone with a second VM.

5

Implement Scaling Strategy

For VMSS, configure autoscale rules based on metrics like CPU > 75% for 5 minutes. For App Service, set up autoscale in the plan (Standard tier+). For Azure Functions, scaling is automatic in Consumption plan. For AKS, enable cluster autoscaler and set minimum and maximum node counts. Consider scale-in and scale-out cool-down periods (default 5 minutes) to avoid thrashing. Test scaling behavior under load.

6

Plan Networking and Security

VMs and VMSS require a VNet with subnets. Use NSGs to filter traffic. For App Service, use VNet integration for private access. For Functions, Premium plan enables VNet integration. For AKS, choose Azure CNI or kubenet. Enable private endpoints for secure access to PaaS services. Use managed identities for authentication to other Azure resources. Apply Azure Policy to enforce compliance.

7

Optimize Cost and Performance

Use Reserved Instances or Savings Plans for steady-state workloads. Use Spot VMs for batch or fault-tolerant workloads. For App Service, right-size the plan tier. For Functions, choose Consumption plan for low-usage scenarios. Monitor performance with Azure Monitor and adjust VM sizes or scale settings. Use Azure Advisor recommendations for cost and performance optimization. Regularly review and deallocate unused resources.

What This Looks Like on the Job

Scenario 1: E-commerce Web Application A retail company deploys a customer-facing e-commerce site on Azure. The workload is stateless (web tier) and stateful (shopping cart and orders). They choose Azure App Service for the web frontend with Standard tier, enabling auto-scale from 2 to 10 instances based on CPU usage. The backend uses Azure SQL Database for orders and Azure Redis Cache for session state. For high availability, they deploy the App Service plan across two availability zones (zone-redundant). During Black Friday, traffic spikes 500%; auto-scale responds within minutes, adding instances. The site stays up with 99.99% uptime. Common misconfiguration: setting autoscale cooldown too short (e.g., 1 minute), causing thrashing. They set cooldown to 10 minutes for scale-out and 15 for scale-in.

Scenario 2: Batch Processing Pipeline A financial services firm runs daily risk calculations using a custom .NET application. The job is CPU-intensive and runs for 2 hours. They use a VM scale set with Spot VMs (Standard_F16s_v2) to reduce costs by 80%. The scale set is configured with a manual instance count of 20, triggered by an Azure Logic App that runs on a schedule. Each VM processes a subset of data from Azure Blob Storage. They use a low-priority eviction policy with the 'Delete' action. To handle evictions, they checkpoint progress to Azure Table Storage. The challenge: Spot VMs can be evicted with 30 seconds notice; the checkpointing ensures no data loss. They also set up a second scale set using pay-as-you-go VMs as a fallback for critical runs.

Scenario 3: Microservices on AKS A SaaS provider builds a multi-tenant application using microservices. They choose AKS for orchestration. They create a cluster with two node pools: a system pool (Standard_D2s_v3, 3 nodes) for core services (DNS, metrics), and a user pool (Standard_D4s_v3, 5-20 nodes) for application microservices. They enable cluster autoscaler with min 5, max 20. Each microservice is deployed with Horizontal Pod Autoscaler (HPA) based on CPU and custom metrics. They use Azure Dev Spaces for debugging. A common issue: pods are not scheduled due to insufficient node resources; they use pod resource requests and limits to avoid overcommit. They also enable Azure Policy to enforce that all containers come from their private ACR. The system handles 10,000 requests per second with p99 latency under 200ms.

How AZ-305 Actually Tests This

What AZ-305 Tests (Objective 4.1: Design compute solutions) The exam focuses on your ability to recommend the right compute service based on requirements. Key objective codes: 4.1.1 (recommend a compute solution), 4.1.2 (recommend a compute service based on requirements), 4.1.3 (design for high availability and scalability).

Common Wrong Answers and Why Candidates Choose Them 1. Choosing VMSS when App Service is sufficient: Candidates see 'scaling' and jump to VMSS, but App Service provides auto-scaling with less management. The trap: VMSS requires managing OS updates, while App Service handles patching. 2. Selecting Azure Functions Consumption plan for a long-running process: The 10-minute timeout is a hard limit; candidates forget this and choose Consumption for simplicity. The correct answer is Premium plan or VMs. 3. Picking a single VM for high availability: The exam often asks for a solution that meets a 99.9% SLA. Candidates think one VM with premium storage is enough, but the SLA requires at least two instances. 4. Using a VMSS with manual scaling for automatic scaling: The question might say 'automatically scale', but candidates miss that VMSS needs autoscale rules configured via Azure Monitor.

Specific Numbers and Terms to Memorize - SLA: 99.9% (two VMs in availability set with premium storage), 99.95% (zone-redundant), 99.99% (App Service with multiple instances). - App Service tiers: Free (no SLA), Shared (no SLA), Basic (no auto-scale), Standard (auto-scale), Premium, Isolated. - Azure Functions timeout: Consumption 10 min, Premium unlimited (default 30 min). - VMSS overprovisioning: default enabled. - AKS cluster autoscaler: must be enabled separately.

Edge Cases and Exceptions - Stateful workloads: Cannot use VMSS with ephemeral disks for state; use managed disks or external storage. - Windows containers on AKS: Only supported on certain VM sizes (e.g., Standard_D2s_v3). - Azure Functions Premium plan: Supports VNet integration, but requires a dedicated App Service plan. - Spot VMs: Can be evicted with 30-second notice; not suitable for stateful or time-critical workloads.

How to Eliminate Wrong Answers - If the requirement mentions 'minimal management', eliminate IaaS options (VMs, VMSS) and choose PaaS (App Service) or serverless. - If the workload is event-driven and short-lived, choose Azure Functions. - For container orchestration, AKS is the only managed option; ACI is for single containers. - Always check SLA requirements: if 99.95% or higher, you need zone redundancy or multiple regions. - If the question says 'auto-scale', ensure the service supports it: Basic App Service does not; Standard+ does. VMSS needs autoscale rules.

Key Takeaways

Azure VMs provide full control but require management; use for legacy or custom workloads.

App Service is PaaS for web apps; auto-scaling requires Standard tier or higher.

Azure Functions Consumption plan has a 10-minute timeout; use Premium for long-running functions.

VMSS enables automatic scaling with autoscale rules; overprovisioning is enabled by default.

AKS is managed Kubernetes; cluster autoscaler must be enabled separately.

High availability for VMs requires at least two instances in an availability set or zones.

Spot VMs can reduce costs up to 90% but can be evicted with 30-second notice.

Use proximity placement groups for low-latency between VMs.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Azure Virtual Machine Scale Sets

IaaS – full control over OS and software

Requires manual OS patching and configuration

Supports any workload (stateful or stateless)

Scaling adds new VMs (slower, minutes)

Pricing per VM hour; reserved instances available

Azure App Service

PaaS – no OS management

Automatic patching and updates

Best for web apps, APIs, mobile backends

Scaling adds instances quickly (seconds to minutes)

Pricing per App Service plan; reserved capacity available

Watch Out for These

Mistake

A single Azure VM with premium storage meets the 99.9% SLA.

Correct

The 99.9% SLA for VMs requires at least two instances in the same availability set or across availability zones. A single VM only gets 99.9% if it uses premium storage and is part of an availability set with another VM. For a single VM, the SLA is 99.9% only if you have two or more VMs. Actually, the SLA for a single VM is 99.9% only if it uses premium storage and is in an availability set? No, the SLA states: 'For all Virtual Machines that have two or more instances deployed across two or more Availability Zones, we guarantee you will have Virtual Machine Connectivity to at least one instance at least 99.99% of the time.' For a single VM, the SLA is 99.9% only if you use premium storage. But the exam often tests that a single VM does NOT meet high availability; you need at least two.

Mistake

Azure Functions Consumption plan can run indefinitely.

Correct

The maximum execution time for a function in the Consumption plan is 10 minutes (configurable up to 10). For longer executions, you must use the Premium plan (no timeout) or a Dedicated plan.

Mistake

VMSS automatically scales without any configuration.

Correct

VMSS provides a scaling mechanism, but you must configure autoscale rules (e.g., CPU > 75%) in Azure Monitor. Without rules, the instance count remains fixed at the initial value.

Mistake

App Service Basic tier supports auto-scaling.

Correct

Auto-scaling is only available in Standard, Premium, and Isolated tiers. Basic tier supports manual scaling only.

Mistake

AKS node pools can be resized by changing the VM size.

Correct

You cannot change the VM size of an existing node pool. To use a different size, you must create a new node pool with the desired size and migrate workloads.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between Azure VM availability sets and availability zones?

Availability sets distribute VMs across fault domains (physical racks) and update domains (maintenance windows) within a single datacenter. They protect against rack failures and planned maintenance. Availability zones are physically separate datacenters within a region, each with independent power, cooling, and networking. Zones protect against datacenter-level failures. For higher SLA (99.99%), use zones. For cost-effective high availability within a datacenter, use availability sets.

Can I use Azure Functions for a long-running process that takes 30 minutes?

Yes, but you must use the Premium plan (or a Dedicated plan). The Consumption plan has a maximum execution timeout of 10 minutes. The Premium plan has no timeout (default 30 minutes, can be increased). Also, Premium plan supports VNet integration and pre-warmed instances.

How do I choose between Azure Kubernetes Service (AKS) and Azure Container Instances (ACI)?

Use ACI for simple, single-container deployments or burst scenarios (e.g., run a container for a few minutes). ACI is fast to deploy and you pay per second. Use AKS for multi-container microservices, orchestration, scaling, and complex networking. AKS is a full Kubernetes platform with managed control plane. For production-grade containerized applications, AKS is recommended.

What is the SLA for a single Azure VM?

For a single VM using premium storage, the SLA is 99.9%. However, to achieve high availability, you need at least two VMs in an availability set (99.95%) or across availability zones (99.99%). The exam expects you to know that a single VM does not provide high availability.

Can I autoscale Azure VM Scale Sets automatically without any configuration?

No. VMSS has the infrastructure to scale, but you must configure autoscale rules (e.g., based on CPU, memory, or custom metrics) using Azure Monitor. Without rules, the instance count remains static. You can also scale manually.

What is the difference between vertical and horizontal scaling in Azure?

Vertical scaling (scale up) increases the size of a single VM (e.g., from D2s_v3 to D4s_v3). This requires downtime. Horizontal scaling (scale out) adds more instances (e.g., from 2 to 5 VMs). For stateless workloads, horizontal scaling is preferred for better availability and elasticity. Azure App Service and VMSS support horizontal scaling.

How do I ensure my App Service app is highly available across regions?

Use Azure Traffic Manager or Azure Front Door to route traffic to App Service deployments in multiple regions. Each region's App Service should have multiple instances (Standard tier or higher) and be zone-redundant if available. For disaster recovery, implement active-passive or active-active configurations with data replication.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Designing Compute Solutions — now see how well it sticks with free AZ-305 practice questions. Full explanations included, no account needed.

Done with this chapter?