This chapter covers designing application architecture on Azure, a critical topic for the AZ-305 exam. Approximately 15-20% of exam questions touch on application architecture patterns, including compute, networking, storage, and messaging. Mastering these concepts is essential for designing scalable, resilient, and cost-effective solutions. We'll explore key services, design patterns, and best practices that the exam tests heavily.
Jump to a section
Designing application architecture in Azure is like planning a city's infrastructure. The city has zones (regions) connected by highways (Azure backbone). Each zone has districts (availability zones) with their own power and water (compute, storage). Buildings (applications) are constructed using blueprints (ARM templates) and materials (compute, storage, networking). Traffic flow (load balancing) is managed by traffic lights and roundabouts (Azure Load Balancer, Application Gateway). Emergency services (disaster recovery) ensure the city can recover from disasters. Security (Azure AD, NSGs) is like police and fire departments. The city planner (architect) must consider growth (scalability), reliability (high availability), and cost (budget). Just as a city needs to handle rush hour spikes (autoscaling) and withstand a storm (resilience), an application must handle traffic bursts and hardware failures without downtime.
What is Application Architecture?
Application architecture defines the structure, components, and interactions of an application. In Azure, it involves selecting compute services (VMs, App Service, Functions), storage (Blob, SQL Database, Cosmos DB), networking (VNet, Load Balancer, Traffic Manager), and integration (Service Bus, Event Grid). The goal is to meet functional and non-functional requirements: scalability, availability, security, and cost.
Why It Exists
Modern applications are distributed, often microservices-based, running on cloud infrastructure. Azure provides building blocks to compose these applications. Architects must choose the right services and patterns to ensure the application can handle load, recover from failures, and evolve over time.
How It Works Internally
Azure services are deployed within a subscription, resource group, and region. Compute services run on hypervisors (Azure Hyper-V) with dedicated or shared resources. Storage is replicated across fault domains. Networking uses software-defined networking (SDN) with virtual switches. Load balancers distribute traffic using hash-based algorithms. Autoscaling monitors metrics like CPU or queue length and adjusts resources.
Key Components and Defaults
Azure App Service: Supports .NET, Java, Node.js, Python, PHP. Default plan: Standard S1 (1 vCPU, 1.75 GB RAM). Scaling: manual or autoscale (10 instances max for Standard, 30 for Premium).
Azure Functions: Consumption plan (pay-per-execution, 5-minute timeout default, 10-minute max). Premium plan (no timeout, VNet integration).
Azure Kubernetes Service (AKS): Managed Kubernetes master. Node pools can be Linux or Windows. Default node size: Standard_DS2_v2.
Azure SQL Database: DTU or vCore purchasing model. DTU Basic: 5 DTU, 2 GB storage. DTU Standard S2: 50 DTU, 250 GB storage.
Azure Cosmos DB: Multi-model (SQL, MongoDB, Cassandra, Gremlin, Table). Default consistency: Session. RU/s per container: 400 minimum.
Azure Storage: Blob, File, Queue, Table. LRS (locally redundant) by default. GRS (geo-redundant) optional.
Azure Load Balancer: Public or internal. Distribution mode: 5-tuple hash (source IP, source port, destination IP, destination port, protocol). Idle timeout: 4 minutes default (configurable up to 30 minutes).
Azure Application Gateway: Layer 7 load balancer with SSL termination. WAF (Web Application Firewall) available. Autoscaling enabled by default (min capacity 0, max 125).
Azure Traffic Manager: DNS-based traffic routing. Routing methods: Performance, Weighted, Priority, Geographic, Multivalue, Subnet. TTL default: 300 seconds.
Azure Front Door: Global HTTP/S load balancer with SSL offload, URL-based routing, WAF. Supports acceleration via Microsoft edge network.
Design Patterns
N-tier architecture: Separate presentation, business, data tiers. Each tier can scale independently. Use VNet with subnets for isolation.
Microservices: Small, independent services communicating via APIs or messaging. Use AKS, Service Fabric, or App Service with containers.
Event-driven architecture: Services triggered by events (e.g., Blob upload, queue message). Use Functions, Event Grid, Event Hubs.
CQRS (Command Query Responsibility Segregation): Separate read and write models. Use Cosmos DB for reads, SQL for writes.
Strangler Fig: Incrementally migrate a monolithic app to microservices by routing traffic to new services.
Interaction with Related Technologies
Azure Active Directory (Azure AD): Authentication and authorization for apps. Integrates with App Service, Functions, API Management.
Azure Monitor: Collects metrics and logs. Autoscaling uses metrics like CPU > 70% to add instances.
Azure Key Vault: Stores secrets, keys, certificates. Apps retrieve them via managed identity.
Azure DevOps: CI/CD pipelines to deploy to App Service, AKS, VMs.
Azure Policy: Enforces compliance (e.g., only allow certain VM sizes).
Configuration and Verification Commands
Deploy App Service via CLI: az webapp create --name myapp --resource-group myRG --plan myPlan
Create AKS cluster: az aks create --resource-group myRG --name myAKS --node-count 3 --enable-addons monitoring
Create Load Balancer: az network lb create --resource-group myRG --name myLB --backend-pool-name myBackendPool --frontend-ip-name myFrontend
Create Traffic Manager profile: az network traffic-manager profile create --name myProfile --resource-group myRG --routing-method Performance --unique-dns-name myApp
Verify autoscale settings: az monitor autoscale show --resource-group myRG --name myAutoscale
Check Azure SQL DTU usage: Query sys.dm_db_resource_stats
Common Values and Timers
Autoscale cool-down: Default 5 minutes between scale operations.
Load balancer probe interval: 15 seconds default.
Application Gateway connection draining: 60 seconds default.
Traffic Manager DNS TTL: 300 seconds default (can be lowered to 0).
Event Grid retry policy: 24-hour retention, exponential backoff (10 seconds, 30 seconds, 1 minute, up to 1 hour).
Service Bus duplicate detection: 10-minute window default.
Trap Patterns
Confusing Azure Load Balancer (Layer 4) with Application Gateway (Layer 7): Exam tests scenarios requiring URL-based routing or SSL termination – choose Application Gateway.
Choosing Traffic Manager for load balancing within a region: Traffic Manager is DNS-based, not for real-time load balancing. Use Load Balancer or Application Gateway.
Assuming all Azure Functions run indefinitely: Consumption plan has a 5-minute timeout by default. For long-running functions, use Premium or App Service.
Not considering data residency: Exam may ask about storing data in specific regions for compliance. Use Azure Policy to enforce region restrictions.
Summary
Designing application architecture requires understanding Azure compute, networking, storage, and integration services. Architects must select appropriate patterns based on requirements. The exam focuses on choosing the right service for a given scenario, understanding scaling and availability options, and knowing default values and limitations.
Define Application Requirements
Identify functional and non-functional requirements: expected user load (concurrent users, requests per second), latency requirements (e.g., <200ms), availability SLA (e.g., 99.9%), data storage needs (structured vs unstructured), geographic distribution (global vs single region), and compliance (e.g., HIPAA, GDPR). This step determines which Azure services are suitable. For example, a global e-commerce site with sub-second latency may require Azure Front Door, Cosmos DB with multi-region writes, and App Service in multiple regions.
Select Compute Service
Choose between VMs, App Service, AKS, Functions, or Service Fabric based on requirements. Use VMs for full control and legacy apps; App Service for web apps and APIs with built-in scaling; AKS for containerized microservices; Functions for event-driven serverless code; Service Fabric for stateful microservices. Consider factors: startup time, scaling granularity, OS support, and cost. For example, a stateless web API with variable load is best on App Service with autoscale.
Design Networking and Security
Plan VNet architecture: address space, subnets for each tier (e.g., web, business, data). Use NSGs and ASGs to filter traffic. For internet-facing apps, place public IP on a load balancer or Application Gateway. Use private endpoints for PaaS services (e.g., SQL Database, Storage) to keep traffic within the VNet. Implement Azure Firewall or NVAs for egress filtering. Enable DDoS Protection Standard for public endpoints.
Implement Data Storage and Caching
Choose storage based on data model: Azure SQL for relational, Cosmos DB for NoSQL (low latency, global distribution), Blob Storage for unstructured data, Redis Cache for caching frequently accessed data. Use read replicas for scaling reads (Azure SQL, Cosmos DB). Implement data partitioning (sharding) in Cosmos DB using partition keys. For caching, use Azure Redis Cache with premium tier for persistence and clustering.
Configure Scalability and Availability
Set up autoscale for compute (App Service, VMSS, AKS) based on metrics (CPU, memory, queue length). Use availability zones for VMs, App Service, and SQL Database for 99.99% SLA. Implement load balancing across zones (Standard Load Balancer). For disaster recovery, use Azure Site Recovery for VMs, geo-redundant storage for data, and Traffic Manager for global failover. Test failover regularly.
Integrate with Messaging and Events
Use Azure Service Bus for reliable messaging with queues and topics (e.g., order processing). Use Event Grid for event-driven architecture (e.g., blob created triggers function). Use Event Hubs for high-throughput streaming (IoT, telemetry). Choose based on message size, throughput, and ordering requirements. Service Bus supports sessions and transactions; Event Grid is for push-based events with retry; Event Hubs supports massive ingestion.
Monitor and Optimize
Implement Azure Monitor for metrics and logs. Set up alerts for key indicators (e.g., CPU > 80%, HTTP 5xx errors). Use Application Insights for application performance monitoring (dependencies, request rates, failures). Enable autoscale based on metrics. Review cost reports and right-size resources (e.g., downsize VMs, use reserved instances). Use Azure Advisor for recommendations.
Scenario 1: Global E-commerce Platform
A retail company needs a global e-commerce platform with high availability, low latency, and ability to handle flash sales. They deploy: - Azure Front Door at the edge for global load balancing, SSL termination, and WAF (protection against DDoS and OWASP attacks). - App Service in multiple regions (e.g., West US, West Europe, Southeast Asia) behind Front Door. Each region uses at least two instances in different availability zones. - Azure Cosmos DB with multi-region writes (99.999% availability) and session consistency for cart data. Partition key: user ID. - Azure Redis Cache for session state and product catalog caching, reducing database load. - Azure Service Bus for order processing: orders are placed in a queue, processed by backend workers (Functions), ensuring reliable delivery. - Azure Monitor with Application Insights for real-time performance dashboards and alerts. In production, they handle 50,000 requests per second during peak. Autoscale adds instances based on CPU > 70%. They use Traffic Manager as a backup for Front Door (if needed). Common issues: misconfigured Front Door health probes causing false failures; too aggressive autoscale causing thrashing; forgetting to enable geo-replication for Cosmos DB.
Scenario 2: Microservices Migration
A financial services company is migrating a monolithic .NET application to microservices on AKS. They need: - AKS cluster with multiple node pools (Linux for services, Windows for legacy components). - Azure API Management to expose APIs with throttling, caching, and authentication (OAuth2 via Azure AD). - Azure Service Bus for asynchronous communication between services (e.g., trade execution). - Azure SQL Database with elastic pools for relational data, and Cosmos DB for real-time market data. - Azure Key Vault for secrets (connection strings, API keys). - Azure DevOps for CI/CD with Helm charts and Azure Container Registry. They use the Strangler Fig pattern: gradually route traffic to new microservices via API Management. Challenges: network latency between microservices in different pods; managing distributed transactions (use Saga pattern with Service Bus). Misconfigurations: not setting pod resource limits causing noisy neighbors; not using Azure Policy to enforce security.
Scenario 3: IoT Telemetry Pipeline
A manufacturing company collects sensor data from thousands of devices. Architecture: - Azure IoT Hub for device registration, authentication, and message ingestion (millions of messages per second). - Azure Stream Analytics for real-time processing (e.g., detect anomalies, compute averages). - Azure Event Hubs for high-throughput buffering (optional, if IoT Hub not enough). - Azure Functions (consumption plan) for event-driven actions (e.g., send alert when temperature exceeds threshold). - Azure Blob Storage for cold storage of raw data (hot tier for recent data, cool tier for archival). - Azure Time Series Insights for historical analysis. Common pitfalls: not partitioning IoT Hub device identities correctly; using Functions with long execution times (need Premium plan); not enabling auto-inflate on Event Hubs throughput units.
The AZ-305 exam tests your ability to design application architecture that meets business requirements. Key objectives under "Design application architecture" (AZ-305: 4.4) include: - Selecting appropriate compute services (VMs, App Service, AKS, Functions, Service Fabric) - Designing for high availability and disaster recovery (availability zones, geo-redundancy, Traffic Manager, Azure Site Recovery) - Implementing autoscaling (based on metrics, schedules) - Choosing messaging patterns (queues, topics, events) - Designing for cost optimization (right-sizing, reserved instances, serverless)
Common Wrong Answers
Choosing Azure Load Balancer for URL-based routing: Candidates often pick Load Balancer because they think it handles HTTP, but it's Layer 4 (TCP/UDP). For URL routing, SSL termination, or WAF, you need Application Gateway or Front Door.
Selecting Traffic Manager for intra-region load balancing: Traffic Manager is DNS-based, not for real-time load balancing. For distributing traffic within a region, use Load Balancer or Application Gateway.
Using Azure Functions for long-running tasks: The default 5-minute timeout on consumption plan is a trap. For tasks >5 minutes, choose Premium plan or App Service.
Ignoring data consistency requirements: Cosmos DB offers five consistency levels (Strong, Bounded Staleness, Session, Consistent Prefix, Eventual). The exam may ask which to use for a global app with low latency – Session is often the best balance.
Specific Numbers and Terms
99.9% SLA: Requires at least 2 instances in same availability set or zone.
99.95% SLA: App Service with 2+ instances.
99.99% SLA: Cosmos DB with multi-region writes, or VM with availability zones.
Autoscale default cool-down: 5 minutes.
Load Balancer idle timeout: 4 minutes (default), configurable up to 30 minutes.
Application Gateway connection draining: 60 seconds (default).
Traffic Manager TTL: 300 seconds (default), can be set to 0.
Service Bus duplicate detection window: 10 minutes.
Event Grid retry policy: 24 hours, exponential backoff.
Edge Cases
Stateful vs stateless: For stateful services (e.g., WebSockets), use SignalR Service or VMSS with sticky sessions (Application Gateway with session affinity).
Hybrid applications: On-premises apps connect via VPN or ExpressRoute; use Azure Relay or Hybrid Connections for internal resources.
Compliance: Data must remain in specific regions – use Azure Policy to restrict resource creation, and geo-location routing in Traffic Manager.
How to Eliminate Wrong Answers
Read the scenario carefully: Look for keywords like "URL path", "SSL termination", "WAF" – these point to Application Gateway. "Global", "low latency", "anycast" – point to Front Door. "DNS-based" – Traffic Manager.
Check SLA requirements: If 99.99% is needed, you need availability zones or multi-region. If 99.9% is enough, 2 VMs in an availability set works.
Consider cost: For variable workloads, serverless (Functions, Consumption plan) is cheaper. For predictable workloads, reserved instances save money.
Messaging pattern: If you need exactly-once processing, use Service Bus with duplicate detection. If you need fan-out to multiple subscribers, use Event Grid. For high throughput streaming, use Event Hubs.
For URL-based routing, SSL termination, or WAF, use Azure Application Gateway or Front Door, not Load Balancer.
Traffic Manager is DNS-based and does not provide real-time load balancing; use Load Balancer or Application Gateway for intra-region traffic.
Azure Functions consumption plan has a 5-minute timeout (10-minute max). Use Premium plan for longer executions.
Cosmos DB offers five consistency levels; Session consistency is the default and suitable for most global apps.
Autoscale cool-down is 5 minutes by default; plan for pre-warming or use scheduled scaling for predictable spikes.
For 99.99% SLA, use availability zones (VMs, App Service, SQL Database) or multi-region deployment with Traffic Manager.
Service Bus is for reliable messaging with features like duplicate detection; Event Grid is for event-driven push notifications.
Always consider cost: serverless (Functions, Consumption plan) for variable workloads; reserved instances for steady-state.
Use Azure Policy to enforce compliance (e.g., region restrictions, allowed VM sizes).
Application Gateway connection draining default is 60 seconds; configure for graceful shutdown during updates.
These come up on the exam all the time. Here's how to tell them apart.
Azure Load Balancer
Layer 4 (TCP/UDP) load balancer
Distributes traffic based on 5-tuple hash
No SSL termination or URL routing
Supports backend pools of VMs or VMSS
Health probes via TCP, HTTP, or HTTPS
Azure Application Gateway
Layer 7 (HTTP/HTTPS) load balancer
URL-based routing, SSL termination, WAF
Supports cookie-based session affinity
Can route to multiple backend pools based on path
Autoscaling and zone redundancy available
Azure Service Bus
Message broker for queues and topics
Supports sessions, transactions, duplicate detection
Pull-based (client polls) or push via Service Bus connector
Maximum message size: 256 KB (Standard), 1 MB (Premium)
Ideal for reliable, ordered messaging with high durability
Azure Event Grid
Event routing service (pub/sub)
Push-based with automatic retry and dead-lettering
Supports filtering and event schemas
Maximum event size: 64 KB (General Availability)
Ideal for reactive programming and serverless triggers
Azure App Service
PaaS for web apps, APIs, and mobile backends
Built-in autoscaling, load balancing, and CI/CD
Supports multiple languages ( .NET, Java, Node.js, Python, PHP)
Limited to Windows or Linux, not custom OS
Simpler management, less control
Azure Kubernetes Service (AKS)
Managed Kubernetes for containerized microservices
Full control over pods, services, and networking
Supports any containerized application (any OS/language)
Requires more operational overhead (Kubernetes expertise)
More flexible scaling and deployment options (Helm, canary)
Mistake
Azure Load Balancer can route based on URL path.
Correct
Azure Load Balancer is Layer 4 (TCP/UDP) and cannot inspect HTTP headers. Use Application Gateway (Layer 7) for URL-based routing, SSL termination, and WAF.
Mistake
Traffic Manager provides real-time load balancing within a region.
Correct
Traffic Manager is DNS-based; it directs clients to endpoints based on DNS resolution. It does not balance live traffic. For intra-region load balancing, use Azure Load Balancer or Application Gateway.
Mistake
Azure Functions can run indefinitely on the consumption plan.
Correct
Consumption plan functions have a default timeout of 5 minutes (configurable up to 10 minutes). For longer executions, use the Premium plan (no timeout) or App Service.
Mistake
Cosmos DB strong consistency is always the best choice for global apps.
Correct
Strong consistency ensures immediate consistency but increases latency and reduces availability during failures. For globally distributed apps, Session consistency is often sufficient and provides better performance.
Mistake
Autoscaling in Azure is instant and can handle sudden spikes immediately.
Correct
Autoscaling has a cool-down period (default 5 minutes) and takes time to provision new instances. For sudden spikes, consider using Azure Front Door, which can absorb traffic at the edge, or pre-warm instances with scheduled scaling.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Azure Load Balancer operates at Layer 4 (TCP/UDP) and distributes traffic based on source IP, port, destination IP, port, and protocol. It does not inspect HTTP headers. Application Gateway operates at Layer 7 (HTTP/HTTPS), supporting URL-based routing, SSL termination, Web Application Firewall (WAF), and cookie-based session affinity. Use Load Balancer for non-HTTP traffic or simple load balancing; use Application Gateway for web applications needing intelligent routing and security.
Use Azure Functions for event-driven, short-lived (under 5-10 minutes) workloads that scale to zero when idle, such as processing messages from a queue or responding to HTTP triggers with low traffic. Use App Service for always-on web APIs that require consistent performance, longer execution times, and features like custom domains, SSL, and autoscale. App Service is better for production APIs with high traffic.
Deploy your application in multiple Azure regions. Use Azure Traffic Manager or Front Door to route users to the nearest healthy region. For data, use Cosmos DB with multi-region writes or Azure SQL with active geo-replication. Ensure each region's deployment is in availability zones for intra-region resilience. Test failover regularly with Azure Site Recovery for VMs.
Azure Service Bus is a message broker for reliable, ordered messaging with features like queues, topics, sessions, transactions, and duplicate detection. It is pull-based (clients poll) or push via connectors. Event Grid is a serverless event routing service that pushes events to subscribers (e.g., Functions, webhooks) with automatic retry and filtering. Use Service Bus for enterprise messaging with complex routing; use Event Grid for reactive event-driven architectures.
Autoscaling in App Service can scale based on metrics (CPU, memory, HTTP queue length) or schedule. When a metric crosses a threshold (e.g., CPU > 70%), a scale-out operation adds instances. There is a cool-down period of 5 minutes between operations to avoid thrashing. You can set minimum and maximum instance counts. Autoscaling works only on App Service plans that support it (Standard, Premium, Isolated).
The default idle timeout for Azure Load Balancer is 4 minutes. This means if no traffic flows for 4 minutes, the load balancer closes the connection. You can configure this up to 30 minutes. For long-lived connections like WebSockets, ensure your client sends keepalive packets or increase the timeout.
Use Azure SQL Database if your data is relational and requires complex queries, joins, and ACID transactions. Use Cosmos DB if you need low-latency reads/writes globally, flexible schema (NoSQL), and built-in multi-region replication. For an e-commerce app, you might use both: SQL for orders and inventory (relational), Cosmos DB for product catalog and user sessions (low-latency).
You've just covered Designing Application Architecture — now see how well it sticks with free AZ-305 practice questions. Full explanations included, no account needed.
Done with this chapter?