This chapter covers Blue-Green Deployment on Azure, a critical deployment strategy that minimizes downtime and risk by running two identical environments (Blue and Green) and switching traffic between them. For the AZ-204 exam, this topic appears in approximately 5-10% of questions, often within the Compute domain (Objective 1.3: Create and manage container images for solutions). You will need to understand how to implement this pattern using Azure App Service slots, Traffic Manager, or Azure Kubernetes Service (AKS) to pass scenario-based questions.
Jump to a section
Imagine a busy city with a river that must be crossed by commuters. The city builds two identical bridges side by side: Bridge A and Bridge B. Bridge A is currently open to all traffic, while Bridge B is closed for maintenance and testing. When the city wants to deploy a new traffic pattern (e.g., reversible lanes), they first implement and test it on Bridge B while Bridge A continues to handle traffic. Once Bridge B is verified safe and efficient, they flip a switch that instantly redirects all traffic to Bridge B, then close Bridge A for updates. If something goes wrong with the new pattern on Bridge B, they can flip the switch back to Bridge A, restoring the old pattern immediately. This ensures zero downtime for commuters and allows rollback without disrupting traffic. The key is that both bridges are fully functional and identical in capacity, so the switch is seamless. The city keeps both bridges ready at all times, which costs more but guarantees reliability.
What is Blue-Green Deployment?
Blue-Green Deployment is a release management strategy that reduces downtime and risk by running two identical production environments, referred to as Blue and Green. At any time, only one environment serves live production traffic. The other environment is idle or used for staging the new release. When a new version of the application is ready, it is deployed to the idle environment (e.g., Green). After thorough testing, the router or load balancer switches all traffic from Blue to Green. If any issues arise, traffic can be switched back to Blue (rollback) instantly.
Why Use Blue-Green Deployment?
The primary goals are zero-downtime deployments, instant rollback, and reduced risk. Traditional in-place updates involve stopping the service, deploying new code, and restarting, causing downtime. Rolling updates gradually replace instances, which can be slow and complex to roll back. Blue-Green avoids these issues by keeping the old version fully available until the new version is proven stable.
How It Works Internally
Two Environments: Two identical stacks are provisioned: Blue (current production) and Green (staging). Each environment includes compute resources (e.g., App Service plans, VMs, containers), databases, caches, and configuration.
Deployment to Idle Environment: The new application version is deployed to the idle environment (Green). This does not affect live traffic.
Testing: The new version is tested in isolation on Green. This can include smoke tests, integration tests, and performance tests.
Traffic Switch: Once Green passes testing, the router or load balancer is reconfigured to direct all production traffic to Green. This switch is instantaneous at the network level.
Blue Becomes Idle: Blue is now idle but remains available. If the new version has issues, traffic can be switched back to Blue immediately.
Cleanup: After a stabilization period (e.g., 24-48 hours), Blue can be decommissioned or repurposed for the next release.
Key Components on Azure
#### Azure App Service Deployment Slots
Azure App Service supports deployment slots, which are live environments with their own hostnames. App content and configuration can be swapped between slots. This is the simplest implementation of Blue-Green on Azure.
Default slot: production (Blue).
Additional slot: e.g., staging (Green).
Swap operation: az webapp deployment slot swap or via portal. The swap is atomic and includes warm-up to ensure the new version is ready before receiving traffic.
Auto swap: Enables continuous deployment by automatically swapping when the slot is updated.
Traffic routing: You can route a percentage of traffic to a slot for testing (e.g., 10% to staging) before full swap.
Default values: Slots are available in Standard, Premium, and Isolated tiers. Auto swap is disabled by default. Swap duration depends on app startup time (default timeout 90 seconds).
#### Azure Traffic Manager
Azure Traffic Manager is a DNS-based traffic load balancer that can route traffic to different endpoints. It can be used for Blue-Green deployment by pointing the Traffic Manager profile to either the Blue or Green endpoint.
Endpoint types: Azure endpoints (App Service, Cloud Service), external endpoints, nested endpoints.
Routing methods: Priority (for active-passive), Weighted (for gradual rollouts), Performance, Geographic.
For Blue-Green: Use Priority routing method with two endpoints: primary (Blue) and secondary (Green). To switch, change the priority so that Green becomes primary.
DNS TTL: Traffic Manager respects DNS TTL (default 300 seconds). Changes are not instantaneous; clients may cache DNS for up to TTL.
#### Azure Kubernetes Service (AKS)
In AKS, Blue-Green can be implemented using multiple deployments and services, or using service mesh like Istio.
Simple approach: Create two deployments (blue and green) with different labels. Create a service that selects labels based on the active version. Update the service's selector to switch traffic.
Advanced approach: Use Istio VirtualService to route traffic to different subsets based on weight or header.
Rollback: Revert the service selector or VirtualService configuration.
Configuration and Verification Commands
#### App Service Slot Swap
# Create a staging slot
az webapp deployment slot create --name MyApp --resource-group MyRG --slot staging
# Deploy to staging slot
az webapp deployment source config-zip --resource-group MyRG --name MyApp --slot staging --src app.zip
# Swap staging to production
az webapp deployment slot swap --resource-group MyRG --name MyApp --slot staging --target-slot production#### Traffic Manager Priority
# Create Traffic Manager profile
az network traffic-manager profile create --name MyTM --resource-group MyRG --routing-method Priority
# Add endpoints
az network traffic-manager endpoint create --name Blue --profile-name MyTM --resource-group MyRG --type azureEndpoints --target-resource-id /subscriptions/.../sites/MyApp --priority 1
az network traffic-manager endpoint create --name Green --profile-name MyTM --resource-group MyRG --type azureEndpoints --target-resource-id /subscriptions/.../sites/MyApp-staging --priority 2
# Switch traffic by updating priorities
az network traffic-manager endpoint update --name Blue --profile-name MyTM --resource-group MyRG --priority 2
az network traffic-manager endpoint update --name Green --profile-name MyTM --resource-group MyRG --priority 1Interaction with Related Technologies
Azure DevOps Pipelines: Can automate Blue-Green deployments using tasks like "Azure App Service deploy" with slot swap.
Azure Load Balancer: For VMs, you can use Azure Load Balancer with backend pools representing Blue and Green, and update the load balancing rules to switch traffic.
Azure Front Door: Similar to Traffic Manager but with HTTP/HTTPS traffic and global routing.
Azure SQL Database: When switching environments, databases must be compatible. Blue-Green often requires backward-compatible schema changes (e.g., additive changes only) to avoid breaking the old version during switch.
Blue-Green vs. Canary vs. Rolling
Canary deployment: Gradual rollout to a subset of users. Blue-Green is an all-at-once switch.
Rolling deployment: Incrementally updates instances. Blue-Green requires double the resources but provides instant rollback.
The AZ-204 exam may ask you to choose the appropriate strategy based on requirements like zero downtime, instant rollback, or cost constraints.
Provision Blue and Green Environments
Create two identical environments in Azure, e.g., two App Service slots (production and staging) or two VM scale sets. Ensure both have the same configuration, capacity, and dependencies (databases, caches). For App Service slots, the production slot is Blue, and a new slot (e.g., staging) is Green. This step is done once and reused for multiple releases.
Deploy New Version to Green
Deploy the new application version to the Green environment. This does not affect live traffic. Use CI/CD pipelines (e.g., Azure DevOps) to automate deployment. For App Service slots, deploy to the staging slot using `az webapp deployment source config-zip` or Kudu. Ensure the Green environment is fully functional and warmed up.
Test Green Environment
Run automated smoke tests, integration tests, and performance tests against the Green environment. Access the Green environment via its direct URL (e.g., `https://myapp-staging.azurewebsites.net`). Verify that the new version works correctly with the existing backend services. This step catches issues before traffic is switched.
Switch Traffic to Green
Redirect all production traffic from Blue to Green. For App Service slots, perform a swap operation (`az webapp deployment slot swap`). This is atomic and includes warm-up of the target slot. For Traffic Manager, update endpoint priorities or disable the Blue endpoint. The switch should be instantaneous at the network level, but DNS propagation may take time (TTL dependent).
Monitor and Rollback if Needed
After the switch, monitor application health, error rates, and performance. If issues are detected, rollback by swapping back (App Service) or re-enabling Blue endpoint (Traffic Manager). Rollback is immediate because Blue is still fully deployed. Keep Blue running for a stabilization period (e.g., 24 hours) before decommissioning.
Enterprise Scenario 1: E-commerce Platform A large e-commerce company uses Azure App Service with deployment slots for Blue-Green deployments. They have a production slot (Blue) and a staging slot (Green). During a major holiday sale, they need to deploy a new checkout flow. They deploy the new code to the staging slot, run automated tests, then perform a slot swap. The swap takes less than 30 seconds. After swap, they monitor for errors. If any critical bug appears, they swap back. This ensures zero downtime during peak traffic. The challenge is database schema changes: they use backward-compatible changes (add columns, not remove) to avoid breaking the old version during the swap window.
Enterprise Scenario 2: Financial Services A bank uses Azure Traffic Manager with Priority routing for Blue-Green deployment of a critical payment processing API. They have two identical deployments in separate regions for disaster recovery. Blue is the primary endpoint, Green is secondary. When deploying a new version, they deploy to Green, test using a test endpoint, then update Traffic Manager to make Green primary. Because DNS caching may cause some clients to still hit Blue, they set a low TTL (60 seconds) and wait for the TTL to expire. After a day, they decommission Blue. This provides both Blue-Green deployment and geographic redundancy. The main issue is that Traffic Manager is DNS-based, so failover is not instant; they accept a few minutes of potential mixed traffic.
Common Pitfalls
- Database schema changes: If the new version changes the database schema in a non-backward-compatible way, the old version (Blue) may break if swapped back. Solution: use additive changes only, or run both versions against the same database with compatibility.
- Stateful applications: If the application stores state locally (e.g., session data in memory), switching traffic may lose sessions. Solution: use external session state (Azure Redis Cache, SQL Server).
- Warm-up time: The new environment may need time to warm up (load caches, JIT compilation). App Service slots have a warm-up phase during swap, but if the app takes too long, the swap may time out (default 90 seconds). Configure WEBSITE_WARMUP_PATH to specify a warm-up endpoint.
What AZ-204 Tests AZ-204 Objective 1.3: Create and manage container images for solutions. Blue-Green deployment is tested as a deployment strategy for containerized solutions (AKS) and PaaS (App Service). Expect scenario-based questions that ask you to choose the best strategy (Blue-Green, Canary, Rolling) given requirements like zero downtime, instant rollback, or cost. Also test implementation details: how to swap slots, how to route traffic in AKS, and how to use Traffic Manager.
Common Wrong Answers 1. Choosing Rolling Deployment when the requirement is zero downtime with instant rollback: Rolling updates can achieve zero downtime but rollback is slow (requires rolling back each instance). Candidates often confuse rolling with blue-green. 2. Using Azure Load Balancer with health probes for traffic switch: Health probes can detect failures but are not designed for manual traffic switching. Candidates may think updating health probe configuration can switch traffic, but it's indirect and slow. 3. Believing slot swap is instantaneous with zero latency: The swap involves warm-up and configuration synchronization, which takes time (up to 90 seconds). The exam may test the swap behavior and warm-up. 4. Assuming Blue-Green requires separate databases: Often, both environments share the same database, which can cause compatibility issues. The exam may test schema versioning.
Specific Numbers and Terms
- App Service slot swap default warm-up timeout: 90 seconds.
- Traffic Manager default DNS TTL: 300 seconds.
- Slot swap is only available in Standard, Premium, and Isolated tiers.
- Auto swap is disabled by default.
- In AKS, use kubectl apply to update service selectors or use Istio VirtualService.
Edge Cases - Swap with auto swap enabled: If auto swap is enabled, every deployment to the staging slot triggers an automatic swap. This can be used for continuous deployment but reduces control. - Traffic Manager with multiple endpoints: If more than two endpoints exist, priority routing may require careful management. - Slot swap with connection strings: Connection strings can be slot-specific (sticky settings). If not configured correctly, the swapped slot may have wrong connection strings.
How to Eliminate Wrong Answers - If the question mentions "instant rollback," eliminate rolling and canary. - If the question mentions "gradual rollout to 10% of users," eliminate blue-green and choose canary. - If the question mentions "cost-effective" and "limited resources," blue-green may be too expensive (double resources); choose rolling. - If the question involves App Service and slot swap, the answer likely involves swapping slots. - If the question involves AKS and service mesh, the answer may involve Istio VirtualService.
Blue-Green deployment uses two identical environments (Blue and Green) to achieve zero-downtime deployments and instant rollback.
On Azure App Service, deployment slots enable Blue-Green via slot swap operation, available in Standard, Premium, and Isolated tiers.
Slot swap includes a warm-up phase (default timeout 90 seconds) to ensure the new version is ready before receiving traffic.
Traffic Manager with Priority routing can implement Blue-Green at the DNS level, but DNS TTL (default 300 seconds) delays traffic switching.
In AKS, Blue-Green can be implemented using multiple deployments and updating service selectors or using Istio VirtualService.
Database schema changes must be backward-compatible (additive only) to support rollback when sharing a database between environments.
Auto swap in App Service automatically swaps a slot when updated, enabling continuous deployment with Blue-Green.
Blue-Green deployment doubles resource costs because both environments must be fully provisioned.
The AZ-204 exam tests your ability to choose between Blue-Green, Canary, and Rolling deployments based on requirements like rollback speed and cost.
Slot-specific settings (sticky settings) in App Service stay with the slot during a swap, not the application code.
These come up on the exam all the time. Here's how to tell them apart.
Blue-Green Deployment
Two full environments run simultaneously (double resource cost).
Instant rollback by switching traffic back to the old environment.
Traffic switch is all-at-once; no gradual rollout.
Requires backward-compatible database changes or separate databases.
Best for critical applications where zero downtime and instant rollback are required.
Rolling Deployment
Single environment with incremental updates to instances (no extra cost).
Rollback is slow; requires rolling back each instance individually.
Traffic is gradually shifted as instances are updated.
Database compatibility is easier because old and new versions coexist briefly.
Best for cost-sensitive applications where some downtime during rollback is acceptable.
Mistake
Blue-Green deployment requires both environments to be identical in every aspect.
Correct
While they should be as identical as possible to avoid surprises, they can differ in configuration (e.g., connection strings for staging vs production). However, during a swap, slot-specific settings (sticky settings) stay with the slot, not the environment.
Mistake
Slot swap in Azure App Service is instant and has no downtime.
Correct
The swap is not instantaneous; it involves a warm-up phase where the target slot is warmed up before receiving traffic. During this time, the swap is in progress, but the production slot continues to serve traffic until the swap completes. There is no downtime, but there is a brief period of potential latency if warm-up fails.
Mistake
Blue-Green deployment eliminates the need for database compatibility.
Correct
If both environments share the same database, the new version must be backward-compatible with the old schema. Otherwise, rolling back may break the application. Often, additive schema changes are used, or the database is versioned.
Mistake
Traffic Manager provides instant traffic switching.
Correct
Traffic Manager is DNS-based, so changes are subject to DNS TTL. Clients may cache DNS for up to TTL seconds (default 300). To achieve faster switching, use Azure Front Door or App Service slot swap.
Mistake
Blue-Green deployment is only for web applications.
Correct
It can be used for any application that can be load-balanced, including APIs, microservices, and even databases (with careful planning). On Azure, it is commonly used with App Service, AKS, and VMs.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Blue-Green deployment switches all traffic from the old version (Blue) to the new version (Green) at once, providing instant rollback. Canary deployment gradually routes a small percentage of traffic to the new version, allowing monitoring before full rollout. Canary is slower to rollback (requires reversing the traffic split) but reduces risk by limiting exposure. On Azure, canary can be implemented using App Service slot traffic routing (preview) or Traffic Manager weighted routing.
Create a deployment slot (e.g., staging) in addition to the production slot. Deploy your new application version to the staging slot. Test it by accessing the staging URL. Then perform a slot swap via the portal, CLI (`az webapp deployment slot swap`), or Azure DevOps. The swap atomically switches the production and staging slots, making the new version live. To rollback, swap again.
Yes. Create two deployments (blue and green) with different labels. Create a service that selects the active version (e.g., via label `version: blue`). To switch, update the service's selector to `version: green`. Alternatively, use a service mesh like Istio to route traffic based on weights or headers. This allows instant rollback by reverting the selector.
Blue-Green deployment requires running two full environments simultaneously, doubling compute costs (e.g., App Service plan instances, VM scale sets, AKS node pools). However, you can downscale the idle environment to save costs, but this increases switch time if the idle environment needs to scale up. On Azure, you can use reserved instances or spot VMs to reduce costs.
To support rollback, database changes must be backward-compatible. Add new columns with default values, create new tables, or use feature flags to enable new code paths. Avoid destructive changes (drop columns, rename tables) until after the old version is decommissioned. Alternatively, run separate databases for each environment and migrate data after switch.
The default DNS TTL is 300 seconds (5 minutes). You can configure it to a lower value (e.g., 60 seconds) for faster propagation, but this increases DNS query load. For instant traffic switching, use Azure Front Door or App Service slot swap instead.
Yes. Use the Azure App Service deploy task with the 'Deploy to Slot' option, then use the 'Swap Slots' task. You can also use the 'Azure Traffic Manager' task to update endpoint priorities. For AKS, use kubectl tasks to update deployments and services.
You've just covered Blue-Green Deployment on Azure — now see how well it sticks with free AZ-204 practice questions. Full explanations included, no account needed.
Done with this chapter?