This chapter covers designing backup and disaster recovery solutions for Azure workloads, which is a core part of Domain 3 (Business Continuity) and Objective 3.1 of the AZ-305 exam. Approximately 15-20% of exam questions touch on backup, restore, and disaster recovery strategies. You will learn how to choose between Azure Backup and Azure Site Recovery, configure Recovery Services Vaults, design backup policies, and implement cross-region recovery. The chapter provides the depth needed to answer scenario-based questions about RPO, RTO, retention, and failover.
Jump to a section
Think of Azure Backup as a safety deposit box system for a business. Your original data is like the cash and documents you keep in your office safe (primary production). A safety deposit box at a bank (Recovery Services Vault) provides a separate, secure copy. The process of backing up is like a trusted courier making copies of your documents and delivering them to the bank vault each day. The vault has its own access controls (RBAC), and you can specify how long the bank keeps your copies (retention policy). If your office is burgled (disaster), you go to the bank, retrieve your copies, and restore them to a new office (target location). The bank vault is geo-redundant if you choose that option – the bank also stores copies in a different city. Azure Site Recovery (ASR) is like having a fully furnished backup office in another city that you can instantly move into. You continuously replicate your office setup (VMs, networks) to that remote office, and in a disaster, you just flip a switch (failover) and start working there. You don't need to rebuild – you just walk into the already-prepared office. When the primary office is repaired, you flip back (failback) and the remote office is vacated for future use. The replication is like a continuous, real-time document feed between the two offices, ensuring the backup office is always current.
Overview of Backup and Disaster Recovery in Azure
Azure offers two primary services for business continuity: Azure Backup and Azure Site Recovery (ASR). Azure Backup provides backup and restore capabilities for Azure VMs, SQL Server, SAP HANA, Azure Files, and on-premises workloads. Azure Site Recovery provides disaster recovery (DR) by replicating workloads to a secondary region, enabling failover and failback. The exam tests your ability to select the appropriate service based on RPO, RTO, data type, and recovery objectives.
Recovery Services Vault
A Recovery Services Vault is the storage container for both Azure Backup and Azure Site Recovery data. It is an Azure Resource Manager (ARM) resource that stores backup data and replication data. Key features: - Storage Replication: Locally redundant storage (LRS), geo-redundant storage (GRS), or zone-redundant storage (ZRS). GRS is default for backups and replicates data to a paired region. - Soft Delete: Enabled by default for Azure VM backups. Deleted backup data is retained for 14 days, preventing accidental deletion. - Encryption: Data is encrypted at rest using platform-managed keys (SSE) or customer-managed keys (CMK) via Azure Key Vault. - RBAC: Built-in roles like Backup Contributor, Backup Operator, and Site Recovery Contributor control access.
Azure Backup Components
Azure Backup consists of multiple components depending on the workload: - Azure VM Backup: Uses the VM’s extensions to take application-consistent snapshots (using VSS on Windows, file-system-consistent on Linux). The backup agent is the Azure Backup extension. - Azure Backup Server (MABS): For backing up on-premises workloads to Azure. It uses System Center DPM technology. - Microsoft Azure Recovery Services (MARS) Agent: For backing up files, folders, and system state from on-premises Windows machines. - Azure Files Backup: Uses the Azure file share’s snapshot capability. - SQL Server / SAP HANA Backup: Uses the workload’s native backup integration with Azure Backup.
Backup Policy
A backup policy defines when backups occur and how long they are retained. Two types: - Standard Policy: Can include daily, weekly, monthly, yearly backups with retention rules. - Enhanced Policy: Available for Azure VMs, allows multiple backups per day (hourly) and instant restore snapshots.
Retention Rules: You can specify retention for each backup point based on the backup frequency. For example, keep daily backups for 30 days, weekly for 12 weeks, monthly for 5 years, yearly for 10 years.
Instant Restore
Azure VM Backup uses snapshots to enable instant restore. By default, snapshots are retained for 2 days (configurable from 1 to 5 days). This allows restoring from the snapshot without waiting for data to transfer from the vault. The snapshot is stored in the same storage account as the VM's managed disk, enabling quick recovery.
Azure Site Recovery (ASR)
ASR replicates Azure VMs from one region to another (or from on-premises to Azure). Key components: - Replication Policy: Defines RPO threshold (default 15 minutes, minimum 30 seconds for Azure-to-Azure), recovery point retention (0-24 hours), and app-consistent snapshot frequency (0-12 hours). - Cache Storage Account: Stores changes before sending to target region. - Target Resources: Automatically created in the target region, including virtual networks, storage accounts, and availability sets/zones. - Failover: Can be planned (no data loss) or unplanned (may lose data up to RPO). Test failover is isolated and does not affect production. - Failback: After failover, you can reprotect and failback to the original region.
RPO and RTO Considerations
Azure Backup: RPO is typically 24 hours for daily backups, but can be as low as 4 hours with Enhanced Policy. RTO depends on restore type: instant restore (minutes) vs. vault restore (hours).
Azure Site Recovery: RPO as low as 30 seconds (for Azure VMs), RTO typically 30 minutes to 2 hours depending on application.
Exam Tip: For critical workloads with low RPO/RTO, use ASR. For archival or less critical data, use Azure Backup.
Cross-Region Restore (CRR)
Cross-Region Restore is a feature of Azure Backup that allows restoring Azure VMs in a secondary paired region when using GRS storage. It must be enabled at the vault level. It is useful for compliance or when the primary region is unavailable. The backup data is replicated to the paired region, but you can only restore to the secondary region, not failover.
Backup for On-Premises Workloads
For on-premises workloads, you can use the MARS agent for files/folders/system state, or MABS for more complex scenarios. Data is backed up over the internet or ExpressRoute to a Recovery Services Vault. The MARS agent encrypts data before sending, using a passphrase that you set.
Security Features
Soft Delete: Retains deleted backup data for 14 days.
Immutable Vault: Prevents deletion of backup data even by administrators (currently in preview for Azure Backup).
Encryption in transit: HTTPS for backup data.
Private Endpoints: Can be used to keep backup traffic within the Azure backbone.
Best Practices for Designing Backup and Recovery
Separate vaults for production and test to avoid accidental deletion.
Use GRS for production to enable cross-region restore.
Enable soft delete to prevent data loss.
Monitor backup health using Azure Monitor and Backup Reports.
Test restore regularly to validate RTO.
For critical VMs, use both Azure Backup and ASR for layered protection.
Integration with Other Azure Services
Azure Policy: Enforce backup on VMs using built-in policies.
Azure Automation: Runbooks can automate failover or backup tasks.
Azure Monitor: Alerts on backup failures.
Azure Backup Center: Centralized management of backups across vaults.
Exam Focus on Backup and Recovery
The AZ-305 exam expects you to understand the differences between Azure Backup and Azure Site Recovery, when to use each, and how to configure them. You need to know the default settings (e.g., snapshot retention 2 days, GRS default for vaults) and the limitations (e.g., Azure Backup does not support cross-region failover, only restore). Scenario-based questions will ask you to recommend a solution given RPO, RTO, and cost constraints.
Create a Recovery Services Vault
In the Azure portal, navigate to Backup Center or Recovery Services Vaults and click Create. Specify a resource group, vault name, and region. The region determines where the metadata is stored. Choose storage replication: LRS for dev/test, GRS for production (enables cross-region restore). You can also enable soft delete at this step (default on). After creation, configure networking: choose public access or private endpoints. For backups, the vault must be in the same region as the workload (for Azure VMs) or the same region as the backup target.
Define Backup Policy
In the vault, go to Backup Policies and create a new policy. For Azure VM backup, choose frequency (daily or hourly with Enhanced Policy), time of day, timezone, and retention rules. For daily, you can set retention for daily points (e.g., 30 days), weekly (12 weeks), monthly (5 years), yearly (10 years). For Enhanced Policy, you can set hourly frequency (every 4 hours minimum) and instant restore snapshot retention (1-5 days). The policy applies to all VMs associated with it. For SQL/SAP HANA, you can set log backup frequency (every 15 minutes).
Configure Backup for Azure VMs
In the vault, go to Backup and select Azure Virtual Machine. Choose the VM to back up. The Azure Backup extension is automatically installed on the VM. Select the backup policy. The first backup is a full copy; subsequent backups are incremental. Snapshots are taken for instant restore and stored in the VM's storage account. The snapshot is transferred to the vault asynchronously. You can monitor progress in Backup Jobs. Ensure the VM has outbound connectivity to Azure Backup endpoints (or use private endpoints).
Perform a Restore
In the vault, go to Backup Items, select the VM, and click Restore VM. Choose restore point (from snapshots or vault). If using snapshot, the restore is instant. You can choose to create a new VM or restore disks. For cross-region restore (if enabled), you can select the secondary region. You can also restore individual files from the VM backup by mounting the snapshot as a drive (for Windows) or using a script (for Linux). After restore, the new VM is created in the specified resource group and virtual network.
Configure Azure Site Recovery for Azure VM Replication
In the Recovery Services Vault, go to Site Recovery and select Enable Replication for Azure VMs. Choose source region (e.g., East US) and target region (e.g., West US). Create or select a cache storage account (must be in source region). Define replication policy: RPO threshold (default 15 min), retention (default 24 hours), app-consistent snapshot frequency (default 1 hour). ASR automatically creates target resources: virtual network, storage accounts, availability set/zone. Review and enable replication. Initial replication copies the entire VM disk; subsequent replications are incremental. Monitor replication health in the vault.
Perform a Test Failover
In the vault, under Site Recovery, select the replicated item and click Test Failover. Choose a recovery point (latest, latest processed, latest app-consistent, or custom). Select an isolated Azure virtual network for the test (do not use production network). The test failover creates a VM in the target region using the selected recovery point. It does not affect the source VM or ongoing replication. After testing, click Cleanup test failover to delete the test VM and resources. This validates the failover process without disruption.
Perform an Unplanned Failover
In a disaster scenario, in the vault, select the replicated item and click Failover. Choose a recovery point (latest processed, latest app-consistent, or latest multi-VM consistent). Select the target network. Optionally, shut down the source VM before failover (recommended). Confirm failover. The target VM starts in the target region. After failover, you can commit the failover (makes it permanent) or re-protect the source VM. For failback, you need to reprotect the target VM to the source region and then fail over again.
Enterprise Scenario 1: Financial Services Company with Compliance Requirements
A financial services company must retain backup data for 7 years for regulatory compliance. They use Azure Backup with a Recovery Services Vault configured with GRS and a custom backup policy: daily backups retained for 30 days, weekly for 12 weeks, monthly for 5 years, and yearly for 10 years. They enable soft delete and immutable vault to prevent deletion by administrators. They also use Azure Backup Reports to audit backup compliance. The challenge is cost: long retention increases storage costs. They optimize by using archive tier for backups older than 6 months, which reduces storage costs by up to 70%. They also use Azure Policy to enforce backup on all production VMs. Misconfiguration often occurs when retention rules are not applied correctly, leading to premature deletion of data. They test restore quarterly to ensure RTO of 4 hours is met.
Enterprise Scenario 2: E-commerce Platform with Low RPO/RTO Requirements
An e-commerce platform requires RPO of 15 minutes and RTO of 1 hour for critical application VMs. They use Azure Site Recovery to replicate VMs from primary region (West Europe) to secondary region (North Europe). They configure replication policy with RPO threshold of 15 minutes and app-consistent snapshots every 30 minutes. They use a cache storage account with premium performance to minimize latency. They perform test failover monthly to ensure the DR plan works. During a regional outage, they perform unplanned failover; the application is back online in 45 minutes. Post-failover, they reprotect the secondary VMs to the primary region for failback. A common pitfall is not updating the DNS records after failover, causing client traffic to still go to the primary region. They use Azure Traffic Manager to automatically redirect traffic based on endpoint health.
Enterprise Scenario 3: Healthcare Provider with Hybrid Workloads
A healthcare provider has on-premises SQL Server databases and Azure VMs. They use Azure Backup Server (MABS) to back up on-premises SQL databases to a Recovery Services Vault. They also use Azure Backup for Azure VMs. They enable private endpoints for backup traffic to avoid internet exposure. They use Azure Backup Center for centralized monitoring. They have a requirement to restore individual files from VM backups; they use file-level restore from the VM backup snapshot. A common issue is that the MABS agent becomes outdated, causing backup failures. They implement automatic patching via Azure Update Management. They also use Azure Site Recovery for on-premises Hyper-V VMs to Azure for disaster recovery, with a replication policy of 5 minutes RPO.
Exactly What AZ-305 Tests on Backup and Recovery
The AZ-305 exam (Objective 3.1) focuses on designing a backup and disaster recovery solution. You need to differentiate between Azure Backup and Azure Site Recovery based on RPO, RTO, data type, and cost. Specific topics tested: - Recovery Services Vault: Storage replication types (LRS, GRS, ZRS) and when to use each (GRS for cross-region restore, LRS for cost savings). - Backup Policy: Default snapshot retention (2 days), minimum backup frequency (daily for standard, 4 hours for enhanced), retention rules. - Cross-Region Restore: Must be enabled at vault creation, only works with GRS, allows restore to paired region. - Soft Delete: Enabled by default, retention period of 14 days. - Azure Site Recovery: Default RPO threshold (15 minutes), minimum RPO (30 seconds for Azure-to-Azure), recovery point retention (0-24 hours). - Failover Types: Planned (no data loss), unplanned (may lose data), test failover (isolated). - Backup for On-Premises: MARS agent for files/folders, MABS for workloads, DPM for advanced scenarios. - Security: Private endpoints, encryption at rest (SSE or CMK), encryption in transit (HTTPS).
Common Wrong Answers and Why Candidates Choose Them
Choosing Azure Backup when ASR is needed: Candidates often pick Azure Backup for disaster recovery because it is more familiar. But ASR is required for low RPO/RTO (minutes) and automated failover. Azure Backup is for backup and restore, not failover.
Selecting LRS for production: Candidates may choose LRS to save costs, but the exam expects GRS for production to enable cross-region restore. LRS is for dev/test.
Enabling cross-region restore with LRS: Cross-region restore requires GRS. Candidates may think it works with any replication type.
Setting snapshot retention to 0 days: Snapshot retention must be between 1 and 5 days. Candidates may incorrectly set it to 0 to save costs, but then instant restore is disabled.
Using ASR for file-level recovery: ASR is for full VM recovery, not individual files. For file-level restore, use Azure Backup's file recovery.
Specific Numbers and Values to Memorize
Snapshot retention: 2 days default, 1-5 days configurable.
Soft delete retention: 14 days.
ASR default RPO threshold: 15 minutes.
ASR minimum RPO: 30 seconds (Azure-to-Azure).
ASR recovery point retention: 0-24 hours.
Backup frequency: daily (standard), 4 hours minimum (enhanced).
MARS agent: supports files, folders, system state.
Vault storage replication: LRS, GRS, ZRS.
Edge Cases and Exceptions
Azure Backup for Azure VMs does not support cross-region failover; only cross-region restore (restore to secondary region, not failover).
Azure Site Recovery does not support cross-region backup; it supports replication to a secondary region.
Backup of Azure Files uses snapshots; you cannot use ASR for Azure Files.
SQL Server backup uses native backup integration; you can set log backup frequency as low as every 15 minutes.
SAP HANA backup uses Backint interface; requires specific configuration.
How to Eliminate Wrong Answers
If the question mentions failover, automated recovery, or low RPO (< 1 hour), eliminate Azure Backup and choose ASR.
If the question mentions long-term retention (years) or file-level restore, eliminate ASR and choose Azure Backup.
If the question mentions cross-region restore, ensure the vault uses GRS.
If the question mentions cost optimization, consider LRS for dev/test, archive tier for old backups.
If the question mentions on-premises backup, consider MARS or MABS.
Recovery Services Vault is the container for both Azure Backup and Azure Site Recovery data.
For production workloads, use GRS to enable cross-region restore.
Soft delete is enabled by default and retains deleted backup data for 14 days.
Azure Backup instant restore snapshots are retained for 2 days by default (configurable 1-5 days).
Azure Site Recovery default RPO threshold is 15 minutes; minimum RPO is 30 seconds for Azure-to-Azure replication.
Cross-region restore must be enabled at vault creation and requires GRS.
Azure Backup supports file-level restore for Azure VMs via snapshot mounting.
For on-premises backup, use MARS agent for files/folders, MABS for workloads.
Test failover in ASR is isolated and does not affect production.
Use Azure Policy to enforce backup on all VMs.
Archive tier for Azure Backup reduces storage costs for older backups.
Azure Backup does not support cross-region failover; only cross-region restore.
These come up on the exam all the time. Here's how to tell them apart.
Azure Backup
Purpose: Backup and restore of data, not failover.
RPO: Typically 24 hours (daily), can be 4 hours with Enhanced Policy.
RTO: Minutes (instant restore) to hours (vault restore).
Retention: Long-term retention possible (years).
Use case: Archival, compliance, file-level restore.
Azure Site Recovery
Purpose: Disaster recovery with automated failover.
RPO: As low as 30 seconds (Azure-to-Azure), default 15 min.
RTO: Typically 30 minutes to 2 hours.
Retention: Short retention (0-24 hours) for recovery points.
Use case: Critical apps with low RPO/RTO, cross-region failover.
Mistake
Azure Backup can be used for disaster recovery with automated failover.
Correct
Azure Backup provides backup and restore, not automated failover. For disaster recovery with failover, use Azure Site Recovery. Azure Backup can restore to a secondary region via cross-region restore, but it is a manual restore, not a failover.
Mistake
Cross-region restore works with any storage replication type.
Correct
Cross-region restore requires the Recovery Services Vault to be configured with geo-redundant storage (GRS). LRS and ZRS do not replicate data to a secondary region, so cross-region restore is not available.
Mistake
Soft delete permanently deletes backup data after 14 days.
Correct
Soft delete retains deleted backup data for 14 days, during which it can be recovered. After 14 days, the data is permanently deleted. You can disable soft delete, but it is enabled by default.
Mistake
Azure Site Recovery replicates data to a Recovery Services Vault.
Correct
ASR uses a cache storage account in the source region for replication, and the replicated data is stored in the target region's managed disks, not in the vault. The vault stores metadata and configuration, not the actual replicated data.
Mistake
Azure Backup can back up Azure VMs to a different region directly.
Correct
Azure Backup backs up VMs to the vault in the same region as the VM. To have a copy in another region, you must enable cross-region restore (which requires GRS) and then restore to the secondary region. Alternatively, you can use ASR for replication to another region.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Azure Backup is for backup and restore of data, with RPO typically 24 hours and RTO from minutes to hours. It supports long-term retention and file-level restore. Azure Site Recovery is for disaster recovery with automated failover, offering RPO as low as 30 seconds and RTO of 30 minutes to 2 hours. Use Azure Backup for compliance and archival; use ASR for critical workloads that need fast recovery.
No, Azure Backup does not support failover. It only supports restore. For failover, use Azure Site Recovery. However, with cross-region restore enabled, you can restore a VM to a secondary region, but it is a manual process.
The default snapshot retention is 2 days. You can configure it between 1 and 5 days. Snapshots are used for instant restore and are stored in the VM's storage account.
Cross-region restore must be enabled when creating the Recovery Services Vault. You select the 'Enable cross-region restore' checkbox. The vault must use GRS storage. After creation, you cannot disable it. To use cross-region restore, go to Backup Items, select a restore point, and choose the secondary region.
The minimum RPO is 30 seconds for Azure-to-Azure replication. The default RPO threshold is 15 minutes. You can set it as low as 30 seconds in the replication policy.
Yes, Azure Backup supports file-level restore for Azure VMs. For Windows VMs, you can mount the snapshot as a drive and copy files. For Linux VMs, you can run a script that mounts the snapshot.
Soft delete is a feature that retains deleted backup data for 14 days, allowing you to recover it. It is enabled by default for Azure VM backups. After 14 days, the data is permanently deleted.
You've just covered Designing Backup and Recovery — now see how well it sticks with free AZ-305 practice questions. Full explanations included, no account needed.
Done with this chapter?