This chapter covers Resource Health and Azure Service Health, two critical monitoring tools in Azure. For the AZ-104 exam, these topics fall under Domain 5 (Monitoring), Objective 5.1: Monitor Azure resources. Expect 5-10% of exam questions to touch on health monitoring, including interpreting health states, configuring alerts, and understanding the difference between Resource Health and Service Health. You'll need to know exactly what each service does, how to access it, and how to respond to health events.
Jump to a section
Imagine you own a fleet of delivery vans (your Azure resources). Each van has a dashboard that shows its own status: engine temperature, fuel level, tire pressure. This is like Resource Health — it monitors a specific resource (van) and tells you if it's available, degraded, or unavailable. Now, the fleet manager has a central monitoring station that tracks all vans, but also receives alerts about road closures, weather warnings, or regional fuel shortages affecting multiple vans. That's Azure Service Health — it provides a global view of Azure services and regions, including planned maintenance and outages. The dashboard in each van gives you real-time, resource-level health (Resource Health), while the central station gives you service-wide and subscription-level alerts (Service Health). If a van's tire is flat, the dashboard shows 'degraded' (Resource Health). If a major highway is closed due to a storm, the central station alerts all drivers (Service Health). Both are needed: Resource Health helps you troubleshoot a specific VM, while Service Health tells you if an Azure region is having problems that might affect multiple resources.
What Are Resource Health and Azure Service Health?
Azure provides two distinct but complementary health monitoring services: Resource Health and Azure Service Health. Both are part of the Azure Monitor suite, but they serve different purposes.
Resource Health gives you a personalized dashboard of the health of your individual Azure resources (e.g., a specific VM, storage account, or SQL database). It detects whether a resource is available, degraded, or unavailable, and provides root cause analysis (RCA) when issues occur. It runs checks every 1-2 minutes and reports status in near real-time.
Azure Service Health is a broader service that tracks the health of Azure services across regions. It provides information about: - Service issues: Ongoing problems affecting Azure services (e.g., a regional outage). - Planned maintenance: Upcoming maintenance that may impact your resources. - Health advisories: Changes that require your attention, such as service retirements or feature deprecations.
Both services are built on the same underlying health monitoring infrastructure, but Resource Health operates at the resource level, while Service Health operates at the service and subscription level.
How Resource Health Works Internally
Resource Health uses a series of health checks performed by the Azure Resource Health service. These checks are executed from multiple Azure datacenter locations to avoid single points of failure. The checks include: - Platform health checks: Verify that the underlying Azure platform (e.g., host server, network, storage) is working correctly. - VM health checks: For virtual machines, Resource Health checks whether the VM is running, responding to guest OS-level probes, and has network connectivity. - Application health checks: For services like App Service, it checks if the application pool is running and responding.
Each check returns one of three states: - Available: The resource is healthy and functioning normally. - Degraded: The resource is functioning but with reduced performance or availability. For example, a VM might be running but with high latency to its disk. - Unavailable: The resource is not functioning. For example, a VM has stopped.
Resource Health also provides historical health data for up to 30 days, allowing you to review past incidents.
How Azure Service Health Works Internally
Azure Service Health aggregates health information from all Azure regions and services. It uses a global monitoring system that collects telemetry from Azure datacenters worldwide. When a problem is detected (e.g., a regional power outage), the system creates a service health event that can be viewed in the Azure portal or queried via the Azure Resource Manager API.
Service Health events are categorized as: - Service Issue: An ongoing problem affecting one or more Azure services in a region. - Planned Maintenance: Scheduled updates that may impact your resources. - Health Advisory: Important changes or recommendations (e.g., TLS 1.2 deprecation). - Security Advisory: Security-related announcements (e.g., a vulnerability disclosure).
You can configure Service Health alerts to notify you via email, SMS, webhook, or Azure Monitor action groups when a service health event occurs that matches your criteria (e.g., a service issue affecting your subscription's region).
Key Components and Defaults
Resource Health state: Available, Degraded, Unavailable. These are determined by automated checks every 1-2 minutes.
Resource Health history: Stored for 30 days.
Service Health event types: Service Issue, Planned Maintenance, Health Advisory, Security Advisory.
Service Health alerts: You can create alerts based on service, region, event type, and severity.
Azure Service Health portal: https://status.azure.com (public status page) and the Azure portal's Service Health blade (subscription-specific).
Azure Resource Health blade: In the Azure portal, under each resource's 'Help' section.
Configuration and Verification
View Resource Health for a VM: 1. In the Azure portal, navigate to the VM. 2. Under 'Help', click 'Resource health'. 3. You'll see the current health state and any past events.
View Azure Service Health: 1. In the Azure portal, search for 'Service Health'. 2. The dashboard shows active service issues, planned maintenance, and health advisories.
Create a Service Health Alert using Azure CLI:
az monitor activity-log alert create \
--resource-group MyResourceGroup \
--name ServiceHealthAlert \
--condition category=ServiceHealth \
--action-groups /subscriptions/.../actionGroupNameQuery Resource Health via Azure Resource Graph:
resources
| where type =~ 'microsoft.resourcehealth/availabilitystatuses'
| where properties.availabilityState == 'Unavailable'Interaction with Related Technologies
Azure Monitor: Both Resource Health and Service Health are integrated with Azure Monitor. You can create alerts based on health changes using Azure Monitor action groups.
Azure Advisor: Advisor can recommend actions based on health issues (e.g., configuring availability sets).
Azure Automation: You can trigger runbooks when a health event occurs (e.g., auto-remediate a degraded VM).
Log Analytics: Resource Health data can be sent to Log Analytics for advanced querying and correlation with other logs.
Exam-Relevant Details
Resource Health checks are not performed by the guest OS; they are platform-level checks. For VMs, you can enable guest-level monitoring (via Azure Monitor for VMs) to get deeper health insights.
Resource Health can show 'Unknown' state if the platform cannot determine health (e.g., due to network issues).
Service Health events are subscription-scoped. You cannot create Service Health alerts for resources in other subscriptions unless you have cross-subscription access.
The Azure Status page (status.azure.com) shows global Azure health, but it does not include subscription-specific planned maintenance or advisories. Use the Azure portal for subscription-specific info.
Planned maintenance events include a maintenance window (start and end time). You can schedule maintenance for some services (e.g., VMs) to avoid disruption.
Common Misconfigurations
Not configuring Service Health alerts: Many admins rely on the public status page, but that doesn't notify you of issues affecting your specific resources. Always create Service Health alerts.
Confusing Resource Health with VM availability metrics: Resource Health shows platform health, not guest OS health. A VM might be 'Available' in Resource Health but unresponsive due to a guest OS crash.
Ignoring 'Degraded' state: A degraded resource might still be working but with performance issues. Ignoring it can lead to SLA breaches.
Access Resource Health for a VM
Navigate to the Azure portal, select your VM, and under 'Help' click 'Resource health'. The dashboard shows the current health state (Available, Degraded, Unavailable) and a timeline of past events. If the state is 'Unknown', it means the platform cannot reach the resource. This is the first step in troubleshooting a VM issue.
Interpret Resource Health State
The health state is determined by automated checks from Azure's platform. 'Available' means all checks passed. 'Degraded' means at least one check failed (e.g., high latency to storage). 'Unavailable' means the resource is not accessible (e.g., VM stopped). Click on each event to see the root cause and recommended steps.
Create a Service Health Alert
In the Azure portal, go to 'Service Health' and click 'Create service health alert'. Define the scope (subscription), filter by service (e.g., Virtual Machines), region, and event type (Service Issue, Planned Maintenance, etc.). Then configure an action group (email, SMS, webhook). This ensures you are notified when an Azure issue affects your resources.
Review Planned Maintenance Events
In the Service Health blade, click 'Planned maintenance' to see upcoming maintenance that may affect your resources. Each event includes a maintenance window, affected services, and impact description. You can reschedule some maintenance (e.g., for VMs) using the 'Reschedule' option if available.
Use Azure Resource Graph for Bulk Health
To check health of all resources in a subscription, use Azure Resource Graph with a Kusto query like: resources | where type =~ 'microsoft.resourcehealth/availabilitystatuses' | project resource, properties.availabilityState. This returns a list of all resources and their health states, useful for large-scale audits.
In a typical enterprise scenario, a company runs hundreds of VMs across multiple regions. The cloud operations team uses Resource Health to monitor individual VMs. For example, if a VM shows 'Degraded' due to high disk latency, the team can investigate whether the disk is throttled or if there's a regional storage issue. They also set up Service Health alerts for 'Service Issues' in their primary region. When an Azure regional outage occurs, they receive an alert within minutes, allowing them to failover to a secondary region using Azure Site Recovery.
Another scenario involves a SaaS provider that uses Azure SQL Database. They rely on Resource Health to detect when a database becomes unavailable. They have an automation runbook that triggers when a database enters 'Unavailable' state: the runbook attempts to restart the database or failover to a geo-replica. They also subscribe to Service Health advisories to get early warnings about SQL database version retirements.
A common problem arises when administrators confuse Resource Health with guest OS monitoring. For example, a VM might be 'Available' per Resource Health, but the application inside the VM is down due to a misconfigured firewall. Resource Health does not check guest-level processes. To get application-level health, you need to use Azure Monitor for VMs or custom health probes.
Performance considerations: Resource Health checks run every 1-2 minutes, so there is minimal overhead. However, if you have thousands of resources, the Azure Resource Graph query might take a few seconds. For real-time alerting, use Resource Health alerts (preview) that trigger on state changes.
When misconfigured, teams might miss critical alerts. For instance, if Service Health alerts are not configured for all regions, a regional outage in a secondary region might go unnoticed until users complain. Also, if action groups are misconfigured (e.g., wrong email address), notifications are lost.
For AZ-104, you need to know the difference between Resource Health and Service Health. Objective 5.1 specifically tests your ability to 'monitor Azure resources', which includes interpreting health states and configuring alerts. Common exam questions ask: - 'Which service shows you the health of a specific VM?' Answer: Resource Health. - 'Where do you find information about planned maintenance?' Answer: Service Health (Planned maintenance blade). - 'What does a 'Degraded' state mean?' Answer: The resource is functioning but with reduced performance.
Trap patterns: 1. Choosing 'Azure Monitor' instead of 'Resource Health' when asked about resource-level health. Azure Monitor is the overarching platform; Resource Health is the specific service. 2. Confusing 'Service Health' with 'Azure Status' (status.azure.com). The public status page shows global health, not subscription-specific events. Service Health in the portal shows both. 3. Thinking Resource Health checks guest OS health. It does not. It checks platform health only. Guest OS monitoring requires Azure Monitor for VMs. 4. Assuming 'Unavailable' means the resource is deleted. It means the resource is not accessible (e.g., VM stopped, network issue).
Numbers and terms: - Resource Health states: Available, Degraded, Unavailable, Unknown. - Service Health event types: Service Issue, Planned Maintenance, Health Advisory, Security Advisory. - Resource Health history retention: 30 days. - Resource Health check interval: 1-2 minutes.
Edge cases: - If a resource is in 'Unknown' state, it could be due to a network issue between the health check system and the resource. This does not necessarily mean the resource is unhealthy. - Service Health alerts can be created for multiple subscriptions using Azure Management Groups. - Some services (like Azure DNS) do not support Resource Health because they are global services.
How to eliminate wrong answers: Focus on the scope. If the question mentions a specific resource (e.g., 'a VM'), the answer is Resource Health. If it mentions a service or region, it's Service Health. Also, remember that Resource Health is accessed from the resource's blade, not from a central location.
Resource Health checks run every 1-2 minutes and report state as Available, Degraded, Unavailable, or Unknown.
Resource Health does not monitor guest OS; it only checks platform health.
Azure Service Health provides subscription-specific information about service issues, planned maintenance, and advisories.
Service Health alerts use action groups (email, SMS, webhook, etc.) to notify you.
Resource Health history is stored for 30 days.
The Azure Status page (status.azure.com) shows global health, not subscription-specific events.
A 'Degraded' state means the resource is functioning but with reduced performance.
You can query Resource Health for all resources using Azure Resource Graph.
These come up on the exam all the time. Here's how to tell them apart.
Resource Health
Monitors individual resources (e.g., a specific VM).
States: Available, Degraded, Unavailable, Unknown.
Accessible from each resource's blade.
Provides root cause analysis for resource-level issues.
History retained for 30 days.
Azure Service Health
Monitors Azure services and regions globally.
Event types: Service Issue, Planned Maintenance, Health Advisory, Security Advisory.
Accessible from the Service Health blade in the portal.
Provides information about service-wide issues and planned changes.
Alerts can be configured for specific subscriptions.
Mistake
Resource Health monitors the guest operating system of a VM.
Correct
Resource Health only monitors the Azure platform health (host, network, storage). It does not check guest OS processes or application health. Use Azure Monitor for VMs for guest-level monitoring.
Mistake
Service Health and the Azure Status page (status.azure.com) show the same information.
Correct
The Azure Status page shows global Azure service health, but it does not include subscription-specific planned maintenance or health advisories. Service Health in the Azure portal provides personalized information for your subscription.
Mistake
A resource in 'Degraded' state means it will become unavailable soon.
Correct
'Degraded' means the resource is still operational but with reduced performance or availability. It may recover or worsen, but it is not necessarily a precursor to 'Unavailable'.
Mistake
Resource Health alerts are the only way to get notified of resource issues.
Correct
Resource Health alerts are in preview. You can also use Azure Monitor metrics and logs to detect issues. For example, you can set a metric alert on VM CPU utilization.
Mistake
You can create Service Health alerts for any resource in any subscription.
Correct
Service Health alerts are scoped to a subscription. You cannot create an alert for resources in another subscription unless you have appropriate permissions.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
Resource Health monitors the health of individual Azure resources (e.g., a VM, storage account) and reports states like Available, Degraded, Unavailable. Azure Service Health monitors the health of Azure services and regions, providing information about service issues, planned maintenance, and advisories that affect your subscription. Use Resource Health for resource-level troubleshooting, and Service Health for understanding broader Azure issues.
In the Azure portal, navigate to the VM, then under 'Help' click 'Resource health'. You'll see the current health state and a timeline of past events. If the VM is unavailable, you can see the root cause and recommended actions.
Yes, you can create Resource Health alerts (currently in preview) that trigger when a resource's health state changes. Alternatively, you can use Azure Monitor metric alerts (e.g., VM availability metric) or log alerts based on Resource Health data sent to Log Analytics.
The 'Unknown' state means that Azure Resource Health has not received health information for the resource. This could be due to a network issue, the resource being newly created, or the health check service being temporarily unavailable. It does not necessarily mean the resource is unhealthy.
Go to the Azure portal, search for 'Service Health', then click 'Planned maintenance'. You'll see a list of upcoming maintenance events that may affect your resources. Each event includes a maintenance window and affected services.
No, Resource Health is available for most Azure services but not all. For example, it is available for VMs, storage accounts, SQL databases, and App Services. Some global services like Azure DNS do not support Resource Health.
Resource Health retains health history for 30 days. You can view past events in the Resource Health blade for each resource.
You've just covered Resource Health and Azure Service Health — now see how well it sticks with free AZ-104 practice questions. Full explanations included, no account needed.
Done with this chapter?