This chapter covers Azure Policy Remediation Tasks, a critical feature for automatically correcting non-compliant resources after policy evaluation. Remediation is a key part of Azure Policy's 'deny, audit, and remediate' lifecycle and is heavily tested on the AZ-104 exam (approximately 10-15% of policy-related questions). You will learn the internal mechanics of how remediation tasks work, how to configure them, and common exam traps. By the end, you will understand when to use remediation versus other policy effects and how to troubleshoot failures.
Jump to a section
Imagine a city building department with a master set of zoning laws (Azure Policy definitions). A developer submits plans for a new building (a resource creation request). Before any construction begins, the building inspector (Azure Policy evaluation) checks the plans against the zoning laws. If the plans violate a rule — say, the building is too tall — the inspector rejects the plans immediately (deny effect). But what if the building is already standing and is non-compliant? The inspector issues a correction order (remediation task). The remediation task doesn't rebuild the building; it hires a contractor (a managed identity) to apply a specific fix — like painting the facade to match the approved color. The contractor works autonomously, using the building's existing blueprints (the resource's current configuration) and the inspector's instructions (the policy definition's remediation template). The contractor logs each action, and once complete, the inspector rechecks the building. If the fix was applied correctly, the building becomes compliant. If not, the inspector issues another order. The key mechanic: the inspector never changes the building itself — it only orders the change. The contractor (remediation task) is a separate entity with its own identity and permissions, ensuring the inspector's role stays pure (evaluation only) while the contractor handles modification. This separation of concerns is critical: evaluation and remediation are distinct processes that must be independently authorized.
What is Azure Policy Remediation?
Azure Policy evaluates resources against policy definitions and assigns an effect: Deny (block creation), Audit (log non-compliance), Append (add fields), or Modify (alter properties). The Modify effect—and its companion, Remediation Tasks—is how Azure Policy can automatically fix non-compliant resources after they exist. Remediation is not a separate policy effect; it is a post-evaluation action that applies the Modify effect to existing resources.
Remediation tasks are created at the scope of a policy assignment (management group, subscription, resource group). When triggered, the task uses a system-assigned or user-assigned managed identity to execute a deployment template (called a 'remediation template') that makes the required changes. The task runs asynchronously and can be monitored via Azure Portal, CLI, or PowerShell.
Why Remediation Exists
Policy effects like Deny and Audit are passive: Deny prevents non-compliant resources from being created, but does nothing for existing ones. Audit only logs compliance state. Append can add tags or fields at creation time, but cannot modify existing resources. Only the Modify effect can change existing resources, but it does so only when a resource is created or updated (at evaluation time). To fix resources that were created before the policy was assigned, or that became non-compliant due to policy changes, you need a remediation task.
How Remediation Works Internally
Policy Assignment with Modify Effect: You assign a policy definition that uses the Modify effect. The definition includes a 'policyRule' with an 'effect' of 'Modify' and a 'details' section specifying the operations (e.g., add a tag, change SKU). The assignment also specifies a managed identity (system-assigned by default) with permissions to perform the modify operations.
Creating a Remediation Task: When you trigger a remediation task (via portal, CLI, or PowerShell), Azure Policy scans all resources within the assignment scope that are evaluated as non-compliant for that policy. It then creates a task that iterates over each non-compliant resource and applies the modifications defined in the policy's 'details'.
Execution: The task uses the managed identity to call the Azure Resource Manager (ARM) for each resource. For each resource, it constructs a PUT or PATCH request (depending on the operation) with the required changes. The task does not modify the resource directly; it submits an ARM request, which is then processed like any other ARM operation (including any other policies that might evaluate the request).
Completion: The task runs until all resources have been processed or it times out (default 2 hours). If a resource fails (e.g., due to permissions or conflict), the task logs the error and continues. You can retry failed resources individually.
Key Components and Defaults
Managed Identity: The identity used by the remediation task. By default, it's a system-assigned managed identity created at assignment time. You can also use a user-assigned managed identity for more control. The identity must have 'Contributor' or custom role with write permissions on the target resources.
Remediation Task Timeout: 2 hours per task. If the task cannot process all resources in that time, it fails with a timeout error. You can create a new task for remaining resources.
Parallelism: By default, a remediation task processes up to 5 resources concurrently. This can be adjusted via CLI or PowerShell.
Retry Logic: If a resource modification fails due to a transient error (e.g., conflict), the task retries up to 3 times with exponential backoff.
Prerequisite: The policy assignment must have the Modify effect enabled and a managed identity configured. The identity must have the necessary permissions before the task is created.
Configuration and Verification
Azure Portal: 1. Navigate to Policy > Remediation. 2. Select the policy assignment. 3. Click 'Create Remediation Task'. 4. The task runs immediately. You can monitor progress in the Remediation tasks pane.
Azure CLI:
# Create a remediation task for a policy assignment
az policy remediation create --name myRemediation --policy-assignment myAssignment
# List remediation tasks
az policy remediation list
# Show status of a task
az policy remediation show --name myRemediation
# Cancel a task
az policy remediation cancel --name myRemediationPowerShell:
# Create a remediation task
New-AzPolicyRemediation -Name myRemediation -PolicyAssignmentId /subscriptions/{subId}/providers/Microsoft.Authorization/policyAssignments/myAssignment
# Get remediation task status
Get-AzPolicyRemediation -Name myRemediation
# Remove a remediation task
Remove-AzPolicyRemediation -Name myRemediationVerification: After remediation, check the resource's compliance state via Policy > Compliance. Resources that were successfully remediated should change from 'Non-compliant' to 'Compliant' or 'Exempt' depending on the policy.
Interaction with Related Technologies
Azure Policy Effects: Remediation is only applicable to Modify effect. Deny, Audit, and Append do not have remediation tasks. If you need to fix resources for an Audit policy, you must manually modify them or use a different tool (e.g., Azure Automation).
Azure Blueprints: Blueprints can include policy assignments with remediation tasks. When you assign a blueprint, remediation tasks can be automatically triggered for existing resources.
Azure Resource Manager: Remediation tasks use ARM to modify resources. This means they are subject to ARM throttling (30,000 requests per hour per subscription). Large-scale remediation may hit this limit.
Azure Policy Exemptions: Resources with an exemption are not evaluated and thus not remediated. To remediate an exempt resource, you must first remove the exemption.
Azure Policy Initiatives (Policy Sets): Remediation tasks can be created for an initiative assignment. The task will apply the Modify effect from all included policy definitions that have it.
Common Parameters
When creating a remediation task via CLI or PowerShell, you can specify:
- --resource-discovery-mode (CLI) or -ResourceDiscoveryMode (PowerShell): Values are 'ExistingNonCompliant' (default) or 'ReEvaluateCompliance'. 'ExistingNonCompliant' only remediates resources currently shown as non-compliant. 'ReEvaluateCompliance' re-evaluates all resources in scope first, then remediates any that become non-compliant.
- --parallelism (CLI) or -ParallelismDeploymentCount (PowerShell): Number of resources to remediate concurrently (default 5, max 50).
- --failure-threshold (CLI) or -FailureThreshold (PowerShell): Percentage of resources that can fail before the task stops (default 100, meaning ignore failures).
Step-by-Step Example: Adding a Tag to All Non-Compliant VMs
Policy Definition:
{
"policyRule": {
"if": {
"field": "type",
"equals": "Microsoft.Compute/virtualMachines"
},
"then": {
"effect": "Modify",
"details": {
"operations": [{
"operation": "add",
"field": "tags.Environment",
"value": "Production"
}]
}
}
}
}Assignment: Assign this policy at subscription scope with a system-assigned managed identity. The identity needs 'Virtual Machine Contributor' role on the subscription.
Remediation: Create a remediation task. The task will iterate over all non-compliant VMs and add the tag 'Environment: Production'.
Identify Non-Compliant Resources
Before any remediation can occur, Azure Policy must determine which resources are non-compliant for the given policy assignment. This is done during the compliance evaluation scan, which runs automatically every 24 hours or on-demand. The scan checks each resource within the assignment scope against the policy rule. For the Modify effect, non-compliance means the resource does not have the desired configuration (e.g., missing a tag). The list of non-compliant resources is stored in the Azure Policy compliance state. When you create a remediation task with default mode 'ExistingNonCompliant', it uses this pre-existing list. If you choose 'ReEvaluateCompliance', the task triggers a fresh evaluation scan first, which can be time-consuming for large scopes.
Create Remediation Task
The administrator creates a remediation task via portal, CLI, or PowerShell. The task is associated with a specific policy assignment. At creation, you can specify parameters like parallelism, failure threshold, and resource discovery mode. The task object is created in the Azure Policy service and is assigned a unique ID. The task does not start immediately; it is queued. The Azure Policy backend then schedules the task for execution. The task's initial state is 'Running'.
Task Execution Begins
The Azure Policy remediation engine picks up the task and begins processing. It uses the managed identity associated with the policy assignment. For each non-compliant resource, the engine constructs an ARM PUT or PATCH request that applies the modifications defined in the policy's 'details' section. The request is sent to the Azure Resource Manager. The engine can process multiple resources concurrently (default 5). It respects ARM throttling limits and will back off if rate-limited. Each resource modification is logged with a status (Succeeded, Failed, or Skipped).
ARM Processes the Request
The ARM receives the PUT/PATCH request from the remediation engine. It authenticates the request using the managed identity's token. ARM then validates the request against Azure RBAC (the identity must have write permissions). If valid, ARM applies the change to the resource. This may trigger other Azure policies (including the same policy again) because the change is an update operation. If another policy denies the change, the remediation fails for that resource. ARM also logs the operation in the activity log. The response is sent back to the remediation engine.
Completion and Monitoring
Once all resources have been processed (or the task times out after 2 hours), the task state changes to 'Succeeded' if all resources were remediated, 'Failed' if some or all failed, or 'Partial' if only some succeeded. You can view the task details to see per-resource status. If the task fails, you can retry failed resources by creating a new remediation task that only targets those resources. After successful remediation, the next compliance evaluation scan (or an on-demand scan) will show the resources as compliant.
Enterprise Scenario 1: Tagging Enforcement for Cost Management
A large enterprise with 10,000+ VMs across multiple subscriptions needs to enforce a mandatory 'CostCenter' tag on all VMs. They create a policy with Modify effect that adds the tag with a default value 'Unassigned' if missing. Initially, only 60% of VMs have the tag. The administrator creates a remediation task at the management group scope. The task processes 5,000 non-compliant VMs in about 45 minutes. However, they hit ARM throttling limits because of other concurrent operations. They reduce parallelism to 3 and schedule remediation during off-peak hours. After remediation, they run a compliance scan to verify. Common issues: the managed identity lacked 'Contributor' on some resource groups, causing partial failure. They had to grant permissions at the management group level to ensure uniform access.
Scenario 2: Enforcing Encryption Settings on Storage Accounts
A financial services company must ensure all storage accounts have 'requireInfrastructureEncryption' enabled. They use a policy with Modify effect to set this property. However, the property is not directly modifiable via ARM for existing storage accounts; it can only be set at creation time. The remediation task fails for all existing storage accounts because the ARM API rejects the change. This is a common gotcha: not all properties are modifiable after creation. The team had to use a custom script (Azure Automation) to recreate storage accounts with the correct setting. This scenario highlights that remediation only works for properties that support in-place updates.
Scenario 3: Compliance Drift Remediation in a DevOps Pipeline
A software company uses Azure Policy to enforce that all resources have specific tags (e.g., 'Environment', 'Owner'). Developers sometimes accidentally remove tags during deployments. They integrate remediation tasks into their CI/CD pipeline. After each deployment, a script runs an on-demand compliance evaluation and triggers remediation for any non-compliant resources. This ensures rapid correction. However, they must be careful not to create infinite loops: if the remediation itself triggers a new deployment that changes the tags, it could cause continuous remediation. They use exclusion scopes to exempt the pipeline's own resources. Also, they set a failure threshold so that if more than 10% of resources fail, the pipeline stops and alerts.
What AZ-104 Tests on Remediation Tasks
The AZ-104 exam (Objective 1.2 - Configure Azure Policy) tests your ability to:
Identify when to use remediation vs. other policy effects.
Understand the prerequisites: managed identity, permissions, Modify effect.
Know the default values: timeout (2 hours), parallelism (5), failure threshold (100%).
Differentiate between 'ExistingNonCompliant' and 'ReEvaluateCompliance' resource discovery modes.
Recognize that remediation tasks are only for Modify effect, not Deny, Audit, or Append.
Understand that remediation tasks can be created at the same scope as the policy assignment or at a lower scope (if the assignment is at a higher scope).
Know that you can cancel a running remediation task.
Understand that remediation does not automatically re-evaluate compliance; you must trigger a scan separately.
Common Wrong Answers and Why Candidates Choose Them
'Remediation can be used with Deny effect.' Wrong. Deny blocks creation; it does not modify existing resources. Candidates confuse 'deny and remediate' as a combined process, but they are separate.
'Remediation tasks run automatically after policy assignment.' Wrong. You must manually create a remediation task. Candidates think the Modify effect automatically fixes existing resources; it only applies to new or updated resources.
'Remediation tasks require no special permissions.' Wrong. The managed identity must have write permissions on the target resources. Candidates assume the policy engine has inherent permissions.
'Remediation can fix any property.' Wrong. Only properties that support in-place modification via ARM can be remediated. Candidates forget that some properties are immutable after creation.
'Remediation tasks can be scheduled to run periodically.' Wrong. There is no built-in scheduler; you must trigger them manually or via automation. Candidates expect automatic periodic remediation.
Specific Values and Terms on the Exam
Default remediation timeout: 2 hours.
Default parallelism: 5 resources.
Default failure threshold: 100% (meaning ignore failures).
Resource discovery modes: 'ExistingNonCompliant' and 'ReEvaluateCompliance'.
The managed identity can be system-assigned or user-assigned.
Remediation tasks are created under the 'Remediation' blade in Azure Policy.
The CLI command: az policy remediation create.
The PowerShell cmdlet: New-AzPolicyRemediation.
Edge Cases and Exceptions
If the policy definition uses 'auditIfNotExists' or 'deployIfNotExists', remediation tasks cannot be used because those effects do not have a Modify component. Only the 'Modify' effect supports remediation.
Remediation tasks cannot be created for policy definitions that are part of an initiative (policy set) if the initiative itself does not have the Modify effect enabled. Each individual definition must have Modify.
If a resource is exempted, it is not remediated. You must remove the exemption first.
Remediation tasks can be created at a lower scope than the assignment, but only if the assignment is at a higher scope (e.g., assignment at subscription, remediation at resource group).
How to Eliminate Wrong Answers
Always check if the question mentions 'existing resources' and 'automatic fix'. If the answer says 'Deny effect', eliminate it. If the answer says 'no permissions needed', eliminate it. If the answer says 'scheduled', eliminate it. Focus on the Modify effect and managed identity prerequisites. Remember that remediation is a one-time action, not continuous.
Remediation tasks are only for the Modify policy effect, not Deny, Audit, or Append.
The managed identity used by remediation must have write permissions (e.g., Contributor) on the target resources.
Default remediation timeout is 2 hours; default parallelism is 5 resources.
Resource discovery mode can be 'ExistingNonCompliant' (default) or 'ReEvaluateCompliance'.
Remediation tasks do not automatically run; you must create them manually.
After remediation, compliance is not re-evaluated until the next scan or an on-demand trigger.
Remediation cannot modify immutable properties; check ARM API documentation for modifiability.
You can create remediation tasks at the same scope as the assignment or at a lower scope.
Remediation tasks can be canceled via CLI or PowerShell.
The failure threshold default is 100% (ignore failures); set a lower percentage to stop on errors.
These come up on the exam all the time. Here's how to tell them apart.
Remediation Task (Modify effect)
Applies to existing non-compliant resources via a one-time task.
Uses a managed identity to directly modify the resource via ARM.
Only works for properties that can be updated in-place.
Can be triggered manually or via automation.
Does not deploy additional resources; only modifies existing ones.
DeployIfNotExists effect
Deploys a new resource (e.g., Log Analytics workspace) if one does not exist.
Uses a template deployment (ARM template) to create resources.
Works for missing resources, not for modifying existing ones.
Evaluates at resource creation/update time, not retroactively.
Can be combined with remediation to fix existing resources after deployment.
Mistake
Remediation tasks automatically run on a schedule.
Correct
Remediation tasks are manual one-time actions. There is no built-in scheduler. You must trigger them via portal, CLI, PowerShell, or automation (e.g., Azure Automation, Logic Apps).
Mistake
Remediation tasks can fix resources that were denied by a Deny policy.
Correct
Deny policies block creation of non-compliant resources. If a resource was created before the Deny policy was assigned, it is not denied; it is only non-compliant. Remediation does not apply to Deny effect; it only applies to Modify effect. To fix resources denied at creation, you must manually create them in a compliant way.
Mistake
The managed identity for remediation is automatically granted all necessary permissions.
Correct
The managed identity must be explicitly granted RBAC roles (e.g., Contributor) on the target resources. Azure Policy does not auto-grant permissions. If the identity lacks permissions, the remediation task fails.
Mistake
Remediation tasks can be used with any policy effect.
Correct
Only the Modify effect supports remediation. Effects like Deny, Audit, Append, AuditIfNotExists, and DeployIfNotExists do not have remediation tasks. For DeployIfNotExists, you use deployment scripts, not remediation tasks.
Mistake
Remediation tasks automatically re-evaluate compliance after completion.
Correct
Remediation tasks do not trigger a compliance scan. After remediation, the compliance state remains unchanged until the next scheduled scan (every 24 hours) or an on-demand scan is triggered. You must manually run 'az policy state trigger-scan' or equivalent.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
No. Remediation tasks are only available for policies with the Modify effect. Audit policies only log non-compliance; they do not modify resources. To fix resources identified by an Audit policy, you must manually change them or use a different tool like Azure Automation.
The managed identity must have at least 'Contributor' role on the resources being remediated, or a custom role with write permissions for the specific resource types and properties. Without proper permissions, the remediation task will fail with an authorization error.
The default timeout is 2 hours. The actual time depends on the number of non-compliant resources, parallelism (default 5), and ARM throttling. For large numbers of resources, consider increasing parallelism (up to 50) or splitting the task into multiple smaller tasks.
There is no built-in scheduler in Azure Policy. You must trigger remediation tasks manually or via automation tools like Azure Automation, Logic Apps, or custom scripts. You can create a runbook that periodically creates remediation tasks.
The task continues processing other resources. You can view per-resource status in the task details. Failed resources are not retried automatically. You can create a new remediation task that targets only the failed resources (by specifying resource IDs or using a filter).
Yes, if the policy assignment is at a management group scope that includes those subscriptions. The remediation task will process resources in all subscriptions under that scope. Ensure the managed identity has permissions on all target subscriptions.
Remediation works for any resource type that supports the Modify effect. However, not all properties can be modified after creation. For example, you cannot change the SKU of a storage account from Standard to Premium via remediation if the account already exists. Check the ARM API documentation for each property.
You've just covered Azure Policy Remediation Tasks — now see how well it sticks with free AZ-104 practice questions. Full explanations included, no account needed.
Done with this chapter?