MS-102Chapter 80 of 104Objective 3.3

Auto-Labeling Policies for Sensitive Content

This chapter covers auto-labeling policies for sensitive content in Microsoft 365, a critical feature for protecting data at scale. Auto-labeling automatically applies sensitivity labels to documents and emails based on their content, reducing manual effort and ensuring consistent classification. On the MS-102 exam, this topic appears in about 5-8% of questions, primarily in the 'Implement and manage information protection' domain. You must understand how auto-labeling policies work, their components, and how they differ from manual labeling and default labeling.

25 min read
Intermediate
Updated May 31, 2026

Auto-Labeling as a Smart Mail Sorter

Imagine a large corporate mailroom that receives thousands of letters daily. The mailroom has a smart sorter that scans each envelope before it reaches any employee. The sorter looks for specific keywords like 'Confidential', 'Legal Notice', or 'Personal' on the envelope or inside (via X-ray-like scanning). Based on these keywords, the sorter automatically stamps the envelope with a colored label: red for 'Top Secret', yellow for 'Internal Use', green for 'Public'. The sorter also has rules: if an envelope contains 'Credit Card', it must be stamped red and locked in a secure drawer; if it contains 'Meeting Agenda', it gets a green stamp and goes to general delivery. The sorter learns from past mistakes—if a manager corrects a stamp, the sorter updates its rules. This automatic stamping ensures that every letter is handled correctly without relying on each employee to manually apply labels. In Microsoft 365, auto-labeling policies work similarly: they scan content (emails, documents) for sensitive patterns, automatically apply the appropriate sensitivity label, and can enforce protections like encryption or watermarking, all based on rules you define.

How It Actually Works

What Auto-Labeling Policies Are and Why They Exist

Auto-labeling policies are a feature of Microsoft 365 Information Protection that automatically applies sensitivity labels to files and emails based on their content, context, or metadata. They are part of the broader sensitivity labeling ecosystem, which includes manual labeling (users choose labels), default labeling (a default label is applied to new items), and mandatory labeling (users must select a label). Auto-labeling is designed to ensure that sensitive data is consistently classified and protected without relying on end users to remember or correctly apply labels. This is especially important for organizations that handle large volumes of data containing personally identifiable information (PII), financial data, or intellectual property.

How Auto-Labeling Works Internally

Auto-labeling policies operate through a combination of content scanning, pattern matching, and rule evaluation. The process involves the following steps:

1.

Content Discovery: The policy scans content stored in SharePoint Online, OneDrive for Business, and Exchange Online (emails). For files, it uses the Microsoft 365 built-in sensitive information types and trainable classifiers to detect patterns like credit card numbers, passport numbers, or custom regex patterns. For emails, it scans the body, subject line, and attachments.

2.

Rule Evaluation: Each auto-labeling policy contains one or more rules. Each rule has conditions (e.g., 'content contains a credit card number') and an action (e.g., 'apply the Confidential label'). The policy evaluates content against these rules in order of priority. If multiple rules match, the highest priority rule's action is applied.

3.

Label Application: If content matches a rule, the policy applies the specified sensitivity label. This label can then enforce protection actions like encryption (via Azure Rights Management), visual markings (watermarks, headers, footers), and conditional access policies (e.g., require MFA to access).

4.

Simulation Mode: Before deploying an auto-labeling policy, you can run it in simulation mode. In simulation mode, the policy identifies items that would be labeled but does not actually apply the label. This allows you to review the impact and fine-tune rules without affecting users.

5.

Automatic Re-labeling: Auto-labeling policies can also re-label content if it already has a label. For example, you might have a policy that upgrades the label from 'Internal' to 'Confidential' if a document contains a new sensitive pattern.

Key Components, Values, Defaults, and Timers

Sensitive Information Types: Microsoft provides over 200 built-in sensitive information types, such as Credit Card Number (defined by Luhn check), U.S. Social Security Number (SSN), and ABA Routing Number. You can also create custom sensitive information types using regex, keyword lists, and proximity rules.

Trainable Classifiers: These are machine-learning-based classifiers that can identify content based on examples (e.g., 'contracts' or 'invoices'). You train them by providing sample documents.

Policy Scope: You can scope auto-labeling policies to specific locations (SharePoint, OneDrive, Exchange) and to specific users or groups. For Exchange, you can also scope to inbound, outbound, or internal emails.

Default Values: There is no default auto-labeling policy; you must create one. However, there is a default label setting for SharePoint and OneDrive that applies a label to new files without a label, but this is separate from auto-labeling.

Timers: Auto-labeling does not have a specific timer; it runs continuously in the background. For Exchange, labeling occurs as messages are processed. For SharePoint and OneDrive, labeling can take up to 24 hours for large volumes of existing content, but new content is labeled within minutes.

Thresholds: You can set a minimum number of instances of a sensitive type required to trigger labeling. For example, require at least 3 credit card numbers to apply the label.

Configuration and Verification Commands

Auto-labeling policies are configured in the Microsoft 365 Compliance Center (now part of the Purview compliance portal) under Information Protection > Auto-labeling. You can also use PowerShell for automation. Key cmdlets:

Get-AutoLabelingPolicy: Lists all auto-labeling policies.

New-AutoLabelingPolicy: Creates a new policy.

Set-AutoLabelingPolicy: Modifies an existing policy.

Get-AutoLabelingRule: Lists rules within a policy.

New-AutoLabelingRule: Adds a rule.

Set-AutoLabelingRule: Modifies a rule.

Example PowerShell commands:

# Create a new auto-labeling policy for SharePoint
New-AutoLabelingPolicy -Name "PII Policy" -Comment "Auto-label documents with PII" -Locations @("SharePoint") -Mode Simulation

# Add a rule to apply the "Confidential" label when a credit card number is detected
New-AutoLabelingRule -Policy "PII Policy" -Name "Credit Card Rule" -ContentContainsSensitiveInformation @{SensitiveInfoTypeId = "50842eb7-edc8-4019-85dd-5a5c1f2bb085"; Count = 1} -Action "ApplyLabel" -LabelId "e2c9c9b9-9c9b-4c9b-9c9b-9c9b9c9b9c9b"

To verify labeling, use the Get-AutoLabelingPolicy cmdlet with the -Status parameter to check policy state (simulation or published). You can also use the Content Explorer in the Compliance Center to view labeled items.

Interaction with Related Technologies

Sensitivity Labels: Auto-labeling applies sensitivity labels, which can then trigger protection actions like encryption (via Azure Information Protection) and visual markings. Labels can also control access through conditional access policies.

Data Loss Prevention (DLP): Auto-labeling policies are separate from DLP policies, but they can complement each other. DLP policies can detect and block sharing of sensitive data, while auto-labeling ensures the data is labeled first. However, auto-labeling does not block actions; it only labels.

Microsoft 365 Compliance Center: Both auto-labeling and DLP policies are managed in the Compliance Center. They share sensitive information types and classifiers.

Azure Information Protection (AIP): Auto-labeling uses the same labeling engine as AIP. If you have AIP deployed, auto-labeling policies can apply labels that enforce rights management protection.

Important Considerations for the Exam

Auto-labeling policies are not real-time for SharePoint and OneDrive; there is a delay (up to 24 hours for bulk scanning). For Exchange, labeling occurs during transport.

Auto-labeling policies can be set to simulation mode first, which is a key exam point.

You can scope policies to specific locations and users, but you cannot scope to specific document libraries or folders—only to the entire SharePoint site or OneDrive account.

Trainable classifiers require at least 50 sample documents for training.

Auto-labeling does not apply labels to content that is already labeled unless you enable re-labeling in the policy settings.

Walk-Through

1

Identify Sensitive Data Types

Begin by determining what sensitive content your organization needs to protect. Common examples include credit card numbers, social security numbers, passport numbers, bank account numbers, and custom patterns like employee IDs. Microsoft provides over 200 built-in sensitive information types, each with a unique ID and detection logic (e.g., credit card numbers require Luhn check). You can also create custom types using regex, keyword lists, and proximity rules. This step is critical because auto-labeling policies rely on these patterns to trigger labeling. On the exam, you may be asked which sensitive information type to use for a given scenario.

2

Create or Select a Sensitivity Label

Auto-labeling policies require an existing sensitivity label to apply. If you don't have one, create it in the Compliance Center under Information Protection > Labels. Define the label's name, priority, and protection settings (encryption, visual markings). The label's priority determines which label wins if multiple labels could apply. For example, a 'Highly Confidential' label with priority 1 will override a 'Confidential' label with priority 2. Ensure the label is published to the users or groups you want to target. Auto-labeling can only apply labels that are published.

3

Create Auto-Labeling Policy in Simulation

In the Compliance Center, go to Information Protection > Auto-labeling and click 'Create auto-labeling policy'. Choose whether to apply the policy to SharePoint, OneDrive, or Exchange. For Exchange, you can also choose to label messages sent to internal or external recipients. Set the policy to simulation mode initially. This mode identifies items that would be labeled but does not apply the label. Simulation allows you to review the number of affected items and adjust rules before going live. The exam often tests that simulation is a recommended first step.

4

Configure Rules and Conditions

Within the policy, add rules that define when to apply the label. Each rule includes conditions (e.g., content contains sensitive info type, content is from a specific domain) and an action (apply label or keep existing label). You can set conditions like 'minimum count' (e.g., at least 2 credit card numbers) and 'match accuracy' (e.g., high confidence). You can also use trainable classifiers as conditions. Rules are evaluated in order; the first matching rule wins. You can also configure exceptions, such as not labeling content from a specific sender.

5

Review Simulation Results and Publish

After running simulation for a period (typically a few days), review the results in the Compliance Center. The policy report shows how many items would be labeled, their locations, and any issues. If satisfied, change the policy mode from simulation to published. Once published, the policy starts applying labels to new and existing content. For SharePoint and OneDrive, existing content is scanned in batches; for Exchange, labeling occurs during message flow. You can also choose to re-label content that already has a lower-priority label.

What This Looks Like on the Job

In a large financial services company, auto-labeling policies are used to automatically classify and protect customer account statements containing bank account numbers and social security numbers. The company has a sensitivity label called 'Financial Confidential' that encrypts documents and restricts access to authorized personnel only. They created a custom sensitive information type for their account number format (e.g., 'ACCT-XXXX-XXXX-XXXX'). The auto-labeling policy is scoped to all SharePoint Online sites used by the wealth management division. In simulation mode, they discovered that many legacy documents in team sites lacked labels, and the policy would label over 10,000 documents. After reviewing, they published the policy. Within 48 hours, all matching documents were labeled and encrypted. A common issue they encountered was that some documents contained false positives—generic numbers that matched the pattern but were not actual account numbers. To reduce false positives, they increased the minimum count to 2 and used a keyword list (e.g., 'Account', 'Statement') in proximity to the pattern.

Another scenario involves a healthcare organization that must comply with HIPAA. They use auto-labeling to apply a 'Protected Health Information (PHI)' label to emails and documents containing patient names combined with medical record numbers. They configured a rule using the built-in U.S. Social Security Number and a custom sensitive info type for medical record numbers. They also set the policy to re-label items that already have a lower-priority label (e.g., 'Internal Use') to 'PHI' when PHI is detected. After deployment, they noticed that some emails were not being labeled because the sensitive info type required a high match accuracy (75%), which was too strict. They lowered it to 60% to capture more true positives. Performance was not an issue because the policy runs asynchronously. However, they had to ensure that the auto-labeling policy did not conflict with their DLP policies that prevented sharing of PHI externally. They set DLP to take action (block) while auto-labeling only labels, creating a layered defense.

A third scenario is a law firm that handles contracts containing intellectual property. They use a trainable classifier trained on sample contracts to identify contract documents. The auto-labeling policy applies a 'Contracts' label that adds a watermark and restricts printing. The training required at least 50 sample documents per classifier. After publishing, they found that documents with heavy formatting or scanned images were not being classified correctly because the classifier relies on text content. They had to adjust their process to ensure OCR was enabled for scanned documents. They also learned that auto-labeling policies cannot be applied to on-premises file shares or third-party cloud storage without additional connectors.

How MS-102 Actually Tests This

The MS-102 exam tests auto-labeling policies under objective 3.3 'Implement and manage information protection', specifically sub-objectives related to sensitivity labels and auto-labeling. Expect 2-3 questions on this topic. Key exam points:

1.

Simulation mode: The exam loves to ask about the purpose of simulation mode. Wrong answer choices often include 'simulation mode applies labels temporarily' or 'simulation mode blocks content'. The correct answer is that simulation mode identifies items that would be labeled without actually applying labels, allowing review before deployment.

2.

Scope: Auto-labeling policies can be scoped to SharePoint, OneDrive, and Exchange. A common wrong answer is that they can also be scoped to Teams or Yammer. They cannot; those require different mechanisms. Also, for Exchange, you can scope to inbound, outbound, or internal messages, but not to specific folders.

3.

Timing: For SharePoint and OneDrive, auto-labeling is not real-time; there is a delay. For Exchange, labeling occurs during message flow. A typical wrong answer states that auto-labeling is immediate for all locations. Know the difference.

4.

Re-labeling: Auto-labeling can re-label content that already has a label, but only if the new label has higher priority. A common trap is that auto-labeling can always override any existing label. The correct behavior is that it respects label priority order.

5.

Trainable classifiers: These require a minimum of 50 sample documents for training. The exam may ask about the minimum number or the fact that classifiers are machine-learning-based.

6.

Custom sensitive information types: You can create custom types using regex, keywords, and proximity. The exam may present a scenario where you need to choose between built-in and custom types.

7.

Multiple rules: If multiple rules match, the rule with the highest priority (lowest number) wins. The exam may test that rule order matters.

8.

Protection actions: Auto-labeling applies a label, but the label's protection settings (encryption, visual markings) are enforced separately. A wrong answer might suggest that auto-labeling itself encrypts content; it does not—it applies a label that triggers encryption.

To eliminate wrong answers, focus on the mechanism: auto-labeling is about classification, not enforcement. Enforcement comes from the label's settings. Also, remember that auto-labeling policies are separate from default labels and mandatory labeling.

Key Takeaways

Auto-labeling policies apply sensitivity labels automatically based on content patterns, not user action.

Simulation mode is a critical first step to review impact before publishing.

Auto-labeling scopes to SharePoint, OneDrive, and Exchange only; not Teams or Yammer.

For SharePoint and OneDrive, labeling is asynchronous with up to 24-hour delay for bulk; Exchange is real-time.

Re-labeling only occurs if the new label has higher priority (lower numeric priority value).

Auto-labeling does not enforce protection; it applies a label that may trigger protection settings.

Trainable classifiers require at least 50 sample documents for training.

Custom sensitive information types can be created using regex, keywords, and proximity.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Auto-Labeling Policy

Applies labels automatically based on content scanning.

Can label existing and new content.

Requires configuration of rules with conditions.

Can be run in simulation mode before publishing.

Supports re-labeling if a higher-priority label matches.

Default Label Policy

Applies a default label to new, unlabeled content only.

Does not scan content; applies based on location or user.

Simple configuration: choose a label and scope.

No simulation mode; applies immediately to new items.

Does not re-label existing content or override other labels.

Watch Out for These

Mistake

Auto-labeling policies can apply labels to content in real-time for all locations.

Correct

For SharePoint and OneDrive, auto-labeling is not real-time; it runs asynchronously and can take up to 24 hours for existing content. Only Exchange labels messages in real-time during transport.

Mistake

Auto-labeling policies automatically encrypt the content they label.

Correct

Auto-labeling only applies a sensitivity label. Encryption is a separate protection setting configured on the label itself. If the label does not have encryption enabled, no encryption occurs.

Mistake

You can use auto-labeling policies to label content in Teams messages and files.

Correct

Auto-labeling policies only support SharePoint Online, OneDrive for Business, and Exchange Online. Teams messages and files are not directly supported; they rely on underlying SharePoint and OneDrive storage for files, but not for chat messages.

Mistake

Simulation mode applies labels but allows you to undo them.

Correct

Simulation mode does not apply any labels. It only identifies which items would have been labeled, allowing you to review the impact without making changes.

Mistake

Auto-labeling policies can override any existing label regardless of priority.

Correct

Auto-labeling respects label priority. It can only re-label content if the new label has a higher priority (lower numeric value) than the existing label. If the existing label has higher priority, the policy does not change it.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between auto-labeling and default labeling?

Auto-labeling automatically applies a sensitivity label based on content scanning (e.g., detecting a credit card number). Default labeling applies a pre-configured label to all new, unlabeled items in a specific location (e.g., all new documents in a SharePoint site). Auto-labeling can also re-label existing content, while default labeling only affects new items. Auto-labeling requires rules and conditions; default labeling is simpler.

Can auto-labeling policies label emails sent to external recipients?

Yes, auto-labeling policies for Exchange can be scoped to inbound, outbound, or internal messages. You can choose to label emails sent to external recipients by selecting 'Outbound' in the policy scope. However, the label's protection settings (e.g., encryption) will apply regardless of recipient type.

How long does it take for auto-labeling to label existing SharePoint documents?

For existing documents in SharePoint or OneDrive, auto-labeling runs as a background process. It can take up to 24 hours to scan and label a large volume of documents, but new documents are typically labeled within minutes after upload.

What happens if an auto-labeling policy conflicts with a DLP policy?

Auto-labeling and DLP policies are independent. Auto-labeling applies a label, while DLP can block or alert on actions. They can work together: auto-labeling ensures data is classified, and DLP can enforce rules based on that classification. There is no direct conflict, but you should ensure consistency (e.g., DLP might block sharing of labeled content).

Can I use auto-labeling to apply labels to content stored on-premises?

No, auto-labeling policies only work with content in Microsoft 365 cloud locations: SharePoint Online, OneDrive for Business, and Exchange Online. For on-premises content, you would need Azure Information Protection scanner or other on-premises solutions.

What is the minimum number of instances required for a sensitive info type to trigger labeling?

The default is 1, but you can configure a minimum count in the rule conditions. For example, you can set 'minimum count: 3' to require at least 3 instances of a credit card number before labeling. This helps reduce false positives.

Can auto-labeling policies be applied to all users in the organization?

Yes, you can scope the policy to all users or specific groups. For Exchange, you can also scope to specific senders or recipients. The policy applies to content stored in the selected locations for the scoped users.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Auto-Labeling Policies for Sensitive Content — now see how well it sticks with free MS-102 practice questions. Full explanations included, no account needed.

Done with this chapter?