SC-200Chapter 18 of 101Objective 1.1

Advanced Hunting with KQL in Defender

Advanced Hunting with Kusto Query Language (KQL) is a cornerstone capability in Microsoft 365 Defender that allows security analysts to proactively search for threats across their environment. This chapter covers the architecture of Advanced Hunting, the structure of KQL queries, key tables and schemas, and how to build effective hunts. For the SC-200 exam, approximately 15-20% of questions relate to KQL and Advanced Hunting, including constructing queries, understanding data sources, and interpreting results. Mastering this topic is essential for passing the exam and for real-world threat hunting.

25 min read
Intermediate
Updated May 31, 2026

Advanced Hunting as a Detective's Evidence Board

Imagine you are a detective investigating a complex crime. You have access to a massive evidence board (the Microsoft 365 Defender portal) containing thousands of pieces of information: phone records, emails, security camera timestamps, witness statements, and financial transactions. All this data is stored in separate folders, but you can freely search across them. Your goal is to reconstruct the sequence of events that led to the crime. To do this efficiently, you don't just read every document; you use a specialized query language (KQL) that lets you filter, join, and project data. For example, you might start by looking for all phone calls made on the day of the crime (filtering), then cross-reference those phone numbers with a list of known suspects (joining), and finally extract only the caller ID, time, and duration (projecting). You can also pipe the results into a time chart to visualize the pattern of calls. The evidence board is not static—new data streams in constantly, and you can set up custom detections that automatically alert you when a specific pattern occurs, like a suspect calling the victim after midnight. This is exactly how Advanced Hunting works in Microsoft Defender: it provides a unified interface to query raw data from various security sources using KQL, enabling you to hunt for threats, investigate incidents, and create custom detection rules.

How It Actually Works

What is Advanced Hunting and Why Does It Exist?

Advanced Hunting is a query-based threat hunting tool within Microsoft 365 Defender that provides access to raw, pre-processed data from multiple security domains—endpoints, email, identities, cloud apps, and more. Unlike pre-built detections or alerts, Advanced Hunting allows analysts to ask custom questions of their security data, enabling proactive discovery of threats that might evade automated detection. It is built on Azure Data Explorer (ADX) and uses KQL as its query language.

How Advanced Hunting Works Internally

When you run a query in Advanced Hunting, the request is sent to the ADX cluster that stores the data for your tenant. The data is organized into tables, each representing a specific type of event (e.g., DeviceProcessEvents for process creation, EmailEvents for email delivery). Each table has a predefined schema with columns such as Timestamp, DeviceId, AccountUpn, etc. The query is parsed, optimized, and executed across distributed nodes. Results are returned as a table that you can further refine, export, or visualize.

Key internal components: - Data Sources: Data flows from endpoints (via Microsoft Defender for Endpoint), email (Defender for Office 365), identities (Microsoft Defender for Identity), cloud apps (Defender for Cloud Apps), and others. Each source has its own connector that normalizes data into the common schema. - Data Retention: Raw data is retained for 30 days by default. You can configure custom retention policies (up to 180 days) for certain tables using the Advanced Hunting data retention settings. - Query Limits: Queries can run up to 30 minutes, but the portal has a 10-minute timeout for interactive queries. Results are limited to 100,000 rows per query (or 500,000 if using the API).

Key KQL Operators and Functions for Security Hunting

KQL is a read-only, tabular query language. The most common operators used in Advanced Hunting are: - `where`: Filters rows based on a predicate. Example: where Timestamp > ago(7d) - `project`: Selects specific columns. Example: project Timestamp, DeviceName, FileName - `extend`: Adds computed columns. Example: extend DayOfWeek = datetime_part('dayofweek', Timestamp) - `join`: Combines rows from two tables based on a key. Types: inner, outer, left, right, semi, anti. Example: DeviceProcessEvents | join kind=inner DeviceInfo on DeviceId - `summarize`: Aggregates data by groups, using functions like count(), dcount(), avg(), min(), max(). Example: summarize TotalProcesses = count() by DeviceName - `order by` and `sort by`: Sorts results. - `take` and `limit`: Returns a specified number of rows. - `let`: Defines variables or functions. Example: let threshold = 5; - `materialize`: Caches a table for reuse in the same query. - `make-series`: Creates time series for anomaly detection. - `render`: Visualizes results as chart (timechart, barchart, etc.).

Key Tables and Their Schemas

The SC-200 exam frequently tests knowledge of specific tables. Here are the most important ones:

DeviceProcessEvents: Records process creation events. Columns: Timestamp, DeviceId, DeviceName, AccountUpn, AccountName, ProcessName, ProcessId, CommandLine, FileName, FolderPath, SHA256, InitiatingProcessFileName, InitiatingProcessCommandLine, etc.

DeviceFileEvents: File creation, modification, deletion, rename. Columns: Timestamp, DeviceId, DeviceName, AccountUpn, FileName, FolderPath, SHA256, FileSize, ActionType (e.g., FileCreated, FileModified).

DeviceNetworkEvents: Network connections initiated by processes. Columns: Timestamp, DeviceId, DeviceName, RemoteIP, RemotePort, LocalIP, LocalPort, Protocol, ActionType (ConnectionSuccess, ConnectionFailed), InitiatingProcessFileName, etc.

DeviceLogonEvents: User logon/logoff events. Columns: Timestamp, DeviceId, DeviceName, AccountUpn, AccountDomain, LogonType (Interactive, Network, RemoteInteractive), LogonId, SessionId, etc.

EmailEvents: Email delivery events. Columns: Timestamp, NetworkMessageId, SenderFromAddress, SenderDisplayName, RecipientEmailAddress, Subject, ThreatTypes (Phish, Malware), DetectionMethods, DeliveryAction (Delivered, Blocked, Junked), etc.

EmailAttachmentInfo: Attachments on emails. Columns: Timestamp, NetworkMessageId, SenderFromAddress, RecipientEmailAddress, FileName, FileType, SHA256, ThreatTypes, DetectionMethods.

IdentityLogonEvents: Logon events from Microsoft Defender for Identity. Columns: Timestamp, AccountUpn, AccountName, AccountDomain, LogonType, Application, IPAddress, Country, IsAdmin, etc.

AlertInfo: Alerts generated by Microsoft 365 Defender. Columns: AlertId, Title, Category, Severity, DetectionSource, ServiceSource, AlertEvidence (list of evidence entities).

AlertEvidence: Evidence entities linked to alerts. Columns: AlertId, EntityType (File, Process, IP, etc.), EntityId, Timestamp, etc.

Building a Basic Query

A typical query follows this pattern:

// Find all processes that created a file with a specific SHA256 hash in the last 7 days
DeviceFileEvents
| where Timestamp > ago(7d)
| where SHA256 == "<hash>"
| join kind=inner DeviceProcessEvents on DeviceId, Timestamp
| project Timestamp, DeviceName, FileName, ProcessName, CommandLine

Advanced Techniques

Time Series and Anomaly Detection: Use make-series and series_decompose_anomalies to detect unusual spikes. Example:

DeviceNetworkEvents
| make-series Count=count() default=0 on Timestamp from ago(14d) to now() step 1h by RemoteIP
| extend anomalies = series_decompose_anomalies(Count)
| mv-expand Timestamp, Count, anomalies
| where anomalies == 1

Hunting Across Tables with Joins: Use joins to correlate events across domains. For example, to see which processes ran on a device that received a phishing email:

EmailEvents
| where ThreatTypes has "Phish"
| join kind=inner (DeviceProcessEvents) on RecipientEmailAddress
| project Timestamp, DeviceName, ProcessName, CommandLine

Using `let` for Reusability:

let suspiciousIPs = dynamic(["10.0.0.1", "192.168.1.1"]);
DeviceNetworkEvents
| where RemoteIP in (suspiciousIPs)
| project Timestamp, DeviceName, RemoteIP, RemotePort

Custom Detections

Queries can be saved as custom detection rules that run periodically and generate alerts. The rule can be configured to run every 1, 6, 12, or 24 hours. When a query returns results, an alert is created in the Microsoft 365 Defender portal. Custom detections support both scheduled and real-time (within 5 minutes) modes.

Performance and Best Practices

Use time filters (Timestamp > ago(7d)) to limit data scanned.

Avoid join on large tables without filtering first.

Use summarize with dcount() for approximate distinct counts (faster than count(distinct)).

Use materialize when a subquery is used multiple times.

Limit columns with project to reduce data transfer.

Integration with Other Tools

Advanced Hunting data can be exported to Azure Sentinel (now Microsoft Sentinel) for longer retention and correlation with other logs. Queries can also be run via the Microsoft 365 Defender API for automation.

Walk-Through

1

Access the Advanced Hunting Portal

Log into the Microsoft 365 Defender portal (https://security.microsoft.com). Navigate to 'Hunting' > 'Advanced Hunting' from the left navigation pane. The portal opens a query editor with a schema browser on the left, showing all available tables grouped by data source (e.g., Endpoints, Email & collaboration, Identities). You can expand each group to see column definitions. The editor supports syntax highlighting and intellisense for table names and columns. You can also switch between tabs for multiple queries.

2

Write a Simple Filter Query

Start with a basic query to retrieve process creation events from the last 7 days: `DeviceProcessEvents | where Timestamp > ago(7d) | take 10`. Click 'Run query' to execute. The results appear in a table below. Observe the columns returned. You can click on any column header to sort. Use 'Export' to download results as CSV. This step verifies connectivity and data flow.

3

Refine with Project and Extend

Modify the query to show only relevant columns and add a computed column: `DeviceProcessEvents | where Timestamp > ago(7d) | project Timestamp, DeviceName, ProcessName, CommandLine | extend DayOfWeek = datetime_part('dayofweek', Timestamp)`. Run again. Notice that the DayOfWeek column shows an integer (0=Sunday, 6=Saturday). This demonstrates how to transform data inline.

4

Correlate Events with Join

Write a query that correlates process creation with network events to find processes that made outbound connections to a known malicious IP. First, filter DeviceNetworkEvents for the IP, then join with DeviceProcessEvents on DeviceId and a time window: `let maliciousIP = '10.0.0.1'; DeviceNetworkEvents | where RemoteIP == maliciousIP and Timestamp > ago(1d) | join kind=inner DeviceProcessEvents on DeviceId | project Timestamp, DeviceName, ProcessName, RemoteIP, RemotePort`. This step shows how to link events across tables.

5

Aggregate and Summarize Data

Use `summarize` to count the number of processes per device: `DeviceProcessEvents | where Timestamp > ago(7d) | summarize ProcessCount = count() by DeviceName | order by ProcessCount desc`. This query groups by DeviceName and counts rows. For distinct count of processes, use `dcount(ProcessName)`. This is useful for identifying devices with unusual process activity.

6

Create a Custom Detection Rule

After finalizing a query, click 'Create detection rule' in the toolbar. Provide a name, description, and set the frequency (e.g., every 1 hour). Choose which entities to alert on (e.g., Device, User). Configure the impact and severity. The rule will run automatically and generate alerts when the query returns results. This step turns a hunt into a continuous detection.

What This Looks Like on the Job

Scenario 1: Hunting for Lateral Movement

A security operations center (SOC) analyst suspects that an attacker has moved laterally across workstations using remote desktop protocol (RDP). Using Advanced Hunting, the analyst queries DeviceLogonEvents for LogonType == 10 (RemoteInteractive) within the last 24 hours, then joins with DeviceNetworkEvents to see if the source IP of the RDP connection matches an internal IP that also had a suspicious process. The query: DeviceLogonEvents | where LogonType == 10 and Timestamp > ago(1d) | join DeviceNetworkEvents on $left.RemoteIP == $right.RemoteIP | where DeviceNetworkEvents.RemotePort == 3389. This reveals the lateral movement path. In production, the analyst would also check for unusual account usage (e.g., service accounts logging on interactively).

Scenario 2: Investigating a Phishing Campaign

A company receives a wave of phishing emails. The analyst uses EmailEvents to find all emails with ThreatTypes containing 'Phish' in the last 7 days. They then join with EmailAttachmentInfo to extract SHA256 hashes of attachments. Using those hashes, they query DeviceFileEvents to see if any file was created on endpoints, and then DeviceProcessEvents to see if the file was executed. The query chain: EmailEvents | where ThreatTypes has 'Phish' | join EmailAttachmentInfo on NetworkMessageId | project SHA256 | join DeviceFileEvents on SHA256 | join DeviceProcessEvents on SHA256. This provides a complete kill chain from inbox to execution.

Scenario 3: Anomalous Outbound Connections

A security engineer monitors for data exfiltration. They create a custom detection that uses time series anomaly detection on DeviceNetworkEvents to flag devices with sudden spikes in outbound connections to new external IPs. The query uses make-series and series_decompose_anomalies to detect outliers. In production, the rule runs every 6 hours and alerts on devices with anomaly score > 1.5. Misconfiguration often occurs when the time bin is too large (e.g., 1 day) causing missed short bursts, or when the threshold is too low causing noise.

How SC-200 Actually Tests This

SC-200 Objective Coverage

Advanced Hunting with KQL is tested under objective 1.1: Hunt for threats using Microsoft 365 Defender. Specifically, sub-objectives include:

Construct KQL queries to search for threats across data sources.

Use joins, filters, and aggregations to correlate events.

Interpret query results and identify indicators of compromise.

Create custom detection rules from queries.

Understand the schema of key tables (DeviceProcessEvents, DeviceFileEvents, DeviceNetworkEvents, EmailEvents, etc.).

Common Wrong Answers and Traps

1.

Confusing `join` types: Candidates often use join kind=leftouter when they need inner. The exam tests understanding of when to include non-matching rows. For example, if you want only processes that have a corresponding network event, use inner. If you want all processes even without network events, use leftouter.

2.

Misusing `summarize` vs `extend`: summarize collapses rows into groups; extend adds columns row-by-row. A common mistake is using extend to count, which doesn't aggregate.

3.

Forgetting time filters: Queries without where Timestamp > ago(7d) can scan the entire 30-day retention, causing timeouts. The exam expects you to always include a time range.

4.

Incorrect table names: Tables like DeviceProcessEvents are often misspelled as DeviceProcessEvents (correct) vs ProcessEvents (wrong). The exam may present similar names to test exact recall.

Specific Values and Terms

LogonType values: 2 (Interactive), 10 (RemoteInteractive), 3 (Network), 7 (Unlock).

Retention: 30 days default, extendable to 180 days for some tables.

Query timeout: 10 minutes in portal, 30 minutes via API.

Result limit: 100,000 rows (portal), 500,000 (API).

Custom detection frequencies: 1, 6, 12, 24 hours.

Edge Cases

When using join, ensure the time window is appropriate. Joining on Timestamp exactly rarely works due to millisecond differences; use a time range like | where abs(DeviceProcessEvents.Timestamp - DeviceNetworkEvents.Timestamp) < 1m.

has vs contains: has is faster and matches whole tokens; contains matches substrings. The exam may test which to use for partial matching.

let statements: Variables can be used for IP lists, but the dynamic type is required for arrays.

Eliminating Wrong Answers

If a question asks for the number of distinct processes, look for dcount() not count(). count() counts rows, not distinct values.

If a question asks to correlate data from two sources, the answer likely involves a join or lookup. lookup is a specialized join that extends a table with values from another table without increasing row count.

If a question asks to find processes that started after a specific event, use where Timestamp > datetime(2023-01-01).

Key Takeaways

Advanced Hunting uses KQL to query raw data from endpoints, email, identities, and cloud apps in Microsoft 365 Defender.

Key tables: DeviceProcessEvents, DeviceFileEvents, DeviceNetworkEvents, DeviceLogonEvents, EmailEvents, EmailAttachmentInfo, IdentityLogonEvents, AlertInfo, AlertEvidence.

Always include a time filter using `where Timestamp > ago(Nd)` to limit data scanned and avoid timeouts.

Use `join` to correlate events across tables; specify `kind` (inner, leftouter, etc.) and join keys.

Use `summarize` with `count()`, `dcount()`, `min()`, `max()` for aggregation; `dcount()` is faster for distinct counts.

Custom detections can be created from any query and run every 1, 6, 12, or 24 hours to generate alerts.

Query results are limited to 100,000 rows in the portal; use `take` or `limit` to restrict output.

Use `let` to define variables and functions for reusable queries.

The `has` operator is faster than `contains` for whole-token matching; `has` does not match substrings.

For time series anomaly detection, use `make-series` and `series_decompose_anomalies`.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Advanced Hunting (KQL)

Built into Microsoft 365 Defender, no additional cost

Data retention: 30 days default, up to 180 days with custom policy

Limited to security data from M365 Defender sources

Queries run directly against ADX cluster in tenant

Custom detections run at fixed intervals (1,6,12,24h)

Microsoft Sentinel (KQL)

Separate Azure resource, requires licensing

Retention up to 2 years or more with hot/cold tiers

Can ingest data from any source (firewalls, syslog, etc.)

Queries run against Log Analytics workspace

Analytics rules can run every 5 minutes or on a schedule

Watch Out for These

Mistake

Advanced Hunting only works for endpoint data.

Correct

Advanced Hunting covers multiple domains: endpoints, email, identities, cloud apps, and more. Each domain has its own set of tables (e.g., EmailEvents, IdentityLogonEvents).

Mistake

KQL is case-sensitive for table and column names.

Correct

KQL is case-insensitive for identifiers. However, string comparisons using `==` are case-sensitive; use `=~` for case-insensitive matching.

Mistake

The `take` operator returns random rows.

Correct

`take` returns an arbitrary subset of rows, not necessarily random. For reproducible results, use `order by` before `take`.

Mistake

Custom detections can run every minute.

Correct

Custom detections have minimum frequencies of 1 hour. For near-real-time alerts, use Microsoft 365 Defender's built-in detections or configure scheduled queries with 5-minute intervals via the API (not supported in UI).

Mistake

You can modify data using KQL in Advanced Hunting.

Correct

KQL is a read-only query language. You cannot insert, update, or delete data. Advanced Hunting is for querying only.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

How do I find all processes that ran on a specific device in the last 24 hours?

Use the DeviceProcessEvents table and filter by DeviceName and time: `DeviceProcessEvents | where DeviceName == "PC-01" and Timestamp > ago(1d) | project Timestamp, ProcessName, CommandLine`. Replace PC-01 with the actual device name. If you don't know the exact name, use `where DeviceName contains "PC"` for partial match.

What is the difference between `has` and `contains` in KQL?

`has` matches whole terms (tokens) and is faster because it uses an index. For example, `"the quick brown fox" has "brown"` returns true, but `"brown" has "row"` returns false. `contains` matches any substring, so `"brown" contains "row"` returns true. Use `has` when you want to match a complete word; use `contains` when you need substring matching.

How can I join two tables on a time window?

Since timestamps rarely match exactly, use a condition like `where abs(Table1.Timestamp - Table2.Timestamp) < 1m`. For example: `DeviceProcessEvents | where Timestamp > ago(1d) | join DeviceNetworkEvents on DeviceId | where abs(DeviceProcessEvents.Timestamp - DeviceNetworkEvents.Timestamp) < 5m`. This joins events that occurred within 5 minutes of each other on the same device.

Can I export Advanced Hunting results to a SIEM?

Yes, you can export results to Microsoft Sentinel (Azure Sentinel) using the Microsoft 365 Defender data connector. Additionally, you can use the Advanced Hunting API to programmatically retrieve data and send it to any SIEM. The portal also allows CSV export for manual transfer.

What does the `materialize` function do?

`materialize` caches the result of a subquery so that it can be reused multiple times in the same query without re-executing. This improves performance when a subquery is referenced more than once. Example: `let materializedEvents = materialize(DeviceProcessEvents | where Timestamp > ago(1d)); materializedEvents | summarize ... ; materializedEvents | join ...`.

How do I find emails with specific attachment names?

Use the EmailAttachmentInfo table: `EmailAttachmentInfo | where FileName contains "invoice" and Timestamp > ago(7d) | project NetworkMessageId, SenderFromAddress, RecipientEmailAddress, FileName`. You can then join with EmailEvents for more details like subject.

What is the default data retention for Advanced Hunting?

The default retention is 30 days for all tables. You can extend retention up to 180 days for certain tables using the 'Data retention' settings in the Microsoft 365 Defender portal. Extended retention may incur additional costs.

Terms Worth Knowing

Ready to put this to the test?

You've just covered Advanced Hunting with KQL in Defender — now see how well it sticks with free SC-200 practice questions. Full explanations included, no account needed.

Done with this chapter?