This chapter covers Microsoft Sentinel's Entity Mapping and Investigation Graph, two critical features for correlating disparate security events into a coherent incident narrative. On the SC-200 exam, approximately 5-10% of questions touch on entity behavior analytics, entity mapping, and the investigation graph. Understanding how Sentinel automatically extracts and links entities from raw data, and how to use the graph to pivot during investigations, is essential for passing objective 2.4. We will cover the underlying mechanics, configuration steps, and common exam traps.
Jump to a section
Imagine a detective investigating a cyber intrusion. They have a large corkboard where they pin photos, documents, and strings connecting suspects, locations, and events. Each pin represents an entity—a person (user account), a place (IP address), a thing (device), or an event (alert). The strings represent relationships, like 'logged into' or 'communicated with'. Initially, the detective has scattered notes: a log file shows an IP address, an alert mentions a username, and a threat intelligence report lists a malicious domain. Without the board, these are isolated facts. The detective must manually connect them to see the big picture. Sentinel's Entity Mapping automates this process. It ingests raw data from various sources, normalizes them into a common schema (the Entity schema), and automatically links related entities based on identifiers (e.g., matching Account Sid or HostName). The Investigation Graph is the actual corkboard view: a visual representation of entities and their relationships, allowing the analyst to traverse from an alert to related entities and back to other alerts. Just as the detective can follow a string from a suspect to a location to a weapon, the analyst can click on a user entity to see all alerts involving that user, all devices they accessed, and all IPs they connected to. This transforms reactive incident response into proactive threat hunting, enabling the analyst to pivot quickly and uncover the full scope of an attack.
What is Entity Mapping and Why Does It Exist?
Entity mapping is the process by which Microsoft Sentinel extracts structured entities—such as user accounts, IP addresses, hosts, and file hashes—from unstructured or semi-structured log data and normalizes them into a common schema. This allows the platform to automatically link related events that share the same entity, enabling the Investigation Graph to display a unified relationship map. Without entity mapping, an analyst would need to manually correlate data across different tables and time ranges, a tedious and error-prone process.
How Entity Mapping Works Internally
Sentinel ingests data from multiple sources: Azure AD logs, Windows Security Events, Azure Activity logs, third-party firewalls, and more. Each source has its own schema. For example, a Windows Security Event 4624 (successful logon) contains fields like TargetUserName, IpAddress, and WorkstationName. A firewall log might contain src_ip, dst_ip, and user. Sentinel uses built-in parsers (functions that normalize data) to map these disparate fields into a standard set of entity types defined in the Azure Sentinel entity schema.
The entity types include:
- Account: Represents a user or service account. Key identifiers: AccountSid, AccountUPN, AccountName, AccountDomain.
- Host: Represents a device. Key identifiers: HostName, DnsDomain, AzureID, OMSAgentID.
- IP: Represents an IP address (IPv4 or IPv6). Identifier: Address.
- URL: Represents a URL. Identifier: Url.
- FileHash: Represents a file hash. Identifier: Algorithm (e.g., SHA256) and Value.
- Process: Represents a running process. Identifiers: ProcessId, CommandLine, ImageFile.
- Malware: Represents a malware entity (often from threat intel).
- Mailbox: Represents an Exchange mailbox.
- MailMessage: Represents an email message.
- SubmissionMail: Represents an email submission.
When an alert is triggered (either from a built-in analytics rule or a custom rule), Sentinel automatically extracts entities from the alert's results. For example, if a rule detects multiple failed logons from an IP, Sentinel extracts the Account and IP entities. These entities are then stored in the SecurityAlert table (or SecurityIncident table for incidents) under the Entities column, which is a JSON array of entity objects.
The Investigation Graph
The Investigation Graph is a visual tool that displays entities and their relationships. It is accessible from the Sentinel incident details page. When you open an incident, click "Investigate" to launch the graph. The graph starts with the incident's root entities (those extracted from the triggering alert). From there, you can expand entities to see related alerts, other entities, and even bookmarks.
Key features of the graph: - Entity expansion: Right-click an entity (e.g., an IP) and select "Explore related alerts" to see all alerts involving that IP in the last 30 days (configurable up to 90 days). - Relationship types: Lines between entities are labeled with the relationship (e.g., "Connected to", "Contains", "Related alert"). The relationship is derived from the context in which entities appear together in an alert. - Bookmarks: You can add bookmarks to save a particular state of the graph for later reference. - Hunting integration: You can run a hunting query directly from the graph by selecting an entity and choosing "Run query".
Configuration and Verification
Entity mapping is largely automatic, but you can influence it via:
- Analytics rule settings: When creating a custom analytics rule, you can explicitly map fields from the query results to entity types. This is done in the "Set rule logic" step under "Entity mapping". For example, if your KQL query returns a column UserPrincipalName, you can map it to the Account entity's AccountUPN property.
- Normalization parsers: You can write custom parsers using KQL functions to normalize data from custom logs into the standard entity schema.
To verify entity mapping, run a KQL query against the SecurityAlert table:
SecurityAlert
| where TimeGenerated > ago(24h)
| extend Entities = parse_json(Entities)
| mv-expand Entities
| evaluate bag_unpack(Entities)
| project TimeGenerated, AlertName, Entities_Type = tostring(Entities['Type']), Entities_Name = tostring(Entities['Name'])This shows which entity types were extracted for each alert.
Interaction with Related Technologies
Entity mapping integrates with: - UEBA (User and Entity Behavior Analytics): UEBA uses the same entity schema to build behavioral profiles for entities (e.g., baseline of logon times, locations). Anomalies detected by UEBA are surfaced as alerts and appear in the investigation graph. - Threat Intelligence: Indicators of compromise (IOCs) from threat intelligence feeds are mapped to entity types (IP, URL, FileHash). When an alert involves an entity that matches an IOC, the graph shows a threat indicator. - Watchlists: You can import custom data (e.g., VIP users) as watchlists. These can be joined with entity data in analytics rules, but they do not automatically become graph entities unless mapped.
Default Values and Timers
The investigation graph queries for related alerts up to 30 days back by default. This can be changed in the graph settings (gear icon) to up to 90 days.
Entity expansion queries are limited to 100 results per expansion to avoid overwhelming the UI.
The graph will automatically refresh every 60 seconds if new alerts are ingested for the entities.
Common Pitfalls
Missing entity mapping: If an alert does not have entities extracted, the graph will show only the alert node with no connections. This often happens when custom analytics rules do not define entity mapping explicitly.
Duplicate entities: The same user may appear as multiple entity nodes if different identifiers are used (e.g., one alert uses AccountSid and another uses AccountUPN). Sentinel does not automatically merge them; you must use the graph's "Merge" feature (available in the right-click menu) to combine them.
Performance: For very large incidents with hundreds of entities, the graph can become slow. Use the "Filter" option to focus on specific entity types.
Ingest Raw Log Data
Sentinel collects logs from various sources such as Azure AD, Windows Event Logs, and third-party security appliances. Each log source has its own schema. For example, a Windows Event ID 4624 contains fields like `TargetUserName`, `IpAddress`, `WorkstationName`. Sentinel uses Data Connectors to ingest this data into Log Analytics tables (e.g., `SecurityEvent`, `SigninLogs`). At this stage, entities are not yet extracted; the data is raw and unstructured.
Run Analytics Rule and Generate Alert
An analytics rule (e.g., a scheduled query) runs on the ingested data. The KQL query returns results that match a detection pattern. For instance, a rule detecting multiple failed logins might output rows with `AccountName`, `SourceIP`, and `TimeGenerated`. When the rule triggers, it creates an alert. During alert creation, Sentinel automatically extracts entities from the query results based on the entity mapping defined in the rule. If no mapping is defined, Sentinel will attempt to infer entities from common field names (e.g., `AccountName` -> Account entity).
Extract Entities and Store in Alert
Sentinel parses the alert's results and extracts entities according to the mapping. For each entity type, it populates the relevant identifiers (e.g., for Account: `AccountSid`, `AccountUPN`, `AccountName`). These entities are stored as a JSON array in the `Entities` column of the `SecurityAlert` table. The extraction is deterministic: if a field is mapped to `AccountUPN`, the value must be a UPN format. If the value is invalid, the entity may be skipped.
Create Incident and Open Investigation Graph
When an alert is generated, Sentinel can automatically create an incident (if configured). The incident groups related alerts. From the incident details, an analyst clicks "Investigate" to open the Investigation Graph. The graph initially shows the incident's root entities (those from the triggering alert) as nodes. Each node displays the entity type icon (e.g., a person for Account, a monitor for Host) and the primary name (e.g., username, IP address).
Expand Entities to Explore Relationships
The analyst can right-click an entity node (e.g., an IP address) and select "Explore related alerts". Sentinel queries the `SecurityAlert` table for all alerts that contain the same entity identifier within the last 30 days (default). The results are displayed as new nodes connected by relationship lines. The analyst can also expand to see related entities (e.g., other accounts that logged in from that IP). This process can be repeated to traverse the graph and uncover the full attack chain.
Scenario 1: Incident Response for a Compromised User Account
A large enterprise uses Sentinel to monitor Azure AD sign-ins. An analytics rule detects a user logging in from an unusual location. The alert extracts entities: Account (user@contoso.com), IP (203.0.113.5), and Host (laptop01). The incident is assigned to an analyst who opens the Investigation Graph. The analyst expands the IP entity and discovers that the same IP was used in 15 other sign-in attempts to different accounts in the last 24 hours, indicating a brute-force attack. By expanding one of those accounts, the analyst finds that the attacker successfully logged into a privileged account. The graph shows all related alerts, enabling the analyst to contain the breach quickly. In production, the organization has set the graph time range to 7 days to balance performance and coverage. They also use bookmarks to document the investigation path for compliance.
Scenario 2: Threat Hunting with File Hashes
A security operations center (SOC) receives a threat intelligence feed containing SHA256 hashes of known ransomware. Sentinel automatically maps these hashes to the FileHash entity. An analytics rule triggers when a hash matches an alert from Microsoft Defender for Endpoint. The investigation graph shows the file hash node connected to the host where it was executed, the user who ran it, and the network connections made. The analyst expands the host to see all alerts involving that host, revealing that the same host also downloaded a malicious URL. The graph allows the analyst to pivot from file to host to network, identifying lateral movement. In this deployment, the SOC has configured the graph to show up to 90 days of related alerts, but they limit expansions to 50 results to avoid performance issues. They also use the "Merge" feature to combine duplicate host entities that appear due to different identifiers (AzureID vs. HostName).
Common Misconfigurations and Pitfalls
Missing entity mapping in custom rules: Many custom analytics rules fail to define entity mapping, resulting in alerts with no entities. The investigation graph then shows only the alert node, rendering it useless. To fix, always map at least one entity (e.g., Account or IP) in the rule's entity mapping section.
Overly broad time range: Setting the graph to query 90 days of data can cause slow performance. Best practice is to start with 30 days and increase only if needed.
Ignoring duplicate entities: When entities are not merged, the graph can become cluttered with multiple nodes for the same entity. Analysts should use the merge feature to consolidate.
SC-200 Objective 2.4: "Create and manage entities"
The exam tests your ability to:
Configure entity mapping in analytics rules
Interpret the Investigation Graph
Understand how entities are used in UEBA and threat intelligence
Most Common Wrong Answers
"Entities are automatically extracted from all logs without any configuration." This is false. While Sentinel does perform some automatic extraction based on common field names, for custom logs or custom analytics rules, you must explicitly map entities. The exam loves to present a scenario where a custom rule fails to produce entities, and the correct answer is to add entity mapping.
"The Investigation Graph shows all data from the last 90 days by default." The default is 30 days. The 90-day option is available but must be manually set. Exam questions often ask: "What is the default time range for related alerts in the investigation graph?" The answer is 30 days.
"Entity mapping is only available for built-in analytics rules." This is incorrect. Entity mapping can be configured for custom analytics rules as well, in the "Set rule logic" step.
Specific Numbers and Terms
Entity types tested: Account, Host, IP, URL, FileHash, Process, Malware, Mailbox, MailMessage, SubmissionMail.
Default graph time range: 30 days.
Maximum expansion results: 100.
The SecurityAlert table's Entities column is of type dynamic (JSON).
To view entities in KQL: SecurityAlert | extend Entities = parse_json(Entities) | mv-expand Entities.
Edge Cases
When an entity has multiple identifiers (e.g., both AccountSid and AccountUPN), the graph may show two separate nodes. The exam expects you to know that you can merge them manually.
If an analytics rule returns multiple rows with different entities, all are extracted. However, if the rule does not have entity mapping, no entities are extracted even if the query outputs field names that match entity properties.
How to Eliminate Wrong Answers
If an answer says "entities are automatically extracted from all logs", it is likely wrong because custom logs require mapping.
If an answer mentions "90 days" as default, it is wrong; default is 30 days.
If an answer says "entity mapping is only for built-in rules", it is wrong.
Focus on answers that emphasize explicit mapping in the analytics rule configuration.
Entity mapping extracts structured entities (Account, Host, IP, etc.) from alert data and stores them in the `Entities` column of the `SecurityAlert` table.
Custom analytics rules require explicit entity mapping in the rule configuration; otherwise, no entities are extracted.
The Investigation Graph defaults to showing related alerts within the last 30 days; can be changed to up to 90 days.
Each entity expansion in the graph returns up to 100 results.
Duplicate entities (e.g., same user with different identifiers) must be merged manually using the graph's merge feature.
Entities are used by UEBA to build behavioral baselines and by threat intelligence to match IOCs.
The `SecurityAlert` table's `Entities` column is of type `dynamic`; use `parse_json()` and `mv-expand` to query it.
The Investigation Graph is accessible from the incident details page by clicking 'Investigate'.
These come up on the exam all the time. Here's how to tell them apart.
Entity Mapping in Analytics Rules
Configured per rule in the Azure Sentinel interface.
Maps fields from the query result to entity types (e.g., Account, Host).
Entities are extracted at alert generation time.
Simpler to implement for individual rules.
Does not affect data in the underlying Log Analytics tables.
Entity Mapping via Normalization Parsers
Implemented as KQL functions that normalize data from custom logs.
Transforms raw data into a standardized schema (e.g., CommonSecurityLog).
Entities can be extracted at query time by any rule using the parser.
More complex but reusable across multiple rules.
Changes the data representation in the Log Analytics workspace.
Mistake
Entities are automatically extracted from every log source without any configuration.
Correct
Sentinel does perform some automatic extraction for common fields (e.g., AccountName), but for custom logs or custom analytics rules, explicit entity mapping is required in the rule's configuration. Without mapping, the `Entities` column in the alert will be empty.
Mistake
The Investigation Graph shows all historical data for an entity.
Correct
By default, the graph queries for related alerts within the last 30 days. This can be increased to up to 90 days, but not beyond. Also, the graph only shows alerts that have the entity extracted; it does not query raw logs directly.
Mistake
Entity mapping can only be done in built-in analytics rules.
Correct
Entity mapping is available for both built-in and custom analytics rules. When creating a custom rule, you can map fields from your KQL query to entity types in the "Entity mapping" section of the rule creation wizard.
Mistake
The Investigation Graph automatically merges duplicate entities.
Correct
The graph does not automatically merge duplicates. If the same user appears with different identifiers (e.g., AccountSid vs. AccountUPN), they will appear as separate nodes. The analyst must manually merge them using the right-click "Merge" option.
Mistake
Entities are stored in a separate table from alerts.
Correct
Entities are stored within the `SecurityAlert` table as a JSON array in the `Entities` column. There is no separate entity table; entities are embedded in the alert record.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
In the Sentinel analytics rule creation wizard, after writing your KQL query, go to the 'Set rule logic' step. Scroll down to 'Entity mapping'. Click 'Add new entity'. Choose the entity type (e.g., Account) and then map the field from your query results to the appropriate entity identifier (e.g., AccountUPN). You can map multiple entities per rule. Without this mapping, no entities will be extracted.
This happens when the alert has no entities extracted. Check the analytics rule that generated the alert: does it have entity mapping configured? If not, add mapping. Also verify that the field values are in the correct format (e.g., UPN for AccountUPN). You can run `SecurityAlert | where SystemAlertId == '<alert-id>' | extend Entities = parse_json(Entities) | project Entities` to see if the Entities array is empty.
Yes. In the Investigation Graph, click the gear icon (Settings) in the top right. Under 'Related alerts time range', you can select from 1 day to 90 days. The default is 30 days. Note that increasing the range may slow down performance.
Right-click on one of the duplicate entity nodes (e.g., two Account nodes for the same user). Select 'Merge'. Then choose the target entity node to merge into. The merged node will combine all relationships from both original nodes. This is useful when the same entity appears with different identifiers (e.g., one with AccountSid and another with AccountUPN).
The supported entity types are: Account, Host, IP, URL, FileHash, Process, Malware, Mailbox, MailMessage, SubmissionMail, CloudApplication, DNS, AzureResource, and more. For the SC-200 exam, focus on Account, Host, IP, URL, FileHash, and Process.
Use the following KQL query: `SecurityAlert | where TimeGenerated > ago(24h) | extend Entities = parse_json(Entities) | mv-expand Entities | evaluate bag_unpack(Entities) | project TimeGenerated, AlertName, Entities_Type = tostring(Entities['Type']), Entities_Name = tostring(Entities['Name'])`. This expands the JSON array and unpacks the entity properties.
Yes. When you connect Microsoft Defender for Endpoint to Sentinel, alerts from Defender automatically include entities such as Account, Host, IP, and FileHash. The entity mapping is built-in for these connectors. However, if you create custom analytics rules on top of Defender data, you must still configure entity mapping manually.
You've just covered Sentinel Entity Mapping and Investigation Graph — now see how well it sticks with free SC-200 practice questions. Full explanations included, no account needed.
Done with this chapter?