This chapter covers Microsoft Sentinel data connectors, which are essential for ingesting security data from diverse sources into Sentinel for analysis and threat detection. On the SC-200 exam, questions on data connectors appear in approximately 15-20% of the questions in the 'Manage a Microsoft Sentinel workspace' domain (objective 2.1). Understanding connector types, configuration steps, and prerequisites is critical for passing the exam and for real-world deployment. This chapter provides a deep dive into each connector category, best practices, and common pitfalls.
Jump to a section
Think of Microsoft Sentinel as a central mail processing facility. Data connectors are like the various mail collection points and sorting machines that bring letters (log data) into the facility. Each connector is designed for a specific type of mail: Amazon Web Services (AWS) CloudTrail logs arrive via a dedicated courier service (the AWS S3 connector), Azure Active Directory sign-in logs come through a government mail route (the Azure AD connector), and Syslog messages from on-premises Linux servers are like bulk mail that gets sorted by a machine (the Syslog connector). The Common Event Format (CEF) connector is like a standard envelope format that allows different senders to use a consistent layout. Just as the postal facility has rules about what mail it accepts and how it's processed, each connector has prerequisites, configuration steps, and data format requirements. If a connector is misconfigured, logs may be lost, delayed, or malformed, just as a wrong address or insufficient postage causes mail to be returned or lost. The facility's health monitoring (Sentinel health monitoring) tracks whether each collection point is working, alerting the postmaster (security analyst) if a route fails.
What Are Microsoft Sentinel Data Connectors?
Microsoft Sentinel data connectors are the mechanisms that bring security data from various sources into your Sentinel workspace. They act as bridges between data producers (like firewalls, servers, cloud services) and the Sentinel data ingestion pipeline. Without connectors, Sentinel would have no data to analyze. The SC-200 exam tests your ability to select, configure, and troubleshoot these connectors.
Types of Data Connectors
Sentinel supports two main categories of connectors:
Service-to-service connectors: These are native connectors that directly pull data from Microsoft services (e.g., Azure Active Directory, Azure Activity, Office 365). They require minimal configuration and are typically enabled from the Sentinel portal.
Ingestion-based connectors: These require additional infrastructure to forward logs to Sentinel. Examples include Syslog, CEF, and the Log Analytics agent. They are used for non-Microsoft sources or on-premises devices.
Additionally, there are API-based connectors (e.g., AWS S3, GCP) and agent-based connectors (e.g., Windows Security Events via MMA or AMA).
How Connectors Work Internally
When you enable a connector, Sentinel creates a data collection rule (DCR) or uses a Log Analytics workspace as a target. For service-to-service connectors, the data flows directly from the source service to the workspace via Azure Monitor pipelines. For ingestion-based connectors, data is sent to a Log Analytics gateway (the Log Analytics agent) or directly to the workspace via the HTTP Data Collector API. The connector then maps the incoming data to specific tables in the workspace (e.g., Syslog, CommonSecurityLog, AWSCloudTrail).
#### Example: Syslog Connector
The Syslog connector uses the Log Analytics agent (or Azure Monitor Agent) installed on a Linux server that receives Syslog messages. The agent forwards the messages to the workspace. The connector defines the facility and severity levels to collect. By default, all facilities and severities are collected, but you can filter to reduce noise. The data lands in the Syslog table.
#### Example: CEF Connector
CEF is a standard log format used by many security appliances. The CEF connector uses a Syslog forwarder (Linux server with the Log Analytics agent) that listens on TCP port 25226 or UDP 514. The agent parses the CEF messages and forwards them to Sentinel. The data is stored in the CommonSecurityLog table. The connector requires the agent to be configured with a specific parser (the cef_parser.py script).
Key Components and Defaults
Log Analytics agent (MMA): Used for Windows and Linux servers. Default heartbeat interval is 1 minute. The agent sends data to the workspace using TCP 443.
Azure Monitor Agent (AMA): The newer agent that supports data collection rules (DCRs). It is recommended for new deployments.
Data Collection Rules (DCRs): Define what data to collect and how to transform it. Used with AMA.
Syslog facility and severity: Default collects all. Exam may test that you can filter by facility (e.g., auth, cron, daemon, kern, local0-local7, mail, news, syslog, user, uucp) and severity (Emergency, Alert, Critical, Error, Warning, Notice, Informational, Debug).
CEF port: Default TCP 25226 for the agent, but many appliances send CEF on UDP 514. The forwarder must listen on the appropriate port.
AWS S3 connector: Requires an AWS IAM role with read access to the S3 bucket and a Simple Queue Service (SQS) queue for notifications. The connector polls SQS for new logs.
Configuration and Verification Commands
To verify connector health, you can use the following KQL queries:
// Check Syslog connector health
Syslog
| where TimeGenerated > ago(1h)
| summarize Count = count() by Computer// Check CEF connector health
CommonSecurityLog
| where TimeGenerated > ago(1h)
| summarize Count = count() by DeviceVendor, DeviceProductFor agent connectivity:
# On Windows agent, check connectivity to workspace
Test-CloudConnectivity -WorkspaceId <your-workspace-id># On Linux agent, check if agent is running
sudo /opt/microsoft/omsagent/bin/omsagent -dInteraction with Related Technologies
Azure Monitor: Sentinel uses the same ingestion pipeline as Azure Monitor. Data connectors often leverage Log Analytics workspaces and agents.
Azure Policy: Can be used to deploy connectors at scale across multiple subscriptions using built-in policies like 'Deploy Azure Monitor Agent for Windows VMs'.
Azure Sentinel Health: A workbook and solution that provides monitoring of connector health, ingestion latency, and data volume.
Microsoft 365 Defender: Connectors for Microsoft 365 Defender services (e.g., Microsoft Defender for Endpoint) automatically ingest alerts and incidents.
Exam-Specific Details
The exam expects you to know which connector to use for a given source. For example, for Palo Alto Networks firewall, use the CEF connector. For Azure Activity logs, use the Azure Activity connector (service-to-service).
Be familiar with the concept of data connectors being 'connected' vs 'disconnected' in the Sentinel UI. A connector showing 'Connected' means the data source is configured to send data, but it does not guarantee data is flowing; you must check the data ingestion.
The Syslog connector is often confused with the CEF connector. Know that Syslog stores data in Syslog table, while CEF stores in CommonSecurityLog. CEF requires parsing of the message format.
Multiple workspaces: Connectors are workspace-specific. You cannot share a connector across workspaces without additional configuration (e.g., using Azure Lighthouse or cross-workspace queries).
Common Pitfalls
Agent not installed or not running: The most common issue. Always verify agent heartbeat.
Firewall blocking ports: Syslog uses UDP 514, CEF forwarder uses TCP 25226. Ensure network connectivity.
Misconfigured CEF parser: The cef_parser.py script must be installed and running. Check /var/opt/microsoft/omsagent/<workspace-id>/log/ for errors.
AWS SQS not configured: The S3 connector relies on SQS notifications; without them, polling is delayed or fails.
Summary of Connector Categories
| Connector Type | Examples | Data Table | Agent Required | |----------------|----------|------------|----------------| | Service-to-service | Azure AD, Office 365, Azure Activity | Varies | No | | API-based | AWS S3, GCP, Okta | Varies | No (but uses API) | | Agent-based (Windows) | Windows Security Events | Event | Yes (MMA/AMA) | | Agent-based (Linux) | Syslog, CEF | Syslog, CommonSecurityLog | Yes (MMA/AMA) |
Identify the data source type
Determine whether the source is a Microsoft service (e.g., Azure AD, Office 365), a third-party cloud (AWS, GCP), an on-premises appliance (firewall, syslog), or a custom application. This dictates the connector type: service-to-service, API-based, or agent-based. For example, Azure AD uses the Azure AD connector (service-to-service), while a Palo Alto firewall uses the CEF connector (agent-based). The exam often presents a scenario and asks which connector to use.
Enable the connector in Sentinel
In the Azure portal, navigate to Microsoft Sentinel > Data connectors. Search for the connector, select it, and click 'Open connector page'. Then click 'Connect' (for service-to-service) or follow the configuration steps. For service-to-service connectors, this may involve granting permissions (e.g., for Azure AD, you need Global Admin or Security Admin role). The connector status changes to 'Connected' when the connection is established.
Configure the data source
Depending on the connector, you may need to install an agent, configure log forwarding, or set up an API integration. For Syslog, install the Log Analytics agent on a Linux server and configure rsyslog or syslog-ng to forward to the agent. For CEF, run the `cef_parser.py` script. For AWS S3, create an IAM role and SQS queue. Follow the connector's instructions exactly; missing a step can cause data not to appear.
Verify data ingestion
After configuration, check if data is flowing. Use the Sentinel 'Data connectors' page to see the 'Data received' graph. Run a KQL query on the relevant table (e.g., `Syslog | take 10`). For service-to-service connectors, data may take up to 15 minutes to appear. If no data, check agent heartbeat, firewall rules, and connector logs. The exam may ask you to troubleshoot by checking the agent or the `Heartbeat` table.
Monitor connector health
Use the Sentinel Health workbook or create custom alerts for missing data. The `Heartbeat` table shows agent connectivity every 1 minute. For Syslog/CEF, check the `Syslog` and `CommonSecurityLog` tables for recent entries. Set up a Sentinel scheduled query rule to alert if a connector stops sending data for more than 30 minutes. Regular monitoring ensures you catch failures early.
Enterprise Scenario 1: Ingesting Palo Alto Networks Firewall Logs
A multinational company uses Palo Alto Networks firewalls across its global offices. They need to ingest traffic logs, threat logs, and system logs into Sentinel for centralized monitoring. The solution uses the CEF connector. A Linux server in each region acts as a Syslog forwarder. The firewalls are configured to send CEF-formatted logs to the forwarder's UDP 514 port. The forwarder runs the Log Analytics agent and the CEF parser. The data lands in CommonSecurityLog. The team monitors the forwarder's disk space and CPU usage, as high log volume (e.g., 10,000 EPS) can overwhelm the server. They also use the Sentinel Health workbook to track ingestion latency. A common issue is that the CEF parser stops due to memory corruption; the team has a script to restart it automatically.
Enterprise Scenario 2: AWS CloudTrail Integration
A company uses AWS for its cloud infrastructure and wants to ingest CloudTrail logs into Sentinel. They use the AWS S3 connector. They create an S3 bucket for CloudTrail logs, enable SQS for notifications, and configure an IAM role that Sentinel assumes to read the bucket. The connector polls the SQS queue every 5 minutes. Data appears in the AWSCloudTrail table. They noticed that if the SQS queue is not set up, logs are delayed by up to 15 minutes (the polling interval). They also set up a Sentinel analytics rule to detect suspicious API calls. A challenge is managing cross-account IAM permissions and ensuring the S3 bucket policy allows Sentinel to read objects.
Scenario 3: On-Premises Windows Server Security Events
A company uses Windows Event Log forwarding to collect security events from 500 servers to a central Windows Event Collector (WEC). The WEC forwards events to Sentinel using the Windows Security Events via AMA connector. They use Azure Monitor Agent (AMA) with a Data Collection Rule (DCR) that filters for Event IDs 4624, 4625, and 4688. The DCR also transforms the data to include custom fields. The team monitors the Event table for ingestion volume. A common misconfiguration is forgetting to install the AMA on the WEC server, which causes no data to appear. They also use Azure Policy to deploy the AMA and DCR at scale across multiple subscriptions.
What SC-200 Tests on Data Connectors (Objective 2.1)
The exam focuses on:
Identifying the correct connector for a given data source.
Understanding the difference between Syslog, CEF, and custom log connectors.
Knowing the prerequisites for each connector (e.g., IAM roles for AWS, Global Admin for Azure AD).
Troubleshooting common issues like missing data or agent connectivity.
Configuring connectors using the Azure portal and PowerShell/CLI.
Common Wrong Answers and Why Candidates Choose Them
Choosing Syslog for a CEF source: Candidates see 'Syslog' and assume any syslog-formatted log can use the Syslog connector. However, CEF logs require the CEF connector to parse the Common Event Format into structured fields. The Syslog connector would dump raw syslog messages into the Syslog table, losing the structured fields that security analytics rules depend on.
Selecting 'Connect' but not configuring the source: Many connectors require additional steps after clicking 'Connect'. For example, the Azure AD connector requires you to sign in and grant permissions. Candidates think the connector is ready after clicking 'Connect', but no data flows until the source is configured.
Using the wrong agent: For Windows Security Events, candidates might choose the Log Analytics agent (MMA) when the newer Azure Monitor Agent (AMA) is recommended. The exam may ask which agent to use; AMA is preferred for new deployments.
Assuming all connectors are service-to-service: Some candidates think all connectors are point-and-click. They don't realize that Syslog and CEF require a forwarder server.
Specific Values and Terms That Appear on the Exam
CEF port: TCP 25226 (agent) or UDP 514 (appliance).
Syslog default port: UDP 514.
Log Analytics agent heartbeat interval: 1 minute.
AWS S3 connector polling interval: 5 minutes (default).
Tables: Syslog, CommonSecurityLog, AWSCloudTrail, Event, OfficeActivity, SigninLogs.
Roles: 'Security Administrator' or 'Global Administrator' for Azure AD connector; 'Contributor' on the workspace is often sufficient for other connectors.
Edge Cases and Exceptions
Multiple workspaces: A connector is tied to one workspace. To send data to multiple workspaces, you need multiple agents or a forwarding mechanism.
Custom logs: The 'Custom Logs' connector allows ingestion of any text-based log file using the Log Analytics agent. This is for unsupported formats.
Data transformation: Azure Monitor Agent supports data transformations via DCRs, which can modify the data before ingestion. The exam may test that you can use DCRs to filter or enrich logs.
How to Eliminate Wrong Answers
If the source is a Microsoft cloud service (Azure, Office 365, Microsoft 365 Defender), the correct answer is almost always a service-to-service connector.
If the source is a third-party appliance that supports CEF, choose CEF, not Syslog.
If the source is a Linux server sending syslog, choose the Syslog connector.
Always check prerequisites: if the question mentions 'IAM role', the answer is likely AWS S3 connector. If it mentions 'Global Admin', it's Azure AD.
For troubleshooting, always check the agent first: look for Heartbeat table or check agent installation.
Service-to-service connectors require no agent and are used for Microsoft services like Azure AD and Office 365.
CEF connector parses Common Event Format into `CommonSecurityLog` table; Syslog connector stores raw messages in `Syslog`.
The Azure Monitor Agent (AMA) is the recommended agent for new deployments; MMA is deprecated.
For AWS S3 connector, you must configure an IAM role and SQS queue for log discovery.
Connector status 'Connected' does not guarantee data flow; always verify with a KQL query.
Syslog default facility and severity collection includes all; you can filter to reduce noise.
CEF forwarder listens on TCP 25226 by default; ensure firewall allows this traffic.
The `Heartbeat` table shows agent connectivity every 1 minute; use it to troubleshoot missing data.
Data connectors are workspace-specific; you cannot share a connector across workspaces without additional setup.
Custom logs connector can ingest any text-based log file using the Log Analytics agent.
These come up on the exam all the time. Here's how to tell them apart.
Syslog Connector
Data stored in `Syslog` table with raw message.
No parsing of structured fields; fields are not extracted.
Used for generic syslog sources (e.g., Linux servers).
Listens on UDP 514 (default).
No additional parser script required.
CEF Connector
Data stored in `CommonSecurityLog` table with parsed fields.
Parses CEF header into fields like DeviceVendor, DeviceProduct, SignatureID.
Used for security appliances that output CEF (e.g., Palo Alto, Fortinet).
Listens on TCP 25226 (agent) or UDP 514 (appliance).
Requires `cef_parser.py` script on the forwarder.
Service-to-Service Connector
No agent required; direct API integration.
Configuration is simple: click 'Connect' and grant permissions.
Data flows automatically from Microsoft services.
Examples: Azure AD, Office 365, Azure Activity.
Limited to Microsoft sources.
Ingestion-Based Connector
Requires agent or forwarder infrastructure.
Configuration involves installing agents, setting up log forwarding.
Data must be sent to the agent, which forwards to Sentinel.
Examples: Syslog, CEF, Windows Security Events.
Supports non-Microsoft and on-premises sources.
Mistake
The Syslog connector can parse CEF messages into structured fields.
Correct
The Syslog connector stores raw syslog messages in the `Syslog` table without parsing the CEF header. CEF messages require the CEF connector, which parses them into `CommonSecurityLog` with fields like DeviceVendor, DeviceProduct, and SignatureID.
Mistake
Once a connector shows 'Connected' in the UI, data is automatically flowing.
Correct
The 'Connected' status only indicates that the initial configuration is complete. Data may not flow if the source is not sending logs, the agent is not running, or firewall rules block the traffic. Always verify by querying the relevant table.
Mistake
The Azure Monitor Agent (AMA) and Log Analytics agent (MMA) can be used interchangeably for any connector.
Correct
AMA is the newer agent and supports data collection rules (DCRs). Some connectors, like Windows Security Events via AMA, require AMA. Others, like Syslog, support both but AMA is recommended. MMA is deprecated for new deployments.
Mistake
All data connectors require an agent to be installed.
Correct
Service-to-service connectors (e.g., Azure AD, Office 365) do not require an agent. They pull data directly from the source via APIs. Agent-based connectors are only needed for on-premises or non-Microsoft cloud sources.
Mistake
The AWS S3 connector can ingest data from any S3 bucket without special permissions.
Correct
The connector requires an IAM role with read access to the S3 bucket and SQS queue. The bucket must also have an SQS notification configured for new objects. Without these, the connector cannot discover new logs.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
The Syslog connector ingests raw syslog messages and stores them in the `Syslog` table without parsing any structured fields. The CEF connector parses Common Event Format (CEF) messages into the `CommonSecurityLog` table, extracting fields like DeviceVendor, DeviceProduct, and SignatureID. Use Syslog for generic syslog sources (e.g., Linux servers) and CEF for security appliances that output CEF (e.g., Palo Alto Networks, Fortinet). The CEF connector requires a forwarder with the `cef_parser.py` script.
First, verify that the Log Analytics agent (or AMA) is installed on the syslog forwarder and running. Check the `Heartbeat` table for recent heartbeats from that computer. Ensure the syslog source is sending logs to the forwarder's IP and port (default UDP 514). Check firewall rules to allow UDP 514. On the forwarder, check the agent logs at `/var/opt/microsoft/omsagent/<workspace-id>/log/` for errors. In Sentinel, run `Syslog | where TimeGenerated > ago(1h) | take 10` to see if any data arrived. Also check the connector status in the UI; if it shows 'Connected', the configuration is correct but data may not be flowing.
No, each data connector is tied to a single Log Analytics workspace. To send the same data to multiple workspaces, you need to configure separate connectors or use a forwarding mechanism. For example, you can configure a syslog forwarder to send logs to two different workspaces by installing two agents or using a log shipper like Logstash. Alternatively, you can use cross-workspace queries to view data from multiple workspaces in a single Sentinel instance.
To enable the Azure AD connector in Sentinel, you need at least the 'Security Administrator' role in Azure AD. This role allows you to grant the necessary permissions for Sentinel to read sign-in logs and audit logs. Additionally, you need 'Contributor' permissions on the Log Analytics workspace or the Sentinel resource. If you are not a Global Admin, you may need a Global Admin to consent to the required API permissions.
The AWS S3 connector ingests AWS CloudTrail logs from an S3 bucket. You create an S3 bucket for CloudTrail, enable SQS for object creation notifications, and create an IAM role that Sentinel can assume to read the bucket and queue. Sentinel polls the SQS queue every 5 minutes for new log file notifications, then reads the log files from S3. The data is stored in the `AWSCloudTrail` table. The connector requires proper IAM permissions and SQS configuration to function.
The Log Analytics agent (MMA) is the legacy agent that has been used for years. The Azure Monitor Agent (AMA) is the newer, more secure, and more flexible agent. AMA supports data collection rules (DCRs) for centralized configuration, multi-homing to multiple workspaces, and data transformation. Microsoft recommends using AMA for new deployments. For Sentinel connectors, AMA is required for some connectors like Windows Security Events via AMA. MMA is still supported but is in the process of being deprecated.
When configuring the Syslog connector, you can specify which facility and severity levels to collect. By default, all facilities and severities are collected. To filter, in the connector configuration page, you can add custom facility and severity combinations. For example, you can choose to collect only `auth` facility with severity `Warning` and above. This reduces data volume and noise. The filtering is applied on the agent side; messages that do not match are discarded before being sent to Sentinel.
You've just covered Sentinel Data Connectors — now see how well it sticks with free SC-200 practice questions. Full explanations included, no account needed.
Done with this chapter?