This chapter covers SIEM log analysis, a core skill for the CompTIA CySA+ CS0-003 exam under Domain 1.0 (Security Operations), Objective 1.2: Given a scenario, analyze data to identify security incidents. Approximately 15-20% of exam questions touch SIEM-related topics, including log sources, normalization, correlation, and querying. You will learn how SIEM systems work internally, how to interpret common log fields, and how to use SIEM tools to detect and investigate incidents. This chapter provides the depth needed to answer scenario-based questions that require you to identify the correct log source, field, or correlation rule.
Jump to a section
A SIEM is like a bank vault's integrated security system. The vault has multiple sensors: door contacts (logs from firewalls), motion detectors (IDS/IPS alerts), glass break sensors (antivirus detections), and temperature monitors (system performance logs). Each sensor sends its raw data to a central security panel (the SIEM collector). The panel doesn't just store the data; it correlates events. For example, if the motion detector fires (intrusion alert) and simultaneously the door contact shows the vault door was opened with a valid code (legitimate access), the system might suppress the alarm. But if motion is detected and the door contact shows forced entry (invalid code repeated 3 times), the panel triggers a high-priority alarm and notifies the police (incident response). The security guard (analyst) reviews the panel's correlated alerts, not raw sensor data. The panel also enforces compliance: it logs every entry and exit, and if the vault door is left open for more than 5 minutes (compliance rule), it generates a report. Without the panel, the guard would be overwhelmed by hundreds of individual sensor signals. The panel's job is to filter noise, correlate events, and present meaningful security incidents. Just like the bank vault system, a SIEM ingests logs from diverse sources, normalizes them, applies correlation rules, and produces actionable alerts. The exam tests your understanding of this correlation process and the components involved: log collection, normalization, correlation, alerting, and reporting.
What is SIEM Log Analysis?
SIEM (Security Information and Event Management) log analysis is the process of collecting, normalizing, correlating, and analyzing log data from various sources to detect security incidents, policy violations, and anomalies. A SIEM system aggregates logs from firewalls, IDS/IPS, servers, endpoints, applications, and cloud services. The core value of SIEM is not just storage but real-time correlation and alerting. The CS0-003 exam expects you to understand the entire pipeline: log generation → collection → parsing → normalization → correlation → alerting → investigation.
How SIEM Works Internally
#### 1. Log Collection SIEMs use agents or agentless methods to collect logs. Agents are installed on endpoints to forward logs (e.g., Windows Event Log, syslog). Agentless methods include pulling logs via APIs (e.g., AWS CloudTrail) or receiving logs via syslog (UDP 514 or TCP 514). Common collection protocols:
Syslog: Standard for network devices (RFC 3164, RFC 5424). Typically UDP/514, but TCP/514 or TLS/6514 for reliability.
Windows Event Forwarding (WEF): Uses HTTP/HTTPS (WinRM).
SNMP traps: UDP/162 for legacy devices.
APIs: RESTful APIs for cloud services (e.g., Azure Monitor, AWS CloudWatch).
#### 2. Parsing and Normalization Raw logs are in various formats (e.g., Cisco ASA syslog, Apache access log, Windows Event ID 4625). The SIEM parser extracts fields like timestamp, source IP, destination IP, user, event type, and action. Normalization maps these fields to a common schema (e.g., CIM - Common Information Model from Splunk, or LEEF - Log Event Extended Format from IBM QRadar). For example, a firewall log entry "Deny TCP 10.0.0.1:12345 -> 192.168.1.1:80" is parsed into fields: src_ip=10.0.0.1, src_port=12345, dest_ip=192.168.1.1, dest_port=80, action=deny, protocol=TCP.
#### 3. Correlation Correlation is the heart of SIEM. Rules define conditions that indicate an incident. For example:
- Rule: "More than 5 failed logins from the same source IP within 10 minutes" → Brute force alert. - Rule: "A successful login immediately after multiple failed logins from the same IP" → Account compromise. Correlation can involve multiple log sources. For example, correlating a firewall deny log with an IDS alert for the same source IP indicates a scanning attack. Correlation engines use:
Aggregation: Count events over a time window.
Sequencing: Detect events in a specific order.
Thresholding: Trigger when count exceeds a value.
Temporal correlation: Time-based patterns.
#### 4. Alerting and Incident Response When a correlation rule fires, the SIEM generates an alert with severity (low, medium, high, critical). Alerts are sent to analysts via email, dashboard, or ticketing system. The analyst then investigates by querying the SIEM for related logs (drill-down). The exam tests your ability to interpret alerts and identify false positives.
Key Components, Values, Defaults, and Timers
Log Retention: Typically 90-365 days for compliance (e.g., PCI DSS requires 1 year).
Syslog Default Port: UDP 514 (RFC 3164). For secure syslog, use TCP 6514 (TLS).
Windows Event Log: Event IDs relevant to security: 4624 (successful logon), 4625 (failed logon), 4634 (logoff), 4688 (process creation), 5156 (Windows Filtering Platform connection).
Common Log Fields: timestamp, source_ip, destination_ip, source_port, destination_port, protocol, user, action (allow/deny), event_id, message.
Correlation Window: Typically 5-60 minutes. Exam may ask: "If a rule looks for 10 failed logins in 5 minutes, what is the time window?"
Threat Intelligence Feeds: SIEMs can ingest IoCs (IPs, domains, hashes) from feeds like AlienVault OTX, VirusTotal. Matching logs against IoCs triggers alerts.
Configuration and Verification Commands
While the exam does not test specific vendor syntax, understanding generic commands helps. For example, in Splunk:
Search: index=windows EventCode=4625 | stats count by src_ip
Correlation rule: ... | where count > 5 | eval severity="medium"
In QRadar:
Log search: SELECT * FROM events WHERE username = 'admin'
Rule: "When more than 5 failed logins in 10 minutes, create offense."
In ELK Stack:
Query: {"query": {"match": {"event_id": "4625"}}}
Watcher alert: threshold of 10 in 5 minutes.
Interaction with Related Technologies
SIEM interacts with: - SOAR (Security Orchestration, Automation, and Response): SIEM sends alerts to SOAR for automated response (e.g., block IP on firewall). - Threat Intelligence Platforms (TIP): SIEM ingests IoCs from TIP to enrich events. - Endpoint Detection and Response (EDR): SIEM receives endpoint alerts (e.g., malware detected) and correlates with network logs. - Network Traffic Analysis (NTA): Tools like Zeek or Cisco Stealthwatch feed metadata to SIEM.
The exam may ask: "Which technology would you integrate to automatically block an IP after a SIEM alert?" Answer: SOAR.
SIEM Log Analysis Workflow
Identify Log Sources: Firewalls, IDS/IPS, servers, endpoints, cloud, applications. Each source has specific log format.
Collect Logs: Ensure all sources send logs to SIEM (syslog, agent, API).
Normalize Logs: Check that fields are correctly parsed. Example: timestamp should be in UTC.
Create Correlation Rules: Define conditions for incidents. Use tuning to reduce false positives.
Monitor Alerts: Review dashboards and alerts. Prioritize based on severity.
Investigate: Drill down into raw logs. Use search queries to find related events.
Respond: Escalate to incident response team. Document findings.
Common Log Formats and Fields
- Syslog: <PRI>Timestamp Hostname App[PID]: Message
Example: <134>Oct 1 12:34:56 webserver sshd[12345]: Failed password for root from 10.0.0.2 port 22 ssh2
- Windows Event Log: XML-based. Key fields: EventID, TimeCreated, Source, EventData.
- Apache Access Log: IP - - [Timestamp] "Method URI Protocol" Status Bytes
Example: 10.0.0.1 - - [01/Oct/2023:12:34:56 +0000] "GET /admin HTTP/1.1" 403 1234
- Firewall Logs: Vary by vendor. Cisco ASA: %ASA-4-106023: Deny tcp src inside:10.0.0.1/12345 dst outside:192.168.1.1/80
SIEM Deployment Considerations
On-premises vs Cloud: On-prem gives full control but requires hardware. Cloud SIEM (e.g., Azure Sentinel, AWS Security Hub) scales easily.
Log Volume: Average enterprise generates 1-10 TB per day. SIEM must handle ingestion rate.
Data Retention: Balance cost vs compliance. Hot storage (fast) for 30 days, cold storage (cheap) for longer.
High Availability: Deploy multiple collectors and indexers. Use load balancers.
Key Metrics and KPIs
Mean Time to Detect (MTTD): Goal < 1 hour.
Mean Time to Respond (MTTR): Goal < 24 hours.
False Positive Rate: Should be < 10%.
Events per Second (EPS): Typical SIEM handles 10,000-100,000 EPS.
The exam may ask: "What metric indicates how quickly the security team identifies incidents?" Answer: MTTD.
Security Incident Detection Examples
1. Brute Force Attack: Multiple failed logins (Event ID 4625) from same IP to multiple users.
- Search: EventCode=4625 | stats count by src_ip, Account_Name | where count > 10
2. Malware Communication: Outbound connection to known malicious IP (from threat feed) on port 443.
- Correlation: Firewall allow log + threat intelligence match.
3. Privilege Escalation: User added to local admin group (Event ID 4732) after failed logins.
- Search: EventCode=4732 | search Account_Name=*admin*
4. Data Exfiltration: Large outbound transfer (e.g., > 100 MB) to external IP.
- Search: bytes_out > 100000000 | stats sum(bytes) by src_ip, dest_ip
Troubleshooting SIEM Issues
Missing Logs: Check network connectivity (syslog ports), agent status, permissions.
Incorrect Normalization: Verify parser regex. Example: timestamp in wrong timezone.
High False Positives: Tune correlation rules: increase threshold, whitelist known good sources.
Performance Degradation: Increase resources (CPU, RAM, disk I/O). Optimize queries with indexes.
The exam may present a scenario: "Users report that SIEM alerts are not triggering for failed logins. What is the most likely cause?" Answer: Logs are not being collected from the domain controller.
SIEM vs Log Management
Log Management: Stores logs for compliance and troubleshooting. No real-time correlation.
SIEM: Real-time correlation, alerting, and incident detection. Includes log management.
The exam distinguishes: "Which solution provides real-time correlation?" SIEM.
Advanced SIEM Features
User and Entity Behavior Analytics (UEBA): Uses machine learning to detect anomalies (e.g., user logging in at unusual time).
Threat Hunting: Proactively search for threats using hypothesis-driven queries.
Automated Response: Integrate with SOAR to block IPs, disable accounts, or isolate endpoints.
Compliance Reporting: Pre-built reports for PCI DSS, HIPAA, GDPR.
The exam may ask: "Which SIEM feature helps detect insider threats by baselining normal behavior?" Answer: UEBA.
Summary
SIEM log analysis is critical for detecting and responding to security incidents. The CySA+ exam tests your ability to interpret logs, identify relevant fields, and understand correlation logic. Focus on common log formats, Windows Event IDs, syslog, correlation rules, and the overall workflow from collection to alerting. Practical experience with queries (even in a lab) will solidify these concepts.
Identify Log Sources and Types
First, determine which devices and systems generate logs relevant to the investigation. Common sources include firewalls (e.g., Cisco ASA, Palo Alto), IDS/IPS (e.g., Snort, Suricata), Windows domain controllers (Event Log), Linux servers (auth.log, syslog), and cloud services (AWS CloudTrail, Azure Activity Log). Each source has a specific log format. For example, Windows logs use Event IDs (4625 for failed logon), while syslog messages have facility and severity codes. In the exam, you might be given a log snippet and asked to identify the source (e.g., "Which device generated this log?"). Knowing the format and keywords (e.g., "%ASA-" for Cisco ASA) is key.
Collect and Aggregate Logs into SIEM
Ensure all identified sources forward logs to the SIEM collector. This involves configuring syslog servers (UDP 514), installing agents (e.g., Splunk Universal Forwarder), or setting up API integrations. The SIEM receives logs in various formats and may buffer them if the collector is overloaded. Check that the SIEM is receiving logs from all sources by verifying the last log received timestamp. In a scenario, if the SIEM shows no recent logs from a firewall, the most likely cause is a network issue or misconfigured syslog settings. The exam may ask: "What is the default syslog port?" Answer: UDP 514.
Parse and Normalize Logs
Raw logs are parsed to extract structured fields like timestamp, source IP, destination IP, user, event type, and action. The SIEM uses pre-built parsers (e.g., for Cisco ASA, Windows Event Log) or custom regex. Normalization maps these fields to a common schema (e.g., CIM). For example, a Windows failed logon event (Event ID 4625) is normalized to fields: event_type=FailedLogon, user=username, src_ip=10.0.0.2. If a log is not parsed correctly, it may appear as "unparsed" or "unknown" in the SIEM. The exam may show a raw log and ask: "Which field is missing?" or "Which parser would you use?"
Create and Apply Correlation Rules
Define rules that detect specific attack patterns. For example, a rule might trigger an alert if there are more than 10 failed logins from the same source IP within 5 minutes. The rule specifies the log source (e.g., Windows Security log), the condition (Event ID 4625), the aggregation (count by src_ip), and the threshold (10). The correlation engine evaluates incoming events against all rules. If a match is found, an alert is generated. The exam may ask: "What is the purpose of a correlation rule?" or "Which component evaluates rules?"
Investigate Alerts and Drill Down
When an alert fires, the analyst reviews the alert details and drills down into the raw logs to confirm the incident. For example, a brute force alert shows multiple failed logins from IP 10.0.0.2. The analyst queries the SIEM for all logs from that IP in the last hour, checks if any successful login occurred (Event ID 4624), and examines the user accounts targeted. The analyst also correlates with other sources, like firewall logs to see if the IP was blocked. The exam may present an alert and ask: "What is the next step?" Answer: Investigate by querying related logs.
In a large enterprise with 10,000 endpoints and 500 servers, the SIEM ingests approximately 5 TB of logs per day. The security team uses Splunk Enterprise Security (ES) with correlation rules for common attacks. One scenario: a rule detects 20 failed logins from a single IP to different user accounts within 10 minutes. The alert fires, and the analyst queries the SIEM to see if any successful login occurred after the failures. They find a successful login (Event ID 4624) for the 'admin' account from the same IP. The analyst then checks the firewall logs and sees the IP is from a known malicious range (based on threat intel). The incident is escalated, and the account is disabled. The SIEM also triggers a SOAR playbook that automatically blocks the IP on the perimeter firewall. In another scenario, a misconfigured rule causes high false positives: a rule that alerts on any outbound connection to a new IP triggers thousands of alerts for legitimate cloud services. The team tunes the rule by adding an allowlist for known IP ranges and increasing the threshold to 3 connections. Performance is a constant concern: the SIEM indexes must be scaled to handle peak EPS of 50,000. The team uses indexers with SSDs and hot/warm/cold storage tiers. A common mistake is not normalizing timestamps to UTC, causing confusion during incident investigation. The team enforces UTC for all logs and uses the SIEM's time zone conversion. Another issue is log source failure: a firewall stops sending syslog due to a configuration change. The SIEM dashboard shows no logs from that firewall for 2 hours. The analyst detects the gap via a heartbeat rule and restores the syslog configuration. These real-world experiences highlight the importance of monitoring SIEM health and tuning rules.
The CS0-003 exam tests SIEM log analysis under Objective 1.2: Given a scenario, analyze data to identify security incidents. You must be able to read a log snippet and identify the event type, source, and relevant fields. Common exam question formats include: (1) Given a log entry, what is the event? (2) Which log source would you check for a specific event? (3) What correlation rule would detect a brute force attack? (4) What is the next step after an alert? The 3-4 most common wrong answers: (1) Choosing 'syslog' as the answer for 'What protocol is used for Windows Event Log collection?' — but Windows uses WEF (WinRM), not syslog. (2) Selecting 'increase EPS' to reduce false positives — actually, you tune rules. (3) Confusing SIEM with SOAR: 'Which tool automatically blocks an IP?' — SIEM alerts, SOAR acts. (4) Picking 'Event ID 4624' for failed logon — correct is 4625. Specific numbers: default syslog port UDP 514, Windows Event ID 4625 (failed logon), 4624 (successful), 4688 (process creation). Terms: normalization, correlation, aggregation, threshold, severity, false positive, true positive. Edge cases: The exam loves to test that a successful login after multiple failures indicates account compromise, not just brute force. Another edge: if logs are missing, the SIEM cannot detect incidents — check collection first. How to eliminate wrong answers: Use the underlying mechanism. For example, if a question asks 'What is the best way to reduce false positives?' — options might include 'increase storage' (wrong), 'disable rules' (wrong), 'tune rule thresholds' (correct). Always think about the SIEM pipeline: collection → normalization → correlation → alerting. If the question is about detection, focus on correlation rules. If about missing data, focus on collection.
SIEM correlates logs from multiple sources to detect incidents; it is not just log storage.
Default syslog port is UDP 514; secure syslog uses TCP 6514 (TLS).
Windows Event ID 4625 = failed logon; 4624 = successful logon; 4688 = process creation.
Normalization maps raw log fields to a common schema (e.g., CIM).
Correlation rules use aggregation, threshold, and time windows (e.g., 10 failed logins in 5 minutes).
False positives are reduced by tuning rule thresholds and adding allowlists.
SIEM integrates with SOAR for automated response (e.g., block IP).
Missing logs are often due to network issues or misconfigured log forwarding.
UEBA uses machine learning to detect anomalous behavior.
MTTD (Mean Time to Detect) measures how quickly incidents are identified.
These come up on the exam all the time. Here's how to tell them apart.
Syslog (UDP 514)
Uses UDP 514, connectionless, may lose logs.
Common for network devices (routers, firewalls).
Plain text, easily parsed.
No built-in authentication or encryption.
RFC 3164/5424 standard.
Windows Event Forwarding (WinRM)
Uses HTTP/HTTPS (WinRM), reliable delivery.
Native to Windows domain-joined systems.
XML-based, structured.
Supports authentication and encryption.
Requires Group Policy configuration.
Mistake
SIEM and log management are the same thing.
Correct
Log management only stores and retrieves logs. SIEM adds real-time correlation, alerting, and incident detection. SIEM includes log management but is not limited to it.
Mistake
Windows Event Logs are sent via syslog by default.
Correct
Windows does not natively send logs via syslog. It uses Windows Event Forwarding (WEF) over HTTP/HTTPS (WinRM). Third-party agents can convert to syslog.
Mistake
More correlation rules always improve security.
Correct
Too many rules increase false positives and alert fatigue. Rules must be tuned to balance detection and noise. Quality over quantity.
Mistake
SIEM can detect all attacks automatically.
Correct
SIEM detects based on predefined rules and known patterns. Zero-day attacks or attacks without logs may go undetected. Threat hunting and UEBA help.
Mistake
The default syslog port is TCP 514.
Correct
The default is UDP 514. TCP 514 is also used but not the default. RFC 3164 specifies UDP. For reliable delivery, use TCP or TLS (6514).
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
SIEM (Security Information and Event Management) includes log management but adds real-time correlation, alerting, and incident detection. Log management only stores and retrieves logs for compliance or troubleshooting. SIEM can correlate events from multiple sources to detect attacks, while log management does not analyze logs in real time. For the exam, remember that SIEM = log management + correlation + alerting.
Key Event IDs: 4624 (successful logon), 4625 (failed logon), 4634 (logoff), 4648 (logon with explicit credentials), 4688 (process creation), 4732 (member added to security-enabled local group), 4768 (Kerberos TGT requested), 4776 (credential validation). The exam commonly tests 4624 and 4625 for logon failures and successes.
Tune correlation rules by increasing thresholds (e.g., from 5 to 10 failed logins), extending time windows, or adding allowlists for known good IPs/users. Also, use whitelisting for expected behavior (e.g., admin scanning). Avoid disabling rules entirely. The exam may ask: 'Which action reduces false positives without losing detection?' — Tune thresholds.
The default syslog port is UDP 514 (per RFC 3164). TCP 514 is also used but not default. For encrypted syslog, use TCP 6514 (TLS). The exam expects you to know UDP 514 as the standard. If a question asks for 'secure syslog port,' answer 6514.
SIEM detects and alerts on security incidents. SOAR (Security Orchestration, Automation, and Response) takes actions based on alerts, such as blocking IPs, disabling accounts, or creating tickets. SIEM feeds alerts to SOAR. The exam may ask: 'Which technology automatically blocks a malicious IP?' — SOAR, not SIEM.
Search for failed logon events (Event ID 4625) aggregated by source IP. Look for high counts (e.g., >10) within a short time window (e.g., 5 minutes). Then check if any successful logon (4624) occurred from the same IP after failures, which indicates compromise. Also, correlate with firewall logs to see if the IP was blocked. The exam may present a scenario and ask: 'What is the first step?' — Query failed logons.
Log normalization is the process of parsing raw logs into a common structured format. For example, a firewall log 'Deny TCP 10.0.0.1:12345 -> 192.168.1.1:80' is normalized to fields: src_ip=10.0.0.1, src_port=12345, dest_ip=192.168.1.1, dest_port=80, action=Deny, protocol=TCP. This allows correlation across different log sources. The exam tests understanding of normalization as a step before correlation.
You've just covered SIEM Log Analysis — now see how well it sticks with free CS0-003 practice questions. Full explanations included, no account needed.
Done with this chapter?