SOA-C02Chapter 65 of 104Objective 5.1

VPC Flow Logs Analysis

This chapter covers VPC Flow Logs analysis, a critical skill for the SOA-C02 exam's Networking domain (Objective 5.1). VPC Flow Logs enable you to capture IP traffic metadata flowing to and from network interfaces in your VPC, providing visibility into network traffic patterns, security threats, and connectivity issues. Expect 5-10% of exam questions to test your understanding of flow log creation, publication destinations, log formats, and troubleshooting using these logs. Mastery of this topic is essential for any SysOps administrator responsible for network security and monitoring.

25 min read
Intermediate
Updated May 31, 2026

VPC Flow Logs as Security Camera Tapes

Imagine a large office building with multiple floors, each floor having its own entrance, internal doors, and security cameras. The building's security team installs cameras at every door — both entrance and exit — to record every person who passes through, including their ID badge number, the time they entered or exited, and whether they were allowed through. These cameras do not stop anyone; they only record. The security tapes are stored in a central vault (CloudWatch Logs or S3). Later, if a theft occurs, the team reviews the tapes to see who entered the restricted area at the time of the incident. However, the cameras only show metadata — not the content of conversations or items carried. Similarly, VPC Flow Logs capture metadata about IP traffic: source/destination IP, port, protocol, and whether the traffic was accepted or rejected by security groups or network ACLs. They do not capture packet payloads. Just as cameras can be placed at every door or only specific ones, flow logs can be created at the VPC, subnet, or network interface level. The logs are published to CloudWatch Logs or S3 for analysis. If a camera is mispositioned (e.g., only recording exits), you might miss the intruder entering. Likewise, if you only log accepted traffic, you might not see blocked malicious attempts. The analogy holds: flow logs are passive recorders, not active enforcers, and they are essential for forensic analysis after a security event.

How It Actually Works

What Are VPC Flow Logs and Why Do They Exist?

VPC Flow Logs are a feature that captures metadata about the IP traffic going to and from network interfaces in a VPC. They are analogous to a network tap or a packet capture at layer 3/4, but without the payload. The primary use cases are: - Security analysis: Identify denied traffic, detect anomalies, investigate security incidents. - Network troubleshooting: Diagnose connectivity issues, verify that traffic is flowing as expected. - Compliance: Meet audit requirements by logging network traffic. - Usage monitoring: Understand traffic patterns, bandwidth usage, and application behavior.

Flow logs are not a replacement for deep packet inspection tools like AWS Network Firewall or third-party NGFWs, but they provide a cost-effective, native way to get baseline network visibility.

How VPC Flow Logs Work Internally

When you create a flow log, you specify: - Resource: A VPC, subnet, or individual network interface (ENI). - Destination: Amazon CloudWatch Logs log group or Amazon S3 bucket. - Traffic type: ACCEPT, REJECT, or ALL. - Format: The default format includes specific fields; you can also use custom formats.

The flow log service runs in the background on each hypervisor that hosts EC2 instances. It captures all IP traffic (IPv4 and IPv6) that passes through the virtual network interface. For each network interface, the service aggregates traffic into flows. A flow is defined as a set of packets with the same source and destination IP, source and destination port, protocol, and direction (ingress/egress). For each flow, the service records a log entry at intervals (default 10 minutes, but can be set to 1 minute for higher granularity).

Each log entry contains fields like: - version: The flow log version (currently 2-5). - account-id: The AWS account ID. - interface-id: The ENI ID. - srcaddr: Source IP address. - dstaddr: Destination IP address. - srcport: Source port. - dstport: Destination port. - protocol: IANA protocol number (e.g., 6 for TCP, 17 for UDP). - packets: Number of packets in the flow. - bytes: Number of bytes in the flow. - start: Start time (Unix timestamp). - end: End time (Unix timestamp). - action: ACCEPT or REJECT. - log-status: OK, NODATA, or SKIPDATA. - vpc-id, subnet-id, instance-id, etc.

Key Components, Values, Defaults, and Timers

- Aggregation interval: Default 10 minutes. Minimum 1 minute (additional charge). The service collects packets for the interval, then writes a log entry summarizing the flow. - Log destination: CloudWatch Logs (real-time streaming) or S3 (batch delivery, typically within a few minutes). - Traffic types: ACCEPT only, REJECT only, or ALL. - Log format: Default includes 14 fields (version, account-id, interface-id, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action, log-status). Custom formats can include additional fields like region, az-id, subnet-id, instance-id, pkt-srcaddr, pkt-dstaddr, tcp-flags, type, drop-cause. - Log status: - OK: Data is being logged normally. - NODATA: No network traffic to/from the ENI during the aggregation interval. - SKIPDATA: Some log entries were skipped (e.g., due to internal capacity issues). - Pricing: No charge for creating flow logs, but you pay for CloudWatch Logs ingestion and storage, or S3 storage and requests. - Limitations:

Flow logs cannot be enabled for RDS instances, Redshift, or certain other managed services that use ENIs not owned by your account.

They do not capture traffic to the following AWS services: Amazon DNS, Amazon DHCP, Amazon EFS (NFS), Amazon CloudWatch, and EC2 instance metadata (169.254.169.254).

They do not capture traffic generated by the VPC router itself (e.g., ICMP time exceeded messages).

Maximum number of flow logs per region: 500.

Configuration and Verification Commands

Creating a flow log using AWS CLI:

aws ec2 create-flow-logs --resource-type VPC --resource-ids vpc-12345678 --log-group-name my-flow-logs --traffic-type ALL --deliver-logs-permission-arn arn:aws:iam::123456789012:role/FlowLogsRole

Creating a flow log with custom format to S3:

aws ec2 create-flow-logs --resource-type VPC --resource-ids vpc-12345678 --log-destination-type s3 --log-destination arn:aws:s3:::my-bucket/prefix/ --traffic-type ALL --log-format '${version} ${account-id} ${interface-id} ${srcaddr} ${dstaddr} ${srcport} ${dstport} ${protocol} ${packets} ${bytes} ${start} ${end} ${action} ${log-status} ${region} ${az-id} ${subnet-id} ${instance-id} ${pkt-srcaddr} ${pkt-dstaddr} ${tcp-flags} ${type} ${drop-cause}'

Verifying flow logs:

aws ec2 describe-flow-logs --filter "Name=resource-id,Values=vpc-12345678"

Viewing logs in CloudWatch Logs:

aws logs get-log-events --log-group-name my-flow-logs --log-stream-name eni-12345678-all

Interaction with Related Technologies

Security Groups vs. Network ACLs: Flow logs show whether traffic was accepted or rejected. The action field reflects the combined effect of security group and network ACL rules. If either denies, the action is REJECT. If both allow, ACCEPT. However, flow logs do not indicate which component denied the traffic. For that, you need to enable VPC Traffic Mirroring or use additional logging.

VPC Traffic Mirroring: While flow logs give metadata, Traffic Mirroring copies actual packet payloads for deep inspection. Flow logs are lighter and cheaper.

AWS Network Firewall: Provides stateful inspection and can generate its own logs. Flow logs complement by showing baseline traffic.

AWS Transit Gateway: Flow logs can be enabled on transit gateway attachments, but only for the attachment level, not the transit gateway itself.

AWS PrivateLink: Flow logs capture traffic to interface VPC endpoints.

Advanced: Custom Formats and Drop Cause

Custom formats allow you to include fields like pkt-srcaddr and pkt-dstaddr which are the original packet IP addresses (useful when traffic goes through a NAT or load balancer). The tcp-flags field captures TCP flags (SYN, FIN, RST, etc.) for detailed analysis. The drop-cause field (available in version 5) indicates why a packet was dropped (e.g., security-group-rule, network-acl-rule, no-route, blackhole-route, etc.). This is extremely useful for troubleshooting.

Troubleshooting with Flow Logs

Common scenarios: - Connectivity failure: Check if the action is REJECT. If so, look at the source/destination IP and port to identify which security group or NACL rule is blocking. - Asymmetric routing: If you see traffic in one direction but not the reverse, there may be a routing issue. - Missing logs: Check log status. If SKIPDATA, the flow log service is overwhelmed; consider reducing the aggregation interval or splitting resources. - No data: Ensure the flow log is attached to the correct resource and that the IAM role (for CloudWatch) or bucket policy (for S3) allows delivery.

Best Practices

Enable flow logs for all VPCs, especially production.

Use ALL traffic type to capture both accepted and rejected traffic.

Set aggregation interval to 1 minute for critical interfaces.

Store logs in S3 with lifecycle policies to move to cheaper storage after a retention period.

Use Amazon Athena to query flow logs in S3.

Monitor flow log delivery using CloudWatch metrics (e.g., FlowLogsLogGroupDeliveryError).

Walk-Through

1

Create an IAM Role for CloudWatch

If delivering to CloudWatch Logs, you need an IAM role that grants the flow logs service permission to publish log events to the log group. The role must have a trust policy allowing `vpc-flow-logs.amazonaws.com` to assume it, and a permissions policy allowing `logs:CreateLogGroup`, `logs:CreateLogStream`, `logs:PutLogEvents`, and `logs:DescribeLogGroups`. If delivering to S3, you need a bucket policy that allows `s3:PutObject` for the flow logs service principal. The exam often tests that you must create the role or bucket policy before creating the flow log.

2

Select Resource Type and Traffic Type

You can create a flow log at the VPC, subnet, or ENI level. The scope determines which interfaces are monitored: VPC-level logs capture all ENIs in the VPC; subnet-level captures ENIs in that subnet; ENI-level captures a single interface. Traffic type can be ACCEPT, REJECT, or ALL. Choose ALL for comprehensive visibility. Note that you cannot modify the traffic type after creation; you must delete and recreate the flow log.

3

Choose Log Destination and Format

Destinations: CloudWatch Logs (real-time, searchable, but higher cost) or S3 (cheaper, batch delivery, good for long-term storage and analysis with Athena). You can specify a custom log format using variables like `${version} ${srcaddr} ${dstaddr} ${action}`. The default format includes 14 fields. Custom formats allow you to include additional fields like `pkt-srcaddr`, `tcp-flags`, and `drop-cause`. The format must be defined at creation time and cannot be changed later.

4

Create the Flow Log

Use the AWS Management Console, CLI, or SDK to create the flow log. For example, using CLI: `aws ec2 create-flow-logs --resource-type VPC --resource-ids vpc-xxxxx --log-group-name my-log-group --traffic-type ALL --deliver-logs-permission-arn arn:aws:iam::123456789012:role/FlowLogRole`. If successful, the command returns a list of flow log IDs. Note that you cannot create a flow log for a resource that already has a flow log of the same type (same destination and traffic type) – you can have up to 2 flow logs per resource (e.g., one for ACCEPT and one for REJECT, or one to CloudWatch and one to S3).

5

Analyze Logs and Troubleshoot

After creation, logs start flowing within minutes (for CloudWatch) or up to an hour (for S3). Use CloudWatch Logs Insights to query logs: `fields @timestamp, srcaddr, dstaddr, action | filter action = 'REJECT' | sort @timestamp desc`. For S3, use Athena to create a table over the flow log data. Look for patterns like repeated REJECTs from the same IP, missing expected traffic, or asymmetric flows. The `drop-cause` field (if included) tells you exactly why a packet was dropped (e.g., `security-group-rule`). This step is critical for exam scenarios where you need to identify why an EC2 instance cannot connect to another server.

What This Looks Like on the Job

Enterprise Scenario 1: Security Incident Investigation

A financial services company detects unusual outbound traffic from an EC2 instance in a production VPC. The security team suspects a compromised instance exfiltrating data. They enable VPC Flow Logs for the entire VPC (if not already enabled) with 1-minute aggregation interval and custom format including tcp-flags, pkt-srcaddr, and drop-cause. Within minutes, they see a flow from the instance's ENI to an unknown IP on port 443 with high byte counts. The tcp-flags show only SYN and FIN packets, indicating a TCP connection. The action is ACCEPT. They correlate this with CloudTrail logs to see who launched the instance and what IAM roles it had. They isolate the instance, take a forensic snapshot, and use the flow log data as evidence for compliance. Without flow logs, they would have no visibility into the data exfiltration.

Enterprise Scenario 2: Troubleshooting Connectivity Issues

A SaaS company deploys a new microservice in a private subnet that needs to communicate with an RDS database in another subnet. Users report timeouts. The network engineer checks the VPC Flow Logs for the microservice's ENI and sees that traffic to the RDS endpoint (IP and port 3306) is being REJECTED. The drop-cause field shows security-group-rule. The engineer reviews the security group attached to the microservice: it allows outbound traffic to the RDS security group on port 3306, but the RDS security group does not allow inbound traffic from the microservice's security group. After updating the RDS security group, the flow logs show ACCEPT. This quick diagnosis saves hours of packet-level analysis.

Enterprise Scenario 3: Capacity Planning and Cost Allocation

A large e-commerce platform uses flow logs to monitor traffic patterns across hundreds of microservices. They store flow logs in S3 and use Athena to query them weekly. They generate reports on top talkers (by bytes and packets), identify which services communicate most frequently, and plan scaling accordingly. They also use flow logs to attribute network costs to specific business units by tagging ENIs and extracting the instance-id from logs. One misconfiguration: they initially used default format and could not identify which instance generated traffic because the instance-id field is not in the default format. They had to recreate flow logs with custom format including ${instance-id}. They also encountered a SKIPDATA status during a major sales event, which they resolved by reducing the aggregation interval from 10 minutes to 1 minute and increasing the log group's retention policy.

How SOA-C02 Actually Tests This

What SOA-C02 Tests on VPC Flow Logs

Objective 5.1: 'Implement and manage network features, including VPC Flow Logs.' The exam tests your ability to:

Create flow logs with correct parameters (resource type, destination, traffic type).

Understand the difference between CloudWatch Logs and S3 destinations.

Interpret flow log entries to diagnose connectivity issues.

Know the limitations: what traffic is NOT captured (DNS, DHCP, instance metadata, etc.).

Identify the correct IAM permissions required.

Understand the aggregation interval and its impact on log volume.

Common Wrong Answers and Why Candidates Choose Them

1.

Choosing 'REJECT' traffic type to see blocked traffic: Candidates often think they only need to see denied traffic for security. But to fully understand network behavior, you need ALL traffic. The exam will test that you know to choose ALL for comprehensive analysis.

2.

Thinking flow logs capture packet payloads: Many assume flow logs are like packet capture. The exam explicitly tests that flow logs only capture metadata (IP addresses, ports, protocols, action) and NOT the payload.

3.

Assuming flow logs can be enabled for RDS or Redshift: Candidates might think they can log traffic to any ENI. The exam highlights that flow logs cannot be enabled for ENIs that are not in your account (e.g., RDS, Redshift, NAT gateway's internal ENI).

4.

Mixing up the IAM role for CloudWatch vs. S3: For CloudWatch, you need a service role with logs:PutLogEvents. For S3, you need a bucket policy. The exam presents scenarios where the wrong permission is used.

Specific Numbers and Terms

Aggregation interval: 10 minutes (default) or 1 minute.

Maximum flow logs per region: 500.

Log status: OK, NODATA, SKIPDATA.

Version: 2 (default), 3, 4, 5 (with drop-cause).

Traffic types: ACCEPT, REJECT, ALL.

Fields: version, account-id, interface-id, srcaddr, dstaddr, srcport, dstport, protocol, packets, bytes, start, end, action, log-status, vpc-id, subnet-id, instance-id, tcp-flags, pkt-srcaddr, pkt-dstaddr, region, az-id, sublocation-type, sublocation-id, pkt-account-id.

Edge Cases and Exceptions

Flow logs do not capture traffic to 169.254.169.254 (instance metadata) or 169.254.169.253 (Amazon DNS).

Traffic generated by the VPC router (e.g., ICMP destination unreachable) is not logged.

If you have multiple flow logs on the same resource (e.g., one to CloudWatch and one to S3), both capture the same traffic.

You cannot enable flow logs for a VPC that is shared with you (via AWS RAM).

Flow logs do not support IPv6 traffic if the resource is a subnet or VPC? Actually, they do support IPv6, but the exam may test that IPv6 is supported.

How to Eliminate Wrong Answers

If a question asks about capturing denied traffic, remember that you need to set traffic type to ALL or REJECT, but also consider the aggregation interval.

If a question asks about real-time analysis, CloudWatch Logs is the destination (streaming), not S3 (batch).

If a question mentions missing logs, check for SKIPDATA or NODATA status, or verify IAM permissions.

If a question involves troubleshooting a connection between two instances, look at the flow log entries for both ENIs to see if traffic is one-way.

Key Takeaways

VPC Flow Logs capture metadata (not payload) of IP traffic at the VPC, subnet, or ENI level.

Default aggregation interval is 10 minutes; can be set to 1 minute for higher granularity.

Traffic types: ACCEPT, REJECT, or ALL – choose ALL for comprehensive visibility.

Log destinations: CloudWatch Logs (real-time) or S3 (batch, cheaper).

Flow logs do not capture traffic to Amazon DNS, DHCP, instance metadata, or EFS.

The `drop-cause` field (version 5) indicates why a packet was dropped (e.g., security-group-rule).

You must have the correct IAM role (for CloudWatch) or bucket policy (for S3) before creating flow logs.

Maximum 500 flow logs per region.

Flow logs are not retroactive – they only capture traffic after creation.

Use `describe-flow-logs` to verify creation and status.

Common log statuses: OK, NODATA (no traffic), SKIPDATA (some data skipped).

Custom log format can include fields like instance-id, pkt-srcaddr, tcp-flags, and drop-cause.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

CloudWatch Logs Destination

Real-time streaming of log events (within seconds).

Supports real-time search with CloudWatch Logs Insights.

Higher cost due to ingestion and storage fees.

Automatic log stream creation per ENI.

Can trigger Lambda functions for real-time processing.

S3 Bucket Destination

Batch delivery, typically within a few minutes to an hour.

Cheaper storage, especially with S3 lifecycle policies.

Ideal for long-term archival and compliance.

Can be queried using Amazon Athena with SQL.

No automatic log stream; data is stored as gzipped text files.

Watch Out for These

Mistake

VPC Flow Logs capture all network traffic, including traffic to AWS DNS and instance metadata.

Correct

Flow logs do not capture traffic to Amazon DNS (169.254.169.253), Amazon DHCP, Amazon EFS (NFS), Amazon CloudWatch, or instance metadata (169.254.169.254). These are internal AWS services that are not logged.

Mistake

Flow logs capture the actual packet data (payload).

Correct

Flow logs only capture metadata about the IP traffic: source/destination IP, ports, protocol, number of packets/bytes, and action (ACCEPT/REJECT). They never capture the payload contents.

Mistake

You can change the traffic type or log format after creating a flow log.

Correct

Both the traffic type (ACCEPT, REJECT, ALL) and log format are immutable after creation. To change them, you must delete the flow log and create a new one.

Mistake

Flow logs can be enabled for any ENI, including those of RDS, Redshift, and NAT gateways.

Correct

You can only enable flow logs for ENIs that belong to EC2 instances, Elastic Load Balancers, or VPC endpoints. ENIs owned by AWS services like RDS, Redshift, or NAT gateways (the internal ENI) are not directly accessible for flow log creation.

Mistake

The default aggregation interval is 1 minute.

Correct

The default aggregation interval is 10 minutes. You can optionally set it to 1 minute, but this incurs additional costs and generates more log data.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

How do I create a VPC Flow Log in AWS?

You can create a flow log via the AWS Management Console, CLI, or SDK. In the console, navigate to VPC > Flow Logs > Create flow log. Specify the resource (VPC, subnet, or ENI), traffic type (ACCEPT, REJECT, ALL), destination (CloudWatch Logs or S3), and optional custom format. For CLI, use `aws ec2 create-flow-logs`. Ensure you have the required IAM role (for CloudWatch) or bucket policy (for S3) in place.

What is the difference between VPC Flow Logs and AWS CloudTrail?

VPC Flow Logs capture network traffic metadata (IP addresses, ports, protocols) at the packet level. CloudTrail records API calls made to the AWS Management Console, SDKs, and CLI. They serve different purposes: flow logs for network visibility, CloudTrail for auditing API activity. Both are important for security, but they capture different types of data.

Can I use VPC Flow Logs to capture traffic to a NAT gateway?

Yes, you can enable flow logs on the ENI of a NAT gateway. However, note that the NAT gateway's internal ENI is not directly accessible; you must enable flow logs at the VPC or subnet level that includes the NAT gateway's ENI. The logs will show traffic going through the NAT gateway, but the source IP will be the NAT gateway's private IP, not the original instance's private IP (unless you use custom format with pkt-srcaddr).

Why are my VPC Flow Logs showing 'NODATA'?

NODATA means there was no network traffic to or from the ENI during the aggregation interval. This is normal if the instance is idle or not connected to the network. If you expect traffic, check that the instance has an ENI attached and that there is active communication. Also verify that the flow log is correctly associated with the resource.

How can I query VPC Flow Logs stored in S3?

You can use Amazon Athena to query flow logs stored in S3. First, create a table in Athena using the AWS Glue Data Catalog or manually with the appropriate column mapping. Then run SQL queries. For example: `SELECT * FROM flow_logs WHERE action = 'REJECT' LIMIT 10`. Alternatively, you can use Amazon QuickSight for visualization.

What is the 'drop-cause' field in VPC Flow Logs?

The `drop-cause` field is available in flow log version 5 and indicates the reason a packet was dropped. Possible values include: `security-group-rule` (dropped by a security group), `network-acl-rule` (dropped by a network ACL), `no-route` (no route to destination), `blackhole-route` (route to a blackhole), `interface-violation` (packet did not match ENI), `aws-service-internal` (internal AWS service drop). This field is extremely useful for troubleshooting connectivity issues.

Can I have multiple flow logs for the same ENI?

Yes, you can have up to two flow logs per resource (ENI, subnet, or VPC). For example, you could have one flow log sending ACCEPT traffic to CloudWatch and another sending REJECT traffic to S3. However, you cannot have two flow logs with the same destination and traffic type for the same resource.

Terms Worth Knowing

Ready to put this to the test?

You've just covered VPC Flow Logs Analysis — now see how well it sticks with free SOA-C02 practice questions. Full explanations included, no account needed.

Done with this chapter?