This chapter covers two foundational AWS monitoring and logging services: Amazon CloudWatch and AWS CloudTrail. For the CLF-C02 exam, this objective falls under Domain 3: Cloud Technology Services, and typically represents about 8-10% of the exam questions. Understanding the distinct purposes, features, and use cases of CloudWatch and CloudTrail is critical because the exam often tests your ability to choose the right service for a given scenario. By the end of this chapter, you'll know exactly when to use each service and how they complement each other in a well-architected AWS environment.
Jump to a section
Imagine you run a small retail store. You install security cameras (CloudWatch) that continuously record video, showing you real-time footage of customer traffic, employee activity, and stock levels. You can set up alerts: if the front door is opened after hours, you get a notification. The cameras give you operational visibility—they monitor the health of your store. Separately, you have a paper logbook (CloudTrail) at the entrance where every person who enters must sign their name, time, and purpose. This logbook doesn't show you live video; it records a chronological, tamper-evident record of who did what and when. If a theft occurs, you review the logbook to see exactly which employee entered the stockroom at 3 AM. The cameras help you understand store performance and issues; the logbook provides an audit trail for security and compliance. In AWS, CloudWatch monitors resources and applications (cameras), while CloudTrail records API activity (logbook). Both are essential but serve different purposes: one for performance and health, the other for auditing and security.
What Are CloudWatch and CloudTrail? The Problems They Solve
Amazon CloudWatch is a monitoring and observability service that provides data and actionable insights for AWS resources, applications, and on-premises servers. It answers questions like: Is my EC2 instance CPU utilization too high? Is my application responding slowly? Are my RDS connections maxing out? CloudWatch collects metrics (time-ordered data points), logs, and events, and allows you to set alarms and automate responses.
AWS CloudTrail is a governance, compliance, and audit service that records every API call made in your AWS account. It answers questions like: Who launched an EC2 instance last night? Did someone delete an S3 bucket? Which IAM user made a change to a security group? CloudTrail provides a history of AWS API calls for your account, including calls made via the AWS Management Console, SDKs, CLI, and AWS services.
How CloudWatch Works: The Mechanism
CloudWatch operates on three core concepts: metrics, alarms, and logs.
Metrics are the fundamental building blocks. A metric is a time-ordered set of data points representing a variable you want to monitor. For example, the CPUUtilization metric for an EC2 instance is published every minute by default. Metrics are stored for 15 months, with granularity decreasing over time: 1-second resolution for the first 3 hours, 1-minute for 15 days, 5-minute for 63 days, and 1-hour for 455 days (15 months). You can also publish custom metrics (e.g., application-level metrics like number of orders processed) using the PutMetricData API.
Alarms watch a single metric or an expression based on multiple metrics. You set a threshold and an action. For example, if CPUUtilization > 80% for 5 consecutive minutes, send an SNS notification to your operations team. Alarms have three states: OK, ALARM, and INSUFFICIENT_DATA. You can also use composite alarms to combine multiple alarms into a single alert.
Logs – CloudWatch Logs lets you ingest, store, and analyze log files from AWS resources and applications. You can create log groups and log streams. For example, EC2 instances can send system logs to CloudWatch Logs via the CloudWatch agent. You can then search logs, create metric filters to count occurrences of specific terms (e.g., "ERROR"), and set alarms on those metrics. Logs are retained indefinitely by default, but you can set retention policies (1 day to 10 years).
CloudWatch Dashboards allow you to create customizable views of metrics and alarms. You can share dashboards across accounts or with the public (with caution).
CloudWatch Events (now part of Amazon EventBridge) – CloudWatch Events delivers a near real-time stream of system events that describe changes in AWS resources. For example, an EC2 instance state change to "running" triggers an event. You can set rules to route events to targets like Lambda functions, SNS topics, or SQS queues. EventBridge is the newer, more feature-rich version.
CloudWatch Agent – For monitoring EC2 instances and on-premises servers, you install the CloudWatch agent. It collects metrics (like memory utilization, disk space) and logs that are not available by default. The agent can also collect metrics from Windows or Linux systems.
How CloudTrail Works: The Mechanism
CloudTrail records every API call as an event. There are two types of trails: a trail that applies to all regions (default) or a single region. When you enable CloudTrail, it automatically creates a trail that logs events for all regions. Events are delivered to an S3 bucket of your choice. You can also optionally deliver to CloudWatch Logs for real-time monitoring.
Management Events – These are operations performed on AWS resources, such as creating an EC2 instance (RunInstances), modifying a security group (AuthorizeSecurityGroupIngress), or deleting an S3 bucket (DeleteBucket). Management events are recorded by default. You can separate them into Read events (e.g., DescribeInstances) and Write events (e.g., RunInstances).
Data Events – These are operations performed on or within resources, such as S3 object-level API calls (GetObject, PutObject), Lambda function invocations, and DynamoDB item operations. Data events are not recorded by default because they can generate a high volume of logs. You must explicitly enable them.
CloudTrail Insights – An optional feature that uses machine learning to identify unusual API activity in your account, such as a spike in terminations or a user making API calls from an unusual location. Insights are recorded as events in a separate log file.
Event History – By default, CloudTrail provides a viewable, searchable, downloadable record of the last 90 days of management events in the CloudTrail console. This is free and always enabled. For longer retention, you must create a trail that delivers to S3.
Log File Integrity Validation – CloudTrail can sign log files using SHA-256 hashing and digital signatures to ensure they have not been tampered with. This is critical for compliance (e.g., PCI DSS, SOC).
Comparison to On-Premises or Competing Approaches
In an on-premises data center, monitoring often involves a combination of SNMP traps, syslog servers, and custom scripts. CloudWatch provides a fully managed, scalable alternative with no servers to maintain. Similarly, auditing on-premises might involve reviewing server logs or database logs. CloudTrail provides a centralized, tamper-evident audit trail for all AWS API calls, which is much harder to achieve manually.
Competing cloud monitoring services include Azure Monitor and Google Cloud Operations Suite. However, for AWS-centric environments, CloudWatch and CloudTrail are tightly integrated with all AWS services, making them the natural choice.
When to Use CloudWatch vs CloudTrail
Use CloudWatch for performance monitoring, operational health, and application troubleshooting. If you need to know how your resources are performing (CPU, memory, latency), set alarms, or view logs, use CloudWatch.
Use CloudTrail for security auditing, compliance, and incident investigation. If you need to know who did what and when in your AWS account, use CloudTrail.
They often work together: you can send CloudTrail logs to CloudWatch Logs to set alarms on specific API calls (e.g., alarm when an IAM user creates a new access key).
Enable CloudTrail for your account
Go to the CloudTrail console in AWS. Click 'Create trail'. Give it a name (e.g., 'ManagementTrail'). Choose whether to apply to all regions (recommended) or a single region. Select an S3 bucket to store log files (you can create a new bucket or use an existing one). Optionally, you can enable CloudWatch Logs integration to send events to a log group for real-time monitoring. You can also enable log file validation for integrity. Click 'Create'. Behind the scenes, AWS creates an S3 bucket policy that allows CloudTrail to write logs, and starts recording API calls. Within 15 minutes, you'll see the first log file appear in the bucket. This trail will record management events by default. To record data events (e.g., S3 object-level operations), you need to enable them separately in the trail settings.
View CloudTrail Event History
In the CloudTrail console, click 'Event history'. This shows the last 90 days of management events for your account. You can filter by resource type, user name, event name, event source, and time range. For example, filter by 'Event name = RunInstances' to see all EC2 instance launches. Click on any event to see details: who made the call (user identity), when (event time), source IP address, and the request parameters. This is useful for quick investigations. Note that Event History does not include data events unless you have a trail that captures them. Also, Event History is free and always enabled. For longer retention or data events, you must create a trail.
Create a CloudWatch Alarm for high CPU
Open the CloudWatch console. In the left navigation, click 'Alarms' then 'All alarms'. Click 'Create alarm'. Choose 'Select metric'. Browse or search for a metric, e.g., EC2 > Per-Instance Metrics > CPUUtilization. Select the instance you want to monitor. Set the metric name and statistic (e.g., Average over 5 minutes). Define the condition: e.g., 'Greater than 80' for 2 consecutive datapoints. Configure an action: send a notification to an SNS topic (you may need to create an SNS topic first). Optionally, you can also trigger an Auto Scaling action or an EC2 action (stop, terminate, reboot). Give the alarm a name and description. Click 'Create alarm'. Now, when the CPU utilization exceeds 80% for two consecutive 5-minute periods, the alarm state changes to ALARM and the notification is sent.
Send EC2 logs to CloudWatch Logs
To send system logs from an EC2 instance to CloudWatch Logs, install the CloudWatch agent on the instance. For Amazon Linux 2, you can use the command: `sudo yum install amazon-cloudwatch-agent`. Create a configuration file (e.g., /opt/aws/amazon-cloudwatch-agent/bin/config.json) that specifies which log files to collect (e.g., /var/log/messages) and the log group name. Start the agent. The agent will automatically send log events to CloudWatch Logs. In the CloudWatch console, under 'Log groups', you'll see your log group. You can search logs, create metric filters, and set alarms. For example, create a metric filter that counts occurrences of 'ERROR' and then set an alarm when the count exceeds a threshold.
Use CloudTrail Insights to detect anomalies
In the CloudTrail console, select a trail that has CloudTrail Insights enabled (you can enable it when creating or editing a trail). Insights monitors management events and uses machine learning to establish a baseline of normal API activity. If it detects unusual patterns, such as a sudden spike in TerminateInstances calls, it generates an Insights event. These events are stored in a separate folder in the S3 bucket (prefix: `AWSLogs/.../CloudTrail-Insight/`). You can also view Insights events in the CloudTrail console under 'Insights'. This helps you identify potential security threats or operational issues proactively. Note that Insights has an additional cost based on the number of events analyzed.
Scenario 1: E-commerce Platform Performance Monitoring
An e-commerce company runs a web application on EC2 instances behind an Application Load Balancer (ALB). They use CloudWatch to monitor key metrics: ALB request count, latency (p99), EC2 CPU utilization, and RDS connection count. They set CloudWatch Alarms to notify the DevOps team when latency exceeds 2 seconds or CPU is above 80% for 5 minutes. When an alarm triggers, an SNS notification sends an email to the on-call engineer, who can then investigate and scale out the Auto Scaling group. Additionally, they use CloudWatch Logs to collect application logs from EC2 instances. They create a metric filter to count "5xx" errors and set an alarm for high error rates. This setup ensures they can detect and respond to performance issues quickly. Cost: CloudWatch metrics and alarms are low cost; CloudWatch Logs ingestion and storage costs can be significant if logs are verbose. They mitigate this by setting log retention to 30 days and using metric filters to extract only what they need.
Scenario 2: Security Incident Investigation with CloudTrail
A financial services company needs to comply with SOC 2 and must audit all changes to their AWS environment. They enable CloudTrail in all regions with log file validation enabled. They also enable data events for S3 buckets containing sensitive customer data. When a security analyst notices an unauthorized S3 bucket deletion, they go to CloudTrail Event History and filter by 'DeleteBucket'. They find the event, which shows the IAM user who called the API, the source IP address, and the time. They discover that the user's credentials were compromised. They then use CloudTrail Insights to check for other unusual activity. The trail logs are stored in an S3 bucket with a lifecycle policy that moves older logs to Glacier after 90 days for long-term retention. They also send CloudTrail logs to CloudWatch Logs to set an alarm that triggers when a security group is modified (a critical control). This setup helps them detect and investigate security incidents quickly.
Scenario 3: Misconfiguration Leading to Oversight
A startup enables CloudTrail but forgets to enable data events for S3. Later, they suspect a data breach but cannot find any API calls for GetObject or PutObject in CloudTrail because those data events were not logged. They mistakenly think no access occurred. In reality, the logs are missing critical evidence. Also, they set CloudWatch alarms only on EC2 CPU, but not on memory or disk. When an instance runs out of memory, the application crashes, but no alarm triggers because memory metrics are not sent by default (they require the CloudWatch agent). They learn the hard way that monitoring must be comprehensive. This scenario highlights the importance of understanding default vs. optional features.
What CLF-C02 Tests on This Objective
Domain 3.5: Identify the purposes of monitoring and logging services (Amazon CloudWatch, AWS CloudTrail). The exam expects you to:
Differentiate between CloudWatch (monitoring performance, metrics, logs, alarms) and CloudTrail (auditing API calls).
Know that CloudTrail records management events by default, and data events are optional.
Understand that CloudWatch can monitor EC2, RDS, Lambda, and custom metrics.
Know that CloudTrail logs are stored in S3 and can be delivered to CloudWatch Logs.
Recognize that CloudWatch Alarms can trigger SNS, Auto Scaling, and EC2 actions.
Know that CloudTrail Event History is free and retains 90 days of management events.
Common Wrong Answers and Why Candidates Choose Them
*"CloudTrail monitors CPU utilization."* – Wrong because CloudTrail is for API auditing, not performance metrics. Candidates confuse the two services.
*"CloudWatch records who deleted an S3 bucket."* – Wrong because CloudWatch does not track user identity; CloudTrail does. Candidates think all monitoring is in CloudWatch.
*"CloudTrail is only for management events."* – While true by default, candidates forget that data events can be enabled. The exam might ask whether data events are included by default (no).
*"CloudWatch Logs can replace CloudTrail for auditing."* – Wrong because CloudWatch Logs stores log data but does not automatically capture API calls with user identity; CloudTrail is purpose-built for auditing.
Specific Terms and Values That Appear on the Exam - "90 days" – CloudTrail Event History retention. - "15 months" – CloudWatch metric retention. - "Management events" vs "Data events" – CloudTrail terminology. - "CloudWatch Agent" – needed for memory and disk metrics. - "SNS" – common alarm action. - "Log file validation" – CloudTrail integrity feature.
Tricky Distinctions - CloudWatch vs CloudTrail: If the scenario mentions "performance" or "metrics", think CloudWatch. If it mentions "audit", "who", or "API call", think CloudTrail. - CloudWatch Logs vs CloudTrail: CloudWatch Logs can ingest CloudTrail logs, but they are fundamentally different services. The exam may ask: "Which service records API calls?" Answer: CloudTrail. - CloudWatch Alarms vs AWS Config: Config is for resource configuration changes, not performance. Alarms are for metrics.
Decision Rule for Multiple Choice If the question asks about "monitoring resource utilization" → CloudWatch. If it asks about "tracking API calls for compliance" → CloudTrail. If it asks about "real-time notification of high CPU" → CloudWatch Alarm. If it asks about "retaining API logs for 7 years" → CloudTrail with S3 lifecycle.
CloudWatch is for monitoring metrics, logs, and setting alarms; CloudTrail is for auditing API calls.
CloudTrail records management events by default; data events (e.g., S3 object-level) must be enabled separately.
CloudWatch Agent is required to collect memory and disk metrics from EC2 instances.
CloudTrail Event History provides 90 days of free management event visibility.
CloudWatch Alarms can trigger SNS, Auto Scaling, EC2 actions, and more.
CloudTrail logs can be delivered to CloudWatch Logs for real-time monitoring and alerting.
Log file validation in CloudTrail ensures log integrity using cryptographic hashing.
These come up on the exam all the time. Here's how to tell them apart.
Amazon CloudWatch
Purpose: Monitor performance and operational health of AWS resources and applications
Data collected: Metrics, logs, events (e.g., CPU utilization, error counts)
Default behavior: Collects basic metrics automatically for many services
Retention: Metrics stored for up to 15 months; logs retention configurable
Use cases: Performance troubleshooting, capacity planning, real-time alerts
AWS CloudTrail
Purpose: Audit API activity for governance, compliance, and security
Data collected: API call records (who, what, when, source IP)
Default behavior: Records management events automatically; data events opt-in
Retention: Event History 90 days free; S3 storage for longer (cost applies)
Use cases: Security investigations, compliance audits, change tracking
Mistake
CloudTrail records all API calls by default, including data events.
Correct
CloudTrail records management events by default. Data events (e.g., S3 GetObject, Lambda Invoke) are not recorded unless you explicitly enable them in a trail.
Mistake
CloudWatch can tell me who deleted an S3 bucket.
Correct
CloudWatch monitors metrics and logs, but it does not capture user identity for API calls. That is the function of CloudTrail. CloudWatch can ingest CloudTrail logs, but the user identity comes from CloudTrail.
Mistake
CloudTrail logs are stored indefinitely in Event History.
Correct
CloudTrail Event History retains management events for 90 days only. For longer retention, you must create a trail that delivers logs to an S3 bucket.
Mistake
CloudWatch automatically collects memory and disk metrics from EC2 instances.
Correct
By default, EC2 only sends hypervisor-level metrics like CPU, network, and disk I/O. Memory and disk utilization require the CloudWatch agent to be installed on the instance.
Mistake
CloudWatch Alarms can only send SNS notifications.
Correct
CloudWatch Alarms can trigger multiple actions: SNS notifications, Auto Scaling policies, EC2 actions (stop, terminate, reboot, recover), and Systems Manager actions. The exam may test this variety.
CloudWatch monitors the performance and operational health of your AWS resources (metrics, logs, alarms). CloudTrail records API calls made in your account for auditing and compliance. In short: CloudWatch tells you how your resources are performing; CloudTrail tells you who did what and when. For the CLF-C02 exam, remember that if a question mentions 'monitoring CPU' or 'setting alarms', it's CloudWatch. If it mentions 'auditing API calls' or 'tracking user activity', it's CloudTrail.
No. CloudTrail records management events (e.g., creating or deleting resources) by default. Data events (e.g., S3 GetObject, Lambda invocation) are not recorded unless you explicitly enable them in a trail. This is a common exam trap: the question may say 'CloudTrail records every API call' – that's false because data events are opt-in. Also, CloudTrail does not record all AWS service-to-service calls; it focuses on API calls made by users, roles, or services on your behalf.
CloudWatch metrics are stored for 15 months. The granularity decreases over time: 1-second resolution for the first 3 hours, 1-minute for 15 days, 5-minute for 63 days, and 1-hour for 455 days (15 months). This is important for the exam – you may be asked about the maximum retention period. Also, custom metrics follow the same retention schedule.
Yes, you can monitor on-premises servers by installing the CloudWatch agent. The agent collects metrics and logs and sends them to CloudWatch. This is a common exam scenario: a hybrid environment where you need a single monitoring solution for both AWS and on-premises. CloudWatch also supports collecting metrics from on-premises via the PutMetricData API, but the agent is the recommended approach.
CloudTrail Insights is an optional feature that uses machine learning to detect unusual API activity in your account. For example, it can identify a spike in terminations or a user making calls from an unusual geographic location. Insights events are recorded and stored in a separate folder in your S3 bucket. This feature helps with proactive security and operational anomaly detection. Note that Insights has an additional cost.
Enable log file validation when creating or editing a trail. CloudTrail uses SHA-256 hashing and digital signatures (using RSA) to create a digest file that proves the log files have not been modified. You can use the AWS CLI or SDK to validate the integrity of log files. This is crucial for compliance frameworks like PCI DSS and SOC.
CloudWatch Logs is a service for storing, monitoring, and analyzing log files from various sources (EC2, Lambda, on-premises). CloudTrail is specifically for recording AWS API calls. You can send CloudTrail logs to CloudWatch Logs to set alarms on API activity, but they are separate services. The exam may test this: if you need to monitor application logs, use CloudWatch Logs; if you need to audit API calls, use CloudTrail.
You've just covered AWS CloudWatch and CloudTrail — now see how well it sticks with free CLF-C02 practice questions. Full explanations included, no account needed.
Done with this chapter?