DVA-C02Chapter 88 of 101Objective 4.2

CloudWatch Alarms: Composite and Metric Math

This chapter covers CloudWatch composite alarms and metric math, two advanced features that enable sophisticated monitoring and reduce false alarms. Composite alarms allow you to combine multiple alarms using Boolean logic, while metric math lets you create new time series from existing metrics using arithmetic and functions. These topics are tested under Domain 4.2 (Troubleshooting) and appear in roughly 5-8% of DVA-C02 exam questions. Mastering them is essential for designing cost-effective, low-noise alerting systems and for answering questions that require understanding how to aggregate metrics or alarms to trigger actions only when multiple conditions are met.

25 min read
Intermediate
Updated May 31, 2026

Composite Alarm as a Fire Alarm Panel

Imagine a large building with multiple fire sensors: smoke detectors on each floor, heat sensors in the kitchen, and manual pull stations in corridors. Each sensor sends a simple 'alarm' or 'ok' signal to a central fire alarm panel. The panel is programmed with logical rules: if any two smoke detectors on the same floor go off, or if a smoke detector AND a heat sensor trigger simultaneously, then sound the building-wide evacuation alarm. This is exactly how a composite alarm works. Each individual sensor is a standard CloudWatch alarm (metric alarm). The composite alarm is the fire alarm panel that combines these inputs using AND/OR logic. Instead of sounding a siren, the composite alarm transitions to ALARM state, which can trigger an SNS notification or an Auto Scaling action. The key benefit: you avoid false alarms (a single burnt toast won't evacuate the whole building) while still catching real fires quickly. In AWS terms, you define a composite alarm with an alarm rule expression like 'ALARM("HighCPU") OR ALARM("HighMemory")', and it evaluates the state of each referenced alarm every 10 seconds. The composite alarm has its own state (OK, ALARM, INSUFFICIENT_DATA) derived from the rule. This pattern is critical for reducing noise and ensuring actions only fire when multiple conditions align, exactly like a fire panel prevents unnecessary evacuations.

How It Actually Works

What Are Composite Alarms and Why Do They Exist?

Composite alarms are a CloudWatch feature introduced to solve the problem of alarm noise and complex condition logic. Before composite alarms, you could only create simple threshold-based alarms on a single metric or a math expression. If you needed to trigger an action only when multiple conditions were true simultaneously (e.g., high CPU AND high memory), you had to create separate alarms and then use downstream logic in a Lambda function or SNS subscriber to evaluate the combined state. This was cumbersome and introduced latency. Composite alarms allow you to define a rule that references other alarms (called "child alarms") and evaluates their states using AND, OR, and NOT operators. The composite alarm itself has a state (OK, ALARM, INSUFFICIENT_DATA) that changes based on the rule. This simplifies architecture, reduces the number of actions, and provides a single point of truth.

How Composite Alarms Work Internally

A composite alarm monitors the state of its child alarms. Each child alarm is a standard CloudWatch alarm (metric alarm) that evaluates its own metric or expression. The composite alarm does not directly evaluate metrics; it only reads the state of child alarms. The state of a child alarm can be OK, ALARM, or INSUFFICIENT_DATA. The composite alarm rule is an expression that combines these states using: - ALARM("alarm-name") – true if the named alarm is in ALARM state. - OK("alarm-name") – true if the named alarm is in OK state. - INSUFFICIENT_DATA("alarm-name") – true if the named alarm has insufficient data. - AND, OR, NOT – logical operators. - Parentheses for grouping.

The composite alarm evaluates its rule every 10 seconds. It retrieves the current state of each referenced child alarm from CloudWatch and computes the rule. If the rule evaluates to true, the composite alarm transitions to ALARM state; if false, to OK state. If any child alarm is in INSUFFICIENT_DATA and the rule cannot be determined (e.g., an OR condition where one alarm is INSUFFICIENT_DATA and the other is OK), the composite alarm may also go to INSUFFICIENT_DATA. The exact behavior: if the rule evaluates to true, ALARM; if false, OK; if indeterminate due to missing data, INSUFFICIENT_DATA.

Key Components, Values, Defaults, and Timers

Child alarms: Up to 100 child alarms can be referenced in a single composite alarm rule. Each child alarm must be in the same Region and AWS account.

Rule syntax: The rule is a string of up to 1024 characters. Example: ALARM("HighCPU") OR (ALARM("HighMemory") AND ALARM("HighDisk")).

Evaluation period: The composite alarm evaluates every 10 seconds. There is no configurable evaluation period; it is fixed.

State transition: Composite alarms have the same three states as metric alarms: OK, ALARM, INSUFFICIENT_DATA. They transition immediately based on the rule evaluation (no datapoints to treat missing).

Actions: You can configure actions for OK, ALARM, and INSUFFICIENT_DATA states, just like metric alarms. Actions include SNS topics, Auto Scaling policies, EC2 actions (stop, terminate, reboot), and Systems Manager OpsItems.

Alarm name: Must be unique within the account and Region. Names can include up to 255 characters.

Cost: Composite alarms incur charges per composite alarm per month. Each child alarm is billed separately. Composite alarms do not incur additional metric costs.

Configuration and Verification

You can create composite alarms via the AWS Management Console, AWS CLI, or SDK. Using the AWS CLI:

aws cloudwatch put-composite-alarm \
    --alarm-name "MyCompositeAlarm" \
    --alarm-rule "ALARM(\"HighCPU\") OR ALARM(\"HighMemory\")" \
    --alarm-actions arn:aws:sns:us-east-1:123456789012:MyTopic \
    --ok-actions arn:aws:sns:us-east-1:123456789012:MyTopic \
    --insufficient-data-actions arn:aws:sns:us-east-1:123456789012:MyTopic

To verify, describe the alarm:

aws cloudwatch describe-alarms --alarm-names "MyCompositeAlarm"

Look for StateValue and StateReason. You can also view the composite alarm in the CloudWatch console under Alarms > Composite alarms.

Interaction with Related Technologies

Composite alarms integrate with: - SNS: Most common action – send notifications to email, SMS, Lambda, etc. - Auto Scaling: Trigger scale-in or scale-out policies based on composite alarm state. - EC2 Actions: Stop, terminate, reboot, or recover an instance. - Systems Manager OpsCenter: Create OpsItems automatically. - CloudWatch Dashboards: Display composite alarm state on a dashboard widget. - AWS Chatbot: Send notifications to Slack or Chime channels.

Composite alarms are often used in conjunction with metric math alarms. For example, you might create a metric math expression that computes a custom metric (like error rate = errors / total requests), create an alarm on that expression, and then use that alarm as a child in a composite alarm with other conditions.

What Is Metric Math and Why Does It Exist?

Metric math allows you to query multiple CloudWatch metrics and apply mathematical expressions (arithmetic, functions, comparisons) to create new time series. This is useful when you need a metric that is not directly emitted, such as a ratio, sum, or custom aggregation. For example, you can compute CPU utilization percentage across all instances in an Auto Scaling group, or calculate the 95th percentile of latency. Metric math expressions can be used in dashboards, alarms, and CloudWatch Logs Insights queries.

How Metric Math Works Internally

When you define a metric math expression, you specify a list of metrics (each with its own ID, like "m1", "m2") and an expression that references those IDs. The expression can use: - Arithmetic: +, -, *, /, ^ - Functions: SUM, AVG, MIN, MAX, COUNT, STDDEV, PERCENTILE, METRICS, FILL, RATE, DIFF, TIME_SERIES, IF, CASE, CEIL, FLOOR, ABS, LOG, etc. - Comparison: <, >, <=, >=, ==, != - Logical: AND, OR, NOT - Constants: Numbers like 100, 0.5

The result is a new time series with the same timestamps and period as the input metrics, but with the expression applied point-by-point. If metrics have different periods, CloudWatch will resample them to the specified period (default 60 seconds). Missing data points can be treated using FILL (e.g., FILL(m1, 0) fills missing values with 0).

Key Components and Defaults for Metric Math

Expression syntax: Up to 1024 characters. Example: (m1 / m2) * 100 where m1 is errors, m2 is requests.

Metrics list: Each metric has an ID (e.g., "m1"), namespace, metric name, dimensions, and optionally a period and stat.

Period: The time interval over which each metric is aggregated. Default is 60 seconds. All metrics in an expression must have the same period, or CloudWatch will align them.

Stat: The statistic to use (e.g., Sum, Average, SampleCount). Default is Average.

Return data: By default, the expression result is returned. You can set ReturnData: false for intermediate metrics that are only used in other expressions.

Label: Optional label for the resulting time series.

Configuration and Verification

You can use metric math in dashboards and alarms. For an alarm, you define the metric math in the Metrics array of the alarm configuration. Example using AWS CLI:

aws cloudwatch put-metric-alarm \
    --alarm-name "ErrorRateAlarm" \
    --alarm-description "Alarm when error rate exceeds 5%" \
    --metrics '[
        {"Id": "e1", "Expression": "(m1 / m2) * 100"},
        {"Id": "m1", "MetricStat": {"Metric": {"Namespace": "MyApp", "MetricName": "Errors"}, "Period": 60, "Stat": "Sum"}},
        {"Id": "m2", "MetricStat": {"Metric": {"Namespace": "MyApp", "MetricName": "Requests"}, "Period": 60, "Stat": "Sum"}}
    ]' \
    --evaluation-periods 2 \
    --threshold 5 \
    --comparison-operator GreaterThanThreshold

To verify, describe the alarm and inspect the Metrics array. You can also test expressions in the CloudWatch console by building a graph with metric math.

Interaction with Related Technologies

Metric math is used in: - CloudWatch Dashboards: Create visualizations of derived metrics. - CloudWatch Alarms: Set thresholds on derived metrics. - CloudWatch Logs Insights: Not directly, but you can export logs to metrics and then use metric math. - AWS Compute Optimizer: Uses metric math for some recommendations.

Common Patterns

Error rate: (m1 / m2) * 100 where m1 is error count and m2 is total requests.

Aggregate across instances: SUM(m1) where m1 is a metric with InstanceId dimension, to get total across all instances.

Percentile: PERCENTILE(m1, 95) for p95 latency.

Rate of change: RATE(m1) to compute per-second rate of a metric that is a count.

Fill missing data: FILL(m1, 0) to treat missing datapoints as zero.

Exam Tips

Composite alarms are for combining multiple alarm states, not metrics. Metric math is for combining metrics.

Composite alarms evaluate every 10 seconds, not configurable.

Child alarms must exist before creating the composite alarm.

Metric math expressions can be used directly in alarms without needing a separate metric.

The ReturnData flag is crucial: set to false for intermediate expressions that are only used in other expressions.

Be careful with period alignment: all metrics in an expression must have the same period, or CloudWatch will resample (which may cause unintended behavior).

Composite alarms can reference up to 100 child alarms.

The rule syntax uses ALARM("name"), OK("name"), INSUFFICIENT_DATA("name") functions.

Walk-Through

1

Define Child Metric Alarms

First, create the individual metric alarms that will serve as children. For example, create an alarm 'HighCPU' that triggers when CPU > 80% for 5 minutes, and 'HighMemory' when memory > 80%. These alarms must be in the same account and Region. Each alarm will evaluate its own metric and transition between OK, ALARM, and INSUFFICIENT_DATA states. They can have their own actions (e.g., SNS topics) but typically you rely on the composite alarm to trigger actions.

2

Write the Composite Alarm Rule

Define the Boolean expression using the alarm state functions. Example: `ALARM("HighCPU") AND ALARM("HighMemory")`. This rule will be true only when both child alarms are in ALARM state. You can use parentheses for grouping, e.g., `ALARM("HighCPU") OR (ALARM("HighMemory") AND ALARM("HighDisk"))`. The rule string must be <= 1024 characters and can reference up to 100 unique child alarms.

3

Create the Composite Alarm

Use the AWS CLI, SDK, or Console to create the composite alarm. Specify the alarm name, the rule, and actions for OK, ALARM, and INSUFFICIENT_DATA states. The composite alarm will start in INSUFFICIENT_DATA until its first evaluation (within 10 seconds). All child alarms must already exist; otherwise, creation will fail.

4

Composite Alarm Evaluates Rule

Every 10 seconds, CloudWatch evaluates the composite alarm rule. It fetches the current state of each referenced child alarm. If the rule evaluates to true, the composite alarm transitions to ALARM. If false, it goes to OK. If any child alarm is INSUFFICIENT_DATA and the rule cannot be determined (e.g., an AND where one child is INSUFFICIENT_DATA and the other is OK), the composite alarm may also become INSUFFICIENT_DATA. The exact logic: if rule is true -> ALARM; if false -> OK; if indeterminate -> INSUFFICIENT_DATA.

5

Trigger Actions on State Change

When the composite alarm changes state, it executes the configured actions for that state. For example, if it goes to ALARM, it can publish to an SNS topic, trigger an Auto Scaling policy, or create an OpsItem. Actions can be the same as for metric alarms. Note that composite alarms do not have their own metric evaluation; they only react to child alarm state changes. This reduces the number of notifications and ensures actions only fire when the composite condition is met.

What This Looks Like on the Job

Enterprise Scenario 1: E-Commerce Platform Auto Scaling

A large e-commerce platform uses Auto Scaling groups to manage its web tier. They want to scale out only when both CPU utilization is high AND the request latency is high, to avoid scaling on CPU spikes from batch jobs. They create two metric alarms: 'HighCPU' (CPU > 70% for 3 minutes) and 'HighLatency' (latency > 500ms for 3 minutes). Then they create a composite alarm with rule ALARM("HighCPU") AND ALARM("HighLatency"). The composite alarm triggers an SNS notification and a scale-out policy. This prevents unnecessary scale-outs during CPU spikes caused by background tasks. In production, they monitor the composite alarm state and adjust thresholds. A common misconfiguration is setting the evaluation periods too short, causing flapping. They use a 3-minute evaluation period on child alarms to smooth out spikes.

Enterprise Scenario 2: Multi-Service Health Monitoring

A SaaS company monitors the health of its microservices. They have alarms for each service: 'AuthServiceDown', 'PaymentServiceDown', etc. They want a single page when any two services are down simultaneously (indicating a broader issue). They create a composite alarm with rule (ALARM("AuthServiceDown") AND ALARM("PaymentServiceDown")) OR (ALARM("PaymentServiceDown") AND ALARM("InventoryServiceDown")). This composite alarm pages the on-call engineer via SNS->Lambda->PagerDuty. The benefit: a single service down doesn't page (it's handled by the individual alarm's own action), but a correlated failure does. In production, they have 20+ child alarms and the composite alarm evaluates every 10 seconds, providing near-real-time detection. They also set an INSUFFICIENT_DATA action to alert if too many child alarms have missing data.

Scenario 3: Cost Optimization with Metric Math

A data analytics company wants to alarm on error rate per request. They emit two metrics: Errors (count) and Requests (count) from their application. They create a metric math alarm using expression (m1 / m2) * 100 where m1 is Errors (Sum) and m2 is Requests (Sum), both with a period of 60 seconds. They set threshold to 5 (5% error rate). This avoids emitting a separate error rate metric, saving costs. They also use FILL(m2, 1) to avoid division by zero when there are no requests. In production, they monitor the alarm and adjust the period to 300 seconds for a smoother trend. A common pitfall is forgetting to set ReturnData: false on the intermediate metrics, which would incorrectly return those metrics as separate time series in the alarm.

Performance and Scale Considerations

Composite alarms can reference up to 100 child alarms. Each composite alarm is evaluated every 10 seconds, so for large numbers of composite alarms, there is a limit of 100 composite alarms per account (soft limit, can be increased). Metric math expressions can include up to 50 metrics per expression. For high-resolution metrics (1-second), metric math can still be used, but the period must be at least 1 second. CloudWatch charges per metric math expression used in alarms and dashboards, so using metric math can reduce costs by avoiding additional custom metrics.

How DVA-C02 Actually Tests This

What DVA-C02 Tests on Composite Alarms and Metric Math

Under Domain 4.2 (Troubleshooting), the exam tests your ability to design alarm systems that reduce noise and accurately detect issues. You will see questions that ask you to choose between composite alarms, metric math alarms, and simple metric alarms. The exam also tests your understanding of the rule syntax, state transitions, and how to use metric math to compute derived metrics.

Common Wrong Answers and Why Candidates Choose Them

1.

Using metric math instead of composite alarm for combining alarm states: Candidates often think they can use a metric math expression like (ALARM("HighCPU") AND ALARM("HighMemory")) in a single alarm. This is incorrect because metric math operates on metric values, not alarm states. The correct approach is to create separate metric alarms and then a composite alarm referencing them.

2.

Setting evaluation periods on composite alarms: Many candidates try to set an evaluation period on a composite alarm, but composite alarms evaluate every 10 seconds fixed. They confuse it with metric alarms that have configurable evaluation periods.

3.

Creating composite alarm before child alarms: The exam may present a scenario where a developer creates a composite alarm referencing alarms that don't exist yet. The composite alarm creation will fail. The correct order is to create child alarms first.

4.

Using composite alarms for metric aggregation: Some questions ask how to aggregate metrics across instances (e.g., total CPU). Candidates might incorrectly choose a composite alarm, but composite alarms combine alarm states, not metrics. The correct answer is metric math with SUM function.

Specific Numbers, Values, and Terms That Appear on the Exam

Composite alarm evaluation period: 10 seconds (fixed).

Maximum child alarms per composite alarm: 100.

Rule syntax: ALARM("alarm-name"), OK("alarm-name"), INSUFFICIENT_DATA("alarm-name").

Metric math functions: SUM, AVG, MIN, MAX, COUNT, PERCENTILE, FILL, RATE, DIFF.

ReturnData flag: set to false for intermediate expressions.

Composite alarm states: OK, ALARM, INSUFFICIENT_DATA.

Metric math expression length limit: 1024 characters.

Number of metrics per expression: up to 50.

Edge Cases and Exceptions

If a child alarm is deleted, the composite alarm will go to INSUFFICIENT_DATA and then to ALARM if the rule cannot be evaluated. It's best practice to update the composite alarm before deleting child alarms.

Composite alarms can reference other composite alarms, but this is not recommended due to complexity and potential for circular references (which CloudWatch prevents).

Metric math with FILL function: If you use FILL(m1, 0), missing datapoints become 0, which may cause division by zero if used in a denominator. Use FILL(m2, 1) for denominators.

When using metric math in alarms, the alarm evaluates the expression at each evaluation period. If the expression returns no data (e.g., all metrics missing), the alarm treats missing data according to the treatMissingData setting (default: missing).

How to Eliminate Wrong Answers

If the question involves combining multiple conditions (e.g., high CPU AND high memory), look for composite alarm as the answer. If it involves computing a new metric (e.g., error rate), look for metric math.

If the question mentions 'reduce alarm noise' or 'avoid false positives', composite alarm is likely the answer.

If the question asks for 'aggregate metrics across instances', metric math with SUM or AVG is correct.

Pay attention to the phrase 'alarm state' vs 'metric value'. Composite alarms work with states, metric math works with values.

Remember that composite alarms are evaluated every 10 seconds; metric alarms have configurable evaluation periods.

Key Takeaways

Composite alarms combine alarm states using Boolean logic; they evaluate every 10 seconds fixed.

Metric math combines metric values using arithmetic and functions; it produces a new time series.

Composite alarms can reference up to 100 child alarms; metric math can include up to 50 metrics per expression.

Use composite alarms to reduce noise and trigger actions only when multiple conditions are met.

Use metric math to compute custom metrics like error rate, aggregate across instances, or calculate percentiles.

Child alarms must exist before creating a composite alarm that references them.

In metric math, set ReturnData: false for intermediate expressions to avoid unwanted metrics in alarms.

The rule syntax for composite alarms uses ALARM("name"), OK("name"), and INSUFFICIENT_DATA("name").

Metric math functions include SUM, AVG, MIN, MAX, COUNT, PERCENTILE, FILL, RATE, DIFF, and many more.

Composite alarms and metric math alarms both support actions like SNS, Auto Scaling, and EC2 actions.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Composite Alarm

Combines states of other alarms using Boolean logic (AND, OR, NOT).

Evaluates every 10 seconds (fixed).

Can reference up to 100 child alarms.

Useful for reducing false positives by requiring multiple conditions.

Does not compute new metric values; only reads alarm states.

Metric Math Alarm

Combines metric values using arithmetic and functions (SUM, AVG, etc.).

Evaluates at a configurable period (default 60 seconds).

Can include up to 50 metrics per expression.

Useful for creating derived metrics like error rate or percentile.

Produces a new time series that can be used in alarms or dashboards.

Watch Out for These

Mistake

Composite alarms can combine metric values directly.

Correct

Composite alarms only combine the states of other alarms (OK, ALARM, INSUFFICIENT_DATA). They do not evaluate metric values. To combine metric values, use metric math.

Mistake

You can set an evaluation period on a composite alarm.

Correct

Composite alarms have a fixed evaluation period of 10 seconds. You cannot configure it. Only metric alarms have configurable evaluation periods.

Mistake

Metric math can be used to combine alarm states.

Correct

Metric math operates on numeric metric values, not alarm states. To combine alarm states, you must use a composite alarm.

Mistake

Composite alarms can be created before child alarms.

Correct

All child alarms referenced in the rule must already exist at the time of composite alarm creation. Otherwise, the creation will fail.

Mistake

Metric math expressions can reference the same metric multiple times without issue.

Correct

Each metric in the metrics array must have a unique ID. You can reference the same metric name with different dimensions or periods, but each instance needs a distinct ID.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between a composite alarm and a metric math alarm?

A composite alarm combines the states of other alarms (OK, ALARM, INSUFFICIENT_DATA) using Boolean logic (AND, OR, NOT). It is used to trigger actions when multiple conditions are met, reducing false positives. A metric math alarm, on the other hand, combines metric values using arithmetic and functions (e.g., SUM, AVG, division) to create a new derived metric, and then sets a threshold on that derived metric. Metric math is used when you need a custom metric that is not directly emitted, such as an error rate. In essence: composite alarms work with alarm states, metric math works with metric values.

How often does a composite alarm evaluate?

A composite alarm evaluates every 10 seconds. This is fixed and cannot be changed. In contrast, metric alarms have a configurable evaluation period (e.g., 60 seconds, 300 seconds). The 10-second evaluation ensures that composite alarms react quickly to changes in child alarm states.

Can a composite alarm reference another composite alarm?

Yes, a composite alarm can reference another composite alarm as a child. However, this is not recommended because it can lead to complex dependencies and potential circular references. CloudWatch prevents circular references (e.g., alarm A references alarm B, and alarm B references alarm A). If you need complex logic, consider using a single composite alarm with multiple child alarms rather than nesting.

What happens to a composite alarm if a child alarm is deleted?

If a child alarm is deleted, the composite alarm will start receiving INSUFFICIENT_DATA for that child. The composite alarm's rule evaluation may become indeterminate, causing the composite alarm to transition to INSUFFICIENT_DATA or ALARM depending on the rule. It is best practice to update the composite alarm rule to remove references to deleted alarms before deletion.

How do I handle division by zero in metric math?

Use the FILL function to replace missing or zero values. For example, if you have an expression `m1 / m2`, and m2 could be zero, you can use `m1 / FILL(m2, 1)` to treat missing m2 as 1, or `FILL(m1, 0) / FILL(m2, 1)`. Alternatively, you can use an IF expression to check if m2 is zero and return a default value.

What is the maximum number of child alarms in a composite alarm?

A composite alarm can reference up to 100 unique child alarms. This limit is per composite alarm. If you need more, you can create multiple composite alarms or use a hierarchical approach, but that may increase complexity.

Can I use metric math in a composite alarm?

No. Composite alarms only accept references to other alarms (by name). They cannot include metric math expressions. To use metric math, you must create a metric alarm that uses a metric math expression, and then reference that alarm as a child in a composite alarm.

Terms Worth Knowing

Ready to put this to the test?

You've just covered CloudWatch Alarms: Composite and Metric Math — now see how well it sticks with free DVA-C02 practice questions. Full explanations included, no account needed.

Done with this chapter?