SOA-C02Chapter 15 of 104Objective 1.1

CloudWatch Container Insights for ECS and EKS

This chapter covers CloudWatch Container Insights, a fully managed service for collecting, aggregating, and visualizing metrics and logs from containerized applications running on Amazon ECS, EKS, and Kubernetes clusters on AWS. For the SOA-C02 exam, Container Insights is a key topic under Domain 1: Monitoring and Logging, specifically Objective 1.1 – Monitor and report metrics and logs. Expect 2–3 questions on Container Insights, focusing on its setup, data sources, dashboards, and integration with CloudWatch Logs and ServiceLens. Mastery of this topic is essential for SysOps administrators who must ensure observability of container workloads at scale.

25 min read
Intermediate
Updated May 31, 2026

Container Metrics Like a Building Management System

Imagine a large office building with 100 rooms, each containing servers and applications. A building management system (BMS) monitors temperature, power usage, and occupancy per room, per floor, and for the whole building. CloudWatch Container Insights is like that BMS, but for containers. It collects metrics and logs from each container (room), each pod (floor), each node (building), and the cluster (campus). The BMS uses sensors (agents) installed in each room that report data to a central dashboard. Similarly, Container Insights uses a CloudWatch agent or a delegated AWS service (like Amazon EKS with AWS Distro for OpenTelemetry) to gather CPU, memory, network, and disk metrics. The BMS can also alert if a room's temperature exceeds a threshold; Container Insights can trigger CloudWatch alarms based on metric anomalies. The building manager can view a heatmap of the entire building to see which floors are overloaded; Container Insights provides a performance dashboard for the cluster, node, pod, and container levels. Without the BMS, the manager would have to manually check each room; without Container Insights, you'd have to SSH into nodes and run kubectl top commands. The BMS also logs historical data for capacity planning; Container Insights stores metrics in CloudWatch Logs for analysis. In short, Container Insights is the centralized observability layer that makes container infrastructure visible and actionable, just as a BMS makes building operations manageable.

How It Actually Works

What is CloudWatch Container Insights?

CloudWatch Container Insights is a managed observability service that collects, aggregates, and summarizes metrics and logs from containerized applications and microservices. It supports Amazon ECS (including Fargate), Amazon EKS, and Kubernetes clusters on AWS (self-managed or managed node groups). It provides out-of-the-box dashboards for cluster, node, pod, task, and container-level performance metrics (CPU, memory, network, disk). It also collects diagnostic information, such as container restarts and failures, and integrates with CloudWatch Logs for log monitoring.

Why It Exists

Before Container Insights, customers had to manually install third-party monitoring agents (e.g., Prometheus, Datadog) or write custom scripts to scrape metrics from container runtimes. This was complex, inconsistent, and did not leverage CloudWatch's native alerting and dashboarding. Container Insights solves this by providing a native, integrated solution that automatically populates CloudWatch with structured metrics and logs from container environments.

How It Works Internally

Container Insights uses a CloudWatch agent (for EC2-based clusters) or an AWS Distro for OpenTelemetry (ADOT) collector to scrape metrics from the container runtime and Kubernetes API. On ECS, the agent runs as a daemon service on each EC2 instance in the cluster. On EKS, it runs as a DaemonSet (one pod per node). The agent collects: - Resource utilization metrics: CPU, memory, network, and disk usage at the container, pod, task, node, and cluster levels. - Performance metrics: such as CPU throttling, memory limits, and OOM kills. - Network metrics: bytes sent/received, packets dropped, and connection counts. - Diagnostic data: container restart counts, failed pod counts, and deployment errors.

The agent sends these metrics to CloudWatch as custom metrics (in the AWS/ContainerInsights namespace) and also emits structured logs (in JSON format) to a CloudWatch Logs group named /aws/containerinsights/ClusterName/performance. These logs contain detailed performance data that can be queried using CloudWatch Logs Insights.

Key Components, Values, Defaults, and Timers

Metric Namespace: All Container Insights metrics are published under the AWS/ContainerInsights namespace.

Metric Dimensions: Metrics are organized by dimensions such as ClusterName, NodeName, PodName, TaskId, ContainerName, Namespace, and ServiceName.

Data Collection Interval: The agent collects metrics every 60 seconds by default. This interval is configurable but not recommended to be less than 15 seconds to avoid API throttling.

Retention: Metrics are retained according to CloudWatch standard retention: 15 months for detailed metrics, 63 days for aggregated metrics (if you use the CloudWatchAgent with metrics_collected configuration).

Log Group: Performance logs are stored in /aws/containerinsights/ClusterName/performance with a retention period that defaults to never expire (you should set a retention policy).

Agent Image: The CloudWatch agent for Container Insights is available as amazon/cloudwatch-agent on Docker Hub. For EKS, use the AWS Distro for OpenTelemetry (ADOT) collector image amazon/aws-otel-collector.

IAM Permissions: The agent requires the CloudWatchAgentServerPolicy managed policy (or equivalent permissions) to put metrics and logs.

Configuration and Verification Commands

#### Setting up Container Insights on ECS (EC2 launch type) 1. Create an IAM role for the EC2 instances with the CloudWatchAgentServerPolicy. 2. Launch or update your ECS cluster with the CloudWatch agent as a daemon service. - Use the AWS Management Console: under your ECS cluster, go to 'CloudWatch Container Insights' and follow the wizard. - Or use the AWS CLI to create a task definition for the agent and register it as a service.

Example task definition snippet (JSON) for the CloudWatch agent:

{
  "family": "cwagent-ecs",
  "taskRoleArn": "arn:aws:iam::account-id:role/ecsCWAgentRole",
  "executionRoleArn": "arn:aws:iam::account-id:role/ecsTaskExecutionRole",
  "containerDefinitions": [
    {
      "name": "cloudwatch-agent",
      "image": "amazon/cloudwatch-agent:latest",
      "essential": true,
      "environment": [
        {
          "name": "CW_CONFIG_CONTENT",
          "value": "{\"logs\": {\"metrics_collected\": {\"kubernetes\": {\"cluster_name\": \"my-ecs-cluster\"}}}}"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/cwagent",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      }
    }
  ]
}

#### Setting up Container Insights on EKS 1. Install the AWS Distro for OpenTelemetry (ADOT) collector or the CloudWatch agent using Helm. - For ADOT: helm install aws-otel-collector eks/aws-otel-collector --namespace amazon-cloudwatch - For CloudWatch agent: helm install cwagent amazon-cloudwatch/amazon-cloudwatch-agent --namespace amazon-cloudwatch 2. Verify the agent pods are running:

kubectl get pods -n amazon-cloudwatch

3. Check that metrics appear in CloudWatch:

aws cloudwatch list-metrics --namespace AWS/ContainerInsights --dimensions Name=ClusterName,Value=my-cluster

#### Verification Commands - List available metrics: aws cloudwatch list-metrics --namespace AWS/ContainerInsights - Get metric statistics: aws cloudwatch get-metric-statistics --namespace AWS/ContainerInsights --metric-name pod_cpu_utilization --dimensions Name=ClusterName,Value=my-cluster --start-time 2023-01-01T00:00:00Z --end-time 2023-01-01T01:00:00Z --period 300 --statistics Average - Query performance logs: Use CloudWatch Logs Insights with the log group /aws/containerinsights/my-cluster/performance. Example query:

fields @timestamp, @message
  | filter @message like /PodName="my-pod"/
  | sort @timestamp desc
  | limit 20

How It Interacts with Related Technologies

CloudWatch Alarms: You can set alarms on Container Insights metrics (e.g., pod_cpu_utilization > 80%) to trigger SNS notifications or Auto Scaling actions.

CloudWatch Dashboard: Container Insights automatically creates a dashboard named ContainerInsights in the CloudWatch console that shows cluster-level and resource-level performance.

ServiceLens: Container Insights integrates with AWS X-Ray to provide end-to-end tracing and service maps for your containerized applications.

AWS Distro for OpenTelemetry (ADOT): For EKS, ADOT is the recommended collector because it supports OpenTelemetry standards and can send metrics to multiple backends (CloudWatch, Prometheus, etc.).

Prometheus: Container Insights can also scrape Prometheus metrics from your applications if you configure the agent with a Prometheus scrape configuration.

Important Settings and Limitations

Fargate Support: For ECS tasks on Fargate, Container Insights requires the task to have the awslogs log driver and the cloudwatch agent configuration. Metrics are collected via the Fargate agent and sent to CloudWatch without needing a separate container.

Cost: Container Insights incurs charges for custom metrics (each unique combination of dimensions is a metric) and for logs ingested. The first 10 custom metrics per account per month are free, but container workloads can generate thousands of metrics quickly.

Data Volume: Each node sends performance logs every 60 seconds. For a cluster with 10 nodes, this generates 14,400 log events per day. Plan log retention accordingly.

Supported Regions: Container Insights is available in all commercial AWS regions, but not in China or GovCloud (as of 2025).

Troubleshooting Common Issues

No metrics appearing: Check that the agent pods are running and have network access to CloudWatch endpoints. Verify IAM roles have the correct permissions (cloudwatch:PutMetricData, logs:CreateLogGroup, logs:CreateLogStream, logs:PutLogEvents).

Metrics missing for certain containers: Ensure the container runtime is supported (Docker, containerd). On ECS, Fargate tasks must have the enableExecuteCommand flag set to true to allow the agent to collect metrics.

High metric costs: Use metric filters to aggregate data and reduce the number of unique metric combinations. For example, aggregate by service instead of by individual pod.

Summary of Key Exam Points

Container Insights is enabled per cluster (ECS or EKS).

Metrics are stored in the AWS/ContainerInsights namespace.

Performance logs are stored in /aws/containerinsights/ClusterName/performance.

The agent can be deployed as a DaemonSet (EKS) or a daemon service (ECS).

For EKS, the recommended collector is AWS Distro for OpenTelemetry (ADOT).

Container Insights supports both EC2 and Fargate launch types for ECS.

Alarms and dashboards are fully supported.

Cost is based on custom metrics and log ingestion.

Walk-Through

1

Enable Container Insights on Cluster

Start by enabling Container Insights on your ECS or EKS cluster. For ECS, this can be done via the AWS Management Console by selecting the cluster and clicking 'Update Cluster' then checking 'Use Container Insights'. For EKS, you enable it by installing the CloudWatch agent or ADOT collector using Helm. The cluster must have an IAM role that allows the agent to publish metrics and logs. This step is a one-time setup; after enabling, the agent will begin collecting data.

2

Deploy CloudWatch Agent as DaemonSet

The CloudWatch agent runs as a DaemonSet on each node in the cluster. For EKS, you use Helm charts to deploy the agent into the `amazon-cloudwatch` namespace. The agent pod on each node scrapes metrics from the kubelet and container runtime (Docker or containerd). It collects CPU, memory, network, and disk metrics at the container, pod, and node levels. The agent also emits structured JSON logs to CloudWatch Logs.

3

Agent Scrapes Metrics Every 60 Seconds

By default, the agent collects metrics every 60 seconds. It queries the kubelet API for pod metrics and cAdvisor for container metrics. The data is aggregated per node and sent to CloudWatch as custom metrics. The metrics are published under the `AWS/ContainerInsights` namespace with dimensions like `ClusterName`, `NodeName`, `PodName`, etc. This interval can be adjusted in the agent configuration file, but shorter intervals increase API calls and costs.

4

Metrics and Logs Sent to CloudWatch

The agent sends the scraped metrics to CloudWatch using the `PutMetricData` API. It also sends performance logs to a CloudWatch Logs group named `/aws/containerinsights/ClusterName/performance`. Each log event is a JSON object containing fields like `ClusterName`, `NodeName`, `PodName`, `ContainerName`, `CpuUtilized`, `MemoryUtilized`, etc. These logs can be queried with CloudWatch Logs Insights for detailed analysis.

5

Visualize Metrics in CloudWatch Dashboard

Once metrics and logs are flowing, CloudWatch automatically creates a dashboard called `ContainerInsights` under the 'Dashboards' section. This dashboard provides pre-built widgets for cluster, node, pod, task, and service-level metrics. You can also create custom dashboards using the Container Insights metrics. Additionally, you can set up CloudWatch Alarms on key metrics like high CPU or memory usage to trigger notifications or auto-scaling actions.

What This Looks Like on the Job

Scenario 1: E-Commerce Platform on ECS (Fargate)

A large e-commerce company runs its microservices on Amazon ECS with Fargate. They need to monitor over 200 services across multiple clusters. Before Container Insights, they relied on CloudWatch logs and custom scripts to track CPU and memory, which was error-prone and lacked pod-level granularity. After enabling Container Insights, they immediately got visibility into per-task CPU and memory utilization. They set up alarms for when memory usage exceeds 80% to trigger task scale-out. They also use the performance logs to identify which services are causing network bottlenecks. One misconfiguration they encountered: they forgot to add the CloudWatchAgentServerPolicy to the task execution role, resulting in no metrics. After fixing the IAM role, metrics appeared within minutes.

Scenario 2: Kubernetes Cluster for Data Analytics

A data analytics company runs a 50-node EKS cluster with GPU instances for machine learning workloads. They use Container Insights to monitor GPU utilization (via the NVIDIA DCGM exporter integrated with the CloudWatch agent). They configured the agent to scrape Prometheus metrics from their custom model servers. The pre-built dashboard helped them identify that several nodes were over-utilized, causing pod evictions. They set up a CloudWatch alarm on node_cpu_utilization to send SNS notifications when a node exceeds 90% for 5 minutes. A common mistake they made: they initially deployed the agent without setting the cluster_name in the configuration, leading to metrics appearing under a generic cluster name. They fixed it by updating the agent's ConfigMap.

Scenario 3: Hybrid Container Monitoring

A financial services firm runs both ECS and EKS clusters across multiple AWS accounts. They use Container Insights in each account and aggregate metrics into a central monitoring account using CloudWatch cross-account observability. They created a single dashboard that shows metrics from all clusters. They also use CloudWatch Logs Insights to query performance logs across accounts for troubleshooting. The challenge they faced was cost: with hundreds of pods, they were generating over 50,000 custom metrics per month. They reduced costs by using metric filters to aggregate pod-level metrics to service-level metrics, reducing the unique metric count by 80%.

How SOA-C02 Actually Tests This

What SOA-C02 Tests on Container Insights

- Objective 1.1: Monitor and report metrics and logs. The exam expects you to know how to enable Container Insights, what metrics are collected, and how to access dashboards. - Specific areas: Difference between ECS (EC2 vs Fargate) and EKS setup, default metrics collected, log group naming, and integration with CloudWatch Alarms and ServiceLens. - Common wrong answers: 1. "Container Insights requires a third-party agent" – Wrong. Container Insights uses a native CloudWatch agent or ADOT collector, not third-party. 2. "Metrics are stored in the /aws/containerinsights/ log group" – Wrong. Metrics are stored in CloudWatch Metrics (namespace AWS/ContainerInsights), while performance logs are stored in the log group. 3. "Container Insights only works with ECS" – Wrong. It also works with EKS and self-managed Kubernetes on AWS. 4. "Fargate does not support Container Insights" – Wrong. Fargate supports Container Insights with the awslogs log driver and proper IAM roles. - Specific numbers/values: Default collection interval is 60 seconds; metric namespace is AWS/ContainerInsights; log group pattern is /aws/containerinsights/ClusterName/performance. - Edge cases: If the cluster name contains special characters, the log group name may be truncated or fail to create. The exam may ask about troubleshooting missing metrics: check IAM permissions, agent pod status, and network connectivity. - Eliminating wrong answers: Understand the mechanism: the agent runs on each node, scrapes metrics from kubelet/cAdvisor, and sends to CloudWatch. If an answer says metrics are collected from the control plane, it's wrong – the control plane does not expose container metrics. If it says logs are stored in a different log group, it's wrong. Always verify the exact namespace and log group naming convention.

Key Takeaways

Container Insights is enabled per cluster and uses the CloudWatch agent or ADOT collector.

Metrics are published under the AWS/ContainerInsights namespace with dimensions like ClusterName, NodeName, PodName, and TaskId.

Performance logs are stored in CloudWatch Logs group /aws/containerinsights/ClusterName/performance.

Default data collection interval is 60 seconds.

Container Insights supports both ECS (EC2 and Fargate) and EKS.

IAM permissions required: CloudWatchAgentServerPolicy or equivalent for PutMetricData and logs actions.

Alarms can be set on any Container Insights metric to trigger notifications or auto scaling.

Cost is based on custom metrics and log ingestion; aggregate metrics to reduce costs.

For EKS, ADOT is the recommended collector because it supports OpenTelemetry and Prometheus.

Container Insights automatically creates a pre-built dashboard named ContainerInsights in CloudWatch.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

CloudWatch Container Insights for ECS

Agent runs as a daemon service on each EC2 instance in the cluster.

For Fargate, no agent container needed; metrics collected by Fargate agent.

Metrics include task-level CPU, memory, network, and disk.

Easier setup via console wizard or CloudFormation.

Integrated with ECS service auto scaling.

CloudWatch Container Insights for EKS

Agent runs as a DaemonSet (one pod per node).

Recommended collector is AWS Distro for OpenTelemetry (ADOT).

Metrics include pod-level CPU, memory, network, and disk.

Requires Helm installation or manual YAML deployment.

Can scrape custom Prometheus metrics from applications.

Watch Out for These

Mistake

Container Insights only works with Amazon ECS.

Correct

Container Insights supports Amazon ECS (both EC2 and Fargate), Amazon EKS, and self-managed Kubernetes clusters on AWS. It does not support on-premises Kubernetes directly.

Mistake

Container Insights stores metrics in a CloudWatch Logs group.

Correct

Metrics are stored in CloudWatch Metrics under the `AWS/ContainerInsights` namespace. Performance logs are stored in CloudWatch Logs under `/aws/containerinsights/ClusterName/performance`, but these are log data, not metrics.

Mistake

You must install a third-party monitoring tool to see container metrics in CloudWatch.

Correct

Container Insights is a native AWS service. It uses the CloudWatch agent or AWS Distro for OpenTelemetry collector, which are AWS-provided, not third-party.

Mistake

Container Insights automatically collects application-level metrics like request latency.

Correct

Container Insights collects infrastructure-level metrics (CPU, memory, network, disk). For application-level metrics (e.g., request latency, error rates), you need to use CloudWatch embedded metric format or publish custom metrics via the agent.

Mistake

Fargate tasks cannot use Container Insights because there is no agent to install.

Correct

Fargate tasks support Container Insights by using the `awslogs` log driver and sending metrics via the Fargate agent. No separate container is needed; you just enable Container Insights on the cluster and ensure the task has the correct IAM role.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

How do I enable Container Insights on an existing ECS cluster?

In the AWS Management Console, navigate to your ECS cluster, click 'Update Cluster', check the box for 'Use Container Insights', and click 'Update'. For CLI, use the `update-cluster` command with `--settings name=containerInsights,value=enabled`. Ensure the cluster's task execution role has the CloudWatchAgentServerPolicy. Metrics will appear within a few minutes.

What metrics does Container Insights collect?

Container Insights collects CPU utilization, memory utilization, network bytes sent/received, disk read/write bytes, and diagnostic metrics like container restart count, OOM kills, and pod status (pending, running, failed). These are at the container, pod, task, node, and cluster levels. For EKS, it also collects node-level metrics like node CPU and memory.

Can I use Container Insights with Fargate?

Yes. For ECS tasks on Fargate, enable Container Insights on the cluster. The Fargate agent automatically sends metrics to CloudWatch. You do not need to run a separate monitoring container. Ensure the task execution role has the CloudWatchAgentServerPolicy and that the task uses the awslogs log driver.

How do I create a CloudWatch alarm on a Container Insights metric?

In the CloudWatch console, go to Alarms > Create alarm. Select the metric source as 'ContainerInsights'. Choose a metric like `pod_cpu_utilization` and specify dimensions (e.g., ClusterName, PodName). Set the condition (e.g., > 80 for 3 consecutive periods). Configure actions (e.g., SNS topic) and create the alarm. You can also use the AWS CLI with `put-metric-alarm`.

Why are my Container Insights metrics not showing up?

Common causes: 1) The CloudWatch agent is not deployed or is crashing. Check pod status with `kubectl get pods -n amazon-cloudwatch`. 2) IAM permissions missing – ensure the node or task role has CloudWatchAgentServerPolicy. 3) Network connectivity – the agent must be able to reach CloudWatch endpoints. 4) The cluster name is incorrect in the agent configuration. Verify the log group exists: `/aws/containerinsights/ClusterName/performance`.

What is the difference between CloudWatch Container Insights and CloudWatch Logs?

CloudWatch Container Insights is a service that collects both metrics and logs specifically for container environments. It publishes metrics to CloudWatch Metrics and structured performance logs to CloudWatch Logs. CloudWatch Logs is a broader service for storing and querying log data from any source. Container Insights uses CloudWatch Logs as the storage for its performance data, but it also provides pre-built dashboards and metric aggregation.

Can I customize the metrics collected by Container Insights?

Yes, you can customize the CloudWatch agent configuration to collect additional metrics, such as Prometheus metrics from your applications. You can also use the agent's `metrics_collected` section to specify which metrics to collect and at what intervals. However, the default set of container metrics is automatically collected and cannot be disabled individually.

Terms Worth Knowing

Ready to put this to the test?

You've just covered CloudWatch Container Insights for ECS and EKS — now see how well it sticks with free SOA-C02 practice questions. Full explanations included, no account needed.

Done with this chapter?