Practice DOP-C02 Monitoring and Logging questions with full explanations on every answer.
Start practicing
Monitoring and Logging — choose a session length
Free · No account required
Click any question to see the full explanation and answer options, or start a focused practice session above.
A company is running a critical web application on Amazon EC2 instances behind an Application Load Balancer (ALB). The DevOps team wants to monitor HTTP 5xx errors and receive alerts when the error rate exceeds 5% over a 5-minute period. Which combination of services and configurations should be used to meet these requirements?
2A DevOps team is using Amazon CloudWatch Logs to collect application logs from multiple EC2 instances. They notice that some log entries are missing and that the CloudWatch agent is consuming high CPU. The log group has a retention policy of 30 days. Which action should the team take to reduce CPU usage without losing log data?
3A company wants to monitor the number of messages in an Amazon SQS queue and send an alert if the queue depth exceeds 1000 for more than 5 minutes. Which AWS service should be used to create the alarm?
4A company is using Amazon CloudWatch Synthetics canaries to monitor its web application endpoints. The canaries are deployed in multiple AWS regions. The team wants to aggregate the canary results into a single dashboard in the US East (N. Virginia) region. What is the MOST efficient way to achieve this?
5A DevOps team is troubleshooting a slow application. They enabled AWS X-Ray tracing and see that one of the downstream services has a high average response time. However, the traces show that the service itself is fast; the delay is in the network call from the upstream service. Which X-Ray feature should the team use to identify the root cause?
6A company needs to monitor the CPU utilization of its Amazon RDS for PostgreSQL instance. The metric should be available in Amazon CloudWatch with a granularity of 1 minute. Which action should the team take?
7A company runs a containerized application on Amazon ECS Fargate. The DevOps team wants to collect custom application metrics (e.g., request count, error rate) and send them to Amazon CloudWatch. The team wants to minimize changes to the application code. Which solution should be used?
8A DevOps engineer is designing a monitoring solution for a serverless application using AWS Lambda, Amazon API Gateway, and Amazon DynamoDB. The team needs to monitor for errors and latency. Which TWO actions should the engineer take to implement comprehensive monitoring? (Choose TWO.)
9A company uses Amazon CloudWatch Logs to store logs from multiple applications. The security team requires that logs are encrypted at rest using a customer-managed KMS key. Additionally, logs must be retained for 7 years for compliance. Which THREE steps should the DevOps engineer take to meet these requirements? (Choose THREE.)
10A DevOps engineer is monitoring an Amazon EC2 Auto Scaling group. The engineer wants to receive notifications when instances are launched or terminated. Which TWO AWS services can be used together to achieve this? (Choose TWO.)
11A company runs a production web application on Amazon EC2 instances in an Auto Scaling group behind an Application Load Balancer (ALB). The application is deployed across three Availability Zones. The DevOps team recently noticed that the application's error rate is spiking periodically, but they cannot correlate the spikes with any known deployments or changes. The team has enabled detailed CloudWatch metrics for the ALB and EC2, and they are using CloudWatch Logs for application logs. They also have AWS X-Ray enabled for tracing. The team observes that during error spikes, the ALB's 5XX count increases, but the EC2 instance-level CPU and memory metrics remain normal. The application logs show 'Connection timed out' errors. The team suspects the issue is related to network connectivity but is not sure. Which course of action should the DevOps team take to identify the root cause of the periodic error spikes?
12A company runs a critical application on Amazon ECS with Fargate. The application experiences intermittent slow responses. The DevOps team enabled Container Insights and CloudWatch ServiceLens. However, traces from the application do not appear in ServiceLens. The application uses the AWS X-Ray SDK for tracing. What is the MOST likely cause?
13A DevOps team has set up centralized logging for multiple AWS accounts using Amazon OpenSearch Service. The team uses CloudWatch cross-account observability to collect logs from various accounts into a monitoring account. Recently, logs from one source account stopped appearing in the monitoring account's OpenSearch dashboard. Other source accounts continue to send logs successfully. Which step should the team take to troubleshoot this issue?
14A company uses AWS CloudFormation to deploy infrastructure. The DevOps team wants to receive notifications when a stack fails to create or update. What is the MOST efficient way to achieve this?
15A DevOps engineer is troubleshooting a production AWS Lambda function that occasionally times out. The function has a timeout of 30 seconds and uses a synchronous invocation. The engineer wants to capture invocation logs to identify the cause. Which approach will provide the MOST detailed diagnostic information?
16A company runs a web application on Amazon EC2 instances behind an Application Load Balancer. The DevOps team has enabled detailed CloudWatch metrics for the ALB and is using CloudWatch Logs for the EC2 instances. Recently, users report intermittent 503 errors. The team notices that the ALB's 'RequestCount' metric shows a sudden drop during error periods, while the 'ActiveConnectionCount' remains steady. Which TWO steps should the team take to diagnose the issue? (Choose two.)
17A DevOps team is designing a monitoring strategy for a microservices application deployed on Amazon EKS. The application emits custom metrics, and the team needs to collect them with minimal latency and at high resolution. The team also needs to retain logs for 90 days for compliance. Which THREE steps should the team take to meet these requirements? (Choose three.)
18Your company runs a multi-tier web application on AWS. The application consists of an Application Load Balancer (ALB) that distributes traffic to a fleet of Amazon EC2 instances running a web server. The web servers write access logs to a shared Amazon EFS filesystem. The operations team needs to monitor the web server logs in real-time to detect and alert on 5xx error spikes. Currently, the team manually SSHes into instances to tail logs, which is inefficient and doesn't provide real-time alerting. The team wants a centralized, near-real-time logging solution with minimal operational overhead. They have asked you to design a solution that ingests logs from the EFS filesystem into a centralized log analytics platform. Which solution would you recommend?
19A company is running a microservices application on Amazon ECS with AWS Fargate. The operations team needs to monitor application performance and troubleshoot slow API responses. They currently use Amazon CloudWatch Logs for container logs and have enabled Container Insights. However, they are unable to see detailed latency breakdowns per API endpoint. Which solution would provide the most granular visibility into API performance?
20A company is using AWS CloudTrail to log API activity in their AWS account. They want to ensure that any modification to CloudTrail configuration itself is logged and that the logs are immutable. Which combination of actions should they take? (Choose TWO.)
21Drag and drop the steps to configure an AWS Auto Scaling group with a launch template and scaling policies.
22Drag and drop the steps to set up an AWS CodeBuild project to build a Docker image and push it to Amazon ECR.
23Match each AWS monitoring or logging tool to its purpose.
24Match each AWS service health or performance concept to its meaning.
25A company uses AWS CloudTrail to log API activity across multiple accounts in AWS Organizations. The security team wants to receive near-real-time notifications for specific high-risk API calls, such as IAM policy changes or S3 bucket policy modifications. What is the MOST efficient and scalable solution?
26A DevOps engineer manages a production environment with EC2 instances behind an Application Load Balancer (ALB). The application logs show intermittent 5xx errors from the ALB. The engineer needs to identify whether the errors originate from the targets or the ALB itself. Which CloudWatch metric should be examined to differentiate between these two sources?
27A company uses Amazon RDS for PostgreSQL and wants to monitor database performance metrics such as CPU utilization, memory, and disk I/O. Which AWS service should be used to set up custom dashboards and alarms for these metrics?
28A company runs a microservices application on Amazon ECS with Fargate. The application logs are sent to CloudWatch Logs. Recently, the operations team noticed that logs from one service are missing for certain time periods. The service is very chatty and produces a high volume of logs. The CloudWatch Logs agent is configured with default settings. What is the MOST likely cause of the missing logs?
29A company wants to monitor network traffic to and from its VPC for security analysis. It needs to capture IP traffic information, including accepted and rejected connection attempts, and store the data in S3 for long-term analysis. Which AWS service should be used?
30A DevOps team uses AWS Lambda functions to process events from an SQS queue. The Lambda function occasionally fails due to transient errors, and the team wants to capture and analyze the full error details, including stack traces, for debugging. The errors are not always related to invocation failures (e.g., timeouts) but include exceptions thrown within the function code. Which approach will capture the MOST comprehensive error information?
31A company wants to be alerted when the root user signs in to the AWS Management Console. Which service should be used to create a monitoring rule for this event?
32A company uses Amazon CloudWatch to monitor its EC2 instances. A DevOps engineer notices that some metrics (e.g., memory utilization) are not available in the CloudWatch console. The engineer wants to collect these metrics. What should the engineer do?
33A company uses AWS CloudFormation to deploy infrastructure. The security team wants to be notified whenever a stack is created, updated, or deleted. They also want to track who made the change. Which combination of services should be used to achieve this?
34A company uses Amazon CloudWatch Logs to store application logs. The DevOps team wants to search across multiple log groups for a specific error pattern. Which TWO options can be used to achieve this? (Choose TWO.)
35A company runs a critical application on Amazon EKS. The operations team needs to monitor the health of the Kubernetes cluster and the applications running on it. Which THREE services can be used together to achieve comprehensive monitoring? (Choose THREE.)
36A DevOps engineer needs to collect and analyze logs from multiple AWS services, including EC2, Lambda, and API Gateway. The logs must be stored in a central location for long-term retention and analyzed using SQL queries. Which TWO services should be combined to achieve this? (Choose TWO.)
37A DevOps engineer needs to monitor the memory utilization of an Amazon EC2 instance running a critical application. Which AWS service should be used to collect and track this metric?
38A company uses Amazon CloudWatch Logs to store application logs from multiple EC2 instances. The security team requires that logs be encrypted at rest using a customer-managed KMS key. Which configuration step should the engineer perform to meet this requirement?
39An e-commerce application runs on Amazon ECS with Fargate. The operations team notices that the application's latency increases during peak hours. The engineer needs to correlate high CPU usage with increased request latency to identify the root cause. Which approach should be used?
40A DevOps engineer is setting up an alarm to notify the team when the average CPU utilization of an EC2 instance exceeds 80% for 5 consecutive minutes. Which CloudWatch alarm configuration should be used?
41A company uses AWS CloudTrail to log API activity across multiple accounts. The security team needs to ensure that all CloudTrail logs are delivered to a centralized S3 bucket in the audit account, and that any log file validation failures trigger an immediate notification. What should the engineer do to meet this requirement?
42A company runs a microservices application on Amazon EKS. The DevOps team wants to collect and visualize metrics such as pod CPU and memory usage, and set up alerts. Which combination of AWS services should be used?
43A DevOps engineer needs to centralize logs from multiple AWS accounts into a single CloudWatch Logs account. Which feature should be used?
44A company uses Amazon RDS for MySQL. The database performance has degraded, and the engineer suspects that slow queries are the cause. Which service should be used to identify and analyze the slow queries?
45A company is using AWS Lambda to process streaming data from Amazon Kinesis. The processing rate is slower than expected, and the engineer needs to monitor the number of records that are failing processing. Which metric should be used to create a CloudWatch alarm?
46A DevOps engineer needs to set up centralized logging for an application running on multiple EC2 instances across different AWS accounts. The logs must be aggregated in a single S3 bucket and also be analyzed in near real-time. Which TWO services should be used together to achieve this?
47A company uses Amazon CloudWatch to monitor a fleet of EC2 instances. The DevOps team wants to receive notifications when the CPU utilization exceeds 90% for 5 minutes and also when the status check fails. Which THREE steps should be taken to set up these alerts?
48A company wants to ensure that all changes to its Amazon S3 bucket policies are logged for auditing purposes. Which TWO AWS services should be enabled to capture these changes?
49A company is using Amazon CloudWatch Logs to monitor application logs from EC2 instances. The DevOps engineer notices that some log entries are missing. The CloudWatch agent is installed and configured. What is the most likely cause of the missing log entries?
50A DevOps engineer is tasked with setting up a centralized logging solution for a multi-account AWS environment. Which service should be used to aggregate logs from multiple accounts?
51A company runs a critical application on Amazon ECS with Fargate. The DevOps engineer wants to receive alerts when the application's error rate exceeds 5% over a 5-minute period. Which combination of services should be used?
52A DevOps engineer is designing a monitoring solution for an application that runs on Amazon EC2 instances in an Auto Scaling group. The engineer needs to collect memory utilization metrics and visualize them in a dashboard. What should the engineer do?
53A company uses AWS X-Ray to trace requests through its microservices application. The DevOps engineer notices that some traces are incomplete. What is a possible reason?
54A company is using Amazon CloudWatch Logs to store application logs. The DevOps engineer needs to ensure that log data is encrypted at rest using a customer-managed KMS key. What step must be taken?
55A DevOps engineer is troubleshooting a production issue where an application's response time has increased. The application is deployed on Amazon ECS with Fargate. The engineer wants to identify which microservice is causing the latency. Which AWS service should be used?
56A company wants to receive notifications when an EC2 instance's CPU utilization exceeds 90% for 10 consecutive minutes. Which AWS service should be used?
57A company runs a containerized application on Amazon EKS. The DevOps engineer needs to collect application metrics and make them available in Amazon CloudWatch. Which solution should be used?
58A DevOps engineer needs to set up a monitoring solution that can detect and alert on unusual patterns in application metrics. Which TWO AWS services can be used together to achieve this? (Choose TWO.)
59A company wants to centralize logging from multiple AWS accounts and regions. The logs should be stored in a central S3 bucket for compliance. Which THREE steps are required to achieve this? (Choose THREE.)
60A DevOps engineer is investigating a performance issue in a serverless application using AWS Lambda. The engineer wants to view the duration of each invocation and identify cold starts. Which TWO AWS services should be used? (Choose TWO.)
61A company's DevOps team notices that their Amazon RDS for PostgreSQL instance's CPU utilization spikes to 90% every day at 10:00 AM, causing application latency. They want to be notified when the CPU utilization exceeds 80% for more than 5 minutes to investigate the cause. Which solution should they implement?
62A company is using Amazon CloudWatch Logs to collect logs from its containerized applications running on Amazon ECS Fargate. The DevOps engineer wants to centralize logs from multiple services into a single CloudWatch Logs log group. They currently have a log group per service. Which approach minimizes operational overhead and cost?
63A DevOps engineer needs to monitor the number of 4xx and 5xx HTTP errors returned by an Application Load Balancer (ALB). They want to set up a dashboard that shows the error count over the last 24 hours. Which CloudWatch metrics should they use?
64A company uses AWS Lambda functions to process incoming events. The DevOps team notices that some functions are timing out after 30 seconds, but the configured timeout is 1 minute. They want to capture the actual invocation duration for all invocations to analyze performance. What is the most efficient way to achieve this?
65A company is running a production microservices architecture on Amazon ECS with Fargate. The operations team wants to set up centralized logging across all services, including the ability to search logs in near real-time and retain them for 3 years. The logs are currently sent to CloudWatch Logs. Which combination of services would meet these requirements with the least operational overhead?
66A DevOps engineer is troubleshooting a slow API response. They suspect that the issue is related to database queries. The application runs on EC2 instances behind an ALB and uses Amazon RDS for MySQL. Which monitoring approach will provide the most granular insight into database query performance?
67A company is using AWS CloudFormation to deploy infrastructure. They want to receive notifications when a stack operation fails, including the specific resource that caused the failure. Which approach should they use?
68A company is migrating its on-premises applications to AWS and wants to maintain the same level of monitoring for its Linux-based EC2 instances. They currently use Nagios for monitoring. They want a managed AWS service that can monitor instance health, system metrics, and application logs. Which solution should they use?
69A DevOps team is deploying a new web application on AWS Elastic Beanstalk. They want to monitor the application's health and receive notifications when the environment's health status changes to 'Degraded' or 'Severe'. What is the simplest way to achieve this?
70A company wants to set up centralized logging for its multi-account AWS environment. The logs include CloudTrail, VPC Flow Logs, and Amazon Route 53 resolver query logs. Which TWO services should they use to achieve this with minimal operational overhead? (Select TWO.)
71A company is using Amazon CloudWatch Synthetics canaries to monitor its web application endpoints. The canaries are failing intermittently with timeout errors. The DevOps team needs to troubleshoot the root cause. Which THREE actions should they take? (Select THREE.)
72A DevOps engineer is setting up monitoring for an Amazon DynamoDB table that experiences high read traffic. They want to monitor the read capacity consumption and be alerted when the consumed read capacity exceeds 80% of the provisioned capacity for 5 consecutive minutes. Which TWO steps should they take? (Select TWO.)
73A company uses Amazon EC2 instances in an Auto Scaling group behind an Application Load Balancer. The operations team notices that some instances are failing health checks but are not being terminated by Auto Scaling. What should be investigated to resolve this issue?
74A DevOps engineer is configuring a centralized logging solution using Amazon CloudWatch Logs. They need to ensure that logs from multiple AWS accounts are aggregated into a single CloudWatch Logs account. Which approach meets this requirement?
75A company is using AWS CloudTrail to track API calls. They want to be notified immediately when an IAM user creates a new access key. Which combination of AWS services should be used?
76A company is running a critical application on Amazon ECS with Fargate launch type. The application experiences periodic performance degradation. The DevOps team needs to set up monitoring to capture detailed metrics at a 1-second granularity. Which solution should be used?
77A DevOps engineer needs to audit changes to IAM policies over the past 90 days. The engineer wants to see who made the change, what the change was, and when it occurred. Which AWS tool should be used?
78A company is using Amazon RDS for MySQL and wants to monitor database connections. They need to set up an alarm when the number of connections exceeds 80% of the maximum connections for more than 5 minutes. Which CloudWatch metric and statistic should be used?
79A company runs a web application on EC2 instances behind an Application Load Balancer. They use Amazon CloudFront for content delivery. The DevOps team notices that some requests are returning HTTP 503 errors intermittently. After checking the CloudFront and ALB logs, they find that the errors originate from the ALB. What is the most likely cause?
80A company wants to collect and analyze logs from on-premises servers and send them to AWS for centralized monitoring. Which combination of AWS services should be used?
81A DevOps engineer is troubleshooting a slow-running Lambda function. The function processes messages from an SQS queue. Which CloudWatch metric should be examined first to determine if the function is experiencing throttling?
82A company is using Amazon CloudWatch Logs to store application logs. The security team requires that logs are encrypted at rest using a customer-managed KMS key. Which TWO steps must be taken to achieve this?
83A DevOps team is troubleshooting a slow website that uses Amazon CloudFront with an Application Load Balancer as the origin. The team notices that cache hit ratio is low. Which THREE actions are most likely to improve the cache hit ratio?
84A company is using Amazon CloudWatch Logs to collect application logs. They need to search and analyze the logs in near real-time. Which TWO AWS services can be used to achieve this?
85A company is running a production application on Amazon ECS with AWS Fargate. The application has unpredictable traffic patterns and occasionally experiences increased latency. The DevOps team needs to configure scaling based on a custom metric that tracks the number of active user sessions in real time. Which solution will allow the team to scale the ECS service based on this custom metric?
86A DevOps team is implementing a comprehensive logging strategy for a microservices architecture running on Amazon EKS. They need to collect logs from all containers and send them to a centralized log analytics platform. The solution must be agentless and support multi-line log events. Which approach should the team use?
87A company uses AWS CloudTrail to log API activity in their AWS account. They need to ensure that any changes to CloudTrail configuration itself are detected and alerted upon in real time. Which service should they use?
88A company is using Amazon RDS for PostgreSQL and wants to monitor the database for performance issues. They need to capture slow queries and analyze them over time. Which combination of AWS services should they use?
89A company has a multi-account AWS environment using AWS Organizations. The security team needs to centrally monitor and analyze VPC Flow Logs from all accounts. The solution must be cost-effective and allow querying across accounts. Which approach should they take?
90A DevOps engineer needs to set up a monitoring solution for an AWS Lambda function that processes messages from an Amazon SQS queue. The engineer wants to be alerted if the function fails to process a message (i.e., the message ends up in the dead-letter queue). Which approach should they use?
91A company runs a web application behind an Application Load Balancer (ALB) in a production AWS account. The DevOps team needs to analyze HTTP request patterns and identify the top IP addresses generating errors. They want to store the data cost-effectively for querying with SQL. Which solution meets these requirements?
92A company uses AWS Lambda with an Amazon DynamoDB table to process high-volume clickstream data. The Lambda function writes the data to DynamoDB. Recently, the function has been experiencing throttling and timeouts during peak traffic. The DevOps team needs to set up monitoring to identify the root cause. Which combination of metrics should they analyze?
93A DevOps engineer is tasked with ensuring that all Amazon S3 buckets in the account have server access logging enabled. The engineer needs to be automatically notified when a new bucket is created without logging enabled. Which AWS service should they use?
94A company is using Amazon CloudWatch Logs to collect logs from multiple EC2 instances. They need to filter logs in real time and send specific log events to a custom application for processing. Which TWO services can they use to achieve this?
95A company is running a critical application on Amazon EC2 instances behind an Application Load Balancer (ALB). They need to implement a monitoring strategy that provides detailed visibility into application performance, including request-level latency and error codes. Which THREE actions should they take?
96A DevOps engineer is setting up centralized logging for a multi-account environment using AWS Organizations. The engineer needs to aggregate logs from all accounts into a single Amazon S3 bucket. Which TWO steps are necessary?
97A company uses Amazon CloudWatch Synthetics canaries to monitor endpoint availability. The canaries are failing intermittently with timeout errors. The DevOps team needs to diagnose the issue. Which THREE aspects should they investigate?
98Refer to the exhibit. An IAM policy is attached to an EC2 instance role. The application on the instance is unable to send logs to CloudWatch Logs. The log group 'MyAppLogs' exists in the same account and region. What is the most likely reason for the failure?
99Refer to the exhibit. A DevOps engineer runs the AWS CLI command to get the average TargetResponseTime for an ALB over a 1-hour period. The output shows only three datapoints. What is the most likely reason?
100A company is running a production web application on Auto Scaling EC2 instances behind an ALB. They have enabled detailed CloudWatch metrics on the EC2 instances and enabled CloudTrail. Recently, users reported intermittent 503 errors. The operations team reviews CloudWatch dashboards but sees no spike in CPU or memory. What is the MOST likely cause of the 503 errors?
101A DevOps engineer is tasked with centralizing logs from multiple AWS accounts into a single Amazon OpenSearch Service domain. The engineer sets up Amazon Kinesis Data Firehose to deliver logs from each account to the OpenSearch domain. However, some accounts show failed deliveries in the Firehose console. Which configuration is MOST likely causing the failures?
102A company wants to monitor CPU utilization of its EC2 instances and receive an alert when utilization exceeds 80% for 5 consecutive minutes. Which AWS service should be used to create this alarm?
103A DevOps engineer notices that an Amazon RDS for MySQL instance's CPU is consistently high during business hours. The engineer wants to identify the specific queries causing the high CPU. Which combination of services should be used to capture and analyze the queries? (Choose the best answer.)
104A company runs a containerized microservices application on Amazon ECS with Fargate. The operations team wants to collect custom application metrics (e.g., request latency, error counts) and send them to CloudWatch. The team wants to avoid managing any servers or agents. Which solution meets these requirements?
105An organization wants to ensure that all API calls made in their AWS account are logged for security analysis. Which AWS service should be enabled to meet this requirement?
106A company uses Amazon CloudWatch Logs to store application logs. A DevOps engineer needs to create a real-time dashboard that displays the count of ERROR-level log entries across all instances. Which approach is the MOST efficient and cost-effective?
107A company runs a fleet of EC2 instances behind an Auto Scaling group. The DevOps team wants to detect and respond to memory leaks in their application. They have configured CloudWatch agent to collect memory metrics. However, the metric shows unpredictable spikes. The team needs to correlate these spikes with application logs to identify the root cause. Which solution provides the BEST correlation?
108A DevOps engineer needs to monitor the number of messages in an Amazon SQS queue and trigger an Auto Scaling policy to add more EC2 instances when the queue depth exceeds a threshold. Which CloudWatch metric should the alarm use?
109A DevOps engineer executes the above CloudWatch Logs Insights query. What will the output contain?
110A company is deploying a new microservice on AWS Lambda. The DevOps team needs to monitor the function for errors and performance issues. Which TWO steps should the team take to set up effective monitoring?
111A company runs a critical application on Amazon ECS with Fargate. The application emits structured logs in JSON format. The DevOps team wants to monitor for specific error codes and receive near-real-time alerts. The team also needs to retain logs for 5 years for compliance. Which TWO steps should the team implement?
112A DevOps engineer is designing a monitoring solution for a multi-tier web application hosted on AWS. The application consists of an Application Load Balancer (ALB), EC2 instances, and an RDS database. The engineer needs to capture and analyze HTTP request logs from the ALB to understand client behavior and troubleshoot errors. Which THREE steps are necessary to achieve this?
113A company runs a critical e-commerce platform on AWS. The architecture includes an Application Load Balancer (ALB) that distributes traffic to a fleet of EC2 instances in an Auto Scaling group. The EC2 instances run a custom Java application that uses an RDS for MySQL database and an ElastiCache Redis cluster for session caching. The DevOps team has set up CloudWatch alarms for CPU utilization, memory, and database connections. Recently, customers have been reporting slow page load times and occasional timeouts. The team notices that during peak hours, the ALB's TargetResponseTime metric spikes, and the number of healthy hosts in the target group fluctuates. The CPU and memory metrics on the EC2 instances remain within normal ranges. The database CPU is also normal. The team suspects the issue is related to the application's session management. Which course of action should the DevOps team take to identify the root cause?
114A company runs a serverless application using AWS Lambda, Amazon API Gateway, and Amazon DynamoDB. The application is used by thousands of users. Recently, the operations team noticed an increase in 5xx errors from API Gateway. The team has enabled CloudWatch Logs for the Lambda functions and API Gateway. They see the errors are sporadic and not correlated with high traffic. The Lambda function's error count in CloudWatch is also increasing. The team wants to identify the specific requests that are failing and understand the error details. Which solution should the team implement?
115A company is using Amazon RDS for MySQL and needs to monitor slow queries to optimize database performance. The team has enabled slow query logs and wants to centralize logging in Amazon CloudWatch Logs for real-time analysis and alerting. Which solution meets these requirements with minimal operational overhead?
116A DevOps engineer is troubleshooting a production issue where an Application Load Balancer (ALB) is returning 503 errors. The ALB targets are EC2 instances in an Auto Scaling group behind the ALB. The engineer checks the ALB access logs in Amazon S3 and finds that the ALB is healthy. However, the 503 errors persist. Which configuration should the engineer check next?
117A company runs a microservices architecture on Amazon ECS with Fargate. The operations team wants to collect custom application metrics (e.g., request latency per service) and visualize them in CloudWatch dashboards. The team also needs to set CloudWatch alarms based on these metrics. Which solution requires the LEAST amount of code changes and operational overhead?
118A company uses AWS CloudTrail to log all API calls in their AWS account. They need to ensure that any changes to CloudTrail configuration (such as disabling the trail or modifying the log file validation) are immediately detected and trigger an automated response. Which solution should the DevOps engineer implement?
119A company has a critical application running on Amazon EC2 instances behind an Application Load Balancer. The application is experiencing intermittent latency spikes. The DevOps team has enabled detailed monitoring on the EC2 instances and is using CloudWatch metrics. They notice that CPU utilization and network traffic are normal during the spikes. Which additional diagnostic step should the team take to identify the root cause?
120A company is using AWS Lambda functions for data processing. The operations team needs to monitor the number of invocations, duration, and error counts for each function. They also want to set alarms when the error rate exceeds 5% in a 5-minute period. Which combination of AWS services should the team use to achieve this with minimal effort?
121A DevOps engineer is setting up monitoring for an Amazon S3 bucket that stores sensitive data. The engineer needs to be notified whenever an object in the bucket is accessed by a user or application, including read and write operations. Which AWS service should the engineer use to capture these events and trigger notifications?
122A company uses Amazon CloudWatch Logs to centralize logs from multiple EC2 instances running a web application. The DevOps team needs to create a metric filter that parses logs for HTTP status codes (e.g., 4xx and 5xx) and increment a metric. Additionally, they need to create a CloudWatch alarm on the error count. Which of the following are required to achieve this? (Select TWO.)
123A DevOps engineer is designing a monitoring solution for a multi-tier web application hosted on AWS. The application consists of an Application Load Balancer (ALB), a fleet of EC2 instances in an Auto Scaling group, and an Amazon RDS database. The engineer needs to monitor the health of each component and receive alerts when any component becomes unhealthy. Which of the following CloudWatch metrics should the engineer monitor? (Select THREE.)
124A company uses Amazon CloudWatch Logs to store application logs. They have a requirement to retain logs for 90 days for operational analysis and then archive them to Amazon S3 for compliance purposes for an additional 5 years. Which of the following steps are necessary to meet this requirement? (Select TWO.)
125Refer to the exhibit. A DevOps engineer runs the above CloudWatch Logs Insights query on a log group containing application logs. The query returns an empty result set. The engineer knows that the application logs contain ERROR entries. Which of the following is the most likely cause?
126A company runs a critical e-commerce application on AWS. The architecture includes an Application Load Balancer (ALB) in front of an Auto Scaling group of EC2 instances running a web server, and an Amazon RDS MySQL Multi-AZ database. The DevOps team has implemented CloudWatch dashboards to monitor key metrics. Recently, customers have reported that the website becomes unresponsive for a few minutes during peak traffic hours. The team reviews the CloudWatch metrics and observes that during the incidents, the ALB's 'TargetResponseTime' metric spikes, and the RDS 'ReadLatency' and 'WriteLatency' metrics also spike. However, the EC2 CPU utilization and memory usage remain normal. The ALB health check shows 'Healthy' for all targets. The team needs to identify the root cause. Which course of action should the team take?
127A DevOps engineer is tasked with setting up monitoring for a serverless application that uses AWS Lambda, Amazon API Gateway, and Amazon DynamoDB. The engineer needs to create a centralized dashboard that displays the number of Lambda invocations, API Gateway request counts, and DynamoDB consumed read/write capacity units. The dashboard should be accessible to the operations team without requiring AWS Management Console login. The engineer also wants to set up email alerts when the DynamoDB consumed capacity exceeds 80% of the provisioned capacity. Which solution meets these requirements with the LEAST operational overhead?
128A company has deployed a containerized application on Amazon ECS with Fargate. The application is fronted by an Application Load Balancer (ALB). The DevOps team is using CloudWatch Container Insights to monitor the ECS cluster. They notice that the 'MemoryUtilized' metric for the service is consistently above 80%, and the 'CPUUtilized' is around 50%. The ALB's 'TargetResponseTime' is increasing over time. The team wants to resolve the performance issue. Which action should the team take?
129A company uses AWS CloudFormation to deploy infrastructure. The DevOps team wants to monitor the CloudFormation stack events to detect when stack creation or updates fail, and automatically send notifications to a Slack channel. The team has set up an Amazon SNS topic that sends messages to a Slack webhook via a Lambda function. Which solution should the team implement to trigger the SNS topic when a CloudFormation stack fails?
130A company runs a critical e-commerce application on AWS. The application is deployed on Amazon EC2 instances behind an Application Load Balancer (ALB) in an Auto Scaling group. The instances store session data in an ElastiCache for Redis cluster. Recently, users have reported intermittent session timeouts during peak traffic hours. The operations team notices that CloudWatch alarms for the Redis cluster's CPUUtilization and Evictions metrics are frequently breaching thresholds. The team wants to resolve the issue without incurring unnecessary costs. Which solution should the team implement?
131A DevOps engineer is responsible for monitoring a set of microservices running on Amazon ECS with Fargate. The services are fronted by an Application Load Balancer (ALB). The engineer needs to collect and analyze application logs centrally with minimal latency and operational overhead. The logs should be searchable and retainable for 90 days. Which solution meets these requirements?
132A company runs a serverless application using AWS Lambda, Amazon API Gateway, and Amazon DynamoDB. The application processes financial transactions. The DevOps team needs to monitor for duplicate transactions that could occur due to retries. The team wants to set up an alert when the number of duplicate transaction attempts exceeds 10 in a 5-minute window. The application logs each transaction attempt with a unique transaction ID to CloudWatch Logs. What is the most efficient way to achieve this?
133A media company runs a video transcoding pipeline on AWS. The pipeline uses AWS Step Functions to orchestrate multiple Lambda functions that transcode video files stored in Amazon S3. The company wants to implement a monitoring solution to track the progress of each workflow execution, including which step is currently running, the duration of each step, and any errors. The solution should provide near real-time visibility and allow the team to troubleshoot failed executions quickly. Which solution meets these requirements?
134A company is using Amazon CloudWatch Logs to store application logs. The DevOps team wants to set up real-time monitoring for specific error patterns and trigger remediation actions. Which TWO services can process the log events in real time and invoke an AWS Lambda function for remediation? (Choose two.)
135A DevOps team is designing a centralized logging solution for multiple AWS accounts. The team needs to collect logs from EC2 instances, Lambda functions, and VPC Flow Logs, and store them in a central account for analysis. The solution must be cost-effective and support near real-time log aggregation. Which THREE steps should the team take? (Choose three.)
136A company runs a microservices architecture on Amazon EKS. The DevOps team wants to monitor application performance and detect anomalies in request latency. They need to collect metrics, logs, and traces from all services. Which THREE AWS services should the team use together to implement a complete observability solution? (Choose three.)
137A company is using Amazon CloudWatch to monitor its production environment. The operations team receives alerts for the same underlying issue from multiple alarms, causing alert fatigue. The team wants to reduce noise and consolidate alerts into actionable notifications. Which TWO steps should the team take? (Choose two.)
138A company is using a centralized logging solution with Amazon OpenSearch Service. The DevOps team notices that logs from some EC2 instances are missing. The CloudWatch agent is installed and configured on all instances. What should the team do to troubleshoot the issue?
139A company is running a critical application on Amazon ECS with Fargate. The application generates custom metrics that are published to CloudWatch using the PutMetricData API. Recently, the metrics have been delayed by up to 5 minutes. The DevOps team needs to reduce the latency. What should the team do?
140A DevOps engineer needs to set up an alert for when the CPU utilization of an EC2 instance exceeds 90% for 5 consecutive minutes. Which CloudWatch features should be used?
141A company is using AWS X-Ray to trace requests through a microservices application. Some traces are incomplete, showing only the root segment without any subsegments. The application uses the X-Ray SDK for Java. What is the most likely cause?
142A DevOps team needs to monitor failed API calls in their AWS account. They want to receive notifications when specific IAM actions, such as DeleteBucket, fail. Which service should they use?
143A company wants to visualize the performance of their application running on EC2. They need to create a dashboard that shows CPU utilization, memory usage, and disk I/O. Which AWS service should they use?
144A company is using Amazon RDS for MySQL and needs to monitor the number of slow queries. They have enabled slow query logs. How can they effectively monitor and alert on the number of slow queries per minute?
145A DevOps engineer is setting up centralized logging for multiple AWS accounts. They need to collect VPC Flow Logs, CloudTrail logs, and application logs into a single Amazon S3 bucket. What is the most efficient approach?
146A company wants to receive real-time notifications when their Auto Scaling group launches or terminates EC2 instances. Which AWS service should they use?
147A company is using Amazon CloudWatch Logs to store application logs. The DevOps team needs to search and analyze logs from multiple EC2 instances in real time. Which TWO services can be used to achieve this? (Choose TWO.)
148A company runs a containerized application on Amazon ECS with Fargate. They want to monitor the application logs and metrics. Which THREE steps should they take to collect and visualize this data? (Choose THREE.)
149A DevOps engineer needs to monitor the health of a web application running on EC2 instances behind an Application Load Balancer (ALB). Which TWO metrics from ALB should be monitored to detect application errors? (Choose TWO.)
150A company is running a web application on Amazon EC2 instances behind an Application Load Balancer. The application is experiencing intermittent errors. The DevOps engineer needs to identify if the errors are caused by the application or the underlying infrastructure. Which solution provides the MOST detailed visibility into the application's behavior?
151A DevOps team is using Amazon CloudWatch Logs to centralize logs from multiple EC2 instances running a custom application. The team notices that logs are missing from some instances intermittently. The CloudWatch agent configuration is identical across all instances. What is the MOST likely cause of the missing logs?
152A company is using AWS CloudFormation to deploy a microservices architecture. The operations team wants to receive real-time notifications when any stack operation fails. Which TWO AWS services can be used together to achieve this?
153A DevOps engineer is designing a monitoring solution for a multi-account AWS environment using AWS Organizations. The solution must collect logs from all accounts into a centralized Amazon S3 bucket for analysis. Which THREE steps are required to set up this centralized logging?
154A company is running a critical application on Amazon ECS with Fargate launch type. The application writes logs to Amazon CloudWatch Logs. The DevOps team needs to set up an alert when the application generates more than 100 error logs in any 5-minute window. Which configuration should be used?
155A DevOps engineer is troubleshooting an issue where an Amazon RDS for MySQL instance is experiencing high latency. The engineer wants to identify which queries are causing the problem. Which AWS service should be used?
156A company uses AWS Lambda functions to process data from an Amazon SQS queue. The Lambda function sometimes fails due to timeouts. The DevOps team wants to monitor the number of function timeouts and receive alerts. What is the MOST efficient way to achieve this?
157A DevOps team is using Amazon CloudWatch Synthetics canaries to monitor the availability of a web application. The canary is configured to run every 5 minutes. The team notices that the canary fails occasionally but the application is healthy. Which action will help identify if the failures are due to network issues between the canary and the application?
158A company is using Amazon CloudWatch Logs to store application logs. The security team requires that logs are encrypted at rest using a customer-managed AWS KMS key. Which TWO steps are necessary to achieve this?
159A DevOps engineer needs to monitor the number of messages in an Amazon SQS queue and trigger an auto scaling action when the queue depth exceeds a threshold. Which combination of services should be used?
160A company is using AWS CloudFormation to manage infrastructure. The DevOps team wants to receive notifications when CloudFormation stack creation fails. Which AWS service should be used to capture the stack failure event and send a notification?
161A company is running a production application on Amazon ECS with Fargate. The DevOps team needs to monitor the application's performance and set up alerts for high memory usage. Which THREE steps should the team take to achieve this?
162A DevOps engineer is troubleshooting an application that runs on Amazon EC2 instances behind an Application Load Balancer. Users report intermittent 503 errors. CloudWatch metrics for the ALB show an increase in 'HTTPCode_ELB_5XX_Count' but the backend 'HealthyHostCount' remains stable. Which action should the engineer take to identify the root cause?
163A company uses AWS CloudTrail to monitor API activity. The DevOps team needs to ensure that any deletion of an S3 bucket is detected in real time and triggers an automated response. Which combination of AWS services should be used to meet these requirements?
164A company is running a critical application on Amazon RDS for PostgreSQL. The DevOps team needs to set up monitoring to detect when database connections exceed 80% of the maximum connections for more than 5 minutes. Which CloudWatch metric should be used to create an alarm?
165A company uses Amazon ECS with Fargate for containerized applications. The DevOps team notices that some tasks are failing with 'OutOfMemoryError' but the CloudWatch metric 'MemoryUtilization' for the service shows values well below the task memory limit. What is the most likely cause of this discrepancy?
166A DevOps engineer sets up a CloudWatch dashboard to monitor an application's performance. The application runs on EC2 instances in an Auto Scaling group. The engineer wants to display the average CPU utilization across all instances in the group. Which CloudWatch metric and statistic should be used?
167A company uses AWS CloudFormation to deploy infrastructure. The DevOps team needs to receive notifications when stack creation fails. Which approach should be used to automate this monitoring?
168An application running on Amazon EKS generates logs that need to be sent to CloudWatch Logs for central monitoring. The DevOps team deploys the CloudWatch agent as a DaemonSet in the cluster. However, logs from some pods are not appearing in CloudWatch. Which configuration issue is most likely causing this?
169A DevOps engineer wants to receive an alert when the total number of error logs in an application exceeds 100 within a 5-minute period. The application writes logs to CloudWatch Logs. How can this be achieved?
170A company uses AWS Lambda functions for data processing. The operations team notices that some functions are taking longer to execute than expected. They want to analyze the execution durations to identify functions that exceed the 75th percentile latency. Which CloudWatch feature should be used?
171A DevOps team is designing a monitoring solution for a multi-tier web application running on AWS. The application consists of an Application Load Balancer, EC2 instances in an Auto Scaling group, and an RDS database. Which TWO approaches provide centralized logging and monitoring across all tiers?
172A company uses AWS Organizations to manage multiple accounts. The DevOps team needs to monitor for any IAM user creation across all accounts in the organization. Which THREE steps should be taken to implement this centralized monitoring?
173A DevOps engineer is troubleshooting a performance issue with an Amazon RDS for MySQL database. The engineer suspects that slow queries are causing high CPU utilization. Which TWO actions can the engineer take to identify the slow queries?
174Refer to the exhibit. The IAM policy above is attached to an EC2 instance role. The CloudWatch agent on the instance is configured to send logs to the 'MyAppLogs' log group. However, logs are not appearing in CloudWatch. What is the most likely issue?
175Refer to the exhibit. A DevOps engineer runs the AWS CLI command shown to retrieve the RequestCount metric for an ELB. The output shows datapoints with Sum values. What is the total number of requests received by the load balancer during the entire hour?
176Refer to the exhibit. The IAM policy above is attached to a Lambda function's execution role. The Lambda function is supposed to publish custom metrics to CloudWatch using PutMetricData. However, the metrics are not appearing. What is the most likely reason?
177A company runs a microservices application on Amazon ECS with Fargate. The operations team notices that some services are experiencing intermittent high latency, but CPU and memory metrics appear normal. They need to identify the root cause. Which approach should they use?
178A DevOps team uses AWS CloudFormation to deploy a web application. They want to receive notifications when a stack update fails. Which combination of services should they use?
179An application running on Amazon EC2 instances sends custom metrics to CloudWatch. The team notices that some metrics are not appearing. What is the most likely cause?
180A company uses AWS CloudTrail to log API activity. The security team needs to be alerted when an IAM user creates a new access key. How can this be achieved with minimal overhead?
181A company runs a critical application on Amazon RDS for PostgreSQL. The database experiences periodic slowdowns. The team wants to monitor the number of active connections and the query execution time. Which approach is most cost-effective?
182A DevOps engineer needs to aggregate logs from multiple AWS accounts into a central account for analysis. Which service should they use?
183An application running on AWS Lambda is experiencing cold starts. The team wants to monitor the cold start duration. What should they do?
184A company uses Amazon S3 to store sensitive data. The security team wants to be notified when an S3 bucket policy is modified. Which approach is most efficient?
185A DevOps team wants to monitor the disk space utilization on their EC2 instances. What is the simplest way to achieve this?
186A company runs a web application on Amazon EC2 instances behind an Application Load Balancer (ALB). The application logs show that some requests are timing out. The team needs to identify the source of the issue. Which TWO steps should they take?
187A DevOps engineer is designing a centralized logging solution for a multi-account AWS environment. The solution must be cost-effective and provide real-time log analysis. Which THREE services should they consider?
188A company uses AWS Lambda for data processing. The operations team wants to be alerted when a function fails. Which TWO methods can they use?
189A Lambda function is unable to write logs to CloudWatch Logs. The IAM policy attached to the function's execution role is shown above. What is the issue?
190A DevOps engineer runs the command above to retrieve CPU utilization for an EC2 instance, but gets no data points. The instance is running and has basic monitoring enabled. What is the most likely reason?
191A Lambda function is timing out. The log above shows a recent invocation. What is the most likely cause?
192A company is using an Application Load Balancer (ALB) in front of an Auto Scaling group of EC2 instances. The operations team notices that the error rate on the ALB is increasing, but the CPU utilization on the EC2 instances remains low. Which CloudWatch metric should be examined to determine if the errors are due to a lack of healthy targets?
193A DevOps engineer is configuring CloudWatch Logs for a Lambda function that processes streaming data from Kinesis. The function sometimes fails due to memory exhaustion. The engineer wants to ensure that logs from the function are shipped to CloudWatch Logs even when the function fails. Which configuration should be used?
194A company uses Amazon CloudWatch to monitor its production environment. The DevOps team wants to receive an email notification whenever the average CPU utilization of any EC2 instance exceeds 90% for 5 consecutive minutes. Which steps should be taken to set up this notification?
195A company runs a critical application on an Auto Scaling group of EC2 instances behind an Application Load Balancer (ALB). The DevOps team needs to implement a dashboard that shows real-time request latency, error rates, and the number of healthy hosts. Which AWS service should be used to create this dashboard?
196A DevOps engineer is troubleshooting a slow web application. The application runs on EC2 instances behind an ALB. The engineer notices that the ALB's TargetResponseTime metric shows high p99 values, but the CPU and memory on the EC2 instances are well below thresholds. What is the most likely cause?
197A company wants to monitor the number of messages that are published to an Amazon SNS topic. Which CloudWatch metric should be used?
198A company has a microservices architecture with 50 services running on Amazon ECS. The DevOps team wants to collect and analyze logs from all services centrally. They need to query logs across services and set up alerts for error patterns. Which solution is the most scalable and cost-effective?
199A DevOps engineer receives an alarm that an EC2 instance's StatusCheckFailed metric has been in ALARM state for 10 minutes. Which action should the engineer take first to investigate?
200A company wants to receive a notification when an AWS IAM user creates a new access key. Which AWS service should be used to capture this event and trigger a notification?
201A DevOps team is setting up centralized logging for a multi-account AWS environment. They want to aggregate logs from all accounts into a single S3 bucket. Which services should be used to achieve this? (Choose TWO.)
202A company uses Amazon RDS for MySQL and wants to monitor slow queries to optimize performance. Which actions should the DevOps engineer take to capture and analyze slow query logs? (Choose THREE.)
203A company is using AWS CloudFormation to deploy infrastructure. The DevOps team wants to receive notifications when a stack creation fails. Which services can be used together to send an email notification on stack failure? (Choose TWO.)
204A DevOps engineer notices that a CloudWatch alarm for high CPU utilization on an EC2 instance is not triggering despite the CPU consistently above the threshold. The instance is in a VPC with a public subnet and has internet access. What is the most likely cause?
205An application running on Amazon ECS Fargate writes logs to CloudWatch Logs. The logs include sensitive data such as credit card numbers, which must be masked before storage. What is the most cost-effective solution that requires the least operational overhead?
206A company uses AWS CloudTrail to log all API calls across multiple accounts in AWS Organizations. The DevOps team wants to detect and alert on any IAM user who creates an access key and then uses it to make API calls within 24 hours, as this may indicate a compromised account. Which combination of actions should be taken to achieve this with minimal latency?
207A DevOps engineer needs to monitor the memory utilization of an Amazon RDS for MySQL instance. Which AWS service should be used to collect and visualize this metric?
208A company uses AWS Lambda functions behind an Amazon API Gateway REST API. The DevOps team wants to monitor the end-to-end latency of API requests, including the time spent in API Gateway and Lambda. Which approach provides the most granular breakdown?
209A company has a production Amazon EKS cluster with multiple node groups. The DevOps team notices that some pods are frequently restarting due to OOMKilled errors, but the cluster-level metrics (CPU, memory) appear normal. Which CloudWatch Container Insights metric should be analyzed to identify the specific node or pod causing the issue?
210A DevOps engineer needs to centrally collect and analyze logs from multiple AWS accounts and on-premises servers. Which AWS service should be used to aggregate logs in a single dashboard?
211A company uses AWS CloudFormation to deploy infrastructure. The DevOps team wants to receive notifications when a stack creation fails due to a resource limit exceeded error. Which approach should be used?
212A DevOps team is troubleshooting a performance issue where an Amazon RDS for PostgreSQL instance's CPU utilization spikes every hour. The team suspects a specific query from an application. Which combination of tools can identify the problematic query?
213A DevOps engineer wants to monitor the health of an Auto Scaling group and receive notifications when instances are launched or terminated. Which TWO AWS services can be used together to achieve this?
214A company is using Amazon CloudWatch Logs to store application logs. The DevOps team needs to search across multiple log groups and visualize trends. Which TWO services can be used together to achieve this?
215A DevOps engineer is designing a centralized logging solution for 10 AWS accounts. Logs must be stored in a central S3 bucket with encryption and access logging. Which THREE services/resources are required to meet these requirements?
216Refer to the exhibit. An IAM policy is attached to an EC2 instance role. The application on the instance fails to write logs to CloudWatch Logs in the log group 'MyAppLogs'. What is the most likely cause?
217Refer to the exhibit. A DevOps engineer checks the CloudWatch alarm configuration and state. The alarm is in ALARM state for CPUUtilization averaging 90% over 5 minutes, but no notification was received. What is the most likely reason?
218Refer to the exhibit. A CloudFormation template deploys a Lambda function with X-Ray tracing enabled. However, traces are not appearing in the X-Ray console. What is the most likely missing configuration?
219A company is running a critical web application on Amazon EC2 instances behind an Application Load Balancer (ALB) with Auto Scaling. The operations team notices that the application's error rate spiked for 10 minutes last night, but no CloudWatch alarm was triggered. The team has a CloudWatch alarm on the ALB's 'HTTPCode_Target_5XX_Count' metric with a threshold of 100 over 5 consecutive periods of 1 minute. What is the MOST likely reason the alarm did not trigger?
220A DevOps engineer is troubleshooting an AWS Lambda function that processes messages from an Amazon SQS queue. The function is configured with a reserved concurrency of 5 and a batch size of 10. The SQS queue has a visibility timeout of 30 seconds, and the Lambda function typically completes processing each batch in 10 seconds. Recently, the engineer noticed that messages are repeatedly processed, causing duplicates. The CloudWatch Logs show that the function is experiencing throttling errors. What is the MOST likely cause of the duplicate processing?
221A company wants to centralize logging from multiple AWS accounts into a single Amazon S3 bucket for long-term storage and analysis. The logs include AWS CloudTrail, VPC Flow Logs, and Amazon RDS audit logs. Which solution is the MOST operationally efficient?
222An e-commerce platform uses Amazon DynamoDB as its primary database. During a flash sale, the application experienced high read latency. The DevOps team wants to set up CloudWatch alarms to detect high read latency proactively. The team has enabled DynamoDB Accelerator (DAX) for caching. Which metric should the team use to create a CloudWatch alarm for read latency?
223A company runs a microservices application on Amazon ECS with Fargate launch type. The application uses an Application Load Balancer (ALB) to distribute traffic. The DevOps team wants to monitor the number of HTTP 5xx errors returned by each service. They configure the ALB to send access logs to an S3 bucket and enable CloudWatch Container Insights. However, the team cannot view 5xx errors per service. What should the team do to achieve this?
224A company is running a batch processing job on Amazon EMR that writes results to an Amazon S3 bucket. The job runs daily and takes about 2 hours. The DevOps team wants to be alerted if the job fails or takes longer than 3 hours. Which solution is the MOST cost-effective and operationally efficient?
225A DevOps engineer is setting up monitoring for an Amazon RDS for PostgreSQL instance. The engineer wants to track the number of active database connections over time to plan for scaling. Which approach should the engineer use?
226A company is using AWS CloudFormation to deploy infrastructure. The DevOps team wants to receive notifications when stack operations fail. They create an Amazon SNS topic and subscribe the team's email. Then they configure CloudFormation to send notifications to the SNS topic. However, no notifications are received when a stack creation fails. What is the MOST likely reason?
227A company runs a web application on Amazon EC2 instances behind an Application Load Balancer (ALB). The application uses a custom health check endpoint '/health'. The DevOps team notices that the ALB is marking some instances as unhealthy even though the application is running fine. The team checks the security groups and network ACLs and confirms they allow traffic. What should the team check next?
228A company is using Amazon CloudWatch Logs to collect logs from multiple applications. The DevOps team wants to create a metric filter to count the number of ERROR log entries and trigger an alarm when the count exceeds 10 in 5 minutes. Which TWO steps must the team take? (Choose TWO.)
229A DevOps engineer is investigating a performance issue with an Amazon RDS for MySQL instance. The engineer has enabled Performance Insights and CloudWatch Enhanced Monitoring. Which THREE metrics should the engineer examine to identify whether the issue is due to a resource bottleneck? (Choose THREE.)
230A company runs a critical application on Amazon ECS with Fargate. The DevOps team wants to set up a metric to track the number of tasks running. Which TWO steps are required to achieve this? (Choose TWO.)
231A DevOps engineer is troubleshooting why an AWS Lambda function is not writing logs to the CloudWatch Logs log group 'MyAppLogs'. The Lambda function's execution role includes the IAM policy shown in the exhibit. What is the MOST likely reason the logs are not being written?
232A company runs a containerized application on Amazon ECS with Fargate launch type. The application consists of three microservices: frontend, backend, and database. The ECS cluster is in a VPC with public and private subnets. The frontend service is publicly accessible via an Application Load Balancer (ALB) in public subnets. The backend service communicates with the database service, which runs as a stateful service with persistent storage using Amazon EFS. The DevOps team is using CloudWatch Container Insights and has enabled Prometheus metrics for the ECS cluster. Recently, the team observed that the frontend service's response time has increased significantly, and some requests are timing out. The team checked the ALB metrics and saw an increase in 5xx errors. They also noticed that the backend service's CPU utilization is high, and the database service's disk I/O is high. The team suspects a bottleneck in the backend service. Which course of action should the team take FIRST to identify the root cause?
233A company uses AWS CloudFormation to deploy a three-tier web application. The stack includes an Application Load Balancer (ALB), an Auto Scaling group of EC2 instances, and an Amazon RDS Multi-AZ database. The DevOps team has configured the EC2 instances to send application logs to CloudWatch Logs using the CloudWatch agent. They also set up a CloudWatch alarm on the ALB's 5xx error count. During a recent deployment, the team noticed that the alarm did not trigger even though the application was returning 5xx errors. The team verified that the CloudWatch agent is running on the instances and logs are appearing in CloudWatch Logs. What should the team do to ensure the alarm triggers correctly?
234A company is running a critical application on Amazon EC2 instances behind an Application Load Balancer (ALB). The operations team notices that the application's error rate has increased significantly in the last 30 minutes, but they are unable to identify the root cause because the metrics are aggregated across all instances. Which solution would provide the MOST granular visibility into individual instance performance?
235A DevOps engineer is troubleshooting an AWS Lambda function that processes messages from an Amazon SQS queue. The function is invoked successfully, but it frequently times out after 15 seconds. The function's CloudWatch Logs show that the timeout occurs while the function is making an HTTP request to an external API. The function's reserved concurrency is set to 5, and the SQS queue has a visibility timeout of 30 seconds. Which change would MOST effectively reduce the number of timeouts?
236A company uses Amazon CloudWatch Logs to store application logs from EC2 instances. The security team requires that logs be retained for 5 years for compliance. Which action should be taken to meet this requirement cost-effectively?
237A company is using Amazon CloudWatch Synthetics to monitor the availability of a web application. The canary runs every 5 minutes from multiple locations. Recently, the canary has been failing intermittently with HTTP 503 errors, but the application team reports that the application is healthy. Which step should the DevOps engineer take to identify the cause of the false positives?
238Refer to the exhibit. An AWS Lambda function has the IAM policy shown. The function is intended to write logs to CloudWatch Logs and publish custom metrics to CloudWatch. However, the function is failing to publish custom metrics. What is the MOST likely cause?
239A DevOps engineer needs to set up a centralized logging solution for multiple AWS accounts. The logs must be stored in a central Amazon S3 bucket for long-term retention and analysis. Which combination of services should the engineer use?
240A company is using Amazon CloudWatch Logs Insights to analyze application logs. The DevOps team needs to create a metric filter that counts occurrences of the word 'ERROR' in the log events. Which CloudWatch Logs Insights query should be used to test the metric filter?
241A company is using Amazon CloudWatch to monitor a production environment. The DevOps team wants to receive notifications when the CPU utilization of an EC2 instance exceeds 90% for 5 consecutive minutes. Which TWO steps should the team take to achieve this? (Choose TWO.)
242A company is running a microservices application on Amazon ECS with AWS Fargate. The operations team wants to collect and visualize metrics such as CPU, memory, and network utilization at the task level. Which TWO services should the team use to achieve this? (Choose TWO.)
243A company uses Amazon CloudWatch Logs to store application logs. The security team requires that all logs be encrypted at rest using a customer-managed AWS KMS key. Which THREE steps are necessary to meet this requirement? (Choose THREE.)
244A company is using AWS CloudTrail to log API activity across multiple accounts. The security team wants to ensure that all CloudTrail logs are delivered to a central Amazon S3 bucket and that the logs are encrypted and cannot be deleted. Which THREE steps should the team take to meet these requirements? (Choose THREE.)
245A DevOps engineer needs to set up a monitoring solution for an application running on Amazon EKS. The application emits custom metrics that need to be stored in Amazon CloudWatch and visualized on a dashboard. Which THREE steps should the engineer take? (Choose THREE.)
246A company runs a containerized web application on Amazon ECS with AWS Fargate. The application is critical and requires high availability. The DevOps team has set up an Amazon CloudWatch alarm that triggers an auto scaling action when the average CPU utilization exceeds 75% for 5 minutes. However, during a recent traffic spike, the application became slow and some requests timed out, even though the CloudWatch alarm did not fire. The team checked the ECS service auto scaling configuration and found that the target tracking scaling policy based on average CPU utilization is set with a target value of 75%. The ECS service is configured with a minimum of 2 tasks and a maximum of 10 tasks. Upon investigation, they noticed that the CPU utilization metric for the service remained below 75% during the spike, but the memory utilization was high (over 90%). The application logs show that the tasks were running out of memory, causing garbage collection pauses and slow responses. Which course of action should the DevOps engineer take to prevent this issue in the future?
247A company uses AWS Lambda functions to process streaming data from Amazon Kinesis Data Streams. The Lambda function processes records in batches and writes the results to an Amazon DynamoDB table. Recently, the operations team noticed that the Lambda function is experiencing a high number of throttling errors (HTTP 400) when writing to DynamoDB. The DynamoDB table has on-demand capacity mode enabled. The CloudWatch metrics show that the DynamoDB consumed write capacity is well below the provisioned limits, but the Lambda function's error rate is increasing. The Lambda function's reserved concurrency is set to 100, and the function's timeout is 1 minute. The Kinesis stream has 10 shards. What is the MOST likely cause of the throttling errors?
248A DevOps engineer is responsible for monitoring an AWS environment that includes multiple EC2 instances running a web application. The engineer needs to set up a solution that sends an email alert when the average CPU utilization across all instances exceeds 80% for 10 consecutive minutes. The engineer has created a CloudWatch alarm with the metric `CPUUtilization` aggregated across all instances using the statistic `Average` and a period of 5 minutes. The alarm is set to trigger when the metric exceeds 80% for 2 consecutive periods (10 minutes). The alarm's action is configured to send a notification to an Amazon SNS topic that has an email subscription. However, the engineer is not receiving the email alerts. The engineer verified that the SNS topic exists and the email subscription is confirmed. The CloudWatch alarm shows that the metric value exceeded the threshold for 2 periods, but the alarm state is still 'OK'. What is the MOST likely reason for this?
249A DevOps engineer notices that a critical Lambda function occasionally times out. The engineer wants to monitor the function's duration and log the timeout errors for analysis. Which TWO steps should the engineer take to achieve this? (Select TWO.)
250A company runs a web application on Amazon EC2 instances behind an Application Load Balancer. The operations team wants to analyze application access logs and error rates. They need to identify the top IP addresses making requests, as well as the distribution of HTTP status codes over time. Which THREE steps should the team take to achieve this? (Select THREE.)
251A company runs a production web application on Amazon EC2 instances that are part of an Auto Scaling group. The instances are behind an Application Load Balancer. The DevOps team has enabled detailed CloudWatch metrics and set up a CloudWatch dashboard to monitor the application. Recently, the team noticed that the CPU Utilization metric for the Auto Scaling group shows a spike every day at 2:00 PM, but the application performance remains normal. The team wants to investigate the cause of the CPU spike. What should the team do FIRST to identify the root cause?
252A company has a microservices architecture running on Amazon ECS with Fargate. The operations team uses Amazon CloudWatch Container Insights to monitor the cluster. They notice that one of the services is experiencing high memory utilization, causing occasional task failures. The team wants to set up proactive monitoring to receive alerts when memory utilization exceeds 80% for more than 5 minutes. They also want to automate the response by replacing the failing tasks. The team has already created a CloudWatch alarm on the MemoryUtilized metric. Which additional steps should the team take to achieve the desired proactive monitoring and automated response?
253A company runs a multi-region application on Amazon EC2 instances across us-east-1 and eu-west-1. The application uses an Amazon Aurora global database for writes in us-east-1 and reads in eu-west-1. The DevOps team wants to monitor the replication lag between the primary and secondary regions. They have set up a CloudWatch alarm on the AuroraReplicaLag metric in both regions. However, they notice that the alarm in eu-west-1 sometimes triggers false positives when the lag spikes briefly but then recovers. The team wants to reduce false alarms while still being alerted to sustained high lag that could impact read replicas. The team is already using a standard CloudWatch alarm with a period of 1 minute and evaluation periods of 1. What should the team change to reduce false positives?
254A company runs a serverless application using AWS Lambda and Amazon API Gateway. The application processes user uploads to an S3 bucket. The operations team uses CloudWatch Logs for monitoring, but they are finding it difficult to correlate logs across multiple Lambda functions that handle different parts of the workflow. The team wants to trace requests as they flow through the application and identify bottlenecks or errors. The team has already enabled CloudWatch Logs for all Lambda functions. What should the team do to achieve end-to-end request tracing?
255A company runs a critical application on Amazon EKS. The DevOps team uses Prometheus for monitoring and Grafana for visualization. The team has set up a Prometheus server on an EC2 instance to scrape metrics from the EKS cluster. However, they are experiencing high memory usage on the Prometheus server, and some metrics are being dropped because of the retention period. The team wants to implement a scalable and managed monitoring solution that can store metrics for longer durations without the operational overhead of managing the Prometheus server. The team also wants to retain the ability to use PromQL queries and Grafana dashboards. What should the team do?
256A company runs a web application on an Auto Scaling group of EC2 instances. The operations team uses CloudWatch alarms to monitor the application. They have set up a CPUUtilization alarm that triggers when the average CPU exceeds 70% for 5 minutes. The alarm triggers a scaling policy to add instances. Recently, the team noticed that the alarm frequently triggers during the day, but the application performance is acceptable. They suspect the alarm is too sensitive and want to reduce the number of false alarms. The team wants to keep the alarm responsive to real CPU spikes but avoid triggering on short bursts. What should the team change in the alarm configuration?
257Refer to the exhibit. A DevOps engineer runs this query to investigate a spike in errors. What is the most likely interpretation?
258Refer to the exhibit. An alarm is configured as shown. The CPU utilization averages 85% for 10 minutes, then spikes to 95% for the next 5 minutes, and returns to 80%. How many times will the SNS topic receive a notification?
259Refer to the exhibit. A security engineer runs this AWS Config query. What is the intended purpose?
260Refer to the exhibit. A security team reviews this CloudTrail log entry. Which finding is most concerning?
261Refer to the exhibit. A network engineer reviews VPC Flow Logs. Which statement about the traffic is correct?
The Monitoring and Logging domain covers the key concepts tested in this area of the DOP-C02 exam blueprint published by Amazon Web Services. Courseiva provides free domain-focused practice, mock exams, missed-question review, and readiness tracking across all DOP-C02 domains — no account required.
The Courseiva DOP-C02 question bank contains 261 questions in the Monitoring and Logging domain. Click any question to see the full explanation and answer breakdown.
Start with a 10-question focused session to identify your baseline accuracy in this domain. Read every explanation — even for questions you answer correctly — to understand the reasoning. Once you score consistently above 80%, move to a 20–30 question session to confirm depth before moving to the next domain.
Yes — the session launcher on this page draws questions exclusively from the Monitoring and Logging domain. Choose 10, 20, 30, or 50 questions for a focused session, or click individual questions to review them one by one.
Save your results, see per-domain analytics, and get readiness scores — free, for every certification.
Sign Up FreeFree forever · Every certification included