CCNA Soa Monitoring Logging Questions — Page 3 of 5

151

MCQmedium

A company uses AWS CloudFormation to deploy a multi-tier application. The stack includes an Application Load Balancer, Auto Scaling group, and RDS database. The SysOps administrator receives a notification that a stack update has failed. The administrator wants to investigate the failure and understand which resource caused the issue. The stack is in the UPDATE_ROLLBACK_IN_PROGRESS state. What should the administrator do to identify the failed resource?

A.Review the stack's template in the CloudFormation console to check for syntax errors.

B.Check the CloudWatch Logs for the EC2 instances in the Auto Scaling group.

C.Manually re-run the update with the same parameters to see if the error recurs.

D.View the stack events in the CloudFormation console to see which resource failed and the error message.

AnswerD

Stack events provide detailed information about each resource operation.

Why this answer

When a CloudFormation stack update fails and enters UPDATE_ROLLBACK_IN_PROGRESS, the most direct way to identify the failed resource is to view the stack events in the CloudFormation console. Each event includes a status reason field that contains the specific error message and the logical resource ID of the resource that caused the failure, allowing the administrator to pinpoint the issue without additional investigation.

Exam trap

The trap here is that candidates may assume the failure is due to a template syntax error (Option A) or that application logs (Option B) would reveal the issue, when in fact CloudFormation events are the authoritative source for resource-level failure details during stack operations.

How to eliminate wrong answers

Option A is wrong because syntax errors in the template would typically cause the update to fail before it begins (e.g., during validation), not during the update process itself; the stack is already in UPDATE_ROLLBACK_IN_PROGRESS, meaning the template was valid enough to start the update. Option B is wrong because CloudWatch Logs for EC2 instances in the Auto Scaling group would only show application-level or OS-level logs, not CloudFormation resource provisioning failures; the failure is at the infrastructure layer, not within the instances. Option C is wrong because manually re-running the update with the same parameters is risky and inefficient; it could cause the same failure again or trigger additional rollbacks, and it does not leverage the existing event data that already contains the error details.

Practice this question →

152

MCQeasy

Refer to the exhibit. An application running on EC2 is using the AWS SDK to publish custom metrics to CloudWatch. The application fails to publish metrics. The IAM role attached to the EC2 instance has this policy. What is the issue?

A.The condition key 'cloudwatch:namespace' is misspelled.

B.The policy does not specify a specific resource ARN.

C.The application may be using a different namespace than 'MyApp'.

D.The action 'cloudwatch:PutMetricData' is not allowed for custom metrics.

AnswerC

The condition restricts to a specific namespace.

Why this answer

The IAM policy explicitly allows 'cloudwatch:PutMetricData' on the condition that the namespace is 'MyApp'. If the application's AWS SDK code publishes metrics under a different namespace (e.g., 'AWS/EC2' or a custom namespace like 'MyOtherApp'), the condition fails and the API call is denied. This is the most likely cause of the failure, as the policy is otherwise correctly configured for the specified namespace.

Exam trap

The trap here is that candidates often assume a policy with a condition key is always correct, overlooking that the application's actual namespace value must exactly match the condition value for the API call to succeed.

How to eliminate wrong answers

Option A is wrong because 'cloudwatch:namespace' is a valid condition key for CloudWatch PutMetricData; it is not misspelled. Option B is wrong because CloudWatch PutMetricData does not require a resource ARN in the policy; it uses a 'Resource': '*' by convention and the condition key provides the necessary restriction. Option D is wrong because the action 'cloudwatch:PutMetricData' is explicitly allowed for custom metrics when the namespace condition is met; the issue is not that the action is disallowed entirely.

Practice this question →

153

MCQhard

A SysOps admin is investigating why a CloudWatch alarm did not trigger an SNS notification when a metric breached the threshold. The alarm state is visible in the console as 'ALARM'. What is the most likely reason the notification was not sent?

A.The SNS topic's subscription is not confirmed

B.The alarm name contains special characters

C.The alarm's evaluation period is set to 1 minute

D.The metric has a resolution of 1 minute

AnswerA

Email subscriptions require confirmation; if not confirmed, notifications are not delivered.

Why this answer

The most likely reason the notification was not sent is that the SNS topic's subscription is not confirmed. When an SNS topic sends a notification to an endpoint such as email, HTTP, or SMS, the subscription must first be confirmed by the subscriber. If the subscription remains in a 'Pending confirmation' state, SNS will not deliver messages to that endpoint, even if the CloudWatch alarm transitions to the ALARM state and publishes to the topic.

Exam trap

The trap here is that candidates assume any alarm in ALARM state will automatically trigger its configured SNS action, overlooking the requirement that the SNS subscription must be in a confirmed state before messages can be delivered.

How to eliminate wrong answers

Option B is wrong because CloudWatch alarm names can contain special characters (e.g., hyphens, underscores, spaces) without affecting notification delivery; the alarm name is simply a label and does not impact SNS publishing. Option C is wrong because setting the evaluation period to 1 minute does not prevent notifications; it only affects how quickly the alarm evaluates metric data and transitions state. Option D is wrong because a metric resolution of 1 minute (standard resolution) is normal and does not interfere with alarm actions or SNS notifications; high-resolution metrics (1 second) are also supported without issue.

Practice this question →

154

Multi-Selecteasy

A SysOps administrator is creating a monitoring solution for a web application that uses an Application Load Balancer (ALB) and an Auto Scaling group of EC2 instances. The administrator wants to monitor the average request count per minute and the number of healthy hosts. Which TWO CloudWatch metrics should the administrator use? (Choose TWO.)

Select 2 answers

A.AWS/ApplicationELB Latency

B.AWS/ApplicationELB HealthyHostCount

C.AWS/EC2 CPUUtilization

D.AWS/AutoScaling GroupInServiceInstances

E.AWS/ApplicationELB RequestCount

AnswersB, E

HealthyHostCount indicates the number of healthy registered targets.

Why this answer

Option B, AWS/ApplicationELB HealthyHostCount, is correct because it directly reports the number of registered instances that are passing health checks, which is exactly what the administrator needs to monitor the number of healthy hosts behind the ALB. Option E, AWS/ApplicationELB RequestCount, is correct because it tracks the total number of requests handled by the ALB, and by dividing by the time period, the administrator can calculate the average request count per minute.

Exam trap

The trap here is that candidates often confuse Auto Scaling group metrics (like GroupInServiceInstances) with ALB health check metrics (HealthyHostCount), not realizing that an instance can be InService but still unhealthy to the ALB if it fails health checks.

Practice this question →

155

MCQhard

An application running on EC2 instances sends custom metrics to CloudWatch using the PutMetricData API. The metrics are not appearing in the CloudWatch console. The IAM role attached to the instances has the following policy: { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "cloudwatch:PutMetricData", "Resource": "*" } ] }. What is the most likely cause?

A.The metric timestamp is older than 14 days.

B.The metric namespace must start with 'AWS/'.

C.The metric data is being sent to a different AWS region.

D.The IAM policy does not allow the 'cloudwatch:PutMetricData' action.

AnswerC

If the region in the PutMetricData call differs from the console, metrics won't appear.

Why this answer

The most likely cause is that the metric data is being sent to a different AWS region than the one displayed in the CloudWatch console. The PutMetricData API call includes a regional endpoint, and if the EC2 instance is configured to send metrics to a region other than the one you are viewing, the metrics will not appear. The IAM policy correctly allows the action, so authentication is not the issue.

Exam trap

The trap here is that candidates often assume the issue is a missing IAM permission or an invalid namespace, but the real problem is a region mismatch between where the data is sent and where it is viewed.

How to eliminate wrong answers

Option A is wrong because CloudWatch accepts metric data with timestamps up to 15 days in the past (and 2 hours in the future), so a timestamp older than 14 days is still within the acceptable range and would not prevent the metric from appearing. Option B is wrong because custom metric namespaces can be any string and do not need to start with 'AWS/'; the 'AWS/' prefix is reserved for AWS services, but custom metrics can use any namespace. Option D is wrong because the IAM policy explicitly allows 'cloudwatch:PutMetricData' on all resources, so there is no permission issue.

Practice this question →

156

MCQmedium

A SysOps administrator needs to monitor AWS CloudTrail logs for any calls to the 'CreateUser' API in AWS Identity and Access Management (IAM). When such an API call is detected, the administrator wants to receive a notification within a few minutes and also log the event to a central log group in Amazon CloudWatch Logs. The solution should use minimal custom code. Which combination of services should be used?

A.Configure AWS CloudTrail to deliver logs to Amazon CloudWatch Logs, create a metric filter for the 'CreateUser' API call, and set up a CloudWatch alarm that sends an Amazon SNS notification.

B.Use AWS CloudTrail with Amazon EventBridge by creating an event rule that matches the 'CreateUser' API call via the 'aws.cloudtrail' event source, and set the targets to an Amazon SNS topic and a CloudWatch Logs log group.

C.Write an AWS Lambda function that is triggered by Amazon S3 events when a new CloudTrail log is delivered to S3. The Lambda parses the log file for 'CreateUser' and if found, sends an SNS notification.

D.Enable AWS Config and create a custom rule that evaluates CloudTrail trail configurations for events.

AnswerB

Amazon EventBridge natively listens for AWS service events, including CloudTrail API calls. By creating a rule with a custom event pattern that matches the specific API call, you can directly send the event to multiple targets (SNS, CloudWatch Logs, Lambda, etc.) without needing metric filters or alarms. This is the recommended low-overhead solution.

Why this answer

Option B is correct because Amazon EventBridge can directly consume CloudTrail events in near-real time via the 'aws.cloudtrail' event source, allowing you to create a rule that matches the 'CreateUser' API call. This rule can then target both an Amazon SNS topic for immediate notification and a CloudWatch Logs log group for centralized logging, all without custom code.

Exam trap

The trap here is that candidates often assume CloudTrail-to-CloudWatch Logs delivery is the fastest method, but they overlook the inherent delivery latency and the fact that EventBridge provides a more immediate, event-driven path for real-time monitoring.

How to eliminate wrong answers

Option A is wrong because while CloudTrail can deliver logs to CloudWatch Logs, this delivery has a latency of up to 15 minutes, which does not meet the 'within a few minutes' requirement; also, metric filters and alarms operate on the delivered logs, not on the event stream. Option C is wrong because it requires custom Lambda code to parse S3-delivered CloudTrail logs, which violates the 'minimal custom code' requirement and introduces additional latency and complexity. Option D is wrong because AWS Config evaluates resource configurations, not real-time API call events; a custom Config rule cannot detect individual 'CreateUser' API calls as they occur.

Practice this question →

157

MCQeasy

A SysOps administrator needs to monitor application logs in Amazon CloudWatch Logs for the occurrence of the string 'ERROR'. The administrator wants to create a custom metric that counts the number of 'ERROR' occurrences per 5-minute window and trigger an Amazon CloudWatch alarm when the count exceeds 10. Which action should the administrator take to create the custom metric?

A.Create a CloudWatch Events rule that triggers on 'ERROR' and publishes a metric.

B.Create a metric filter on the CloudWatch Logs log group that matches the term 'ERROR'.

C.Create a CloudWatch dashboard that displays the log group and set an alarm on the dashboard.

D.Enable AWS CloudTrail on the log group and select the 'ERROR' pattern.

AnswerB

A metric filter is the correct way to define a pattern to look for in log events. CloudWatch Logs uses the filter to publish a numeric metric to CloudWatch, which can then be used for alarms.

Why this answer

Option B is correct because metric filters in CloudWatch Logs allow you to define a pattern (e.g., 'ERROR') that is evaluated against incoming log events. The filter counts occurrences and publishes a custom metric to CloudWatch, which can then be used to set an alarm with a period of 5 minutes and a threshold of 10.

Exam trap

The trap here is that candidates confuse CloudWatch Logs metric filters with CloudWatch Events or CloudTrail, thinking those services can parse log content, when in fact only metric filters can extract and count patterns from log data.

How to eliminate wrong answers

Option A is wrong because CloudWatch Events (now Amazon EventBridge) is used to trigger actions based on events, not to parse log content and create custom metrics; it cannot count string occurrences in log streams. Option C is wrong because a CloudWatch dashboard is a visualization tool and cannot directly create a custom metric or trigger an alarm; alarms are set on metrics, not dashboards. Option D is wrong because AWS CloudTrail records API activity, not application log content; it cannot be enabled on a CloudWatch Logs log group or used to count 'ERROR' strings in application logs.

Practice this question →

158

MCQmedium

A company is using Amazon EC2 instances in an Auto Scaling group behind an Application Load Balancer. The operations team needs to receive an alert when the number of healthy hosts drops below 50% of the desired capacity for more than 5 minutes. Which CloudWatch metric and alarm configuration should be used?

A.Use the 'RequestCount' metric with a statistic of 'Sum' and a threshold of 50% of the average request count.

B.Use the 'HealthyHostCount' metric with a statistic of 'Sum' and a threshold of 0.5 * desired capacity.

C.Use the 'UnhealthyHostCount' metric with a threshold of 50% of desired capacity.

D.Use the 'TargetResponseTime' metric with a statistic of 'p90' and a threshold of 2 seconds.

AnswerB

This directly measures the number of healthy hosts and can be compared to half the desired capacity.

Why this answer

The correct answer is B because the 'HealthyHostCount' metric from the Application Load Balancer (ALB) directly reports the number of registered instances that are passing health checks. By setting the statistic to 'Sum' and the threshold to 0.5 * desired capacity, the alarm triggers when the count of healthy hosts falls below 50% of the Auto Scaling group's desired capacity for the specified evaluation period of 5 minutes, meeting the exact requirement.

Exam trap

The trap here is that candidates often confuse 'UnhealthyHostCount' with 'HealthyHostCount', assuming that a threshold on unhealthy hosts (e.g., >50% of desired capacity) is equivalent, but this fails because the alarm would not trigger correctly when the total number of hosts changes or when both healthy and unhealthy counts shift simultaneously.

How to eliminate wrong answers

Option A is wrong because 'RequestCount' measures the total number of requests processed, not the health of hosts; it cannot determine the percentage of healthy hosts relative to desired capacity. Option C is wrong because 'UnhealthyHostCount' reports the number of unhealthy hosts, but the requirement is to alert when healthy hosts drop below 50% of desired capacity, which is not directly equivalent to a threshold on unhealthy count (e.g., if desired capacity changes, the unhealthy count threshold would not scale dynamically). Option D is wrong because 'TargetResponseTime' measures latency, not host health status; a p90 threshold of 2 seconds is unrelated to the percentage of healthy hosts.

Practice this question →

159

Multi-Selecthard

A company has a centralized logging solution where CloudTrail logs from multiple accounts are delivered to a single S3 bucket. The security team needs to be alerted when an IAM user is created in any of the accounts. Which steps should be taken? (Choose THREE.)

Select 3 answers

A.Create a CloudWatch Logs metric filter for 'CreateUser' event.

B.Configure CloudTrail to deliver logs to CloudWatch Logs.

C.Create an AWS Config rule to detect IAM user creation.

D.Configure S3 event notification on the central bucket to trigger a Lambda function.

E.Create a CloudWatch alarm on the metric filter that publishes to an SNS topic.

AnswersA, B, E

Metric filter detects the API call.

Why this answer

Option A is correct because a CloudWatch Logs metric filter can parse CloudTrail logs delivered to CloudWatch Logs and match the 'CreateUser' event pattern. This filter creates a metric that can be used to trigger an alarm. Option B is correct because CloudTrail must be configured to deliver logs to CloudWatch Logs in each account so that the metric filter can be applied to the log group.

Option E is correct because a CloudWatch alarm on the metric filter can publish to an SNS topic, which sends notifications (e.g., email, SMS) to the security team when an IAM user is created.

Exam trap

The trap here is that candidates confuse AWS Config rules (which assess resource compliance) with CloudWatch metric filters (which monitor log events), leading them to select Config for event detection instead of the correct CloudWatch-based approach.

Practice this question →

160

MCQmedium

A SysOps administrator needs to create a custom Amazon CloudWatch metric to track the number of active user sessions from application logs. The administrator wants to publish this metric to CloudWatch and set an alarm when the count exceeds a threshold. Which solution should be used?

A.Use a CloudWatch Logs Metric Filter on the log group.

B.Use CloudWatch Contributor Insights to extract the metric from logs.

C.Use CloudWatch Synthetics Canary to simulate user sessions and publish metrics.

D.Use CloudWatch Embedded Metric Format to have the application publish metrics directly.

AnswerA

A metric filter scans log entries for a pattern and increments a metric each time the pattern appears. The resulting metric can be used to trigger an alarm. This is the correct and straightforward approach.

Why this answer

Option A is correct because CloudWatch Logs Metric Filters allow you to define a filter pattern that matches specific log events (e.g., 'User session started') and convert them into a custom metric. The metric is automatically published to CloudWatch, where you can set an alarm on the count of matching log entries. This is the standard, cost-effective approach for extracting metrics from application logs without modifying the application code.

Exam trap

The trap here is that candidates often confuse CloudWatch Contributor Insights (which analyzes log data for top contributors) with a simple metric filter, or they assume Embedded Metric Format is required when the question explicitly states no application code changes are desired.

How to eliminate wrong answers

Option B is wrong because CloudWatch Contributor Insights is designed to analyze high-cardinality log data to identify top contributors (e.g., top IP addresses), not to produce a simple count metric for alarm thresholds. Option C is wrong because CloudWatch Synthetics Canaries simulate user interactions to generate traffic and metrics, but they do not parse existing application logs to count active sessions; they create synthetic data, not real session counts. Option D is wrong because CloudWatch Embedded Metric Format requires the application to be modified to emit metrics in a specific JSON format, whereas the requirement is to extract metrics from existing logs without code changes.

Practice this question →

161

MCQeasy

A SysOps administrator needs to view a graph of the average CPU utilization of an Auto Scaling group over the past 24 hours. The administrator wants to share this graph with the team via a link that does not require AWS console login. Which AWS service should be used to create and share this graph?

A.Amazon CloudWatch Dashboard

B.AWS CloudTrail

C.Amazon QuickSight

D.AWS Trusted Advisor

AnswerA

CloudWatch Dashboards can be shared via a public URL or via IAM permissions, allowing viewing without console login if shared publicly.

Why this answer

Amazon CloudWatch Dashboards allow you to create customizable graphs of metrics like average CPU utilization from an Auto Scaling group. You can share a dashboard via a link that does not require AWS console login by using the 'Share' feature, which generates a public URL that grants read-only access to the dashboard without authentication.

Exam trap

The trap here is that candidates may confuse CloudWatch Dashboards with QuickSight for sharing visualizations, but QuickSight requires user provisioning and login, whereas CloudWatch Dashboards can be shared publicly without authentication.

How to eliminate wrong answers

Option B is wrong because AWS CloudTrail records API activity and governance logs, not metric graphs or CPU utilization data. Option C is wrong because Amazon QuickSight is a business analytics service for interactive dashboards and visualizations, but it requires a QuickSight account and login, and is not designed for sharing a simple metric graph without authentication. Option D is wrong because AWS Trusted Advisor provides best-practice checks and recommendations for cost, performance, and security, but does not generate or share metric graphs.

Practice this question →

162

MCQmedium

A SysOps administrator notices that an EC2 instance's status check fails intermittently. The instance is part of an Auto Scaling group. What is the most efficient way to automatically recover the instance?

A.Increase the Auto Scaling group's cooldown period.

B.Create a CloudWatch alarm on StatusCheckFailed and configure an EC2 recovery action.

C.Terminate the instance and wait for Auto Scaling to launch a new one.

D.Place the instance in a different Availability Zone.

AnswerB

Recovery action automatically recovers the instance on the same private IP and metadata.

Why this answer

Option B is correct because creating a CloudWatch alarm on the StatusCheckFailed metric and configuring an EC2 recovery action (using the 'recover' alarm action) automatically restarts the instance on a new underlying host if a system status check fails. This is the most efficient recovery method for an instance in an Auto Scaling group, as it preserves the instance ID, private IP, and Elastic IP, while Auto Scaling handles only instance replacement if the instance is terminated.

Exam trap

The trap here is that candidates confuse instance recovery with Auto Scaling replacement, thinking termination and relaunch is the default or only option, but EC2 recovery is a separate, more efficient mechanism that preserves instance identity and is directly configurable via CloudWatch alarms.

How to eliminate wrong answers

Option A is wrong because increasing the Auto Scaling group's cooldown period delays the launch of new instances after scaling activities, but does not recover a failing instance or address the status check failure. Option C is wrong because terminating the instance and waiting for Auto Scaling to launch a new one is less efficient—it loses the instance's metadata, private IP, and Elastic IP, and incurs longer downtime compared to an automatic recovery. Option D is wrong because placing the instance in a different Availability Zone does not automatically recover the instance; it requires manual intervention or a new launch, and does not leverage the built-in EC2 recovery mechanism.

Practice this question →

163

MCQeasy

A SysOps administrator needs to ensure that all S3 buckets in the account are configured with server access logging. Which AWS service can evaluate the buckets and automatically remediate non-compliant buckets?

A.Amazon GuardDuty

B.AWS CloudTrail

C.AWS Config

D.AWS Trusted Advisor

AnswerC

AWS Config rules can evaluate resources and trigger remediation actions via Systems Manager Automation.

Why this answer

AWS Config is the correct service because it can continuously evaluate your S3 buckets against a managed rule (s3-bucket-server-access-logging-enabled) and automatically remediate non-compliant buckets using AWS Systems Manager Automation documents. This allows the SysOps administrator to enforce server access logging as a compliance requirement without manual intervention.

Exam trap

The trap here is that candidates often confuse AWS Config's evaluation and remediation capabilities with CloudTrail's logging of API calls, mistakenly thinking CloudTrail can enforce S3 bucket policies, when in fact CloudTrail only records events and cannot modify resource configurations.

How to eliminate wrong answers

Option A is wrong because Amazon GuardDuty is a threat detection service that monitors for malicious activity using DNS logs, VPC Flow Logs, and CloudTrail events; it does not evaluate S3 bucket configurations for compliance or perform remediation. Option B is wrong because AWS CloudTrail records API activity for auditing and governance but does not evaluate current resource configurations or automatically remediate non-compliant resources. Option D is wrong because AWS Trusted Advisor provides best-practice checks and recommendations, including S3 bucket logging checks, but it cannot automatically remediate non-compliant buckets; it only offers guidance.

Practice this question →

164

MCQmedium

Refer to the exhibit. A SysOps administrator deploys this CloudFormation stack. The EC2 instance launches and the web server starts. However, the CloudWatch alarm does not trigger even when CPU utilization exceeds 80%. What is the MOST likely reason?

A.The alarm action is missing a valid SNS topic ARN.

B.The alarm statistic should be 'Maximum' instead of 'Average' to catch CPU spikes that may not sustain the average above 80% for 5 minutes.

C.The alarm dimension is incorrect; it should use the instance's private IP.

D.The user data script fails to start the web server, causing the instance to be unhealthy.

AnswerB

Using Average over 5 minutes can mask short spikes; Maximum would trigger on any 5-minute period where the maximum is above 80%.

Why this answer

Option B is correct because the alarm is configured with the 'Average' statistic, which smooths out CPU utilization over the 5-minute period. If CPU utilization spikes above 80% but does not sustain an average above that threshold for the entire duration, the alarm will not trigger. Using the 'Maximum' statistic would catch any single data point exceeding 80% within the period, making it appropriate for detecting short-lived spikes.

Exam trap

The trap here is that candidates often assume any CPU utilization above the threshold will trigger an alarm, overlooking how the chosen statistic (Average vs. Maximum) and evaluation period affect whether a spike is detected.

How to eliminate wrong answers

Option A is wrong because the alarm action missing a valid SNS topic ARN would cause a different issue (e.g., failure to send notifications), but it would not prevent the alarm from triggering based on the metric threshold; the alarm state would still change. Option C is wrong because the alarm dimension should use the instance ID, not the private IP; CloudWatch metrics for EC2 are dimensioned by InstanceId, and using a private IP would cause the alarm to not match the metric data. Option D is wrong because the user data script failing to start the web server would affect the instance's health but has no bearing on the CloudWatch alarm's ability to trigger based on CPU utilization; the alarm monitors CPU, not web server status.

Practice this question →

165

MCQmedium

A SysOps admin notices that an EC2 instance's status check fails intermittently. The instance is part of an Auto Scaling group. What is the most appropriate first step to diagnose the issue?

A.Stop and start the instance

B.Terminate the instance and let Auto Scaling replace it

C.Reboot the instance

D.Review the instance's status check history in the EC2 console

AnswerD

Status checks reveal whether the issue is system-level or instance-level.

Why this answer

The most appropriate first step is to review the instance's status check history in the EC2 console (Option D). This allows the SysOps admin to determine whether the failures are due to system status checks (e.g., underlying hardware issues) or instance status checks (e.g., OS-level problems). Since the instance is part of an Auto Scaling group, understanding the root cause is critical before taking any corrective action, as premature termination or reboot could mask the issue or lead to unnecessary replacements.

Exam trap

The trap here is that candidates often jump to terminating or rebooting the instance immediately, but the SOA-C02 exam emphasizes a methodical troubleshooting approach where reviewing status check history is the first step to differentiate between recoverable and irrecoverable failures.

How to eliminate wrong answers

Option A is wrong because stopping and starting the instance would move it to new hardware, which is only appropriate if the issue is a system status check failure (hardware problem), but this action is not the first diagnostic step and could disrupt the instance without confirming the cause. Option B is wrong because terminating the instance and letting Auto Scaling replace it is a reactive measure that bypasses diagnosis; it could lead to repeated failures if the underlying issue (e.g., a misconfigured application) persists in the replacement instance. Option C is wrong because rebooting the instance only addresses transient software issues and does not resolve hardware-level failures; it also does not provide diagnostic information about the intermittent status check failures.

Practice this question →

166

MCQhard

Refer to the exhibit. A SysOps administrator runs the AWS CLI command shown. The output shows that the CPUUtilization average over the period is 75%. However, the administrator knows that the instance was idle for the first 15 minutes of the hour. Which explanation best describes why the average might be misleading?

A.The period is too long; a shorter period like 60 seconds would show more granular data.

B.The average statistic over a period of 300 seconds can smooth out spikes, and the overall average of 75% may be due to a high spike after the idle period.

C.The command should include --unit Percent to get accurate data.

D.The --statistics parameter should be set to 'Sum' to capture total usage.

AnswerB

Averaging over 5-minute periods can mask short spikes, and the overall average over the hour can be misleading if the load is not constant.

Why this answer

Option B is correct because the average statistic over a 300-second period can smooth out brief but intense spikes in CPU utilization. In this scenario, the instance was idle for the first 15 minutes, so the average of 75% over the entire hour must be driven by a very high CPU spike later in the period. The period of 300 seconds aggregates data into 5-minute intervals, which can mask the idle period and make the overall average misleading.

Exam trap

The trap here is that candidates assume a high average always indicates consistent high usage, when in fact the 'Average' statistic over a long period can mask idle periods and be heavily skewed by short, intense spikes.

How to eliminate wrong answers

Option A is wrong because while a shorter period like 60 seconds would provide more granular data, it would not change the fact that the average over the hour is 75%—the issue is not the granularity but the smoothing effect of the average statistic over the chosen period. Option C is wrong because the --unit parameter is not required for CPUUtilization metrics; CloudWatch automatically reports CPUUtilization as a percentage, and omitting --unit does not cause inaccurate data. Option D is wrong because using the 'Sum' statistic would accumulate CPU utilization over each period, which is not meaningful for a percentage metric and would not help identify the misleading average caused by the idle period.

Practice this question →

167

Multi-Selecthard

A SysOps administrator is tasked with setting up a solution that automatically terminates EC2 instances that have been running for more than 24 hours. Which steps should the administrator take? (Select THREE.)

Select 3 answers

A.Configure an Auto Scaling group lifecycle hook to terminate instances after 24 hours.

B.Create a CloudWatch alarm on the InstanceAge metric and set it to trigger the Lambda function.

C.Tag each EC2 instance with its launch time (e.g., key: LaunchTime, value: timestamp).

D.Create an Amazon EventBridge rule that triggers the Lambda function on a schedule (e.g., every hour).

E.Create an AWS Lambda function that uses the EC2 API to terminate instances older than 24 hours.

AnswersC, D, E

Tags allow the Lambda function to calculate age.

Why this answer

Options A, B, and D are correct. Option A creates a Lambda function to terminate instances. Option B creates an EventBridge rule that triggers the Lambda function.

Option D tags instances with a launch time for the function to evaluate. Option C is wrong because CloudWatch alarms are for metric thresholds, not time-based. Option E is wrong because Auto Scaling lifecycle hooks are for lifecycle actions, not time-based termination.

Practice this question →

168

MCQmedium

Refer to the exhibit. An IAM policy is attached to an EC2 instance role. The application on the instance attempts to write logs to a log group named 'MyAppLogs' in CloudWatch Logs but fails. What is the likely cause?

A.The EC2 instance does not have an internet gateway to reach CloudWatch Logs.

B.The log group name in the policy does not match the application's log group.

C.The policy does not grant permission to create the log group because the resource for CreateLogGroup is specified as the log stream ARN.

D.The policy lacks permission for 'logs:DescribeLogGroups'.

AnswerC

CreateLogGroup requires the resource to be the log group ARN, not the stream.

Why this answer

The policy grants `logs:CreateLogGroup` but specifies the resource as the log stream ARN (`arn:aws:logs:us-east-1:123456789012:log-group:MyAppLogs:log-stream:*`). CloudWatch Logs requires the resource for `CreateLogGroup` to be the log group ARN (`arn:aws:logs:us-east-1:123456789012:log-group:*` or a specific log group name), not a log stream. Since the application is attempting to write logs to a log group that does not yet exist, the `CreateLogGroup` call fails due to the incorrect resource ARN, causing the overall write operation to fail.

Exam trap

The trap here is that candidates often overlook the resource ARN mismatch for `CreateLogGroup` and assume the failure is due to a missing permission or network issue, but the policy explicitly includes the action with an incorrectly scoped resource.

How to eliminate wrong answers

Option A is wrong because EC2 instances can reach CloudWatch Logs via a VPC endpoint or NAT gateway; an internet gateway is not strictly required, and the question does not indicate any network connectivity issue. Option B is wrong because the exhibit shows the log group name in the policy is 'MyAppLogs', which matches the application's log group, so a mismatch is not the cause. Option D is wrong because `logs:DescribeLogGroups` is not required to write logs; the necessary permissions for writing are `CreateLogGroup`, `CreateLogStream`, and `PutLogEvents`, and the failure is specifically due to the `CreateLogGroup` resource misconfiguration.

Practice this question →

169

MCQmedium

An organization wants to ensure that all changes to an S3 bucket policy are logged and immediately trigger a notification to the security team. What is the most efficient way to achieve this?

A.Create a CloudWatch Events rule that matches PutBucketPolicy API call and triggers an SNS topic.

B.Create a CloudWatch Alarm that monitors the S3 bucket's policy.

C.Use AWS Config with a managed rule to detect policy changes.

D.Enable S3 event notifications on the bucket for 'PutBucketPolicy' events.

AnswerA

CloudWatch Events can respond to API calls in near-real-time.

Why this answer

Option A is correct because CloudWatch Events (now Amazon EventBridge) can capture the PutBucketPolicy API call via a service-specific event pattern and route it to an SNS topic for immediate notification. This approach is the most efficient as it directly monitors the API call in real-time without polling or additional configuration, ensuring the security team is alerted the moment the policy changes.

Exam trap

The trap here is that candidates confuse S3 event notifications (which only cover object-level events) with CloudWatch Events (which can capture management API calls via CloudTrail), leading them to incorrectly select option D.

How to eliminate wrong answers

Option B is wrong because a CloudWatch Alarm monitors metric data (e.g., bucket size, request count) and cannot directly detect or react to S3 bucket policy changes; it lacks the ability to match specific API calls. Option C is wrong because AWS Config evaluates resource configurations against rules and can detect policy changes, but it operates on a periodic or configuration-change basis (typically minutes delay) and does not provide immediate, real-time notification via SNS. Option D is wrong because S3 event notifications support object-level events (e.g., s3:ObjectCreated, s3:ObjectRemoved) and not management API calls like PutBucketPolicy; S3 event notifications cannot be configured for bucket policy changes.

Practice this question →

170

MCQhard

A SysOps administrator monitors a custom business metric published to Amazon CloudWatch. The metric exhibits irregular spikes that are not predictable. The administrator needs to be alerted when the metric deviates significantly from its normal pattern. Which CloudWatch feature should be used to set up the alarm with the least manual tuning?

A.CloudWatch Logs metric filter

B.CloudWatch Metric Math with standard deviation

C.CloudWatch Anomaly Detection

D.AWS CloudTrail Insights

AnswerC

Uses ML to create dynamic thresholds that adapt to the metric's normal behavior.

Why this answer

CloudWatch Anomaly Detection uses machine learning to automatically establish a baseline for a metric's normal pattern and create a band of expected values. When the metric deviates outside this band, it triggers an alarm without requiring manual threshold tuning, making it ideal for unpredictable, irregular spikes.

Exam trap

The trap here is that candidates often confuse CloudWatch Metric Math with standard deviation (Option B) as a way to detect anomalies, but it requires manual formula creation and does not automatically adapt to pattern changes, unlike Anomaly Detection which learns and adjusts the baseline over time.

How to eliminate wrong answers

Option A is wrong because CloudWatch Logs metric filters extract metrics from log data, not from custom business metrics published directly to CloudWatch, and they require manual threshold configuration. Option B is wrong because CloudWatch Metric Math with standard deviation requires manual calculation and setup of the standard deviation formula, and it does not automatically adapt to changing patterns over time. Option D is wrong because AWS CloudTrail Insights analyzes API activity for unusual patterns in AWS management events, not custom business metrics published to CloudWatch.

Practice this question →

171

MCQmedium

A company has an S3 bucket that stores sensitive data. The security team requires an alert whenever an object in the bucket is deleted. What is the MOST efficient way to achieve this?

A.Configure S3 access logs and stream them to CloudWatch Logs, then create a metric filter.

B.Use S3 Inventory to generate a daily report and check for deletes.

C.Enable AWS CloudTrail data events for the S3 bucket and create a CloudWatch metric filter.

D.Enable S3 event notifications and send them to Amazon EventBridge, then create a rule to publish to SNS.

AnswerD

S3 events can be sent to EventBridge with low overhead and trigger notifications.

Why this answer

Option D is correct because S3 event notifications can be sent directly to Amazon EventBridge, which allows you to create a rule that triggers an SNS topic for real-time alerts on object deletions. This approach is the most efficient as it avoids the overhead of log analysis or polling, providing immediate notification with minimal latency.

Exam trap

The trap here is that candidates often assume CloudTrail data events (Option C) are the best for monitoring S3 operations, but they overlook the latency and cost implications, whereas S3 event notifications via EventBridge provide the most efficient real-time alerting for object deletions.

How to eliminate wrong answers

Option A is wrong because S3 access logs are delivered on a best-effort basis, typically with a delay of several hours, making them unsuitable for real-time alerting on deletions. Option B is wrong because S3 Inventory generates daily or weekly CSV reports, which are not real-time and cannot trigger immediate alerts for individual delete events. Option C is wrong because while CloudTrail data events can capture S3 object-level operations, they are not the most efficient due to the overhead of enabling data events across all objects and the potential cost of CloudTrail logs; moreover, CloudTrail logs are typically delivered with a delay of up to 15 minutes, whereas EventBridge provides near-instantaneous notification.

Practice this question →

172

MCQmedium

A company uses AWS Organizations with multiple accounts. The SysOps administrator needs to centralize the monitoring of all API calls made in any account for security analysis. The solution must collect logs from all accounts, both existing and future, and deliver them to a centralized S3 bucket in the management account. Which AWS service should the administrator use?

A.AWS Config aggregator

B.Amazon CloudWatch Logs with cross-account subscription

C.AWS CloudTrail organization trail

D.Amazon Detective

AnswerC

An organization trail automatically logs API calls for all accounts in the organization and can deliver to a centralized S3 bucket.

Why this answer

AWS CloudTrail organization trails allow you to log all API calls across all accounts in an AWS Organization from a single management account. When you create an organization trail, it automatically applies to all existing and future accounts, delivering logs to a centralized S3 bucket in the management account without requiring per-account configuration.

Exam trap

The trap here is that candidates confuse AWS Config aggregator (which centralizes configuration data) with CloudTrail (which centralizes API call logs), or assume cross-account CloudWatch Logs subscriptions are the simpler solution, missing the automatic future-account coverage of an organization trail.

How to eliminate wrong answers

Option A is wrong because AWS Config aggregator collects configuration snapshots and compliance data, not API call logs; it is designed for resource inventory and rule evaluation, not security analysis of API activity. Option B is wrong because Amazon CloudWatch Logs with cross-account subscription requires manual setup for each account and does not automatically include future accounts; it also does not natively capture all API calls (CloudTrail is the service for API logging). Option D is wrong because Amazon Detective analyzes and visualizes security data from existing logs (like VPC Flow Logs and CloudTrail), but it does not collect or centralize API call logs itself; it relies on other services to deliver the data.

Practice this question →

173

Multi-Selectmedium

A SysOps administrator is configuring a CloudWatch dashboard to monitor an application. Which TWO of the following are valid widget types that can be added to a CloudWatch dashboard?

Select 2 answers

A.Line

B.Alarm

C.Bar chart

D.Number

E.Time Series

AnswersD, E

Number widget displays a single metric value.

Why this answer

Option D (Number) is correct because CloudWatch dashboards support the Number widget type, which displays a single metric value as a numeric statistic (e.g., sum, average, or sample count) and can optionally show a trend arrow. This widget is ideal for showing key performance indicators like current error count or latency.

Exam trap

The trap here is that candidates confuse the visual sub-types (like 'Line' or 'Bar chart') with the actual widget types, leading them to select those as valid options when only 'Time Series' and 'Number' are the correct top-level widget types in the CloudWatch dashboard widget JSON schema.

Practice this question →

174

MCQmedium

A company is using Amazon RDS for MySQL. The SysOps administrator needs to monitor the number of database connections and set an alarm when connections exceed 80% of the maximum. Which CloudWatch metric and alarm threshold should be used?

A.Metric: DBConnections; Threshold: 80

B.Metric: DatabaseConnections; Threshold: 80

C.Metric: FreeableMemory; Threshold: 20%

D.Metric: DatabaseConnections; Threshold: 0.8 * max_connections (using a math expression)

AnswerD

This uses a metric math expression to set a dynamic threshold based on instance max connections.

Why this answer

Option D is correct because Amazon RDS for MySQL does not expose a direct 'DatabaseConnections' metric that represents the current connection count relative to the maximum. Instead, you must use the 'DatabaseConnections' CloudWatch metric (which reports the number of client connections) and create a CloudWatch math expression to compare it against the RDS instance's 'max_connections' parameter (e.g., 0.8 * max_connections). This allows you to set an alarm that triggers when connections exceed 80% of the configured maximum, which is the accurate way to monitor connection utilization.

Exam trap

The trap here is that candidates assume a static threshold like 80 is sufficient, but the exam tests whether you understand that 'DatabaseConnections' must be compared against the dynamic 'max_connections' value using a math expression to accurately detect 80% utilization.

How to eliminate wrong answers

Option A is wrong because 'DBConnections' is not a valid CloudWatch metric name for RDS; the correct metric is 'DatabaseConnections'. Option B is wrong because while 'DatabaseConnections' is the correct metric name, setting a static threshold of 80 is meaningless—80 connections could be far below or above 80% of max_connections depending on the instance size and configuration. Option C is wrong because 'FreeableMemory' measures available memory, not database connections, and a threshold of 20% is unrelated to connection utilization; it monitors memory pressure, not connection limits.

Practice this question →

175

Multi-Selecthard

An organization uses Amazon CloudWatch Synthetics canaries to monitor its web application endpoints. A SysOps administrator needs to be alerted when a canary run fails. Which THREE steps are required to set up this alerting?

Select 3 answers

A.Create a custom CloudWatch metric for canary failures.

B.Configure a CloudWatch alarm on the canary's `SuccessPercent` metric.

C.Create a canary in CloudWatch Synthetics.

D.Configure the alarm to send a notification to an SNS topic.

E.Enable detailed monitoring on the canary.

AnswersB, C, D

The alarm will trigger when success rate drops below threshold.

Why this answer

Option B is correct because CloudWatch Synthetics canaries automatically publish a `SuccessPercent` metric to CloudWatch. By configuring a CloudWatch alarm on this metric (e.g., when `SuccessPercent` drops below 100), the administrator can trigger an alert whenever a canary run fails. This is the standard method for monitoring canary health without needing custom metrics.

Exam trap

The trap here is that candidates assume they must create a custom metric (Option A) or enable detailed monitoring (Option E) because they confuse Synthetics canaries with EC2 detailed monitoring, when in fact the built-in `SuccessPercent` metric is sufficient and automatically available.

Practice this question →

176

MCQhard

A company has an application running on EC2 instances that sends logs to CloudWatch Logs. The operations team wants to receive a notification when a specific error pattern appears in the logs. Which combination of steps should the team take? (Select TWO.)

A.Create a CloudWatch alarm on the metric to trigger an SNS notification.

B.Enable CloudTrail to capture the log events and send to S3.

C.Create an SNS topic and subscribe the operations team's email addresses.

D.Create a subscription filter to send logs to Lambda for analysis.

E.Create a metric filter on the log group to count occurrences of the error pattern.

AnswerA, E

An alarm on the metric can send notifications via SNS.

Why this answer

Option A is correct because a CloudWatch alarm can be configured to trigger an SNS notification when a metric crosses a defined threshold. In this scenario, the metric is derived from a metric filter that counts occurrences of the specific error pattern in the log group. The alarm monitors that metric and sends an SNS notification to subscribed endpoints, such as the operations team's email addresses, providing the required alert.

Exam trap

The trap here is that candidates often select Option D (subscription filter to Lambda) thinking it is necessary for custom processing, but the combination of a metric filter and CloudWatch alarm directly provides the notification without additional compute resources.

How to eliminate wrong answers

Option B is wrong because CloudTrail captures API activity and management events, not application log data from EC2 instances; sending logs to S3 does not provide real-time notification for error patterns. Option C is wrong because creating an SNS topic and subscribing email addresses alone does not trigger a notification; the SNS topic must be integrated with a CloudWatch alarm or another event source to send messages. Option D is wrong because a subscription filter sending logs to Lambda for analysis is an indirect approach that requires custom Lambda code to parse logs and send notifications, whereas the question asks for a notification when a specific error pattern appears, which is more directly achieved with a metric filter and CloudWatch alarm.

Practice this question →

177

Multi-Selectmedium

Which TWO actions should a SysOps administrator take to set up centralized logging from multiple Amazon EC2 instances running Amazon Linux 2 to Amazon CloudWatch Logs?

Select 2 answers

A.Attach an IAM role to each EC2 instance that includes permission for logs:PutLogEvents.

B.Create an S3 bucket and configure the EC2 instances to write logs directly to the bucket.

C.Install and configure the unified CloudWatch agent on each EC2 instance.

D.Create a VPC endpoint for CloudWatch Logs to allow private connectivity.

E.Export the logs from CloudWatch Logs to an Amazon S3 bucket for long-term retention.

AnswersA, C

The IAM role must allow the CloudWatch agent to call PutLogEvents to send log data to CloudWatch Logs.

Why this answer

Options A and C are correct. Installing and configuring the CloudWatch Logs agent on each EC2 instance is required to send logs to CloudWatch. The IAM role must grant the logs:PutLogEvents permission to allow the agent to write log events.

Option B is wrong because CloudWatch Logs does not support S3 as a direct target for real-time log ingestion; S3 is a destination for log exports. Option D is wrong because there is no 'export' action needed; the agent pushes logs. Option E is wrong because CloudWatch Logs does not require VPC endpoints for internet-facing instances; it uses public endpoints.

Practice this question →

178

Multi-Selecthard

A SysOps administrator is setting up a CloudWatch dashboard to monitor an application. The application runs on an Auto Scaling group of EC2 instances behind an Application Load Balancer. The administrator wants to track the number of healthy hosts and the request count per target group. Which two metrics should be used? (Choose TWO.)

Select 2 answers

A.HealthyHostCount (per TargetGroup)

B.RequestCount (per ALB)

C.ActiveConnectionCount

D.RequestCount (per TargetGroup)

E.HealthyHostCount (per ALB)

AnswersA, D

This metric shows healthy hosts per target group.

Why this answer

Option A is correct because `HealthyHostCount` (per TargetGroup) is a CloudWatch metric that reports the number of healthy EC2 instances in a specific target group, which directly indicates the health of the backend fleet. Option D is correct because `RequestCount` (per TargetGroup) tracks the number of requests routed to that target group, allowing the administrator to correlate traffic load with host health. Together, these two metrics provide the exact visibility needed for an Auto Scaling group behind an ALB.

Exam trap

The trap here is that candidates confuse ALB-level metrics (like `RequestCount` per ALB or `ActiveConnectionCount`) with target-group-level metrics, or assume `HealthyHostCount` exists at the ALB level, when in fact it is only available per target group.

Practice this question →

179

MCQmedium

A company is using CloudWatch Logs to centralize logs from multiple EC2 instances. The operations team notices that some log entries are missing from CloudWatch Logs. The CloudWatch agent is installed and running on all instances. What is the most likely cause?

A.The CloudWatch agent is not configured to send logs to the correct log group.

B.The CloudWatch agent is sending logs to a different AWS Region.

C.The log group has been encrypted with a KMS key that the agent does not have access to.

D.The IAM role attached to the EC2 instance does not have the 'logs:PutLogEvents' permission.

AnswerD

Missing IAM permissions is a common cause of missing logs.

Why this answer

The most likely cause is that the IAM role attached to the EC2 instance lacks the 'logs:PutLogEvents' permission. Without this permission, the CloudWatch agent can authenticate and connect to CloudWatch Logs but cannot actually write log data to the log stream, resulting in missing entries. The agent may appear to be running and healthy, but API calls to PutLogEvents will fail silently or log errors, leading to gaps in the centralized logs.

Exam trap

The trap here is that candidates assume the agent's installation and running status guarantee log delivery, but the missing permission causes silent failures that are easy to overlook, especially when the agent reports no obvious errors.

How to eliminate wrong answers

Option A is wrong because if the agent were configured to send logs to the wrong log group, log entries would still appear in CloudWatch Logs, just in a different log group; the issue is missing entries entirely, not misrouting. Option B is wrong because sending logs to a different AWS Region would still result in log entries appearing in that region's CloudWatch Logs, not missing entries; the agent's configuration specifies the region, and logs would be visible elsewhere. Option C is wrong because KMS encryption on the log group does not block the agent from writing logs; the agent uses the same IAM permissions to decrypt the KMS key (if needed) and write logs, and missing logs are not caused by encryption access issues unless the agent explicitly fails to write due to a KMS permission error, which is less common than missing PutLogEvents.

Practice this question →

180

MCQeasy

An application is running on an EC2 instance and is experiencing intermittent connection timeouts. The SysOps administrator wants to capture network traffic to analyze the issue. Which AWS service should be used?

A.CloudWatch Logs

B.AWS CloudTrail

C.AWS Config

D.VPC Flow Logs

AnswerD

VPC Flow Logs capture network traffic metadata.

Why this answer

VPC Flow Logs capture IP traffic information for network interfaces in a VPC, including accepted and rejected connection attempts. This allows the SysOps administrator to analyze the source/destination IPs, ports, protocols, and whether the traffic was allowed or denied, which is essential for diagnosing intermittent connection timeouts.

Exam trap

The trap here is that candidates often confuse VPC Flow Logs (which capture network traffic metadata) with CloudWatch Logs (which capture application/system logs) or CloudTrail (which captures API activity), leading them to choose a logging service that cannot diagnose network-level issues.

How to eliminate wrong answers

Option A is wrong because CloudWatch Logs is a service for storing, monitoring, and accessing log files from AWS resources (e.g., application logs, system logs), but it does not capture raw network traffic or packet-level data. Option B is wrong because AWS CloudTrail records API calls and user activity for auditing and governance, not network traffic flows. Option C is wrong because AWS Config evaluates resource configurations against desired policies and tracks configuration changes, but it does not capture or analyze network traffic.

Practice this question →

181

Multi-Selectmedium

A SysOps administrator is investigating a security incident where an unauthorized user accessed an S3 bucket. Which TWO AWS services can the administrator use to collect and analyze the relevant logs?

Select 2 answers

A.AWS WAF logs

B.Amazon VPC Flow Logs

C.AWS CloudTrail

D.Amazon Route 53 resolver logs

E.Amazon S3 server access logs

AnswersC, E

CloudTrail can log S3 data events if enabled.

Why this answer

AWS CloudTrail is correct because it records API calls made to S3, including who made the request, the source IP address, and the time of the action. This allows the administrator to trace the unauthorized access to a specific IAM user or role and identify the exact API operations performed, such as GetObject or PutObject.

Exam trap

The trap here is that candidates often confuse VPC Flow Logs with S3 access logs, thinking network-level logs can capture S3 API calls, but VPC Flow Logs only show IP traffic metadata and not the application-level S3 operations.

Practice this question →

182

MCQhard

A SysOps administrator examines the output of the describe-alarms command for the 'HighCPU' alarm. The alarm is in ALARM state. What action will be taken automatically?

A.Recover the EC2 instance using EC2 instance recovery.

B.Launch a new EC2 instance in an Auto Scaling group.

C.Reboot the EC2 instance.

D.Send a notification to an SNS topic.

AnswerA

The alarm action is an EC2 recover action.

Why this answer

The 'HighCPU' alarm is in ALARM state, and the question specifies that the alarm is configured with an EC2 instance recovery action. EC2 instance recovery automatically restarts the instance on a new healthy host when a status check failure (such as impaired hardware or network connectivity) is detected, preserving the instance ID, private IP addresses, and Elastic IP addresses. This is a built-in CloudWatch alarm action that triggers recovery, not a reboot or scaling action.

Exam trap

The trap here is that candidates confuse EC2 instance recovery (which moves the instance to a new host) with a simple reboot or Auto Scaling replacement, but the question's context of 'HighCPU' alarm and ALARM state implies a status check failure recovery action, not a scaling or notification action.

How to eliminate wrong answers

Option B is wrong because launching a new EC2 instance in an Auto Scaling group is an Auto Scaling action triggered by a CloudWatch alarm configured with an Auto Scaling policy, not a direct EC2 recovery action. Option C is wrong because rebooting the EC2 instance is a separate CloudWatch alarm action (EC2 Reboot) that does not recover the instance from underlying hardware issues; it only restarts the OS. Option D is wrong because sending a notification to an SNS topic is a separate alarm action that does not perform any automated remediation; it only alerts via email, SMS, or other endpoints.

Practice this question →

183

MCQhard

A company runs a critical web application on EC2 instances behind an Application Load Balancer (ALB). The SysOps administrator needs to be notified if the ALB's error rate exceeds 5% for 5 consecutive minutes. Which solution meets this requirement with the least operational overhead?

A.Enable VPC Flow Logs and analyze them with Amazon Athena to detect error rates.

B.Use a CloudWatch alarm on the ALB's 'HTTPCode_ELB_5XX_Count' metric with a math expression to calculate error rate.

C.Enable CloudTrail for the ALB and create a metric filter for 5xx errors.

D.Use AWS Config rules to monitor the ALB configuration and trigger a notification on changes.

AnswerB

CloudWatch can directly alarm on ALB metrics with math expressions.

Why this answer

Option B is correct because CloudWatch can directly monitor the ALB's 'HTTPCode_ELB_5XX_Count' metric and combine it with the 'RequestCount' metric using a math expression to calculate the error rate as a percentage. This approach requires no additional logging or external services, and a CloudWatch alarm can be configured to trigger an SNS notification when the error rate exceeds 5% for 5 consecutive minutes, minimizing operational overhead.

Exam trap

The trap here is that candidates may confuse CloudTrail (which logs API activity) with CloudWatch metrics (which track performance data), or assume VPC Flow Logs can provide HTTP-level error codes when they only capture network-layer information.

How to eliminate wrong answers

Option A is wrong because VPC Flow Logs capture network-level traffic metadata (IP addresses, ports, protocols) and do not include HTTP status codes, making them unsuitable for detecting application-layer 5xx errors; analyzing them with Athena adds unnecessary complexity and cost. Option C is wrong because CloudTrail records API calls to the ALB (e.g., configuration changes) and does not capture HTTP response codes from client requests; metric filters on CloudTrail logs cannot extract 5xx error rates from traffic. Option D is wrong because AWS Config rules evaluate resource configuration compliance (e.g., security group settings, deletion protection) and cannot monitor real-time traffic metrics like error rates; they are designed for drift detection, not performance monitoring.

Practice this question →

184

MCQmedium

A company is using AWS CloudTrail to log API activity in their account. The security team needs to be alerted when an IAM user creates a new access key. Which solution meets this requirement with the least operational overhead?

A.Enable CloudTrail email notifications for management events.

B.Configure the IAM user's permissions to require MFA for access key creation.

C.Create an Amazon EventBridge rule that matches the CreateAccessKey event and targets an SNS topic.

D.Write a Lambda function that periodically scans CloudTrail logs in S3 and sends alerts.

AnswerC

EventBridge can react to CloudTrail events in real time.

Why this answer

Option C is correct because Amazon EventBridge can directly match the `CreateAccessKey` API call from CloudTrail and trigger an SNS topic to send an alert in real time, requiring no custom code or polling. This provides the least operational overhead by using a fully managed, event-driven rule.

Exam trap

The trap here is that candidates may think CloudTrail itself can send email alerts (Option A) or that a security control like MFA (Option B) satisfies the alerting requirement, but neither provides the real-time notification specified in the question.

How to eliminate wrong answers

Option A is wrong because CloudTrail does not support email notifications for management events; it can only deliver log files to S3 or CloudWatch Logs, and email alerts require an additional service like SNS. Option B is wrong because requiring MFA for access key creation is a security control that prevents unauthorized creation but does not generate alerts when a key is created. Option D is wrong because writing a Lambda function to periodically scan CloudTrail logs in S3 introduces unnecessary complexity, latency, and operational overhead compared to a real-time EventBridge rule.

Practice this question →

185

MCQhard

A company has a CloudWatch dashboard that displays custom metrics from an application. Some metrics are missing from the dashboard, but the application is publishing them to CloudWatch. The SysOps administrator confirms the metrics are present in the CloudWatch console under the correct namespace. What could be causing the metrics to not appear on the dashboard?

A.The IAM user viewing the dashboard does not have cloudwatch:GetMetricData permission.

B.The dashboard's time range is set to the last 1 hour, but the metrics were published 2 hours ago.

C.The dashboard widget is configured with a period that is longer than the interval at which metrics are published.

D.The namespace of the custom metrics is misspelled in the dashboard configuration.

AnswerC

If the widget period is 5 minutes but metrics are published every minute, the widget may aggregate and show a single point, but missing? Actually if period is too long, recent data may not be aggregated yet. More likely: the widget's stat (e.g., Sum) might show zero if no data points in that period. But the correct scenario: missing metrics often due to period mismatch.

Why this answer

Option C is correct because if the dashboard widget's period (e.g., 5 minutes) is longer than the metric publication interval (e.g., 1 minute), CloudWatch will aggregate the data points into the larger period. If the metric was published recently, the aggregation may not have completed, or the data points may fall outside the visible range of the widget's period, causing them to not appear even though the raw metrics exist in the namespace.

Exam trap

The trap here is that candidates often assume missing metrics are due to permissions or namespace errors, but the question explicitly states the metrics are visible in the CloudWatch console, pointing to a widget configuration issue like period mismatch.

How to eliminate wrong answers

Option A is wrong because the IAM user can see the metrics in the CloudWatch console under the correct namespace, which requires cloudwatch:GetMetricData permission; thus, the permission is already granted. Option B is wrong because if the metrics were published 2 hours ago, they would still appear in a dashboard set to the last 1 hour only if the dashboard's time range is adjusted; the metrics are missing, not just outside the current range. Option D is wrong because the administrator confirmed the metrics are present under the correct namespace in the CloudWatch console, so the namespace is correctly spelled in the dashboard configuration.

Practice this question →

186

Multi-Selecthard

A company is using AWS Organizations to manage multiple accounts. The security team wants to ensure that all API calls across all accounts are logged and retained for at least 7 years. Which THREE steps should the SysOps administrator take to meet this requirement? (Choose THREE.)

Select 3 answers

A.Use AWS Config to record API calls and store them in an S3 bucket.

B.Enable AWS CloudTrail in all AWS accounts and all regions.

C.Store CloudTrail logs in Amazon CloudWatch Logs with a retention policy of 7 years.

D.Configure a CloudTrail trail to deliver log files to a centralized Amazon S3 bucket.

E.Configure an S3 lifecycle policy on the centralized bucket to transition logs to Amazon S3 Glacier after 7 years.

AnswersB, D, E

This ensures all API calls are logged across the organization.

Why this answer

AWS CloudTrail is the service designed to record all API calls made within an AWS account. Enabling CloudTrail in all accounts and all regions ensures comprehensive logging of management events across the entire AWS Organization. This is a foundational step for meeting audit and compliance requirements.

Exam trap

The trap here is confusing AWS Config (which tracks configuration changes) with CloudTrail (which records API calls), leading candidates to incorrectly select AWS Config as a logging solution for API activity.

Practice this question →

187

MCQmedium

A SysOps administrator needs to audit all changes to security groups in an AWS account. Which AWS service should be used to capture these changes?

A.VPC Flow Logs

B.AWS CloudTrail

C.CloudWatch Logs

D.AWS Config

AnswerB

CloudTrail logs all API calls, including security group modifications.

Why this answer

AWS CloudTrail is the correct service because it records API calls made to the AWS environment, including all CreateSecurityGroup, AuthorizeSecurityGroupIngress, RevokeSecurityGroupEgress, and DeleteSecurityGroup API actions. By enabling CloudTrail trail logging, the SysOps administrator can capture a complete audit trail of who made changes, when, and from which source IP, which is essential for security group change auditing.

Exam trap

The trap here is that candidates often confuse AWS Config (which tracks configuration state changes) with CloudTrail (which tracks API call provenance), leading them to choose Config for auditing changes when only CloudTrail provides the identity and source of the change.

How to eliminate wrong answers

Option A is wrong because VPC Flow Logs capture network traffic metadata (IP addresses, ports, protocols) at the network interface level, not API-level changes to security group configurations. Option C is wrong because CloudWatch Logs is a service for storing, monitoring, and accessing log files from various sources (e.g., application logs, system logs), but it does not natively capture AWS API calls; it can only store CloudTrail logs if they are streamed to it, but it is not the primary service for capturing the changes. Option D is wrong because AWS Config evaluates resource configurations against desired rules and tracks configuration changes over time, but it does not capture the API caller identity or the source of the change; it records the state of the security group after the change, not the who and how of the API call.

Practice this question →

188

MCQeasy

Refer to the exhibit. The command returns no datapoints for CPUUtilization for the specified instance. What is the most likely reason?

A.The instance was stopped or did not emit metrics during the specified time range.

B.The metric name is incorrect.

C.The instance does not have detailed monitoring enabled.

D.The period of 300 seconds is too short.

AnswerA

If the instance is stopped, no metrics are emitted.

Why this answer

The most likely reason for no datapoints is that the instance was stopped or did not emit metrics during the specified time range. CloudWatch only retains and returns metric data when the instance is running and the CloudWatch agent or EC2 hypervisor is actively publishing CPUUtilization. If the instance was in a stopped state, no metrics are generated, resulting in an empty response from the GetMetricStatistics API call.

Exam trap

The trap here is that candidates often assume missing datapoints are due to a configuration issue (like detailed monitoring not enabled or wrong period), when in fact the instance simply wasn't running during the queried time window.

How to eliminate wrong answers

Option B is wrong because CPUUtilization is a standard EC2 metric name; if the metric name were incorrect, the API would return an error message (e.g., 'InvalidParameterValue') rather than an empty dataset. Option C is wrong because basic monitoring (5-minute granularity) still emits CPUUtilization datapoints; detailed monitoring only affects the frequency (1-minute granularity), not the existence of data. Option D is wrong because a period of 300 seconds is a valid and common value for basic monitoring (matching the default 5-minute interval) and does not cause missing datapoints.

Practice this question →

189

MCQmedium

A SysOps administrator is troubleshooting an application that runs on AWS Lambda. The application occasionally fails with timeout errors. The administrator needs to identify the exact lines of code that are causing the delays. Which AWS service or feature should be used to gather this information?

A.Enable detailed CloudWatch Logs and search for 'timeout' strings.

B.Use AWS X-Ray to trace the Lambda function and view segment details.

C.Set a CloudWatch Metric Filter for 'Duration' and create an alarm.

D.Enable AWS CloudTrail data events for the Lambda function.

AnswerB

Correct. X-Ray traces function executions and can be instrumented to capture subsegments for each function call, helping identify which lines or api calls are slow.

Why this answer

AWS X-Ray provides end-to-end tracing for Lambda functions, capturing segment details and subsegments that pinpoint the exact lines of code causing delays. By analyzing the trace timeline and annotations, the administrator can identify which specific function calls or operations exceed the timeout threshold, unlike CloudWatch Logs which only show aggregate duration or error strings without code-level granularity.

Exam trap

The trap here is that candidates often confuse CloudWatch Logs or Metrics (which show aggregate data) with the code-level tracing capability of X-Ray, assuming that searching for 'timeout' strings or monitoring 'Duration' metrics will reveal the exact lines of code causing the delay.

How to eliminate wrong answers

Option A is wrong because CloudWatch Logs can show 'timeout' strings but cannot trace the exact lines of code causing delays; they only log output from the function, not internal execution flow. Option C is wrong because a CloudWatch Metric Filter for 'Duration' and an alarm only monitors the overall execution time, not the specific code segments or lines responsible for the timeout. Option D is wrong because AWS CloudTrail data events record API calls to the Lambda service (e.g., Invoke, UpdateFunctionCode) but do not capture the internal execution trace or code-level timing within the function.

Practice this question →

190

Multi-Selectmedium

A SysOps administrator sets up an alarm to monitor the CPU utilization of an Auto Scaling group. The alarm should trigger a scaling policy when CPU exceeds 80% for 5 consecutive minutes. The alarm state remains 'INSUFFICIENT_DATA' even though instances are running. Which THREE are possible causes?

Select 3 answers

A.The EC2 instances are not sending CPU utilization metrics to CloudWatch.

B.The alarm is configured with the wrong metric namespace.

C.The Auto Scaling group does not have a health check grace period configured.

D.The alarm's period is shorter than the evaluation period.

E.The Auto Scaling group uses ELB health checks.

AnswersA, B, D

If the CloudWatch agent is not installed or configured, no metrics are sent.

Why this answer

Option A is correct because if EC2 instances are not sending CPU utilization metrics to CloudWatch, the alarm will remain in INSUFFICIENT_DATA state. This can happen if the CloudWatch agent is not installed or configured on the instances, or if the instances lack the necessary IAM permissions to publish metrics. Without metric data, CloudWatch cannot evaluate the alarm condition.

Exam trap

The trap here is that candidates often assume INSUFFICIENT_DATA is always due to missing metrics, but they overlook misconfigured namespaces or mismatched period/evaluation windows as equally valid causes.

Practice this question →

191

MCQmedium

A company needs to continuously scan Amazon EC2 instances for software vulnerabilities and unintended network exposure. Which AWS service should be used?

A.AWS Config

B.Amazon Inspector

C.AWS Trusted Advisor

D.Amazon GuardDuty

AnswerB

Amazon Inspector is designed specifically to assess vulnerabilities and network exposure on EC2 instances.

Why this answer

Amazon Inspector is the correct service because it is specifically designed to automatically scan Amazon EC2 instances for software vulnerabilities (CVEs) and unintended network exposure (network reachability). It uses a combination of a managed agent (for OS-level assessment) and network configuration analysis to produce a detailed findings report, directly meeting the requirement for continuous scanning.

Exam trap

The trap here is that candidates often confuse Amazon Inspector with Amazon GuardDuty, mistakenly thinking GuardDuty performs vulnerability scanning when it actually focuses on threat detection from network and account activity, not on scanning EC2 instances for CVEs or network exposure.

How to eliminate wrong answers

Option A is wrong because AWS Config is a service for evaluating and auditing the configuration of AWS resources against desired policies (e.g., compliance rules), not for scanning for software vulnerabilities or network exposure. Option C is wrong because AWS Trusted Advisor provides high-level best-practice recommendations across cost, performance, security, and fault tolerance, but it does not perform deep vulnerability scanning of EC2 instances. Option D is wrong because Amazon GuardDuty is a threat detection service that analyzes VPC Flow Logs, DNS logs, and CloudTrail events for malicious activity, not for scanning EC2 instances for software vulnerabilities or unintended network exposure.

Practice this question →

192

MCQeasy

A company has an AWS Lambda function that processes files uploaded to an S3 bucket. The function fails intermittently with a timeout error. What should the SysOps administrator do to monitor and resolve this issue?

A.Place the S3 bucket in a different AWS region to reduce latency.

B.Enable provisioned concurrency for the Lambda function.

C.Increase the Lambda function timeout and review the function logs in CloudWatch Logs.

D.Configure EC2 Auto Scaling to launch more instances for the Lambda function.

AnswerC

Increasing timeout may fix the timeout error, and logs help identify the cause.

Why this answer

Option C is correct because increasing the Lambda function timeout directly addresses the intermittent timeout error by allowing the function more time to complete execution before Lambda terminates it. Reviewing CloudWatch Logs is essential to identify the root cause, such as slow downstream dependencies or inefficient code, enabling targeted optimization. This approach aligns with standard troubleshooting for Lambda timeout issues.

Exam trap

The trap here is that candidates may confuse Lambda's scaling behavior with EC2 Auto Scaling, or assume that provisioned concurrency fixes execution duration issues, when in fact it only addresses cold start latency.

How to eliminate wrong answers

Option A is wrong because moving the S3 bucket to a different region does not resolve Lambda timeout errors; it may increase latency due to cross-region data transfer and does not affect the function's execution duration. Option B is wrong because provisioned concurrency initializes execution environments to reduce cold starts, but it does not extend the maximum execution time allowed by the Lambda timeout setting. Option D is wrong because EC2 Auto Scaling manages EC2 instances, not Lambda functions; Lambda scales automatically based on incoming requests, and Auto Scaling has no role in Lambda execution.

Practice this question →

193

Multi-Selecthard

A SysOps administrator is troubleshooting an issue where an EC2 instance has failed a status check. The instance is still running but is unresponsive. Which THREE actions should the administrator take to diagnose and resolve the issue? (Choose THREE.)

Select 3 answers

A.Reboot the instance.

B.Check the system status checks in the EC2 console.

C.Review the instance system log (console output).

D.Restore the instance from the latest AMI.

E.Stop and start the instance (recovery action).

AnswersB, C, E

System status checks indicate underlying hardware/network issues.

Why this answer

System status checks (Option B) monitor the underlying physical host for issues like network or power loss, while instance status checks (like system log in Option C) detect OS-level problems. Reviewing the system log helps identify kernel panics or boot failures. Stopping and starting the instance (Option E) forces a migration to a new physical host, which can resolve host-level impairments without losing the instance's configuration or data.

Exam trap

The trap here is that candidates confuse 'reboot' with 'stop/start' — rebooting does not change the underlying host, while stopping and starting does, which is the key recovery action for host-level failures.

Practice this question →

194

Multi-Selecteasy

A SysOps administrator is setting up monitoring for an Amazon RDS instance. Which TWO metrics are available by default in Amazon CloudWatch for RDS?

Select 2 answers

A.DatabaseConnections

B.CPUUtilization

C.FreeableMemory

D.FailedLoginAttempts

E.WriteLatency

AnswersA, B

Default metric for RDS.

Why this answer

Amazon RDS automatically publishes a set of metrics to CloudWatch at no additional cost. DatabaseConnections is one of these default metrics, as it tracks the number of database client connections to the RDS instance. This metric is essential for monitoring connection limits and application connectivity health.

Exam trap

AWS often tests the distinction between default CloudWatch metrics and Enhanced Monitoring metrics, trapping candidates who assume all performance-related metrics (like FreeableMemory or WriteLatency) are automatically available without additional configuration.

Practice this question →

195

MCQmedium

A company has deployed a web application on EC2 instances with an Auto Scaling group. The SysOps administrator needs to automatically replace any instance that is in a 'failed' status as reported by the EC2 status checks. Which action should the administrator take?

A.Create an AWS Config rule to detect failed status checks and trigger a remediation action.

B.Create an AWS Lambda function that stops and starts the failed instance.

C.Configure the Auto Scaling group to use EC2 status checks for health checks.

D.Create a CloudWatch alarm on the StatusCheckFailed metric and trigger an SNS notification.

AnswerC

Auto Scaling can automatically replace instances based on health checks.

Why this answer

Option C is correct because an Auto Scaling group can be configured to use EC2 status checks (both system and instance) as the health check type. When the status check reports a failed status, the Auto Scaling group automatically terminates the unhealthy instance and launches a new one to replace it, ensuring self-healing without manual intervention.

Exam trap

The trap here is that candidates often confuse monitoring (CloudWatch alarms or SNS notifications) with automated remediation, forgetting that Auto Scaling groups have a built-in health check replacement feature that directly addresses the requirement to automatically replace failed instances.

How to eliminate wrong answers

Option A is wrong because AWS Config rules are designed for compliance and resource configuration auditing, not for real-time health monitoring or automatic replacement of failed instances; they lack the native ability to trigger instance replacement in an Auto Scaling group. Option B is wrong because stopping and starting a failed instance does not replace it; the instance retains its private IP and may still be unhealthy, and this approach bypasses the Auto Scaling group's lifecycle management, potentially causing state inconsistencies. Option D is wrong because a CloudWatch alarm on StatusCheckFailed with an SNS notification only sends an alert; it does not automatically replace the instance, requiring manual or additional automation to trigger the replacement.

Practice this question →

196

MCQhard

A company uses Amazon CloudWatch Logs to collect logs from multiple EC2 instances. The SysOps administrator needs to create a metric filter that counts the number of ERROR-level log entries per hour and triggers an alarm when the count exceeds 100 in any 5-minute period. Which metric filter pattern should be used?

A.Use the pattern "ERROR" and set the metric value to 100.

B.Use the pattern "ERROR" and set the metric value to 1.

C.Use the pattern "ERROR *" to match any log entry starting with ERROR.

D.Use the pattern "[ERROR, 5]" to match 5 consecutive ERROR entries.

AnswerB

Each log entry containing ERROR increments the metric by 1, allowing accurate counting.

Why this answer

Option B is correct because a CloudWatch Logs metric filter counts each log event that matches the pattern. Setting the metric value to 1 ensures that each matching log entry increments the metric by 1, allowing the alarm to evaluate the sum over a 5-minute period against the threshold of 100. The pattern "ERROR" matches any log entry containing the string "ERROR" anywhere in the message.

Exam trap

The trap here is that candidates often think the metric value should match the alarm threshold (e.g., 100) or that wildcards or special syntax are needed, when in fact the metric value should be 1 and the threshold is set in the alarm definition.

How to eliminate wrong answers

Option A is wrong because setting the metric value to 100 would cause each matching log entry to increment the metric by 100, making the alarm trigger after a single ERROR entry (100/100 = 1), not after 100 entries. Option C is wrong because the pattern "ERROR *" uses a wildcard that is not valid in CloudWatch Logs metric filter syntax; CloudWatch Logs uses space-delimited token matching with brackets for positional patterns, not glob-style wildcards. Option D is wrong because the pattern "[ERROR, 5]" is not valid metric filter syntax; CloudWatch Logs does not support counting consecutive entries or specifying a count within the pattern itself.

Practice this question →

197

MCQmedium

A company wants to receive alerts when an Auto Scaling group launches or terminates instances. They already have a CloudTrail trail enabled. What is the simplest way to achieve this?

A.Create a CloudWatch Events rule that matches Auto Scaling event patterns and sends notifications to an SNS topic.

B.Write a script on each EC2 instance to call the CloudWatch Logs API on launch/termination.

C.Configure the Auto Scaling group to publish lifecycle hooks and use Lambda to send notifications.

D.Enable CloudWatch detailed monitoring on the Auto Scaling group and create alarms.

AnswerA

Simplest approach using existing CloudTrail and CloudWatch Events.

Why this answer

Option A is correct because CloudWatch Events (now part of Amazon EventBridge) can automatically capture Auto Scaling group state changes (launch and terminate) via CloudTrail API calls. By creating a rule that matches the specific event pattern for Auto Scaling events (e.g., 'EC2 Instance Launch Successful' and 'EC2 Instance Terminate Successful'), you can directly route those events to an SNS topic, which then sends notifications (e.g., email or SMS). This requires no custom code, lifecycle hooks, or additional monitoring configuration, making it the simplest solution.

Exam trap

The trap here is that candidates often overcomplicate the solution by choosing lifecycle hooks (Option C) or custom scripts (Option B), not realizing that CloudWatch Events can directly consume CloudTrail API events for Auto Scaling without additional infrastructure.

How to eliminate wrong answers

Option B is wrong because writing a script on each EC2 instance to call the CloudWatch Logs API on launch/termination is overly complex, requires custom code on every instance, and does not inherently trigger alerts; it only logs data, not send notifications. Option C is wrong because lifecycle hooks are designed for custom actions during scaling events (e.g., running a script before termination), but they add unnecessary complexity and cost (Lambda execution) when the goal is simply to receive alerts; CloudWatch Events can do this without hooks. Option D is wrong because CloudWatch detailed monitoring on an Auto Scaling group provides metrics like CPU utilization, not instance launch/termination events; alarms based on these metrics cannot detect scaling actions directly, and they would require threshold-based logic that is indirect and unreliable for this use case.

Practice this question →

198

MCQeasy

A company wants to receive a notification when the root account is used to perform any action in the AWS account. Which service should be used to monitor this?

A.AWS Config

B.Amazon CloudWatch Logs

C.AWS CloudTrail with Amazon CloudWatch Events

D.AWS Trusted Advisor

AnswerC

CloudTrail records API calls; CloudWatch Events can trigger notifications on specific events.

Why this answer

AWS CloudTrail logs all API activity in the account, including root user actions. By creating a CloudTrail trail and sending those logs to Amazon CloudWatch Events (now part of Amazon EventBridge), you can define a rule that matches the `RootAccess` event or any API call where `userIdentity.type` is `Root` and trigger a notification via SNS or Lambda. This combination provides real-time monitoring and alerting for root account usage.

Exam trap

The trap here is that candidates confuse AWS Config (which audits resource configurations) with CloudTrail (which audits API activity), or they think CloudWatch Logs alone can trigger alerts without CloudWatch Events, missing the requirement for event-driven notification.

How to eliminate wrong answers

Option A is wrong because AWS Config is designed for resource inventory, configuration history, and compliance rules (e.g., checking if S3 buckets are public), not for monitoring real-time API calls or user actions. Option B is wrong because Amazon CloudWatch Logs can store and query log data, but it cannot directly generate notifications from CloudTrail events without a CloudWatch Events rule or metric filter; the question requires a service that monitors root actions, and CloudWatch Logs alone lacks the event-driven alerting capability. Option D is wrong because AWS Trusted Advisor provides best-practice checks for cost optimization, performance, security, and fault tolerance, but it does not monitor or alert on specific user actions like root account usage.

Practice this question →

199

MCQmedium

Refer to the exhibit. An EC2 instance is running the CloudWatch Logs agent and uses the IAM policy shown. The agent is configured to send logs to the log group 'MyAppLogGroup'. However, logs are not appearing. What is the MOST likely cause?

A.The log group name in the policy does not match the agent configuration.

B.The policy does not allow the 'logs:PutLogEvents' action.

C.The policy is missing permission to create the log group if it does not exist.

D.The Resource ARN incorrectly specifies a wildcard after the log group name.

AnswerC

The agent may need 'logs:CreateLogGroup' permission to create the log group.

Why this answer

Option C is correct because the CloudWatch Logs agent cannot automatically create a log group; it requires explicit permission via the `logs:CreateLogGroup` action in the IAM policy. Without this permission, if the log group 'MyAppLogGroup' does not already exist, the agent will fail to send logs, even though the `logs:PutLogEvents` action is allowed. The policy shown only grants `logs:PutLogEvents` and `logs:DescribeLogStreams`, missing the necessary `logs:CreateLogGroup` and `logs:CreateLogStream` actions for initial setup.

Exam trap

The trap here is that candidates assume `logs:PutLogEvents` alone is sufficient for sending logs, overlooking the fact that the agent must first create the log group and log stream if they do not exist, which requires additional permissions.

How to eliminate wrong answers

Option A is wrong because the log group name in the policy ('MyAppLogGroup') matches the agent configuration, so there is no mismatch. Option B is wrong because the policy explicitly includes the `logs:PutLogEvents` action, so that permission is present. Option D is wrong because the Resource ARN `arn:aws:logs:us-east-1:123456789012:log-group:MyAppLogGroup:*` correctly uses a wildcard after the log group name to match all log streams within that group, which is standard practice.

Practice this question →

200

MCQmedium

A SysOps administrator needs to automatically restart an Amazon RDS DB instance when the 'DatabaseConnections' metric exceeds a threshold of 200 for 5 consecutive minutes. The administrator wants a solution that uses minimal custom code and leverages AWS managed services. Which combination of services should be used?

A.Amazon CloudWatch alarm with an Auto Scaling policy.

B.Amazon CloudWatch alarm with an Amazon Simple Notification Service (SNS) topic that triggers an AWS Lambda function to restart the instance.

C.Amazon CloudWatch alarm with an AWS Systems Manager Automation action.

D.Amazon RDS event subscription that triggers an AWS Lambda function.

AnswerC

Systems Manager Automation has a pre-built runbook 'AWS-RestartRDSInstance' that can be triggered directly from a CloudWatch alarm via the 'Systems Manager Automation' action. No custom code is needed.

Why this answer

Option C is correct because AWS Systems Manager Automation provides a built-in 'AWSSupport-StartRDSInstance' or 'AWSSystemsManager-RestartRDSInstance' runbook that can be triggered directly by a CloudWatch alarm action, requiring no custom code. This leverages a managed service to restart the RDS instance automatically when the 'DatabaseConnections' metric exceeds 200 for 5 consecutive minutes, meeting the minimal custom code requirement.

Exam trap

The trap here is that candidates often assume a Lambda function is always required for custom remediation actions, but AWS Systems Manager Automation provides a no-code alternative for many common operations like restarting RDS instances, which directly meets the 'minimal custom code' constraint.

How to eliminate wrong answers

Option A is wrong because Auto Scaling policies are designed to scale EC2 instances or other Auto Scaling group resources, not to restart RDS instances; they cannot directly trigger a database restart. Option B is wrong because while it uses a Lambda function to restart the instance, it introduces custom code (the Lambda function logic) which violates the 'minimal custom code' requirement. Option D is wrong because RDS event subscriptions are for notification of events like instance creation or failure, not for triggering automated remediation based on CloudWatch metric thresholds; they lack the direct integration with CloudWatch alarms needed for this metric-based condition.

Practice this question →

201

MCQeasy

A company wants to receive an email notification when an EC2 instance's status check fails. What AWS service should be used?

A.AWS CloudTrail

B.Amazon CloudWatch Alarm

C.AWS Config

D.AWS Trusted Advisor

AnswerB

CloudWatch Alarms can monitor status check metrics and send SNS notifications.

Why this answer

Amazon CloudWatch Alarms can monitor EC2 instance status checks (both system and instance checks) and trigger an action, such as sending an email via Amazon SNS, when a status check fails. This is the correct service because it directly integrates with EC2 metrics and supports alarm-based notifications for status check failures.

Exam trap

The trap here is that candidates often confuse AWS Config or CloudTrail with monitoring services, but only CloudWatch Alarms can directly monitor EC2 status check metrics and trigger notifications.

How to eliminate wrong answers

Option A is wrong because AWS CloudTrail records API activity and management events, not instance-level health metrics or status checks; it cannot trigger email notifications for status check failures. Option C is wrong because AWS Config evaluates resource configurations against rules and tracks configuration changes, but it does not monitor real-time operational metrics like EC2 status checks or send direct email alerts. Option D is wrong because AWS Trusted Advisor provides best-practice recommendations and checks for cost optimization, security, and fault tolerance, but it does not monitor EC2 status checks or send notifications for status check failures.

Practice this question →

202

Matchingmedium

Match each AWS service to its primary function.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Monitoring and observability

API activity logging

Resource compliance and configuration tracking

Best practice recommendations

Operational management and automation

Why these pairings

These are core AWS management and governance services.

Practice this question →

203

MCQhard

A company has an application running on Amazon EC2 instances behind an Application Load Balancer (ALB). The application logs show intermittent 503 errors. The ALB access logs show that the errors occur when the target response time exceeds 30 seconds. Which configuration change should the SysOps administrator make to reduce the number of 503 errors without affecting the application's behavior?

A.Increase the deregistration delay on the target group

B.Increase the idle timeout setting on the ALB

C.Enable cross-zone load balancing on the ALB

D.Decrease the health check interval on the target group

AnswerB

ALB idle timeout is the maximum time the load balancer waits for a response. Increasing it allows longer-running requests to complete.

Why this answer

The 503 errors occur when the target response time exceeds 30 seconds, which matches the default idle timeout of the Application Load Balancer. By increasing the idle timeout setting on the ALB to a value higher than the application's maximum expected response time (e.g., 60 or 120 seconds), the ALB will wait longer before closing the connection, preventing premature 503 errors while the backend is still processing the request.

Exam trap

The trap here is that candidates often confuse the ALB idle timeout with the target group deregistration delay or health check settings, mistakenly thinking that adjusting health checks or deregistration will fix timeout-related 503 errors, when the root cause is the ALB's connection timeout parameter.

How to eliminate wrong answers

Option A is wrong because increasing the deregistration delay on the target group controls how long the ALB waits before sending new connections to a deregistering target, which does not address the 30-second response timeout causing 503 errors. Option C is wrong because enabling cross-zone load balancing distributes traffic evenly across all targets in all Availability Zones, which improves load distribution but does not affect the ALB's idle timeout or prevent 503 errors from slow responses. Option D is wrong because decreasing the health check interval on the target group makes health checks more frequent, which can detect unhealthy targets faster but does not change the ALB's connection timeout behavior that is causing the 503 errors.

Practice this question →

204

Multi-Selecteasy

Which TWO AWS services can be used to monitor the performance of an Amazon RDS database and set alarms based on metrics?

Select 2 answers

A.AWS Lambda

B.Amazon RDS Performance Insights

C.Amazon S3

D.Amazon CloudWatch

E.AWS CloudTrail

AnswersB, D

Performance Insights provides database performance metrics and can trigger alarms via CloudWatch.

Why this answer

Amazon CloudWatch (Option D) is the primary monitoring service for AWS resources, including Amazon RDS. It collects metrics like CPU utilization, database connections, and read/write latency, and allows you to set CloudWatch Alarms that trigger actions (e.g., SNS notifications) when thresholds are breached. Amazon RDS Performance Insights (Option B) provides deeper database performance analysis by visualizing database load and identifying bottlenecks, and it can also publish metrics to CloudWatch for alarm purposes.

Exam trap

The trap here is that candidates often confuse AWS CloudTrail (audit logging) with CloudWatch (monitoring), or think Lambda can be used for monitoring when it is actually a compute trigger, not a monitoring service.

Practice this question →

205

Multi-Selectmedium

A company uses Amazon CloudWatch to monitor its AWS infrastructure. The operations team wants to receive notifications when any EC2 instance's status check fails. Which TWO steps should be taken to achieve this?

Select 2 answers

A.Create a CloudWatch alarm on the CPUUtilization metric with a threshold of 90%.

B.Configure the alarm to send a notification to an Amazon SNS topic.

C.Enable AWS CloudTrail to capture instance state changes.

D.Install the CloudWatch agent on the instance and publish custom memory metrics.

E.Create a CloudWatch alarm on the StatusCheckFailed (or StatusCheckFailed_Instance) metric with a threshold of greater than 0.

AnswersB, E

SNS delivers notifications via email, SMS, etc.

Why this answer

Option B is correct because Amazon CloudWatch alarms can be configured to send notifications to an Amazon SNS topic when the alarm state changes. This allows the operations team to receive immediate notifications (e.g., via email, SMS, or HTTP) when an EC2 instance's status check fails. Option E is also correct because the StatusCheckFailed metric (or StatusCheckFailed_Instance) directly reflects the result of the EC2 status check; setting an alarm with a threshold of greater than 0 triggers when any status check fails.

Exam trap

The trap here is that candidates may confuse CloudWatch metrics like CPUUtilization or custom metrics with the built-in StatusCheckFailed metric, or think that CloudTrail can be used for real-time monitoring and alerting, when in fact it is designed for auditing API activity, not for instance health checks.

Practice this question →

206

MCQmedium

Refer to the exhibit. A SysOps administrator runs this CloudWatch Logs Insights query against an application log group. The query returns no results, even though the administrator knows that errors occurred in the last hour. What is the most likely cause?

A.The 'stats' command requires a 'by' clause with a field name, but 'bin(5m)' is invalid.

B.The log group contains too many log events, causing the query to time out.

C.The @message field is not a valid field in CloudWatch Logs Insights.

D.The log group's retention policy is set to 1 day and the data is older than the retention period.

AnswerD

Data outside retention is not queryable.

Why this answer

Option D is correct because CloudWatch Logs Insights queries only return results from log events that are still present in the log group. If the log group's retention policy is set to 1 day, any data older than that retention period is automatically deleted. The administrator knows errors occurred in the last hour, but if the query is incorrectly scoped to a time range that includes data beyond the retention period, or if the errors were logged exactly at the boundary and have since been purged, the query will return no results.

This is the most likely cause given that the query syntax appears valid and the administrator confirms errors exist.

Exam trap

The trap here is that candidates often assume a query returns no results due to syntax errors or data volume issues, overlooking the fact that the log data may have been deleted by a short retention policy, which is a subtle but critical operational detail in AWS monitoring.

How to eliminate wrong answers

Option A is wrong because the 'stats' command in CloudWatch Logs Insights does not require a 'by' clause; 'bin(5m)' is a valid function that groups timestamps into 5-minute intervals, so the syntax is correct. Option B is wrong because CloudWatch Logs Insights queries have a 10,000-event limit per query, but they do not time out due to too many log events; instead, they return partial results or a message indicating the limit was reached. Option C is wrong because @message is a reserved field in CloudWatch Logs Insights that contains the raw log event text, and it is always available for querying.

Practice this question →

207

MCQmedium

An application logs user authentication attempts to Amazon CloudWatch Logs. The SysOps administrator needs to create a custom metric that counts the number of failed authentication attempts every 5 minutes and trigger an alarm when the count exceeds 5. Which combination of actions should the administrator take?

A.Use the PutMetricData API in the application to publish the number of failed attempts as a custom metric, then create an alarm.

B.Create a metric filter on the log group for the string 'FAILED_AUTH', set the metric value to 1, then create an alarm on the resulting metric.

C.Use AWS CloudTrail to track authentication events and create a metric filter on the CloudTrail log group.

D.Use Amazon Athena to query the logs every 5 minutes and publish results to a CloudWatch metric.

AnswerB

A metric filter counts occurrences of a pattern in log events and publishes a metric that can be alarmed on, without requiring application changes.

Why this answer

Option B is correct because CloudWatch Logs metric filters allow you to extract a numeric value from log events and publish it as a custom metric. By creating a filter that matches the string 'FAILED_AUTH' and setting the metric value to 1, each failed attempt increments the metric. You can then set the metric's period to 5 minutes and create an alarm that triggers when the sum exceeds 5, meeting the requirement without modifying the application code.

Exam trap

The trap here is that candidates often confuse CloudTrail (which logs AWS API calls) with application-level logging, leading them to choose Option C, or they assume the application must be modified to publish metrics (Option A), missing the serverless metric filter approach.

How to eliminate wrong answers

Option A is wrong because it requires modifying the application to call the PutMetricData API, which adds complexity and couples the application to CloudWatch, whereas the requirement can be met without application changes using a metric filter. Option C is wrong because CloudTrail logs management events, not application-level authentication logs; it tracks API calls to AWS services, not user authentication attempts within an application. Option D is wrong because Amazon Athena is an interactive query service for analyzing data in S3, not a real-time or scheduled metric publisher; it cannot automatically publish results to CloudWatch every 5 minutes without custom orchestration, and it introduces unnecessary latency and cost.

Practice this question →

208

MCQmedium

A company is running a stateful web application on EC2 instances behind an Application Load Balancer (ALB). Users report intermittent errors. The SysOps admin notices that the ALB's healthy host count is fluctuating. The admin wants to improve the health check configuration to reduce false positives. Which configuration change is most likely to help?

A.Increase the unhealthy threshold count

B.Reduce the health check interval

C.Increase the health check interval

D.Lower the healthy threshold count

AnswerC

A longer interval gives the application more time to recover from transient issues.

Why this answer

Increasing the health check interval reduces the frequency of health checks, which helps prevent transient issues (e.g., brief CPU spikes or network jitter) from causing false positives. With a longer interval, the ALB waits longer between checks, giving the instance more time to recover before being marked unhealthy. This stabilizes the healthy host count and reduces unnecessary instance replacement.

Exam trap

The trap here is that candidates often assume reducing the interval (more frequent checks) improves accuracy, but in reality it increases sensitivity to transient issues, causing more false positives and fluctuating health status.

How to eliminate wrong answers

Option A is wrong because increasing the unhealthy threshold count would make the ALB require more consecutive failed checks before marking an instance unhealthy, which can delay detection of truly unhealthy instances and does not address the root cause of false positives from frequent checks. Option B is wrong because reducing the health check interval increases the frequency of checks, which amplifies the impact of transient issues and makes false positives more likely, worsening the fluctuating healthy host count. Option D is wrong because lowering the healthy threshold count makes it easier for an instance to be marked healthy after fewer successful checks, which can mask underlying instability and does not prevent false positives caused by frequent checks.

Practice this question →

209

MCQmedium

A SysOps administrator needs to analyze application logs stored in Amazon CloudWatch Logs to find specific error patterns across multiple log groups. The administrator wants to run queries to filter and parse the logs. Which feature should the administrator use?

A.CloudWatch Logs subscriptions

B.CloudWatch Logs Insights

C.CloudWatch Metric Filters

D.CloudWatch Contributor Insights

AnswerB

Correct. CloudWatch Logs Insights enables you to query your logs using a SQL-like syntax, filter results, and visualize findings across multiple log groups.

Why this answer

CloudWatch Logs Insights is the correct feature because it enables you to interactively search and analyze log data stored in CloudWatch Logs using a purpose-built query language. It allows you to run queries across multiple log groups, filter, parse, and aggregate logs to identify specific error patterns, making it ideal for ad-hoc log analysis and troubleshooting.

Exam trap

The trap here is that candidates often confuse CloudWatch Metric Filters (which can filter logs for metric extraction) with the interactive querying capability of CloudWatch Logs Insights, but Metric Filters cannot parse or analyze log content across multiple log groups in a query-like manner.

How to eliminate wrong answers

Option A is wrong because CloudWatch Logs subscriptions are used to stream log data in real-time to other services like Lambda, Kinesis, or Elasticsearch for processing or storage, not for running interactive queries to filter and parse logs. Option C is wrong because CloudWatch Metric Filters extract metric data from logs to create CloudWatch metrics and trigger alarms, but they do not support running queries to parse and analyze log content for error patterns across multiple log groups. Option D is wrong because CloudWatch Contributor Insights analyzes time-series data to identify top contributors and understand traffic patterns, but it is not designed for querying and parsing raw log messages to find specific error patterns.

Practice this question →

210

MCQeasy

A company needs to retain API call logs for 7 years for compliance. Which AWS service should be used to store these logs?

A.AWS CloudTrail

B.AWS Config

C.Amazon CloudWatch Logs

D.Amazon VPC Flow Logs

AnswerA

CloudTrail records API activity and can deliver logs to S3 for long-term retention.

Why this answer

AWS CloudTrail is the correct service because it records API activity across your AWS infrastructure and can be configured to store logs in an S3 bucket with lifecycle policies that retain data for 7 years. CloudTrail is specifically designed for auditing and compliance, capturing management and data plane API calls, and supports long-term retention via S3 object locking or lifecycle rules.

Exam trap

The trap here is that candidates confuse CloudTrail (API auditing) with CloudWatch Logs (operational logs) or Config (configuration history), but only CloudTrail captures the specific API call logs required for compliance retention.

How to eliminate wrong answers

Option B (AWS Config) is wrong because it tracks resource configuration changes and compliance over time, not API call logs; it stores configuration history, not API activity. Option C (Amazon CloudWatch Logs) is wrong because it is intended for real-time monitoring and operational logging from applications and services, with a default retention of indefinite but not optimized for 7-year compliance archiving; it lacks native long-term retention controls like S3 lifecycle policies. Option D (Amazon VPC Flow Logs) is wrong because it captures IP traffic metadata (source/destination IPs, ports, protocols) for network analysis, not API call logs; it is not designed for auditing API-level actions.

Practice this question →

211

Multi-Selectmedium

A company wants to receive real-time notifications when specific API calls are made in their AWS account. Which TWO services can be used together to achieve this? (Choose TWO.)

Select 2 answers

A.AWS CloudTrail

B.AWS Config

C.AWS Lambda

D.Amazon CloudWatch Logs

E.Amazon CloudWatch Events

AnswersA, E

CloudTrail delivers API call logs.

Why this answer

AWS CloudTrail captures all API calls made in the AWS account and can deliver those events to CloudWatch Logs. Amazon CloudWatch Events (now part of Amazon EventBridge) can then process those log entries in real time using event rules that match specific API calls, triggering a target such as an SNS notification or Lambda function. This combination provides real-time notification without polling or custom code.

Exam trap

The trap here is that candidates often pick AWS Lambda alone, thinking it can directly monitor API calls, but Lambda requires a triggering event source such as CloudWatch Events or S3, not raw CloudTrail logs.

Practice this question →

212

MCQmedium

A SysOps administrator is troubleshooting an issue where an Amazon EC2 instance running Amazon Linux 2 is not sending logs to CloudWatch Logs. The CloudWatch agent is installed and configured. Which step should the administrator take FIRST to diagnose the issue?

A.Reinstall the CloudWatch agent from the AWS Systems Manager.

B.Update the IAM role attached to the instance to include CloudWatch Logs permissions.

C.Verify the EC2 instance status in the AWS Management Console.

D.Check the CloudWatch agent status and review its log files on the instance.

AnswerD

The agent logs often contain error messages about configuration or connectivity.

Why this answer

The CloudWatch agent is already installed and configured, so the first step is to check its operational status and review its own log files (typically in /var/log/aws/amazon-cloudwatch-agent/). This directly reveals whether the agent is running, encountering configuration errors, or failing to connect to the CloudWatch Logs service endpoint, without making unnecessary changes.

Exam trap

The trap here is that candidates assume the IAM role is the most common cause of log delivery failure and jump to updating it, but the question specifies the agent is already installed and configured, so the first logical step is to check the agent's own logs to confirm the exact error before making any changes.

How to eliminate wrong answers

Option A is wrong because reinstalling the agent from Systems Manager is a disruptive step that should only be taken after verifying the agent is not functioning due to corruption or misconfiguration, not as a first diagnostic step. Option B is wrong because updating the IAM role is a valid fix if permissions are missing, but the question states the agent is installed and configured; checking the agent's logs first will confirm whether the issue is permission-related (e.g., an AccessDeniedException) before modifying the role. Option C is wrong because verifying the EC2 instance status in the console only confirms the instance is running and reachable, but does not provide any insight into the CloudWatch agent's internal state or log delivery failures.

Practice this question →

213

MCQmedium

A company runs a web application on EC2 instances behind an Application Load Balancer. The SysOps administrator creates a CloudWatch alarm on the ALB's HTTPCode_ELB_5XX_Count metric to trigger an SNS notification when there are many 5xx errors. However, the alarm remains in INSUFFICIENT_DATA state. What is a likely cause?

A.The HTTPCode_ELB_5XX_Count metric is not available for ALBs.

B.The ALB and the CloudWatch alarm are in different AWS Regions.

C.The IAM role for CloudWatch does not have permission to read the ALB metrics.

D.The alarm's period is set to 1 minute, but the metric is published every 5 minutes.

AnswerB

CloudWatch alarms cannot monitor metrics from a different region unless using cross-region functionality.

Why this answer

CloudWatch alarms can only evaluate metrics from the same AWS Region in which the alarm is created. If the ALB and the CloudWatch alarm are in different Regions, the alarm will never receive metric data points, causing it to remain in INSUFFICIENT_DATA state. This is a common cross-Region limitation for CloudWatch metrics.

Exam trap

The trap here is that candidates often overlook the regional scope of CloudWatch metrics and alarms, assuming that metrics are globally accessible, when in fact they are strictly regional.

How to eliminate wrong answers

Option A is wrong because HTTPCode_ELB_5XX_Count is a standard metric emitted by Application Load Balancers and is fully available in CloudWatch. Option C is wrong because CloudWatch does not require a separate IAM role to read ALB metrics; metrics are published automatically by AWS services, and the alarm uses the service-linked role or the caller's permissions, which are not the cause of INSUFFICIENT_DATA. Option D is wrong because even if the alarm's period is shorter than the metric's publication interval, CloudWatch will still receive data points (just less frequently) and the alarm would eventually show OK or ALARM, not remain in INSUFFICIENT_DATA indefinitely.

Practice this question →

214

MCQhard

A SysOps administrator is investigating a security incident where an EC2 instance was used to launch an attack. The administrator needs to determine the source IP addresses that were used to access the instance prior to the attack. Which AWS service and feature should be used to capture this information?

A.Amazon GuardDuty

B.AWS CloudTrail with data events enabled for EC2

C.VPC Flow Logs

D.AWS Config with recording of security groups

AnswerC

Flow Logs capture network traffic metadata, including source IPs.

Why this answer

VPC Flow Logs capture metadata about IP traffic flowing to and from network interfaces in a VPC, including the source IP address, destination IP address, port, protocol, and timestamps. This allows the administrator to identify the source IP addresses that accessed the EC2 instance prior to the attack, as the logs record all accepted and rejected traffic at the network interface level.

Exam trap

The trap here is that candidates confuse AWS CloudTrail (which logs API calls) with network traffic logging, mistakenly thinking CloudTrail captures IP-level traffic data, when in fact only VPC Flow Logs record the actual source IP addresses of network connections to an EC2 instance.

How to eliminate wrong answers

Option A is wrong because Amazon GuardDuty is a threat detection service that analyzes logs (like VPC Flow Logs, DNS logs, and CloudTrail events) to identify malicious activity, but it does not natively store or provide raw source IP address logs for historical analysis of traffic to an instance. Option B is wrong because AWS CloudTrail with data events for EC2 records API calls (e.g., RunInstances, DescribeInstances) and does not capture network-level traffic metadata such as source IP addresses of packets reaching the instance. Option D is wrong because AWS Config records configuration changes to resources like security groups, but it does not log network traffic flows or the source IP addresses of connections to an instance.

Practice this question →

215

MCQeasy

A SysOps administrator wants to receive a notification when an EC2 instance's status check fails. Which AWS service should the administrator use to set up an alarm based on the status check metric?

A.Amazon CloudWatch Alarms

B.Amazon EventBridge

C.AWS Config

D.AWS Systems Manager

AnswerA

CloudWatch Alarms can monitor StatusCheckFailed metric and send notifications.

Why this answer

Amazon CloudWatch Alarms is the correct service because EC2 instance status checks are published as CloudWatch metrics (e.g., StatusCheckFailed, StatusCheckFailed_Instance, StatusCheckFailed_System). A CloudWatch Alarm can be configured to monitor these metrics and trigger an action, such as sending a notification via Amazon SNS, when the alarm state changes to ALARM. This directly meets the requirement to receive a notification when a status check fails.

Exam trap

The trap here is that candidates may confuse EventBridge's ability to react to EC2 instance state changes (e.g., running, stopped) with the need to monitor a continuous metric like status check failures, which requires CloudWatch Alarms for threshold-based evaluation.

How to eliminate wrong answers

Option B (Amazon EventBridge) is wrong because EventBridge is a serverless event bus used to route events from AWS services or custom sources to targets like Lambda or SQS; it does not natively evaluate metric thresholds or trigger alarms based on status check metrics. Option C (AWS Config) is wrong because Config is used for resource inventory, configuration history, and compliance auditing, not for real-time monitoring of operational metrics like status checks. Option D (AWS Systems Manager) is wrong because Systems Manager provides operational management capabilities (e.g., patching, automation, Run Command) but does not offer metric-based alarm functionality for EC2 status checks.

Practice this question →

216

MCQmedium

An application running on EC2 instances in an Auto Scaling group is experiencing intermittent errors. The errors correlate with periods of high memory usage. The SysOps administrator wants to set up a CloudWatch alarm to scale out when memory usage exceeds 80%. What should the administrator do to enable monitoring of memory usage?

A.Enable detailed monitoring on the Auto Scaling group.

B.Install the CloudWatch agent on the EC2 instances to publish memory metrics.

C.Use the default EC2 memory metric provided by CloudWatch.

D.Create a Lambda function to pull memory data from the EC2 instances.

AnswerB

The CloudWatch agent collects memory and other custom metrics.

Why this answer

The CloudWatch agent is required to publish custom memory metrics from EC2 instances because memory utilization is not a standard metric provided by default. By installing and configuring the agent, the administrator can collect memory usage data and create a CloudWatch alarm to trigger Auto Scaling actions when usage exceeds 80%.

Exam trap

The trap here is that candidates often assume memory metrics are automatically available in CloudWatch like CPU utilization, but AWS intentionally excludes guest-level metrics (memory, disk, swap) by default, requiring the CloudWatch agent to be installed and configured.

How to eliminate wrong answers

Option A is wrong because enabling detailed monitoring on the Auto Scaling group only increases the frequency of standard EC2 metrics (e.g., CPU, network) to 1-minute intervals, but does not add memory metrics. Option C is wrong because CloudWatch does not provide a default EC2 memory metric; memory is a guest-level metric that must be explicitly reported by an agent. Option D is wrong because while a Lambda function could theoretically pull memory data, it is an unnecessarily complex and indirect approach compared to the straightforward, supported method of using the CloudWatch agent, and it would require custom code and permissions management.

Practice this question →

217

MCQhard

The monitoring team needs to collect per-process CPU and memory utilization for a specific Java process (named 'app.jar') running on EC2 Linux instances. Standard EC2 metrics show aggregate CPU but not per-process details. Which CloudWatch agent configuration section enables this?

A.Add a procstat section under metrics_collected in the CloudWatch agent config, specifying process_name = 'app.jar' to collect per-process CPU and memory

B.Enable enhanced monitoring on the EC2 instance and select 'per-process metrics' from the console

C.Configure a CloudWatch Logs metric filter on the Java GC log output to derive CPU and memory figures

D.Use the aws ec2 describe-instance-status API on a schedule to pull process metrics from the instance's system status checks

AnswerA

The procstat plugin uses the Linux /proc filesystem to sample per-process resource usage. With process_name set to 'app.jar', the agent matches the running JVM process and publishes metrics like procstat_cpu_usage and procstat_memory_rss to CloudWatch every collection interval. These metrics carry instance ID and process name dimensions.

Why this answer

The CloudWatch agent's `procstat` plugin is specifically designed to collect per-process metrics such as CPU and memory utilization. By adding a `procstat` section under `metrics_collected` in the agent configuration file and specifying the process name (e.g., `process_name = 'app.jar'`), the agent will gather the required per-process metrics and send them to CloudWatch. This is the only native method within the CloudWatch ecosystem to achieve per-process monitoring on EC2 Linux instances.

Exam trap

The trap here is that candidates may confuse 'enhanced monitoring' (a hypervisor-level feature) with OS-level per-process monitoring, or incorrectly assume that CloudWatch Logs metric filters can derive CPU/memory metrics from application logs, when in fact only the CloudWatch agent's `procstat` plugin can collect actual OS-level per-process resource utilization.

How to eliminate wrong answers

Option B is wrong because 'enhanced monitoring' is a feature of EC2 that provides additional hypervisor-level metrics (e.g., CPU credit usage, network throughput) but does not expose per-process metrics; it cannot see inside the guest OS. Option C is wrong because CloudWatch Logs metric filters can parse log patterns and create numerical metrics from log data, but Java GC logs do not contain CPU or memory utilization figures for the process; they only contain garbage collection timing and heap usage, not OS-level resource consumption. Option D is wrong because the `aws ec2 describe-instance-status` API returns instance health and status checks (e.g., system reachability, instance status) and has no capability to retrieve per-process metrics from within the instance.

Practice this question →

218

MCQhard

A company has a VPC with public and private subnets. An EC2 instance in a private subnet needs to send logs to CloudWatch Logs. Which steps are necessary to allow this without traversing the internet? (Select TWO.)

A.Place the instance in a public subnet with a public IP.

B.Attach an IAM role to the instance with permissions to call PutLogEvents.

C.Create a VPC endpoint for CloudWatch Logs (com.amazonaws.region.logs).

D.Ensure the instance is in the default VPC.

E.Attach a NAT gateway to the private subnet's route table.

AnswerB, C

The instance needs IAM permissions to send logs to CloudWatch Logs.

Why this answer

Option B is correct because the EC2 instance must have an IAM role attached that includes permissions for `logs:PutLogEvents` to authenticate and authorize log delivery to CloudWatch Logs. Without this role, the instance cannot sign API requests, even if network connectivity exists. Option C is correct because a VPC endpoint for CloudWatch Logs (com.amazonaws.region.logs) provides private connectivity via AWS PrivateLink, allowing the instance to send logs without traversing the internet or a NAT gateway.

Exam trap

The trap here is that candidates often assume a NAT gateway is required for private subnet outbound traffic, but for AWS services like CloudWatch Logs, a VPC endpoint provides a more secure and direct path without internet traversal.

How to eliminate wrong answers

Option A is wrong because placing the instance in a public subnet with a public IP would expose it to the internet, which violates the requirement to avoid traversing the internet and introduces unnecessary security risks. Option D is wrong because being in the default VPC does not inherently provide private connectivity to CloudWatch Logs; the default VPC still requires either a NAT gateway or a VPC endpoint for outbound traffic to AWS services. Option E is wrong because a NAT gateway provides internet access for private subnets, but it forces traffic to traverse the internet, which contradicts the requirement to avoid internet traversal; a VPC endpoint is the correct solution for private connectivity.

Practice this question →

219

MCQmedium

A SysOps administrator is investigating a security incident and needs to determine who deleted an S3 bucket. Which AWS service should be used to find this information?

A.AWS CloudTrail

B.AWS Trusted Advisor

C.S3 server access logs

D.CloudWatch Logs for the S3 service

AnswerA

CloudTrail records all management events, including DeleteBucket.

Why this answer

AWS CloudTrail is the correct service because it records API activity across AWS services, including S3 bucket deletion events. When a user or role deletes an S3 bucket, CloudTrail logs the `DeleteBucket` API call with details such as the IAM user or role, source IP address, and timestamp, enabling the administrator to identify the responsible entity.

Exam trap

The trap here is that candidates often confuse S3 server access logs (which log object-level requests) with CloudTrail (which logs management-plane API calls), leading them to incorrectly choose S3 server access logs for bucket deletion events.

How to eliminate wrong answers

Option B (AWS Trusted Advisor) is wrong because it provides best-practice recommendations for cost, performance, security, and fault tolerance, but does not record or log API calls like S3 bucket deletions. Option C (S3 server access logs) is wrong because these logs record object-level access requests (e.g., GET, PUT, DELETE on objects) and HTTP status codes, but they do not capture management-plane API calls such as `DeleteBucket`; they are also not enabled by default and require a target bucket. Option D (CloudWatch Logs for the S3 service) is wrong because CloudWatch Logs can ingest log data from various sources, but S3 does not natively emit management-plane API logs to CloudWatch Logs; CloudTrail logs are the source for such events, and CloudWatch Logs can be used as a destination for CloudTrail, but the service itself does not directly capture `DeleteBucket` events.

Practice this question →

220

MCQmedium

A company runs a web application on Amazon EC2 instances. The SysOps administrator needs to monitor two metrics: high CPU utilization (greater than 90%) and high memory utilization (greater than 85%). An alarm should trigger when both conditions are true simultaneously for a period of 5 minutes. Which CloudWatch feature should the administrator use to create this alarm?

A.Metric math

B.Composite alarm

C.Anomaly detection

D.Logs Insights

AnswerB

Composite alarms can combine multiple underlying alarms with AND logic, triggering only when all conditions are met.

Why this answer

A composite alarm in Amazon CloudWatch allows you to create an alarm that evaluates multiple conditions using logical operators (AND, OR, NOT). In this scenario, the administrator needs the alarm to trigger only when both CPU utilization > 90% AND memory utilization > 85% are true simultaneously for 5 minutes. Composite alarms evaluate the state of underlying metric alarms (e.g., two separate simple alarms for CPU and memory) and combine them with an AND condition, making it the correct feature for this requirement.

Exam trap

The trap here is that candidates often confuse Metric Math with composite alarms, thinking that arithmetic operations can simulate logical AND, but Metric Math cannot evaluate alarm states or combine them with logical operators—it only produces a new numeric metric series.

How to eliminate wrong answers

Option A is wrong because Metric Math is used to perform arithmetic operations on multiple metrics (e.g., sum, average, rate) to create a single time series, but it cannot evaluate logical conditions like AND across separate alarms; it only produces a new metric, not an alarm with combined states. Option C is wrong because Anomaly Detection uses machine learning to detect deviations from expected behavior based on historical patterns, but it does not support combining two separate conditions with an AND operator; it creates a single band for a metric. Option D is wrong because Logs Insights is a query engine for analyzing CloudWatch Logs data, not for creating alarms based on real-time metric thresholds; it cannot trigger alarms directly and does not support composite logic.

Practice this question →

221

MCQhard

A company has a CloudWatch alarm that monitors the CPU utilization of an EC2 instance. The alarm is set to trigger when CPU utilization exceeds 80% for 5 consecutive minutes. The alarm state is 'INSUFFICIENT_DATA'. What does this mean?

A.The CPU utilization is below 80% for 5 minutes.

B.The CPU utilization has exceeded 80% for 5 minutes.

C.The alarm does not have enough data to determine the state.

D.The alarm is missing data points for the past 5 minutes.

AnswerC

INSUFFICIENT_DATA indicates that the metric has not provided enough data points for evaluation.

Why this answer

The INSUFFICIENT_DATA state in CloudWatch indicates that the alarm has not received enough metric data points to evaluate whether the threshold (CPU utilization > 80% for 5 consecutive minutes) has been breached. This typically occurs when the EC2 instance is newly launched, the CloudWatch agent is not reporting, or there are gaps in metric collection due to network issues or instance stops. It does not imply any conclusion about the CPU utilization level itself.

Exam trap

The trap here is that candidates confuse INSUFFICIENT_DATA with missing data points for the exact evaluation period, but the state actually means the alarm cannot determine whether the threshold is breached due to a lack of sufficient data across the configured evaluation periods, not just the last 5 minutes.

How to eliminate wrong answers

Option A is wrong because INSUFFICIENT_DATA does not mean the CPU utilization is below 80%; that would correspond to the ALARM state being false (OK state). Option B is wrong because exceeding 80% for 5 minutes would trigger the ALARM state, not INSUFFICIENT_DATA. Option D is wrong because while missing data points can cause INSUFFICIENT_DATA, the alarm state specifically indicates insufficient data to determine the state, not merely that data points are missing for the past 5 minutes—the alarm evaluates based on the configured evaluation periods and may have partial data.

Practice this question →

222

MCQmedium

A company uses CloudWatch Logs to store application logs from EC2 instances. The SysOps team needs to search for specific error patterns across all log groups. What is the most efficient way to perform this search?

A.Define a CloudWatch metric filter to count errors and view the metric.

B.Use CloudWatch Logs Insights to run a query across the log groups.

C.Create a subscription filter to stream logs to Amazon ES and use Kibana.

D.Export the logs to Amazon S3 and use S3 Select to search.

AnswerB

CloudWatch Logs Insights is designed for interactive querying across log groups.

Why this answer

CloudWatch Logs Insights is purpose-built for ad-hoc querying and analysis of log data across multiple log groups. It allows you to run SQL-like queries (using a query language) to search for specific patterns, filter results, and aggregate data without needing to set up additional infrastructure. This is the most efficient method for the SysOps team's requirement because it provides immediate, interactive search capabilities directly within the AWS Management Console or via the AWS CLI.

Exam trap

The trap here is that candidates often confuse metric filters (which only aggregate counts) with the ability to search actual log content, leading them to choose Option A, or they over-engineer the solution by selecting Option C or D, not realizing that CloudWatch Logs Insights provides a native, serverless, and cost-effective query capability for exactly this scenario.

How to eliminate wrong answers

Option A is wrong because a metric filter only counts occurrences of a pattern and stores the count as a CloudWatch metric; it does not allow you to view the actual log events or search for specific error messages across log groups. Option C is wrong because creating a subscription filter to stream logs to Amazon Elasticsearch Service (now Amazon OpenSearch Service) and using Kibana adds significant setup overhead, latency, and cost; it is overkill for a simple search task and not the most efficient approach for an ad-hoc search. Option D is wrong because exporting logs to S3 and using S3 Select is inefficient for this use case: S3 Select is designed for querying structured data in files (like CSV or JSON) and requires a full export of logs, which is time-consuming and not suitable for real-time or frequent searches across multiple log groups.

Practice this question →

223

Multi-Selectmedium

Which THREE actions can be taken to remediate an EC2 instance that has been compromised according to security best practices? (Select THREE.)

Select 3 answers

A.Terminate the instance after taking a forensic snapshot.

B.Create an AMI of the instance for forensic analysis.

C.Reboot the instance to clear the compromise.

D.Isolate the instance by removing it from security groups.

E.Apply any pending patches and restart the application.

AnswersA, B, D

Termination stops the compromised instance after evidence is preserved.

Why this answer

Terminating the instance after taking a forensic snapshot (Option A) is correct because it ensures the compromised instance is removed from the environment, preventing further malicious activity, while the snapshot preserves the volatile data (memory, disk state) for offline forensic analysis. This aligns with the AWS security best practice of 'preserve evidence, then contain and eradicate' by using EBS snapshots to capture the root volume and any additional data volumes before termination.

Exam trap

The trap here is that candidates often confuse 'forensic analysis' with 'creating an AMI' (Option B) instead of using a snapshot, or they mistakenly believe that rebooting (Option C) or patching (Option E) can remediate a compromise, when in fact these actions destroy evidence and fail to remove the attacker's foothold.

Practice this question →

224

Multi-Selecteasy

A SysOps administrator needs to track changes to security group rules in a VPC. Which AWS services can be used to monitor and log these changes? (Choose TWO.)

Select 2 answers

A.VPC Flow Logs.

B.AWS Trusted Advisor.

C.AWS CloudTrail.

D.AWS Config.

E.Amazon CloudWatch Logs.

AnswersC, D

CloudTrail logs API calls that modify security groups.

Why this answer

AWS CloudTrail is correct because it records API calls made to the Amazon EC2 service, including AuthorizeSecurityGroupIngress, RevokeSecurityGroupEgress, and CreateSecurityGroup. These events capture the identity, source IP, and timestamp of every change to security group rules, providing an audit trail for compliance and troubleshooting.

Exam trap

The trap here is that candidates confuse VPC Flow Logs (which monitor traffic flows) with CloudTrail (which monitors API calls), or assume CloudWatch Logs alone can capture changes without a source like CloudTrail or Config.

Practice this question →

225

Multi-Selecthard

A company runs a critical application on Amazon EC2 instances in an Auto Scaling group. The SysOps administrator needs to detect and automatically replace an instance that is not responding to health checks. Which THREE steps should the administrator take? (Choose THREE.)

Select 2 answers

A.Manually terminate the unhealthy instance after the alarm triggers.

B.Configure the CloudWatch alarm to take the action of terminating the instance.

C.Configure the CloudWatch alarm to send a notification to an SNS topic.

D.Configure the Auto Scaling group to use ELB health checks.

E.Create a CloudWatch alarm on the 'StatusCheckFailed' metric for each instance.

AnswersB, E

Termination triggers the Auto Scaling group to launch a replacement instance.

Why this answer

Option B is correct because a CloudWatch alarm configured on the 'StatusCheckFailed' metric can directly trigger an Auto Scaling action to terminate the unhealthy instance. This integrates with the Auto Scaling group's lifecycle to automatically replace the instance without manual intervention, meeting the requirement for automated detection and replacement.

Exam trap

The trap here is that candidates often confuse sending an SNS notification (Option C) with automated remediation, but notifications alone do not trigger instance replacement; the action must be a direct termination or scaling action.

Practice this question →