CCNA Monitoring, Logging, and Remediation Questions — Page 2 of 5

MCQeasy

A SysOps administrator needs to create a custom metric to track the number of active connections to an EC2 instance. Which steps should be taken? (Select TWO.)

A.Enable detailed monitoring on the EC2 instance.

B.Use the AWS CLI to call put-metric-data and publish the custom metric.

C.Store the metric data in an S3 bucket and configure CloudWatch to read from it.

D.Use the EC2 console to enable custom metric collection.

E.Install and configure the Amazon CloudWatch agent on the EC2 instance.

AnswerB, E

You can publish custom metrics using the CLI.

Why this answer

Option B is correct because the AWS CLI `put-metric-data` command allows you to publish custom metrics directly to CloudWatch, which is the standard method for sending application-level or OS-level metrics that are not automatically provided by AWS. Option E is correct because the Amazon CloudWatch agent can collect custom metrics from the EC2 instance (e.g., active connection counts from netstat or a script) and publish them to CloudWatch, making it the recommended approach for in-guest metric collection.

Exam trap

The trap here is that candidates often confuse 'detailed monitoring' (which only increases frequency of existing metrics) with the ability to create new custom metrics, leading them to select Option A incorrectly.

How to eliminate wrong answers

Option A is wrong because enabling detailed monitoring on an EC2 instance only increases the frequency of standard hypervisor-level metrics (CPU, disk, network) from 5 minutes to 1 minute; it does not enable collection of custom metrics like active connections. Option C is wrong because CloudWatch cannot directly read metric data from an S3 bucket; you would need to use a Lambda function or other service to ingest the data into CloudWatch via PutMetricData. Option D is wrong because the EC2 console does not have a feature to enable custom metric collection; custom metrics must be published programmatically via the CloudWatch API, CLI, or an agent.

Practice this question →

MCQeasy

A company uses Amazon CloudWatch to monitor its Amazon EC2 instances. The SysOps administrator wants to receive an email notification when any EC2 instance's CPUUtilization metric exceeds 90% for 5 consecutive minutes. Which combination of services should be used to meet this requirement with the least operational overhead?

A.Create a CloudWatch Logs metric filter and a Lambda function that sends email via SES

B.Create a CloudWatch metric alarm that sends a notification to an Amazon SNS topic subscribed with email endpoints

C.Create a CloudWatch Events rule that matches EC2 instance state changes and sends to SQS with a Lambda consumer

D.Configure a CloudWatch dashboard that displays CPU utilization and share it with the team

AnswerB

This directly meets the requirement. The alarm monitors the metric and triggers SNS to send emails.

Why this answer

Option B is correct because it directly uses a CloudWatch metric alarm configured to trigger when CPUUtilization exceeds 90% for 5 consecutive minutes, which then publishes to an Amazon SNS topic with email endpoints. This combination requires no custom code, no Lambda functions, and no additional services, minimizing operational overhead while meeting the requirement precisely.

Exam trap

The trap here is that candidates may overcomplicate the solution by introducing Lambda, SQS, or SES, when a native CloudWatch alarm with SNS is the simplest and most operationally efficient approach for metric-based threshold notifications.

How to eliminate wrong answers

Option A is wrong because CloudWatch Logs metric filters are designed to parse log data, not to evaluate EC2 metrics like CPUUtilization, and adding a Lambda function with SES introduces unnecessary complexity and overhead. Option C is wrong because CloudWatch Events rules that match EC2 instance state changes (e.g., running, stopped) cannot evaluate CPUUtilization thresholds, and using SQS with a Lambda consumer adds complexity without benefit. Option D is wrong because a CloudWatch dashboard only visualizes metrics and does not trigger any notifications or actions when thresholds are breached.

Practice this question →

MCQmedium

A company uses an Application Load Balancer (ALB) to distribute traffic to a fleet of EC2 instances. The administrator notices that the ALB returns 503 errors during peak traffic. The instances are healthy according to the ALB health checks. What is the MOST likely cause and what metric should the administrator check?

A.The SSL certificate has expired; check the TLS handshake errors.

B.The ALB is overloaded; check the ALB's CPUUtilization metric.

C.The targets are failing health checks; check the HealthyHostCount metric.

D.The ALB's surge queue is full; check the SurgeQueueLength metric.

AnswerD

The SurgeQueueLength metric indicates the number of pending requests waiting to be routed. A high value can cause 503 errors.

Why this answer

The ALB returns 503 errors despite healthy targets, which indicates that the ALB itself is overwhelmed and cannot process incoming requests. The ALB uses a surge queue to buffer requests when traffic exceeds its capacity; when the queue is full, new requests are rejected with a 503. The SurgeQueueLength metric directly measures this queue depth, making it the correct metric to check.

Exam trap

The trap here is that candidates confuse ALB overload (surge queue full) with target health issues, but the question explicitly states healthy targets, so the root cause is the ALB's own capacity limit, not the targets.

How to eliminate wrong answers

Option A is wrong because an expired SSL certificate would cause TLS handshake failures (e.g., 502 or 525 errors), not 503 errors, and the ALB would still forward requests to healthy targets. Option B is wrong because ALBs are managed services that do not expose a CPUUtilization metric; they scale automatically and 503s from overload are indicated by SurgeQueueLength, not CPU. Option C is wrong because the question explicitly states that instances are healthy according to health checks, so HealthyHostCount would be normal and not explain 503 errors.

Practice this question →

MCQhard

Refer to the exhibit. A SysOps administrator runs the command and sees the output. The administrator then creates a CloudWatch alarm on the CPUUtilization metric for this instance, but the alarm state remains 'INSUFFICIENT_DATA'. What is a likely cause?

A.Detailed monitoring is not enabled.

B.The EC2 instance is stopped or terminated.

C.The instance is in a different AWS region.

D.The metric name is misspelled.

AnswerB

No data is emitted when the instance is stopped.

Why this answer

The command output shows the instance state is 'stopped'. CloudWatch cannot retrieve metrics from a stopped or terminated EC2 instance because the hypervisor is no longer running the instance's operating system or collecting CPU utilization data. When no metric data points are received for the configured alarm period, the alarm transitions to 'INSUFFICIENT_DATA' state.

Exam trap

The trap here is that candidates assume INSUFFICIENT_DATA always means a configuration issue (like missing detailed monitoring or wrong region), when in fact it often indicates the resource itself is not running and therefore not emitting any metrics.

How to eliminate wrong answers

Option A is wrong because detailed monitoring (1-minute granularity) is not required for basic CPUUtilization metrics; standard 5-minute monitoring still provides data points. Option C is wrong because CloudWatch alarms can monitor metrics across regions if the alarm is created in the same region as the instance; the command output shows the instance is in us-east-1, and the alarm would be created in that same region. Option D is wrong because the metric name 'CPUUtilization' is a standard AWS/EC2 namespace metric and is correctly spelled; a misspelling would cause a validation error at alarm creation time, not an INSUFFICIENT_DATA state.

Practice this question →

MCQeasy

A company wants to ensure that it receives notifications whenever any AWS Identity and Access Management (IAM) user in the account creates a new access key. Which AWS service should be used to achieve this?

A.AWS Config

B.AWS CloudTrail

C.AWS Trusted Advisor

D.Amazon CloudWatch Events

AnswerD

EventBridge can match CloudTrail events and trigger SNS notifications.

Why this answer

Amazon CloudWatch Events (now part of Amazon EventBridge) can capture API calls from AWS CloudTrail and trigger a notification (e.g., via SNS) when an IAM user creates a new access key. By setting up a rule that matches the `CreateAccessKey` API call, the company can receive real-time alerts for this specific action.

Exam trap

The trap here is that candidates often choose AWS CloudTrail because it logs API calls, but they overlook that CloudTrail alone cannot send notifications—it requires an event-driven service like CloudWatch Events/EventBridge to trigger alerts.

How to eliminate wrong answers

Option A is wrong because AWS Config is used for evaluating resource configurations against desired policies (e.g., compliance rules), not for real-time event-driven notifications on API actions. Option B is wrong because AWS CloudTrail only logs API calls for auditing and does not natively send notifications; it requires an external service like CloudWatch Events to trigger alerts. Option C is wrong because AWS Trusted Advisor provides best-practice checks and recommendations (e.g., security, cost optimization), but it does not monitor or notify on specific IAM user actions like creating access keys.

Practice this question →

Multi-Selecthard

A company is using AWS CloudTrail to log API activity. The security team wants to be notified when an IAM user attempts to modify an S3 bucket policy. Which actions should be taken to meet this requirement? (Select THREE.)

Select 3 answers

A.Create a CloudWatch alarm on the number of PutBucketPolicy calls.

B.Enable CloudTrail data events for S3 to capture bucket policy changes.

C.Create an Amazon EventBridge rule that matches the PutBucketPolicy API call via CloudTrail.

D.Configure the EventBridge rule to send events to an SNS topic.

E.Ensure CloudTrail is logging management events for the S3 service.

AnswersC, D, E

EventBridge can filter CloudTrail events.

Why this answer

Option C is correct because Amazon EventBridge can match specific API calls (like PutBucketPolicy) by using CloudTrail as an event source. This allows the security team to trigger a notification when an IAM user attempts to modify an S3 bucket policy, without needing to poll or set up custom monitoring.

Exam trap

The trap here is that candidates may confuse CloudWatch alarms (which are metric-based) with EventBridge rules (which are event-driven), leading them to select Option A instead of understanding that EventBridge provides immediate, per-event notification for specific API calls.

Practice this question →

MCQhard

An organization has a CloudWatch dashboard that displays metrics for multiple AWS services. The dashboard is shared with the operations team. Recently, some team members reported that the dashboard is not loading for them. Which action should the SysOps administrator take to troubleshoot the issue?

A.Confirm that the team members have the necessary IAM permissions for cloudwatch:GetDashboard.

B.Verify that the team members have subscribed to the metric streams.

C.Ensure the CloudWatch agent is installed on the instances displaying the dashboard.

D.Check that the dashboard is in the same region as the resources.

AnswerA

Without GetDashboard permission, users cannot load the dashboard.

Why this answer

The most likely cause of the dashboard not loading is that the team members lack the required IAM permission to retrieve the dashboard definition. CloudWatch dashboards are stored as JSON objects, and the `cloudwatch:GetDashboard` action is necessary to fetch and render that data in the console. Without this permission, the API call fails silently, resulting in a blank or non-loading dashboard.

Exam trap

The trap here is that candidates often assume the issue is related to the CloudWatch agent or regional configuration, but the root cause is almost always an IAM permissions problem when a dashboard fails to load for users who previously had access.

How to eliminate wrong answers

Option B is wrong because metric streams are used to send CloudWatch metrics to destinations like AWS Lambda or Kinesis Data Firehose; they are not related to viewing or loading a CloudWatch dashboard. Option C is wrong because the CloudWatch agent is installed on EC2 instances to collect custom metrics and logs, but it has no role in rendering or loading a dashboard in the AWS Management Console. Option D is wrong because CloudWatch dashboards can display metrics from multiple regions, and the dashboard itself is a global resource; the dashboard not loading is not caused by a region mismatch.

Practice this question →

MCQeasy

A SysOps administrator is troubleshooting an application that intermittently fails to connect to an RDS database. The error logs show 'Too many connections'. What CloudWatch metric should the administrator monitor to proactively detect this issue?

A.CPUUtilization

B.DatabaseConnections

C.NetworkThroughput

D.FreeableMemory

AnswerB

Directly tracks the number of database connections.

Why this answer

The 'Too many connections' error indicates that the RDS database has reached its maximum allowed number of simultaneous client connections. The DatabaseConnections CloudWatch metric tracks the current number of connections to the DB instance, so monitoring this metric allows the administrator to set an alarm when connections approach the instance's max_connections limit, enabling proactive scaling or connection management before errors occur.

Exam trap

The trap here is that candidates may confuse performance metrics like CPU or memory with the specific connection limit error, overlooking that the 'Too many connections' error is directly tied to the DatabaseConnections metric and the max_connections configuration.

How to eliminate wrong answers

Option A (CPUUtilization) is wrong because high CPU usage does not directly cause connection limit errors; it may indicate query performance issues but not the specific 'Too many connections' error. Option C (NetworkThroughput) is wrong because network throughput measures data transfer volume, not the number of database connections, and a connection limit error is unrelated to bandwidth. Option D (FreeableMemory) is wrong because low freeable memory can affect performance but does not directly trigger a connection limit error; the error is explicitly tied to the connection count exceeding the configured max_connections parameter.

Practice this question →

MCQmedium

A company has an application running on EC2 instances behind an Application Load Balancer. They want to receive an email notification when the average latency exceeds 2 seconds. Which combination of steps should the SysOps administrator take? (Select TWO.)

A.Enable AWS CloudTrail to capture API calls and monitor latency.

B.Use CloudWatch Logs to stream logs to Amazon ES for latency analysis.

C.Create a CloudWatch alarm on the EC2 instance's CPUUtilization metric.

D.Configure the alarm to send a notification to an Amazon SNS topic.

E.Create a CloudWatch alarm on the ALB's TargetResponseTime metric.

AnswerD, E

SNS can send email notifications.

Why this answer

Option D is correct because Amazon CloudWatch alarms can be configured to send notifications to an Amazon SNS topic when a metric threshold is breached. By creating an SNS topic and subscribing an email endpoint to it, the SysOps administrator can receive email alerts when the alarm state changes, such as when the average latency exceeds 2 seconds.

Exam trap

The trap here is that candidates often confuse CloudTrail (audit logging) with CloudWatch (monitoring and alarming), or mistakenly think CPUUtilization is a proxy for latency, when the correct approach is to use the ALB's TargetResponseTime metric with a CloudWatch alarm and SNS notification.

How to eliminate wrong answers

Option A is wrong because AWS CloudTrail captures API calls for auditing and governance, not application-level latency metrics; it cannot monitor or alarm on latency. Option B is wrong because streaming CloudWatch Logs to Amazon ES provides log analysis and visualization, but it does not directly trigger email notifications for latency thresholds. Option C is wrong because the CPUUtilization metric on EC2 instances measures compute resource usage, not application latency; it is unrelated to the ALB's TargetResponseTime metric.

Practice this question →

MCQeasy

A SysOps administrator needs to centrally collect operating system-level metrics from a fleet of Amazon EC2 instances running Amazon Linux 2. The metrics should include memory usage and disk I/O. Which solution should the administrator implement?

A.Install and configure the CloudWatch agent on the EC2 instances.

B.Enable detailed monitoring on the EC2 instances.

C.Use AWS CloudTrail to log OS-level metrics.

D.Use AWS Systems Manager Inventory to collect metrics.

AnswerA

The CloudWatch agent can collect custom metrics like memory and disk I/O.

Why this answer

The CloudWatch agent is the correct solution because it can collect custom OS-level metrics such as memory usage and disk I/O from EC2 instances running Amazon Linux 2. Unlike the default CloudWatch metrics, which only capture hypervisor-level metrics (e.g., CPU, network), the CloudWatch agent uses the procstat and disk plugins to gather detailed system metrics and publish them to CloudWatch as custom namespaces.

Exam trap

The trap here is that candidates often confuse 'detailed monitoring' (which only increases frequency of existing hypervisor metrics) with the ability to collect new OS-level metrics, leading them to incorrectly select Option B.

How to eliminate wrong answers

Option B is wrong because enabling detailed monitoring on EC2 instances only increases the frequency of hypervisor-level metrics (e.g., CPU, network) from 5 minutes to 1 minute, but it does not collect OS-level metrics like memory usage or disk I/O. Option C is wrong because AWS CloudTrail is designed to log API calls and account activity, not OS-level metrics from EC2 instances. Option D is wrong because AWS Systems Manager Inventory collects software inventory and configuration data (e.g., installed applications, patches), not real-time performance metrics like memory or disk I/O.

Practice this question →

MCQhard

A company runs a critical web application on Amazon EC2 instances in an Auto Scaling group across three Availability Zones. The application uses an Application Load Balancer (ALB) for traffic distribution. The SysOps administrator has configured a CloudWatch alarm to monitor the ALB's `TargetResponseTime` metric, with a threshold of 5 seconds. The alarm triggers when the average response time exceeds 5 seconds for 2 consecutive periods. Recently, the alarm has been triggering frequently during peak hours, but the application team reports that the response time is acceptable and the application is performing normally. The administrator investigates and finds that a small number of requests are taking a very long time (over 30 seconds), skewing the average. The administrator needs to reduce the number of false alarms while still being alerted if the overall application performance degrades. Which course of action should the administrator take?

A.Change the statistic to p95 and keep the threshold at 5 seconds

B.Increase the threshold to 30 seconds

C.Decrease the period to 60 seconds and lower the threshold to 3 seconds

D.Increase the evaluation periods to 5 consecutive periods

AnswerA

p95 excludes the top 5% slowest requests, so it reflects the experience of the majority.

Why this answer

The correct answer is A because using the p95 (95th percentile) statistic instead of the average filters out the impact of the small number of outlier requests that take over 30 seconds. The p95 metric shows the response time below which 95% of requests fall, providing a more accurate representation of typical application performance. This reduces false alarms from skewed averages while still alerting if the majority of users experience degraded response times exceeding 5 seconds.

Exam trap

The trap here is that candidates may think increasing the threshold or evaluation periods is the solution, but they fail to recognize that the average metric is inherently sensitive to outliers, and the correct fix is to change the statistic to a percentile like p95 or p99.

How to eliminate wrong answers

Option B is wrong because increasing the threshold to 30 seconds would mask genuine performance degradation for the majority of requests, as the alarm would only trigger when the average exceeds 30 seconds, which is far beyond acceptable performance. Option C is wrong because decreasing the period to 60 seconds and lowering the threshold to 3 seconds would make the alarm more sensitive, likely increasing false alarms due to short-term spikes or noise. Option D is wrong because increasing evaluation periods to 5 consecutive periods would delay the alarm response, potentially missing transient performance issues that affect users, and does not address the root cause of outliers skewing the average.

Practice this question →

MCQhard

Refer to the exhibit. A SysOps administrator reviews the CloudWatch alarm configuration. The alarm is in ALARM state. Which statement accurately describes the alarm's behavior?

A.The alarm evaluates CPU utilization every 5 minutes and requires 3 consecutive breaches to trigger.

B.The alarm will automatically resolve when CPU utilization drops below 80% for one period.

C.The alarm triggered because the average CPU utilization over 5 minutes exceeded 80% for one consecutive period.

D.The alarm sends a notification to the SNS topic every 5 minutes while in ALARM state.

AnswerC

Exactly correct based on the configuration.

Why this answer

Option C is correct because the alarm configuration shows 'Period: 5 minutes' and 'Statistic: Average' with 'Threshold: 80%' and 'Datapoints to alarm: 1 out of 1'. This means the alarm evaluates the average CPU utilization over a single 5-minute period, and if that average exceeds 80%, the alarm transitions to ALARM state immediately after one period's data point is available.

Exam trap

The trap here is that candidates assume 'Datapoints to alarm' implies multiple consecutive breaches (like 3 out of 3) without reading the actual values, or they confuse the alarm's evaluation period with the notification frequency, leading them to pick Option A or D.

How to eliminate wrong answers

Option A is wrong because the alarm requires only 1 datapoint to alarm (not 3 consecutive breaches), as indicated by 'Datapoints to alarm: 1 out of 1'. Option B is wrong because the alarm does not automatically resolve when CPU utilization drops below 80% for one period; it requires the metric to return to a non-breaching state for the number of datapoints specified in 'Datapoints to alarm' (here 1) to transition to OK state, but the alarm does not auto-resolve—it must be explicitly configured with an alarm actions or left to evaluate. Option D is wrong because the alarm sends a notification to the SNS topic only when the alarm state changes (e.g., from OK to ALARM or ALARM to OK), not every 5 minutes while in ALARM state; continuous notifications would require a custom solution like a Lambda function.

Practice this question →

MCQmedium

A SysOps administrator notices that an Amazon RDS instance's CPU utilization is consistently above 90% during peak hours. The administrator needs to investigate which queries are consuming the most CPU. Which action should the administrator take?

A.Enable Performance Insights for the RDS instance and review the top SQL queries.

B.Use CloudWatch Logs Insights to query the database error log for slow queries.

C.Enable detailed CloudWatch metrics for the RDS instance and analyze the CPUUtilization metric.

D.Enable RDS Enhanced Monitoring and review the 'cpuCreditUsage' metric.

AnswerA

Performance Insights shows the top queries by CPU usage, enabling targeted optimization.

Why this answer

Performance Insights is the correct tool because it provides a database-specific performance schema that visualizes database load and identifies the top SQL queries consuming resources. By enabling Performance Insights on the RDS instance, the administrator can directly view which queries are responsible for the high CPU utilization during peak hours, allowing targeted optimization.

Exam trap

The trap here is confusing aggregate metrics (CloudWatch CPUUtilization) or OS-level metrics (Enhanced Monitoring) with database-specific query performance analysis, leading candidates to choose options that show overall CPU usage but not the root-cause queries.

How to eliminate wrong answers

Option B is wrong because CloudWatch Logs Insights queries the database error log, which typically contains errors, warnings, and startup messages, not a real-time breakdown of query CPU consumption; slow query logs would need to be enabled separately and analyzed with a different tool. Option C is wrong because detailed CloudWatch metrics for CPUUtilization only show the aggregate CPU usage percentage, not which specific queries are causing the load. Option D is wrong because RDS Enhanced Monitoring provides OS-level metrics like CPU credit usage for burstable instances, but it does not identify the top SQL queries consuming CPU.

Practice this question →

MCQmedium

A SysOps administrator needs to ensure that an Amazon RDS instance automatically reboots if it becomes unavailable due to an operating system crash. The instance is a Multi-AZ deployment. What is the correct approach?

A.Use AWS Systems Manager Run Command to reboot the instance when a health check fails.

B.Create an Amazon EventBridge rule to trigger a Lambda function that reboots the instance.

C.Configure a CloudWatch Alarm on the DatabaseConnections metric to reboot the instance.

D.Enable Multi-AZ on the RDS instance to automatically failover to the standby.

AnswerD

Multi-AZ automatically handles failover without manual intervention.

Why this answer

Option D is correct because Multi-AZ deployments automatically handle failover to a standby replica in a different Availability Zone when the primary instance becomes unavailable due to an OS crash. This built-in mechanism ensures high availability without manual intervention, as the standby takes over with the same endpoint.

Exam trap

The trap here is that candidates may overcomplicate the solution by proposing custom automation (e.g., Lambda or Systems Manager) when the simplest and most robust answer is to leverage the native Multi-AZ failover feature, which is specifically designed for this scenario.

How to eliminate wrong answers

Option A is wrong because AWS Systems Manager Run Command is designed for ad-hoc or scheduled administrative tasks on EC2 instances, not for automating RDS instance reboots based on health checks; RDS is a managed service that does not support Run Command for rebooting. Option B is wrong because while an EventBridge rule can trigger a Lambda function, using a custom script to reboot an RDS instance is unnecessary and less reliable than the native Multi-AZ failover, which is the intended AWS solution for OS-level crashes. Option C is wrong because the DatabaseConnections metric measures active connections, not instance health or OS crashes; a low connection count does not indicate an OS crash, and rebooting based on this metric could cause unnecessary downtime.

Practice this question →

MCQhard

A company uses AWS CloudTrail to log API activity. A SysOps administrator discovers that some management events are not being logged. The administrator checks the CloudTrail configuration and confirms that management events are enabled and logging is working for most events. What is the most likely cause of the missing events?

A.The trail excludes specific management events based on read/write filtering

B.The trail is logging only data events for S3

C.The trail is configured to log events only for a single region, and the missing events occurred in a different region

D.The missing events are from unsupported services

AnswerC

CloudTrail must be configured to log events from all regions to capture global events or events in other regions.

Why this answer

Option C is correct because CloudTrail trails can be configured to log events for a single region or all regions. If the trail is set to log only one region, management events occurring in any other region will not be captured. Since the administrator confirmed management events are enabled and logging works for most events, the most likely cause is that the missing events originated from a region not covered by the trail.

Exam trap

The trap here is that candidates often overlook the region scope of CloudTrail and assume that enabling management events globally means all regions are covered, but a single-region trail only captures events from its designated region.

How to eliminate wrong answers

Option A is wrong because read/write filtering applies to data events, not management events; management events are logged regardless of read/write filtering unless explicitly excluded via event selectors, but the question states management events are enabled and logging works for most events, so filtering is not the issue. Option B is wrong because if the trail were logging only data events for S3, management events would not be logged at all, contradicting the statement that logging works for most events. Option D is wrong because AWS CloudTrail supports logging management events for all AWS services; unsupported services would not generate management events in the first place, and the question indicates the missing events are from services that should be logged.

Practice this question →

MCQmedium

A SysOps administrator manages an Amazon RDS for MySQL instance that handles a critical web application. During peak traffic, the number of database connections exceeds 500 for more than 15 minutes, leading to connection timeouts. The administrator wants to automatically increase the DB instance size when the connection count remains high, and decrease it when the load drops, to balance performance and cost. Which combination of AWS services should be used to achieve this automation with the least operational overhead?

A.Configure a CloudWatch alarm on DatabaseConnections that triggers an Amazon CloudWatch Events rule, which directly modifies the DB instance class using a CloudFormation custom resource.

B.Use an AWS Config rule to monitor DatabaseConnections and invoke an AWS Lambda function to scale the RDS instance when the threshold is breached.

C.Set up an Amazon CloudWatch alarm on the DatabaseConnections metric that triggers an AWS Lambda function to modify the DB instance class via the RDS API.

D.Use an AWS Systems Manager Automation runbook to periodically check the DatabaseConnections metric and adjust the RDS instance class if needed.

AnswerC

This is the correct approach. CloudWatch alarms can invoke Lambda actions. Lambda can use the RDS API (ModifyDBInstance) to change the instance class. This provides automated, event-driven scaling with minimal overhead.

Why this answer

Option C is correct because it uses a CloudWatch alarm to monitor the DatabaseConnections metric, which triggers an AWS Lambda function that directly calls the RDS ModifyDBInstance API to change the instance class. This approach provides the least operational overhead by leveraging native AWS services without additional infrastructure, custom resources, or periodic polling, and it enables real-time, event-driven scaling based on the specified threshold.

Exam trap

The trap here is that candidates often confuse AWS Config rules (designed for compliance) with CloudWatch alarms (designed for metric monitoring), leading them to choose Option B, or they overcomplicate the solution with CloudFormation custom resources (Option A) or Systems Manager runbooks (Option D) when a simple Lambda function triggered by a CloudWatch alarm is the most direct and low-overhead approach.

How to eliminate wrong answers

Option A is wrong because CloudFormation custom resources require a Lambda-backed provisioning function and are designed for infrastructure provisioning, not for real-time, event-driven scaling of an existing RDS instance; they introduce unnecessary complexity and latency. Option B is wrong because AWS Config rules are designed for compliance and resource configuration auditing, not for monitoring real-time CloudWatch metrics like DatabaseConnections, and they cannot directly invoke a Lambda function for metric-based scaling without additional setup. Option D is wrong because AWS Systems Manager Automation runbooks are intended for operational tasks and remediation workflows, but periodically checking metrics introduces polling overhead and latency, which is less efficient than event-driven triggers and increases operational complexity.

Practice this question →

MCQmedium

A SysOps administrator notices that an RDS instance's CPU utilization is consistently above 80% during peak hours. The administrator wants to set up automated actions to scale the database and also notify the team. What should the administrator do?

A.Configure a scheduled scaling action to change the instance class during peak hours.

B.Add the RDS instance to an Auto Scaling group.

C.Create a CloudWatch alarm on CPU utilization that triggers a Lambda function to modify the RDS instance class to a larger size.

D.Enable RDS Auto Scaling for the instance.

AnswerC

This provides a manual-like automated scaling by triggering a Lambda function to modify the instance class.

Why this answer

Option C is correct because it uses a CloudWatch alarm on CPU utilization to trigger a Lambda function, which can programmatically call the ModifyDBInstance API to scale the RDS instance class up during peak hours. This provides automated, event-driven scaling based on actual utilization, and the same alarm can be configured to send an SNS notification to the team. This approach is flexible and allows custom logic in Lambda, such as checking current metrics before scaling.

Exam trap

The trap here is that candidates often confuse RDS Auto Scaling (which only handles storage) with compute scaling, or they mistakenly think RDS can be added to an Auto Scaling group like EC2 instances, leading them to choose option B or D.

How to eliminate wrong answers

Option A is wrong because scheduled scaling actions are time-based and do not respond to real-time CPU utilization, so they cannot adapt to varying peak hour durations or unexpected spikes. Option B is wrong because RDS instances cannot be added to an Auto Scaling group; Auto Scaling groups are designed for EC2 instances, not managed database services. Option D is wrong because RDS Auto Scaling (for storage) only scales storage capacity automatically based on free space, not compute resources like CPU; it does not change the instance class to address high CPU utilization.

Practice this question →

MCQhard

A SysOps administrator is managing an AWS environment with multiple VPCs connected via a transit gateway. The administrator needs to monitor network traffic between VPCs for security analysis. The administrator wants to capture metadata about IP traffic going through the transit gateway. The logs should be centralized in a single S3 bucket and retained for 90 days. Which solution should the administrator implement?

A.Enable VPC Flow Logs on each VPC and publish them to a central S3 bucket.

B.Use AWS Config to record traffic changes and store the configuration history in S3.

C.Enable flow logs on the transit gateway and publish them to a CloudWatch Logs log group, then export to S3 using a subscription filter.

D.Use Transit Gateway Network Manager to enable flow logs on the transit gateway attachments and publish them directly to an S3 bucket.

AnswerD

Transit Gateway Network Manager supports flow logs that can be delivered to S3.

Why this answer

Option D is correct because Transit Gateway Network Manager allows you to enable flow logs directly on transit gateway attachments, capturing IP traffic metadata (source/destination IP, ports, protocol, packet/byte counts) and publishing them to a centralized S3 bucket. This meets the requirement for centralized logging in a single S3 bucket with a 90-day retention period, as S3 lifecycle policies can be configured to expire objects after 90 days.

Exam trap

The trap here is that candidates assume VPC Flow Logs (Option A) are sufficient for monitoring inter-VPC traffic, but they fail to recognize that VPC Flow Logs only capture traffic at the VPC or subnet level and do not see traffic that is routed through a transit gateway unless specifically enabled on the transit gateway attachments.

How to eliminate wrong answers

Option A is wrong because VPC Flow Logs capture traffic at the VPC or subnet level, not traffic traversing the transit gateway between VPCs; they would miss inter-VPC traffic that goes through the transit gateway. Option B is wrong because AWS Config records configuration changes to AWS resources, not network traffic metadata; it is used for compliance and auditing of resource configurations, not IP traffic analysis. Option C is wrong because transit gateway flow logs can be published directly to CloudWatch Logs or S3, but the suggested method of exporting via a subscription filter from CloudWatch Logs to S3 adds unnecessary complexity and cost; transit gateway flow logs natively support direct delivery to S3 without needing CloudWatch Logs.

Practice this question →

MCQeasy

A company uses Amazon CloudWatch to monitor its AWS resources. The operations team needs to receive email notifications when the root user performs any action in the AWS account. Which combination of services should the SysOps administrator use to meet this requirement?

A.Amazon CloudWatch Logs and Amazon Simple Notification Service (SNS).

B.AWS CloudTrail, Amazon CloudWatch Logs metric filter, and Amazon SNS.

C.AWS Trusted Advisor and Amazon Simple Email Service (SES).

D.AWS Config, Amazon CloudWatch Events, and Amazon SNS.

AnswerB

CloudTrail delivers logs to CloudWatch Logs, where a metric filter can detect root user events and trigger an alarm to SNS.

Why this answer

AWS CloudTrail logs all API activity, including root user actions. By sending these logs to CloudWatch Logs, you can create a metric filter that matches root user events (e.g., `userIdentity.type = "Root"`). When the metric filter triggers a CloudWatch alarm, it publishes a notification to an SNS topic, which sends an email to subscribers.

This combination ensures real-time notification of root user actions.

Exam trap

The trap here is that candidates often confuse AWS Config (resource configuration tracking) with CloudTrail (API activity logging), or assume CloudWatch Logs alone can filter events without a metric filter and CloudTrail integration.

How to eliminate wrong answers

Option A is wrong because CloudWatch Logs alone cannot filter for root user actions; it requires a metric filter to detect specific log events, and without CloudTrail, there are no logs of root user API calls. Option C is wrong because AWS Trusted Advisor provides best-practice checks (e.g., cost optimization, security) but does not log or monitor real-time API actions like root user activity. Option D is wrong because AWS Config tracks resource configuration changes, not API actions; CloudWatch Events (now Amazon EventBridge) can trigger on API calls via CloudTrail, but without CloudTrail integration, Config alone cannot capture root user actions.

Practice this question →

MCQmedium

A company is experiencing intermittent performance issues with an application running on an EC2 instance. The CloudWatch metrics show high CPU utilization but no correlation with the timing of the issue. The SysOps administrator needs to collect detailed performance data to identify the root cause. Which AWS service should the administrator use to capture network-level metrics and logs?

A.Configure a CloudWatch Logs agent on the instance to send application logs.

B.Enable VPC Flow Logs for the EC2 instance's subnet.

C.Use AWS CloudTrail to log all API calls made to the instance.

D.Enable AWS Config to track configuration changes to the instance.

AnswerB

VPC Flow Logs capture network traffic metadata, which can help identify network bottlenecks or anomalies.

Why this answer

VPC Flow Logs capture IP traffic metadata (source/destination IP, ports, protocol, packet count) at the network interface level, which is essential for diagnosing network-related performance issues. Since the problem is intermittent and uncorrelated with CPU, network-level metrics can reveal issues like packet loss, throttling, or latency that application logs or CPU metrics alone cannot. This directly addresses the need for detailed network-level data.

Exam trap

The trap here is that candidates confuse VPC Flow Logs (network traffic metadata) with CloudTrail (API activity) or CloudWatch Logs (application logs), assuming any 'log' service captures network-level data, but only VPC Flow Logs provide IP traffic flow records at the network interface level.

How to eliminate wrong answers

Option A is wrong because CloudWatch Logs agent sends application logs, not network-level metrics or logs; it cannot capture IP traffic metadata or network performance data. Option C is wrong because AWS CloudTrail logs API calls to the instance (e.g., StartInstances, DescribeInstances), not network traffic flowing through the instance's ENI; it provides no insight into packet-level performance. Option D is wrong because AWS Config tracks configuration changes (e.g., security group rules, instance type) but does not capture real-time network traffic or performance metrics.

Practice this question →

MCQhard

A SysOps administrator manages Amazon EC2 instances in multiple AWS accounts. The administrator needs to collect and analyze network traffic logs to identify the top IP addresses generating the most traffic to the instances. The administrator must centralize this analysis in a single monitoring account that has cross-account access to the logs. Which combination of AWS services should the administrator use?

A.Enable VPC Flow Logs in each account, publish them to an Amazon S3 bucket, and use Amazon Athena to query for top IPs.

B.Use AWS Config rules across accounts to aggregate network traffic data and generate a report in Amazon QuickSight.

C.Set up CloudWatch Contributor Insights rules in the central monitoring account, with cross-account log ingestion from each account's VPC Flow Logs published to CloudWatch Logs.

D.Use Amazon CloudWatch Logs Insights with a saved query in the central account and schedule it to run every hour using EventBridge.

AnswerC

CloudWatch Contributor Insights can analyze log data from CloudWatch Logs across accounts using cross-account observability. It continuously identifies top contributors like source IP addresses, providing a dashboard without manual queries.

Why this answer

Option C is correct because CloudWatch Contributor Insights can analyze VPC Flow Logs to identify the top IP addresses generating traffic, and it supports cross-account log ingestion by subscribing to CloudWatch Logs from multiple accounts. This allows centralized analysis in the monitoring account without needing to copy logs to S3 or run complex queries manually.

Exam trap

The trap here is that candidates often assume S3 with Athena is the simplest centralized logging solution, but they overlook that Contributor Insights provides built-in top-N analysis and native cross-account log ingestion, which is more efficient for this specific use case than manual querying.

How to eliminate wrong answers

Option A is wrong because while VPC Flow Logs can be published to S3 and queried with Athena, this approach does not natively support cross-account centralized analysis without additional setup like S3 bucket policies and replication, and it lacks the built-in top-N contributor analysis that Contributor Insights provides. Option B is wrong because AWS Config rules are designed for resource compliance and configuration tracking, not for analyzing network traffic logs or identifying top IP addresses. Option D is wrong because CloudWatch Logs Insights requires logs to be in the same account and region to query; it does not natively support cross-account log ingestion without additional infrastructure like cross-account subscriptions or log replication.

Practice this question →

MCQeasy

A company wants to receive a real-time notification whenever an IAM user creates a new access key. Which combination of AWS services should be used to achieve this?

A.Amazon GuardDuty and Amazon SQS

B.AWS CloudTrail and Amazon EventBridge

C.Amazon CloudWatch Logs and AWS Lambda

D.AWS Config and Amazon SNS

AnswerB

CloudTrail records API calls; EventBridge filters events and triggers notifications.

Why this answer

AWS CloudTrail captures IAM API calls, including CreateAccessKey, as management events. Amazon EventBridge can be configured with a rule that matches this specific API call pattern and triggers a real-time notification (e.g., via SNS or Lambda). This combination provides the exact event-driven monitoring required without polling or custom code.

Exam trap

The trap here is that candidates confuse AWS Config (which tracks resource state changes) with CloudTrail (which tracks API calls), leading them to pick Option D, but Config does not provide real-time, event-driven notifications for individual API actions like CreateAccessKey.

How to eliminate wrong answers

Option A is wrong because Amazon GuardDuty is a threat detection service that analyzes VPC flow logs, DNS logs, and CloudTrail management events for malicious activity, but it does not provide a mechanism to trigger real-time notifications for specific IAM actions like creating access keys; SQS alone cannot filter or route events. Option C is wrong because Amazon CloudWatch Logs can store CloudTrail logs, but it does not natively support real-time event pattern matching for specific API calls; using Lambda to poll logs introduces latency and complexity, whereas EventBridge provides immediate, pattern-based routing. Option D is wrong because AWS Config is a resource compliance and configuration tracking service that records resource state changes (e.g., access key creation as a resource change), but it does not generate real-time notifications for API-level events; it evaluates rules on a periodic or configuration-change basis, not for every CreateAccessKey call.

Practice this question →

MCQhard

A SysOps admin is troubleshooting an Auto Scaling group that fails to launch instances. The group uses a launch template with an Amazon Linux 2 AMI. The admin reviews the scaling activity history and sees: 'Launching a new EC2 instance. Status: Failed. Description: Your spot request price is lower than the minimum required Spot price.' Which change should the admin make to resolve the issue?

A.Increase the maximum price for the Spot request in the launch template

B.Modify the Auto Scaling group to use On-Demand instances instead of Spot

C.Change the Auto Scaling group to a different AWS Region

D.Increase the desired capacity of the Auto Scaling group

AnswerA

Raising the max price allows the request to meet the current Spot market price.

Why this answer

The error message indicates that the Spot Instance request failed because the maximum price specified in the launch template is below the current Spot market price. By increasing the maximum price in the launch template (Option A), you allow the Spot request to meet or exceed the minimum required Spot price, enabling the Auto Scaling group to successfully launch instances. This is the direct fix for the price-related failure.

Exam trap

The trap here is that candidates may think the error is about insufficient capacity or regional issues, rather than recognizing it as a direct price mismatch that requires adjusting the maximum bid in the launch template.

How to eliminate wrong answers

Option B is wrong because switching to On-Demand instances would avoid Spot pricing issues entirely, but it is not the minimal change required; the question asks for the change to resolve the specific Spot price error, and increasing the maximum price is the targeted fix. Option C is wrong because changing the AWS Region does not address the Spot price constraint; the error is about the price in the current Region, not regional availability. Option D is wrong because increasing the desired capacity does not affect the Spot request price; it would only attempt to launch more instances, which would still fail with the same price error.

Practice this question →

Multi-Selectmedium

A SysOps administrator is troubleshooting an Amazon EC2 instance that is unreachable. The instance passes the system status check but fails the instance status check. Which TWO of the following are likely causes of this issue? (Choose TWO.)

Select 2 answers

A.Network connectivity issues

B.Detached EBS root volume

C.Misconfigured firewall or iptables

D.Insufficient memory for applications

E.Corrupted file system

AnswersC, E

A misconfigured firewall can block all traffic, making the instance unreachable.

Why this answer

An instance status check failure indicates that the operating system or the instance itself is not functioning correctly, even though the underlying hardware (system status check) is healthy. A misconfigured firewall or iptables can block required network traffic, causing the instance to appear unreachable, while a corrupted file system can prevent the OS from booting or operating properly, both of which are detected by the instance status check.

Exam trap

The trap here is that candidates often confuse instance status checks with system status checks, incorrectly attributing network-level issues (like detached volumes or external connectivity) to instance status failures when they actually belong to system status failures.

Practice this question →

100

MCQeasy

Refer to the exhibit. A SysOps administrator runs the 'list-metrics' command for CPUUtilization. Based on the output, what can the administrator conclude?

A.The CPUUtilization metric is only available for EC2 instances.

B.There are two EC2 instances that have reported CPUUtilization metrics at some point.

C.Both instances are actively publishing CPUUtilization metrics.

D.An alarm has been set on both instances for CPUUtilization.

AnswerB

list-metrics returns metrics that exist, regardless of current publishing.

Why this answer

The 'list-metrics' output shows two distinct dimensions (i-12345678 and i-87654321) for the CPUUtilization metric, indicating that two EC2 instances have reported this metric at some point. Option B is correct because the presence of two unique instance IDs in the metric data confirms that CPUUtilization has been recorded for both instances, regardless of whether they are currently active or have alarms configured.

Exam trap

The trap here is that candidates assume 'list-metrics' shows only currently active resources or that it implies alarm configurations, when in fact it only reflects historical metric reporting and has no relation to current state or alarms.

How to eliminate wrong answers

Option A is wrong because CPUUtilization is not exclusive to EC2 instances; it can also be reported by other services like Auto Scaling groups or Elastic Load Balancers via the AWS/EC2 namespace, but the metric itself is available for any resource that publishes it. Option C is wrong because the output only shows that metrics have been reported at some point; it does not indicate whether the instances are currently publishing metrics (e.g., they could be stopped or terminated). Option D is wrong because the 'list-metrics' command returns metric metadata, not alarm configurations; alarms are managed separately via the 'describe-alarms' API and are not visible in this output.

Practice this question →

101

MCQeasy

A SysOps administrator needs to monitor the CPU utilization of an EC2 instance and receive an alert when it exceeds 80% for 10 consecutive minutes. Which AWS service should be used to set up this monitoring and alerting?

A.Amazon CloudWatch

B.AWS Trusted Advisor

C.AWS Config

D.AWS CloudTrail

AnswerA

CloudWatch monitors metrics and can trigger alarms.

Why this answer

Amazon CloudWatch is the correct service because it provides the ability to monitor EC2 instance metrics, such as CPU utilization, and create CloudWatch Alarms that trigger when a metric crosses a defined threshold (e.g., 80%) for a specified number of consecutive evaluation periods (e.g., 10 minutes with a 1-minute period). This directly meets the requirement for monitoring and alerting on CPU utilization.

Exam trap

The trap here is that candidates often confuse AWS CloudTrail (auditing API calls) or AWS Config (configuration compliance) with CloudWatch, thinking they can monitor performance metrics, but only CloudWatch provides metric collection and alarm-based alerting for EC2 CPU utilization.

How to eliminate wrong answers

Option B (AWS Trusted Advisor) is wrong because it provides best-practice recommendations for cost optimization, performance, security, and fault tolerance, but it does not monitor real-time EC2 CPU utilization or trigger alerts based on metric thresholds. Option C (AWS Config) is wrong because it evaluates and records configuration changes to AWS resources (e.g., security group rules, instance types) and can trigger rules-based remediation, but it does not monitor performance metrics like CPU utilization. Option D (AWS CloudTrail) is wrong because it records API activity and user actions for auditing and governance, not real-time performance monitoring or metric-based alerting.

Practice this question →

102

MCQeasy

A company needs to monitor for unauthorized changes to critical IAM policies. The SysOps administrator must receive notifications within minutes of any change. Which combination of AWS services should the administrator use?

A.Use AWS CloudTrail to log IAM changes, and create a CloudWatch Events rule that triggers an SNS notification when specific API calls are made.

B.Use AWS Config rules to detect changes and send notifications via SNS.

C.Use CloudWatch Logs to monitor IAM activity and create a metric filter to trigger an alarm.

D.Use a Lambda function that periodically checks IAM policies and sends an SNS message if changes are detected.

AnswerA

CloudTrail logs API calls, and CloudWatch Events provides near real-time event streaming to trigger SNS.

Why this answer

Option B is correct because CloudTrail logs API calls to IAM, and a CloudWatch Events rule (now EventBridge) can trigger an SNS notification for specific API calls. Option A is incorrect because AWS Config evaluates resource configurations but is not the best for real-time notifications. Option C is incorrect because CloudWatch Logs does not directly monitor API calls.

Option D is incorrect because Lambda alone cannot monitor without a trigger.

Practice this question →

103

MCQeasy

A SysOps administrator needs to monitor the CPU utilization of an Amazon RDS for MySQL DB instance. The administrator wants to receive a notification when the average CPU utilization exceeds 80% for 10 consecutive minutes. Which steps should the administrator take to set up this monitoring?

A.Use CloudWatch Logs to monitor the database logs and create an alarm based on log patterns.

B.Enable Enhanced Monitoring and create an alarm on the 'CPUUtilization' metric in RDS console.

C.Create a CloudWatch alarm on the 'CPUUtilization' metric with a threshold of 80% and an SNS topic for notifications.

D.Enable CloudTrail and create a metric filter for CPU utilization.

AnswerC

This is the standard method for monitoring RDS CPU utilization.

Why this answer

Option C is correct because Amazon RDS automatically publishes the 'CPUUtilization' metric to CloudWatch, and a CloudWatch alarm can be configured with a threshold of 80% for the 'Average' statistic over a period of 10 consecutive minutes (e.g., 10 evaluation periods of 1 minute each). The alarm can then trigger an SNS topic to send notifications when the threshold is breached. This directly meets the requirement without additional services.

Exam trap

The trap here is that candidates confuse Enhanced Monitoring (which provides OS-level metrics like memory and disk I/O) with the standard CloudWatch metrics, leading them to incorrectly think Enhanced Monitoring is required for CPU utilization alarms.

How to eliminate wrong answers

Option A is wrong because CloudWatch Logs monitors database logs (e.g., error logs, slow query logs) for patterns, not CPU utilization metrics; CPU utilization is a numeric metric, not a log pattern. Option B is wrong because Enhanced Monitoring provides OS-level metrics (e.g., 'cpuUtilization' in the RDS console) but is not required for the basic 'CPUUtilization' metric already available in CloudWatch; creating an alarm on that metric does not require Enhanced Monitoring. Option D is wrong because CloudTrail records API calls (e.g., RDS instance modifications), not CPU utilization metrics; metric filters in CloudTrail cannot capture CPU utilization data.

Practice this question →

104

MCQhard

A SysOps team needs to monitor application logs in Amazon CloudWatch Logs for specific error codes and automatically invoke an AWS Lambda function for remediation within 5 minutes of an error occurring. Which solution involves the least operational overhead?

A.Create a CloudWatch Logs subscription filter to stream logs directly to an AWS Lambda function.

B.Create a CloudWatch metric filter on the log group, create a CloudWatch alarm on the metric, and configure the alarm to post to an SNS topic that triggers the Lambda function.

C.Use a third-party log aggregation tool that sends webhook notifications to an API Gateway endpoint to invoke the Lambda function.

D.Write a custom script that runs on an EC2 instance to poll CloudWatch Logs every minute and invoke the Lambda function.

AnswerB

Correct. This uses native CloudWatch features with minimal overhead, meeting the 5-minute requirement through alarm evaluation intervals.

Why this answer

Option B is correct because it uses CloudWatch metric filters and alarms to detect error codes in logs and trigger remediation via SNS and Lambda, all within a fully managed AWS pipeline. This approach requires no custom code or infrastructure to maintain, and the alarm can be configured to evaluate logs within a 1-minute period, easily meeting the 5-minute requirement with minimal operational overhead.

Exam trap

The trap here is that candidates often assume a subscription filter (Option A) is the simplest because it directly streams logs to Lambda, but they overlook that it lacks native filtering for specific error codes and requires the Lambda to process all log events, increasing complexity and cost compared to a metric filter and alarm.

How to eliminate wrong answers

Option A is wrong because CloudWatch Logs subscription filters stream logs in near real-time but do not provide a built-in mechanism to filter for specific error codes before invoking Lambda; the Lambda function would have to parse every log event, increasing cost and complexity, and there is no native alarm or retry logic for missed events. Option C is wrong because introducing a third-party log aggregation tool and an API Gateway endpoint adds significant operational overhead for setup, maintenance, and cost, and it violates the 'least operational overhead' requirement. Option D is wrong because writing a custom script on an EC2 instance to poll CloudWatch Logs every minute introduces unnecessary compute resources, potential single points of failure, and ongoing maintenance overhead, which is far from the least operational overhead solution.

Practice this question →

105

MCQmedium

An application running on Amazon EC2 instances behind an Application Load Balancer (ALB) is experiencing intermittent 5xx errors. CloudWatch metrics show that the ALB's 'HTTPCode_ELB_5XX_Count' is elevated. What is the MOST likely cause?

A.The target instances are returning HTTP 503 errors.

B.The target instances have high latency but are still responding.

C.The load balancer is timing out waiting for a response from the target.

D.Client requests are malformed and being rejected by the load balancer.

AnswerC

ELB generates 5xx errors when it cannot establish a connection or the target fails to respond within the idle timeout.

Why this answer

When the ALB's 'HTTPCode_ELB_5XX_Count' is elevated, it indicates that the load balancer itself is generating the 5xx error, not the target. The most common cause is that the load balancer is timing out while waiting for a response from the target instances, which occurs when the target takes longer than the configured idle timeout (default 60 seconds) to respond. This results in the ALB returning a 504 Gateway Timeout error, which is counted in the ELB 5xx metric.

Exam trap

The trap here is that candidates confuse 'HTTPCode_ELB_5XX_Count' (errors generated by the load balancer) with 'HTTPCode_Target_5XX_Count' (errors generated by the target), leading them to incorrectly assume the target is returning 5xx errors when the actual issue is a load balancer timeout.

How to eliminate wrong answers

Option A is wrong because if target instances return HTTP 503 errors, those would be counted in the target group's 'HTTPCode_Target_5XX_Count' metric, not the ALB's 'HTTPCode_ELB_5XX_Count' — the ALB forwards the target's 503 response to the client without generating its own 5xx. Option B is wrong because high latency alone does not cause ELB 5xx errors unless the latency exceeds the idle timeout; if the target eventually responds, the ALB will forward the response successfully. Option D is wrong because malformed client requests are rejected by the ALB with a 400 Bad Request error, which is a 4xx error, not a 5xx error, and would be reflected in the 'HTTPCode_ELB_4XX_Count' metric.

Practice this question →

106

MCQeasy

A SysOps administrator needs to monitor the memory utilization of an EC2 instance running a custom application. The instance is not using the default CloudWatch metrics for memory. What should the administrator do to collect memory metrics?

A.Enable detailed monitoring on the EC2 instance

B.Use AWS Trusted Advisor to check memory utilization

C.Use Amazon Inspector to monitor memory

D.Install and configure the CloudWatch agent on the instance

AnswerD

The CloudWatch agent can collect memory metrics and send them to CloudWatch.

Why this answer

The default CloudWatch metrics for EC2 include CPU, disk, and network utilization, but not memory utilization. To collect custom metrics like memory usage, you must install and configure the CloudWatch agent on the instance. The agent collects memory and disk metrics from the OS and sends them to CloudWatch as custom metrics.

Exam trap

The trap here is that candidates assume 'detailed monitoring' or other AWS services like Trusted Advisor or Inspector can capture OS-level metrics, but only the CloudWatch agent can collect memory and disk metrics from inside the instance.

How to eliminate wrong answers

Option A is wrong because enabling detailed monitoring increases the frequency of default metrics (e.g., CPU, disk I/O) from 5 minutes to 1 minute, but it does not add memory metrics. Option B is wrong because AWS Trusted Advisor checks for best practices (e.g., idle instances, security groups) and does not monitor memory utilization. Option C is wrong because Amazon Inspector is a vulnerability assessment service that scans for software vulnerabilities and network exposure, not for OS-level memory metrics.

Practice this question →

107

Multi-Selecthard

A company uses CloudWatch Logs to collect application logs from EC2 instances. The logs are critical for troubleshooting. The operations team notices that some log entries are missing during peak hours. The CloudWatch Logs agent is configured with a batch size of 1 MB and a batch timeout of 10 seconds. Which TWO actions should the administrator take to reduce the chance of missing log events?

Select 2 answers

A.Use PutLogEvents directly from the application.

B.Reduce the retry count for failed requests.

C.Increase the batch size to 5 MB.

D.Increase the batch timeout to 30 seconds.

E.Enable log compression in the agent configuration.

AnswersC, E

Larger batches reduce the number of API calls, lowering the chance of hitting rate limits.

Why this answer

Option C is correct because increasing the batch size from 1 MB to 5 MB allows the CloudWatch Logs agent to buffer more log data before sending a request. During peak hours, log volume spikes can cause the agent to drop events if the batch fills up faster than it can be transmitted. A larger batch size reduces the frequency of API calls and helps ensure that all log entries are successfully delivered.

Exam trap

The trap here is that candidates often think increasing the batch timeout (Option D) will help, but in reality, a longer timeout increases the risk of buffer overflow during peak traffic, whereas increasing batch size and enabling compression directly address throughput and payload limits.

Practice this question →

108

MCQhard

The security team requires that no S3 bucket in the account ever has public read or write ACLs enabled. They want non-compliant buckets automatically remediated within 5 minutes of detection without any manual intervention. What is the correct implementation?

A.Create an AWS Config rule for s3-bucket-public-read-prohibited; configure auto-remediation using the AWS-DisableS3BucketPublicReadWrite SSM Automation document

B.Create an EventBridge rule that matches S3 PutBucketAcl API calls and triggers a Lambda function to re-apply a private ACL

C.Enable S3 Block Public Access at the account level to prevent public ACLs from being set in the first place

D.Schedule a daily Lambda function that lists all buckets, checks ACLs, and removes public grants if found

AnswerA

Config evaluates the rule within seconds of a bucket ACL change. The auto-remediation action invokes the SSM document automatically when compliance status changes to NON_COMPLIANT. The SSM document calls PutBucketAcl to remove public grants. The entire cycle completes in 1-3 minutes under normal conditions.

Why this answer

Option A is correct because AWS Config can evaluate S3 bucket ACLs against the `s3-bucket-public-read-prohibited` managed rule and automatically trigger an AWS Systems Manager (SSM) Automation document (`AWS-DisableS3BucketPublicReadWrite`) as a remediation action. This ensures non-compliant buckets are fixed within minutes without manual intervention, meeting the 5-minute requirement.

Exam trap

The trap here is that candidates often choose Option C (Block Public Access) thinking it prevents all public access, but it does not remediate existing non-compliant buckets, which is explicitly required by the question.

How to eliminate wrong answers

Option B is wrong because EventBridge rules matching `PutBucketAcl` API calls only trigger on new ACL changes, not on existing buckets that already have public ACLs; it also cannot detect public ACLs set via other methods (e.g., S3 console or SDK) and does not provide a 5-minute remediation guarantee for all non-compliant buckets. Option C is wrong because S3 Block Public Access at the account level prevents new public ACLs from being set but does not automatically remediate existing buckets that already have public ACLs; it also does not meet the requirement for automatic remediation within 5 minutes of detection. Option D is wrong because a daily Lambda function runs only once per day, which violates the 5-minute remediation requirement; it also relies on a custom script that may miss edge cases or fail to handle all ACL configurations.

Practice this question →

109

MCQmedium

A web application publishes a custom metric 'FailedLoginAttempts' to Amazon CloudWatch. The SysOps administrator needs to be notified via Amazon SNS when the number of failed login attempts exceeds 100 within a 5-minute period. Which AWS service or feature should be used to create this notification?

A.Amazon CloudWatch Logs metric filter

B.Amazon CloudWatch alarm

C.Amazon CloudWatch dashboard

D.AWS Config rule

AnswerB

A CloudWatch alarm can monitor any CloudWatch metric (including custom ones) and trigger an action, such as sending a message to an SNS topic, when the metric crosses a defined threshold.

Why this answer

An Amazon CloudWatch alarm is the correct service because it monitors a specific CloudWatch metric (such as 'FailedLoginAttempts') and triggers an action (such as sending an SNS notification) when the metric crosses a defined threshold over a specified period. In this case, the alarm evaluates whether the sum of 'FailedLoginAttempts' exceeds 100 within a 5-minute period, and upon breaching, it publishes to the SNS topic to notify the SysOps administrator.

Exam trap

The trap here is that candidates often confuse CloudWatch Logs metric filters (which extract metrics from logs) with CloudWatch alarms (which evaluate metrics and trigger actions), leading them to choose Option A even though the custom metric is already published to CloudWatch and does not require log extraction.

How to eliminate wrong answers

Option A is wrong because Amazon CloudWatch Logs metric filters are used to extract metric data from log events (e.g., from CloudWatch Logs), not to monitor a custom metric that is already published directly to CloudWatch; they cannot directly trigger SNS notifications without an alarm. Option C is wrong because an Amazon CloudWatch dashboard is a visualization tool for displaying metrics and alarms, not a service that evaluates metric thresholds or triggers notifications. Option D is wrong because AWS Config rules evaluate resource configurations for compliance against desired policies, not real-time metric values like failed login attempts, and they cannot directly trigger SNS notifications based on metric thresholds.

Practice this question →

110

Multi-Selecthard

A company uses AWS CloudTrail to log API activity. The security team wants to be alerted when an IAM user creates a new access key. Which THREE steps should the SysOps administrator take to meet this requirement?

Select 3 answers

A.Configure the CloudTrail trail to deliver logs directly to an SNS topic.

B.Configure the Lambda function to publish a custom metric to CloudWatch.

C.Set a CloudWatch alarm on the custom metric to send an Amazon SNS notification when the metric exceeds a threshold.

D.Create a CloudWatch Logs subscription filter that sends matching log events to an AWS Lambda function.

E.Create an Amazon EventBridge rule that matches the CreateAccessKey event and triggers an SNS notification.

AnswersB, C, D

Publishing a custom metric allows setting an alarm on the count of access key creations.

Why this answer

Options A, B, and D are correct. CloudTrail logs events in JSON, so a CloudWatch Logs subscription filter can be used to match the event name 'CreateAccessKey' and send matching events to Lambda for processing. The Lambda function can then publish a custom metric to CloudWatch, and a CloudWatch alarm can be set on that metric.

Option C is wrong because EventBridge can directly match events without needing a CloudWatch Logs subscription; but the scenario asks for three steps, and using EventBridge alone is simpler but does not include the metric/alarm path. Option E is wrong because SNS can send notifications but does not provide filtering logic; it would receive all events.

Practice this question →

111

MCQhard

A SysOps administrator is troubleshooting a slow web application running on EC2 instances behind an ALB. The application uses an RDS MySQL database. The administrator checks CloudWatch metrics and sees that the ALB's latency is high, the RDS CPU is high, and the EC2 CPU is moderate. The application team reports that the database queries are slow. The administrator suspects that the database is the bottleneck. However, the RDS instance is already a db.r5.large and the administrator wants to avoid increasing instance size due to cost. What should the administrator do to improve performance without increasing instance size?

A.Increase the number of EC2 instances to reduce the load on the database.

B.Add an ElastiCache Redis cluster to cache database queries.

C.Create a Read Replica and offload read traffic to it.

D.Enable Performance Insights on the RDS instance to identify slow queries.

AnswerD

Performance Insights helps pinpoint the exact queries causing high load for optimization.

Why this answer

Option D is correct because enabling Performance Insights on the RDS instance allows the administrator to identify the specific slow queries causing the bottleneck. This diagnostic tool provides a database load analysis, showing which queries consume the most resources, enabling targeted optimization (e.g., adding indexes or rewriting queries) without increasing instance size. Since the EC2 CPU is moderate and the ALB latency is high due to slow database queries, resolving the query performance directly addresses the root cause.

Exam trap

The trap here is that candidates often assume scaling out (more EC2 instances) or adding caching/read replicas will solve a database performance issue, when the real problem is unoptimized queries that need to be identified and fixed first.

How to eliminate wrong answers

Option A is wrong because increasing the number of EC2 instances would not reduce the load on the database; it would increase the number of concurrent connections and queries, potentially worsening the database bottleneck. Option B is wrong because adding an ElastiCache Redis cluster caches only specific query results and requires application-level changes to implement caching logic; it does not fix the underlying slow queries that are already identified as the issue. Option C is wrong because creating a Read Replica offloads read traffic but does not improve the performance of the existing slow queries; the replica would execute the same slow queries, and the primary instance would still be impacted by write operations or unoptimized queries.

Practice this question →

112

MCQeasy

A SysOps administrator is troubleshooting an Amazon RDS for MySQL instance that is experiencing high CPU utilization. The administrator wants to identify the specific queries consuming the most CPU. What is the MOST efficient way to achieve this?

A.Use CloudWatch metrics for RDS and create a dashboard for CPU utilization.

B.Enable Performance Insights for the RDS instance and view the top SQL queries.

C.Enable Enhanced Monitoring for the RDS instance and view the CPU metrics.

D.Enable CloudWatch Logs for the RDS instance and filter for slow query logs.

AnswerB

Performance Insights provides a visual representation of database load and identifies top SQL queries.

Why this answer

Performance Insights provides a built-in database load visualization and a dashboard that directly shows the top SQL queries consuming the most resources, including CPU. This is the most efficient method because it requires no additional configuration beyond enabling the feature and immediately surfaces the specific queries causing high CPU utilization.

Exam trap

The trap here is confusing Enhanced Monitoring (OS-level metrics) with Performance Insights (database-level query analysis), leading candidates to choose Enhanced Monitoring when it cannot identify specific queries.

How to eliminate wrong answers

Option A is wrong because CloudWatch metrics for RDS show aggregate CPU utilization but do not identify which specific queries are consuming the CPU. Option C is wrong because Enhanced Monitoring provides OS-level metrics (e.g., CPU, memory, disk I/O) but does not correlate those metrics to individual SQL queries. Option D is wrong because enabling CloudWatch Logs for slow query logs only captures queries that exceed a defined execution time threshold, not necessarily the queries consuming the most CPU, and it requires additional parsing to identify top CPU consumers.

Practice this question →

113

MCQmedium

A company uses AWS CloudTrail to log API activity. The SysOps administrator needs to receive an email notification whenever a new IAM user is created. Which AWS services should be used together to meet this requirement with the least operational overhead?

A.CloudTrail, Amazon SNS, and AWS Lambda

B.CloudTrail, Amazon CloudWatch Logs, and a metric filter with an alarm

C.CloudTrail, Amazon EventBridge, and Amazon SNS

D.AWS Config and Amazon SNS

AnswerC

EventBridge can directly consume CloudTrail events and route them to SNS without custom code, providing the simplest solution.

Why this answer

Option C is correct because Amazon EventBridge can directly capture CloudTrail API events (such as CreateUser) and route them to an SNS topic for email notification without needing any custom code or additional infrastructure. This pattern minimizes operational overhead by using a fully managed event bus with built-in filtering and target routing, eliminating the need for Lambda functions or metric filter configurations.

Exam trap

The trap here is that candidates often overcomplicate the solution by adding Lambda or CloudWatch Logs, not realizing that EventBridge provides a direct, serverless integration between CloudTrail and SNS for real-time event-driven notifications.

How to eliminate wrong answers

Option A is wrong because while CloudTrail and SNS are used, adding AWS Lambda introduces unnecessary custom code and operational overhead when EventBridge can directly invoke SNS without a Lambda intermediary. Option B is wrong because using CloudWatch Logs with a metric filter and alarm requires sending CloudTrail logs to CloudWatch Logs, creating a metric filter, and setting an alarm — this adds complexity and latency compared to EventBridge's real-time event routing. Option D is wrong because AWS Config tracks resource configuration changes, not API-level events like IAM user creation; it would require additional rules and custom remediation to trigger SNS, making it less direct and more overhead than EventBridge.

Practice this question →

114

MCQmedium

A SysOps administrator needs to monitor the CPU utilization of an Amazon RDS for PostgreSQL instance and receive an alert if the usage exceeds 80% for 5 consecutive minutes. The database is in a production environment. What is the MOST efficient way to achieve this?

A.Configure an Amazon Simple Notification Service (SNS) topic to subscribe to CloudWatch alarms for all RDS metrics and filter for CPUUtilization.

B.Create an AWS Lambda function that queries the RDS performance schema every minute and publishes a custom metric to CloudWatch, then set an alarm.

C.Create an Amazon CloudWatch alarm on the CPUUtilization metric with a threshold of 80 and an evaluation period of 5 minutes.

D.Use a third-party monitoring tool such as Datadog because CloudWatch cannot monitor RDS CPU utilization.

AnswerC

CloudWatch directly monitors RDS metrics and can trigger an alarm based on the metric's value over a specified period.

Why this answer

Option C is correct because Amazon CloudWatch natively publishes the CPUUtilization metric for RDS instances every minute (standard monitoring) or every 5 minutes (enhanced monitoring). Creating a CloudWatch alarm with a threshold of 80% and an evaluation period of 5 consecutive minutes directly meets the requirement without additional infrastructure. This is the most efficient approach as it uses built-in RDS monitoring capabilities with no custom code or third-party tools.

Exam trap

The trap here is that candidates may overcomplicate the solution by assuming CloudWatch cannot natively monitor RDS CPU utilization or that custom code is required, when in fact RDS automatically publishes CPUUtilization to CloudWatch and alarms can be configured directly.

How to eliminate wrong answers

Option A is wrong because subscribing an SNS topic to all CloudWatch alarms for RDS metrics would require filtering at the SNS level, which is inefficient and does not directly create the alarm; the alarm must be created first, and SNS is a notification target, not a monitoring configuration tool. Option B is wrong because querying the RDS performance schema every minute via Lambda is unnecessarily complex, introduces latency, and incurs additional cost; CloudWatch already provides the CPUUtilization metric natively for RDS without custom instrumentation. Option D is wrong because CloudWatch fully supports monitoring RDS CPU utilization; a third-party tool like Datadog adds cost and complexity without solving the stated requirement.

Practice this question →

115

MCQmedium

A company is using Amazon CloudWatch Logs to monitor application logs from EC2 instances. The operations team wants to receive a notification when a specific error pattern appears in the logs. Which solution requires the least operational overhead?

A.Install the CloudWatch agent on EC2 instances and configure it to stream logs to CloudWatch Logs. Create a metric filter for the error pattern and set a CloudWatch alarm that sends an SNS notification.

B.Use Amazon Kinesis Data Firehose to stream all logs to Amazon S3, then run an AWS Glue job to search for the error pattern and trigger an SNS notification.

C.Install the CloudWatch agent on EC2 instances, stream logs to CloudWatch Logs, and use a subscription filter to invoke an AWS Lambda function that publishes a message to an SNS topic.

D.Configure the application to write logs to a file, use the CloudWatch agent to send logs to CloudWatch Logs, set a metric filter, and use the filter to send data to Amazon EventBridge, which then triggers a Lambda function to send an SNS notification.

AnswerC

Subscription filters provide serverless, real-time processing without managing additional resources.

Why this answer

Option C is correct because it uses a CloudWatch Logs subscription filter to directly invoke a Lambda function when a log event matches the error pattern, which then publishes an SNS notification. This approach avoids the overhead of creating and managing metric filters and alarms, and it provides real-time, event-driven processing with minimal configuration.

Exam trap

The trap here is that candidates often assume metric filters and alarms are the simplest solution, but they overlook the real-time, event-driven nature of subscription filters, which actually require less overhead for immediate notification.

How to eliminate wrong answers

Option A is wrong because while it uses metric filters and alarms, this approach requires polling and incurs additional latency; metric filters are evaluated on a schedule, not in real-time, and the alarm must transition through states, adding operational overhead. Option B is wrong because it introduces unnecessary complexity by streaming logs to S3 and running AWS Glue jobs, which are batch-oriented and not designed for real-time notification; this adds significant operational overhead and latency. Option D is wrong because it adds an unnecessary intermediate step by sending data to EventBridge before triggering Lambda; the subscription filter can directly invoke Lambda, making the EventBridge hop redundant and increasing overhead.

Practice this question →

116

MCQeasy

A company hosts a web application on multiple EC2 instances behind an Application Load Balancer (ALB). The SysOps administrator receives a report that the application is experiencing intermittent 503 errors. The ALB target group health checks are configured to check the /health endpoint every 30 seconds with a healthy threshold of 2 and an unhealthy threshold of 2. The administrator checks the ALB metrics and notices that the number of healthy hosts occasionally drops to zero. The EC2 instances are normal and the application logs show no errors. What is the most likely cause and solution?

A.Increase the health check interval to 60 seconds.

B.Increase the health check timeout to 10 seconds.

C.Add more EC2 instances to the target group.

D.Decrease the healthy threshold to 1.

AnswerB

A longer timeout accommodates momentary slowdowns in the health check endpoint response.

Why this answer

The intermittent 503 errors and healthy hosts dropping to zero indicate that health checks are timing out before the application can respond. With a default health check timeout of 5 seconds and a 30-second interval, if the /health endpoint occasionally takes longer than 5 seconds (e.g., due to transient load), the ALB marks instances unhealthy after two consecutive failures (unhealthy threshold of 2). Increasing the timeout to 10 seconds gives the endpoint more time to respond, preventing false negatives without changing the check frequency.

Exam trap

The trap here is that candidates often confuse increasing the health check interval (Option A) with giving more time for the application to respond, when in fact the timeout parameter directly controls how long the ALB waits for a response before marking the check as failed.

How to eliminate wrong answers

Option A is wrong because increasing the health check interval to 60 seconds would reduce the frequency of checks, delaying detection of actual failures and not addressing the root cause of timeouts. Option C is wrong because adding more EC2 instances does not fix the underlying health check timeout issue; the new instances would also fail the same timeout-based health checks. Option D is wrong because decreasing the healthy threshold to 1 would make the target group more sensitive to transient failures, potentially causing even more frequent flapping and 503 errors.

Practice this question →

117

MCQhard

A SysOps administrator is troubleshooting an issue where an EC2 instance is not sending logs to CloudWatch Logs. The instance has the CloudWatch agent installed, but no logs appear in the log group. The IAM role assigned to the instance has the following policy: {"Version": "2012-10-17", "Statement": [{"Effect": "Allow", "Action": ["logs:CreateLogGroup", "logs:CreateLogStream", "logs:PutLogEvents"], "Resource": "arn:aws:logs:us-east-1:123456789012:log-group:MyAppLogs:*"}]}. What is the most likely cause?

A.The log group name in the agent configuration does not match the policy resource.

B.The CloudWatch agent is not running on the instance.

C.The IAM role does not have permission to describe log groups.

D.The instance does not have outbound internet access to reach CloudWatch Logs.

AnswerA

The policy allows only actions on MyAppLogs; mismatch causes failure.

Why this answer

The IAM policy grants permissions only for the log group named 'MyAppLogs' (with a wildcard for streams). If the CloudWatch agent configuration specifies a different log group name, the agent's API calls to CreateLogGroup, CreateLogStream, or PutLogEvents will fail with an AccessDenied error because the resource ARN in the policy does not match the actual log group being targeted. This is the most common cause when the agent is installed and running but no logs appear.

Exam trap

The trap here is that candidates often assume the agent is not running or lacks internet access, but the real issue is a mismatch between the IAM policy resource and the log group name in the agent configuration, which causes an implicit deny.

How to eliminate wrong answers

Option B is wrong because if the CloudWatch agent were not running, the instance would not be able to send any logs at all, but the question states the agent is installed and the issue is that no logs appear—the agent could be running but failing due to permissions. Option C is wrong because the logs:DescribeLogGroups action is not required for sending logs; the agent only needs CreateLogGroup, CreateLogStream, and PutLogEvents to write logs. Option D is wrong because EC2 instances can reach CloudWatch Logs via the AWS public endpoint or a VPC endpoint without requiring internet access; the policy mismatch is a more specific and likely cause.

Practice this question →

118

MCQmedium

A company uses Amazon S3 to store sensitive data. The SysOps administrator needs to ensure that any attempt to upload an object with server-side encryption disabled is immediately detected and the administrator is notified. The administrator has enabled AWS CloudTrail and is logging S3 data events. Which approach should the administrator use to achieve this?

A.Enable S3 event notifications to send events to SNS for all PutObject operations.

B.Create a CloudWatch Events rule that matches PutObject API calls without the encryption header and triggers an SNS notification.

C.Create an AWS Config rule to detect objects without encryption.

D.Use S3 Inventory to generate a daily report of unencrypted objects.

AnswerB

CloudTrail logs contain the request parameters; CloudWatch Events can filter and act.

Why this answer

Option B is correct because CloudWatch Events (now Amazon EventBridge) can filter API calls captured by CloudTrail for PutObject operations that lack the x-amz-server-side-encryption header, and then trigger an SNS notification in near real-time. This ensures immediate detection and notification of any upload with server-side encryption disabled, meeting the requirement for instant alerting.

Exam trap

The trap here is that candidates may confuse S3 event notifications (which trigger on all PutObject operations without filtering) with CloudWatch Events (which can filter on API call details), leading them to choose Option A despite its inability to detect missing encryption headers.

How to eliminate wrong answers

Option A is wrong because S3 event notifications for PutObject operations do not inspect the encryption header; they trigger on all PutObject events regardless of encryption status, so they cannot distinguish between encrypted and unencrypted uploads. Option C is wrong because AWS Config rules evaluate resource configurations periodically or on configuration changes, not in real-time for each API call, and they would detect unencrypted objects at rest rather than the act of uploading without encryption. Option D is wrong because S3 Inventory generates a daily report of objects and their metadata, which is not immediate and cannot provide real-time detection or notification of the upload attempt.

Practice this question →

119

MCQmedium

A company hosts a static website on Amazon S3 and uses Amazon CloudFront for content delivery. The marketing team wants to know how many users visit the website each day, including the geographic distribution. Which solution requires the LEAST operational overhead?

A.Use CloudWatch metrics for CloudFront to view request counts and enable detailed metrics.

B.Enable CloudFront access logs and use Amazon Athena to query the logs in S3.

C.Enable AWS CloudTrail for CloudFront and query the event history.

D.Enable CloudFront real-time logs and send them to Amazon Kinesis Data Analytics for analysis.

AnswerB

Athena is serverless and can easily query S3 logs for daily counts and geography.

Why this answer

Option B is correct because enabling CloudFront access logs and querying them with Amazon Athena provides detailed user visit counts and geographic distribution with minimal operational overhead. Access logs contain client IP addresses and request details, which Athena can analyze using SQL without managing servers or complex pipelines. This approach is serverless and cost-effective for periodic analysis.

Exam trap

The trap here is that candidates may confuse CloudWatch metrics (which show request counts but lack geographic detail) with access logs (which provide the raw data needed for geographic analysis), or overcomplicate the solution by choosing real-time streaming when batch analysis is sufficient.

How to eliminate wrong answers

Option A is wrong because CloudWatch metrics for CloudFront provide aggregated request counts but do not include geographic distribution data, and enabling detailed metrics increases cost without solving the requirement. Option C is wrong because AWS CloudTrail records API calls made to CloudFront (e.g., configuration changes), not user requests to the website, so it cannot provide visitor counts or geographic distribution. Option D is wrong because CloudFront real-time logs with Kinesis Data Analytics introduce significant operational overhead for stream processing, which is unnecessary for daily, batch-oriented analysis of user visits.

Practice this question →

120

Multi-Selectmedium

A company uses AWS CloudFormation to deploy infrastructure. The operations team wants to be notified when a stack enters a ROLLBACK_IN_PROGRESS state. Which TWO methods can achieve this?

Select 2 answers

A.Use AWS Config rules to evaluate the stack state.

B.Create an Amazon CloudWatch Events rule that matches the CloudFormation stack status change.

C.Configure a CloudWatch Logs subscription filter to detect the stack state.

D.Create a CloudTrail trail and monitor the UpdateStack API call.

E.Configure CloudFormation stack notifications to send events to an Amazon SNS topic.

AnswersB, E

CloudFormation emits events to CloudWatch Events for stack status changes.

Why this answer

Option B is correct because Amazon CloudWatch Events (now Amazon EventBridge) can capture CloudFormation stack status changes, including ROLLBACK_IN_PROGRESS, by matching the 'CloudFormation Stack Status Change' event pattern. This allows you to trigger a notification action (e.g., via SNS or Lambda) in real time when the stack enters that state.

Exam trap

The trap here is that candidates may confuse CloudTrail API logging (which records the API call but not the asynchronous state transition) with event-driven notifications, or assume CloudWatch Logs subscription filters can parse CloudFormation events, when in fact CloudFormation does not write stack state changes to CloudWatch Logs.

Practice this question →

121

Drag & Dropmedium

Drag and drop the steps to migrate an on-premises application to AWS using AWS Application Migration Service (MGN) into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Install the agent, configure replication, sync, test, then cut over.

Practice this question →

122

Multi-Selecthard

A SysOps administrator is investigating a performance issue where an Amazon RDS for MySQL instance's ReadIOPS metric is consistently high. The database is used by a web application. Which THREE actions should the administrator take to improve performance?

Select 3 answers

A.Increase the InnoDB buffer pool size to cache more data in memory.

B.Enable query caching in MySQL to avoid repeated reads of the same data.

C.Add read replicas to offload read queries from the primary instance.

D.Change the storage type to Provisioned IOPS for better performance.

E.Enable Multi-AZ deployment for high availability.

AnswersA, B, C

A larger buffer pool reduces the need to read from disk, lowering ReadIOPS.

Why this answer

Increasing the InnoDB buffer pool size allows more data and indexes to be cached in memory, reducing the need for disk reads and thus lowering ReadIOPS. This is a direct and effective tuning action for MySQL on RDS when read I/O is the bottleneck.

Exam trap

The trap here is that candidates often confuse Provisioned IOPS with reducing I/O demand, when it only improves I/O performance, and Multi-AZ with read scaling, when it is solely for failover and availability.

Practice this question →

123

MCQmedium

A SysOps administrator needs to ensure that all S3 buckets in the account have server access logging enabled. The administrator wants to be notified if a bucket is created without logging. What is the most efficient solution?

A.Periodically run a script to list all buckets and check logging configuration.

B.Enable S3 event notifications on the bucket creation event and trigger a Lambda function.

C.Use CloudWatch Events to detect CreateBucket API calls and trigger a Lambda function.

D.Use AWS Config with a managed rule to evaluate S3 buckets for server access logging.

AnswerD

AWS Config continuously evaluates resources against rules and can send notifications on non-compliant resources.

Why this answer

AWS Config with the managed rule 's3-bucket-server-access-logging-enabled' continuously evaluates all S3 buckets against the desired configuration. When a bucket is created without server access logging, AWS Config automatically flags it as noncompliant and can trigger an SNS notification. This is the most efficient solution because it provides ongoing, automated compliance monitoring without requiring custom scripts or event-driven remediation.

Exam trap

The trap here is that candidates confuse event-driven detection (CloudWatch Events or S3 event notifications) with continuous compliance evaluation, mistakenly believing that a single trigger at creation time is sufficient to meet the requirement of being notified if a bucket is created without logging.

How to eliminate wrong answers

Option A is wrong because periodically running a script is reactive, not proactive; it introduces latency between bucket creation and detection, and requires manual maintenance. Option B is wrong because S3 event notifications are triggered by object-level events (e.g., PUT, POST) within a bucket, not by the CreateBucket API call itself, so they cannot detect bucket creation. Option C is wrong because CloudWatch Events (now Amazon EventBridge) can detect CreateBucket API calls, but this approach only triggers a Lambda function at creation time; it does not provide ongoing compliance evaluation if logging is later disabled, and it requires custom code to check the logging configuration.

Practice this question →

124

Multi-Selectmedium

A company wants to monitor AWS API calls for suspicious activity. Which TWO AWS services can be used together to achieve this?

Select 2 answers

A.VPC Flow Logs

B.Amazon CloudWatch Logs

C.Amazon Inspector

D.AWS Config

E.AWS CloudTrail

AnswersB, E

CloudWatch Logs can analyze CloudTrail logs for suspicious patterns.

Why this answer

Options B and C are correct. CloudTrail logs API calls, and CloudWatch Logs can ingest those logs and use metric filters to detect suspicious patterns. Option A (AWS Config) records resource changes, not API calls.

Option D (VPC Flow Logs) captures network traffic. Option E (Amazon Inspector) is for security assessments.

Practice this question →

125

MCQeasy

A SysOps administrator needs to monitor the memory utilization of an EC2 instance running Amazon Linux 2. Which of the following is required to publish memory metrics to CloudWatch?

A.Use the CloudWatch Logs agent to parse memory usage from system logs.

B.Install and configure the CloudWatch agent on the instance.

C.Enable detailed monitoring on the instance.

D.Install the AWS Systems Manager Agent (SSM Agent) and configure it to send metrics.

AnswerB

The CloudWatch agent can collect custom metrics like memory.

Why this answer

The CloudWatch agent is specifically designed to collect custom metrics, such as memory utilization, from EC2 instances and publish them to CloudWatch. Unlike the default EC2 monitoring, which only captures hypervisor-level metrics (CPU, network, disk), memory utilization requires an in-guest agent to read from the operating system's /proc/meminfo or similar interfaces. The CloudWatch agent can be configured via a JSON file to collect memory metrics and send them to CloudWatch using the PutMetricData API.

Exam trap

The trap here is that candidates confuse 'detailed monitoring' (which increases frequency of existing hypervisor metrics) with the ability to collect in-guest metrics, or they assume the SSM Agent or CloudWatch Logs agent can perform metric collection, when only the CloudWatch agent is designed for that purpose.

How to eliminate wrong answers

Option A is wrong because the CloudWatch Logs agent is used to send log data to CloudWatch Logs, not to parse and publish custom metrics like memory utilization to CloudWatch Metrics. Option C is wrong because enabling detailed monitoring only increases the frequency of standard EC2 metrics (e.g., CPU, disk I/O) from 5 minutes to 1 minute; it does not enable collection of in-guest metrics such as memory. Option D is wrong because the SSM Agent is used for Systems Manager features like Run Command, Patch Manager, and Inventory, not for publishing custom metrics to CloudWatch; it lacks the metric collection and publishing capabilities of the CloudWatch agent.

Practice this question →

126

Multi-Selecthard

Which TWO options are valid ways to send custom metrics to Amazon CloudWatch?

Select 2 answers

A.Use the CloudWatch agent to collect and publish custom metrics.

B.Use the PutMetricData API call.

C.Use Amazon SQS to send metric data to CloudWatch.

D.Use AWS CloudTrail to log custom metrics.

E.Use Amazon Kinesis Data Firehose to deliver metrics to CloudWatch.

AnswersA, B

The agent can collect custom metrics from the OS and applications.

Why this answer

Option A is correct because the CloudWatch agent can be installed on EC2 instances or on-premises servers to collect system-level metrics (like memory and disk usage) and custom application metrics, then publish them to CloudWatch. Option B is correct because the PutMetricData API call allows direct programmatic ingestion of custom metrics into CloudWatch, supporting up to 1,000 metrics per call with a maximum payload of 1 MB.

Exam trap

The trap here is that candidates may think SQS or Kinesis Data Firehose can natively push data to CloudWatch, but neither service has a direct integration for custom metric ingestion—only PutMetricData or the CloudWatch agent (which uses that API) are valid methods.

Practice this question →

127

Multi-Selecthard

A SysOps administrator is designing a monitoring solution for a critical application running on EC2 instances. The application requires that all API calls to the environment are logged for security analysis. Which TWO services should the administrator use to meet this requirement?

Select 2 answers

A.Amazon GuardDuty

B.Amazon CloudWatch Logs

C.AWS CloudTrail

D.AWS Config

E.VPC Flow Logs

AnswersB, C

CloudWatch Logs can store and monitor CloudTrail log files.

Why this answer

AWS CloudTrail is the correct service because it records all API calls made to the AWS environment, including calls made via the AWS Management Console, AWS CLI, SDKs, and other services. CloudTrail logs provide the identity of the caller, the time of the call, the source IP address, and the request parameters, which are essential for security analysis. Option B (Amazon CloudWatch Logs) is also correct because CloudTrail logs can be delivered to CloudWatch Logs for centralized monitoring, alerting, and retention, enabling real-time analysis and integration with other AWS services.

Exam trap

The trap here is confusing AWS CloudTrail (which logs API calls) with VPC Flow Logs (which log network traffic) or GuardDuty (which detects threats but does not generate logs), leading candidates to select services that analyze logs rather than capture them.

Practice this question →

128

MCQeasy

A company has an Amazon S3 bucket that stores critical data. The security team wants to be notified whenever an object in the bucket is deleted. Which solution should the SysOps administrator implement?

A.Configure an S3 event notification for 's3:ObjectRemoved:*' events to trigger an AWS Lambda function that sends an email.

B.Enable CloudTrail data events for the S3 bucket, create a CloudWatch Events rule for 'DeleteObject' API calls, and send notifications via SNS.

C.Use AWS Config to monitor S3 bucket resources and trigger an SNS notification on configuration changes.

D.Enable S3 server access logs and use Amazon Athena to query for delete events, then send notifications.

AnswerB

This setup captures all delete operations and sends real-time alerts.

Why this answer

Option B is correct because it uses CloudTrail data events to capture 'DeleteObject' API calls specifically for the S3 bucket, then routes those events via CloudWatch Events to an SNS topic for notification. This provides a reliable, real-time notification mechanism for object deletions without requiring custom code or post-hoc analysis.

Exam trap

The trap here is that candidates often assume S3 event notifications are sufficient for all object operations, but they do not capture all delete scenarios (e.g., versioned object deletions) and lack the integration flexibility of CloudTrail with CloudWatch Events for centralized monitoring and alerting.

How to eliminate wrong answers

Option A is wrong because S3 event notifications for 's3:ObjectRemoved:*' are triggered asynchronously and may not capture all delete scenarios (e.g., versioned object deletions or MFA delete failures), and they require a Lambda function to send email, adding complexity and potential failure points. Option C is wrong because AWS Config monitors configuration changes to the bucket itself (e.g., policy changes), not object-level operations like deletions, so it cannot detect object deletion events. Option D is wrong because S3 server access logs are delivered on a best-effort basis with potential delays (often hours), and querying with Athena is a reactive, post-hoc approach that does not provide real-time notifications.

Practice this question →

129

MCQmedium

A company has a fleet of EC2 instances that are part of an Auto Scaling group. The SysOps team wants to automatically replace any instance that fails the status check for 2 consecutive minutes. Which configuration should be used?

A.Configure an EC2 Auto Scaling group to use EC2 status checks and set the health check grace period to 2 minutes.

B.Use AWS Systems Manager Automation to run a script that reboots the instance.

C.Configure an Amazon EventBridge rule to trigger an AWS Lambda function that terminates the instance.

D.Configure a CloudWatch Alarm on StatusCheckFailed metric to reboot the instance.

AnswerA

Auto Scaling will automatically terminate and replace instances that fail status checks.

Why this answer

Option A is correct because an Auto Scaling group can use EC2 status checks to determine instance health. By setting the health check grace period to 2 minutes, the Auto Scaling group will wait 2 minutes after an instance enters the InService state before starting health checks, and then if the instance fails status checks for 2 consecutive minutes, the Auto Scaling group will mark it as unhealthy and automatically terminate and replace it.

Exam trap

The trap here is that candidates often confuse the health check grace period with the time window for detecting failures, or they mistakenly think that rebooting or terminating via CloudWatch or Lambda is equivalent to replacing the instance within an Auto Scaling group.

How to eliminate wrong answers

Option B is wrong because AWS Systems Manager Automation can reboot an instance, but it does not automatically replace the instance; it only attempts to recover it, and the requirement is to replace the instance, not reboot it. Option C is wrong because an EventBridge rule triggering a Lambda function to terminate the instance would require custom code and does not integrate with the Auto Scaling group's lifecycle to automatically launch a replacement instance. Option D is wrong because a CloudWatch Alarm on the StatusCheckFailed metric can be configured to reboot the instance, but it does not replace the instance; the requirement is to automatically replace the instance, not reboot it.

Practice this question →

130

MCQeasy

A company has an Auto Scaling group of EC2 instances behind an Application Load Balancer. The SysOps administrator notices that the healthy host count is lower than expected. The instances are in service, and security groups allow traffic. What is a likely cause?

A.The instances are not registered with the target group.

B.The security group for the load balancer does not allow inbound traffic.

C.The health check path is returning HTTP 503.

D.The target group is not associated with the load balancer.

AnswerC

A non-200 response causes the instance to be unhealthy.

Why this answer

The correct answer is C because a health check path returning HTTP 503 (Service Unavailable) indicates that the target instances are reachable but the application is failing to respond correctly. The Application Load Balancer (ALB) marks instances as unhealthy when the health check receives any non-2xx or non-3xx response, which reduces the healthy host count even though the instances are in service and security groups are properly configured.

Exam trap

The trap here is that candidates often assume a low healthy host count is always due to network-level issues (security groups or registration), but the question explicitly states instances are 'in service' and security groups allow traffic, pointing to an application-level health check failure like a 503 response.

How to eliminate wrong answers

Option A is wrong because instances that are 'in service' in the Auto Scaling group are automatically registered with the target group when the group is associated with the ALB; if they were not registered, they would not appear as 'in service'. Option B is wrong because the security group for the load balancer controls inbound traffic from clients, not health check traffic from the load balancer to instances; health check traffic is governed by the instance security group allowing traffic from the load balancer's security group or CIDR. Option D is wrong because if the target group were not associated with the load balancer, the instances would not be receiving traffic from the ALB at all, and the healthy host count would be zero or the target group would not appear in the ALB configuration.

Practice this question →

131

MCQhard

A SysOps administrator is troubleshooting a slow-running RDS MySQL instance. The administrator notices that the ReadIOPS metric is consistently high, but the WriteIOPS is low. The instance type is db.m5.large with 300 GB of General Purpose SSD (gp2). What is the most likely cause?

A.The instance type is too small for the workload.

B.The database is experiencing write contention.

C.The network bandwidth is insufficient.

D.The gp2 volume is experiencing I/O credit exhaustion.

AnswerD

Exceeding baseline IOPS depletes credits, causing throttling.

Why this answer

The correct answer is D because a db.m5.large instance with 300 GB of gp2 storage has a baseline IOPS of 900 (3 IOPS per GB) and a burst balance of 5.4 million I/O credits. With consistently high ReadIOPS and low WriteIOPS, the volume is likely exhausting its I/O credit balance, causing performance throttling. This is a classic symptom of gp2 I/O credit exhaustion, where read-heavy workloads deplete the burst bucket, leading to degraded performance.

Exam trap

The trap here is that candidates often assume a slow database is always due to an undersized instance type, overlooking that gp2 volumes have a burst credit mechanism that can be exhausted by sustained high read I/O even with low write activity.

How to eliminate wrong answers

Option A is wrong because the instance type (db.m5.large) is not the primary bottleneck; the issue is with the storage layer's I/O credits, not compute or memory. Option B is wrong because write contention would manifest as high WriteIOPS or increased latency on writes, but the metric shows low WriteIOPS, indicating writes are not the problem. Option C is wrong because network bandwidth is unrelated to storage I/O metrics; insufficient bandwidth would cause network latency or throughput issues, not high ReadIOPS on the EBS volume.

Practice this question →

132

MCQeasy

A company uses AWS CloudFormation to deploy a VPC with public and private subnets. The stack creation fails with the error 'The maximum number of VPCs has been reached.' The SysOps administrator needs to deploy the stack as soon as possible. What should the administrator do?

A.Delete unused VPCs to free up capacity.

B.Modify the CloudFormation template to use an existing VPC.

C.Request a VPC limit increase from AWS Support.

D.Deploy the stack in a different AWS region.

AnswerC

Increasing the limit allows creation of additional VPCs.

Why this answer

Option C is correct because the error 'The maximum number of VPCs has been reached' indicates the AWS account has hit the default VPC limit (5 per region). Requesting a service limit increase from AWS Support is the fastest way to raise this soft limit without modifying existing infrastructure or templates, allowing the stack to deploy in the same region.

Exam trap

The trap here is that candidates may choose to delete unused VPCs (Option A) thinking it's faster, but AWS Support limit increases are often quicker and safer than auditing and deleting resources, especially in production environments.

How to eliminate wrong answers

Option A is wrong because deleting unused VPCs is an alternative but may not be the fastest solution if no VPCs are truly unused, and it requires identifying and safely removing resources, which could delay deployment. Option B is wrong because modifying the CloudFormation template to use an existing VPC changes the architecture and may not meet the requirement for deploying a new VPC as specified in the question. Option D is wrong because deploying in a different region may avoid the limit but introduces latency, compliance, or service availability issues, and is not the most direct fix for a soft limit that can be increased.

Practice this question →

133

MCQeasy

A SysOps administrator configures AWS CloudTrail to log all management events in a company's AWS account. The administrator needs to ensure that CloudTrail logs are not deleted for at least 5 years to meet compliance requirements. Which configuration should the administrator apply?

A.Enable CloudTrail log file validation.

B.Enable CloudTrail data events for S3.

C.Apply an S3 bucket policy that prohibits deletion of log files.

D.Enable S3 Object Lock on the CloudTrail S3 bucket.

AnswerD

Correct. S3 Object Lock enforces a retention period, preventing object deletion or overwrite for the specified duration, meeting the 5-year compliance requirement.

Why this answer

Option D is correct because S3 Object Lock provides a Write-Once-Read-Many (WORM) model that prevents objects from being deleted or overwritten for a specified retention period. By enabling S3 Object Lock on the CloudTrail S3 bucket and setting a retention mode (e.g., Compliance or Governance) with a 5-year retention period, the administrator ensures that CloudTrail log files cannot be deleted, meeting the compliance requirement.

Exam trap

The trap here is that candidates confuse S3 bucket policies with immutable storage, not realizing that bucket policies can be overridden by IAM permissions or root user actions, whereas S3 Object Lock provides true WORM protection that even the root user cannot bypass in Compliance mode.

How to eliminate wrong answers

Option A is wrong because CloudTrail log file validation uses SHA-256 hashing to verify the integrity of log files, not to prevent deletion; it ensures logs have not been tampered with but does not enforce retention. Option B is wrong because enabling CloudTrail data events for S3 captures object-level API activity (e.g., GetObject, PutObject) but does not protect log files from deletion; it increases logging scope but does not enforce retention. Option C is wrong because an S3 bucket policy that prohibits deletion of log files can be bypassed by the root user or by an IAM policy that grants s3:DeleteObject permissions; bucket policies alone cannot enforce immutable retention against authorized users.

Practice this question →

134

Multi-Selecthard

A SysOps administrator needs to detect unauthorized changes to security groups and automatically notify the operations team. Which two AWS services should be part of the solution? (Choose 2.)

Select 2 answers

A.AWS CloudTrail.

B.Amazon EventBridge.

C.Amazon S3 Transfer Acceleration.

D.AWS Snowball Edge.

AnswersA, B

CloudTrail records API calls such as AuthorizeSecurityGroupIngress.

Why this answer

AWS CloudTrail is correct because it records API calls made to create, modify, or delete security groups, providing the audit trail needed to detect unauthorized changes. By enabling CloudTrail on the account and configuring a trail to deliver logs to Amazon S3, the administrator can monitor security group events such as AuthorizeSecurityGroupIngress or RevokeSecurityGroupEgress. This log data is essential for identifying when a change occurred, who made it, and from which source IP.

Exam trap

The trap here is that candidates often confuse Amazon S3 Transfer Acceleration with S3 event notifications or S3 server access logging, mistakenly thinking it can trigger alerts, when in fact it is solely a performance optimization for uploads.

Practice this question →

135

MCQmedium

A company wants to ensure that all IAM user changes are logged and that an alert is sent when a new IAM user is created. Which services should be used together to achieve this? (Select THREE.)

A.Amazon CloudWatch Logs

B.AWS Config

C.Amazon CloudWatch Alarms

D.Amazon S3

E.AWS CloudTrail

AnswerA, C, E

CloudTrail logs can be streamed to CloudWatch Logs.

Why this answer

Amazon CloudWatch Logs is correct because it can receive and store log events from AWS CloudTrail, which records all IAM user changes including user creation. By sending CloudTrail logs to CloudWatch Logs, you can then create metric filters to detect specific API calls like 'CreateUser' and trigger CloudWatch Alarms to send notifications via SNS.

Exam trap

The trap here is that candidates often select AWS Config thinking it monitors API activity, but Config is designed for compliance and configuration auditing, not for real-time event logging and alerting on specific API calls.

How to eliminate wrong answers

Option B is wrong because AWS Config is a service for evaluating resource configurations against desired policies (e.g., checking if IAM users have MFA enabled), not for logging real-time API events or triggering alerts on user creation. Option D is wrong because Amazon S3 is an object storage service that can store CloudTrail logs but cannot natively filter logs or trigger alarms based on specific API events; it lacks the real-time monitoring and alerting capabilities needed for this use case.

Practice this question →

136

MCQmedium

A company runs a REST API on Amazon EC2 instances behind an Application Load Balancer. The SysOps administrator needs to monitor the API endpoint from multiple geographic locations and receive an alarm if the p90 latency exceeds 2 seconds for two consecutive checks. The solution must use AWS managed services and not require custom code running on EC2. Which approach should the administrator use?

A.Set up Amazon CloudWatch Synthetics canaries to run from multiple AWS Regions and publish custom metrics. Create a CloudWatch alarm on the p90 latency metric.

B.Configure VPC Flow Logs on the Application Load Balancer and use Amazon CloudWatch Logs Insights to query for high-latency requests.

C.Enable Amazon CloudWatch RUM (Real User Monitoring) on the client side and create a CloudWatch alarm on the Duration metric.

D.Use AWS CloudTrail to log API calls and set a CloudWatch alarm on the event count for errors.

AnswerA

CloudWatch Synthetics canaries execute scripts in AWS-managed Lambda functions across Regions, measuring latency and success. They can publish metrics to CloudWatch, and an alarm can be created on a percentile statistic like p90.

Why this answer

Amazon CloudWatch Synthetics canaries are AWS-managed Node.js scripts that run on a schedule to monitor endpoints from multiple AWS Regions, capturing metrics like duration and latency. By configuring canaries to report p90 latency as a custom metric, you can create a CloudWatch alarm that triggers when p90 exceeds 2 seconds for two consecutive data points, meeting all requirements without custom EC2 code.

Exam trap

The trap here is that candidates may confuse VPC Flow Logs or CloudTrail with application-layer monitoring, but neither provides request-level latency metrics; CloudWatch Synthetics is the only AWS-managed service that can synthetically test an HTTP endpoint from multiple geographic locations and publish percentile latency metrics without custom EC2 code.

How to eliminate wrong answers

Option B is wrong because VPC Flow Logs capture network-level metadata (IPs, ports, protocols) but do not measure application-layer latency like p90; they cannot be used to query for request duration or percentile latencies. Option C is wrong because Amazon CloudWatch RUM collects client-side performance data from actual user browsers, which introduces variability from network conditions and device performance, and it requires client-side JavaScript injection, not a pure AWS-managed service for synthetic monitoring from multiple geographic locations. Option D is wrong because AWS CloudTrail logs API calls to the AWS management plane (e.g., EC2 API calls), not the application-layer REST API requests; it cannot measure p90 latency or trigger alarms on performance metrics.

Practice this question →

137

MCQmedium

A company uses AWS CloudTrail to record all API activity. The SysOps administrator needs to be alerted in real time when an IAM user creates a new access key. Which combination of AWS services should be used to create this alert?

A.CloudTrail + Amazon S3 + Amazon SNS

B.CloudTrail + Amazon CloudWatch Logs + Amazon SNS

C.CloudTrail + AWS Config + Amazon SNS

D.CloudTrail + Amazon EventBridge + Amazon SNS

AnswerD

EventBridge can directly consume CloudTrail events and apply rules to match specific API actions, triggering SNS notifications with low latency.

Why this answer

Option D is correct because Amazon EventBridge can directly consume CloudTrail events in real time and trigger an SNS notification when an IAM user creates a new access key. EventBridge provides a serverless event bus that matches specific API calls (e.g., CreateAccessKey) using event patterns, enabling immediate alerting without additional polling or log processing.

Exam trap

The trap here is that candidates often assume CloudWatch Logs is required for any CloudTrail-based alerting, but EventBridge provides a simpler, lower-latency, and more direct integration for real-time API event monitoring.

How to eliminate wrong answers

Option A is wrong because CloudTrail logs to Amazon S3 are delivered in batches (typically every 5 minutes), not in real time, so S3 events cannot trigger immediate alerts for access key creation. Option B is wrong because CloudTrail integration with CloudWatch Logs introduces latency (up to several minutes) and requires additional metric filters and alarms, which is not the most direct real-time approach. Option C is wrong because AWS Config is designed for resource configuration tracking and compliance evaluation, not for real-time API event alerting; it evaluates rules periodically or on configuration changes, not instantaneously for every API call.

Practice this question →

138

MCQhard

A SysOps administrator is investigating a security breach. An IAM user 'Bob' is suspected of performing unauthorized actions. The administrator needs to determine the source IP addresses from which Bob's access keys were used in the last 30 days. Which AWS service or feature should be used?

A.AWS CloudTrail event history.

B.VPC Flow Logs.

C.Amazon CloudWatch Logs.

D.AWS IAM credential report.

AnswerA

CloudTrail records API calls with source IP.

Why this answer

AWS CloudTrail event history provides a record of all API calls made by IAM users, including the source IP address from which the request originated. By filtering the event history for the IAM user 'Bob' and the time range of the last 30 days, the administrator can identify the source IP addresses associated with each API call made using Bob's access keys. This directly meets the requirement to determine the source IP addresses of unauthorized actions.

Exam trap

The trap here is that candidates may confuse the IAM credential report (which shows credential metadata) with CloudTrail (which records actual API call details), leading them to choose the credential report for investigating source IPs when it only provides static credential status, not historical usage data.

How to eliminate wrong answers

Option B is wrong because VPC Flow Logs capture network traffic at the IP level (source/destination IPs, ports, protocols) but do not log IAM user identity or access key usage; they are used for analyzing network traffic patterns, not for tracking API calls by specific IAM users. Option C is wrong because Amazon CloudWatch Logs can store log data from various sources (e.g., application logs, system logs) but does not natively capture IAM user API call details or source IPs unless custom logging is configured; it is not the primary service for auditing IAM user activity. Option D is wrong because AWS IAM credential report provides information about the status of IAM user credentials (e.g., password last used, access key age, rotation status) but does not include source IP addresses or a history of API calls; it is used for credential auditing, not for investigating specific actions or source IPs.

Practice this question →

139

MCQmedium

An application writes error logs to Amazon CloudWatch Logs. The SysOps administrator needs to monitor for the occurrence of the string 'ERROR' in the logs and trigger an Amazon SNS notification if more than 10 errors occur within a 5-minute window. The administrator also wants to visualize the error count over time. Which approach should be used to meet these requirements with the least operational overhead?

A.Create a CloudWatch Logs metric filter to count 'ERROR' entries, then create a CloudWatch alarm on that metric with a period of 5 minutes and a threshold of 10.

B.Use CloudWatch Logs Insights to run a query every 5 minutes and send notifications via a scheduled AWS Lambda function.

C.Create an AWS Lambda function that processes log events in real-time and publishes to Amazon SNS when the error count exceeds 10 in 5 minutes.

D.Use Amazon EventBridge to match log events with the pattern 'ERROR' and send them to an SNS topic.

AnswerA

Metric filters convert log data into CloudWatch metrics, enabling alarms and dashboards with minimal overhead.

Why this answer

Option A is correct because CloudWatch Logs metric filters can extract a count of 'ERROR' occurrences from incoming log events and emit a custom metric. A CloudWatch alarm on that metric with a period of 5 minutes and a threshold of 10 directly triggers an SNS notification when the error count exceeds 10 within the window, and the metric itself can be graphed in CloudWatch dashboards for visualization—all with minimal configuration and no custom code.

Exam trap

The trap here is that candidates may overcomplicate the solution by choosing Lambda or EventBridge, not realizing that CloudWatch Logs metric filters combined with CloudWatch alarms are the native, serverless, and lowest-overhead way to count substring occurrences and trigger alerts on aggregated thresholds.

How to eliminate wrong answers

Option B is wrong because running a CloudWatch Logs Insights query every 5 minutes via a scheduled Lambda function introduces unnecessary complexity, latency, and operational overhead compared to a real-time metric filter and alarm. Option C is wrong because creating a Lambda function to process log events in real-time adds custom code, scaling concerns, and maintenance burden when CloudWatch Logs metric filters and alarms natively provide the same functionality with zero code. Option D is wrong because Amazon EventBridge does not natively parse or count occurrences of a string like 'ERROR' within log events; it matches event patterns at the event level, not substring counts within log messages, and cannot aggregate counts over a time window.

Practice this question →

140

MCQeasy

A SysOps administrator is troubleshooting an application that runs on an EC2 instance. The application is experiencing high latency, and the administrator suspects a memory leak. Which metrics should the administrator examine first?

A.Custom CloudWatch metrics published by the CloudWatch agent, such as mem_used_percent.

B.CloudWatch metrics from the Detailed Monitoring feature, such as DiskReadOps.

C.CloudWatch metrics for the instance's Elastic Network Interface.

D.CloudWatch default EC2 metrics, such as CPUUtilization and NetworkIn.

AnswerA

Memory metrics require the CloudWatch agent.

Why this answer

A memory leak causes the application to consume increasing amounts of memory over time, leading to high latency as the OS begins swapping or the kernel reclaims memory. The CloudWatch agent can publish custom metrics like `mem_used_percent`, which directly tracks memory usage percentage and is the most relevant metric to confirm a memory leak. Default EC2 metrics do not include memory utilization, so the administrator must rely on custom metrics from the CloudWatch agent.

Exam trap

The trap here is that candidates assume default EC2 metrics include memory utilization, but AWS does not provide guest OS memory metrics by default; you must install the CloudWatch agent to capture them.

How to eliminate wrong answers

Option B is wrong because DiskReadOps measures disk I/O operations, not memory usage; it would not help identify a memory leak. Option C is wrong because Elastic Network Interface metrics track network throughput and packet counts, which are unrelated to memory consumption. Option D is wrong because default EC2 metrics like CPUUtilization and NetworkIn do not include memory metrics; EC2 does not expose guest OS memory usage without the CloudWatch agent.

Practice this question →

141

Multi-Selecthard

A SysOps administrator needs to ensure that all API calls in the AWS account are logged for auditing purposes. The administrator also wants to receive notifications when specific API calls are made. Which THREE services should the administrator use together to achieve this? (Choose THREE.)

Select 3 answers

A.Amazon EventBridge

B.AWS Config

C.Amazon Simple Notification Service (SNS)

D.Amazon CloudWatch Logs

E.AWS CloudTrail

AnswersA, D, E

EventBridge can filter CloudTrail events and trigger actions like SNS notifications.

Why this answer

AWS CloudTrail (Option E) is the primary service for logging all API calls in an AWS account, capturing detailed event records. Amazon CloudWatch Logs (Option D) can receive these CloudTrail events for centralized log storage and monitoring. Amazon EventBridge (Option A) can then be used to create rules that match specific API calls and trigger notifications, such as sending messages to an SNS topic, enabling real-time alerting.

Exam trap

The trap here is that candidates often confuse AWS Config with CloudTrail, thinking Config logs API calls, when in fact Config only tracks configuration changes and compliance, not the API calls that caused those changes.

Practice this question →

142

MCQmedium

A SysOps administrator needs to monitor application logs stored in Amazon CloudWatch Logs for the term 'CRITICAL'. When more than 5 'CRITICAL' entries appear in a 5-minute window, the administrator wants to automatically restart the underlying Amazon EC2 instance. Which solution should the administrator implement?

A.Create a CloudWatch Logs metric filter, then a CloudWatch alarm that triggers an AWS Systems Manager Automation document to restart the instance.

B.Create a CloudWatch Logs metric filter, then a CloudWatch alarm that triggers an EC2 Reboot Instances action.

C.Create a CloudWatch Logs metric filter, then use Amazon CloudWatch Events (Amazon EventBridge) to trigger an AWS Lambda function that restarts the instance.

D.Use Amazon CloudWatch Synthetics canary to monitor the logs and automatically stop the instance.

AnswerB

CloudWatch alarms support EC2 actions including reboot, which is the simplest way to restart the instance based on a metric.

Why this answer

Option B is correct because CloudWatch Logs metric filters can count occurrences of the term 'CRITICAL' in log data, and a CloudWatch alarm can be configured to trigger an EC2 Reboot Instances action directly when the metric exceeds a threshold of 5 in a 5-minute period. This provides a native, simple, and fully managed solution without requiring additional services like Lambda or Systems Manager.

Exam trap

The trap here is that candidates may overcomplicate the solution by choosing Lambda or Systems Manager, not realizing that CloudWatch alarms have a built-in EC2 action for reboot, stop, terminate, or recover, which is the simplest and most cost-effective method for this use case.

How to eliminate wrong answers

Option A is wrong because while a CloudWatch alarm can trigger an AWS Systems Automation document, the EC2 Reboot Instances action is a direct alarm target and does not require Systems Manager Automation, which adds unnecessary complexity and potential latency. Option C is wrong because using CloudWatch Events (EventBridge) to invoke a Lambda function to restart the instance is an over-engineered approach; the EC2 Reboot Instances action is a built-in alarm target that eliminates the need for custom code. Option D is wrong because CloudWatch Synthetics canaries are designed for synthetic monitoring of endpoints and web applications, not for analyzing existing CloudWatch Logs for specific terms like 'CRITICAL'.

Practice this question →

143

MCQmedium

Refer to the exhibit. The command returns no events for RunInstances during the specified time period. The administrator knows that instances were launched during that time. What is the most likely cause?

A.CloudTrail logs are being delivered to an S3 bucket, not to CloudWatch Logs.

B.The command is run in the wrong AWS Region.

C.CloudTrail is not configured to log management events.

D.The IAM user does not have permission to view CloudTrail events.

AnswerC

If management events are not logged, RunInstances won't appear.

Why this answer

Option C is correct because CloudTrail can be configured to log either management events, data events, or both. If only data events are logged, management events such as RunInstances will not appear in the CloudTrail event history. The command `aws cloudtrail lookup-events` queries the CloudTrail event history, which only contains events that CloudTrail is configured to record.

Since the administrator knows instances were launched but no events are returned, the most likely cause is that CloudTrail is not configured to log management events.

Exam trap

The trap here is that candidates assume CloudTrail always logs all API calls by default, but they overlook that CloudTrail can be configured to exclude management events, and the `lookup-events` command only returns events that CloudTrail is actually recording.

How to eliminate wrong answers

Option A is wrong because CloudTrail logs are delivered to an S3 bucket for long-term storage, but the `lookup-events` command queries the CloudTrail event history, which is a separate, queryable view of the last 90 days of events regardless of whether they are also delivered to S3 or CloudWatch Logs. Option B is wrong because if the command were run in the wrong AWS Region, it would return events from that region, but the administrator knows instances were launched in the region where the command is run; the issue is that no events are returned at all, not that events from a different region appear. Option D is wrong because the IAM user needs permission to call `cloudtrail:LookupEvents`, but if the user lacked that permission, the command would return an access denied error, not an empty result set.

Practice this question →

144

MCQeasy

A SysOps administrator notices that an EC2 instance's CPU utilization has been above 90% for the past hour. The instance is part of an Auto Scaling group with a CPU utilization-based scaling policy. However, no new instances have been launched. What is the most likely cause?

A.The Auto Scaling cooldown period is preventing additional scaling activities.

B.The EC2 instance is in a private subnet and cannot communicate with the Auto Scaling service.

C.The CloudWatch alarm is publishing to an S3 bucket that is full.

D.The scaling policy is based on memory utilization, not CPU.

AnswerA

A cooldown period after a previous scaling event can prevent new scaling actions.

Why this answer

The most likely cause is that the Auto Scaling cooldown period is preventing additional scaling activities. When a scaling activity completes, a cooldown period (default 300 seconds) starts during which the Auto Scaling group ignores additional CloudWatch alarms to allow metrics to stabilize. If the instance has been above 90% CPU for an hour but no new instances launched, the cooldown period may have been triggered by a previous scaling event and is still active, blocking further scale-out actions despite sustained high utilization.

Exam trap

The trap here is that candidates often assume high CPU utilization always triggers immediate scaling, overlooking the cooldown period that can delay or block subsequent scaling activities even when alarms are in ALARM state.

How to eliminate wrong answers

Option B is wrong because EC2 instances in a private subnet can still communicate with the Auto Scaling service via a VPC endpoint or NAT gateway; the instance's subnet type does not prevent the Auto Scaling group from launching new instances. Option C is wrong because CloudWatch alarms publish to SNS topics, not S3 buckets; an S3 bucket being full has no impact on alarm delivery or scaling policy execution. Option D is wrong because the question explicitly states the scaling policy is CPU utilization-based, so a memory-based policy would not trigger on CPU metrics, but the policy is correctly configured for CPU.

Practice this question →

145

MCQhard

An application running on EC2 instances behind an Application Load Balancer (ALB) sends custom metrics to CloudWatch. The team wants to set an alarm that triggers when the error rate exceeds 5% over a 5-minute period. The alarm must evaluate the metric every minute. Which configuration is required?

A.Period = 300 seconds, Statistic = Average, Evaluation Periods = 1, Datapoints to Alarm = 1

B.Period = 300 seconds, Statistic = Sum, Evaluation Periods = 1, Datapoints to Alarm = 1

C.Period = 60 seconds, Statistic = Average, Evaluation Periods = 5, Datapoints to Alarm = 5

D.Period = 60 seconds, Statistic = Sum, Evaluation Periods = 5, Datapoints to Alarm = 5

AnswerC

This checks that all 5 datapoints exceed 5% over 5 minutes.

Why this answer

Option C is correct because the alarm must evaluate the error rate every minute (period = 60 seconds) over a 5-minute window. With evaluation periods = 5 and datapoints to alarm = 5, the alarm requires all five 1-minute datapoints to exceed the 5% threshold, ensuring the error rate is sustained for the full 5-minute period. The Average statistic is appropriate because the error rate is a percentage metric that should be averaged over each period.

Exam trap

The trap here is that candidates often confuse 'period' with the total evaluation window, selecting period = 300 seconds (option A or B) thinking it covers the 5-minute window, but this fails the requirement to evaluate every minute, and they may also incorrectly choose Sum instead of Average for a percentage metric.

How to eliminate wrong answers

Option A is wrong because period = 300 seconds means the metric is evaluated only once every 5 minutes, not every minute as required, and evaluation periods = 1 would trigger the alarm on a single 5-minute datapoint, not a sustained condition. Option B is wrong because period = 300 seconds again fails the 1-minute evaluation requirement, and using Sum for a percentage metric would incorrectly aggregate error counts rather than averaging the rate. Option D is wrong because while period = 60 seconds and evaluation periods = 5 are correct, using Sum instead of Average would sum the error rate values across datapoints, which is meaningless for a percentage metric and would not correctly reflect the 5% threshold.

Practice this question →

146

MCQmedium

A SysOps administrator manages an application that runs on Amazon EC2 instances and stores critical data in Amazon Elastic Block Store (EBS) volumes. The administrator needs to monitor the EBS volumes for any performance bottlenecks. The key metric of interest is the average number of I/O operations per second (IOPS) that are waiting to be completed. Which Amazon CloudWatch metric should the administrator examine?

A.VolumeQueueLength

B.VolumeReadOps

C.VolumeIdleTime

D.VolumeTotalReadTime

AnswerA

This metric shows the number of pending I/O operations waiting to be serviced. A high value indicates a bottleneck.

Why this answer

The VolumeQueueLength metric measures the number of pending I/O requests waiting to be serviced by an EBS volume. A high value indicates that the volume is unable to keep up with the I/O demand, which is the direct indicator of a performance bottleneck related to IOPS waiting. This makes it the correct metric for the administrator's stated goal.

Exam trap

The trap here is that candidates confuse 'operations waiting' (VolumeQueueLength) with 'operations completed' (VolumeReadOps/VolumeWriteOps), assuming a high read count indicates a bottleneck when it actually indicates throughput.

How to eliminate wrong answers

Option B (VolumeReadOps) is wrong because it counts the total number of read I/O operations completed, not the number waiting; it measures throughput, not queue depth. Option C (VolumeIdleTime) is wrong because it indicates the time the volume had no pending I/O, which is the opposite of a bottleneck condition. Option D (VolumeTotalReadTime) is wrong because it aggregates the total time spent on read operations, not the number of operations waiting in the queue.

Practice this question →

147

MCQmedium

A SysOps administrator is troubleshooting an issue where an EC2 instance running a web server is unreachable. The instance passes status checks and is in a healthy state. Security groups and network ACLs are configured correctly. CloudWatch metrics show CPU utilization is 5%. The administrator can SSH into the instance but cannot connect to the web server on port 443. What is the most likely cause?

A.The security group inbound rule for HTTPS is misconfigured.

B.The instance has an incorrect route table entry.

C.The web server service is not running or crashed.

D.The instance has insufficient CPU credits.

AnswerC

The application may have failed to start after boot or crashed, which would not be detected by EC2 status checks.

Why this answer

The instance passes both status checks and is healthy, and the administrator can SSH into it, confirming that the operating system and network stack are functional. Since the web server is unreachable on port 443 despite correct security group and network ACL configurations, and CPU utilization is low (5%), the most likely cause is that the web server service (e.g., Apache, Nginx) has stopped or crashed. This would prevent the instance from listening on port 443, even though the underlying infrastructure is sound.

Exam trap

The trap here is that candidates often assume a reachability issue must be a network configuration problem (security group or route table), but the combination of successful SSH and failed HTTPS on a low-CPU, healthy instance points directly to the application service not running.

How to eliminate wrong answers

Option A is wrong because the security group inbound rule for HTTPS is explicitly stated to be configured correctly, and SSH (port 22) works, indicating no network-level filtering issue. Option B is wrong because an incorrect route table entry would affect all traffic to/from the instance, not just port 443, and SSH connectivity would also fail. Option D is wrong because CPU utilization is only 5%, which is well below the threshold for credit exhaustion, and the instance passes status checks, ruling out a performance-based bottleneck.

Practice this question →

148

MCQeasy

A SysOps administrator needs to track changes to IAM policies in the AWS account for auditing purposes. Which service should be used?

A.IAM Access Analyzer

B.AWS Config

C.AWS CloudTrail

D.Amazon CloudWatch

AnswerC

CloudTrail logs all API calls for auditing.

Why this answer

AWS CloudTrail is the correct service because it records API calls made to IAM, including changes to IAM policies (e.g., CreatePolicy, PutRolePolicy, AttachUserPolicy). These logs are stored in a CloudTrail trail and can be delivered to Amazon S3 for long-term auditing. CloudTrail is specifically designed for auditing API activity, making it the appropriate choice for tracking policy modifications.

Exam trap

The trap here is that candidates often confuse AWS Config (which tracks resource configuration changes) with CloudTrail (which tracks API calls), leading them to choose Config for auditing IAM policy changes when Config only records the resulting state, not the action that caused it.

How to eliminate wrong answers

Option A is wrong because IAM Access Analyzer analyzes resource-based policies to identify unintended public or cross-account access, but it does not track changes to IAM policies over time. Option B is wrong because AWS Config evaluates resource configurations and compliance rules, but it does not record API-level changes to IAM policies; it tracks configuration state, not the history of policy modifications. Option D is wrong because Amazon CloudWatch monitors metrics and logs, but it is not designed to capture IAM API calls or policy changes; it can consume CloudTrail logs for alerting, but it is not the primary auditing service.

Practice this question →

149

MCQhard

An application running on EC2 instances behind an Application Load Balancer (ALB) is experiencing increased latency. The SysOps administrator checks CloudWatch and sees that the ALB's TargetResponseTime is high, but the backend EC2 instance's CPUUtilization and MemoryUtilization are low. What is the most likely cause?

A.The ALB is experiencing connection queuing due to a high number of concurrent requests.

B.The application is CPU-bound on the EC2 instances.

C.The EC2 instances are throttled due to a burst balance limit.

D.The application is waiting on a database query that is slow.

AnswerA

High concurrency can cause the ALB to queue requests, increasing response time without backend load.

Why this answer

When TargetResponseTime is high but backend CPU and memory are low, the bottleneck is not the compute capacity of the instances but rather the ALB's ability to forward requests. The ALB has a default connection queue limit (typically 1024 pending connections per target group), and when concurrent requests exceed that, new requests are queued, increasing response time without stressing the backend. This is a classic sign of connection queuing at the load balancer level.

Exam trap

The trap here is that candidates assume high latency always means the backend is overloaded, but the question deliberately shows low CPU/memory to force you to consider load balancer-level queuing as the root cause.

How to eliminate wrong answers

Option B is wrong because if the application were CPU-bound, CPUUtilization would be high, not low. Option C is wrong because EC2 instances do not have a 'burst balance limit' — that concept applies to EBS volumes (burst balance for gp2/gp3) or T2/T3 instances (CPU credits), not to throttling of the instance itself. Option D is wrong because a slow database query would cause high CPU or memory on the EC2 instance while waiting (e.g., connection pool exhaustion or thread blocking), not low utilization.

Practice this question →

150

MCQmedium

A company uses AWS CloudFormation to deploy a stack that includes an EC2 instance and an S3 bucket. The SysOps administrator needs to monitor the stack for any changes to the S3 bucket's bucket policy. Which AWS service should be used?

A.Amazon CloudWatch

B.AWS Config

C.AWS CloudTrail

D.AWS Trusted Advisor

AnswerB

AWS Config can track changes to S3 bucket policies and trigger rules.

Why this answer

Option C is correct because AWS Config can monitor changes to S3 bucket policies and trigger notifications. Option A is incorrect because CloudTrail logs API calls to change the policy, but does not monitor the policy itself. Option B is incorrect because CloudWatch is for performance metrics.

Option D is incorrect because Trusted Advisor does not monitor bucket policies.

Practice this question →