Knowledge + Practice

CCNA Cloud Operations Support Questions

75 of 193 questions · Page 2/3 · Cloud Operations Support topic · Answers revealed

Practice these questions Exam hub All questions

76

MCQeasy

A cloud engineer notices that a virtual machine (VM) in a public cloud environment is consistently running at 90% CPU during business hours. The VM hosts a customer-facing web application. Which of the following is the BEST initial troubleshooting step?

A.Migrate the VM to a different availability zone.

B.Review the VM's performance metrics and application logs.

C.Reboot the VM to reset resource usage.

D.Scale up the VM to a larger instance size.

AnswerB

Reviewing metrics and logs is the standard first step in troubleshooting.

Why this answer

Option B is correct because the initial step in troubleshooting high CPU usage is to gather diagnostic data. Reviewing the VM's performance metrics (e.g., CPU utilization, memory, disk I/O) and application logs helps identify whether the issue is caused by a legitimate workload spike, a memory leak, or a misconfiguration. This aligns with the 'identify before act' principle in cloud operations, ensuring the engineer understands the root cause before making changes.

Exam trap

The trap here is that candidates often jump to a 'fix' like scaling up or rebooting, but Cisco tests the foundational troubleshooting methodology of 'gather data first' to avoid unnecessary changes and ensure the solution is targeted and cost-effective.

How to eliminate wrong answers

Option A is wrong because migrating the VM to a different availability zone does not address high CPU usage; it only changes the physical location, which may introduce latency or availability issues without resolving the performance bottleneck. Option C is wrong because rebooting the VM is a disruptive action that only temporarily resets resource usage; it does not diagnose or fix the underlying cause, and it can lead to application downtime for a customer-facing web app. Option D is wrong because scaling up to a larger instance size is a reactive measure that may mask the problem without investigation; it increases costs and could be unnecessary if the issue is due to a software bug or misconfiguration.

Practice this question →

77

MCQhard

A company uses a hybrid cloud model with an on-premises data center and a public cloud. The network team reports that traffic between the cloud and on-premises is experiencing high latency and packet loss. The cloud administrator verifies that the VPN connection is up. What is the most likely cause?

A.A firewall rule is blocking ICMP packets.

B.VMs are placed in different cloud regions.

C.The VPN tunnel has a mismatched MTU size.

D.The cloud provider is throttling bandwidth.

AnswerC

Mismatched MTU causes fragmentation and packet loss.

Why this answer

When a VPN tunnel is up but traffic experiences high latency and packet loss, a mismatched Maximum Transmission Unit (MTU) size is a common cause. This occurs because packets larger than the tunnel's MTU must be fragmented, and if fragmentation is not properly handled (e.g., due to the DF bit being set), packets are dropped, leading to retransmissions and increased latency. The symptoms align with MTU issues rather than simple connectivity or throttling problems.

Exam trap

The trap here is that candidates assume a 'VPN is up' means all traffic flows perfectly, but CompTIA often tests the subtlety that MTU mismatch causes performance degradation without breaking the tunnel itself, leading them to incorrectly blame firewall rules or bandwidth throttling.

How to eliminate wrong answers

Option A is wrong because ICMP packets are not required for VPN tunnel operation; blocking ICMP would cause ping failures but not necessarily high latency and packet loss on data traffic, and the VPN is already verified as up. Option B is wrong because VMs in different cloud regions would affect latency between those VMs, but the question specifies traffic between the cloud and on-premises data center, which is routed through the VPN tunnel regardless of VM placement. Option D is wrong because cloud providers typically throttle bandwidth based on usage limits or burst credits, which would manifest as reduced throughput rather than the combination of high latency and packet loss described.

Practice this question →

78

MCQeasy

Refer to the exhibit. A cloud administrator runs this command on a VM. Which of the following is most likely causing the high 'wa' value?

A.Disk I/O bottleneck

B.Network congestion

C.High CPU load

D.Insufficient memory

AnswerA

High 'wa' indicates CPU waiting for disk I/O, suggesting a bottleneck.

Why this answer

Option C is correct because a high 'wa' (wait) value in vmstat indicates that the CPU is waiting for I/O operations to complete, pointing to a disk I/O bottleneck. Option A (insufficient memory) would show high swapping activity or high si/so values. Option B (high CPU load) would show high us or sy.

Option D (network congestion) is not directly indicated by this metric.

Practice this question →

79

MCQeasy

A company uses a cloud load balancer to distribute traffic to web servers. The load balancer health checks are failing for all instances. The instances are running and can be accessed directly via their private IPs from within the VPC. What is the most likely cause?

A.The load balancer's cross-zone load balancing is disabled.

B.The load balancer's listeners are configured on the wrong ports.

C.The security group of the instances is not allowing traffic from the load balancer.

D.The instances are not registered with the target group.

AnswerC

The load balancer sends health checks from a specific source; if the security group doesn't allow it, health checks fail.

Why this answer

Option C is correct because the security group must allow health check traffic from the load balancer. Options A, B, and D are incorrect or less likely.

Practice this question →

80

MCQeasy

A company recently migrated its on-premises backup server to a cloud virtual machine running Windows Server with a dedicated data disk for backups. The backup software is configured to write to a folder on the data disk. After two weeks, the backup jobs start failing with 'disk full' errors. The cloud engineer logs into the VM and verifies that the data disk has 500 GB of total space and the backup folder shows only 300 GB used. However, the operating system reports the disk as 100% full. The engineer also notices that the recycle bin on the data disk appears to be empty. Which of the following is the MOST likely cause of the discrepancy?

A.Shadow copies (Volume Shadow Copies) are consuming space on the data disk

B.The cloud provider has imposed a quota on the disk's storage capacity that is lower than the provisioned size

C.The recycle bin on the data disk contains deleted backup files that are not counted

D.The backup software is compressing data, causing the disk to appear full due to index fragmentation

AnswerA

VSS snapshots can consume disk space without appearing in the backup folder or recycle bin.

Why this answer

Option A is correct: previous backups moved to the system volume shadow copy (VSS) snapshots can consume hidden space. Option B is wrong: VSS and system protection are separate from the recycle bin. Option C is wrong: cloud storage limits affect object storage, not VM disks.

Option D is wrong: compression would reduce used space, not increase it.

Practice this question →

81

Drag & Dropmedium

Arrange the steps to configure auto-scaling for a group of virtual machines based on CPU utilization.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

First define the template, then the group, then the scaling policy, attach it, and test.

Practice this question →

82

Multi-Selectmedium

Which THREE of the following are common causes of application performance degradation in a cloud environment? (Choose three.)

Select 3 answers

A.Insufficient number of security groups

B.Network latency and bandwidth limitations

C.Application code that is poorly optimized

D.Resource exhaustion (CPU, memory, disk I/O)

E.Overprovisioned storage

AnswersB, C, D

High latency or low bandwidth can cause delays in data transmission.

Why this answer

Options A, B, and C are correct. Resource exhaustion (CPU, memory, disk I/O) directly impacts performance. Network latency and bandwidth limitations slow down data transfer.

Poorly optimized application code can cause inefficiencies. Option D is wrong because insufficient security groups do not affect performance; they affect traffic flow rules. Option E is wrong because overprovisioned storage means excess capacity, which does not degrade performance; underprovisioned storage would.

Practice this question →

83

MCQhard

An organization is deploying a multi-tier application in the cloud. The web tier uses auto scaling, and the database tier uses a managed database service. During a load test, the web tier scales up correctly, but the database performance degrades significantly, causing timeout errors. The administrator reviews the database metrics and finds that CPU and memory are normal, but the number of connections is high. Which of the following is the BEST action to resolve the issue?

A.Add read replicas to offload read traffic from the primary database.

B.Increase the maximum number of web servers in the auto scaling group.

C.Increase the compute size of the database instance.

D.Implement connection pooling on the web servers to limit database connections.

AnswerD

Connection pooling reduces the number of simultaneous database connections, improving performance.

Why this answer

The issue is that the database is overwhelmed by a high number of connections, not by CPU or memory pressure. Connection pooling on the web servers reuses a fixed set of database connections, reducing the connection overhead and preventing the database from hitting its maximum connection limit. This directly addresses the symptom of high connection count without changing the database's compute capacity or the web tier's scaling behavior.

Exam trap

The trap here is that candidates confuse high connection count with high CPU/memory load and choose to scale the database vertically (Option C), when the real issue is connection exhaustion that is solved by pooling.

How to eliminate wrong answers

Option A is wrong because read replicas only offload read queries, but the problem is connection count, not read load; the database is likely experiencing connection exhaustion from both reads and writes. Option B is wrong because increasing the maximum number of web servers would increase the number of concurrent database connections, worsening the problem. Option C is wrong because CPU and memory are normal, so increasing compute size does not address the connection limit; the bottleneck is the maximum number of allowed connections, not resource saturation.

Practice this question →

84

MCQeasy

An organization is migrating a legacy application to a public cloud. The application requires a static IP address that does not change after a VM is stopped. Which of the following should the cloud architect use to meet this requirement?

A.Reserved IP address

B.Elastic IP address

C.Ephemeral IP address

D.Floating IP address

AnswerA

Reserved IPs are static and remain allocated to the account even when the VM is off.

Why this answer

A reserved IP address is a static, persistent public IP that remains assigned to a VM even when the VM is stopped or deallocated. In public cloud platforms like Azure, a reserved IP (or static IP) is explicitly set to not change across VM lifecycle events, meeting the requirement for a fixed address that survives stop/deallocate operations.

Exam trap

CompTIA often tests the distinction between cloud-agnostic terminology and vendor-specific terms (e.g., 'Reserved IP' vs 'Elastic IP'), trapping candidates who pick a correct AWS concept but fail to recognize the generic term required by the exam.

How to eliminate wrong answers

Option B (Elastic IP address) is wrong because Elastic IP is an AWS-specific term for a static public IP that persists across stop/start, but it is not the generic cloud term used in the CV0-004 exam; the question asks for the general concept, and 'Reserved IP' is the correct cross-platform term. Option C (Ephemeral IP address) is wrong because an ephemeral IP is temporary and is released when the VM is stopped or deallocated, directly contradicting the requirement for a static IP. Option D (Floating IP address) is wrong because a floating IP is typically used in OpenStack or on-premises environments for dynamic remapping between instances, not for a persistent static IP that remains attached to a single VM across stop/start cycles.

Practice this question →

85

MCQhard

A cloud operations team is troubleshooting a performance issue with a database that is running on a virtual machine. The database is experiencing high latency during peak hours. Metrics show that CPU and memory usage are below 50%, but disk I/O latency spikes. The database is hosted on a cloud provider's virtual machine with premium SSDs. Which of the following is the MOST likely cause of the disk I/O latency?

A.The OS is paging memory to disk due to insufficient RAM.

B.The virtual machine's network bandwidth is saturated.

C.The disk IOPS limit is being exceeded during peak loads.

D.The database application is CPU-bound.

AnswerC

Exceeding provisioned IOPS results in throttling, causing increased latency.

Why this answer

Option B is correct because exceeding provisioned IOPS limits causes throttling and latency. Option A is wrong because CPU and memory are low, indicating no processing bottleneck. Option C is wrong because premium SSDs typically host OS and data, and the problem is I/O.

Option D is wrong because network bandwidth doesn't directly cause disk I/O latency.

Practice this question →

86

Multi-Selecteasy

Which TWO of the following are benefits of using a content delivery network (CDN) with cloud-hosted applications? (Choose two.)

Select 2 answers

A.Reduces load on the origin server

B.Simplifies application architecture

C.Eliminates the need for HTTPS encryption

D.Reduces latency for end users by caching content at edge locations

E.Lowers overall infrastructure cost

AnswersA, D

CDN serves cached requests, reducing origin traffic.

Why this answer

Options B and E are correct. CDN reduces latency by caching content at edge locations (B) and reduces origin server load (E). Option A is wrong because CDN typically increases complexity, not simplifies.

Option C is wrong: CDN may add cost. Option D is wrong: security is not inherently increased; CDN can provide DDoS protection but is not a built-in benefit.

Practice this question →

87

MCQmedium

A company has a cloud-based application that uses an auto-scaling group across multiple Availability Zones (AZs). The application experiences periodic spikes in traffic. The auto-scaling policy uses a step scaling policy based on CPU utilization. The operations team notices that during a traffic spike, new instances are launched but take over five minutes to become healthy and begin serving traffic. During this time, existing instances are overloaded and some requests fail. The team wants to reduce the time it takes for new instances to handle traffic. Which action would be most effective?

A.Increase the instance type to a larger size so each instance can handle more traffic.

B.Use a pre-warmed, customized AMI with the application pre-installed and caches preloaded.

C.Reduce the cooldown period in the scaling policy to launch instances faster.

D.Move all instances to a single AZ to avoid cross-AZ latency.

AnswerB

A pre-warmed AMI reduces startup time by eliminating installation and cache warming steps.

Why this answer

Using a pre-warmed AMI with the application stack already configured and caching enabled significantly reduces the time required for an instance to become healthy because it avoids the need to install dependencies and warm caches on each startup.

Practice this question →

88

MCQmedium

A cloud administrator runs the command shown in the exhibit on a storage node in a hyper-converged cluster. The node is experiencing intermittent I/O errors and degraded performance. Based on the SMART data, what is the most likely cause of the issue?

A.The storage network is experiencing high latency.

B.The filesystem is corrupted and requires a repair.

C.The disk has developed bad sectors and is likely failing.

D.A RAID array is rebuilding, causing performance degradation.

AnswerC

The elevated reallocated and pending sector counts indicate physical disk degradation.

Why this answer

The SMART data from the storage node shows attributes such as Reallocated_Sector_Count, Current_Pending_Sector, and Offline_Uncorrectable with non-zero values, which are definitive indicators of physical bad sectors on the disk. These bad sectors cause intermittent I/O errors and degraded performance because the disk must retry reads/writes or remap sectors, increasing latency. Option C is correct because this pattern directly points to a failing disk, not a network, filesystem, or RAID issue.

Exam trap

The trap here is that candidates confuse SMART disk failure indicators with filesystem corruption or network issues, because intermittent I/O errors can superficially resemble symptoms of a corrupt filesystem or a flapping network link, but the SMART data provides direct hardware-level evidence that eliminates those possibilities.

How to eliminate wrong answers

Option A is wrong because high storage network latency would manifest as consistent packet loss or jitter across all nodes, not as SMART attributes indicating physical disk errors; the command shown is a SMART query, not a network diagnostic. Option B is wrong because filesystem corruption typically produces errors like 'structure needs cleaning' or 'input/output error' on specific files, not the SMART counters for reallocated or pending sectors, which are hardware-level indicators. Option D is wrong because a RAID array rebuilding would show a degraded or rebuilding state in the RAID controller status (e.g., mdadm or megaraid output), not in SMART data, and performance would be consistently slow during rebuild, not intermittent.

Practice this question →

89

MCQhard

An organization has a hybrid cloud environment with resources in both a private cloud and a public cloud. The operations team reports that the cloud management platform cannot collect monitoring data from the public cloud instances. The security team recently updated firewall rules. Which of the following is the MOST likely cause?

A.The load balancer in front of the management platform is misconfigured

B.The public cloud instances cannot reach the management platform's IP address due to firewall changes

C.The SNMP community string was modified on the instances

D.The management platform's NAT IP address was changed

AnswerB

Firewall updates may have blocked outbound traffic from instances to the management platform.

Why this answer

The security team's recent firewall rule update is the most likely cause because the cloud management platform typically uses specific ports and protocols (e.g., HTTPS on TCP 443, SSH on TCP 22, or WinRM on TCP 5985/5986) to collect monitoring data from public cloud instances. If the firewall rules block outbound traffic from the public cloud instances to the management platform's IP address, or block inbound traffic from those instances at the platform's side, the data collection will fail. This directly aligns with the reported symptom of the management platform being unable to collect monitoring data after a firewall change.

Exam trap

CompTIA often tests the candidate's ability to correlate a specific operational change (firewall rule update) with the most direct impact on network connectivity for monitoring, rather than distracting with unrelated configuration changes like SNMP strings or load balancer settings.

How to eliminate wrong answers

Option A is wrong because a misconfigured load balancer in front of the management platform would affect all traffic to the platform, not just monitoring data from public cloud instances, and the issue is specifically tied to the recent firewall rule update. Option C is wrong because while SNMP community string changes could disrupt monitoring, the question explicitly states the security team updated firewall rules, not SNMP configurations, and SNMP is not the only monitoring protocol used in hybrid cloud environments. Option D is wrong because changing the management platform's NAT IP address would require corresponding updates in routing and firewall rules; if this had occurred, the operations team would likely have reported a broader connectivity failure, not just a monitoring data collection issue, and the firewall rule update is the more immediate and likely cause.

Practice this question →

90

MCQeasy

A cloud engineer notices that a virtual machine running a critical application is experiencing high CPU usage. The engineer needs to resolve the issue without affecting other VMs on the same host. Which of the following actions should the engineer take first?

A.Restart the VM to clear the high CPU usage.

B.Increase the CPU allocation for the VM.

C.Migrate the VM to another host in the cluster.

D.Add another VM to the same host to distribute load.

AnswerC

Live migration moves the VM to a less loaded host, resolving the issue without downtime.

Why this answer

Migrating the VM to another host in the cluster (option C) is the correct first action because it immediately offloads the CPU pressure from the current host without impacting the VM's operation or other VMs. This leverages VMware vMotion or Microsoft Hyper-V Live Migration to move the running VM to a less-utilized host, isolating the performance issue while preserving uptime and avoiding resource contention.

Exam trap

CompTIA often tests the misconception that increasing a VM's resource allocation is the first troubleshooting step, when in fact it can cause resource starvation for other VMs; the correct first action is to migrate the VM to balance the load.

How to eliminate wrong answers

Option A is wrong because restarting the VM would cause application downtime and does not address the root cause of high CPU usage; it only temporarily clears the process queue. Option B is wrong because increasing the CPU allocation for the VM on the same host could starve other VMs of CPU resources, violating the requirement to not affect other VMs. Option D is wrong because adding another VM to the same host would increase overall CPU contention, worsening the problem for all VMs on that host.

Practice this question →

91

MCQeasy

A cloud administrator is designing a multi-tier application. The database tier must not be directly accessible from the internet, but the web tier must be able to connect to it. Which of the following should the administrator implement?

A.Implement an application load balancer in front of the database.

B.Place the database servers in a public subnet and restrict the security group.

C.Use a VPN connection for the web tier to access the database.

D.Place the database servers in a private subnet and configure the security group to allow inbound traffic from the web tier's security group.

AnswerD

This isolates the database from the internet and allows only the web tier to connect.

Why this answer

Option D is correct because placing the database servers in a private subnet ensures they have no direct internet route, meeting the security requirement. By configuring the security group to allow inbound traffic only from the web tier's security group, you enable the web tier to connect to the database while blocking all other traffic, including from the internet. This leverages AWS security group referencing (or similar cloud provider feature) to create a trusted, internal communication path.

Exam trap

The trap here is that candidates often confuse 'restricting the security group' with 'placing in a private subnet,' failing to realize that a public subnet inherently provides internet accessibility regardless of security group rules, which only filter traffic but do not remove the public route.

How to eliminate wrong answers

Option A is wrong because an application load balancer is designed to distribute traffic to web or application servers, not to provide secure database access; placing a load balancer in front of the database would expose it to the internet and add unnecessary complexity without preventing direct internet access. Option B is wrong because placing the database servers in a public subnet, even with a restrictive security group, still exposes them to potential internet-based attacks and violates the requirement that the database must not be directly accessible from the internet. Option C is wrong because a VPN connection is typically used for site-to-site or remote user access, not for internal web-to-database communication within the same cloud environment; it would add latency and overhead without addressing the core need for private subnet isolation.

Practice this question →

92

MCQmedium

An organization's cloud environment has a policy that all administrative access must be logged and recorded. Which of the following is the best method to enforce this policy?

A.Require multifactor authentication.

B.Use a bastion host with session recording.

C.Implement a VPN connection for all administrators.

D.Configure syslog forwarding for all devices.

AnswerB

A bastion host can log and record all administrative actions.

Why this answer

Option C is correct because a bastion host can be configured to record all administrative sessions, providing both logging and recording. Option A enforces authentication but not session recording. Option B secures the connection but does not record.

Option D provides logging of events but not full session recording.

Practice this question →

93

MCQmedium

A cloud engineer deployed the infrastructure shown. The load balancer's health checks are failing for the EC2 instance. Which of the following is the MOST likely cause?

A.The web server is not running HTTP.

B.The health check endpoint /health does not exist on the web server.

C.The EC2 instance is in the wrong subnet.

D.The security group does not allow traffic from the load balancer.

AnswerB

The user_data script does not create a /health page; it only installs httpd.

Why this answer

Option A is correct because the health check path '/health' is not created by the user_data script, which only installs httpd and starts it, but does not create the health check endpoint. Option B is wrong because the security group is attached. Option C is wrong because the subnet is specified.

Option D is wrong because the user_data script starts httpd.

Practice this question →

94

MCQhard

A company is migrating a legacy application that requires static public IP addresses for licensing. The cloud provider assigns public IPs dynamically by default. Which solution should the administrator recommend while minimizing cost?

A.Use a NAT gateway with a static IP.

B.Assign elastic IP addresses (or static public IPs) to the instances.

C.Use static private IPs.

D.Use a VPN connection.

AnswerB

Elastic IPs are static public IPs that can be associated with instances at no cost while in use.

Why this answer

Option B is correct because Elastic IP addresses (or static public IPs) provide persistent public IPv4 addresses that can be associated with instances, meeting the licensing requirement for static public IPs. This is the most cost-effective solution as it only incurs charges for allocated but unused Elastic IPs, whereas other options introduce additional infrastructure costs or fail to address the requirement.

Exam trap

The trap here is that candidates may choose a NAT gateway (Option A) thinking it provides a static public IP for the instance, but a NAT gateway only translates outbound traffic and does not assign a public IP to the instance for inbound licensing checks, while also incurring higher costs.

How to eliminate wrong answers

Option A is wrong because a NAT gateway with a static IP provides outbound internet access but does not assign a static public IP directly to the instance for inbound licensing validation; it also incurs hourly and data processing costs, increasing expenses unnecessarily. Option C is wrong because static private IPs are not publicly routable and cannot satisfy the licensing requirement for static public IP addresses. Option D is wrong because a VPN connection creates an encrypted tunnel to a remote network but does not provide a static public IP address for the instance; it adds complexity and cost without meeting the core requirement.

Practice this question →

95

Multi-Selecthard

Which THREE are common causes of network latency in a cloud environment? (Choose three.)

Select 3 answers

A.Firewall rule misconfiguration

B.DNS resolution delays

C.Jumbo frame misconfiguration

D.High bandwidth utilization

E.Packet loss

AnswersA, C, E

Misconfigured firewall rules can cause dropped packets or added inspection delay.

Why this answer

Options A, B, and E are correct. Jumbo frame misconfiguration (A) causes fragmentation or mismatch, leading to latency. Packet loss (B) triggers retransmission delays.

Firewall rule misconfiguration (E) can cause additional processing or drops. Option C (high bandwidth utilization) typically causes congestion only if capacity is exceeded, but not always latency. Option D (DNS resolution delays) affects initial connections but not ongoing latency.

Practice this question →

96

MCQhard

A cloud operations team is investigating why a batch processing job that runs nightly in a cloud environment has been failing intermittently. The job processes data from an external API and writes results to a database. The error logs show "Connection timed out" when calling the external API. However, manual calls from the same cloud environment succeed. What is the most likely cause?

A.The external API rate limit has been exceeded.

B.The database connection pool is exhausted.

C.The batch job's service account has been disabled.

D.The cloud firewall is blocking outbound traffic during the batch window.

AnswerA

Correct. Rate limiting causes intermittent timeouts when batch calls exceed the allowed threshold, while manual calls succeed.

Why this answer

Intermittent timeouts during batch execution suggest the external API is rate-limiting the high volume of automated calls, while manual calls succeed because they are infrequent.

Practice this question →

97

MCQmedium

A company is designing a disaster recovery (DR) plan for a critical application hosted in a public cloud. The application requires a recovery time objective (RTO) of 1 hour and a recovery point objective (RPO) of 15 minutes. Which of the following DR strategies BEST meets these requirements?

A.Backup and restore with daily backups.

B.Cold standby with nightly backups.

C.Pilot light with hourly snapshots.

D.Warm standby with continuous data replication.

AnswerD

Warm standby with continuous replication meets both RTO and RPO targets.

Why this answer

Warm standby with continuous data replication meets the RTO of 1 hour and RPO of 15 minutes because it maintains a partially scaled-down replica of the production environment that can be quickly scaled up, and continuous replication (e.g., using asynchronous replication or Change Block Tracking) ensures data loss is limited to seconds or minutes, well within the 15-minute RPO.

Exam trap

CompTIA often tests the distinction between 'pilot light' and 'warm standby' — the trap here is that candidates confuse hourly snapshots (pilot light) with continuous replication, failing to recognize that hourly snapshots cannot achieve a 15-minute RPO.

How to eliminate wrong answers

Option A is wrong because daily backups provide an RPO of up to 24 hours, far exceeding the required 15 minutes, and the restore process would take much longer than 1 hour, failing the RTO. Option B is wrong because cold standby involves no pre-provisioned resources, requiring manual provisioning and configuration that typically takes hours, exceeding the 1-hour RTO, and nightly backups provide an RPO of up to 24 hours. Option C is wrong because pilot light with hourly snapshots provides an RPO of up to 1 hour, which exceeds the 15-minute requirement, and the snapshots are not continuous, so data loss could be significant.

Practice this question →

98

Multi-Selectmedium

A company is experiencing intermittent connectivity issues between its on-premises data center and a public cloud environment over a VPN connection. Which TWO of the following should the administrator check to troubleshoot the problem?

Select 2 answers

A.Verify that the internet bandwidth is sufficient.

B.Validate the route tables on both sides of the VPN.

C.Ensure the cloud storage performance is adequate.

D.Check the DNS resolution of cloud endpoints.

E.Review the VPN logs and monitor packet loss.

AnswersB, E

Incorrect routes can cause traffic to be dropped or misrouted.

Why this answer

Route tables control the path that traffic takes between networks. If the route tables on either the on-premises VPN device or the cloud virtual network gateway do not have the correct entries (e.g., missing routes for the remote subnet or incorrect next-hop IPs), traffic can be dropped or misdirected, causing intermittent connectivity. Validating these routes ensures that packets destined for the cloud or on-premises are properly forwarded over the VPN tunnel.

Exam trap

CompTIA often tests the misconception that intermittent VPN issues are always bandwidth-related, but the real culprit is usually routing misconfiguration or tunnel instability, which is why route validation and log/packet-loss analysis are the correct pair.

Practice this question →

99

MCQeasy

A cloud administrator is tasked with reducing costs for a development environment that runs 24/7. The environment consists of several virtual machines and a load balancer. Which action would most effectively reduce costs without affecting developer access during business hours?

A.Implement auto-scaling to run only one instance during off-hours.

B.Schedule the VMs to shut down during nights and weekends.

C.Use reserved instances for all VMs.

D.Change all VMs to burstable instance types.

AnswerB

Correct. Shutting down VMs eliminates compute costs during idle periods, directly reducing expenses.

Why this answer

Scheduling VMs to shut down during nights and weekends stops compute charges while not impacting developers who only work during business hours.

Practice this question →

100

MCQhard

A company runs a containerized application on a Kubernetes cluster. The application logs indicate occasional 'CrashLoopBackOff' errors. The developer says the application works fine locally. What is the most likely cause in the cloud environment?

A.The readiness probe is misconfigured and returning false.

B.The memory limit set in the pod spec is too low, causing the container to be OOMKilled.

C.The container image is not being pulled correctly from the registry.

D.The persistent volume claim is not bound to a storage class.

AnswerB

OOMKill causes container to restart, resulting in CrashLoopBackOff.

Why this answer

The 'CrashLoopBackOff' error indicates that the container is repeatedly crashing after startup. Since the application works locally, the issue is likely environmental. A memory limit set too low in the pod spec causes the container to exceed its allowed memory, triggering an OOMKill (Out of Memory Kill) by the kubelet.

The container restarts, fails again, and enters CrashLoopBackOff, which is a common cloud-specific resource constraint not present in local development.

Exam trap

CompTIA often tests the distinction between pod statuses: candidates confuse 'CrashLoopBackOff' (container crashes after starting) with 'ImagePullBackOff' (image pull failure) or 'Pending' (resource unavailability), so they incorrectly attribute the error to image or volume issues rather than resource limits.

How to eliminate wrong answers

Option A is wrong because a misconfigured readiness probe returning false would cause the pod to be marked as not ready and removed from service endpoints, but it would not cause the container to crash or enter CrashLoopBackOff; the container would continue running. Option C is wrong because if the container image were not being pulled correctly, the pod would show an 'ImagePullBackOff' or 'ErrImagePull' status, not 'CrashLoopBackOff'. Option D is wrong because an unbound persistent volume claim would cause the pod to remain in 'Pending' state, not crash after starting; the container would never run to begin with.

Practice this question →

101

MCQhard

A cloud administrator is trying to attach an EBS volume to an EC2 instance for a database migration. The attachment fails with the error shown in the exhibit. The volume contains critical data. Which of the following is the MOST appropriate action to take?

A.Create a new volume and copy the data from the existing volume.

B.Terminate instance 'i-0123ijkl' to release the volume.

C.Force-detach the volume from the other instance.

D.Detach the volume from instance 'i-0123ijkl' first, then attach to the new instance.

AnswerD

Properly detaching ensures data integrity.

Why this answer

The error indicates the EBS volume is already attached to another EC2 instance (i-0123ijkl). An EBS volume can only be attached to one instance at a time in a standard attachment. The correct procedure is to first detach the volume from instance i-0123ijkl, then attach it to the target instance.

This preserves the critical data without risk of corruption or data loss.

Exam trap

CompTIA often tests the misconception that force-detach is the quickest fix, but the trap here is that force-detach can corrupt a database volume with critical data, whereas a proper detach is the safest and most appropriate action.

How to eliminate wrong answers

Option A is wrong because creating a new volume and copying data is unnecessary and time-consuming; the existing volume is intact and can be reattached. Option B is wrong because terminating instance i-0123ijkl would destroy the instance and potentially cause data loss if the volume is not first detached; it is an extreme and destructive action. Option C is wrong because force-detaching the volume can cause data corruption or filesystem inconsistency, especially for a database volume that may have pending writes; it should only be used as a last resort when the instance is unresponsive.

Practice this question →

102

MCQmedium

A cloud administrator receives an alert that the CPU utilization on a production web server has exceeded 90% for the past hour. The administrator checks the metrics and sees that the request rate has increased. Which of the following is the MOST appropriate action to resolve the issue in the short term?

A.Increase the memory allocation for the virtual machine.

B.Implement rate limiting to reduce incoming requests.

C.Increase the number of vCPUs or add additional instances behind a load balancer.

D.Optimize the application code to reduce CPU usage.

AnswerC

Scaling up or out quickly addresses the increased load.

Why this answer

Option A is correct because adding more compute resources allows the server to handle the increased load. Option B is wrong because increasing memory may not help CPU-bound issues. Option C is wrong because optimizing code is a long-term solution.

Option D is wrong because throttling requests degrades user experience.

Practice this question →

103

Multi-Selecthard

Which THREE of the following are valid methods to automate the deployment of cloud resources?

Select 3 answers

A.Employing configuration management tools to enforce desired state

B.Using snapshots to clone instances

C.Using infrastructure as code templates

D.Implementing orchestration tools that chain API calls

E.Writing scripts that use cloud provider CLI commands

AnswersA, C, D

Configuration management automates server configuration.

Why this answer

Option A is correct because configuration management tools like Ansible, Puppet, or Chef enforce a desired state on cloud resources. They continuously ensure that the system configuration matches the defined state, automatically correcting any drift. This is a valid method for automating deployment and maintenance of cloud resources.

Exam trap

The trap here is that candidates may confuse basic scripting (Option E) with dedicated automation methods, or think snapshot cloning (Option B) is a form of automation, when in fact the exam expects recognition of configuration management, IaC, and orchestration as the three primary valid methods for automating cloud resource deployment.

Practice this question →

104

MCQmedium

Refer to the exhibit. A cloud load balancer is returning 502 Bad Gateway errors to clients. What is the most likely cause?

A.The load balancer's SSL certificate is invalid.

B.The security group allows inbound traffic from the load balancer.

C.The DNS record points to the wrong IP.

D.The backend web servers are not responding correctly.

AnswerD

502 errors and health check failures indicate the backend servers are not serving correctly.

Why this answer

A 502 Bad Gateway error indicates that the load balancer (acting as a proxy or gateway) received an invalid or no response from the upstream backend web servers. This typically occurs when the backend servers are overloaded, have crashed, or are misconfigured (e.g., incorrect health check path, application pool failure). The load balancer successfully forwards the request but fails to get a valid HTTP response, triggering the 502 error.

Exam trap

CompTIA often tests the distinction between HTTP status codes (502 vs. 504 vs. 503) and their root causes, trapping candidates who confuse a backend response failure (502) with a timeout (504) or an overload condition (503).

How to eliminate wrong answers

Option A is wrong because an invalid SSL certificate on the load balancer would cause SSL/TLS handshake failures (e.g., 525 or 526 errors in Cloudflare, or certificate warnings), not a 502 Bad Gateway, which is a proxy-level error unrelated to certificate validity. Option B is wrong because allowing inbound traffic from the load balancer in the security group is actually a correct configuration; if it were blocked, the load balancer would receive connection timeouts or 504 errors, not 502 errors. Option C is wrong because a DNS record pointing to the wrong IP would cause clients to reach an incorrect server or no server at all, resulting in connection failures or 404 errors, not a 502 Bad Gateway from the load balancer.

Practice this question →

105

MCQmedium

A cloud engineer is tasked with setting up a disaster recovery (DR) plan for a critical application that runs on virtual machines in a private cloud. The DR site is a public cloud. The application requires low recovery time objective (RTO) of less than 15 minutes and recovery point objective (RPO) of less than 5 minutes. Which of the following replication strategies BEST meets these requirements?

A.Replicate virtual machine images to the DR site daily.

B.Use synchronous replication to keep data identical.

C.Configure agent-based backup to the DR site every 5 minutes.

D.Use continuous asynchronous replication to the DR site.

AnswerD

Continuous replication provides near-real-time RPO (seconds to minutes) and if VMs are pre-configured, can achieve RTO under 15 minutes.

Why this answer

Option B is correct because continuous asynchronous replication can achieve low RPO (under 5 minutes) and low RTO (under 15 minutes) if VMs can be quickly started from the replicated data. Option A is wrong because daily replication cannot achieve RPO < 5 minutes. Option C is wrong because synchronous replication may introduce latency and is typically used for high consistency, but RTO may be higher if recovery involves starting VMs; also, it can affect primary performance.

Option D is wrong because backup every 5 minutes may meet RPO but RTO would be higher due to restoration time from backup.

Practice this question →

106

MCQmedium

A cloud engineer is designing a disaster recovery plan for a critical application. The application requires a Recovery Time Objective (RTO) of 15 minutes and a Recovery Point Objective (RPO) of 1 hour. Which replication strategy should be used?

A.Asynchronous replication with snapshots every 30 minutes.

B.Daily backups stored in a different region.

C.No replication; rely on infrastructure rebuild.

D.Synchronous replication between two regions.

AnswerA

Snapshots every 30 minutes meet RPO of 1 hour; RTO achieved by failover.

Why this answer

Asynchronous replication with snapshots every 30 minutes meets the RPO of 1 hour because data loss is limited to at most 30 minutes of changes, which is within the 1-hour window. The RTO of 15 minutes is achievable because the replicated data is readily available in the secondary region, allowing for rapid failover without the delay of restoring from backups.

Exam trap

CompTIA often tests the misconception that synchronous replication is always the best choice for disaster recovery, but the trap here is that synchronous replication over long distances introduces unacceptable latency, making it unsuitable for applications with strict performance requirements, while asynchronous replication with appropriate snapshot intervals can meet moderate RPO/RTO goals without performance degradation.

How to eliminate wrong answers

Option B is wrong because daily backups stored in a different region cannot meet the RPO of 1 hour (data loss could be up to 24 hours) or the RTO of 15 minutes (restoring from backups typically takes hours). Option C is wrong because relying on infrastructure rebuild would result in an RTO far exceeding 15 minutes and an RPO of potentially all data since the last backup, failing both objectives. Option D is wrong because synchronous replication between two regions introduces significant latency that can degrade application performance, and while it provides an RPO of near zero, it is overkill for a 1-hour RPO and may not be feasible over long distances due to network constraints.

Practice this question →

107

MCQmedium

A company uses a cloud-based logging service to aggregate logs from multiple servers. Suddenly, the logging service stops receiving logs from several servers. The administrator checks the logging agent status on those servers and finds that the agents are running but not sending data. The network connectivity between the servers and the logging service is verified as working. Which of the following is the MOST likely cause?

A.The logging service has reached its storage quota.

B.The logging endpoint configuration on the agents is incorrect.

C.The logging agent is out of memory.

D.A firewall is blocking the outbound traffic from the servers.

AnswerB

Wrong endpoint will cause logs to be sent to an invalid destination.

Why this answer

The logging agents are running but not sending data, and network connectivity is verified as working. This points to a configuration issue where the agents are pointing to an incorrect logging endpoint (e.g., wrong URL, port, or API key). Even if the service is healthy, misconfigured agents will fail to transmit logs, which matches the symptom of agents running but idle.

Exam trap

The trap here is that candidates assume a running agent implies correct configuration, but agents can run idle if they cannot reach or authenticate to the configured endpoint, even when network connectivity is fine.

How to eliminate wrong answers

Option A is wrong because if the logging service had reached its storage quota, the service would typically reject new logs or return an error, but the agents would still attempt to send data (and likely log the rejection). Option C is wrong because an out-of-memory agent would likely crash, hang, or produce errors, not remain in a running state without sending data. Option D is wrong because network connectivity between the servers and the logging service is verified as working, which directly rules out a firewall blocking outbound traffic.

Practice this question →

108

MCQmedium

A cloud engineer deploys the Kubernetes manifest shown in the exhibit. After deployment, the frontend pods are in CrashLoopBackOff state. The engineer checks the logs and finds 'OOMKilled' errors. Which of the following changes would resolve the issue?

A.Increase the memory limit to 1Gi.

B.Increase the number of replicas to 5.

C.Reduce the CPU limit to 250m.

D.Change the service type to ClusterIP.

AnswerA

Increasing memory limit allows the container to use more memory without being killed.

Why this answer

The 'OOMKilled' error indicates the container's memory usage exceeded its configured limit, causing the kernel's Out-Of-Memory (OOM) killer to terminate the process. Increasing the memory limit to 1Gi provides the container with more memory headroom, preventing the OOM kill and allowing the pod to run without crashing.

Exam trap

CompTIA often tests the distinction between resource limits (memory vs. CPU) and scaling strategies, trapping candidates who confuse horizontal scaling (replicas) with vertical scaling (resource limits).

How to eliminate wrong answers

Option B is wrong because increasing the number of replicas does not address the per-pod memory exhaustion; it only distributes traffic across more pods, each still subject to the same insufficient memory limit. Option C is wrong because reducing the CPU limit does not affect memory allocation; OOM kills are triggered by memory, not CPU, and lowering CPU could cause throttling but not resolve the memory shortage. Option D is wrong because changing the service type to ClusterIP only alters network exposure (internal vs. external) and has no impact on container resource limits or OOM behavior.

Practice this question →

109

MCQhard

A company operates a multi-tier web application on AWS. The web tier runs on EC2 instances behind an Application Load Balancer. The application tier runs on EC2 instances that connect to an RDS MySQL database. Recently, users have reported slow page load times. The cloud administrator investigates and finds the following: CPU utilization on web and app tier instances is below 50%, memory usage is normal, but the RDS instance's CPU utilization is consistently above 80% and the number of database connections is at the maximum. The administrator also notices that the application code opens a new database connection for each HTTP request and does not close them properly. Which action should the administrator take to resolve the performance issue?

A.Scale the RDS instance vertically to a larger instance class.

B.Implement a connection pooling mechanism in the application tier.

C.Increase the timeout for idle database connections.

D.Increase the maximum number of database connections on RDS.

AnswerB

Reduces the number of concurrent connections and reuses them, lowering CPU load.

Why this answer

The performance issue is caused by the application opening a new database connection for each HTTP request and not closing them properly, which exhausts the maximum connections on RDS. Implementing a connection pooling mechanism in the application tier reuses existing connections, reduces connection overhead, and prevents connection exhaustion without requiring a larger instance or increasing the connection limit. This directly addresses the root cause—inefficient connection management—rather than treating symptoms like high CPU or connection limits.

Exam trap

CompTIA often tests the misconception that scaling resources (vertical scaling or increasing limits) is the primary fix for performance issues, when the real problem is inefficient resource usage like connection leaks that require architectural changes such as connection pooling.

How to eliminate wrong answers

Option A is wrong because vertically scaling the RDS instance would increase CPU and memory capacity but does not fix the underlying connection leak; the application would still exhaust connections, leading to the same bottleneck. Option C is wrong because increasing the timeout for idle database connections would keep more connections open longer, exacerbating the connection exhaustion problem rather than resolving it. Option D is wrong because increasing the maximum number of database connections on RDS would allow more concurrent connections but does not address the application's failure to close connections, leading to eventual resource exhaustion and potential instability.

Practice this question →

110

MCQeasy

A cloud administrator needs to ensure that application logs are retained for three years to comply with regulatory requirements. Which of the following is the MOST cost-effective solution?

A.Configure log rotation so only the last 30 days of logs are kept in the instance

B.Store all logs in block storage for three years

C.Use a lifecycle policy to transition logs to archive storage after 90 days and delete after three years

D.Compress old logs manually and store them in a separate volume

AnswerC

Lifecycle policies automate tiering to cost-effective storage.

Why this answer

Option B is correct because setting a lifecycle policy to transition logs to cold storage after a period and then delete after three years reduces cost. Option A is wrong because storing all logs in hot storage for three years is expensive. Option C is wrong because preventing log rotation would not help with cost and may cause storage issues.

Option D is wrong because compressing old logs manually is less efficient than automated lifecycle.

Practice this question →

111

MCQmedium

A company uses a cloud load balancer to distribute traffic to a group of web servers. After a recent update, some users report being redirected to a maintenance page when the application is actually available. What is the most likely cause?

A.The load balancer health check is misconfigured and marking healthy instances as unhealthy.

B.The load balancer's SSL certificate has expired.

C.The DNS record for the load balancer has a short TTL.

D.The web servers are not configured with the same security group.

AnswerA

A misconfigured health check can cause the load balancer to stop sending traffic to healthy instances, redirecting users to a maintenance page.

Why this answer

The most likely cause is that the load balancer's health check is misconfigured, causing it to incorrectly mark healthy web servers as unhealthy. When all instances are marked unhealthy, the load balancer has no available targets and may route traffic to a fallback maintenance page or return an error. This explains why users see a maintenance page despite the application being available.

Exam trap

CompTIA often tests the distinction between health check misconfiguration and other common issues like SSL or DNS, so candidates may mistakenly choose an expired SSL certificate because they associate 'maintenance page' with security errors, but the correct cause is the load balancer's health check marking healthy instances as unhealthy.

How to eliminate wrong answers

Option B is wrong because an expired SSL certificate would cause TLS handshake errors (e.g., certificate warnings or connection failures), not a redirect to a maintenance page. Option C is wrong because a short TTL on the DNS record affects how quickly DNS changes propagate, but it does not cause the load balancer to redirect traffic to a maintenance page. Option D is wrong because mismatched security groups would block traffic at the network level, resulting in timeouts or connection refused errors, not a maintenance page redirect.

Practice this question →

112

Multi-Selectmedium

A cloud administrator is planning a migration of on-premises workloads to a public cloud. Which THREE factors should the administrator consider to ensure minimal downtime? (Choose three.)

Select 3 answers

A.Application compatibility

B.Data compression

C.Network bandwidth

D.Transfer time window

E.Cost of data transfer

AnswersA, C, D

Ensuring the application runs in the cloud avoids rollbacks and extended downtime.

Why this answer

Options A, C, and E are correct. Network bandwidth (A) affects transfer speed. Application compatibility (C) determines if the migration will succeed without issues.

Transfer time window (E) is the scheduled period for migration to minimize business impact. Option B (data compression) can help but is not essential for minimal downtime. Option D (cost) is important but does not directly affect downtime.

Practice this question →

113

MCQeasy

A company uses a cloud-based backup solution for its virtual machines. The backup policy specifies daily backups with a retention of 30 days. The backup administrator notices that backups are failing with an error indicating insufficient storage space in the backup repository. Which of the following is the most likely reason for this issue?

A.The retention period is too long for the available storage.

B.The backup frequency is too low.

C.Compression is not enabled on the backup.

D.The backup window is too short.

AnswerA

Longer retention means more backups accumulate, consuming storage.

Why this answer

The error indicates insufficient storage space in the backup repository. With a daily backup policy and a 30-day retention period, the repository must accommodate at least 30 full backups (or the equivalent incremental/differential chain). If the repository's capacity is less than the total size of 30 backups, the retention period exceeds the available storage, causing failures.

This is the most direct cause of the error.

Exam trap

The trap here is that candidates may confuse 'insufficient storage' with backup window or frequency issues, but the error is explicitly capacity-related, making retention period length the root cause when storage is fixed.

How to eliminate wrong answers

Option B is wrong because a lower backup frequency (e.g., weekly instead of daily) would reduce storage consumption, not cause an insufficient space error; the issue is high consumption relative to retention, not frequency. Option C is wrong because while compression reduces backup size, its absence would increase storage usage but not directly cause a failure unless the repository is already at capacity; the error message specifically cites insufficient space, not a configuration issue. Option D is wrong because a short backup window might cause timeouts or incomplete backups, but it would not produce an 'insufficient storage space' error; that error is purely capacity-related.

Practice this question →

114

MCQhard

A cloud engineer is troubleshooting a web application that is not responding. The engineer examines the serial console output of the web-server instance and finds the error shown in the exhibit. What is the MOST likely cause of this issue?

A.The service account associated with the instance is missing the required permissions.

B.The instance is in a STOPPED state and cannot execute user data scripts.

C.The instance does not have a public IP address assigned.

D.A firewall rule is blocking traffic to the metadata server IP address 169.254.169.254.

AnswerD

The metadata server is accessed via link-local address; blocking this traffic prevents metadata retrieval.

Why this answer

The error shown in the serial console output indicates that the instance cannot reach the metadata server at 169.254.169.254. This IP address is a link-local address used by cloud providers (e.g., AWS, GCP, Azure) to serve instance metadata, including user data scripts. If a firewall rule blocks traffic to this IP, the instance cannot retrieve its user data, causing the web application to fail to start or respond.

Exam trap

The trap here is that candidates often associate connectivity issues with public IPs or firewall rules blocking external traffic, but the metadata server is an internal link-local address, so the firewall rule must be blocking internal traffic to 169.254.169.254 specifically.

How to eliminate wrong answers

Option A is wrong because the error is about network connectivity to the metadata server, not about IAM or service account permissions; missing permissions would cause API call failures, not a connection timeout to 169.254.169.254. Option B is wrong because if the instance were in a STOPPED state, there would be no serial console output or running processes to troubleshoot; the error implies the instance is running but cannot reach the metadata server. Option C is wrong because a public IP address is not required for an instance to access the metadata server; the metadata server is accessible via the link-local address 169.254.169.254 from within the instance regardless of public IP assignment.

Practice this question →

115

MCQeasy

A company is experiencing latency issues when accessing a cloud-based application. The cloud administrator runs a traceroute and notices high latency at the ISP's edge router. Which of the following is the MOST likely cause?

A.An internet service provider (ISP) issue is degrading the WAN connection

B.A misconfigured firewall rule is dropping packets

C.The load balancer is sending traffic to unhealthy instances

D.The virtual machine hosting the application is under-provisioned

AnswerA

High latency at ISP edge router indicates WAN connection issue.

Why this answer

High latency at the ISP's edge router indicates that the bottleneck is occurring on the WAN link between the company's network and the cloud provider, which is under the ISP's control. This is a classic symptom of an ISP issue, such as congestion, routing problems, or a degraded physical link, directly impacting the WAN connection. The traceroute output localizes the latency to the ISP's infrastructure, not to the company's internal network or the cloud application's resources.

Exam trap

CompTIA often tests the distinction between latency caused by network infrastructure (ISP) versus application or server performance issues, and the trap here is that candidates may attribute high latency to internal misconfigurations (like firewalls or load balancers) when the traceroute clearly isolates the problem to an external hop.

How to eliminate wrong answers

Option B is wrong because a misconfigured firewall rule dropping packets would cause packet loss or connectivity failures, not consistently high latency at a specific hop; dropped packets would result in retransmissions and timeouts, not a steady latency increase. Option C is wrong because a load balancer sending traffic to unhealthy instances would cause application errors or timeouts, but the latency would appear at the application layer or after the load balancer, not at the ISP's edge router. Option D is wrong because an under-provisioned virtual machine would cause high CPU or memory utilization leading to application slowness, but the latency would be observed at the VM or within the cloud provider's network, not at the ISP's edge router.

Practice this question →

116

MCQeasy

A cloud administrator needs to apply a critical security patch to a virtual machine that is part of a production application. The application must remain available during patching. Which of the following is the BEST approach?

A.Postpone the patch until the next scheduled update cycle

B.Remove the VM from the load balancer, apply the patch, then return it to service during a maintenance window

C.Patch all VMs simultaneously to minimize the time to full deployment

D.Apply the patch during peak usage hours to ensure immediate deployment

AnswerB

Rolling patching maintains availability.

Why this answer

Removing the VM from the load balancer ensures that no new traffic is sent to it while the patch is applied, maintaining application availability for users. After the patch is applied and the VM is verified as healthy, it can be returned to the load balancer pool. This approach aligns with a rolling update strategy, which is the standard method for applying patches to production VMs without downtime.

Exam trap

The trap here is that candidates may think patching all VMs simultaneously is faster and therefore better, but they overlook the critical requirement of maintaining application availability, which is explicitly stated in the question.

How to eliminate wrong answers

Option A is wrong because postponing a critical security patch leaves the application vulnerable to exploitation, which violates security best practices and compliance requirements. Option C is wrong because patching all VMs simultaneously would cause a complete application outage, as no VM would be available to serve traffic during the patching process. Option D is wrong because applying the patch during peak usage hours increases the risk of performance degradation or service disruption, and contradicts the standard practice of scheduling maintenance during low-traffic periods.

Practice this question →

117

MCQhard

A cloud engineer is troubleshooting an issue where a virtual machine (VM) in a VPC cannot communicate with an on-premises database server through a site-to-site VPN. The VPN tunnel status shows 'UP' and the on-premises firewall logs show packets from the VM's public IP (but the VM is in a private subnet with no public IP). What is the MOST likely cause?

A.The on-premises firewall is blocking the VM's private IP address.

B.The VM's security group is blocking outbound traffic to the on-premises subnet.

C.The route table for the subnet is missing a route to the on-premises network via the VPN.

D.The VPN tunnel is misconfigured with mismatched pre-shared keys.

AnswerC

Without proper route, traffic may exit through an internet gateway or NAT, resulting in public IP source.

Why this answer

The VPN tunnel status is 'UP' and the on-premises firewall sees packets from the VM's public IP, but the VM is in a private subnet with no public IP. This indicates that the VM's traffic is being sent to the internet instead of through the VPN tunnel. The most likely cause is that the subnet's route table lacks a specific route directing traffic destined for the on-premises network to the virtual private gateway (VGW) or VPN connection, causing the traffic to be routed to the internet gateway (IGW) and source-NATed to the public IP.

Exam trap

The trap here is that candidates see 'VPN tunnel UP' and assume the VPN is fully functional, overlooking that routing (the route table) is a separate layer that must direct traffic into the tunnel; the tunnel being up does not guarantee traffic is being sent through it.

How to eliminate wrong answers

Option A is wrong because the on-premises firewall logs show packets from the VM's public IP, not its private IP, so the firewall is not blocking the private IP; the issue is that the private IP is not being used for the VPN traffic. Option B is wrong because security groups are stateful and, by default, allow all outbound traffic; even if outbound rules were restrictive, the traffic would still be dropped at the VM level, not appear at the on-premises firewall with a public IP. Option D is wrong because mismatched pre-shared keys would prevent the VPN tunnel from establishing, but the tunnel status is 'UP', indicating the Phase 1 and Phase 2 negotiations succeeded.

Practice this question →

118

MCQmedium

Refer to the exhibit. A cloud administrator applies the bucket policy shown. After applying, users report that they can no longer access the prod-backup bucket using their applications. The applications use the AWS SDK with default configuration. What is the most likely reason?

A.The policy does not include an Allow statement.

B.The policy has a syntax error in the condition.

C.The applications are using HTTP instead of HTTPS.

D.The policy's principal "*" blocks all users.

AnswerC

Correct. The policy denies non-HTTPS requests, so if the SDK uses HTTP, access is denied.

Why this answer

The policy denies requests that do not use HTTPS. Many SDKs default to HTTP, causing the Deny statement to block access.

Practice this question →

119

Multi-Selectmedium

A cloud administrator is troubleshooting a connectivity issue between a web server and a database server in the same VPC but different subnets. The security group for the database server allows inbound traffic from the web server's security group. However, the web server cannot establish a TCP connection to the database. What are two possible causes? (Choose two.)

Select 2 answers

A.The security group of the web server does not allow outbound traffic to the database.

B.The network ACL of the web server subnet is blocking outbound traffic.

C.The network ACL of the database subnet is blocking inbound traffic.

D.The route table of the database subnet does not contain a route to the web server subnet.

E.The database server is not listening on the correct port.

AnswersB, C

The outbound NACL on the web server subnet may block the connection request.

Why this answer

Option B is correct because network ACLs are stateless and apply to subnet boundaries. Even if the web server's security group allows outbound traffic, the subnet's network ACL must explicitly allow outbound traffic to the database server's IP and port. If the outbound rule is missing or denies the traffic, the TCP SYN packet will be dropped before it leaves the subnet.

Option C is correct because the database subnet's network ACL must allow inbound traffic from the web server's IP and port; if it blocks the inbound SYN, the connection cannot be established.

Exam trap

The trap here is that candidates often assume security groups are the only firewall layer, forgetting that network ACLs at the subnet level can override security group rules, especially when they are stateless and require explicit rules for both directions.

Practice this question →

120

Multi-Selecthard

A cloud administrator is investigating a performance issue with a cloud-based application. The application's response time has increased significantly. Monitoring shows low CPU and memory, but high network latency. Which two actions should the administrator take? (Choose two.)

Select 2 answers

A.Review the security group rules for any restrictive outbound rules.

B.Check the load balancer's connection draining settings.

C.Use a trace tool to identify network hops and bottlenecks.

D.Verify that the instances are in the same placement group.

E.Check for packet drops in the VPC flow logs.

AnswersC, E

Traceroute helps pinpoint where network delays occur.

Why this answer

Correct answers are A and E. Packet drops (A) and network hops (E) are diagnostic of high latency. Options B, C, and D are less relevant.

Practice this question →

121

MCQhard

A cloud administrator is managing a hybrid cloud environment where on-premises servers connect to a public cloud VPC via a site-to-site VPN. Users report intermittent connectivity issues to cloud resources. The administrator examines the VPN tunnel logs and sees 'Phase 2 negotiation failed' errors. Which of the following is the MOST likely cause?

A.Dead Peer Detection (DPD) is disabled on one side.

B.Incorrect pre-shared key used for the VPN tunnel.

C.Packet loss due to high latency on the internet link.

D.Mismatched encryption domain definitions (traffic selectors) between on-premises and cloud VPN gateways.

AnswerD

Phase 2 negotiates encryption domains; mismatched selectors cause failure.

Why this answer

Phase 2 negotiation failures in IPsec VPNs indicate that the two gateways cannot agree on the security associations (SAs) for encrypting data traffic. This is most commonly caused by mismatched encryption domain definitions (traffic selectors), such as differing local/remote subnets, protocols, or ports. When the on-premises and cloud VPN gateways define the allowed traffic differently, they cannot establish the Phase 2 SA, leading to intermittent connectivity.

Exam trap

The trap here is that candidates often confuse Phase 1 and Phase 2 failures, incorrectly attributing the error to pre-shared key mismatches (Phase 1) instead of traffic selector mismatches (Phase 2).

How to eliminate wrong answers

Option A is wrong because Dead Peer Detection (DPD) is used to detect loss of a peer during Phase 1 or Phase 2, but disabling DPD does not cause Phase 2 negotiation failures; it only delays failure detection. Option B is wrong because an incorrect pre-shared key would cause Phase 1 (IKE) authentication to fail, not Phase 2 negotiation. Option C is wrong because packet loss or high latency can cause timeouts or retransmissions but does not directly cause a 'Phase 2 negotiation failed' error, which is a protocol-level mismatch.

Practice this question →

122

MCQhard

A cloud administrator is troubleshooting an application that fails to connect to a database. The database server is in a private subnet, and the application server is in a public subnet. The security group for the database allows inbound traffic on port 3306 from the application's security group. Which of the following is the MOST likely reason the connection fails?

A.The network ACL for the database subnet is blocking inbound traffic on port 3306

B.The database server's operating system firewall is blocking the connection

C.The application's security group does not allow outbound traffic to the database

D.The route table for the public subnet does not have a route to the private subnet

AnswerA

NACLs are stateless and require explicit allow rules.

Why this answer

The network ACL (NACL) for the database subnet is the most likely cause because NACLs are stateless and apply to the entire subnet. Even if the database security group allows inbound traffic from the application's security group, the NACL must also allow inbound traffic on port 3306 (MySQL) from the application subnet. By default, custom NACLs deny all inbound traffic, so if the administrator did not explicitly add a rule for port 3306, the connection will be blocked at the subnet boundary.

Exam trap

CompTIA often tests the difference between stateful security groups and stateless network ACLs, and the trap here is that candidates assume security group rules are sufficient for connectivity, forgetting that NACLs must also permit the traffic at the subnet boundary.

How to eliminate wrong answers

Option B is wrong because the operating system firewall on the database server could block the connection, but it is less likely than a NACL issue since the question states the security group already allows the traffic, and OS firewalls are typically configured to allow traffic that matches security group rules. Option C is wrong because security groups are stateful; if the application's security group allows outbound traffic by default (which it does for all traffic unless explicitly denied), the response traffic from the database is automatically allowed, so outbound rules are not the issue. Option D is wrong because route tables control traffic routing between subnets, and by default, VPCs have an implicit local route that allows communication between all subnets within the VPC, so a missing route is not the problem here.

Practice this question →

123

MCQhard

A cloud administrator is troubleshooting connectivity issues between two virtual networks in a public cloud. The networks are in the same region but different VPCs. Both VPCs have route tables and security groups configured. Instances in VPC A cannot ping instances in VPC B. Which of the following is the most likely cause?

A.VPC peering is not established between the two VPCs.

B.The instances are not assigned public IP addresses.

C.Security groups are blocking ICMP traffic.

D.Network ACLs are not configured to allow the traffic.

AnswerA

Without peering, traffic is isolated between VPCs.

Why this answer

VPC peering is a direct network connection between two VPCs that enables routing of traffic using private IPv4 or IPv6 addresses. Without an established VPC peering connection, instances in different VPCs cannot communicate, even if they are in the same region. Since the question states the VPCs are separate and no peering is mentioned, this is the most likely root cause of the connectivity failure.

Exam trap

The trap here is that candidates often focus on security groups or ACLs as the default answer for connectivity issues, but the fundamental prerequisite for cross-VPC communication is the existence of a VPC peering connection or a transit gateway, not just network access controls.

How to eliminate wrong answers

Option B is wrong because public IP addresses are not required for VPC-to-VPC communication; private IP routing via VPC peering or transit gateway is the standard method. Option C is wrong because while security groups can block ICMP, they are stateful and would not prevent all traffic unless explicitly configured to deny ICMP; the question does not indicate any such rule. Option D is wrong because network ACLs are stateless and must allow both inbound and outbound traffic, but they are not the primary enabler of cross-VPC connectivity; without VPC peering, no amount of ACL configuration will establish the link.

Practice this question →

124

Multi-Selectmedium

A company is migrating to the cloud and needs to ensure high availability for a web application. The solution must tolerate the failure of an entire Availability Zone. Which three actions should the administrator take? (Choose three.)

Select 3 answers

A.Deploy the application across two or more Availability Zones.

B.Implement auto scaling with a minimum of one instance per AZ.

C.Use an Application Load Balancer with cross-zone load balancing enabled.

D.Use a single instance in a larger instance size.

E.Use a Multi-AZ RDS database.

AnswersA, B, E

This is fundamental to survive an AZ failure.

Why this answer

Option A is correct because deploying the application across two or more Availability Zones (AZs) ensures that if an entire AZ fails, the application remains available in the other AZ(s). This is a fundamental design pattern for achieving high availability in AWS, as AZs are physically separate data centers with independent power, cooling, and networking. By distributing resources across multiple AZs, the application can tolerate the failure of one AZ without service interruption.

Exam trap

The trap here is that candidates often confuse cross-zone load balancing (which optimizes traffic distribution) with the fundamental requirement of deploying resources across multiple AZs to achieve AZ failure tolerance, leading them to select option C as a necessary action when it is actually a default or optional feature that does not by itself provide AZ redundancy.

Practice this question →

125

MCQeasy

A cloud operations team uses a configuration management tool to apply patches to hundreds of Linux servers. Recently, the automation script that applies security patches has been failing with an error: 'Package not found.' The administrator verifies that the patch repository URL is correct and that the servers have internet access. The script runs every Sunday at 2:00 AM and the failures started two weeks ago. The failed patches are all for the latest kernel update. What should the administrator check FIRST?

A.Verify that the package cache is being updated before the installation step.

B.Ensure the patch repository is reachable via DNS.

C.Check if the servers have sufficient disk space for the patch download.

D.Roll back to an earlier version of the patch script.

AnswerA

The package cache must be refreshed to recognize the latest kernel package; the script may have skipped that step.

Why this answer

The error 'Package not found' typically indicates that the repository metadata is outdated. On many Linux systems, the local package cache (e.g., apt or yum) needs to be updated before installing new packages. The script likely needs to run an update command before attempting to install the patch.

Practice this question →

126

MCQeasy

A company is using a cloud provider's object storage for archival data. The data is rarely accessed but must be retained for 7 years for compliance. The current storage class is Standard, and the monthly bill is increasing. The cloud administrator wants to minimize costs while meeting compliance requirements. The data must be accessible within 24 hours if needed. Which of the following actions should the administrator take?

A.Compress the objects using server-side encryption.

B.Delete the objects after 7 years to save cost.

C.Move the data to a cold storage tier like Glacier or Archive.

D.Use a lifecycle policy to transition objects to a cheaper storage class after a period.

AnswerD

Lifecycle policies automate transition to lower-cost storage (e.g., after 30 days to Infrequent Access, then to Glacier), optimizing cost while retaining data.

Why this answer

Option D is correct because a lifecycle policy automates the transition of objects from a higher-cost storage class (e.g., Standard) to a cheaper class (e.g., S3 Glacier Deep Archive or Azure Archive) after a specified period, reducing costs while retaining data for the required 7 years. The data remains accessible within 24 hours via a restore request, meeting the compliance and retrieval time requirement.

Exam trap

The trap here is that candidates may think moving data to a cold storage tier (Option C) is the best immediate action, but they overlook that a lifecycle policy (Option D) automates the transition and avoids manual intervention, which is more efficient and cost-effective over time.

How to eliminate wrong answers

Option A is wrong because server-side encryption (e.g., AES-256) does not reduce storage costs; it only secures data at rest. Option B is wrong because deleting objects after 7 years does not address the immediate cost increase and violates the compliance requirement to retain data for the full 7 years. Option C is wrong because moving data directly to a cold storage tier like Glacier or Archive without a lifecycle policy is a manual, non-automated action that does not leverage cost-saving transitions over time; also, some cold tiers may have minimum storage durations that could increase costs if data is moved too early.

Practice this question →

127

MCQhard

A cloud administrator receives an alert that a storage bucket containing sensitive customer data has been accessed from an unknown IP address at 3:00 AM. The bucket policy is configured to allow access only from the corporate VPN CIDR block (10.0.0.0/8). The administrator checks the access logs and sees that the request originated from 203.0.113.50, which is not within the allowed range. The bucket policy also includes a condition that restricts access to Secure Transport (SSL). What is the most likely reason the request succeeded despite the policy?

A.The access logs are spoofed; the request actually came from a corporate IP.

B.The policy has an Allow statement that permits all accesses using SSL, without restricting the source IP.

C.The unknown IP address is part of a misconfigured VPN client that still appears as the corporate CIDR.

D.The bucket policy is missing an explicit Deny statement for IP addresses outside the allowed range.

AnswerB

If the Allow statement only requires SSL but does not enforce the IP condition, then any SSL request would be allowed, bypassing the intended IP restriction.

Why this answer

If the bucket policy has a condition that checks only for Secure Transport (aws:SecureTransport) but does not explicitly deny IP addresses, a Deny statement with a Null condition on IP address might be misconfigured, or the IP address condition is not applied correctly. The most common error is that the policy allows all requests that use SSL, overriding the IP restriction. Option C correctly identifies that the SSL condition might be too permissive.

Practice this question →

128

MCQhard

A DevOps engineer is designing a CI/CD pipeline for a microservices application. The team wants to isolate each build job to avoid interference. Which cloud concept should be utilized?

A.Dedicated hosts

B.Containerization with orchestration

C.Virtual private cloud (VPC)

D.Serverless functions

AnswerB

Containers provide lightweight, isolated environments ideal for CI/CD jobs, and orchestration manages them.

Why this answer

Containerization with orchestration (e.g., Docker and Kubernetes) provides isolated runtime environments for each build job by packaging the application and its dependencies into lightweight containers. This ensures that build processes do not interfere with each other, as each container runs in its own isolated user space with dedicated resources, and orchestration manages scheduling, scaling, and lifecycle. This approach is ideal for CI/CD pipelines in microservices architectures where build isolation is critical.

Exam trap

CompTIA often tests the misconception that network-level isolation (VPC) or physical isolation (dedicated hosts) is required for build job isolation, when in fact containerization provides sufficient and more efficient isolation at the process level.

How to eliminate wrong answers

Option A is wrong because dedicated hosts provide physical server isolation but are overkill for build job isolation; they do not offer per-job isolation within the same host and incur higher cost and management overhead. Option C is wrong because a Virtual Private Cloud (VPC) is a network-level isolation construct for cloud resources, not a mechanism to isolate individual build jobs; it cannot prevent interference between processes running on the same compute instance. Option D is wrong because serverless functions (e.g., AWS Lambda) are stateless and ephemeral, but they are not designed for running CI/CD build jobs that require persistent storage, longer execution times, or custom runtime environments; they also lack the fine-grained resource isolation needed for concurrent builds.

Practice this question →

129

Multi-Selecthard

Which THREE of the following are common causes of VM migration failures in a cloud environment? (Choose three.)

Select 3 answers

A.Expired software licenses on the target host

B.Incompatible CPU instruction sets or features between source and target hosts

C.Stale DNS records for the VM's hostname

D.Insufficient storage space on the target host

E.Network connectivity issues between the source and target hypervisors

AnswersB, D, E

Different CPU generations can prevent live migration.

Why this answer

Options A, C, and D are correct. Incompatible CPU features (A) can cause migration failures, insufficient storage capacity (C) on target, and network connectivity issues (D) between hosts. Option B is wrong: stale DNS records may cause name resolution problems but not migration failure.

Option E is wrong: license checks may block migration but are less common than resource issues.

Practice this question →

130

MCQeasy

A user reports being unable to upload files to an S3 bucket named 'my-bucket'. The IAM policy attached to the user is shown in the exhibit. What is the most likely reason for the failure?

A.The policy requires a condition that is not met.

B.The policy does not include s3:PutObjectAcl, which is needed.

C.The policy has a typo in the Action field.

D.The bucket policy denies the upload.

AnswerB

Many upload operations require PutObjectAcl.

Why this answer

The IAM policy grants s3:PutObject, which allows uploading an object, but it does not include s3:PutObjectAcl. When uploading to an S3 bucket, if the bucket is configured to require bucket-owner-full-control ACLs (e.g., via a bucket policy or default settings), the upload will fail unless the user also has permission to set the ACL. The s3:PutObjectAcl action is necessary to specify the ACL during the PUT request, and its absence is the most likely cause of the failure.

Exam trap

CompTIA often tests the distinction between s3:PutObject and s3:PutObjectAcl, trapping candidates who assume that upload permission alone is sufficient when ACL requirements are enforced.

How to eliminate wrong answers

Option A is wrong because the exhibit does not show any condition block in the policy, so there is no condition to be unmet. Option C is wrong because 's3:PutObject' is a valid action with correct casing and syntax, so there is no typo. Option D is wrong because the bucket policy is not mentioned in the scenario; the failure is attributed to the user's IAM policy, and bucket policies are separate from IAM policies.

Practice this question →

131

MCQeasy

A cloud administrator notices that a web application is experiencing intermittent latency spikes. The application runs on a load-balanced set of virtual machines in a public cloud. Which of the following should the administrator investigate FIRST?

A.Verify that the VMs are in the same availability zone.

B.Inspect the network ACLs for any recent changes.

C.Check the load balancer health checks to ensure all instances are healthy.

D.Review the application logs for errors or performance issues.

AnswerD

Application logs can reveal slow queries, resource contention, or code errors that cause intermittent latency.

Why this answer

Option D is correct because intermittent latency spikes in a load-balanced web application are most effectively diagnosed by first reviewing application logs for errors or performance issues. Application logs can reveal slow database queries, memory exhaustion, or code-level bottlenecks that cause latency, which is the most direct source of evidence before investigating network or infrastructure components.

Exam trap

CompTIA often tests the principle of starting with the most direct evidence (application logs) rather than jumping to network or infrastructure checks, trapping candidates who assume latency must be a network issue.

How to eliminate wrong answers

Option A is wrong because VMs in the same availability zone do not prevent latency spikes; in fact, placing all VMs in one zone increases risk of zone failure and does not address intermittent performance issues. Option B is wrong because network ACLs are stateless and changes would typically cause consistent connectivity failures or denials, not intermittent latency spikes. Option C is wrong because load balancer health checks only verify instance reachability and basic responsiveness, not application-level performance; healthy instances can still suffer from internal latency issues.

Practice this question →

132

MCQmedium

An organization uses a cloud-based monitoring service to track CPU utilization across a fleet of virtual machines. The administrator notices that one VM consistently shows 100% CPU utilization at the same time each day. Which of the following should the administrator do NEXT?

A.Add the VM to an auto scaling group to distribute the load

B.Immediately increase the VM size to accommodate the peak

C.Check the VM's local task scheduler for any jobs running during the peak times

D.Scan the VM for malware that might be causing the activity

AnswerC

Scheduled tasks could be the cause of the recurring CPU spike.

Why this answer

Option C is correct because the consistent daily spike in CPU utilization at the same time strongly suggests a scheduled task or cron job is triggering the load. Checking the VM's local task scheduler (e.g., Task Scheduler on Windows or cron on Linux) is the logical first step to identify the specific process causing the spike before taking any remediation actions.

Exam trap

The trap here is that candidates may jump to scaling or security responses (auto scaling, resizing, malware scan) without first performing basic troubleshooting to identify the predictable, recurring process causing the CPU spike.

How to eliminate wrong answers

Option A is wrong because adding the VM to an auto scaling group does not distribute the load within a single VM; auto scaling adds or removes instances horizontally, which would not address a local process consuming 100% CPU on one VM. Option B is wrong because immediately increasing the VM size (vertical scaling) is a reactive, costly approach that does not identify the root cause; the administrator should first investigate what is causing the spike. Option D is wrong because while malware can cause high CPU usage, the predictable daily pattern at the same time is more indicative of a scheduled job than malware, which typically exhibits random or persistent activity.

Practice this question →

133

MCQhard

A cloud administrator runs the `iostat` command on a Linux VM experiencing slow performance. Based on the exhibit, what is the most likely bottleneck?

A.Disk I/O is saturated.

B.Network bandwidth is limited.

C.CPU is overloaded.

D.Memory is insufficient.

AnswerA

High %iowait and disk utilization indicate I/O bottleneck.

Why this answer

The `iostat` command reports CPU and I/O statistics. The exhibit shows high `%util` (e.g., 99.9%) and elevated `await` or `svctm` values, indicating that the disk device is operating at or near its maximum capacity. This means the disk I/O subsystem is saturated, causing requests to queue and slowing overall VM performance.

Exam trap

The trap here is that candidates may misinterpret high `%util` as a CPU bottleneck because `iostat` also displays CPU stats, but the question specifically asks about the bottleneck indicated by the exhibit, which clearly points to disk I/O saturation.

How to eliminate wrong answers

Option B is wrong because `iostat` does not measure network bandwidth; network issues would be diagnosed with tools like `netstat`, `ss`, or `iperf`. Option C is wrong because `iostat` shows CPU statistics (e.g., `%user`, `%system`), and if the CPU were overloaded, those values would be high while disk `%util` might remain low; the exhibit indicates disk saturation, not CPU exhaustion. Option D is wrong because insufficient memory would manifest as high swap usage or out-of-memory (OOM) events, not as high disk `%util`; memory issues are diagnosed with `free`, `vmstat`, or `top`.

Practice this question →

134

Multi-Selectmedium

A cloud operations team is analyzing a security incident in which an unauthorized user accessed a storage bucket. The bucket was configured with public access. Which three best practices should the team implement to prevent such incidents in the future? (Select THREE).

Select 3 answers

A.Enable access logging and monitor logs.

B.Enable encryption on the storage bucket.

C.Implement least privilege access policies.

D.Enable versioning on the bucket.

E.Use bucket policies to deny all public access.

AnswersA, C, E

Correct. Logging provides visibility into access patterns and aids in detecting anomalies.

Why this answer

Implementing least privilege, denying public access, and enabling access logging help prevent and detect unauthorized access.

Practice this question →

135

MCQeasy

A cloud administrator sets up a monitoring alarm to trigger when CPU utilization exceeds 90% for 5 minutes. The alarm uses a period of 5 minutes and an evaluation period of 1. The alarm does not trigger even though CPU spikes above 90% for several minutes. What is the most likely cause?

A.The monitoring service does not support CPU utilization.

B.The alarm's evaluation period is set to 2 periods.

C.The alarm's statistic is set to 'Average', which smooths out short spikes.

D.The alarm action is not configured.

AnswerC

The 5-minute average might not exceed 90% if the spike is not sustained for most of the period.

Why this answer

Option C is correct because the average statistic smooths out the spike. Options A, B, and D are incorrect or not the cause.

Practice this question →

136

MCQmedium

A company uses a cloud-based object storage service to store backups. The backups must be retained for seven years to meet compliance requirements. Which storage tier should be used to minimize cost while meeting the retention requirement?

A.Archival storage with a minimum retention period of 90 days.

B.Cold storage with a retrieval time of several hours.

C.Infrequent access storage with a retrieval fee.

D.Standard storage with life-cycle policies to delete after seven years.

AnswerB

Correct. Cold storage offers low cost for long-term retention, and retrieval time is acceptable for backup restoration.

Why this answer

Cold storage (e.g., Glacier) is designed for long-term archival with infrequent access, offering low cost per GB and suitable for compliance retention.

Practice this question →

137

Matchingmedium

Match each acronym to its definition.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Software as a Service

Platform as a Service

Infrastructure as a Service

Function as a Service

Desktop as a Service

Why these pairings

These are common cloud service models tested in Cloud+.

Practice this question →

138

MCQeasy

A company has a policy that all cloud resources must be tagged with 'CostCenter' and 'Project' tags. The cloud operations team uses a monitoring tool to alert when untagged resources are created. The team receives an alert for a new EC2 instance that lacks the required tags. The instance was launched two hours ago by a DevOps engineer who is on leave. The instance is critical for production. What should the administrator do to resolve the compliance violation?

A.Terminate the instance immediately and launch a new one with proper tags.

B.Apply the required tags to the existing instance using the cloud provider's console or CLI.

C.Ignore the alert because the instance is critical and the engineer will fix it when back.

D.Modify the tag policy to exempt instances launched by senior engineers.

AnswerB

Tagging can be applied post-creation; this resolves the compliance issue without affecting operations.

Why this answer

The best practice is to apply the missing tags to the existing instance, as it is a critical production resource. Removing or stopping the instance would cause downtime. Applying tags ensures compliance without disrupting operations.

Practice this question →

139

MCQeasy

An Azure administrator runs the command and gets the output shown. The virtual machine 'web-01' is not accessible over the network. Which of the following is the MOST likely reason?

A.The VM name is incorrect.

B.The VM is stopped and deallocated.

C.The VM failed to provision.

D.The VM is in the wrong region.

AnswerB

PowerState shows deallocated, so the VM is not running.

Why this answer

Option B is correct because the VM is deallocated, meaning it is not running and cannot be reached. Option A is wrong because the provisioning state is Succeeded, not Failed. Option C is wrong because the location is correct.

Option D is wrong because the name is correct.

Practice this question →

140

MCQeasy

A company's cloud environment has experienced a sudden spike in network traffic, causing a critical application to become unresponsive. Which of the following is the FIRST step the cloud administrator should take to address this issue?

A.Restart the application server to restore service.

B.Analyze the network traffic logs to identify the source of the spike.

C.Contact the cloud provider to report the issue.

D.Increase the bandwidth for the affected application.

AnswerB

Log analysis is the first step in troubleshooting to understand the cause of the spike.

Why this answer

Option D is correct because the first step in troubleshooting is to identify the problem, which involves analyzing traffic logs. Option A is wrong because increasing resources without understanding the cause can waste resources. Option B is wrong because contacting the provider should be done after internal diagnosis.

Option C is wrong because restarting may resolve symptoms but not the root cause.

Practice this question →

141

MCQhard

An organization uses a cloud management platform (CMP) to orchestrate resources across multiple cloud providers. The CMP has a policy that automatically terminates any VM that exceeds 85% CPU utilization for more than 15 minutes. The operations team receives complaints that some VMs are being terminated while performing legitimate batch processing jobs. What should the operations team do to resolve this issue?

A.Create a separate VM group with a different policy that allows higher CPU for longer.

B.Disable the automatic termination policy.

C.Add an exclusion list for known batch processing VMs.

D.Increase the CPU threshold to 95% and extend the duration to 30 minutes.

AnswerC

Excluding specific VMs allows the policy to remain effective while protecting batch jobs.

Why this answer

Option C is correct because adding an exclusion list for known batch processing VMs prevents unnecessary termination without disabling the entire policy. Option A is wrong because disabling the policy removes important auto-healing. Option B is wrong because increasing thresholds may still affect long-running batch jobs.

Option D is wrong because creating a separate group is more complex than simply excluding specific VMs.

Practice this question →

142

Multi-Selecthard

A cloud security team is investigating a potential data breach. Which THREE actions should be taken immediately?

Select 3 answers

A.Delete all logs to prevent further evidence exposure

B.Isolate the affected systems from the network

C.Capture a forensic snapshot of the affected storage

D.Notify all users via email

E.Preserve logs and system state

AnswersB, C, E

Isolation contains the breach and prevents lateral movement.

Why this answer

Isolating affected systems from the network (B) is a critical immediate action to contain a potential data breach. This prevents lateral movement by an attacker and stops further data exfiltration, aligning with incident response best practices such as NIST SP 800-61. In a cloud environment, this could involve applying a security group rule to deny all traffic or disconnecting a virtual network interface.

Exam trap

CompTIA often tests the misconception that deleting logs or notifying all users immediately is a valid first response, when in fact containment and evidence preservation are the top priorities.

Practice this question →

143

MCQmedium

A load balancer log entry shows the above for a request. What is the MOST likely cause of the 504 error?

A.The DNS resolution for the domain name has failed.

B.The backend server took too long to respond to the request.

C.The requested resource does not exist on the backend server.

D.The load balancer's health check is misconfigured.

AnswerB

The 30s response time exceeds typical timeouts, causing the gateway to timeout.

Why this answer

A 504 Gateway Timeout error from a load balancer indicates that the load balancer sent the request to a backend server but did not receive a timely response. The load balancer has a configured timeout value (often 30-120 seconds), and if the backend server fails to respond within that window, the load balancer terminates the connection and returns a 504. This is the most common cause of 504 errors in load-balanced environments.

Exam trap

CompTIA often tests the distinction between 502 (bad gateway, often DNS or upstream connection failure) and 504 (gateway timeout, upstream response delay), and candidates mistakenly attribute 504 errors to health check failures or DNS issues.

How to eliminate wrong answers

Option A is wrong because a DNS resolution failure would typically result in a 502 Bad Gateway error (the load balancer cannot resolve the backend server's hostname) or a 503 Service Unavailable, not a 504 timeout. Option C is wrong because a missing resource on the backend server would return a 404 Not Found response from the backend itself, which the load balancer would forward to the client; the load balancer does not generate a 504 for missing resources. Option D is wrong because a misconfigured health check would cause the load balancer to mark the backend as unhealthy and stop sending traffic to it, resulting in a 503 Service Unavailable error, not a 504 timeout.

Practice this question →

144

MCQhard

Refer to the exhibit. A cloud administrator sees this log after a nightly backup job. Which of the following is the most likely cause of the timeout?

A.The volume has a high I/O load during the snapshot.

B.The volume is attached to an instance that is powered off.

C.The snapshot target region is unreachable.

D.The backup agent is not installed.

AnswerA

High I/O can slow snapshot creation and cause timeouts.

Why this answer

Option B is correct because high I/O load on the volume during the snapshot can cause the snapshot creation to exceed the timeout. Option A is incorrect because a powered-off instance does not affect snapshot creation. Option C is incorrect because the timeout is for creation, not transfer to another region.

Option D is incorrect because a missing backup agent would generate a different error.

Practice this question →

145

MCQhard

Refer to the exhibit. A cloud administrator deployed a Compute Engine instance in the default VPC. The instance has a public IP but cannot be accessed via SSH from the internet. The firewall rules allow ingress on port 22 from 0.0.0.0/0. What is the most likely cause?

A.The instance does not have a firewall tag matching the SSH rule.

B.The instance is not assigned to a service account.

C.The subnetwork does not have an external IP range.

D.The instance is running an incompatible OS.

AnswerA

If the SSH rule is targeted to a specific network tag that the instance lacks, the rule does not apply.

Why this answer

Option C is correct. In GCP, firewall rules can target instances by network tags. If the SSH rule targets a tag that the instance does not have, the rule is not applied.

Options A, B, and D are incorrect.

Practice this question →

146

MCQhard

Refer to the exhibit. What action should the administrator take to bring the environment into compliance?

A.Modify the policy file to accept t2.large and t2.xlarge.

B.Ignore the compliance check because it is advisory.

C.Resize the non-compliant VMs to t2.medium.

D.Delete the non-compliant VMs.

AnswerC

Resizing brings VMs into compliance with the existing policy.

Why this answer

Option C is correct because the exhibit shows that the environment has a compliance policy requiring all VMs to be of type t2.medium, but two VMs are running as t2.large and t2.xlarge. Resizing the non-compliant VMs to t2.medium brings them into alignment with the policy, ensuring the environment meets the defined compliance standard. This action directly remediates the violation without unnecessary deletion or ignoring the check.

Exam trap

CompTIA often tests the misconception that compliance checks are merely advisory warnings, but in reality, they are enforceable policies that require active remediation to avoid audit failures or security risks.

How to eliminate wrong answers

Option A is wrong because modifying the policy file to accept t2.large and t2.xlarge would change the compliance standard rather than enforcing it, which defeats the purpose of the policy and could lead to resource over-provisioning or cost issues. Option B is wrong because compliance checks are not advisory; they are mandatory enforcement mechanisms in cloud governance frameworks, and ignoring a non-compliant state can result in security vulnerabilities or audit failures. Option D is wrong because deleting the non-compliant VMs is an overly aggressive action that would cause data loss and service disruption when a simple resize operation can achieve compliance without destroying resources.

Practice this question →

147

MCQeasy

A cloud administrator notices that a cloud-based application is running slowly. The administrator checks the cloud monitoring dashboard and sees that CPU utilization is at 95% for the application server. Which of the following should the administrator do first?

A.Add an additional network interface.

B.Scale out by adding another instance.

C.Reboot the server.

D.Increase the memory allocation.

AnswerB

Adding instances distributes the CPU load, alleviating the bottleneck.

Why this answer

Option D is correct because scaling out by adding another instance distributes the load, reducing CPU pressure. Option A provides only temporary relief. Option B may not help if the bottleneck is CPU.

Option C addresses network, which is not indicated as the issue.

Practice this question →

148

MCQeasy

A cloud administrator needs to automate the patching of operating systems on a fleet of EC2 instances. Which AWS service should be used?

A.AWS Update Manager

B.AWS Config

C.Amazon Inspector

D.AWS Systems Manager Patch Manager

AnswerD

Designed for automated patching.

Why this answer

AWS Systems Manager Patch Manager automates the process of patching managed nodes with both security-related and other types of updates. It uses a patch baseline to define which patches should be installed and can schedule patching across a fleet of EC2 instances, making it the correct service for this task.

Exam trap

The trap here is that candidates may confuse Amazon Inspector (which only identifies vulnerabilities) with a patching solution, or assume 'AWS Update Manager' is a real service because it sounds plausible, when in fact the correct service is AWS Systems Manager Patch Manager.

How to eliminate wrong answers

Option A is wrong because AWS Update Manager is not a real AWS service; the correct service for update management is AWS Systems Manager Patch Manager. Option B is wrong because AWS Config is a service for evaluating, auditing, and assessing the configurations of your resources against desired policies, not for automating patch installation. Option C is wrong because Amazon Inspector is a vulnerability management service that scans for software vulnerabilities and unintended network exposure, but it does not perform patching itself.

Practice this question →

149

MCQmedium

A cloud administrator is reviewing a security audit report that shows an instance has been sending outbound traffic to a known malicious IP address. The instance hosts a production application. Which action should the administrator take first?

A.Run an antivirus scan on the instance.

B.Immediately shut down the instance.

C.Isolate the instance by modifying the security group to deny all traffic.

D.Review the application logs to understand the traffic.

AnswerC

Containment is the first priority in incident response. Isolating via security group is quick and reversible.

Why this answer

Option C is correct because the first priority in a confirmed security incident is containment. Modifying the security group to deny all traffic immediately stops the outbound communication to the malicious IP, preventing data exfiltration or further compromise while preserving the instance for forensic analysis. This aligns with the incident response process, where isolation precedes investigation.

Exam trap

The trap here is that candidates may choose immediate shutdown (Option B) thinking it is the fastest containment method, but the exam emphasizes preserving forensic evidence and using network-level isolation (security groups) as the correct first step in incident response.

How to eliminate wrong answers

Option A is wrong because running an antivirus scan is a remediation step that should occur after containment; it does not stop ongoing malicious outbound traffic and may be ineffective against advanced threats. Option B is wrong because immediately shutting down the instance destroys volatile data (e.g., running processes, memory contents) that could be critical for forensic investigation, and it may cause unnecessary downtime for the production application. Option D is wrong because reviewing application logs is a post-containment investigative action; delaying containment to first review logs risks continued data exfiltration or lateral movement.

Practice this question →

150

MCQeasy

A cloud administrator for a large e-commerce company receives an alert that the response time of the production web application has increased by 300% in the last hour. The application is deployed on a Kubernetes cluster with three worker nodes, each running multiple pods. The administrator checks the Kubernetes dashboard and notices that one of the worker nodes has a CPU utilization of 95%, while the other two nodes are at 40%. The pods on the overloaded node are experiencing resource throttling. The application uses a deployment with a replica count of 6, and all pods are currently running. The administrator needs to resolve the performance issue quickly and prevent future occurrences. Which of the following actions should the administrator take FIRST?

A.Increase the replica count from 6 to 12 to distribute load across more pods.

B.Add a new worker node to the cluster to increase overall capacity.

C.Cordon and drain the overloaded node to redistribute its pods to other nodes.

D.Modify the deployment to set resource requests and limits for CPU on all pods.

AnswerC

This immediately relieves the overloaded node and redistributes pods, restoring performance.

Why this answer

C is correct because the immediate symptom is an overloaded node (95% CPU) causing resource throttling on its pods. Cordon and drain the node to safely evict its pods, allowing the Kubernetes scheduler to redistribute them to the underutilized nodes (40% CPU), which quickly resolves the performance bottleneck without adding unnecessary resources or changing the deployment configuration.

Exam trap

The trap here is that candidates often choose to increase replica count or add nodes as a knee-jerk reaction to high CPU, but the correct first step is to isolate and redistribute the load from the overloaded node using cordon and drain, which directly addresses the immediate throttling without over-provisioning.

How to eliminate wrong answers

Option A is wrong because increasing the replica count from 6 to 12 would create more pods, but the overloaded node would still be at 95% CPU and the new pods might be scheduled onto the same node, worsening the throttling. Option B is wrong because adding a new worker node increases cluster capacity but is a slower, more disruptive action; the immediate issue can be resolved by redistributing existing pods to the two nodes with 40% CPU utilization. Option D is wrong because setting resource requests and limits is a preventive measure that should have been configured before deployment; it does not resolve the current performance issue and would require a rolling update, which takes time and does not immediately relieve the overloaded node.

Practice this question →

← PreviousPage 2 of 3 · 193 questions totalNext →

Ready to test yourself?

Try a timed practice session using only Cloud Operations Support questions.

Start 20-question session