Knowledge + Practice

CCNA Troubleshooting Questions

75 of 99 questions · Page 1/2 · Troubleshooting topic · Answers revealed

Practice these questions Exam hub All questions

1

MCQmedium

A company uses a cloud-based load balancer to distribute traffic to web servers. Recently, a new security policy was applied that restricts traffic to certain geographic regions. Users from an allowed region report they cannot access the website. The load balancer status shows health checks are passing. What should the administrator check?

A.The DNS resolution for the website

B.The SSL certificate expiration

C.The web server logs for application errors

D.The load balancer's access control lists (ACLs)

AnswerD

ACLs enforce geographic restrictions and could be misconfigured, blocking allowed regions.

Why this answer

Option B is correct because the geographic restriction policy is likely implemented via ACLs on the load balancer. Option A is wrong because health checks passing indicate web servers are fine. Option C is wrong because DNS would affect all users, not just a specific region.

Option D is wrong because SSL certificate issues would cause browser warnings, not complete inaccessibility.

Practice this question →

2

MCQmedium

A cloud administrator is starting the nginx web server on a new cloud VM but it fails. According to the exhibit, what is the most likely cause of the failure?

A.Another service is already listening on port 80

B.The VM does not have network connectivity

C.SELinux is blocking nginx from binding to the port

D.The nginx configuration file has a syntax error

AnswerA

The error explicitly states address already in use.

Why this answer

The error 'bind() to 0.0.0.0:80 failed (98: Address already in use)' indicates that port 80 is already occupied by another process. Option A is correct. Option B is wrong because SELinux would give a permission denied error, not address in use.

Option C is wrong because the error says address in use, not file not found. Option D is wrong because the error is about binding, not about configuration syntax.

Practice this question →

3

Multi-Selecthard

Which THREE are common reasons why a cloud database instance may become unreachable?

Select 3 answers

A.Firewall rules blocking the database port

B.Incorrect connection string in the application

C.Storage volume is full on the database server

D.Database service not started

E.Hypervisor maintenance causing VM reboot

AnswersA, B, D

Security groups or firewalls can block inbound traffic.

Why this answer

Options A, B, and C are correct. Firewall rules block access, database service not running prevents connections, and incorrect connection string leads to failure to reach. D is incorrect because storage full might cause write failures but not always unreachability.

E is incorrect because hypervisor maintenance typically triggers live migration, not unreachability.

Practice this question →

4

MCQhard

A company runs a critical e-commerce application on a private cloud using OpenStack. The application consists of web servers, application servers, and a MySQL database running on separate VMs. Recently, users have reported intermittent 502 Bad Gateway errors during peak hours. The operations team notices that the web server VMs show high CPU ready times and the application server VMs have increased network latency. Storage performance also shows high await times on the SSD-based Ceph cluster. The team suspects resource contention. Which of the following is the BEST course of action to diagnose and resolve the issue?

A.Migrate the web server VMs to a different compute host using live migration.

B.Increase the number of vCPUs for each web server VM to reduce CPU ready time.

C.Implement quality of service (QoS) policies on the Ceph cluster to guarantee IOPS for the database.

D.Review the hypervisor's CPU and memory allocation ratios and adjust overcommitment settings.

AnswerD

Reducing overcommitment alleviates overall resource contention.

Why this answer

Option D is correct because high CPU ready times and overall contention indicate overcommitment on the hypervisors. Adjusting overcommitment ratios can reduce contention across CPU, memory, and storage. Option A is incorrect because adding vCPUs may worsen contention.

Option B is a temporary fix and does not address the root cause. Option C only addresses storage and not CPU/network contention.

Practice this question →

5

Drag & Dropmedium

Sequence the steps to set up a cloud storage bucket with versioning and lifecycle policies.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Create bucket, enable versioning, add lifecycle rules for transitions and deletions, then test.

Practice this question →

6

MCQmedium

A cloud engineer receives an alert that the root filesystem (/) is at 93% usage. The /data volume has plenty of free space. The application stores logs in /var/log/app/ on the root filesystem. Which of the following is the BEST long-term solution?

A.Move the /var/log/app directory to the /data partition and create a symlink

B.Increase the size of the root filesystem

C.Delete the /data partition and merge it with root

D.Configure log rotation to delete logs more frequently

AnswerA

This frees root space and leverages the /data volume's capacity.

Why this answer

Moving the /var/log/app directory to the /data partition and creating a symlink is the best long-term solution because it permanently relocates the log data to a volume with ample free space without requiring application reconfiguration. The symlink (/var/log/app -> /data/app) makes the application continue to write to the same logical path, while the actual storage is on the /data filesystem. This resolves the root filesystem capacity issue without altering the application's logging behavior or risking data loss.

Exam trap

CompTIA often tests the misconception that increasing filesystem size or deleting partitions is a valid long-term fix, when in reality the correct approach is to relocate data to a separate volume using a symlink or mount bind.

How to eliminate wrong answers

Option B is wrong because increasing the size of the root filesystem only provides a temporary fix and does not address the underlying issue of log growth; it may also be impractical if the underlying disk or LVM has no free extents. Option C is wrong because deleting the /data partition and merging it with root is destructive, risks data loss on /data, and violates the principle of separating application data from the OS filesystem. Option D is wrong because configuring log rotation to delete logs more frequently reduces historical data needed for troubleshooting and compliance, and does not prevent future root filesystem exhaustion if log volume continues to grow.

Practice this question →

7

MCQhard

A cloud database cluster is experiencing replication lag. The primary node shows high write activity, and the replicas are on different availability zones. Which of the following is the most likely cause?

A.Replication is configured as synchronous.

B.Network latency between the primary and replica zones is high.

C.The replica nodes have insufficient storage.

D.The primary node's vCPU is over-allocated.

AnswerB

Geographic distance increases latency, causing replication lag.

Why this answer

Option B is correct because network latency between zones can cause lag. Option A is incorrect because the primary is not underprovisioned. Option C is incorrect because replication is asynchronous typically.

Option D is incorrect as disk I/O is not the immediate issue.

Practice this question →

8

MCQhard

A cloud administrator is troubleshooting why a newly launched VM did not complete its initialization. According to the exhibit, what is the most likely cause?

A.Cloud-init is not installed on the VM

B.The package repository is not configured correctly

C.The cloud-init user data script contains a syntax error

D.The VM does not have internet access

AnswerB

The 'Unable to locate package' error typically means the repository list is outdated or missing.

Why this answer

The error 'E: Unable to locate package python3-pip' indicates that the package name is incorrect or the repository is not configured. Option C is correct. Option A is wrong because cloud-init is running commands.

Option B is wrong because the error is about locating a package, not about a syntax issue in cloud-init. Option D is wrong because there is no indication of network issues; the command ran but couldn't find the package.

Practice this question →

9

MCQhard

A web application is deployed across multiple availability zones behind an application load balancer (ALB). The administrator notices that all traffic is being routed to instances in only one availability zone, causing performance issues. The ALB cross-zone load balancing is enabled. What is the most likely cause?

A.The security group for the ALB only allows traffic from one zone.

B.The instances in the other zones are marked as unhealthy due to failing health checks.

C.The route table for the subnets in the other zones is missing a default route.

D.The listener rules are configured to forward traffic to a single target group.

AnswerB

Health check failures cause instances to be deregistered, so traffic is sent only to healthy instances in the remaining zone.

Why this answer

Option A is correct because if health checks fail, instances are deregistered, and ALB only sends traffic to healthy targets. Option B is wrong because a single target group can have instances from multiple zones. Option C is wrong because security groups are not zone-specific.

Option D is wrong because route tables affect instance outbound, not ALB traffic distribution.

Practice this question →

10

MCQeasy

A cloud administrator is troubleshooting a failed deployment of a new application version using a continuous integration/continuous deployment (CI/CD) pipeline. The pipeline fails at the 'test' stage. What is the first step the administrator should take?

A.Re-run the pipeline

B.Increase the timeout of the test stage

C.Roll back to the previous version

D.Check the test logs for specific errors

AnswerD

Logs reveal the exact cause, such as failed unit tests or configuration issues.

Why this answer

Option B is correct because reviewing test logs provides specific error details. Option A is wrong because rolling back should only happen after understanding the failure. Option C is wrong because increasing timeout is a guess and does not address the root cause.

Option D is wrong because re-running without investigation wastes time.

Practice this question →

11

MCQmedium

A company is experiencing intermittent network connectivity issues between two cloud subnets. The cloud provider's monitoring shows no packet loss. Which troubleshooting step should be taken first?

A.Review the security group rules for both subnets

B.Replace the virtual routers

C.Increase the bandwidth between subnets

D.Check the physical cabling

AnswerA

Incorrect security group rules can block traffic between subnets intermittently based on timing or state.

Why this answer

Option B is correct because security group rules can cause intermittent connectivity issues by selectively dropping traffic. Option A is wrong because virtual routers are typically managed by the provider and are unlikely to cause issues without alerts. Option C is wrong because physical cabling is not relevant in the cloud.

Option D is wrong because increasing bandwidth does not address the underlying cause of intermittent drops.

Practice this question →

12

MCQeasy

A company is running a database server on a virtual machine in the cloud. The database team reports that write operations are taking longer than expected. The administrator checks the disk performance metrics and sees that the average disk queue length is consistently above 10. Which action would most likely resolve this issue?

A.Add more RAM to the virtual machine.

B.Upgrade to a higher IOPS tier for the disk.

C.Increase the size of the disk.

D.Enable compression on the database.

AnswerB

Upgrading IOPS tier increases throughput, reducing queue length.

Why this answer

Option B is correct because high disk queue length indicates insufficient I/O throughput; upgrading IOPS tier increases throughput. Option A is wrong because increasing size does not guarantee higher IOPS. Option C is wrong because compression may reduce I/O but not directly address queue length.

Option D is wrong because adding RAM may reduce disk hits but doesn't directly fix queue length.

Practice this question →

13

MCQmedium

A cloud administrator receives an alert that a virtual machine (VM) is unresponsive. The VM is hosted on a hypervisor that shows high CPU ready time. Which of the following is the most likely cause?

A.Insufficient memory allocated to the VM

B.Network latency between the VM and storage

C.Disk I/O contention from other VMs

D.Over-provisioning of vCPUs on the hypervisor

AnswerD

Correct; over-provisioned vCPUs cause contention and high ready time.

Why this answer

High CPU ready time indicates that the VM is ready to execute instructions but is waiting for the hypervisor to schedule physical CPU time. This is a classic symptom of over-provisioning vCPUs, where the total number of vCPUs assigned to all VMs exceeds the available physical cores, causing contention at the hypervisor scheduler level.

Exam trap

The trap here is that candidates confuse high CPU ready time with high CPU usage or memory pressure, but ready time is a hypervisor-level scheduling delay, not a guest OS metric, and is directly tied to vCPU over-provisioning.

How to eliminate wrong answers

Option A is wrong because insufficient memory would typically cause swapping or ballooning, not high CPU ready time, which is a CPU scheduling metric. Option B is wrong because network latency between the VM and storage affects storage I/O latency, not CPU scheduling, and would manifest as high disk latency or queue depth. Option C is wrong because disk I/O contention from other VMs would result in high disk queue length or latency, not CPU ready time, which is a measure of CPU starvation.

Practice this question →

14

MCQhard

A cloud instance fails to initialize. The cloud-init log shows the error above. Which of the following is the most likely cause?

A.The filesystem on /dev/xvdb1 is not formatted with a recognized filesystem.

B.The disk is not attached to the instance at all.

C.The partition table on /dev/xvdb1 is corrupted.

D.The launch template specifies a block device mapping that is not attached to the instance.

AnswerD

The device /dev/xvdb1 is present in the mapping but not attached, causing mount failure.

Why this answer

Option A is correct because the block device mapping in the launch template probably references a device that does not exist. Option B is incorrect because the error says 'No such device', not 'wrong filesystem'. Option C is incorrect because if not attached, the device would not appear at all.

Option D is incorrect because a corrupted partition would give a different error like 'invalid partition table'.

Practice this question →

15

MCQeasy

A user reports that they cannot connect to a RDS database instance from their application. The security group for the RDS instance allows inbound traffic on port 3306 from the application server's security group. What should the administrator check NEXT?

A.IAM policy attached to the RDS instance

B.Network ACL rules for the RDS subnet

C.Route table entries for the RDS subnet

D.Outbound security group rules on the RDS instance

AnswerB

Network ACLs act as a firewall at subnet level and can block inbound traffic.

Why this answer

Option A is correct because network ACLs are stateless and may block traffic even if security groups allow it. Option B is wrong because SG outbound rules are typically allow all by default. Option C is wrong because route tables do not apply to traffic within a VPC for the same subnet? Actually, they do, but the problem is more likely at the ACL layer.

Option D is wrong because IAM roles do not control network connectivity.

Practice this question →

16

Multi-Selecthard

A cloud administrator is troubleshooting a network connectivity issue between two VPCs connected via a VPC peering connection. The administrator has verified that the route tables are correct and that the security groups allow traffic. However, instances in VPC A cannot ping instances in VPC B. Which TWO of the following could be causing the issue? (Choose TWO.)

Select 2 answers

A.Network ACLs in VPC B are blocking inbound ICMP

B.Security groups in VPC A are blocking inbound ICMP

C.Host-based firewall on the target instance is blocking ping

D.VPC peering connection does not support ICMP

E.Route tables are misconfigured

AnswersA, C

Network ACLs are stateless; they must explicitly allow both inbound and outbound ICMP.

Why this answer

Network ACLs are stateless and must allow both inbound and outbound traffic; if they block ICMP, ping fails. Firewall rules within the OS can also block ping. Option A is wrong because security groups are stateful and would allow return traffic if outbound is allowed.

Option D is wrong because routing is verified. Option E is wrong because VPC peering does not depend on IGW.

Practice this question →

17

MCQhard

A cloud administrator deploys a new application that writes logs to a block storage volume attached to a virtual machine. The application's performance degrades after a few hours. Monitoring shows that the volume's read latency is low, but write latency spikes periodically. The administrator discovers that the volume type is standard HDD. What should the administrator do to improve write performance without changing the application?

A.Migrate to a volume type with provisioned IOPS (SSD).

B.Increase the volume size to gain higher baseline IOPS.

C.Move the logs to an object storage service.

D.Enable write caching on the volume.

AnswerA

Provisioned IOPS SSD provides consistent high IOPS, eliminating write spikes.

Why this answer

Option B is correct because standard HDD has limited burst IOPS, and sustained writes will degrade; provisioned IOPS SSD provides consistent performance. Option A is wrong because increasing HDD size increases baseline IOPS but still limited and burst-based. Option C is wrong because write caching on a data volume is not recommended and may cause data loss.

Option D is wrong because object storage is not block-level and may require application changes.

Practice this question →

18

MCQhard

A company runs a critical e-commerce application on AWS. The architecture includes an Application Load Balancer (ALB) in front of an Auto Scaling group of EC2 instances across two Availability Zones. The instances are in a private subnet and use a NAT Gateway for outbound internet access. The application stores session data in an ElastiCache Redis cluster. During a flash sale, users report that the site is extremely slow and some requests time out. Monitoring shows the ALB's latency metric is high, and the number of healthy hosts fluctuates. The CPU utilization on the EC2 instances averages 60% and memory averages 70%. The Redis cluster's CPU utilization is 90%, and its memory usage is 95%. The NAT Gateway's metrics show high BytesOutToSource but no errors. Which of the following is the most likely cause of the performance issue?

A.The NAT Gateway is throttling traffic due to bandwidth limits

B.The ElastiCache Redis cluster is overloaded and becoming a bottleneck for session lookups

C.The Auto Scaling group is not scaling quickly enough due to cooldown periods

D.The ALB's idle timeout setting is too low, causing premature connection drops

AnswerB

Correct; high Redis CPU and memory cause slow responses and timeouts.

Why this answer

The ElastiCache Redis cluster is the most likely bottleneck because its CPU utilization is at 90% and memory usage at 95%, indicating it is near capacity. Since the application stores session data in Redis, high latency and timeouts during a flash sale are consistent with an overloaded session store that cannot keep up with request volume, causing the ALB to experience increased latency and healthy host fluctuations as sessions fail to be retrieved or written.

Exam trap

The trap here is that candidates may focus on the NAT Gateway or Auto Scaling group because they are common bottlenecks, but the key clue is the Redis cluster's high CPU and memory metrics, which directly correlate with session store performance issues in a stateful application.

How to eliminate wrong answers

Option A is wrong because the NAT Gateway shows high BytesOutToSource but no errors, and NAT Gateway bandwidth limits are typically high (up to 10 Gbps per AZ) and would cause packet drops or errors if throttled, not just high latency. Option C is wrong because the Auto Scaling group's cooldown periods could delay scaling, but the EC2 instances are only at 60% CPU and 70% memory, which are not saturated, so scaling is not the primary issue. Option D is wrong because the ALB's idle timeout setting (default 60 seconds) controls how long the ALB keeps a connection open without data; premature connection drops would manifest as immediate disconnects, not high latency and timeouts.

Practice this question →

19

MCQeasy

A company has a cloud-based application that uses a relational database. The database team performs daily backups to an on-premises storage system using a VPN connection. Recently, backups have been failing with timeout errors. The network team confirms the VPN is up and stable. Which of the following is the MOST likely cause?

A.The database service is not responding

B.The VPN bandwidth is insufficient for the backup data volume

C.The VPN tunnel is not properly configured

D.The on-premises firewall is blocking the backup port

AnswerB

Large backups can exceed VPN capacity, leading to timeouts.

Why this answer

The VPN connection is confirmed stable, so tunnel configuration and firewall issues are unlikely. Backup timeout errors with large data volumes typically indicate insufficient bandwidth, causing the transfer to exceed the timeout threshold. The database service itself is responding (backups are attempted), ruling out service unavailability.

Exam trap

The trap here is that candidates assume a stable VPN means the link has sufficient capacity, but CompTIA often tests the distinction between connectivity (layer 3) and throughput (layer 4/performance), where a stable tunnel can still be too slow for large data transfers.

How to eliminate wrong answers

Option A is wrong because if the database service were not responding, backups would fail immediately with a connection error, not a timeout after data transfer begins. Option C is wrong because the network team confirmed the VPN is up and stable, meaning the tunnel is properly configured and operational. Option D is wrong because a firewall block would cause a consistent failure (e.g., connection refused), not intermittent timeouts, and the VPN tunnel encrypts traffic, making port-specific blocking less likely.

Practice this question →

20

MCQeasy

A cloud administrator notices that a virtual machine (VM) is running slowly. The hypervisor shows high CPU ready time for that VM. Which of the following is the most likely cause?

A.High disk I/O latency on the datastore

B.Insufficient memory allocated to the VM

C.Overcommitted physical CPU resources on the host

D.Misconfigured virtual switch

AnswerC

Overcommitted CPU means the VM competes for physical cores, causing high ready time.

Why this answer

High CPU ready time indicates that the VM is ready to execute instructions but is waiting for the physical CPU to become available. This is a classic symptom of CPU overcommitment, where the host has more virtual CPUs (vCPUs) assigned to VMs than physical cores, causing contention. Option C correctly identifies this as the most likely cause.

Exam trap

CompTIA often tests the distinction between CPU ready time and other performance metrics, trapping candidates who confuse high CPU ready time with memory pressure or storage latency.

How to eliminate wrong answers

Option A is wrong because high disk I/O latency would manifest as high disk queue depth or high kernel latency, not as CPU ready time. Option B is wrong because insufficient memory would cause ballooning or swapping, not CPU ready time. Option D is wrong because a misconfigured virtual switch would cause network connectivity issues or packet loss, not CPU scheduling delays.

Practice this question →

21

MCQmedium

During a cloud migration, a database server is moved from on-premises to a cloud-managed database service. After migration, the application team reports that some queries are running slower than before. The database CPU utilization is low. What is the most likely cause?

A.The network latency between the application and the database has increased

B.The database is not indexed properly

C.The database connection pooling is misconfigured

D.The cloud database instance type has insufficient memory

AnswerA

Higher latency increases query response time without affecting CPU.

Why this answer

Option C is correct because increased network latency between the application and the cloud database can slow queries without increasing CPU usage. Option A is wrong because indexing issues would affect specific queries but would also cause higher CPU. Option B is wrong because insufficient memory would cause swapping and higher CPU.

Option D is wrong because pool misconfiguration would cause connection errors, not slow queries.

Practice this question →

22

MCQmedium

A company has a three-tier application in a cloud VPC: web servers in a public subnet, application servers in a private subnet, and database servers in a private subnet. The web servers can connect to the application servers, but the application servers cannot connect to the database servers. The security groups are configured as follows: - Web SG: inbound HTTP from 0.0.0.0/0, outbound all - App SG: inbound HTTP from Web SG, outbound all - DB SG: inbound MySQL from App SG, outbound all What is the most likely cause of the connectivity issue?

A.The database security group is missing an inbound rule for MySQL.

B.The application security group is missing an outbound rule for MySQL.

C.The network access control list (NACL) on the database subnet is blocking inbound traffic from the application subnet.

D.The web security group is blocking traffic to the database.

AnswerC

NACLs are stateless, so they must have explicit rules for traffic. A missing rule can block connectivity.

Why this answer

Option D is correct because security groups are stateful and allow outbound traffic automatically, so outbound rules are not the issue. The inbound rule on DB SG allows MySQL from App SG, so A is not the issue. B is not needed because outbound is all.

C is irrelevant. The issue likely is a NACL on the private subnet blocking the traffic, as NACLs are stateless and need explicit rules.

Practice this question →

23

MCQmedium

An administrator notices that a cloud-hosted database is experiencing high latency during peak usage. The CPU and memory utilization on the DB server are below 50%, but disk IOPS are consistently at the provisioned limit. What should the administrator check first?

A.Check network bandwidth between the DB and application servers.

B.Add additional storage volumes and stripe data.

C.Check if the storage volume has an IOPS cap.

D.Increase the number of vCPUs on the DB server.

AnswerC

Saturating the IOPS limit causes queuing and high latency.

Why this answer

Option B is correct because saturating IOPS is a common cause of high latency. Option A is incorrect because CPU is not maxed. Option C is incorrect because network latency is not indicated.

Option D is incorrect as scaling would increase IOPS but not address the limit.

Practice this question →

24

MCQhard

A cloud administrator is troubleshooting a database failover issue. The database is a managed service with a primary and standby replica in different availability zones. The application uses a read-write endpoint. During a recent maintenance event, the primary database failed over automatically, but the application experienced a 10-minute outage. The administrator checks the failover logs and sees that it completed within 2 minutes. What is the most likely cause of the extended outage?

A.The application's database connection pool does not retry DNS resolution

B.The application was not configured to use multiple availability zones

C.The standby replica was not in sync

D.The failover triggered a change in the endpoint DNS record

AnswerA

Stale connections continue to point to the old primary IP, causing failures until the pool refreshes.

Why this answer

Option A is correct because if the application's connection pool caches the IP address of the primary database, it will not automatically re-resolve the DNS after failover, causing prolonged outages. Option B is wrong because if the standby was out of sync, failover would not complete cleanly. Option C is wrong because the DNS change is usually fast but the application may not re-query DNS.

Option D is wrong because multi-AZ configuration is about the database, not the application's endpoint configuration.

Practice this question →

25

MCQmedium

A company migrated to a hybrid cloud and users report slow access to files stored in the cloud. The on-premises network is 100 Mbps. What troubleshooting step should be taken?

A.Enable compression on the cloud storage gateway

B.Check VPN bandwidth and latency

C.Increase cloud storage performance tier

D.Move files to on-premises storage

AnswerB

VPN bandwidth and latency directly affect file transfer speeds.

Why this answer

Option B is correct because slow file access over a hybrid connection is often due to limited VPN bandwidth or high latency. Option A is wrong because increasing storage performance tier may not address network bottlenecks. Option C is wrong because moving files back defeats the hybrid cloud purpose.

Option D is wrong because compression may help but isn't the first step to diagnose the issue.

Practice this question →

26

MCQeasy

A cloud engineer is troubleshooting performance issues in a virtualized environment. Which of the following tools would BEST help identify CPU contention on a hypervisor?

A.iperf

B.esxtop

C.ping

D.nslookup

AnswerB

Shows CPU metrics like ready time and co-stop.

Why this answer

Option C is correct because 'esxtop' (on VMware) provides real-time performance data including CPU ready time. Option A is wrong because 'iperf' is for network throughput. Option B is wrong because 'ping' tests basic connectivity.

Option D is wrong because 'nslookup' is for DNS resolution.

Practice this question →

27

MCQeasy

Refer to the exhibit. An application running on an EC2 instance is failing to connect to an RDS database. What is the most likely issue?

A.The database instance is in a different VPC

B.The security group for the RDS instance does not allow inbound traffic from the EC2 instance

C.The database instance is stopped

D.The application is using the wrong database port

AnswerB

A security group denying inbound traffic causes a TCP reset, resulting in connection refused.

Why this answer

Option B is correct because a 'connection refused' error typically indicates that the database is reachable but the port is blocked, likely by a security group. Option A is wrong if the database is in a different VPC, the error would be timeout or no route. Option C is wrong because a stopped database would result in a 'no route to host' or timeout.

Option D is wrong because port 3306 is standard for MySQL and the error is not about wrong port.

Practice this question →

28

MCQhard

A company is implementing a cloud governance strategy. They need to ensure that all resources are tagged with cost center and environment, and any untagged resources are automatically remediated. Which of the following best practices should be applied?

A.Implement role-based access control to restrict resource creation

B.Set up budget alerts to notify when costs exceed thresholds

C.Create a manual audit process to check tags weekly

D.Use policy-as-code to enforce tagging and automatically apply tags to untagged resources

AnswerD

Correct; policy-as-code can enforce and auto-remediate tagging.

Why this answer

Option D is correct because policy-as-code (e.g., Azure Policy, AWS Config Rules, or Open Policy Agent) allows you to define tagging requirements declaratively and automatically remediate non-compliant resources. This approach enforces governance in real-time without manual intervention, ensuring all resources are tagged with cost center and environment as specified.

Exam trap

The trap here is that candidates often confuse manual audit processes (Option C) with automated governance, failing to recognize that policy-as-code provides the required automatic remediation in real-time.

How to eliminate wrong answers

Option A is wrong because role-based access control (RBAC) restricts who can create resources but does not automatically tag or remediate untagged resources. Option B is wrong because budget alerts notify when costs exceed thresholds but do not enforce tagging or remediate untagged resources. Option C is wrong because a manual audit process is reactive, time-consuming, and does not provide automatic remediation, which is required by the question.

Practice this question →

29

Multi-Selecteasy

A cloud engineer receives alerts that a storage volume is reaching capacity. Which three immediate actions should the engineer consider? (Choose three.)

Select 3 answers

A.Attach additional volumes

B.Create a snapshot before making changes

C.Enable compression on the volume

D.Delete unnecessary files

E.Increase the size of the storage volume

AnswersA, D, E

Adding new volumes expands total storage instantly.

Why this answer

Correct options are A, B, and D. Option A is correct because increasing the volume size provides more space. Option B is correct because deleting unnecessary files frees up space immediately.

Option D is correct because attaching additional volumes can increase total capacity. Option C is wrong because enabling compression may not work on existing data and is not immediate. Option E is wrong because creating a snapshot is a precaution, not an immediate remedy for capacity.

Practice this question →

30

MCQeasy

A cloud application returns HTTP 503 errors during high traffic. The application runs on VMs behind a load balancer. Which action is most likely to resolve the issue?

A.Restart the web server service on one VM.

B.Change the DNS TTL to a lower value.

C.Increase the health check interval on the load balancer.

D.Add additional VMs to the backend pool.

AnswerD

Scaling out increases capacity to handle traffic.

Why this answer

Option C is correct because scaling out adds capacity. Option A is incorrect because it affects only one host. Option B is incorrect as health checks are passing.

Option D is incorrect because DNS changes don't add capacity.

Practice this question →

31

MCQmedium

An automated snapshot of a cloud VM is failing with the error 'Quota exceeded for resource snapshots'. What is the most likely cause?

A.The snapshot is being created during a backup window.

B.The maximum number of snapshots allowed has been reached.

C.The snapshot retention policy is set too high.

D.The VM's disk is too full to create a snapshot.

AnswerB

Quota exceeded means the limit on number of snapshots is hit.

Why this answer

Option D is correct because the snapshot count limit is reached. Option A is incorrect because disk space is separate. Option B is incorrect as policy applies to all snapshots.

Option C is incorrect if snapshot creation is automated.

Practice this question →

32

MCQmedium

A KVM host has three VMs. The db-server VM is in a paused state. Which of the following is the most likely cause?

A.Storage I/O error on the VM disk

B.CPU overcommitment on the host

C.Insufficient memory on the host

D.Network interface is down

AnswerA

Hypervisors pause VMs when storage errors occur to prevent corruption.

Why this answer

Option B is correct because storage I/O errors can cause a VM to be paused. Option A is incorrect because memory shortage would cause performance issues but not necessarily pause. Option C is incorrect because network issues do not cause pausing.

Option D is incorrect because CPU overcommitment leads to slowdowns, not pausing.

Practice this question →

33

MCQhard

A global company runs a SaaS application in multiple cloud regions. They use DNS-based global load balancing to route users to the nearest region. Recently, users in Asia are experiencing high latency and timeouts. The administrator checks the health of the Asian region's resources and finds everything operational. Latency measurements from a monitoring tool show that traffic from Asian users is being routed to the European region. What should the administrator investigate first?

A.The latency-based routing policy

B.The DNS TTL settings

C.The geo-location records in the DNS provider

D.The load balancer configuration in the Asian region

AnswerA

Misconfiguration in latency-based routing can send traffic to a farther region.

Why this answer

Option C is correct because the latency-based routing policy may be incorrectly configured or not properly measuring latency, causing traffic to be routed to a distant region. Option A is wrong because TTL settings affect caching, not routing decisions. Option B is wrong because geo-location records are used for geographic routing, but the problem is latency-based routing sending traffic to Europe.

Option D is wrong because the Asian load balancer is operational; the issue is at the DNS level.

Practice this question →

34

MCQhard

Refer to the exhibit. A cloud engineer is using AzCopy to transfer files to Azure Blob Storage. The copy fails with the above error. Which of the following is the most likely cause?

A.The network throughput is insufficient

B.The SAS token used has expired

C.The storage account firewall is blocking the IP

D.The destination container does not exist

AnswerB

An expired SAS token causes the server to reject the request with this exact error.

Why this answer

Option B is correct because the error explicitly indicates an authentication failure, which is typically due to an expired or invalid SAS token. Option A is wrong because a firewall block would result in a different error (e.g., 403 Forbidden). Option C is wrong because a non-existent container would return a 404 error.

Option D is wrong because insufficient throughput would cause a timeout, not an authentication error.

Practice this question →

35

Multi-Selecthard

A company is experiencing high latency in their cloud-based database. The database is provisioned with SSD storage. Which THREE factors should the administrator investigate? (Choose three.)

Select 3 answers

A.Network bandwidth between application and database

B.Database query optimization

C.Number of database replicas

D.Storage IOPS limits

E.Region latency

AnswersA, B, D

Network congestion or high latency affects database response.

Why this answer

Options A, B, and C are correct because IOPS limits cause throttling, network latency between tiers adds delay, and unoptimized queries increase response time. Option D is wrong because number of replicas primarily affects read throughput and disaster recovery, not latency. Option E is wrong because region latency is infrastructure-level and less likely to change suddenly.

Practice this question →

36

MCQeasy

A small business hosts a web application on a single cloud server. The server has 2 vCPUs and 4 GB RAM. Recently, the application crashes when the number of concurrent users exceeds 50. The administrator checks the system logs and finds out-of-memory (OOM) errors. What is the best course of action to resolve this issue without redesigning the application?

A.Add a load balancer and another server

B.Reduce the application's memory footprint by code optimization

C.Increase the server's RAM to 8 GB

D.Enable swap space on the server

AnswerC

Increasing memory directly resolves OOM errors without application changes.

Why this answer

Option A is correct because increasing RAM directly addresses the OOM errors and is the simplest fix. Option B is wrong because adding a load balancer and another server is more complex and may require application changes. Option C is wrong because enabling swap space is a temporary workaround that can degrade performance.

Option D is wrong because code optimization is a redesign effort and not a quick fix.

Practice this question →

37

MCQmedium

A cloud administrator is troubleshooting an application that is experiencing intermittent timeouts. The application runs on a cloud VM and connects to a cloud database. The administrator sees no errors in the application logs but notices high network latency during peak hours. Which of the following is the MOST likely cause?

A.Insufficient provisioned IOPS on the database

B.Incorrect database schema

C.SSL certificate mismatch between app and database

D.Missing route table entry for the database subnet

AnswerA

Low IOPS leads to queueing and increased latency under load.

Why this answer

Option B is correct because insufficient provisioned IOPS can cause queue buildup and latency. Option A is wrong because database schema issues would cause query errors, not latency. Option C is wrong because a missing route would cause complete failure, not intermittent timeouts.

Option D is wrong because SSL misconfiguration would cause handshake failures.

Practice this question →

38

Multi-Selectmedium

A cloud architect is designing a highly available web application. Which THREE of the following components should be configured in at least two availability zones? (Choose THREE.)

Select 3 answers

A.Web server instances

B.Application load balancer

C.Database instance (primary and standby)

D.DNS service (e.g., Route 53)

E.Auto-scaling group

AnswersA, B, C

Instances should be deployed across AZs to handle requests if one AZ fails.

Why this answer

To achieve high availability across AZs, the load balancer, application servers, and database should be multi-AZ. Auto-scaling groups can launch instances across AZs, but they are not a component themselves; they are a management service. The DNS service is globally redundant by nature, not usually limited to AZs.

Practice this question →

39

MCQmedium

The exhibit shows the health check status for targets in an application load balancer's target group. The target group has a health check on port 80. An administrator notices that one target is unhealthy on port 80 but healthy on port 443. What is the most likely cause?

A.The web server on the target is not listening on port 443.

B.The security group for the target is blocking port 80 from the load balancer.

C.The load balancer is in a different VPC.

D.The health check path is incorrect.

AnswerB

A security group blocking port 80 would cause the health check on port 80 to fail, while port 443 remains healthy.

Why this answer

Option A is correct because the target is healthy on 443 but not on 80, suggesting port 80 is blocked. Option B is wrong because it's healthy on 443, so server is listening. Option C is wrong because if path were wrong, both would fail.

Option D is wrong because the same target group is used.

Practice this question →

40

MCQhard

A cloud administrator sees the output above when troubleshooting a virtual machine that is unresponsive. The VM is critical and must be restored quickly. What should the administrator do first?

A.Resume the VM using the virsh resume command.

B.Restart the libvirtd service on the host.

C.Increase the memory allocation for the host to free resources.

D.Migrate the VM to another host in the cluster.

AnswerA

This directly addresses the paused state and will restore the VM to a running state.

Why this answer

The output from `virsh list --all` shows the VM is in a 'paused' state, which means it is still resident in memory but not executing. The fastest way to restore a paused VM is to resume it with `virsh resume <vm-name>`, which immediately continues CPU execution without requiring a reboot or migration. This directly addresses the unresponsive behavior while preserving the VM's current memory state.

Exam trap

The trap here is that candidates assume a paused VM requires a full restart or host-level intervention, but the CV0-004 exam expects you to recognize that `virsh resume` is the immediate, low-risk recovery action for a paused domain.

How to eliminate wrong answers

Option B is wrong because restarting the libvirtd service would disrupt all VMs on the host and is unnecessary when only a single VM is paused; the issue is at the VM level, not the hypervisor daemon. Option C is wrong because increasing host memory allocation does not affect a paused VM—pausing is triggered by storage I/O errors, disk full conditions, or host memory overcommitment, not by insufficient host memory. Option D is wrong because migrating a paused VM requires resuming it first or using `virsh migrate --live` which cannot work on a paused domain; migration adds unnecessary complexity and downtime when a simple resume command will restore service immediately.

Practice this question →

41

MCQeasy

A cloud administrator notices that a virtual machine is running but cannot be reached over the network. The administrator verifies that the VM is configured with the correct IP address and subnet mask. Which of the following is the MOST likely cause of this issue?

A.Cloud provider firewall blocking all traffic

B.Incorrect DNS server settings

C.Missing port forwarding rule

D.Misconfigured default gateway

AnswerD

Without a correct gateway, traffic cannot exit the subnet.

Why this answer

Option B is correct because a misconfigured default gateway is a common cause of network unreachability when IP and subnet are correct. Option A is wrong because incorrect DNS would affect name resolution, not basic connectivity. Option C is wrong because firewall rules within the cloud provider may block traffic, but the question suggests the VM itself is reachable? Actually, if the VM is running but cannot be reached, a misconfigured gateway prevents return traffic.

Option D is wrong because port forwarding is not typically used for general network connectivity.

Practice this question →

42

MCQhard

A cloud administrator notices that a virtual machine is consuming excessive CPU resources with no apparent workload. Which of the following should the administrator investigate FIRST to determine the cause?

A.A misconfigured load balancer sending traffic to the VM

B.CPU hotplug settings on the hypervisor

C.A runaway process inside the VM

D.Memory overcommitment ratio

AnswerC

A runaway process (e.g., infinite loop) can consume 100% CPU even with no intended workload.

Why this answer

Option D is correct because a VM with no workload but high CPU is often due to a runaway process, such as a background service or malware. Option A is wrong because CPU hotplug is not common and would not cause continuous high usage without workload. Option B is wrong because memory overcommitment affects memory, not CPU.

Option C is wrong while possibly true, it is less likely than a process running inside the VM.

Practice this question →

43

MCQhard

A cloud engineer is troubleshooting an issue where an application running in a container on a Kubernetes cluster is unable to resolve DNS names. The cluster uses CoreDNS. The engineer checks the CoreDNS pod logs and sees no errors. Which of the following should the engineer check next?

A.The Kubernetes DNS service IP address

B.The container's /etc/resolv.conf file

C.The cloud provider's DNS resolver settings

D.The network policy for the namespace

AnswerB

If this file does not point to CoreDNS, DNS resolution fails.

Why this answer

Option A is correct because the container's /etc/resolv.conf should list the CoreDNS service IP; if misconfigured, DNS resolution fails. Option B is wrong because the DNS service IP is typically automatically set. Option C is wrong because network policies would block traffic entirely, not just DNS.

Option D is wrong because cloud provider DNS is upstream; the issue is within the cluster.

Practice this question →

44

Multi-Selecthard

A cloud engineer is troubleshooting a performance issue where a web server cluster experiences high latency during peak hours. The cluster uses an auto-scaling group behind a load balancer. Which THREE steps should the engineer take to identify the root cause?

Select 3 answers

A.Monitor CPU and memory utilization on the web servers

B.Analyze web server access logs for slow requests

C.Check the load balancer's backend instance health status

D.Reduce the number of instances in the auto-scaling group

E.Review security group rules for the load balancer

AnswersA, B, C

High resource usage can cause slow responses.

Why this answer

Option A is correct because high CPU or memory utilization on web servers directly indicates resource contention, which can cause increased request processing time and latency. Monitoring these metrics helps identify if the auto-scaling group is under-provisioned or if a specific instance is overloaded, guiding scaling policy adjustments.

Exam trap

The trap here is that candidates may think reducing instances (Option D) is a valid troubleshooting step, but it is a remediation action that can mask the root cause and potentially crash the application under load.

Practice this question →

45

MCQhard

A cloud administrator is troubleshooting why web-server-01 is not receiving traffic from an internet-facing load balancer. The load balancer is in the same VPC and subnet. According to the exhibit, what is the most likely reason?

A.The security group attached to the instance does not allow traffic from the load balancer

B.The instance is in a stopped state

C.The instance is not in the same VPC as the load balancer

D.The instance does not have a public IP address

AnswerA

The security group must allow inbound HTTP/HTTPS from the load balancer's security group or CIDR.

Why this answer

The instance has no public IP address (not shown) and the security group 'web-sg' may not allow HTTP traffic. However, the exhibit does not show security group rules or public IP. The most likely issue is that the instance is in a private subnet without a route to the internet? But the question says load balancer is internet-facing and in same VPC.

The load balancer can route to private instances if security groups allow. However, the exhibit shows no public IP, so the load balancer can still reach it via private IP. Actually, the most common issue is security group rules: the load balancer's security group must allow inbound HTTP, and the instance's security group must allow traffic from the load balancer.

Without that, traffic is blocked. Option C is correct. Option A is wrong because the load balancer can route to private IPs.

Option B is wrong because the instance is running. Option D is wrong because load balancer does not require a public IP on the instance.

Practice this question →

46

MCQeasy

A cloud administrator is configuring a Linux VM as a router. The iptables rules are shown. The administrator can SSH into the VM from the network but cannot forward traffic between interfaces. What is the most likely cause?

A.The INPUT chain has a rule dropping invalid packets

B.The INPUT chain is missing a rule to allow forwarded traffic

C.The FORWARD chain's default policy is DROP and no rules allow forwarding

D.The NAT table is misconfigured

AnswerC

With default policy DROP and no FORWARD rules, all forwarded packets are dropped.

Why this answer

The FORWARD chain in iptables controls traffic that passes through the VM (i.e., traffic not destined for the VM itself). If its default policy is DROP and no explicit ACCEPT rules exist for forwarding, the kernel will drop all forwarded packets, preventing the VM from acting as a router. SSH access works because it uses the INPUT chain, which is separate from FORWARD.

Exam trap

The trap here is that candidates confuse the INPUT chain (for local traffic) with the FORWARD chain (for transit traffic), assuming that allowing SSH implies forwarding is also allowed, when in fact they are handled by completely separate chains.

How to eliminate wrong answers

Option A is wrong because the INPUT chain dropping invalid packets affects only traffic destined for the VM itself, not forwarded traffic; SSH connectivity proves INPUT is functional. Option B is wrong because forwarded traffic is governed by the FORWARD chain, not the INPUT chain; the INPUT chain has no role in forwarding decisions. Option D is wrong because the NAT table is used for source/destination NAT (e.g., masquerading) and does not control basic IP forwarding; even with correct NAT, packets will be dropped if the FORWARD chain blocks them.

Practice this question →

47

MCQmedium

A cloud architect is designing a multi-tier application that must be resilient to the failure of an entire availability zone. Which of the following strategies BEST meets this requirement?

A.Place all instances behind a single load balancer in one zone

B.Implement auto-scaling within the same availability zone

C.Use larger instance types to handle more load

D.Deploy application instances across three availability zones with a load balancer

AnswerD

Multi-AZ deployment ensures continued operation if one zone fails.

Why this answer

Option D is correct because deploying across multiple availability zones provides high availability. Option A is wrong because vertical scaling does not address zone failure. Option B is wrong because auto-scaling within a single zone does not help if the zone fails.

Option C is wrong because a single load balancer in one zone is a single point of failure.

Practice this question →

48

Multi-Selecthard

A company is migrating on-premises workloads to the cloud. They need to ensure high availability for a stateless web application across two availability zones. Which THREE components should be configured to meet this requirement?

Select 3 answers

A.An auto scaling group spanning both availability zones

B.A load balancer in front of the web tier

C.A read replica database in a different AZ

D.A single large EC2 instance to handle all traffic

E.Multiple subnets, each in a different availability zone

AnswersA, B, E

Correct; auto scaling maintains instance count across AZs.

Why this answer

An auto scaling group spanning both availability zones (Option A) ensures that the stateless web application can automatically replace failed instances and maintain the desired capacity across multiple AZs, which is essential for high availability. By distributing instances across AZs, the application can tolerate an entire AZ failure without losing all compute capacity.

Exam trap

The trap here is that candidates often confuse database-level high availability (like read replicas or Multi-AZ RDS) with application-tier high availability, leading them to select a database option (C) when the question explicitly targets the stateless web tier.

Practice this question →

49

MCQhard

An administrator deployed a new web application using an auto-scaling group. Users report that the application becomes slow after a few hours. The administrator examines the scaling policy and notices that the CPU utilization threshold is set to 80% for scale-out and 20% for scale-in. What is the most likely issue?

A.The scale-in threshold is too low, causing instances to be terminated prematurely

B.The load balancer is misconfigured

C.The application has a memory leak

D.The scale-out threshold is too high, causing delayed scaling

AnswerD

At 80% CPU, scaling out only occurs after significant load, delaying additional capacity.

Why this answer

Option B is correct because a scale-out threshold of 80% delays the addition of new instances, causing performance degradation. Option A is wrong because a 20% scale-in threshold is low, but not the primary cause of slowness. Option C is wrong because the symptoms point to scaling configuration rather than a memory leak.

Option D is wrong because load balancer misconfiguration would likely cause direct failures, not gradual slowdown.

Practice this question →

50

MCQeasy

A company uses a multi-cloud strategy with workloads on AWS and Azure. An application running on an Amazon EC2 instance in a VPC uses an Azure SQL Database as its backend via a site-to-site VPN. Recently, users reported intermittent timeouts when accessing the application. The EC2 instance passes health checks, and the VPN tunnel status shows as 'UP' from both sides. The application logs show 'Cannot open server 'azuresql.database.windows.net' requested by the login. The login failed.' Which of the following is the MOST likely cause of the issue?

A.The EC2 instance has exhausted its CPU credits, causing the application to become unresponsive.

B.The Azure SQL Database firewall does not allow traffic from the EC2 instance's IP address or the VPN gateway's IP.

C.The VPN tunnel is not properly routing traffic to Azure, causing intermittent connectivity.

D.The EC2 instance does not have sufficient IAM permissions to connect to Azure SQL Database.

AnswerB

Azure SQL has a firewall that must explicitly permit the source IP. The error 'login failed' often indicates the IP is blocked. The admin should add the VPN gateway's public IP to the allowed list.

Why this answer

The error message 'Cannot open server 'azuresql.database.windows.net' requested by the login. The login failed.' indicates that the Azure SQL Database server rejected the connection attempt. Since the VPN tunnel is 'UP' and the EC2 instance passes health checks, the most likely cause is that the Azure SQL Database firewall rules do not include the source IP address of the traffic coming from the EC2 instance — either the EC2 instance's private IP (if traffic is routed through the VPN) or the public IP of the VPN gateway.

Azure SQL Database uses server-level firewall rules to allow client IP addresses, and without an explicit rule, all connections are blocked.

Exam trap

The trap here is that candidates see a VPN tunnel status of 'UP' and assume connectivity is fully functional, overlooking that Azure SQL Database has its own separate firewall layer that must explicitly permit the source IP address of the connecting client.

How to eliminate wrong answers

Option A is wrong because CPU credit exhaustion would cause performance degradation or throttling, not a specific login failure error from Azure SQL Database; the application logs clearly show a database authentication error, not a timeout or resource exhaustion. Option C is wrong because the VPN tunnel status is 'UP' from both sides, and the error is a login failure from the database server, not a routing or connectivity issue; if routing were broken, the application would likely see a network timeout or unreachable host error, not a specific SQL login failure. Option D is wrong because IAM permissions are an AWS construct used for AWS services (e.g., S3, DynamoDB) and have no bearing on authenticating to an Azure SQL Database; Azure SQL uses SQL authentication or Azure AD authentication, not AWS IAM.

Practice this question →

51

Multi-Selectmedium

A cloud administrator is troubleshooting a failed backup job that was supposed to back up a database to a cloud storage bucket. The job fails with an access denied error. Which two likely causes should the administrator investigate? (Choose two.)

Select 2 answers

A.The database is offline during backup

B.The IAM role assigned to the backup service lacks write permissions to the bucket

C.The backup schedule is misconfigured

D.The storage bucket has been deleted

E.The backup software version is incompatible

AnswersB, D

Insufficient permissions directly cause access denied errors.

Why this answer

Correct options are B and C. Option B is likely because missing IAM permissions cause access denied. Option C is likely because a deleted bucket would also produce an access denied error.

Option A is wrong because version incompatibility would cause a different error. Option D is wrong because an offline database would cause a connection error, not access denied. Option E is wrong because schedule misconfiguration would prevent the job from running, not cause access denied.

Practice this question →

52

Multi-Selectmedium

A cloud administrator is troubleshooting a connectivity issue between two VPCs in the same region. Which TWO actions should the administrator verify? (Choose two.)

Select 2 answers

A.VPC peering connection status

B.Route table entries

C.Security group rules

D.VPN tunnel configuration

E.Internet gateway attachment

AnswersA, B

The peering connection must be active.

Why this answer

Options A and C are correct because route tables must have routes to the peering connection, and the VPC peering connection must be in the 'active' state. Option B is wrong because security groups don't block traffic between peered VPCs unless explicitly configured, but they are not the primary check. Option D is wrong because an internet gateway is not required for VPC peering.

Option E is wrong because VPN tunnel is a different connectivity method.

Practice this question →

53

MCQeasy

A cloud user is unable to connect to a web server VM from the internet after a security group rule was modified. The VM is running and can be pinged from other VMs in the same subnet. What is the most likely cause?

A.The VM's local firewall is blocking the traffic.

B.The VM's routing table is missing a default gateway.

C.The inbound rule for HTTP/HTTPS was removed or misconfigured.

D.The VM's DNS settings are incorrect.

AnswerC

Security groups control inbound traffic; missing rule blocks internet access.

Why this answer

Option A is correct because the security group rule likely blocks inbound HTTP/HTTPS traffic. Option B is incorrect because DNS resolution affects hostnames, not connectivity. Option C is incorrect because routing between VMs works.

Option D is incorrect because a firewall on the OS would affect all traffic, not just internet.

Practice this question →

54

MCQeasy

A cloud administrator notices that a virtual machine is unresponsive. The VM is running on a hypervisor host that shows high CPU utilization. What should the administrator do first?

A.Reboot the hypervisor host

B.Increase the VM's vCPU count

C.Migrate the VM to another host

D.Check the VM console for OS-level issues

AnswerD

Checking the console allows direct assessment of the VM's OS state, such as a hung process or login prompt.

Why this answer

Option C is correct because the first step in troubleshooting an unresponsive VM is to check the VM console for OS-level issues. Option A is wrong because migrating the VM might be premature without diagnosing the root cause. Option B is wrong because increasing vCPU could exacerbate resource contention if the host is overloaded.

Option D is wrong because rebooting the host would affect all VMs and should be a last resort.

Practice this question →

55

MCQmedium

An organization is using a hybrid cloud model with an on-premises data center connected to a public cloud via a dedicated Direct Connect circuit. Users on the corporate network report that access to cloud resources is intermittently failing. The cloud administrator pings the cloud gateway and sees packet loss. The on-premises network team confirms no issues on their side. The administrator reviews the cloud provider's status page and finds no outages. What should the administrator do next?

A.Submit a support ticket to the cloud provider

B.Reboot the on-premises router

C.Check the bandwidth utilization on the Direct Connect circuit

D.Failover to a backup VPN connection

AnswerC

High utilization can cause packet loss and intermittent failures.

Why this answer

Option C is correct because intermittent packet loss and no provider outage suggest bandwidth saturation on the Direct Connect circuit. Option A is wrong because failing over to a VPN may not address the root cause and could mask it. Option B is wrong because rebooting the on-premises router is disruptive and doesn't troubleshoot bandwidth.

Option D is wrong because a support ticket is unlikely to help if the provider has no issues and the circuit is simply congested.

Practice this question →

56

MCQmedium

During a disaster recovery test, a cloud administrator discovers that the standby database in a different region is not synchronized with the primary. The primary database uses asynchronous replication. What is the MOST likely reason for the sync failure?

A.License expiration on the standby database

B.Network latency causing replication lag

C.Firewall rules blocking port 3306 between regions

D.Incorrect replication configuration using a read replica instead of a standby

AnswerB

Asynchronous replication allows lag; high latency can increase lag, causing data not to be current.

Why this answer

Option C is correct because asynchronous replication can have lag; if the primary fails before sync, data may be lost. Option A is wrong because DR test is not about licensing. Option B is wrong because replication configurations are typically cross-region.

Option D is wrong because a firewall blocking port 3306 would prevent any replication, not just async lag.

Practice this question →

57

Multi-Selecteasy

A cloud administrator is investigating why a virtual machine is running slowly. The administrator checks the hypervisor performance metrics. Which TWO of the following metrics indicate CPU contention? (Choose TWO.)

Select 2 answers

A.High CPU ready time

B.High disk queue depth

C.High CPU co-stop time

D.High memory ballooning

E.High CPU usage percentage

AnswersA, C

CPU ready time directly measures wait time due to contention.

Why this answer

CPU ready time and co-stop time are both indicators of CPU contention. Ready time is time a VM is ready to run but waiting for CPU, co-stop time is time a VM is stopped because another vCPU in the same VM is contending. Option A (CPU usage percentage) is normal utilization, not contention.

Option D (Memory ballooning) is memory-related. Option E (Disk latency) is storage-related.

Practice this question →

58

MCQeasy

A cloud administrator is troubleshooting a performance issue where a web application experiences intermittent slowdowns. The application is deployed on a public cloud IaaS with auto-scaling. What should the administrator check first?

A.Verify load balancer configuration

B.Check CPU utilization of all instances

C.Analyze database query performance

D.Review network latency between tiers

AnswerA

Load balancer misconfiguration can cause intermittent failures.

Why this answer

Option D is correct because a misconfigured load balancer can cause uneven traffic distribution, leading to intermittent slowdowns. Option A is wrong because CPU utilization spikes might be a symptom, not the root cause, and auto-scaling should handle it. Option B is wrong because network latency is less likely to be intermittent.

Option C is wrong because database queries are a deeper layer to investigate after network and application tiers.

Practice this question →

59

MCQmedium

Refer to the exhibit. A cloud engineer is troubleshooting network connectivity to a server with IP 10.0.0.5. The server is on the same subnet. Based on the iptables rules shown, what is the most likely cause of the connectivity failure?

A.The FORWARD chain policy drops all traffic

B.The OUTPUT chain rejects all traffic to the server

C.The DROP rule in INPUT has zero packet count, so it is not effective

D.The INPUT chain drops all traffic destined to the 10.0.0.0/8 network

AnswerD

Correct; the DROP rule in INPUT blocks traffic to 10.0.0.0/8.

Why this answer

Option D is correct because the INPUT chain has a rule that drops all traffic destined to the 10.0.0.0/8 network, which includes the target server at 10.0.0.5. Since the server is on the same subnet, traffic to it must traverse the INPUT chain on the local system, and this DROP rule will match and discard packets before any other rule can accept them. The rule's position and destination match make it the most direct cause of the connectivity failure.

Exam trap

The trap here is that candidates often overlook the INPUT chain's destination match and assume the FORWARD chain is responsible for same-subnet traffic, or they misinterpret a zero packet count as an inactive rule, when in fact the rule is simply waiting for matching traffic.

How to eliminate wrong answers

Option A is wrong because the FORWARD chain only applies to traffic being routed through the system, not to traffic destined for the local system itself; since the server is on the same subnet, packets to 10.0.0.5 are not forwarded but processed locally via the INPUT chain. Option B is wrong because the OUTPUT chain controls traffic leaving the local system, not incoming traffic to the server; rejecting OUTPUT traffic would prevent the local system from sending packets, but the issue is about connectivity to the server, which is inbound. Option C is wrong because a DROP rule with a zero packet count simply means no packets have matched it yet; it is still present and effective in the ruleset, and once traffic matches, the count will increment—zero count does not imply the rule is inactive or ineffective.

Practice this question →

60

Multi-Selecthard

A cloud administrator is troubleshooting performance issues with a cloud object storage bucket that is used for storing large amounts of small files. The application reads and writes objects frequently. Which three actions could improve the performance? (Choose three.)

Select 3 answers

A.Use a multi-region bucket to reduce latency.

B.Increase the number of concurrent requests from the application.

C.Enable transfer acceleration using a CDN.

D.Enable versioning to avoid overwrites.

E.Use a prefix naming scheme that distributes objects across multiple partitions.

AnswersB, C, E

Increasing concurrency can improve throughput as long as the backend can handle it.

Why this answer

Options C, D, and E are correct. Option A is wrong because multi-region buckets increase write latency due to replication. Option B is wrong because versioning adds overhead.

C uses prefix naming to increase request rate, D uses transfer acceleration for faster uploads, and E increases throughput via concurrency.

Practice this question →

61

MCQeasy

A virtual machine in a cloud environment is experiencing high disk I/O latency. The administrator checks the performance metrics and sees that the disk queue length is consistently above 100. What is the best immediate action?

A.Attach an additional disk and stripe the data

B.Upgrade the VM's network bandwidth

C.Migrate the VM to a host with faster disks

D.Increase the VM's memory

AnswerA

Stripping adds parallelism, reducing queue depth and improving latency.

Why this answer

Option C is correct because attaching additional disks and striping (e.g., RAID 0) distributes I/O, reducing queue length. Option A is wrong because memory does not affect disk I/O. Option B is wrong because migration to a different host may not change disk performance.

Option D is wrong because network bandwidth is unrelated.

Practice this question →

62

MCQhard

A cloud administrator is troubleshooting a web application hosted on a cloud virtual machine (VM) that is experiencing intermittent high latency during peak traffic hours. The application is deployed on a single VM instance with 4 vCPUs and 8 GB RAM, running a Linux OS. The VM is connected to a virtual network with a public IP. The administrator has verified that the application code is optimized and there are no memory leaks. CPU utilization remains below 50% during peaks, but network outbound traffic shows periodic spikes up to 500 Mbps. The VM's network interface is configured with a 1 Gbps bandwidth cap. The administrator suspects that the issue is related to network throttling or packet loss. Which of the following actions should the administrator take to resolve the issue?

A.Increase the VM's vCPU count to 8 to improve processing capacity.

B.Upgrade the VM to a larger instance size with higher network bandwidth cap (e.g., 2 Gbps).

C.Configure the firewall to allow all traffic to reduce processing overhead.

D.Enable DDoS protection on the public IP to filter malicious traffic.

AnswerB

This directly addresses the network bottleneck causing latency during traffic spikes.

Why this answer

Option B is correct because the VM's network bandwidth cap of 1 Gbps is being saturated during peak traffic (spikes up to 500 Mbps, but with overhead and burst behavior, the cap can cause throttling and packet loss). Upgrading to a larger instance size with a higher network bandwidth cap (e.g., 2 Gbps) directly addresses the bottleneck by providing more headroom for outbound traffic, reducing latency caused by queueing and drops. The administrator has already ruled out CPU and memory issues, so the network cap is the likely culprit.

Exam trap

The trap here is that candidates may assume CPU or memory is the bottleneck because latency is intermittent, but the question explicitly states CPU is below 50% and memory is fine, so the real issue is the network bandwidth cap, which is a common cloud-specific limitation tied to instance size.

How to eliminate wrong answers

Option A is wrong because increasing vCPUs does not increase network bandwidth capacity; the bottleneck is network throughput, not compute, and CPU utilization is already below 50%. Option C is wrong because configuring the firewall to allow all traffic would not reduce processing overhead in a meaningful way and could actually increase security risks; firewall processing overhead is negligible compared to the bandwidth cap limitation. Option D is wrong because DDoS protection is designed to filter malicious traffic, not to resolve throttling or packet loss caused by legitimate peak traffic exceeding the bandwidth cap.

Practice this question →

63

Multi-Selectmedium

A cloud administrator receives an alert that a VM's disk usage is at 95%. The VM is running a critical database. Which TWO actions should the administrator take to resolve the issue while minimizing downtime?

Select 2 answers

A.Increase the size of the existing disk

B.Shrink an existing partition to free space

C.Clear temporary files and logs

D.Restore the VM from a recent backup to a larger disk

E.Add a new disk and move data to it

AnswersA, C

Many cloud providers allow online disk resizing without rebooting.

Why this answer

Increasing the size of the existing disk (Option A) is correct because it allows the VM to gain additional storage capacity without requiring a reboot or migration, minimizing downtime. Modern hypervisors and cloud platforms support live resizing of virtual disks, and once the disk is expanded, the OS can extend the partition online using tools like `resize2fs` (Linux) or Disk Management (Windows). This directly addresses the 95% disk usage alert for the critical database with minimal service interruption.

Exam trap

The trap here is that candidates often choose Option E (add a new disk) thinking it is safer or more standard, but they overlook that expanding the existing disk is faster and causes less downtime for a critical database, and that adding a new disk introduces additional management overhead and potential service interruption.

Practice this question →

64

Matchingmedium

Match each disaster recovery term to its definition.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Maximum time to restore services after outage

Maximum acceptable data loss in time

Automatic switch to standby system

Copy of data for restoration

Documented plan for disaster recovery

Why these pairings

Key metrics and concepts for business continuity.

Practice this question →

65

MCQhard

A company is designing a multi-cloud disaster recovery solution. They need to ensure RPO of 15 minutes and RTO of 1 hour for critical workloads. Which of the following should be implemented?

A.Asynchronous replication to a secondary cloud with a 30-minute delay

B.Synchronous replication to a standby environment in another cloud provider

C.Pilot light environment that is started manually during a disaster

D.Daily backups to object storage in a different region

AnswerB

Correct; synchronous replication provides low RPO and fast failover.

Why this answer

Synchronous replication ensures that data is written to both the primary and standby environments simultaneously, guaranteeing zero data loss and meeting the 15-minute RPO. With a pre-configured standby environment in another cloud provider, failover can occur within minutes, satisfying the 1-hour RTO. This approach provides the lowest possible RPO and RTO for critical workloads.

Exam trap

CompTIA often tests the distinction between synchronous and asynchronous replication, where candidates mistakenly choose asynchronous replication for low RPO requirements, not realizing that asynchronous replication inherently introduces a delay equal to the replication interval.

How to eliminate wrong answers

Option A is wrong because asynchronous replication with a 30-minute delay cannot achieve a 15-minute RPO, as data loss could be up to 30 minutes. Option C is wrong because a pilot light environment that is started manually during a disaster typically has an RTO of hours, not 1 hour, due to the time required to provision and configure resources. Option D is wrong because daily backups to object storage cannot meet a 15-minute RPO, as data loss could be up to 24 hours, and recovery from backups often takes longer than 1 hour.

Practice this question →

66

MCQeasy

The exhibit shows the output of the df command and an application error. What is the most likely cause of the error?

A.The /dev/shm partition is full.

B.The /var partition is full.

C.The filesystem is corrupt.

D.The inode usage on the root filesystem is exhausted.

AnswerB

The root partition is at 95% usage, and since /var is under /, it is likely full.

Why this answer

Option A is correct because the root partition is at 95% usage, and the application writes to /var/log which is on the root filesystem. Option B is wrong because tmpfs is empty. Option C is plausible but df shows space, not inodes; inode exhaustion would also give same error, but more likely space.

Option D is wrong because no corruption indicated.

Practice this question →

67

Multi-Selectmedium

A cloud engineer is troubleshooting a VM that is experiencing high latency. The VM is hosted on a hypervisor with other VMs. Which TWO metrics should the engineer review to identify if resource contention is occurring?

Select 2 answers

A.Memory ballooning

B.CPU ready time

C.Network packet drops

D.Swap usage

E.Disk queue length

AnswersA, B

Correct; memory ballooning indicates memory contention.

Why this answer

Memory ballooning (A) is a VMware mechanism where the hypervisor reclaims idle memory from a VM by inflating a balloon driver, forcing the VM to swap. High ballooning indicates memory overcommitment and contention, directly causing latency. CPU ready time (B) measures the time a VM is ready to run but waiting for a physical CPU core; elevated ready time signals CPU contention among VMs on the same hypervisor.

Exam trap

CompTIA often tests the distinction between guest-level metrics (swap usage, disk queue length) and hypervisor-level metrics (ballooning, ready time), and the trap here is that candidates confuse swap usage (guest OS paging) with memory ballooning (hypervisor reclaim), or assume network packet drops indicate VM contention rather than network issues.

Practice this question →

68

Multi-Selecteasy

A cloud administrator is troubleshooting a virtual machine that is experiencing high memory usage. The VM is running a web server. Which two metrics should the administrator monitor to determine if the VM needs additional memory? (Choose two.)

Select 2 answers

A.Swap usage

B.Disk I/O wait

C.Page fault rate

D.Available memory

E.CPU ready time

AnswersA, D

High swap usage indicates the OS is using disk as memory, a sign of insufficient physical memory.

Why this answer

Options A and C are correct. Available memory directly shows free memory, and swap usage indicates the OS is using disk as memory, which is a sign of insufficient RAM. Option B (CPU ready time) is a CPU metric.

Option D (disk I/O wait) is disk-related. Option E (page fault rate) indicates paging but is not as direct as swap usage.

Practice this question →

69

MCQmedium

A cloud administrator is troubleshooting connectivity issues between two virtual networks in different regions. The VNets are peered, but instances cannot communicate. The administrator verifies that the peering status is 'Connected' and route tables appear correct. Which of the following should be checked next?

A.Network Security Group (NSG) rules on the instances and subnets

B.DNS resolution settings

C.Gateway subnet configuration

D.Service endpoint status

AnswerA

Correct; NSGs can block traffic even if VNet peering is established.

Why this answer

Even when VNet peering status shows 'Connected' and route tables are correct, Network Security Group (NSG) rules can still block traffic. NSGs act as a stateful firewall at the subnet or NIC level, and by default they deny all inbound traffic unless explicitly allowed. Since the administrator has already verified routing, the next logical step is to check NSG rules for any implicit deny or missing allow rules that could be dropping the inter-region traffic.

Exam trap

The trap here is that candidates assume a 'Connected' peering status guarantees traffic flow, but they overlook that NSGs can silently drop traffic even when peering and routing are correctly configured.

How to eliminate wrong answers

Option B is wrong because DNS resolution settings affect name resolution, not IP-level connectivity; if instances cannot communicate via IP, DNS is irrelevant. Option C is wrong because gateway subnets are only used for VPN or ExpressRoute gateways, not for VNet peering; peering does not require a gateway. Option D is wrong because service endpoints are used to secure Azure service access (e.g., Storage, SQL) from a VNet, not for traffic between peered VNets; they do not control inter-VNet communication.

Practice this question →

70

MCQmedium

A cloud load balancer is not distributing traffic evenly to backend servers. All servers pass health checks. Which of the following is the most likely cause?

A.The health check interval is set too long.

B.One of the backend servers has reached its connection limit.

C.Session persistence is enabled and directing traffic to specific servers.

D.The health check path is incorrect.

AnswerC

Sticky sessions bind clients to a server, causing imbalance.

Why this answer

Option A is correct because session persistence (sticky sessions) can cause uneven distribution. Option B is incorrect because health checks pass. Option C is incorrect as capacity is sufficient.

Option D is incorrect because health check interval affects removal, not distribution.

Practice this question →

71

MCQeasy

A cloud administrator runs a deployment script that creates multiple resources using Infrastructure as Code (IaC). The script fails with a "400 Bad Request" error when attempting to create a storage account. Which troubleshooting step should the administrator take first?

A.Check the network connectivity to the cloud API endpoint.

B.Increase the timeout value for the API call.

C.Review the error message details for a specific validation error.

D.Verify that the script has the correct region parameter.

AnswerC

The first step is to examine the error message to identify the invalid parameter.

Why this answer

Option C is correct because a 400 error indicates a client error, so the first step is to examine the error details to understand what parameter is invalid. Option A is wrong because 400 is not a network error (would be 5xx). Option B is wrong but plausible; however, the error message would specify the exact issue.

Option D is wrong because timeouts result in different errors.

Practice this question →

72

MCQeasy

Refer to the exhibit. An administrator runs the command shown and receives the output. The administrator wants to ensure the VM uses SSDs for the OS disk. Based on the output, what is the current storage type?

A.Premium SSD (Premium_LRS)

B.Ultra Disk (UltraSSD_LRS)

C.Standard HDD (Standard_LRS)

D.Standard SSD (StandardSSD_LRS)

AnswerA

Correct; Premium_LRS indicates premium SSD.

Why this answer

The output shows the OS disk is using `Premium_LRS` as the storage account type, which corresponds to Premium SSD. This is confirmed by the `storageAccountType` field in the JSON output, indicating the disk is backed by solid-state drives with premium performance characteristics.

Exam trap

CompTIA often tests the distinction between storage account type names and their corresponding hardware (e.g., confusing `StandardSSD_LRS` with `Premium_LRS`), leading candidates to overlook the exact string in the output and assume any SSD type is correct.

How to eliminate wrong answers

Option B is wrong because Ultra Disk (UltraSSD_LRS) is a separate storage type that offers higher IOPS and lower latency but is not shown in the output; the output explicitly lists `Premium_LRS`. Option C is wrong because Standard HDD (Standard_LRS) uses magnetic spinning disks, not SSDs, and would appear as `Standard_LRS` in the output. Option D is wrong because Standard SSD (StandardSSD_LRS) is a different tier that uses SSDs but with lower performance than Premium SSD; the output shows `Premium_LRS`, not `StandardSSD_LRS`.

Practice this question →

73

MCQmedium

A cloud engineer is troubleshooting an issue where users cannot connect to a web application hosted on a cloud VM. The VM's security group allows HTTP (port 80) from 0.0.0.0/0, and the VM's OS firewall is disabled. The engineer can ping the VM's public IP from the internet. What is the most likely cause of the issue?

A.OS firewall is blocking port 80

B.Incorrect routing table on the VM

C.Security group rule is applied to the wrong subnet

D.Web server service is not running on the VM

AnswerD

If the web server is not running, it won't respond on port 80, even though the network allows it.

Why this answer

Since the OS firewall is disabled and the security group allows HTTP from 0.0.0.0/0, the only remaining layer that could block connectivity is the application itself. If the web server service (e.g., Apache, Nginx, IIS) is not running on the VM, it will not listen on TCP port 80, so HTTP requests will be refused even though network-level access is permitted. The ability to ping the VM confirms IP-level reachability, isolating the issue to the application layer.

Exam trap

The trap here is that candidates assume a ping success implies all services are reachable, but ICMP (ping) operates at the network layer (Layer 3) and does not test TCP port availability, so a running web server is required for HTTP connectivity.

How to eliminate wrong answers

Option A is wrong because the OS firewall is explicitly stated as disabled, so it cannot be blocking port 80. Option B is wrong because routing tables on the VM control outbound traffic, not inbound connections to the VM; inbound traffic is handled by the cloud provider's virtual network and security groups. Option C is wrong because security groups are stateful and applied at the VM network interface level, not to subnets; even if the rule were misapplied, the VM's security group explicitly allows HTTP from 0.0.0.0/0, so this is not the cause.

Practice this question →

74

Multi-Selecteasy

Which TWO are common causes of virtual machine performance degradation in a cloud environment?

Select 2 answers

A.Misconfigured security groups

B.Insufficient memory allocation

C.Network latency to storage

D.Incorrect DNS settings

E.High CPU ready time

AnswersB, E

Causes swapping and slow performance.

Why this answer

Options A and D are correct. CPU ready time indicates contention, and insufficient memory causes swapping. B is incorrect because security groups affect connectivity, not performance.

C is partially true but network latency is less common than CPU/memory. E is incorrect as incorrect DNS causes resolution issues, not performance.

Practice this question →

75

MCQeasy

A cloud engineer notices that an application is running slower than expected. Monitoring shows that the CPU utilization is consistently below 30%, but memory usage is at 95%. Which of the following is the most likely cause of the performance issue?

A.Insufficient disk space for application logs

B.Insufficient memory causing swapping to disk

C.Network bandwidth saturation

D.CPU contention due to overprovisioning

AnswerB

Correct; high memory usage leads to swapping, slowing performance.

Why this answer

When memory usage is at 95% and CPU utilization is low, the system is likely thrashing—the operating system is forced to page memory to disk (swap) to free RAM. Disk I/O is orders of magnitude slower than RAM, so even with idle CPU, the application stalls waiting for swap operations. This explains the performance degradation despite low CPU load.

Exam trap

The trap here is that candidates often associate performance issues solely with CPU or network bottlenecks, overlooking the severe impact of memory exhaustion and disk swapping, which can masquerade as a slow application with ample CPU headroom.

How to eliminate wrong answers

Option A is wrong because insufficient disk space for logs would cause write failures or application crashes, not a gradual slowdown with high memory and low CPU. Option C is wrong because network bandwidth saturation would manifest as high latency or packet loss, not as high memory usage with low CPU. Option D is wrong because CPU contention due to overprovisioning would show high CPU ready times or steal time, not consistently low CPU utilization; overprovisioning typically leads to CPU starvation, not memory exhaustion.

Practice this question →

Page 1 of 2 · 99 questions totalNext →

Ready to test yourself?

Try a timed practice session using only Troubleshooting questions.

Start 20-question session