Knowledge + Practice

CCNA Monitoring and Troubleshooting Questions

75 of 300 questions · Page 4/4 · Monitoring and Troubleshooting · Answers revealed

Practice these questions Domain overview All questions

226

MCQmedium

A developer reports that an application using Amazon DynamoDB is experiencing high latency during peak hours. The table has a provisioned capacity of 500 read capacity units (RCUs) and 500 write capacity units (WCUs). The application uses eventually consistent reads and the table is about 50 GB. The developer notices throttled write requests in CloudWatch. Which action would most effectively reduce write throttling?

A.Enable DynamoDB Accelerator (DAX) for the table.

B.Create a global secondary index on the table.

C.Increase the provisioned write capacity for the table.

D.Switch from eventually consistent reads to strongly consistent reads.

AnswerC

Increasing write capacity units reduces throttling for write requests.

Why this answer

The developer reports throttled write requests, which directly indicates that the provisioned write capacity (500 WCUs) is insufficient to handle the peak write traffic. Increasing the provisioned write capacity for the table is the most direct and effective action to eliminate write throttling, as it raises the limit on write operations per second. Option C is correct because it addresses the root cause—write capacity exhaustion—without introducing unnecessary components or changing read behavior.

Exam trap

The trap here is that candidates may confuse read performance solutions (DAX, consistency changes) with write throttling issues, or incorrectly assume that adding a GSI will offload write traffic, when in fact it increases the write capacity burden on the base table.

How to eliminate wrong answers

Option A is wrong because DynamoDB Accelerator (DAX) is an in-memory cache that reduces read latency, not write throttling; it does not increase write capacity or reduce write request throttling. Option B is wrong because creating a global secondary index (GSI) does not reduce write throttling on the base table; in fact, GSIs consume additional write capacity from the base table's provisioned throughput, potentially worsening throttling. Option D is wrong because switching from eventually consistent reads to strongly consistent reads doubles the read capacity consumption per read request, increasing read throttling risk and having no effect on write throttling.

Practice this question →

227

MCQmedium

An application running on Amazon EC2 is unable to connect to an Amazon RDS for SQL Server DB instance. The security group for the RDS instance allows inbound traffic from the security group of the EC2 instance on port 1433. The network ACLs allow all traffic. What is a likely cause of the connectivity issue?

A.The RDS instance is in a private subnet and does not have a public IP address

B.The network ACL is blocking the traffic

C.The security group inbound rule is incorrectly configured

D.The database port is not 1433

AnswerA

Without a public IP, the EC2 instance cannot reach it over the internet.

Why this answer

Option C is correct because RDS instances are not accessible from the internet by default; they must be launched in a public subnet with a public IP. Option A is wrong because security groups are properly configured. Option B is wrong because network ACLs allow all traffic.

Option D is wrong because port 1433 is the default SQL Server port.

Practice this question →

228

Multi-Selecthard

Which THREE actions should be taken to troubleshoot a high number of ThrottlingExceptions from Amazon DynamoDB? (Choose 3.)

Select 3 answers

A.Examine the ConsumedWriteCapacity and ThrottledWriteCount metrics in CloudWatch

B.Enable DynamoDB Streams to offload writes

C.Implement exponential backoff in the application

D.Increase the write capacity units for the table

E.Change the read consistency to eventual

AnswersA, C, D

Helps identify if capacity is exceeded.

Why this answer

A, C, D are correct. Examining CloudWatch metrics helps identify throttling patterns. Implementing exponential backoff is a best practice.

Increasing provisioned capacity resolves throttling. B is wrong because enabling DynamoDB Streams does not affect throttling. E is wrong because changing consistency model does not affect write throttling.

Practice this question →

229

MCQeasy

A database administrator is troubleshooting an issue where an Amazon RDS for PostgreSQL DB instance is not allowing connections. The administrator checks the security group and network ACLs, and they are correctly configured. What is the next step to diagnose the issue?

A.Reboot the DB instance

B.Modify the DB instance's parameter group

C.Create a DB snapshot

D.Review the DB instance error logs in Amazon CloudWatch

AnswerD

Logs can show reason for connection failures.

Why this answer

Option A is correct because checking DB instance logs can reveal connection issues like authentication failures or max connections reached. Option B is wrong because rebooting may not resolve the underlying issue. Option C is wrong because modifying the parameter group is not diagnostic.

Option D is wrong because creating a snapshot is not a diagnostic step.

Practice this question →

230

MCQhard

A company is using Amazon Redshift for data warehousing. The VACUUM operation is taking longer than expected, and the database administrator wants to identify the tables that require the most vacuuming effort. Which system table should be queried to find the percentage of deleted rows per table?

A.STL_QUERY

B.STV_TBL_PERM

C.PG_TABLE_DEF

D.SVV_TABLE_INFO

AnswerD

SVV_TABLE_INFO includes columns for unsorted rows and tombstone blocks.

Why this answer

Option A is correct because SVV_TABLE_INFO contains the unsorted and tombstone information needed. Option B is wrong because STL_QUERY contains query logs. Option C is wrong because STV_TBL_PERM contains permanent table information but not vacuum metrics.

Option D is wrong because PG_TABLE_DEF contains table definitions.

Practice this question →

231

Multi-Selectmedium

A company is using Amazon Redshift for analytics. The database administrator notices that some queries are slow and the system is running out of memory. Which THREE steps should the administrator take to improve performance?

Select 3 answers

A.Increase the node size (scale up) to get more memory per node

B.Optimize the table design by choosing appropriate distkeys and sortkeys

C.Add more nodes to the cluster to increase total memory

D.Run the VACUUM command to reclaim space from deleted rows

E.Configure workload management (WLM) to limit the number of concurrent queries

AnswersB, C, E

Better data distribution reduces memory usage during joins.

Why this answer

Option A is correct because adding nodes increases memory. Option B is correct because distkey and sortkey optimization reduces data movement. Option C is correct because WLM queues can limit concurrency to reduce memory contention.

Option D is incorrect because vacuum is for disk space, not memory. Option E is incorrect because increasing node count (scale out) is better than scale up for memory.

Practice this question →

232

Multi-Selecthard

A company is monitoring an Amazon RDS for Oracle instance. CloudWatch alarms show that FreeableMemory is consistently below 256 MB. The database has high read and write I/O. Which THREE steps should the database specialist take to diagnose the issue?

Select 3 answers

A.Check the MemoryPressure and LogFileSyncDuration metrics in CloudWatch.

B.Review the Oracle memory advisor (V$MEMORY_TARGET_ADVICE).

C.Enable storage auto scaling to increase allocated storage.

D.Increase the DB instance class to allocate more memory.

E.Query V$SGASTAT and V$PGASTAT to understand memory allocation.

AnswersA, B, E

These metrics indicate memory pressure and potential performance impact.

Why this answer

Options A, C, and D are correct. Checking memory pressure (A), looking at swap usage (C), and reviewing memory advisor (D) are all appropriate. Option B is wrong because storage auto scaling does not address memory.

Option E is wrong because RDS uses a fixed SGA/PGA, and increasing instance size might be a solution but not a diagnostic step.

Practice this question →

233

Multi-Selectmedium

Which TWO metrics should be monitored to detect a memory leak in an RDS for Oracle instance? (Choose 2.)

Select 2 answers

A.SwapUsage

B.FreeableMemory

C.DatabaseConnections

D.CPUUtilization

E.ReadIOPS

AnswersA, B

Increasing swap usage indicates the OS is paging memory to disk, sign of memory pressure.

Why this answer

Options A and D are correct because FreeableMemory shows available memory, and SwapUsage indicates when the OS uses swap due to memory pressure, both signs of a memory leak. Option B is wrong because DatabaseConnections does not directly indicate memory leak. Option C is wrong because ReadIOPS relates to I/O, not memory.

Option E is wrong because CPUUtilization may be high but not specific to memory leak.

Practice this question →

234

MCQhard

A financial services company runs a critical application on Amazon DynamoDB. Recently, they observed increased latency and throttled requests. Upon reviewing CloudWatch metrics, they see that WriteCapacityUnits consumed is consistently below the provisioned capacity, but ReadCapacityUnits consumed is frequently at 100%. The application performs equal mix of strongly consistent reads and writes. What is the most likely cause of the throttling?

A.Hot partitions causing uneven read traffic distribution.

B.Adaptive capacity is automatically adjusting partition throughput.

C.Write capacity is insufficient for the workload.

D.Strongly consistent reads are consuming twice the RCU.

AnswerA

Hot partitions lead to throttling even if overall capacity is not exhausted.

Why this answer

Option A is correct because throttling on reads with provisioned capacity still available indicates hot partitions. Option B is wrong because writes are not throttled. Option C is wrong because adaptive capacity helps but does not eliminate hot partitions.

Option D is wrong because strongly consistent reads consume twice the read capacity units, but the issue is hot partitions.

Practice this question →

235

MCQeasy

A database administrator notices that an Amazon RDS for MySQL DB instance's CPU utilization is consistently above 90% during peak hours. Which initial troubleshooting step should the administrator take?

A.Increase the DB instance size to handle the load.

B.Use Amazon RDS Performance Insights to identify the queries consuming CPU.

C.Enable Multi-AZ deployment to distribute the load.

D.Disable slow query logging to reduce CPU overhead.

AnswerB

Performance Insights helps pinpoint the source of high CPU usage.

Why this answer

Option C is correct because Performance Insights provides detailed database performance metrics and helps identify the root cause of high CPU usage. Option A is wrong because increasing instance size is a reactive measure, not a troubleshooting step. Option B is wrong because enabling Multi-AZ addresses high availability, not performance.

Option D is wrong because deleting slow query logs would remove diagnostic data.

Practice this question →

236

MCQhard

A company is running an Amazon DocumentDB cluster. The application is experiencing high write latency. The cluster has a single instance. What should be done to identify the cause of the latency?

A.Upgrade the instance to a larger size.

B.Enable Performance Insights and review the top wait events.

C.Add a replica to distribute the write load.

D.Change the storage type to Provisioned IOPS.

AnswerB

Performance Insights reveals database bottlenecks and wait events.

Why this answer

Option A is correct because enabling Performance Insights helps identify slow queries. Option B is wrong because adding a replica does not diagnose latency. Option C is wrong because increasing instance size is a reactive step.

Option D is wrong because changing storage type may not address latency.

Practice this question →

237

MCQhard

A team is using Amazon DynamoDB Accelerator (DAX) to improve read performance for a table. They notice that DAX is returning stale data even though the TTL is set to 5 minutes. The table is updated frequently by multiple writers. What is the most likely cause of the stale reads?

A.The DAX cluster is not large enough to cache all items, causing cache misses

B.DAX is configured with eventual consistency, which returns stale data by design

C.The DAX cluster is deployed in a different Availability Zone than the application

D.The TTL is too long, causing cached items to remain after updates

AnswerD

A long TTL means cached items are not invalidated quickly after updates.

Why this answer

Option B is correct because DAX uses a write-through cache; if the TTL is too long, cached data may become stale before invalidation. Option A is incorrect because DAX supports eventual consistency and can return stale data if TTL is long. Option C is incorrect because DAX cache is not sharded across nodes in the same cluster.

Option D is incorrect because DAX does not have a separate cache for each AZ.

Practice this question →

238

MCQhard

A company is using Amazon ElastiCache for Redis and notices that the cache hit ratio is low. The application is frequently reading data that is not in the cache. Which action would be most effective in improving the cache hit ratio?

A.Increase the number of replicas in the replication group.

B.Decrease the TTL of cached items to ensure freshness.

C.Pre-warm the cache by loading frequently accessed data from the database.

D.Enable Multi-AZ for automatic failover.

AnswerC

Pre-warming ensures that the most requested data is already in the cache, improving hit ratio.

Why this answer

Increasing the cache size may reduce evictions but does not directly improve hit ratio if the working set is not cached. Enabling replication adds replicas but does not increase capacity. Using lazy loading (read-through) and proper cache warming can improve hit ratio.

Option A is correct: pre-warming the cache with frequently accessed data can increase the hit ratio.

Practice this question →

239

MCQmedium

A company is using Amazon DynamoDB with auto scaling enabled. The application is experiencing higher than expected write throttling. Which action should be taken to resolve this issue?

A.Increase the minimum provisioned capacity for the table.

B.Disable auto scaling and set a fixed provisioned capacity.

C.Decrease the maximum provisioned capacity to limit writes.

D.Switch the table to on-demand capacity mode.

AnswerA

Increasing the minimum capacity ensures that the table can handle baseline traffic and reduces the chance of throttling during spikes.

Why this answer

Auto scaling can lag behind sudden traffic spikes. Increasing the minimum provisioned capacity can help reduce throttling during bursts. Option B is correct.

Option A is incorrect because on-demand mode may be costly but could solve throttling, but the question asks for a resolution with auto scaling. Option C is incorrect because auto scaling already adjusts capacity. Option D is incorrect because decreasing capacity would worsen throttling.

Practice this question →

240

MCQeasy

A database administrator is monitoring an Amazon RDS for SQL Server DB instance and notices that the FreeableMemory metric is consistently below 200 MB. Which of the following actions is most appropriate to mitigate performance issues?

A.Modify the DB instance's maintenance window to off-peak hours

B.Disable the SQL Server Agent and error logging

C.Enable automatic backups with a shorter retention period

D.Scale up the DB instance to a larger instance class with more memory

AnswerD

More memory increases the freeable memory available to the database engine.

Why this answer

Option B is correct because low freeable memory can cause performance degradation; increasing allocated memory (by scaling up) helps. Option A is wrong because enabling automatic backups does not affect memory. Option C is wrong because modifying the maintenance window does not help.

Option D is wrong because disabling logging is not a standard practice.

Practice this question →

241

MCQeasy

A database administrator notices that an Amazon RDS for SQL Server DB instance has been in the 'storage-optimization' state for several hours after modifying the storage type from gp2 to io1. What should the administrator do to resolve this?

A.Wait for the storage optimization to complete.

B.Restore from the latest snapshot and reapply the modification.

C.Cancel the modification by modifying the DB instance back to gp2.

D.Reboot the DB instance.

AnswerA

Storage optimization is automatic and takes time; no action is needed.

Why this answer

Option B is correct because 'storage-optimization' is a normal state that occurs when modifying the storage configuration, and no action is required. Option A is incorrect because rebooting does not speed up the optimization. Option C is incorrect because there is no need to restore from snapshot.

Option D is incorrect because the modification is in progress, and there is no need to cancel it.

Practice this question →

242

MCQeasy

A company is using Amazon RDS for SQL Server with native backup and restore. The backup process is failing with an error indicating insufficient disk space for the backup file. The DB instance has 200 GB of allocated storage, and the backup file is 50 GB. What should the database administrator do to resolve this issue?

A.Change the storage type to Provisioned IOPS for better performance

B.Increase the allocated storage for the RDS instance

C.Grant the rds_backup user additional permissions to write to S3

D.Switch to automated backups instead of native backups

AnswerB

More storage space allows the backup file to be written.

Why this answer

Option A is correct because native backups are stored on the instance's attached storage; if the storage is full, backups fail. Increasing allocated storage provides more space. Option B is incorrect because automated backups are stored in S3 and do not use instance storage.

Option C is incorrect because the error is about disk space, not permissions. Option D is incorrect because changing storage type does not add space.

Practice this question →

243

Multi-Selecthard

A database engineer is troubleshooting slow query performance on an Amazon RDS for PostgreSQL instance. The instance is db.r5.large with 500 GB of General Purpose SSD (gp2) storage. CloudWatch metrics show high Read Latency and high Read IOPS, but low CPU utilization. Which TWO actions should the engineer take to improve performance?

Select 2 answers

A.Create a read replica and offload read queries to it.

B.Increase the DB instance class to a larger size, such as db.r5.2xlarge.

C.Enable Multi-AZ to use the standby for read traffic.

D.Optimize queries by adding appropriate indexes.

E.Switch from General Purpose SSD (gp2) to Provisioned IOPS SSD (io1) with a higher IOPS rate.

AnswersA, E

Read replicas reduce the read IOPS on the primary, which can lower latency on the primary.

Why this answer

A is correct because creating a read replica offloads read queries from the primary instance, reducing the read IOPS and read latency on the primary. This directly addresses the high Read Latency and high Read IOPS metrics without requiring a larger instance class or storage change, especially since CPU utilization is low, indicating the bottleneck is I/O, not compute.

Exam trap

The trap here is that candidates often assume Multi-AZ can serve read traffic (like in SQL Server or Oracle), but Amazon RDS for PostgreSQL Multi-AZ does not support read-only queries on the standby; only read replicas can offload reads.

Practice this question →

244

Multi-Selecthard

A company is running a production Amazon Aurora MySQL-Compatible Edition database. The database has recently experienced several failovers due to replica lag. The DBA needs to implement monitoring to detect replica lag early. Which THREE metrics should be monitored to assess replication health? (Select THREE.)

Select 3 answers

A.DatabaseConnections

B.ActiveTransactions

C.ReplicaLag

D.BufferCacheHitRatio

E.AuroraReplicaLag

AnswersA, C, E

High connection count may indicate application retries due to failover.

Why this answer

Option A: AuroraReplicaLag is the direct metric for replica lag. Option B: ReplicaLag is the standard metric for MySQL replication lag. Option D: DatabaseConnections can indirectly indicate issues if application retries increase.

Option C: BufferCacheHitRatio is about cache efficiency, not replication. Option E: ActiveTransactions is about transactions, not replication.

Practice this question →

245

MCQhard

You are managing an Amazon RDS for PostgreSQL Multi-AZ DB instance that handles a high-traffic e-commerce application. Recently, the database has been experiencing intermittent slowdowns during peak hours. You have enabled Enhanced Monitoring and Performance Insights. After reviewing the Performance Insights dashboard, you notice that the 'db.sql.queries.avg_latency' metric spikes during the slowdowns, and the top SQL queries are all simple SELECT statements on a frequently accessed 'orders' table. The table has over 10 million rows and is indexed on 'order_id', 'customer_id', and 'order_date'. The average query latency for these SELECT statements jumps from 5 ms to over 500 ms during the spikes. You also observe that the 'ReadIOPS' metric on the DB instance is consistently below the provisioned IOPS limit of the gp2 storage. The DB instance type is db.r5.large with 16 GB memory. The 'DatabaseConnections' metric shows that the number of connections is well within the max_connections limit (set to 200). However, the 'CPUCreditBalance' for the underlying EC2 instance, which is a T3 medium, drops to near zero during the spikes. The 'CPUUtilization' metric is below 50%. Which of the following is the MOST likely cause and the appropriate action to resolve the issue?

A.The issue is due to CPU credit exhaustion on the T3 instance. Change the DB instance class from db.t3.medium to a dedicated CPU instance type such as db.r5.large.

B.The issue is due to a memory bottleneck causing queries to spill to disk. Increase the instance memory by moving to a db.r5.xlarge instance.

C.The issue is caused by a connection pool exhaustion. Increase the max_connections parameter and use connection pooling.

D.The issue is caused by a missing index on the 'orders' table. Add a composite index on the columns used in the WHERE clauses.

AnswerA

The T3 instance uses CPU credits; when credits are exhausted, performance is throttled. Changing to a dedicated instance eliminates credit-based performance.

Why this answer

The correct answer is A. The 'CPUCreditBalance' dropping to near zero on a T3 instance indicates CPU credit exhaustion, which throttles the baseline CPU performance and causes query latency spikes. Even though 'CPUUtilization' is below 50%, T3 instances rely on CPU credits for burstable performance; when credits are exhausted, the instance is limited to the baseline (e.g., 10% for t3.medium), causing severe latency increases for simple SELECT statements.

Moving to a dedicated CPU instance like db.r5.large eliminates credit-based throttling and provides consistent performance.

Exam trap

The trap here is that candidates see low CPU utilization and assume no CPU issue, but T3 instances can be throttled even at moderate utilization when credits are exhausted, and the 'CPUCreditBalance' metric is the key indicator.

How to eliminate wrong answers

Option B is wrong because the 'ReadIOPS' metric is below the provisioned gp2 limit, and the queries are simple SELECTs with indexes, so a memory bottleneck causing disk spill is unlikely; increasing memory would not resolve CPU credit exhaustion. Option C is wrong because 'DatabaseConnections' is well within the max_connections limit (200), and connection pool exhaustion would show high connection counts or wait events, not CPU credit depletion. Option D is wrong because the table is already indexed on the relevant columns ('order_id', 'customer_id', 'order_date'), and the issue is CPU credit exhaustion, not missing indexes.

Practice this question →

246

MCQmedium

A company is using Amazon RDS for MySQL with Multi-AZ deployment. The application team reports increased latency during peak hours. Which AWS service should the database specialist use to identify the root cause?

A.Enable Performance Insights for the RDS instance and analyze the database load.

B.Enable AWS Config to track configuration changes to the RDS instance.

C.Use the CloudWatch Metrics Dashboard to analyze database connections.

D.Run an Amazon Inspector assessment on the RDS instance.

AnswerA

Performance Insights provides detailed database performance analysis and helps identify bottlenecks.

Why this answer

Option B is correct because Performance Insights provides database performance metrics and helps identify bottlenecks. Option A is wrong because CloudWatch Metrics Dashboard shows aggregated metrics but does not provide detailed database performance analysis. Option C is wrong because AWS Config is for resource configuration auditing.

Option D is wrong because Amazon Inspector is for security assessments.

Practice this question →

247

Multi-Selectmedium

A company is using Amazon DynamoDB with provisioned capacity. They notice an increase in throttled write requests. The workload consists of writes to a single partition key. Which TWO actions would help reduce throttling?

Select 2 answers

A.Add a global secondary index with a different partition key.

B.Increase the provisioned write capacity units.

C.Enable DynamoDB auto scaling with adaptive capacity.

D.Use DynamoDB Accelerator (DAX) for write caching.

E.Implement write sharding by adding a suffix to the partition key.

AnswersB, C

More capacity directly reduces throttling.

Why this answer

Options A and C are correct. Using DynamoDB Accelerator (DAX) reduces read load but does not help with writes; option B is irrelevant. Option D is for read scaling.

Option E (using write sharding) distributes writes across partitions. Option C (increasing write capacity) directly addresses throttling. Option A (adaptive capacity) helps but may not fully resolve hot partition; however, it is a valid action.

The question expects A and C as the most direct.

Practice this question →

248

Multi-Selectmedium

A database specialist is troubleshooting an Amazon RDS for PostgreSQL instance that has high replication lag between the primary and a read replica. Which TWO metrics should the specialist review to identify the cause? (Select TWO.)

Select 2 answers

A.WriteIOPS on the primary

B.ReadIOPS on the replica

C.DatabaseConnections on the primary

D.ReplicaLag

E.NetworkThroughput between primary and replica

AnswersB, D

High read activity on the replica can cause lag.

Why this answer

Option A is correct because 'ReplicaLag' directly measures the lag. Option B is correct because 'ReadIOPS' can indicate heavy read activity on the replica that may cause lag. Option C is wrong because 'WriteIOPS' on the primary does not directly indicate replica lag.

Option D is wrong because 'DatabaseConnections' is about connections, not replication. Option E is wrong because 'NetworkThroughput' is not a direct indicator of replication lag.

Practice this question →

249

MCQhard

Refer to the exhibit. A DBA is troubleshooting an issue where an IAM user cannot view CloudWatch metrics for an RDS DB instance. The IAM policy attached to the user is shown above. What is the MOST likely reason the user cannot view the metrics?

A.The policy does not include cloudwatch:DescribeAlarms

B.The policy does not include rds:DescribeDBInstances

C.The policy uses a Resource element of '*' which is not allowed for CloudWatch

D.The policy does not include cloudwatch:GetMetricStatistics

AnswerA

The CloudWatch console often requires DescribeAlarms to view metrics.

Why this answer

Option D is correct because the policy is missing the 'rds:DescribeDBInstances' action for the specific DB instance resource, but the issue is about CloudWatch metrics. However, the policy includes 'cloudwatch:GetMetricStatistics' and 'cloudwatch:ListMetrics' with Resource '*', which should allow viewing metrics. But the user cannot view metrics, so the problem might be that the user lacks permission to describe the DB instance to get its identifier.

Actually, the question is tricky: the policy allows describing DB instances, so the issue is likely that the user does not have permission to write to CloudWatch Logs? No. The exhibit shows an IAM policy that allows rds:DescribeDBInstances and cloudwatch:GetMetricStatistics and ListMetrics. The user should be able to view metrics.

However, the DBA cannot view metrics; the most likely reason is that the policy does not include permission to view CloudWatch alarms or dashboards. But the question says "cannot view CloudWatch metrics", so the policy includes GetMetricStatistics. Option A is wrong because the policy includes that action.

Option B is wrong because the resource is '*', so it's not restricted. Option C is wrong because the policy includes rds:DescribeDBInstances. Option D is correct because the user might need additional permissions like cloudwatch:DescribeAlarms or cloudwatch:GetMetricWidgetImage? Actually, the policy seems sufficient for viewing metrics via the console or API.

But a common issue is that the console requires additional permissions like cloudwatch:DescribeAlarms for the Metrics page to load. So D is plausible.

Practice this question →

250

MCQhard

A company is using Amazon ElastiCache for Redis to cache frequently accessed data. Recently, the application has been experiencing increased latency. The database specialist suspects that the cache hit ratio has decreased. Which CloudWatch metric should the specialist analyze to confirm this suspicion?

A.Monitor the 'CurrConnections' metric to see if there are too many connections.

B.Monitor the 'Evictions' metric to see if keys are being evicted.

C.Monitor 'CacheHits' and 'CacheMisses' metrics to calculate the hit ratio.

D.Monitor the 'ReplicationLag' metric to check replication delay.

AnswerC

Cache hit ratio = CacheHits / (CacheHits + CacheMisses).

Why this answer

Option A is correct because the cache hit ratio is calculated as CacheHits / (CacheHits + CacheMisses). Option B is wrong because CurrConnections shows current connections. Option C is wrong because Evictions shows number of evicted keys.

Option D is wrong because ReplicationLag shows replication delay.

Practice this question →

251

MCQeasy

A developer reports that an Amazon ElastiCache for Redis cluster's memory usage is consistently above 90%. The application uses Redis for caching and session storage. Which configuration change would MOST effectively reduce memory pressure?

A.Enable eviction with volatile-LRU policy

B.Scale up to a larger node type

C.Enable AOF persistence to free memory

D.Disable replication to reduce memory overhead

AnswerA

Eviction removes less-used keys when memory is full.

Why this answer

Option B is correct because enabling eviction with a volatile-LRU policy frees memory by removing expired keys. Option A is wrong because scaling up increases memory but costs more. Option C is wrong because persistence writes to disk, not memory.

Option D is wrong because disabling replication does not reduce memory usage.

Practice this question →

252

Multi-Selecthard

A team is using Amazon DynamoDB with auto scaling enabled. They notice that some requests are returning ProvisionedThroughputExceededException errors during a sudden traffic spike. The application uses strong consistent reads. Which two actions would help mitigate the throttling without over-provisioning capacity? (Choose two.)

Select 2 answers

A.Implement DynamoDB Accelerator (DAX) to cache read results.

B.Enable DynamoDB adaptive capacity.

C.Switch to eventually consistent reads for all queries.

D.Disable auto scaling and manually set higher capacity.

E.Use DynamoDB burst capacity for the spike.

AnswersA, B

Correct. DAX reduces the number of read requests to the table, lowering the consumed read capacity.

Why this answer

Option A and D are correct. Using DynamoDB Accelerator (DAX) reduces read load, and enabling adaptive capacity helps handle uneven access patterns. Option B is wrong because disabling auto scaling would worsen the issue.

Option C is wrong because eventual reads may not be acceptable. Option E is wrong because bursting is limited and not guaranteed.

Practice this question →

253

MCQmedium

A company uses Amazon RDS for PostgreSQL with Multi-AZ deployment. The primary instance fails, and a failover occurs. After the failover, the application is still unable to connect to the database endpoint. The database administrator checks the RDS console and sees that the new primary is in 'available' state. What should the administrator do next to diagnose the connectivity issue?

A.Verify that the security group for the RDS instance allows inbound traffic from the application

B.Check if the subnet group for the RDS instance is correctly configured

C.Check the DNS resolution of the RDS endpoint from the application server

D.Restart the RDS instance to force a new connection

AnswerC

The CNAME record should be updated; stale DNS could cause connection failures.

Why this answer

Option C is correct because the DNS CNAME of the RDS endpoint should have updated to point to the new primary. If the application is using the old IP or a cached DNS entry, it may not connect. Option A is incorrect because security group rules are usually unchanged.

Option B is incorrect because the subnet group is not the issue. Option D is incorrect because the primary is already in available state.

Practice this question →

254

Multi-Selectmedium

A company runs a self-managed Redis cluster on Amazon EC2 for caching. The cluster has one primary and two replicas, each on c5.large instances. The application experiences high latency during peak hours. CloudWatch metrics show that the primary node's CPU utilization is consistently above 80% and the network bandwidth is near the instance limit. The replicas show moderate CPU usage. The team wants to reduce latency without increasing cost significantly. Which combination of actions should the team take? (Choose two.)

Select 2 answers

A.Enable Redis Cluster Mode and distribute data across multiple shards.

B.Add more EC2 instances as additional replicas to offload reads.

C.Configure the application to read from replica nodes.

D.Upgrade the primary to a c5.2xlarge instance type.

E.Migrate to Amazon ElastiCache for Redis with Cluster Mode enabled.

AnswersA, E

Sharding spreads the load, reducing CPU and network pressure on individual nodes.

Why this answer

Option A and Option C are correct. Option A: Enabling Cluster Mode allows sharding data across multiple nodes, reducing load on each node. Option C: Using an ElastiCache Redis cluster with Cluster Mode enabled provides managed scaling and reduces the operational burden.

Option B is incorrect because it increases cost (more instances) and does not address vertical scaling limits. Option D is incorrect because using larger instances (vertical scaling) increases cost significantly and may still hit network limits. Option E is incorrect because read replicas help with read scaling, but the primary is the bottleneck for writes and CPU.

Practice this question →

255

Multi-Selecteasy

Which THREE actions should be taken to troubleshoot an Amazon RDS for PostgreSQL instance that is unresponsive? (Choose 3.)

Select 3 answers

A.Reboot the DB instance immediately

B.Modify the DB instance to a larger instance class

C.Verify that the security group allows inbound traffic on the database port

D.Check the database error logs in CloudWatch Logs

E.Review CloudWatch metrics for CPU, memory, and disk I/O

AnswersC, D, E

Network connectivity issues can make instance appear unresponsive.

Why this answer

Options A, C, and D are correct because checking logs, verifying security groups, and reviewing metrics are standard initial steps. Option B is wrong because rebooting without diagnostics may lose evidence. Option E is wrong because modifying instance class without cause may not help and could be disruptive.

Practice this question →

256

MCQmedium

A company uses Amazon RDS for PostgreSQL with logical replication to a downstream system. The replication slot grows unbounded and causes storage full issues. Which action resolves this without data loss?

A.Disable logical replication and use DMS instead

B.Increase the allocated storage size of the RDS instance

C.Monitor the replication slot and advance it using pg_replication_slot_advance

D.Delete the replication slot and recreate it

AnswerC

Advancing the slot allows WAL cleanup.

Why this answer

Option C is correct because monitoring and advancing the replication slot prevents WAL accumulation. Option A is wrong because disabling replication causes data loss. Option B is wrong because increasing storage is a temporary fix, not a resolution.

Option D is wrong because deleting old WAL logs may break replication.

Practice this question →

257

Multi-Selectmedium

A company is using Amazon DynamoDB with autoscaling enabled. The table has a partition key of 'order_id' and a sort key of 'order_date'. The application performs both point queries and range queries. Recently, the 'ConsumedReadCapacityUnits' metric shows that the table is consistently using 100% of the provisioned capacity. Which THREE factors should the database engineer investigate to determine the cause?

Select 3 answers

A.Whether autoscaling is configured correctly to add capacity.

B.Whether the application is using Scan operations instead of Query operations.

C.Whether the partition key is evenly distributed across partitions.

D.Whether a specific 'order_id' is being accessed frequently, creating a hot key.

E.Whether a global secondary index is being used for queries.

AnswersB, C, D

Scans consume more read capacity than queries.

Why this answer

Option B is correct because Scan operations read the entire table or index before applying filters, consuming far more read capacity than Query operations, which target specific partition and sort key values. If the application is using Scans instead of Queries, it would consistently consume 100% of provisioned capacity even for small result sets, leading to throttling and high utilization.

Exam trap

AWS often tests the misconception that autoscaling misconfiguration is the primary cause of high capacity utilization, when in reality the root cause is often inefficient access patterns (Scans) or uneven data distribution (hot keys) that autoscaling cannot fix.

Practice this question →

258

MCQmedium

A company is using Amazon RDS for MySQL with Multi-AZ deployment. The application team reports that write latency has increased significantly over the past hour. CloudWatch shows elevated 'ReadLatency' and 'WriteLatency' metrics. The DB instance is a db.r5.large with 500 GB General Purpose SSD (gp2) storage. Which action is most likely to resolve the issue?

A.Enable Multi-AZ deployment

B.Create a read replica to offload read traffic

C.Increase the allocated storage size to 1,000 GB

D.Scale up the DB instance class to db.r5.xlarge

AnswerC

Larger gp2 volume provides higher baseline IOPS, resolving I/O credit exhaustion.

Why this answer

Option C is correct because elevated latency combined with gp2 storage suggests that the I/O credits are exhausted, causing throughput to drop to baseline. Increasing storage size increases baseline IOPS. Option A is wrong because Multi-AZ is already enabled.

Option B is wrong because increasing instance size improves compute but does not address storage I/O limits. Option D is wrong because read replicas help read scaling but do not reduce write latency.

Practice this question →

259

MCQhard

A database specialist is troubleshooting a performance issue on a self-managed PostgreSQL database that they plan to migrate to Amazon RDS. The database has a high number of 'idle in transaction' connections. What is the impact of these connections on the database?

A.They increase CPU usage due to constant polling.

B.They hold locks and prevent cleanup of dead tuples, leading to bloat.

C.They prevent new connections from being established.

D.They cause increased disk I/O from write-ahead logging.

AnswerB

Idle transactions keep locks and prevent autovacuum from marking dead tuples.

Why this answer

Option D is correct because idle-in-transaction connections hold locks and can cause bloat, reducing performance. Option A is wrong because they do not prevent new connections from being established. Option B is wrong because CPU usage is low for idle connections.

Option C is wrong because they do not affect disk I/O directly.

Practice this question →

260

MCQhard

A company is using Amazon Redshift for data warehousing. The data engineering team notices that queries are taking longer than expected. The cluster has two nodes of type dc2.large. The database specialist checks the system tables and finds that many queries are using the disk for temporary storage. Which action should the specialist take to improve query performance?

A.Add distribution keys to the tables to improve data distribution.

B.Enable concurrency scaling to offload queries to additional clusters.

C.Increase the number of nodes to three to distribute the workload.

D.Upgrade the cluster to a node type with more memory, such as ra3.xlplus.

AnswerD

More memory reduces the need for disk-based operations.

Why this answer

Option B is correct because the need for temporary storage on disk indicates insufficient memory. Upgrading to a node type with more memory (such as ra3 nodes) reduces disk spills. Option A is wrong because increasing the number of nodes does not increase per-node memory.

Option C is wrong because adding distribution keys may not reduce memory usage. Option D is wrong because enabling concurrency scaling does not address memory issues.

Practice this question →

261

Multi-Selecthard

A company is migrating its on-premises Oracle database to Amazon RDS for Oracle. The database specialist needs to monitor the migration process and ensure data consistency. Which TWO AWS services should be used together to continuously monitor the replication lag and data integrity?

Select 2 answers

A.Amazon RDS Performance Insights

B.AWS Schema Conversion Tool (AWS SCT)

C.AWS Database Migration Service (AWS DMS)

D.AWS Glue

E.Amazon CloudWatch

AnswersC, E

DMS provides replication tasks and publishes latency metrics.

Why this answer

AWS DMS is the correct service because it is specifically designed for database migrations and provides built-in monitoring of replication lag via the 'CDC latency' metric. Amazon CloudWatch is the correct complementary service because it collects and visualizes DMS metrics, including replication lag and task status, and can trigger alarms if data integrity or latency thresholds are breached.

Exam trap

The trap here is that candidates may confuse AWS DMS with AWS Glue or SCT, thinking those services also handle continuous replication monitoring, but only DMS provides CDC metrics that CloudWatch can monitor for lag and integrity.

Practice this question →

262

MCQmedium

A development team is using Amazon RDS for MySQL with read replicas to offload reporting queries. They notice that the read replica is consistently lagging behind the primary by several seconds. The primary handles 5000 writes per second. Which action would most likely reduce replica lag?

A.Increase the 'max_connections' parameter on the primary instance.

B.Increase the instance size of the read replica.

C.Disable binary logging on the primary instance to reduce I/O.

D.Convert the primary instance to a Multi-AZ deployment.

AnswerB

A larger replica can apply changes more quickly, reducing lag.

Why this answer

Option D is correct because increasing the replica instance size can improve its ability to apply changes faster. Option A is incorrect because disabling binary logging would break replication. Option B is incorrect because the replica lag is due to the replica not keeping up, not the primary.

Option C is incorrect because Multi-AZ is for high availability, not for reducing replica lag.

Practice this question →

263

MCQmedium

A database specialist is troubleshooting an Amazon RDS for SQL Server instance that is running out of storage. The instance has 500 GB of provisioned storage and is using General Purpose SSD (gp2). The specialist wants to set up an alarm to notify when free storage space drops below 50 GB. Which CloudWatch metric and threshold should be used?

A.Monitor the 'FreeStorageSpace' metric in percent and set a threshold of 10.

B.Monitor the 'DiskSpaceUtilization' metric and set a threshold of 90.

C.Monitor the 'FreeStorageSpace' metric in bytes and set a threshold of 53687091200.

D.Monitor the 'FreeStorageSpace' metric in gigabytes and set a threshold of 50.

AnswerC

FreeStorageSpace is in bytes; 50 GB = 53687091200 bytes.

Why this answer

Option B is correct because 'FreeStorageSpace' metric in bytes should be monitored. 50 GB = 50 * 1024^3 bytes = 53687091200 bytes. Option A is wrong because 'FreeStorageSpace' in bytes is the correct metric. Option C is wrong because 'FreeStorageSpace' is in bytes, not percent.

Option D is wrong because 'FreeStorageSpace' is the correct metric.

Practice this question →

264

Multi-Selectmedium

A company uses Amazon DynamoDB with global tables. During a regional outage, the application fails over to the secondary region. After recovery, the DBA notices that the data in the secondary region is not fully consistent with the primary. Which THREE steps should the DBA take to diagnose the issue? (Choose THREE.)

Select 3 answers

A.Verify that DynamoDB Streams is enabled on the table.

B.Disable and re-enable global tables to force resync.

C.Increase the write capacity on the secondary table.

D.Review the DynamoDB Streams error logs in CloudWatch Logs.

E.Check the ReplicationLatency metric in CloudWatch.

AnswersA, D, E

Streams are required for global tables to replicate changes.

Why this answer

Options A, C, and D are correct. Checking ReplicationLatency (A) identifies replication lag. Reviewing CloudWatch Logs for error logs (C) can reveal replication errors.

Checking DynamoDB Streams (D) ensures changes are being captured. Option B is wrong because disabling global tables is not a diagnostic step. Option E is wrong because increasing write capacity does not fix consistency issues.

Practice this question →

265

MCQmedium

A company uses Amazon DynamoDB with on-demand capacity. Users report increased latency during peak hours. The application uses the DynamoDB API. Which monitoring metric should be examined first to identify throttling issues?

A.ThrottledRequests

B.SuccessfulRequestLatency

C.ReadThrottleEvents

D.ConsumedWriteCapacityUnits

AnswerA

ThrottledRequests directly indicates requests that were throttled.

Why this answer

Option B is correct because ThrottledRequests indicates requests that were rejected due to exceeding capacity. Option A is wrong because ConsumedWriteCapacityUnits might be lower than expected if throttling occurs, but it does not directly show throttled requests. Option C is wrong because SuccessfulRequestLatency is a latency metric, not a throttling indicator.

Option D is wrong because ReadThrottleEvents is a CloudWatch metric for throttled reads, but it's less direct than ThrottledRequests.

Practice this question →

266

MCQeasy

A database administrator is troubleshooting a slow-running query on an Amazon RDS for MySQL DB instance. Which CloudWatch metric should be monitored to determine if the query is experiencing storage I/O bottlenecks?

A.ReadLatency and WriteLatency

B.CPUUtilization

C.NetworkThroughput

D.DatabaseConnections

AnswerA

High latency indicates I/O bottlenecks.

Why this answer

Option A is correct because ReadLatency and WriteLatency indicate I/O performance. Option B is wrong because CPUUtilization measures processor usage. Option C is wrong because DatabaseConnections measures connections.

Option D is wrong because NetworkThroughput measures network traffic.

Practice this question →

267

MCQmedium

A database administrator notices that an Amazon RDS for MySQL instance's CPU utilization is consistently above 80% during peak hours. The DB instance is a db.r5.large with 16 GB memory and 500 GB gp2 storage. The application is a read-intensive web application. Which action is MOST effective to reduce CPU load without significant cost increase?

A.Increase the instance size to db.r5.xlarge

B.Create a read replica and redirect read traffic to it

C.Enable Performance Insights to identify slow queries and optimize them

D.Increase the allocated storage to 1000 GB to improve I/O performance

AnswerB

Offloads read queries, reducing CPU on the primary instance.

Why this answer

Option B is correct because creating a read replica offloads read traffic from the primary instance, reducing CPU utilization without requiring a larger instance or more storage. Option A increases cost significantly. Option C may not help if the issue is CPU, not I/O.

Option D increases storage but does not directly reduce CPU.

Practice this question →

268

MCQmedium

A company uses Amazon CloudWatch to monitor an RDS for Oracle instance. They want to receive an alert when the database connection count exceeds 90% of the maximum connections. Which CloudWatch metric should be used to create the alarm?

A.ActiveTransactions

B.DatabaseConnections

C.ReadIOPS

D.NetworkThroughput

AnswerB

Correct. This metric directly reflects the number of connections.

Why this answer

Option A is correct because DatabaseConnections tracks the number of client connections. Option B is wrong because ActiveTransactions measures transactions. Option C is wrong because NetworkThroughput measures network traffic.

Option D is wrong because ReadIOPS measures disk I/O.

Practice this question →

269

MCQhard

A database specialist is troubleshooting an Amazon RDS for PostgreSQL instance that is experiencing intermittent connection timeouts. The application logs show errors like 'FATAL: remaining connection slots are reserved for non-replication superuser connections'. The max_connections parameter is set to 100. What should the specialist do to resolve this issue?

A.Modify the 'max_connections' parameter to a higher value and reboot the instance.

B.Increase the 'max_replication_slots' parameter to allow more replication connections.

C.Increase the value of the 'superuser_reserved_connections' parameter.

D.Enable RDS Proxy to manage database connections efficiently.

AnswerA

Increasing max_connections allows more concurrent connections and resolves the error.

Why this answer

Option C is correct because the error indicates connection pool exhaustion. Increasing max_connections allows more concurrent connections. Option A is wrong because the error is about connection slots, not idle sessions.

Option B is wrong because reserving more slots for superusers does not address the root cause. Option D is wrong because the error is not due to replication slots.

Practice this question →

270

MCQeasy

A company runs a PostgreSQL database on an Amazon RDS DB instance (db.t3.medium) with 100 GB of General Purpose SSD (gp2) storage. The database is used by a web application that experiences occasional slowdowns. CloudWatch metrics show that the BurstBalance metric for the storage volume drops to 0% during peak usage and then recovers. The average IOPS during peak is 600, and the baseline IOPS for the volume is 300. The team needs a cost-effective solution to eliminate the performance issues. What should the team do?

A.Upgrade the DB instance to db.t3.large.

B.Increase the gp2 volume size to 200 GB.

C.Migrate the storage to gp3 with 3000 baseline IOPS.

D.Enable Performance Insights to monitor database load.

AnswerC

gp3 provides a consistent baseline of 3000 IOPS (or 3000 if using the default) without burst credits, eliminating the burst balance issue and providing headroom.

Why this answer

Option B is correct because converting to gp3 provides a consistent baseline of 3000 IOPS (or 3000 if using the default 3000 IOPS, though for 100 GB the baseline is 3000) without burst credits, eliminating the burst balance problem. Option A is incorrect because increasing gp2 volume size to 200 GB increases baseline to 600 IOPS and burst credits, but still relies on credits; the workload may still exceed baseline. Option C is incorrect because increasing instance size does not affect storage performance.

Option D is incorrect because enabling Performance Insights does not improve performance.

Practice this question →

271

MCQhard

A database engineer is troubleshooting a production Amazon Aurora MySQL DB cluster. The application is experiencing high latency on write operations. The engineer checks the Amazon CloudWatch metrics and sees that the 'AuroraBinlogReplicaLag' metric is high. What is the most likely cause of the write latency?

A.The DB cluster has insufficient storage capacity for the binlog files

B.The DB cluster has a high CPU utilization that is causing replication lag

C.The binlog replication to a downstream MySQL instance is falling behind

D.A recent failover event caused the binlog to be replayed from the last checkpoint

AnswerC

High binlog lag means the downstream replica cannot keep up, causing write delays if the source waits for acknowledgment.

Why this answer

Option C is correct because the 'AuroraBinlogReplicaLag' metric specifically measures the lag between the Aurora MySQL cluster and an external MySQL instance that is replicating from Aurora using binary log (binlog) replication. When this lag is high, it indicates that the downstream MySQL instance is falling behind in applying binlog events, which can cause write operations on the Aurora cluster to stall or slow down due to the synchronous nature of binlog generation and the need to retain binlogs until they are consumed by the replica.

Exam trap

The trap here is that candidates often confuse 'AuroraBinlogReplicaLag' with Aurora's internal replication lag (e.g., ReplicaLag for Aurora Replicas) or with general performance metrics like CPU or storage, leading them to select incorrect options that do not address the specific binlog replication context.

How to eliminate wrong answers

Option A is wrong because insufficient storage capacity for binlog files would cause binlog file retention issues or storage full errors, but it does not directly cause high binlog replication lag; the 'AuroraBinlogReplicaLag' metric is about replication delay, not storage capacity. Option B is wrong because high CPU utilization on the DB cluster can cause general performance degradation, but the 'AuroraBinlogReplicaLag' metric is specific to the lag of binlog replication to an external MySQL instance, not to internal replication or CPU-related delays. Option D is wrong because a failover event would cause a brief interruption and replay of binlog from the last checkpoint, but this would not result in a persistently high 'AuroraBinlogReplicaLag' metric; the lag would typically be transient and recover quickly.

Practice this question →

272

Multi-Selecteasy

Which TWO tools can be used to monitor query performance in Amazon Aurora MySQL? (Choose 2.)

Select 2 answers

A.Amazon RDS Performance Insights

B.AWS Config

C.Amazon RDS Enhanced Monitoring

D.VPC Flow Logs

E.AWS CloudTrail

AnswersA, C

Shows database load and SQL queries.

Why this answer

A and D are correct. Performance Insights provides database performance analysis. Enhanced Monitoring provides OS-level metrics.

B is wrong because CloudTrail logs API calls. C is wrong because VPC Flow Logs show network traffic. E is wrong because AWS Config tracks resource configuration.

Practice this question →

273

MCQmedium

A company is running a production Amazon RDS for PostgreSQL database. The database has experienced a sudden spike in CPU utilization, causing application timeouts. The monitoring team needs to identify the root cause. Which AWS service or feature should be used to analyze the database load and identify the specific queries causing the high CPU?

A.Enhanced Monitoring

B.Amazon CloudWatch Logs

C.Amazon RDS Performance Insights

D.AWS Trusted Advisor

AnswerC

Performance Insights provides database load analysis and helps identify the specific SQL queries that are causing high CPU.

Why this answer

Option D is correct because Performance Insights provides a database performance tuning and monitoring feature that helps you quickly assess the load on your database and determine when and where to take action. Option A is wrong because CloudWatch Logs captures database logs, not real-time performance data. Option B is wrong because Enhanced Monitoring provides OS-level metrics, not database query details.

Option C is wrong because AWS Trusted Advisor provides best practice checks but not query-level analysis.

Practice this question →

274

Multi-Selectmedium

A company is using Amazon RDS for PostgreSQL and needs to monitor the database for performance issues. Which TWO metrics in Amazon CloudWatch are most useful for identifying I/O bottlenecks?

Select 2 answers

A.ReadIOPS

B.FreeStorageSpace

C.DiskQueueDepth

D.DatabaseConnections

E.CPUUtilization

AnswersA, C

ReadIOPS and WriteIOPS show I/O operations.

Why this answer

Option B is correct because ReadIOPS and WriteIOPS show the actual I/O operations. Option D is correct because DiskQueueDepth indicates pending I/O requests, which suggests contention. Option A is wrong because CPU utilization is compute, not I/O.

Option C is wrong because FreeStorageSpace is capacity, not performance. Option E is wrong because DatabaseConnections is about connections, not I/O.

Practice this question →

275

MCQmedium

A company runs a document management system using Amazon S3 and Amazon DynamoDB. The application writes document metadata to DynamoDB and stores the document in S3. Recently, users report that occasionally documents are saved in S3 but the corresponding metadata is missing in DynamoDB. The application writes to DynamoDB first, then to S3. If the S3 upload fails, the application retries. The database specialist suspects a transaction consistency issue. The application is running on multiple EC2 instances behind an Application Load Balancer. What should the specialist recommend to ensure that both the metadata and document are stored consistently?

A.Use DynamoDB transactions to write metadata and initiate S3 upload within the same transaction.

B.Use DynamoDB Streams to trigger an AWS Lambda function that performs the S3 upload.

C.Reverse the order: upload to S3 first, then write to DynamoDB.

D.Implement an idempotency key in the application to retry the entire operation.

AnswerA

Transactions provide atomicity across multiple items.

Why this answer

Option B is correct because using DynamoDB transactions ensures atomicity; if the S3 upload fails, the transaction can be rolled back. Option A is wrong because DynamoDB Streams with Lambda introduces eventual consistency and possible duplicates. Option C is wrong because writing to S3 first still leaves inconsistency if DynamoDB fails.

Option D is wrong because idempotency tokens help with duplicates but not atomicity.

Practice this question →

276

MCQeasy

A developer is troubleshooting an application that uses Amazon DynamoDB. The application is experiencing throttled requests (ProvisionedThroughputExceededException). Which CloudWatch metric should be monitored to troubleshoot this issue?

A.ThrottledRequests

B.SuccessfulRequestLatency

C.UserErrors

D.ConsumedWriteCapacityUnits

AnswerD

Comparing consumed vs provisioned capacity identifies throttling.

Why this answer

Option C is correct because ConsumedWriteCapacityUnits shows actual usage; comparing with ProvisionedWriteCapacityUnits helps identify throttling. Option A is wrong because UserErrors is for client-side errors. Option B is wrong because ThrottledRequests is a metric but not a CloudWatch metric name; the actual metric is ConsumedWriteCapacityUnits.

Option D is wrong because SuccessfulRequestLatency is about latency, not throttling.

Practice this question →

277

Multi-Selectmedium

Which TWO metrics should be monitored together to detect a memory leak in an Amazon RDS for Oracle DB instance? (Choose TWO.)

Select 2 answers

A.FreeableMemory

B.SwapUsage

C.ReadIOPS

D.DatabaseConnections

E.NetworkThroughput

AnswersA, B

Declining freeable memory may indicate a memory leak.

Why this answer

Options A and D are correct because FreeableMemory shows available memory, and SwapUsage indicates swapping, which occurs when memory is exhausted. Option B (NetworkThroughput) is not memory-related. Option C (ReadIOPS) is I/O-related.

Option E (DatabaseConnections) is connection-related.

Practice this question →

278

MCQhard

A data analytics company runs Amazon Redshift clusters. A user reports that a complex query is taking much longer than expected. The DBA uses the STL_QUERY view to check the query execution. Which column in STL_QUERY should the DBA examine to identify if the query is waiting for resources?

A.aborted

B.query

C.starttime

D.service_class

AnswerD

Service class indicates the WLM queue; a high queue time suggests waiting for resources.

Why this answer

Option B is correct because the 'service_class' column indicates the workload management (WLM) queue, which can show if the query is queued. Option A is wrong because 'query' is just the query ID. Option C is wrong because 'starttime' shows when the query started.

Option D is wrong because 'aborted' indicates if the query was canceled, not waiting.

Practice this question →

279

MCQeasy

A database administrator notices that an Amazon RDS for MySQL DB instance is using more storage than expected. Which metric should be monitored to troubleshoot storage usage?

A.FreeStorageSpace

B.DatabaseConnections

C.ReadIOPS

D.NetworkThroughput

AnswerA

FreeStorageSpace shows remaining storage, helping identify usage trends.

Why this answer

Option B is correct because FreeStorageSpace indicates remaining capacity. Option A is wrong because DatabaseConnections is about connections. Option C is wrong because ReadIOPS measures I/O operations.

Option D is wrong because NetworkThroughput measures network traffic.

Practice this question →

280

Multi-Selecthard

A database specialist is troubleshooting a slow-running query on an Amazon Aurora MySQL DB cluster. The query performs a large table scan. Which THREE actions would likely improve query performance?

Select 3 answers

A.Increase the size of the DB instance to provide more memory and CPU.

B.Change the transaction isolation level to SERIALIZABLE.

C.Enable the query cache feature to cache the results of the query.

D.Enable Aurora Parallel Query to parallelize the table scan.

E.Create an index on columns used in WHERE and JOIN clauses.

AnswersA, C, E

More resources can speed up query execution.

Why this answer

Creating appropriate indexes can speed up queries by avoiding table scans. Increasing the instance size provides more memory and CPU for query execution. Enabling query caching can store results of repeated queries.

Changing isolation level and enabling parallel query may not help in all cases and may have side effects.

Practice this question →

281

MCQmedium

A company uses Amazon DynamoDB with a table that has a partition key of 'user_id' (string) and sort key of 'timestamp' (number). The application queries for recent items for a specific user using the query API with KeyConditionExpression. The query returns items in descending order. Occasionally, the query returns items that are not the most recent. What is the most likely cause?

A.The query is using eventually consistent reads, which may not reflect the latest writes.

B.The query is not using the ScanIndexForward parameter set to false.

C.The query results are paginated and the application is not iterating through all pages.

D.The query is using a global secondary index (GSI) that has a different sort key.

AnswerA

Eventually consistent reads may return stale data.

Why this answer

DynamoDB stores items with the same partition key in sorted order by sort key. Queries return items in order by default. Option C is correct because if the application uses eventually consistent reads, the data might be stale.

Option A is wrong because GSIs don't affect base table queries. Option B is wrong because the query returns items in order, but eventually consistent reads may not reflect latest writes. Option D is wrong because pagination is not the cause.

Practice this question →

282

MCQeasy

A startup is using Amazon ElastiCache for Redis to cache session data. They deployed a single Redis node (cache.t3.micro) in us-west-2. The application reports high latency when reading session data. CloudWatch metrics show CPUUtilization at 90% and Evictions at 100 per minute. The cache hit ratio is 80%. The database specialist suspects the node is overloaded. What should the specialist do to improve performance?

A.Scale up to a larger node type, such as cache.m5.large.

B.Add a read replica to offload read traffic.

C.Enable cluster mode and add more shards.

D.Decrease the TTL for session keys to reduce memory usage.

AnswerA

More resources reduce CPU and evictions.

Why this answer

Option D is correct because increasing the node size provides more CPU and memory, reducing evictions and latency. Option A is wrong because a read replica helps with read scaling but the node is a primary. Option B is wrong because Redis Cluster mode requires multiple shards and may add complexity.

Option C is wrong because reducing TTL might increase cache misses.

Practice this question →

283

MCQeasy

A company is running a MongoDB database on Amazon EC2. The database is experiencing high disk I/O latency. Which AWS service can be used to monitor the disk I/O metrics at the instance level?

A.Amazon RDS

B.Amazon CloudWatch

C.Amazon DynamoDB

D.Amazon S3

AnswerB

CloudWatch provides metrics like DiskReadBytes, DiskWriteBytes.

Why this answer

Option C is correct because CloudWatch provides disk I/O metrics for EC2 instances. Option A is wrong because RDS is for managed databases. Option B is wrong because DynamoDB is NoSQL.

Option D is wrong because S3 is object storage.

Practice this question →

284

Multi-Selectmedium

Which TWO actions can help reduce Amazon RDS for MySQL replication lag between a primary instance and a read replica? (Choose two.)

Select 2 answers

A.Increase the allocated storage for the read replica.

B.Reduce the number of write-heavy DML statements on the primary.

C.Enable Multi-AZ on the primary instance.

D.Increase the instance size of the read replica.

E.Disable binary logging on the primary instance.

AnswersB, D

Fewer changes to replicate means less lag.

Why this answer

Options A and D are correct. Increasing the replica instance size gives it more resources to apply changes. Reducing write-heavy DML on the primary reduces the volume of changes to replicate.

Option B is incorrect because enabling Multi-AZ on the primary does not reduce lag. Option C is incorrect because binary logging is required for replication. Option E is incorrect because increasing the replica's allocated storage does not directly reduce lag.

Practice this question →

285

MCQmedium

A company runs a critical application on Amazon RDS for PostgreSQL with a Multi-AZ deployment. The application experiences intermittent connection timeouts and slow query performance. The CloudWatch metrics show that the 'ReadLatency' and 'WriteLatency' metrics are elevated during peak hours. The 'CPUUtilization' is consistently below 30%, and 'DatabaseConnections' is within limits. The 'BurstBalance' for the gp2 storage is frequently dropping to 0%. The DB instance is a db.r5.large with 300 GB of gp2 storage. The company wants to resolve the latency issues without significant cost increase. Which solution should the company implement?

A.Add a read replica to offload read traffic.

B.Enable Performance Insights to identify the root cause.

C.Switch the storage type to io1 with 3000 provisioned IOPS.

D.Increase the allocated storage to 600 GB to increase baseline IOPS.

AnswerD

Larger gp2 volumes have higher baseline IOPS, reducing burst credit depletion.

Why this answer

Option C: Increasing storage to 600 GB increases baseline IOPS from 900 to 1800, reducing reliance on burst credits. This is cost-effective compared to switching to io1. Option A: Increasing to io1 with 3000 IOPS would be more expensive.

Option B: Adding a read replica does not help with write latency. Option D: Enabling Performance Insights only helps diagnose, not resolve.

Practice this question →

286

MCQhard

A company runs a critical application on Amazon RDS for PostgreSQL. The application team reports that the database occasionally becomes unresponsive for a few seconds. CloudWatch metrics show 'CPUSurplusCreditsCharged' and 'CPUSurplusCredits' are not 0. The instance is a db.t3.medium. What is the likely cause and how should it be fixed?

A.The instance is out of CPU credits and is being throttled; switch to a larger or non-burstable instance

B.Enable Enhanced Monitoring to diagnose the issue

C.Increase the allocated storage to improve I/O

D.The instance is experiencing a failover; enable Multi-AZ

AnswerA

Burstable instances rely on CPU credits; when exhausted, performance is limited, causing unresponsiveness.

Why this answer

Option A is correct because t3 instances are burstable and use CPU credits. When credits are exhausted, they can use surplus credits, which incur charges. If the workload is consistently high, the instance may run out of credits and become throttled, causing unresponsiveness.

The fix is to switch to a T3 unlimited mode (but that incurs charges) or use a non-burstable instance (e.g., m5). Option B is wrong because modifying Multi-AZ does not affect CPU credits. Option C is wrong because enabling Enhanced Monitoring does not fix credit exhaustion.

Option D is wrong because storage optimization does not affect CPU credits.

Practice this question →

287

Multi-Selecthard

A company's Amazon Aurora MySQL DB cluster is experiencing a failover event. Which THREE metrics in CloudWatch should be examined to understand the cause of the failover?

Select 3 answers

A.ACUUtilization

B.ReadLatency

C.BinLogDiskUsage

D.DatabaseConnections

E.FailoverCount

AnswersA, D, E

High ACU utilization can trigger failover.

Why this answer

Option A is correct because FailoverCount shows how many failovers occurred. Option B is correct because DatabaseConnections can indicate if connections were dropped or maxed out. Option D is correct because ACUUtilization can show resource exhaustion that may trigger failover.

Option C is wrong because BinLogDiskUsage is about binary logs, not failover cause. Option E is wrong because ReadLatency is a symptom but not a direct cause of failover.

Practice this question →

288

MCQhard

A company uses Amazon RDS for SQL Server with Multi-AZ deployment. During a failover test, the application experienced a longer downtime than expected. Which monitoring metric should be reviewed to understand the failover duration?

A.FailoverTime

B.WriteLatency

C.DatabaseConnections

D.ReplicaLag

AnswerA

FailoverTime is a CloudWatch metric specific to Multi-AZ failover duration.

Why this answer

Option B is correct because the FailoverTime metric in CloudWatch measures the time taken for a Multi-AZ failover. Option A is wrong because DatabaseConnections measures the number of connections, not failover time. Option C is wrong because ReplicaLag is for read replicas, not Multi-AZ.

Option D is wrong because WriteLatency measures write latency, not failover duration.

Practice this question →

289

Multi-Selecteasy

A database administrator is monitoring an Amazon RDS for PostgreSQL DB instance. The administrator notices that the DB instance is using more memory than expected. Which TWO metrics in Amazon CloudWatch can help diagnose memory usage?

Select 2 answers

A.NetworkReceiveThroughput

B.FreeableMemory

C.ReadIOPS

D.DatabaseConnections

E.SwapUsage

AnswersB, E

This metric shows the amount of available RAM.

Why this answer

Option A is correct because FreeableMemory shows available memory. Option D is correct because SwapUsage indicates swapping, which occurs when memory is low. Option B is wrong because ReadIOPS is I/O metric.

Option C is wrong because DatabaseConnections shows connections, not memory. Option E is wrong because NetworkReceiveThroughput is network metric.

Practice this question →

290

MCQmedium

A company uses Amazon RDS for MySQL with Multi-AZ deployment. The database experiences intermittent write latency spikes. CloudWatch shows elevated 'WriteLatency' and 'WriteIOPS' but normal 'CPUUtilization'. Which is the MOST likely cause?

A.A parameter group change was applied without rebooting

B.The instance is exceeding the provisioned IOPS burst balance

C.A read replica is being used for write operations

D.Multi-AZ replication is causing synchronous writes to the standby

AnswerB

When EBS burst balance depletes, write latency spikes occur even with low CPU.

Why this answer

Option D is correct because write spikes with high IOPS but low CPU often indicate storage performance issues like exceeding EBS burst balance. Option A is wrong because Multi-AZ replication does not cause write latency. Option B is wrong because read replica lag affects read, not write.

Option C is wrong because parameter group changes require reboot and affect all operations, not intermittent.

Practice this question →

291

MCQeasy

A database administrator notices that an Amazon RDS for Oracle DB instance's CPU utilization is consistently above 90% during peak hours. The application is read-heavy. Which action can reduce CPU load?

A.Disable Multi-AZ to free up resources

B.Increase the allocated storage

C.Enable Performance Insights to optimize queries

D.Create a read replica and direct read traffic to it

AnswerD

Offloading reads to a read replica reduces CPU load on the primary instance.

Why this answer

Option B is correct because creating a read replica offloads read traffic from the primary instance, reducing CPU usage. Option A is wrong because increasing instance size may help but is more costly; also, read replica is a better practice for read-heavy workloads. Option C is wrong because disabling Multi-AZ reduces availability, not CPU.

Option D is wrong because enabling Performance Insights adds overhead, not reduces CPU.

Practice this question →

292

MCQeasy

An administrator notices that the CloudWatch metric 'ReadLatency' for an Amazon RDS for SQL Server instance has increased significantly. Which of the following is the most likely cause?

A.The DB instance is experiencing high CPU utilization.

B.The DB instance is running out of memory.

C.The DB instance is using a burstable instance class that has exhausted its credits.

D.The DB instance does not have enough provisioned IOPS.

AnswerD

Insufficient IOPS can cause read operations to queue, increasing latency.

Why this answer

High ReadLatency indicates that read operations are taking longer. This can be due to high I/O wait, which can be caused by insufficient I/O throughput (provisioned IOPS). Option A is correct: insufficient IOPS can cause latency.

Option B is incorrect: high CPU may cause slower processing but not directly read latency. Option C is incorrect: network bandwidth does not directly affect storage latency. Option D is incorrect: memory pressure can cause swapping, which affects latency but not as directly as IOPS.

Practice this question →

293

MCQmedium

A developer is troubleshooting an application that uses Amazon DynamoDB. The application sometimes receives ProvisionedThroughputExceededException errors. The table has on-demand capacity mode. The errors occur in short bursts. What is the most likely cause?

A.The table has a low read/write capacity mode limit that needs to be increased.

B.The global secondary index (GSI) has a different throughput limit.

C.The table has reached the maximum provisioned throughput.

D.The request rate exceeds the partition's throughput capacity in a short burst.

AnswerD

On-demand can throttle if a single partition's throughput is exceeded.

Why this answer

On-demand capacity mode can handle up to the table's previous peak traffic. However, if traffic spikes suddenly, DynamoDB might throttle. Option D is correct because on-demand has a limit on the maximum throughput per partition.

Option A is wrong because indexes share the table's capacity. Option B is wrong because on-demand does not have provisioned limits. Option C is wrong because the table is on-demand, not provisioned.

Practice this question →

294

MCQmedium

A company is running a production Amazon RDS for MySQL DB instance. The application team reports intermittent high latency and connection timeouts. A quick check shows that the DB instance's CPU utilization is consistently above 90% during peak hours. The database size is 500 GB and the instance class is db.r5.large. Which combination of actions should a database specialist take to resolve the performance issue?

A.Increase the allocated storage to 1 TB and enable auto-scaling for storage.

B.Scale up the DB instance to db.r5.xlarge and review slow query logs to optimize poorly performing queries.

C.Enable Multi-AZ and increase the allocated storage to 1 TB to improve I/O performance.

D.Enable Performance Insights and create a CloudWatch alarm to notify when CPU exceeds 80%.

AnswerB

Scaling up provides more CPU and memory; slow query logs help identify and fix inefficient queries.

Why this answer

Option B is correct because scaling up the DB instance to db.r5.xlarge increases compute capacity, which directly addresses high CPU utilization. Additionally, reviewing slow query logs can identify inefficient queries causing CPU spikes. Option A is wrong because enabling Multi-AZ does not increase compute capacity; it only provides high availability.

Option C is wrong because enabling Performance Insights helps diagnose but does not resolve the issue. Option D is wrong because increasing the allocated storage does not affect CPU performance.

Practice this question →

295

Matchingmedium

Match each AWS database-related CLI command to its function.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Creates a new RDS DB instance

Inserts or replaces an item in a DynamoDB table

Returns details about Redshift clusters

Creates an ElastiCache cache cluster

Lists manual and automated DB snapshots

Why these pairings

Common AWS CLI commands for database services.

Practice this question →

296

MCQhard

A company is running a MongoDB-compatible Amazon DocumentDB cluster with one writer and two readers. The application writes a large amount of data during batch processing, and after a batch completes, the writer's CPU is high, and the readers have significant replica lag. The team wants to reduce replica lag without affecting the batch performance. What should they do?

A.Change the storage type to Provisioned IOPS on all instances

B.Increase the instance size of the readers to improve apply throughput

C.Reduce the batch size to lower the write rate

D.Increase the instance size of the writer to handle the batch faster

AnswerB

Larger readers can apply oplog entries faster, reducing lag.

Why this answer

Option D is correct because the replica lag is due to the readers applying writes from the oplog; increasing the reader instance size gives them more CPU and memory to apply writes faster. Option A is incorrect because increasing the writer size does not directly help readers. Option B is incorrect because decreasing the batch size reduces writes but the team wants to maintain performance.

Option C is incorrect because changing the storage type does not affect replica lag.

Practice this question →

297

MCQhard

A company runs an Amazon Aurora MySQL DB cluster with one writer and two readers. They notice that one reader instance is consistently showing higher than expected lag. The other reader is fine. What is the most likely cause?

A.The writer instance is experiencing high write activity

B.There is a network connectivity issue between the writer and that reader

C.The reader instance is being used for heavy analytical queries

D.The reader instance has a different DB parameter group

AnswerC

Heavy read workload on a reader can cause replication lag.

Why this answer

Option C is correct because an imbalanced workload on that reader can cause it to lag. Option A is wrong because writer load affects all replicas. Option B is wrong because network issues would affect both readers.

Option D is wrong because parameter group change would affect both readers if applied to cluster.

Practice this question →

298

Multi-Selecteasy

Which TWO actions should be taken to troubleshoot high memory usage on an Amazon ElastiCache for Redis node? (Choose two.)

Select 2 answers

A.Monitor the Evictions CloudWatch metric.

B.Increase the maxmemory parameter to allow more memory usage.

C.Monitor CPUUtilization CloudWatch metric.

D.Enable cluster mode to distribute memory across shards.

E.Enable Reserved Memory parameter group setting.

AnswersA, B

Evictions indicate memory pressure.

Why this answer

Options A and D are correct. Checking the evictions metric (Evictions) shows if memory is being freed due to maxmemory policy. Increasing the maxmemory parameter can alleviate memory pressure.

Option B (enabling Reserved Memory) reserves memory for overhead, not a troubleshooting step. Option C (monitoring CPUUtilization) is for CPU, not memory. Option E (enabling cluster mode) is a configuration change, not a troubleshooting step.

Practice this question →

299

MCQeasy

Refer to the exhibit. A DBA is troubleshooting a performance issue on an RDS for MySQL DB instance. The DBA runs the AWS CLI command shown. Based on the output, which of the following is a potential performance bottleneck?

A.The storage type is gp2, which may have limited IOPS performance.

B.The DB instance is not Multi-AZ, causing failover delay.

C.The DB instance status is 'available', meaning it is not accepting connections.

D.The engine version 8.0.27 has a known performance bug.

AnswerA

gp2 is burstable; under sustained load, performance may degrade.

Why this answer

The instance uses gp2 storage, which has burstable IOPS. If the burst balance is depleted, performance can degrade. The engine version is recent, Multi-AZ is not enabled, but that affects availability, not necessarily performance.

DB instance status is available so no issue there.

Practice this question →

300

MCQhard

A company runs a critical MySQL database on Amazon RDS Single-AZ (db.m5.large) with 200 GB of Provisioned IOPS (io1) storage set to 3000 IOPS. The application team reports that write operations are occasionally slow. CloudWatch metrics show that the Write IOPS metric peaks at 3500 IOPS during the slowdowns, but the average is 2000 IOPS. The Read IOPS average is 500 IOPS. The queue depth metric occasionally spikes to 20. The storage configuration includes a 50 GB General Purpose SSD (gp2) log volume attached to the same RDS instance. Which change will MOST effectively resolve the write latency?

A.Change the storage type to gp3 with 3000 baseline IOPS.

B.Move the log volume to the same io1 volume to reduce I/O overhead.

C.Increase the provisioned IOPS on the io1 volume to 4000.

D.Enable Multi-AZ for failover protection.

AnswerC

Increasing provisioned IOPS to match peak demand (3500) gives headroom and reduces queue depth.

Why this answer

Option C is correct because the io1 volume is provisioned at 3000 IOPS, but the workload bursts to 3500 IOPS, causing queue depth to spike. Increasing the provisioned IOPS to 4000 ensures that the volume can handle the peak without queuing. Option A is incorrect because enabling Multi-AZ provides high availability but does not increase IOPS capacity.

Option B is incorrect because moving to gp3 does not guarantee better performance; gp3 baseline is 3000 IOPS (same as current provisioned) but can burst to higher, but the issue is consistent peaks. However, the most direct fix is to increase io1 provisioned IOPS. Option D is incorrect because the log volume is separate and not the source of write IOPS contention.

Practice this question →

← PreviousPage 4 of 4 · 300 questions total

Ready to test yourself?

Try a timed practice session using only Monitoring and Troubleshooting questions.

Start 20-question session