Knowledge + Practice

CCNA Workload-Specific Database Design Questions

75 of 444 questions · Page 2/6 · Workload-Specific Database Design · Answers revealed

Practice these questions Domain overview All questions

76

MCQeasy

A company is designing a database for an IoT application that ingests millions of time-series data points per second. The database must support high-throughput writes and efficient querying of recent data. Which AWS database service is MOST suitable?

A.Amazon RDS for PostgreSQL

B.Amazon Timestream

C.Amazon DynamoDB with TTL

D.Amazon Redshift

AnswerB

Timestream is purpose-built for time-series.

Why this answer

Amazon Timestream is purpose-built for time-series data, offering a serverless architecture that ingests millions of data points per second with automatic scaling. It provides efficient storage and querying of recent data through its memory store, while tiering older data to a cost-optimized magnetic store, making it the most suitable choice for high-throughput IoT time-series workloads.

Exam trap

AWS often tests the misconception that any high-throughput NoSQL database (like DynamoDB) is suitable for time-series workloads, but the key differentiator is the need for native time-series query capabilities and automatic data lifecycle management, which Timestream provides and DynamoDB lacks.

How to eliminate wrong answers

Option A is wrong because Amazon RDS for PostgreSQL is a relational database optimized for OLTP workloads with row-based storage, not designed for the high-ingest rates and time-series-specific query patterns (e.g., downsampling, interpolation) required by IoT data. Option C is wrong because Amazon DynamoDB with TTL supports high-throughput writes but lacks native time-series query optimizations such as time-based aggregation, window functions, or automatic data tiering; TTL only handles data expiration, not efficient querying of recent data across millions of points per second. Option D is wrong because Amazon Redshift is a columnar data warehouse optimized for complex analytical queries on large datasets, not for real-time, high-frequency writes of individual time-series data points; its ingestion latency and cost model are unsuitable for per-second write rates.

Practice this question →

77

MCQmedium

A company uses Amazon DynamoDB to store user session data for a web application. The table has a partition key of 'user_id' and no sort key. Each item is about 5 KB. The application performs frequent GetItem and UpdateItem operations. Recently, the application has been experiencing higher than expected latency and some throttling. The table's read and write capacity are set to on-demand mode. The CloudWatch metrics show that the ConsumedWriteCapacityUnits are well below the provisioned limits (if they were provisioned), but there are occasional ThrottledWriteEvents. The application team also notices that the throttling occurs for specific users. What is the most likely cause and solution?

A.Create a global secondary index with a different partition key for the hot users.

B.Add a sort key to the table to improve data distribution.

C.Implement write sharding by appending a random suffix to the partition key for high-traffic users.

D.Switch to provisioned capacity mode and increase the write capacity units significantly.

AnswerC

Write sharding distributes writes across multiple partitions, avoiding hot partitions.

Why this answer

Option B is correct because throttling for specific users indicates a hot partition, even with on-demand capacity, because DynamoDB distributes throughput evenly across partitions. A single partition can be throttled if its activity exceeds 3000 RCU or 1000 WCU. Option A (increasing capacity) is irrelevant for on-demand.

Option C (adding a sort key) doesn't help because the partition key is already the main access pattern. Option D (adding a GSI) may not solve the hot partition issue for the base table writes.

Practice this question →

78

Multi-Selectmedium

A company is migrating an on-premises Oracle database with complex stored procedures and triggers to AWS. They want to minimize code changes. Which two AWS database services should they consider? (Choose two.)

Select 2 answers

A.Amazon DynamoDB

B.Amazon RDS for Oracle

C.Amazon S3

D.Amazon Redshift

E.Amazon Aurora PostgreSQL with Babelfish

AnswersB, E

Directly supports Oracle PL/SQL with minimal changes.

Why this answer

Amazon RDS for Oracle is a direct migration target for on-premises Oracle databases, supporting the same Oracle Database engine. This minimizes code changes because existing stored procedures, triggers, and PL/SQL code can run with minimal or no modification, leveraging Oracle's native compatibility.

Exam trap

The trap here is that candidates may confuse Babelfish's T-SQL compatibility with Oracle PL/SQL support, or incorrectly assume that any 'Aurora' or 'PostgreSQL' service can handle Oracle stored procedures without modification.

Practice this question →

79

MCQmedium

A company uses Amazon RDS for MySQL to store e-commerce order data. The orders table has millions of rows and is frequently queried by order_id. The company also runs periodic reports that aggregate data by order_date. The reports are slow. The database has a primary key on order_id. The company needs to improve report performance without affecting OLTP queries. Which design change should be made?

A.Create a secondary index on order_date.

B.Upgrade to a larger instance type.

C.Create a read replica and run reports on the replica.

D.Partition the table by order_date.

AnswerA

A secondary index on order_date speeds up date-based aggregations without impacting OLTP queries.

Why this answer

Creating a secondary index on order_date allows MySQL to quickly locate rows matching the report's date range without scanning the entire table, significantly improving aggregation performance. This index is separate from the primary key on order_id, so OLTP queries that filter by order_id remain unaffected. The index provides a balanced approach: it accelerates read-heavy reporting while adding minimal overhead to write operations.

Exam trap

The trap here is that candidates often assume a read replica (Option C) solves all performance issues, but without an appropriate index, the replica still performs full table scans, making the reports slow regardless of where they run.

How to eliminate wrong answers

Option B is wrong because upgrading to a larger instance type increases CPU, memory, and I/O capacity but does not address the root cause of slow reports—the lack of an efficient access path for date-based queries; it merely masks the performance issue with more resources. Option C is wrong because creating a read replica offloads reporting traffic from the primary instance, but the replica still lacks an index on order_date, so the reports will remain slow on the replica. Option D is wrong because partitioning the table by order_date can improve partition pruning for date-range queries, but it introduces complexity and may negatively impact OLTP queries that filter by order_id, as MySQL must search across multiple partitions; additionally, partitioning does not replace the need for an index on the partitioning key.

Practice this question →

80

MCQhard

A company needs to migrate an on-premises Oracle database to AWS with minimal changes to the application code. The application uses complex stored procedures and has high availability requirements. Which database service should be used?

A.Amazon RDS for Oracle

B.Amazon DynamoDB

C.Amazon Redshift

D.Amazon Aurora PostgreSQL

AnswerA

RDS for Oracle provides full Oracle compatibility with Multi-AZ for high availability.

Why this answer

Amazon RDS for Oracle is the correct choice because it provides native Oracle compatibility, supporting complex stored procedures, PL/SQL, and existing application code with minimal changes. It also offers Multi-AZ deployments for high availability, meeting the requirement without requiring a complete rewrite.

Exam trap

The trap here is that candidates may choose Amazon Aurora PostgreSQL due to its high availability and performance, overlooking the fact that it does not support Oracle-specific stored procedures and PL/SQL, which would require costly application rewrites.

How to eliminate wrong answers

Option B is wrong because Amazon DynamoDB is a NoSQL key-value and document database that does not support complex stored procedures or Oracle PL/SQL, requiring significant application code changes. Option C is wrong because Amazon Redshift is a petabyte-scale data warehouse optimized for analytical queries, not for transactional workloads with complex stored procedures. Option D is wrong because Amazon Aurora PostgreSQL, while highly available, uses PostgreSQL syntax and does not natively support Oracle-specific stored procedures or PL/SQL, necessitating code modifications.

Practice this question →

81

MCQhard

A company runs a multi-tenant SaaS platform on AWS. Each tenant has their own database schema within a shared PostgreSQL database on Amazon RDS. The platform has grown to thousands of tenants, and the single RDS instance is experiencing performance degradation due to resource contention. Queries from one tenant can impact others. The company needs a solution that isolates tenants, provides predictable performance, and allows easy scaling. They also want to minimize application changes. The application uses an ORM that dynamically constructs SQL queries based on the tenant ID. Which solution is BEST?

A.Migrate to Amazon Aurora PostgreSQL and use Aurora Auto Scaling to add reader nodes as needed.

B.Create separate RDS instances for each tenant and use RDS Proxy to pool connections per tenant. Modify the application to select the appropriate database instance based on tenant ID.

C.Implement Amazon RDS Proxy in front of the existing RDS instance to manage connections and reduce contention.

D.Migrate the application to use Amazon DynamoDB with tenant ID as the partition key, using global tables for scalability.

AnswerB

This provides full isolation and predictable performance. RDS Proxy reduces connection overhead. Application changes are limited to connection routing logic.

Why this answer

The best solution is to use Amazon RDS Proxy with a connection pool per tenant, but that does not provide isolation. The most effective approach is to migrate to Amazon Aurora, which can handle many connections and provides better performance. However, the key is to use a separate database per tenant (database-per-tenant model) with a pool of RDS instances.

The most AWS-native approach is to use Amazon RDS for PostgreSQL with pg_partman or use a separate RDS instance per tenant, but that is costly. The best solution is to use Aurora Serverless v2, which can scale to zero and provides isolation through separate Aurora clusters? But that may require many clusters. The optimal solution is to use Amazon RDS with a separate database per tenant and use a connection pooling service like RDS Proxy to manage connections.

The application changes are minimal because the ORM can be configured to use a different connection string per tenant. However, the question asks for the BEST solution. Option A: Use RDS Proxy with a single database but with connection pooling; does not isolate.

Option B: Migrate to Aurora and use Aurora Auto Scaling; still shared. Option C: Implement a database-per-tenant model with separate RDS instances and use RDS Proxy for each; provides isolation but complex. Option D: Use Amazon DynamoDB with tenant ID as partition key; provides isolation and scaling but requires application changes to use DynamoDB instead of PostgreSQL.

The stem says 'minimize application changes', so moving from PostgreSQL to DynamoDB would require significant changes. Therefore, the best is to use a database-per-tenant approach with RDS and RDS Proxy. But among the options, likely one suggests using Aurora with separate databases per tenant.

I'll craft the options accordingly.

Practice this question →

82

MCQmedium

A gaming company runs a global leaderboard on Amazon DynamoDB. The leaderboard is updated frequently and must return the top 100 scores in milliseconds. The current design uses a single table with a Global Secondary Index (GSI) on score. However, the query to retrieve top scores often throttles under load. Which design change would best improve performance?

A.Use a scan operation with a filter to retrieve top scores.

B.Implement a write shard pattern using a random suffix on the partition key and a GSI on score.

C.Add DynamoDB Accelerator (DAX) in front of the table.

D.Switch to strongly consistent reads for the leaderboard query.

AnswerB

Sharding distributes write load, and the GSI on score enables efficient range queries for top scores.

Why this answer

Option B is correct because the write shard pattern distributes high-frequency writes across multiple partition keys by appending a random suffix, preventing hot partitions. The GSI on score still allows efficient top-N queries by scanning the index in descending order. This avoids throttling by spreading write capacity evenly, while the GSI remains a sparse index that can be queried without impacting the base table's write throughput.

Exam trap

The trap here is that candidates often assume caching (DAX) or consistency changes will fix throttling, but the real issue is write-side hot partitions, which the write shard pattern directly addresses by distributing the write load.

How to eliminate wrong answers

Option A is wrong because a scan operation reads every item in the table, which is inefficient and costly, and filtering after a scan does not reduce the read capacity consumed, leading to even more throttling under load. Option C is wrong because DAX is an in-memory cache that accelerates reads but does not solve write-side throttling caused by hot partitions; it also adds latency for writes and does not help with the write-heavy leaderboard updates. Option D is wrong because strongly consistent reads consume twice the read capacity units of eventually consistent reads and do not address the root cause of write throttling; the leaderboard query is a read operation, but the bottleneck is write contention on hot partitions.

Practice this question →

83

MCQmedium

Refer to the exhibit. A CloudFormation template creates a DynamoDB table. The application team needs to query orders by customer ID (which is not a key attribute). Which change to the template would enable efficient querying by customer ID?

A.Change the KeySchema to use CustomerID as the hash key

B.Add a LocalSecondaryIndex on CustomerID

C.Add a GlobalSecondaryIndex with CustomerID as the hash key and OrderDate as the range key

D.Enable DynamoDB Streams and use Lambda to populate a separate table

AnswerC

GSI allows efficient querying by CustomerID.

Why this answer

Option C is correct because a GlobalSecondaryIndex (GSI) allows querying on a non-key attribute (CustomerID) with a different key schema than the base table. By specifying CustomerID as the hash key and OrderDate as the range key, the application can efficiently query orders by CustomerID and optionally sort by OrderDate, without affecting the base table's primary key structure.

Exam trap

The trap here is that candidates often confuse LocalSecondaryIndexes (LSIs) with GlobalSecondaryIndexes (GSIs), incorrectly assuming an LSI can be created on any attribute, when in fact an LSI must share the same hash key as the base table and can only be added during table creation.

How to eliminate wrong answers

Option A is wrong because changing the KeySchema to use CustomerID as the hash key would break existing access patterns that rely on the original primary key (e.g., OrderID), and CustomerID is not guaranteed to be unique, leading to data overwrites. Option B is wrong because a LocalSecondaryIndex (LSI) can only be created on tables with a composite primary key (hash and range key) and must use the same hash key as the base table; since CustomerID is not the base table's hash key, an LSI cannot be defined on it. Option D is wrong because using DynamoDB Streams and Lambda to populate a separate table adds operational complexity, latency, and cost, and is not the simplest or most efficient solution for enabling querying by a non-key attribute when a GSI directly solves the requirement.

Practice this question →

84

MCQhard

A company is using Amazon RDS for Oracle with a very large database (10 TB). They need to migrate to Amazon Aurora PostgreSQL with minimal downtime. The source database is heavily used with constant writes. Which migration strategy is most appropriate?

A.Export the Oracle database using expdp and import into Aurora PostgreSQL using pg_restore.

B.Use AWS Database Migration Service (DMS) with ongoing replication to migrate from Oracle to Aurora PostgreSQL.

C.Create a read replica of the RDS Oracle instance and promote it to an Aurora PostgreSQL instance.

D.Use Oracle GoldenGate to replicate data to an Aurora PostgreSQL instance.

AnswerB

DMS supports full load and CDC, minimizing downtime.

Why this answer

AWS DMS with ongoing replication (change data capture) is the most appropriate strategy for migrating a heavily written 10 TB Oracle database to Aurora PostgreSQL with minimal downtime. DMS can perform a full load of the existing data and then continuously replicate changes from Oracle's redo logs to Aurora PostgreSQL, allowing the source to remain fully operational until a brief cutover window. This approach minimizes downtime compared to offline export/import methods and is natively supported by AWS.

Exam trap

The trap here is that candidates may confuse read replicas (which are engine-specific and cannot change database engines) with DMS replication, or assume that Oracle GoldenGate is always the best choice for heterogeneous migrations without considering AWS-native alternatives like DMS.

How to eliminate wrong answers

Option A is wrong because expdp and pg_restore are incompatible tools (Oracle export/import vs. PostgreSQL restore), and this offline method would require significant downtime for a 10 TB database with constant writes, making minimal downtime impossible. Option C is wrong because RDS for Oracle does not support creating a read replica that can be promoted to a different database engine (Aurora PostgreSQL); read replicas are only for the same engine type.

Option D is wrong because Oracle GoldenGate is a third-party tool that adds complexity and cost, and while it could technically work, AWS DMS is the recommended, fully managed service for heterogeneous migrations with ongoing replication, making it a more appropriate choice in the AWS ecosystem.

Practice this question →

85

Multi-Selectmedium

Which TWO of the following are benefits of using Amazon DynamoDB Accelerator (DAX)? (Choose 2.)

Select 2 answers

A.Improves write throughput by caching write operations

B.Reduces storage costs by compressing data

C.Reduces read latency to microseconds for cached items

D.Automatically scales write capacity based on demand

E.Reduces the read capacity units consumed on the DynamoDB table

AnswersC, E

DAX provides microsecond read latency for cached data.

Why this answer

Option C is correct because Amazon DynamoDB Accelerator (DAX) is an in-memory cache that delivers up to 10x read performance improvement, reducing read latency to microseconds for cached items. It sits between your application and DynamoDB, intercepting read requests and serving them from its cluster's memory, which avoids the millisecond-level latency of reading from DynamoDB's SSD storage.

Exam trap

The trap here is confusing DAX's read caching with write optimization, leading candidates to incorrectly select that DAX improves write throughput or scales write capacity, when in fact DAX only accelerates reads and reduces read capacity consumption.

Practice this question →

86

MCQmedium

A company uses Amazon DynamoDB for a high-traffic leaderboard application that updates scores in real-time. The table has partition key 'game_id' and sort key 'player_id'. Queries retrieve top 10 players by score for each game. Which secondary index design is most efficient?

A.Create a global secondary index (GSI) with partition key 'game_id' and sort key 'score'

B.Create a global secondary index (GSI) with partition key 'game_id' and sort key 'player_id'

C.Create a local secondary index (LSI) with sort key 'score'

D.Do not create any index; use the base table with a scan

AnswerA

This index allows querying by game and retrieving top scores efficiently.

Why this answer

Option C is correct because a GSI with game_id as partition key and score as sort key allows efficient top-N queries using ScanIndexForward=false. Option A (GSI with player_id) doesn't sort by score. Option B (LSI) cannot use a different partition key.

Option D (no index) would require full table scan.

Practice this question →

87

MCQmedium

A company is designing a multi-tenant SaaS application on Amazon RDS for PostgreSQL. Each tenant's data must be isolated for security and performance. The application has millions of tenants, with most tenants having small datasets (under 100 MB). Which database design pattern is MOST cost-effective and operationally efficient?

A.Use Amazon DynamoDB with a separate table per tenant.

B.Use a single RDS instance with a shared schema and implement Row-Level Security (RLS) policies based on tenant_id.

C.Use a single RDS instance with a separate schema per tenant.

D.Use a separate Amazon RDS for PostgreSQL instance per tenant.

AnswerB

RLS provides tenant isolation with minimal overhead, suitable for many small tenants.

Why this answer

Option B is correct because using a single RDS for PostgreSQL instance with Row-Level Security (RLS) allows you to isolate tenant data at the row level based on a tenant_id column, without the overhead of managing millions of separate schemas or tables. This design is both cost-effective (single instance, no per-tenant provisioning) and operationally efficient (simple schema management, no connection pooling issues), while still meeting security and performance isolation requirements for small datasets under 100 MB.

Exam trap

The trap here is that candidates often assume separate schemas per tenant (Option C) are the best balance of isolation and cost, but they overlook PostgreSQL's practical limits on the number of schemas and the severe performance degradation from catalog bloat when dealing with millions of tenants.

How to eliminate wrong answers

Option A is wrong because Amazon DynamoDB with a separate table per tenant would require creating millions of tables, which exceeds the default DynamoDB table limit (256 per account) and introduces significant operational overhead for table management, throughput provisioning, and cross-tenant queries. Option C is wrong because using a separate schema per tenant on a single RDS instance would require creating millions of schemas, which is not supported by PostgreSQL (the system catalog pg_namespace would become bloated, and performance would degrade due to excessive catalog lookups). Option D is wrong because using a separate RDS for PostgreSQL instance per tenant would be prohibitively expensive and operationally unmanageable for millions of tenants, as each instance incurs minimum billing costs and requires individual maintenance, backups, and monitoring.

Practice this question →

88

MCQeasy

A company needs a fully managed graph database for a social networking application that requires real-time recommendations based on friend connections. Which AWS service should they use?

A.Amazon Neptune

B.Amazon DocumentDB

C.Amazon ElastiCache

D.Amazon DynamoDB

AnswerA

Neptune is a managed graph database suitable for social networking.

Why this answer

Amazon Neptune is the correct choice because it is a fully managed graph database service optimized for storing and querying highly connected data. It supports both property graph (Apache TinkerPop Gremlin) and RDF (SPARQL) models, making it ideal for social networking applications that require real-time friend-of-friend recommendations and traversal queries across complex relationships.

Exam trap

The trap here is that candidates often confuse Amazon DocumentDB or DynamoDB as suitable for graph workloads because they can store JSON with references, but they lack native graph traversal engines and query languages (Gremlin/SPARQL) required for efficient relationship queries.

How to eliminate wrong answers

Option B (Amazon DocumentDB) is wrong because it is a document database (MongoDB-compatible) designed for JSON document storage and indexing, not for graph traversal or relationship-heavy queries like friend connections. Option C (Amazon ElastiCache) is wrong because it is an in-memory caching service (Redis/Memcached) that does not natively support graph data models or traversal algorithms; it can accelerate queries but cannot replace a graph database. Option D (Amazon DynamoDB) is wrong because it is a key-value and document NoSQL database optimized for single-item access patterns and simple queries, lacking native graph traversal capabilities such as shortest-path or multi-hop relationship queries.

Practice this question →

89

Multi-Selecteasy

A company is designing a disaster recovery strategy for an Amazon RDS for PostgreSQL database. The database is 2 TB in size. The company wants to recover to a different AWS Region with minimal data loss. Which TWO options meet these requirements?

Select 2 answers

A.Create a read replica in the other Region.

B.Use AWS Database Migration Service (DMS) with ongoing replication to a target in the other Region.

C.Take a manual snapshot and copy it to the other Region. Restore from the snapshot.

D.Enable automatic backups and copy automated snapshots to the other Region.

E.Use AWS Backup to schedule cross-Region backups.

AnswersA, D

Cross-Region read replicas provide low RPO and can be promoted to a standalone instance.

Why this answer

Option A is correct because Amazon RDS for PostgreSQL supports creating a cross-Region read replica, which uses PostgreSQL's native streaming replication to keep the replica nearly synchronized with the source database. This provides a Recovery Point Objective (RPO) of seconds to minutes, minimizing data loss in a disaster scenario. The replica can be promoted to a standalone primary in the other Region for failover.

Exam trap

The trap here is that candidates may think cross-Region automatic backup copies (Option D) provide minimal data loss, but they actually have an RPO of up to 24 hours, whereas cross-Region read replicas (Option A) provide near-real-time replication.

Practice this question →

90

MCQeasy

A company needs to store JSON documents that require complex querying on nested attributes. The database must support ACID transactions and be fully managed. Which service should they use?

A.Amazon Aurora MySQL

B.Amazon DocumentDB (with MongoDB compatibility)

C.Amazon DynamoDB

D.Amazon Neptune

AnswerA

Supports JSON and ACID transactions.

Why this answer

Amazon Aurora MySQL-Compatible Edition supports JSON data type and ACID transactions. Option B (DynamoDB) is NoSQL but does not support complex nested queries natively. Option C (DocumentDB) is MongoDB-compatible but not ACID.

Option D (Neptune) is graph database.

Practice this question →

91

MCQhard

A database specialist is analyzing an Aurora MySQL error log and finds the above deadlock error. The application performs an update on the orders table and then updates the inventory table within the same transaction. The deadlock occurs when two concurrent transactions try to update orders and inventory in different orders. Which design change should the database specialist recommend to reduce deadlocks?

A.Combine the orders and inventory tables into a single table to avoid multiple table locks

B.Ensure all transactions update tables in the same order (e.g., always update inventory first, then orders)

C.Use SELECT ... FOR UPDATE on both tables before updating

D.Change the transaction isolation level to READ UNCOMMITTED

AnswerB

Consistent lock ordering prevents circular wait conditions, reducing deadlocks.

Why this answer

Option B is correct because deadlocks in Aurora MySQL often occur when concurrent transactions acquire row-level locks on tables in different orders. By enforcing a consistent lock order (e.g., always updating inventory first, then orders), the database can avoid circular wait conditions, which are a necessary condition for deadlocks. This is a standard best practice for reducing deadlocks in InnoDB, which uses row-level locking and two-phase locking.

Exam trap

The trap here is that candidates may think combining tables or using SELECT ... FOR UPDATE will prevent deadlocks, but the root cause is inconsistent lock ordering, not the number of tables or the use of explicit locking.

How to eliminate wrong answers

Option A is wrong because combining tables into a single table does not eliminate the need for multiple row locks and can introduce data redundancy, normalization issues, and still allow deadlocks if rows are locked in different orders. Option C is wrong because using SELECT ... FOR UPDATE on both tables before updating does not guarantee a consistent lock order; if the SELECT ...

FOR UPDATE statements acquire locks in different orders across transactions, deadlocks can still occur. Option D is wrong because changing the isolation level to READ UNCOMMITTED can lead to dirty reads, non-repeatable reads, and phantom reads, and it does not prevent deadlocks; deadlocks are caused by lock contention, not isolation level.

Practice this question →

92

Multi-Selecthard

Which THREE of the following are key considerations when designing a time-series database using Amazon DynamoDB? (Select THREE.)

Select 3 answers

A.Always use strongly consistent reads for accurate time-series data

B.Enable Time to Live (TTL) to automatically expire old data

C.Use a composite primary key with a high-cardinality partition key and a sort key that includes a truncated timestamp

D.Use local secondary indexes for aggregating data across partitions

E.Design for adaptive capacity to handle uneven access patterns

AnswersB, C, E

Automatically deletes data after a specified time.

Why this answer

Option B is correct because Amazon DynamoDB's Time to Live (TTL) feature automatically deletes expired items without consuming write throughput, making it ideal for managing data retention in time-series workloads. This eliminates the need for custom cleanup scripts and reduces storage costs over time.

Exam trap

AWS often tests the misconception that strongly consistent reads are mandatory for time-series accuracy, when in fact eventually consistent reads are acceptable for most time-series patterns and provide better performance and cost efficiency.

Practice this question →

93

MCQmedium

A company uses Amazon DynamoDB to store user profiles. The access pattern is mostly GetItem by user_id. They want to reduce costs. Which design change is most effective?

A.Use DynamoDB Standard-IA table class for the user profiles table.

B.Increase the read capacity units to reduce throttling.

C.Add a Global Secondary Index on an additional attribute.

D.Add DynamoDB Accelerator (DAX) for caching.

AnswerA

Standard-IA lowers storage cost for infrequently accessed data.

Why this answer

Option D is correct because DynamoDB Standard-IA is cheaper for data that is accessed less frequently. Option A is wrong because DAX adds cost. Option B is wrong because increasing read capacity increases cost.

Option C is wrong because adding a GSI adds cost.

Practice this question →

94

MCQeasy

A company is migrating an on-premises MongoDB database to AWS. The application uses MongoDB's aggregation pipeline for real-time analytics. Which AWS database service is most compatible and provides the least application changes?

A.Amazon ElastiCache for Redis with RedisJSON module.

B.Amazon DocumentDB (with MongoDB compatibility).

C.Amazon DynamoDB with DynamoDB Streams and Lambda for aggregation.

D.Amazon Aurora with JSON data type.

AnswerB

DocumentDB is MongoDB-compatible and supports aggregation pipeline.

Why this answer

Amazon DocumentDB is designed to be MongoDB-compatible, supporting the MongoDB aggregation pipeline with minimal changes. This allows the company to migrate the existing MongoDB database and continue using the same aggregation pipeline for real-time analytics without rewriting application code, making it the most compatible option.

Exam trap

The trap here is that candidates may assume DynamoDB's flexibility or Aurora's JSON support can easily replace MongoDB's aggregation pipeline, overlooking the fundamental differences in query language and data model that necessitate significant application rewrites.

How to eliminate wrong answers

Option A is wrong because Amazon ElastiCache for Redis with RedisJSON module is an in-memory cache, not a document database, and does not support MongoDB's aggregation pipeline or provide persistent storage for the full dataset. Option C is wrong because Amazon DynamoDB is a key-value and document database that does not natively support MongoDB's aggregation pipeline; using DynamoDB Streams and Lambda would require significant application changes to reimplement aggregation logic. Option D is wrong because Amazon Aurora with JSON data type is a relational database that does not support MongoDB's aggregation pipeline or its query language, requiring a complete rewrite of application queries and logic.

Practice this question →

95

Multi-Selectmedium

A company is building a real-time leaderboard for an online game using Amazon DynamoDB. The leaderboard must update scores within seconds and support queries for top 100 players. Which TWO design patterns should be used? (Choose TWO.)

Select 2 answers

A.Create a global secondary index on the score attribute for efficient range queries.

B.Use DynamoDB Streams to trigger a Lambda function that updates a separate leaderboard table.

C.Store the leaderboard in Amazon ElastiCache for Redis for low-latency reads.

D.Enable DynamoDB Accelerator (DAX) for faster reads of the leaderboard.

E.Set the sort key to the score attribute for natural ordering.

AnswersB, D

Streams and Lambda provide real-time processing.

Why this answer

Option A (DynamoDB Streams + Lambda) enables real-time updates. Option D (DAX) provides low-latency read caching for the leaderboard. Option B (GSI) could help but not for real-time top-N.

Option C (ElastiCache) is an alternative but not DynamoDB-native. Option E (Sort key on score) is a good design but not a separate pattern; it's part of table design.

Practice this question →

96

MCQeasy

A company needs to store JSON documents that are frequently accessed by a web application. The documents have varying attributes and the query pattern includes filtering on multiple fields. Which AWS database service is most suitable?

A.Amazon Neptune

B.Amazon ElastiCache for Redis

C.Amazon DynamoDB

D.Amazon RDS for MySQL

AnswerC

NoSQL, supports JSON and flexible queries with GSIs.

Why this answer

Amazon DynamoDB is the most suitable choice because it is a fully managed NoSQL key-value and document database that natively supports JSON documents with varying attributes. Its flexible schema allows each item to have different attributes, and its support for secondary indexes (Local Secondary Indexes and Global Secondary Indexes) enables efficient filtering and querying on multiple fields without requiring predefined schemas or complex joins.

Exam trap

The trap here is that candidates often choose Amazon RDS for MySQL because they assume JSON support in relational databases is sufficient, but they overlook the performance and schema flexibility limitations when dealing with varying attributes and multi-field filtering at scale.

How to eliminate wrong answers

Option A is wrong because Amazon Neptune is a graph database designed for highly connected data (e.g., social networks, recommendation engines) and is not optimized for storing or querying JSON documents with varying attributes or multi-field filtering; it uses SPARQL or Gremlin, not simple key-value or document queries. Option B is wrong because Amazon ElastiCache for Redis is an in-memory cache, not a durable primary database; while it can store JSON via the RedisJSON module, it lacks persistent storage guarantees and is not designed for complex multi-field filtering or secondary indexes. Option D is wrong because Amazon RDS for MySQL is a relational database that requires a fixed schema, making it unsuitable for storing JSON documents with varying attributes; although MySQL supports JSON columns, querying multiple fields within JSON requires complex expressions and cannot leverage secondary indexes efficiently, leading to performance issues.

Practice this question →

97

MCQhard

Refer to the exhibit. A developer reports that the RDS MySQL instance 'mydb' is experiencing high write latency. The storage is gp2 with 100 GB. What is the MOST likely cause of the write latency?

A.There is a read replica causing replication lag

B.The gp2 volume size is too small, resulting in insufficient baseline IOPS

C.The instance class db.r5.large does not provide enough memory

D.Multi-AZ is not enabled, causing synchronous replication overhead

AnswerB

gp2 baseline IOPS is 3 per GB, so 100 GB gives only 300 IOPS.

Why this answer

The gp2 volume's baseline IOPS are determined by the volume size at a ratio of 3 IOPS per GB, up to 16,000 IOPS. With a 100 GB gp2 volume, the baseline IOPS is only 300 (100 × 3). This is insufficient for write-heavy workloads, causing write latency as the volume exhausts its IOPS credit balance and enters a throttled state.

Burst credits can temporarily boost performance, but sustained high write throughput will deplete credits and lead to latency.

Exam trap

The trap here is that candidates may overlook the gp2 IOPS-to-size ratio and assume any gp2 volume can burst indefinitely, or they may confuse storage performance issues with instance class or replication factors.

How to eliminate wrong answers

Option A is wrong because read replicas do not cause write latency on the source instance; replication lag affects read replicas, not the primary's write performance. Option C is wrong because db.r5.large provides ample memory (16 GiB) for typical workloads, and insufficient memory would manifest as swap usage or out-of-memory errors, not directly as write latency. Option D is wrong because Multi-AZ does not introduce synchronous replication overhead for writes; it uses synchronous replication to a standby in a different AZ, but this adds minimal latency (typically <10 ms) and is not the primary cause of high write latency.

Practice this question →

98

MCQhard

A company is designing a database for a global IoT application that ingests millions of events per second. Each event includes a device ID, timestamp, and sensor readings. The requirement is to store data for historical analysis and to support queries that aggregate data by device ID over time ranges. The team needs a cost-effective solution that can scale write throughput. Which database design is most appropriate?

A.Use Amazon DynamoDB with a table keyed by device ID (partition) and timestamp (sort).

B.Use Amazon RDS for MySQL with Multi-AZ and auto-scaling storage.

C.Use Amazon Redshift with a schema optimized for time-series data.

D.Use Amazon ElastiCache for Redis with persistence enabled.

AnswerA

DynamoDB supports massive write throughput and efficient querying by device and time range.

Why this answer

Option D is correct because DynamoDB with a partition key of device ID and sort key of timestamp allows high write throughput and efficient time-range queries. Option A is wrong because RDS with auto-scaling cannot handle millions of events per second. Option B is wrong because Redshift is optimized for analytics but not for high-velocity writes.

Option C is wrong because ElastiCache is in-memory and too expensive for historical storage.

Practice this question →

99

Multi-Selecthard

Which TWO strategies can improve query performance in Amazon Aurora MySQL for a read-heavy workload? (Select TWO.)

Select 2 answers

A.Enable Aurora Auto Scaling for read replicas

B.Use Provisioned IOPS EBS volumes for the primary instance

C.Enable Multi-AZ to create a standby for read traffic

D.Create Aurora Replicas and distribute read traffic to them

E.Migrate the read-heavy queries to Amazon DynamoDB

AnswersA, D

Auto Scaling automatically adjusts the number of replicas based on load.

Why this answer

Aurora Replicas offload read traffic, and Auto Scaling adjusts the number of replicas based on load. Option C (Multi-AZ) is for failover, not read scaling. Option D (EBS optimization) is not applicable to Aurora (uses cluster volume).

Option E (DynamoDB) is a different service.

Practice this question →

100

MCQeasy

A company needs to store and query a graph of relationships between users for a recommendation engine. The queries involve traversing multiple edges. Which AWS database service is most suitable?

A.Amazon DynamoDB with adjacency list design

B.Amazon Neptune

C.Amazon DocumentDB (with MongoDB compatibility)

D.Amazon RDS for PostgreSQL with recursive CTEs

AnswerB

Neptune is optimized for graph traversals and supports Gremlin and SPARQL.

Why this answer

Option B is correct because Neptune is a fully managed graph database designed for highly connected data. Option A (RDS) requires complex joins. Option C (DynamoDB) can model graphs but requires multiple queries.

Option D (DocumentDB) is for documents.

Practice this question →

101

MCQmedium

A gaming company uses Amazon DynamoDB to store player profiles and game state. The access patterns include: (1) lookup by player ID, (2) query by game ID for recent games, and (3) leaderboard queries sorted by score. The current single-table design is causing hot partitions on the leaderboard queries. What design change should the company implement to resolve hot partitions?

A.Increase the read capacity units (RCUs) on the base table to handle the load.

B.Enable DynamoDB Accelerator (DAX) to cache frequent leaderboard queries.

C.Create a GSI with the game ID as the partition key and a composite sort key of score and timestamp.

D.Shard the table by player ID and use application-level aggregation for leaderboards.

AnswerC

GSI distributes write activity and allows efficient sorted queries per game.

Why this answer

Option C is correct because creating a Global Secondary Index (GSI) with game ID as the partition key and a composite sort key of score and timestamp allows efficient leaderboard queries without hot partitions. This design distributes write activity across multiple partitions by game ID, while the composite sort key enables sorted queries by score and timestamp within each game, avoiding the hot partition issue caused by the original single-table design.

Exam trap

The trap here is that candidates often confuse caching solutions (like DAX) with architectural fixes for hot partitions, failing to recognize that caching does not eliminate the underlying partition-level contention caused by a skewed access pattern.

How to eliminate wrong answers

Option A is wrong because increasing RCUs on the base table does not resolve hot partitions; it only increases throughput capacity, but the underlying partition key (likely player ID) still causes all leaderboard queries to hit the same partition, leading to throttling. Option B is wrong because DynamoDB Accelerator (DAX) caches query results but does not address the root cause of hot partitions; if the leaderboard queries are write-heavy or the cache misses, the hot partition still causes performance degradation. Option D is wrong because sharding by player ID and using application-level aggregation for leaderboards introduces complexity and latency, and does not leverage DynamoDB's native indexing capabilities; it also requires custom logic to maintain sorted leaderboards, which is inefficient compared to a GSI.

Practice this question →

102

Multi-Selectmedium

Which TWO factors should be considered when designing a database for an IoT workload that ingests millions of sensor readings per second? (Choose 2.)

Select 2 answers

A.Ensure strong consistency for all reads

B.Enforce ACID transactions for all writes

C.Implement data retention and aggregation to reduce storage costs

D.Use a time-series database for efficient storage and querying

E.Use a graph database to model relationships between sensors

AnswersC, D

Storing raw data indefinitely is expensive; aggregation reduces volume.

Why this answer

Options A and C are correct: Time-series data is best stored in a time-series database like Timestream, and data retention policies are crucial to manage storage costs. Option B is wrong because strong consistency is not typically required for IoT. Option D is wrong because graph databases are for relationships.

Option E is wrong because ACID compliance is usually unnecessary for sensor data.

Practice this question →

103

Multi-Selectmedium

A company is building a content management system that stores articles, images, and user comments. Articles are text-heavy and need full-text search. Images are binary files. Comments are relational with user IDs. Which TWO AWS services should be combined to best support this workload?

Select 2 answers

A.Amazon ElastiCache for Redis for caching

B.Amazon DynamoDB for articles and comments

C.Amazon OpenSearch Service for full-text search

D.Amazon RDS for MySQL for articles and comments

E.Amazon S3 for images

AnswersC, E

Provides powerful search capabilities.

Why this answer

Amazon S3 is ideal for storing images (binary objects). Amazon OpenSearch Service provides full-text search capabilities for articles. Option C (RDS) is wrong because while it can store text and images, it is not optimal for search or binary storage.

Option D (DynamoDB) is wrong because it does not support full-text search natively. Option E (ElastiCache) is wrong because it is a cache, not a primary store.

Practice this question →

104

MCQmedium

A company is building a document management system where each document can have multiple attributes (tags) that need to be queried efficiently. The workload is write-heavy with occasional reads. Which database is best suited?

A.Amazon QLDB

B.Amazon DynamoDB

C.Amazon ElastiCache for Redis

D.Amazon RDS for MySQL

AnswerB

DynamoDB allows flexible attributes and global secondary indexes for efficient queries.

Why this answer

Amazon DynamoDB is the best choice for a write-heavy, document management system with tag-based queries because it is a fully managed NoSQL key-value and document database that delivers single-digit millisecond performance at any scale. Its flexible schema allows each document to have multiple attributes (tags) without predefined schemas, and its secondary indexes (LSI/GSI) enable efficient querying on those tags. DynamoDB's auto-scaling and provisioned throughput are designed to handle high write volumes, while occasional reads benefit from its consistent low-latency access.

Exam trap

AWS often tests the misconception that a ledger database (QLDB) is suitable for general-purpose document storage because of its 'immutable' and 'verifiable' features, but candidates overlook that QLDB is not designed for high write throughput or flexible attribute queries, which DynamoDB handles natively.

How to eliminate wrong answers

Option A is wrong because Amazon QLDB is a ledger database optimized for immutable, cryptographically verifiable transaction logs, not for high-throughput write-heavy document storage with flexible tag queries; it lacks native support for secondary indexes on arbitrary attributes. Option C is wrong because Amazon ElastiCache for Redis is an in-memory cache designed for sub-millisecond read-heavy workloads and transient data, not for durable, write-heavy document persistence with complex query patterns. Option D is wrong because Amazon RDS for MySQL is a relational database with a fixed schema, which would require complex join tables or EAV (Entity-Attribute-Value) patterns to handle multiple tags, leading to performance degradation under write-heavy loads and poor scalability compared to DynamoDB's distributed architecture.

Practice this question →

105

MCQeasy

A company uses Amazon DynamoDB to store session data for a web application. The table has a partition key of 'SessionId'. The company wants to automatically expire sessions after 1 hour. Which feature should be used?

A.DynamoDB Global Tables

B.AWS Lambda function that scans the table every hour and deletes old items.

C.DynamoDB Streams

D.DynamoDB Time to Live (TTL)

AnswerD

TTL automatically deletes expired items.

Why this answer

Option B is correct because DynamoDB TTL automatically expires items after a specified timestamp. Option A is wrong because DynamoDB Streams captures changes but does not expire items. Option C is wrong because the application would need to scan and delete, which is inefficient.

Option D is wrong because Global Tables are for multi-region replication, not expiration.

Practice this question →

106

MCQmedium

A developer is trying to create a FULLTEXT index on a column in an RDS MySQL instance. The error log shows the index creation failed. What is the most likely cause?

A.The column 'description' has a length that exceeds the maximum allowed for FULLTEXT index.

B.The table size is too large for a FULLTEXT index to be created.

C.The table uses a character set that is not compatible with FULLTEXT indexes.

D.The InnoDB engine does not support FULLTEXT indexes.

AnswerA

The error states the column length is 4294967295, which is too large.

Why this answer

In RDS MySQL, FULLTEXT indexes have a maximum column length limit of 1000 bytes for InnoDB and 1000 characters for MyISAM. If the 'description' column exceeds this limit, the index creation will fail. This is the most likely cause because the error log indicates a failure without other configuration issues.

Exam trap

The trap here is that candidates often assume InnoDB does not support FULLTEXT indexes (a common misconception from older MySQL versions) or that table size is the issue, but the actual constraint is the column length limit.

How to eliminate wrong answers

Option B is wrong because table size does not prevent FULLTEXT index creation; large tables may take longer to index but will not cause a failure. Option C is wrong because MySQL FULLTEXT indexes support character sets like utf8, utf8mb4, latin1, etc., as long as they are compatible with the full-text parser; incompatible character sets are rare and would produce a different error. Option D is wrong because InnoDB has supported FULLTEXT indexes since MySQL 5.6, and RDS MySQL instances use InnoDB by default.

Practice this question →

107

MCQeasy

A company needs to store and query time-series data from IoT devices. The data arrives in high volume and requires efficient range queries over time. Which database is most appropriate?

A.Amazon RDS for MySQL

B.Amazon Timestream

C.Amazon DynamoDB

D.Amazon Redshift

AnswerB

Timestream is a serverless time-series database designed for IoT and operational applications.

Why this answer

Amazon Timestream is a purpose-built time-series database that automatically scales to handle high-volume IoT data and is optimized for efficient range queries over time. It separates storage into a memory store for recent data and a magnetic store for historical data, enabling fast queries across time ranges with built-in time-series functions.

Exam trap

The trap here is that candidates often choose DynamoDB (Option C) because of its scalability, but they overlook that DynamoDB lacks native time-series optimization and requires complex workarounds for efficient range queries over time, making Timestream the correct purpose-built choice.

How to eliminate wrong answers

Option A is wrong because Amazon RDS for MySQL is a relational database not optimized for time-series workloads; it lacks automatic time-based partitioning and efficient range query performance at scale, and would require manual sharding and indexing. Option C is wrong because Amazon DynamoDB is a key-value and document database that does not natively support time-series range queries efficiently; it requires complex design patterns like composite sort keys and TTL for time-series data, and lacks built-in time-series functions. Option D is wrong because Amazon Redshift is a columnar data warehouse designed for OLAP and complex analytics on structured data, not for high-frequency time-series ingestion and real-time range queries; it incurs higher latency and cost for IoT workloads.

Practice this question →

108

MCQmedium

A company is migrating a 5 TB Microsoft SQL Server database to Amazon RDS for SQL Server. The database has many stored procedures and triggers. The migration must have minimal downtime. Which approach should be used?

A.Use AWS SCT to convert the database schema and then use DMS for data load.

B.Use the SQL Server Import/Export wizard to copy data.

C.Use AWS DMS with full load and ongoing replication (CDC).

D.Take a native backup, copy to Amazon S3, and restore to RDS during a maintenance window.

AnswerC

CDC captures changes during migration, minimizing downtime.

Why this answer

Option C is correct because AWS DMS with full load and ongoing change data capture (CDC) enables continuous replication of changes from the source SQL Server to the target Amazon RDS for SQL Server, minimizing downtime to a brief cutover window. This approach handles the migration of stored procedures and triggers as part of the schema conversion via AWS SCT, while CDC captures ongoing transactions to keep the target synchronized until the final switch.

Exam trap

The trap here is that candidates often assume native backup and restore (Option D) is the simplest method for minimal downtime, but they overlook that it requires a maintenance window and does not support ongoing replication, whereas DMS with CDC is specifically designed for near-zero downtime migrations.

How to eliminate wrong answers

Option A is wrong because AWS SCT converts the schema but does not handle ongoing replication; using DMS for data load alone would require a full load without CDC, resulting in significant downtime as the database must be offline to capture a consistent snapshot. Option B is wrong because the SQL Server Import/Export wizard is a one-time, bulk copy tool that does not support ongoing replication or minimal downtime, and it cannot handle large databases like 5 TB efficiently without extended outages. Option D is wrong because taking a native backup, copying to S3, and restoring to RDS requires the database to be in a consistent state during the backup, which typically involves taking the database offline or using a maintenance window, causing downtime; it also does not provide ongoing replication to minimize the cutover period.

Practice this question →

109

MCQeasy

A startup needs a fully managed, serverless database for a new web application with unpredictable traffic. The application requires ACID transactions and SQL queries. Which AWS database service should they use?

A.Amazon Neptune

B.Amazon DynamoDB

C.Amazon Aurora Serverless v2

D.Amazon Redshift

AnswerC

Serverless, auto-scaling, MySQL/PostgreSQL compatible, ACID.

Why this answer

Amazon Aurora Serverless v2 is the correct choice because it provides a fully managed, serverless relational database that automatically scales capacity based on application demand, supports ACID transactions, and uses standard SQL queries. It is ideal for unpredictable traffic patterns as it can scale from zero to hundreds of thousands of transactions per minute without manual intervention.

Exam trap

The trap here is that candidates often confuse DynamoDB's 'transactions' feature (which supports ACID-like semantics only within a single AWS account and region) with full ACID compliance across multiple items, or they mistakenly think Neptune or Redshift can handle OLTP SQL workloads, when in fact they are specialized for graph and analytics respectively.

How to eliminate wrong answers

Option A is wrong because Amazon Neptune is a graph database designed for highly connected data (e.g., social networks, recommendation engines) and does not support ACID transactions or SQL queries in the traditional relational sense. Option B is wrong because Amazon DynamoDB is a NoSQL key-value and document database that does not support ACID transactions across multiple items (only single-item atomicity) and uses a non-SQL API (e.g., PartiQL is limited). Option D is wrong because Amazon Redshift is a petabyte-scale data warehouse optimized for analytical queries (OLAP) on large datasets, not for transactional (OLTP) workloads requiring ACID compliance and low-latency SQL queries.

Practice this question →

110

MCQhard

A company runs a critical PostgreSQL database on Amazon RDS Multi-AZ. They need to perform a major version upgrade (e.g., from 12 to 13) with minimal downtime. Which approach should they take?

A.Take a snapshot, restore as a new instance with the upgraded engine version, and redirect traffic.

B.Initiate a major version upgrade directly on the Multi-AZ instance; the upgrade will be applied during the next maintenance window with minimal downtime.

C.Modify the DB instance to disable Multi-AZ, perform the upgrade, then re-enable Multi-AZ.

D.Create a read replica of the DB instance, perform the major version upgrade on the replica, then promote the replica to a new primary and update the connection string.

AnswerD

This approach reduces downtime because the upgrade is done on the replica while the original primary remains active.

Why this answer

RDS supports major version upgrades, but they require a brief downtime while the upgrade is applied. The recommended approach to minimize downtime is to create a read replica, upgrade the replica, promote it to a standalone instance, and then switch over. This way, the upgrade happens on the replica while the primary continues to serve traffic.

Option A describes this correctly. Option B is the standard upgrade but incurs downtime. Option C is not supported (can't upgrade Multi-AZ without downtime).

Option D suggests modifying the DB instance to Single-AZ, upgrading, then converting back to Multi-AZ, which also incurs downtime.

Practice this question →

111

MCQhard

A financial services company needs to store trade data with strong consistency, high durability, and the ability to run complex SQL queries on the data. The data volume is 10 TB and grows by 1 GB per day. Queries must return results in less than 5 seconds. Which database solution best meets these requirements?

A.Amazon DynamoDB

B.Amazon DocumentDB

C.Amazon Aurora

D.Amazon Redshift

AnswerC

Aurora provides strong consistency, durability, and full SQL support.

Why this answer

Amazon Aurora is a MySQL/PostgreSQL-compatible relational database that provides strong consistency, high durability (6 copies across 3 AZs), and supports complex SQL queries. Option B (DynamoDB) is wrong because it does not natively support complex SQL queries (though PartiQL exists, it's not as powerful). Option C (Redshift) is wrong because it is optimized for data warehousing and may have higher latency for single-row operations.

Option D (DocumentDB) is wrong because it is MongoDB-compatible and not as strong on complex SQL joins.

Practice this question →

112

MCQhard

A company is deploying a globally distributed application with users in the US, Europe, and Asia. The application requires sub-10ms read latency for user profiles stored in Amazon DynamoDB. Writes are less frequent. Which configuration meets the latency requirement while minimizing write conflicts?

A.Deploy Amazon RDS for MySQL with Multi-AZ and cross-Region read replicas.

B.Use Amazon ElastiCache for Redis Global Datastore with DynamoDB as backing store.

C.Deploy a single DynamoDB table in us-east-1 with DAX caches in each region.

D.Use DynamoDB global tables to replicate data to Regions close to users.

AnswerD

Global tables provide multi-region writes and reads with low latency.

Why this answer

DynamoDB global tables provide multi-region, multi-active replication with eventual consistency, enabling sub-10ms reads from local replicas while writes are replicated asynchronously. This minimizes write conflicts because DynamoDB uses last-writer-wins (LWW) conflict resolution, which is acceptable for user profiles where writes are infrequent and conflicts are rare.

Exam trap

The trap here is that candidates may confuse DynamoDB global tables with DAX caching, assuming that a local cache alone can solve global latency without addressing write replication and conflict resolution.

How to eliminate wrong answers

Option A is wrong because Amazon RDS for MySQL with Multi-AZ and cross-Region read replicas cannot achieve sub-10ms read latency globally due to cross-Region replication lag and does not natively handle write conflicts across regions. Option B is wrong because ElastiCache for Redis Global Datastore provides low-latency reads but requires DynamoDB as a backing store, adding operational complexity and potential write conflicts from dual-write patterns. Option C is wrong because a single DynamoDB table in us-east-1 with DAX caches in each region still requires cross-Region reads from the primary table, which cannot guarantee sub-10ms latency due to network distance, and DAX does not replicate writes, so write conflicts are not addressed.

Practice this question →

113

MCQeasy

A startup needs a cost-effective database for a small application that handles both transactional and analytical workloads. They expect low traffic initially but want the database to automatically scale as the business grows. Which database solution is BEST suited?

A.Amazon Aurora Serverless v2

B.Amazon DynamoDB with on-demand capacity

C.Amazon RDS for MySQL with a Single-AZ deployment

D.Amazon Redshift Serverless

AnswerA

Automatically scales capacity and is cost-effective for variable workloads.

Why this answer

Amazon Aurora Serverless v2 is the best fit because it automatically scales compute and memory capacity in fine-grained increments (down to 1 ACU) based on actual workload demand, supporting both transactional (OLTP) and analytical (OLAP) queries via the MySQL/PostgreSQL-compatible Aurora engine. It offers a pay-per-ACU model that is cost-effective for low-traffic startups while providing near-instant scaling to handle growth without manual intervention.

Exam trap

The trap here is that candidates often confuse 'serverless' with 'NoSQL' (DynamoDB) or assume that any 'serverless' database (Redshift Serverless) can handle mixed workloads, but the key differentiator is the need for relational SQL support for both transactional and analytical queries, which only Aurora Serverless v2 provides among the options.

How to eliminate wrong answers

Option B is wrong because Amazon DynamoDB with on-demand capacity is a NoSQL key-value/document database optimized for simple key-value lookups and high-throughput transactional workloads, but it lacks native support for complex analytical queries (e.g., joins, aggregations) that the application requires. Option C is wrong because Amazon RDS for MySQL with a Single-AZ deployment does not automatically scale compute or storage capacity; scaling requires manual instance resizing or Multi-AZ failover, and it cannot handle mixed transactional-analytical workloads efficiently without additional read replicas or separate analytics engines. Option D is wrong because Amazon Redshift Serverless is a petabyte-scale data warehouse designed for heavy analytical workloads and large-scale data warehousing, not for transactional (OLTP) workloads; it is over-provisioned and cost-inefficient for a small application with mixed workloads.

Practice this question →

114

MCQhard

A company is designing a database for an IoT application that ingests millions of sensor readings per second. Each reading includes device ID, timestamp, and measurement. The workload requires time-series analytics and data retention for 90 days. Which AWS database solution is MOST appropriate?

A.Amazon Redshift with auto-copy from S3

B.Amazon ElastiCache for Redis with time-series module

C.Amazon Timestream

D.Amazon DynamoDB with TTL

AnswerC

Timestream is purpose-built for time-series data, handles high ingestion, and includes built-in analytics.

Why this answer

Amazon Timestream is a purpose-built time-series database designed for IoT and operational applications that ingest millions of sensor readings per second. It automatically manages data retention policies (e.g., 90 days) by storing recent data in memory and historical data in a cost-optimized store, and it supports time-series analytics with built-in functions like interpolation and smoothing.

Exam trap

The trap here is that candidates often choose DynamoDB with TTL because they associate it with time-series data and automatic expiration, but they overlook the lack of native time-series analytics and the performance challenges of range queries across high-cardinality device IDs.

How to eliminate wrong answers

Option A is wrong because Amazon Redshift is a data warehouse optimized for complex analytical queries on structured data, not for ingesting millions of high-velocity sensor writes per second; the auto-copy from S3 adds latency and is not designed for real-time streaming ingestion. Option B is wrong because Amazon ElastiCache for Redis with the time-series module is an in-memory cache that cannot efficiently retain 90 days of data at scale due to memory cost and lack of tiered storage, and it is not designed for long-term durable storage. Option D is wrong because Amazon DynamoDB with TTL is a key-value and document database that lacks native time-series analytics functions (e.g., downsampling, interpolation) and cannot efficiently query over time ranges across millions of devices without complex secondary index design and scan-heavy patterns.

Practice this question →

115

MCQhard

A gaming company uses Amazon DynamoDB to store player profiles with partition key player_id. The access pattern is to retrieve profiles for multiple players in a single request. The application currently makes separate GetItem calls, causing high latency. Which design pattern reduces latency and cost?

A.Enable DynamoDB Accelerator (DAX)

B.Redesign to a single-table design

C.Create a global secondary index on player_id

D.Use BatchGetItem to retrieve multiple items in one request

AnswerD

BatchGetItem reduces I/O and latency.

Why this answer

BatchGetItem allows you to retrieve up to 100 items or 16 MB of data from multiple tables in a single API call, reducing the number of network round trips compared to individual GetItem calls. This directly addresses the high latency caused by multiple sequential requests and also reduces cost because you pay for read capacity units (RCUs) based on the total item size, not per request overhead.

Exam trap

AWS often tests the misconception that caching (DAX) or indexing (GSI) can solve multi-item retrieval latency, when the actual solution is to reduce the number of API calls using BatchGetItem, which directly targets the root cause of high latency from sequential requests.

How to eliminate wrong answers

Option A is wrong because DynamoDB Accelerator (DAX) is an in-memory cache that speeds up individual GetItem queries but does not reduce the number of API calls; it still requires separate requests for each player_id, so it does not solve the latency issue of multiple sequential calls. Option B is wrong because the company already uses a single-table design with player_id as the partition key, and redesigning to another single-table design does not change the access pattern of needing multiple items; the problem is the number of API calls, not the table schema. Option C is wrong because a global secondary index on player_id is redundant—player_id is already the partition key of the base table, and creating an index on the same attribute does not enable batch retrieval or reduce latency; it would only add storage and write costs without addressing the multiple-request issue.

Practice this question →

116

MCQmedium

A company has a high-traffic e-commerce application that uses Amazon RDS for MySQL. During flash sales, the database experiences high read load causing slow page loads. The application is read-heavy with occasional writes. Which design change would provide the most immediate performance improvement?

A.Add an Amazon ElastiCache layer

B.Create read replicas of the RDS instance

C.Enable Multi-AZ deployment

D.Upgrade to a larger instance type

AnswerB

Read replicas offload read traffic, improving performance.

Why this answer

Adding read replicas offloads read traffic from the primary instance, improving performance for read-heavy workloads. Option B (Multi-AZ) is wrong because it provides high availability, not read scaling. Option C (ElastiCache) is wrong because it caches only data that is explicitly cached, but read replicas reduce load on the primary.

Option D (instance upgrade) is wrong because it can help but is more expensive and less scalable than adding replicas.

Practice this question →

117

Multi-Selecteasy

A company is selecting a database for a time-series application that collects sensor data from thousands of devices. The data is written at a high velocity (millions of data points per second). The application needs to query recent data (last hour) with sub-second latency and perform long-term analysis on months of data. Which TWO AWS database services best meet these requirements?

Select 2 answers

A.Amazon ElastiCache for Redis with Time Series module.

B.Amazon Timestream for both real-time and historical queries.

C.Amazon DynamoDB with TTL and export to S3 for historical analysis.

D.Amazon Redshift for real-time queries and historical analysis.

E.Amazon Quantum Ledger Database (QLDB) for immutable time-series records.

AnswersB, C

Timestream is optimized for time-series data and provides fast recent data queries and historical analysis.

Why this answer

Option A (Timestream) is built for time-series data with fast queries. Option D (DynamoDB with TTL) can handle high write throughput and automatically expire old data to S3 for analysis. Option B is wrong because Redshift is not designed for sub-second real-time queries.

Option C is wrong because ElastiCache is primarily a cache. Option E is wrong because QLDB is for ledger data.

Practice this question →

118

MCQhard

A financial services company needs a database to store transaction records with strong consistency and the ability to run complex analytical queries. The data volume is in the terabytes and is expected to grow. The company also needs point-in-time recovery. Which AWS database solution meets these requirements?

A.Amazon Redshift with automated snapshots

B.Amazon RDS for MySQL with read replicas

C.Amazon ElastiCache for Redis with AOF persistence

D.Amazon DynamoDB with on-demand backup

AnswerA

Redshift is built for analytics and supports point-in-time recovery via snapshots.

Why this answer

Amazon Redshift with automated snapshots provides point-in-time recovery and is optimized for analytical queries on large datasets. Option A (DynamoDB) does not support complex SQL analytics. Option B (RDS MySQL) is for OLTP and may not handle terabytes of analytical queries efficiently.

Option D (ElastiCache) is in-memory caching.

Practice this question →

119

MCQeasy

A company's application is logging the error shown in the exhibit. The application is deployed on Amazon EC2 and connects to an Amazon RDS for MySQL Multi-AZ DB instance. Which configuration change is most likely to resolve this issue?

A.Add an additional standby instance in a third Availability Zone.

B.Increase the connection pool timeout in the application configuration.

C.Create a read replica and direct write traffic to it.

D.Increase the DB instance class to handle more concurrent connections.

AnswerD

A larger instance can handle more connections and reduce timeouts.

Why this answer

The error log indicates that the application is hitting the maximum number of connections allowed by the RDS DB instance. Increasing the DB instance class (Option D) provides more memory and CPU resources, which allows the instance to support a higher `max_connections` value (calculated as `DBInstanceClassMemory / 12582880` for MySQL). This directly resolves the connection limit issue without changing the application's connection pool behavior or architecture.

Exam trap

The trap here is that candidates often confuse connection pool timeout adjustments (Option B) with connection limit increases, but timeout only affects how long a request waits, not the hard limit imposed by the database engine's `max_connections` parameter.

How to eliminate wrong answers

Option A is wrong because adding a third standby instance in a Multi-AZ deployment does not increase the connection limit; it only improves availability and failover capability. Option B is wrong because increasing the connection pool timeout does not reduce the number of concurrent connections; it only changes how long the application waits for a connection, which could actually worsen the backlog. Option C is wrong because a read replica cannot accept write traffic; directing writes to it would cause application errors, and it does not increase the write capacity or connection limit of the primary instance.

Practice this question →

120

Multi-Selectmedium

A company is designing a database for an IoT application that ingests sensor data from thousands of devices. Each device sends a reading every minute. The data includes device_id, timestamp, temperature, humidity, and pressure. The application needs to store this data and support queries that retrieve all readings for a specific device within a time range. The company expects high write throughput and moderate read frequency. The data must be stored with high durability. Which TWO database designs are appropriate for this workload? (Choose TWO.)

Select 2 answers

A.Use Amazon DynamoDB with device_id as partition key and store all readings for a device as a list attribute in a single item, updating the list every minute.

B.Use Amazon S3 to store compressed JSON files per device per hour, and query using Amazon Athena.

C.Use Amazon DynamoDB with device_id as partition key and timestamp as sort key.

D.Use Amazon RDS for MySQL with a single table and index on device_id and timestamp.

E.Use Amazon Timestream, a time series database, with device_id as dimension and timestamp as time column.

AnswersC, E

DynamoDB can handle high write throughput and efficient queries by device and time range.

Why this answer

Option C is correct because DynamoDB's partition key (device_id) and sort key (timestamp) design allows efficient retrieval of all readings for a specific device within a time range using a Query operation with a KeyConditionExpression on the sort key. This schema supports high write throughput by distributing writes across partitions based on device_id, and DynamoDB's multi-AZ replication provides high durability.

Exam trap

The trap here is that candidates often overlook DynamoDB's item size limit and write hotspot issues in Option A, or assume that any SQL database can handle high write throughput without considering single-writer bottlenecks in Option D.

Practice this question →

121

Multi-Selectmedium

Which TWO database services are most suitable for workloads that require ACID transactions?

Select 2 answers

A.Amazon Neptune

B.Amazon Timestream

C.Amazon Aurora

D.Amazon RDS for MySQL

E.Amazon DynamoDB

AnswersC, D

Aurora is a relational database with full ACID support.

Why this answer

Amazon Aurora is correct because it is a MySQL- and PostgreSQL-compatible relational database engine that provides full ACID (Atomicity, Consistency, Isolation, Durability) transaction support, including multi-statement transactions with commit and rollback. Aurora uses a distributed, fault-tolerant storage subsystem that replicates data across three Availability Zones, ensuring durability and consistency for transactional workloads.

Exam trap

The trap here is that candidates often assume DynamoDB supports full ACID transactions because of its 'DynamoDB Transactions' feature, but those transactions are limited to a maximum of 25 items or 4 MB per transaction and do not provide the same isolation guarantees as a relational database, making it unsuitable for workloads requiring strict ACID compliance across many rows or tables.

Practice this question →

122

MCQmedium

A company runs a MongoDB-compatible workload on Amazon DocumentDB. They notice that many read requests are returning stale data even though reads are directed to the primary instance. What is the MOST likely cause?

A.The application's session is pinned to a secondary replica despite requesting the primary.

B.The application is using a read preference that allows secondary reads.

C.The primary instance is experiencing high CPU utilization, causing delayed writes.

D.The storage volume is using the default eventually consistent configuration for primary reads.

AnswerB

If the read preference is set to 'secondaryPreferred' or similar, reads may go to secondary replicas which are eventually consistent.

Why this answer

DocumentDB uses a distributed storage volume with six copies across three Availability Zones. Reads from the primary are strongly consistent by default. However, if the application uses read preferences that allow secondary reads (e.g., 'secondaryPreferred'), then reads may go to replicas which are eventually consistent.

The question implies reads are directed to primary, but if the application uses 'secondaryPreferred' and the primary is unavailable, reads may go to secondary. Alternatively, if the application uses a read preference that does not require primary, stale reads can occur. Option D addresses the most common cause: read preference set to allow secondary reads.

Practice this question →

123

MCQhard

Based on the CLI output, what is true about this RDS instance?

A.The instance runs Amazon Aurora PostgreSQL

B.The instance is a Multi-AZ deployment

C.The instance is a Read Replica of another RDS instance

D.The instance uses Provisioned IOPS (io1) storage

AnswerC

ReadReplicaSourceDBInstanceIdentifier is set.

Why this answer

The CLI output shows `ReplicaLag` with a value of `0`, which is a field that only appears when the RDS instance is configured as a Read Replica. A Read Replica maintains asynchronous replication from a source DB instance, and the lag metric indicates how far behind the replica is. Since the output includes this field, the instance must be a Read Replica.

Exam trap

The trap here is that candidates see `ReplicaLag: 0` and assume it means no replication is happening or that it indicates a Multi-AZ setup, but in reality, a lag of 0 simply means the replica is fully caught up, and the presence of the field itself confirms it is a Read Replica, not a Multi-AZ standby.

How to eliminate wrong answers

Option A is wrong because the output does not show any Aurora-specific fields (e.g., `DBClusterIdentifier`, `AuroraReplicaLag`) and the engine would be listed as `aurora` or `aurora-postgresql`, not a standard RDS engine. Option B is wrong because a Multi-AZ deployment does not expose a `ReplicaLag` field; Multi-AZ uses synchronous replication and the replica is not directly accessible for reads. Option D is wrong because the output does not include `StorageType` set to `io1` or `ProvisionedIOPS`; without those fields, we cannot conclude the instance uses Provisioned IOPS storage.

Practice this question →

124

Multi-Selectmedium

A company is designing a database for an IoT application that ingests millions of sensor readings per second. The data is time-series and is queried to generate reports on average temperature over the last hour. Which TWO database solutions are most suitable for this workload?

Select 2 answers

A.Amazon DynamoDB with time-series data model and TTL

B.Amazon Neptune

C.Amazon RDS for PostgreSQL

D.Amazon Timestream

E.Amazon ElastiCache for Redis

AnswersA, D

Can handle time-series with proper design.

Why this answer

Option B (Timestream) is a purpose-built time-series database for IoT data. Option D (DynamoDB with TTL and optimized access pattern) can also handle time-series data if designed properly with partition key and sort key. Option A is wrong because Neptune is for graph data.

Option C is wrong because ElastiCache is a cache, not a primary store for long-term data. Option E is wrong because RDS is not optimized for high-ingestion time-series.

Practice this question →

125

MCQhard

An e-commerce application stores order data in Amazon RDS for MySQL. The database has grown to 1.5 TB and the company needs to retain data for 7 years for compliance. Current queries are becoming slow due to the large table size. The compliance requirement mandates that data older than 1 year must be retained but is rarely accessed. What strategy would reduce the active table size while maintaining compliance?

A.Create a read replica and run reports against it.

B.Partition the table by date and archive partitions older than 1 year to Amazon S3 using AWS DMS.

C.Delete data older than 1 year and use automated backups for compliance.

D.Vertically partition the table to separate frequently and infrequently accessed columns.

AnswerB

Removes old data from active table, retains in S3 for compliance.

Why this answer

Option C is correct because partitioning the table by date and archiving old partitions to a separate table or S3 meets compliance and reduces active table size. Option A is wrong because vertical partitioning (splitting columns) doesn't address the row count issue. Option B is wrong because read replicas do not reduce storage size.

Option D is wrong because deleting data violates compliance.

Practice this question →

126

MCQeasy

A company is designing a document management system using Amazon S3 and needs to store metadata such as document ID, owner, creation date, and tags. The metadata must be searchable with low latency, supporting queries like 'Find all documents owned by user X with tag Y created after date Z'. Which AWS database service is most suitable for storing and querying this metadata?

A.Amazon DynamoDB with a GSI on (owner, creation_date) and a filter on tags.

B.Amazon Redshift Spectrum querying metadata stored in S3 as CSV.

C.Amazon RDS for PostgreSQL with a normalized schema.

D.Amazon ElastiCache for Redis with sorted sets for tags.

AnswerA

DynamoDB provides fast queries and flexible indexing.

Why this answer

Amazon DynamoDB is the most suitable choice because it provides single-digit millisecond latency for queries at any scale, which meets the low-latency search requirement. By creating a Global Secondary Index (GSI) on (owner, creation_date), you can efficiently query documents by owner and date range, and then apply a filter expression on tags to narrow results. This schema avoids the overhead of joins and normalization, making it ideal for high-throughput metadata lookups.

Exam trap

The trap here is that candidates often choose a relational database like PostgreSQL because they think normalized schemas are required for complex queries, but DynamoDB's GSI and filter expressions can handle this access pattern more efficiently at scale without the overhead of joins.

How to eliminate wrong answers

Option B is wrong because Amazon Redshift Spectrum is designed for analytical queries on large datasets in S3, not for low-latency, point-query or filtered lookups on metadata; it incurs significant overhead for each query and does not support sub-second response times. Option C is wrong because Amazon RDS for PostgreSQL with a normalized schema would require complex joins and indexing to support the multi-condition query, and relational databases typically have higher latency and scaling limitations compared to DynamoDB for this access pattern. Option D is wrong because Amazon ElastiCache for Redis with sorted sets is an in-memory cache, not a durable database; it lacks native support for multi-attribute queries like filtering by owner, date, and tags simultaneously, and sorted sets are optimized for leaderboard-style range queries, not arbitrary metadata searches.

Practice this question →

127

MCQeasy

A company is migrating a MySQL database from on-premises to Amazon RDS for MySQL. The current database has several stored procedures and triggers that use user-defined functions (UDFs) compiled as shared libraries. What is the best practice for handling these UDFs in RDS?

A.Use Amazon RDS Custom for MySQL to upload the UDF libraries.

B.Use AWS Lambda to replace the UDFs.

C.Migrate to Amazon Aurora MySQL, which supports custom UDFs.

D.Refactor the stored procedures to avoid using the custom UDFs.

AnswerD

RDS does not support custom compiled UDFs; the application must be refactored.

Why this answer

Amazon RDS for MySQL does not allow access to the underlying file system, so you cannot upload custom UDF shared libraries (.so files). The best practice is to refactor the stored procedures and triggers to remove dependencies on these UDFs, replacing their logic with native MySQL functions or application-level code. This ensures compatibility with the managed RDS environment without requiring custom binaries.

Exam trap

The trap here is that candidates assume RDS Custom or Aurora MySQL will support custom UDFs, but neither service allows loading arbitrary shared libraries, making refactoring the only viable option.

How to eliminate wrong answers

Option A is wrong because Amazon RDS Custom for MySQL still restricts custom UDFs; RDS Custom provides OS-level access for patching and configuration but does not support loading arbitrary shared libraries for UDFs. Option B is wrong because AWS Lambda is an event-driven compute service that cannot directly replace UDFs used inside stored procedures or triggers; it would require significant architectural changes and introduce latency. Option C is wrong because Amazon Aurora MySQL does not support custom UDFs compiled as shared libraries; it only supports a limited set of built-in functions and Lambda-based functions via the native function interface.

Practice this question →

128

MCQeasy

A company runs a reporting application on Amazon Redshift. The application queries a large fact table that is distributed by a key. The report queries filter on a date column. The report performance is slow. The database has 10 nodes. The company wants to improve query performance by optimizing the table design. Which design change should be made?

A.Set the sort key to the date column.

B.Increase the number of nodes in the cluster.

C.Change the distribution style to ALL to avoid data redistribution.

D.Change the distribution style to KEY on the date column.

AnswerA

Sort keys enable efficient range filtering, improving query performance for date-based filters.

Why this answer

Setting the sort key to the date column improves query performance by enabling range-restricted scans. When queries filter on a date column, Redshift uses zone maps to skip blocks that do not contain relevant data, drastically reducing the number of rows scanned. This is the most direct and cost-effective optimization for filter-heavy workloads on large fact tables.

Exam trap

The trap here is that candidates often confuse the purpose of distribution keys (for join co-location) with sort keys (for filter pruning), leading them to choose distribution changes (options C or D) instead of the correct sort key optimization.

How to eliminate wrong answers

Option B is wrong because increasing the number of nodes adds compute and storage capacity but does not address the root cause of slow scans; it is a scale-up solution that incurs additional cost without optimizing data access patterns. Option C is wrong because changing the distribution style to ALL replicates the entire table to every node, which eliminates data redistribution for joins but does not improve the efficiency of range-restricted scans on the date column; it also wastes storage and can degrade load performance. Option D is wrong because changing the distribution style to KEY on the date column would distribute rows based on date values, which can cause data skew if the date column has uneven cardinality (e.g., recent dates dominating), and it does not enable the block-minimax pruning that a sort key provides.

Practice this question →

129

Multi-Selecthard

A company is using Amazon DynamoDB for a gaming leaderboard that updates frequently. They need to maintain a sorted list of top 100 players by score. Which THREE design patterns can achieve this efficiently?

Select 3 answers

A.Use a Global Secondary Index (GSI) with score as the sort key and query with ScanIndexForward=false and Limit=100.

B.Use DynamoDB Accelerator (DAX) to cache query results.

C.Use DynamoDB Streams and AWS Lambda to maintain a separate leaderboard table with the top 100 scores.

D.Scan the entire table and sort the results in memory.

E.Use Amazon ElastiCache for Redis with sorted sets to maintain the leaderboard.

AnswersA, C, E

This retrieves the top 100 scores efficiently.

Why this answer

Option A is correct because a Global Secondary Index (GSI) with score as the sort key allows you to query items in descending order using ScanIndexForward=false and limit the result to the top 100 players. This pattern efficiently retrieves the highest scores without scanning the entire table, leveraging DynamoDB's index query capabilities.

Exam trap

Cisco often tests the misconception that DAX can perform sorting or ranking operations, but DAX is only a cache and cannot reorder data or maintain sorted sets.

Practice this question →

130

MCQhard

A company runs a large-scale e-commerce platform using Amazon RDS for MySQL with a Multi-AZ deployment. The database has a table 'orders' with 200 million rows. Recently, they added a new index on the 'order_date' column to improve reporting queries. After adding the index, they noticed increased write latency and occasional replication lag. The application writes new orders continuously. The table experiences about 10,000 writes per second. The DB instance is db.r5.4xlarge. The index creation was done using the ALTER TABLE statement with a default algorithm. What is the most likely cause of the increased write latency and replication lag?

A.The index creation DDL statement is not replicated to the standby instance, causing inconsistency.

B.The instance size is insufficient for the write workload.

C.The index was created using the default algorithm (COPY), which locks the table and blocks writes, causing replication lag.

D.The new index is causing excessive overhead on write operations due to index maintenance.

AnswerC

In MySQL 5.6 and 5.7, ALTER TABLE uses COPY algorithm by default, which locks the table for writes during the operation.

Why this answer

Option C is correct because the default algorithm for ALTER TABLE in MySQL is COPY, which creates a new table, copies all rows, and rebuilds indexes. During this process, the table is locked with a write lock, blocking DML operations and causing increased write latency. In a Multi-AZ deployment, the DDL is replicated to the standby, but the lock on the primary delays writes, which can manifest as replication lag when the standby applies the same blocking DDL.

Exam trap

The trap here is that candidates often assume any index addition causes permanent write overhead (Option D), but the question describes a sudden latency spike immediately after the operation, which is characteristic of the blocking COPY algorithm, not ongoing maintenance.

How to eliminate wrong answers

Option A is wrong because DDL statements like ALTER TABLE are replicated to the standby instance via the binary log in MySQL Multi-AZ deployments; the index creation is not skipped, so inconsistency does not occur. Option B is wrong because the db.r5.4xlarge instance (16 vCPUs, 128 GB memory) is more than sufficient for 10,000 writes per second on a single table; the issue is not raw capacity but the blocking nature of the DDL operation. Option D is wrong because while index maintenance does add overhead to writes, the sudden increase in write latency and replication lag immediately after adding the index points to the blocking DDL operation itself, not the ongoing maintenance cost of the new index.

Practice this question →

131

MCQmedium

A company runs a customer relationship management (CRM) application on Amazon RDS for PostgreSQL. The application stores customer data in a table with over 50 million rows. The company recently added a new query that searches for customers by their email domain (e.g., '@example.com'). The query uses a LIKE pattern: 'WHERE email LIKE ''%@example.com'''. The query takes over 30 seconds to complete. The DBA has already created a B-tree index on the email column, but it does not help. Which action should the database specialist recommend to improve query performance?

A.Create a hash index on the email column.

B.Increase the shared_buffers parameter to improve caching.

C.Create a B-tree index on the reversed email string.

D.Create a trigram index (using pg_trgm extension) on the email column.

AnswerD

Trigram indexes are designed for fast LIKE queries.

Why this answer

The query uses a leading wildcard LIKE pattern ('%@example.com'), which prevents a standard B-tree index from being used because the search string does not have a fixed prefix. A trigram index, provided by the pg_trgm extension, breaks strings into three-character substrings (trigrams) and allows the database to efficiently match patterns with leading wildcards. This index type is specifically designed for fuzzy text matching and LIKE queries with wildcards, reducing the query time from over 30 seconds to milliseconds.

Exam trap

The trap here is that candidates assume a B-tree index can handle all LIKE patterns, but AWS specifically tests the understanding that leading wildcards disable B-tree index scans, requiring a specialized index like pg_trgm for pattern-matching performance.

How to eliminate wrong answers

Option A is wrong because hash indexes in PostgreSQL only support equality comparisons (=), not pattern-matching operations like LIKE. Option B is wrong because increasing shared_buffers improves caching of data pages but does not change the query execution plan; the B-tree index is still not used for leading-wildcard searches, so the query remains a full table scan. Option C is wrong because creating a B-tree index on the reversed email string would only help if the query were rewritten to use a trailing wildcard (e.g., WHERE REVERSE(email) LIKE 'moc.elpmaxe@%'), which is not the given query pattern and adds complexity without addressing the leading wildcard issue.

Practice this question →

132

MCQmedium

A company uses Amazon Aurora MySQL for its customer relationship management (CRM) system. The database has a table "contacts" with millions of rows. The application frequently searches for contacts by email address. The email column has a B-tree index. The DBA notices that queries are still slow, and the EXPLAIN plan shows index scans but not index-only scans. What is the most likely cause?

A.The query selects columns not included in the index, requiring table lookups.

B.The index is a composite index on (email, phone) and the query selects only email.

C.The index has low cardinality.

D.The index type is not suitable for equality searches.

AnswerA

If the query selects columns like phone not in the index, the database must access the table, preventing an index-only scan.

Why this answer

Option D is correct because if the index does not include all columns needed (like phone), the database must fetch rows from the table. Option A is wrong because composite indexes on email and phone would help if queries used both, but not for email-only. Option B is wrong because B-tree indexes are suitable for equality.

Option C is wrong because cardinality is likely high enough.

Practice this question →

133

MCQeasy

A company wants to migrate their on-premises Oracle database to Amazon RDS for Oracle. They have a complex data loading process that uses Oracle Data Pump. Which migration approach is MOST efficient and minimizes downtime?

A.Use AWS Schema Conversion Tool (SCT) to convert the schema and then copy data files directly.

B.Use AWS Database Migration Service (DMS) with ongoing replication from the source Oracle database.

C.Take a physical backup of the on-premises database and restore to RDS.

D.Export data using Oracle Data Pump and import into RDS.

AnswerB

DMS supports full load + CDC, reducing downtime.

Why this answer

AWS DMS can perform a one-time full load followed by ongoing replication using Oracle LogMiner or binary reader. This minimizes downtime because after the full load, DMS continuously applies changes from the source. Data Pump export/import requires downtime.

SCT is used for schema conversion, not data movement. Backup and restore also requires downtime.

Practice this question →

134

MCQeasy

A company is implementing fine-grained access control for a DynamoDB table named UserSessions. The table has a partition key of 'user_id'. The above IAM policy is attached to an IAM role assumed by the application. What does this policy achieve?

A.Allows the application to perform all operations on the UserSessions table without restrictions

B.Restricts the application to access only items where the partition key matches the user's AWS user ID

C.Allows the application to read but not write items in the UserSessions table

D.Allows the application to access only the UserSessions table but not other tables

AnswerB

The condition uses 'aws:userid' to limit access to items with the corresponding partition key.

Why this answer

The IAM policy uses a condition key `dynamodb:LeadingKeys` with a value of `${aws:userid}`. This restricts access to items in the DynamoDB table where the partition key (`user_id`) matches the unique identifier of the IAM user or role that is making the request. This implements fine-grained access control, ensuring the application can only read or write items belonging to the authenticated user.

Exam trap

The trap here is that candidates often confuse `aws:userid` with the IAM user name or the partition key value, or they assume the policy grants full access (Option A) without noticing the condition that enforces row-level security.

How to eliminate wrong answers

Option A is wrong because the policy explicitly restricts access based on the partition key, so it does not allow all operations without restrictions. Option C is wrong because the policy does not specify any `Action` or `Effect` that limits operations to read-only; it allows all DynamoDB actions on the table, subject to the condition. Option D is wrong because the policy's `Resource` element is scoped to the `UserSessions` table ARN, but the condition is what restricts access within that table, not the ability to access other tables (which would be denied by default if not explicitly allowed).

Practice this question →

135

MCQmedium

A financial services company is migrating an on-premises Oracle database to Amazon RDS for Oracle. The database has a high volume of write transactions and requires minimal downtime during migration. Which AWS service or feature should be used to replicate data continuously to the target RDS instance during the migration?

A.Amazon RDS Read Replica

B.Amazon RDS Multi-AZ deployment

C.AWS Database Migration Service (AWS DMS) with ongoing replication

D.AWS Schema Conversion Tool (AWS SCT)

AnswerC

AWS DMS can perform full load and then continuously replicate changes via CDC.

Why this answer

AWS Database Migration Service (AWS DMS) with ongoing replication (change data capture, CDC) is the correct choice because it continuously captures and applies changes from the source Oracle database to the target Amazon RDS for Oracle instance, enabling near-zero downtime during migration. This is achieved by using Oracle's redo logs to stream transactions in real time, which meets the high write volume and minimal downtime requirements.

Exam trap

The trap here is that candidates confuse continuous replication with high-availability features like Multi-AZ or read replicas, but those services cannot ingest data from an on-premises source; only AWS DMS with CDC provides the necessary ongoing replication for a live migration with minimal downtime.

How to eliminate wrong answers

Option A is wrong because Amazon RDS Read Replica is designed for read scaling and asynchronous replication from an RDS source, not for migrating an on-premises Oracle database; it cannot connect to an external source. Option B is wrong because Amazon RDS Multi-AZ deployment provides high availability by synchronously replicating data to a standby instance in another Availability Zone, but it does not support continuous replication from an on-premises database. Option D is wrong because the AWS Schema Conversion Tool (AWS SCT) is used to convert database schemas and code for heterogeneous migrations, not for continuous data replication.

Practice this question →

136

MCQeasy

A financial services company uses Amazon DynamoDB to store transaction records. Each transaction has a unique transaction_id as the partition key and a timestamp as the sort key. The application frequently queries all transactions for a given customer within a date range. However, customer_id is not an attribute indexed for querying. The company wants to optimize these queries without redesigning the entire table schema. Which action should the company take?

A.Change the table's partition key to customer_id and use a composite sort key.

B.Create a Local Secondary Index (LSI) on customer_id.

C.Create a Global Secondary Index (GSI) with customer_id as the partition key and timestamp as the sort key.

D.Use the Scan operation with a filter expression for customer_id and timestamp.

AnswerC

A GSI allows querying by customer_id and timestamp range without modifying the base table.

Why this answer

Option C is correct because creating a Global Secondary Index (GSI) with customer_id as the partition key and timestamp as the sort key allows efficient querying of all transactions for a given customer within a date range without redesigning the base table. The GSI provides a new access pattern with its own partition and sort keys, enabling the Query operation on customer_id and timestamp, which is far more efficient than a Scan. This approach preserves the existing table schema and supports the required query pattern with minimal overhead.

Exam trap

The trap here is that candidates often confuse Local Secondary Indexes (LSIs) with Global Secondary Indexes (GSIs), assuming an LSI can be added later or can use a different partition key, when in fact LSIs must share the base table's partition key and can only be created at table creation time.

How to eliminate wrong answers

Option A is wrong because changing the table's partition key to customer_id would require a full table redesign, data migration, and application downtime, which contradicts the requirement to avoid redesigning the entire table schema. Option B is wrong because a Local Secondary Index (LSI) can only be created at table creation time and must use the same partition key as the base table (transaction_id), so it cannot index on customer_id as a partition key for range queries. Option D is wrong because using the Scan operation with a filter expression for customer_id and timestamp is inefficient, as it reads every item in the table and then filters, incurring high read capacity consumption and latency, especially for large tables.

Practice this question →

137

MCQmedium

A company has a document database workload on Amazon DynamoDB that stores user session data. The application frequently updates session attributes (e.g., last activity timestamp). The current design stores the entire session as a single item and updates the entire item on each session activity. This is causing high write costs and throttling. Which design pattern would reduce write costs and improve performance?

A.Increase the write capacity units (WCUs) on the table.

B.Use UpdateItem with an update expression to modify only the changed attributes.

C.Implement DynamoDB Accelerator (DAX) to cache the session data.

D.Split the session item into multiple items, one per attribute.

AnswerB

Update expressions only write the changed attributes, consuming fewer write capacity units.

Why this answer

Option B is correct because using UpdateItem with an update expression allows you to modify only the specific attributes that changed (e.g., last activity timestamp) instead of rewriting the entire item. This reduces write consumption to a fraction of the original cost, since DynamoDB charges based on the size of the written data, and partial updates write only the changed attribute bytes. This directly addresses the high write costs and throttling caused by full-item overwrites.

Exam trap

The trap here is that candidates often confuse scaling solutions (increasing WCUs or adding DAX) with optimization patterns, failing to recognize that the real issue is the write amplification caused by full-item updates rather than insufficient capacity or read performance.

How to eliminate wrong answers

Option A is wrong because increasing write capacity units (WCUs) only raises the throughput limit but does not reduce the cost per write; it would increase costs further and does not solve the root cause of writing the entire item. Option C is wrong because DynamoDB Accelerator (DAX) is an in-memory cache that improves read performance, not write cost or write throttling; it does not reduce the amount of data written per update. Option D is wrong because splitting a session item into multiple items per attribute would require multiple write operations for each session update, increasing write costs and complexity, and DynamoDB charges per write request regardless of item size.

Practice this question →

138

MCQeasy

A company is migrating an on-premises Oracle OLTP workload to AWS. The database has complex stored procedures and requires minimal code changes. Which AWS database service is the most suitable target?

A.Amazon Redshift

B.Amazon DynamoDB

C.Amazon Aurora PostgreSQL

D.Amazon RDS for Oracle

AnswerD

Minimal code changes required.

Why this answer

Amazon RDS for Oracle allows minimal code changes as it is compatible with Oracle. Option B (Aurora PostgreSQL) would require rewriting procedures. Option C (DynamoDB) is NoSQL.

Option D (Redshift) is for analytics.

Practice this question →

139

MCQeasy

A company runs a MySQL database on Amazon RDS for an e-commerce platform. The application performs frequent INSERT and UPDATE operations on the 'orders' table. The team notices an increase in disk I/O and CPU usage. They want to optimize the database for write-heavy workloads without changing the application. Which option is the MOST effective?

A.Enable Multi-AZ deployment for redundancy

B.Change the storage engine to MyISAM

C.Increase the InnoDB buffer pool size

D.Upgrade to Amazon Aurora MySQL

AnswerC

A larger buffer pool reduces disk I/O by caching data and indexes in memory.

Why this answer

RDS provides enhanced monitoring and parameter groups. For write-heavy workloads, using InnoDB with appropriate buffer pool settings is key. Option A is wrong because switching to MyISAM is not recommended (no transaction support, table-level locking).

Option B is wrong because enabling Multi-AZ does not improve write performance (synchronous replication adds latency). Option D is wrong because converting to Aurora MySQL would improve performance but is more expensive and not without changes.

Practice this question →

140

MCQeasy

A startup is building a multi-tenant SaaS application where each tenant's data must be isolated. The data model is relational with complex joins. Which database deployment model is most appropriate?

A.Use a single Amazon DynamoDB table with a tenant_id partition key

B.Use a single Amazon Redshift cluster with tenant_id distribution key

C.Provision a separate Amazon RDS instance for each tenant

D.Use a single Amazon RDS database with a tenant_id column on every table

AnswerC

Separate instances ensure complete data isolation and independent scaling.

Why this answer

Option B is correct because a separate RDS instance per tenant provides strong isolation. Option A (single table with tenant_id) risks noisy neighbors. Option C (DynamoDB single table) is not relational.

Option D (Redshift) is for analytics, not OLTP.

Practice this question →

141

Multi-Selecteasy

Which TWO are valid use cases for Amazon ElastiCache for Redis? (Choose 2)

Select 2 answers

A.Storing graph data with relationships

B.Session management for web applications

C.Running complex analytical queries on large datasets

D.Caching frequently accessed database queries to reduce load on RDS

E.Persistent storage of relational data

AnswersB, D

Redis is often used for session storage due to low latency.

Why this answer

Amazon ElastiCache for Redis is an in-memory data store ideal for session management because it provides sub-millisecond latency for storing and retrieving session tokens, supports TTL-based key expiration to automatically clean up stale sessions, and offers atomic operations like SETEX for safe session creation. This makes it a perfect fit for stateless web applications that need to offload session state from the application server.

Exam trap

The trap here is that candidates often confuse caching (option D) with persistent storage (option E) or assume that Redis's data structures (like sorted sets) can handle graph relationships (option A), but Redis lacks the graph traversal and indexing capabilities of a dedicated graph database.

Practice this question →

142

MCQhard

A company uses Amazon RDS for PostgreSQL to store sensor data. Each sensor sends a row every second. The table has grown to 500 GB and queries filtering on a timestamp column are slow even with an index. The team wants to improve query performance while keeping the data online. Which approach should they take?

A.Partition the table by time using PostgreSQL table partitioning

B.Migrate to Amazon Aurora PostgreSQL and enable parallel query

C.Add more indexes on the timestamp column

D.Create a read replica and direct queries to the replica

AnswerA

Partitioning by time allows partition pruning, significantly improving query performance on timestamp filters.

Why this answer

Partitioning the table by time (e.g., by day or month) allows PostgreSQL to prune partitions, reducing the amount of data scanned. This is a common pattern for time-series data. Adding more indexes can slow writes.

Read replicas help with read scaling but not query performance on the primary. Changing to Aurora PostgreSQL with parallel query can help but partitioning is a more direct solution.

Practice this question →

143

MCQmedium

A gaming company uses Amazon RDS for PostgreSQL to store player profiles and game state data. The database is currently 500 GB and grows by 10 GB per day. The company runs weekly reports that scan the entire database, causing high I/O and CPU usage. The application experiences read latency spikes during report generation. The team wants to minimize performance impact on the application while maintaining the ability to run reports. Which solution should the team implement?

A.Create a read replica and direct all report queries to the read replica.

B.Enable Multi-AZ deployment to provide a standby instance for failover and use it for reporting.

C.Scale up the RDS instance to a larger instance type to handle the additional load from reports.

D.Archive historical game state data to Amazon S3 and delete it from the database to reduce size.

AnswerA

A read replica offloads read-intensive workloads from the primary, reducing latency for the application.

Why this answer

Creating a read replica for Amazon RDS for PostgreSQL allows the team to offload all report queries to a separate read-only endpoint, eliminating the I/O and CPU contention on the primary database. This directly addresses the read latency spikes during report generation without requiring any application changes beyond redirecting the reporting queries. The read replica asynchronously replicates data from the primary instance, ensuring the reports see a near-real-time snapshot of the data while the primary remains dedicated to the application workload.

Exam trap

The trap here is that candidates often confuse Multi-AZ standby instances with read replicas, mistakenly believing the standby can be used for read traffic, but AWS explicitly prevents read access to the standby to maintain synchronous replication integrity.

How to eliminate wrong answers

Option B is wrong because a Multi-AZ standby instance is not accessible for read queries; it is a synchronous replica used solely for automatic failover and cannot serve traffic, so it would not offload the reporting workload. Option C is wrong because scaling up the RDS instance to a larger type only increases the capacity of the single instance, but the report queries would still compete with the application for the same I/O and CPU resources, failing to minimize the performance impact. Option D is wrong because archiving historical data to S3 reduces the database size but does not address the immediate I/O and CPU spikes caused by the weekly full-table scans; the reports would still scan the remaining data and cause latency issues.

Practice this question →

144

MCQhard

A company runs a data warehouse on Amazon Redshift. The workload has frequent DELETE and UPDATE operations on a large fact table. Over time, query performance degrades. Which maintenance operation should be scheduled regularly to optimize performance?

A.Run VACUUM FULL during maintenance windows

B.Run VACUUM and ANALYZE commands regularly

C.Alter the table to use a different DISTKEY

D.Drop and recreate the table periodically

AnswerB

Reclaims space and updates statistics for better query plans.

Why this answer

B is correct because frequent DELETE and UPDATE operations in Redshift create ghost rows and cause table bloat, degrading query performance. Running VACUUM reclaims space and re-sorts rows, while ANALYZE updates table statistics for the query optimizer; together they restore performance without requiring a full table rebuild.

Exam trap

The trap here is that candidates confuse VACUUM FULL (a PostgreSQL command) with Redshift's VACUUM options, or assume that changing the DISTKEY or recreating the table is a practical maintenance strategy instead of using the native VACUUM and ANALYZE commands.

How to eliminate wrong answers

Option A is wrong because VACUUM FULL is not a valid Redshift command; the correct commands are VACUUM (with optional FULL parameter) and VACUUM DELETE ONLY, but VACUUM FULL is a PostgreSQL command not applicable here. Option C is wrong because altering the DISTKEY is a schema design change that requires a table rebuild and does not address the immediate bloat and statistics issues caused by frequent DELETEs and UPDATEs. Option D is wrong because dropping and recreating the table is disruptive, causes downtime, and loses data unless carefully managed; it is not a regular maintenance operation and does not leverage Redshift's built-in VACUUM and ANALYZE capabilities.

Practice this question →

145

MCQhard

A gaming company uses Amazon DynamoDB to store player profiles. Each profile is about 5 KB and is accessed frequently. The access pattern is mostly point reads by player ID. The company wants to reduce read costs while maintaining low latency. Currently, the table uses provisioned capacity with 3000 RCU. Which change would be MOST effective?

A.Use strongly consistent reads instead of eventually consistent reads.

B.Switch from provisioned capacity to on-demand capacity mode.

C.Decrease the provisioned RCU to 2000 and rely on adaptive capacity.

D.Use DynamoDB Accelerator (DAX) to cache frequently accessed items.

AnswerD

DAX reduces reads from the table, lowering RCU consumption.

Why this answer

Option B is correct because DynamoDB Accelerator (DAX) caches reads, reducing RCU consumption. Option A is wrong because switching to on-demand may increase costs if traffic is steady. Option C is wrong because decreasing RCU would cause throttling.

Option D is wrong because strongly consistent reads consume more RCU than eventually consistent.

Practice this question →

146

Multi-Selecthard

A company is migrating a large Oracle data warehouse to AWS. The warehouse contains 50 TB of data and runs complex analytical queries. The solution must support concurrency of up to 100 users and provide high performance for queries. Which THREE design decisions should the company make? (Choose three.)

Select 3 answers

A.Use distribution keys based on frequently joined columns

B.Design tables with columnar storage

C.Use Amazon RDS for Oracle with Multi-AZ

D.Use Amazon DynamoDB with global tables

E.Use Amazon Redshift as the database engine

AnswersA, B, E

Distribution keys enable parallel processing and reduce data movement.

Why this answer

Distribution keys based on frequently joined columns ensure that related data is co-located on the same compute nodes, minimizing data movement across the network during joins. This is critical for complex analytical queries on large datasets in Amazon Redshift, as it reduces shuffle overhead and improves query performance.

Exam trap

The trap here is that candidates may confuse Amazon RDS for Oracle (an OLTP database) with a suitable data warehouse solution, overlooking that Redshift’s columnar storage and MPP architecture are specifically designed for large-scale analytical workloads.

Practice this question →

147

MCQhard

A financial services company needs to enforce row-level security on a MySQL database hosted on Amazon RDS. They want to restrict access so that each application user can only see their own data. Which approach should they take?

A.Place each user's data in a separate database and use VPC endpoints to isolate access

B.Create separate database views for each user

C.Use a MySQL proxy that injects session context variables and enable row-level security in the application queries

D.Use IAM database authentication and define fine-grained access policies

AnswerC

Allows dynamic row filtering per user.

Why this answer

Using MySQL proxy with session context variables allows row-level filtering based on user identity. Option A (IAM policy) doesn't apply to database rows. Option B (views per user) is not scalable.

Option D (VPC endpoints) doesn't control row access.

Practice this question →

148

MCQmedium

A company needs to store and query time-series data from IoT sensors. The data is written continuously and queried by time range for dashboards. Which AWS database service is most cost-effective and scalable for this workload?

A.Amazon ElastiCache for Redis with time-series data structures.

B.Amazon DynamoDB with time-based partition keys.

C.Amazon Timestream.

D.Amazon RDS for PostgreSQL with time-based indexing.

AnswerC

Managed time-series database with built-in analytics.

Why this answer

Amazon Timestream is purpose-built for time-series data, offering automatic tiering between in-memory and magnetic stores for cost efficiency, and built-in functions for time-based aggregations and windowed queries. It is serverless and scales automatically to handle continuous writes from IoT sensors and low-latency dashboard queries by time range, making it the most cost-effective and scalable choice.

Exam trap

The trap here is that candidates often choose DynamoDB for its scalability, overlooking that time-series workloads with sequential timestamps cause hot partitions and lack native time-series query capabilities, while Timestream is the only service specifically designed for this use case with automatic tiering and cost optimization.

How to eliminate wrong answers

Option A is wrong because ElastiCache for Redis is an in-memory cache, not a durable, scalable database for continuous time-series ingestion; it requires manual management of data eviction and lacks built-in time-series query optimization for large historical datasets. Option B is wrong because DynamoDB with time-based partition keys can lead to hot partitions due to sequential writes, and it lacks native time-series functions like interpolation or smoothing, requiring complex application logic for dashboard queries. Option D is wrong because RDS for PostgreSQL with time-based indexing incurs high storage and compute costs for continuous writes, requires manual scaling and partitioning, and lacks the automatic data tiering and query optimization that Timestream provides for time-series workloads.

Practice this question →

149

MCQeasy

A company is running an Amazon RDS for MySQL instance as shown in the exhibit. The application is experiencing high write latency. The instance has a high number of write operations and the storage queue depth is consistently above 100. Which change would most effectively reduce write latency?

A.Modify the storage type to Provisioned IOPS (io1) with 3000 IOPS.

B.Change the instance class to db.m5.xlarge.

C.Enable Multi-AZ to offload writes to a standby.

D.Increase the allocated storage to 200 GB.

AnswerA

Provisioned IOPS provides consistent, low-latency performance.

Why this answer

The correct answer is A because the instance is experiencing high write latency with a consistently high storage queue depth (above 100), which indicates that the current storage (likely gp2 or magnetic) cannot keep up with the write IOPS demand. Provisioned IOPS (io1) with 3000 IOPS guarantees a dedicated level of IOPS, reducing queue depth and write latency by ensuring the storage subsystem can handle the write workload without throttling.

Exam trap

The trap here is that candidates often confuse Multi-AZ replication as a way to distribute write load, but in reality, Multi-AZ only handles failover and read replicas for reads, not writes, and increasing storage or instance size without addressing the IOPS bottleneck will not resolve high queue depth.

How to eliminate wrong answers

Option B is wrong because changing the instance class to db.m5.xlarge improves compute and memory resources but does not address the storage-level bottleneck causing high queue depth and write latency. Option C is wrong because enabling Multi-AZ provides synchronous replication to a standby for high availability and failover, but it does not offload writes; writes must still be committed to the primary instance's storage, so it does not reduce write latency. Option D is wrong because increasing allocated storage to 200 GB may improve baseline IOPS for gp2 (since gp2 IOPS scale with size) but does not guarantee the consistent, high IOPS needed to reduce queue depth; the queue depth above 100 indicates a need for Provisioned IOPS, not just more storage.

Practice this question →

150

MCQmedium

A social media startup is using Amazon ElastiCache for Redis to cache user profiles. The cache currently has a 24-hour TTL. The application experiences a sudden spike in traffic after a celebrity mentions the service, causing the cache to be flooded with requests for uncached profiles. This results in high latency and database load. Which design pattern should the company implement to prevent this in the future?

A.Use a read-through cache with a longer TTL (e.g., 48 hours).

B.Implement a local cache in each application instance to reduce load on the centralized Redis cluster.

C.Use a write-through cache with a longer TTL (e.g., 48 hours).

D.Use a write-through cache with a shorter TTL (e.g., 1 hour).

AnswerB

Local caching reduces the number of requests to Redis and the database, helping to mitigate cache stampedes.

Why this answer

Option B is correct because implementing a local cache (e.g., using a library like Caffeine or Guava) in each application instance reduces the number of requests hitting the centralized Redis cluster during a traffic spike. This pattern, often called a multi-tier or near-cache, absorbs repeated reads for the same uncached profiles locally, preventing cache flooding and database overload without relying solely on Redis TTL adjustments.

Exam trap

The trap here is that candidates often assume extending TTL or changing cache write strategies (write-through vs. read-through) will solve a cache-miss storm, when in fact the core issue is the volume of concurrent misses, which only a local cache or similar request-reduction pattern can mitigate.

How to eliminate wrong answers

Option A is wrong because simply extending the TTL to 48 hours does not prevent the initial flood of requests for uncached profiles; it only keeps cached data longer once it is loaded, but the spike still causes a cache-miss storm. Option C is wrong because a write-through cache with a longer TTL focuses on write consistency and does not address read-side cache misses during a traffic spike; it would also increase write latency unnecessarily. Option D is wrong because a write-through cache with a shorter TTL would evict data faster, exacerbating cache misses and making the flood problem worse, not better.

Practice this question →

← PreviousPage 2 of 6 · 444 questions totalNext →

Ready to test yourself?

Try a timed practice session using only Workload-Specific Database Design questions.

Start 20-question session