CCNA Design High-Performing Architectures Questions — Page 2 of 4

MCQmedium

A document portal requires consistent high IOPS for a transactional database on EC2. Which EBS volume type is most suitable? The design must avoid adding custom operational scripts.

A.sc1 Cold HDD

B.Instance store only

C.Provisioned IOPS SSD such as io2

D.st1 Throughput Optimized HDD

AnswerC

io2 is designed for business-critical workloads requiring consistent high IOPS and durability.

Why this answer

Provisioned IOPS SSD volumes (io2) are designed for latency-sensitive transactional workloads that require consistent, high IOPS. They deliver a predictable performance level with a 99.9% durability guarantee, making them ideal for a database on EC2 without needing custom scripts to manage performance.

Exam trap

The trap here is that candidates often confuse throughput-optimized HDD (st1) with IOPS-optimized SSD, failing to recognize that transactional databases require low-latency random I/O, not high sequential throughput.

How to eliminate wrong answers

Option A is wrong because sc1 Cold HDD volumes are optimized for large, sequential workloads with low cost, not for high IOPS or transactional databases. Option B is wrong because instance store volumes are ephemeral and data is lost on instance stop/termination, requiring custom operational scripts to manage data persistence. Option D is wrong because st1 Throughput Optimized HDD volumes are designed for high-throughput, sequential access patterns (e.g., big data, log processing) and cannot deliver the consistent low-latency IOPS required for transactional databases.

Practice this question →

MCQmedium

A Lambda function behind an API needs consistent low latency. Traffic normally drops to near zero, then spikes several times per hour. During spikes, the p95 latency often spikes above 800 ms due to cold starts. The team wants to keep using Lambda (no containers) but minimize cold start impact during predictable spikes. What is the best AWS configuration to meet this goal?

A.Enable Lambda provisioned concurrency on a published function alias and set the minimum provisioned instances to the baseline expected during spikes.

B.Increase the function memory size to the maximum and rely on the larger memory to eliminate cold starts.

C.Configure an ALB with target group health checks to keep Lambda warm by sending periodic requests.

D.Turn on AWS CloudTrail data events to monitor cold start frequency and tune the runtime accordingly.

AnswerA

Provisioned concurrency pre-initializes Lambda execution environments for a specific alias, reducing cold start latency.

Why this answer

Provisioned concurrency initializes a specified number of execution environments in advance, keeping them warm and ready to handle requests instantly. By setting the minimum provisioned instances to the baseline expected during spikes, the function avoids cold starts for those requests, ensuring p95 latency stays low even when traffic surges from near zero.

Exam trap

The trap here is that candidates may confuse provisioned concurrency with reserved concurrency, or assume that increasing memory or using health checks can eliminate cold starts, when only provisioned concurrency guarantees pre-warmed environments for predictable spikes.

How to eliminate wrong answers

Option B is wrong because increasing memory size can improve CPU performance but does not eliminate cold starts; cold starts still occur when a new execution environment is created. Option C is wrong because ALB health checks send requests to the Lambda function, but they do not guarantee that the function stays warm for all concurrent invocations during spikes, and the health check interval (e.g., every 30 seconds) is insufficient to prevent cold starts when traffic spikes from zero. Option D is wrong because CloudTrail data events log API calls but do not prevent cold starts; they only provide monitoring data, not a solution to reduce latency.

Practice this question →

Drag & Dropmedium

Arrange the steps to migrate an on-premises database to Amazon RDS using AWS DMS.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

Source preparation first, then DMS infrastructure, endpoints, task, and monitoring/cutover.

Practice this question →

Multi-Selecthard

A distributed analytics engine runs 12 EC2 instances in one Availability Zone. The nodes exchange thousands of tiny messages per second and must keep jitter as low as possible. The current design launches the instances across multiple placement groups and uses general-purpose burstable instances. Which two changes will most directly lower east-west network latency and variability? Select two.

Select 2 answers

A.Move all instances into a cluster placement group.

B.Use instance families that provide high network bandwidth and support enhanced networking.

C.Spread the instances across three Availability Zones for better fault tolerance.

D.Front the nodes with an Application Load Balancer to balance the internal messages.

E.Store the messages on EBS volumes so the nodes avoid network communication.

AnswersA, B

Cluster placement groups pack instances closely together in a single Availability Zone, which minimizes network distance and improves latency consistency. This is the best placement strategy when the workload is highly chatty and needs very low jitter between nodes. It directly targets east-west performance.

Why this answer

A cluster placement group provides a low-latency, high-bandwidth network connection by placing instances in a single Availability Zone within the same logical rack or cluster. This minimizes the physical distance and network hops between instances, directly reducing east-west latency and jitter for the thousands of tiny messages per second.

Exam trap

The trap here is that candidates often confuse 'fault tolerance' (spreading across AZs) with 'performance' (cluster placement group), or they mistakenly think a load balancer can optimize internal node-to-node traffic, when in fact it adds latency and is designed for client-facing traffic.

Practice this question →

MCQmedium

A media processing service runs ECS tasks in multiple Availability Zones. Each task must read and write the same shared filesystem with low latency because tasks stream intermediate artifacts to other tasks. The team currently mounts an EBS volume per task, and cross-AZ tasks frequently cannot see each other’s files. Which option best resolves the shared filesystem requirement while supporting high-performing access?

A.Keep using EBS, but attach the same EBS volume to tasks in multiple Availability Zones using EBS multi-attach so all tasks share the filesystem.

B.Use Amazon EFS with mount targets in each Availability Zone so all tasks mount a common NFS filesystem over the AWS network.

C.Use Amazon S3 for the intermediate artifacts and rely on S3 event notifications to emulate POSIX file operations.

D.Switch to instance store on each task and use SQS messages between tasks to copy intermediate artifacts.

AnswerB

EFS is designed for shared, NFS-like file storage that can be mounted concurrently from compute resources across multiple Availability Zones. By creating mount targets in each AZ used by the ECS tasks, you enable low-latency network access patterns so tasks can read and write the same shared filesystem reliably.

Why this answer

Amazon EFS provides a fully managed, shared NFS filesystem that can be mounted concurrently by ECS tasks across multiple Availability Zones with low latency. It supports POSIX file operations, making it ideal for streaming intermediate artifacts between tasks. EFS mount targets in each AZ ensure local access, meeting the requirement for high-performing shared storage.

Exam trap

The trap here is that candidates may assume EBS multi-attach works across Availability Zones, but it is strictly limited to a single AZ and requires specific instance types, making it unsuitable for multi-AZ shared filesystem requirements.

How to eliminate wrong answers

Option A is wrong because EBS multi-attach is limited to a single Availability Zone and supports only up to 16 Nitro-based instances, not ECS tasks across multiple AZs, and does not provide a shared filesystem for cross-AZ access. Option C is wrong because Amazon S3 is an object store, not a POSIX-compliant filesystem; it lacks low-latency file locking and streaming semantics required for intermediate artifact sharing between tasks. Option D is wrong because instance store is ephemeral and tied to a single EC2 instance, and using SQS for copying artifacts introduces latency and complexity, failing to provide a low-latency shared filesystem.

Practice this question →

Multi-Selectmedium

A marketing site serves versioned JavaScript and CSS from an Amazon S3 origin through Amazon CloudFront. After each release, the cache hit ratio drops sharply because clients keep sending request headers and query strings that are not needed for asset retrieval. Which two changes should improve cache efficiency the most? Select two.

Select 2 answers

A.Create a CloudFront cache policy that excludes unnecessary headers, query strings, and cookies from the cache key.

B.Use versioned filenames or content hashes for static assets and apply long-lived immutable caching.

C.Move the S3 origin behind an Application Load Balancer so CloudFront can cache responses more effectively.

D.Store the objects in Amazon S3 Standard-IA so repeated requests are cheaper.

E.Lower the CloudFront TTL to zero so viewers always receive the newest content immediately.

AnswersA, B

CloudFront uses the cache key to decide whether two requests can share the same cached object. If irrelevant headers, query strings, or cookies are included, the same file is cached as many variants and the hit ratio drops.

Why this answer

Option A is correct because CloudFront cache policies allow you to explicitly control which headers, query strings, and cookies are included in the cache key. By excluding unnecessary ones (e.g., User-Agent, random query parameters), you prevent cache fragmentation and ensure that identical assets served with different request metadata map to the same cached object, dramatically improving the cache hit ratio.

Exam trap

The trap here is that candidates often confuse cache invalidation strategies (like lowering TTL) with cache efficiency improvements, not realizing that excluding unnecessary cache key components is the direct mechanism to reduce cache misses.

Practice this question →

MCQhard

Based on the exhibit, a DynamoDB-backed event processing system is throttling during a promotion. The table uses tenantId as the partition key and eventTime as the sort key. One tenant accounts for most of the write traffic, and the application must preserve fast lookups for that tenant without relying on a single hot partition. What change is the best fix?

A.Add a sharding suffix to the partition key, such as tenantId#shardId, and query across the tenant's shards.

B.Enable DynamoDB Streams so the table can process writes more quickly.

C.Switch the table to on-demand capacity mode and keep the same key design.

D.Add a global secondary index on eventTime and query the index instead of the base table.

AnswerA

Sharding the partition key spreads ACME traffic across multiple partitions, which removes the hot key problem. Because the application still needs tenant-scoped time-range queries, it can fan out across the shard values and merge results.

Why this answer

Option A is correct because adding a sharding suffix (e.g., tenantId#shardId) to the partition key distributes write traffic for the hot tenant across multiple partitions, eliminating the single-partition bottleneck while preserving fast lookups by querying across all shards for that tenant. DynamoDB's partition key determines physical storage; without sharding, all writes for the hot tenant land on one partition, causing throttling even if the table has sufficient total capacity.

Exam trap

The trap here is that candidates often assume on-demand mode (Option C) eliminates all throttling, but it does not resolve the physical partition limit—a single hot partition still caps at 1,000 WCU/3,000 RCU, so throttling persists regardless of capacity mode.

How to eliminate wrong answers

Option B is wrong because enabling DynamoDB Streams does not increase write throughput; it captures item-level changes asynchronously and does not alleviate throttling caused by a hot partition. Option C is wrong because switching to on-demand capacity mode only removes the need to provision capacity manually, but it does not solve the underlying hot partition issue—DynamoDB still throttles if a single partition exceeds 1,000 WCU or 3,000 RCU, regardless of capacity mode. Option D is wrong because adding a GSI on eventTime does not distribute write load; the base table's partition key remains tenantId, so the hot tenant still causes throttling on the base table, and the GSI inherits the same write patterns.

Practice this question →

MCQmedium

A media archive requires consistent high IOPS for a transactional database on EC2. Which EBS volume type is most suitable? The architecture review board prefers a managed AWS-native control.

A.Provisioned IOPS SSD such as io2

B.st1 Throughput Optimized HDD

C.Instance store only

D.sc1 Cold HDD

AnswerA

io2 is designed for business-critical workloads requiring consistent high IOPS and durability.

Why this answer

The io2 Provisioned IOPS SSD volume type is designed for latency-sensitive transactional workloads that require consistent, high IOPS. It provides a service-level agreement (SLA) of 99.999% durability and supports up to 256,000 IOPS per volume, making it ideal for a database on EC2 that demands predictable performance. As a managed AWS-native EBS volume, it aligns with the architecture review board's preference for a fully AWS-controlled storage solution.

Exam trap

The trap here is that candidates often confuse throughput-optimized HDD (st1) with IOPS requirements, mistakenly thinking high throughput equals high IOPS, but transactional databases need random I/O performance, not sequential throughput.

How to eliminate wrong answers

Option B (st1 Throughput Optimized HDD) is wrong because it is optimized for sequential throughput, not random IOPS, and cannot deliver the consistent high IOPS required by a transactional database. Option C (Instance store only) is wrong because instance store volumes are ephemeral and data is lost on instance stop or termination, making them unsuitable for a persistent database. Option D (sc1 Cold HDD) is wrong because it is designed for infrequently accessed data with the lowest cost and lowest IOPS, far below the needs of a transactional database.

Practice this question →

MCQeasy

A media company uses CloudFront in front of an S3 bucket origin for video thumbnails. They want to prevent users from bypassing CloudFront and accessing the S3 bucket directly, while still allowing CloudFront to fetch objects. What is the best option?

A.Keep the bucket public and rely on signed cookies for all thumbnail requests.

B.Use CloudFront Origin Access Control (OAC) or Origin Access Identity (OAI) and update the bucket policy to allow only CloudFront.

C.Enable S3 static website hosting so users access thumbnails directly from the S3 website endpoint.

D.Set S3 bucket permissions to allow all IAM users and block access only by using a WAF rule at CloudFront.

AnswerB

OAC/OAI ensures only CloudFront can access the bucket while keeping the bucket private.

Why this answer

CloudFront Origin Access Control (OAC) or Origin Access Identity (OAI) allows you to restrict direct access to an S3 bucket by configuring the bucket policy to grant read permissions only to the CloudFront distribution's service principal. This ensures that users can only retrieve thumbnails through CloudFront, leveraging its caching and security features, while blocking any direct S3 requests.

Exam trap

The trap here is that candidates often think signed cookies or URLs alone are sufficient to secure direct S3 access, but they forget that those mechanisms only control access through CloudFront and do not restrict the S3 bucket's public endpoint unless the bucket policy explicitly denies direct access.

How to eliminate wrong answers

Option A is wrong because making the bucket public and relying on signed cookies does not prevent users from bypassing CloudFront and accessing the S3 bucket directly via its public URL; signed cookies only control access through CloudFront, not direct S3 access. Option C is wrong because enabling S3 static website hosting exposes a separate website endpoint that users could access directly, defeating the purpose of restricting access to CloudFront. Option D is wrong because setting S3 bucket permissions to allow all IAM users does not block direct access; WAF rules at CloudFront only filter traffic reaching CloudFront, not requests made directly to the S3 bucket endpoint.

Practice this question →

MCQeasy

Your application uses ElastiCache Redis as a cache for user profiles stored in DynamoDB. You must ensure that when a profile is updated, subsequent reads see the latest value quickly. Which cache strategy is generally the best fit for this requirement?

A.Write to DynamoDB only, and never update or invalidate the Redis cache.

B.Use a cache-aside approach with TTL plus explicit invalidation after writes.

C.Cache only for reads, and do not fetch from DynamoDB when a key is missing.

D.Rely on eventual consistency of Redis replication to propagate updates to all nodes.

AnswerB

A cache-aside (lazy loading) pattern reads from cache first; if missing/expired, it fetches from the source of truth. After an update, explicitly invalidating or updating the cached entry ensures subsequent reads quickly reflect changes. TTL provides protection against missed invalidations while invalidation accelerates correctness after writes.

Why this answer

Option B is correct because a cache-aside (lazy loading) strategy with TTL and explicit invalidation ensures that after a write to DynamoDB, the stale Redis entry is removed, forcing the next read to fetch the fresh profile from DynamoDB and repopulate the cache. This combination minimizes the window of stale reads while maintaining high read performance, which is critical for user profile caches where consistency matters.

Exam trap

The trap here is that candidates often confuse eventual consistency within Redis replication (which only applies to Redis-to-Redis sync) with the need to synchronize the cache with the authoritative data store (DynamoDB), leading them to pick option D, which does not address the core requirement of reflecting DynamoDB updates in the cache.

How to eliminate wrong answers

Option A is wrong because never updating or invalidating the Redis cache means stale data persists indefinitely, violating the requirement that subsequent reads see the latest value quickly. Option C is wrong because caching only for reads and not fetching from DynamoDB when a key is missing would result in cache misses returning no data, effectively breaking the application's ability to serve user profiles. Option D is wrong because relying on eventual consistency of Redis replication does not guarantee that updates to DynamoDB are reflected in the cache; Redis replication only synchronizes data between Redis nodes, not between DynamoDB and Redis, and does not address cache invalidation after writes.

Practice this question →

MCQeasy

An ECS service runs on EC2 capacity. During peak traffic, tasks frequently wait for available container instances. The team wants faster scale-out for the underlying EC2 capacity when tasks increase. What is the best first architectural step?

A.Tune the container health check settings so tasks stop failing and stay running.

B.Use an ECS capacity provider (or Auto Scaling integration) to scale the EC2 instances based on ECS demand.

C.Pin all tasks to a single Availability Zone to reduce placement overhead.

D.Switch the tasks to run only on Fargate so EC2 scaling is no longer relevant.

AnswerB

When ECS tasks need compute, capacity must scale at the EC2 layer so there are enough container instances to place tasks. Integrating ECS with an Auto Scaling capacity provider allows the cluster to scale out in response to pending tasks. This reduces waiting time and improves responsiveness under load.

Why this answer

Option B is correct because an ECS capacity provider (or Auto Scaling integration) directly links ECS task-level demand to EC2 instance scaling. When tasks are pending due to insufficient container instances, the capacity provider triggers a scale-out event on the Auto Scaling group, adding EC2 instances to accommodate the workload. This is the most efficient architectural step to reduce placement delays during peak traffic.

Exam trap

The trap here is that candidates may confuse task-level scaling (e.g., Service Auto Scaling) with infrastructure-level scaling, and incorrectly assume that tuning health checks or placement strategies will resolve a capacity shortage caused by insufficient EC2 instances.

How to eliminate wrong answers

Option A is wrong because tuning container health check settings addresses task failures, not the underlying shortage of EC2 container instances; tasks waiting for available instances is a capacity issue, not a health check issue. Option C is wrong because pinning all tasks to a single Availability Zone increases risk of failure and does not solve the capacity shortage; placement overhead is negligible compared to the lack of instances. Option D is wrong because switching to Fargate is a migration, not an architectural step to improve EC2 scaling; it avoids the EC2 scaling problem rather than solving it, and may not be feasible or cost-effective for all workloads.

Practice this question →

MCQeasy

An application uses DynamoDB to store order status. Reads happen extremely frequently for the same few keys (for example, the most recent orders), and the team wants lower read latency without changing the table’s partition key design. Which AWS service best fits this requirement?

A.Amazon DAX (DynamoDB Accelerator) to cache frequently read items

B.Provision AWS WAF rules to reduce DynamoDB read latency caused by bots

C.Enable multi-region writes in DynamoDB Global Tables to speed up reads locally

D.Add more read capacity units to DynamoDB and avoid caching entirely

AnswerA

DAX is an in-memory caching layer specifically built for DynamoDB. It reduces read latency for hot keys by serving cached responses quickly while still reading from DynamoDB when a key is not cached (or when the cached entry expires). This avoids the need to redesign partition keys.

Why this answer

Amazon DAX (DynamoDB Accelerator) is an in-memory cache that sits between your application and DynamoDB, providing microsecond read latency for frequently accessed items. Because the workload involves extremely frequent reads of the same few keys (hot keys), DAX reduces the load on the DynamoDB table and delivers faster responses without requiring any changes to the partition key design.

Exam trap

The trap here is that candidates often confuse throughput scaling (adding RCUs) with latency optimization, or they mistakenly think Global Tables improve read latency within a single region, when in fact DAX is the only option that directly caches hot keys to reduce read latency without altering the table design.

How to eliminate wrong answers

Option B is wrong because AWS WAF is a web application firewall that protects against web exploits and bots at the HTTP/HTTPS layer, but it does not reduce DynamoDB read latency or cache data. Option C is wrong because DynamoDB Global Tables replicate data across regions for disaster recovery and local writes, but they do not reduce read latency for a single-region application and still require reads to go to the DynamoDB API without caching. Option D is wrong because simply adding more read capacity units increases throughput but does not lower latency for hot keys; the read requests still hit the DynamoDB service, and without caching, the same hot keys continue to experience the same base latency.

Practice this question →

MCQeasy

Based on the exhibit, which AWS feature should the team use to minimize network latency between EC2 instances that exchange messages very frequently?

A.Use a spread placement group to maximize instance separation across hardware.

B.Use a cluster placement group to place instances close together.

C.Use a partition placement group to distribute instances across many partitions.

D.Use multiple Auto Scaling groups to spread traffic across more subnets.

AnswerB

A cluster placement group is designed for workloads that need very low network latency and high packet-per-second performance between instances. The exhibit describes frequent small-message traffic and a need for the lowest possible latency, which makes a cluster placement group the right choice. It keeps instances physically close in the AWS network for faster communication.

Why this answer

A cluster placement group is the correct choice because it groups EC2 instances within a single Availability Zone with low-latency, high-bandwidth networking, achieving single-digit millisecond latency between instances. This is ideal for applications that exchange messages very frequently, as it minimizes network hops and maximizes throughput.

Exam trap

The trap here is that candidates may confuse placement group types, incorrectly assuming a spread or partition group reduces latency when they actually prioritize fault isolation over network performance.

How to eliminate wrong answers

Option A is wrong because a spread placement group maximizes instance separation across distinct hardware to reduce correlated failures, which increases network latency and is unsuitable for high-frequency messaging. Option C is wrong because a partition placement group distributes instances across logical partitions to isolate failures in large distributed systems, but it does not optimize for low latency between instances. Option D is wrong because using multiple Auto Scaling groups to spread traffic across more subnets increases network hops and latency, counteracting the goal of minimizing latency.

Practice this question →

MCQeasy

A travel booking site uses EC2 instances behind an ALB. CPU is consistently high during peak traffic, and request latency rises. What should be configured? The design must avoid adding custom operational scripts.

A.A VPC endpoint for CloudWatch only

B.Auto Scaling policy based on an appropriate CloudWatch metric

C.S3 Object Lock

D.Disable health checks

AnswerB

Auto Scaling adds capacity when load increases and removes it when load falls.

Why this answer

An Auto Scaling policy based on a CloudWatch metric like CPUUtilization or request latency directly addresses the high CPU and rising latency by automatically adding EC2 instances during peak traffic. This eliminates the need for custom scripts and ensures the application scales horizontally to maintain performance.

Exam trap

The trap here is that candidates might think a VPC endpoint (Option A) is needed for CloudWatch metrics, but CloudWatch metrics are already available without a VPC endpoint, and scaling requires an Auto Scaling policy, not just metric access.

How to eliminate wrong answers

Option A is wrong because a VPC endpoint for CloudWatch only enables private connectivity to CloudWatch, but does not provide any scaling or performance improvement for EC2 instances. Option C is wrong because S3 Object Lock is used for data retention and compliance, not for scaling compute resources or reducing latency. Option D is wrong because disabling health checks would cause the ALB to route traffic to unhealthy instances, worsening latency and availability issues.

Practice this question →

MCQeasy

A system uses multiple AWS Lambda functions behind different event sources. One Lambda occasionally spikes and causes other Lambdas to be throttled due to shared concurrency limits. Which setting best helps ensure the important Lambda keeps capacity during spikes?

A.Increase the function timeout so throttling is less likely.

B.Set Reserved Concurrency for the important Lambda function.

C.Enable Provisioned Concurrency for every Lambda in the account.

D.Reduce the number of IAM policies attached to the Lambda roles.

AnswerB

Reserved concurrency allocates a guaranteed amount of concurrent execution capacity to a specific Lambda. This prevents other functions from consuming all concurrency and throttling the important one. If the reserved limit is reached, only that function is throttled, isolating impact.

Why this answer

Reserved Concurrency guarantees a set number of concurrent executions for a specific Lambda function, isolating it from the account-level concurrency pool. This ensures that the important function always has capacity available, even when other functions spike and consume the shared pool. Without this setting, all functions compete for the same 1,000 concurrent executions (default regional limit), and a spike in one can throttle others.

Exam trap

The trap here is that candidates confuse Provisioned Concurrency (which reduces cold starts) with Reserved Concurrency (which guarantees capacity), leading them to choose Option C, even though Provisioned Concurrency does not protect against throttling from other functions.

How to eliminate wrong answers

Option A is wrong because increasing the function timeout does not affect concurrency limits; it only extends how long a single invocation can run, which could actually increase the chance of throttling by holding concurrency slots longer. Option C is wrong because Provisioned Concurrency pre-warms environments to reduce cold starts but does not reserve capacity away from the shared pool; it still counts toward the account concurrency limit and does not prevent throttling of other functions. Option D is wrong because reducing IAM policies affects permissions, not concurrency limits; it has no impact on Lambda's throttling behavior.

Practice this question →

MCQhard

A DynamoDB table for a retail API has a partition key based only on the current date. Write throttling occurs during business hours. What is the best design change? The design must avoid adding custom operational scripts.

A.Use a higher-cardinality partition key that distributes writes across partitions

B.Create a global secondary index with the same date key

C.Reduce the table's write capacity

D.Move the table to S3 Glacier Instant Retrieval

AnswerA

A low-cardinality hot partition causes throttling; a better key spreads writes more evenly.

Why this answer

Using only the current date as a partition key creates a 'hot partition' because all writes for the day target a single partition, exceeding its 1,000 WCU limit. A higher-cardinality partition key (e.g., combining date with user ID or order ID) distributes writes evenly across partitions, eliminating throttling without custom scripts.

Exam trap

The trap here is that candidates often confuse GSIs as a solution for write hot spots, but GSIs only help with read patterns and do not change the base table's write distribution.

How to eliminate wrong answers

Option B is wrong because a global secondary index (GSI) inherits the same write capacity from the base table and does not redistribute the write load; it would still be throttled. Option C is wrong because reducing write capacity would worsen throttling, not solve it. Option D is wrong because S3 Glacier Instant Retrieval is for archival data with infrequent access, not for a DynamoDB table requiring low-latency writes for a retail API.

Practice this question →

MCQmedium

A media processing pipeline uses EBS-backed storage for an application that performs sustained random I/O with low latency requirements. During peak processing windows, the team sees increased read latency and occasional timeouts at the application layer. They need predictable, high IOPS performance rather than best-effort throughput. Which EBS configuration choice is most appropriate?

A.Use gp2 volumes and rely on burst credits to handle peak random I/O latency requirements.

B.Use io1 or io2 EBS volumes configured with a high provisioned IOPS value, and attach them to EBS-optimized instances.

C.Use standard HDD (st1) volumes, because they provide high throughput and will reduce latency automatically.

D.Use S3 instead of EBS for random I/O latency reduction without changing the application.

AnswerB

io1/io2 are designed for predictable, low-latency IOPS for sustained I/O workloads. By provisioning a sufficient IOPS level, you improve consistency during peak windows. Using EBS-optimized instances ensures the instance-to-EBS bandwidth and I/O performance are adequate so the instance does not become the bottleneck before EBS can deliver the provisioned IOPS.

Why this answer

Option B is correct because io1 and io2 volumes are provisioned IOPS SSD volumes designed for sustained, predictable high IOPS performance, which directly addresses the application's need for low-latency random I/O during peak loads. Attaching them to EBS-optimized instances ensures dedicated network bandwidth for EBS traffic, eliminating contention and preventing timeouts.

Exam trap

The trap here is that candidates may choose gp2 (Option A) assuming burst credits will cover peak loads, but they fail to recognize that sustained peak I/O exhausts credits, leading to performance degradation, whereas provisioned IOPS volumes guarantee consistent performance regardless of duration.

How to eliminate wrong answers

Option A is wrong because gp2 volumes rely on burst credits that can be exhausted during sustained peak I/O, leading to throttled performance and increased latency, not predictable high IOPS. Option C is wrong because st1 volumes are HDD-based and optimized for sequential throughput, not random I/O; they cannot provide low latency or high IOPS for random access patterns. Option D is wrong because S3 is an object storage service with higher latency and no support for low-latency random I/O; it cannot replace EBS for block-level access without significant application changes.

Practice this question →

MCQmedium

A media archive requires consistent high IOPS for a transactional database on EC2. Which EBS volume type is most suitable?

A.Provisioned IOPS SSD such as io2

B.st1 Throughput Optimized HDD

C.Instance store only

D.sc1 Cold HDD

AnswerA

io2 is designed for business-critical workloads requiring consistent high IOPS and durability.

Why this answer

The scenario requires consistent high IOPS for a transactional database, which demands low-latency, predictable performance. Provisioned IOPS SSD volumes like io2 are designed specifically for such workloads, offering up to 256,000 IOPS per volume with 99.999% durability, making them the most suitable choice for consistent high IOPS.

Exam trap

The trap here is that candidates often confuse throughput-optimized HDD (st1) with IOPS-focused workloads, mistakenly thinking high throughput equals high IOPS, but IOPS measures random access operations while throughput measures sequential data transfer, and transactional databases require low-latency random I/O.

How to eliminate wrong answers

Option B (st1 Throughput Optimized HDD) is wrong because it is a throughput-optimized HDD volume designed for large, sequential workloads like big data and log processing, not for transactional databases requiring consistent high IOPS and low latency. Option C (Instance store only) is wrong because instance store volumes provide ephemeral storage that is not persistent; data is lost if the instance stops or terminates, making it unsuitable for a transactional database that requires data durability. Option D (sc1 Cold HDD) is wrong because it is a cold HDD volume optimized for infrequently accessed data with the lowest cost, offering very low IOPS and throughput, which cannot meet the consistent high IOPS demands of a transactional database.

Practice this question →

MCQmedium

A high-volume telemetry pipeline writes streaming click events that must be processed by multiple independent consumers. Which service is most appropriate? The design must avoid adding custom operational scripts.

A.Amazon Kinesis Data Streams

B.AWS DataSync

C.Amazon EBS

D.Amazon Route 53

AnswerA

Kinesis Data Streams supports high-throughput event ingestion with multiple consumers reading from the stream.

Why this answer

Amazon Kinesis Data Streams is designed for real-time streaming of high-volume data, such as click events, and allows multiple independent consumers to process the same stream concurrently via enhanced fan-out or shared throughput. It provides durable, ordered data retention and integrates with AWS Lambda, Kinesis Data Analytics, and Kinesis Data Firehose without requiring custom operational scripts.

Exam trap

The trap here is that candidates may confuse AWS DataSync or EBS as viable for streaming data, but DataSync is for batch file transfers and EBS is for block storage, neither supporting real-time, multi-consumer event processing.

How to eliminate wrong answers

Option B (AWS DataSync) is wrong because it is a data transfer service for moving large datasets between on-premises storage and AWS services (e.g., S3, EFS) over the internet or Direct Connect, not a real-time streaming pipeline for multiple consumers. Option C (Amazon EBS) is wrong because it provides block-level storage volumes for EC2 instances, not a streaming data ingestion or processing service, and cannot support multiple independent consumers reading the same event stream. Option D (Amazon Route 53) is wrong because it is a DNS web service for domain name resolution and traffic routing, not a data streaming or processing service.

Practice this question →

Multi-Selectmedium

A media company is designing a high-performance architecture to serve video content to users worldwide. The solution must minimize latency for end users and reduce the load on the origin servers. The video files are stored in an Amazon S3 bucket. Which three options should be combined to meet these requirements? (Choose three.)

Select 3 answers

.Use Amazon CloudFront as a content delivery network (CDN) with the S3 bucket as the origin.

.Enable S3 Transfer Acceleration on the bucket to speed up uploads.

.Configure CloudFront to use Regional Edge Caches to improve cache hit ratios for less popular content.

.Use Amazon ElastiCache for Memcached to cache video metadata at the edge.

.Enable S3 default encryption using AWS KMS to improve data transfer performance.

.Implement origin shield in CloudFront to reduce the number of requests sent to the S3 origin.

Why this answer

Amazon CloudFront as a CDN with the S3 bucket as the origin minimizes latency by caching video content at edge locations worldwide, serving users from the nearest edge. This reduces load on the origin S3 bucket by handling requests at the edge. Regional Edge Caches further improve cache hit ratios for less popular content by caching it at regional locations, reducing the need to fetch from the origin.

Origin shield in CloudFront consolidates requests from multiple edge locations into a single request to the S3 origin, significantly reducing the number of direct requests and lowering origin load.

Exam trap

The trap here is that candidates may confuse S3 Transfer Acceleration (which optimizes uploads) with CloudFront (which optimizes downloads), or think that ElastiCache can be used as a CDN for video content, when it is actually an in-memory cache for application data, not for serving static files at the edge.

Practice this question →

Multi-Selecthard

A serverless checkout API uses AWS Lambda behind API Gateway. Every weekday at 09:00 UTC, marketing triggers a predictable surge. The first few minutes after each surge show cold-start latency, but traffic volume is forecastable and the business wants stable p95 latency. Which two changes should the team implement? Select two.

Select 2 answers

A.Publish a Lambda version and attach provisioned concurrency to an alias that points to that version.

B.Use Application Auto Scaling scheduled actions to raise provisioned concurrency before 09:00 UTC and lower it afterward.

C.Increase the Lambda timeout so the function has more time to initialize during the spike.

D.Double the memory size during the spike without changing the concurrency model.

E.Move the function into more Availability Zones so the platform can spread cold starts across regions.

AnswersA, B

Provisioned concurrency keeps execution environments initialized and ready to serve requests, which is the correct way to reduce cold starts. Using an alias tied to a published version is the standard deployment pattern for managing that setting safely. This directly improves p95 latency during predictable bursts.

Why this answer

Provisioned concurrency keeps a specified number of Lambda execution environments initialized and ready to respond immediately, eliminating cold starts for predictable traffic patterns. By publishing a Lambda version and attaching provisioned concurrency to an alias pointing to that version, the team ensures that the surge at 09:00 UTC is handled without cold-start latency, stabilizing p95 latency.

Exam trap

The trap here is that candidates often confuse increasing Lambda timeout or memory with solving cold-start latency, but these settings do not pre-warm execution environments; only provisioned concurrency (and optionally scheduled scaling) directly eliminates cold starts for predictable surges.

Practice this question →

MCQeasy

A latency-sensitive trading workload runs on 6 EC2 instances. You must distribute the instances so they do NOT share the same underlying hardware rack, reducing the risk of correlated rack-level faults. Which EC2 placement group strategy best meets this requirement?

A.Cluster placement group

B.Spread placement group

C.Partition placement group

D.No placement group, rely on the default scheduler

AnswerB

Spread placement groups place instances across distinct underlying hardware, separating them onto different racks within a single Availability Zone. This reduces the chance that a rack-level issue impacts multiple instances simultaneously and directly matches the requirement.

Why this answer

A Spread placement group is the correct choice because it ensures each EC2 instance is placed on distinct underlying hardware (different racks), eliminating shared fault domains. This directly meets the requirement to avoid correlated rack-level failures for latency-sensitive trading workloads.

Exam trap

The trap here is that candidates often confuse Partition placement groups with Spread groups, assuming partitions guarantee rack-level isolation, but partitions only separate instances into logical groups that may still share racks within a partition.

How to eliminate wrong answers

Option A is wrong because a Cluster placement group places instances in a single, low-latency rack, which increases the risk of correlated failures from rack-level faults. Option C is wrong because a Partition placement group spreads instances across logical partitions within an Availability Zone, but multiple instances can share the same rack within a partition, not guaranteeing isolation at the rack level. Option D is wrong because the default scheduler does not guarantee placement on separate hardware racks, leaving the workload vulnerable to correlated failures.

Practice this question →

MCQeasy

A retail analytics app uses Amazon RDS for PostgreSQL. Read traffic is growing, and the database CPU spikes mainly due to SELECT-heavy workloads. Writes are less frequent, and the app can tolerate eventually consistent reads for the reports. What is the most appropriate AWS-native way to improve read performance with minimal application changes?

A.Create an RDS read replica and point the reporting queries to the replica endpoint.

B.Switch the cluster to DynamoDB without redesigning the data model.

C.Enable S3 event notifications to trigger a Lambda function after each write to the database.

D.Replace the RDS instance class with a smaller size to reduce cost and improve performance.

AnswerA

Read replicas offload reads from the primary and can speed up SELECT-heavy workloads with minimal changes.

Why this answer

Creating an RDS read replica is the most appropriate AWS-native solution because it offloads SELECT-heavy read traffic from the primary database instance to a separate read-only replica, reducing CPU spikes on the primary. The application can tolerate eventually consistent reads for reports, which aligns with the natural replication lag of RDS read replicas (typically sub-second). This requires minimal application changes—only updating the reporting queries to point to the replica endpoint—and leverages PostgreSQL's built-in streaming replication.

Exam trap

The trap here is that candidates may confuse read replicas with Multi-AZ deployments, thinking Multi-AZ improves read performance, but Multi-AZ only provides failover redundancy and does not offload read traffic—the standby is not accessible for reads.

How to eliminate wrong answers

Option B is wrong because switching to DynamoDB without redesigning the data model would require significant application changes (e.g., adapting from relational to NoSQL schema, handling partition keys, and losing SQL query capabilities), which contradicts the requirement for minimal application changes. Option C is wrong because enabling S3 event notifications to trigger a Lambda function after each write does not directly improve read performance on the database; it adds asynchronous processing overhead and does not offload SELECT queries from the RDS instance. Option D is wrong because replacing the RDS instance class with a smaller size would reduce CPU capacity, worsening performance under the existing SELECT-heavy workload, and does not address the root cause of CPU spikes.

Practice this question →

Multi-Selecthard

A partner integration sends a custom binary TCP protocol to a service running on EC2 instances in private subnets. The partners require static endpoint IPs for allowlisting, and the application must see the original client source IP for rate limiting. Which two changes best fit the protocol and network requirements? Select two.

Select 2 answers

A.Replace the Application Load Balancer with a Network Load Balancer.

B.Use a TCP listener on the load balancer instead of an HTTP or HTTPS listener.

C.Put the service behind API Gateway REST API and use Lambda integration.

D.Use CloudFront to cache the binary packets at edge locations.

E.Terminate the traffic with an Amazon RDS proxy to stabilize the connections.

AnswersA, B

A Network Load Balancer is the right choice for TCP traffic and low-latency forwarding at layer 4. It also supports static IP behavior that is important for partner allowlisting. This directly matches the custom binary protocol and source-IP requirement.

Why this answer

A Network Load Balancer (NLB) is required because it supports TCP traffic natively at Layer 4, which is necessary for a custom binary TCP protocol that cannot be interpreted by an Application Load Balancer (ALB) at Layer 7. Additionally, an NLB preserves the original client source IP address by default when used with targets in private subnets, meeting the requirement for rate limiting based on the client IP. Static IP addresses can be assigned to the NLB via Elastic IPs, satisfying the partner's need for static endpoint IPs for allowlisting.

Exam trap

The trap here is that candidates often assume an Application Load Balancer can handle any TCP traffic because it supports TCP listeners, but ALB only supports HTTP/HTTPS at Layer 7 and cannot process custom binary protocols, while NLB is the correct choice for non-HTTP TCP traffic with static IP and client IP preservation requirements.

Practice this question →

100

MCQmedium

A company serves the same public content to many users through Amazon CloudFront. The origin is experiencing increased fetches because CloudFront cache hit rate is dropping. Most requests include an Authorization header and a custom header that changes per user. The response content is identical regardless of these headers. What change should the solutions architect make to restore a high cache hit rate?

A.Create a custom cache policy that excludes the Authorization header and the per-user changing custom header from the cache key.

B.Lower the TTL to a few seconds so cached objects expire sooner and origin fetches decrease.

C.Disable caching for the affected paths so CloudFront always forwards all headers to the origin.

D.Force all requests to use query-string based caching and include all headers in the cache policy for correctness.

AnswerA

CloudFront cache keys determine how requests map to cached objects. If the response is identical regardless of certain headers, including those headers in the cache key causes cache fragmentation (many unique cache keys for what is effectively the same content). Excluding the Authorization header and the varying custom header from the cache key allows CloudFront to reuse cached responses across users, restoring hit rate and reducing origin fetches.

Why this answer

Option A is correct because CloudFront's default cache key includes the Authorization header and all custom headers, which causes unique cache entries for each user even though the content is identical. By creating a custom cache policy that excludes these headers from the cache key, CloudFront will treat requests with different header values as the same cached object, restoring a high cache hit rate and reducing origin fetches.

Exam trap

The trap here is that candidates may assume the Authorization header must always be included in the cache key for security, but for public content, it can be safely excluded to improve cache efficiency without compromising access control.

How to eliminate wrong answers

Option B is wrong because lowering the TTL causes cached objects to expire sooner, which increases origin fetches and further reduces the cache hit rate, the opposite of what is needed. Option C is wrong because disabling caching for the affected paths forces CloudFront to forward every request to the origin, eliminating cache hits entirely and defeating the purpose of using CloudFront. Option D is wrong because forcing query-string based caching and including all headers in the cache key would still create unique cache entries per user (since headers vary per user), and query strings are not relevant to the issue described.

Practice this question →

101

MCQmedium

A telemetry pipeline uses an Application Load Balancer in one Region. Global users need lower network latency to the application without caching dynamic responses. What should be considered?

A.AWS Global Accelerator

B.S3 Cross-Region Replication

C.CloudFront only with long TTLs

D.AWS Backup cross-Region copy

AnswerA

Global Accelerator routes traffic over the AWS global network to improve performance for TCP/UDP applications without relying on caching.

Why this answer

AWS Global Accelerator uses the Anycast IP address concept to route traffic through the AWS global network to the optimal endpoint, reducing latency and jitter for global users. It does not cache content, making it ideal for dynamic responses that cannot be cached, and it integrates directly with an Application Load Balancer in a single Region.

Exam trap

The trap here is that candidates often confuse CloudFront (a CDN with caching) with Global Accelerator (a non-caching network accelerator), assuming any edge service must cache content, but Global Accelerator is designed specifically for dynamic and uncacheable traffic.

How to eliminate wrong answers

Option B is wrong because S3 Cross-Region Replication is a storage feature for replicating objects across S3 buckets in different Regions; it does not reduce network latency for application traffic or handle dynamic HTTP responses. Option C is wrong because CloudFront with long TTLs caches responses at edge locations, which is unsuitable for dynamic content that must not be cached; long TTLs would serve stale data. Option D is wrong because AWS Backup cross-Region copy is a disaster recovery feature for backing up resources to another Region; it does not improve real-time network latency for users accessing the application.

Practice this question →

102

MCQeasy

A team runs a latency-sensitive service on EC2 and needs consistent, low-latency block storage for a database. The application requires predictable performance and should be fast for random reads/writes. Which EBS volume type is the best choice?

A.EBS st1 (throughput optimized HDD)

B.EBS gp3 (general purpose SSD)

C.EBS sc1 (cold HDD)

D.EBS magnetic (legacy magnetic)

AnswerB

gp3 is designed for a broad range of general-purpose workloads with solid low-latency performance. It supports random I/O patterns and offers predictable performance for many latency-sensitive applications. It is a common best-fit choice when you need balanced performance without specialized throughput-focused characteristics.

Why this answer

B is correct because gp3 is a general-purpose SSD volume that provides consistent, low-latency performance for random read/write operations, making it ideal for latency-sensitive database workloads. It offers a baseline of 3,000 IOPS and 125 MB/s throughput, with the ability to independently scale up to 16,000 IOPS and 1,000 MB/s, ensuring predictable performance without the burst-bucket limitations of gp2.

Exam trap

The trap here is that candidates often confuse 'throughput optimized' (st1) with 'low-latency' because both sound performance-oriented, but st1 is designed for sequential throughput, not random I/O latency, making it a poor choice for databases.

How to eliminate wrong answers

Option A is wrong because st1 (throughput optimized HDD) is designed for large, sequential workloads like big data and log processing, not for low-latency random reads/writes, and its performance degrades significantly with random I/O. Option C is wrong because sc1 (cold HDD) is the lowest-cost HDD volume intended for infrequently accessed data, with very low IOPS (as low as 0.025 IOPS/GB) and high latency, making it unsuitable for a latency-sensitive database. Option D is wrong because magnetic (legacy) volumes are obsolete, offer inconsistent performance with high latency and low IOPS (max ~100 IOPS), and are not recommended for any production database workloads.

Practice this question →

103

MCQmedium

A distributed system needs extremely low network latency between a set of EC2 instances running the same workload. The team wants the instances to be placed as close together as AWS allows to reduce round-trip time. Which placement strategy should the architect use?

A.Use a Cluster placement group for the instances that must communicate frequently over low latency.

B.Use a Spread placement group across multiple Availability Zones to maximize fault tolerance.

C.Use the default placement strategy without specifying a placement group.

D.Use a placement group of type Partition to ensure independent failure of each instance.

AnswerA

Cluster placement groups are designed to place instances close together within a single Availability Zone to minimize network latency. They are the right choice when nodes require high intercommunication performance, such as distributed processing or tightly coupled systems. The scenario’s goal of minimizing round-trip time aligns with the Cluster placement group behavior. It’s also an EC2-native placement option focused on performance.

Why this answer

A Cluster placement group is the correct choice because it places instances in a single Availability Zone within the same rack or logical cluster, providing the lowest possible network latency and maximum throughput (up to 10 Gbps for single-flow traffic) between instances. This is ideal for tightly coupled, latency-sensitive workloads like HPC or real-time distributed systems.

Exam trap

The trap here is that candidates often confuse the purpose of placement groups: Cluster is for low latency and high throughput, Spread is for fault tolerance across hardware, and Partition is for large distributed systems needing failure isolation, but only Cluster guarantees physical proximity.

How to eliminate wrong answers

Option B is wrong because a Spread placement group spreads instances across distinct hardware racks or Availability Zones, which increases latency and is designed for fault tolerance, not low latency. Option C is wrong because the default placement strategy does not guarantee proximity; instances may be placed on different racks or AZs, leading to higher latency. Option D is wrong because a Partition placement group spreads instances across multiple partitions (each with separate racks) to isolate failures, but does not minimize latency between instances within the same partition.

Practice this question →

104

MCQmedium

A high-volume telemetry pipeline writes streaming click events that must be processed by multiple independent consumers. Which service is most appropriate? The architecture review board prefers a managed AWS-native control.

A.Amazon Kinesis Data Streams

B.AWS DataSync

C.Amazon EBS

D.Amazon Route 53

AnswerA

Kinesis Data Streams supports high-throughput event ingestion with multiple consumers reading from the stream.

Why this answer

Amazon Kinesis Data Streams is the correct choice because it is a fully managed, AWS-native service designed for real-time streaming data ingestion and processing. It supports multiple independent consumers via enhanced fan-out, which provides each consumer with a dedicated throughput of up to 2 MB/sec per shard, ensuring that high-volume click events can be processed concurrently without contention.

Exam trap

The trap here is confusing batch data transfer services (DataSync) or storage services (EBS) with real-time streaming, leading candidates to overlook Kinesis Data Streams' native support for multiple independent consumers via enhanced fan-out.

How to eliminate wrong answers

Option B (AWS DataSync) is wrong because it is a data transfer service for moving large datasets between on-premises storage and AWS, not a real-time streaming pipeline. Option C (Amazon EBS) is wrong because it provides block-level storage volumes for EC2 instances, not a streaming data ingestion or processing capability. Option D (Amazon Route 53) is wrong because it is a DNS and domain name resolution service, completely unrelated to streaming telemetry data.

Practice this question →

105

MCQmedium

Your team runs a tightly coupled distributed workload (for example, synchronous training nodes) across many EC2 instances placed within a single cluster environment. The instances need low-latency networking to reduce delays at synchronization barriers. Which EC2 placement strategy should you use to improve inter-node latency?

A.Create a placement group with the 'spread' strategy to separate instances across underlying hardware for fault tolerance.

B.Create a placement group with the 'cluster' strategy to place instances close together and reduce network latency.

C.Use the default placement strategy and rely on Auto Scaling to keep instances from drifting to different locations.

D.Avoid placement groups and instead use Amazon S3 for inter-node messaging to minimize direct network traffic between instances.

AnswerB

Cluster placement groups are intended to place instances in close proximity to provide high-bandwidth, low-latency networking. For tightly coupled workloads, this improves the likelihood of reduced latency and faster completion of synchronization barriers.

Why this answer

A cluster placement group is the correct choice because it groups instances in a single Availability Zone with low-latency, high-bandwidth networking, ideal for tightly coupled workloads like synchronous training nodes that require minimal delay at synchronization barriers. This strategy places instances physically close together within the same rack or cluster, reducing network round-trip time and maximizing throughput for inter-node communication.

Exam trap

The trap here is that candidates may confuse 'spread' with 'cluster' placement groups, assuming fault tolerance is always the priority, but for tightly coupled workloads requiring low latency, the cluster strategy is the correct choice despite its reduced fault tolerance.

How to eliminate wrong answers

Option A is wrong because the 'spread' strategy places instances on distinct hardware to maximize fault tolerance, which increases network latency due to physical separation, making it unsuitable for low-latency inter-node communication. Option C is wrong because the default placement strategy does not guarantee proximity; instances can be placed on different racks or hosts, leading to higher and unpredictable latency, and Auto Scaling does not control placement to reduce latency. Option D is wrong because using Amazon S3 for inter-node messaging introduces significant latency and throughput bottlenecks compared to direct network communication, and it is not designed for real-time, low-latency synchronization in tightly coupled workloads.

Practice this question →

106

MCQeasy

A retail API uses EC2 instances behind an ALB. CPU is consistently high during peak traffic, and request latency rises. What should be configured?

A.Auto Scaling policy based on an appropriate CloudWatch metric

B.S3 Object Lock

C.A VPC endpoint for CloudWatch only

D.Disable health checks

AnswerA

Auto Scaling adds capacity when load increases and removes it when load falls.

Why this answer

An Auto Scaling policy based on an appropriate CloudWatch metric (such as CPUUtilization or ALBRequestCountPerTarget) dynamically adds or removes EC2 instances to match demand. This directly addresses the high CPU and rising latency by distributing the load across more instances, preventing performance degradation during peak traffic.

Exam trap

The trap here is that candidates may confuse operational features (like S3 Object Lock or VPC endpoints) with scaling mechanisms, or mistakenly think disabling health checks improves performance, when in fact it degrades reliability and latency.

How to eliminate wrong answers

Option B is wrong because S3 Object Lock is a data protection feature for Amazon S3 objects (preventing deletion or overwriting) and has no relevance to scaling compute resources or reducing request latency. Option C is wrong because a VPC endpoint for CloudWatch only enables private connectivity to CloudWatch APIs (e.g., for publishing metrics or logs) but does not scale EC2 capacity or reduce latency. Option D is wrong because disabling health checks would cause the ALB to continue routing traffic to unhealthy instances, worsening latency and potentially causing failures; health checks are essential for maintaining a reliable target group.

Practice this question →

107

MCQmedium

You run a web application on an EC2 Auto Scaling group behind an Application Load Balancer (ALB). During scheduled traffic spikes, new instances launch but customers occasionally see 5xx errors for the first few minutes after scale-out. Operational logs show instances need ~4 minutes to warm up (load caches and initialize dependencies). ALB target health becomes healthy only after this warm-up. Which change most directly improves performance during spikes by reducing the time to serve traffic after scaling?

A.Configure a larger ALB deregistration delay so that old targets remain longer before termination.

B.Use an Auto Scaling warm pool so instances are pre-initialized and ready to register quickly when the ASG scales out.

C.Increase the number of desired instances immediately without using scaling policies, and then rely on manual reconfiguration.

D.Switch from ALB to NLB so instances become reachable sooner without waiting for health checks.

AnswerB

With a warm pool, Auto Scaling can launch and keep a set of instances in a pre-initialized state (for example, instances are already booted and have completed parts of startup/initialization as supported by warm pool behavior). When scaling triggers, these instances can transition to service faster and begin registering with the ALB. Because your bottleneck is that instances take ~4 minutes to become truly ready, warming them ahead of time most directly reduces the gap between scale-out and customer-ready capacity (and therefore reduces 5xx occurrences while waiting for targets to pass ALB health checks).

Why this answer

B is correct because a warm pool pre-initializes instances (e.g., loading caches and dependencies) before they are added to the Auto Scaling group. When the ASG scales out, these pre-warmed instances can be quickly moved into service, bypassing the ~4-minute warm-up delay and reducing the window for 5xx errors.

Exam trap

The trap here is that candidates may think NLB bypasses health checks entirely, but in reality NLB still requires health checks to mark targets as healthy, and the application warm-up delay remains the bottleneck.

How to eliminate wrong answers

Option A is wrong because increasing the deregistration delay keeps old targets alive longer, which does not help new instances serve traffic faster; it only delays termination of existing instances. Option C is wrong because manually setting desired instances without scaling policies is not automated and does not address the root cause of warm-up latency during spikes. Option D is wrong because switching to NLB does not eliminate the need for health checks or application warm-up; NLB health checks are still required and instances still need time to become healthy, so 5xx errors would persist.

Practice this question →

108

MCQmedium

A telemetry pipeline uses RDS MySQL and receives many read-only reporting queries that slow down the primary database. What should the architect add?

A.Multi-AZ standby and route reads to the standby

B.RDS read replica and route reporting queries to it

C.S3 lifecycle policy

D.A larger NAT gateway

AnswerB

Read replicas offload read traffic from the primary instance.

Why this answer

RDS Read Replicas are designed specifically to offload read-heavy workloads from the primary database. By creating a read replica and routing the reporting queries to it, the primary database is freed from processing these read-only queries, reducing contention and improving overall performance. This is the most cost-effective and architecturally appropriate solution for read scaling in RDS MySQL.

Exam trap

The trap here is confusing Multi-AZ standby (which is for failover, not read scaling) with a read replica, leading candidates to incorrectly choose Option A.

How to eliminate wrong answers

Option A is wrong because a Multi-AZ standby is for high availability and disaster recovery, not for read scaling; the standby does not accept read traffic unless a failover occurs. Option C is wrong because S3 lifecycle policies manage object storage tiers and expiration, which have no relevance to offloading database read queries. Option D is wrong because a larger NAT gateway increases outbound internet bandwidth for private subnets, but does not address database read performance or query offloading.

Practice this question →

109

MCQeasy

A company serves public JavaScript and CSS files from S3 using CloudFront. After a frontend change, customers report a low CloudFront cache hit ratio. Requests now include an Authorization header, but these assets do not require authentication. The CloudFront distribution is configured such that Authorization is included in the cache key. Which change best maximizes cache reuse?

A.Include the Authorization header in the cache key so responses vary correctly

B.Use a CloudFront Cache Policy that excludes Authorization from the cache key

C.Disable caching and always fetch from S3

D.Forward all headers and cookies to the origin to improve correctness

AnswerB

Because the assets are public and do not depend on Authorization, excluding Authorization from the cache key allows all users to share the same cached objects. This reduces cache fragmentation and increases cache hit ratio.

Why this answer

Option B is correct because excluding the Authorization header from the cache key ensures that all users, regardless of their authentication token, receive the same cached object. Since the static assets (JavaScript/CSS) do not require authentication, including Authorization in the cache key creates multiple cache entries for the same file, drastically reducing the cache hit ratio. A CloudFront cache policy that omits Authorization from the cache key maximizes reuse while still allowing the header to be forwarded to the origin if needed.

Exam trap

The trap here is that candidates may assume including the Authorization header is necessary for correctness, but for public static assets, excluding it from the cache key is the correct way to maximize cache reuse without affecting delivery.

How to eliminate wrong answers

Option A is wrong because including the Authorization header in the cache key would cause CloudFront to cache separate copies for each unique token value, which is exactly the problem that reduces the cache hit ratio. Option C is wrong because disabling caching entirely would increase latency and origin load, violating the goal of maximizing cache reuse. Option D is wrong because forwarding all headers and cookies to the origin would not only include unnecessary Authorization values but also further fragment the cache, worsening the hit ratio and adding overhead.

Practice this question →

110

MCQhard

A document portal needs low-latency full-text search across product descriptions and filtered attributes. Which managed service is most suitable? The architecture review board prefers a managed AWS-native control.

A.Amazon OpenSearch Service

B.AWS Config

C.Amazon EFS

D.Amazon SQS

AnswerA

OpenSearch is designed for search and analytics over indexed text and structured fields.

Why this answer

Amazon OpenSearch Service is a managed service that provides low-latency full-text search and analytics capabilities, making it ideal for indexing and searching product descriptions and filtered attributes. It is AWS-native and supports features like inverted indices, fuzzy search, and faceted filtering, which directly address the requirement for a high-performance document portal.

Exam trap

The trap here is that candidates may confuse Amazon CloudSearch (another managed search service) with OpenSearch Service, but the question emphasizes 'AWS-native control' and OpenSearch Service is the more modern, feature-rich choice for full-text search with filtering.

How to eliminate wrong answers

Option B is wrong because AWS Config is a service for resource inventory, compliance auditing, and configuration change tracking, not a full-text search engine. Option C is wrong because Amazon EFS is a scalable file storage service for shared access to files, not a search or indexing service. Option D is wrong because Amazon SQS is a fully managed message queuing service for decoupling microservices, not a search or query engine.

Practice this question →

111

Multi-Selectmedium

An order lookup API repeatedly reads the same few items from DynamoDB. The application can tolerate slightly stale data for a few seconds, and the team wants the lowest-latency design with minimal application changes. Which two changes should they make? Select two.

Select 2 answers

A.Put Amazon DynamoDB Accelerator (DAX) in front of the table.

B.Use eventually consistent reads where the application can tolerate slightly stale data.

C.Switch all access to strongly consistent reads for faster results.

D.Increase the item size so fewer requests are needed.

E.Replace the table with Amazon EBS volumes mounted on EC2 instances.

AnswersA, B

DAX is an in-memory cache for DynamoDB reads, so repeated lookups for the same keys can be served with much lower latency than direct table reads. It is especially effective for hot-item access patterns like order lookups, product metadata, and profile reads.

Why this answer

Amazon DynamoDB Accelerator (DAX) is an in-memory cache for DynamoDB that provides microsecond read latency, which is ideal for repeated reads of the same few items. Since the application can tolerate slightly stale data, DAX's default write-through caching with a TTL of 5 minutes ensures low latency without requiring application code changes beyond adding the DAX client.

Exam trap

The trap here is that candidates may think strongly consistent reads are always faster, but they actually have higher latency and cannot be cached by DAX, making them unsuitable for this low-latency, minimal-change requirement.

Practice this question →

112

MCQmedium

A analytics dashboard uses RDS MySQL and receives many read-only reporting queries that slow down the primary database. What should the architect add? The architecture review board prefers a managed AWS-native control.

A.S3 lifecycle policy

B.RDS read replica and route reporting queries to it

C.Multi-AZ standby and route reads to the standby

D.A larger NAT gateway

AnswerB

Read replicas offload read traffic from the primary instance.

Why this answer

B is correct because an RDS read replica is a fully managed, native AWS solution that offloads read-heavy reporting queries from the primary RDS MySQL instance. The read replica asynchronously replicates data using the MySQL binlog, allowing reporting traffic to be routed to it without impacting the primary database's write performance. This directly addresses the slowdown caused by many read-only queries while satisfying the architecture review board's preference for a managed AWS-native control.

Exam trap

The trap here is confusing the Multi-AZ standby (which is for failover only and cannot serve reads) with a read replica (which is specifically designed to offload read traffic), leading candidates to incorrectly select Option C as a managed solution for read scaling.

How to eliminate wrong answers

Option A is wrong because an S3 lifecycle policy manages object transitions and expirations in S3, not database read traffic; it cannot offload SQL queries from RDS. Option C is wrong because a Multi-AZ standby is designed for high availability and automatic failover, not for serving read traffic — it does not accept direct connections for reads, and any attempt to route reads to it would fail or require unsupported workarounds. Option D is wrong because a NAT gateway provides outbound internet access for private subnets and has no role in distributing database read queries; it cannot reduce load on an RDS primary instance.

Practice this question →

113

MCQhard

Based on the exhibit, a static asset distribution site uses Amazon CloudFront with an S3 origin. The assets are versioned by filename, but the cache hit ratio remains low after each release. Which CloudFront change is the best way to improve cache reuse without changing the origin objects?

A.Keep the current cache key and increase the S3 bucket's storage class.

B.Remove Authorization and unnecessary query strings from the CloudFront cache key.

C.Disable the CloudFront cache so every request is served directly from S3.

D.Switch the origin from Amazon S3 to an Application Load Balancer.

AnswerB

Versioned static assets do not need Authorization in the cache key, and arbitrary query strings can destroy cache efficiency. Excluding those fields lets CloudFront reuse the same cached object across many viewers.

Why this answer

Option B is correct because removing Authorization headers and unnecessary query strings from the CloudFront cache key ensures that multiple requests for the same versioned asset (e.g., style.v2.css) share a single cached object, regardless of user-specific headers or irrelevant query parameters. This directly increases the cache hit ratio without modifying the origin objects, as CloudFront will serve the same cached response for identical cache keys.

Exam trap

The trap here is that candidates may think increasing storage class or switching to an ALB improves caching, but the real issue is the cache key composition—specifically, unnecessary headers or query strings fragmenting the cache—which is solved by adjusting the CloudFront cache key settings.

How to eliminate wrong answers

Option A is wrong because changing the S3 bucket's storage class (e.g., to S3 Standard-IA or Glacier) has no effect on CloudFront's cache key or cache hit ratio; it only affects storage cost and retrieval latency, not caching behavior. Option C is wrong because disabling the CloudFront cache would force every request to go directly to the S3 origin, eliminating all caching benefits and increasing latency and origin load, which is the opposite of improving cache reuse. Option D is wrong because switching the origin from S3 to an Application Load Balancer (ALB) introduces unnecessary complexity and does not address the cache key issue; the ALB would still require the same cache key optimization to improve cache hits, and it would not inherently improve cache reuse.

Practice this question →

114

MCQhard

Based on the exhibit, what is the best change to improve read performance without increasing write latency on the primary database?

A.Create an RDS read replica and direct the reporting queries to the replica endpoint.

B.Convert the DB instance to Multi-AZ so the primary can serve more reads.

C.Increase the primary instance class to a larger size and keep all traffic on one writer.

D.Migrate the reporting workload to DynamoDB to gain faster reads.

AnswerA

A read replica offloads the long-running read-only reports from the primary database, which preserves write performance and reduces read latency for the reporting workload. Because the business accepts slightly stale report data, the asynchronous replication delay is acceptable. This is the most direct and AWS-native way to separate read pressure from writes.

Why this answer

Creating an RDS read replica offloads read-heavy reporting queries from the primary database instance, improving read performance without adding any write latency to the primary. The replica operates asynchronously, so writes on the primary are not blocked or delayed by the replica's lag. This is the standard AWS solution for scaling read traffic on RDS.

Exam trap

The trap here is that candidates confuse Multi-AZ with read scaling, assuming the standby instance can serve reads, when in fact Multi-AZ only provides failover redundancy and the standby is not accessible for read operations.

How to eliminate wrong answers

Option B is wrong because Multi-AZ is designed for high availability and automatic failover, not for scaling read capacity; the standby instance cannot serve reads directly. Option C is wrong because increasing the instance class would improve both read and write performance, but it does not isolate the reporting workload, so it could still increase write latency under heavy read load. Option D is wrong because migrating to DynamoDB is an architectural change that would require application rewrites and does not directly address improving read performance on the existing primary database without increasing write latency.

Practice this question →

115

Multi-Selecthard

A nightly video rendering pipeline runs on Linux EC2 instances and is compatible with ARM64. The jobs are CPU-bound, checkpoint frequently, and can resume if interrupted. The business wants the best throughput per dollar for the batch window. Which two changes should the team make? Select two.

Select 2 answers

A.Use AWS Graviton-based instances for the render workers.

B.Run the workers in an Auto Scaling group with Spot Instances for interruption-tolerant capacity.

C.Use a single large x86 instance with On-Demand pricing to avoid interruptions.

D.Replace the batch workers with a Lambda function to eliminate instance management.

E.Move the workload to a spread placement group to increase cost efficiency.

AnswersA, B

Graviton instances are ARM-based and often deliver better price-performance than comparable x86 instances for CPU-bound workloads. Because the application is already compatible with ARM64, the team can adopt Graviton without rewriting the pipeline. That improves throughput per dollar while keeping the same batch-processing model.

Why this answer

AWS Graviton-based instances use ARM64 architecture, which is explicitly compatible with the video rendering pipeline. They offer up to 40% better price-performance compared to comparable x86 instances for CPU-bound workloads, directly improving throughput per dollar. This makes option A correct for maximizing cost efficiency.

Exam trap

The trap here is that candidates may overlook the compatibility requirement with ARM64 and choose a single large x86 instance for simplicity, or mistakenly think Lambda can handle long-running CPU-bound tasks, missing the cost and throughput benefits of Graviton and Spot Instances.

Practice this question →

116

MCQhard

Based on the exhibit, a web application runs on an Amazon EC2 Auto Scaling group behind an Application Load Balancer. During traffic surges, the average CPU utilization stays below 35%, but request latency increases sharply and the ALB access logs show far more requests per target than expected. Which change is the best way to improve scaling behavior?

A.Lower the CPU target tracking threshold so the Auto Scaling group launches more instances sooner.

B.Replace the Application Load Balancer with a Network Load Balancer to reduce request latency.

C.Configure target tracking scaling on ALB RequestCountPerTarget for the Auto Scaling group.

D.Increase the ALB idle timeout so requests can wait longer before timing out.

AnswerC

RequestCountPerTarget directly reflects how many requests each instance is serving, which matches the symptom in the exhibit. It scales the fleet based on actual per-target demand instead of CPU, so the group can add capacity before queueing and latency grow.

Why this answer

Option C is correct because the issue is that request latency increases sharply and the ALB logs show far more requests per target than expected, indicating that the Auto Scaling group is not scaling based on the actual load per instance. By configuring target tracking scaling on ALB RequestCountPerTarget, the Auto Scaling group will launch new instances when the average number of requests per target exceeds a defined threshold, directly addressing the root cause of high request volume per instance. This approach ensures scaling is driven by the actual workload distribution rather than CPU utilization, which remains low due to the application being I/O-bound or network-bound.

Exam trap

The trap here is that candidates often assume CPU utilization is the universal scaling metric, but the question explicitly states CPU stays low while latency spikes, indicating the bottleneck is request throughput, not compute, making RequestCountPerTarget the correct metric to scale on.

How to eliminate wrong answers

Option A is wrong because lowering the CPU target tracking threshold would not help when CPU utilization is already below 35% and the bottleneck is request latency, not CPU; this could lead to unnecessary scaling and increased costs without solving the latency issue. Option B is wrong because replacing the Application Load Balancer with a Network Load Balancer would not reduce request latency caused by high request volume per target; NLB operates at Layer 4 and does not inspect HTTP requests, so it cannot provide request-level metrics like RequestCountPerTarget for scaling decisions. Option D is wrong because increasing the ALB idle timeout only extends how long the load balancer keeps connections open without activity, which does not address the root cause of high request volume per target or the sharp increase in latency; it may mask the problem by allowing requests to wait longer before timing out.

Practice this question →

117

MCQeasy

Multiple EC2 instances need a shared filesystem so they can concurrently read and write the same files (for example, user uploads and rendered assets). The instances are in different Availability Zones and must mount the filesystem using NFS. Which AWS storage service best fits?

A.Amazon EFS

B.Amazon S3

C.Amazon EBS gp3 volumes

D.Instance store on EC2

AnswerA

EFS provides a shared, NFS-compatible filesystem that supports mounting from multiple EC2 instances and AZs.

Why this answer

Amazon EFS is a fully managed NFS file system that supports concurrent read/write access from multiple EC2 instances across different Availability Zones. It uses the NFSv4.1 protocol, making it the only AWS storage service that provides a shared POSIX-compliant filesystem mountable via NFS for multi-AZ workloads.

Exam trap

The trap here is that candidates confuse Amazon S3's eventual consistency and HTTP-based access with a true shared filesystem, or assume EBS multi-attach (which is limited to a few instances in the same AZ) can replace a multi-AZ NFS solution.

How to eliminate wrong answers

Option B (Amazon S3) is wrong because S3 is an object storage service accessed via HTTP/HTTPS APIs, not a filesystem that can be mounted via NFS; it does not support POSIX file locking or concurrent write consistency required for shared filesystem use cases. Option C (Amazon EBS gp3 volumes) is wrong because EBS volumes are block storage that can only be attached to a single EC2 instance at a time (except for multi-attach EBS io1/io2, which is limited to a few instances in the same AZ and does not support NFS). Option D (Instance store on EC2) is wrong because instance store volumes are ephemeral, tied to a single EC2 instance, and cannot be shared across instances or Availability Zones.

Practice this question →

118

MCQmedium

A company needs to replicate a DynamoDB table to three AWS regions so that users in each region can read and write to a local copy with the lowest possible latency. Changes must propagate to all regions within seconds. Which solution should a solutions architect implement?

A.Enable DynamoDB Streams and use Lambda functions to replicate changes to tables in the other two regions

B.Configure DynamoDB Global Tables with replica tables in each of the three regions

C.Create DynamoDB read replicas in each region and use the primary table for all writes

D.Use Amazon S3 cross-region replication to back up DynamoDB exports to each region

AnswerB

Global Tables provides managed multi-region multi-active replication with sub-second propagation and automatic conflict resolution. No custom code required.

Why this answer

DynamoDB Global Tables provide multi-region, multi-active (multi-master) replication. Each region maintains a full replica of the table, and applications can read and write to any region with local latency. Changes propagate to all other regions typically within one second.

DynamoDB Streams + Lambda is the underlying mechanism that Global Tables uses internally, but building a custom replication pipeline adds significant operational complexity. Global Tables is the managed, purpose-built solution requiring no custom code.

Exam trap

DynamoDB Streams captures item-level changes and can be processed by Lambda to replicate to other regions — this is a valid DIY approach. But when the question asks for multi-region multi-active replication with minimal complexity, Global Tables is the correct answer. Streams is the mechanism; Global Tables is the managed service.

Always choose the managed service over DIY for SAA-C03.

Why the other options are wrong

Custom DynamoDB Streams + Lambda replication works but requires significant development: Lambda functions per region, error handling, idempotency logic, and conflict resolution. Always choose the managed service (Global Tables) over custom Lambda pipelines.

DynamoDB does not have 'read replicas' like RDS. Global Tables creates full replica tables that support both reads and writes in each region. There is no read-only replica concept in DynamoDB.

S3 cross-region replication copies S3 objects between buckets. DynamoDB-to-S3 export is a data archival mechanism, not real-time database replication. Neither provides active database access with sub-second propagation.

Practice this question →

119

MCQhard

Based on the exhibit, an Amazon Aurora MySQL application is read-heavy, but the database writer is nearing CPU limits while the reader instance is mostly idle. The application currently sends all queries to the writer endpoint. Which change should you make first to increase read throughput?

A.Keep using the writer endpoint so Aurora can route the reads automatically.

B.Change the application to send read-only queries to the Aurora reader endpoint.

C.Convert the cluster to a single-AZ deployment so network hops are reduced.

D.Add an Amazon DynamoDB Accelerator (DAX) cluster in front of Aurora.

AnswerB

The reader endpoint is designed to distribute read traffic across Aurora replica instances. Moving SELECT-heavy traffic off the writer immediately reduces writer CPU pressure and increases total read throughput.

Why this answer

The correct answer is B because the Aurora reader endpoint is specifically designed to distribute read-only traffic across all available reader instances, offloading the writer and increasing read throughput. Since the reader instance is idle, directing read queries to the reader endpoint immediately reduces CPU load on the writer without requiring any architectural changes.

Exam trap

The trap here is that candidates assume the writer endpoint automatically load-balances reads across all instances, but in Aurora the writer endpoint always points to the primary instance, and only the reader endpoint distributes read traffic.

How to eliminate wrong answers

Option A is wrong because the writer endpoint always routes queries to the writer instance, which is already near CPU limits; Aurora does not automatically redirect read queries to reader instances when using the writer endpoint. Option C is wrong because converting to a single-AZ deployment removes the reader instance entirely, eliminating the ability to offload reads and reducing availability, not increasing read throughput. Option D is wrong because adding a DAX cluster in front of Aurora introduces a caching layer for DynamoDB, not Aurora MySQL, and does not address the immediate need to offload read traffic from the writer instance.

Practice this question →

120

MCQeasy

A team needs to distribute TCP traffic (not HTTP) across multiple services. The services must see the original client source IP for auditing. Which AWS load balancer is the best fit?

A.Application Load Balancer (ALB) using HTTP/HTTPS listeners with host-based routing

B.Network Load Balancer (NLB) using TCP listeners

C.Classic Load Balancer (CLB) configured for TCP health checks only

D.API Gateway with a VPC Link to forward raw TCP traffic

AnswerB

NLB is a Layer 4 load balancer that supports TCP and UDP. When the traffic is routed to targets (for example, instance or IP targets), the backend connection maintains the original source IP/port at the networking layer, which supports IP-based auditing without requiring HTTP headers.

Why this answer

A Network Load Balancer (NLB) is the best fit because it operates at Layer 4 (TCP/UDP) and preserves the original client source IP address by default, which is required for auditing. It can distribute raw TCP traffic across multiple services without inspecting application-layer headers, making it ideal for non-HTTP TCP workloads.

Exam trap

The trap here is that candidates often assume an Application Load Balancer can handle any TCP traffic because of its 'listener' terminology, but ALB strictly requires HTTP/HTTPS protocols and cannot forward raw TCP streams.

How to eliminate wrong answers

Option A is wrong because an Application Load Balancer (ALB) only supports HTTP/HTTPS and gRPC protocols, not raw TCP traffic, and it terminates the client connection, replacing the source IP with its own private IP unless X-Forwarded-For headers are used (which are HTTP-specific). Option C is wrong because the Classic Load Balancer (CLB) is a legacy service that does not natively preserve the client source IP for TCP listeners; it uses proxy protocol to forward the source IP, but this requires additional configuration on the backend services, and CLB is not recommended for new architectures. Option D is wrong because API Gateway is designed for HTTP/HTTPS and RESTful APIs, not raw TCP traffic; a VPC Link integrates with NLBs or ALBs for HTTP traffic, but API Gateway cannot forward raw TCP streams.

Practice this question →

121

Multi-Selecthard

An event-ingestion application writes telemetry to DynamoDB with partition key tenantId and sort key eventTime. During a promotion, one tenant generates 10 times the normal traffic. Dashboards repeatedly query the most recent items for that tenant, and they can tolerate slightly stale data. Which changes would most effectively reduce throttling and improve responsiveness? Select three.

Select 3 answers

A.Introduce a sharded partition key for the hot tenant and query the small shard set when reading recent data.

B.Add a time bucket to the partition key, such as tenantId#YYYYMMDDHH, to spread bursty writes across more partitions.

C.Place DynamoDB DAX in front of the table for the repeated dashboard reads of recent items.

D.Increase only the sort-key cardinality while leaving the partition key unchanged.

E.Move the table to the Standard-IA table class because throttling is usually caused by storage class selection.

AnswersA, B, C

Correct because hot-partition problems are usually solved by spreading traffic across multiple partition key values. Sharding ACME across several keys distributes write load and avoids a single overloaded partition.

Why this answer

Option A is correct because introducing a sharded partition key (e.g., tenantId#shardId) for the hot tenant spreads its write traffic across multiple physical partitions, reducing throttling. When querying recent data, the dashboard can read from a small, fixed set of shards (e.g., 10 shards) and merge results, which is efficient and tolerates the slight staleness. This pattern directly addresses the hot partition issue without requiring application-level aggregation.

Exam trap

The trap here is that candidates assume increasing sort-key cardinality (Option D) improves write throughput, but DynamoDB's partition key alone determines the physical partition, so only modifying the partition key or using write sharding can alleviate hot partition throttling.

Practice this question →

122

MCQeasy

A company serves mostly static images and JavaScript files from an origin in one AWS Region. They want to reduce origin load and improve global performance. Which change most directly increases cache-hit ratio for static assets while avoiding stale content?

A.Set Cache-Control headers on the origin to always be no-cache so clients revalidate frequently.

B.Use versioned file names (e.g., app.abc123.js) and configure a long TTL with appropriate revalidation behavior.

C.Disable query string forwarding so all URLs without query strings share one cached object even when content differs.

D.Forward all headers, including cookies, to maximize personalization in edge cached responses.

AnswerB

Versioned assets allow long caching with confidence, while new filenames trigger updates when code changes.

Why this answer

Option B is correct because using versioned file names (e.g., app.abc123.js) allows you to set a long Cache-Control max-age TTL (e.g., one year) without risking stale content. When the file changes, the versioned name changes, creating a new URL that forces a cache miss and fetches the fresh content. This directly increases the cache-hit ratio for static assets while ensuring clients never serve outdated files.

Exam trap

The trap here is that candidates may think disabling query strings or forwarding all headers helps caching, but in reality, these actions either cause cache collisions or fragment the cache, reducing the cache-hit ratio.

How to eliminate wrong answers

Option A is wrong because setting Cache-Control: no-cache forces clients to revalidate with the origin on every request, which increases origin load and defeats the purpose of caching, reducing the cache-hit ratio. Option C is wrong because disabling query string forwarding can cause different content to be served from the same cached object if the content actually varies by query string, leading to stale or incorrect responses; it does not improve cache-hit ratio for static assets. Option D is wrong because forwarding all headers, including cookies, reduces cacheability since CloudFront (or any CDN) treats each unique set of headers as a separate cache key, fragmenting the cache and lowering the cache-hit ratio.

Practice this question →

123

MCQeasy

A trading analytics system deploys multiple EC2 instances that exchange very frequent, low-latency, east-west messages. The application team wants the instances to be placed to minimize network latency and variability. Which AWS feature should they use?

A.EC2 Placement Groups with the "cluster" strategy

B.EC2 Placement Groups with the "spread" strategy

C.Auto Scaling cooldown adjustments only

D.Switching the instances to a larger instance size without any placement group

AnswerA

Cluster placement groups place instances close together within a single Availability Zone on underlying infrastructure intended to have low-latency, high-bandwidth networking between instances. This directly targets minimizing latency and jitter for inter-instance communication.

Why this answer

The cluster placement group is the correct choice because it places instances into a low-latency, high-bandwidth group within a single Availability Zone, which minimizes network latency and variability for east-west traffic. This strategy is specifically designed for applications that require very frequent, low-latency communication between instances, such as trading analytics systems.

Exam trap

The trap here is that candidates confuse the 'spread' placement group's high availability benefit with low-latency requirements, not realizing that spreading instances across racks increases network hops and latency.

How to eliminate wrong answers

Option B is wrong because the spread placement group distributes instances across distinct hardware racks to reduce correlated failures, which increases network latency and variability rather than minimizing it. Option C is wrong because Auto Scaling cooldown adjustments only control the rate of scaling activities and have no impact on network latency or placement of instances. Option D is wrong because switching to a larger instance size may improve compute or memory capacity but does not inherently reduce network latency or variability between instances without a placement group.

Practice this question →

124

Multi-Selectmedium

A financial services application requires high-performance read access to a time-series dataset that is frequently updated with new records. The workload is write-heavy during market hours and read-heavy for reporting. The solution must support strong consistency and low-latency queries on a single key. Which three AWS services or features should be used together to meet these requirements? (Choose three.)

Select 3 answers

.Amazon DynamoDB with DAX (DynamoDB Accelerator)

.Amazon S3 Standard-Infrequent Access (S3 Standard-IA) storage class

.Amazon DynamoDB with Auto Scaling enabled for reads and writes

.Amazon RDS for PostgreSQL with Multi-AZ deployment

.Amazon DynamoDB global tables for multi-Region replication

.DynamoDB strongly consistent reads configured on the table

Why this answer

Amazon DynamoDB with DAX (DynamoDB Accelerator) is correct because DAX provides an in-memory cache that reduces read latency from single-digit milliseconds to microseconds for strongly consistent reads, directly addressing the high-performance read requirement for a time-series dataset. DynamoDB with Auto Scaling enabled for reads and writes is correct because it automatically adjusts throughput capacity based on traffic patterns, handling the write-heavy workload during market hours and read-heavy reporting without manual intervention. DynamoDB strongly consistent reads configured on the table is correct because it ensures that all read operations return the most up-to-date data, meeting the strong consistency requirement for financial applications where stale reads are unacceptable.

Exam trap

The trap here is that candidates often confuse eventually consistent reads with strongly consistent reads in DynamoDB, assuming that DAX or Auto Scaling alone can provide strong consistency, when in fact strongly consistent reads must be explicitly configured on the table to guarantee the latest data.

Practice this question →

125

MCQmedium

A read-heavy media archive repeatedly queries the same product catalogue data from DynamoDB with millisecond latency requirements. Which service can reduce read latency and table load? The design must avoid adding custom operational scripts.

A.DynamoDB Accelerator (DAX)

B.Amazon Kinesis Data Firehose

C.AWS Glue Data Catalog

D.S3 Transfer Acceleration

AnswerA

DAX is an in-memory cache for DynamoDB that reduces read latency for suitable access patterns.

Why this answer

DynamoDB Accelerator (DAX) is an in-memory cache for DynamoDB that delivers microsecond read latency, reducing the load on the underlying DynamoDB tables by serving repeated queries from its cache. This directly addresses the read-heavy media archive's millisecond latency requirements without requiring custom operational scripts, as DAX is fully managed and integrates seamlessly with existing DynamoDB API calls.

Exam trap

The trap here is that candidates may confuse DAX with other caching services like ElastiCache, but DAX is purpose-built for DynamoDB and requires no application code changes, whereas ElastiCache would need custom scripts to manage cache invalidation and data population.

How to eliminate wrong answers

Option B (Amazon Kinesis Data Firehose) is wrong because it is a streaming data ingestion service for loading data into data stores like S3 or Redshift, not a caching layer for reducing DynamoDB read latency or table load. Option C (AWS Glue Data Catalog) is wrong because it is a metadata repository for ETL jobs and data discovery, not a low-latency cache for DynamoDB queries. Option D (S3 Transfer Acceleration) is wrong because it speeds up uploads to S3 over long distances using edge locations, but it does not cache DynamoDB data or reduce read latency for repeated queries.

Practice this question →

126

Multi-Selectmedium

A web application uses an Amazon Aurora DB cluster for a read-heavy workload. The team wants to increase read throughput without changing the database schema or rewriting application data access patterns. Which two changes should they make? Select two.

Select 2 answers

A.Add Aurora Replicas to scale out read traffic across multiple database instances.

B.Send read queries to the Aurora reader endpoint so they are distributed across the replicas.

C.Point all queries to the writer endpoint so Aurora can balance reads and writes internally.

D.Enable Multi-AZ standby for the cluster to increase the number of read-only connections.

E.Move the database to a single larger instance class instead of adding replicas.

AnswersA, B

Aurora Replicas are the primary horizontal scaling mechanism for read-heavy Aurora workloads. They add more database compute so the cluster can process more concurrent read queries.

Why this answer

Adding Aurora Replicas (Option A) directly increases read throughput by distributing read-only queries across multiple database instances, which is ideal for a read-heavy workload. Sending read queries to the Aurora reader endpoint (Option B) ensures that these queries are load-balanced across all available replicas, offloading the writer instance and improving overall performance without requiring schema or application changes.

Exam trap

The trap here is that candidates confuse Multi-AZ standby (which provides high availability but not read scaling) with Aurora Replicas (which provide both read scaling and high availability), leading them to select Option D incorrectly.

Practice this question →

127

MCQmedium

A serverless API built with AWS Lambda serves latency-sensitive requests. The team observes intermittent slow responses during traffic ramp-ups and expects some users to hit the API immediately after a period of inactivity. Which configuration best reduces cold-start latency during these ramp-ups?

A.Enable Lambda provisioned concurrency on a published alias used by the API, and set a minimum provisioned concurrency greater than zero.

B.Increase the Lambda function’s memory setting; cold starts will always be eliminated regardless of traffic patterns.

C.Switch the Lambda runtime to a newer language version and remove any VPC configuration so the function never cold starts.

D.Set an API Gateway stage variable to "warm" the function at request time, which forces immediate initialization.

AnswerA

Provisioned concurrency keeps a defined number of Lambda execution environments initialized and ready behind a specific alias. When traffic ramps up—especially after inactivity—invocations can use pre-initialized environments, reducing or eliminating cold starts for those requests.

Why this answer

Lambda provisioned concurrency keeps a specified number of execution environments initialized and ready to respond immediately, eliminating cold starts for those invocations. By setting a minimum provisioned concurrency greater than zero on the alias used by API Gateway, the function remains warm even after periods of inactivity, ensuring consistent low latency during traffic ramp-ups.

Exam trap

The trap here is that candidates confuse provisioned concurrency with reserved concurrency, or assume that increasing memory or changing runtime settings can fully eliminate cold starts, when only provisioned concurrency guarantees pre-warmed execution environments for latency-sensitive workloads.

How to eliminate wrong answers

Option B is wrong because increasing memory reduces cold-start duration but does not eliminate cold starts; they still occur after inactivity. Option C is wrong because switching runtimes or removing VPC configuration does not prevent cold starts; VPC-enabled functions have additional cold-start overhead, but all Lambda functions can cold start regardless of runtime or VPC settings. Option D is wrong because API Gateway stage variables are static configuration values, not mechanisms to warm functions; they cannot force initialization at request time.

Practice this question →

128

Multi-Selecthard

A latency-sensitive telemetry service uses a custom TCP protocol on EC2 instances in private subnets. The service must preserve the client source IP for rate limiting, avoid HTTP header inspection, and keep per-request overhead as low as possible. Which changes should the team make? Select three.

Select 3 answers

A.Use a Network Load Balancer in front of the service.

B.Use a TCP or TLS listener rather than an HTTP listener.

C.Register instance or IP targets so the service can receive the original client source IP for rate limiting.

D.Use an Application Load Balancer because path-based routing improves throughput for binary protocols.

E.Expose the service through API Gateway because it supports raw TCP and UDP pass-through.

AnswersA, B, C

Correct because NLB is built for high-throughput, low-latency TCP traffic. It avoids HTTP-layer processing and is the right load balancer for a custom binary protocol.

Why this answer

Option A is correct because a Network Load Balancer (NLB) operates at Layer 4 and preserves the client source IP by default when instances are registered as targets. This allows the telemetry service to use the original IP for rate limiting without requiring HTTP header inspection, which is critical for a custom TCP protocol. NLB also introduces minimal latency and low per-request overhead, making it ideal for latency-sensitive workloads.

Exam trap

The trap here is that candidates may assume an Application Load Balancer is always better for routing logic, but for non-HTTP protocols and latency-sensitive workloads, the Network Load Balancer is the correct choice because it operates at Layer 4 without protocol inspection.

Practice this question →

129

MCQmedium

A analytics dashboard uses RDS MySQL and receives many read-only reporting queries that slow down the primary database. What should the architect add? The team wants the control to be enforceable during normal operations.

A.S3 lifecycle policy

B.RDS read replica and route reporting queries to it

C.Multi-AZ standby and route reads to the standby

D.A larger NAT gateway

AnswerB

Read replicas offload read traffic from the primary instance.

Why this answer

B is correct because an RDS read replica is designed to offload read-heavy workloads from the primary database instance. By routing reporting queries to the read replica, the primary database is freed from processing these read-only requests, improving overall performance. This solution is enforceable during normal operations as the read replica is always available for reads, unlike a Multi-AZ standby which is not accessible for reads.

Exam trap

The trap here is confusing a Multi-AZ standby (which is not readable) with a read replica (which is readable), leading candidates to incorrectly choose C thinking the standby can serve reads.

How to eliminate wrong answers

Option A is wrong because an S3 lifecycle policy manages object transitions and expirations in S3, not database query routing or read offloading. Option C is wrong because a Multi-AZ standby is a synchronous replica used only for failover and is not accessible for read queries during normal operations; routing reads to it would fail. Option D is wrong because a larger NAT gateway increases outbound internet capacity for private subnets, which does not address database read performance or query routing.

Practice this question →

130

MCQeasy

A latency-sensitive API is implemented with AWS Lambda. During traffic ramp-ups, users sometimes experience slow responses due to cold starts. The team wants to ensure fast initialization for a baseline level of concurrent requests. Which AWS feature should they use?

A.Lambda provisioned concurrency

B.Increase reserved instances for EC2

C.Enable S3 event notifications for every request to the API

D.Decrease the function timeout to reduce execution variability

AnswerA

Provisioned concurrency pre-initializes a specified number of Lambda execution environments and keeps them ready for invocation. This reduces or eliminates cold starts for the configured baseline concurrency during traffic ramp-ups.

Why this answer

Lambda Provisioned Concurrency keeps a specified number of execution environments initialized and ready to respond immediately, eliminating cold starts for those concurrent requests. This directly addresses the latency-sensitive API requirement during traffic ramp-ups by ensuring fast initialization for a baseline level of concurrency.

Exam trap

The trap here is that candidates may confuse 'provisioned concurrency' with 'reserved concurrency' (which only caps concurrency, not pre-warms) or think that reducing the function timeout or adding S3 triggers can somehow mitigate cold starts.

How to eliminate wrong answers

Option B is wrong because reserved instances for EC2 apply to EC2 compute capacity, not to Lambda functions, and would not address Lambda cold starts. Option C is wrong because S3 event notifications are used to trigger Lambda functions on S3 object events, not to pre-warm Lambda execution environments, and adding them for every API request would introduce unnecessary complexity and latency. Option D is wrong because decreasing the function timeout does not reduce cold start latency; it only limits the maximum execution duration, and may actually increase execution variability by forcing premature terminations.

Practice this question →

131

MCQmedium

A video platform uses Amazon Aurora. The workload has many short-lived database connections from Lambda functions, causing connection storms. What should be added? The design must avoid adding custom operational scripts.

A.S3 Select

B.An internet gateway

C.A larger Route 53 hosted zone

D.RDS Proxy

AnswerD

RDS Proxy pools and manages database connections, improving scalability for serverless and bursty workloads.

Why this answer

RDS Proxy is the correct choice because it sits between Lambda functions and the Aurora database, pooling and reusing database connections. This prevents connection storms by reducing the overhead of establishing new connections for each short-lived Lambda invocation, without requiring custom scripts or application changes.

Exam trap

The trap here is that candidates may think scaling the database (e.g., using Aurora Auto Scaling) or adding more compute resources solves connection storms, but the real bottleneck is the connection overhead itself, which RDS Proxy directly addresses without custom scripts.

How to eliminate wrong answers

Option A is wrong because S3 Select is a service for retrieving subsets of data from objects in Amazon S3 using SQL expressions; it does not manage database connections or address connection storms. Option B is wrong because an internet gateway enables VPC-to-internet communication for public subnets; it has no role in database connection pooling or reducing connection overhead. Option C is wrong because a larger Route 53 hosted zone increases the number of DNS records you can host but does not affect database connection management or mitigate connection storms.

Practice this question →

132

MCQeasy

Based on the exhibit, which Amazon EFS performance mode is the best fit for this workload?

A.Use General Purpose performance mode for low-latency access.

B.Use Max I/O performance mode to optimize for the highest possible latency tolerance.

C.Use One Zone storage class to increase metadata speed.

D.Use Provisioned Throughput mode because it is the only performance mode available.

AnswerA

General Purpose is the best EFS performance mode when the priority is low latency for small file operations. The exhibit describes a moderate number of clients and latency-sensitive metadata access, which matches the strengths of General Purpose. It is the usual choice for most applications unless the workload specifically needs very large-scale parallel throughput.

Why this answer

The General Purpose performance mode is the best fit for this workload because it provides the lowest latency for file operations, which is critical for latency-sensitive applications such as web serving, content management, and development environments. EFS General Purpose mode is optimized for workloads where consistent low-latency access is required, making it the default and recommended choice for most use cases.

Exam trap

The trap here is that candidates confuse performance modes (General Purpose vs. Max I/O) with throughput modes (Bursting vs. Provisioned) or storage classes (Standard vs.

One Zone), leading them to select options that address throughput or availability rather than latency requirements.

How to eliminate wrong answers

Option B is wrong because Max I/O performance mode is designed for workloads that require high throughput and parallel processing, but it introduces higher latency, making it unsuitable for low-latency access requirements. Option C is wrong because One Zone storage class is a storage class, not a performance mode; it reduces durability and availability by storing data in a single Availability Zone and does not affect metadata speed. Option D is wrong because Provisioned Throughput is a throughput mode, not a performance mode; EFS offers General Purpose and Max I/O as performance modes, and Provisioned Throughput can be used with either performance mode to set a specific throughput level.

Practice this question →

133

MCQmedium

A analytics dashboard uses an Application Load Balancer in one Region. Global users need lower network latency to the application without caching dynamic responses. What should be considered?

A.AWS Global Accelerator

B.S3 Cross-Region Replication

C.AWS Backup cross-Region copy

D.CloudFront only with long TTLs

AnswerA

Global Accelerator routes traffic over the AWS global network to improve performance for TCP/UDP applications without relying on caching.

Why this answer

AWS Global Accelerator uses the AWS global network and Anycast IPs to route traffic to the optimal Regional endpoint, reducing latency for global users without caching dynamic responses. It does not cache content, so dynamic data is always fetched from the origin, meeting the requirement of no caching while improving network performance via the AWS backbone.

Exam trap

The trap here is that candidates often choose CloudFront for any global latency improvement, but the requirement of 'no caching dynamic responses' disqualifies CloudFront unless TTL=0 is used, which still incurs edge request overhead, whereas Global Accelerator is purpose-built for non-cached dynamic traffic.

How to eliminate wrong answers

Option B (S3 Cross-Region Replication) is wrong because it replicates static objects across S3 buckets in different Regions, which does not reduce latency for dynamic application responses served by an ALB. Option C (AWS Backup cross-Region copy) is wrong because it is a backup and disaster recovery feature for copying backup data across Regions, not a mechanism to lower network latency for live application traffic. Option D (CloudFront only with long TTLs) is wrong because CloudFront caches content at edge locations, and using long TTLs would serve stale cached responses, violating the requirement of no caching for dynamic responses.

Practice this question →

134

MCQhard

A DynamoDB table for a retail API has a partition key based only on the current date. Write throttling occurs during business hours. What is the best design change? The architecture review board prefers a managed AWS-native control.

A.Use a higher-cardinality partition key that distributes writes across partitions

B.Create a global secondary index with the same date key

C.Reduce the table's write capacity

D.Move the table to S3 Glacier Instant Retrieval

AnswerA

A low-cardinality hot partition causes throttling; a better key spreads writes more evenly.

Why this answer

Using only the current date as a partition key creates a hot partition because all writes for the day target a single partition, leading to throttling. A higher-cardinality partition key, such as a composite key combining date with a unique attribute like user ID or order ID, distributes writes evenly across multiple partitions, fully utilizing DynamoDB's provisioned throughput. This is the best managed-native solution to resolve write throttling without changing the table's capacity or moving data.

Exam trap

The trap here is that candidates often think adding a GSI or adjusting capacity solves throttling, but the root cause is the partition key's low cardinality, which only a higher-cardinality key can fix by distributing writes across partitions.

How to eliminate wrong answers

Option B is wrong because a global secondary index (GSI) with the same date key does not solve the hot partition issue; GSIs have their own throughput and inherit the same write distribution problem, potentially causing throttling on the index. Option C is wrong because reducing the table's write capacity would worsen throttling during business hours, not resolve the underlying hot partition caused by the poor key design. Option D is wrong because S3 Glacier Instant Retrieval is an object storage class for infrequently accessed data with millisecond retrieval, not a replacement for DynamoDB's low-latency, high-throughput key-value access, and moving the table would break the API's real-time requirements.

Practice this question →

135

MCQeasy

A.Write to DynamoDB only, and never update or invalidate the Redis cache.

B.Use a cache-aside approach with TTL plus explicit invalidation after writes.

C.Cache only for reads, and do not fetch from DynamoDB when a key is missing.

D.Rely on eventual consistency of Redis replication to propagate updates to all nodes.

AnswerB

Why this answer

The cache-aside (lazy loading) pattern with TTL plus explicit invalidation ensures that after a write to DynamoDB, the stale Redis entry is removed, forcing the next read to fetch the updated profile from DynamoDB and repopulate the cache. This minimizes the window of inconsistency while keeping cache management simple and efficient for user profile workloads.

Exam trap

The trap here is that candidates may confuse cache-aside with write-through or write-behind patterns, or assume that Redis replication alone can solve cache consistency, when in fact explicit invalidation is required to ensure reads see the latest value after a write to the primary data store.

How to eliminate wrong answers

Option A is wrong because never updating or invalidating the cache means reads will serve stale data indefinitely, violating the requirement that subsequent reads see the latest value quickly. Option C is wrong because caching only for reads and not fetching from DynamoDB on a cache miss would result in cache misses returning no data, effectively breaking the application's ability to serve profiles. Option D is wrong because relying on eventual consistency of Redis replication does not address the core issue of cache staleness after a write; Redis replication propagates data between nodes but does not invalidate or update cached entries that were written before the DynamoDB update.

Practice this question →

136

MCQhard

Based on the exhibit, which change will most improve the CloudFront cache hit ratio for the static assets while still serving the same files to all users?

A.Create a custom cache policy that includes only the v query string and excludes cookies.

B.Enable Origin Shield and keep the current cache behavior unchanged.

C.Move the static assets to individual presigned URLs for each viewer.

D.Increase the CloudFront default TTL to 24 hours while continuing to forward all cookies and query strings.

AnswerA

This removes unnecessary cache-key fragmentation. Since all users receive identical static files, forwarding user-specific cookies and irrelevant query strings destroys cache reuse. Keeping only the version parameter preserves correct object variation while allowing many more requests to hit the same cached object at the edge.

Why this answer

Option A is correct because static assets (e.g., images, CSS, JS) are typically served identically to all users, so forwarding a unique query string like 'v' for versioning still allows CloudFront to cache a single object per version. By excluding cookies and other query strings, you prevent cache fragmentation caused by irrelevant variations, directly improving the cache hit ratio. This custom cache policy ensures that requests for the same 'v' value are served from the edge cache rather than forwarded to the origin.

Exam trap

The trap here is that candidates assume increasing TTL or enabling Origin Shield will fix a low cache hit ratio, when the real issue is cache key fragmentation caused by forwarding all cookies and query strings.

How to eliminate wrong answers

Option B is wrong because enabling Origin Shield reduces load on the origin and improves cache fill efficiency, but it does not address the root cause of a low cache hit ratio—forwarding all cookies and query strings still fragments the cache at the edge. Option C is wrong because moving static assets to individual presigned URLs for each viewer would make every request unique, destroying any possibility of caching and drastically reducing the cache hit ratio. Option D is wrong because increasing the default TTL to 24 hours while continuing to forward all cookies and query strings does not solve cache fragmentation; CloudFront still treats requests with different cookies or query strings as separate cache objects, so the cache hit ratio remains low.

Practice this question →

137

MCQmedium

A trading analytics system deploys 10 EC2 instances that exchange very frequent, low-latency messages over the network. The instances must be placed as close together as possible to minimize network hop count and inter-node jitter. Which deployment choice best matches this requirement?

A.Use a spread placement group to distribute instances across multiple underlying hardware to improve overall availability.

B.Use a cluster placement group so the instances are placed close together to reduce latency and jitter.

C.Use no placement group and rely on the Auto Scaling group to balance instance placement automatically.

D.Use a partition placement group so each instance is assigned to separate failure domains for low variance.

AnswerB

Cluster placement groups place instances close together (within a single Availability Zone when supported) to reduce network hop count and improve inter-instance network performance. This directly targets low-latency, jitter-sensitive communication between many nodes.

Why this answer

A cluster placement group is designed for low-latency, high-throughput scenarios by placing all instances in a single Availability Zone within the same rack or logical cluster, minimizing network hop count and inter-node jitter. This directly meets the requirement for very frequent, low-latency messaging between 10 EC2 instances.

Exam trap

The trap here is that candidates often confuse 'low latency' with 'high availability' and choose a spread placement group (Option A) thinking it reduces jitter, when in fact it increases network distance and latency by distributing instances across hardware.

How to eliminate wrong answers

Option A is wrong because a spread placement group distributes instances across distinct underlying hardware to maximize availability, which increases network distance and latency, opposite to the requirement. Option C is wrong because relying on an Auto Scaling group without a placement group does not guarantee close physical proximity; instances may be placed across different racks or AZs, increasing jitter. Option D is wrong because a partition placement group isolates instances into separate failure domains (partitions) to reduce correlated failures, but this increases network hops between partitions, not minimizing latency.

Practice this question →

138

MCQmedium

A analytics dashboard uses an Application Load Balancer in one Region. Global users need lower network latency to the application without caching dynamic responses. What should be considered? The design must avoid adding custom operational scripts.

A.AWS Global Accelerator

B.S3 Cross-Region Replication

C.AWS Backup cross-Region copy

D.CloudFront only with long TTLs

AnswerA

Global Accelerator routes traffic over the AWS global network to improve performance for TCP/UDP applications without relying on caching.

Why this answer

AWS Global Accelerator uses the AWS global network to route traffic from edge locations to the optimal regional endpoint, reducing latency and jitter for global users. It does not cache content, making it ideal for dynamic responses that cannot be cached. The service requires no custom scripts, as it integrates directly with the Application Load Balancer via a static IP address or DNS name.

Exam trap

The trap here is that candidates often confuse Global Accelerator with CloudFront, assuming both are for caching, but Global Accelerator does not cache content and is specifically designed for non-cacheable, dynamic traffic requiring low latency and fast failover.

How to eliminate wrong answers

Option B (S3 Cross-Region Replication) is wrong because it replicates objects across S3 buckets in different regions, but it does not reduce network latency for dynamic application traffic; it is designed for data redundancy and disaster recovery, not for real-time request routing. Option C (AWS Backup cross-Region copy) is wrong because it copies backup data across regions for compliance or disaster recovery, and it has no impact on live application latency or traffic routing. Option D (CloudFront only with long TTLs) is wrong because CloudFront caches content at edge locations, which violates the requirement to avoid caching dynamic responses; long TTLs would serve stale data, and disabling caching would negate the latency benefit, while custom scripts would be needed to bypass caching for dynamic content.

Practice this question →

139

MCQhard

Based on the exhibit, what is the best change to improve read performance without increasing write latency on the primary database?

A.Create an RDS read replica and direct the reporting queries to the replica endpoint.

B.Convert the DB instance to Multi-AZ so the primary can serve more reads.

C.Increase the primary instance class to a larger size and keep all traffic on one writer.

D.Migrate the reporting workload to DynamoDB to gain faster reads.

AnswerA

Why this answer

Creating an RDS read replica offloads read-heavy reporting queries from the primary database instance, improving read performance without increasing write latency on the primary. The replica operates asynchronously, so writes on the primary are not blocked or delayed by the reporting workload.

Exam trap

The trap here is confusing Multi-AZ (which only provides failover redundancy) with read replicas (which provide read scaling), leading candidates to incorrectly select Multi-AZ as a performance solution.

How to eliminate wrong answers

Option B is wrong because Multi-AZ provides high availability and automatic failover, not additional read capacity; the standby instance cannot serve reads. Option C is wrong because scaling up the instance class increases both read and write capacity but does not isolate reporting traffic, so write latency could still be impacted by heavy reads. Option D is wrong because migrating to DynamoDB is a full architectural change that does not address the existing RDS read performance issue without increasing write latency; it also introduces data synchronization complexity.

Practice this question →

140

MCQhard

A DynamoDB table for a travel booking site has a partition key based only on the current date. Write throttling occurs during business hours. What is the best design change? The architecture review board prefers a managed AWS-native control.

A.Create a global secondary index with the same date key

B.Move the table to S3 Glacier Instant Retrieval

C.Reduce the table's write capacity

D.Use a higher-cardinality partition key that distributes writes across partitions

AnswerD

A low-cardinality hot partition causes throttling; a better key spreads writes more evenly.

Why this answer

Option D is correct because using a low-cardinality partition key like the current date causes all writes to land on a single partition, leading to throttling. By choosing a higher-cardinality partition key (e.g., combining date with a user ID or booking ID), writes are distributed evenly across multiple partitions, leveraging DynamoDB's internal partitioning to handle the throughput. This is a managed, AWS-native design change that resolves hot partition issues without additional services.

Exam trap

The trap here is that candidates often confuse a GSI as a solution for write performance, when in fact GSIs only help with read query patterns and do not alleviate write hot spots on the base table.

How to eliminate wrong answers

Option A is wrong because creating a global secondary index (GSI) with the same date key does not solve the write throttling; GSIs have their own write capacity and inherit the same hot partition problem from the base table's partition key. Option B is wrong because moving the table to S3 Glacier Instant Retrieval is not a managed AWS-native control for DynamoDB write throttling; S3 is a different storage service and cannot replace DynamoDB's real-time transactional write capabilities. Option C is wrong because reducing the table's write capacity would worsen throttling during business hours, as it lowers the maximum allowed writes per second, directly contradicting the need to handle high write demand.

Practice this question →

141

Multi-Selecthard

A low-latency market-data engine runs 10 EC2 instances that exchange small messages thousands of times per second. The team wants the lowest possible network latency and jitter, and they can tolerate single-AZ placement for this tier because another layer handles disaster recovery. Which changes should they make? Select three.

Select 3 answers

A.Use a cluster placement group for the instances.

B.Use Nitro-based instances with enhanced networking support.

C.Launch all latency-sensitive nodes in one Availability Zone to fit the cluster placement group constraint.

D.Use a spread placement group to maximize low-latency communication across the fleet.

E.Distribute the instances across multiple Availability Zones to reduce intra-cluster latency.

AnswersA, B, C

Correct. Cluster placement groups place instances physically close together within an Availability Zone to minimize latency and maximize network throughput. This is the standard AWS design for tightly coupled workloads that depend on frequent east-west communication.

Why this answer

A cluster placement group is designed for low-latency, high-throughput scenarios by ensuring instances are in close proximity within a single Availability Zone, which minimizes network latency and jitter. This directly meets the requirement for the lowest possible network latency and jitter for the market-data engine.

Exam trap

The trap here is that candidates may confuse spread placement groups (designed for fault tolerance) with cluster placement groups (designed for low latency), or incorrectly think distributing across AZs reduces latency when it actually increases it.

Practice this question →

142

MCQhard

A Lambda-based travel booking site has unpredictable traffic spikes and users see latency caused by cold starts. The function must respond consistently during expected campaign windows. What should be configured? The team wants the control to be enforceable during normal operations.

A.Provisioned concurrency during campaign windows

B.A larger deployment package

C.CloudTrail data events

D.Reserved concurrency only

AnswerA

Provisioned concurrency keeps execution environments initialized and reduces cold-start latency.

Why this answer

Provisioned concurrency initializes a specified number of execution environments in advance, eliminating cold starts for those instances. During campaign windows, this ensures consistent latency by keeping functions warm and ready to handle spikes immediately. The team can enforce this configuration only during expected high-traffic periods, leaving normal operations unaffected.

Exam trap

The trap here is confusing reserved concurrency (which only limits scaling) with provisioned concurrency (which pre-warms instances); candidates often pick reserved concurrency thinking it prevents cold starts, but it does not address initialization latency.

How to eliminate wrong answers

Option B is wrong because a larger deployment package increases the time to download and initialize the function code, worsening cold starts rather than solving them. Option C is wrong because CloudTrail data events record API activity for auditing and do not affect Lambda execution latency or concurrency. Option D is wrong because reserved concurrency only caps the maximum concurrent executions for a function, preventing it from consuming all available concurrency but does not pre-warm instances or reduce cold starts.

Practice this question →

143

MCQeasy

Your team hosts versioned static assets (for example, /static/app-<buildHash>.js). Each build hash never changes, but you release new files on new URLs. To maximize cache hit rate and reduce origin load using CloudFront, what should you do when generating HTTP responses for these assets?

A.Set Cache-Control: no-cache so CloudFront always revalidates with the origin

B.Set Cache-Control: public, max-age=31536000, immutable for the versioned assets

C.Set Cache-Control: max-age=0 and rely on CloudFront to cache by default

D.Disable CloudFront caching and forward all headers and query strings to the origin

AnswerB

For content-addressed/versioned URLs, a long max-age lets CloudFront treat the object as fresh for a long period. Adding the immutable directive tells clients not to revalidate while the max-age is still valid, supporting high cache hit rates and fewer origin fetches for repeat requests.

Why this answer

Option B is correct because setting `Cache-Control: public, max-age=31536000, immutable` tells CloudFront and browsers to cache the versioned asset for one year (31536000 seconds) and never revalidate, since the URL changes with each new build. The `immutable` directive (RFC 8246) signals that the content will never change on that URL, eliminating conditional revalidation requests and maximizing cache hits, which reduces origin load.

Exam trap

The trap here is that candidates confuse 'no-cache' (which still allows caching but forces revalidation) with 'no-store' (which forbids caching entirely), or they assume that `max-age=0` is acceptable for versioned assets, not realizing it forces revalidation and reduces cache efficiency.

How to eliminate wrong answers

Option A is wrong because `Cache-Control: no-cache` forces CloudFront to revalidate every request with the origin, which increases origin load and defeats the purpose of caching immutable assets. Option C is wrong because `max-age=0` tells CloudFront and browsers to treat the response as stale immediately, requiring revalidation on every request, which reduces cache hit rate and increases origin load. Option D is wrong because disabling CloudFront caching and forwarding all headers/query strings bypasses the CDN entirely, causing every request to go to the origin, which maximizes origin load and eliminates caching benefits.

Practice this question →

144

Multi-Selectmedium

A single EC2 instance hosts a low-latency database cache that writes a large random working set to block storage. The application needs sustained high IOPS and low latency, and the storage must remain attached to the instance while it runs. Which two design choices best meet the requirement? Select two.

Select 2 answers

A.Use an io2 Block Express EBS volume for the highest sustained IOPS and low-latency performance.

B.Stripe multiple EBS volumes together with RAID 0 to increase aggregate IOPS and throughput.

C.Use an S3 bucket as the backing store because object storage scales automatically.

D.Choose a cold HDD-based volume so the cache has durable low-cost storage.

E.Use the root volume from a T-series instance because burst credits can absorb the write spikes.

AnswersA, B

io2 Block Express is designed for demanding block-storage workloads that need very high, consistent IOPS with low latency. It is a strong fit when the data must remain on attached EBS storage rather than on ephemeral instance store.

Why this answer

Option A is correct because io2 Block Express EBS volumes are designed for mission-critical workloads requiring sustained high IOPS and low latency. They offer up to 256,000 IOPS per volume with sub-millisecond latency, making them ideal for a low-latency database cache that writes a large random working set to block storage. The storage remains attached to the EC2 instance while it runs, meeting the requirement for persistent block-level storage.

Exam trap

The trap here is that candidates may confuse burst credits (which apply to CPU performance on T-series instances) with storage performance, or assume that object storage like S3 can serve as a low-latency block device, when in fact only EBS volumes provide the required persistent, low-latency block storage attached to an EC2 instance.

Practice this question →

145

MCQhard

Based on the exhibit, which storage design best supports the application servers' shared working directory requirement?

A.Mount Amazon EFS on every EC2 instance and use it as the shared workspace.

B.Attach one gp3 EBS volume to each instance and synchronize the files with cron jobs.

C.Store the artifacts in S3 and have each node read them directly from S3 as a filesystem.

D.Use instance store on each instance because it provides the fastest local file access.

AnswerA

EFS provides shared, persistent, POSIX-compliant file access across multiple EC2 instances and Availability Zones. That matches the requirement that all nodes see the same workspace immediately and that files survive instance replacement. It is the right choice when the application needs a common filesystem rather than an object store or local-only disk.

Why this answer

Amazon EFS provides a fully managed, NFS-based shared file system that can be mounted concurrently on multiple EC2 instances across multiple Availability Zones. This directly satisfies the requirement for a shared working directory where all application servers can read and write files simultaneously without additional synchronization overhead.

Exam trap

The trap here is that candidates often confuse object storage (S3) with shared file storage, assuming S3 can serve as a drop-in replacement for a POSIX filesystem, but S3 lacks file locking, atomic renames, and low-latency metadata operations required for a shared working directory.

How to eliminate wrong answers

Option B is wrong because attaching separate gp3 EBS volumes to each instance and synchronizing files with cron jobs introduces complexity, potential data inconsistency, and latency, and does not provide a true shared filesystem; EBS volumes can only be attached to a single instance at a time (unless using multi-attach, which is limited to specific scenarios). Option C is wrong because while S3 can be accessed from EC2 instances, it is an object storage service, not a POSIX-compliant filesystem; using S3 as a filesystem via tools like s3fs introduces performance overhead, eventual consistency issues, and lacks native file locking, making it unsuitable for a shared working directory requiring concurrent writes. Option D is wrong because instance store provides ephemeral, block-level storage that is physically attached to the host, but it is temporary and data is lost on instance stop or termination; it cannot be shared across multiple instances, so it does not meet the shared working directory requirement.

Practice this question →

146

MCQmedium

A web application uses an Amazon Aurora DB cluster for a read-heavy workload. The application team needs higher read throughput but cannot change the database schema. They want to avoid blocking writes and are willing to route read traffic separately. What is the most appropriate architecture change?

A.Create Aurora read replicas and route SELECT queries to an Aurora reader endpoint.

B.Scale up the writer instance storage only; read capacity will automatically increase without using a reader endpoint.

C.Move the Aurora cluster to Multi-AZ deployment mode only; read scaling is handled automatically without replicas.

D.Replace the cluster with a single RDS instance because it offers consistent performance for both reads and writes.

AnswerA

Read replicas increase read capacity, and using the Aurora reader endpoint sends read traffic to replicas.

Why this answer

Creating Aurora read replicas and routing SELECT queries to the Aurora reader endpoint is the most appropriate architecture change because Aurora's reader endpoint distributes read traffic across up to 15 low-latency read replicas, providing higher aggregate read throughput without blocking writes. This approach requires no schema changes and allows the application to separate read and write traffic, directly addressing the read-heavy workload requirement.

Exam trap

The trap here is that candidates often confuse Multi-AZ deployment (which provides failover only) with read replica scaling, or mistakenly believe that scaling storage or using a single instance can improve read throughput without schema changes.

How to eliminate wrong answers

Option B is wrong because scaling up the writer instance storage does not increase read throughput; Aurora's read capacity is tied to compute resources (e.g., instance class) and the number of replicas, not storage size. Option C is wrong because Multi-AZ deployment in Aurora is for high availability and failover, not for scaling read throughput; read scaling requires dedicated read replicas with a reader endpoint. Option D is wrong because replacing the cluster with a single RDS instance would eliminate read scaling capabilities and introduce a single point of failure, making it unsuitable for a read-heavy workload that needs higher throughput without blocking writes.

Practice this question →

147

MCQhard

A DynamoDB table for a retail API has a partition key based only on the current date. Write throttling occurs during business hours. What is the best design change?

A.Use a higher-cardinality partition key that distributes writes across partitions

B.Create a global secondary index with the same date key

C.Reduce the table's write capacity

D.Move the table to S3 Glacier Instant Retrieval

AnswerA

A low-cardinality hot partition causes throttling; a better key spreads writes more evenly.

Why this answer

Using a partition key based solely on the current date creates a 'hot partition' because all writes for that day target the same partition, leading to throttling. A higher-cardinality partition key (e.g., combining date with a unique attribute like user ID or order ID) distributes write traffic evenly across multiple partitions, allowing DynamoDB to utilize its full throughput capacity and eliminating throttling.

Exam trap

The trap here is that candidates may think a GSI can solve write throttling, but GSIs only help with read patterns and do not redistribute write load on the base table.

How to eliminate wrong answers

Option B is wrong because creating a global secondary index (GSI) with the same date key does not change the base table's partition key; writes still target the same hot partition, so throttling persists. Option C is wrong because reducing the table's write capacity would worsen throttling, not solve it, as the issue is uneven distribution, not insufficient total capacity. Option D is wrong because S3 Glacier Instant Retrieval is an object storage service for archival data, not a transactional database; it cannot support DynamoDB's low-latency read/write operations or query patterns.

Practice this question →

148

MCQmedium

A document portal requires consistent high IOPS for a transactional database on EC2. Which EBS volume type is most suitable? The architecture review board prefers a managed AWS-native control.

A.sc1 Cold HDD

B.Instance store only

C.Provisioned IOPS SSD such as io2

D.st1 Throughput Optimized HDD

AnswerC

io2 is designed for business-critical workloads requiring consistent high IOPS and durability.

Why this answer

The scenario requires consistent high IOPS for a transactional database, which demands low-latency, predictable performance. Provisioned IOPS SSD (io2) is the only EBS volume type that allows you to specify a guaranteed IOPS rate independent of volume size, making it ideal for latency-sensitive transactional workloads. It is also a managed AWS-native service, satisfying the architecture review board's preference.

Exam trap

The trap here is that candidates may confuse 'high IOPS' with throughput-optimized HDDs (st1) or mistakenly think instance store offers managed persistence, but the key differentiator is the need for consistent, provisioned IOPS and managed durability that only io2 provides.

How to eliminate wrong answers

Option A is wrong because sc1 Cold HDD is designed for infrequently accessed, cold data with low cost, offering burstable throughput but very low IOPS, making it unsuitable for transactional databases requiring consistent high IOPS. Option B is wrong because instance store provides temporary, block-level storage that is physically attached to the host, but it is not managed (data is lost on instance stop/termination) and does not qualify as a managed AWS-native control. Option D is wrong because st1 Throughput Optimized HDD is optimized for large, sequential workloads like big data and log processing, not for random I/O patterns of transactional databases, and it cannot guarantee high IOPS.

Practice this question →

149

MCQmedium

A media archive requires consistent high IOPS for a transactional database on EC2. Which EBS volume type is most suitable? The team wants the control to be enforceable during normal operations.

A.Provisioned IOPS SSD such as io2

B.st1 Throughput Optimized HDD

C.Instance store only

D.sc1 Cold HDD

AnswerA

io2 is designed for business-critical workloads requiring consistent high IOPS and durability.

Why this answer

The io2 Provisioned IOPS SSD volume type is designed for latency-sensitive transactional database workloads that require consistent high IOPS. It allows you to specify a guaranteed IOPS rate (up to 256,000 IOPS for io2 Block Express) and provides 99.999% durability, making it ideal for enforcing performance control during normal operations.

Exam trap

The trap here is that candidates often confuse throughput-optimized HDD (st1) with IOPS-focused SSD, assuming 'high throughput' implies high IOPS, but st1 is designed for sequential access and cannot provide the low-latency random I/O that transactional databases require.

How to eliminate wrong answers

Option B (st1 Throughput Optimized HDD) is wrong because it is a throughput-optimized HDD volume designed for large, sequential workloads like big data and log processing, not for transactional databases requiring consistent low-latency IOPS. Option C (Instance store only) is wrong because instance store volumes are ephemeral and data is lost on instance stop or termination, making them unsuitable for persistent database storage. Option D (sc1 Cold HDD) is wrong because it is a cold HDD volume optimized for infrequently accessed data with the lowest cost, offering very low IOPS and throughput that cannot meet the demands of a transactional database.

Practice this question →

150

MCQeasy

Based on the exhibit, the team wants to improve application performance without changing the code. Which EC2 instance family should they choose next?

A.Choose a compute-optimized instance family such as C6i to increase CPU performance.

B.Choose a memory-optimized instance family such as R6i to provide more RAM.

C.Choose a storage-optimized instance family such as I4i to improve block storage throughput.

D.Choose a burstable instance family such as T3 to reduce cost and improve performance.

AnswerB

Memory-optimized instances are the best fit when memory pressure is causing slowdowns. The exhibit shows CPU is low while memory is consistently near saturation, which strongly suggests the application needs more RAM rather than more compute. Moving to an R6i family should reduce paging and improve response times without changing the application design.

Why this answer

The exhibit shows a memory-constrained application (e.g., high cache miss rates or swap usage) where performance is bottlenecked by insufficient RAM. Choosing a memory-optimized instance family like R6i provides more RAM per vCPU, allowing the application to keep more data in memory and reduce disk I/O, directly improving performance without code changes.

Exam trap

The trap here is that candidates assume 'improve performance' always means faster CPU, but the exhibit's memory pressure metric (e.g., high swap usage or cache miss rate) directly points to RAM as the bottleneck, making memory-optimized instances the correct choice.

How to eliminate wrong answers

Option A is wrong because compute-optimized instances (C6i) increase CPU performance, but the exhibit indicates the bottleneck is memory, not CPU; adding CPU power won't reduce swap or cache misses. Option C is wrong because storage-optimized instances (I4i) improve block storage throughput, which helps I/O-bound workloads, but the exhibit shows memory pressure, not storage latency or throughput issues. Option D is wrong because burstable instances (T3) are designed for variable workloads with CPU credits and actually have less consistent performance and lower baseline RAM, which could worsen memory-constrained applications.

Practice this question →