CCNA Design High-Performing Architectures Questions — Page 3 of 4

151

Multi-Selecthard

A serverless checkout API runs on AWS Lambda behind API Gateway. Traffic spikes are predictable every weekday at 09:00 UTC, and p95 latency jumps for the first few minutes after each deployment because execution environments are cold. The team wants to reduce this startup impact without changing the API contract. Which changes should they make? Select three.

Select 3 answers

A.Configure provisioned concurrency on the production Lambda alias during the busy windows.

B.Initialize SDK clients and other reusable objects outside the handler so they are created once per execution environment.

C.Reduce the deployment package size and remove unnecessary layers to shorten function initialization.

D.Replace provisioned concurrency with reserved concurrency because reserved concurrency keeps instances warm.

E.Increase the function timeout so the first request has more time to warm up.

AnswersA, B, C

Correct. Provisioned concurrency keeps a pool of pre-initialized execution environments ready to handle invocations, which directly reduces cold-start latency. Using an alias allows the team to manage production traffic separately from development or canary versions and to schedule capacity for the predictable weekday peak.

Why this answer

Provisioned concurrency initializes a specified number of execution environments in advance, so when traffic spikes at 09:00 UTC, the Lambda function is already warm and can serve requests without cold start latency. This directly addresses the p95 latency jump after deployment without altering the API contract.

Exam trap

The trap here is confusing reserved concurrency (which only limits concurrency) with provisioned concurrency (which pre-warms instances), leading candidates to incorrectly select reserved concurrency as a solution for cold starts.

Practice this question →

152

MCQmedium

A telemetry pipeline uses an Application Load Balancer in one Region. Global users need lower network latency to the application without caching dynamic responses. What should be considered? The design must avoid adding custom operational scripts.

A.AWS Global Accelerator

B.S3 Cross-Region Replication

C.CloudFront only with long TTLs

D.AWS Backup cross-Region copy

AnswerA

Global Accelerator routes traffic over the AWS global network to improve performance for TCP/UDP applications without relying on caching.

Why this answer

AWS Global Accelerator uses the AWS global network to route traffic from global users to the Application Load Balancer, reducing latency and jitter by leveraging Anycast IP addresses and edge locations. It does not cache responses, making it ideal for dynamic content where low latency is required without custom operational scripts.

Exam trap

The trap here is that candidates often confuse CloudFront with Global Accelerator, assuming CloudFront can reduce latency for dynamic content without caching, but CloudFront inherently caches content at edge locations and requires custom origin headers or Lambda@Edge to bypass caching, which violates the 'no custom operational scripts' constraint.

How to eliminate wrong answers

Option B (S3 Cross-Region Replication) is wrong because it replicates objects across S3 buckets in different Regions, which does not reduce network latency for dynamic application traffic and is unrelated to ALB routing. Option C (CloudFront only with long TTLs) is wrong because CloudFront caches responses at edge locations, and long TTLs would serve stale dynamic content, contradicting the requirement to avoid caching dynamic responses. Option D (AWS Backup cross-Region copy) is wrong because it is a backup and disaster recovery service that copies backup data across Regions, not a solution for reducing network latency to an application endpoint.

Practice this question →

153

MCQhard

Based on the exhibit, an application runs on Amazon Aurora MySQL. The writer instance is frequently near 85% CPU while the reader instance is under 20% CPU. Application traces show that most of the database traffic is read-only SELECT queries, but the code currently sends all queries to the writer endpoint. What should the solutions architect recommend to improve performance with the smallest functional change?

A.Increase the writer instance size and keep all traffic on the writer endpoint.

B.Point read-only database traffic to the Aurora reader endpoint and keep writes on the writer endpoint.

C.Convert the cluster to a Multi-AZ RDS PostgreSQL deployment to get automatic failover and better read performance.

D.Enable cross-Region read replicas so SELECT queries are routed to a remote Region for improved performance.

AnswerB

This directly uses the cluster’s read scale-out capability. The reader endpoint distributes read traffic across replicas, reducing load on the writer and increasing read throughput without changing schema or database engine.

Why this answer

Option B is correct because the Aurora reader endpoint distributes read-only traffic across all available reader instances, offloading the writer instance and reducing its CPU utilization. Since the application traces show most traffic is read-only SELECT queries, this change requires only modifying the connection string for reads while keeping writes on the writer endpoint, making it the smallest functional change.

Exam trap

The trap here is that candidates may think increasing instance size (Option A) is the simplest fix, but they overlook the fact that Aurora's architecture is designed to offload reads to reader instances, which is a more cost-effective and scalable solution with minimal code change.

How to eliminate wrong answers

Option A is wrong because increasing the writer instance size does not address the root cause—the writer is overloaded with read traffic that could be handled by readers—and it incurs higher cost without leveraging Aurora's built-in read scaling. Option C is wrong because converting to RDS PostgreSQL Multi-AZ does not provide the same read scaling as Aurora readers; Multi-AZ only provides a standby for failover, not active read offloading, and it requires a full migration. Option D is wrong because cross-Region read replicas introduce significant latency for read queries and are intended for disaster recovery or global read scaling, not for reducing CPU on the local writer instance.

Practice this question →

154

MCQmedium

A DynamoDB-backed event processing system experiences throttling during a promotion. All events are written and read using the same partition key value (tenantId = "ACME"). The workload is time-ordered per tenant, and the application can tolerate slight reordering across partitions. Which design change will most directly increase throughput and reduce hot-partition throttling?

A.Increase the table's provisioned capacity (read/write units) to handle the promotion peak.

B.Change the partition key to include an additional sharding attribute derived from a hash of eventId.

C.Enable DAX caching for all reads but keep the same partition key and item layout.

D.Switch the table to eventually consistent reads for queries to lower read throttling.

AnswerB

When all traffic targets one partition key value, that partition becomes the bottleneck regardless of total table capacity. Adding a shard/salt attribute to the partition key (for example, tenantId + shardId where shardId = hash(eventId) mod N) spreads writes across multiple partition key values, increasing partition-level parallelism. Because the scenario allows slight reordering across partitions, losing strict single-partition time ordering is acceptable while improving throughput and reducing throttling.

Why this answer

Option B is correct because adding a sharding attribute derived from a hash of eventId allows writes and reads to be distributed across multiple partition keys, breaking the single hot partition caused by using tenantId='ACME' for all operations. DynamoDB's throughput is limited per partition, so distributing the load across many partitions directly reduces throttling without changing the application's tolerance for slight reordering.

Exam trap

The trap here is that candidates often assume increasing provisioned capacity (Option A) is the universal fix for throttling, but AWS specifically tests the understanding that DynamoDB's per-partition throughput limits require a sharding strategy to distribute load across partitions.

How to eliminate wrong answers

Option A is wrong because simply increasing provisioned capacity does not resolve the hot-partition issue; the single partition key (tenantId='ACME') still caps throughput at 3000 RCU/1000 WCU per partition, so throttling persists regardless of total table capacity. Option C is wrong because DAX caching only reduces read load on the table, but writes (which are the primary source of throttling during a promotion) still hit the same hot partition, and DAX does not help with write throttling. Option D is wrong because eventually consistent reads only reduce read costs and latency, but they do not address the root cause of throttling—the single partition bottleneck—and have no effect on write throttling.

Practice this question →

155

MCQeasy

A customer-facing application has a relational data model and needs frequent complex queries (joins and aggregations), but it also experiences a significant read-heavy workload. Which design choice best improves read performance while keeping relational features?

A.Use DynamoDB with a single partition key and avoid indexes to keep writes simple.

B.Add read replicas to an RDS or Aurora cluster and keep the primary for writes.

C.Store the data in S3 and query it directly from the application without a database.

D.Switch the database to DynamoDB but keep using the same relational SQL queries and joins.

AnswerB

Read replicas offload read operations from the primary database instance, improving read throughput and reducing contention with writes. RDS/Aurora preserve relational capabilities like joins and SQL queries. This is a common and practical way to scale performance for read-heavy workloads without completely changing the data model.

Why this answer

Adding read replicas to an RDS or Aurora cluster offloads read traffic from the primary instance, improving read performance for complex queries (joins and aggregations) while preserving the relational data model. Aurora automatically scales read replicas and uses a shared storage volume, making this highly efficient for read-heavy workloads.

Exam trap

The trap here is that candidates often assume NoSQL databases like DynamoDB can handle relational queries if they simply 'switch' the database, ignoring that DynamoDB lacks native support for joins and complex aggregations, which are core to the relational data model described in the question.

How to eliminate wrong answers

Option A is wrong because DynamoDB with a single partition key and no indexes cannot efficiently support complex relational queries (joins and aggregations), and it sacrifices relational features. Option C is wrong because S3 is an object store, not a relational database; querying it directly for complex joins and aggregations is extremely slow and lacks transactional consistency. Option D is wrong because DynamoDB does not support SQL joins or complex relational queries; attempting to use the same relational SQL queries would fail or require significant application-level workarounds.

Practice this question →

156

Multi-Selecthard

A latency-sensitive mobile game backend uploads large files to S3 from users around the world. Which two features can improve upload performance? The architecture review board prefers a managed AWS-native control.

Select 2 answers

A.S3 Object Lock

B.S3 multipart upload

C.S3 Inventory

D.S3 Transfer Acceleration

AnswersB, D

Multipart upload parallelizes large object upload parts and improves reliability.

Why this answer

B is correct because S3 multipart upload allows a large file to be broken into smaller parts and uploaded in parallel, which significantly reduces the impact of network latency and improves throughput. This is especially beneficial for latency-sensitive applications where upload speed is critical.

Exam trap

The trap here is that candidates might confuse S3 Transfer Acceleration (which uses AWS edge locations and optimized network paths) with multipart upload, but both are valid; however, the question asks for two features, and both B and D are correct, while A and C are irrelevant to performance.

Practice this question →

157

MCQmedium

A mobile game backend uses Amazon Aurora. The workload has many short-lived database connections from Lambda functions, causing connection storms. What should be added?

A.An internet gateway

B.S3 Select

C.RDS Proxy

D.A larger Route 53 hosted zone

AnswerC

RDS Proxy pools and manages database connections, improving scalability for serverless and bursty workloads.

Why this answer

RDS Proxy is the correct solution because it sits between Lambda functions and the Aurora database, pooling and reusing database connections. This prevents connection storms by reducing the overhead of establishing new connections for each short-lived Lambda invocation, and it also helps manage IAM authentication for Lambda functions without storing database credentials.

Exam trap

The trap here is that candidates may think scaling the database (e.g., increasing instance size) is the answer, but the question specifically targets connection management, not compute or storage capacity, and RDS Proxy is the AWS-managed service designed exactly for this use case.

How to eliminate wrong answers

Option A is wrong because an internet gateway is used to enable VPC-to-internet communication, not to manage database connection pooling or reduce connection storms. Option B is wrong because S3 Select is a service for retrieving subsets of data from objects in S3 using SQL-like expressions, and it has no role in database connection management. Option D is wrong because a larger Route 53 hosted zone increases the number of DNS records you can host but does not affect database connection handling or reduce connection storms.

Practice this question →

158

Multi-Selecthard

A media company serves versioned JavaScript and CSS from an S3 origin through CloudFront. After a release, the cache hit ratio drops because the SPA sends an Authorization header and several tracking query strings on every request, even though the assets are public and identical for all users. Which changes would most improve cache efficiency without changing the content returned? Select three.

Select 3 answers

A.Create a CloudFront cache policy that excludes the Authorization header from the cache key when the assets do not require per-user authorization.

B.Use versioned object names for each release and apply long cache TTLs so viewers reuse the same objects until the content changes.

C.Use a cache policy that forwards only required query strings and ignores the tracking parameters that do not affect object content.

D.Place the S3 origin behind an Application Load Balancer so CloudFront can reuse more cached responses.

E.Enable S3 Transfer Acceleration to increase the cache hit ratio for repeated browser requests.

AnswersA, B, C

Correct because an unnecessary Authorization header fragments the cache into many unique variants. If the files are truly public and identical, CloudFront should not vary the cache key on that header.

Why this answer

Option A is correct because CloudFront's default behavior includes the Authorization header in the cache key, causing unique cache entries for each user even when the content is public. By creating a cache policy that excludes the Authorization header, CloudFront treats all requests for the same object as identical, dramatically improving the cache hit ratio without affecting the content served.

Exam trap

The trap here is that candidates may think adding an ALB or enabling Transfer Acceleration improves caching performance, but these services address availability and speed, not cache key efficiency, which is the root cause of low hit ratios.

Practice this question →

159

MCQmedium

A company is deploying a high-performance computing (HPC) cluster with 16 EC2 instances. The workload requires the lowest possible network latency and highest throughput between all nodes for tightly coupled parallel MPI computations. Which EC2 placement group type should a solutions architect recommend?

A.Cluster placement group

B.Partition placement group

C.Spread placement group

D.No placement group — use Auto Scaling across multiple AZs

AnswerA

Cluster PGs place instances physically close together in a single AZ for lowest latency and highest throughput. They support EFA for MPI-level performance — the standard HPC choice.

Why this answer

Cluster placement groups pack instances physically close together within a single Availability Zone, providing the lowest possible network latency and highest network throughput between instances. They support enhanced networking (SR-IOV) and Elastic Fabric Adapter (EFA) for inter-node MPI communication.

Tightly coupled parallel HPC workloads require all nodes to communicate frequently with minimal latency. Cluster placement groups are specifically designed for this use case. The trade-off is all instances are in one AZ — if the AZ fails, the entire cluster is affected.

Exam trap

Spread and Partition placement groups improve availability by distributing instances across racks or partitions — they intentionally increase inter-node distance, which increases latency. For HPC requiring sub-microsecond inter-node communication, low latency trumps availability. Cluster PG = maximum performance in one AZ.

Spread PG = maximum isolation across racks.

Why the other options are wrong

Partition PGs distribute instances across separate hardware racks to reduce rack-failure impact. Instances in different partitions have higher inter-node latency. Designed for distributed databases (Hadoop, Cassandra, Kafka), not tightly coupled HPC.

Spread PGs place each instance on a distinct hardware rack for maximum isolation. Instances are intentionally spread further apart, increasing latency — the opposite of what HPC requires.

Multi-AZ Auto Scaling distributes instances across AZs for availability. Cross-AZ networking has higher latency than within-AZ. For HPC requiring sub-microsecond inter-node communication, all instances must be in the same AZ within a Cluster PG.

Practice this question →

160

Multi-Selecthard

A customer portal uses Amazon Aurora MySQL. The application currently sends all SELECT queries to the writer instance endpoint. During traffic spikes, read latency increases, and the team wants the cluster to survive a writer failover without manual endpoint changes for the application. Which changes should the team make? Select three.

Select 3 answers

A.Replace the hard-coded DB instance endpoint with the Aurora reader endpoint for read traffic.

B.Add additional Aurora Replicas and spread read queries across them.

C.Enable Aurora Auto Scaling for the replicas so the cluster can add readers during demand spikes.

D.Keep sending all traffic to the writer endpoint because it always has the freshest data.

E.Replace Aurora with a single larger RDS instance and continue using the same read pattern.

AnswersA, B, C

Correct. The reader endpoint is a cluster-level endpoint that automatically routes read connections to available Aurora Replicas. That removes the dependency on a specific DB instance endpoint and avoids application changes when the writer fails over or the writer instance identifier changes.

Why this answer

Option A is correct because the Aurora reader endpoint automatically distributes read-only connections across all available Aurora Replicas, including the writer instance if it is configured to accept reads. This eliminates the need for manual endpoint changes during a failover, as the reader endpoint remains constant while the underlying instances change.

Exam trap

The trap here is that candidates may think the writer endpoint is the only way to get fresh data, but Aurora Replicas provide nearly real-time consistency and survive failover without application changes, making them the correct choice for read scaling and high availability.

Practice this question →

161

MCQmedium

A research team runs a latency-sensitive distributed training job on Amazon EC2. They deploy 80 identical nodes that exchange small messages frequently and need low network jitter. The job must run entirely within one Availability Zone. Which placement group strategy should a solutions architect use to maximize intra-cluster network performance?

A.Use a cluster placement group to keep all instances in close proximity within the same Availability Zone.

B.Use a spread placement group to distribute instances across distinct hardware to reduce jitter.

C.Use a partition placement group and place each node into its own partition for uniform latency.

D.Do not use a placement group; rely on the default EC2 scheduling to balance latency and availability.

AnswerA

A cluster placement group is optimized to place instances close together (for example, within the same rack/cluster) to reduce latency and jitter for traffic between the instances. Because the workload runs in a single Availability Zone, the cluster placement group aligns with the requirement for strong locality and low-jitter communication.

Why this answer

A cluster placement group is the correct choice because it places all 80 EC2 instances in close physical proximity within a single Availability Zone, ensuring low-latency, high-bandwidth network connections with minimal jitter. This placement group type is specifically designed for tightly coupled, latency-sensitive workloads like distributed training that require frequent, small message exchanges, as it leverages non-blocking, high-throughput networking between instances.

Exam trap

The trap here is that candidates often confuse spread placement groups (which reduce jitter by isolating hardware failures) with cluster placement groups (which reduce jitter by minimizing physical distance), not realizing that jitter in this context is caused by network hops, not hardware faults.

How to eliminate wrong answers

Option B is wrong because a spread placement group distributes instances across distinct hardware to maximize fault tolerance, which increases network distance and jitter, making it unsuitable for latency-sensitive workloads. Option C is wrong because a partition placement group divides instances into logical partitions for fault isolation, but it does not guarantee the uniform low latency and close proximity needed for frequent small message exchanges. Option D is wrong because relying on default EC2 scheduling does not ensure instances are placed in close physical proximity, leading to higher network latency and jitter compared to a cluster placement group.

Practice this question →

162

MCQmedium

A game streaming service must use UDP for real-time gameplay traffic. For external firewall allowlisting, the service requires stable, static IP addresses. The TLS handshake must be handled end-to-end by the application servers (the load balancer must not terminate TLS). Which AWS load balancing option best fits these requirements?

A.Use a Network Load Balancer (NLB) with a UDP listener, configure the NLB to use Elastic IP addresses for static IPs, and use TCP listeners for TLS passthrough to the application servers.

B.Use an Application Load Balancer (ALB) with UDP listeners and configure TLS passthrough.

C.Use Amazon API Gateway with a WebSocket API and keepalive pings to provide UDP-like low-latency delivery.

D.Use a Classic Load Balancer and multiplex UDP over TCP to meet the UDP and low-latency requirements.

AnswerA

NLB supports UDP listeners and is designed for low-latency, high-performance networking. Associating Elastic IP addresses with the NLB provides stable public IP addresses for firewall allowlisting. For TLS passthrough, using a TCP listener keeps the TLS handshake and encryption between the client and the targets (no load balancer TLS termination).

Why this answer

A Network Load Balancer (NLB) supports UDP listeners, which are required for real-time gameplay traffic, and can be assigned Elastic IP addresses to provide stable, static IPs for firewall allowlisting. Additionally, NLB supports TCP listeners with TLS passthrough, meaning it forwards the encrypted traffic without terminating the TLS handshake, allowing the application servers to handle end-to-end encryption as required.

Exam trap

The trap here is that candidates may assume an ALB can handle UDP traffic because it supports WebSocket or HTTP/2, but ALB is strictly Layer 7 and only supports TCP-based protocols, while NLB is the correct choice for UDP and TLS passthrough with static IPs.

How to eliminate wrong answers

Option B is wrong because an Application Load Balancer (ALB) does not support UDP listeners; it only handles HTTP/HTTPS and WebSocket traffic over TCP. Option C is wrong because Amazon API Gateway with WebSocket API operates over TCP (not UDP) and does not provide static IP addresses for allowlisting; it also terminates TLS at the API Gateway endpoint, not at the application servers. Option D is wrong because a Classic Load Balancer does not support UDP listeners and cannot multiplex UDP over TCP in a way that meets low-latency requirements; it is a legacy option that lacks the necessary protocol support and static IP capabilities.

Practice this question →

163

MCQeasy

A new feature stores user events in DynamoDB. Each event must be fetched by user_id and sorted by event_time. The team expects many different users and wants to avoid a single hot partition. Which partition key design is best?

A.Use a constant partition key value (for example, partition_key='events') and store user_id as an attribute.

B.Use user_id as the partition key and event_time as the sort key.

C.Use event_time as the partition key and user_id as an attribute to query later.

D.Use a randomly generated UUID as the partition key and query by user_id using a full table scan.

AnswerB

Using user_id as the partition key spreads data across many partitions based on user distribution. event_time as the sort key supports efficient range queries and retrieving events in time order per user. This design matches the stated access pattern and reduces hot partition likelihood.

Why this answer

Option B is correct because using `user_id` as the partition key ensures each user's events are stored in a separate partition, distributing the workload evenly and avoiding hot partitions. Adding `event_time` as the sort key allows DynamoDB to efficiently retrieve events for a given user in sorted order using a Query operation, which is both fast and cost-effective.

Exam trap

The trap here is that candidates may choose a constant partition key (Option A) thinking it simplifies queries, but they overlook that DynamoDB's scalability depends on partition key cardinality, and a single partition key creates a bottleneck that defeats the purpose of a NoSQL database.

How to eliminate wrong answers

Option A is wrong because using a constant partition key value (e.g., `'events'`) forces all data into a single partition, creating a hot partition that throttles performance and defeats the purpose of DynamoDB's distributed architecture. Option C is wrong because using `event_time` as the partition key scatters events across partitions without grouping by user, so fetching all events for a specific user would require a costly full table scan or a Scan with a filter, which is inefficient and not sorted. Option D is wrong because a randomly generated UUID partition key distributes writes well but makes it impossible to query by `user_id` without a full table scan, as DynamoDB cannot query across partitions without knowing the exact partition key values.

Practice this question →

164

MCQhard

Based on the exhibit, which design change is the best way to reduce the observed read latency for this DynamoDB-backed service?

A.Add a DynamoDB Accelerator (DAX) cluster in front of the table and send repeated read traffic through it.

B.Increase the on-demand table limits so DynamoDB can automatically absorb more traffic.

C.Create a global secondary index on tenantId to distribute the load across more partitions.

D.Move the dashboard data into S3 and use Lambda functions to read it on demand.

AnswerA

DAX is designed to accelerate repeated eventually consistent reads from DynamoDB by caching hot items in memory. The exhibit shows one tenant driving most of the reads and the same dashboard items being requested repeatedly within a short window, which is an excellent fit for DAX. It reduces latency and offloads the hot key without requiring a schema redesign.

Why this answer

Adding a DynamoDB Accelerator (DAX) cluster in front of the table reduces read latency by providing an in-memory cache for repeated read traffic. DAX delivers microsecond response times for eventually consistent reads, which directly addresses the observed latency issue without requiring application-level caching or table redesign.

Exam trap

The trap here is that candidates often assume increasing capacity limits (Option B) or adding indexes (Option C) will solve latency issues, but they fail to recognize that latency is a caching problem, not a throughput or partitioning problem, and that DAX is the AWS-native solution for DynamoDB read-heavy workloads with repeated access patterns.

How to eliminate wrong answers

Option B is wrong because increasing on-demand table limits does not reduce read latency; it only prevents throttling by allowing DynamoDB to absorb more traffic, but the underlying read latency from the storage layer remains unchanged. Option C is wrong because creating a global secondary index on tenantId does not inherently reduce read latency; it distributes load across partitions but still requires a full table scan or query that incurs the same storage-layer latency, and it does not provide caching. Option D is wrong because moving dashboard data to S3 and using Lambda to read it on demand introduces cold start latency and S3's eventual consistency model, which can increase read latency rather than reduce it, and it adds unnecessary complexity for a use case that benefits from DynamoDB's low-latency access.

Practice this question →

165

MCQmedium

A video platform uses Amazon Aurora. The workload has many short-lived database connections from Lambda functions, causing connection storms. What should be added?

A.S3 Select

B.An internet gateway

C.A larger Route 53 hosted zone

D.RDS Proxy

AnswerD

RDS Proxy pools and manages database connections, improving scalability for serverless and bursty workloads.

Why this answer

RDS Proxy sits between Lambda functions and the Aurora database, pooling and reusing database connections. This prevents the Lambda functions from overwhelming the database with many short-lived connections, which can cause connection storms and degrade performance. RDS Proxy also reduces the overhead of establishing new connections and improves scalability.

Exam trap

The trap here is that candidates may confuse connection pooling with network-level components (like internet gateways) or data retrieval services (like S3 Select), overlooking that RDS Proxy is the AWS-native solution for managing short-lived database connections from serverless or highly concurrent workloads.

How to eliminate wrong answers

Option A is wrong because S3 Select is used to retrieve subsets of data from objects in Amazon S3 using SQL expressions, not for managing database connections. Option B is wrong because an internet gateway enables VPC resources to communicate with the internet, not to manage or pool database connections. Option C is wrong because a larger Route 53 hosted zone increases the number of DNS records you can create, but does not address connection pooling or database connection storms.

Practice this question →

166

MCQhard

Based on the exhibit, a low-latency analytics platform runs 10 EC2 instances in the same Availability Zone. The nodes exchange a very high volume of east-west messages and must experience the lowest possible network latency and jitter. A separate operations team also wants to reduce the risk that all nodes land on the same physical hardware rack. Which placement strategy should the solutions architect use?

A.Cluster placement group

B.Spread placement group

C.Partition placement group

D.Auto Scaling group with a mixed instances policy

AnswerA

Cluster placement groups place instances physically close together to maximize bandwidth and minimize latency. That is the best fit for a high-chatty, east-west workload where network performance matters more than fault isolation at the rack level.

Why this answer

A cluster placement group is the correct choice because it provides the lowest possible network latency and jitter by ensuring all 10 EC2 instances are placed in close proximity within a single Availability Zone, enabling non-blocking, high-bandwidth communication. This is ideal for high-volume east-west traffic, as it maximizes network performance for tightly coupled workloads like analytics platforms.

Exam trap

The trap here is that candidates may choose a Spread placement group to reduce hardware rack risk, overlooking that the primary requirement is lowest latency and jitter, which cluster groups provide, while Spread groups sacrifice performance for fault isolation.

How to eliminate wrong answers

Option B (Spread placement group) is wrong because it spreads instances across distinct hardware racks to reduce risk of simultaneous failure, but it increases network latency and jitter due to physical separation, which conflicts with the requirement for lowest possible latency. Option C (Partition placement group) is wrong because it divides instances into logical partitions across racks to isolate failures, but it does not guarantee the lowest latency or jitter for east-west traffic, as instances in different partitions may be on separate racks. Option D (Auto Scaling group with a mixed instances policy) is wrong because it focuses on instance diversity and scaling, not on network placement; it does not control physical proximity or reduce latency/jitter, and it may even increase variability in network performance.

Practice this question →

167

MCQeasy

A team runs a stateless web app on Amazon EC2 behind an Application Load Balancer. During traffic spikes, new EC2 instances take several minutes to finish bootstrapping before they can receive traffic. Which Auto Scaling configuration most directly reduces the time until additional capacity is available?

A.Increase the ALB target group deregistration delay.

B.Use an Auto Scaling warm pool so pre-initialized instances are ready to enter service.

C.Reduce the Auto Scaling group minimum size to one instance.

D.Replace the Application Load Balancer with a Network Load Balancer.

AnswerB

Warm pools keep instances pre-launched and initialized, which reduces the time needed to add capacity during spikes.

Why this answer

Option B is correct because an Auto Scaling warm pool allows you to maintain a pool of pre-initialized instances that are fully bootstrapped and ready to enter service. When a scale-out event occurs, instances from the warm pool can be moved into the Auto Scaling group and start receiving traffic almost immediately, bypassing the several-minute bootstrapping delay.

Exam trap

The trap here is that candidates may confuse the deregistration delay (which controls how gracefully existing connections are drained) with a mechanism that speeds up new instance readiness, or they may think that changing the load balancer type or reducing the minimum size will somehow accelerate instance bootstrapping.

How to eliminate wrong answers

Option A is wrong because increasing the ALB target group deregistration delay only affects how long the ALB waits before stopping traffic to instances that are being deregistered; it does not reduce the time for new instances to become ready. Option C is wrong because reducing the Auto Scaling group minimum size to one instance would actually reduce the baseline capacity, potentially making the application more vulnerable to traffic spikes and not addressing the bootstrapping delay. Option D is wrong because replacing the Application Load Balancer with a Network Load Balancer does not affect instance bootstrapping time; NLB operates at Layer 4 and does not provide the HTTP/HTTPS health checks or path-based routing that ALB offers, and it does not accelerate instance initialization.

Practice this question →

168

MCQeasy

A company runs an Amazon RDS for PostgreSQL database. The application performs frequent OLTP writes, but it also has a separate dashboard that runs heavy SELECT queries and is slowing down overall database performance. The writes must remain on the primary. What is the best approach to improve performance for the dashboard?

A.Create an RDS read replica and route the dashboard’s read-only queries to the replica endpoint

B.Increase instance storage throughput limits and disable synchronous replication to speed up all queries

C.Replace RDS with Amazon S3 because dashboards require SQL result caching

D.Move the primary database to a different AWS Region to reduce network latency

AnswerA

Read replicas offload read workloads from the primary. Since the dashboard performs read-only SELECTs, routing those queries to a replica reduces contention on the primary, allowing OLTP writes to continue with less interference.

Why this answer

Creating an RDS read replica allows you to offload the heavy SELECT queries from the primary database instance. The replica asynchronously replicates data from the primary using PostgreSQL's streaming replication, so the dashboard can query the replica without impacting the OLTP write performance on the primary. This directly addresses the requirement that writes remain on the primary while improving dashboard query performance.

Exam trap

The trap here is that candidates might think increasing instance size or storage throughput is sufficient, but the core issue is workload isolation—offloading read-heavy queries to a read replica is the only scalable solution that preserves write performance on the primary.

How to eliminate wrong answers

Option B is wrong because increasing storage throughput limits does not reduce the impact of heavy SELECT queries on write performance, and disabling synchronous replication (which is not applicable to RDS for PostgreSQL in this context) would not isolate the dashboard workload. Option C is wrong because Amazon S3 is an object storage service, not a relational database; it cannot replace PostgreSQL for OLTP writes or support SQL queries without additional services like Athena or Redshift Spectrum, and it does not provide the transactional consistency required for the application. Option D is wrong because moving the primary database to a different AWS Region would increase network latency for the application's writes, not improve dashboard performance, and it does not separate the read workload from the write workload.

Practice this question →

169

MCQhard

Based on the exhibit, a media rendering job runs on a single EC2 instance and writes a large working set of metadata to block storage. The workload performs sustained random reads and writes and must keep latency consistently low for the entire run. The instance may be stopped and started between jobs, and the data must persist. Which storage choice best meets the requirements?

A.Amazon S3 with multipart uploads because it provides durable object storage and high throughput.

B.Amazon EFS because it can be mounted by EC2 and supports persistent file access.

C.Provisioned IOPS SSD EBS volume (io2).

D.Amazon FSx for Windows File Server because it offers durable storage and low latency.

AnswerC

io2 is designed for sustained high IOPS with low and consistent latency on EC2 block storage. The workload is single-instance, random I/O intensive, and needs persistence across stop/start, which matches EBS block storage behavior well.

Why this answer

The workload requires sustained low-latency random reads and writes to block storage, and the data must persist across instance stop/start cycles. Provisioned IOPS SSD EBS volumes (io2) are block-level storage designed for high-performance, low-latency workloads with consistent IOPS, and they persist independently of the EC2 instance lifecycle.

Exam trap

The trap here is that candidates confuse file storage (EFS, FSx) or object storage (S3) with block storage, failing to recognize that sustained low-latency random reads and writes require a block-level device like EBS, not a network-mounted file system.

How to eliminate wrong answers

Option A is wrong because Amazon S3 is object storage, not block storage, and does not support low-latency random read/write access required for a working set of metadata; multipart uploads are for throughput, not latency-sensitive random I/O. Option B is wrong because Amazon EFS is a file-level NFS service that introduces network latency and does not provide the consistent sub-millisecond latency of local block storage for sustained random I/O. Option D is wrong because Amazon FSx for Windows File Server is file-level storage with higher latency than direct-attached block storage and is optimized for Windows workloads, not for the low-latency random block I/O pattern described.

Practice this question →

170

MCQmedium

A media platform runs a CPU-heavy thumbnail generation workload on an EC2 Auto Scaling group using t3.large instances. During peak traffic, p95 processing time increases significantly even though average CPU remains around 40–50%. CloudWatch also shows CPU credit depletion behavior. Which change will most directly improve performance predictability for this workload?

A.Increase the t3.large maximum CPU credits and keep the Auto Scaling group using the same burstable instance type.

B.Change the Auto Scaling group instance type to a compute-optimized family (for example, c7i) to provide steady CPU performance.

C.Add a placement group to the existing t3.large instances so they are packed close together for lower latency between nodes.

D.Switch the workload to run on Lambda with the same logic so invocations automatically scale without instance selection changes.

AnswerB

Compute-optimized instances are designed for consistently high CPU performance and do not rely on a burst-credit model. Switching to a steady-performance family removes the credit-depletion/throttling pattern that is driving the p95 latency spikes under sustained load.

Why this answer

The t3.large instances rely on CPU credits for burst performance, and when credits are exhausted, CPU performance is throttled to the baseline (e.g., 30% for t3.large). This causes unpredictable processing times during peak traffic, even if average CPU is moderate. Switching to a compute-optimized family like c7i provides dedicated, consistent CPU performance without credit-based throttling, directly improving predictability for CPU-heavy thumbnail generation.

Exam trap

The trap here is that candidates assume 'CPU credit depletion' can be fixed by increasing credits or scaling out, but the real issue is that burstable instances are fundamentally unsuitable for sustained CPU-heavy workloads, and only switching to a non-burstable instance type (e.g., compute-optimized) guarantees predictable performance.

How to eliminate wrong answers

Option A is wrong because increasing maximum CPU credits (which is not a configurable parameter; t3 instances have a fixed credit earning/balance limit) would only delay throttling, not eliminate it, and the workload would still face unpredictable performance once credits are depleted. Option C is wrong because placement groups optimize network latency between instances (e.g., for tightly coupled workloads like HPC), but the issue here is CPU credit exhaustion, not network latency. Option D is wrong because Lambda has a 15-minute execution timeout and limited CPU allocation per invocation (proportional to memory), making it unsuitable for long-running, CPU-heavy thumbnail generation; it also introduces cold start latency and does not inherently solve the CPU credit problem.

Practice this question →

171

Multi-Selectmedium

A DevOps team is designing a high-performance CI/CD pipeline to build and test code changes. The pipeline needs to scale to handle hundreds of concurrent builds, with fast build times and minimal idle compute cost. The builds are containerized and require consistent, reproducible environments. Which three options should be used to meet these requirements? (Choose three.)

Select 3 answers

.Use AWS CodeBuild with a large number of concurrent build projects.

.Use self-managed Jenkins on EC2 Spot Instances to reduce costs.

.Use AWS CodePipeline to orchestrate the build, test, and deploy stages.

.Use AWS CodeBuild with pre-built Docker images cached in Amazon ECR.

.Use Amazon EC2 Auto Scaling with a custom AMI for build agents.

.Use Amazon S3 as a cache store for CodeBuild to speed up dependency download.

Why this answer

AWS CodePipeline is the correct orchestration service to define and manage the CI/CD pipeline stages (build, test, deploy) in a serverless, highly available manner. Pre-built Docker images cached in Amazon ECR ensure consistent, reproducible environments and drastically reduce build times by avoiding image rebuilds. Using Amazon S3 as a cache store for CodeBuild allows storing and retrieving dependency caches (e.g., Maven .m2, npm node_modules) across builds, minimizing download times and speeding up the pipeline.

Exam trap

The trap here is that candidates often overcomplicate the solution by choosing self-managed or auto-scaling options (like Jenkins or EC2 Auto Scaling) instead of recognizing that AWS managed services (CodePipeline, CodeBuild, ECR, S3) provide the required scalability, speed, and cost efficiency with far less operational overhead.

Practice this question →

172

MCQeasy

Based on the exhibit, which EBS volume type should the team use to meet the performance need at lower cost than overprovisioning capacity?

A.Use gp3 and provision the needed IOPS independently of volume size.

B.Use sc1 because it is optimized for infrequent access and large objects.

C.Use st1 because it provides high throughput for streaming data.

D.Use standard magnetic storage because it is compatible with all EC2 instances.

AnswerA

gp3 is the best fit because it lets you provision IOPS and throughput separately from volume size. The exhibit shows the workload needs around 10,000 IOPS and experiences queue buildup on gp2. With gp3, the team can raise performance without unnecessarily increasing storage capacity, which is usually more cost-effective for this kind of database workload.

Why this answer

The gp3 volume type allows you to provision baseline performance of 3,000 IOPS and 125 MiB/s regardless of volume size, and you can independently increase IOPS up to 16,000 and throughput up to 1,000 MiB/s without needing to add more storage capacity. This decoupling of performance from size means you can meet the required IOPS at a lower cost compared to gp2, where performance scales with volume size and often forces overprovisioning of capacity to achieve the needed IOPS.

Exam trap

The trap here is that candidates assume all EBS volume types require overprovisioning capacity to achieve higher IOPS, overlooking gp3's ability to independently scale performance from storage size, which is a key differentiator tested on the SAA-C03 exam.

How to eliminate wrong answers

Option B is wrong because sc1 (Cold HDD) is designed for infrequently accessed, large sequential workloads with a maximum throughput of 250 MiB/s and very low IOPS (tens), making it unsuitable for workloads requiring consistent IOPS performance. Option C is wrong because st1 (Throughput Optimized HDD) is optimized for high-throughput, sequential streaming data (e.g., big data, log processing) and cannot provide the low-latency, random IOPS that gp3 delivers. Option D is wrong because standard magnetic storage (previous generation) offers very low IOPS (approximately 100 IOPS per volume) and is not cost-effective for any performance-sensitive workload, nor is it compatible with all modern EC2 instance types (e.g., Nitro-based instances do not support it).

Practice this question →

173

MCQeasy

An application uses an Amazon Aurora cluster. The workload becomes read-heavy, but the team cannot change the database schema. They need higher read throughput while keeping writes on the primary. What should they do?

A.Create Aurora read replicas and use the reader endpoint for read traffic

B.Switch the cluster to a single-AZ Aurora configuration to reduce coordination overhead

C.Increase DynamoDB capacity units instead of modifying the database layer

D.Enable CloudFront caching for database queries to serve results from edge locations

AnswerA

Aurora read replicas (reader instances) scale read throughput without requiring schema changes. The cluster provides a reader endpoint to route read queries to replica instances while the writer endpoint continues to handle writes.

Why this answer

Aurora read replicas are designed to offload read traffic from the primary instance, and the reader endpoint automatically load-balances connections across all replicas. Since the workload is read-heavy and the schema cannot change, adding read replicas directly increases read throughput without modifying the application's database schema. The reader endpoint ensures that read queries are directed to the replicas while writes continue to hit the primary instance.

Exam trap

The trap here is that candidates may confuse Aurora read replicas with RDS read replicas, which have different replication mechanics and lag characteristics, or they may think that switching to a single-AZ configuration improves performance by reducing overhead, when in fact it only reduces availability.

How to eliminate wrong answers

Option B is wrong because switching to a single-AZ configuration reduces availability and does not increase read throughput; it only eliminates the standby replica, which does not serve read traffic. Option C is wrong because DynamoDB is a different database service with a different API and data model; the question explicitly states the application uses an Aurora cluster, and migrating to DynamoDB would require schema changes, which are not allowed. Option D is wrong because CloudFront caches static content at edge locations, not dynamic database query results; database queries are typically dynamic and cannot be cached effectively at edge locations without complex application-level caching logic.

Practice this question →

174

MCQeasy

Based on the exhibit, what change best reduces Lambda cold-start impact for a predictable user-upload workflow?

A.Set a reserved concurrency limit for the function to protect it from throttling.

B.Enable provisioned concurrency for the function.

C.Increase the function timeout to give more time for initialization.

D.Move the function to a larger memory setting only to eliminate all initialization time.

AnswerB

Provisioned concurrency keeps a pre-initialized pool of Lambda execution environments ready to respond immediately. The exhibit shows long init duration after inactivity, which is the classic symptom of cold starts affecting user experience. Because the traffic pattern is predictable during launches, provisioned concurrency is the most direct way to reduce startup latency and smooth response times.

Why this answer

Provisioned concurrency initializes a specified number of execution environments in advance, so when a user upload triggers the Lambda function, there is no cold-start delay. This directly addresses the predictable, user-upload workflow by ensuring warm containers are ready to handle requests immediately.

Exam trap

The trap here is confusing reserved concurrency (which limits concurrency to prevent throttling) with provisioned concurrency (which pre-warms instances to eliminate cold starts), leading candidates to choose a throttling protection mechanism instead of a cold-start mitigation solution.

How to eliminate wrong answers

Option A is wrong because reserved concurrency limits the maximum number of concurrent executions to protect downstream resources, but it does not pre-warm containers or reduce cold-start latency. Option C is wrong because increasing the function timeout only extends the maximum execution duration, not the initialization time; it does not eliminate the cold-start delay. Option D is wrong because larger memory settings can reduce initialization time by providing more CPU, but they do not eliminate all initialization time; the function still incurs a cold start when no pre-warmed instances exist.

Practice this question →

175

MCQhard

Based on the exhibit, which storage design best supports the application servers' shared working directory requirement?

A.Mount Amazon EFS on every EC2 instance and use it as the shared workspace.

B.Attach one gp3 EBS volume to each instance and synchronize the files with cron jobs.

C.Store the artifacts in S3 and have each node read them directly from S3 as a filesystem.

D.Use instance store on each instance because it provides the fastest local file access.

AnswerA

EFS provides shared, persistent, POSIX-compliant file access across multiple EC2 instances and Availability Zones. That matches the requirement that all nodes see the same workspace immediately and that files survive instance replacement. It is the right choice when the application needs a common filesystem rather than an object store or local-only disk.

Why this answer

Amazon EFS provides a fully managed, NFS-based shared file system that can be mounted concurrently on multiple EC2 instances across multiple Availability Zones. This directly satisfies the requirement for a shared working directory where all application servers can read and write files simultaneously without additional synchronization logic.

Exam trap

The trap here is confusing shared file storage (EFS) with block storage (EBS) or object storage (S3), leading candidates to choose EBS with synchronization or S3 as a filesystem, both of which lack the native shared file system semantics required for concurrent read/write access across multiple instances.

How to eliminate wrong answers

Option B is wrong because attaching a separate gp3 EBS volume to each instance creates isolated file systems; synchronizing files via cron jobs introduces latency, complexity, and potential data inconsistency, failing to provide a true real-time shared workspace. Option C is wrong because Amazon S3 is an object storage service, not a POSIX-compliant file system; mounting S3 as a filesystem (e.g., via s3fs) incurs significant performance overhead, lacks file locking, and does not support concurrent read/write semantics required for a shared working directory. Option D is wrong because instance store volumes are ephemeral and tied to the lifecycle of the EC2 instance; data is lost on stop/termination, and instance stores cannot be shared across multiple instances, making them unsuitable for a persistent shared workspace.

Practice this question →

176

MCQhard

Based on the exhibit, a distributed analytics workload runs on 12 EC2 instances in one Availability Zone. The nodes exchange thousands of small messages per second and require the lowest possible intra-cluster latency and jitter. Which EC2 placement strategy is the best fit?

A.Spread placement group, because it places each instance on distinct underlying hardware.

B.Partition placement group, because it isolates nodes across rack partitions.

C.Cluster placement group, because it places instances physically close together in one Availability Zone.

D.Move the workload behind an Application Load Balancer so node-to-node traffic is balanced more efficiently.

AnswerC

Cluster placement groups are designed for workloads that need very low network latency, low jitter, and high packet-per-second performance. Placing the instances physically close together within the same Availability Zone reduces network hop distance and is the best match for a message-heavy distributed analytics cluster.

Why this answer

A cluster placement group is the best choice because it places all 12 EC2 instances in a single Availability Zone within the same high-bandwidth, low-latency logical segment of the network. This minimizes the physical distance and network hops between nodes, achieving the lowest possible intra-cluster latency and jitter required for the thousands of small messages exchanged per second.

Exam trap

The trap here is that candidates confuse 'spread' or 'partition' placement groups as providing better performance due to isolation, but they fail to recognize that cluster placement groups are the only strategy designed specifically for the lowest latency and jitter within a single AZ.

How to eliminate wrong answers

Option A is wrong because a spread placement group places each instance on distinct underlying hardware (different racks and often different AZs), which increases network distance and latency, making it unsuitable for high-frequency, low-latency messaging. Option B is wrong because a partition placement group isolates nodes across rack partitions to reduce correlated failures, but it does not guarantee the tight physical proximity needed for the lowest latency and jitter; it is designed for large distributed systems like HDFS or Cassandra, not for latency-sensitive micro-batch workloads. Option D is wrong because moving the workload behind an Application Load Balancer (ALB) would introduce an intermediary that adds significant latency and jitter for node-to-node traffic, and ALBs are designed for client-to-server load balancing, not for optimizing internal cluster communication.

Practice this question →

177

MCQeasy

An ECS service runs on EC2 capacity. During peak traffic, tasks frequently wait for available container instances. The team wants faster scale-out for the underlying EC2 capacity when tasks increase. What is the best first architectural step?

A.Tune the container health check settings so tasks stop failing and stay running.

B.Use an ECS capacity provider (or Auto Scaling integration) to scale the EC2 instances based on ECS demand.

C.Pin all tasks to a single Availability Zone to reduce placement overhead.

D.Switch the tasks to run only on Fargate so EC2 scaling is no longer relevant.

AnswerB

When ECS tasks need compute, capacity must scale at the EC2 layer so there are enough container instances to place tasks. Integrating ECS with an Auto Scaling capacity provider allows the cluster to scale out in response to pending tasks. This reduces waiting time and improves responsiveness under load.

Why this answer

Option B is correct because an ECS capacity provider (or Auto Scaling integration) directly links ECS task-level demand to EC2 instance scaling. When tasks are pending due to insufficient container instances, the capacity provider triggers a scale-out event on the Auto Scaling group, adding EC2 instances to accommodate the workload. This is the most direct and efficient architectural step to reduce the wait time for available container instances during peak traffic.

Exam trap

The trap here is that candidates may think tuning health checks (Option A) or switching to Fargate (Option D) are simpler fixes, but the question specifically asks for the best first architectural step to scale EC2 capacity faster, which is directly addressed by the capacity provider integration.

How to eliminate wrong answers

Option A is wrong because tuning health check settings does not address the root cause of insufficient EC2 capacity; it only affects task lifecycle management, not the number of available container instances. Option C is wrong because pinning tasks to a single Availability Zone increases risk of failure and does not solve the capacity shortage; placement overhead is negligible compared to the lack of instances. Option D is wrong because switching to Fargate is a migration, not an architectural step for the existing EC2-based service, and it does not address the immediate need for faster scale-out of the underlying EC2 capacity.

Practice this question →

178

MCQhard

A document portal needs low-latency full-text search across product descriptions and filtered attributes. Which managed service is most suitable? The design must avoid adding custom operational scripts.

A.Amazon OpenSearch Service

B.AWS Config

C.Amazon EFS

D.Amazon SQS

AnswerA

OpenSearch is designed for search and analytics over indexed text and structured fields.

Why this answer

Amazon OpenSearch Service is the correct choice because it is a fully managed service that provides low-latency full-text search and filtering capabilities, ideal for indexing and searching product descriptions and attributes. It eliminates the need for custom operational scripts by handling cluster management, scaling, and backups automatically, aligning with the requirement to avoid custom operational overhead.

Exam trap

The trap here is that candidates may confuse AWS Config (a compliance tool) with a search service due to its name, or mistakenly think Amazon EFS or SQS can be adapted for search with custom scripts, ignoring the requirement to avoid custom operational scripts.

How to eliminate wrong answers

Option B (AWS Config) is wrong because it is a service for auditing and evaluating resource configurations against compliance rules, not for full-text search or indexing. Option C (Amazon EFS) is wrong because it is a scalable file storage service for shared access to files, not a search engine or indexing solution. Option D (Amazon SQS) is wrong because it is a message queuing service for decoupling application components, not designed for search or querying of product data.

Practice this question →

179

MCQmedium

A startup runs an HTTP/2 API that also supports WebSocket connections. They need path-based routing to separate microservices (for example, /api/* to Service A and /metrics/* to Service B) and want TLS terminated at the load balancer. Which AWS option best meets these requirements while maintaining high request performance?

A.Use an Amazon NLB and configure target groups with HTTP health checks and listener rules for path-based routing.

B.Use an Amazon ALB with HTTP/2 support, WebSocket upgrades enabled, and listener rules for host/path-based routing.

C.Use Amazon API Gateway with a single backend integration and rely on the client to route requests to different microservices.

D.Use Amazon CloudFront without an ALB, and route requests to microservices using only custom origin headers.

AnswerB

An ALB supports Layer 7 features needed here: it can terminate TLS on an HTTPS listener, evaluate HTTP host/path routing rules, and it supports WebSocket by allowing HTTP Upgrade behavior through the ALB to the targets. ALBs also support HTTP/2 on HTTPS listeners, which helps maintain high request performance.

Why this answer

An Application Load Balancer (ALB) natively supports HTTP/2, WebSocket upgrades, and path-based routing via listener rules. It terminates TLS at the load balancer, offloading encryption from backend services, and maintains high performance for both HTTP/2 and WebSocket traffic. This makes ALB the correct choice for the startup's requirements.

Exam trap

The trap here is that candidates may confuse NLB's Layer 4 capabilities with ALB's Layer 7 features, incorrectly assuming NLB can handle path-based routing or WebSocket upgrades, when in fact it cannot inspect application-layer data.

How to eliminate wrong answers

Option A is wrong because a Network Load Balancer (NLB) operates at Layer 4 and does not support path-based routing or HTTP/2; it cannot inspect HTTP paths or handle WebSocket upgrades natively. Option C is wrong because Amazon API Gateway does not natively support WebSocket connections in the same manner as an ALB, and relying on the client to route requests bypasses the requirement for server-side path-based routing. Option D is wrong because CloudFront without an ALB cannot perform path-based routing to separate microservices; custom origin headers alone do not provide the necessary listener rules for path-based traffic distribution.

Practice this question →

180

MCQmedium

A media archive requires consistent high IOPS for a transactional database on EC2. Which EBS volume type is most suitable? The design must avoid adding custom operational scripts.

A.Provisioned IOPS SSD such as io2

B.st1 Throughput Optimized HDD

C.Instance store only

D.sc1 Cold HDD

AnswerA

io2 is designed for business-critical workloads requiring consistent high IOPS and durability.

Why this answer

A Provisioned IOPS SSD (io2) volume is the correct choice because it delivers consistent, high IOPS required for transactional databases, with a 99.999% durability guarantee and the ability to provision IOPS independently of storage capacity. This avoids custom operational scripts by providing predictable performance natively through the EBS volume type.

Exam trap

The trap here is that candidates may choose instance store (Option C) thinking it provides the highest performance, but they overlook its lack of persistence and the requirement for custom scripts to manage data durability, which violates the 'no custom operational scripts' constraint.

How to eliminate wrong answers

Option B (st1 Throughput Optimized HDD) is wrong because it is designed for throughput-intensive workloads like big data and log processing, not for consistent high IOPS; its performance is burst-based and degrades under sustained small random I/O. Option C (Instance store only) is wrong because instance store volumes are ephemeral and data is lost on instance stop or termination, making them unsuitable for a persistent transactional database without custom backup scripts. Option D (sc1 Cold HDD) is wrong because it is optimized for infrequently accessed data with the lowest cost per GB, offering very low IOPS that cannot meet the demands of a transactional database.

Practice this question →

181

MCQmedium

A team serves image files from S3 through CloudFront. During a performance review, they notice that CloudFront cache hit ratio is low and the S3 origin receives many repeated requests for the same images. Request URLs include a volatile query parameter called 'sessionId' that changes for each user, but the image content is identical regardless of 'sessionId'. What configuration change will most effectively increase cache hit ratio?

A.Update the CloudFront cache policy so that 'sessionId' is not included in the cache key (and only stable query parameters are used).

B.Enable origin request policy to forward all query strings to S3 so responses are always correct for every sessionId.

C.Set the CloudFront minimum TTL to 0 seconds so cached objects expire quickly and fetch fresh content more often.

D.Disable caching by using CloudFront managed caching disabled so that every request validates with the origin.

AnswerA

Removing volatile query parameters from the cache key prevents unique URLs from generating separate cache entries.

Why this answer

The low cache hit ratio is caused by the volatile 'sessionId' query parameter being included in the CloudFront cache key, which creates a unique cache entry for every user request even though the image content is identical. By updating the cache policy to exclude 'sessionId' from the cache key, CloudFront will treat all requests for the same image as the same cached object, dramatically increasing the cache hit ratio and reducing load on the S3 origin.

Exam trap

The trap here is that candidates may confuse the purpose of cache policies (which control the cache key) with origin request policies (which control what is forwarded to the origin), leading them to incorrectly choose Option B thinking that forwarding query strings will fix the issue, when in fact it does not affect the cache key.

How to eliminate wrong answers

Option B is wrong because forwarding all query strings to S3 via an origin request policy would still include the volatile 'sessionId' in the request to the origin, but it does not change the cache key — the cache key is controlled by the cache policy, not the origin request policy, so the cache hit ratio would remain low. Option C is wrong because setting the minimum TTL to 0 seconds would cause CloudFront to treat every object as immediately expired, forcing frequent revalidation with the origin and actually decreasing the cache hit ratio further. Option D is wrong because disabling caching entirely would eliminate any cache hits, making every request go to the S3 origin, which is the opposite of increasing the cache hit ratio.

Practice this question →

182

MCQeasy

A team runs a latency-sensitive service on EC2 and needs consistent, low-latency block storage for a database. The application requires predictable performance and should be fast for random reads/writes. Which EBS volume type is the best choice?

A.EBS st1 (throughput optimized HDD)

B.EBS gp3 (general purpose SSD)

C.EBS sc1 (cold HDD)

D.EBS magnetic (legacy magnetic)

AnswerB

gp3 is designed for a broad range of general-purpose workloads with solid low-latency performance. It supports random I/O patterns and offers predictable performance for many latency-sensitive applications. It is a common best-fit choice when you need balanced performance without specialized throughput-focused characteristics.

Why this answer

B is correct because gp3 is a general-purpose SSD that provides consistent, low-latency performance for random read/write operations, making it ideal for latency-sensitive databases. It offers a baseline of 3,000 IOPS and 125 MB/s throughput, with the ability to independently scale IOPS up to 16,000 and throughput up to 1,000 MB/s, ensuring predictable performance without the burst-bucket limitations of gp2.

Exam trap

The trap here is that candidates often confuse throughput-optimized HDDs (st1) with low-latency needs, mistakenly thinking 'throughput' implies fast performance, when in fact HDDs are unsuitable for random I/O and latency-sensitive workloads.

How to eliminate wrong answers

Option A is wrong because st1 is a throughput-optimized HDD designed for large, sequential workloads like big data and log processing, not for low-latency random reads/writes required by databases. Option C is wrong because sc1 is a cold HDD optimized for infrequently accessed data with the lowest cost, offering very low IOPS and high latency, unsuitable for latency-sensitive database workloads. Option D is wrong because magnetic (standard) is a legacy HDD volume type with no performance guarantees, high latency, and low IOPS, making it obsolete for modern database applications.

Practice this question →

183

MCQmedium

A read-heavy document portal repeatedly queries the same product catalogue data from DynamoDB with millisecond latency requirements. Which service can reduce read latency and table load? The team wants the control to be enforceable during normal operations.

A.Amazon Kinesis Data Firehose

B.S3 Transfer Acceleration

C.DynamoDB Accelerator (DAX)

D.AWS Glue Data Catalog

AnswerC

DAX is an in-memory cache for DynamoDB that reduces read latency for suitable access patterns.

Why this answer

DynamoDB Accelerator (DAX) is an in-memory cache specifically designed for DynamoDB that can reduce read latency from single-digit milliseconds to microseconds, while offloading read traffic from the underlying table. This directly addresses the read-heavy workload and millisecond latency requirements, and the team can enforce its use during normal operations by configuring the application to route reads through the DAX cluster endpoint.

Exam trap

The trap here is that candidates may confuse DAX with ElastiCache (which is a general-purpose cache but not DynamoDB-native) or assume that S3 Transfer Acceleration can improve DynamoDB read performance, when in fact DAX is the only AWS service purpose-built to cache DynamoDB reads with sub-millisecond latency.

How to eliminate wrong answers

Option A is wrong because Amazon Kinesis Data Firehose is a streaming data ingestion service for loading data into data lakes and analytics tools, not a caching layer for DynamoDB reads. Option B is wrong because S3 Transfer Acceleration speeds up uploads to S3 over long distances using AWS edge locations, but it does not cache DynamoDB query results or reduce table load. Option D is wrong because AWS Glue Data Catalog is a metadata repository for ETL jobs and data lake schemas, not a read cache for DynamoDB.

Practice this question →

184

Multi-Selecthard

A latency-sensitive mobile game backend uploads large files to S3 from users around the world. Which two features can improve upload performance? The design must avoid adding custom operational scripts.

Select 2 answers

A.S3 Object Lock

B.S3 multipart upload

C.S3 Inventory

D.S3 Transfer Acceleration

AnswersB, D

Multipart upload parallelizes large object upload parts and improves reliability.

Why this answer

S3 multipart upload is correct because it allows large files to be uploaded in parallel parts, significantly reducing the impact of latency and packet loss over long distances. This feature is built into S3 and requires no custom scripts, making it ideal for a latency-sensitive mobile game backend.

Exam trap

The trap here is that candidates may confuse S3 Transfer Acceleration with a feature that requires client-side modifications, but it is a simple bucket-level setting that uses AWS edge locations automatically, and they might overlook multipart upload as a performance booster because it is often associated with reliability rather than speed.

Practice this question →

185

Multi-Selecthard

A static website stores assets in S3 and is delivered through CloudFront. Analytics show low cache hit ratio, many origin fetches for the same JavaScript bundles, and elevated S3 GET request costs. Most requests include unnecessary cookies, and the text assets are uncompressed. Which changes should the team make? Select three.

Select 3 answers

A.Configure a CloudFront cache policy that excludes unnecessary cookies and headers from the cache key.

B.Enable Origin Shield for the distribution to reduce duplicate requests reaching the S3 origin.

C.Enable compression for text-based objects such as JavaScript and CSS.

D.Switch the origin to an Application Load Balancer so CloudFront can cache the assets more effectively.

E.Disable caching so viewers always retrieve the newest version directly from S3.

AnswersA, B, C

Correct because cookies and headers that do not affect content create unnecessary cache variants. Removing them from the cache key makes CloudFront reuse the same object more often.

Why this answer

Option A is correct because a CloudFront cache policy that excludes unnecessary cookies and headers from the cache key prevents CloudFront from creating multiple cache entries for the same object based on varying cookie values. This increases the cache hit ratio by ensuring that identical JavaScript bundles are served from the edge cache rather than triggering separate origin fetches for each unique request, thereby reducing S3 GET request costs.

Exam trap

The trap here is that candidates may think an ALB origin improves caching for static assets (Option D) or that disabling caching solves freshness issues (Option E), when in fact both actions worsen performance and cost for a static website served through CloudFront.

Practice this question →

186

Multi-Selectmedium

A multi-tenant event system writes and reads data in DynamoDB. One tenant generates most of the traffic, causing throttling on a single partition key value, and the dashboards repeatedly read the most recent items for that tenant. Which two changes should the team make to improve performance? Select two.

Select 2 answers

A.Shard the hot tenant’s writes across multiple partition key values so traffic is spread across partitions.

B.Use Amazon DAX to cache repetitive read requests for the same items with sub-millisecond latency.

C.Switch the table to on-demand capacity mode so DynamoDB automatically removes partition limits.

D.Use a larger sort key attribute to increase the maximum write throughput for the tenant.

E.Move the table to a single larger provisioned throughput setting and keep the same key design.

AnswersA, B

Write sharding reduces pressure on a single partition and is the standard fix for a hot key caused by one tenant.

Why this answer

Option A is correct because sharding the hot tenant's writes across multiple partition key values (e.g., by appending a random suffix or a timestamp-based suffix) distributes the write traffic across multiple physical partitions, avoiding throttling on a single partition. This is a common DynamoDB design pattern for mitigating hot keys, as each partition has its own throughput limits (e.g., up to 3,000 RCU or 1,000 WCU for a partition). By spreading writes, the system can achieve higher aggregate throughput without hitting per-partition limits.

Exam trap

The trap here is that candidates often assume on-demand mode (Option C) or higher provisioned throughput (Option E) will solve hot key throttling, but they fail to recognize that DynamoDB's per-partition throughput limits are independent of the table's capacity mode or total provisioned capacity.

Practice this question →

187

MCQmedium

A read-heavy media archive repeatedly queries the same product catalogue data from DynamoDB with millisecond latency requirements. Which service can reduce read latency and table load?

A.DynamoDB Accelerator (DAX)

B.Amazon Kinesis Data Firehose

C.AWS Glue Data Catalog

D.S3 Transfer Acceleration

AnswerA

DAX is an in-memory cache for DynamoDB that reduces read latency for suitable access patterns.

Why this answer

DynamoDB Accelerator (DAX) is a fully managed, in-memory cache for DynamoDB that delivers up to 10x read performance improvement by reducing response times from single-digit milliseconds to microseconds. For a read-heavy workload repeatedly querying the same product catalogue data, DAX caches the hot items, offloading read requests from the DynamoDB table and significantly reducing table read capacity consumption.

Exam trap

The trap here is that candidates may confuse DAX with other caching services like ElastiCache or think that S3 Transfer Acceleration can speed up DynamoDB reads, but DAX is the only AWS service purpose-built for in-memory caching of DynamoDB queries with automatic cache invalidation and write-through semantics.

How to eliminate wrong answers

Option B (Amazon Kinesis Data Firehose) is wrong because it is a real-time streaming data ingestion service for loading data into data lakes and analytics tools, not a caching or read acceleration service for DynamoDB queries. Option C (AWS Glue Data Catalog) is wrong because it is a metadata repository for ETL jobs and data discovery, not designed to cache or accelerate DynamoDB read operations. Option D (S3 Transfer Acceleration) is wrong because it speeds up uploads to S3 over long distances using AWS edge locations, but it does not cache DynamoDB query results or reduce read latency for DynamoDB operations.

Practice this question →

188

MCQmedium

A web API runs on an Auto Scaling group (ASG) behind an Application Load Balancer (ALB). During traffic spikes, users experience request timeouts even though CPU stays below 40%. After investigation, you find the ASG often has too few healthy targets to handle the current request rate. Which change will best improve responsiveness during spikes?

A.Keep the ASG scaling policy based on CPU utilization, but increase the ASG min capacity by 50%.

B.Create a target tracking scaling policy using an ALB metric such as RequestCountPerTarget or TargetResponseTime.

C.Enable EC2 detailed monitoring for one-minute granularity and keep CPU scaling.

D.Switch to scaling based on the ASG network out bytes metric only, ignoring ALB response metrics.

AnswerB

Target tracking with an ALB performance metric scales based on the same layer where the problem is observed (requests/latency through the ALB). As traffic spikes, RequestCountPerTarget and/or TargetResponseTime increase; the scaling policy then increases the ASG desired capacity so the ALB has more healthy targets to distribute requests to. That reduces queuing/latency and helps prevent timeouts without waiting for CPU to rise.

Why this answer

Option B is correct because the issue is that the ASG has too few healthy targets to handle the request rate, even though CPU is low. A target tracking scaling policy based on RequestCountPerTarget or TargetResponseTime directly aligns scaling with the ALB's view of demand, ensuring the ASG adds instances when request rates spike, regardless of CPU utilization. This addresses the root cause—insufficient capacity to serve incoming requests—rather than relying on a metric (CPU) that does not reflect the bottleneck.

Exam trap

The trap here is that candidates assume CPU utilization is always the best scaling metric, but AWS explicitly tests that ALB-level metrics (RequestCountPerTarget, TargetResponseTime) are more appropriate when the bottleneck is request throughput rather than compute load.

How to eliminate wrong answers

Option A is wrong because increasing the ASG min capacity only raises the baseline number of instances, but does not make the scaling policy responsive to traffic spikes; the ASG will still scale based on CPU, which remains low, so it will not add instances during spikes. Option C is wrong because enabling detailed monitoring (1-minute granularity) improves the frequency of metric data but does not change the fact that CPU utilization is not the correct metric to trigger scaling for this request-rate bottleneck. Option D is wrong because switching to scaling based solely on ASG network out bytes ignores the ALB's request-level metrics, which are more directly correlated with the user-observed timeouts and healthy-target deficit.

Practice this question →

189

MCQeasy

A compute workload uses temporary scratch space for intermediate results (reproducible), and it can tolerate data loss if the instance is terminated. The workload benefits from very high local I/O throughput. Which storage option is the best fit for the scratch data?

A.Amazon EBS General Purpose (gp3) volumes to persist intermediate results across reboots.

B.Amazon EFS for a shared file system between multiple instances.

C.Instance store for local temporary files that can be lost when the instance stops.

D.Amazon S3 for scratch data so it is always durable and accessible from anywhere.

AnswerC

Instance store is designed for temporary high-performance local storage and is acceptable when loss is tolerable.

Why this answer

Instance store volumes provide very high local I/O throughput because they are physically attached to the host server. Since the workload can tolerate data loss and the scratch data is reproducible, the ephemeral nature of instance store is acceptable, and it offers the best performance for temporary, high-throughput scratch space.

Exam trap

AWS often tests the distinction between persistent block storage (EBS) and ephemeral instance store, where candidates mistakenly choose EBS for its persistence even when the workload explicitly tolerates data loss and requires maximum local I/O throughput.

How to eliminate wrong answers

Option A is wrong because EBS gp3 volumes, while offering good performance, have lower maximum IOPS and throughput compared to instance store, and persisting intermediate results across reboots is unnecessary since the data is reproducible and can be lost. Option B is wrong because EFS is a network file system with higher latency and lower throughput than local storage, and a shared file system is not required for scratch data used by a single instance. Option D is wrong because Amazon S3 is object storage with high latency and lower throughput for random I/O, and its durability and accessibility features are overkill for temporary, reproducible scratch data that can be lost.

Practice this question →

190

Multi-Selecthard

A latency-sensitive mobile game backend uploads large files to S3 from users around the world. Which two features can improve upload performance?

Select 2 answers

A.S3 Object Lock

B.S3 multipart upload

C.S3 Inventory

D.S3 Transfer Acceleration

AnswersB, D

Multipart upload parallelizes large object upload parts and improves reliability.

Why this answer

S3 multipart upload is correct because it allows large files to be uploaded in parallel parts, which reduces the impact of network latency and improves throughput. For a latency-sensitive mobile game backend, this feature enables faster uploads by splitting the file into smaller chunks that can be uploaded concurrently, even over unstable connections.

Exam trap

The trap here is that candidates may confuse S3 Transfer Acceleration with multipart upload, or incorrectly assume that S3 Object Lock or Inventory provide performance benefits, when in fact they serve entirely different purposes related to data protection and management.

Practice this question →

191

Matchinghard

A media platform serves global users through Amazon CloudFront and an S3 origin. Match each requirement on the left to the CloudFront configuration or behavior on the right.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Use CloudFront Origin Access Control and allow only the distribution in the bucket policy.

Use versioned object filenames or hashed asset names with a long TTL.

Exclude the tracking query string from the cache key with a cache policy.

Use CloudFront signed URLs or signed cookies.

Why these pairings

Geo restriction limits access by geography; origin groups enable failover or load balancing; Lambda@Edge customizes content based on device; edge locations reduce latency; AWS WAF mitigates DDoS; signed URLs secure private content.

Practice this question →

192

MCQhard

Based on the exhibit, which storage choice best matches the workload requirements?

A.Use io2 EBS volumes because they provide the highest durable block storage performance.

B.Use instance store NVMe for the temporary processing workspace.

C.Use Amazon EFS for the workspace so the temporary files survive instance replacement.

D.Use S3 as the working directory and read and write the intermediate files directly there.

AnswerB

Instance store fits a high-IOPS scratch workload where data can be lost safely and rebuilt from S3. The benchmark shows extremely low latency and very high random I/O performance, which is ideal for intermediate transcode files. Because the job can be retried from the source object, persistence is not needed on the local workspace.

Why this answer

Instance store NVMe volumes provide temporary, high-performance block storage directly attached to the EC2 host. For a temporary processing workspace where data does not need to persist beyond the instance lifecycle, instance store offers the lowest latency and highest throughput, making it the best match for the workload requirements.

Exam trap

The trap here is that candidates often choose io2 EBS volumes (Option A) because they associate 'highest durable block storage' with 'best performance,' failing to recognize that durability and persistence are unnecessary for temporary data, and that instance store provides superior raw performance for ephemeral workloads.

How to eliminate wrong answers

Option A is wrong because io2 EBS volumes are designed for persistent, high-durability block storage with a 99.999% durability SLA, which is unnecessary and cost-inefficient for temporary processing data that does not require persistence. Option C is wrong because Amazon EFS is a network file system that provides shared, persistent storage; using it for temporary files that should not survive instance replacement introduces unnecessary complexity, latency, and cost, and contradicts the requirement for a temporary workspace. Option D is wrong because using S3 as a working directory for intermediate files would incur high latency per operation, lack POSIX file system semantics (e.g., no file locking or atomic renames), and generate excessive API call costs, making it unsuitable for high-frequency read/write processing tasks.

Practice this question →

193

Multi-Selecthard

A latency-sensitive video platform uploads large files to S3 from users around the world. Which two features can improve upload performance? The design must avoid adding custom operational scripts.

Select 2 answers

A.S3 Object Lock

B.S3 Transfer Acceleration

C.S3 multipart upload

D.S3 Inventory

AnswersB, C

Transfer Acceleration uses optimized edge paths into AWS for long-distance S3 transfers.

Why this answer

S3 Transfer Acceleration (B) uses AWS edge locations to route uploads over optimized network paths, reducing latency for users far from the destination bucket. S3 Multipart Upload (C) allows parallel uploads of file parts, improving throughput and resilience for large files. Both features enhance upload performance without requiring custom scripts.

Exam trap

The trap here is that candidates may confuse S3 Transfer Acceleration with CloudFront’s content delivery features, or assume S3 Object Lock or Inventory could somehow improve upload speed, but neither addresses network latency or throughput for uploads.

Practice this question →

194

MCQhard

A Lambda-based travel booking site has unpredictable traffic spikes and users see latency caused by cold starts. The function must respond consistently during expected campaign windows. What should be configured?

A.Provisioned concurrency during campaign windows

B.A larger deployment package

C.CloudTrail data events

D.Reserved concurrency only

AnswerA

Provisioned concurrency keeps execution environments initialized and reduces cold-start latency.

Why this answer

Provisioned concurrency initializes a specified number of execution environments in advance, eliminating cold starts for those instances. During campaign windows, this ensures consistent sub‑millisecond latency because the function is always warm and ready to handle requests immediately.

Exam trap

The trap here is that candidates confuse reserved concurrency (a limit) with provisioned concurrency (a pre‑warming mechanism), assuming any concurrency setting solves cold starts, when only provisioned concurrency actively eliminates them.

How to eliminate wrong answers

Option B is wrong because a larger deployment package increases the time needed to download and initialize the code, making cold starts worse, not better. Option C is wrong because CloudTrail data events record API activity for auditing and do not affect Lambda execution latency or concurrency. Option D is wrong because reserved concurrency only caps the maximum number of concurrent executions for a function to prevent it from consuming all available concurrency in an account; it does not pre-warm instances or reduce cold starts.

Practice this question →

195

MCQeasy

A backend API uses an AWS Lambda function behind API Gateway. The first requests after every weekly deployment experience cold starts, causing p95 latency spikes for a few minutes. Which configuration most directly prevents those cold starts for the published version?

A.Increase the Lambda memory size only, without changing how Lambda is invoked

B.Use Lambda provisioned concurrency for the version via an alias

C.Enable dead-letter queues (DLQ) to retry failed cold starts

D.Attach a CloudFront distribution to cache API Gateway responses for 5 minutes

AnswerB

Provisioned concurrency keeps Lambda execution environments initialized and ready for a specific published version. By attaching it to an alias (for example, pointing the alias used by API Gateway to the new version), you pre-warm environments so the first requests after deployment are served without cold-start initialization.

Why this answer

Provisioned concurrency initializes a specified number of Lambda execution environments ahead of time, so that when the published version is invoked via an alias, there are no cold starts. This directly addresses the latency spikes caused by cold starts after a deployment, as the function is kept warm and ready to handle requests immediately.

Exam trap

The trap here is that candidates may confuse provisioned concurrency with reserved concurrency, which only limits the maximum number of concurrent executions but does not prevent cold starts.

How to eliminate wrong answers

Option A is wrong because increasing memory size can reduce cold start duration but does not prevent cold starts from occurring; it only makes them slightly faster, not eliminate them. Option C is wrong because dead-letter queues handle failed invocations after they occur, not prevent cold starts; they are for asynchronous retries of failed events, not for keeping functions warm. Option D is wrong because CloudFront caching API Gateway responses reduces latency for cached responses but does not prevent cold starts on the Lambda function; the first request after a deployment still triggers a cold start, and caching does not warm the Lambda environment.

Practice this question →

196

MCQmedium

A site serves static assets (JS/CSS) through CloudFront from an S3 origin. After a recent frontend change, CloudFront shows a cache hit ratio below 20%. In CloudFront access logs, requests to the same asset URL path differ by a query parameter named rnd (a random value appended by the app on every request). The origin content is identical regardless of rnd. What is the best CloudFront configuration change to restore effective caching?

A.Increase the origin response Cache-Control max-age header on S3 so CloudFront caches longer even with different rnd values.

B.Create a custom CloudFront Cache Policy that does not include the rnd query parameter in the cache key (whitelist only required parameters, or forward no query strings).

C.Disable compression on CloudFront so the response body is identical byte-for-byte and cache hits improve.

D.Switch the origin from S3 to an ALB so CloudFront can cache based on ALB target health checks instead of the query string.

AnswerB

CloudFront caching effectiveness depends on the cache key. Since rnd does not change the content returned by the S3 origin, excluding rnd from the cache key allows many requests for the “same” asset to map to the same cached object. This removes cache fragmentation and restores a higher hit ratio without changing application content correctness.

Why this answer

The rnd query parameter makes each request appear unique to CloudFront, causing a cache miss for every request even though the underlying content is identical. By creating a custom cache policy that either forwards no query strings or whitelists only required parameters, CloudFront will ignore the rnd parameter when computing the cache key, allowing it to serve cached responses and dramatically improve the cache hit ratio.

Exam trap

The trap here is that candidates often think increasing cache duration (Option A) or disabling compression (Option C) will fix cache misses, when the real issue is that the query parameter is being included in the cache key, making every request unique.

How to eliminate wrong answers

Option A is wrong because increasing Cache-Control max-age only tells the browser and edge how long to cache the response, but it does not change the cache key; CloudFront still treats URLs with different rnd values as distinct objects, so each request will be a cache miss. Option C is wrong because disabling compression does not affect the cache key; CloudFront already caches compressed and uncompressed versions separately based on the Accept-Encoding header, and the issue here is the query string, not compression. Option D is wrong because switching to an ALB does not solve the query-string-based cache key problem; CloudFront would still see different rnd values as different cache keys, and ALBs are not designed to improve CloudFront caching behavior.

Practice this question →

197

Multi-Selectmedium

A distributed analytics platform runs on 12 EC2 instances in one Availability Zone. The nodes exchange a very high volume of east-west messages and the team wants the lowest possible network latency between instances. Which two changes should the architect make first? Select two.

Select 2 answers

A.Place the instances in a cluster placement group so AWS keeps them physically close together.

B.Use instance types that support enhanced networking with the Elastic Network Adapter (ENA).

C.Spread the instances across multiple Availability Zones to reduce the chance of correlated failure.

D.Use a spread placement group so each instance lands on different underlying hardware.

E.Move the workload to burstable T-series instances to absorb short traffic spikes economically.

AnswersA, B

Cluster placement groups are intended for tightly coupled workloads that need low network latency and high throughput between instances. AWS places the instances on hardware that is physically close within the AZ, which improves east-west communication.

Why this answer

A cluster placement group is the correct choice because it ensures EC2 instances are placed in a single Availability Zone and are physically close together, which minimizes network latency and maximizes throughput for high-volume east-west traffic. This is the lowest-latency placement group option available, as it groups instances within a single rack or cluster of racks, reducing the number of network hops.

Exam trap

The trap here is that candidates may confuse a spread placement group (which focuses on fault isolation) with a cluster placement group (which focuses on low latency), or they may think that spreading across Availability Zones improves performance when it actually increases latency.

Practice this question →

198

MCQeasy

A.Use a constant partition key value (for example, partition_key='events') and store user_id as an attribute.

B.Use user_id as the partition key and event_time as the sort key.

C.Use event_time as the partition key and user_id as an attribute to query later.

D.Use a randomly generated UUID as the partition key and query by user_id using a full table scan.

AnswerB

Why this answer

Option B is correct because using user_id as the partition key evenly distributes writes across partitions, avoiding hot spots, while event_time as the sort key enables efficient retrieval of events for a specific user in chronological order. DynamoDB's query operation can then fetch all events for a given user_id sorted by event_time without scanning.

Exam trap

The trap here is that candidates may choose a constant partition key (Option A) thinking it simplifies queries, not realizing it creates a single hot partition that defeats DynamoDB's scalability.

How to eliminate wrong answers

Option A is wrong because a constant partition key value ('events') forces all items into a single partition, creating a hot partition that throttles writes and reads as the number of users grows. Option C is wrong because using event_time as the partition key scatters events for the same user across multiple partitions, requiring a costly scan or multiple queries to retrieve all events for a user, and it does not guarantee sorted results per user. Option D is wrong because a random UUID partition key prevents efficient retrieval by user_id without a full table scan, which is expensive and slow, and it does not provide sorted results.

Practice this question →

199

Multi-Selectmedium

A company is designing a high-performance web application that serves static and dynamic content to a global user base. The application runs on Amazon EC2 instances behind an Application Load Balancer (ALB). The static assets are stored in an S3 bucket. Which three architecture decisions will improve performance and reduce latency for users? (Choose three.)

Select 3 answers

.Place the EC2 instances in a single Availability Zone to reduce network latency.

.Use Amazon CloudFront to cache both static and dynamic content at edge locations.

.Integrate the ALB with AWS Global Accelerator to route traffic over the AWS global network.

.Use a larger EC2 instance type with higher network bandwidth, such as the c5n or m5n family.

.Enable S3 Transfer Acceleration on the bucket for faster downloads.

.Use an Amazon RDS Multi-AZ database for read replicas to offload read traffic.

Why this answer

Amazon CloudFront caches both static and dynamic content at edge locations, reducing latency by serving content from locations closer to users. AWS Global Accelerator improves performance by routing traffic over the AWS global network instead of the public internet, reducing jitter and latency. Larger EC2 instance types like c5n or m5n provide higher network bandwidth, which reduces network bottlenecks for high-traffic applications.

Exam trap

The trap here is that candidates may confuse S3 Transfer Acceleration as a solution for faster downloads, when it only accelerates uploads, or think Multi-AZ RDS provides read scaling, when it is for failover only.

Practice this question →

200

MCQeasy

Based on the exhibit, which AWS feature should the team use to minimize network latency between EC2 instances that exchange messages very frequently?

A.Use a spread placement group to maximize instance separation across hardware.

B.Use a cluster placement group to place instances close together.

C.Use a partition placement group to distribute instances across many partitions.

D.Use multiple Auto Scaling groups to spread traffic across more subnets.

AnswerB

A cluster placement group is designed for workloads that need very low network latency and high packet-per-second performance between instances. The exhibit describes frequent small-message traffic and a need for the lowest possible latency, which makes a cluster placement group the right choice. It keeps instances physically close in the AWS network for faster communication.

Why this answer

A cluster placement group is the correct choice because it places EC2 instances in a low-latency, high-bandwidth network within a single Availability Zone. This minimizes network latency between instances that exchange messages very frequently, as the instances are physically close together and can communicate using up to 10 Gbps of network throughput for most instance types.

Exam trap

The trap here is that candidates often confuse placement group types, assuming a spread placement group is for performance when it is actually designed for high availability and fault isolation, not low latency.

How to eliminate wrong answers

Option A is wrong because a spread placement group maximizes instance separation across distinct hardware to reduce the risk of simultaneous failures, which increases latency and is counterproductive for high-frequency messaging. Option C is wrong because a partition placement group distributes instances across logical partitions to isolate failures in large distributed systems, but it does not minimize latency between instances. Option D is wrong because using multiple Auto Scaling groups to spread traffic across more subnets can increase network hops and latency, not reduce it.

Practice this question →

201

MCQhard

A media archive needs low-latency full-text search across product descriptions and filtered attributes. Which managed service is most suitable?

A.AWS Config

B.Amazon OpenSearch Service

C.Amazon EFS

D.Amazon SQS

AnswerB

OpenSearch is designed for search and analytics over indexed text and structured fields.

Why this answer

Amazon OpenSearch Service is the correct choice because it provides a managed, scalable solution for full-text search and real-time analytics on large volumes of data. It supports low-latency queries across product descriptions and filtered attributes through its inverted index and query DSL, making it ideal for media archive search use cases.

Exam trap

The trap here is that candidates may confuse AWS Config's resource tracking or SQS's message handling with search capabilities, but neither provides the indexing and query engine required for full-text search.

How to eliminate wrong answers

Option A is wrong because AWS Config is a service for auditing and evaluating resource configurations against desired policies, not for full-text search or indexing of data. Option C is wrong because Amazon EFS is a scalable file storage service for Linux-based workloads, lacking any built-in search or indexing capabilities for text content. Option D is wrong because Amazon SQS is a fully managed message queuing service for decoupling application components, not designed for storing or searching data.

Practice this question →

202

MCQhard

Based on the exhibit, a serverless checkout API is implemented in AWS Lambda and deployed in one Region. The function has a cold-start time of 700-900 ms on the first request after idle periods. Marketing launches a predictable traffic spike every weekday at 09:00 UTC, and the p95 latency target is under 150 ms during the first five minutes of the spike. What should the solutions architect do to meet the latency target while controlling cost?

A.Increase the Lambda memory size and leave concurrency at the default value.

B.Configure provisioned concurrency and scale it up before the predictable spike begins.

C.Put the Lambda function behind an Application Load Balancer so the load balancer absorbs the initialization delay.

D.Set reserved concurrency to the expected peak so Lambda will pre-create execution environments.

AnswerB

Provisioned concurrency keeps pre-initialized environments ready, which removes most cold-start latency. Because the spike is predictable, you can scale concurrency before 09:00 UTC and reduce it afterward to control cost.

Why this answer

Provisioned concurrency pre-warms a specified number of execution environments so that the Lambda function has zero cold-start latency when invoked. By scheduling the provisioned concurrency to scale up before the 09:00 UTC spike, the function can serve the first requests within the 150 ms p95 latency target, while the scheduled scaling down after the spike controls cost by releasing unused capacity.

Exam trap

AWS often tests the distinction between provisioned concurrency (which pre-warms environments to eliminate cold starts) and reserved concurrency (which only caps the maximum concurrent executions without affecting cold-start behavior).

How to eliminate wrong answers

Option A is wrong because increasing memory size can reduce cold-start time but cannot eliminate it entirely, and the cold-start of 700-900 ms far exceeds the 150 ms target; default concurrency does not pre-warm environments. Option C is wrong because an Application Load Balancer does not absorb initialization delay—it only distributes requests to the Lambda function, which still experiences cold starts. Option D is wrong because reserved concurrency limits the maximum number of concurrent executions but does not pre-create execution environments; it prevents scaling beyond a limit but does not reduce cold-start latency.

Practice this question →

203

MCQmedium

A company needs to implement session management for a web application. Sessions must persist across multiple EC2 instances, survive EC2 failures, and be accessible with sub-millisecond latency. Sessions must also be sortable by last-access time to expire the oldest sessions first. Which caching solution should a solutions architect recommend?

A.Amazon ElastiCache for Memcached with session data stored as key-value pairs

B.Amazon DynamoDB with TTL enabled for session expiration

C.Amazon ElastiCache for Redis with sessions stored as sorted sets

D.ElastiCache for Redis with sticky sessions enabled on the Application Load Balancer

AnswerC

Redis provides sub-millisecond latency, sorted sets for ordering by last-access score, Multi-AZ replication, and external storage for cross-instance availability. All requirements are met.

Why this answer

Amazon ElastiCache for Redis satisfies all requirements: multi-instance session sharing (sessions stored externally), sub-millisecond latency, survival of EC2 failures (stored outside instances), and sorted sets (ZSET data structure) for ordering sessions by last-access score.

Memcached supports only simple key-value pairs — it cannot perform sorted set operations to order sessions by last-access time. Memcached also lacks replication, meaning a node failure loses all cached sessions.

Exam trap

Memcached and Redis are both ElastiCache engines, but they serve different needs. Any requirement involving sorted data, complex data structures, persistence, or replication eliminates Memcached. Redis sorted sets (ZSET) store members with numeric scores and support range queries — perfect for session expiry queues ordered by last-access timestamp.

Why the other options are wrong

Memcached supports only simple string key-value storage. It cannot perform sorted set operations to expire sessions by last-access time. Memcached also lacks replication — a node failure loses all cached sessions.

DynamoDB achieves single-digit millisecond latency, not sub-millisecond. DynamoDB also does not natively support sorted set operations without additional query complexity.

ALB sticky sessions pin a client to a specific EC2 instance. If that instance fails, the session is lost. Sticky sessions do not make session data redundant across instances — the opposite of what is required.

Practice this question →

204

Multi-Selecthard

A media company serves versioned JavaScript and CSS files from an Amazon S3 origin through CloudFront. After each release, origin requests spike even though the files are public. Browser requests include a tracking cookie, an Authorization header, and a cache-busting query string that the site no longer needs. Which three changes will most improve the CloudFront cache hit ratio without exposing private content? Select three.

Select 3 answers

A.Rename each static asset with a content hash or release version in the filename before publishing.

B.Create a CloudFront cache policy that excludes unnecessary query strings and cookies from the cache key.

C.Use an origin request policy that forwards only the headers and cookies the origin truly needs.

D.Enable CloudFront compression and configure the origin to return Cache-Control: no-store for all files.

E.Forward all viewer headers to the origin so CloudFront can personalize every request.

AnswersA, B, C

Versioned filenames let CloudFront cache each asset for a long time without worrying about stale content. When the file name changes on release, clients naturally fetch the new object, and old cached objects remain valid for older pages until they expire.

Why this answer

Option A is correct because renaming static assets with a content hash or version in the filename ensures that each new release creates a unique object key in S3. This allows CloudFront to treat the new file as a distinct object, avoiding cache invalidation issues and enabling long-term caching of the old version. Without this, even with cache-busting query strings, CloudFront might still serve stale content or require frequent invalidations, reducing the cache hit ratio.

Exam trap

The trap here is that candidates often confuse origin request policies (which control what is sent to the origin) with cache policies (which control the cache key), leading them to think forwarding headers or cookies to the origin will improve caching, when in fact it can harm the cache hit ratio if those values are included in the cache key.

Practice this question →

205

Multi-Selecthard

An application uses Amazon Aurora MySQL. CloudWatch shows the writer instance near 85% CPU while the only reader instance averages 15% CPU. Trace logs show that all SELECT statements still target the writer endpoint. The workload is read-heavy, and the application already tolerates eventual consistency for reads. Which two changes will best increase total read throughput without a schema redesign? Select two.

Select 2 answers

A.Point read-only queries to the Aurora reader endpoint instead of the writer endpoint.

B.Add one or more additional Aurora Replicas and distribute read traffic across them.

C.Convert the cluster to a single-AZ RDS MySQL instance to reduce replication overhead.

D.Replace the writer endpoint with the instance endpoint of the primary node to speed up SELECT queries.

E.Add Amazon ElastiCache and move all database writes into the cache layer.

AnswersA, B

The reader endpoint is intended for read-only traffic and automatically distributes connections across Aurora Replicas. Redirecting SELECT statements away from the writer immediately reduces CPU pressure on the writer and uses the unused read capacity already available in the cluster. This is the fastest, lowest-risk way to improve read throughput without changing the schema or the application data model.

Why this answer

Option A is correct because the Aurora reader endpoint is designed to distribute read-only connections across all available Aurora Replicas, offloading SELECT queries from the writer instance. Currently, all SELECT statements target the writer endpoint, causing the writer's CPU to be at 85% while the reader instance is underutilized at 15%. By redirecting read traffic to the reader endpoint, the writer's CPU load decreases, and the existing reader instance can handle more read throughput without any schema changes.

Exam trap

The trap here is that candidates may think adding more reader instances alone solves the problem, but they must first redirect read traffic away from the writer endpoint—otherwise, the new replicas remain idle and the writer remains overloaded.

Practice this question →

206

MCQmedium

A mobile game backend uses Amazon Aurora. The workload has many short-lived database connections from Lambda functions, causing connection storms. What should be added? The design must avoid adding custom operational scripts.

A.An internet gateway

B.S3 Select

C.RDS Proxy

D.A larger Route 53 hosted zone

AnswerC

RDS Proxy pools and manages database connections, improving scalability for serverless and bursty workloads.

Why this answer

RDS Proxy is the correct choice because it pools and shares database connections, reducing the overhead of establishing new connections for each Lambda invocation. This prevents connection storms by maintaining a persistent pool of connections to Aurora, which is ideal for short-lived, high-frequency connections from serverless functions like Lambda.

Exam trap

The trap here is that candidates might think adding more network resources (like an internet gateway or larger DNS zone) solves connection storms, when the real issue is connection management at the database layer, not network capacity.

How to eliminate wrong answers

Option A is wrong because an internet gateway provides internet access to a VPC and does not manage database connections or connection pooling. Option B is wrong because S3 Select is used to retrieve subsets of data from objects in S3 using SQL expressions, not for managing database connections. Option D is wrong because a larger Route 53 hosted zone increases the number of DNS records you can host but does not affect database connection management or pooling.

Practice this question →

207

MCQhard

A DynamoDB table for a travel booking site has a partition key based only on the current date. Write throttling occurs during business hours. What is the best design change?

A.Create a global secondary index with the same date key

B.Move the table to S3 Glacier Instant Retrieval

C.Reduce the table's write capacity

D.Use a higher-cardinality partition key that distributes writes across partitions

AnswerD

A low-cardinality hot partition causes throttling; a better key spreads writes more evenly.

Why this answer

Option D is correct because using a low-cardinality partition key like the current date concentrates all writes into a single partition, causing throttling when write demand exceeds that partition's 1,000 WCU limit. A higher-cardinality key (e.g., combining date with user ID or session ID) distributes writes evenly across multiple partitions, allowing the table to use its full provisioned write capacity without throttling.

Exam trap

The trap here is that candidates confuse throttling with insufficient total capacity and choose to reduce write capacity (Option C), when the real issue is a hot partition caused by a low-cardinality partition key.

How to eliminate wrong answers

Option A is wrong because a global secondary index (GSI) inherits the same partition key from the base table by default; creating a GSI with the same date key does not redistribute writes and would itself be throttled. Option B is wrong because S3 Glacier Instant Retrieval is an object storage class for archival data with retrieval latency in milliseconds, not a replacement for DynamoDB's low-latency read/write operations required by a travel booking site. Option C is wrong because reducing write capacity would lower the throttling threshold, making the problem worse; the issue is uneven distribution of writes, not insufficient total capacity.

Practice this question →

208

MCQmedium

A marketing team uses CloudFront with an S3 origin to serve a single-page web app. After a release, CloudFront cache hit ratio dropped sharply. The app requests the same static JS and CSS assets, but each request includes a unique tracking query parameter (for example, ?utm_source=campaign123, campaign456, etc.). You want CloudFront to cache those assets efficiently even when the tracking query parameter changes. What should you do?

A.Create a cache policy that forwards the query string to the origin and varies the cache key by all query parameters.

B.Update the CloudFront cache policy so the cache key ignores the tracking query parameter, while still using the path and other essential headers.

C.Enable S3 origin access control and keep the existing default cache policy, because origin access changes caching behavior automatically.

D.Set the CloudFront Time-to-Live (TTL) to 0 seconds to ensure the origin always serves the latest asset content.

AnswerB

CloudFront caching depends on the cache key (for example, path, selected headers, and selected query strings). If you configure a cache policy to exclude the tracking query parameter (or ignore specific query string parameters), CloudFront treats requests for the same asset as the same cached object. This prevents cache fragmentation caused by unique tracking values. Origin load decreases and cache hit ratio increases, while correctness is maintained because the excluded parameter does not affect the content of the static JS/CSS objects.

Why this answer

Option B is correct because CloudFront's cache key determines whether a request is served from the cache or forwarded to the origin. By configuring a cache policy that ignores the tracking query parameter (e.g., utm_source), CloudFront treats all requests for the same asset path as identical, regardless of the unique tracking parameter. This allows the same JS and CSS files to be cached once and served for all campaign variations, restoring the cache hit ratio.

Exam trap

The trap here is that candidates may think forwarding all query parameters (Option A) is necessary for dynamic content, but for static assets with irrelevant tracking parameters, ignoring them is the correct approach to maximize cache hits.

How to eliminate wrong answers

Option A is wrong because forwarding the query string and varying the cache key by all query parameters would create a separate cache entry for each unique utm_source value, which is exactly the problem causing the cache hit ratio to drop. Option C is wrong because enabling S3 origin access control (OAC) only secures the origin and does not affect CloudFront's caching behavior or cache key configuration. Option D is wrong because setting TTL to 0 seconds forces CloudFront to revalidate every request with the origin, eliminating caching entirely and worsening performance, not improving cache efficiency.

Practice this question →

209

MCQeasy

A travel booking site uses EC2 instances behind an ALB. CPU is consistently high during peak traffic, and request latency rises. What should be configured?

A.A VPC endpoint for CloudWatch only

B.Auto Scaling policy based on an appropriate CloudWatch metric

C.S3 Object Lock

D.Disable health checks

AnswerB

Auto Scaling adds capacity when load increases and removes it when load falls.

Why this answer

An Auto Scaling policy based on a CloudWatch metric like CPUUtilization or request latency directly addresses the root cause: rising CPU and latency under peak traffic. By automatically adding EC2 instances when the metric breaches a threshold, the ALB can distribute load across more resources, reducing CPU per instance and improving response times. This is the standard AWS solution for dynamic scaling to maintain performance.

Exam trap

The trap here is that candidates may confuse monitoring (VPC endpoints) or data protection (S3 Object Lock) with scaling solutions, or think disabling health checks reduces overhead, when the correct approach is to scale horizontally based on load metrics.

How to eliminate wrong answers

Option A is wrong because a VPC endpoint for CloudWatch only enables private connectivity to CloudWatch without internet gateway, but does not add compute capacity or reduce CPU load or latency. Option C is wrong because S3 Object Lock prevents object deletion or overwrite for compliance, which is irrelevant to EC2 CPU and latency issues. Option D is wrong because disabling health checks would cause the ALB to route traffic to unhealthy instances, increasing failures and latency, not solving the performance problem.

Practice this question →

210

MCQeasy

A company runs a stateless web API on Amazon EC2 behind an Application Load Balancer. The team notices that during business hours, the ALB starts queueing requests and the average request latency rises. They want to scale out quickly and reliably based on demand, not CPU alone. Which Auto Scaling approach best matches this requirement?

A.Use a fixed-size Auto Scaling group and increase capacity manually once per hour.

B.Use target tracking scaling based on ALB request count per target.

C.Scale based only on EC2 instance memory utilization, regardless of load.

D.Use step scaling with a single threshold on average network-in bytes.

AnswerB

Target tracking can automatically adjust capacity using ALB load metrics and respond faster.

Why this answer

Target tracking scaling based on ALB request count per target directly aligns with the requirement to scale out based on demand (request queuing and latency) rather than CPU alone. This policy automatically adjusts the Auto Scaling group size to maintain a target value for the average number of requests per instance, which is a more reliable indicator of load for a stateless web API than CPU utilization.

Exam trap

The trap here is that candidates often assume CPU utilization is the best metric for all scaling scenarios, but for a stateless web API behind an ALB, request count per target is a more direct and reliable indicator of demand and latency issues.

How to eliminate wrong answers

Option A is wrong because manual scaling once per hour cannot react quickly to sudden spikes in demand during business hours, leading to request queuing and increased latency. Option C is wrong because scaling based solely on memory utilization ignores the actual load (request count) and may not trigger scaling when the API is CPU-bound or I/O-bound, failing to address the queuing issue. Option D is wrong because step scaling with a single threshold on average network-in bytes is not a direct measure of application demand; network bytes can be influenced by packet size and protocol overhead, and a single threshold lacks the granularity to handle variable traffic patterns, potentially causing either under- or over-scaling.

Practice this question →

211

Multi-Selectmedium

A CPU-bound batch rendering service runs on EC2. The application is Linux-based, compatible with ARM64, and the team wants the best throughput per dollar without changing the workload's architecture. Which two instance-family choices should the team consider first? Select two.

Select 2 answers

A.A compute-optimized family, because it is designed for workloads that spend most of their time on CPU.

B.A Graviton-based family, because compatible ARM instances often provide better price performance for many compute workloads.

C.A memory-optimized family, because extra RAM always increases compute throughput.

D.A storage-optimized family, because local storage bandwidth is the main factor for rendering performance.

E.A burstable family, because CPU credits make sustained rendering faster during long runs.

AnswersA, B

Compute-optimized families are the first place to look for sustained CPU-heavy jobs. They allocate more of the instance's resources to processor performance rather than memory or storage.

Why this answer

Option A is correct because compute-optimized families (e.g., C5, C6g) are designed for workloads that spend most of their time on CPU, such as batch rendering. Option B is correct because Graviton-based instances (e.g., C6g, M6g) use ARM64 architecture, which is compatible with the workload and often delivers better price-performance for compute-intensive tasks, maximizing throughput per dollar without architectural changes.

Exam trap

The trap here is that candidates may confuse 'CPU-bound' with 'memory-bound' or 'storage-bound,' leading them to select memory-optimized or storage-optimized families, or they may mistakenly think burstable instances can sustain high CPU performance over long periods.

Practice this question →

212

MCQhard

A media archive needs low-latency full-text search across product descriptions and filtered attributes. Which managed service is most suitable? The architecture review board prefers a managed AWS-native control.

A.AWS Config

B.Amazon OpenSearch Service

C.Amazon EFS

D.Amazon SQS

AnswerB

OpenSearch is designed for search and analytics over indexed text and structured fields.

Why this answer

Amazon OpenSearch Service is the correct choice because it is a managed, AWS-native service that provides low-latency full-text search and supports filtering on structured attributes. It is purpose-built for indexing and searching large volumes of text data, such as product descriptions, with sub-second response times. The architecture review board's preference for a managed AWS-native control is satisfied, as OpenSearch Service handles cluster management, scaling, and backups automatically.

Exam trap

The trap here is that candidates may confuse Amazon OpenSearch Service with a database or storage service, but it is specifically a search and analytics engine optimized for low-latency full-text queries, not for transactional storage or messaging.

How to eliminate wrong answers

Option A is wrong because AWS Config is a service for auditing and evaluating resource configurations against compliance rules, not for full-text search. Option C is wrong because Amazon EFS is a managed NFS file system for shared storage, not a search engine; it cannot perform full-text queries on file contents without additional software. Option D is wrong because Amazon SQS is a fully managed message queue for decoupling application components, not a search or indexing service.

Practice this question →

213

MCQhard

A.Provisioned concurrency during campaign windows

B.A larger deployment package

C.CloudTrail data events

D.Reserved concurrency only

AnswerA

Provisioned concurrency keeps execution environments initialized and reduces cold-start latency.

Why this answer

Provisioned concurrency initializes a specified number of execution environments in advance, eliminating cold starts for those instances. During campaign windows, this ensures consistent latency by keeping the function warm and ready to handle spikes without the delay of initializing new environments. It is a managed AWS-native control that directly addresses the unpredictable traffic pattern described.

Exam trap

The trap here is confusing reserved concurrency (which limits scaling but does not prevent cold starts) with provisioned concurrency (which pre-warms instances to eliminate cold starts), leading candidates to choose reserved concurrency as a simpler but incorrect solution.

How to eliminate wrong answers

Option B is wrong because a larger deployment package increases cold start time due to longer download and initialization overhead, making latency worse, not better. Option C is wrong because CloudTrail data events record API activity for auditing and governance, not for managing function initialization or latency. Option D is wrong because reserved concurrency only guarantees a maximum number of concurrent executions for a function, preventing it from using all available concurrency, but does not pre-warm instances; cold starts still occur for new invocations.

Practice this question →

214

MCQhard

Based on the exhibit, a single EC2 instance hosts a latency-sensitive cache that performs sustained random reads and writes to persistent block storage. The current EBS volume is a general-purpose SSD, but BurstBalance is repeatedly depleted and p95 I/O latency has risen above 20 ms. The workload needs more than 16,000 sustained IOPS. Which change is the best fix?

A.Move the data to Amazon S3 so the instance can read and write objects directly.

B.Replace the volume with an io2 EBS volume and provision the required IOPS.

C.Keep gp2 and increase the instance size to a compute-optimized family.

D.Enable Amazon EFS with bursting throughput mode for the cache data.

AnswerB

io2 is designed for mission-critical workloads that need sustained, predictable, low-latency random I/O. Unlike gp2, it does not depend on burst credits for performance. Provisioning the required IOPS directly addresses the exhausted BurstBalance and the sustained throughput requirement above 16,000 IOPS.

Why this answer

The workload requires more than 16,000 sustained IOPS with low latency, and the gp2 volume's burst credits are exhausted, causing high latency. An io2 Block Express or io2 volume can be provisioned with the exact IOPS needed (up to 256,000 IOPS) and provides consistent single-digit millisecond latency, making it the best fix for this latency-sensitive, sustained I/O workload.

Exam trap

The trap here is that candidates often assume increasing instance size (Option C) will improve EBS performance, but EBS IOPS and throughput are tied to the volume type and size, not the instance type (except for EBS-optimized bandwidth), so the gp2 burst credit exhaustion remains the root cause.

How to eliminate wrong answers

Option A is wrong because Amazon S3 is object storage accessed via HTTPS, not block storage, and introduces network latency and throughput limitations that are unsuitable for a latency-sensitive cache requiring sustained random reads/writes. Option C is wrong because increasing the instance size to a compute-optimized family does not change the gp2 volume's burst credit model; the volume will still deplete its burst balance and throttle to baseline IOPS (e.g., 160 IOPS per GB), failing to meet the >16,000 sustained IOPS requirement. Option D is wrong because Amazon EFS is a shared file system with NFS protocol overhead and its bursting throughput mode relies on burst credits that can be exhausted, leading to throttled throughput and higher latency, not suitable for sustained high IOPS block-level cache workloads.

Practice this question →

215

Multi-Selecthard

Multiple EC2 instances in different Availability Zones need concurrent read/write access to the same shared files. The files are actively modified by several application servers, and low-latency metadata operations matter more than extremely high aggregate throughput. Which two changes should the team make? Select two.

Select 2 answers

A.Use Amazon EFS instead of EBS or S3 for the shared file system.

B.Create EFS mount targets in every Availability Zone that hosts application instances.

C.Use a single EBS Multi-Attach volume mounted read/write by all instances across AZs.

D.Store the files in S3 and mount them directly through the console as a shared network filesystem.

E.Place the files on instance store volumes so each server has faster local access.

AnswersA, B

Amazon EFS is the managed AWS file service built for shared POSIX-style file access from multiple instances. It supports concurrent read/write access from many EC2 hosts and is a better fit than EBS, which is attached to a single instance, or S3, which provides object storage rather than a native shared filesystem. For an application that expects standard filesystem semantics, EFS is the correct storage layer.

Why this answer

Amazon EFS provides a fully managed, POSIX-compliant, shared file system that can be mounted concurrently by multiple EC2 instances across different Availability Zones (AZs). It supports concurrent read/write access with strong consistency, and its metadata operations are optimized for low latency, making it ideal for workloads where many application servers actively modify the same files. EBS cannot be shared across AZs, and S3 lacks POSIX semantics and low-latency metadata operations.

Exam trap

The trap here is that candidates often confuse EBS Multi-Attach with a cross-AZ shared storage solution, but Multi-Attach is strictly limited to a single AZ and a small number of instances, while EFS is the only AWS shared file system that natively spans AZs with concurrent read/write access.

Practice this question →

216

MCQeasy

A company serves mostly static images and JavaScript files from an origin in one AWS Region. They want to reduce origin load and improve global performance. Which change most directly increases cache-hit ratio for static assets while avoiding stale content?

A.Set Cache-Control headers on the origin to always be no-cache so clients revalidate frequently.

B.Use versioned file names (e.g., app.abc123.js) and configure a long TTL with appropriate revalidation behavior.

C.Disable query string forwarding so all URLs without query strings share one cached object even when content differs.

D.Forward all headers, including cookies, to maximize personalization in edge cached responses.

AnswerB

Versioned assets allow long caching with confidence, while new filenames trigger updates when code changes.

Why this answer

Option B is correct because using versioned file names (e.g., app.abc123.js) allows you to set a long Cache-Control max-age TTL (e.g., one year) without risking stale content. When the file changes, the new version gets a new URL, so clients and edge caches immediately fetch the fresh object, maximizing cache hits for unchanged assets while avoiding stale content.

Exam trap

The trap here is that candidates often confuse 'no-cache' with 'no-store' or think that disabling query strings universally improves caching, but they fail to recognize that versioned filenames with long TTLs are the standard pattern for maximizing cache hits while ensuring content freshness.

How to eliminate wrong answers

Option A is wrong because setting Cache-Control: no-cache forces clients to revalidate with the origin on every request, which increases origin load and reduces cache-hit ratio, directly contradicting the goal. Option C is wrong because disabling query string forwarding can cause different content to be served from the same cached object if the URL path is identical but query parameters differentiate the content, leading to stale or incorrect responses. Option D is wrong because forwarding all headers, including cookies, reduces cache-hit ratio by creating many unique cache keys for the same asset, defeating the purpose of caching static content.

Practice this question →

217

MCQeasy

A team wants to run containerized services with AWS-managed orchestration and autoscaling. They do NOT require Kubernetes compatibility. Which AWS service choice is most appropriate to meet these goals?

A.Amazon EKS

B.Amazon ECS

C.An EC2 Auto Scaling group only

D.Amazon SQS as the compute layer

AnswerB

Amazon ECS is a native container orchestration service. You can run containers without Kubernetes, and ECS integrates with AWS-native autoscaling (for example, ECS Service Auto Scaling with targets such as CPU/memory or request-based metrics when applicable to the architecture).

Why this answer

Amazon ECS is the most appropriate choice because it provides AWS-managed container orchestration and autoscaling without requiring Kubernetes compatibility. ECS integrates natively with AWS services like Application Auto Scaling and CloudWatch to automatically scale container tasks based on metrics such as CPU or memory utilization, meeting the team's requirements directly.

Exam trap

The trap here is that candidates often confuse Amazon ECS with Amazon EKS, assuming that Kubernetes compatibility is required for container orchestration, but ECS provides a simpler, AWS-native alternative without Kubernetes overhead.

How to eliminate wrong answers

Option A is wrong because Amazon EKS is a managed Kubernetes service that requires Kubernetes compatibility, which the team explicitly does not need, adding unnecessary complexity and overhead. Option C is wrong because an EC2 Auto Scaling group only manages EC2 instances, not container orchestration or scheduling, so it cannot run containerized services directly without additional container management software. Option D is wrong because Amazon SQS is a message queuing service, not a compute layer; it cannot run containers or provide orchestration or autoscaling for containerized workloads.

Practice this question →

218

MCQeasy

A retail API uses EC2 instances behind an ALB. CPU is consistently high during peak traffic, and request latency rises. What should be configured? The design must avoid adding custom operational scripts.

A.Auto Scaling policy based on an appropriate CloudWatch metric

B.S3 Object Lock

C.A VPC endpoint for CloudWatch only

D.Disable health checks

AnswerA

Auto Scaling adds capacity when load increases and removes it when load falls.

Why this answer

An Auto Scaling policy based on a CloudWatch metric like CPUUtilization or ALB TargetResponseTime can dynamically add or remove EC2 instances to match demand. This directly addresses the high CPU and rising latency during peak traffic without requiring custom scripts, as the scaling actions are fully managed by AWS. The ALB distributes traffic across the scaled instances, reducing per-instance load and improving response times.

Exam trap

The trap here is that candidates may confuse VPC endpoints (which enable private connectivity) with actual scaling mechanisms, or assume that disabling health checks is a quick fix for latency, when in fact it degrades reliability and does not address the underlying capacity issue.

How to eliminate wrong answers

Option B is wrong because S3 Object Lock is a data protection feature for S3 objects (preventing deletion/overwrite) and has no role in scaling compute resources or reducing latency for an API behind an ALB. Option C is wrong because a VPC endpoint for CloudWatch only enables private connectivity to CloudWatch APIs (e.g., for publishing metrics or logs) but does not automatically trigger scaling or resolve CPU/latency issues; scaling still requires an Auto Scaling policy. Option D is wrong because disabling health checks would cause the ALB to route traffic to unhealthy instances, worsening latency and availability, and it does not address the root cause of high CPU.

Practice this question →

219

MCQmedium

A telemetry pipeline uses RDS MySQL and receives many read-only reporting queries that slow down the primary database. What should the architect add? The design must avoid adding custom operational scripts.

A.Multi-AZ standby and route reads to the standby

B.RDS read replica and route reporting queries to it

C.S3 lifecycle policy

D.A larger NAT gateway

AnswerB

Read replicas offload read traffic from the primary instance.

Why this answer

RDS Read Replicas are designed specifically to offload read-heavy workloads from the primary database. By creating a read replica and routing reporting queries to it, the architect reduces load on the primary MySQL instance without custom scripts. This is a native, managed feature of RDS that supports asynchronous replication.

Exam trap

The trap here is confusing Multi-AZ standby (which is for failover only and cannot serve reads) with a read replica (which is for read scaling and can serve queries).

How to eliminate wrong answers

Option A is wrong because a Multi-AZ standby is a synchronous replica used for high availability and disaster recovery, not for read traffic; it cannot serve read queries directly. Option C is wrong because an S3 lifecycle policy manages object storage transitions and expiration, not database query routing or read offloading. Option D is wrong because a larger NAT gateway increases outbound internet capacity for private subnets but does not address database read performance or query distribution.

Practice this question →

220

MCQhard

A.Provisioned concurrency during campaign windows

B.A larger deployment package

C.CloudTrail data events

D.Reserved concurrency only

AnswerA

Provisioned concurrency keeps execution environments initialized and reduces cold-start latency.

Why this answer

Provisioned concurrency keeps a specified number of Lambda execution environments initialized and ready to respond immediately, eliminating cold starts. By enabling it only during campaign windows, you ensure consistent latency for the travel booking site during traffic spikes without incurring cost during off-peak periods. This directly addresses the requirement to avoid custom scripts, as it is a native AWS feature configured via the Lambda API or console.

Exam trap

The trap here is that candidates confuse reserved concurrency (which caps concurrent executions) with provisioned concurrency (which pre-warms instances), leading them to choose reserved concurrency alone, which does not address cold starts.

How to eliminate wrong answers

Option B is wrong because a larger deployment package increases the time to download and initialize the function code, which actually worsens cold start latency rather than solving it. Option C is wrong because CloudTrail data events record API activity for auditing and governance, not for managing Lambda concurrency or cold starts. Option D is wrong because reserved concurrency only sets a maximum number of concurrent executions for a function to prevent it from consuming all available concurrency in the account; it does not pre-warm instances or reduce cold starts.

Practice this question →

221

MCQeasy

A media company uses CloudFront in front of an S3 bucket origin for video thumbnails. They want to prevent users from bypassing CloudFront and accessing the S3 bucket directly, while still allowing CloudFront to fetch objects. What is the best option?

A.Keep the bucket public and rely on signed cookies for all thumbnail requests.

B.Use CloudFront Origin Access Control (OAC) or Origin Access Identity (OAI) and update the bucket policy to allow only CloudFront.

C.Enable S3 static website hosting so users access thumbnails directly from the S3 website endpoint.

D.Set S3 bucket permissions to allow all IAM users and block access only by using a WAF rule at CloudFront.

AnswerB

OAC/OAI ensures only CloudFront can access the bucket while keeping the bucket private.

Why this answer

Option B is correct because CloudFront Origin Access Control (OAC) or Origin Access Identity (OAI) allows you to create a special CloudFront identity and attach a bucket policy that grants that identity s3:GetObject permissions. This ensures that only CloudFront can fetch objects from the S3 bucket, while direct access via the S3 endpoint is denied, preventing users from bypassing CloudFront.

Exam trap

The trap here is that candidates often think signed cookies or WAF rules can restrict direct S3 access, but they fail to realize that those mechanisms only apply at the CloudFront layer and do not affect the S3 bucket's own permissions.

How to eliminate wrong answers

Option A is wrong because keeping the bucket public and relying on signed cookies does not prevent direct access to the S3 bucket; signed cookies only restrict access through CloudFront, but the bucket itself remains publicly accessible. Option C is wrong because enabling S3 static website hosting exposes the bucket via the S3 website endpoint, which would allow users to bypass CloudFront and access thumbnails directly. Option D is wrong because setting S3 bucket permissions to allow all IAM users does not restrict direct access; a WAF rule at CloudFront cannot block direct S3 requests since WAF operates at the CloudFront edge, not on the S3 endpoint.

Practice this question →

222

MCQhard

Based on the exhibit, a retail analytics service repeatedly reads the same DynamoDB items during an active campaign. The business can tolerate data that is a few seconds stale, but the application must minimize latency and reduce pressure on DynamoDB. A load test shows that 80% of reads target only 200 item keys. What should the solutions architect implement?

A.Add a DynamoDB Accelerator (DAX) cluster in front of the table and point the application to the DAX endpoint.

B.Switch the table to provisioned capacity with auto scaling so DynamoDB can handle the repeated reads more efficiently.

C.Create a global table in a second Region and read from the replica Region to lower latency.

D.Move the hot items into Amazon ElastiCache for Redis and keep the remaining data in DynamoDB.

AnswerA

DAX is purpose-built for DynamoDB read caching and can absorb repeated reads for the same keys with very low latency. Because the workload can tolerate slight staleness, DAX fits the requirement well and reduces pressure on the table during bursts.

Why this answer

DynamoDB Accelerator (DAX) is a fully managed, highly available, in-memory cache designed specifically for DynamoDB. It reduces read latency from single-digit milliseconds to microseconds and offloads repeated reads from the table, which directly addresses the requirement to minimize latency and reduce pressure on DynamoDB. Since the business can tolerate stale data (DAX default TTL is 5 minutes, but can be configured lower), DAX is ideal for the 80% of reads hitting only 200 hot keys.

Exam trap

The trap here is that candidates may choose ElastiCache (Option D) because it is a general-purpose cache, but they overlook that DAX is purpose-built for DynamoDB and eliminates the need for custom cache invalidation and dual-write logic, making it the simpler and more efficient solution for this exact use case.

How to eliminate wrong answers

Option B is wrong because switching to provisioned capacity with auto scaling does not reduce latency or offload repeated reads; it only adjusts throughput capacity based on load, but the same read requests still hit the underlying DynamoDB storage, causing the same pressure. Option C is wrong because creating a global table in a second Region adds cross-Region replication latency and does not reduce read pressure on the primary table; it also does not solve the hot-key issue for repeated reads within the same Region. Option D is wrong because moving hot items into ElastiCache for Redis requires dual-write logic and data synchronization between DynamoDB and Redis, adding complexity and potential inconsistency; DAX is a simpler, native cache that automatically stays consistent with DynamoDB without application changes.

Practice this question →

223

MCQmedium

A DynamoDB table uses this schema: partition key = customerId, sort key = timestamp. During a marketing campaign, one customer generates extremely high read traffic and the application sees ProvisionedThroughputExceeded errors even though the table’s total capacity is sufficient. What change most directly improves read distribution across partitions?

A.Increase the table’s provisioned read capacity units while keeping partition key = customerId.

B.Add a salt component to the partition key by changing it to customerId#salt, where salt is derived from a hash of requestId so a single customer’s requests are spread across many partitions; keep the sort key as timestamp.

C.Remove the sort key and use timestamp as the partition key to increase cardinality.

D.Switch to on-demand capacity and rely on DynamoDB to automatically distribute reads across partitions.

AnswerB

Hot partition throttling usually occurs when too many requests target a single partition key value. Salting transforms the partition key so that one high-traffic customerId maps to multiple distinct partition keys (e.g., customerId#0, customerId#1, etc.), which increases the number of partitions that can serve that customer’s workload concurrently and reduces the probability that a single partition becomes overloaded.

Why this answer

Option B is correct because adding a salt to the partition key (e.g., customerId#hash(requestId)) distributes the read-heavy customer's data across multiple physical partitions. This prevents a single hot partition from throttling requests, even when the table's total provisioned capacity is sufficient. DynamoDB's partition key determines the internal hash used for data placement, so increasing partition key cardinality directly improves read distribution.

Exam trap

The trap here is that candidates confuse total table capacity with per-partition capacity, assuming that increasing RCUs or switching to on-demand will fix throttling caused by a hot key, when in reality the bottleneck is the single partition's throughput limit.

How to eliminate wrong answers

Option A is wrong because increasing provisioned read capacity units does not solve the hot partition problem; it only raises the total table capacity, but a single partition still has a hard limit of 3000 RCUs (or 1000 WCUs) and will continue to throttle requests from the same customer. Option C is wrong because removing the sort key and using timestamp as the partition key would cause all writes and reads for a given timestamp to land on one partition, creating a new hot partition and losing the ability to query by customerId. Option D is wrong because on-demand capacity does not automatically distribute reads across partitions; it only scales total table capacity up or down, but a single partition still has the same throughput limit, so a hot key will still cause throttling.

Practice this question →

224

MCQeasy

A web service runs on an Auto Scaling group (ASG). The team updates configuration (AMIs, environment variables) in a Launch Template and wants new instances created during scale-out to use the latest Launch Template version. What should the architect do?

A.Leave the ASG attached to the previous Launch Template version so scale-out is stable.

B.Set the ASG to use the latest Launch Template version and optionally start an instance refresh for existing instances.

C.Manually SSH into each new instance and reconfigure it after it launches.

D.Move the configuration changes into a security group rule so the ASG updates them automatically.

AnswerB

ASG scale-out uses the configured Launch Template version at instance launch time. Switching the ASG to the latest version ensures new instances are consistent. An instance refresh helps apply changes to running instances safely and predictably.

Why this answer

Option B is correct because the ASG can be configured to use the latest version of a Launch Template by specifying the `$Latest` version alias. This ensures that any new instances launched during scale-out automatically use the most recent template configuration (e.g., updated AMI, environment variables). Additionally, an Instance Refresh can be triggered to roll the update across existing instances, aligning them with the same latest template version without manual intervention.

Exam trap

The trap here is that candidates may think the ASG automatically updates existing instances when the Launch Template version is changed, but without an Instance Refresh, only new scale-out instances receive the update, leaving existing instances on the old configuration.

How to eliminate wrong answers

Option A is wrong because leaving the ASG attached to a previous Launch Template version means new scale-out instances will use outdated configurations, defeating the purpose of updating the template. Option C is wrong because manually SSHing into each new instance is not scalable, violates infrastructure-as-code principles, and introduces human error; the ASG should automate configuration via the Launch Template. Option D is wrong because security group rules control network traffic, not instance configuration (AMIs, environment variables); they cannot propagate or apply Launch Template changes.

Practice this question →

225

MCQmedium

A global mobile game backend serves mostly static images and JavaScript files from an S3 origin. Users in distant countries report slow load times. What should improve performance most?

A.RDS read replicas

B.Amazon CloudFront distribution with the S3 bucket as origin

C.A larger S3 bucket

D.An EC2 Auto Scaling group in one Region

AnswerB

CloudFront caches content at edge locations close to users, reducing latency.

Why this answer

Amazon CloudFront is a content delivery network (CDN) that caches static content (images, JavaScript files) at edge locations worldwide. By distributing content closer to users, it significantly reduces latency and improves load times for a global audience, making it the most effective solution for this use case.

Exam trap

The trap here is that candidates may confuse improving database read performance (RDS read replicas) with improving static content delivery, or assume that scaling compute resources (Auto Scaling) in a single Region can solve global latency issues.

How to eliminate wrong answers

Option A is wrong because RDS read replicas are designed to offload read traffic from a relational database, not to accelerate delivery of static files stored in S3. Option C is wrong because increasing the S3 bucket size does not affect network latency or data transfer speed; it only increases storage capacity. Option D is wrong because an EC2 Auto Scaling group in a single Region does not address global latency; it only provides scalability within that one Region, leaving distant users unaffected.

Practice this question →