CCNA Design High-Performing Architectures Questions — Page 4 of 4

226

MCQmedium

A document portal requires consistent high IOPS for a transactional database on EC2. Which EBS volume type is most suitable?

A.sc1 Cold HDD

B.Instance store only

C.Provisioned IOPS SSD such as io2

D.st1 Throughput Optimized HDD

AnswerC

io2 is designed for business-critical workloads requiring consistent high IOPS and durability.

Why this answer

Provisioned IOPS SSD (io2) is the correct choice because it delivers consistent, high IOPS performance required for transactional databases running on EC2. io2 volumes offer a 99.999% durability and can sustain up to 256,000 IOPS per volume, making them ideal for latency-sensitive workloads like OLTP databases.

Exam trap

The trap here is that candidates often confuse 'high IOPS' with 'high throughput' and select st1 or sc1, not realizing that transactional databases require low-latency random I/O, which only SSD-based volumes like io2 can consistently deliver.

How to eliminate wrong answers

Option A is wrong because sc1 Cold HDD is designed for infrequently accessed, throughput-oriented workloads with low cost, and cannot provide consistent high IOPS due to its burst-bucket model and high latency. Option B is wrong because instance store volumes are ephemeral and data is lost on instance stop/termination, making them unsuitable for persistent transactional databases that require durability and consistent IOPS. Option D is wrong because st1 Throughput Optimized HDD is optimized for large, sequential workloads like big data and log processing, not for random I/O patterns typical of transactional databases, and its performance is limited to a maximum of 500 IOPS per volume.

Practice this question →

227

MCQmedium

A global video platform serves mostly static images and JavaScript files from an S3 origin. Users in distant countries report slow load times. What should improve performance most?

A.A larger S3 bucket

B.Amazon CloudFront distribution with the S3 bucket as origin

C.RDS read replicas

D.An EC2 Auto Scaling group in one Region

AnswerB

CloudFront caches content at edge locations close to users, reducing latency.

Why this answer

Amazon CloudFront is a content delivery network (CDN) that caches static content (images, JavaScript) at edge locations worldwide. By distributing content closer to users, it reduces latency and improves load times significantly compared to serving directly from a single S3 origin. This is the most effective solution for a global user base accessing static assets.

Exam trap

The trap here is that candidates might confuse 'scaling' (Auto Scaling, larger buckets) with 'latency reduction' (CDN), or mistakenly think database read replicas can serve static web assets, when in fact they are only for relational database read offloading.

How to eliminate wrong answers

Option A is wrong because S3 bucket size has no impact on performance; S3 scales automatically to handle any amount of data, and a larger bucket does not reduce latency for distant users. Option C is wrong because RDS read replicas are designed to offload read traffic from a relational database, not to serve static files like images or JavaScript; they address database query performance, not content delivery. Option D is wrong because an EC2 Auto Scaling group in one Region only scales compute capacity within that single geographic area, failing to reduce latency for users in distant countries who still must traverse long network paths.

Practice this question →

228

MCQhard

Based on the exhibit, a trading platform exposes a custom binary TCP protocol to partner systems. The service must preserve the original client source IP for rate limiting, support TLS pass-through to the application, and minimize network latency. The team also wants a simple architecture that can scale across multiple Availability Zones. What load balancing option should the solutions architect choose?

A.Application Load Balancer with path-based routing and HTTP/2 enabled.

B.Network Load Balancer with TCP listeners and target groups in the private subnets.

C.Amazon API Gateway REST API integrated directly with the EC2 instances.

D.CloudFront in front of the EC2 instances to cache and terminate the client connections.

AnswerB

NLB is designed for ultra-low-latency TCP/UDP workloads and preserves the client source IP to targets. It also supports multi-AZ scale-out and works well when the application is not HTTP-based.

Why this answer

A Network Load Balancer (NLB) with TCP listeners is the correct choice because it preserves the original client source IP address (via the Proxy Protocol header or direct preservation in the TCP flow), supports TLS pass-through (no decryption at the load balancer), and minimizes latency by operating at Layer 4. It also scales across multiple Availability Zones with simple target groups in private subnets, meeting all stated requirements.

Exam trap

The trap here is that candidates often choose ALB for its advanced routing features, forgetting that ALB cannot preserve client source IP for TCP traffic and terminates TLS, which violates the TLS pass-through requirement for a custom binary protocol.

How to eliminate wrong answers

Option A is wrong because an Application Load Balancer (ALB) operates at Layer 7, which would terminate TLS and re-encrypt, breaking the TLS pass-through requirement; it also does not preserve the original client source IP natively for TCP-based protocols. Option C is wrong because Amazon API Gateway is a RESTful HTTP/HTTPS service that cannot handle custom binary TCP protocols or provide TLS pass-through, and it introduces additional latency. Option D is wrong because CloudFront is an HTTP/HTTPS content delivery network that terminates client connections, cannot pass through raw TCP traffic, and would add latency and complexity without supporting the custom binary protocol.

Practice this question →

229

MCQmedium

A telemetry pipeline uses an Application Load Balancer in one Region. Global users need lower network latency to the application without caching dynamic responses. What should be considered? The architecture review board prefers a managed AWS-native control.

A.AWS Global Accelerator

B.S3 Cross-Region Replication

C.CloudFront only with long TTLs

D.AWS Backup cross-Region copy

AnswerA

Global Accelerator routes traffic over the AWS global network to improve performance for TCP/UDP applications without relying on caching.

Why this answer

AWS Global Accelerator is the correct choice because it uses the AWS global network and Anycast IP addresses to route user traffic to the optimal Application Load Balancer endpoint, reducing latency for global users without caching dynamic responses. Unlike CloudFront, Global Accelerator does not cache content; it simply optimizes the network path, making it ideal for dynamic or real-time applications where caching is not acceptable. It is a managed AWS-native service that aligns with the architecture review board's preference.

Exam trap

The trap here is that candidates often confuse CloudFront with Global Accelerator, assuming that any CDN-like service is the answer for latency reduction, but CloudFront's caching behavior makes it unsuitable for dynamic content that must not be cached.

How to eliminate wrong answers

Option B (S3 Cross-Region Replication) is wrong because it is designed for replicating objects in S3 buckets across regions for data durability or compliance, not for reducing network latency to an ALB-based application. Option C (CloudFront only with long TTLs) is wrong because CloudFront caches content at edge locations, which would cache dynamic responses—contradicting the requirement to avoid caching—and long TTLs would further exacerbate stale data issues. Option D (AWS Backup cross-Region copy) is wrong because it is a backup and disaster recovery service for creating copies of resources across regions, not a solution for improving application latency.

Practice this question →

230

MCQhard

Based on the exhibit, an application repeatedly reads the same DynamoDB items with extremely low latency requirements. The business can tolerate data that is a few seconds stale. Which architecture change best improves read performance?

A.Add a DynamoDB Accelerator (DAX) cluster in front of the table.

B.Increase the table's sort key cardinality while keeping the same read pattern.

C.Switch the table to provisioned mode with auto scaling disabled.

D.Move the session data to Amazon EFS so the application can read it from shared files.

AnswerA

DAX is designed for repeated, read-heavy DynamoDB access patterns where a small amount of staleness is acceptable. It can dramatically reduce read latency and offload the table during peak demand.

Why this answer

Adding a DynamoDB Accelerator (DAX) cluster provides an in-memory cache that can reduce read latencies to microseconds for frequently accessed items, while still allowing for eventual consistency and tolerating a few seconds of staleness. DAX is specifically designed for this use case, handling cache hits without any application code changes and offloading read traffic from the DynamoDB table.

Exam trap

The trap here is that candidates may overlook DAX as a specialized caching layer for DynamoDB and instead consider increasing table capacity or changing data models, which do not directly address the need for extremely low latency on repeated reads of the same items.

How to eliminate wrong answers

Option B is wrong because increasing sort key cardinality does not improve read performance for repeated reads of the same items; it primarily helps with write distribution and query flexibility, not latency for individual GetItem operations. Option C is wrong because switching to provisioned mode with auto scaling disabled does not inherently improve read performance; it may lead to throttling if capacity is insufficient, and it does not address the need for sub-millisecond latency. Option D is wrong because moving session data to Amazon EFS introduces file system overhead and network latency that is significantly higher than DynamoDB's single-digit millisecond latency, and EFS is not designed for the same low-latency, high-throughput access pattern required for repeated reads of individual items.

Practice this question →

231

MCQhard

Based on the exhibit, what change should the team make to achieve the lowest possible network latency for the distributed workload?

A.Place the instances in a spread placement group across multiple Availability Zones.

B.Move the workload into a cluster placement group in one Availability Zone.

C.Add an Application Load Balancer in front of the workers to reduce inter-node latency.

D.Increase the EC2 instance size while keeping the current multi-AZ layout.

AnswerB

Cluster placement groups place instances physically close together inside one Availability Zone, which is the best AWS option for workloads that need low-latency, high-bandwidth communication between many nodes. The exhibit explicitly says the workload can run in a single AZ if performance improves. That makes cluster placement groups the right fit.

Why this answer

A cluster placement group provides the lowest possible network latency and highest throughput by placing all instances in a single Availability Zone with low-latency, non-blocking 10 Gbps or 25 Gbps network connectivity between them. This is ideal for tightly coupled, distributed workloads that require frequent inter-node communication, such as HPC or data analytics jobs.

Exam trap

The trap here is that candidates often assume multi-AZ is always better for high availability, but for latency-sensitive distributed workloads, a single-AZ cluster placement group is the correct choice to minimize inter-node latency, even though it sacrifices fault tolerance.

How to eliminate wrong answers

Option A is wrong because a spread placement group spreads instances across distinct hardware racks or even Availability Zones, which increases network latency and reduces throughput compared to a cluster placement group. Option C is wrong because an Application Load Balancer operates at Layer 7 and is designed for distributing incoming traffic, not for reducing inter-node latency between worker instances; it would add overhead and increase latency. Option D is wrong because increasing instance size does not fundamentally change the network topology or reduce the physical distance between instances; inter-node latency remains constrained by multi-AZ network hops.

Practice this question →

232

Multi-Selectmedium

A marketing site serves versioned JavaScript and CSS files from Amazon S3 through CloudFront. Origin bandwidth costs are rising because CloudFront keeps revalidating objects and fetching too much content from the bucket. Which two changes most directly improve cache hit ratio and reduce origin load? Select two.

Select 2 answers

A.Use versioned object names and long cache TTLs for immutable assets.

B.Forward all cookies and query strings so each request is treated as unique.

C.Configure a cache policy that excludes unnecessary cookies, query strings, and headers.

D.Switch the bucket to S3 Intelligent-Tiering to reduce CloudFront origin requests.

E.Add more NAT Gateways to improve the speed of CloudFront origin fetches.

AnswersA, C

Versioned file names let you cache content aggressively because each new build gets a new URL and does not overwrite the old one.

Why this answer

Option A is correct because using versioned object names (e.g., app-v1.js, app-v2.js) combined with long cache TTLs (e.g., one year) tells CloudFront that these assets are immutable. Once cached, CloudFront never revalidates them, eliminating origin requests for unchanged files. This directly reduces origin bandwidth costs by preventing unnecessary fetches from S3.

Exam trap

The trap here is that candidates confuse S3 storage classes (like Intelligent-Tiering) with caching performance, or think that increasing network throughput (NAT Gateways) can fix a cache miss problem, when the real solution lies in optimizing cache keys and TTLs.

Practice this question →

233

MCQmedium

A read-heavy document portal repeatedly queries the same product catalogue data from DynamoDB with millisecond latency requirements. Which service can reduce read latency and table load? The design must avoid adding custom operational scripts.

A.Amazon Kinesis Data Firehose

B.S3 Transfer Acceleration

C.DynamoDB Accelerator (DAX)

D.AWS Glue Data Catalog

AnswerC

DAX is an in-memory cache for DynamoDB that reduces read latency for suitable access patterns.

Why this answer

DynamoDB Accelerator (DAX) is an in-memory cache for DynamoDB that delivers microsecond read latency, reducing the number of read requests hitting the underlying table. It requires no custom scripts—just a DAX cluster endpoint—and automatically caches frequently accessed items, making it ideal for a read-heavy document portal with millisecond latency requirements.

Exam trap

The trap here is that candidates often confuse caching services like ElastiCache with DAX, but DAX is purpose-built for DynamoDB and requires no application code changes beyond pointing to a different endpoint, whereas ElastiCache would need custom cache invalidation logic.

How to eliminate wrong answers

Option A is wrong because Amazon Kinesis Data Firehose is a streaming data ingestion service for loading data into data lakes or analytics tools, not a read cache for DynamoDB. Option B is wrong because S3 Transfer Acceleration speeds up uploads to S3 over long distances using edge locations, but does not reduce read latency or load on a DynamoDB table. Option D is wrong because AWS Glue Data Catalog is a metadata repository for ETL jobs and data discovery, not a caching layer for DynamoDB reads.

Practice this question →

234

MCQhard

A Lambda-based retail API has unpredictable traffic spikes and users see latency caused by cold starts. The function must respond consistently during expected campaign windows. What should be configured?

A.A larger deployment package

B.Reserved concurrency only

C.Provisioned concurrency during campaign windows

D.CloudTrail data events

AnswerC

Provisioned concurrency keeps execution environments initialized and reduces cold-start latency.

Why this answer

Provisioned concurrency initializes a specified number of execution environments in advance, eliminating cold starts during campaign windows. This ensures consistent latency even under unpredictable traffic spikes, as the function is always warm and ready to handle requests immediately.

Exam trap

The trap here is confusing reserved concurrency (which limits scaling but does not prevent cold starts) with provisioned concurrency (which pre-warms environments to eliminate cold starts).

How to eliminate wrong answers

Option A is wrong because a larger deployment package increases cold start time, making latency worse. Option B is wrong because reserved concurrency only guarantees a maximum number of concurrent executions but does not pre-warm environments; cold starts still occur. Option D is wrong because CloudTrail data events record API activity for auditing, not for managing function initialization or latency.

Practice this question →

235

MCQeasy

Your team serves static JavaScript and CSS files from an S3 origin through CloudFront. After a release, the CloudFront cache hit ratio dropped because clients keep re-downloading the same assets. What is the best next change to improve caching performance?

A.Update origin responses to include long-lived Cache-Control headers (for example, max-age) so CloudFront can cache objects

B.Switch the S3 bucket to S3 Glacier so objects are not frequently accessed

C.Disable CloudFront compression to reduce CPU usage at the edge

D.Set CloudFront to forward all query strings to the origin to ensure the latest assets are returned

AnswerA

CloudFront will only reuse cached objects when the origin response is cacheable. Adding/adjusting Cache-Control (and related directives such as public and s-maxage where appropriate) to allow long-lived caching enables edge reuse and increases cache hit ratio.

Why this answer

Option A is correct because setting long-lived Cache-Control headers (e.g., max-age=31536000) on static assets tells CloudFront to cache them at edge locations for an extended period. This reduces the number of requests forwarded to the S3 origin, improving the cache hit ratio and preventing clients from re-downloading unchanged assets on every visit.

Exam trap

The trap here is that candidates may think forwarding query strings (Option D) ensures freshness, but it actually fragments the cache and reduces hit ratio, whereas the real solution is to use long-lived Cache-Control headers with versioned filenames to maximize caching.

How to eliminate wrong answers

Option B is wrong because moving the S3 bucket to Glacier would make objects inaccessible for real-time serving, breaking the static asset delivery entirely. Option C is wrong because disabling CloudFront compression does not affect caching behavior; it would only increase bandwidth and latency for clients, not improve cache hit ratio. Option D is wrong because forwarding all query strings to the origin forces CloudFront to treat each unique query string as a separate cache key, fragmenting the cache and reducing hit ratio, which is the opposite of what is needed.

Practice this question →

236

MCQmedium

Your mobile app writes events to a single DynamoDB table with partition key = customerId and sort key = eventTime. During a promotional campaign, one tenant ("ACME") generates far more traffic than others. CloudWatch shows sustained throttling (ProvisionedThroughputExceeded) and elevated p99 latency only for that tenant. The workload pattern cannot be changed to a completely different schema, but you can change how items are partitioned. Which design change is most likely to reduce the hot-partition throttling while keeping efficient reads for ACME?

A.Use the same partition key (customerId), but increase the table’s provisioned capacity for that tenant.

B.Change the partition key to a salted key such as customerId + shard number, and include the eventTime ordering using the sort key.

C.Switch to on-demand capacity mode and keep the partition key unchanged.

D.Enable Global Tables so that reads are served from a nearby replica for ACME.

AnswerB

Hot-partition throttling happens when a single logical partition (one partition key value) receives more requests than it can serve. By salting the partition key (for example, customerId#shardId), ACME’s writes are spread across multiple physical partitions, reducing request rate per partition and lowering throttling. Efficient reads for ACME can be preserved by querying only the shard partitions that belong to ACME (for example, using a small, deterministic set of shardIds and issuing parallel queries per shard, then merging results). This avoids scanning the whole table and keeps access patterns predictable while improving tail latency.

Why this answer

Option B is correct because salting the partition key by appending a shard number (e.g., customerId + random digit) distributes ACME's writes across multiple partitions, eliminating the hot partition. The sort key still preserves eventTime ordering, so queries for a specific customer can be parallelized across shards and merged client-side or via a composite sort key pattern, maintaining efficient reads.

Exam trap

The trap here is that candidates assume increasing capacity or switching to on-demand alone solves hot partitions, but they overlook DynamoDB's fixed per-partition throughput limits that require key design changes to distribute load.

How to eliminate wrong answers

Option A is wrong because increasing provisioned capacity for a single tenant does not solve the hot-partition issue; DynamoDB distributes capacity across partitions, and a single partition's throughput is capped at 3,000 RCU or 1,000 WCU regardless of table-level settings. Option C is wrong because switching to on-demand capacity mode only handles traffic spikes at the table level, but a single hot partition still hits the same per-partition throughput limits (3,000 RCU/1,000 WCU), causing throttling. Option D is wrong because Global Tables replicate data across regions for low-latency reads and disaster recovery, but they do not redistribute write load within a single table; ACME's writes still target the same partition key in the source region, so throttling persists.

Practice this question →

237

MCQmedium

An Aurora PostgreSQL cluster is experiencing high read latency because 85% of traffic consists of read-only queries. The write workload must stay on the writer instance, and the team wants to offload reads without changing the application’s core query patterns. What is the best architectural option?

A.Increase the writer instance size so it can handle more reads and writes simultaneously.

B.Add Aurora reader instances (read replicas) and route read queries to the reader endpoint while keeping writes on the writer endpoint.

C.Enable Multi-AZ failover only and rely on the standby to serve reads in normal operation.

D.Move the read workload to ElastiCache Redis while keeping DynamoDB as the SQL data source.

AnswerB

Aurora reader instances are designed for exactly this pattern: they provide dedicated compute capacity for read-only workloads. By sending read queries to the reader endpoint and keeping writes on the writer endpoint, the cluster can scale read performance without forcing reads to contend with write processing on the writer.

Why this answer

Adding Aurora reader instances (read replicas) and routing read queries to the reader endpoint offloads read traffic from the writer instance without altering application query patterns. Aurora reader endpoints automatically distribute read-only connections across all replicas, reducing latency on the writer while keeping writes on the writer instance. This directly addresses the 85% read-heavy workload without requiring application changes.

Exam trap

The trap here is that candidates often confuse Multi-AZ standby instances (which are passive and cannot serve reads) with Aurora reader replicas (which are active and can serve reads), leading them to incorrectly select Option C.

How to eliminate wrong answers

Option A is wrong because increasing the writer instance size does not offload reads; it only adds more resources to a single node, which still handles all read and write traffic, and does not scale read capacity independently. Option C is wrong because Multi-AZ failover provides a standby for high availability, but the standby does not serve reads in normal operation (it is passive until failover), so it does not offload read traffic. Option D is wrong because ElastiCache Redis is a caching layer, not a SQL data source, and DynamoDB is a NoSQL database, not a SQL data source; this would require significant application changes and does not preserve the existing Aurora PostgreSQL query patterns.

Practice this question →

238

MCQmedium

An event ingestion service writes to a DynamoDB table where the partition key is tenantId and the sort key is eventTime. During a campaign, one tenant generates a disproportionate share of traffic, causing write throttling and increased latency for that tenant’s writes. You can change the data model and application queries, but you must still efficiently retrieve events for a tenant for the last 10 minutes. Which change best improves write throughput by reducing hot partitions?

A.Keep tenantId as the partition key and rely on DynamoDB adaptive capacity to automatically remove all throttling.

B.Add a shard attribute to the partition key (partition key = tenantId#shard, where shard is randomly selected from a fixed range). Query all shards for the tenant for eventTime values in the last 10 minutes, then merge results in the application.

C.Change the sort key to eventTimeBucket (for example, eventTime rounded to 1-minute buckets) while keeping the partition key as tenantId.

D.Enable DAX and use it for write operations so throttled writes are served from cache instead of reaching DynamoDB.

AnswerB

This “write sharding” spreads a tenant’s traffic across multiple partition key values, which distributes the write load across multiple DynamoDB partitions (and thus multiple throughput slices). Reads for the last 10 minutes remain efficient because each shard still supports a sort-key range query on eventTime; the application merges results across shards.

Why this answer

Option B is correct because it distributes writes for a hot tenant across multiple partitions by appending a random shard suffix to the tenantId partition key. This eliminates a single hot partition, allowing DynamoDB to scale write capacity horizontally. The application can then query all shards for the last 10 minutes and merge results, satisfying the retrieval requirement.

Exam trap

The trap here is that candidates assume adaptive capacity or caching (DAX) can solve write throttling, but neither addresses the root cause—a single partition exceeding its write capacity—which requires redistributing the partition key across multiple physical partitions.

How to eliminate wrong answers

Option A is wrong because DynamoDB adaptive capacity can only mitigate moderate hot spots by temporarily allocating extra capacity, but it cannot eliminate throttling when a single partition exceeds its 1,000 WCU or 3,000 RCU limit; sustained high traffic from one tenant will still cause throttling. Option C is wrong because changing the sort key to eventTimeBucket does nothing to distribute writes across partitions—the partition key remains tenantId, so all writes for that tenant still target the same partition, leaving the hot partition problem unsolved. Option D is wrong because DAX is a read-through cache and does not handle write operations; throttled writes are not served from cache, and DAX cannot increase write throughput or reduce hot partition contention.

Practice this question →