SAA-C03 SAA-C03 Questions 526–600 | Page 8/14

526

MCQmedium

A telemetry pipeline uses RDS MySQL and receives many read-only reporting queries that slow down the primary database. What should the architect add?

A.Multi-AZ standby and route reads to the standby

B.RDS read replica and route reporting queries to it

C.S3 lifecycle policy

D.A larger NAT gateway

AnswerB

Read replicas offload read traffic from the primary instance.

Why this answer

RDS Read Replicas are designed specifically to offload read-heavy workloads from the primary database. By creating a read replica and routing the reporting queries to it, the primary database is freed from processing these read-only queries, reducing contention and improving overall performance. This is the most cost-effective and architecturally appropriate solution for read scaling in RDS MySQL.

Exam trap

The trap here is confusing Multi-AZ standby (which is for failover, not read scaling) with a read replica, leading candidates to incorrectly choose Option A.

How to eliminate wrong answers

Option A is wrong because a Multi-AZ standby is for high availability and disaster recovery, not for read scaling; the standby does not accept read traffic unless a failover occurs. Option C is wrong because S3 lifecycle policies manage object storage tiers and expiration, which have no relevance to offloading database read queries. Option D is wrong because a larger NAT gateway increases outbound internet bandwidth for private subnets, but does not address database read performance or query offloading.

Full explanation →

527

MCQmedium

A test environment has EC2 instances that are oversized based on CPU, memory, and network utilisation. Which AWS service should identify rightsizing recommendations? The design must avoid adding custom operational scripts.

A.AWS DataSync

B.AWS Shield

C.AWS Artifact

D.AWS Compute Optimizer

AnswerD

Compute Optimizer analyses utilisation metrics and recommends rightsizing for supported resources.

Why this answer

AWS Compute Optimizer uses machine learning to analyze historical utilization metrics (CPU, memory, network) and generates rightsizing recommendations for EC2 instances, including instance type changes and downsizing opportunities. It operates without requiring custom scripts, as it leverages existing CloudWatch metrics and optionally the Compute Optimizer agent for enhanced memory and disk metrics.

Exam trap

The trap here is that candidates may confuse AWS Compute Optimizer with AWS Trusted Advisor, which also provides cost optimization checks but does not offer the same granular, ML-driven rightsizing recommendations for EC2 instance types.

How to eliminate wrong answers

Option A is wrong because AWS DataSync is a data transfer service for moving large datasets between on-premises storage and AWS services (e.g., S3, EFS), not for analyzing EC2 utilization or generating rightsizing recommendations. Option B is wrong because AWS Shield is a managed DDoS protection service that safeguards applications against distributed denial-of-service attacks, unrelated to cost optimization or instance sizing. Option C is wrong because AWS Artifact is a self-service portal for downloading AWS compliance reports and agreements (e.g., SOC, PCI), not a tool for resource optimization or rightsizing.

Full explanation →

528

MCQeasy

A company serves public JavaScript and CSS files from S3 using CloudFront. After a frontend change, customers report a low CloudFront cache hit ratio. Requests now include an Authorization header, but these assets do not require authentication. The CloudFront distribution is configured such that Authorization is included in the cache key. Which change best maximizes cache reuse?

A.Include the Authorization header in the cache key so responses vary correctly

B.Use a CloudFront Cache Policy that excludes Authorization from the cache key

C.Disable caching and always fetch from S3

D.Forward all headers and cookies to the origin to improve correctness

AnswerB

Because the assets are public and do not depend on Authorization, excluding Authorization from the cache key allows all users to share the same cached objects. This reduces cache fragmentation and increases cache hit ratio.

Why this answer

Option B is correct because excluding the Authorization header from the cache key ensures that all users, regardless of their authentication token, receive the same cached object. Since the static assets (JavaScript/CSS) do not require authentication, including Authorization in the cache key creates multiple cache entries for the same file, drastically reducing the cache hit ratio. A CloudFront cache policy that omits Authorization from the cache key maximizes reuse while still allowing the header to be forwarded to the origin if needed.

Exam trap

The trap here is that candidates may assume including the Authorization header is necessary for correctness, but for public static assets, excluding it from the cache key is the correct way to maximize cache reuse without affecting delivery.

How to eliminate wrong answers

Option A is wrong because including the Authorization header in the cache key would cause CloudFront to cache separate copies for each unique token value, which is exactly the problem that reduces the cache hit ratio. Option C is wrong because disabling caching entirely would increase latency and origin load, violating the goal of maximizing cache reuse. Option D is wrong because forwarding all headers and cookies to the origin would not only include unnecessary Authorization values but also further fragment the cache, worsening the hit ratio and adding overhead.

Full explanation →

529

MCQhard

A order processing API uses Amazon RDS for PostgreSQL. Application credentials must not be stored on the EC2 instances, and authentication should use short-lived credentials. What should the architect recommend?

A.IAM database authentication for RDS with an EC2 instance role

B.Store the database password in user data

C.Use a security group rule that allows only application instances

D.Embed the database password in the AMI

AnswerA

IAM database authentication allows the application to use temporary AWS credentials instead of stored database passwords.

Why this answer

IAM database authentication for RDS allows EC2 instances to authenticate to PostgreSQL using a short-lived token generated via the AWS CLI or SDK, instead of a static password. By assigning an IAM instance role to the EC2 instance, the application can obtain the token without storing any credentials on the instance, meeting both security requirements. This approach uses the IAM role's temporary security credentials to generate a password token that is valid for 15 minutes, after which a new token must be obtained.

Exam trap

The trap here is that candidates often confuse network-level controls (security groups) with authentication mechanisms, or they assume that storing credentials in user data or AMIs is acceptable because it is 'hidden,' but the exam explicitly tests the requirement for short-lived, non-persistent credentials.

How to eliminate wrong answers

Option B is wrong because storing the database password in user data leaves it in plaintext on the instance metadata, which can be accessed by any process or user with instance metadata access, and it does not use short-lived credentials. Option C is wrong because a security group rule only controls network access at the transport layer; it does not provide authentication credentials or eliminate the need to store them on the instance. Option D is wrong because embedding the database password in the AMI hardcodes the credential into the image, which persists across instances and cannot be rotated without rebuilding the AMI, violating the requirement for short-lived credentials.

Full explanation →

530

MCQmedium

Based on the exhibit, the company wants DNS traffic to fail over automatically from the primary Region to a secondary Region when the primary endpoint is unhealthy. Which Route 53 change is best?

A.Keep simple routing and lower the TTL to 10 seconds.

B.Use weighted routing with equal weights for both ALBs.

C.Use geolocation routing so users in each continent reach a closer ALB.

D.Create Route 53 failover records with health checks for the primary and secondary ALBs.

AnswerD

Failover routing is the Route 53 policy intended for this use case. Route 53 returns the primary record while its health check passes, and automatically serves the secondary record when the primary health check fails. That provides DNS-based Regional failover without manual intervention.

Why this answer

Route 53 failover routing with health checks is the only option that automatically directs DNS traffic away from an unhealthy primary endpoint to a healthy secondary endpoint. When the health check for the primary ALB fails, Route 53 returns the secondary ALB's IP address in DNS responses, providing automatic failover across regions. Simple, weighted, and geolocation routing do not natively support automatic failover based on endpoint health.

Exam trap

The trap here is that candidates often confuse weighted routing with failover, assuming equal weights will somehow cause automatic failover, but weighted routing does not consider health status and requires manual intervention to shift traffic.

How to eliminate wrong answers

Option A is wrong because simple routing does not support health checks or automatic failover; lowering the TTL only reduces DNS caching but does not change the routing behavior when the endpoint is unhealthy. Option B is wrong because weighted routing distributes traffic based on weights regardless of endpoint health; it does not automatically failover to the secondary when the primary is unhealthy unless you manually adjust weights. Option C is wrong because geolocation routing directs users based on their geographic location, not endpoint health; it cannot automatically failover traffic from an unhealthy primary region to a secondary region.

Full explanation →

531

MCQhard

A mobile banking backend uses Amazon RDS for PostgreSQL. Application credentials must not be stored on the EC2 instances, and authentication should use short-lived credentials. What should the architect recommend?

A.Store the database password in user data

B.IAM database authentication for RDS with an EC2 instance role

C.Use a security group rule that allows only application instances

D.Embed the database password in the AMI

AnswerB

IAM database authentication allows the application to use temporary AWS credentials instead of stored database passwords.

Why this answer

IAM database authentication for RDS with an EC2 instance role is the correct approach because it eliminates the need to store credentials on the instance. The EC2 instance assumes an IAM role, which obtains a short-lived (15-minute default) authentication token using the AWS CLI's `generate-db-auth-token` command. This token is used as the password for the PostgreSQL connection, ensuring credentials are never stored and automatically rotated.

Exam trap

The trap here is that candidates often confuse network-level controls (security groups) with authentication mechanisms, or they assume that storing credentials in user data or AMIs is acceptable because they are 'hidden', but the exam strictly requires no static credentials on the instance and short-lived tokens.

How to eliminate wrong answers

Option A is wrong because storing the database password in user data leaves it in plaintext on the instance metadata, which is accessible to any process or user with access to the instance, violating the requirement to not store credentials on EC2. Option C is wrong because a security group rule only controls network access at the transport layer; it does not address authentication or credential storage, and the application would still need a static password to connect. Option D is wrong because embedding the database password in the AMI hardcodes the credential into the image, which persists across instances and violates the principle of not storing credentials on the instance, plus it cannot provide short-lived credentials.

Full explanation →

532

MCQhard

A IoT ingestion API uses Amazon RDS for PostgreSQL. Application credentials must not be stored on the EC2 instances, and authentication should use short-lived credentials. What should the architect recommend?

A.Store the database password in user data

B.Embed the database password in the AMI

C.IAM database authentication for RDS with an EC2 instance role

D.Use a security group rule that allows only application instances

AnswerC

IAM database authentication allows the application to use temporary AWS credentials instead of stored database passwords.

Why this answer

Option C is correct because IAM database authentication for RDS allows EC2 instances to authenticate to PostgreSQL using short-lived credentials obtained via an IAM instance role, eliminating the need to store long-term credentials on the instance. The EC2 instance assumes the role, retrieves a temporary authentication token (valid for 15 minutes), and uses it to connect to the RDS database, meeting both security requirements.

Exam trap

The trap here is that candidates confuse network-level controls (security groups) with authentication mechanisms, assuming that restricting traffic alone satisfies credential security, while the real requirement is about eliminating stored long-term credentials entirely.

How to eliminate wrong answers

Option A is wrong because storing the database password in user data is insecure — user data is accessible from within the instance and can be retrieved by any process or user with access, and it does not provide short-lived credentials. Option B is wrong because embedding the database password in an AMI creates a static credential that persists across instances launched from that AMI, violating the requirement for short-lived credentials and increasing the risk of credential exposure. Option D is wrong because a security group rule controls network access at the transport layer but does not address authentication or credential management; it cannot provide short-lived credentials or eliminate the need to store passwords on the instance.

Full explanation →

533

MCQmedium

Your organization uses IAM permission boundaries to prevent privilege escalation. A deployment role was created with a permission boundary. After an incident, you discover that an operator was later able to remove or change the permission boundary (the operator has iam:PutRolePermissionsBoundary permissions). You need to ensure operators cannot remove or change the permission boundary after it is set. What is the best security control to add?

A.Grant operators iam:PutRolePermissionsBoundary so they can reapply the boundary if needed.

B.Add an explicit IAM Deny for operators on both iam:PutRolePermissionsBoundary and iam:DeleteRolePermissionsBoundary for all affected roles.

C.Rely only on the role’s trust policy so operators cannot assume the role.

D.Attach a more permissive permission boundary so the roles remain functional after changes.

AnswerB

An explicit Deny prevents permission boundary updates or removal, even if the operator has allow permissions elsewhere. This directly protects the permission boundary integrity and maintains the privilege-limiting guardrail.

Why this answer

Option B is correct because adding an explicit IAM Deny for both `iam:PutRolePermissionsBoundary` and `iam:DeleteRolePermissionsBoundary` on the affected roles prevents operators from removing or changing the permission boundary, even if they have the corresponding Allow permissions. This is a classic use of an explicit Deny, which overrides any Allow in AWS IAM policy evaluation logic, ensuring the boundary remains immutable after deployment.

Exam trap

The trap here is that candidates often assume that simply not granting the `iam:PutRolePermissionsBoundary` permission is sufficient, but they overlook that an operator with broader IAM privileges (e.g., `iam:PassRole` or `iam:CreatePolicyVersion`) could still modify the boundary unless an explicit Deny is added.

How to eliminate wrong answers

Option A is wrong because granting `iam:PutRolePermissionsBoundary` would allow the operator to reapply a different boundary, which directly enables the privilege escalation they are trying to prevent. Option C is wrong because relying solely on the role’s trust policy controls who can assume the role, but does not restrict the operator’s ability to modify the permission boundary via IAM API calls (e.g., from their own user account). Option D is wrong because attaching a more permissive permission boundary would expand the role’s effective permissions, defeating the purpose of using boundaries to limit privilege escalation.

Full explanation →

534

MCQmedium

A microservice runs in private subnets and must read exactly one AWS Secrets Manager secret using its IAM task role: arn:aws:secretsmanager:us-east-1:111122223333:secret:prod/db-pass-AbCdEf Security requires that every Secrets Manager API call comes only through a specific Interface VPC Endpoint (vpce-0a1b2c3d4e5f6g7h), and must not be reachable over any other network path. Which IAM policy change best enforces this requirement?

A.In the task role policy statement for secretsmanager:GetSecretValue on the secret ARN, add a condition that allows the action only when aws:SourceVpce equals vpce-0a1b2c3d4e5f6g7h.

B.Add a condition that allows secretsmanager:GetSecretValue only when aws:SourceIp is within 10.0.0.0/8.

C.Require TLS by adding a condition on aws:SecureTransport for the Secrets Manager permission.

D.Add a KMS condition using kms:ViaService=secretsmanager.us-east-1.amazonaws.com instead of restricting Secrets Manager directly.

AnswerA

For Interface VPC endpoints, aws:SourceVpce can be used as a condition key so KMS/Secrets Manager API authorization succeeds only when the request originates from the specified endpoint. Restricting the IAM permission to aws:SourceVpce=vpce-... directly matches the requirement that calls must not traverse other network paths (e.g., via NAT/egress).

Why this answer

Option A is correct because the condition `aws:SourceVpce` in the IAM policy restricts the `secretsmanager:GetSecretValue` API call to originate only from the specified VPC Endpoint (vpce-0a1b2c3d4e5f6g7h). This ensures that the secret can only be accessed via that specific Interface Endpoint, blocking any other network path (e.g., internet, NAT gateway, or other VPC endpoints). The task role is attached to the microservice, so the policy directly enforces the security requirement at the API level.

Exam trap

The trap here is that candidates often confuse `aws:SourceVpce` with `aws:SourceIp` or `aws:SourceVpc`, thinking any network-level condition will work, but only `aws:SourceVpce` uniquely identifies the specific Interface VPC Endpoint required for this strict enforcement.

How to eliminate wrong answers

Option B is wrong because `aws:SourceIp` condition key is not effective for requests made through a VPC Endpoint; the source IP is replaced by the endpoint's private IP, making the condition unreliable for restricting traffic to a specific endpoint. Option C is wrong because requiring TLS (`aws:SecureTransport`) only ensures encryption in transit, not that the API call comes through a specific VPC Endpoint; it does not restrict the network path. Option D is wrong because `kms:ViaService` restricts KMS key usage to a specific AWS service (Secrets Manager), but it does not control which network path (e.g., VPC Endpoint) the Secrets Manager API call uses; it addresses KMS authorization, not network-level restriction.

Full explanation →

535

MCQmedium

A batch analytics job has unpredictable DynamoDB traffic with long idle periods and occasional spikes. Which capacity mode should minimize operational overhead and avoid paying for idle provisioned capacity? The design must avoid adding custom operational scripts.

A.DynamoDB on-demand capacity mode

B.Reserved capacity for maximum daily traffic

C.Provisioned capacity set for peak traffic

D.Global tables in every Region

AnswerA

On-demand capacity is suitable for unpredictable workloads and charges per request without capacity planning.

Why this answer

DynamoDB on-demand capacity mode automatically scales to handle unpredictable traffic spikes and idle periods without requiring any capacity planning or management. It charges only for the reads and writes you perform, eliminating the cost of idle provisioned capacity and avoiding the need for custom scripts to adjust capacity.

Exam trap

The trap here is that candidates may confuse 'reserved capacity' (a pricing discount for provisioned capacity) with a capacity mode, or assume that provisioned capacity set for peak traffic is cost-effective, ignoring the cost of idle periods.

How to eliminate wrong answers

Option B is wrong because reserved capacity is a pricing model for provisioned capacity, not a capacity mode; it requires you to commit to a specific throughput level and does not eliminate idle costs. Option C is wrong because setting provisioned capacity for peak traffic would result in paying for unused capacity during long idle periods, increasing costs and requiring manual or scripted adjustments. Option D is wrong because global tables replicate data across Regions for disaster recovery or low-latency access, not for managing capacity or cost optimization; they add complexity and cost without addressing idle capacity.

Full explanation →

536

MCQhard

A document portal needs low-latency full-text search across product descriptions and filtered attributes. Which managed service is most suitable? The architecture review board prefers a managed AWS-native control.

A.Amazon OpenSearch Service

B.AWS Config

C.Amazon EFS

D.Amazon SQS

AnswerA

OpenSearch is designed for search and analytics over indexed text and structured fields.

Why this answer

Amazon OpenSearch Service is a managed service that provides low-latency full-text search and analytics capabilities, making it ideal for indexing and searching product descriptions and filtered attributes. It is AWS-native and supports features like inverted indices, fuzzy search, and faceted filtering, which directly address the requirement for a high-performance document portal.

Exam trap

The trap here is that candidates may confuse Amazon CloudSearch (another managed search service) with OpenSearch Service, but the question emphasizes 'AWS-native control' and OpenSearch Service is the more modern, feature-rich choice for full-text search with filtering.

How to eliminate wrong answers

Option B is wrong because AWS Config is a service for resource inventory, compliance auditing, and configuration change tracking, not a full-text search engine. Option C is wrong because Amazon EFS is a scalable file storage service for shared access to files, not a search or indexing service. Option D is wrong because Amazon SQS is a fully managed message queuing service for decoupling microservices, not a search or query engine.

Full explanation →

537

MCQeasy

Your company allows application teams to create IAM roles. Each team must be prevented from granting permissions beyond a defined per-role baseline, even if they attach overly permissive identity-based policies to the role. Which AWS feature best enforces this ceiling at the IAM role level?

A.Use an Organizations service control policy (SCP) to cap the maximum permissions for role creation in each account

B.Attach a permission boundary to every role that teams create so the boundary limits the role’s maximum effective permissions

C.Rely on KMS key policies to restrict permissions because IAM policies cannot override KMS restrictions

D.Require multi-factor authentication (MFA) for all role creation requests and deny any request without MFA

AnswerB

A permission boundary acts as a permissions ceiling for the role. Even if the team attaches an identity-based policy that grants broader permissions, the role’s effective permissions are only those allowed by both the identity policy and the permission boundary. This prevents privilege escalation by role policy changes while still allowing teams to manage which policies are attached, within the boundary.

Why this answer

Permission boundaries are an AWS IAM feature that sets the maximum permissions that an identity-based policy can grant to an IAM role. When a permission boundary is attached to a role, the effective permissions are the intersection of the boundary and the role's identity-based policy, ensuring that even if a team attaches an overly permissive policy, the role cannot exceed the boundary's defined limits. This directly enforces a per-role ceiling on permissions, making option B the correct choice.

Exam trap

The trap here is that candidates often confuse SCPs with permission boundaries, thinking SCPs can enforce per-role limits, but SCPs apply to all principals in an account and cannot be scoped to individual roles, whereas permission boundaries are specifically designed for that purpose.

How to eliminate wrong answers

Option A is wrong because SCPs apply at the AWS account or organizational unit level, not at the individual IAM role level, and they cannot enforce a per-role ceiling within an account; they cap permissions for all principals in the account but do not provide granular control over each role's maximum permissions. Option C is wrong because KMS key policies control access to KMS keys, not IAM role permissions, and they are unrelated to setting a ceiling on what actions a role can perform via IAM policies. Option D is wrong because requiring MFA for role creation requests is an authentication control that does not limit the permissions granted to the role once it is created; it prevents unauthorized creation but does not enforce a permissions ceiling.

Full explanation →

538

MCQmedium

A test environment stores logs in S3. Logs are queried for 30 days, rarely accessed for one year, and then retained for compliance. What should reduce storage cost? The architecture review board prefers a managed AWS-native control.

A.Keep all logs in S3 Standard indefinitely

B.Move all logs immediately to S3 Glacier Deep Archive

C.S3 lifecycle policy that transitions objects to lower-cost storage classes over time

D.Use EBS snapshots for the logs

AnswerC

Lifecycle rules automate transitions based on age, matching storage cost to access patterns.

Why this answer

Option C is correct because S3 Lifecycle policies allow you to automate the transition of objects from S3 Standard to lower-cost storage classes like S3 Standard-IA (after 30 days) and then to S3 Glacier Deep Archive (after one year) for long-term compliance. This matches the access pattern of frequent queries for 30 days, rare access for a year, and then retention-only, minimizing storage costs without manual intervention.

Exam trap

The trap here is that candidates may choose Option B (immediate move to Glacier Deep Archive) thinking it maximizes cost savings, but they overlook the requirement for 30 days of queryable access, which Glacier Deep Archive cannot support due to its multi-hour retrieval times.

How to eliminate wrong answers

Option A is wrong because keeping all logs in S3 Standard indefinitely incurs the highest storage cost, ignoring the infrequent access and long-term retention requirements. Option B is wrong because moving all logs immediately to S3 Glacier Deep Archive eliminates the ability to query them for 30 days, as retrieval times are hours and not suitable for active queries. Option D is wrong because EBS snapshots are designed for block-level backups of EC2 instances, not for storing log files; they are not a cost-effective or managed-native solution for S3 log storage and would introduce unnecessary complexity and cost.

Full explanation →

539

Multi-Selectmedium

An order lookup API repeatedly reads the same few items from DynamoDB. The application can tolerate slightly stale data for a few seconds, and the team wants the lowest-latency design with minimal application changes. Which two changes should they make? Select two.

Select 2 answers

A.Put Amazon DynamoDB Accelerator (DAX) in front of the table.

B.Use eventually consistent reads where the application can tolerate slightly stale data.

C.Switch all access to strongly consistent reads for faster results.

D.Increase the item size so fewer requests are needed.

E.Replace the table with Amazon EBS volumes mounted on EC2 instances.

AnswersA, B

DAX is an in-memory cache for DynamoDB reads, so repeated lookups for the same keys can be served with much lower latency than direct table reads. It is especially effective for hot-item access patterns like order lookups, product metadata, and profile reads.

Why this answer

Amazon DynamoDB Accelerator (DAX) is an in-memory cache for DynamoDB that provides microsecond read latency, which is ideal for repeated reads of the same few items. Since the application can tolerate slightly stale data, DAX's default write-through caching with a TTL of 5 minutes ensures low latency without requiring application code changes beyond adding the DAX client.

Exam trap

The trap here is that candidates may think strongly consistent reads are always faster, but they actually have higher latency and cannot be cached by DAX, making them unsuitable for this low-latency, minimal-change requirement.

Full explanation →

540

MCQmedium

A analytics dashboard uses RDS MySQL and receives many read-only reporting queries that slow down the primary database. What should the architect add? The architecture review board prefers a managed AWS-native control.

A.S3 lifecycle policy

B.RDS read replica and route reporting queries to it

C.Multi-AZ standby and route reads to the standby

D.A larger NAT gateway

AnswerB

Read replicas offload read traffic from the primary instance.

Why this answer

B is correct because an RDS read replica is a fully managed, native AWS solution that offloads read-heavy reporting queries from the primary RDS MySQL instance. The read replica asynchronously replicates data using the MySQL binlog, allowing reporting traffic to be routed to it without impacting the primary database's write performance. This directly addresses the slowdown caused by many read-only queries while satisfying the architecture review board's preference for a managed AWS-native control.

Exam trap

The trap here is confusing the Multi-AZ standby (which is for failover only and cannot serve reads) with a read replica (which is specifically designed to offload read traffic), leading candidates to incorrectly select Option C as a managed solution for read scaling.

How to eliminate wrong answers

Option A is wrong because an S3 lifecycle policy manages object transitions and expirations in S3, not database read traffic; it cannot offload SQL queries from RDS. Option C is wrong because a Multi-AZ standby is designed for high availability and automatic failover, not for serving read traffic — it does not accept direct connections for reads, and any attempt to route reads to it would fail or require unsupported workarounds. Option D is wrong because a NAT gateway provides outbound internet access for private subnets and has no role in distributing database read queries; it cannot reduce load on an RDS primary instance.

Full explanation →

541

MCQhard

Based on the exhibit, a static asset distribution site uses Amazon CloudFront with an S3 origin. The assets are versioned by filename, but the cache hit ratio remains low after each release. Which CloudFront change is the best way to improve cache reuse without changing the origin objects?

A.Keep the current cache key and increase the S3 bucket's storage class.

B.Remove Authorization and unnecessary query strings from the CloudFront cache key.

C.Disable the CloudFront cache so every request is served directly from S3.

D.Switch the origin from Amazon S3 to an Application Load Balancer.

AnswerB

Versioned static assets do not need Authorization in the cache key, and arbitrary query strings can destroy cache efficiency. Excluding those fields lets CloudFront reuse the same cached object across many viewers.

Why this answer

Option B is correct because removing Authorization headers and unnecessary query strings from the CloudFront cache key ensures that multiple requests for the same versioned asset (e.g., style.v2.css) share a single cached object, regardless of user-specific headers or irrelevant query parameters. This directly increases the cache hit ratio without modifying the origin objects, as CloudFront will serve the same cached response for identical cache keys.

Exam trap

The trap here is that candidates may think increasing storage class or switching to an ALB improves caching, but the real issue is the cache key composition—specifically, unnecessary headers or query strings fragmenting the cache—which is solved by adjusting the CloudFront cache key settings.

How to eliminate wrong answers

Option A is wrong because changing the S3 bucket's storage class (e.g., to S3 Standard-IA or Glacier) has no effect on CloudFront's cache key or cache hit ratio; it only affects storage cost and retrieval latency, not caching behavior. Option C is wrong because disabling the CloudFront cache would force every request to go directly to the S3 origin, eliminating all caching benefits and increasing latency and origin load, which is the opposite of improving cache reuse. Option D is wrong because switching the origin from S3 to an Application Load Balancer (ALB) introduces unnecessary complexity and does not address the cache key issue; the ALB would still require the same cache key optimization to improve cache hits, and it would not inherently improve cache reuse.

Full explanation →

542

MCQhard

A warehouse integration service must use shared file storage across Linux EC2 instances in multiple Availability Zones. The storage must remain available during an AZ failure. Which service should be used?

A.Amazon EFS with mount targets in multiple Availability Zones

B.S3 mounted as a POSIX file system without a file gateway

C.Instance store volumes

D.An EBS volume attached to all instances

AnswerA

EFS is regional file storage and supports mount targets across AZs.

Why this answer

Amazon EFS provides a fully managed, scalable, and elastic NFS file system that can be mounted concurrently on multiple Linux EC2 instances across different Availability Zones. By configuring mount targets in each AZ, the file system remains accessible even if one AZ fails, because the other mount targets continue to serve traffic. This meets the requirement for shared, highly available file storage across AZs.

Exam trap

The trap here is that candidates may confuse EBS multi-attach (which has strict limitations and is not suitable for shared file systems across AZs) with a true distributed file system like EFS, or assume that S3 with a FUSE driver can replace a POSIX-compliant shared file system.

How to eliminate wrong answers

Option B is wrong because mounting S3 as a POSIX file system without a file gateway (e.g., using s3fs-fuse) does not provide true POSIX semantics (e.g., file locking, atomic operations) and introduces performance and consistency issues; it is not a native shared file system for Linux EC2 instances. Option C is wrong because instance store volumes are ephemeral and tied to a single EC2 instance; they are lost if the instance stops or fails, and cannot be shared across instances or survive an AZ failure. Option D is wrong because a single EBS volume can only be attached to one EC2 instance at a time (multi-attach EBS is limited to specific io1/io2 volumes and is not designed for shared file system workloads across multiple instances in different AZs).

Full explanation →

543

MCQhard

Based on the exhibit, what is the best change to improve read performance without increasing write latency on the primary database?

A.Create an RDS read replica and direct the reporting queries to the replica endpoint.

B.Convert the DB instance to Multi-AZ so the primary can serve more reads.

C.Increase the primary instance class to a larger size and keep all traffic on one writer.

D.Migrate the reporting workload to DynamoDB to gain faster reads.

AnswerA

A read replica offloads the long-running read-only reports from the primary database, which preserves write performance and reduces read latency for the reporting workload. Because the business accepts slightly stale report data, the asynchronous replication delay is acceptable. This is the most direct and AWS-native way to separate read pressure from writes.

Why this answer

Creating an RDS read replica offloads read-heavy reporting queries from the primary database instance, improving read performance without adding any write latency to the primary. The replica operates asynchronously, so writes on the primary are not blocked or delayed by the replica's lag. This is the standard AWS solution for scaling read traffic on RDS.

Exam trap

The trap here is that candidates confuse Multi-AZ with read scaling, assuming the standby instance can serve reads, when in fact Multi-AZ only provides failover redundancy and the standby is not accessible for read operations.

How to eliminate wrong answers

Option B is wrong because Multi-AZ is designed for high availability and automatic failover, not for scaling read capacity; the standby instance cannot serve reads directly. Option C is wrong because increasing the instance class would improve both read and write performance, but it does not isolate the reporting workload, so it could still increase write latency under heavy read load. Option D is wrong because migrating to DynamoDB is an architectural change that would require application rewrites and does not directly address improving read performance on the existing primary database without increasing write latency.

Full explanation →

544

MCQmedium

A web application runs on an Auto Scaling group (ASG) behind an Application Load Balancer (ALB). The ASG is currently attached to subnets in only two Availability Zones (AZs). During a planned maintenance window, one AZ becomes unavailable for about 25 minutes. Monitoring shows that targets in the remaining AZ go healthy, and the ALB/target group health checks report normal. However, users still experience intermittent connection failures and slower responses during the AZ outage. What change will most directly improve resilience against an AZ loss while keeping the same ALB-based design?

A.Set the ASG min capacity to 0 so instances can be recreated faster when an AZ recovers.

B.Extend the ASG to use subnets in three AZs so there is placement redundancy during an AZ outage, while continuing to keep traffic behind the ALB.

C.Increase the ALB idle timeout to 120 seconds to reduce connection drops.

D.Disable health checks on the target group so instances are not deregistered during the maintenance window.

AnswerB

An AZ outage reduces the number of AZs where the ASG can place instances. With only two AZs, losing one significantly limits capacity and can cause temporary shortages and uneven load distribution, even if existing targets are marked healthy. Expanding the ASG to subnets in three (or more) AZs provides additional placement options so the ASG can maintain the desired number of instances across the remaining AZ(s). The ALB will continue routing only to healthy targets, and the system is more likely to sustain stable response times during the outage.

Why this answer

B is correct because deploying the ASG across three Availability Zones (AZs) ensures that when one AZ becomes unavailable, the remaining two AZs can handle the full traffic load without overloading the instances. This placement redundancy directly addresses the intermittent connection failures and slower responses, as the ALB can distribute traffic only to healthy targets in the remaining AZs, maintaining capacity and performance. The current two-AZ setup lacks sufficient buffer capacity, causing the single remaining AZ to become overwhelmed during the outage.

Exam trap

The trap here is that candidates may focus on connection-level settings (idle timeout) or health check behavior, missing the fundamental architectural need for multi-AZ redundancy to maintain capacity during an AZ outage.

How to eliminate wrong answers

Option A is wrong because setting the ASG min capacity to 0 does not help during an AZ outage; it would actually allow all instances to be terminated, making the application unavailable, and it does not address the lack of capacity in the remaining AZ. Option C is wrong because increasing the ALB idle timeout to 120 seconds only keeps idle connections open longer, which does not prevent connection failures or slow responses caused by insufficient capacity in the remaining AZ; it may even mask underlying issues. Option D is wrong because disabling health checks on the target group would prevent the ALB from deregistering unhealthy instances, causing traffic to be routed to failed instances in the unavailable AZ, leading to more connection failures and no improvement in resilience.

Full explanation →

545

MCQmedium

A solutions architect is designing an S3 bucket for a claims portal. The objects must never be publicly accessible, even if a developer later adds an overly broad bucket policy. What should the architect configure? The design must avoid adding custom operational scripts.

A.Enable S3 Block Public Access at the account or bucket level

B.Create an IAM policy that denies s3:GetObject to anonymous users

C.Enable server access logging on the bucket

D.Enable S3 Transfer Acceleration

AnswerA

S3 Block Public Access prevents public ACLs and public bucket policies from exposing the bucket.

Why this answer

Option A is correct because S3 Block Public Access provides a definitive override that prevents any public access to S3 objects, even if a bucket policy or ACL later grants public access. This setting can be applied at the account or bucket level and ensures that all access is denied to anonymous users, meeting the requirement without custom scripts.

Exam trap

The trap here is that candidates may think an IAM policy can block anonymous users, but IAM policies only apply to authenticated IAM principals, not to anonymous (unauthenticated) requests, making S3 Block Public Access the only effective solution.

How to eliminate wrong answers

Option B is wrong because an IAM policy that denies s3:GetObject to anonymous users is not effective; anonymous users are not IAM principals, so IAM policies do not apply to them. Option C is wrong because server access logging records requests but does not enforce access controls or prevent public access. Option D is wrong because S3 Transfer Acceleration speeds up uploads over long distances but has no effect on access permissions or public accessibility.

Full explanation →

546

MCQhard

Based on the exhibit, a partner account uploads encrypted objects to a central S3 bucket and later reads them back. The S3 permissions are correct, but the requests still fail. What change is required so the partner workload can use the customer-managed KMS key safely?

A.Replace SSE-KMS with S3 object ACLs so the partner account can bypass KMS authorization.

B.Create a new bucket in the partner account and copy the objects there to avoid cross-account encryption.

C.Switch the bucket to SSE-S3 so the partner role no longer needs KMS permissions.

D.Update the CMK key policy, or add a tightly scoped grant, to allow the partner role the required KMS actions through S3.

AnswerD

Cross-account access to SSE-KMS encrypted objects requires KMS authorization in addition to S3 authorization. The key policy must trust the partner role, and the permissions should be limited to the needed KMS actions such as Decrypt, Encrypt, and GenerateDataKey with a service condition for S3. That is why the partner can have valid S3 permissions and still fail until the KMS policy is fixed.

Why this answer

Option D is correct because when using a customer-managed KMS key (CMK) for SSE-KMS in a cross-account scenario, the key policy must explicitly grant the partner account's IAM role the necessary KMS actions (kms:Decrypt, kms:GenerateDataKey) to allow S3 to perform the encryption/decryption on behalf of the partner. Without this policy update or a tightly scoped grant, the KMS service will deny the request even if S3 bucket policies are correctly configured.

Exam trap

The trap here is that candidates assume S3 bucket policies alone control all access, forgetting that SSE-KMS introduces a separate authorization layer at KMS that requires explicit cross-account permissions in the key policy.

How to eliminate wrong answers

Option A is wrong because S3 object ACLs cannot bypass KMS authorization; ACLs control access to the object itself, not the encryption key, and removing SSE-KMS would violate security requirements. Option B is wrong because copying objects to a new bucket in the partner account does not resolve the underlying KMS authorization issue; the partner still needs access to the CMK to decrypt the objects. Option C is wrong because switching to SSE-S3 would remove the use of the customer-managed key, which may be a compliance or security requirement, and does not address the need for cross-account access with a CMK.

Full explanation →

547

MCQmedium

A risk simulation workload uses CloudWatch Logs heavily. Retaining all debug logs forever is increasing costs. What should be configured?

A.CloudWatch Logs retention policies per log group

B.AWS Config aggregation

C.CloudWatch detailed monitoring on all instances

D.Route 53 health checks

AnswerA

Retention policies automatically delete older logs after the required period.

Why this answer

CloudWatch Logs retention policies allow you to set per-log-group expiration rules (e.g., 30 days, 90 days) to automatically delete old log events, directly reducing storage costs for debug logs that are no longer needed. This is the most cost-effective and targeted solution for managing log lifecycle without affecting other monitoring or configuration services.

Exam trap

The trap here is that candidates may confuse log retention with monitoring frequency or configuration management, mistakenly thinking that reducing metric collection (detailed monitoring) or using Config aggregation will lower log storage costs.

How to eliminate wrong answers

Option B is wrong because AWS Config aggregation is used to collect and centrally view configuration and compliance data from multiple accounts/regions, not to manage log retention or storage costs. Option C is wrong because CloudWatch detailed monitoring on all instances increases metric frequency (1-minute intervals) and incurs additional costs, doing nothing to control log retention or delete old debug logs. Option D is wrong because Route 53 health checks monitor endpoint availability and DNS routing, not log storage or retention policies.

Full explanation →

548

MCQmedium

A patient portal receives bursts of orders that sometimes overwhelm a downstream fulfilment service. The architecture must absorb spikes and retry processing without losing requests. Which service should be placed between the web tier and fulfilment workers? The team wants the control to be enforceable during normal operations.

A.AWS WAF

B.Amazon CloudFront

C.Amazon SQS queue

D.Amazon Route 53 weighted routing

AnswerC

SQS decouples producers and consumers, buffers bursts, and supports retries through visibility timeout and dead-letter queues.

Why this answer

Amazon SQS is the correct choice because it acts as a durable buffer between the web tier and fulfilment workers, decoupling the producers from consumers. When bursts of orders arrive, SQS queues the messages and allows the fulfilment service to poll and process them at its own pace, absorbing spikes without data loss. The queue provides at-least-once delivery and supports retries via a dead-letter queue, ensuring no requests are lost even if processing fails.

Exam trap

The trap here is that candidates confuse buffering and decoupling (SQS) with traffic distribution (Route 53) or security filtering (WAF), or they mistakenly think a CDN (CloudFront) can handle asynchronous order processing, but none of those services provide durable message storage or retry logic.

How to eliminate wrong answers

Option A is wrong because AWS WAF is a web application firewall that filters HTTP/S traffic based on rules (e.g., SQL injection, XSS), not a message queue; it cannot buffer or retry order processing. Option B is wrong because Amazon CloudFront is a content delivery network (CDN) that caches and accelerates static/dynamic content delivery, not a queue for asynchronous message passing; it does not provide durable storage for order requests. Option D is wrong because Amazon Route 53 weighted routing distributes DNS traffic across multiple endpoints based on weights, but it does not buffer or retry requests; it only controls which server receives a request, and if the downstream service is overwhelmed, requests are still lost or fail.

Full explanation →

549

MCQmedium

A warehouse integration service receives bursts of orders that sometimes overwhelm a downstream fulfilment service. The architecture must absorb spikes and retry processing without losing requests. Which service should be placed between the web tier and fulfilment workers? The architecture review board prefers a managed AWS-native control.

A.AWS WAF

B.Amazon Route 53 weighted routing

C.Amazon SQS queue

D.Amazon CloudFront

AnswerC

SQS decouples producers and consumers, buffers bursts, and supports retries through visibility timeout and dead-letter queues.

Why this answer

Amazon SQS is the correct choice because it acts as a fully managed message queue that decouples the web tier from the fulfilment workers, buffering incoming order bursts. It provides at-least-once delivery and allows workers to poll messages at their own pace, ensuring no requests are lost even during spikes. SQS also supports retries via a dead-letter queue (DLQ) for messages that fail processing, meeting the requirement for resilient, managed AWS-native control.

Exam trap

The trap here is that candidates may confuse AWS WAF or CloudFront as tools for handling traffic spikes, but neither provides the decoupling, buffering, and retry capabilities of a queue; they are designed for security and content delivery, respectively, not for asynchronous processing.

How to eliminate wrong answers

Option A is wrong because AWS WAF is a web application firewall that filters HTTP/S traffic based on rules (e.g., SQL injection, XSS) and does not provide message buffering, queuing, or retry logic for downstream services. Option B is wrong because Amazon Route 53 weighted routing distributes DNS traffic across multiple endpoints based on weights, but it does not absorb spikes or provide retry mechanisms; it only controls which endpoint receives a request, and a failed request is lost unless the client retries. Option D is wrong because Amazon CloudFront is a content delivery network (CDN) that caches static and dynamic content at edge locations to reduce latency, but it cannot buffer or retry requests for a downstream fulfilment service; it is designed for accelerating content delivery, not for decoupling or absorbing processing spikes.

Full explanation →

550

Multi-Selecthard

A nightly video rendering pipeline runs on Linux EC2 instances and is compatible with ARM64. The jobs are CPU-bound, checkpoint frequently, and can resume if interrupted. The business wants the best throughput per dollar for the batch window. Which two changes should the team make? Select two.

Select 2 answers

A.Use AWS Graviton-based instances for the render workers.

B.Run the workers in an Auto Scaling group with Spot Instances for interruption-tolerant capacity.

C.Use a single large x86 instance with On-Demand pricing to avoid interruptions.

D.Replace the batch workers with a Lambda function to eliminate instance management.

E.Move the workload to a spread placement group to increase cost efficiency.

AnswersA, B

Graviton instances are ARM-based and often deliver better price-performance than comparable x86 instances for CPU-bound workloads. Because the application is already compatible with ARM64, the team can adopt Graviton without rewriting the pipeline. That improves throughput per dollar while keeping the same batch-processing model.

Why this answer

AWS Graviton-based instances use ARM64 architecture, which is explicitly compatible with the video rendering pipeline. They offer up to 40% better price-performance compared to comparable x86 instances for CPU-bound workloads, directly improving throughput per dollar. This makes option A correct for maximizing cost efficiency.

Exam trap

The trap here is that candidates may overlook the compatibility requirement with ARM64 and choose a single large x86 instance for simplicity, or mistakenly think Lambda can handle long-running CPU-bound tasks, missing the cost and throughput benefits of Graviton and Spot Instances.

Full explanation →

551

MCQeasy

Company A must allow workloads in Company B to assume an IAM role in Company A (RoleInA). To mitigate confused-deputy attacks, a Security requirement is to use an External ID. Company A should restrict who can assume RoleInA. Which trust-policy configuration is the best choice?

A.In Company A role trust policy, allow sts:AssumeRole for principal "arn:aws:iam::<company-b-account-id>:root" with no sts:ExternalId condition.

B.In Company A role trust policy, allow sts:AssumeRole only for principal "arn:aws:iam::<company-b-account-id>:role/<specific-role-in-b>" and require a condition where sts:ExternalId equals the expected External ID value.

C.In the trust policy, allow iam:PassRole for the Company B principal and include an sts:ExternalId condition.

D.In Company A, grant Company B access using an IAM permissions policy attached to RoleInA instead of using a trust policy.

AnswerB

Restricting the principal to the specific intended role limits who can assume RoleInA. Requiring the correct sts:ExternalId in the trust policy mitigates confused-deputy attacks.

Why this answer

Option B is correct because it restricts the trust policy to a specific IAM role in Company B (using the principal ARN) and requires the `sts:ExternalId` condition to match a predefined value. This ensures only the intended role in Company B can assume RoleInA, and the External ID prevents a confused-deputy attack by requiring the third party to provide a unique identifier that only the legitimate service knows.

Exam trap

The trap here is that candidates often confuse `iam:PassRole` with `sts:AssumeRole` or think that a permissions policy can restrict who assumes a role, but only the trust policy defines the trusted principals and conditions for role assumption.

How to eliminate wrong answers

Option A is wrong because it allows the entire Company B account (root principal) to assume the role without any External ID condition, which violates the security requirement and leaves the role open to confused-deputy attacks. Option C is wrong because `iam:PassRole` is used to pass a role to an AWS service, not to assume a role; the correct action for assuming a role is `sts:AssumeRole`, and the condition should be on the trust policy, not on a permissions policy. Option D is wrong because an IAM permissions policy attached to RoleInA controls what the role can do after it is assumed, but it does not control who can assume the role; the trust policy is the only place to define the trusted principals and conditions for assuming the role.

Full explanation →

552

Multi-Selecthard

A company is encrypting sensitive S3 data for a mobile banking backend with AWS KMS. Which two controls help prevent accidental use of the KMS key by unauthorized principals?

Select 2 answers

A.A larger KMS key rotation period

B.A key policy that limits key administrators and key users

C.S3 Transfer Acceleration

D.IAM policies that grant kms:Decrypt only to required application roles

AnswersB, D

The KMS key policy is the primary resource policy that controls who can administer or use the key.

Why this answer

Option B is correct because a KMS key policy explicitly defines which principals (IAM users, roles, AWS accounts) can administer or use the key. By limiting key administrators and key users in the key policy, you prevent unauthorized principals from accidentally invoking KMS operations on that key, even if they have broad IAM permissions. This is a direct access control mechanism at the key level.

Exam trap

The trap here is that candidates often think IAM policies alone are sufficient for KMS access control, but they forget that KMS key policies are evaluated first and can explicitly deny or limit access regardless of IAM permissions.

Full explanation →

553

MCQmedium

Based on the exhibit, which AWS service should the team use so the database password can rotate automatically every 30 days and the application can retrieve it securely at runtime?

A.AWS Systems Manager Parameter Store with a standard String parameter

B.AWS Secrets Manager

C.Amazon Cognito user pools

D.AWS Key Management Service customer managed keys

AnswerB

Secrets Manager is built for storing and rotating credentials such as database passwords. It supports secret versioning, fine-grained access control, and managed rotation workflows, making it the best fit for a 30-day automated rotation requirement. The application can retrieve the current secret at runtime without embedding the password in code or environment variables.

Why this answer

AWS Secrets Manager is the correct service because it natively supports automatic rotation of database passwords on a configurable schedule (e.g., every 30 days) and provides secure retrieval at runtime via the AWS SDK, CLI, or Secrets Manager API. Unlike Parameter Store, Secrets Manager is designed specifically for managing secrets with built-in rotation, encryption, and fine-grained access control.

Exam trap

The trap here is that candidates often confuse AWS Systems Manager Parameter Store (which can store encrypted parameters) with Secrets Manager, but Parameter Store lacks native automatic rotation and is not optimized for managing database credentials with scheduled rotation.

How to eliminate wrong answers

Option A is wrong because AWS Systems Manager Parameter Store with a standard String parameter does not support automatic rotation of secrets; it is intended for plaintext or encrypted configuration data, not for managing database credentials with lifecycle rotation. Option C is wrong because Amazon Cognito user pools are designed for user authentication and identity management, not for storing or rotating application database passwords. Option D is wrong because AWS KMS customer managed keys are used for encryption and decryption operations, not for storing secrets or managing their rotation; they can encrypt secrets but do not provide rotation or retrieval of the secret value itself.

Full explanation →

554

MCQmedium

A team stores application logs in an S3 bucket. They keep logs for 18 months for compliance. Access patterns: logs are heavily accessed during the first 30 days, rarely accessed between days 31 and 180, and almost never accessed after day 180. They currently store everything in S3 Standard and want to reduce storage cost without violating the 18-month retention requirement. What should they implement?

A.Leave logs in S3 Standard for 18 months and add a tag for internal reporting

B.Create an S3 lifecycle policy to transition logs to Standard-IA after 30 days and to Glacier Deep Archive after 180 days

C.Immediately move all logs to Glacier Instant Retrieval and expire after 18 months

D.Enable versioning and rely on object lifecycle expiration to reduce costs; do not change storage classes

AnswerB

Storage class transitions align cost with access frequency while still keeping objects for the full compliance period.

Why this answer

Option B is correct because an S3 lifecycle policy can automatically transition objects from S3 Standard to S3 Standard-IA after 30 days (matching the heavy-access period) and then to S3 Glacier Deep Archive after 180 days (matching the near-zero-access period). This minimizes storage costs while retaining logs for the required 18 months, as Glacier Deep Archive offers the lowest storage cost for long-term archival data.

Exam trap

The trap here is that candidates may choose Option C, mistakenly thinking Glacier Instant Retrieval is the cheapest archival class, but it is actually more expensive than Glacier Deep Archive for data that is almost never accessed, and the immediate transition ignores the cost savings from using Standard-IA during the first 30 days.

How to eliminate wrong answers

Option A is wrong because leaving logs in S3 Standard for 18 months incurs the highest storage cost, and adding a tag does not reduce cost or change the storage class. Option C is wrong because immediately moving all logs to S3 Glacier Instant Retrieval is more expensive than using Standard-IA for the first 30 days and does not align with the access pattern; also, Glacier Instant Retrieval is designed for data accessed quarterly, not for data that is almost never accessed after 180 days. Option D is wrong because enabling versioning increases storage costs by retaining multiple versions of objects, and object lifecycle expiration alone does not change storage classes to lower-cost tiers; it only deletes objects, which would violate the 18-month retention requirement if set to expire earlier.

Full explanation →

555

Multi-Selectmedium

A data analytics company stores large datasets in Amazon S3. The data is accessed frequently for the first 30 days, then accessed rarely but needs to be retrievable within 1 hour for compliance purposes for up to 3 years. After 3 years, the data must be archived for 7 years with retrieval times acceptable up to 12 hours. Which three of the following strategies would optimize storage costs? (Choose three.)

Select 4 answers

.Use S3 Intelligent-Tiering for automatic cost savings based on access patterns.

.Transition data to S3 Glacier Deep Archive immediately after upload.

.Transition data to S3 Standard-Infrequent Access (S3 Standard-IA) after 30 days.

.Use S3 Lifecycle policies to move objects from S3 Standard to S3 Glacier Flexible Retrieval after 3 years.

.Store all data in S3 One Zone-IA for the first 30 days to save on storage costs.

.Transition data to S3 Glacier Deep Retrieval for the final 7-year archive.

Why this answer

S3 Intelligent-Tiering is correct because it automatically moves objects between access tiers based on changing access patterns, optimizing costs without manual lifecycle rules. For data that is frequently accessed for 30 days and then rarely accessed, Intelligent-Tiering can cost-effectively handle the transition without upfront lifecycle configuration.

Exam trap

AWS often tests the distinction between S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive, where candidates mistakenly choose Deep Archive for the 3-year compliance period, ignoring the 1-hour retrieval requirement that only Flexible Retrieval can meet.

Full explanation →

556

MCQhard

Based on the exhibit, a web application runs on an Amazon EC2 Auto Scaling group behind an Application Load Balancer. During traffic surges, the average CPU utilization stays below 35%, but request latency increases sharply and the ALB access logs show far more requests per target than expected. Which change is the best way to improve scaling behavior?

A.Lower the CPU target tracking threshold so the Auto Scaling group launches more instances sooner.

B.Replace the Application Load Balancer with a Network Load Balancer to reduce request latency.

C.Configure target tracking scaling on ALB RequestCountPerTarget for the Auto Scaling group.

D.Increase the ALB idle timeout so requests can wait longer before timing out.

AnswerC

RequestCountPerTarget directly reflects how many requests each instance is serving, which matches the symptom in the exhibit. It scales the fleet based on actual per-target demand instead of CPU, so the group can add capacity before queueing and latency grow.

Why this answer

Option C is correct because the issue is that request latency increases sharply and the ALB logs show far more requests per target than expected, indicating that the Auto Scaling group is not scaling based on the actual load per instance. By configuring target tracking scaling on ALB RequestCountPerTarget, the Auto Scaling group will launch new instances when the average number of requests per target exceeds a defined threshold, directly addressing the root cause of high request volume per instance. This approach ensures scaling is driven by the actual workload distribution rather than CPU utilization, which remains low due to the application being I/O-bound or network-bound.

Exam trap

The trap here is that candidates often assume CPU utilization is the universal scaling metric, but the question explicitly states CPU stays low while latency spikes, indicating the bottleneck is request throughput, not compute, making RequestCountPerTarget the correct metric to scale on.

How to eliminate wrong answers

Option A is wrong because lowering the CPU target tracking threshold would not help when CPU utilization is already below 35% and the bottleneck is request latency, not CPU; this could lead to unnecessary scaling and increased costs without solving the latency issue. Option B is wrong because replacing the Application Load Balancer with a Network Load Balancer would not reduce request latency caused by high request volume per target; NLB operates at Layer 4 and does not inspect HTTP requests, so it cannot provide request-level metrics like RequestCountPerTarget for scaling decisions. Option D is wrong because increasing the ALB idle timeout only extends how long the load balancer keeps connections open without activity, which does not address the root cause of high request volume per target or the sharp increase in latency; it may mask the problem by allowing requests to wait longer before timing out.

Full explanation →

557

Multi-Selectmedium

A company is migrating its on-premises workloads to AWS and wants to optimize costs. Which three strategies should the company implement to achieve a cost-optimized architecture? (Choose three.)

Select 3 answers

.Use Reserved Instances or Savings Plans for predictable workloads to reduce costs compared to On-Demand pricing.

.Provision additional EC2 instances to handle peak load at all times, ensuring maximum performance.

.Implement auto scaling to match capacity with demand, avoiding over-provisioning and reducing waste.

.Use Spot Instances for fault-tolerant, flexible workloads to achieve significant cost savings.

.Store all data in Amazon S3 Standard storage class to avoid any data retrieval costs.

.Deploy all resources in a single Availability Zone to minimize data transfer costs.

Why this answer

Reserved Instances or Savings Plans provide significant discounts (up to 72%) over On-Demand pricing for predictable workloads by committing to a specific usage term (1 or 3 years). This directly reduces compute costs for steady-state applications, making it a core cost-optimization strategy.

Exam trap

The trap here is that candidates often confuse 'maximizing performance' with 'cost optimization' and select the option to provision extra instances for peak load, failing to recognize that auto scaling and right-sizing are the correct approaches to balance cost and performance.

Full explanation →

558

MCQeasy

Multiple EC2 instances need a shared filesystem so they can concurrently read and write the same files (for example, user uploads and rendered assets). The instances are in different Availability Zones and must mount the filesystem using NFS. Which AWS storage service best fits?

A.Amazon EFS

B.Amazon S3

C.Amazon EBS gp3 volumes

D.Instance store on EC2

AnswerA

EFS provides a shared, NFS-compatible filesystem that supports mounting from multiple EC2 instances and AZs.

Why this answer

Amazon EFS is a fully managed NFS file system that supports concurrent read/write access from multiple EC2 instances across different Availability Zones. It uses the NFSv4.1 protocol, making it the only AWS storage service that provides a shared POSIX-compliant filesystem mountable via NFS for multi-AZ workloads.

Exam trap

The trap here is that candidates confuse Amazon S3's eventual consistency and HTTP-based access with a true shared filesystem, or assume EBS multi-attach (which is limited to a few instances in the same AZ) can replace a multi-AZ NFS solution.

How to eliminate wrong answers

Option B (Amazon S3) is wrong because S3 is an object storage service accessed via HTTP/HTTPS APIs, not a filesystem that can be mounted via NFS; it does not support POSIX file locking or concurrent write consistency required for shared filesystem use cases. Option C (Amazon EBS gp3 volumes) is wrong because EBS volumes are block storage that can only be attached to a single EC2 instance at a time (except for multi-attach EBS io1/io2, which is limited to a few instances in the same AZ and does not support NFS). Option D (Instance store on EC2) is wrong because instance store volumes are ephemeral, tied to a single EC2 instance, and cannot be shared across instances or Availability Zones.

Full explanation →

559

MCQmedium

A media processing workflow uses CloudWatch Logs heavily. Retaining all debug logs forever is increasing costs. What should be configured?

A.Route 53 health checks

B.CloudWatch Logs retention policies per log group

C.CloudWatch detailed monitoring on all instances

D.AWS Config aggregation

AnswerB

Retention policies automatically delete older logs after the required period.

Why this answer

CloudWatch Logs retention policies per log group allow you to set an expiration time (e.g., 30 days) after which log events are automatically deleted. This directly reduces storage costs by preventing debug logs from accumulating indefinitely, without affecting other monitoring or routing functions.

Exam trap

The trap here is that candidates may confuse cost optimization with monitoring frequency or compliance aggregation, but the question specifically targets log storage costs, which only retention policies directly address.

How to eliminate wrong answers

Option A is wrong because Route 53 health checks are used for DNS failover and endpoint monitoring, not for managing log retention or cost optimization. Option C is wrong because CloudWatch detailed monitoring increases metric frequency (1-minute intervals) and incurs additional costs, which does not address log retention or cost reduction. Option D is wrong because AWS Config aggregation centralizes configuration snapshots and compliance rules across accounts/regions, but it does not control log group retention or deletion.

Full explanation →

560

MCQmedium

A company needs to replicate a DynamoDB table to three AWS regions so that users in each region can read and write to a local copy with the lowest possible latency. Changes must propagate to all regions within seconds. Which solution should a solutions architect implement?

A.Enable DynamoDB Streams and use Lambda functions to replicate changes to tables in the other two regions

B.Configure DynamoDB Global Tables with replica tables in each of the three regions

C.Create DynamoDB read replicas in each region and use the primary table for all writes

D.Use Amazon S3 cross-region replication to back up DynamoDB exports to each region

AnswerB

Global Tables provides managed multi-region multi-active replication with sub-second propagation and automatic conflict resolution. No custom code required.

Why this answer

DynamoDB Global Tables provide multi-region, multi-active (multi-master) replication. Each region maintains a full replica of the table, and applications can read and write to any region with local latency. Changes propagate to all other regions typically within one second.

DynamoDB Streams + Lambda is the underlying mechanism that Global Tables uses internally, but building a custom replication pipeline adds significant operational complexity. Global Tables is the managed, purpose-built solution requiring no custom code.

Exam trap

DynamoDB Streams captures item-level changes and can be processed by Lambda to replicate to other regions — this is a valid DIY approach. But when the question asks for multi-region multi-active replication with minimal complexity, Global Tables is the correct answer. Streams is the mechanism; Global Tables is the managed service.

Always choose the managed service over DIY for SAA-C03.

Why the other options are wrong

Custom DynamoDB Streams + Lambda replication works but requires significant development: Lambda functions per region, error handling, idempotency logic, and conflict resolution. Always choose the managed service (Global Tables) over custom Lambda pipelines.

DynamoDB does not have 'read replicas' like RDS. Global Tables creates full replica tables that support both reads and writes in each region. There is no read-only replica concept in DynamoDB.

S3 cross-region replication copies S3 objects between buckets. DynamoDB-to-S3 export is a data archival mechanism, not real-time database replication. Neither provides active database access with sub-second propagation.

Full explanation →

561

MCQhard

Based on the exhibit, an Amazon Aurora MySQL application is read-heavy, but the database writer is nearing CPU limits while the reader instance is mostly idle. The application currently sends all queries to the writer endpoint. Which change should you make first to increase read throughput?

A.Keep using the writer endpoint so Aurora can route the reads automatically.

B.Change the application to send read-only queries to the Aurora reader endpoint.

C.Convert the cluster to a single-AZ deployment so network hops are reduced.

D.Add an Amazon DynamoDB Accelerator (DAX) cluster in front of Aurora.

AnswerB

The reader endpoint is designed to distribute read traffic across Aurora replica instances. Moving SELECT-heavy traffic off the writer immediately reduces writer CPU pressure and increases total read throughput.

Why this answer

The correct answer is B because the Aurora reader endpoint is specifically designed to distribute read-only traffic across all available reader instances, offloading the writer and increasing read throughput. Since the reader instance is idle, directing read queries to the reader endpoint immediately reduces CPU load on the writer without requiring any architectural changes.

Exam trap

The trap here is that candidates assume the writer endpoint automatically load-balances reads across all instances, but in Aurora the writer endpoint always points to the primary instance, and only the reader endpoint distributes read traffic.

How to eliminate wrong answers

Option A is wrong because the writer endpoint always routes queries to the writer instance, which is already near CPU limits; Aurora does not automatically redirect read queries to reader instances when using the writer endpoint. Option C is wrong because converting to a single-AZ deployment removes the reader instance entirely, eliminating the ability to offload reads and reducing availability, not increasing read throughput. Option D is wrong because adding a DAX cluster in front of Aurora introduces a caching layer for DynamoDB, not Aurora MySQL, and does not address the immediate need to offload read traffic from the writer instance.

Full explanation →

562

MCQhard

A patient portal must process every event at least once, but duplicate processing is acceptable if the consumer handles idempotency. Which eventing approach is most suitable? The design must avoid adding custom operational scripts.

A.Use an in-memory queue on one EC2 instance

B.Use UDP messages sent directly to workers

C.Use Amazon SQS standard queue and design consumers to be idempotent

D.Use CloudFront signed URLs

AnswerC

SQS standard queues provide at-least-once delivery and high throughput; consumers must handle occasional duplicates.

Why this answer

Amazon SQS standard queues guarantee at-least-once delivery, which satisfies the requirement that every event is processed at least once. The design avoids custom operational scripts by leveraging a fully managed service, and the acceptance of duplicate processing is handled by making consumers idempotent. This combination provides a scalable, resilient, and cost-effective event-driven architecture without the need for custom infrastructure management.

Exam trap

The trap here is that candidates may confuse 'at-least-once' delivery with 'exactly-once' delivery and incorrectly choose a solution like a FIFO queue or a custom retry mechanism, but the question explicitly allows duplicate processing, making the standard queue the correct choice.

How to eliminate wrong answers

Option A is wrong because an in-memory queue on a single EC2 instance creates a single point of failure, lacks durability, and requires custom operational scripts for management and recovery, violating the 'avoid adding custom operational scripts' constraint. Option B is wrong because UDP is a connectionless, unreliable protocol that does not guarantee message delivery, so it cannot ensure at-least-once processing; it also requires custom application-level handling for reliability. Option D is wrong because CloudFront signed URLs are used for securing content delivery and controlling access to files, not for event processing or message queuing; they do not provide any event delivery guarantee or queue semantics.

Full explanation →

563

Matchinghard

Match each private-networking or content-delivery scenario to the AWS feature that most directly reduces cost while meeting the connectivity requirement.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Gateway VPC endpoint

Interface VPC endpoint (AWS PrivateLink)

CloudFront with versioned objects and a long cache TTL

CloudFront Origin Shield

Why these pairings

Each AWS feature directly reduces cost for the given scenario: CloudFront for latency, VPC Peering for intra-region, Direct Connect for hybrid, Transit Gateway for inter-region, Shield+CloudFront for security, and Global Accelerator for inter-region data transfer costs.

Full explanation →

564

MCQmedium

A company runs an application on EC2 instances in private subnets. The instances must access Amazon S3, and the team currently routes all outbound traffic to the internet through a NAT Gateway. Monthly NAT Gateway charges increased significantly, even though the application only needs to call S3 (not access other public internet services). Which change will most directly reduce NAT Gateway charges while keeping S3 access working?

A.Create a gateway VPC endpoint for S3 and update the private route tables so S3 traffic uses the endpoint instead of the NAT Gateway.

B.Enable S3 Transfer Acceleration on the bucket to reduce the number of S3 calls that go through the NAT Gateway.

C.Switch the EC2 instances to public subnets so S3 calls can use direct internet routing without NAT.

D.Increase the NAT Gateway TCP idle timeout so fewer connections are billed separately for S3 traffic.

AnswerA

A gateway VPC endpoint for S3 keeps S3 traffic within the AWS network. After you add the S3 gateway endpoint and update the private subnet route tables for the S3 prefix list to target the endpoint, S3 API calls from the private subnets no longer traverse the NAT Gateway. This directly reduces both NAT Gateway per-hour charges and NAT data-processing charges associated with S3 traffic. If the application truly only needs S3, you can remove the NAT route for those S3 destinations and rely on the endpoint for S3 connectivity.

Why this answer

A gateway VPC endpoint for S3 allows instances in private subnets to access S3 over the AWS network without traversing the internet. By updating the private route tables to direct S3 traffic to the endpoint, the NAT Gateway is bypassed, eliminating the per-GB data processing charges and hourly NAT Gateway fees for that traffic. This directly reduces costs while maintaining secure, private access to S3.

Exam trap

The trap here is that candidates may think S3 Transfer Acceleration or increasing NAT Gateway timeouts will reduce costs, but they fail to recognize that a gateway VPC endpoint eliminates the NAT Gateway entirely for S3 traffic, directly addressing the cost issue without compromising security.

How to eliminate wrong answers

Option B is wrong because S3 Transfer Acceleration speeds up uploads over long distances using AWS edge locations, but it does not reduce the amount of traffic going through the NAT Gateway; it actually adds additional costs per GB transferred and still requires internet routing. Option C is wrong because moving EC2 instances to public subnets exposes them directly to the internet, violating the requirement for private subnets and introducing security risks; it also does not reduce NAT Gateway charges since the NAT Gateway is no longer used, but the question asks for a change that reduces NAT Gateway charges while keeping S3 access working, not for a security redesign. Option D is wrong because increasing the TCP idle timeout does not reduce NAT Gateway charges; it may actually increase costs by keeping connections open longer, and NAT Gateway billing is based on data processing and hourly usage, not per-connection billing.

Full explanation →

565

MCQmedium

An orders service publishes payment instructions to an Amazon SQS queue. The downstream consumer sometimes times out while processing a message. After the message becomes visible again, the consumer may process the same instruction more than once and occasionally creates duplicate orders. The team needs a resiliency-focused design that prevents duplicates from creating double-charges, even if the same message is processed multiple times. What is the best architectural change?

A.Rely on SQS to guarantee exactly-once delivery for standard queues and remove all duplicate-handling logic in the consumer.

B.Make the consumer idempotent by using an idempotency key from the payment instruction (for example, a unique transaction/payment ID) and storing processing results with conditional writes so repeated deliveries do not create a second order.

C.Increase the SQS visibility timeout to the maximum value so the consumer never retries the message.

D.Change the queue to SNS with a fan-out subscription so each consumer gets a separate copy, ensuring processing is sequential and duplicate-free.

AnswerB

Because standard SQS is at-least-once, duplicates are expected under failure scenarios. The resilient approach is to ensure the side effect is performed only once by implementing idempotency. Store a record keyed by a payment/instruction ID using conditional logic (for example, a database conditional put/update or a transaction with a uniqueness constraint). If the key already exists, the consumer should treat the message as already processed and avoid creating a duplicate order/charge.

Why this answer

Option B is correct because making the consumer idempotent ensures that processing the same payment instruction multiple times does not result in duplicate orders or double charges. By using a unique idempotency key (e.g., transaction ID) and conditional writes (e.g., DynamoDB conditional put or database INSERT ... ON CONFLICT DO NOTHING), the consumer can safely handle repeated message deliveries without side effects.

This directly addresses the resiliency requirement without relying on SQS guarantees, which standard queues do not provide for exactly-once delivery.

Exam trap

The trap here is that candidates assume SQS FIFO queues or increased visibility timeouts solve duplicates, but the question specifically tests the concept of idempotency as the correct resiliency pattern for at-least-once delivery systems.

How to eliminate wrong answers

Option A is wrong because standard SQS queues do not guarantee exactly-once delivery; they offer at-least-once delivery, meaning duplicates can occur. Option C is wrong because increasing the visibility timeout to the maximum value (12 hours) does not prevent the consumer from timing out or processing duplicates; it only delays retries, and the message will still become visible again after the timeout expires, leading to potential duplicate processing. Option D is wrong because switching to SNS with fan-out does not prevent duplicates; it sends the same message to multiple subscribers, which could increase duplicate processing, and SNS does not provide ordering or deduplication guarantees.

Full explanation →

566

MCQeasy

A team needs to distribute TCP traffic (not HTTP) across multiple services. The services must see the original client source IP for auditing. Which AWS load balancer is the best fit?

A.Application Load Balancer (ALB) using HTTP/HTTPS listeners with host-based routing

B.Network Load Balancer (NLB) using TCP listeners

C.Classic Load Balancer (CLB) configured for TCP health checks only

D.API Gateway with a VPC Link to forward raw TCP traffic

AnswerB

NLB is a Layer 4 load balancer that supports TCP and UDP. When the traffic is routed to targets (for example, instance or IP targets), the backend connection maintains the original source IP/port at the networking layer, which supports IP-based auditing without requiring HTTP headers.

Why this answer

A Network Load Balancer (NLB) is the best fit because it operates at Layer 4 (TCP/UDP) and preserves the original client source IP address by default, which is required for auditing. It can distribute raw TCP traffic across multiple services without inspecting application-layer headers, making it ideal for non-HTTP TCP workloads.

Exam trap

The trap here is that candidates often assume an Application Load Balancer can handle any TCP traffic because of its 'listener' terminology, but ALB strictly requires HTTP/HTTPS protocols and cannot forward raw TCP streams.

How to eliminate wrong answers

Option A is wrong because an Application Load Balancer (ALB) only supports HTTP/HTTPS and gRPC protocols, not raw TCP traffic, and it terminates the client connection, replacing the source IP with its own private IP unless X-Forwarded-For headers are used (which are HTTP-specific). Option C is wrong because the Classic Load Balancer (CLB) is a legacy service that does not natively preserve the client source IP for TCP listeners; it uses proxy protocol to forward the source IP, but this requires additional configuration on the backend services, and CLB is not recommended for new architectures. Option D is wrong because API Gateway is designed for HTTP/HTTPS and RESTful APIs, not raw TCP traffic; a VPC Link integrates with NLBs or ALBs for HTTP traffic, but API Gateway cannot forward raw TCP streams.

Full explanation →

567

MCQmedium

A web application for a healthcare document service is behind an Application Load Balancer. The application must be protected from common SQL injection and cross-site scripting attacks with minimum operational overhead. What should the architect deploy? The design must avoid adding custom operational scripts.

A.Security groups on the application instances

B.AWS WAF associated with the Application Load Balancer

C.Network ACLs on the public subnets

D.AWS Shield Advanced only

AnswerB

AWS WAF can inspect HTTP requests and block common web exploits when associated with an ALB.

Why this answer

AWS WAF is a web application firewall that integrates directly with an Application Load Balancer to filter and monitor HTTP/HTTPS requests. It provides managed rules specifically designed to block common attack patterns like SQL injection and cross-site scripting (XSS) without requiring custom scripts or manual rule maintenance, thus meeting the requirement for minimum operational overhead.

Exam trap

The trap here is that candidates often confuse network-layer security controls (security groups, network ACLs, or Shield) with application-layer protection, assuming they can block SQL injection or XSS, when in fact only a web application firewall like AWS WAF can inspect and filter HTTP payloads for such attacks.

How to eliminate wrong answers

Option A is wrong because security groups act as a stateful virtual firewall at the instance level, filtering traffic based on IP addresses, ports, and protocols; they cannot inspect application-layer payloads to detect SQL injection or XSS patterns. Option C is wrong because network ACLs are stateless and operate at the subnet level, only filtering traffic based on IP, port, and protocol rules, with no capability to parse HTTP request bodies or headers for malicious content. Option D is wrong because AWS Shield Advanced provides DDoS protection at the network and transport layers, not application-layer attack mitigation for SQL injection or XSS; it does not include a web application firewall.

Full explanation →

568

MCQmedium

A IoT ingestion API stores audit logs in S3. The compliance team requires that logs cannot be overwritten or deleted for seven years. What should be configured?

A.S3 lifecycle expiration after seven years

B.S3 versioning only

C.S3 server access logging

D.S3 Object Lock in compliance mode with an appropriate retention period

AnswerD

Object Lock compliance mode enforces write-once-read-many retention that even privileged users cannot bypass during the retention period.

Why this answer

Option D is correct because S3 Object Lock with compliance mode provides a write-once-read-many (WORM) model that prevents any user, including the root user, from overwriting or deleting objects for the specified retention period. This meets the compliance requirement of a seven-year immutable audit log, as compliance mode enforces a legal hold that cannot be removed by any entity, including AWS support.

Exam trap

The trap here is that candidates often confuse versioning with immutability, thinking versioning alone prevents data loss, but it does not block overwrites or deletions of the current version, which is why Object Lock is required for true write-once-read-many protection.

How to eliminate wrong answers

Option A is wrong because S3 lifecycle expiration only automates deletion after a set period but does not prevent overwrites or deletions during that period; objects can still be deleted manually or overwritten before expiration. Option B is wrong because S3 versioning alone preserves previous versions but does not prevent deletion of the current version or overwrites; it only retains old versions, not enforce immutability. Option C is wrong because S3 server access logging records access requests to the bucket but does not provide any data protection or prevent modification or deletion of the audit logs themselves.

Full explanation →

569

Multi-Selectmedium

A transactional application uses Amazon RDS for MySQL in a single Availability Zone. The team wants the database to fail over automatically if the primary DB instance becomes unavailable, and they want the application to recover with minimal code changes. Which two actions should they take? Select two.

Select 2 answers

A.Convert the database to an RDS Multi-AZ deployment.

B.Have the application connect to the RDS endpoint by DNS name and reconnect after failures.

C.Add a read replica and promote it manually during an outage.

D.Store the current DB instance IP address in the application configuration file.

E.Rely on nightly snapshots because they provide automatic failover to another Availability Zone.

AnswersA, B

Correct. RDS Multi-AZ is the AWS-managed availability feature designed for automatic failover to a standby in another Availability Zone. It preserves the database endpoint and reduces recovery time without requiring the application to implement its own replica-selection logic.

Why this answer

Option A is correct because RDS Multi-AZ automatically synchronously replicates data to a standby instance in a different Availability Zone. If the primary fails, RDS automatically fails over to the standby, typically within 60–120 seconds, with no manual intervention required. This meets the requirement for automatic failover with minimal code changes.

Exam trap

The trap here is that candidates often confuse read replicas with Multi-AZ failover, thinking a read replica can serve as a manual failover target, but AWS explicitly separates these features—read replicas are for read scaling, not high availability failover.

Full explanation →

570

Multi-Selecthard

A payments API requires point-in-time recovery and accidental-delete protection for a DynamoDB table. Which two settings should the architect enable? The design must avoid adding custom operational scripts.

Select 2 answers

A.Deletion protection or tightly controlled delete permissions

B.Point-in-time recovery

C.Global secondary indexes

D.DAX

AnswersA, B

Deletion protection and least-privilege controls reduce accidental table removal risk.

Why this answer

Deletion protection (option A) prevents accidental deletion of the DynamoDB table itself, which is critical for the accidental-delete protection requirement. Point-in-time recovery (option B) enables restoring the table to any point within the last 35 days, satisfying the point-in-time recovery requirement. Both features are native DynamoDB capabilities that require no custom scripts.

Exam trap

The trap here is that candidates may confuse deletion protection (which protects the table resource) with item-level delete prevention, or think that GSIs or DAX provide data durability or recovery features when they do not.

Full explanation →

571

Multi-Selecthard

An event-ingestion application writes telemetry to DynamoDB with partition key tenantId and sort key eventTime. During a promotion, one tenant generates 10 times the normal traffic. Dashboards repeatedly query the most recent items for that tenant, and they can tolerate slightly stale data. Which changes would most effectively reduce throttling and improve responsiveness? Select three.

Select 3 answers

A.Introduce a sharded partition key for the hot tenant and query the small shard set when reading recent data.

B.Add a time bucket to the partition key, such as tenantId#YYYYMMDDHH, to spread bursty writes across more partitions.

C.Place DynamoDB DAX in front of the table for the repeated dashboard reads of recent items.

D.Increase only the sort-key cardinality while leaving the partition key unchanged.

E.Move the table to the Standard-IA table class because throttling is usually caused by storage class selection.

AnswersA, B, C

Correct because hot-partition problems are usually solved by spreading traffic across multiple partition key values. Sharding ACME across several keys distributes write load and avoids a single overloaded partition.

Why this answer

Option A is correct because introducing a sharded partition key (e.g., tenantId#shardId) for the hot tenant spreads its write traffic across multiple physical partitions, reducing throttling. When querying recent data, the dashboard can read from a small, fixed set of shards (e.g., 10 shards) and merge results, which is efficient and tolerates the slight staleness. This pattern directly addresses the hot partition issue without requiring application-level aggregation.

Exam trap

The trap here is that candidates assume increasing sort-key cardinality (Option D) improves write throughput, but DynamoDB's partition key alone determines the physical partition, so only modifying the partition key or using write sharding can alleviate hot partition throttling.

Full explanation →

572

Multi-Selecthard

A B2B file exchange site uses CloudFront in front of an S3 origin. Which two settings help keep users from bypassing CloudFront and accessing the bucket directly?

Select 2 answers

A.Enable S3 static website hosting

B.Use an S3 bucket policy that allows access only from the CloudFront distribution

C.Configure Origin Access Control for the S3 origin

D.Enable CloudFront standard logging

AnswersB, C

The bucket policy should trust the CloudFront distribution and deny direct public access.

Why this answer

Option B is correct because an S3 bucket policy that explicitly denies access to any principal except the CloudFront distribution's origin access identity (OAI) or origin access control (OAC) ensures that direct requests to the S3 bucket are blocked. This policy uses the `aws:SourceArn` or `aws:SourceAccount` condition key to restrict access exclusively to the CloudFront service, preventing users from bypassing CloudFront and accessing the bucket directly via its S3 endpoint.

Exam trap

The trap here is that candidates often confuse enabling S3 static website hosting (Option A) as a security measure, when in fact it opens a separate direct access endpoint, or they overlook that Origin Access Control (Option C) alone is insufficient without a corresponding bucket policy to enforce the restriction.

Full explanation →

573

MCQeasy

A company serves mostly static images and JavaScript files from an origin in one AWS Region. They want to reduce origin load and improve global performance. Which change most directly increases cache-hit ratio for static assets while avoiding stale content?

A.Set Cache-Control headers on the origin to always be no-cache so clients revalidate frequently.

B.Use versioned file names (e.g., app.abc123.js) and configure a long TTL with appropriate revalidation behavior.

C.Disable query string forwarding so all URLs without query strings share one cached object even when content differs.

D.Forward all headers, including cookies, to maximize personalization in edge cached responses.

AnswerB

Versioned assets allow long caching with confidence, while new filenames trigger updates when code changes.

Why this answer

Option B is correct because using versioned file names (e.g., app.abc123.js) allows you to set a long Cache-Control max-age TTL (e.g., one year) without risking stale content. When the file changes, the versioned name changes, creating a new URL that forces a cache miss and fetches the fresh content. This directly increases the cache-hit ratio for static assets while ensuring clients never serve outdated files.

Exam trap

The trap here is that candidates may think disabling query strings or forwarding all headers helps caching, but in reality, these actions either cause cache collisions or fragment the cache, reducing the cache-hit ratio.

How to eliminate wrong answers

Option A is wrong because setting Cache-Control: no-cache forces clients to revalidate with the origin on every request, which increases origin load and defeats the purpose of caching, reducing the cache-hit ratio. Option C is wrong because disabling query string forwarding can cause different content to be served from the same cached object if the content actually varies by query string, leading to stale or incorrect responses; it does not improve cache-hit ratio for static assets. Option D is wrong because forwarding all headers, including cookies, reduces cacheability since CloudFront (or any CDN) treats each unique set of headers as a separate cache key, fragmenting the cache and lowering the cache-hit ratio.

Full explanation →

574

MCQeasy

A trading analytics system deploys multiple EC2 instances that exchange very frequent, low-latency, east-west messages. The application team wants the instances to be placed to minimize network latency and variability. Which AWS feature should they use?

A.EC2 Placement Groups with the "cluster" strategy

B.EC2 Placement Groups with the "spread" strategy

C.Auto Scaling cooldown adjustments only

D.Switching the instances to a larger instance size without any placement group

AnswerA

Cluster placement groups place instances close together within a single Availability Zone on underlying infrastructure intended to have low-latency, high-bandwidth networking between instances. This directly targets minimizing latency and jitter for inter-instance communication.

Why this answer

The cluster placement group is the correct choice because it places instances into a low-latency, high-bandwidth group within a single Availability Zone, which minimizes network latency and variability for east-west traffic. This strategy is specifically designed for applications that require very frequent, low-latency communication between instances, such as trading analytics systems.

Exam trap

The trap here is that candidates confuse the 'spread' placement group's high availability benefit with low-latency requirements, not realizing that spreading instances across racks increases network hops and latency.

How to eliminate wrong answers

Option B is wrong because the spread placement group distributes instances across distinct hardware racks to reduce correlated failures, which increases network latency and variability rather than minimizing it. Option C is wrong because Auto Scaling cooldown adjustments only control the rate of scaling activities and have no impact on network latency or placement of instances. Option D is wrong because switching to a larger instance size may improve compute or memory capacity but does not inherently reduce network latency or variability between instances without a placement group.

Full explanation →

575

MCQhard

A healthcare document service must ensure that only encrypted EBS volumes can be created in the account. What is the strongest preventive control?

A.Use an SCP that denies ec2:CreateVolume when the encrypted condition is false

B.Tag encrypted volumes after creation

C.Enable VPC Flow Logs

D.Run a daily Lambda function to encrypt unencrypted volumes

AnswerA

An SCP can prevent noncompliant volume creation across accounts in an organization.

Why this answer

Option A is correct because an SCP (Service Control Policy) is a preventive control that can deny the ec2:CreateVolume action when the encryption condition (ec2:Encrypted) is false. This ensures that no unencrypted EBS volumes can be created in the account, providing a strong, proactive guardrail that cannot be overridden by IAM policies within the account.

Exam trap

The trap here is that candidates often confuse detective or corrective controls (like Lambda remediation or tagging) with preventive controls, failing to recognize that only SCPs or IAM policies with Deny effects can block the action before it occurs.

How to eliminate wrong answers

Option B is wrong because tagging encrypted volumes after creation is a detective or corrective control, not a preventive one; it does not stop the creation of unencrypted volumes. Option C is wrong because VPC Flow Logs capture network traffic metadata and have no ability to enforce encryption policies on EBS volumes. Option D is wrong because running a daily Lambda function to encrypt unencrypted volumes is a reactive/corrective control that only fixes volumes after they have been created, leaving a window of non-compliance.

Full explanation →

576

MCQmedium

A company hosts an internal API behind an Application Load Balancer (ALB) in two AWS Regions. They want Amazon Route 53 to automatically fail over to the secondary Region when the primary Region’s ALB is unhealthy. Health checks for the primary ALB are already configured, but the DNS record currently uses a latency-based routing policy. Which Route 53 configuration most directly provides automatic failover based on health status?

A.Keep latency-based routing, and set the weights so the secondary Region rarely receives traffic unless manual changes are made.

B.Use a Route 53 failover routing policy: configure two alias records for the ALBs where the primary record is marked PRIMARY, the secondary is marked SECONDARY, and each record has an associated health check.

C.Use an alias A record that returns both ALBs simultaneously so clients automatically load balance across Regions during outages.

D.Use geolocation routing to route users to the primary Region and rely on ALB health checks to shift requests between Regions.

AnswerB

Route 53 failover routing uses health checks to determine whether to return the PRIMARY or SECONDARY record. When the primary health check fails, Route 53 automatically switches resolution to the secondary ALB.

Why this answer

Option B is correct because Route 53 failover routing policy is specifically designed to automatically route traffic away from an unhealthy resource to a healthy one. By creating two alias records (one PRIMARY with an associated health check for the primary ALB, and one SECONDARY for the secondary ALB), Route 53 will automatically fail over to the secondary record when the primary health check fails. This directly meets the requirement for automatic failover based on health status, unlike latency-based routing which only optimizes for response time.

Exam trap

The trap here is that candidates often confuse latency-based routing with failover routing, assuming latency-based routing inherently provides health-based failover, but it only optimizes for latency and does not automatically reroute based on health status.

How to eliminate wrong answers

Option A is wrong because latency-based routing does not support automatic failover based on health status; weights only control traffic distribution and manual changes would be required to shift traffic, which contradicts the 'automatic failover' requirement. Option C is wrong because an alias A record cannot return multiple ALBs simultaneously; Route 53 alias records point to a single AWS resource, and returning multiple IPs would require a non-alias record with multiple values, which still does not provide health-based failover. Option D is wrong because geolocation routing routes based on user location, not health; ALB health checks alone cannot shift requests between Regions because the DNS record itself does not change based on health status without a failover routing policy.

Full explanation →

577

Multi-Selectmedium

A central security account stores encrypted log files in S3 using a customer managed AWS KMS key. A partner account already has S3 bucket access through an assumed role and now must also be able to encrypt and decrypt objects that use the same KMS key. Which two actions are required? Select two.

Select 2 answers

A.Update the KMS key policy to allow the partner role or account to use the key.

B.Enable automatic key rotation to solve the cross-account access requirement.

C.Attach IAM permissions in the partner account for kms:Encrypt, kms:Decrypt, and kms:GenerateDataKey on the CMK.

D.Replace the CMK with the AWS managed key alias/aws/s3.

E.Export the KMS key material and share it with the partner account.

AnswersA, C

KMS evaluates the key policy before permitting use of a customer managed key. Cross-account use requires the key policy to trust the external principal or a grant to that principal.

Why this answer

Option A is correct because the KMS key policy must explicitly grant the partner account or role permission to use the key for cryptographic operations. Without this cross-account policy statement, the key remains inaccessible to the partner account, even if the partner has S3 bucket access. This is a fundamental requirement for cross-account KMS key usage.

Exam trap

The trap here is that candidates often forget that cross-account KMS access requires both a key policy update in the central account AND IAM permissions in the partner account, not just one of them.

Full explanation →

578

MCQmedium

An S3 bucket in account A uses default server-side encryption with an AWS KMS customer-managed key (CMK) in account A. A team created an IAM role in account B that is allowed by IAM policy to perform s3:GetObject on the bucket. When the account B role tries to read objects, it fails with: AccessDeniedException: 'User is not authorized to perform kms:Decrypt'. Which change is most likely to fix the issue?

A.Add kms:Decrypt permissions to the identity policy in account B only, without modifying the CMK key policy in account A.

B.Update the CMK key policy in account A to allow the account B role principal to call kms:Decrypt (and kms:DescribeKey if needed).

C.Disable SSE-KMS on the S3 bucket so objects use SSE-S3 instead, eliminating the need for KMS permissions.

D.Attach a broad permissions boundary to the account B role allowing all kms:* actions to override the key policy.

AnswerB

KMS customer-managed keys rely on key policies (especially for cross-account access). Granting kms:Decrypt to the exact account B role principal in the key policy enables successful decrypt operations for SSE-KMS objects.

Why this answer

When an S3 bucket uses SSE-KMS with a customer-managed key (CMK) in account A, the account B role must have explicit kms:Decrypt permission on that CMK. The key policy in account A controls access to the CMK, so adding the account B role principal to the key policy with kms:Decrypt (and kms:DescribeKey if needed) is required. Without this, even if the S3 bucket policy and IAM role allow s3:GetObject, the KMS decrypt call will fail.

Exam trap

The trap here is that candidates assume IAM permissions in account B are sufficient for cross-account KMS operations, forgetting that the KMS key policy in the owning account must explicitly grant access to the external principal.

How to eliminate wrong answers

Option A is wrong because adding kms:Decrypt to the identity policy in account B alone is insufficient; the CMK key policy in account A must also grant access to the account B role, as KMS key policies act as a separate authorization layer. Option C is wrong because disabling SSE-KMS and switching to SSE-S3 would change the encryption method and potentially violate security requirements, but it would technically fix the KMS permission issue; however, it is not the most likely fix as it alters the encryption configuration rather than addressing the permission gap. Option D is wrong because a permissions boundary on the account B role cannot override the CMK key policy in account A; the key policy is the ultimate authority for KMS key access, and a boundary only limits the role's maximum permissions within its own account.

Full explanation →

579

Multi-Selectmedium

A financial services application requires high-performance read access to a time-series dataset that is frequently updated with new records. The workload is write-heavy during market hours and read-heavy for reporting. The solution must support strong consistency and low-latency queries on a single key. Which three AWS services or features should be used together to meet these requirements? (Choose three.)

Select 3 answers

.Amazon DynamoDB with DAX (DynamoDB Accelerator)

.Amazon S3 Standard-Infrequent Access (S3 Standard-IA) storage class

.Amazon DynamoDB with Auto Scaling enabled for reads and writes

.Amazon RDS for PostgreSQL with Multi-AZ deployment

.Amazon DynamoDB global tables for multi-Region replication

.DynamoDB strongly consistent reads configured on the table

Why this answer

Amazon DynamoDB with DAX (DynamoDB Accelerator) is correct because DAX provides an in-memory cache that reduces read latency from single-digit milliseconds to microseconds for strongly consistent reads, directly addressing the high-performance read requirement for a time-series dataset. DynamoDB with Auto Scaling enabled for reads and writes is correct because it automatically adjusts throughput capacity based on traffic patterns, handling the write-heavy workload during market hours and read-heavy reporting without manual intervention. DynamoDB strongly consistent reads configured on the table is correct because it ensures that all read operations return the most up-to-date data, meeting the strong consistency requirement for financial applications where stale reads are unacceptable.

Exam trap

The trap here is that candidates often confuse eventually consistent reads with strongly consistent reads in DynamoDB, assuming that DAX or Auto Scaling alone can provide strong consistency, when in fact strongly consistent reads must be explicitly configured on the table to guarantee the latest data.

Full explanation →

580

MCQmedium

A read-heavy media archive repeatedly queries the same product catalogue data from DynamoDB with millisecond latency requirements. Which service can reduce read latency and table load? The design must avoid adding custom operational scripts.

A.DynamoDB Accelerator (DAX)

B.Amazon Kinesis Data Firehose

C.AWS Glue Data Catalog

D.S3 Transfer Acceleration

AnswerA

DAX is an in-memory cache for DynamoDB that reduces read latency for suitable access patterns.

Why this answer

DynamoDB Accelerator (DAX) is an in-memory cache for DynamoDB that delivers microsecond read latency, reducing the load on the underlying DynamoDB tables by serving repeated queries from its cache. This directly addresses the read-heavy media archive's millisecond latency requirements without requiring custom operational scripts, as DAX is fully managed and integrates seamlessly with existing DynamoDB API calls.

Exam trap

The trap here is that candidates may confuse DAX with other caching services like ElastiCache, but DAX is purpose-built for DynamoDB and requires no application code changes, whereas ElastiCache would need custom scripts to manage cache invalidation and data population.

How to eliminate wrong answers

Option B (Amazon Kinesis Data Firehose) is wrong because it is a streaming data ingestion service for loading data into data stores like S3 or Redshift, not a caching layer for reducing DynamoDB read latency or table load. Option C (AWS Glue Data Catalog) is wrong because it is a metadata repository for ETL jobs and data discovery, not a low-latency cache for DynamoDB queries. Option D (S3 Transfer Acceleration) is wrong because it speeds up uploads to S3 over long distances using edge locations, but it does not cache DynamoDB data or reduce read latency for repeated queries.

Full explanation →

581

Multi-Selecthard

A company is encrypting sensitive S3 data for a claims portal with AWS KMS. Which two controls help prevent accidental use of the KMS key by unauthorized principals?

Select 2 answers

A.A larger KMS key rotation period

B.IAM policies that grant kms:Decrypt only to required application roles

C.A key policy that limits key administrators and key users

D.S3 Transfer Acceleration

AnswersB, C

IAM permissions should grant least-privilege use of the KMS key to specific roles.

Why this answer

Option B is correct because IAM policies can be used to restrict the `kms:Decrypt` action to only the specific IAM roles that require it for the claims portal. This ensures that even if an unauthorized principal has access to the encrypted S3 object, they cannot decrypt it without the explicit IAM permission to use the KMS key. Option C is correct because a key policy that explicitly defines key administrators and key users limits who can manage or use the KMS key, preventing accidental use by unauthorized principals.

Exam trap

The trap here is that candidates often assume that IAM policies alone are sufficient to control KMS key access, but they forget that the key policy must also explicitly allow the IAM principal to use the key, as KMS requires both the key policy and IAM policy to grant access.

Full explanation →

582

Multi-Selectmedium

A web application uses an Amazon Aurora DB cluster for a read-heavy workload. The team wants to increase read throughput without changing the database schema or rewriting application data access patterns. Which two changes should they make? Select two.

Select 2 answers

A.Add Aurora Replicas to scale out read traffic across multiple database instances.

B.Send read queries to the Aurora reader endpoint so they are distributed across the replicas.

C.Point all queries to the writer endpoint so Aurora can balance reads and writes internally.

D.Enable Multi-AZ standby for the cluster to increase the number of read-only connections.

E.Move the database to a single larger instance class instead of adding replicas.

AnswersA, B

Aurora Replicas are the primary horizontal scaling mechanism for read-heavy Aurora workloads. They add more database compute so the cluster can process more concurrent read queries.

Why this answer

Adding Aurora Replicas (Option A) directly increases read throughput by distributing read-only queries across multiple database instances, which is ideal for a read-heavy workload. Sending read queries to the Aurora reader endpoint (Option B) ensures that these queries are load-balanced across all available replicas, offloading the writer instance and improving overall performance without requiring schema or application changes.

Exam trap

The trap here is that candidates confuse Multi-AZ standby (which provides high availability but not read scaling) with Aurora Replicas (which provide both read scaling and high availability), leading them to select Option D incorrectly.

Full explanation →

583

MCQhard

A company uses AWS Organizations and wants to prevent any account in the organization from launching resources in regions other than us-east-1 and eu-west-1. This restriction must apply even if an administrator in a member account grants full IAM permissions. Which approach should a solutions architect use?

A.Create IAM policies with Deny for disallowed regions and attach them to all IAM users and roles in each account

B.Enable AWS Config rules to detect resources launched in disallowed regions and trigger auto-remediation to delete them

C.Use AWS Control Tower guardrails to enforce region restriction for all accounts

D.Create an SCP with a Deny on all actions for regions outside us-east-1 and eu-west-1, attached to the Organization root

AnswerD

SCPs apply to all principals in all member accounts and cannot be overridden by account-level IAM. Attached to the Organization root, this SCP covers every member account. The Deny with StringNotEquals condition on aws:RequestedRegion blocks all other regions.

Why this answer

Service Control Policies (SCPs) in AWS Organizations provide a guardrail that applies to all principals in member accounts — including IAM users, roles, and even the account root. SCPs restrict the maximum permissions that can be granted within an account.

An SCP with Deny on all actions for all regions except us-east-1 and eu-west-1, attached to the organization root, prevents any account from launching resources in other regions regardless of account-level IAM permissions. IAM policies in member accounts cannot override SCPs.

Exam trap

A common misconception is that an IAM Administrator or root user in a member account can override organization-level controls. SCPs define the permission ceiling — even AdministratorAccess (Action: *, Resource: *) cannot exceed what the SCP allows. SCPs are evaluated BEFORE account-level IAM policies.

Why the other options are wrong

IAM policies must be attached individually to each user and role in each account — unscalable across an Organization. Administrators in member accounts could also remove or bypass these policies by creating new roles without the restriction.

AWS Config rules detect non-compliant resources after they have been created. Auto-remediation adds latency. This is a detective control, not a preventive control — resources would exist temporarily before deletion.

AWS Control Tower uses SCPs under the hood for guardrails. However, the underlying mechanism is an SCP applied via Organizations. The direct answer for organizational prevention is an SCP.

Full explanation →

584

MCQmedium

An orders service publishes payment instructions to an Amazon SQS Standard queue. A downstream consumer sometimes times out and retries the work, causing the consumer to process the same instruction more than once. Operationally, the team must ensure that duplicate processing does not create duplicate charges. The queue type cannot be changed. What is the most resilient application-side approach?

A.Rely on SQS Standard to provide exactly-once delivery for each message, since the consumer uses retries.

B.Implement idempotent processing using a persistent deduplication key (for example, paymentInstructionId) so repeated messages are ignored or safely merged.

C.Increase the queue’s visibility timeout to 24 hours so messages never reappear even if the consumer times out.

D.Delete and recreate the queue with a different name whenever duplicates are detected in production.

AnswerB

Because SQS Standard is at-least-once, the consumer must assume duplicates are possible. Persisting a record keyed by paymentInstructionId (or using a database unique constraint) lets the consumer detect that a given instruction was already processed successfully and safely skip the charge or merge results deterministically.

Why this answer

Option B is correct because implementing idempotent processing with a persistent deduplication key (e.g., paymentInstructionId) ensures that even if SQS Standard delivers the same message multiple times due to consumer timeouts and retries, the downstream logic will detect and ignore or safely merge duplicate charges. This is the most resilient application-side approach as it does not rely on queue configuration changes and works within the constraints of SQS Standard's at-least-once delivery model.

Exam trap

The trap here is that candidates often assume SQS Standard can provide exactly-once delivery if retries are handled properly, but the exam tests the understanding that SQS Standard inherently allows duplicates and that idempotency is the only reliable application-side solution.

How to eliminate wrong answers

Option A is wrong because SQS Standard queues provide at-least-once delivery, not exactly-once delivery; retries and timeouts can cause duplicate messages, and relying on exactly-once is a misconception. Option C is wrong because increasing the visibility timeout to 24 hours does not prevent duplicates if the consumer times out and retries before the timeout expires, and it can delay processing unnecessarily, making it impractical and not resilient. Option D is wrong because deleting and recreating the queue with a different name is a disruptive, manual, and non-scalable approach that does not address the root cause of duplicate processing and would cause data loss and operational chaos.

Full explanation →

585

MCQmedium

A serverless API built with AWS Lambda serves latency-sensitive requests. The team observes intermittent slow responses during traffic ramp-ups and expects some users to hit the API immediately after a period of inactivity. Which configuration best reduces cold-start latency during these ramp-ups?

A.Enable Lambda provisioned concurrency on a published alias used by the API, and set a minimum provisioned concurrency greater than zero.

B.Increase the Lambda function’s memory setting; cold starts will always be eliminated regardless of traffic patterns.

C.Switch the Lambda runtime to a newer language version and remove any VPC configuration so the function never cold starts.

D.Set an API Gateway stage variable to "warm" the function at request time, which forces immediate initialization.

AnswerA

Provisioned concurrency keeps a defined number of Lambda execution environments initialized and ready behind a specific alias. When traffic ramps up—especially after inactivity—invocations can use pre-initialized environments, reducing or eliminating cold starts for those requests.

Why this answer

Lambda provisioned concurrency keeps a specified number of execution environments initialized and ready to respond immediately, eliminating cold starts for those invocations. By setting a minimum provisioned concurrency greater than zero on the alias used by API Gateway, the function remains warm even after periods of inactivity, ensuring consistent low latency during traffic ramp-ups.

Exam trap

The trap here is that candidates confuse provisioned concurrency with reserved concurrency, or assume that increasing memory or changing runtime settings can fully eliminate cold starts, when only provisioned concurrency guarantees pre-warmed execution environments for latency-sensitive workloads.

How to eliminate wrong answers

Option B is wrong because increasing memory reduces cold-start duration but does not eliminate cold starts; they still occur after inactivity. Option C is wrong because switching runtimes or removing VPC configuration does not prevent cold starts; VPC-enabled functions have additional cold-start overhead, but all Lambda functions can cold start regardless of runtime or VPC settings. Option D is wrong because API Gateway stage variables are static configuration values, not mechanisms to warm functions; they cannot force initialization at request time.

Full explanation →

586

MCQmedium

A ticket booking system runs on EC2 instances behind an Application Load Balancer. The design must tolerate the failure of one Availability Zone. What should the Auto Scaling group configuration include? The architecture review board prefers a managed AWS-native control.

A.Subnets in at least two Availability Zones with health checks enabled

B.All instances in one larger subnet

C.A Network Load Balancer in one subnet

D.A single EC2 instance with detailed monitoring

AnswerA

An Auto Scaling group spanning multiple AZs can replace unhealthy instances and maintain capacity during an AZ failure.

Why this answer

Option A is correct because an Auto Scaling group configured with subnets in at least two Availability Zones ensures that if one AZ fails, the remaining AZ(s) can continue to serve traffic. Health checks on the EC2 instances allow the Auto Scaling group to detect and replace unhealthy instances, maintaining the desired capacity across the surviving AZs. This aligns with the requirement for a managed AWS-native control to tolerate an AZ failure.

Exam trap

The trap here is that candidates might think a single large subnet or a Network Load Balancer provides AZ resilience, but subnets are AZ-scoped and an NLB is a separate load-balancing component, not an Auto Scaling group configuration setting.

How to eliminate wrong answers

Option B is wrong because placing all instances in one larger subnet, even if it spans multiple AZs (which is not possible as subnets are AZ-specific), does not provide AZ failure tolerance; a single AZ failure would take down all instances. Option C is wrong because a Network Load Balancer (NLB) is not a component of an Auto Scaling group configuration; the question asks what the Auto Scaling group should include, and an NLB is a separate resource, not a configuration setting within the group. Option D is wrong because a single EC2 instance, even with detailed monitoring, cannot tolerate the failure of one Availability Zone; if that instance resides in the failed AZ, the application becomes unavailable, and detailed monitoring does not provide redundancy.

Full explanation →

587

MCQmedium

A solutions architect is designing an S3 bucket for a order processing API. The objects must never be publicly accessible, even if a developer later adds an overly broad bucket policy. What should the architect configure?

A.Enable S3 Block Public Access at the account or bucket level

B.Enable server access logging on the bucket

C.Create an IAM policy that denies s3:GetObject to anonymous users

D.Enable S3 Transfer Acceleration

AnswerA

S3 Block Public Access prevents public ACLs and public bucket policies from exposing the bucket.

Why this answer

S3 Block Public Access provides a definitive override that prevents any public access to objects, regardless of bucket policies or ACLs. By enabling this setting at the account or bucket level, the architect ensures that even if a developer later adds an overly broad bucket policy, the objects remain inaccessible to anonymous users. This is the only option that guarantees no public access can be inadvertently granted.

Exam trap

The trap here is that candidates may think an IAM policy denying anonymous access is sufficient, but they miss that bucket policies can override IAM policies when both are evaluated, making S3 Block Public Access the only foolproof solution.

How to eliminate wrong answers

Option B is wrong because server access logging only records requests made to the bucket; it does not enforce any access restrictions. Option C is wrong because an IAM policy that denies s3:GetObject to anonymous users can be overridden by a later bucket policy that grants public access, as IAM and bucket policies are evaluated together and a bucket policy can explicitly allow what an IAM policy denies. Option D is wrong because S3 Transfer Acceleration is a performance feature that speeds up uploads over long distances; it has no effect on access control or public accessibility.

Full explanation →

588

MCQmedium

A solutions architect is designing an S3 bucket for a IoT ingestion API. The objects must never be publicly accessible, even if a developer later adds an overly broad bucket policy. What should the architect configure? The design must avoid adding custom operational scripts.

A.Enable S3 Transfer Acceleration

B.Create an IAM policy that denies s3:GetObject to anonymous users

C.Enable S3 Block Public Access at the account or bucket level

D.Enable server access logging on the bucket

AnswerC

S3 Block Public Access prevents public ACLs and public bucket policies from exposing the bucket.

Why this answer

Option C is correct because S3 Block Public Access provides a definitive override that prevents any public access to objects, regardless of bucket policies or object ACLs. This setting, when enabled at the account or bucket level, ensures that even if a developer later attaches an overly permissive bucket policy, the public access is blocked. It meets the requirement of avoiding custom operational scripts by being a native, configurable S3 feature.

Exam trap

The trap here is that candidates may think an IAM policy denying s3:GetObject to anonymous users is sufficient, but anonymous users are not IAM principals, so such a policy has no effect on anonymous access granted by a bucket policy.

How to eliminate wrong answers

Option A is wrong because S3 Transfer Acceleration is a performance feature that speeds up uploads over long distances using edge locations; it does not control access permissions or prevent public access. Option B is wrong because an IAM policy that denies s3:GetObject to anonymous users only applies to IAM principals, not to anonymous requests; anonymous users are not IAM entities, so this policy would not block public access granted by a bucket policy. Option D is wrong because server access logging records requests to the bucket for auditing purposes but does not enforce any access restrictions or prevent public access.

Full explanation →

589

Multi-Selecteasy

A service processes messages from an Amazon SQS queue. Sometimes the worker finishes the business logic but does not delete the message before the visibility timeout expires, so the message is delivered again. Which two changes improve resilience and reduce the impact of duplicate processing? Select two.

Select 2 answers

A.Make the message handler idempotent.

B.Set the SQS visibility timeout long enough for normal processing to complete.

C.Switch from SQS to Amazon SNS for reliable buffering.

D.Shorten the queue retention period so messages expire quickly.

E.Disable retries in the consumer application.

AnswersA, B

SQS provides at-least-once delivery, so the same message can be seen more than once. An idempotent handler ensures a repeated delivery does not create duplicate records, duplicate payments, or other repeated side effects.

Why this answer

Option A is correct because making the message handler idempotent ensures that even if a message is processed multiple times (due to visibility timeout expiry), the business outcome remains the same. Idempotency is a key design pattern for resilient architectures when using at-least-once delivery systems like SQS. Option B is correct because setting the visibility timeout long enough for normal processing prevents premature redelivery, reducing the chance of duplicate processing in the first place.

Exam trap

The trap here is that candidates often think disabling retries or switching to SNS will solve the duplicate processing issue, but they fail to recognize that SQS's at-least-once delivery model inherently requires idempotent consumers and proper visibility timeout configuration.

Full explanation →

590

Multi-Selecthard

A latency-sensitive telemetry service uses a custom TCP protocol on EC2 instances in private subnets. The service must preserve the client source IP for rate limiting, avoid HTTP header inspection, and keep per-request overhead as low as possible. Which changes should the team make? Select three.

Select 3 answers

A.Use a Network Load Balancer in front of the service.

B.Use a TCP or TLS listener rather than an HTTP listener.

C.Register instance or IP targets so the service can receive the original client source IP for rate limiting.

D.Use an Application Load Balancer because path-based routing improves throughput for binary protocols.

E.Expose the service through API Gateway because it supports raw TCP and UDP pass-through.

AnswersA, B, C

Correct because NLB is built for high-throughput, low-latency TCP traffic. It avoids HTTP-layer processing and is the right load balancer for a custom binary protocol.

Why this answer

Option A is correct because a Network Load Balancer (NLB) operates at Layer 4 and preserves the client source IP by default when instances are registered as targets. This allows the telemetry service to use the original IP for rate limiting without requiring HTTP header inspection, which is critical for a custom TCP protocol. NLB also introduces minimal latency and low per-request overhead, making it ideal for latency-sensitive workloads.

Exam trap

The trap here is that candidates may assume an Application Load Balancer is always better for routing logic, but for non-HTTP protocols and latency-sensitive workloads, the Network Load Balancer is the correct choice because it operates at Layer 4 without protocol inspection.

Full explanation →

591

Multi-Selectmedium

A company runs a production database on Amazon RDS for MySQL with Multi-AZ enabled. The database experiences a sudden increase in read replicas due to a marketing campaign. Which three strategies would help ensure the database remains resilient under heavy read traffic? (Choose three.)

Select 3 answers

.Create additional read replicas in different Availability Zones to distribute read traffic.

.Enable Multi-AZ on the read replicas to provide automatic failover for read operations.

.Use an RDS Proxy between the application and the database to manage connection pooling.

.Promote a read replica to a standalone DB instance to offload write traffic.

.Configure the application to use the read replica endpoint for read queries and the primary endpoint for writes.

.Increase the storage size of the primary DB instance to improve I/O throughput.

Why this answer

Creating additional read replicas in different Availability Zones distributes read traffic across multiple isolated locations, reducing load on the primary instance and improving read scalability. Using RDS Proxy between the application and the database manages connection pooling, which reduces the number of database connections and prevents resource exhaustion under heavy traffic. Configuring the application to use the read replica endpoint for read queries and the primary endpoint for writes offloads read traffic from the primary instance, preserving its capacity for write operations and maintaining write availability.

Exam trap

The trap here is that candidates often confuse Multi-AZ with read replicas, thinking Multi-AZ can be applied to replicas for failover, or they assume promoting a replica helps with write offloading, when in fact it creates a separate write target without reducing load on the original primary.

Full explanation →

592

MCQmedium

A analytics dashboard uses RDS MySQL and receives many read-only reporting queries that slow down the primary database. What should the architect add? The team wants the control to be enforceable during normal operations.

A.S3 lifecycle policy

B.RDS read replica and route reporting queries to it

C.Multi-AZ standby and route reads to the standby

D.A larger NAT gateway

AnswerB

Read replicas offload read traffic from the primary instance.

Why this answer

B is correct because an RDS read replica is designed to offload read-heavy workloads from the primary database instance. By routing reporting queries to the read replica, the primary database is freed from processing these read-only requests, improving overall performance. This solution is enforceable during normal operations as the read replica is always available for reads, unlike a Multi-AZ standby which is not accessible for reads.

Exam trap

The trap here is confusing a Multi-AZ standby (which is not readable) with a read replica (which is readable), leading candidates to incorrectly choose C thinking the standby can serve reads.

How to eliminate wrong answers

Option A is wrong because an S3 lifecycle policy manages object transitions and expirations in S3, not database query routing or read offloading. Option C is wrong because a Multi-AZ standby is a synchronous replica used only for failover and is not accessible for read queries during normal operations; routing reads to it would fail. Option D is wrong because a larger NAT gateway increases outbound internet capacity for private subnets, which does not address database read performance or query routing.

Full explanation →

593

MCQmedium

Based on the exhibit, the team wants to stop poison messages from consuming worker capacity and also prevent duplicate side effects if the same message is delivered more than once. Which design change best meets the requirement?

A.Increase the SQS queue batch size so each worker processes more messages per request.

B.Replace SQS with Amazon SNS and let each worker subscribe directly to the topic.

C.Configure a dead-letter queue and make the handler idempotent by storing a durable processed-message key.

D.Disable retries and shorten the visibility timeout so failed messages disappear sooner.

AnswerC

A dead-letter queue isolates messages that repeatedly fail so they stop wasting worker capacity. Idempotency ensures a message processed more than once does not create duplicate side effects, which is essential when visibility timeouts expire or retries occur. Together, these controls address both poison-message handling and at-least-once delivery behavior.

Why this answer

Option C is correct because a dead-letter queue isolates poison messages that repeatedly fail processing, preventing them from consuming worker capacity. Making the handler idempotent by storing a durable processed-message key (e.g., using DynamoDB or a database) ensures that even if the same message is delivered more than once, duplicate side effects are avoided. This combination directly addresses both requirements: stopping poison messages from wasting resources and preventing duplicate processing.

Exam trap

The trap here is that candidates often think disabling retries or increasing batch size solves poison messages, but they fail to realize that only a dead-letter queue isolates problematic messages, and idempotency is required to handle duplicate deliveries inherent in SQS's at-least-once delivery model.

How to eliminate wrong answers

Option A is wrong because increasing the SQS batch size does not prevent poison messages from consuming worker capacity; it only makes each worker process more messages per request, which could actually increase the impact of poison messages. Option B is wrong because replacing SQS with Amazon SNS and having workers subscribe directly to the topic removes the ability to decouple producers and consumers, and SNS does not provide message retention, retries, or a dead-letter queue mechanism, so poison messages would still be delivered and could cause duplicate side effects. Option D is wrong because disabling retries and shortening the visibility timeout would cause failed messages to disappear sooner, but this does not prevent duplicate side effects (messages could still be redelivered before being deleted) and does not isolate poison messages—they would simply be lost, not handled.

Full explanation →

594

MCQhard

Based on the exhibit, a company stores sensitive PDFs in S3 and serves them through CloudFront. Direct requests to the S3 object URL must fail, but CloudFront should still be able to fetch the files securely. Which solution best satisfies the requirement?

A.Leave the bucket public but require CloudFront signed cookies for all users.

B.Use an S3 access point and give it a public policy so CloudFront can reach the objects.

C.Configure CloudFront Origin Access Control for the S3 origin and update the bucket policy to allow only that distribution.

D.Use S3 object ACLs to grant read access only to users behind CloudFront.

AnswerC

OAC lets CloudFront sign origin requests to S3, while the bucket policy can deny all other principals and block direct URL access.

Why this answer

Option C is correct because CloudFront Origin Access Control (OAC) allows CloudFront to authenticate requests to an S3 origin using a specific identity, and the bucket policy can be configured to grant access only to that CloudFront distribution. This ensures that direct S3 object URL requests fail (since they lack the CloudFront signature), while CloudFront can still fetch the files securely using the OAC mechanism.

Exam trap

The trap here is that candidates often confuse signed URLs/cookies (which control user access) with origin access controls (which control how CloudFront fetches from S3), leading them to pick options that still allow direct S3 access.

How to eliminate wrong answers

Option A is wrong because leaving the bucket public would allow anyone with the S3 object URL to access the files directly, violating the requirement that direct requests must fail. Option B is wrong because an S3 access point with a public policy would still allow direct public access to the objects, bypassing CloudFront. Option D is wrong because S3 object ACLs cannot restrict access based on the requester being behind CloudFront; they only grant permissions to specific AWS accounts or canonical users, not to CloudFront distributions.

Full explanation →

595

MCQeasy

A retail platform needs disaster recovery across AWS Regions. The business requirement is: RTO up to 6 hours, RPO up to 1 hour, and they want the ability to start serving quickly during a Region outage but do not want to run full production capacity continuously. Which DR strategy best fits these requirements?

A.Backup and restore only, with no continuously running infrastructure in the secondary Region.

B.Pilot light, keeping only the minimum resources needed to bootstrap the environment.

C.Warm standby, keeping a reduced but ready-to-scale environment in the secondary Region.

D.Multi-site active-active, serving production traffic from both Regions at all times.

AnswerC

Warm standby maintains enough infrastructure to reduce recovery time, while not fully running production capacity continuously.

Why this answer

Warm standby is the best fit because it maintains a scaled-down but fully functional copy of the production environment in the secondary Region, allowing the RTO of 6 hours and RPO of 1 hour to be met without running full production capacity continuously. During a disaster, the standby environment can be scaled up quickly to serve traffic, balancing cost and recovery speed.

Exam trap

The trap here is that candidates confuse pilot light with warm standby, assuming that minimal resources (pilot light) can meet the 6-hour RTO, but pilot light requires manual provisioning of compute and scaling, which often exceeds the RTO, while warm standby provides a pre-provisioned, ready-to-scale environment that meets the requirement.

How to eliminate wrong answers

Option A is wrong because backup and restore only would result in an RTO significantly longer than 6 hours, as it requires provisioning infrastructure and restoring data from backups, which cannot meet the 1-hour RPO or 6-hour RTO. Option B is wrong because pilot light keeps only the minimal core resources (e.g., database, DNS) and requires manual provisioning of compute and scaling, which typically exceeds the 6-hour RTO and may not achieve the 1-hour RPO without additional automation. Option D is wrong because multi-site active-active runs full production capacity in both Regions at all times, which violates the requirement to not run full production capacity continuously and incurs unnecessary cost.

Full explanation →

596

MCQeasy

A latency-sensitive API is implemented with AWS Lambda. During traffic ramp-ups, users sometimes experience slow responses due to cold starts. The team wants to ensure fast initialization for a baseline level of concurrent requests. Which AWS feature should they use?

A.Lambda provisioned concurrency

B.Increase reserved instances for EC2

C.Enable S3 event notifications for every request to the API

D.Decrease the function timeout to reduce execution variability

AnswerA

Provisioned concurrency pre-initializes a specified number of Lambda execution environments and keeps them ready for invocation. This reduces or eliminates cold starts for the configured baseline concurrency during traffic ramp-ups.

Why this answer

Lambda Provisioned Concurrency keeps a specified number of execution environments initialized and ready to respond immediately, eliminating cold starts for those concurrent requests. This directly addresses the latency-sensitive API requirement during traffic ramp-ups by ensuring fast initialization for a baseline level of concurrency.

Exam trap

The trap here is that candidates may confuse 'provisioned concurrency' with 'reserved concurrency' (which only caps concurrency, not pre-warms) or think that reducing the function timeout or adding S3 triggers can somehow mitigate cold starts.

How to eliminate wrong answers

Option B is wrong because reserved instances for EC2 apply to EC2 compute capacity, not to Lambda functions, and would not address Lambda cold starts. Option C is wrong because S3 event notifications are used to trigger Lambda functions on S3 object events, not to pre-warm Lambda execution environments, and adding them for every API request would introduce unnecessary complexity and latency. Option D is wrong because decreasing the function timeout does not reduce cold start latency; it only limits the maximum execution duration, and may actually increase execution variability by forcing premature terminations.

Full explanation →

597

MCQmedium

A caching layer uses Amazon ElastiCache for Redis in front of a stateless web service. The service must continue to read cached responses during maintenance events and should automatically fail over to another node if one AZ becomes impaired. Which design change best satisfies this requirement?

A.Deploy a single-node Redis cluster and rely on application-level retries when cache misses occur.

B.Configure an ElastiCache Redis replication group with automatic failover across multiple Availability Zones.

C.Move the cache into the VPC but keep it in one Availability Zone to reduce network latency.

D.Use a Memcached cluster and configure only client-side connection pooling without failover support.

AnswerB

Multi-AZ replication groups provide redundant nodes and automatic failover, improving cache resilience during AZ events.

Why this answer

Option B is correct because an ElastiCache Redis replication group with automatic failover across multiple Availability Zones ensures that if the primary node or its AZ becomes impaired, a read-replica in another AZ is automatically promoted to primary. This allows the stateless web service to continue reading cached responses without interruption, satisfying both the maintenance and AZ impairment requirements.

Exam trap

The trap here is that candidates often confuse Memcached's simplicity with Redis's replication capabilities, assuming that client-side connection pooling alone can handle failover, when in fact Memcached lacks any built-in replication or automatic failover mechanism.

How to eliminate wrong answers

Option A is wrong because a single-node Redis cluster provides no redundancy; if the node fails or its AZ becomes impaired, the cache is completely unavailable, forcing the web service to fall back to the origin (cache miss) until the node is restored, which violates the requirement for automatic failover. Option C is wrong because keeping the cache in one Availability Zone does not protect against AZ impairment; a single-AZ deployment cannot automatically fail over to another node in a different AZ, so the service would lose cached responses during an AZ outage. Option D is wrong because Memcached does not support replication or automatic failover; it is a pure caching engine with no built-in mechanism to promote a standby node, and client-side connection pooling alone cannot provide failover if a node or AZ becomes impaired.

Full explanation →

598

MCQmedium

An ECS service runs on EC2 instances and is fronted by an ALB. The ALB spans two Availability Zones, and the ECS service desired count is 2 tasks. The underlying EC2 capacity uses an Auto Scaling group (ASG) with min size set to 1, and the ASG also spans only one subnet in practice. What is the most effective change to meet the requirement that the service continues during a single-AZ instance loss?

A.Set the ECS deployment configuration to maximum percent 100 so tasks replace instances faster during rollouts.

B.Increase ASG min size to at least 2 and ensure the ASG uses subnets in at least two Availability Zones.

C.Enable ALB connection draining longer than expected so existing connections survive longer during an AZ event.

D.Reduce task memory reservations to pack both tasks onto a single EC2 instance.

AnswerB

Multi-AZ instance capacity ensures tasks have eligible compute in another AZ when one AZ loses instances.

Why this answer

Option B is correct because the current architecture has a single point of failure: the ASG spans only one subnet (one AZ), so if that AZ fails, all EC2 instances are lost, and the ECS service cannot run any tasks. By increasing the ASG min size to at least 2 and ensuring it uses subnets in at least two AZs, the ASG will maintain at least one healthy instance in each AZ, allowing the ECS service to survive a single-AZ outage. This aligns with the AWS Well-Architected Framework's principle of deploying across multiple AZs for high availability.

Exam trap

The trap here is that candidates may focus on the ALB's multi-AZ configuration and overlook that the compute layer (ASG/EC2) is the actual bottleneck, leading them to choose connection draining or deployment settings that do not address the fundamental lack of cross-AZ capacity.

How to eliminate wrong answers

Option A is wrong because setting the ECS deployment configuration maximum percent to 100 controls how many tasks are replaced during a rolling update, not the ability to survive an AZ failure; it does not address the underlying lack of EC2 capacity across AZs. Option C is wrong because ALB connection draining only gracefully terminates existing connections during deregistration or health check failures; it does not prevent service interruption when all EC2 instances in the single AZ become unavailable. Option D is wrong because reducing task memory reservations to pack both tasks onto a single EC2 instance actually increases risk—if that single instance or its AZ fails, both tasks are lost, violating the resilience requirement.

Full explanation →

599

MCQmedium

A video platform uses Amazon Aurora. The workload has many short-lived database connections from Lambda functions, causing connection storms. What should be added? The design must avoid adding custom operational scripts.

A.S3 Select

B.An internet gateway

C.A larger Route 53 hosted zone

D.RDS Proxy

AnswerD

RDS Proxy pools and manages database connections, improving scalability for serverless and bursty workloads.

Why this answer

RDS Proxy is the correct choice because it sits between Lambda functions and the Aurora database, pooling and reusing database connections. This prevents connection storms by reducing the overhead of establishing new connections for each short-lived Lambda invocation, without requiring custom scripts or application changes.

Exam trap

The trap here is that candidates may think scaling the database (e.g., using Aurora Auto Scaling) or adding more compute resources solves connection storms, but the real bottleneck is the connection overhead itself, which RDS Proxy directly addresses without custom scripts.

How to eliminate wrong answers

Option A is wrong because S3 Select is a service for retrieving subsets of data from objects in Amazon S3 using SQL expressions; it does not manage database connections or address connection storms. Option B is wrong because an internet gateway enables VPC-to-internet communication for public subnets; it has no role in database connection pooling or reducing connection overhead. Option C is wrong because a larger Route 53 hosted zone increases the number of DNS records you can host but does not affect database connection management or mitigate connection storms.

Full explanation →

600

MCQhard

A financial services company must store audit logs in S3 for 7 years and ensure that no one — including the AWS account root user — can delete or overwrite the logs during the retention period. Which S3 Object Lock configuration should a solutions architect use?

A.Object Lock in Compliance mode with a 7-year retention period

B.Object Lock in Governance mode with a 7-year retention period

C.S3 Versioning with a lifecycle rule to transition objects to Glacier after 7 years

D.A bucket policy with Deny for s3:DeleteObject applied to all principals including root

AnswerA

Compliance mode prevents ALL users including root from deleting or overwriting objects before retention expires. The period cannot be shortened, satisfying strict financial regulatory requirements.

Why this answer

S3 Object Lock in Compliance mode prevents ALL users — including the root account — from deleting or overwriting objects before the retention period expires. The retention period itself cannot be shortened once set in Compliance mode.

Governance mode also prevents most deletions, but users with s3:BypassGovernanceRetention permission (and the root account) can delete objects or shorten the retention period. For regulatory requirements where not even root can override, Compliance mode is mandatory.

Exam trap

Candidates choose Governance mode because 'governance' sounds strict. In AWS terminology, Governance is the LESS strict option — it can be bypassed by privileged users. Compliance mode is immutable: no one can remove the retention until the period expires.

This distinction is critical for financial regulations like SEC Rule 17a-4 and FINRA requirements.

Why the other options are wrong

Governance mode can be bypassed by the root account and users with s3:BypassGovernanceRetention permission. This does NOT meet the requirement that no one including root can delete the logs.

S3 Versioning prevents accidental deletion by keeping previous versions, but a privileged user can permanently delete all versions. Lifecycle rules manage storage class transitions — they do not prevent deletion. Compliance mode is required.

Bucket policies cannot restrict the root account. IAM policies (including resource-based policies) cannot override root user permissions. Only AWS Organizations SCPs and S3 Object Lock Compliance mode can restrict root's ability to delete S3 objects.

Full explanation →

SAA-C03 (SAA-C03) — Questions 526–600