SAA-C03 SAA-C03 Questions 826–900 | Page 12/14

826

MCQeasy

An EC2 workload runs in one region on a single instance type. For the last month, CloudWatch metrics show average CPU utilization of 12% and no sustained memory pressure. The team wants to reduce cost while maintaining the current performance level. What is the best first step?

A.Use AWS Compute Optimizer to get recommendations for instance type and size changes.

B.Increase the instance size to reduce the risk of performance regression.

C.Switch to Spot Instances immediately to reduce cost regardless of utilization.

D.Disable detailed monitoring to lower CloudWatch charges.

AnswerA

AWS Compute Optimizer analyzes historical metrics (such as CPU and memory utilization) and recommends instance type and size changes to improve cost-effectiveness while targeting performance. Given sustained low CPU and no sustained memory pressure, this is the most direct first step to identify a smaller/fewer-overprovisioned instance configuration that can maintain performance.

Why this answer

AWS Compute Optimizer analyzes historical utilization metrics (CPU, memory, I/O) and provides actionable recommendations for right-sizing instances. Given the average CPU utilization of only 12% and no memory pressure, Compute Optimizer will likely recommend a smaller instance type or family that matches the workload's actual resource needs, reducing cost without affecting performance.

Exam trap

The trap here is that candidates may think increasing instance size (Option B) is a safe 'performance buffer' move, but the question explicitly asks to reduce cost while maintaining current performance, making right-sizing via Compute Optimizer the logical first step.

How to eliminate wrong answers

Option B is wrong because increasing instance size would raise costs unnecessarily when utilization is already low, and it does not address the goal of cost reduction. Option C is wrong because switching to Spot Instances without first analyzing workload suitability risks interruption and potential performance degradation; Spot Instances are not a guaranteed cost-reduction strategy for all workloads. Option D is wrong because disabling detailed monitoring (1-minute metrics) saves only a trivial amount and does not address the primary cost driver—compute instance charges—while losing granular visibility needed for right-sizing decisions.

Full explanation →

827

MCQmedium

A security operations team wants continuous compliance checks for AWS resources. They need to know when an EBS volume becomes unencrypted or when a security group starts allowing SSH from 0.0.0.0/0. Which AWS service should they use?

A.AWS CloudTrail, because it records every API call made in the account.

B.AWS Config, because it evaluates resource configuration against compliance rules.

C.Amazon GuardDuty, because it automatically encrypts noncompliant resources.

D.Amazon Macie, because it manages encryption settings for all AWS resources.

AnswerB

AWS Config is the right service for continuous resource compliance monitoring. It tracks configuration changes over time and can evaluate rules that check for conditions such as encrypted EBS volumes or overly permissive security groups. This makes it ideal for governance and drift detection, especially when the team needs to know the current state of resources rather than only the history of API calls.

Why this answer

AWS Config is the correct service because it continuously monitors and evaluates the configuration of AWS resources against desired compliance rules. It can detect when an EBS volume is unencrypted or when a security group rule allows SSH (port 22) from 0.0.0.0/0, and trigger notifications or remediation actions via AWS Config rules.

Exam trap

The trap here is that candidates confuse AWS CloudTrail's API logging with AWS Config's configuration evaluation, assuming that recording API calls is sufficient for compliance checks, but CloudTrail does not assess the current state of resources against rules.

How to eliminate wrong answers

Option A is wrong because AWS CloudTrail records API calls (e.g., who made a change) but does not evaluate the current configuration state of resources against compliance rules; it lacks the ability to detect an unencrypted EBS volume or an overly permissive security group rule without additional analysis. Option C is wrong because Amazon GuardDuty is a threat detection service that analyzes network traffic and DNS logs for malicious activity, not a configuration compliance tool; it does not automatically encrypt resources or evaluate security group rules. Option D is wrong because Amazon Macie is a data security service that uses machine learning to discover and protect sensitive data in S3, not a service for managing encryption settings or evaluating resource configurations like EBS volumes or security groups.

Full explanation →

828

MCQeasy

A media company runs a batch job that processes image thumbnails. The job can be restarted from checkpoints and does not have user-facing SLAs. The batch capacity can tolerate interruptions. Which EC2 purchasing option is the best cost optimization choice?

A.Use On-Demand Instances because interruptions are not allowed for production workloads.

B.Use EC2 Spot Instances, accepting the possibility of interruptions and using checkpoints to resume.

C.Purchase Reserved Instances because they provide a discount regardless of the workload timing.

D.Buy Savings Plans because they guarantee capacity and remove the risk of interruptions entirely.

AnswerB

Spot Instances are typically the cheapest option for workloads that can tolerate interruptions with recovery.

Why this answer

Spot Instances offer significant cost savings (up to 90% compared to On-Demand) and are ideal for fault-tolerant, stateless, or checkpointable workloads like batch image thumbnail processing. Since the job can resume from checkpoints and tolerates interruptions, Spot Instances provide the best cost optimization without compromising functionality.

Exam trap

The trap here is that candidates may assume production workloads require On-Demand or Reserved Instances, but the question explicitly states the job has no user-facing SLAs and tolerates interruptions, making Spot Instances the correct cost-optimized choice despite the 'production' label.

How to eliminate wrong answers

Option A is wrong because On-Demand Instances are not cost-optimized for workloads that can tolerate interruptions; they are priced higher and provide no interruption risk, which is unnecessary here. Option C is wrong because Reserved Instances require a 1- or 3-year commitment and are designed for steady-state, predictable workloads, not for batch jobs that can be interrupted and resumed. Option D is wrong because Savings Plans offer discounted rates in exchange for a commitment to a consistent amount of compute usage (measured in $/hour), but they do not guarantee capacity or remove interruption risk; Spot Instances can still be interrupted under Savings Plans, and the question asks for the best cost optimization choice, not a capacity guarantee.

Full explanation →

829

Multi-Selectmedium

A workload runs in private subnets and must reach Amazon S3 and AWS Secrets Manager without using the internet or a NAT gateway. The team wants to keep the traffic on AWS private networking and avoid public IPs. Which two changes should the architect make? Select two.

Select 2 answers

A.Create an S3 gateway VPC endpoint and update the route tables for the private subnets.

B.Place a NAT gateway in the public subnet so the private instances can reach AWS services.

C.Create an interface VPC endpoint for AWS Secrets Manager and allow the workload security group to reach it.

D.Assign public IPv4 addresses to the instances and restrict them with security groups.

E.Use VPC peering to the AWS service endpoints instead of VPC endpoints.

AnswersA, C

An S3 gateway endpoint provides private access to S3 without sending traffic over the internet. It is the correct endpoint type for S3 and integrates through route tables.

Why this answer

Option A is correct because an S3 gateway VPC endpoint enables private subnet instances to access S3 over the AWS network without requiring internet gateways or NAT gateways. Gateway endpoints use route table entries to direct S3 traffic through the AWS backbone, avoiding public IPs entirely.

Exam trap

The trap here is that candidates often confuse gateway endpoints (for S3 and DynamoDB) with interface endpoints (for most other services) and may incorrectly assume a NAT gateway is needed for all AWS service access, ignoring that gateway endpoints provide a free, internet-free alternative for S3.

Full explanation →

830

Multi-Selecthard

A latency-sensitive mobile game backend uploads large files to S3 from users around the world. Which two features can improve upload performance? The design must avoid adding custom operational scripts.

Select 2 answers

A.S3 Object Lock

B.S3 multipart upload

C.S3 Inventory

D.S3 Transfer Acceleration

AnswersB, D

Multipart upload parallelizes large object upload parts and improves reliability.

Why this answer

S3 multipart upload is correct because it allows large files to be uploaded in parallel parts, significantly reducing the impact of latency and packet loss over long distances. This feature is built into S3 and requires no custom scripts, making it ideal for a latency-sensitive mobile game backend.

Exam trap

The trap here is that candidates may confuse S3 Transfer Acceleration with a feature that requires client-side modifications, but it is a simple bucket-level setting that uses AWS edge locations automatically, and they might overlook multipart upload as a performance booster because it is often associated with reliability rather than speed.

Full explanation →

831

MCQmedium

An application uses an Amazon Aurora DB cluster. The cluster performs an automatic failover from the writer instance to a standby instance. After failover completes, reads succeed, but all new writes fail with errors indicating the application is connecting to the old writer endpoint. Which change best fixes the resiliency issue after failover?

A.Update the application to use the Aurora cluster writer endpoint (or the cluster endpoint intended for writes) rather than an instance-specific endpoint.

B.Enable Multi-AZ on the individual writer instance settings so it can automatically create a new instance during failover.

C.Increase the failover timeout for Aurora to 60 minutes to ensure the app finishes reconnecting.

D.Switch the cluster to a single-AZ configuration to reduce connection retries after failover.

AnswerA

During Aurora failover, the writer role moves to a different underlying DB instance. The cluster writer endpoint is stable and always resolves to the current writer, even after failover. An instance-specific endpoint continues to point to the original (now non-writer) instance, so write operations fail if the application keeps using that stale endpoint.

Why this answer

The application is failing writes because it is connecting to the old writer instance's endpoint, which is no longer the writer after failover. The Aurora cluster writer endpoint is a DNS name that always points to the current primary (writer) instance, regardless of failovers. By using the cluster writer endpoint, the application automatically connects to the new writer after failover, eliminating the need to manually update connection strings.

Exam trap

The trap here is that candidates often confuse instance-specific endpoints with cluster endpoints, assuming that failover automatically updates all DNS records, but only the cluster endpoint is dynamically updated to reflect the new writer.

How to eliminate wrong answers

Option B is wrong because Multi-AZ is already inherent in Aurora clusters (by default, Aurora stores data across three Availability Zones) and enabling it on an individual instance does not change the failover behavior or fix the endpoint issue. Option C is wrong because increasing the failover timeout to 60 minutes does not address the root cause; the application will still connect to the old writer endpoint and fail writes indefinitely. Option D is wrong because switching to a single-AZ configuration would actually reduce resiliency and increase the risk of data loss, and it does not solve the problem of the application using the wrong endpoint.

Full explanation →

832

MCQmedium

A media platform stores originals in an S3 bucket. The application must: (1) prevent any public access to the bucket, (2) allow authenticated users to upload and download objects using presigned URLs, and (3) enforce that all requests use HTTPS and only touch objects under the user-specific prefix (for example, s3://media-originals/user-123/*). The bucket currently allows uploads but sometimes returns 403 AccessDenied for presigned URLs. Which change is the best fix while meeting the security requirements?

A.Disable S3 Block Public Access and add an ACL that grants READ and WRITE to the bucket owner only.

B.Keep Block Public Access enabled, remove any Allow statement to Principal="*", and use a bucket policy or access point policy that denies non-HTTPS requests and allows PutObject/GetObject only when the object key matches the authenticated user's session tag, such as arn:aws:s3:::media-originals/${aws:PrincipalTag/userId}/*.

C.Use bucket website hosting and allow public GET requests so presigned URLs are not needed for downloads.

D.Use ACLs to grant ObjectOwner full control and rely on the application to generate presigned URLs with longer expirations to avoid 403 errors.

AnswerB

Block Public Access ensures the bucket cannot become public. A policy that denies non-HTTPS traffic and scopes object ARNs to a session tag or equivalent identity attribute enforces user-specific access without relying on public principals.

Why this answer

Option B is correct because it keeps S3 Block Public Access enabled (preventing any public access), uses a bucket policy or access point policy with a condition key like `aws:PrincipalTag` to restrict `PutObject`/`GetObject` to the user-specific prefix (e.g., `arn:aws:s3:::media-originals/${aws:PrincipalTag/userId}/*`), and denies non-HTTPS requests via a `aws:SecureTransport` condition. This ensures presigned URLs work only for authenticated users with the correct session tag, while enforcing HTTPS and preventing public access.

Exam trap

The trap here is that candidates mistakenly think presigned URLs bypass bucket policies, but in reality, presigned URLs are still subject to the bucket policy—so a policy that denies access to anonymous principals or lacks conditions for user-specific prefixes will cause 403 errors even with valid presigned URLs.

How to eliminate wrong answers

Option A is wrong because disabling S3 Block Public Access and using an ACL that grants READ and WRITE to the bucket owner only does not prevent public access—ACLs are legacy and do not override bucket policies that might allow public access; also, ACLs cannot enforce user-specific prefixes or HTTPS. Option C is wrong because enabling bucket website hosting and allowing public GET requests violates the requirement to prevent any public access and makes presigned URLs unnecessary, but it exposes objects to the public. Option D is wrong because using ACLs to grant ObjectOwner full control does not enforce user-specific prefixes or HTTPS, and relying on longer presigned URL expirations does not fix the 403 error—the 403 is likely due to a missing or incorrect bucket policy that denies access based on the requester's identity or prefix.

Full explanation →

833

Multi-Selecthard

A static website stores assets in S3 and is delivered through CloudFront. Analytics show low cache hit ratio, many origin fetches for the same JavaScript bundles, and elevated S3 GET request costs. Most requests include unnecessary cookies, and the text assets are uncompressed. Which changes should the team make? Select three.

Select 3 answers

A.Configure a CloudFront cache policy that excludes unnecessary cookies and headers from the cache key.

B.Enable Origin Shield for the distribution to reduce duplicate requests reaching the S3 origin.

C.Enable compression for text-based objects such as JavaScript and CSS.

D.Switch the origin to an Application Load Balancer so CloudFront can cache the assets more effectively.

E.Disable caching so viewers always retrieve the newest version directly from S3.

AnswersA, B, C

Correct because cookies and headers that do not affect content create unnecessary cache variants. Removing them from the cache key makes CloudFront reuse the same object more often.

Why this answer

Option A is correct because a CloudFront cache policy that excludes unnecessary cookies and headers from the cache key prevents CloudFront from creating multiple cache entries for the same object based on varying cookie values. This increases the cache hit ratio by ensuring that identical JavaScript bundles are served from the edge cache rather than triggering separate origin fetches for each unique request, thereby reducing S3 GET request costs.

Exam trap

The trap here is that candidates may think an ALB origin improves caching for static assets (Option D) or that disabling caching solves freshness issues (Option E), when in fact both actions worsen performance and cost for a static website served through CloudFront.

Full explanation →

834

MCQhard

A healthcare document service uses Amazon RDS for PostgreSQL. Application credentials must not be stored on the EC2 instances, and authentication should use short-lived credentials. What should the architect recommend?

A.Embed the database password in the AMI

B.Store the database password in user data

C.Use a security group rule that allows only application instances

D.IAM database authentication for RDS with an EC2 instance role

AnswerD

IAM database authentication allows the application to use temporary AWS credentials instead of stored database passwords.

Why this answer

IAM database authentication for RDS with an EC2 instance role is correct because it allows the EC2 instance to assume an IAM role and obtain a short-lived authentication token (valid for 15 minutes) to connect to the PostgreSQL database, eliminating the need to store any credentials on the instance. This approach meets both requirements: no credentials stored on EC2 and the use of short-lived credentials. The token is generated using the AWS CLI or SDK with the IAM role's temporary security credentials, and the RDS instance must be configured to accept IAM authentication.

Exam trap

The trap here is that candidates may think security groups alone solve credential management, but they only control network access, not authentication or credential lifecycle, so they fail to address the short-lived credential requirement.

How to eliminate wrong answers

Option A is wrong because embedding the database password in the AMI stores credentials persistently on the EC2 instance, violating the requirement that credentials must not be stored on the instance, and the password is long-lived, not short-lived. Option B is wrong because storing the database password in user data also places credentials on the EC2 instance (accessible via the instance metadata), again violating the no-storage requirement and providing a long-lived credential. Option C is wrong because a security group rule only controls network access at the transport layer; it does not address authentication or credential management, and it does not provide short-lived credentials.

Full explanation →

835

MCQmedium

Based on the exhibit, which AWS service should the security team enable to continuously discover sensitive data stored inside Amazon S3 objects?

A.AWS CloudTrail

B.Amazon Macie

C.AWS Config

D.Amazon GuardDuty

AnswerB

Macie is the AWS service designed to discover and classify sensitive data in S3. It can continuously analyze buckets for personal data patterns and produce findings when sensitive information is detected. That matches the requirement for ongoing classification of object contents rather than audit logs or configuration checks.

Why this answer

Amazon Macie is a fully managed data security and data privacy service that uses machine learning and pattern matching to automatically discover, classify, and protect sensitive data such as personally identifiable information (PII) or financial data stored in Amazon S3. It provides continuous visibility into data security risks by generating findings when sensitive data is detected, making it the correct choice for this use case.

Exam trap

The trap here is that candidates often confuse Amazon Macie with Amazon GuardDuty, mistakenly thinking GuardDuty's threat detection includes scanning for sensitive data, when in fact GuardDuty focuses on security threats and anomalies, not data classification or content inspection.

How to eliminate wrong answers

Option A is wrong because AWS CloudTrail records API activity and governance events for auditing, but it does not inspect the content of S3 objects to discover sensitive data. Option C is wrong because AWS Config evaluates resource configurations against compliance rules and tracks configuration changes, but it cannot scan or analyze the data within S3 objects for sensitive content. Option D is wrong because Amazon GuardDuty is a threat detection service that monitors for malicious activity and unauthorized behavior using VPC Flow Logs, DNS logs, and CloudTrail events, but it does not perform content inspection of S3 objects to find sensitive data.

Full explanation →

836

Multi-Selectmedium

A multi-tenant event system writes and reads data in DynamoDB. One tenant generates most of the traffic, causing throttling on a single partition key value, and the dashboards repeatedly read the most recent items for that tenant. Which two changes should the team make to improve performance? Select two.

Select 2 answers

A.Shard the hot tenant’s writes across multiple partition key values so traffic is spread across partitions.

B.Use Amazon DAX to cache repetitive read requests for the same items with sub-millisecond latency.

C.Switch the table to on-demand capacity mode so DynamoDB automatically removes partition limits.

D.Use a larger sort key attribute to increase the maximum write throughput for the tenant.

E.Move the table to a single larger provisioned throughput setting and keep the same key design.

AnswersA, B

Write sharding reduces pressure on a single partition and is the standard fix for a hot key caused by one tenant.

Why this answer

Option A is correct because sharding the hot tenant's writes across multiple partition key values (e.g., by appending a random suffix or a timestamp-based suffix) distributes the write traffic across multiple physical partitions, avoiding throttling on a single partition. This is a common DynamoDB design pattern for mitigating hot keys, as each partition has its own throughput limits (e.g., up to 3,000 RCU or 1,000 WCU for a partition). By spreading writes, the system can achieve higher aggregate throughput without hitting per-partition limits.

Exam trap

The trap here is that candidates often assume on-demand mode (Option C) or higher provisioned throughput (Option E) will solve hot key throttling, but they fail to recognize that DynamoDB's per-partition throughput limits are independent of the table's capacity mode or total provisioned capacity.

Full explanation →

837

MCQeasy

A startup expects steady compute usage around the clock for the next year. They want to reduce costs compared to On-Demand pricing, without tightly planning specific instance types. Which option best matches their goal?

A.Purchase a Compute Savings Plan to receive discounted rates for a usage amount over a 1-year term.

B.Purchase a Reserved Instance that must be tied to exactly one specific instance size (no flexibility to switch instance families).

C.Only use Spot Instances and set the workload to stop immediately if capacity is interrupted.

D.Rely on On-Demand pricing and add more alarms to detect when costs spike.

AnswerA

Compute Savings Plans provide discounted EC2 usage (and related compute usage) versus On-Demand for a committed amount per hour. They are not limited to a single instance type, so the team can change instance families while staying within the committed usage.

Why this answer

A Compute Savings Plan offers the lowest prices on EC2 compute usage (including Fargate and Lambda) in exchange for a commitment to a consistent amount of compute (measured in $/hour) over a 1-year or 3-year term. This matches the startup's steady, predictable usage and provides up to 66% savings over On-Demand, while allowing flexibility to change instance families, sizes, regions, or even switch to containers without renegotiating the plan.

Exam trap

The trap here is that candidates often confuse Compute Savings Plans with Reserved Instances, assuming both lock you to a specific instance type, but Compute Savings Plans provide full flexibility across instance families, sizes, and even compute services.

How to eliminate wrong answers

Option B is wrong because a Reserved Instance (Standard or Convertible) is tied to a specific instance family and often a specific size within that family, which contradicts the requirement for flexibility across instance types. Option C is wrong because Spot Instances can be interrupted with only a 2-minute warning, making them unsuitable for steady, around-the-clock compute workloads that cannot tolerate interruptions. Option D is wrong because relying solely on On-Demand pricing with alarms does not reduce costs; alarms only notify of cost spikes but do not provide any discount mechanism.

Full explanation →

838

MCQmedium

Based on the exhibit, the application should continue serving requests if one Availability Zone fails. Which change best improves resilience with the least operational complexity?

A.Increase the desired capacity in AZ-a so more instances can absorb the failure of that same Availability Zone.

B.Add at least one subnet from a second Availability Zone to both the ALB and the Auto Scaling group.

C.Disable health checks so the ALB stops removing targets during brief infrastructure issues.

D.Move the application to a single larger instance type so the fleet has fewer moving parts.

AnswerB

A resilient design needs the load balancer and the Auto Scaling group to span multiple Availability Zones. If one AZ fails, the ALB can still route to healthy targets in the remaining AZs and the Auto Scaling group can replenish capacity there. This is the simplest and most common way to achieve AZ-level fault tolerance.

Why this answer

Option B is correct because adding subnets from a second Availability Zone to both the ALB and the Auto Scaling group distributes the application across multiple AZs. This ensures that if one AZ fails, the ALB can route traffic to healthy targets in the remaining AZ, and the Auto Scaling group can maintain capacity by launching instances in the surviving AZ. This approach directly addresses the requirement to continue serving requests during an AZ failure with minimal operational complexity.

Exam trap

The trap here is that candidates often think increasing capacity in a single AZ (Option A) provides resilience, but it actually concentrates risk in that AZ, while the correct answer requires distributing resources across multiple AZs to achieve true fault tolerance.

How to eliminate wrong answers

Option A is wrong because increasing the desired capacity in a single AZ does not provide resilience against the failure of that same AZ; all instances would be lost if the AZ fails. Option C is wrong because disabling health checks would prevent the ALB from detecting and removing unhealthy targets, causing traffic to be routed to failed instances and degrading application availability. Option D is wrong because moving to a single larger instance type creates a single point of failure; if that instance fails, the entire application becomes unavailable, and it does not address AZ-level failures.

Full explanation →

839

MCQeasy

A company hosts static images, CSS, and JavaScript files in an Amazon S3 bucket. Users around the world report slow page loads, and the origin receives many repeated requests for the same files. What should the team use to improve performance?

A.Amazon CloudFront

B.AWS Direct Connect

C.Amazon Route 53 health checks

D.Amazon EFS

AnswerA

CloudFront caches content at edge locations and reduces latency and origin traffic for global users.

Why this answer

Amazon CloudFront is a content delivery network (CDN) that caches static content (images, CSS, JavaScript) at edge locations worldwide. By serving cached copies from the edge closest to each user, CloudFront reduces latency, offloads repeated requests from the origin S3 bucket, and improves page load times for a global audience.

Exam trap

The trap here is that candidates may confuse a CDN (CloudFront) with a private network connection (Direct Connect) or a DNS routing service (Route 53), failing to recognize that caching at edge locations is the key to reducing latency and origin load for static content served globally.

How to eliminate wrong answers

Option B is wrong because AWS Direct Connect establishes a dedicated private network connection from on-premises to AWS, which does not cache content or reduce latency for global users accessing public S3 objects. Option C is wrong because Amazon Route 53 health checks monitor endpoint availability and route traffic away from unhealthy endpoints, but they do not cache content or reduce repeated requests to the origin. Option D is wrong because Amazon EFS is a scalable file system for EC2 instances, not a caching or content delivery service, and it would not improve latency for users accessing static files via the internet.

Full explanation →

840

MCQmedium

A batch analytics job has unpredictable DynamoDB traffic with long idle periods and occasional spikes. Which capacity mode should minimize operational overhead and avoid paying for idle provisioned capacity? The architecture review board prefers a managed AWS-native control.

A.DynamoDB on-demand capacity mode

B.Reserved capacity for maximum daily traffic

C.Provisioned capacity set for peak traffic

D.Global tables in every Region

AnswerA

On-demand capacity is suitable for unpredictable workloads and charges per request without capacity planning.

Why this answer

DynamoDB on-demand capacity mode automatically scales to handle unpredictable traffic spikes and idle periods, charging only for the reads/writes you perform. This eliminates the need to provision capacity, reducing operational overhead and avoiding costs for idle provisioned capacity, aligning with the architecture review board's preference for a managed AWS-native control.

Exam trap

The trap here is that candidates may confuse 'reserved capacity' or 'provisioned capacity' as cost-effective for spikes, but they fail to recognize that on-demand is the only mode that eliminates idle cost and operational overhead for unpredictable workloads.

How to eliminate wrong answers

Option B is wrong because reserved capacity requires upfront commitment to a specific traffic level, which doesn't suit unpredictable spikes and idle periods, and would still incur costs for unused capacity. Option C is wrong because provisioned capacity set for peak traffic would over-provision during idle periods, leading to paying for unused capacity and increased operational overhead to manage scaling. Option D is wrong because global tables are a replication feature for multi-Region data access, not a capacity mode, and they do not address cost optimization for unpredictable traffic or idle periods.

Full explanation →

841

MCQeasy

A team runs an Amazon NLB in a VPC with targets registered in multiple Availability Zones (AZs). Their bill shows high inter-AZ data transfer charges. They want to reduce unnecessary cross-AZ traffic costs while still maintaining healthy targets per AZ. What change is most likely to reduce inter-AZ charges?

A.Disable cross-zone load balancing on the NLB so each client is routed to targets in the same AZ when possible.

B.Enable cross-zone load balancing so all targets receive traffic from every AZ.

C.Move the NLB to a different Region so traffic is always kept local.

D.Replace the NLB with a NAT gateway to reduce data charges between AZs.

AnswerA

Disabling cross-zone load balancing helps keep traffic within the same AZ, reducing inter-AZ data transfer charges.

Why this answer

Option A is correct because disabling cross-zone load balancing on an NLB ensures that each client is routed only to targets within the same Availability Zone as the NLB node that receives the traffic. This eliminates inter-AZ data transfer charges because traffic never leaves the AZ boundary. The NLB still maintains healthy targets per AZ by distributing traffic only among healthy targets within that AZ.

Exam trap

The trap here is that candidates often assume enabling cross-zone load balancing always reduces costs or improves performance, but for NLB it actually increases inter-AZ data transfer charges, and the question specifically asks for cost reduction, not high availability.

How to eliminate wrong answers

Option B is wrong because enabling cross-zone load balancing would cause traffic to be distributed across all AZs, increasing inter-AZ data transfer charges, not reducing them. Option C is wrong because moving the NLB to a different Region does not address inter-AZ traffic within the current VPC; it would introduce cross-Region data transfer costs and latency. Option D is wrong because a NAT gateway is used for outbound internet traffic from private subnets, not for load balancing inbound traffic, and it would not reduce inter-AZ data charges; in fact, NAT gateway charges include per-GB data processing fees and inter-AZ traffic costs if deployed across AZs.

Full explanation →

842

MCQeasy

A web application behind an Application Load Balancer (ALB) currently allows client connections over HTTP (port 80). The security policy requires all client traffic to use HTTPS. What is the best ALB change to enforce this requirement?

A.Add an HTTP listener on port 80 with a redirect action to HTTPS on port 443, and configure an HTTPS listener using an ACM certificate

B.Enable TLS only on the target group so that traffic between the ALB and targets is encrypted, even if clients connect via HTTP

C.Turn on S3 server-side encryption to ensure data is encrypted in transit from clients to the ALB

D.Remove port 80 access by removing the port 80 listener and leave only a default target group

AnswerA

Redirecting all HTTP requests to HTTPS forces clients to use TLS when they access the application. Configuring an HTTPS listener with an ACM certificate ensures the ALB terminates TLS on port 443 using a valid certificate, directly enforcing encryption in transit for client-to-ALB traffic.

Why this answer

Option A is correct because it uses an ALB HTTP-to-HTTPS redirect action, which is the most efficient and AWS-native way to enforce HTTPS-only traffic. The HTTP listener on port 80 automatically redirects all client requests to the HTTPS listener on port 443, which terminates TLS using an ACM certificate. This approach requires no changes to client applications and ensures compliance with the security policy at the load balancer level.

Exam trap

The trap here is that candidates may think removing the HTTP listener or enabling TLS on the target group is sufficient, but the correct approach is to use a redirect action on the HTTP listener to enforce HTTPS without breaking client connectivity.

How to eliminate wrong answers

Option B is wrong because enabling TLS only on the target group does not enforce HTTPS for client-to-ALB traffic; clients can still connect over HTTP, and the ALB will forward requests to targets over TLS, but the initial connection remains unencrypted. Option C is wrong because S3 server-side encryption is for data at rest in Amazon S3, not for encrypting data in transit between clients and the ALB. Option D is wrong because simply removing the port 80 listener and leaving only a default target group would drop all HTTP traffic, but it does not provide a redirect to HTTPS, which is a better user experience and the recommended approach; clients would receive a connection error instead of being seamlessly redirected.

Full explanation →

843

MCQmedium

A partner company needs read-only access to reports in an S3 bucket for a image sharing application. The partner has its own AWS account. What is the most secure scalable access pattern?

A.Make the objects public and rely on difficult-to-guess object names

B.Create an IAM user in the company account and share the access keys

C.Copy the objects to a public website bucket

D.Create a bucket policy that grants the partner role least-privilege access to the required prefix

AnswerD

A resource policy can grant cross-account access to a specific external role and prefix.

Why this answer

Option D is correct because it uses a resource-based bucket policy that grants the partner's AWS account (via its IAM role) least-privilege read-only access to a specific prefix. This avoids sharing long-term credentials, leverages AWS's cross-account trust mechanism, and ensures the partner's access is controlled through their own IAM roles, which is the most secure and scalable pattern for cross-account S3 access.

Exam trap

The trap here is that candidates often choose sharing IAM access keys (Option B) because it seems simpler, but the SAA-C03 exam emphasizes using IAM roles and resource-based policies for cross-account access to avoid long-term credential management and improve security.

How to eliminate wrong answers

Option A is wrong because making objects public with 'security through obscurity' (difficult-to-guess names) is not secure; anyone who discovers the URL can access the objects, and S3 does not enforce access control based on object name guessability. Option B is wrong because creating an IAM user in the company account and sharing access keys introduces a long-term credential that must be rotated, managed, and could be leaked; it violates the principle of least privilege and does not scale across multiple partner accounts. Option C is wrong because copying objects to a public website bucket (e.g., S3 static website hosting) makes them publicly accessible over HTTP/HTTPS with no authentication, which is insecure and does not provide read-only access control for a specific partner.

Full explanation →

844

MCQmedium

A test environment stores logs in S3. Logs are queried for 30 days, rarely accessed for one year, and then retained for compliance. What should reduce storage cost? The design must avoid adding custom operational scripts.

A.Keep all logs in S3 Standard indefinitely

B.Move all logs immediately to S3 Glacier Deep Archive

C.S3 lifecycle policy that transitions objects to lower-cost storage classes over time

D.Use EBS snapshots for the logs

AnswerC

Lifecycle rules automate transitions based on age, matching storage cost to access patterns.

Why this answer

Option C is correct because S3 lifecycle policies automate the transition of objects between storage classes based on age, allowing logs to move from S3 Standard (for frequent querying) to S3 Standard-IA or S3 One Zone-IA (for rare access), and eventually to S3 Glacier Deep Archive (for long-term compliance retention). This reduces storage cost without custom scripts, aligning with the requirement to avoid operational overhead.

Exam trap

The trap here is that candidates may choose Option B (immediate move to Glacier Deep Archive) thinking it minimizes cost, but they overlook the 30-day query requirement, which makes S3 Standard necessary for fast retrieval, and fail to recognize that lifecycle policies provide a graduated, automated approach.

How to eliminate wrong answers

Option A is wrong because keeping all logs in S3 Standard indefinitely incurs the highest storage cost, ignoring the cost savings from transitioning to lower-cost classes for rarely accessed and compliance-retained data. Option B is wrong because moving all logs immediately to S3 Glacier Deep Archive prevents the 30-day querying requirement, as retrieval times are hours and costs are high for frequent access, violating the design need for queryability. Option D is wrong because EBS snapshots are block-level backups for EC2 instances, not designed for log storage in S3, and would introduce unnecessary complexity and cost without addressing the tiered access pattern.

Full explanation →

845

Multi-Selectmedium

A company is designing a multi-tier web application on AWS. The application consists of an Application Load Balancer (ALB), an Amazon EC2 Auto Scaling group for the web tier, and an Amazon RDS for MySQL database. The security team requires that the web tier instances have no public IP addresses and that all outbound traffic to the internet is blocked, except for specific software updates from a trusted vendor. Which three steps should be taken to meet these requirements? (Choose three.)

Select 3 answers

.Place the web tier instances in a private subnet and use a NAT gateway in a public subnet for outbound traffic to the trusted vendor.

.Configure a VPC endpoint for the trusted vendor's software update service.

.Use a security group on the web tier instances that denies all outbound traffic except to the trusted vendor's IP range.

.Deploy the web tier instances in a public subnet and use a network ACL to block all inbound traffic from the internet.

.Configure the ALB to be internet-facing and place the web tier instances in a public subnet.

.Use a network ACL on the private subnet to block all outbound traffic except to the trusted vendor's IP range and to the NAT gateway.

Why this answer

Placing the web tier instances in a private subnet ensures they have no public IP addresses, meeting the security requirement. Using a security group on the web tier instances to deny all outbound traffic except to the trusted vendor's IP range provides a stateful, instance-level control that blocks all other outbound internet traffic. Additionally, a network ACL on the private subnet must allow outbound traffic to the trusted vendor's IP range and to the NAT gateway, because the NAT gateway itself needs to send traffic to the vendor, and the network ACL acts as a stateless subnet-level filter that must explicitly permit this return traffic.

Exam trap

The trap here is that candidates often confuse security groups (stateful) with network ACLs (stateless) and forget that a NAT gateway is required for private instances to reach the internet, but the NACL must explicitly allow traffic to the NAT gateway's IP range (the public subnet's CIDR) and the vendor's IP, while the security group handles the instance-level outbound restriction.

Full explanation →

846

Multi-Selecthard

A log archive has old unattached EBS volumes and many stale snapshots. Which two actions reduce storage cost without affecting running instances? The design must avoid adding custom operational scripts.

Select 2 answers

A.Stop all EC2 instances in the account

B.Disable CloudTrail logging

C.Delete unattached EBS volumes after verifying they are no longer needed

D.Apply snapshot lifecycle policies to expire obsolete snapshots

AnswersC, D

Unattached volumes continue to incur charges until deleted.

Why this answer

Option C is correct because deleting unattached EBS volumes eliminates storage costs for volumes that are not in use, and since they are not attached to any running instance, this action does not affect running instances. Option D is correct because applying snapshot lifecycle policies automates the deletion of obsolete snapshots, reducing storage costs without requiring custom scripts or impacting running instances.

Exam trap

The trap here is that candidates may confuse stopping instances (which does not delete volumes or snapshots) with deleting resources, or think disabling CloudTrail reduces storage costs, when in fact CloudTrail logs are stored in S3, not EBS, and have separate cost implications.

Full explanation →

847

MCQmedium

A read-heavy media archive repeatedly queries the same product catalogue data from DynamoDB with millisecond latency requirements. Which service can reduce read latency and table load?

A.DynamoDB Accelerator (DAX)

B.Amazon Kinesis Data Firehose

C.AWS Glue Data Catalog

D.S3 Transfer Acceleration

AnswerA

DAX is an in-memory cache for DynamoDB that reduces read latency for suitable access patterns.

Why this answer

DynamoDB Accelerator (DAX) is a fully managed, in-memory cache for DynamoDB that delivers up to 10x read performance improvement by reducing response times from single-digit milliseconds to microseconds. For a read-heavy workload repeatedly querying the same product catalogue data, DAX caches the hot items, offloading read requests from the DynamoDB table and significantly reducing table read capacity consumption.

Exam trap

The trap here is that candidates may confuse DAX with other caching services like ElastiCache or think that S3 Transfer Acceleration can speed up DynamoDB reads, but DAX is the only AWS service purpose-built for in-memory caching of DynamoDB queries with automatic cache invalidation and write-through semantics.

How to eliminate wrong answers

Option B (Amazon Kinesis Data Firehose) is wrong because it is a real-time streaming data ingestion service for loading data into data lakes and analytics tools, not a caching or read acceleration service for DynamoDB queries. Option C (AWS Glue Data Catalog) is wrong because it is a metadata repository for ETL jobs and data discovery, not designed to cache or accelerate DynamoDB read operations. Option D (S3 Transfer Acceleration) is wrong because it speeds up uploads to S3 over long distances using AWS edge locations, but it does not cache DynamoDB query results or reduce read latency for DynamoDB operations.

Full explanation →

848

MCQmedium

A mobile banking backend stores audit logs in S3. The compliance team requires that logs cannot be overwritten or deleted for seven years. What should be configured?

A.S3 server access logging

B.S3 lifecycle expiration after seven years

C.S3 versioning only

D.S3 Object Lock in compliance mode with an appropriate retention period

AnswerD

Object Lock compliance mode enforces write-once-read-many retention that even privileged users cannot bypass during the retention period.

Why this answer

S3 Object Lock in compliance mode prevents any user, including the root user, from overwriting or deleting objects until the retention period expires. This meets the compliance requirement of immutable audit logs for seven years, as compliance mode enforces a legal hold that cannot be removed by any party.

Exam trap

The trap here is that candidates often confuse versioning with immutability, thinking versioning alone prevents data loss, but it does not block overwrites or deletes of the current version, which is required for compliance-grade write-once protection.

How to eliminate wrong answers

Option A is wrong because S3 server access logging only records requests made to the bucket, it does not prevent deletion or overwriting of existing objects. Option B is wrong because S3 lifecycle expiration after seven years would automatically delete objects after that period, but it does not prevent premature deletion or overwriting before the seven-year mark. Option C is wrong because S3 versioning alone preserves previous versions of objects but does not prevent deletion of the current version or overwriting with a new version; it only retains old versions, not block writes or deletes.

Full explanation →

849

MCQmedium

A web API runs on an Auto Scaling group (ASG) behind an Application Load Balancer (ALB). During traffic spikes, users experience request timeouts even though CPU stays below 40%. After investigation, you find the ASG often has too few healthy targets to handle the current request rate. Which change will best improve responsiveness during spikes?

A.Keep the ASG scaling policy based on CPU utilization, but increase the ASG min capacity by 50%.

B.Create a target tracking scaling policy using an ALB metric such as RequestCountPerTarget or TargetResponseTime.

C.Enable EC2 detailed monitoring for one-minute granularity and keep CPU scaling.

D.Switch to scaling based on the ASG network out bytes metric only, ignoring ALB response metrics.

AnswerB

Target tracking with an ALB performance metric scales based on the same layer where the problem is observed (requests/latency through the ALB). As traffic spikes, RequestCountPerTarget and/or TargetResponseTime increase; the scaling policy then increases the ASG desired capacity so the ALB has more healthy targets to distribute requests to. That reduces queuing/latency and helps prevent timeouts without waiting for CPU to rise.

Why this answer

Option B is correct because the issue is that the ASG has too few healthy targets to handle the request rate, even though CPU is low. A target tracking scaling policy based on RequestCountPerTarget or TargetResponseTime directly aligns scaling with the ALB's view of demand, ensuring the ASG adds instances when request rates spike, regardless of CPU utilization. This addresses the root cause—insufficient capacity to serve incoming requests—rather than relying on a metric (CPU) that does not reflect the bottleneck.

Exam trap

The trap here is that candidates assume CPU utilization is always the best scaling metric, but AWS explicitly tests that ALB-level metrics (RequestCountPerTarget, TargetResponseTime) are more appropriate when the bottleneck is request throughput rather than compute load.

How to eliminate wrong answers

Option A is wrong because increasing the ASG min capacity only raises the baseline number of instances, but does not make the scaling policy responsive to traffic spikes; the ASG will still scale based on CPU, which remains low, so it will not add instances during spikes. Option C is wrong because enabling detailed monitoring (1-minute granularity) improves the frequency of metric data but does not change the fact that CPU utilization is not the correct metric to trigger scaling for this request-rate bottleneck. Option D is wrong because switching to scaling based solely on ASG network out bytes ignores the ALB's request-level metrics, which are more directly correlated with the user-observed timeouts and healthy-target deficit.

Full explanation →

850

MCQeasy

A compute workload uses temporary scratch space for intermediate results (reproducible), and it can tolerate data loss if the instance is terminated. The workload benefits from very high local I/O throughput. Which storage option is the best fit for the scratch data?

A.Amazon EBS General Purpose (gp3) volumes to persist intermediate results across reboots.

B.Amazon EFS for a shared file system between multiple instances.

C.Instance store for local temporary files that can be lost when the instance stops.

D.Amazon S3 for scratch data so it is always durable and accessible from anywhere.

AnswerC

Instance store is designed for temporary high-performance local storage and is acceptable when loss is tolerable.

Why this answer

Instance store volumes provide very high local I/O throughput because they are physically attached to the host server. Since the workload can tolerate data loss and the scratch data is reproducible, the ephemeral nature of instance store is acceptable, and it offers the best performance for temporary, high-throughput scratch space.

Exam trap

AWS often tests the distinction between persistent block storage (EBS) and ephemeral instance store, where candidates mistakenly choose EBS for its persistence even when the workload explicitly tolerates data loss and requires maximum local I/O throughput.

How to eliminate wrong answers

Option A is wrong because EBS gp3 volumes, while offering good performance, have lower maximum IOPS and throughput compared to instance store, and persisting intermediate results across reboots is unnecessary since the data is reproducible and can be lost. Option B is wrong because EFS is a network file system with higher latency and lower throughput than local storage, and a shared file system is not required for scratch data used by a single instance. Option D is wrong because Amazon S3 is object storage with high latency and lower throughput for random I/O, and its durability and accessibility features are overkill for temporary, reproducible scratch data that can be lost.

Full explanation →

851

MCQhard

A claims workflow uses Amazon SQS. Poison messages are repeatedly failing and blocking useful retries. What should the architect configure? The design must avoid adding custom operational scripts.

A.A FIFO queue without a redrive policy

B.Short polling instead of long polling

C.A dead-letter queue with an appropriate maxReceiveCount

D.A larger message retention period only

AnswerC

A DLQ isolates messages that fail repeatedly so they can be investigated without disrupting normal processing.

Why this answer

A dead-letter queue (DLQ) with an appropriate maxReceiveCount allows messages that repeatedly fail processing to be moved to a separate queue after a specified number of receive attempts. This prevents poison messages from blocking the main queue and consuming retry capacity, while avoiding custom operational scripts by using native SQS functionality.

Exam trap

The trap here is that candidates often confuse increasing message retention or changing polling behavior with solving poison message issues, but only a dead-letter queue with a maxReceiveCount directly removes repeatedly failing messages from the processing flow.

How to eliminate wrong answers

Option A is wrong because a FIFO queue without a redrive policy does not automatically handle poison messages; it still requires a DLQ configuration to move failing messages out of the main queue. Option B is wrong because short polling reduces latency but does not address poison messages; it returns only a subset of partitions and can increase empty responses, but it does not prevent repeated failures. Option D is wrong because a larger message retention period only keeps messages longer in the queue; it does not stop poison messages from being repeatedly retried and blocking useful retries.

Full explanation →

852

MCQmedium

A media company uploads raw video thumbnails to an S3 bucket every hour. The application needs these thumbnails for active browsing for the first 7 days. After day 7, access becomes rare. Requirements: - Objects must remain available in S3 for at least 180 days total. - After day 7, the team can tolerate retrieval latency in the range of minutes to hours. - They want to minimize storage cost while keeping the ability to read objects (no application changes required). Which storage strategy is the most cost-optimized fit?

A.Use a bucket-level lifecycle rule to transition objects to S3 Standard-IA on day 7 and then expire them after day 180.

B.Use a lifecycle rule to transition objects to S3 Glacier Flexible Retrieval after day 7 and expire them after day 180.

C.Keep all objects in S3 Standard for 180 days, and enable S3 Intelligent-Tiering only if the bucket’s access frequency is above a threshold.

D.Use a lifecycle rule to transition objects to S3 Glacier Instant Retrieval after day 7 and expire them after day 180.

AnswerB

Glacier Flexible Retrieval is designed for infrequent access and supports restore times compatible with minutes to hours. Transitioning after day 7 reduces storage cost for the long period where access is rare, while expiring at day 180 satisfies the 180-day retention requirement. The application can still use S3 GetObject; retrieval simply takes longer due to the archival tier.

Why this answer

Option B is correct because S3 Glacier Flexible Retrieval provides retrieval times from minutes to hours, which matches the tolerance for rare access after day 7, and offers the lowest storage cost among the options for data that is rarely accessed. A lifecycle rule transitions objects from S3 Standard (used for the first 7 days of active browsing) to Glacier Flexible Retrieval on day 7, then expires them after day 180, meeting the 180-day retention requirement without application changes.

Exam trap

The trap here is that candidates often choose S3 Glacier Instant Retrieval (Option D) because of the word 'Instant,' overlooking that the requirement explicitly tolerates minutes-to-hours latency, making the cheaper Glacier Flexible Retrieval the better cost-optimized choice.

How to eliminate wrong answers

Option A is wrong because S3 Standard-IA is designed for infrequent access but still incurs higher storage costs than Glacier Flexible Retrieval for data that is accessed rarely (minutes-to-hours latency is acceptable), and it does not provide the lowest cost for this use case. Option C is wrong because keeping all objects in S3 Standard for 180 days is significantly more expensive than transitioning to a colder storage class, and S3 Intelligent-Tiering is not cost-optimized for a predictable access pattern (active for 7 days, then rarely accessed) as it adds monitoring costs and may not move objects to the cheapest tier quickly enough. Option D is wrong because S3 Glacier Instant Retrieval is designed for millisecond retrieval, which is unnecessary and more expensive than Glacier Flexible Retrieval when minutes-to-hours latency is acceptable, thus not the most cost-optimized choice.

Full explanation →

853

MCQmedium

Based on the exhibit, the payment worker sometimes processes the same SQS Standard message more than once after a timeout. What change best prevents duplicate charges while keeping the queue architecture?

A.Increase the SQS visibility timeout to 15 minutes and leave the worker unchanged.

B.Replace the Standard queue with a FIFO queue and rely only on message ordering.

C.Make the payment workflow idempotent by recording a unique order key before charging.

D.Add a second consumer so duplicate messages are processed faster.

AnswerC

SQS Standard queues are at-least-once delivery, so duplicate messages are always possible. The correct safeguard is idempotency: store a unique order or payment request key, check whether that key has already been processed, and only perform the charge the first time it is seen. Any later delivery is safely ignored.

Why this answer

Option C is correct because making the payment workflow idempotent ensures that even if the same SQS Standard message is processed more than once (due to a visibility timeout), the duplicate charge is prevented by checking a unique order key before processing. This is the most robust solution for handling at-least-once delivery semantics of Standard queues without changing the queue architecture.

Exam trap

The trap here is that candidates often think increasing the visibility timeout (Option A) or switching to a FIFO queue (Option B) will solve duplicate processing, but they overlook that the root cause is the worker's timeout behavior, which requires application-level idempotency to prevent duplicate charges.

How to eliminate wrong answers

Option A is wrong because increasing the visibility timeout to 15 minutes does not guarantee that the worker will finish processing within that time; if the worker still times out, the message becomes visible again and can be processed again, leading to duplicate charges. Option B is wrong because replacing the Standard queue with a FIFO queue ensures exactly-once processing but does not prevent duplicate charges if the worker itself processes the same message twice due to a timeout; FIFO queues eliminate duplicates at the queue level but not at the application level. Option D is wrong because adding a second consumer increases the likelihood of duplicate processing when messages become visible again after a timeout, as both consumers may pick up the same message, worsening the duplicate charge problem.

Full explanation →

854

Multi-Selecthard

Multiple teams share one AWS Organization. Finance wants chargeback by project, alerts before overspend, and monthly views by account without manually opening each account. Which three actions best fit? Select three.

Select 3 answers

A.Enforce cost allocation tags on resources and activate them for billing reports.

B.Use AWS Budgets to create alerts and budget actions for each project.

C.Use Cost Explorer or Cost and Usage Reports to analyze spend by account, tag, and service.

D.Put every team in a separate AWS account and ignore tagging.

E.Use CloudTrail trails to estimate spend by resource because it records API calls.

AnswersA, B, C

Correct. Cost allocation tags are the foundation for project-level chargeback. Once activated for billing, they let finance group spend by business unit, application, or environment.

Why this answer

Option A is correct because cost allocation tags, when activated for billing reports, allow you to tag resources with project-specific metadata (e.g., 'Project:Alpha'). AWS then includes these tags in the Cost and Usage Reports (CUR) and Cost Explorer, enabling Finance to filter and allocate costs by project without manual account inspection. This directly supports chargeback by project and monthly views by account and tag.

Exam trap

The trap here is that candidates may confuse CloudTrail (which records API calls) with AWS Cost Explorer or CUR (which provide actual cost data), leading them to incorrectly select option E for cost estimation.

Full explanation →

855

Multi-Selecthard

A latency-sensitive mobile game backend uploads large files to S3 from users around the world. Which two features can improve upload performance?

Select 2 answers

A.S3 Object Lock

B.S3 multipart upload

C.S3 Inventory

D.S3 Transfer Acceleration

AnswersB, D

Multipart upload parallelizes large object upload parts and improves reliability.

Why this answer

S3 multipart upload is correct because it allows large files to be uploaded in parallel parts, which reduces the impact of network latency and improves throughput. For a latency-sensitive mobile game backend, this feature enables faster uploads by splitting the file into smaller chunks that can be uploaded concurrently, even over unstable connections.

Exam trap

The trap here is that candidates may confuse S3 Transfer Acceleration with multipart upload, or incorrectly assume that S3 Object Lock or Inventory provide performance benefits, when in fact they serve entirely different purposes related to data protection and management.

Full explanation →

856

MCQmedium

Developers for a customer analytics portal need temporary elevated access to production resources for troubleshooting. The security team wants approvals, expiry, and audit logging. Which approach is best?

A.Disable CloudTrail during troubleshooting

B.Attach AdministratorAccess permanently to every developer role

C.Use IAM Identity Center permission sets with time-bound access processes and CloudTrail auditing

D.Create shared administrator access keys for the team

AnswerC

Federated access with permission sets and audited temporary assignments reduces standing privilege.

Why this answer

Option C is correct because AWS IAM Identity Center (formerly AWS SSO) allows you to define permission sets that grant temporary, time-bound elevated access to production resources. Combined with AWS CloudTrail, every access attempt is logged for audit, meeting the security team's requirements for approvals, expiry, and audit logging. This approach follows the principle of least privilege and ensures that elevated permissions are not permanent.

Exam trap

The trap here is that candidates often confuse IAM users with IAM Identity Center, or think that simply enabling CloudTrail (without a proper access control mechanism) is sufficient, but the question specifically requires time-bound access and approvals, which only a centralized identity solution like IAM Identity Center provides.

How to eliminate wrong answers

Option A is wrong because disabling CloudTrail during troubleshooting removes all audit logging, directly violating the security team's requirement for audit logging. Option B is wrong because permanently attaching AdministratorAccess to every developer role grants excessive, permanent privileges, violating the principle of least privilege and the requirement for temporary, time-bound access. Option D is wrong because creating shared administrator access keys eliminates individual accountability, breaks audit trails, and violates the security team's need for approvals and expiry.

Full explanation →

857

MCQmedium

A production Amazon RDS database has automated backups enabled. At 10:00 UTC, an application deploy accidentally overwrote a subset of rows due to a faulty migration. The issue is detected at 10:45 UTC. The team confirms that the required retention window is still available. Which approach offers the most resilient and least disruptive way to recover the affected data close to the time of the event?

A.Perform a snapshot restore and attach the restored instance, then manually copy only the affected rows back into the current database.

B.Use point-in-time recovery to restore the database to a timestamp just before 10:00 UTC, then swap application connectivity to the recovered instance.

C.Rely on automated backups to roll forward automatically until the data becomes correct.

D.Disable automated backups going forward to prevent future corruption, then reindex the corrupted table.

AnswerB

Point-in-time recovery leverages automated backups to create a recovery point near the incident and supports restoring close to 10:00.

Why this answer

Option B is correct because point-in-time recovery (PITR) allows you to restore the RDS instance to any second within the backup retention window, such as just before the faulty migration at 10:00 UTC. This restores a complete, consistent database state, minimizing data loss and avoiding manual row-by-row recovery. Swapping application connectivity to the restored instance is the least disruptive approach, as it avoids complex manual data merging and reduces downtime.

Exam trap

The trap here is that candidates may choose snapshot restore (Option A) thinking it is faster or simpler, but they overlook that PITR provides a more precise, consistent recovery point without manual data extraction and reinsertion.

How to eliminate wrong answers

Option A is wrong because performing a snapshot restore and manually copying affected rows is error-prone, time-consuming, and risks data inconsistency, especially if the affected rows have dependencies. Option C is wrong because automated backups do not 'roll forward' to correct data corruption; they are used for restore operations, not automatic healing. Option D is wrong because disabling automated backups does not recover lost data and actually increases future risk; reindexing does not restore overwritten rows.

Full explanation →

858

Matchinghard

A media platform serves global users through Amazon CloudFront and an S3 origin. Match each requirement on the left to the CloudFront configuration or behavior on the right.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Use CloudFront Origin Access Control and allow only the distribution in the bucket policy.

Use versioned object filenames or hashed asset names with a long TTL.

Exclude the tracking query string from the cache key with a cache policy.

Use CloudFront signed URLs or signed cookies.

Why these pairings

Geo restriction limits access by geography; origin groups enable failover or load balancing; Lambda@Edge customizes content based on device; edge locations reduce latency; AWS WAF mitigates DDoS; signed URLs secure private content.

Full explanation →

859

Multi-Selecthard

A media company stores raw project files in Amazon S3. Files are accessed heavily for the first 60 days, occasionally for legal review during the next six months, and must be retained for 7 years. Retrieval for the oldest files can take hours. Which three actions should the architect recommend? Select three.

Select 3 answers

A.Transition objects to S3 Standard-IA after 60 days.

B.Transition objects to S3 Glacier Deep Archive after the review period ends.

C.Expire objects after 7 years.

D.Store the same files in EBS snapshots instead of S3 to lower archive costs.

E.Replicate the bucket to another Region to reduce storage charges.

AnswersA, B, C

Standard-IA reduces storage cost for infrequently accessed objects while still keeping retrieval in minutes, which fits the post-production review period.

Why this answer

Option A is correct because S3 Standard-IA is designed for data accessed less frequently but requires rapid access when needed. After 60 days of heavy access, transitioning to Standard-IA reduces storage costs while maintaining low-latency retrieval for occasional legal reviews.

Exam trap

The trap here is that candidates may confuse S3 replication (which increases costs) with a cost-saving mechanism, or incorrectly assume EBS snapshots are a cheaper alternative for long-term archival, when in fact S3 Glacier Deep Archive is the most cost-effective option for data that can tolerate hours of retrieval time.

Full explanation →

860

MCQmedium

Your ecommerce app runs behind an Application Load Balancer (ALB) and uses an RDS database for orders. During an AZ impairment in us-east-1, customers report that checkout takes several minutes to recover. The current design places EC2 instances only in private subnets of AZ-a, while the ALB spans multiple subnets. The RDS DB instance is Multi-AZ. Management wants automatic recovery within the same Region. Which change best addresses the issue with minimal operational overhead?

A.Move the EC2 instances into Auto Scaling Groups that span private subnets in at least two AZs, keeping the ALB spanning those subnets.

B.Switch from RDS Single-AZ to RDS Multi-AZ, keeping the EC2 instances in only AZ-a because failover will still reach them.

C.Terminate the ALB and use a Network Load Balancer (NLB) in front of the existing single-AZ EC2 instances.

D.Add more EC2 instances in AZ-a and increase the ALB health check thresholds to avoid unnecessary replacements during impairments.

AnswerA

An Auto Scaling Group across multiple AZs ensures healthy capacity exists when an AZ becomes impaired, and the ALB can route to instances in any available AZ.

Why this answer

The current design places EC2 instances only in AZ-a, so when that AZ becomes impaired, all compute capacity is lost, causing checkout to fail until the impairment ends or manual intervention occurs. By moving EC2 instances into Auto Scaling Groups spanning at least two AZs, the application gains automatic recovery within the same Region because the ALB can route traffic to healthy instances in the remaining AZs. This change minimizes operational overhead because Auto Scaling automatically replaces failed instances and maintains desired capacity across AZs, while the ALB’s health checks ensure traffic is only sent to healthy targets.

Exam trap

The trap here is that candidates assume Multi-AZ RDS alone guarantees full application resilience, overlooking that the compute layer (EC2) must also be distributed across AZs to survive an AZ impairment.

How to eliminate wrong answers

Option B is wrong because the RDS Multi-AZ deployment is already in place (the question states the RDS DB instance is Multi-AZ), so this change does nothing to address the single-AZ EC2 failure; the database remains reachable, but the compute layer is still unavailable during the AZ impairment. Option C is wrong because replacing the ALB with an NLB does not solve the single-AZ EC2 problem; an NLB also requires targets in multiple AZs for high availability, and the existing single-AZ EC2 instances would still be lost during the impairment. Option D is wrong because adding more EC2 instances in the same impaired AZ-a does not provide recovery when that AZ fails, and increasing health check thresholds would actually delay the detection of unhealthy instances, prolonging the recovery time.

Full explanation →

861

MCQhard

Based on the exhibit, which storage choice best matches the workload requirements?

A.Use io2 EBS volumes because they provide the highest durable block storage performance.

B.Use instance store NVMe for the temporary processing workspace.

C.Use Amazon EFS for the workspace so the temporary files survive instance replacement.

D.Use S3 as the working directory and read and write the intermediate files directly there.

AnswerB

Instance store fits a high-IOPS scratch workload where data can be lost safely and rebuilt from S3. The benchmark shows extremely low latency and very high random I/O performance, which is ideal for intermediate transcode files. Because the job can be retried from the source object, persistence is not needed on the local workspace.

Why this answer

Instance store NVMe volumes provide temporary, high-performance block storage directly attached to the EC2 host. For a temporary processing workspace where data does not need to persist beyond the instance lifecycle, instance store offers the lowest latency and highest throughput, making it the best match for the workload requirements.

Exam trap

The trap here is that candidates often choose io2 EBS volumes (Option A) because they associate 'highest durable block storage' with 'best performance,' failing to recognize that durability and persistence are unnecessary for temporary data, and that instance store provides superior raw performance for ephemeral workloads.

How to eliminate wrong answers

Option A is wrong because io2 EBS volumes are designed for persistent, high-durability block storage with a 99.999% durability SLA, which is unnecessary and cost-inefficient for temporary processing data that does not require persistence. Option C is wrong because Amazon EFS is a network file system that provides shared, persistent storage; using it for temporary files that should not survive instance replacement introduces unnecessary complexity, latency, and cost, and contradicts the requirement for a temporary workspace. Option D is wrong because using S3 as a working directory for intermediate files would incur high latency per operation, lack POSIX file system semantics (e.g., no file locking or atomic renames), and generate excessive API call costs, making it unsuitable for high-frequency read/write processing tasks.

Full explanation →

862

MCQmedium

A SaaS vendor needs temporary access to an S3 bucket in your AWS account to read customer exports. The vendor will assume an IAM role you created. During integration testing, the vendor reports that their AssumeRole requests succeed, but your security team is concerned about the possibility of confused-deputy attacks. Which trust policy approach most directly mitigates this risk?

A.Add an sts:ExternalId condition to the role trust policy that must match the unique external ID you provide to the vendor.

B.Require the vendor to use the same MFA device serial number as your internal administrators in the trust policy.

C.Remove the role’s permissions policy and rely only on the S3 bucket policy to validate the caller.

D.Allow sts:AssumeRole from the vendor account root principal without restricting to the vendor’s specific IAM role.

AnswerA

The sts:ExternalId condition is a common protection against confused-deputy scenarios in cross-account role assumption. It ensures that only principals who know the unique external ID can successfully assume the role. This mitigates a third party tricking the vendor’s identity into assuming your role, even if they can call AssumeRole.

Why this answer

Option A is correct because adding an `sts:ExternalId` condition to the role trust policy forces the vendor to include a unique external ID in their `AssumeRole` API call. This prevents a confused-deputy attack by ensuring that the role can only be assumed when the caller presents the specific external ID you control, even if the vendor's account is compromised or used by a different AWS service.

Exam trap

The trap here is that candidates may think MFA (Option B) or bucket policies (Option C) are sufficient for cross-account access security, but they fail to address the specific confused-deputy vector that `sts:ExternalId` is designed to block.

How to eliminate wrong answers

Option B is wrong because requiring the vendor to use the same MFA device serial number as your internal administrators is impractical and insecure—it would require sharing a physical or virtual MFA device, which violates the principle of least privilege and does not prevent confused-deputy attacks. Option C is wrong because removing the role’s permissions policy and relying solely on the S3 bucket policy does not mitigate the confused-deputy risk; the trust policy still governs who can assume the role, and without an external ID condition, any principal in the vendor account could assume it. Option D is wrong because allowing `sts:AssumeRole` from the vendor account root principal without restricting to the vendor’s specific IAM role actually increases the attack surface—it permits any user or service in the vendor account to assume the role, making confused-deputy attacks easier, not harder.

Full explanation →

863

MCQmedium

A company runs an Amazon Aurora DB cluster with a Multi-AZ deployment. The application is configured with a hard-coded endpoint that points to the current writer *DB instance* (an instance-specific endpoint), rather than the Aurora cluster writer endpoint. During an unexpected AZ failure, Aurora promotes the standby to become the new writer. However, the application continues to fail to connect until an operator updates the hard-coded endpoint. What change most directly improves resiliency so the application automatically reconnects after failover?

A.Keep using the writer DB instance endpoint, but increase the client connection timeout.

B.Connect using the Aurora cluster writer endpoint so DNS resolves to the current writer after failover.

C.Disable Multi-AZ failover and rely on manual snapshot restore to bring the database back online.

D.Enable cross-Region read replicas and route application traffic to the replica during the outage.

AnswerB

Aurora cluster endpoints are designed to provide continuity across failovers. The Aurora cluster writer endpoint (writer endpoint for the cluster) updates so DNS resolves to the promoted writer. The application can reconnect without manual endpoint changes.

Why this answer

Option B is correct because the Aurora cluster writer endpoint is a DNS name that always resolves to the current writer instance in the cluster, even after a failover. By using this endpoint instead of a hard-coded instance-specific endpoint, the application automatically reconnects to the new writer without manual intervention, directly improving resiliency.

Exam trap

The trap here is that candidates may confuse the instance-specific endpoint with the cluster writer endpoint, or think that increasing timeouts or using read replicas can solve a writer failover issue, when the core problem is the hard-coded reference to a specific instance that no longer exists.

How to eliminate wrong answers

Option A is wrong because increasing the client connection timeout does not change the fact that the hard-coded endpoint points to a failed instance; the connection will still fail after the timeout expires. Option C is wrong because disabling Multi-AZ failover and relying on manual snapshot restore would cause significant downtime and data loss, directly contradicting the goal of improving resiliency. Option D is wrong because cross-Region read replicas are read-only and cannot accept writes; routing application traffic to a read replica during an outage would not allow the application to write data, and it does not address the failover of the writer instance.

Full explanation →

864

Multi-Selecthard

A latency-sensitive video platform uploads large files to S3 from users around the world. Which two features can improve upload performance? The design must avoid adding custom operational scripts.

Select 2 answers

A.S3 Object Lock

B.S3 Transfer Acceleration

C.S3 multipart upload

D.S3 Inventory

AnswersB, C

Transfer Acceleration uses optimized edge paths into AWS for long-distance S3 transfers.

Why this answer

S3 Transfer Acceleration (B) uses AWS edge locations to route uploads over optimized network paths, reducing latency for users far from the destination bucket. S3 Multipart Upload (C) allows parallel uploads of file parts, improving throughput and resilience for large files. Both features enhance upload performance without requiring custom scripts.

Exam trap

The trap here is that candidates may confuse S3 Transfer Acceleration with CloudFront’s content delivery features, or assume S3 Object Lock or Inventory could somehow improve upload speed, but neither addresses network latency or throughput for uploads.

Full explanation →

865

Multi-Selecthard

A retailer runs a reporting-heavy relational app on Amazon RDS MySQL. Peak dashboard traffic lasts only three hours each day, but the database is sized for the peak all day. The business wants lower cost without rewriting the application. Which three actions are best? Select three.

Select 3 answers

A.Right-size the writer based on actual utilization instead of peak guesses.

B.Add read replicas and direct dashboard traffic away from the writer.

C.Evaluate Aurora MySQL if the current replica-heavy design would be cheaper there.

D.Migrate to DynamoDB immediately because every relational workload is more expensive.

E.Increase provisioned IOPS permanently so the monthly bill drops.

AnswersA, B, C

Correct. Right-sizing removes waste from the always-on primary instance. If the writer is sized for real load rather than a worst-case assumption, the company pays for less unused compute.

Why this answer

Option A is correct because right-sizing the RDS instance based on actual utilization metrics (e.g., CPU, memory, connections) rather than peak guesses directly reduces compute and memory costs. Since the peak dashboard traffic lasts only three hours, the database can be scaled down for the remaining 21 hours, avoiding over-provisioning. This is a fundamental cost-optimization strategy for RDS without requiring application changes.

Exam trap

The trap here is that candidates assume DynamoDB is always cheaper for any workload, ignoring the need for application rewrites and the relational reporting requirements, while also overlooking that increasing IOPS always raises costs rather than lowering them.

Full explanation →

866

MCQhard

Based on the exhibit, duplicate payment charges occasionally occur when the worker times out after the charge is submitted but before the message is deleted. What change best prevents duplicate charges while keeping retry behavior?

A.Switch the queue to FIFO and rely on content-based deduplication to guarantee exactly-once processing.

B.Make the consumer idempotent by storing a processed payment key and rejecting repeat charges.

C.Reduce the visibility timeout so the message becomes available again sooner after a timeout.

D.Add a dead-letter queue and disable retries so the message is never processed twice.

AnswerB

The worker can still receive the same message more than once because SQS Standard is at-least-once delivery and the delete happened after the charge. Idempotency is the correct safety control because it prevents the payment from being applied twice even when the message is retried. A processed-payment record or conditional write lets retries remain possible without creating duplicate charges.

Why this answer

Option B is correct because making the consumer idempotent ensures that even if the same message is processed more than once (due to a timeout after the charge is submitted but before the message is deleted), the duplicate charge will be rejected. By storing a processed payment key (e.g., a unique transaction ID) and checking it before processing, the system can safely retry without causing duplicate payments. This approach preserves retry behavior while preventing duplicates, which is the core requirement.

Exam trap

The trap here is that candidates often assume FIFO queues with deduplication guarantee exactly-once processing, but they fail to recognize that deduplication only prevents duplicate message delivery, not duplicate processing when the consumer times out after processing but before acknowledging the message.

How to eliminate wrong answers

Option A is wrong because switching to a FIFO queue with content-based deduplication does not guarantee exactly-once processing in this scenario; FIFO queues prevent duplicate message delivery but cannot prevent duplicate processing if the consumer times out after processing the charge but before deleting the message—the message would be redelivered and processed again, leading to duplicates. Option C is wrong because reducing the visibility timeout would make the message reappear sooner, increasing the likelihood of duplicate processing and not preventing the existing duplicate issue. Option D is wrong because adding a dead-letter queue and disabling retries would eliminate retry behavior entirely, which contradicts the requirement to keep retry behavior; it would also move the message to a DLQ after the first failure, potentially losing the payment charge.

Full explanation →

867

MCQhard

A Lambda-based travel booking site has unpredictable traffic spikes and users see latency caused by cold starts. The function must respond consistently during expected campaign windows. What should be configured?

A.Provisioned concurrency during campaign windows

B.A larger deployment package

C.CloudTrail data events

D.Reserved concurrency only

AnswerA

Provisioned concurrency keeps execution environments initialized and reduces cold-start latency.

Why this answer

Provisioned concurrency initializes a specified number of execution environments in advance, eliminating cold starts for those instances. During campaign windows, this ensures consistent sub‑millisecond latency because the function is always warm and ready to handle requests immediately.

Exam trap

The trap here is that candidates confuse reserved concurrency (a limit) with provisioned concurrency (a pre‑warming mechanism), assuming any concurrency setting solves cold starts, when only provisioned concurrency actively eliminates them.

How to eliminate wrong answers

Option B is wrong because a larger deployment package increases the time needed to download and initialize the code, making cold starts worse, not better. Option C is wrong because CloudTrail data events record API activity for auditing and do not affect Lambda execution latency or concurrency. Option D is wrong because reserved concurrency only caps the maximum number of concurrent executions for a function to prevent it from consuming all available concurrency in an account; it does not pre-warm instances or reduce cold starts.

Full explanation →

868

Matchingmedium

A team wants a web application to keep serving traffic if one Availability Zone fails. Match each architecture element to the resilience behavior it provides.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Stop sending requests to unhealthy targets and keep only healthy instances in rotation.

Launch replacement instances in healthy AZs when capacity is lost.

Maintain a synchronous standby in another AZ and fail over automatically.

Allow instances to be replaced without losing user sessions that are stored elsewhere.

Why these pairings

These pairs match architecture elements with their resilience behaviors for surviving an Availability Zone failure, focusing on AWS services that provide high availability and fault tolerance.

Full explanation →

869

MCQmedium

A ticket booking system uses Aurora MySQL. The company wants fast cross-Region disaster recovery with low RPO. Which architecture should be considered? The team wants the control to be enforceable during normal operations.

A.Aurora Global Database

B.A single-AZ Aurora cluster

C.An ElastiCache Redis replica

D.Manual snapshots copied monthly

AnswerA

Aurora Global Database replicates with low latency to secondary Regions and supports faster disaster recovery than snapshot-only approaches.

Why this answer

Aurora Global Database is designed for cross-Region disaster recovery with a typical RPO of 1 second or less, using storage-based replication that does not impact database performance. It provides fast failover to a secondary Region and allows the primary Region to enforce write control during normal operations, meeting the low RPO and enforceable control requirements.

Exam trap

The trap here is that candidates may confuse Aurora Global Database with cross-Region read replicas or manual snapshot copy strategies, underestimating the RPO and failover speed requirements for disaster recovery.

How to eliminate wrong answers

Option B is wrong because a single-AZ Aurora cluster lacks any cross-Region replication or failover capability, resulting in no DR protection and an RPO that depends on manual backups. Option C is wrong because ElastiCache Redis is an in-memory cache, not a persistent database, and its cross-Region replication (Global Datastore) does not provide the same transactional consistency or DR guarantees as Aurora Global Database for a ticket booking system. Option D is wrong because manual snapshots copied monthly would yield an RPO of up to 30 days, far exceeding the low RPO requirement, and they require manual intervention for recovery, which is not fast.

Full explanation →

870

MCQeasy

A team stores application logs in Amazon CloudWatch Logs. They enabled long retention and detailed dashboards, resulting in higher-than-expected monthly spend. Compliance requires retaining logs for 90 days, but operations only needs aggregated views. Which change most directly reduces CloudWatch Logs cost while meeting the requirement?

A.Set the CloudWatch Logs log group retention period to 90 days for the relevant log groups.

B.Disable VPC flow logs so the applications stop producing logs automatically.

C.Increase the logging level to DEBUG to reduce the number of log events by batching them.

D.Turn off CloudWatch alarms so logs stop being ingested into CloudWatch Logs.

AnswerA

CloudWatch Logs storage charges are driven primarily by how much data you store and for how long. Reducing retention to the required 90 days decreases stored log volume over time.

Why this answer

Setting the CloudWatch Logs log group retention period to 90 days directly reduces storage costs by automatically expiring logs after the compliance-required duration. This eliminates the cost of storing logs beyond 90 days, which was the primary driver of the higher-than-expected spend, while still retaining the data for the mandated period and allowing aggregated views via dashboards.

Exam trap

The trap here is that candidates may confuse log retention settings with log ingestion controls, mistakenly thinking that disabling alarms or changing log levels will reduce costs, when in fact the most direct and compliant cost-saving measure is to adjust the retention period.

How to eliminate wrong answers

Option B is wrong because disabling VPC Flow Logs stops the production of network-level logs, but the question states the team stores 'application logs' in CloudWatch Logs, not VPC Flow Logs; this action would not address the cost of existing application log ingestion and retention, and it would break compliance if those logs are required. Option C is wrong because increasing the logging level to DEBUG actually generates more log events per operation, not fewer, and batching does not reduce the number of events; it would increase costs due to higher ingestion volume. Option D is wrong because turning off CloudWatch alarms does not stop log ingestion; alarms are separate from log data ingestion and retention, so logs would continue to be ingested and stored, incurring the same costs.

Full explanation →

871

Multi-Selectmedium

A marketing site serves versioned JavaScript and CSS files from an Amazon S3 origin through Amazon CloudFront. After a frontend release, the CloudFront cache hit ratio dropped because browsers now send an Authorization header on every static asset request even though the assets are public and do not require authentication. The team wants to lower origin load and improve cache efficiency. Which two actions should it take? Select two.

Select 2 answers

A.Create a separate CloudFront behavior for static assets with a cache policy and origin request policy that exclude the Authorization header.

B.Use hashed or versioned object names and long Cache-Control max-age values for immutable assets.

C.Forward the Authorization header to the origin for all static asset requests.

D.Set the cache TTL to zero so browsers always revalidate content.

E.Store the static assets in Amazon EFS so CloudFront can cache them more effectively.

AnswersA, B

CloudFront cache efficiency depends on the cache key. If Authorization is included in the cache key or forwarded unnecessarily to the origin, each request can be treated as unique and the cache hit ratio drops. A dedicated behavior for immutable static content should use a cache policy that does not include Authorization and an origin request policy that does not forward it, so the same object can be reused across many viewers.

Why this answer

Option A is correct because creating a separate CloudFront behavior for static assets allows you to attach a cache policy and an origin request policy that explicitly exclude the Authorization header. By not forwarding the Authorization header to the S3 origin, CloudFront can treat all requests for the same asset as cache hits, regardless of the header value, which restores the cache hit ratio and reduces origin load.

Exam trap

The trap here is that candidates may think forwarding the Authorization header is necessary for security or that setting TTL to zero is a safe fallback, but both actions actually increase origin load and degrade cache performance for public static assets.

Full explanation →

872

MCQeasy

You have an EC2 instance in private subnets with no NAT Gateway. The instance must access an Amazon S3 bucket (for example, to read configuration files) without sending traffic to the public internet. What VPC endpoint type should you use for S3?

A.Create a Gateway VPC endpoint for the S3 service

B.Create an Interface VPC endpoint (powered by PrivateLink) for S3

C.Use a Transit Gateway to route to S3 over the internet

D.Place a NAT Gateway and restrict security group egress to port 443 to reduce exposure

AnswerA

S3 uses a Gateway VPC endpoint type. Gateway endpoints integrate with your VPC route tables so that traffic destined for S3 is routed privately within the VPC, avoiding the need for NAT Gateway and public internet egress for S3 access.

Why this answer

A Gateway VPC endpoint is the correct choice because it allows EC2 instances in a private subnet to access S3 without traversing the public internet. It uses prefix lists and route table entries to direct S3 traffic through AWS's internal network, and it does not require a NAT gateway, internet gateway, or public IP addresses. Gateway endpoints are free of charge and scale automatically, making them ideal for private subnet access to S3 and DynamoDB.

Exam trap

The trap here is that candidates often confuse Gateway VPC endpoints with Interface VPC endpoints, assuming S3 requires a private IP address like other AWS services, but S3 and DynamoDB are the only services that support Gateway endpoints, which are simpler and free.

How to eliminate wrong answers

Option B is wrong because Interface VPC endpoints (PrivateLink) are used for services that require private IP addresses and are typically for services like API Gateway, Kinesis, or custom services, not for S3; S3 supports Gateway endpoints natively, which are more cost-effective and simpler. Option C is wrong because a Transit Gateway is a network transit hub to connect VPCs, VPNs, and on-premises networks, but it does not provide direct private access to S3 without an internet gateway or NAT; routing to S3 over the internet would still require public connectivity. Option D is wrong because placing a NAT Gateway would allow outbound internet traffic, but the requirement explicitly states 'without sending traffic to the public internet,' and a NAT Gateway incurs additional cost and complexity; a Gateway VPC endpoint achieves the goal without any internet exposure.

Full explanation →

873

MCQeasy

A backend API uses an AWS Lambda function behind API Gateway. The first requests after every weekly deployment experience cold starts, causing p95 latency spikes for a few minutes. Which configuration most directly prevents those cold starts for the published version?

A.Increase the Lambda memory size only, without changing how Lambda is invoked

B.Use Lambda provisioned concurrency for the version via an alias

C.Enable dead-letter queues (DLQ) to retry failed cold starts

D.Attach a CloudFront distribution to cache API Gateway responses for 5 minutes

AnswerB

Provisioned concurrency keeps Lambda execution environments initialized and ready for a specific published version. By attaching it to an alias (for example, pointing the alias used by API Gateway to the new version), you pre-warm environments so the first requests after deployment are served without cold-start initialization.

Why this answer

Provisioned concurrency initializes a specified number of Lambda execution environments ahead of time, so that when the published version is invoked via an alias, there are no cold starts. This directly addresses the latency spikes caused by cold starts after a deployment, as the function is kept warm and ready to handle requests immediately.

Exam trap

The trap here is that candidates may confuse provisioned concurrency with reserved concurrency, which only limits the maximum number of concurrent executions but does not prevent cold starts.

How to eliminate wrong answers

Option A is wrong because increasing memory size can reduce cold start duration but does not prevent cold starts from occurring; it only makes them slightly faster, not eliminate them. Option C is wrong because dead-letter queues handle failed invocations after they occur, not prevent cold starts; they are for asynchronous retries of failed events, not for keeping functions warm. Option D is wrong because CloudFront caching API Gateway responses reduces latency for cached responses but does not prevent cold starts on the Lambda function; the first request after a deployment still triggers a cold start, and caching does not warm the Lambda environment.

Full explanation →

874

MCQmedium

A site serves static assets (JS/CSS) through CloudFront from an S3 origin. After a recent frontend change, CloudFront shows a cache hit ratio below 20%. In CloudFront access logs, requests to the same asset URL path differ by a query parameter named rnd (a random value appended by the app on every request). The origin content is identical regardless of rnd. What is the best CloudFront configuration change to restore effective caching?

A.Increase the origin response Cache-Control max-age header on S3 so CloudFront caches longer even with different rnd values.

B.Create a custom CloudFront Cache Policy that does not include the rnd query parameter in the cache key (whitelist only required parameters, or forward no query strings).

C.Disable compression on CloudFront so the response body is identical byte-for-byte and cache hits improve.

D.Switch the origin from S3 to an ALB so CloudFront can cache based on ALB target health checks instead of the query string.

AnswerB

CloudFront caching effectiveness depends on the cache key. Since rnd does not change the content returned by the S3 origin, excluding rnd from the cache key allows many requests for the “same” asset to map to the same cached object. This removes cache fragmentation and restores a higher hit ratio without changing application content correctness.

Why this answer

The rnd query parameter makes each request appear unique to CloudFront, causing a cache miss for every request even though the underlying content is identical. By creating a custom cache policy that either forwards no query strings or whitelists only required parameters, CloudFront will ignore the rnd parameter when computing the cache key, allowing it to serve cached responses and dramatically improve the cache hit ratio.

Exam trap

The trap here is that candidates often think increasing cache duration (Option A) or disabling compression (Option C) will fix cache misses, when the real issue is that the query parameter is being included in the cache key, making every request unique.

How to eliminate wrong answers

Option A is wrong because increasing Cache-Control max-age only tells the browser and edge how long to cache the response, but it does not change the cache key; CloudFront still treats URLs with different rnd values as distinct objects, so each request will be a cache miss. Option C is wrong because disabling compression does not affect the cache key; CloudFront already caches compressed and uncompressed versions separately based on the Accept-Encoding header, and the issue here is the query string, not compression. Option D is wrong because switching to an ALB does not solve the query-string-based cache key problem; CloudFront would still see different rnd values as different cache keys, and ALBs are not designed to improve CloudFront caching behavior.

Full explanation →

875

Multi-Selectmedium

A financial services company is migrating sensitive customer data to Amazon S3. The data must be encrypted at rest using a customer-managed key stored in AWS KMS, with automatic rotation every 90 days. The company also needs to prevent any access to the data from outside the corporate network, except for approved AWS services. Which three steps should be taken to meet these requirements? (Choose three.)

Select 3 answers

.Enable default encryption on the S3 bucket with SSE-KMS using the customer-managed key.

.Configure an S3 bucket policy that denies access unless the request is made from the corporate IP range.

.Use an S3 bucket policy to require that all requests use HTTPS.

.Enable S3 Block Public Access at the account level.

.Attach an S3 VPC endpoint policy that only allows access from the corporate VPC.

.Use an S3 bucket policy that grants access only to the root user of the account.

Why this answer

Enable default encryption on the S3 bucket with SSE-KMS using the customer-managed key ensures data is encrypted at rest with a key the customer controls and can rotate automatically every 90 days. Configure an S3 bucket policy that denies access unless the request is made from the corporate IP range restricts access to the corporate network. Attach an S3 VPC endpoint policy that only allows access from the corporate VPC ensures that traffic to S3 stays within the AWS network and is subject to VPC controls, preventing exposure to the public internet.

Exam trap

The trap here is that candidates often confuse encryption in transit (HTTPS) with encryption at rest (SSE-KMS), or they think that Block Public Access alone satisfies network restriction requirements, when in fact a combination of bucket policy IP restrictions and VPC endpoint policies is needed to meet the 'no access from outside the corporate network' requirement.

Full explanation →

876

MCQhard

A warehouse integration service must use shared file storage across Linux EC2 instances in multiple Availability Zones. The storage must remain available during an AZ failure. Which service should be used? The team wants the control to be enforceable during normal operations.

A.Amazon EFS with mount targets in multiple Availability Zones

B.S3 mounted as a POSIX file system without a file gateway

C.Instance store volumes

D.An EBS volume attached to all instances

AnswerA

EFS is regional file storage and supports mount targets across AZs.

Why this answer

Amazon EFS provides a fully managed, POSIX-compliant NFS file system that can be mounted concurrently on multiple Linux EC2 instances across different Availability Zones. By creating mount targets in each AZ, the file system remains accessible even if one AZ fails, because the other mount targets continue to serve traffic. EFS also supports lifecycle policies and IAM enforcement to control access during normal operations, meeting the requirement for enforceable control.

Exam trap

The trap here is that candidates often confuse EBS multi-attach (which is limited to specific instance types and a single AZ) with the cross-AZ shared file system capability that only EFS provides, or they mistakenly think S3 with a FUSE mount is a reliable POSIX file system for production workloads.

How to eliminate wrong answers

Option B is wrong because mounting S3 as a POSIX file system (e.g., using s3fs-fuse) does not provide true POSIX semantics (e.g., no file locking, eventual consistency) and is not designed for shared file storage across AZs with high availability during an AZ failure. Option C is wrong because instance store volumes are ephemeral, tied to a single EC2 instance, and data is lost if the instance stops or fails; they cannot be shared across instances or survive an AZ failure. Option D is wrong because an EBS volume can only be attached to a single EC2 instance at a time (except for multi-attach EBS, which is limited to specific instance types and is not designed for shared file storage across AZs); attaching the same EBS volume to multiple instances is not supported.

Full explanation →

877

Multi-Selectmedium

A distributed analytics platform runs on 12 EC2 instances in one Availability Zone. The nodes exchange a very high volume of east-west messages and the team wants the lowest possible network latency between instances. Which two changes should the architect make first? Select two.

Select 2 answers

A.Place the instances in a cluster placement group so AWS keeps them physically close together.

B.Use instance types that support enhanced networking with the Elastic Network Adapter (ENA).

C.Spread the instances across multiple Availability Zones to reduce the chance of correlated failure.

D.Use a spread placement group so each instance lands on different underlying hardware.

E.Move the workload to burstable T-series instances to absorb short traffic spikes economically.

AnswersA, B

Cluster placement groups are intended for tightly coupled workloads that need low network latency and high throughput between instances. AWS places the instances on hardware that is physically close within the AZ, which improves east-west communication.

Why this answer

A cluster placement group is the correct choice because it ensures EC2 instances are placed in a single Availability Zone and are physically close together, which minimizes network latency and maximizes throughput for high-volume east-west traffic. This is the lowest-latency placement group option available, as it groups instances within a single rack or cluster of racks, reducing the number of network hops.

Exam trap

The trap here is that candidates may confuse a spread placement group (which focuses on fault isolation) with a cluster placement group (which focuses on low latency), or they may think that spreading across Availability Zones improves performance when it actually increases latency.

Full explanation →

878

MCQmedium

A payments API uses an RDS MySQL database and must remain available during an Availability Zone failure with minimal application changes. What should the architect enable? The architecture review board prefers a managed AWS-native control.

A.S3 Cross-Region Replication

B.Multi-AZ deployment for the RDS DB instance

C.Read replicas only

D.EBS snapshots every hour

AnswerB

Multi-AZ provides synchronous standby replication and automatic failover within a Region.

Why this answer

Multi-AZ deployment for RDS MySQL provides synchronous standby replication to a different Availability Zone, ensuring automatic failover with zero data loss (RPO=0) and minimal downtime (RTO typically under 2 minutes) during an AZ failure. This is a managed AWS-native solution that requires no application changes beyond updating the connection string to use the CNAME endpoint.

Exam trap

The trap here is that candidates often confuse read replicas with Multi-AZ, assuming read replicas can provide high availability, but they lack automatic failover and synchronous replication, making them unsuitable for AZ failure scenarios requiring minimal application changes.

How to eliminate wrong answers

Option A is wrong because S3 Cross-Region Replication is for object storage redundancy across regions, not for RDS database availability within a region, and it does not provide automatic failover for a MySQL database. Option C is wrong because read replicas are asynchronous and do not support automatic failover; they are designed for read scaling, not for maintaining write availability during an AZ failure. Option D is wrong because EBS snapshots are point-in-time backups that require manual restoration and do not provide automatic failover or real-time replication, leading to significant downtime and potential data loss.

Full explanation →

879

MCQmedium

A content publishing system uses Lambda functions that call an unreliable third-party API. Failed events must be retained for later investigation after retries are exhausted. What should be configured?

A.Lambda reserved concurrency set to zero

B.A larger deployment package

C.CloudFront error pages

D.A Lambda dead-letter queue or failure destination

AnswerD

A DLQ or asynchronous failure destination captures failed events after retry attempts.

Why this answer

A Lambda dead-letter queue (DLQ) or failure destination captures events that have exhausted all retry attempts, preserving them in Amazon SQS or SNS for later investigation. This ensures failed invocations from the unreliable third-party API are not lost and can be analyzed or replayed, meeting the requirement for retention after retries are exhausted.

Exam trap

The trap here is that candidates may confuse a dead-letter queue with other error-handling mechanisms like reserved concurrency or CloudFront customizations, failing to recognize that DLQs specifically retain events after retries are exhausted for asynchronous Lambda invocations.

How to eliminate wrong answers

Option A is wrong because setting reserved concurrency to zero would prevent the Lambda function from executing at all, not handle failed events after retries. Option B is wrong because a larger deployment package does not affect error handling or retention of failed events; it only increases the function's code size, which can impact cold start times. Option C is wrong because CloudFront error pages are used to customize HTTP error responses for web content delivery, not to capture or retain Lambda invocation failures from asynchronous or event-driven processing.

Full explanation →

880

Multi-Selectmedium

A line-of-business application runs on EC2 instances 24/7 with predictable usage for the next year. The application will stay in the same Region, and the team does not want to manage capacity interruptions. Which two purchase options can reduce cost compared with pure On-Demand pricing? Select two.

Select 2 answers

A.Buy Compute Savings Plans for the expected steady usage.

B.Purchase Standard Reserved Instances for the EC2 fleet.

C.Move the fleet to Spot Instances.

D.Use Dedicated Hosts to reserve physical servers for the application.

E.Stay entirely on On-Demand Instances because they are already the cheapest option.

AnswersA, B

Compute Savings Plans reduce the hourly cost of predictable usage while preserving flexibility across supported compute services. They are a strong fit when the workload is steady and the team wants savings without interruption risk.

Why this answer

Compute Savings Plans (A) offer a flexible discount (up to 66%) in exchange for a 1- or 3-year commitment to a consistent amount of compute usage (measured in $/hour), automatically applying to any EC2 instance family, region, or even AWS Fargate/Lambda. This reduces cost compared to On-Demand while avoiding capacity interruptions, as the commitment covers the predictable steady-state usage. Standard Reserved Instances (B) provide a similar discount (up to 72%) for a specific instance family in a specific region, also with a 1- or 3-year term, and guarantee capacity for the specified AZ if you choose a zonal reservation, ensuring no interruptions.

Exam trap

The trap here is that candidates may think Spot Instances are always cheaper and safe for steady workloads, but they forget the interruption risk, or they may confuse Dedicated Hosts with Reserved Instances as a cost-saving measure, when Dedicated Hosts actually increase cost for physical isolation.

Full explanation →

881

MCQeasy

An internal web application must require encrypted client connections. The company currently has an ALB listener on port 80 (HTTP), and users can access the application over plain HTTP. What is the best change to ensure all client traffic uses HTTPS?

A.Configure an HTTPS (port 443) listener using an ACM certificate and update the port 80 listener to redirect to HTTPS (or to block plain HTTP requests).

B.Enable S3 default encryption so HTTP requests are automatically encrypted in transit.

C.Set the application to encrypt data only after it is received by the ALB.

D.Rely on WAF alone to encrypt HTTP traffic.

AnswerA

Client-to-ALB encryption is enforced by terminating TLS on an ALB HTTPS listener. Redirecting or blocking HTTP on port 80 ensures clients cannot successfully establish plaintext HTTP sessions, so all viable paths use HTTPS end-to-end between the client and the load balancer.

Why this answer

Option A is correct because it uses an HTTPS listener on port 443 with an ACM certificate to enforce encrypted client connections, and redirecting HTTP (port 80) traffic to HTTPS ensures all traffic is encrypted in transit. This is the standard AWS best practice for enforcing HTTPS on an ALB, as it directly controls the listener behavior at the load balancer level without requiring application changes.

Exam trap

The trap here is that candidates may confuse encryption at rest (S3 default encryption) with encryption in transit, or assume that WAF or post-receipt encryption can secure the initial client connection, when only a properly configured HTTPS listener with a redirect from HTTP can enforce encrypted client connections.

How to eliminate wrong answers

Option B is wrong because S3 default encryption applies to data at rest in S3 buckets, not to data in transit over HTTP; it cannot encrypt client-to-ALB traffic. Option C is wrong because encrypting data after it is received by the ALB means the initial client-to-ALB leg remains in plaintext HTTP, failing the requirement for encrypted client connections. Option D is wrong because AWS WAF is a web application firewall that inspects HTTP/HTTPS traffic but does not perform encryption; it cannot encrypt plain HTTP traffic.

Full explanation →

882

Multi-Selectmedium

A public API currently uses API Gateway REST APIs and Lambda. Traffic is low most of the day, but marketing runs a predictable traffic spike every weekday at 09:00 UTC. Users complain about cold starts during the first few minutes of the spike, and the team wants to avoid paying for provisioned concurrency all day. Which two changes should they make? Select two.

Select 2 answers

A.Switch from REST APIs to HTTP APIs if the feature set is sufficient.

B.Schedule Lambda provisioned concurrency shortly before the spike and scale it back afterward.

C.Keep provisioned concurrency at the maximum level 24/7.

D.Move the API to a single t3.nano EC2 instance.

E.Add an S3 gateway endpoint to reduce cold starts.

AnswersA, B

HTTP APIs are generally lower cost and lower latency than REST APIs for many simple API use cases. They reduce the recurring API Gateway cost without requiring a redesign.

Why this answer

Option A is correct because HTTP APIs are designed to be faster and more cost-effective than REST APIs, with up to 71% lower latency and 60% lower cost per request. This reduces the impact of cold starts by minimizing the overhead of request processing, and the cost savings align with the team's goal of avoiding all-day provisioned concurrency costs.

Exam trap

The trap here is that candidates may think provisioned concurrency must be always-on to be effective, or that adding infrastructure like EC2 or S3 endpoints can solve cold starts, when in fact the correct approach is to combine a cheaper API type with time-based provisioned concurrency scheduling.

Full explanation →

883

MCQmedium

A microservice runs in private subnets with no NAT gateway. It must retrieve a secret from AWS Secrets Manager. Security requires that traffic to Secrets Manager stays within AWS’s private network (no public internet egress). The IAM role already grants secretsmanager:GetSecretValue for the needed secret. What is the best network setup to meet the requirement?

A.Create an Interface VPC Endpoint for Secrets Manager (com.amazonaws.<region>.secretsmanager) and allow it via the endpoint security group; optionally enable private DNS.

B.Create an S3 Gateway VPC endpoint and use it for Secrets Manager requests because both services use HTTPS.

C.Assign a public IP address to the tasks so they can call Secrets Manager over the internet without NAT.

D.Change the route table to send all 0.0.0.0/0 traffic directly to an Internet Gateway.

AnswerA

Interface VPC Endpoints provide private IP connectivity from the VPC to the Secrets Manager service without routing through a NAT gateway or an Internet Gateway. The calls remain within AWS networking and still use standard TLS to the service endpoint.

Why this answer

An Interface VPC Endpoint (AWS PrivateLink) for Secrets Manager allows the microservice to access the secret privately without traversing the public internet. Since the subnet has no NAT Gateway and no public IP, this is the only way to keep traffic within the AWS network. Enabling private DNS ensures the standard Secrets Manager endpoint resolves to the private IP of the endpoint, eliminating the need for route table changes.

Exam trap

The trap here is that candidates often confuse Gateway Endpoints (which only work for S3 and DynamoDB) with Interface Endpoints (which are needed for Secrets Manager and most other AWS services), leading them to incorrectly select option B.

How to eliminate wrong answers

Option B is wrong because S3 Gateway VPC endpoints are specific to Amazon S3 and cannot be used for Secrets Manager requests; Secrets Manager requires an Interface endpoint (powered by PrivateLink), not a Gateway endpoint. Option C is wrong because assigning a public IP address would route traffic over the public internet, violating the requirement that traffic stays within AWS’s private network. Option D is wrong because sending all 0.0.0.0/0 traffic to an Internet Gateway would force traffic out to the public internet, which is not allowed, and the subnet has no NAT Gateway to enable return traffic.

Full explanation →

884

MCQeasy

A new feature stores user events in DynamoDB. Each event must be fetched by user_id and sorted by event_time. The team expects many different users and wants to avoid a single hot partition. Which partition key design is best?

A.Use a constant partition key value (for example, partition_key='events') and store user_id as an attribute.

B.Use user_id as the partition key and event_time as the sort key.

C.Use event_time as the partition key and user_id as an attribute to query later.

D.Use a randomly generated UUID as the partition key and query by user_id using a full table scan.

AnswerB

Using user_id as the partition key spreads data across many partitions based on user distribution. event_time as the sort key supports efficient range queries and retrieving events in time order per user. This design matches the stated access pattern and reduces hot partition likelihood.

Why this answer

Option B is correct because using user_id as the partition key evenly distributes writes across partitions, avoiding hot spots, while event_time as the sort key enables efficient retrieval of events for a specific user in chronological order. DynamoDB's query operation can then fetch all events for a given user_id sorted by event_time without scanning.

Exam trap

The trap here is that candidates may choose a constant partition key (Option A) thinking it simplifies queries, not realizing it creates a single hot partition that defeats DynamoDB's scalability.

How to eliminate wrong answers

Option A is wrong because a constant partition key value ('events') forces all items into a single partition, creating a hot partition that throttles writes and reads as the number of users grows. Option C is wrong because using event_time as the partition key scatters events for the same user across multiple partitions, requiring a costly scan or multiple queries to retrieve all events for a user, and it does not guarantee sorted results per user. Option D is wrong because a random UUID partition key prevents efficient retrieval by user_id without a full table scan, which is expensive and slow, and it does not provide sorted results.

Full explanation →

885

Multi-Selecthard

A log archive has old unattached EBS volumes and many stale snapshots. Which two actions reduce storage cost without affecting running instances? The architecture review board prefers a managed AWS-native control.

Select 2 answers

A.Stop all EC2 instances in the account

B.Disable CloudTrail logging

C.Delete unattached EBS volumes after verifying they are no longer needed

D.Apply snapshot lifecycle policies to expire obsolete snapshots

AnswersC, D

Unattached volumes continue to incur charges until deleted.

Why this answer

Option C is correct because deleting unattached EBS volumes eliminates storage costs for volumes that are not in use, and since they are not attached to any running instance, this action does not affect running instances. Option D is correct because applying snapshot lifecycle policies (e.g., using Amazon Data Lifecycle Manager) automates the expiration of obsolete snapshots, reducing storage costs without impacting running instances. Both actions are managed AWS-native controls, aligning with the architecture review board's preference.

Exam trap

The trap here is that candidates may confuse stopping instances (which does not delete volumes) with deleting unattached volumes, or they may think disabling CloudTrail reduces storage costs, but CloudTrail logs are stored in S3 and are unrelated to EBS volume or snapshot storage charges.

Full explanation →

886

Multi-Selectmedium

A company is designing a high-performance web application that serves static and dynamic content to a global user base. The application runs on Amazon EC2 instances behind an Application Load Balancer (ALB). The static assets are stored in an S3 bucket. Which three architecture decisions will improve performance and reduce latency for users? (Choose three.)

Select 3 answers

.Place the EC2 instances in a single Availability Zone to reduce network latency.

.Use Amazon CloudFront to cache both static and dynamic content at edge locations.

.Integrate the ALB with AWS Global Accelerator to route traffic over the AWS global network.

.Use a larger EC2 instance type with higher network bandwidth, such as the c5n or m5n family.

.Enable S3 Transfer Acceleration on the bucket for faster downloads.

.Use an Amazon RDS Multi-AZ database for read replicas to offload read traffic.

Why this answer

Amazon CloudFront caches both static and dynamic content at edge locations, reducing latency by serving content from locations closer to users. AWS Global Accelerator improves performance by routing traffic over the AWS global network instead of the public internet, reducing jitter and latency. Larger EC2 instance types like c5n or m5n provide higher network bandwidth, which reduces network bottlenecks for high-traffic applications.

Exam trap

The trap here is that candidates may confuse S3 Transfer Acceleration as a solution for faster downloads, when it only accelerates uploads, or think Multi-AZ RDS provides read scaling, when it is for failover only.

Full explanation →

887

Multi-Selectmedium

A company has a workload running on Amazon EC2 instances that need to securely communicate with an Amazon SQS queue and an Amazon DynamoDB table. The EC2 instances are in a private subnet without internet access. The security team wants to ensure that no traffic leaves the AWS network. Which three steps should be taken to meet these requirements? (Choose three.)

Select 3 answers

.Create a VPC endpoint for Amazon SQS and attach a policy that allows access from the EC2 instance's IAM role.

.Create a VPC endpoint for Amazon DynamoDB and attach a policy that allows access from the EC2 instance's IAM role.

.Configure a NAT gateway in a public subnet to route traffic to SQS and DynamoDB.

.Add a route in the private subnet's route table pointing to a NAT gateway for all traffic.

.Configure the security group for the EC2 instances to allow outbound traffic to the VPC endpoints.

.Enable VPC flow logs to monitor traffic to SQS and DynamoDB.

Why this answer

Creating VPC endpoints for Amazon SQS and DynamoDB allows EC2 instances in a private subnet to communicate with these services privately, without traversing the internet or requiring a NAT gateway. Attaching a policy that restricts access to the EC2 instance's IAM role ensures that only authorized traffic is allowed, meeting the security team's requirement that no traffic leaves the AWS network. Configuring the security group to allow outbound traffic to the VPC endpoints is necessary because security groups are stateful and control outbound connections, but the endpoints themselves are accessed via their specific prefix list IDs, not public IPs.

Exam trap

The trap here is that candidates often assume a NAT gateway is required for private subnet resources to access AWS services, but VPC endpoints provide a more secure and cost-effective alternative that keeps traffic entirely within the AWS network.

Full explanation →

888

MCQmedium

A production log archive runs continuously on EC2 with predictable usage for the next three years. The team wants a discount while retaining some instance-family flexibility. What should they buy?

A.S3 Intelligent-Tiering

B.Dedicated Instances

C.Compute Savings Plan

D.Spot Instances only

AnswerC

Compute Savings Plans provide discounts for a committed spend while allowing flexibility across instance families, sizes, Regions, and compute services.

Why this answer

The Compute Savings Plan (C) is correct because it offers a discount (up to 66%) in exchange for a commitment to a consistent amount of compute usage (measured in $/hour) for a 1- or 3-year term, while allowing flexibility to change instance families, sizes, OS, tenancy, and even regions within EC2, Fargate, and Lambda. This matches the requirement of predictable usage for three years with instance-family flexibility, unlike Reserved Instances which lock to a specific instance family.

Exam trap

The trap here is that candidates often confuse Compute Savings Plans with Reserved Instances, assuming that any long-term discount requires locking into a specific instance family, but Compute Savings Plans provide both the discount and the flexibility to change instance families, which is the key differentiator tested in this question.

How to eliminate wrong answers

Option A is wrong because S3 Intelligent-Tiering is a storage class for objects in Amazon S3 that optimizes costs by moving data between access tiers based on changing access patterns; it has nothing to do with EC2 compute discounts or instance-family flexibility. Option B is wrong because Dedicated Instances are EC2 instances that run on hardware dedicated to a single customer, providing physical isolation but no discount or flexibility benefit; they are a billing/tenancy option, not a discount program. Option D is wrong because Spot Instances only offer significant discounts but are interruptible with a 2-minute termination notice, making them unsuitable for a production log archive that must run continuously for three years without interruption.

Full explanation →

889

MCQeasy

Based on the exhibit, which AWS feature should the team use to minimize network latency between EC2 instances that exchange messages very frequently?

A.Use a spread placement group to maximize instance separation across hardware.

B.Use a cluster placement group to place instances close together.

C.Use a partition placement group to distribute instances across many partitions.

D.Use multiple Auto Scaling groups to spread traffic across more subnets.

AnswerB

A cluster placement group is designed for workloads that need very low network latency and high packet-per-second performance between instances. The exhibit describes frequent small-message traffic and a need for the lowest possible latency, which makes a cluster placement group the right choice. It keeps instances physically close in the AWS network for faster communication.

Why this answer

A cluster placement group is the correct choice because it places EC2 instances in a low-latency, high-bandwidth network within a single Availability Zone. This minimizes network latency between instances that exchange messages very frequently, as the instances are physically close together and can communicate using up to 10 Gbps of network throughput for most instance types.

Exam trap

The trap here is that candidates often confuse placement group types, assuming a spread placement group is for performance when it is actually designed for high availability and fault isolation, not low latency.

How to eliminate wrong answers

Option A is wrong because a spread placement group maximizes instance separation across distinct hardware to reduce the risk of simultaneous failures, which increases latency and is counterproductive for high-frequency messaging. Option C is wrong because a partition placement group distributes instances across logical partitions to isolate failures in large distributed systems, but it does not minimize latency between instances. Option D is wrong because using multiple Auto Scaling groups to spread traffic across more subnets can increase network hops and latency, not reduce it.

Full explanation →

890

MCQeasy

A team uses an S3 bucket to store important customer-generated exports. They need protection against accidental overwrites and also want copies of the data in another AWS Region for disaster recovery. Which S3 configuration best satisfies both requirements?

A.Enable S3 lifecycle policies to automatically move objects to Glacier after 30 days only.

B.Enable S3 versioning and configure Cross-Region Replication to a destination bucket in another Region.

C.Disable all versioning and rely on AWS Backup to restore objects from a scheduled backup window.

D.Enable S3 Block Public Access and SSE-S3 encryption, without using versioning or replication.

AnswerB

Versioning preserves previous object states against overwrites and deletes, while replication provides an additional Region copy for recovery.

Why this answer

Option B is correct because enabling S3 versioning protects against accidental overwrites by preserving all object versions, allowing recovery of previous versions. Configuring Cross-Region Replication (CRR) automatically replicates objects to a destination bucket in another AWS Region, providing disaster recovery by maintaining a copy of the data in a separate geographic location.

Exam trap

The trap here is that candidates may think lifecycle policies or AWS Backup alone can handle both accidental overwrites and disaster recovery, but they fail to address the real-time protection and cross-region copy requirements that versioning and CRR specifically provide.

How to eliminate wrong answers

Option A is wrong because lifecycle policies to Glacier only manage storage tier transitions and do not protect against accidental overwrites or provide cross-region copies for disaster recovery. Option C is wrong because disabling versioning removes the ability to recover from accidental overwrites, and relying solely on AWS Backup for scheduled restores does not provide real-time protection against overwrites or continuous replication to another Region. Option D is wrong because enabling Block Public Access and SSE-S3 encryption addresses security and encryption but does not prevent accidental overwrites or create cross-region copies for disaster recovery.

Full explanation →

891

Multi-Selecthard

A marketing portal serves private PDF files stored in Amazon S3 through CloudFront. Users authenticate to the portal first, and each download link must expire after one hour. The S3 origin must never be directly reachable from the internet. Which three actions should be used? Select three.

Select 3 answers

A.Use CloudFront signed URLs or signed cookies with a one-hour expiration window.

B.Configure an Origin Access Control for the S3 origin behind CloudFront.

C.Add an S3 bucket policy that allows only the CloudFront distribution, through its Origin Access Control, and denies public access.

D.Expose the S3 bucket through the static website endpoint and secure it with security group rules.

E.Use an AWS WAF web ACL attached to the S3 bucket instead of CloudFront.

AnswersA, B, C

CloudFront signed URLs or signed cookies enforce time-limited viewer authorization at the edge. For a one-hour access window, the distribution can issue a signature that CloudFront validates before it serves the object. Signed URLs are useful for a small number of object links, while signed cookies are better when the portal needs to grant access to multiple PDFs without generating a separate URL for each file.

Why this answer

Option A is correct because CloudFront signed URLs or signed cookies allow you to restrict access to content for a specific time window. By setting the expiration to one hour, you ensure that each download link becomes invalid after that period, meeting the requirement for expiring links. This approach also keeps the S3 bucket private, as users must authenticate through CloudFront rather than accessing S3 directly.

Exam trap

The trap here is that candidates often confuse CloudFront signed URLs with S3 pre-signed URLs, but S3 pre-signed URLs would expose the bucket directly if not combined with OAC, and they do not inherently prevent direct internet access to the S3 origin.

Full explanation →

892

MCQhard

A media archive needs low-latency full-text search across product descriptions and filtered attributes. Which managed service is most suitable?

A.AWS Config

B.Amazon OpenSearch Service

C.Amazon EFS

D.Amazon SQS

AnswerB

OpenSearch is designed for search and analytics over indexed text and structured fields.

Why this answer

Amazon OpenSearch Service is the correct choice because it provides a managed, scalable solution for full-text search and real-time analytics on large volumes of data. It supports low-latency queries across product descriptions and filtered attributes through its inverted index and query DSL, making it ideal for media archive search use cases.

Exam trap

The trap here is that candidates may confuse AWS Config's resource tracking or SQS's message handling with search capabilities, but neither provides the indexing and query engine required for full-text search.

How to eliminate wrong answers

Option A is wrong because AWS Config is a service for auditing and evaluating resource configurations against desired policies, not for full-text search or indexing of data. Option C is wrong because Amazon EFS is a scalable file storage service for Linux-based workloads, lacking any built-in search or indexing capabilities for text content. Option D is wrong because Amazon SQS is a fully managed message queuing service for decoupling application components, not designed for storing or searching data.

Full explanation →

893

MCQmedium

Based on the exhibit, what is the best way to let private EC2 instances reach Amazon S3 and AWS Systems Manager without sending traffic through the internet or a NAT gateway?

A.Create a gateway endpoint for S3 and interface endpoints for Systems Manager, EC2Messages, and SSMMessages.

B.Add a more permissive security group rule allowing outbound 0.0.0.0/0 on all ports.

C.Replace the NAT gateway with a network ACL that allows ephemeral ports to the internet.

D.Move the instances to public subnets so they can reach AWS services directly.

AnswerA

This keeps traffic on the AWS network and avoids NAT or internet traversal. S3 uses a gateway endpoint, while Systems Manager needs interface endpoints for the control and messaging services that Session Manager depends on. It directly addresses both the S3 download problem and the missing Session Manager connectivity in a private subnet design.

Why this answer

Gateway endpoints for S3 allow private EC2 instances to access S3 via AWS's private network without traversing the internet or a NAT gateway, using prefix lists and route table entries. Interface endpoints for Systems Manager, EC2Messages, and SSMMessages provide private connectivity to AWS Systems Manager via PrivateLink, enabling secure instance management without public IPs or NAT.

Exam trap

The trap here is that candidates often assume all AWS services can be accessed via a single endpoint type, but S3 requires a gateway endpoint (route table-based) while Systems Manager and its sub-services require interface endpoints (PrivateLink-based), and failing to create all three (including EC2Messages and SSMMessages) will break Systems Manager functionality.

How to eliminate wrong answers

Option B is wrong because adding a permissive security group rule for outbound 0.0.0.0/0 does not eliminate the need for a NAT gateway or internet gateway; it only permits traffic but still requires a route to the internet, which violates the requirement to avoid internet or NAT gateway traffic. Option C is wrong because replacing the NAT gateway with a network ACL that allows ephemeral ports to the internet does not provide private connectivity; network ACLs are stateless and cannot route traffic to AWS services privately, and they still require an internet gateway for outbound internet access. Option D is wrong because moving instances to public subnets would require public IP addresses and an internet gateway, exposing them to the internet and defeating the requirement to avoid internet traffic.

Full explanation →

894

MCQmedium

A media company runs a fleet of EC2 instances using Auto Scaling across multiple instance families (for example, m-series and c-series) in a single region. The business wants to commit to steady usage for one year to reduce cost, but the application team must retain flexibility to switch instance families and scale up/down as demand changes. They need the cost-reduction approach that best matches this flexibility. Which option is the best fit?

A.Purchase Standard Reserved Instances tied to a specific instance family and region, so the application can only run on the selected family.

B.Purchase Compute Savings Plans so the commitment applies regardless of instance family changes within the selected scope.

C.Purchase Spot Instances for all capacity and disable On-Demand fallback to guarantee the lowest cost.

D.Rely only on On-Demand and reduce cost by using a CloudFront-only approach for all dynamic content.

AnswerB

Compute Savings Plans provide discounted pricing in exchange for a 1-year or 3-year commitment, while allowing flexibility across instance families/attributes within the scope (for example, region/account and covered usage). This aligns with Auto Scaling that may shift between instance families while maintaining steady overall compute usage.

Why this answer

Compute Savings Plans provide the most flexibility because they apply to any EC2 instance family (including m-series and c-series) within a region, automatically adjusting to instance family changes and scaling. This matches the requirement to commit to steady usage for one year while retaining the ability to switch families and scale up/down, offering up to 66% savings over On-Demand without locking the application to a specific instance type.

Exam trap

The trap here is that candidates often confuse Reserved Instances (which lock to a specific family) with Savings Plans (which offer family flexibility), leading them to choose Option A despite the requirement for instance family switching.

How to eliminate wrong answers

Option A is wrong because Standard Reserved Instances are tied to a specific instance family (e.g., m5.large) and region, which would prevent the application from switching to a different instance family (e.g., c-series) without incurring additional On-Demand costs or modification fees. Option C is wrong because Spot Instances can be interrupted with a 2-minute warning, making them unsuitable as the sole capacity source for a production workload that requires reliability; disabling On-Demand fallback would risk application downtime during Spot reclaimations. Option D is wrong because CloudFront is a content delivery network that caches static and dynamic content at edge locations, but it does not reduce the cost of running EC2 instances for compute workloads; relying solely on On-Demand without a commitment discount would not achieve the desired cost reduction.

Full explanation →

895

Multi-Selecthard

A payments API requires point-in-time recovery and accidental-delete protection for a DynamoDB table. Which two settings should the architect enable?

Select 2 answers

A.Deletion protection or tightly controlled delete permissions

B.Point-in-time recovery

C.Global secondary indexes

D.DAX

AnswersA, B

Deletion protection and least-privilege controls reduce accidental table removal risk.

Why this answer

Deletion protection (Option A) prevents accidental table deletion by blocking DropTable API calls unless explicitly disabled, which is essential for protecting the payments table from human error or automated scripts. Point-in-time recovery (Option B) enables continuous backups with 35-day granularity, allowing restoration to any second within that window to recover from accidental writes or data corruption. Together, these settings satisfy both the point-in-time recovery and accidental-delete protection requirements for the DynamoDB table.

Exam trap

The trap here is that candidates often confuse operational features like GSIs or DAX with data protection mechanisms, mistakenly thinking they provide recovery or deletion safeguards when they only serve performance or query optimization roles.

Full explanation →

896

MCQhard

Based on the exhibit, a serverless checkout API is implemented in AWS Lambda and deployed in one Region. The function has a cold-start time of 700-900 ms on the first request after idle periods. Marketing launches a predictable traffic spike every weekday at 09:00 UTC, and the p95 latency target is under 150 ms during the first five minutes of the spike. What should the solutions architect do to meet the latency target while controlling cost?

A.Increase the Lambda memory size and leave concurrency at the default value.

B.Configure provisioned concurrency and scale it up before the predictable spike begins.

C.Put the Lambda function behind an Application Load Balancer so the load balancer absorbs the initialization delay.

D.Set reserved concurrency to the expected peak so Lambda will pre-create execution environments.

AnswerB

Provisioned concurrency keeps pre-initialized environments ready, which removes most cold-start latency. Because the spike is predictable, you can scale concurrency before 09:00 UTC and reduce it afterward to control cost.

Why this answer

Provisioned concurrency pre-warms a specified number of execution environments so that the Lambda function has zero cold-start latency when invoked. By scheduling the provisioned concurrency to scale up before the 09:00 UTC spike, the function can serve the first requests within the 150 ms p95 latency target, while the scheduled scaling down after the spike controls cost by releasing unused capacity.

Exam trap

AWS often tests the distinction between provisioned concurrency (which pre-warms environments to eliminate cold starts) and reserved concurrency (which only caps the maximum concurrent executions without affecting cold-start behavior).

How to eliminate wrong answers

Option A is wrong because increasing memory size can reduce cold-start time but cannot eliminate it entirely, and the cold-start of 700-900 ms far exceeds the 150 ms target; default concurrency does not pre-warm environments. Option C is wrong because an Application Load Balancer does not absorb initialization delay—it only distributes requests to the Lambda function, which still experiences cold starts. Option D is wrong because reserved concurrency limits the maximum number of concurrent executions but does not pre-create execution environments; it prevents scaling beyond a limit but does not reduce cold-start latency.

Full explanation →

897

MCQmedium

A dev sandbox runs for several hours each night and can be interrupted and restarted. Which EC2 purchasing option should minimize cost? The design must avoid adding custom operational scripts.

A.On-Demand Instances only

B.Spot Instances

C.Dedicated Hosts

D.Provisioned IOPS volumes

AnswerB

Spot Instances offer deep discounts for interruptible workloads.

Why this answer

Spot Instances (B) are ideal for fault-tolerant, interruptible workloads like a nightly dev sandbox because they offer significant cost savings (up to 90% off On-Demand) in exchange for being reclaimable by AWS with a 2-minute warning. Since the sandbox can be interrupted and restarted, and the design avoids custom scripts, Spot Instances can leverage native AWS features like hibernation or automatic recovery to resume work without manual intervention.

Exam trap

The trap here is that candidates may confuse Spot Instances with On-Demand or Reserved Instances, overlooking that Spot Instances are specifically designed for fault-tolerant, interruptible workloads and offer the lowest cost, while On-Demand is for steady-state workloads and Reserved Instances require a 1- or 3-year commitment.

How to eliminate wrong answers

Option A is wrong because On-Demand Instances are billed per second with no interruption risk, but they are the most expensive option and do not minimize cost for a workload that can tolerate interruptions. Option C is wrong because Dedicated Hosts provide physical server isolation for licensing or compliance needs, which is unnecessary and costly for a simple dev sandbox; they incur a per-host fee regardless of usage. Option D is wrong because Provisioned IOPS volumes are a storage type (EBS), not an EC2 purchasing option, and they add cost without addressing compute pricing; the question asks for an EC2 purchasing option to minimize cost.

Full explanation →

898

Multi-Selectmedium

A company is designing a disaster recovery plan for a critical application hosted on AWS. The application runs on EC2 instances with data stored in Amazon EBS volumes and Amazon S3. The recovery time objective (RTO) is 15 minutes, and the recovery point objective (RPO) is 1 hour. Which three strategies would help meet these objectives? (Choose three.)

Select 3 answers

.Use AWS Backup to create hourly snapshots of EBS volumes and copy them to a different AWS Region.

.Pre-provision EC2 instances in the disaster recovery region and keep them running 24/7.

.Replicate critical data to S3 in the disaster recovery region using S3 Cross-Region Replication (CRR).

.Store Amazon Machine Images (AMIs) in the source region and use AWS Lambda to copy them after a disaster.

.Configure Amazon Route 53 with a failover routing policy and health checks to redirect traffic to the DR region.

.Set up an AWS Direct Connect link between the primary and DR regions for faster data transfer.

Why this answer

AWS Backup can create hourly snapshots of EBS volumes and copy them to a different AWS Region, meeting the 1-hour RPO by ensuring backups are taken every hour. S3 Cross-Region Replication (CRR) asynchronously replicates objects to a bucket in another region, keeping data synchronized within minutes and supporting the RPO. Amazon Route 53 with a failover routing policy and health checks can automatically redirect traffic to the DR region within seconds to minutes, enabling the 15-minute RTO by quickly failing over to pre-prepared infrastructure.

Exam trap

The trap here is that candidates may confuse operational readiness (like pre-provisioning instances) with a specific strategy that directly contributes to meeting RTO/RPO, or they may think Direct Connect is a disaster recovery strategy when it is merely a connectivity option that does not automate failover or data replication.

Full explanation →

899

MCQmedium

A company needs to implement session management for a web application. Sessions must persist across multiple EC2 instances, survive EC2 failures, and be accessible with sub-millisecond latency. Sessions must also be sortable by last-access time to expire the oldest sessions first. Which caching solution should a solutions architect recommend?

A.Amazon ElastiCache for Memcached with session data stored as key-value pairs

B.Amazon DynamoDB with TTL enabled for session expiration

C.Amazon ElastiCache for Redis with sessions stored as sorted sets

D.ElastiCache for Redis with sticky sessions enabled on the Application Load Balancer

AnswerC

Redis provides sub-millisecond latency, sorted sets for ordering by last-access score, Multi-AZ replication, and external storage for cross-instance availability. All requirements are met.

Why this answer

Amazon ElastiCache for Redis satisfies all requirements: multi-instance session sharing (sessions stored externally), sub-millisecond latency, survival of EC2 failures (stored outside instances), and sorted sets (ZSET data structure) for ordering sessions by last-access score.

Memcached supports only simple key-value pairs — it cannot perform sorted set operations to order sessions by last-access time. Memcached also lacks replication, meaning a node failure loses all cached sessions.

Exam trap

Memcached and Redis are both ElastiCache engines, but they serve different needs. Any requirement involving sorted data, complex data structures, persistence, or replication eliminates Memcached. Redis sorted sets (ZSET) store members with numeric scores and support range queries — perfect for session expiry queues ordered by last-access timestamp.

Why the other options are wrong

Memcached supports only simple string key-value storage. It cannot perform sorted set operations to expire sessions by last-access time. Memcached also lacks replication — a node failure loses all cached sessions.

DynamoDB achieves single-digit millisecond latency, not sub-millisecond. DynamoDB also does not natively support sorted set operations without additional query complexity.

ALB sticky sessions pin a client to a specific EC2 instance. If that instance fails, the session is lost. Sticky sessions do not make session data redundant across instances — the opposite of what is required.

Full explanation →

900

MCQmedium

A service consumes messages from an SQS queue. Recently, a new message format started failing validation in the consumer. The consumer catches the exception but cannot successfully process those messages without code changes. The team wants failed messages to be isolated for later investigation instead of being retried indefinitely. What should they configure?

A.Set the queue’s retention period to 1 minute and rely on messages expiring naturally.

B.Configure a dead-letter queue (DLQ) with a redrive policy and set maxReceiveCount so messages move after repeated failed receives.

C.Increase the visibility timeout to 7 days so failed messages cannot be retried.

D.Publish the same message again to SNS on every failure so a different subscriber might succeed.

AnswerB

A DLQ isolates “poison messages” that repeatedly fail processing. With a redrive policy, SQS tracks receives; once a message exceeds maxReceiveCount without successful processing, SQS moves it to the DLQ. This prevents infinite retries on the bad format while preserving the failed messages for debugging and code fixes.

Why this answer

A dead-letter queue (DLQ) with a redrive policy is the correct solution because it allows messages that repeatedly fail processing to be moved to a separate queue after exceeding the maxReceiveCount. This isolates problematic messages for later investigation without blocking the main queue or causing infinite retries. The consumer catches the exception, so the message is not deleted and is returned to the queue for redelivery; the DLQ ensures that after a configurable number of attempts, the message is redirected instead of being retried indefinitely.

Exam trap

The trap here is that candidates may think increasing the visibility timeout or relying on message expiration is sufficient, but they fail to understand that those approaches either affect all messages or only temporarily hide the message, whereas a DLQ provides a permanent, targeted isolation mechanism for repeatedly failing messages.

How to eliminate wrong answers

Option A is wrong because setting the retention period to 1 minute would cause all messages (including valid ones) to expire quickly, leading to data loss and not isolating only the failed messages. Option C is wrong because increasing the visibility timeout to 7 days would simply hide the message from consumers for that period, but after the timeout expires the message would become visible again and be retried, failing to isolate it permanently. Option D is wrong because publishing the same message to SNS on every failure would create an infinite loop of republishing, and SNS subscribers would also fail if they use the same validation logic, not solving the isolation requirement.

Full explanation →

SAA-C03 (SAA-C03) — Questions 826–900