SAA-C03 SAA-C03 Questions 76–150 | Page 2/14

MCQeasy

A company runs EC2 instances in private subnets and needs to access Amazon S3 objects without using a NAT gateway. They want the traffic to stay within AWS private networking as much as possible (no internet egress). Which VPC endpoint type should they create for Amazon S3?

A.Create an Interface VPC endpoint for S3 and point the instances to it

B.Create a Gateway VPC endpoint for S3 and update the route tables to use it

C.Create a NAT gateway and allow outbound HTTPS to S3

D.Create a VPC endpoint service and manually register S3 as a provider endpoint

AnswerB

Gateway VPC endpoints for S3 are the supported way to send S3 traffic from private subnets without NAT. They add routes in the relevant route tables (via S3 prefix lists) so requests to S3 go through the AWS network. This avoids internet egress and keeps the path private to the extent intended by VPC endpoint routing.

Why this answer

A Gateway VPC endpoint for S3 is the correct choice because it uses prefix lists and route table entries to send S3 traffic directly through AWS's private network without leaving the AWS backbone or requiring a NAT gateway. This endpoint type supports S3 and DynamoDB only, and it does not incur hourly charges, making it cost-effective for private subnet instances to access S3 objects securely.

Exam trap

The trap here is that candidates often confuse Gateway endpoints (for S3/DynamoDB) with Interface endpoints (for other AWS services), or incorrectly assume that a NAT gateway is required for private subnet egress, missing that Gateway endpoints provide a free, private alternative for S3 access.

How to eliminate wrong answers

Option A is wrong because an Interface VPC endpoint for S3 uses an Elastic Network Interface (ENI) with a private IP, but it still requires a NAT gateway or internet gateway for private subnet instances to reach it unless the endpoint is in the same subnet; more importantly, Gateway endpoints are the recommended and simpler option for S3. Option B is the correct answer. Option C is wrong because a NAT gateway allows outbound internet traffic, which violates the requirement to keep traffic within AWS private networking and avoid internet egress.

Option D is wrong because a VPC endpoint service is used to expose your own services to other VPCs via AWS PrivateLink, not to access AWS services like S3; you cannot manually register S3 as a provider endpoint.

Full explanation →

MCQhard

Based on the exhibit, a media company serves versioned JavaScript and CSS files from an Amazon S3 origin through CloudFront. After a frontend release, the cache hit ratio dropped sharply even though the file names are versioned. The application team says the browser requests include the same Authorization header on every asset request because the frontend and API share one domain. What should the solutions architect do to improve CloudFront cache hit ratio without changing the application authentication model for the API?

A.Enable S3 Transfer Acceleration on the bucket so CloudFront fetches objects faster from the origin.

B.Create a CloudFront cache policy that excludes Authorization, cookies, and unnecessary query strings from the cache key.

C.Switch the origin from S3 to an Application Load Balancer so CloudFront can cache dynamic responses more effectively.

D.Configure CloudFront to forward every viewer header to the origin so the origin can decide whether the content is cacheable.

AnswerB

This reduces cache fragmentation because CloudFront can reuse the same cached object for many viewers. Since the assets are immutable and versioned, the Authorization header is not needed to vary the cache for these files. Keeping API authentication separate preserves the application model while improving hit ratio.

Why this answer

The sharp drop in cache hit ratio is caused by the Authorization header being included in the cache key, which makes each request unique even though the file names are versioned. By creating a CloudFront cache policy that excludes the Authorization header (and unnecessary cookies/query strings) from the cache key, CloudFront can serve cached responses to requests with different Authorization headers, restoring the cache hit ratio without altering the application's authentication model for the API.

Exam trap

The trap here is that candidates may think the Authorization header is required for caching or that forwarding all headers is safe, but in reality, including it in the cache key destroys cache efficiency for static assets, and the correct solution is to exclude it via a cache policy.

How to eliminate wrong answers

Option A is wrong because S3 Transfer Acceleration improves upload/download speed over long distances but does not affect CloudFront's cache key or hit ratio. Option C is wrong because switching to an Application Load Balancer would not solve the cache key issue; ALB is for dynamic content and would not improve caching for static versioned files served from S3. Option D is wrong because forwarding every viewer header to the origin would include the Authorization header in the cache key, making each request unique and further reducing the cache hit ratio, which is the opposite of what is needed.

Full explanation →

MCQmedium

A marketing team runs a report-generation process that must execute once per day at 02:00 UTC. It usually completes in 10315 minutes, but sometimes takes up to 45 minutes due to varying data volumes. They currently run the workload on an EC2 instance that is always on, which wastes money during off-hours. The team wants to minimize operational overhead and pay mainly for actual execution time. What is the best architecture choice?

A.Use a scheduled Amazon EC2 Auto Scaling group that keeps a minimum of one instance running at all times.

B.Use an EventBridge schedule to run the report as an Amazon ECS task on AWS Fargate and write results to S3.

C.Use AWS Lambda triggered by an EventBridge schedule at 02:00 UTC and write results to S3.

D.Use an EMR cluster provisioned daily with manual teardown to ensure the instance is always available before 02:00.

AnswerB

Fargate allows the containerized job to run only when scheduled, so the team pays for task runtime instead of keeping an EC2 instance always on.

Why this answer

Option B is correct because AWS Fargate allows the report-generation task to run as an Amazon ECS task triggered by an EventBridge schedule, eliminating the need for an always-on EC2 instance. Fargate charges only for the vCPU and memory resources consumed during the task's execution (typically 10–45 minutes), which aligns with the team's goal of minimizing operational overhead and paying mainly for actual execution time. Writing results to S3 provides durable, cost-effective storage without managing infrastructure.

Exam trap

The trap here is that candidates often choose Lambda for scheduled tasks without checking the execution duration limit, overlooking that Lambda's 15-minute timeout makes it unsuitable for processes that can take up to 45 minutes.

How to eliminate wrong answers

Option A is wrong because an Auto Scaling group with a minimum of one instance keeps an EC2 instance running 24/7, which still wastes money during off-hours and does not pay mainly for actual execution time. Option C is wrong because AWS Lambda has a maximum execution timeout of 15 minutes (900 seconds), but the report-generation process can take up to 45 minutes, making Lambda unsuitable for this workload. Option D is wrong because manually provisioning and tearing down an EMR cluster daily introduces significant operational overhead, contradicting the team's goal to minimize operational overhead, and EMR is designed for big data processing, not a simple report-generation task.

Full explanation →

MCQeasy

A DynamoDB-backed multi-tenant app experiences throttling during a promotion. Most writes and reads target tenant "ACME" and use the same partition key value, causing a hot partition. Which design change most directly improves performance?

A.Add a "shard" component to the partition key (for example, tenantId + hashed bucket) to spread traffic across partitions

B.Increase the table’s read capacity without changing the partition key

C.Switch all reads to strongly consistent reads to guarantee faster results

D.Store ACME data in S3 and query it directly to avoid DynamoDB throttling

AnswerA

DynamoDB throughput is distributed across physical partitions. If one partition key value receives most traffic, that partition throttles. Adding a shard component to the partition key increases the number of partition key values being used, spreading requests across more partitions and reducing hot-partition throttling.

Why this answer

Option A is correct because adding a shard component to the partition key (e.g., appending a random or hash-based suffix to the tenant ID) distributes writes and reads for the same tenant across multiple physical partitions. This directly alleviates the hot partition caused by all ACME traffic hitting a single partition key value, allowing DynamoDB to utilize its full provisioned throughput across partitions.

Exam trap

The trap here is that candidates may think increasing total table capacity (Option B) solves throttling, but they overlook that DynamoDB throttles at the partition level, not the table level, so a single hot partition remains constrained regardless of total capacity.

How to eliminate wrong answers

Option B is wrong because increasing the table’s read capacity does not fix the hot partition issue—DynamoDB distributes throughput evenly across partitions, so a single partition can still throttle even if total table capacity is high. Option C is wrong because strongly consistent reads do not improve performance; they are slower and consume more read capacity units than eventually consistent reads, and they do not spread traffic across partitions. Option D is wrong because storing ACME data in S3 and querying it directly bypasses DynamoDB’s low-latency access patterns and introduces additional complexity (e.g., S3 eventual consistency, lack of native querying), making it an inefficient and indirect solution for a hot partition problem.

Full explanation →

MCQmedium

A production application writes to an Amazon Aurora PostgreSQL cluster. Users report that during business-hour reporting runs, write latency increases. The application team wants to keep the writer focused on OLTP writes while still providing low-latency reads for reporting queries. What architectural approach should the solutions architect recommend?

A.Create Aurora read replicas and direct reporting read-only connections to the cluster reader endpoint.

B.Resize the writer instance to a larger class so it can handle both writes and reads with fewer slowdowns.

C.Enable cross-region replication for the entire cluster so reporting always runs in the secondary Region.

D.Disable read replicas and use caching only in the application layer, keeping all queries connected to the writer endpoint.

AnswerA

Read replicas offload read workloads from the writer. Using the reader endpoint lets reporting queries use replicas, improving write responsiveness.

Why this answer

A is correct because creating Aurora read replicas and directing reporting read-only connections to the cluster reader endpoint offloads read traffic from the writer instance. This allows the writer to focus on OLTP writes, while the reader endpoint load-balances read-only queries across replicas, providing low-latency reads for reporting without impacting write performance.

Exam trap

The trap here is that candidates may think resizing the writer instance (Option B) is sufficient, but the exam tests the architectural principle of separating read and write workloads to avoid resource contention, not just scaling vertically.

How to eliminate wrong answers

Option B is wrong because resizing the writer instance to a larger class increases capacity for both writes and reads, but does not isolate reporting queries from the writer, so write latency can still spike during heavy read loads. Option C is wrong because cross-region replication adds significant latency for reads and does not address the immediate need for low-latency reads within the same region during business hours. Option D is wrong because disabling read replicas and relying solely on application-layer caching forces all queries to the writer endpoint, which increases contention and does not scale for reporting workloads that require fresh data.

Full explanation →

MCQmedium

An S3 bucket stores user-uploaded images. Access patterns are unpredictable: some objects are never read again, while others are occasionally retrieved months later. The team wants to reduce storage cost without having to manually track access frequency or run periodic analyses. Which S3 storage and lifecycle approach is the best fit?

A.Enable S3 Intelligent-Tiering so objects can automatically move between access tiers based on observed access patterns.

B.Use S3 Glacier Instant Retrieval for all objects immediately to minimize storage cost.

C.Create a lifecycle rule that transitions objects to Standard-IA after a fixed 30 days, regardless of access.

D.Keep all objects in S3 Standard and reduce costs by enabling server access logging compression.

AnswerA

S3 Intelligent-Tiering is designed for unknown or changing access patterns. It monitors access and automatically moves objects between tiers (for example, between frequent-access and infrequent-access tiers) based on actual usage, which avoids the need to manually decide transition schedules. This directly meets the requirement to reduce storage cost while eliminating ongoing manual tracking or periodic analysis.

Why this answer

S3 Intelligent-Tiering is the best fit because it automatically moves objects between access tiers (frequent, infrequent, archive instant, archive) based on changing access patterns, eliminating the need for manual tracking or lifecycle rules. This optimizes storage costs for unpredictable access patterns without requiring you to define fixed time-based transitions or perform periodic analyses.

Exam trap

The trap here is that candidates often choose a fixed lifecycle rule (Option C) thinking it is simpler, but they overlook the retrieval fees and inefficiency of applying a rigid time-based policy to unpredictable access patterns, whereas Intelligent-Tiering adapts dynamically without manual tuning.

How to eliminate wrong answers

Option B is wrong because storing all objects immediately in S3 Glacier Instant Retrieval incurs higher retrieval costs and minimum storage charges (90 days) for objects that may never be accessed again, and it does not adapt to unpredictable patterns. Option C is wrong because a fixed 30-day transition to Standard-IA does not account for objects that are accessed frequently after 30 days, leading to retrieval fees, and it fails to optimize for objects that are never accessed again. Option D is wrong because enabling server access logging compression does not reduce storage costs for the objects themselves; it only reduces log storage size, and keeping all objects in S3 Standard is more expensive than using Intelligent-Tiering for unpredictable access.

Full explanation →

MCQhard

Based on the exhibit, the security team wants centralized detection and alerting for both successful and failed attempts to change S3 bucket policies and KMS key policies across multiple accounts. Which approach best meets the requirement?

A.Enable S3 server access logging on each bucket and archive the logs in the security account.

B.Use AWS Config rules only, because Config records every successful and failed API call automatically.

C.Create an organization CloudTrail trail for management events and add EventBridge rules in the security account to alert on PutBucketPolicy and PutKeyPolicy events, including failed calls.

D.Enable GuardDuty in every account and use its findings as the main source for policy change notifications.

AnswerC

An organization trail captures the API activity across accounts, and EventBridge can route both successful and failed management events to alerts centrally.

Why this answer

Option C is correct because an organization CloudTrail trail captures management events (including PutBucketPolicy and PutKeyPolicy) across all accounts in the organization, and EventBridge rules in the security account can filter for both successful and failed API calls (using the `errorCode` field) to trigger centralized alerts. This provides the required centralized detection and alerting for policy changes across multiple accounts.

Exam trap

The trap here is that candidates may confuse S3 server access logging (which logs object-level access) with CloudTrail (which logs management API calls), or assume AWS Config automatically records all API calls, when in fact Config only tracks configuration changes and not failed API attempts.

How to eliminate wrong answers

Option A is wrong because S3 server access logging logs object-level access requests, not management API calls like PutBucketPolicy, and it does not capture KMS key policy changes at all. Option B is wrong because AWS Config rules evaluate resource configurations and compliance, but they do not automatically record every API call; they rely on configuration changes and cannot directly alert on failed API calls. Option D is wrong because GuardDuty focuses on threat detection (e.g., anomalous behavior, compromised credentials) and does not natively provide detailed alerting for specific management API calls like PutBucketPolicy or PutKeyPolicy, especially for failed attempts.

Full explanation →

Multi-Selecthard

A claims workflow requires point-in-time recovery and accidental-delete protection for a DynamoDB table. Which two settings should the architect enable? The architecture review board prefers a managed AWS-native control.

Select 2 answers

A.Point-in-time recovery

B.DAX

C.Deletion protection or tightly controlled delete permissions

D.Global secondary indexes

AnswersA, C

PITR allows restoration to a specific second within the supported recovery window.

Why this answer

Point-in-time recovery (PITR) for DynamoDB enables continuous backups with 35-day granularity, allowing restoration to any second within that window. This satisfies the point-in-time recovery requirement by providing a managed AWS-native control that automatically backs up table data without manual intervention.

Exam trap

The trap here is that candidates often confuse DAX (a caching layer) with backup/recovery features, or assume that global secondary indexes provide some form of data redundancy or protection, when in fact they only support query flexibility.

Full explanation →

Multi-Selectmedium

A retail API runs on Amazon EC2 instances behind an Application Load Balancer and stores orders in an Amazon RDS for PostgreSQL database. A test that stopped one Availability Zone caused the API to return errors because all application servers were in the same AZ and the database was single-AZ. Which two changes should the architect make to continue serving traffic during a single-AZ failure? Select two.

Select 2 answers

A.Increase the EC2 instance size and keep all application servers in the same subnet.

B.Configure the Auto Scaling group to launch instances across private subnets in at least two Availability Zones.

C.Replace the Application Load Balancer with a Network Load Balancer in a single Availability Zone.

D.Convert the RDS for PostgreSQL database to a Multi-AZ deployment.

E.Add an Amazon RDS read replica and point the application to the replica endpoint.

AnswersB, D

Spreading the application tier across multiple AZs preserves healthy capacity if one AZ fails and lets the load balancer keep serving requests.

Why this answer

Option B is correct because distributing EC2 instances across private subnets in at least two Availability Zones (AZs) ensures that if one AZ fails, the Auto Scaling group can continue serving traffic from instances in the remaining AZs. This eliminates the single point of failure for the application tier. Option D is correct because converting the RDS for PostgreSQL database to a Multi-AZ deployment automatically provisions a standby replica in a different AZ, enabling automatic failover during an AZ outage and preserving database availability.

Exam trap

The trap here is that candidates often think a read replica can serve as a high-availability solution for writes, but read replicas are asynchronous and do not support automatic failover for the primary database.

Full explanation →

MCQeasy

A company runs EC2 workloads in one region with somewhat steady overall demand. Over time, the team frequently changes instance families (for performance/optimization) and sometimes changes instance size, but wants predictable cost discounts. Which purchase option provides the best balance of cost savings and flexibility?

A.Standard Reserved Instances for a specific instance family and size only.

B.Savings Plans (Compute Savings Plans), scoped for flexible EC2 usage in the region.

C.Spot Instances for all workloads, assuming interruptions will never happen.

D.On-Demand only, because it avoids the complexity of purchase option scopes.

AnswerB

Compute Savings Plans provide discounted pricing for steady usage while allowing flexibility across instance families, OS, and sizes within the selected scope (for example, region). That matches the scenario: demand is steady enough for discounts, but the underlying instance type choices change frequently to meet performance needs.

Why this answer

Compute Savings Plans offer the best balance of cost savings and flexibility because they provide up to 66% discount in exchange for a commitment to a consistent amount of compute usage (measured per hour) in a region, but they automatically apply to any EC2 instance family, size, OS, or tenancy, as well as AWS Fargate and Lambda. This matches the team's need to frequently change instance families and sizes while still getting predictable discounts, unlike Standard RIs which lock you to a specific family and size.

Exam trap

The trap here is that candidates often confuse Standard Reserved Instances (which lock family/size) with Convertible RIs (which allow family changes but require a 1:1 exchange and still have restrictions), or they assume Savings Plans only apply to EC2, missing that Compute Savings Plans also cover Fargate and Lambda, making them the most flexible option for compute cost optimization.

How to eliminate wrong answers

Option A is wrong because Standard Reserved Instances require a commitment to a specific instance family and size (e.g., m5.large), which prevents the team from freely changing instance families for performance optimization without incurring modification fees or losing the discount. Option C is wrong because Spot Instances can be interrupted with a 2-minute warning when AWS needs capacity back, making them unsuitable for steady workloads where interruptions are assumed to never happen—this violates the fundamental design of Spot as a cost-optimization tool for fault-tolerant or flexible workloads. Option D is wrong because On-Demand pricing offers no discount (0% savings) and avoids complexity only by paying full price, which fails to meet the requirement for predictable cost savings.

Full explanation →

Matchingmedium

Match the disaster recovery strategy to the recovery posture it best fits for a Regional outage.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Lowest cost option where the environment is rebuilt from backups and hours of downtime are acceptable.

Keep only the critical core running in the secondary Region, then scale out after failover.

Run a scaled-down but functional environment in another Region for faster cutover.

Serve production traffic from more than one Region at the same time for the fastest recovery.

Why these pairings

These pairs match disaster recovery strategies to their recovery postures, aligning with AWS DR strategies where RTO and RPO define the recovery objectives.

Full explanation →

MCQmedium

A payments service receives payment orders by consuming messages from an Amazon SQS Standard queue. The downstream processor occasionally exceeds its processing timeout. As a result, some messages reappear in the queue and may be processed more than once. The team wants to prevent duplicate side effects (for example, double-charging) and also ensure poison messages do not repeatedly consume processing capacity. What approach best satisfies both goals?

A.Implement idempotent processing (for example, store processed payment IDs in DynamoDB) and configure an SQS dead-letter queue (DLQ) using a redrive policy with an appropriate maxReceiveCount.

B.Rely only on increasing the SQS visibility timeout so duplicates rarely occur, without adding idempotency checks or a DLQ.

C.Switch to a FIFO queue and delete messages immediately upon receipt to avoid duplicates.

D.Move the workload to SNS and use synchronous HTTP endpoints so the sender retries until the receiver confirms success.

AnswerA

With SQS Standard’s at-least-once delivery, duplicates can occur. Idempotency ensures repeated processing of the same payment ID does not create duplicate side effects. A DLQ with redrive policy isolates poison messages: after a message is received and fails processing more than maxReceiveCount times, SQS moves it to the DLQ instead of cycling it back to the main queue indefinitely.

Why this answer

Option A is correct because it addresses both requirements: idempotent processing (e.g., storing processed payment IDs in DynamoDB) ensures that even if a message is processed more than once, duplicate side effects like double-charging are prevented. Configuring an SQS dead-letter queue (DLQ) with a redrive policy and an appropriate maxReceiveCount (e.g., 3 or 5) automatically moves messages that exceed the maximum number of receives to the DLQ, preventing poison messages from repeatedly consuming processing capacity.

Exam trap

The trap here is that candidates often confuse 'exactly-once delivery' (FIFO queues) with 'exactly-once processing,' failing to realize that idempotency is still required to handle failures after message receipt, and that a DLQ is necessary to manage poison messages regardless of queue type.

How to eliminate wrong answers

Option B is wrong because simply increasing the SQS visibility timeout reduces the likelihood of duplicates but does not eliminate them entirely, and it fails to handle poison messages that may still cause repeated processing failures. Option C is wrong because switching to a FIFO queue and deleting messages immediately upon receipt does not prevent duplicate side effects if the downstream processor fails after deletion but before completing processing; FIFO queues guarantee exactly-once delivery but not exactly-once processing, and immediate deletion removes the ability to retry or handle failures. Option D is wrong because moving to SNS with synchronous HTTP endpoints shifts the retry responsibility to the sender, but it does not inherently prevent duplicate side effects (e.g., if the receiver processes the request but the acknowledgment is lost) and does not address poison messages that could repeatedly fail.

Full explanation →

MCQeasy

A CI/CD pipeline needs to deploy to your production environment. Security requires that the pipeline uses temporary credentials (not long-lived access keys) and only has permissions to read a specific set of parameters from AWS Systems Manager Parameter Store and write application logs to CloudWatch Logs. What is the best AWS approach?

A.Create an IAM user for the pipeline and store access keys in the CI system.

B.Create an IAM role in the production account, grant least-privilege policies, and let the CI assume it using STS AssumeRole.

C.Attach the required permissions to an IAM group and add the pipeline’s principal to that group directly.

D.Use AWS KMS to encrypt the pipeline’s access keys and store the ciphertext in the CI system.

AnswerB

IAM roles with STS provide temporary credentials and allow least-privilege permissions via attached policies.

Why this answer

Option B is correct because it uses an IAM role with least-privilege policies that the CI/CD pipeline can assume via AWS STS AssumeRole, generating temporary credentials that automatically expire. This eliminates the need for long-lived access keys and adheres to the security requirement of using temporary credentials. The role's policies can be scoped to exactly read specific parameters from Systems Manager Parameter Store and write logs to CloudWatch Logs.

Exam trap

The trap here is that candidates may choose Option A or D because they focus on credential storage rather than the fundamental requirement for temporary credentials, or they may confuse IAM groups with roles, thinking a group can be used for cross-account access without understanding that groups only apply to IAM users within the same account.

How to eliminate wrong answers

Option A is wrong because creating an IAM user with long-lived access keys violates the security requirement for temporary credentials and introduces a static credential risk if the keys are leaked. Option C is wrong because IAM groups are used to attach policies to IAM users, not to external principals like a CI/CD pipeline; the pipeline's principal cannot be added directly to an IAM group without first being an IAM user. Option D is wrong because encrypting access keys with KMS still results in long-lived credentials that must be decrypted and used, failing the temporary credentials requirement and adding unnecessary complexity without addressing the core security mandate.

Full explanation →

MCQmedium

A telemetry pipeline uses RDS MySQL and receives many read-only reporting queries that slow down the primary database. What should the architect add? The architecture review board prefers a managed AWS-native control.

A.Multi-AZ standby and route reads to the standby

B.RDS read replica and route reporting queries to it

C.S3 lifecycle policy

D.A larger NAT gateway

AnswerB

Read replicas offload read traffic from the primary instance.

Why this answer

The correct answer is B because RDS read replicas are designed specifically to offload read-heavy workloads like reporting queries from the primary database. They provide an asynchronous read-only copy of the database that can handle SELECT statements without impacting the primary's write performance. This is a fully managed AWS-native solution that aligns with the architecture review board's preference.

Exam trap

The trap here is confusing Multi-AZ standby (which is for failover only) with read replicas (which are for read scaling), leading candidates to incorrectly choose Option A.

How to eliminate wrong answers

Option A is wrong because a Multi-AZ standby is a synchronous replica used for high availability and failover, not for read traffic; it cannot serve read queries directly. Option C is wrong because an S3 lifecycle policy manages object storage transitions and expiration, which is unrelated to offloading database read queries. Option D is wrong because a larger NAT gateway increases outbound internet capacity for private subnets, which does not address read query load on an RDS database.

Full explanation →

MCQmedium

A analytics dashboard uses RDS MySQL and receives many read-only reporting queries that slow down the primary database. What should the architect add? The design must avoid adding custom operational scripts.

A.S3 lifecycle policy

B.RDS read replica and route reporting queries to it

C.Multi-AZ standby and route reads to the standby

D.A larger NAT gateway

AnswerB

Read replicas offload read traffic from the primary instance.

Why this answer

RDS read replicas are designed specifically to offload read-heavy workloads from the primary DB instance. By routing reporting queries to a read replica, the primary database is freed from processing those queries, reducing contention and improving overall performance. This approach requires no custom scripts—AWS handles replication automatically.

Exam trap

The trap here is confusing Multi-AZ standby with read replica functionality—candidates often assume the standby can serve reads, but AWS explicitly disallows this for RDS MySQL and PostgreSQL.

How to eliminate wrong answers

Option A is wrong because S3 lifecycle policies manage object transitions and deletions in S3 buckets, not database query offloading. Option C is wrong because Multi-AZ standby is for high availability and failover, not for serving read traffic; routing reads to the standby is not supported and would cause errors. Option D is wrong because a larger NAT gateway increases outbound internet capacity for private subnets, which has no effect on database query performance.

Full explanation →

Multi-Selecthard

A photo studio stores original project archives in Amazon S3. Objects are read heavily for 14 days after upload, occasionally during the next 11 months, and almost never after one year. The team wants the lowest storage cost while keeping retrieval within minutes during the first year. Which three actions are best? Select three.

Select 3 answers

A.Keep new objects in S3 Standard for the first 14 days.

B.Transition objects to S3 Standard-IA after 14 days.

C.Transition objects to S3 Glacier Flexible Retrieval after 14 days.

D.Transition objects to S3 Glacier Deep Archive after one year.

E.Disable versioning to make the lifecycle rules work correctly.

AnswersA, B, D

Correct. Standard is appropriate for the initial hot-access period because the data is read frequently and needs immediate performance. Using a cheaper archive tier too early would increase retrieval latency and likely access costs.

Why this answer

A is correct because S3 Standard is designed for frequently accessed data with low latency and high throughput, making it ideal for the first 14 days when objects are read heavily. This storage class provides immediate retrieval and no retrieval costs, aligning with the requirement for minutes-level access during the first year.

Exam trap

The trap here is that candidates might incorrectly choose S3 Glacier Flexible Retrieval for the 14-day transition, overlooking that its retrieval times (minutes to hours) do not meet the 'within minutes' requirement for the first year, or mistakenly think versioning must be disabled for lifecycle rules to function.

Full explanation →

MCQmedium

A public API for a customer analytics portal is deployed on API Gateway. Clients must authenticate with standards-based tokens issued by an external OpenID Connect provider. Which authorization mechanism should be used? The design must avoid adding custom operational scripts.

A.API keys only

B.JWT authorizer configured for the OpenID Connect issuer

C.IAM authorization for all internet users

D.A VPC endpoint policy

AnswerB

A JWT authorizer validates tokens from a trusted OIDC issuer with low operational overhead.

Why this answer

Option B is correct because a JWT authorizer in API Gateway can validate tokens issued by an external OpenID Connect (OIDC) provider without requiring custom code. The JWT authorizer automatically verifies the token's signature, expiry, and issuer against the OIDC provider's JWKS endpoint, meeting the requirement for standards-based authentication and avoiding custom operational scripts.

Exam trap

The trap here is that candidates often confuse API keys (which are for rate limiting and usage plans, not authentication) with token-based authorization, or mistakenly think IAM authorization can be used for external users without AWS credentials.

How to eliminate wrong answers

Option A is wrong because API keys only provide simple identification, not authentication or authorization; they do not validate token claims or integrate with an OpenID Connect provider. Option C is wrong because IAM authorization is designed for AWS principals (e.g., IAM users/roles) and requires AWS credentials, not standards-based tokens from an external OIDC provider; it also cannot be used for all internet users without custom signing logic. Option D is wrong because a VPC endpoint policy controls access to API Gateway via VPC endpoints, not authentication or token validation; it does not address client authentication with OIDC tokens.

Full explanation →

MCQhard

Based on the exhibit, a workload in Account B must assume a role in Account A. Security requires that only the specific role arn:aws:iam::444455556666:role/PipelineExecRole can assume it, and only when the caller supplies the external ID acct-b-prod-7788. Which change best satisfies the requirement with the least privilege?

A.Keep the root principal and add an aws:PrincipalTag condition in the trust policy to require the tag acct-b-prod-7788.

B.Replace the principal with arn:aws:iam::444455556666:role/PipelineExecRole and add a StringEquals condition on sts:ExternalId = acct-b-prod-7788.

C.Attach a permission boundary to the role in Account A so that only PipelineExecRole can use it.

D.Add an SCP in Account B that allows sts:AssumeRole only for PipelineExecRole.

AnswerB

This change directly restricts trust to one named role in Account B and adds a confused-deputy defense with the external ID. The role trust policy is the correct place to control who can assume the role, and the external ID ensures only the expected caller can complete the STS request.

Why this answer

Option B is correct because it explicitly restricts the trust policy principal to the specific IAM role ARN `arn:aws:iam::444455556666:role/PipelineExecRole` and adds a `StringEquals` condition on `sts:ExternalId` set to `acct-b-prod-7788`. This satisfies the security requirement by ensuring only that exact role can assume the role in Account A, and only when the correct external ID is provided, following the principle of least privilege.

Exam trap

The trap here is that candidates often confuse the trust policy's `Principal` element with permission boundaries or SCPs, mistakenly thinking those can restrict who can assume a role, when in fact only the trust policy controls the assumption, and the external ID condition is required to prevent confused deputy attacks.

How to eliminate wrong answers

Option A is wrong because using a root principal (which allows any IAM entity in Account B) combined with an `aws:PrincipalTag` condition does not restrict the caller to the specific role `PipelineExecRole`; tags can be modified or absent, and the root principal is overly permissive. Option C is wrong because a permission boundary attached to the role in Account A limits the permissions of that role but does not control which external principal can assume it; the trust policy alone governs who can assume the role. Option D is wrong because an SCP in Account B can deny or allow `sts:AssumeRole` actions for principals in Account B, but it cannot enforce the external ID requirement or restrict which role in Account A is assumed; the trust policy in Account A is the authoritative mechanism.

Full explanation →

MCQmedium

A fintech company has a two-Region DR requirement: RPO must be within 15 minutes and RTO must be under 2 hours. To control cost, they do not want to run full production infrastructure in the secondary Region continuously. They plan to continuously replicate the database and keep the application infrastructure in the secondary Region prepared, but at reduced capacity. Which DR strategy best matches this requirement and accurately describes their plan?

A.Pilot light: keep only minimal components (for example, replicated storage and a small amount of core services), so the app scales up during a disaster.

B.Warm standby: keep the essential parts of the application running in the secondary Region at reduced capacity, while using database replication to meet the RPO.

C.Active-active: run the application fully in both Regions with synchronized writes and share traffic continuously.

D.Cold standby: store backups in the secondary Region and provision all infrastructure only during a disaster.

AnswerB

Warm standby aligns with both constraints: reduced-cost readiness is maintained in the secondary Region (so RTO is faster), and continuous replication is used to keep data lag within the 15-minute RPO target.

Why this answer

Warm standby is the correct strategy because it runs a scaled-down version of the production application in the secondary Region continuously, with database replication (e.g., Amazon RDS Multi-Region or Aurora Global Database) meeting the 15-minute RPO. The reduced-capacity infrastructure can be scaled up within the 2-hour RTO during a disaster, balancing cost and recovery requirements.

Exam trap

The trap here is confusing pilot light with warm standby: candidates often think any pre-provisioned infrastructure qualifies as pilot light, but warm standby explicitly runs the application at reduced capacity, whereas pilot light keeps only core services and storage without running the application stack.

How to eliminate wrong answers

Option A is wrong because pilot light keeps only minimal core services and storage, not a running application at reduced capacity, and requires provisioning and scaling up compute resources during a disaster, which may not meet the 2-hour RTO if scaling takes significant time. Option C is wrong because active-active runs the application fully in both Regions with synchronized writes and continuous traffic sharing, which violates the cost control requirement of not running full production infrastructure continuously. Option D is wrong because cold standby stores only backups and provisions all infrastructure during a disaster, leading to RTOs that typically exceed 2 hours due to provisioning and data restoration delays.

Full explanation →

MCQmedium

A company hosts a customer analytics portal on EC2. Administrators must connect without opening SSH or RDP ports to the internet. What should the architect use?

A.An internet gateway attached to the private subnet

B.A public Elastic IP address on each instance

C.AWS Systems Manager Session Manager with the required instance role

D.A bastion host with SSH open to 0.0.0.0/0

AnswerC

Session Manager provides audited shell access without inbound SSH/RDP exposure.

Why this answer

AWS Systems Manager Session Manager allows secure, auditable shell access to EC2 instances without opening inbound SSH (port 22) or RDP (port 3389) ports to the internet. It uses the AWS Systems Manager agent on the instance, which initiates an outbound connection to the AWS Systems Manager service over HTTPS (port 443), and the required IAM instance role grants permissions for this communication. This eliminates the need for a bastion host or public IP addresses, meeting the security requirement of no open inbound ports.

Exam trap

The trap here is that candidates often default to a bastion host (Option D) as the traditional solution for secure administrative access, but fail to recognize that Session Manager provides the same functionality without any inbound ports, which is the exact requirement stated in the question.

How to eliminate wrong answers

Option A is wrong because an internet gateway attached to a private subnet does not provide direct connectivity to the internet; it is used for public subnets and would require a route table entry to a NAT device for outbound-only access, not for administrative connections without open ports. Option B is wrong because assigning a public Elastic IP address to each instance exposes them directly to the internet, requiring open SSH or RDP ports to connect, which violates the requirement. Option D is wrong because a bastion host with SSH open to 0.0.0.0/0 exposes the bastion to the entire internet, creating a security risk and still requires opening SSH (port 22) or RDP (port 3389) on the bastion, contradicting the 'without opening SSH or RDP ports to the internet' constraint.

Full explanation →

MCQmedium

Your security team needs to detect and alert on any attempt to change sensitive policies, specifically S3 bucket policy changes and KMS key policy changes. The team wants alerts within minutes, and logs must be centrally retained for forensics. Which design best meets these detective control requirements using AWS-native services?

A.Enable CloudTrail management events and configure an EventBridge rule to send notifications for PutBucketPolicy and PutKeyPolicy API calls, while also delivering CloudTrail logs to a dedicated S3 bucket for retention.

B.Rely on AWS Config resource snapshots only; use the snapshots to infer policy changes and generate alerts from the daily compliance summary reports.

C.Enable S3 access logging on the affected buckets only; treat these logs as sufficient evidence for KMS key policy modifications.

D.Turn on CloudWatch Logs for the S3 bucket and KMS key; alert on any log line containing the word 'policy' to detect changes.

AnswerA

CloudTrail management events capture these policy-change API calls. EventBridge can create near-real-time alerts, and S3 provides durable central log retention for investigations.

Why this answer

Option A is correct because CloudTrail management events capture all API calls for S3 bucket policies (PutBucketPolicy) and KMS key policies (PutKeyPolicy) by default, and EventBridge rules can trigger near-real-time alerts (within minutes) for these specific API calls. Additionally, delivering CloudTrail logs to a dedicated S3 bucket provides centralized, immutable retention for forensic analysis, meeting both the alerting and retention requirements.

Exam trap

The trap here is that candidates often confuse S3 access logs (which record data-plane operations) with CloudTrail management events (which record control-plane operations), leading them to choose Option C, or they mistakenly think AWS Config snapshots provide real-time alerts, when in fact they are periodic and lack API-level detail.

How to eliminate wrong answers

Option B is wrong because AWS Config resource snapshots are taken periodically (e.g., every 1 hour or 6 hours), not within minutes, and they only show the state of resources at a point in time, not the specific API call that made the change, making them unsuitable for near-real-time alerting and forensic detail. Option C is wrong because S3 access logs record object-level access requests (e.g., GET, PUT, DELETE) on S3 buckets, not management events like bucket policy changes or KMS key policy modifications, and they cannot capture KMS key policy changes at all. Option D is wrong because CloudWatch Logs for S3 buckets and KMS keys do not exist as native log sources; CloudWatch Logs can ingest CloudTrail logs, but simply alerting on any log line containing the word 'policy' would generate excessive false positives (e.g., from normal operations like listing policies) and lacks the precision to detect only policy modification API calls.

Full explanation →

MCQmedium

In an AWS Organizations environment, developers create IAM roles using an automation tool. The security team wants to guarantee that even if a developer attaches an overly permissive inline policy, the role cannot exceed a fixed set of allowed actions. The team already uses permission boundaries on each role. The tool’s role-creation API call succeeds, but one developer’s new role can still delete production S3 buckets. What is the most likely reason, and what should be corrected?

A.Permission boundaries do not affect permissions for resources created with role chaining; enable role chaining instead to apply the boundary.

B.The boundary policy was not actually attached during role creation, or the automation tool attached the wrong boundary ARN; correct the role-creation request to set the intended PermissionBoundary.

C.KMS key policies override permission boundaries for S3, so deletion permission comes from the KMS policy; restrict the KMS key policy instead.

D.Permission boundaries apply only to managed policies, not to inline policies; move the overly permissive permissions to a managed policy type to keep it bounded.

AnswerB

Permission boundaries work by intersecting allowed actions from the role’s attached policies with the actions permitted by the boundary policy. If the automation tool fails to set the PermissionBoundary ARN (or sets an incorrect one), then the role can use the developer’s attached policies without the intended restriction. Fixing the PermissionBoundary parameter in the role creation call is the direct remedy.

Why this answer

Option B is correct because permission boundaries must be explicitly attached to an IAM role during creation via the `PermissionBoundary` parameter. If the automation tool fails to attach the intended boundary policy or attaches the wrong ARN, the role will have no effective boundary, allowing any inline policy to grant full access. The developer's role could then delete production S3 buckets because the boundary was not enforced.

Exam trap

The trap here is that candidates may assume permission boundaries are automatically inherited from the AWS Organizations policy or that they only affect managed policies, when in fact they must be explicitly attached and apply to all policy types.

How to eliminate wrong answers

Option A is wrong because permission boundaries do apply to roles used in role chaining; role chaining does not bypass boundaries, and enabling it would not fix the issue. Option C is wrong because KMS key policies control encryption operations, not S3 bucket deletion permissions; S3 delete actions are governed by S3 resource-based policies and IAM policies, not KMS policies. Option D is wrong because permission boundaries apply to both managed and inline policies equally; they limit the maximum permissions a role can have regardless of policy type.

Full explanation →

MCQmedium

A batch analytics job has unpredictable DynamoDB traffic with long idle periods and occasional spikes. Which capacity mode should minimize operational overhead and avoid paying for idle provisioned capacity?

A.DynamoDB on-demand capacity mode

B.Reserved capacity for maximum daily traffic

C.Provisioned capacity set for peak traffic

D.Global tables in every Region

AnswerA

On-demand capacity is suitable for unpredictable workloads and charges per request without capacity planning.

Why this answer

DynamoDB on-demand capacity mode automatically scales to handle unpredictable traffic spikes and idle periods, charging only for the reads and writes you perform. This eliminates the need to provision capacity for peak traffic, avoiding costs during long idle periods and reducing operational overhead from capacity management.

Exam trap

The trap here is that candidates may confuse 'Reserved capacity' with DynamoDB's reserved capacity option (which does not exist) or think provisioned capacity is always cheaper, ignoring the cost of idle provisioned throughput during unpredictable workloads.

How to eliminate wrong answers

Option B is wrong because Reserved capacity is not a DynamoDB pricing model; it applies to Amazon EC2 and RDS, not DynamoDB, and would still require provisioning for peak traffic. Option C is wrong because Provisioned capacity set for peak traffic would incur costs for idle periods when traffic is low, as you pay for the provisioned capacity regardless of actual usage. Option D is wrong because Global tables are a replication feature for multi-Region active-active setups, not a capacity mode; they do not address cost optimization for unpredictable traffic and add complexity and cost.

Full explanation →

MCQeasy

A team uses an S3 bucket to store important customer-generated exports. They need protection against accidental overwrites and also want copies of the data in another AWS Region for disaster recovery. Which S3 configuration best satisfies both requirements?

A.Enable S3 lifecycle policies to automatically move objects to Glacier after 30 days only.

B.Enable S3 versioning and configure Cross-Region Replication to a destination bucket in another Region.

C.Disable all versioning and rely on AWS Backup to restore objects from a scheduled backup window.

D.Enable S3 Block Public Access and SSE-S3 encryption, without using versioning or replication.

AnswerB

Versioning preserves previous object states against overwrites and deletes, while replication provides an additional Region copy for recovery.

Why this answer

Option B is correct because enabling S3 versioning protects against accidental overwrites by preserving previous versions of objects, and configuring Cross-Region Replication (CRR) automatically replicates objects to a destination bucket in another AWS Region, providing disaster recovery. This combination meets both requirements without manual intervention.

Exam trap

The trap here is that candidates may think AWS Backup alone can handle both accidental overwrites and disaster recovery, but it does not provide continuous versioning protection or real-time cross-region replication, and disabling versioning removes the ability to recover from overwrites.

How to eliminate wrong answers

Option A is wrong because lifecycle policies to Glacier only manage storage tier transitions and do not protect against accidental overwrites or provide cross-region replication for disaster recovery. Option C is wrong because disabling versioning removes the ability to recover from accidental overwrites, and relying solely on AWS Backup for scheduled restores does not provide real-time protection or continuous replication to another region. Option D is wrong because enabling Block Public Access and SSE-S3 encryption addresses security and encryption, but does not protect against accidental overwrites (no versioning) nor replicate data to another region (no replication).

Full explanation →

100

MCQmedium

A platform team wants application developers to create IAM roles for their ECS tasks, but security must guarantee that no role created by those developers can ever exceed a predefined permission set. The developers also should not be able to attach broader permissions to themselves later. What should the team implement?

A.Attach a customer-managed IAM policy to the developers and let them create roles freely.

B.Use an IAM permission boundary on the developer principals and require created roles to include the boundary.

C.Create a service-linked role for ECS and let developers reuse it for all workloads.

D.Add an S3 bucket policy that only allows tagged roles to be created.

AnswerB

A permission boundary sets the upper limit for permissions that an IAM principal or created role can receive. By combining developer role creation permissions with a required boundary, security can allow self-service role creation while preventing privilege escalation. Even if developers attach broader identity policies later, the effective permissions cannot exceed the boundary. This is the right control when you need delegated administration with a hard ceiling on privileges.

Why this answer

Option B is correct because IAM permission boundaries allow the platform team to define a maximum set of permissions that any role created by the developers can have. By attaching a permission boundary to the developers' IAM user or role, and requiring that any new role they create includes that boundary, the developers cannot grant permissions beyond the boundary—even if they attach a broader policy. This ensures that no role can ever exceed the predefined permission set, and developers cannot escalate their own privileges later.

Exam trap

The trap here is that candidates often confuse IAM permission boundaries with simple policy attachments, thinking that attaching a restrictive policy to the developer's user account is sufficient to control the roles they create, but permission boundaries are the only mechanism that limits the effective permissions of roles created by delegated users.

How to eliminate wrong answers

Option A is wrong because simply attaching a customer-managed IAM policy to the developers does not prevent them from creating roles with broader permissions—they could attach additional policies to those roles that exceed the predefined set. Option C is wrong because a service-linked role for ECS is a predefined role that cannot be customized per workload; developers would be forced to reuse a single role, which violates the principle of least privilege and does not allow developers to create roles with specific permissions. Option D is wrong because an S3 bucket policy controls access to S3 resources, not IAM role creation or permission boundaries; it cannot restrict the permissions of IAM roles created by developers.

Full explanation →

101

MCQhard

A payments API uses Amazon SQS. Poison messages are repeatedly failing and blocking useful retries. What should the architect configure?

A.A FIFO queue without a redrive policy

B.A dead-letter queue with an appropriate maxReceiveCount

C.A larger message retention period only

D.Short polling instead of long polling

AnswerB

A DLQ isolates messages that fail repeatedly so they can be investigated without disrupting normal processing.

Why this answer

B is correct because a dead-letter queue (DLQ) with an appropriate maxReceiveCount allows the payments API to isolate poison messages after a specified number of failed processing attempts. This prevents repeated failures from blocking useful retries, as the problematic messages are moved to the DLQ for manual inspection or separate handling, while the main queue continues processing valid messages.

Exam trap

The trap here is that candidates often confuse increasing the retention period or switching polling methods as solutions for poison messages, when the correct mechanism is a dead-letter queue with a maxReceiveCount to limit retries.

How to eliminate wrong answers

Option A is wrong because a FIFO queue without a redrive policy does not automatically handle poison messages; without a DLQ, failed messages will continue to be retried indefinitely, blocking the queue. Option C is wrong because increasing the message retention period only extends how long messages stay in the queue, but does not address the repeated failure and blocking caused by poison messages. Option D is wrong because short polling (vs. long polling) affects how often the queue is polled for messages, not the handling of poison messages or retry behavior.

Full explanation →

102

MCQeasy

A content publishing system exposes a static website from S3 and CloudFront. Users should still receive cached pages if the S3 origin has a short outage. Which feature helps most?

A.IAM Access Analyzer

B.AWS Backup Vault Lock

C.CloudFront caching with appropriate TTLs

D.S3 Select

AnswerC

CloudFront can serve cached content from edge locations when the origin is temporarily unavailable.

Why this answer

CloudFront caches responses from the S3 origin based on configured TTLs (Cache-Control or Expires headers). If the S3 origin experiences a short outage, CloudFront can still serve cached content to users until the TTL expires, maintaining availability. This is the most direct way to ensure users receive pages during transient origin failures.

Exam trap

The trap here is confusing data protection features (like Backup Vault Lock) or data retrieval features (like S3 Select) with caching mechanisms that directly improve availability during origin outages.

How to eliminate wrong answers

Option A is wrong because IAM Access Analyzer helps identify unintended access to resources but does not provide caching or origin failover capabilities. Option B is wrong because AWS Backup Vault Lock prevents deletion of backups but does not affect content delivery or caching behavior. Option D is wrong because S3 Select is a feature to retrieve subsets of object data using SQL queries, not a mechanism for caching or serving static content during outages.

Full explanation →

103

MCQhard

Based on the exhibit, the team serves versioned JavaScript and CSS files from an S3 origin through CloudFront. After a release, the cache hit ratio dropped and origin fetches increased sharply. What change best reduces both CloudFront and S3 costs without changing the application’s public behavior?

A.Increase the CloudFront price class to include more edge locations.

B.Create a cache policy that excludes Authorization, cookies, and unnecessary query strings, and narrow the origin request policy to forward only the headers the S3 origin actually needs.

C.Disable CloudFront and serve the files directly from S3 to avoid cache invalidation overhead.

D.Use Lambda@Edge to rewrite every request into a unique path so that clients never receive stale files.

AnswerB

The hit ratio is low because CloudFront is varying the cache on request attributes that do not change versioned static files. Removing Authorization, cookies, and irrelevant query strings from the cache key allows CloudFront to reuse cached objects across users and sessions. Reducing the origin request policy avoids sending unnecessary viewer context to the origin. Because the filenames are already versioned, long TTLs can be used safely and will lower origin requests and S3 request costs.

Why this answer

B is correct because versioned JavaScript and CSS files are immutable, so CloudFront should cache them aggressively. By creating a cache policy that excludes unnecessary headers (like Authorization and cookies) and query strings, and narrowing the origin request policy to forward only required headers, you maximize cache hits and reduce origin fetches. This directly lowers both CloudFront data transfer costs (fewer origin requests) and S3 request costs (fewer GET requests), without altering the application's public behavior.

Exam trap

The trap here is that candidates often think increasing edge locations (Option A) or using Lambda@Edge (Option D) will improve performance, but for versioned static files, the real cost optimization comes from maximizing cache hits by properly configuring cache and origin request policies, not from adding more infrastructure or rewriting requests.

How to eliminate wrong answers

Option A is wrong because increasing the CloudFront price class to include more edge locations increases costs (more regional data transfer) and does not improve cache hit ratio for versioned static files—it may even reduce it by spreading requests across more locations. Option C is wrong because disabling CloudFront and serving files directly from S3 eliminates caching entirely, drastically increasing S3 request costs (GET, data transfer) and latency, while also losing CloudFront's edge caching benefits. Option D is wrong because using Lambda@Edge to rewrite every request into a unique path would bypass caching entirely (each unique path is a cache miss), increasing origin fetches and costs, and it does not solve the stale-file problem because versioned files are already immutable.

Full explanation →

104

Multi-Selectmedium

A company is designing a high-performance architecture for a real-time analytics platform that ingests millions of events per second. The events must be processed with minimal latency and then stored for long-term analysis. Which three services should be combined to build this architecture? (Choose three.)

Select 3 answers

.Amazon Kinesis Data Streams for real-time data ingestion

.Amazon SQS (Simple Queue Service) for buffering events

.AWS Lambda for real-time event processing

.Amazon Redshift for long-term data storage and analytics

.Amazon RDS for MySQL for storing processed results

.Amazon CloudWatch Logs for event storage

Why this answer

Amazon Kinesis Data Streams is correct because it is designed for real-time data ingestion of millions of events per second with low latency, providing durable storage and ordered processing. AWS Lambda is correct because it can process events from Kinesis in near real-time with automatic scaling, making it ideal for minimal-latency processing. Amazon Redshift is correct because it is a petabyte-scale data warehouse optimized for long-term storage and complex analytics on large datasets, supporting high-performance queries.

Exam trap

The trap here is that candidates often confuse Amazon SQS with Kinesis for streaming ingestion, or assume Amazon RDS can handle long-term analytics storage, but the exam tests the specific use case of high-throughput, low-latency streaming and petabyte-scale analytics.

Full explanation →

105

MCQmedium

An application runs on EC2 in us-east-1 and frequently reads objects from an S3 bucket that is physically located in us-west-2. The finance team reports unexpectedly high inter-Region data transfer charges because the application retrieves objects for many user requests. A constraint: the bucket in us-west-2 must remain the system of record for compliance, but the application can read from a replica in us-east-1. What should the solutions architect do to minimize network spend while meeting the compliance constraint?

A.Enable S3 Cross-Region Replication from the us-west-2 source bucket to a destination bucket in us-east-1, and update the app to read from the us-east-1 bucket.

B.Create an interface VPC endpoint for S3 in us-east-1 and keep all object reads pointing to the us-west-2 bucket.

C.Use VPC peering between two regions and route all requests to the us-west-2 bucket over the peering link.

D.Use Route 53 latency-based routing to send users to a us-west-2 web endpoint and keep the S3 bucket unchanged.

AnswerA

CRR keeps the west bucket as the source of record while creating a near-region copy to reduce inter-Region transfer on reads.

Why this answer

Option A is correct because S3 Cross-Region Replication (CRR) automatically replicates objects from the source bucket in us-west-2 to a destination bucket in us-east-1, satisfying the compliance requirement that the us-west-2 bucket remains the system of record. By updating the application to read from the us-east-1 bucket, all read traffic stays within the same region, eliminating inter-region data transfer charges for object retrievals. This approach directly addresses the cost issue while preserving the original bucket as the authoritative source.

Exam trap

The trap here is that candidates may assume VPC endpoints or peering can eliminate inter-region costs, but S3 data transfer charges are based on the bucket's physical region, not the network path, so only replicating the data locally avoids the charges.

How to eliminate wrong answers

Option B is wrong because an interface VPC endpoint for S3 in us-east-1 does not change the physical location of the bucket; the application would still read from the us-west-2 bucket, incurring inter-region data transfer charges for each request. Option C is wrong because VPC peering does not support inter-region traffic for S3; S3 is a regional service accessed via public endpoints or VPC endpoints, and VPC peering does not route traffic to S3 buckets in another region without additional NAT or gateway configurations, and it would still incur inter-region data transfer costs. Option D is wrong because Route 53 latency-based routing directs user traffic to a web endpoint in us-west-2, but the S3 bucket remains unchanged in us-west-2, so the application still reads from the remote bucket, incurring inter-region transfer charges; it does not create a local replica.

Full explanation →

106

MCQmedium

A claims workflow uses an RDS MySQL database and must remain available during an Availability Zone failure with minimal application changes. What should the architect enable? The architecture review board prefers a managed AWS-native control.

A.S3 Cross-Region Replication

B.Multi-AZ deployment for the RDS DB instance

C.EBS snapshots every hour

D.Read replicas only

AnswerB

Multi-AZ provides synchronous standby replication and automatic failover within a Region.

Why this answer

Multi-AZ deployment for RDS MySQL provides synchronous standby replication to a different Availability Zone. In the event of an AZ failure, RDS automatically fails over to the standby, ensuring high availability with minimal application changes (the same endpoint is used). This is a managed AWS-native solution that meets the architecture review board's preference.

Exam trap

The trap here is that candidates often confuse Read replicas (which are for read scaling and not automatic failover) with Multi-AZ (which provides synchronous standby for high availability), or they mistakenly think EBS snapshots or S3 replication can provide the same level of automatic recovery with minimal downtime.

How to eliminate wrong answers

Option A is wrong because S3 Cross-Region Replication is for object storage replication across regions, not for RDS database availability within a region, and it does not address AZ failure for a MySQL database. Option C is wrong because EBS snapshots every hour provide point-in-time backups but do not enable automatic failover; recovery would require manual restoration, causing significant downtime. Option D is wrong because Read replicas only are designed for read scaling and asynchronous replication; they do not support automatic failover for write operations and cannot maintain availability during an AZ failure without manual promotion.

Full explanation →

107

Multi-Selecthard

A private application in two private subnets must download objects from S3 and read parameters from Systems Manager Parameter Store without routing traffic through the public internet. Which two components should the architect use? The implementation must work across routine deployments without manual intervention.

Select 2 answers

A.Interface VPC endpoint for Systems Manager

B.Internet gateway attached to the VPC

C.NAT gateway in each Availability Zone

D.Gateway VPC endpoint for Amazon S3

AnswersA, D

Systems Manager/Parameter Store access uses interface endpoints powered by AWS PrivateLink.

Why this answer

Interface VPC endpoints (AWS PrivateLink) for Systems Manager allow private subnets to access Systems Manager Parameter Store without traversing the internet, using private IP addresses within the VPC. Gateway VPC endpoints for S3 provide a highly available, redundant path to S3 via route table entries, ensuring traffic stays within the AWS network. Together, they eliminate the need for internet gateways or NAT gateways, meeting the requirement for no public internet routing.

Exam trap

The trap here is that candidates often confuse gateway VPC endpoints (used for S3 and DynamoDB) with interface VPC endpoints (used for most other AWS services), leading them to incorrectly select NAT gateways or internet gateways for private subnet access.

Full explanation →

108

MCQmedium

A partner company needs read-only access to reports in an S3 bucket for a B2B file exchange site. The partner has its own AWS account. What is the most secure scalable access pattern?

A.Make the objects public and rely on difficult-to-guess object names

B.Create an IAM user in the company account and share the access keys

C.Create a bucket policy that grants the partner role least-privilege access to the required prefix

D.Copy the objects to a public website bucket

AnswerC

A resource policy can grant cross-account access to a specific external role and prefix.

Why this answer

Option C is correct because it uses a resource-based bucket policy that grants the partner's IAM role (from their own AWS account) least-privilege read-only access to a specific prefix. This avoids sharing long-term credentials, follows the principle of cross-account access using IAM roles and bucket policies, and is fully scalable without managing external users.

Exam trap

The trap here is that candidates often choose Option B (sharing IAM user keys) because it seems simpler, but AWS recommends cross-account IAM roles for secure, temporary, and auditable access between accounts.

How to eliminate wrong answers

Option A is wrong because making objects public bypasses all access control and relies on security through obscurity (guessable object names), which is not secure or auditable. Option B is wrong because creating an IAM user in the company account and sharing access keys introduces long-term static credentials that must be rotated, shared securely, and managed, violating the principle of least privilege and creating a security risk. Option D is wrong because copying objects to a public website bucket exposes them to the internet without any access control, and it adds unnecessary data duplication and cost.

Full explanation →

109

MCQmedium

A SaaS company uses an S3 bucket for database backups created daily. Backups are rarely restored; the company’s documented RTO is 24 hours, and the compliance policy requires backups be kept for 90 days. The team currently stores all backups in S3 Standard, which is costly. Which single lifecycle policy change is most cost-optimized while still meeting the 24-hour RTO and 90-day retention?

A.Add a lifecycle rule to transition backups older than 1 day to S3 Glacier Flexible Retrieval, and keep them until day 90.

B.Add a lifecycle rule to transition backups older than 1 day to S3 Glacier Instant Retrieval, and keep them until day 90.

C.Add a lifecycle rule to transition backups older than 1 day to S3 Glacier Deep Archive, and keep them until day 90 with no restore configuration.

D.Add a lifecycle rule to transition backups older than 1 day to S3 One Zone-IA, and delete them after 7 days.

AnswerA

Glacier Flexible Retrieval is intended for backups with infrequent access and supports restores within an RTO measured in hours.

Why this answer

Option A is correct because S3 Glacier Flexible Retrieval offers retrieval times ranging from minutes to hours, which comfortably meets the 24-hour RTO, while providing significant cost savings over S3 Standard for data that is rarely accessed. Transitioning backups older than 1 day to this storage class reduces costs without compromising the 90-day retention requirement.

Exam trap

The trap here is that candidates may choose S3 Glacier Deep Archive for maximum cost savings without considering that its standard retrieval time (12 hours) could fail to meet the 24-hour RTO under load or without expedited retrieval, which adds cost and complexity.

How to eliminate wrong answers

Option B is wrong because S3 Glacier Instant Retrieval is designed for data accessed once a quarter with millisecond retrieval, but it is more expensive than S3 Glacier Flexible Retrieval and not the most cost-optimized choice for backups that are rarely restored. Option C is wrong because S3 Glacier Deep Archive has a standard retrieval time of 12 hours, which can exceed the 24-hour RTO if restore requests are queued or require expedited retrieval (which costs more), and the option lacks a restore configuration, making it risky for meeting the RTO. Option D is wrong because S3 One Zone-IA does not provide the durability needed for compliance (data is lost if the single AZ fails), and deleting backups after 7 days violates the 90-day retention policy.

Full explanation →

110

MCQeasy

Based on the exhibit, which EBS volume type should the team use to meet the performance need at lower cost than overprovisioning capacity?

A.Use gp3 and provision the needed IOPS independently of volume size.

B.Use sc1 because it is optimized for infrequent access and large objects.

C.Use st1 because it provides high throughput for streaming data.

D.Use standard magnetic storage because it is compatible with all EC2 instances.

AnswerA

gp3 is the best fit because it lets you provision IOPS and throughput separately from volume size. The exhibit shows the workload needs around 10,000 IOPS and experiences queue buildup on gp2. With gp3, the team can raise performance without unnecessarily increasing storage capacity, which is usually more cost-effective for this kind of database workload.

Why this answer

The team needs to meet performance requirements at lower cost than overprovisioning capacity. gp3 allows you to provision baseline performance of 3,000 IOPS and 125 MiB/s throughput for any volume size, and you can independently increase IOPS up to 16,000 and throughput up to 1,000 MiB/s without needing to increase volume size. This avoids the cost of overprovisioning large gp2 volumes to achieve higher IOPS, which are tied to volume size (3 IOPS per GiB).

Exam trap

The trap here is that candidates assume all EBS volume types require overprovisioning capacity to achieve higher IOPS, forgetting that gp3 decouples performance from size, making it the most cost-effective choice for workloads needing specific IOPS without large storage.

How to eliminate wrong answers

Option B is wrong because sc1 (Cold HDD) is designed for infrequently accessed, large sequential workloads with low cost per GB, but it cannot meet consistent IOPS performance needs due to its burst model and very low baseline IOPS (as low as 12 IOPS per TB). Option C is wrong because st1 (Throughput Optimized HDD) is optimized for high throughput for streaming, big data, and log processing, but it does not support independent IOPS provisioning and has low IOPS (as low as 40 IOPS per TB), making it unsuitable for workloads needing predictable IOPS. Option D is wrong because standard magnetic storage (previous generation) is deprecated for most use cases, offers very low performance (average 100 IOPS), and is not cost-effective compared to gp3 for any performance requirement.

Full explanation →

111

MCQmedium

Your company needs a high-throughput, low-latency TCP service using a custom binary protocol. Requirements: preserve the original client source IP for rate limiting, keep latency minimal, and use TCP health checks. The current setup uses an Application Load Balancer and performance is inconsistent. Which load balancer choice best meets these requirements?

A.Keep the Application Load Balancer (ALB), because ALBs also preserve client source IP for TCP protocols.

B.Use a Network Load Balancer (NLB) with TCP listeners so traffic stays at Layer 4 and the original source IP is preserved.

C.Use Amazon API Gateway because it preserves client source IP and provides TCP health checks for all protocols.

D.Use Amazon CloudFront with an S3 origin, because CloudFront reduces latency for TCP-based protocols.

AnswerB

NLB is designed for Layer 4 TCP/UDP traffic with very low latency and high throughput. It supports TCP health checks and preserves the original client source IP by default, which enables accurate client-IP-based rate limiting for a custom TCP protocol.

Why this answer

A Network Load Balancer (NLB) operates at Layer 4 and preserves the original client source IP by default, which is essential for accurate rate limiting. Its TCP listeners provide low-latency, high-throughput handling of custom binary protocols, and it supports TCP health checks natively. This directly addresses the performance inconsistency seen with the Application Load Balancer, which operates at Layer 7 and introduces additional processing overhead.

Exam trap

The trap here is that candidates often assume Application Load Balancers preserve client source IP for all protocols, but they only do so for HTTP/HTTPS traffic via the X-Forwarded-For header, not for raw TCP traffic, and they introduce higher latency due to Layer 7 processing.

How to eliminate wrong answers

Option A is wrong because an Application Load Balancer operates at Layer 7 (HTTP/HTTPS) and does not preserve the original client source IP for TCP traffic; it terminates the client connection and re-establishes a new one, so the source IP seen by the backend is the ALB's private IP. Option C is wrong because Amazon API Gateway is a fully managed service for creating RESTful and WebSocket APIs, not a load balancer; it does not support TCP listeners or TCP health checks, and it operates at Layer 7. Option D is wrong because Amazon CloudFront is a content delivery network (CDN) that caches content at edge locations, but it does not support TCP-based custom binary protocols (it works with HTTP/HTTPS and WebSocket) and cannot use an S3 origin for a TCP service; it also does not preserve the original client source IP for TCP traffic.

Full explanation →

112

MCQmedium

A company runs an application behind an Application Load Balancer (ALB). An Auto Scaling group (ASG) is configured with desired capacity 2, but it is attached only to subnets in a single Availability Zone. The ALB is healthy because it is configured across multiple Availability Zones. When the Availability Zone that contains the ASG subnets experiences an outage, what change most directly improves resilience and allows capacity to be restored automatically?

A.Update the ASG to use subnet IDs that span at least two Availability Zones so it can launch replacement instances after an AZ outage.

B.Reduce the ALB health check interval to speed up detection of unhealthy targets.

C.Enable connection draining on the ALB so existing requests complete before targets are terminated.

D.Increase the ASG desired capacity from 2 to 6 to compensate for the missing subnets.

AnswerA

If the ASG is attached to subnets in multiple Availability Zones, when instances in the failed AZ become unhealthy/terminate, Auto Scaling can launch new instances in the remaining AZs to restore the desired capacity. This directly addresses the root cause: the ASG cannot create capacity outside the AZs it is configured for.

Why this answer

Option A is correct because an Auto Scaling group (ASG) can only launch instances into the subnets explicitly assigned to it. If those subnets reside in a single Availability Zone (AZ) and that AZ fails, the ASG has no capacity to launch replacement instances, even though the ALB is multi-AZ. By configuring the ASG with subnet IDs spanning at least two AZs, the ASG can automatically launch instances in a healthy AZ, restoring capacity and resilience.

Exam trap

The trap here is that candidates assume a multi-AZ ALB automatically makes the entire architecture resilient, overlooking that the ASG must also be configured with subnets in multiple AZs to launch replacement instances after an AZ failure.

How to eliminate wrong answers

Option B is wrong because reducing the ALB health check interval speeds up detection of unhealthy targets but does not address the root cause: the ASG has no subnets in a healthy AZ to launch replacement instances. Option C is wrong because connection draining ensures in-flight requests complete before targets are deregistered, but it does not help restore capacity after an AZ outage. Option D is wrong because increasing the desired capacity from 2 to 6 does not solve the problem; the ASG still cannot launch instances if its subnets are all in the failed AZ, so the extra capacity is unreachable.

Full explanation →

113

Matchinghard

A company runs a stateless application tier behind an Application Load Balancer. Match each observed scaling pattern on the left to the best Auto Scaling strategy or metric on the right.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Scale the Auto Scaling group on ALB RequestCountPerTarget.

Scale on SQS queue depth using a custom CloudWatch metric.

Use scheduled scaling to add capacity before the recurring surge.

Use target tracking on EC2 CPUUtilization.

Why these pairings

Steady increase is best handled by step scaling for gradual adjustments; sudden spikes use simple scaling for immediate action; cyclical patterns benefit from scheduled scaling; consistent low traffic may not need scaling; unpredictable bursts are managed by target tracking to maintain a metric; gradual decrease uses simple scaling to reduce capacity.

Full explanation →

114

MCQmedium

An administrator needs the ability to read and update infrastructure for a specific AWS account, but only when using MFA. The security team wants to eliminate long-lived administrator access keys and ensure that even if someone obtains temporary session credentials, actions are only allowed with MFA present. Which IAM design best meets these requirements?

A.Create an IAM user for administrators with AdministratorAccess and require MFA only at the IAM user login.

B.Create an IAM role for administration and use a permissions policy that allows only the required read/write actions. Add a condition to deny all allowed actions unless aws:MultiFactorAuthPresent is true.

C.Attach policies to an IAM user that allow read/write actions and enable MFA in the account, but do not use condition keys in IAM policies.

D.Use a role with the correct actions but enforce MFA only in the application by prompting users for an OTP before every API call.

AnswerB

A role-based approach removes long-lived keys and supports temporary credentials. Using a permissions-policy condition to require MFA presence enforces that the session must have MFA to perform actions, aligning with the “actions only allowed with MFA present” requirement.

Why this answer

Option B is correct because it uses an IAM role with a condition key `aws:MultiFactorAuthPresent` set to `true` to enforce MFA for all API calls made with temporary credentials. This eliminates long-lived access keys and ensures that even if temporary session credentials are compromised, actions are denied unless MFA was used during the session. The policy explicitly denies all allowed actions when MFA is not present, meeting the security team's requirement for MFA on every administrative action.

Exam trap

The trap here is that candidates often confuse requiring MFA at login (console) with enforcing MFA for all API calls, failing to realize that without a condition key in the IAM policy, access keys or temporary credentials can be used without MFA after the initial login.

How to eliminate wrong answers

Option A is wrong because it only requires MFA at login, not for subsequent API calls made with the user's access keys or temporary credentials, leaving a gap where long-lived access keys could be used without MFA. Option C is wrong because enabling MFA in the account without using condition keys in IAM policies does not enforce MFA for API calls; it only affects console login, and long-lived access keys remain active. Option D is wrong because enforcing MFA only in the application is not an IAM-level control; it can be bypassed if the application is compromised or if API calls are made directly via the AWS CLI or SDK without the application's OTP prompt.

Full explanation →

115

MCQmedium

A SaaS vendor needs temporary access to an S3 bucket in your AWS account to read customer exports. The vendor will assume an IAM role you created. During integration testing, the vendor reports that their AssumeRole requests succeed, but your security team is concerned about the possibility of confused-deputy attacks. Which trust policy approach most directly mitigates this risk?

A.Add an sts:ExternalId condition to the role trust policy that must match the unique external ID you provide to the vendor.

B.Require the vendor to use the same MFA device serial number as your internal administrators in the trust policy.

C.Remove the role’s permissions policy and rely only on the S3 bucket policy to validate the caller.

D.Allow sts:AssumeRole from the vendor account root principal without restricting to the vendor’s specific IAM role.

AnswerA

The sts:ExternalId condition is a common protection against confused-deputy scenarios in cross-account role assumption. It ensures that only principals who know the unique external ID can successfully assume the role. This mitigates a third party tricking the vendor’s identity into assuming your role, even if they can call AssumeRole.

Why this answer

Option A is correct because the `sts:ExternalId` condition in the trust policy forces the vendor to include a unique external ID in their `AssumeRole` API call. This prevents a confused-deputy attack by ensuring that the role can only be assumed when the caller provides the exact external ID you have pre-shared, thereby verifying the intended purpose of the cross-account access.

Exam trap

The trap here is that candidates may think MFA or bucket policies are sufficient for cross-account security, but the confused-deputy attack is specifically mitigated by the `sts:ExternalId` condition, not by authentication factors or resource-based policies alone.

How to eliminate wrong answers

Option B is wrong because requiring the vendor to use the same MFA device serial number as your internal administrators is impractical and insecure—it would expose your MFA device to an external party and does not prevent confused-deputy attacks, as the vendor could still be tricked into assuming the role on behalf of another account. Option C is wrong because removing the role’s permissions policy and relying solely on the S3 bucket policy does not mitigate confused-deputy attacks; the bucket policy can still grant access to the role, and the attacker could still exploit the role’s trust relationship without an external ID check. Option D is wrong because allowing `sts:AssumeRole` from the vendor account root principal without restricting to a specific IAM role increases the attack surface—any principal in the vendor account (including compromised roles) could assume your role, and it does not address the confused-deputy risk because the external ID is not enforced.

Full explanation →

116

MCQmedium

An engineering team runs application servers in private subnets. The instances must download patches and software packages from Amazon S3, but the company does not want the traffic to traverse the internet or a NAT gateway. Which design should they use?

A.Add an internet gateway to the VPC and route private subnet traffic through it.

B.Use an Amazon S3 gateway VPC endpoint in the route tables for the private subnets.

C.Use a security group rule that allows outbound traffic to the S3 public IP range.

D.Create a VPC peering connection to the S3 service VPC.

AnswerB

A gateway VPC endpoint for S3 keeps traffic between the VPC and S3 on the AWS network without using the public internet or a NAT gateway. This is the standard private-connectivity pattern for S3 access from private subnets. It also simplifies the architecture and reduces NAT-related cost while preserving access to the bucket from workloads that must remain nonpublic.

Why this answer

Option B is correct because an S3 Gateway VPC endpoint allows instances in private subnets to access Amazon S3 without traversing the internet or a NAT gateway. The endpoint uses AWS’s internal network and is added to the route table of the private subnets, directing S3 traffic through the endpoint prefix list. This design meets the requirement of keeping traffic off the internet while providing secure, low-latency access to S3.

Exam trap

The trap here is that candidates often confuse Gateway VPC endpoints with Interface VPC endpoints, or mistakenly think that a security group rule alone can bypass the need for a routing path to the internet, when in fact routing decisions are made at the subnet route table level, not by security groups.

How to eliminate wrong answers

Option A is wrong because adding an internet gateway to the VPC and routing private subnet traffic through it would still require a NAT gateway or NAT instance to enable outbound internet access from private subnets, which the company explicitly wants to avoid. Option C is wrong because a security group rule allowing outbound traffic to the S3 public IP range does not change the routing path; traffic would still need an internet gateway or NAT device to reach those public IPs, violating the no-internet requirement. Option D is wrong because VPC peering connections cannot be established with an AWS service VPC (like S3); VPC peering is only between customer-owned VPCs, and S3 is accessed via endpoints or public endpoints, not through a peering connection.

Full explanation →

117

MCQmedium

A global application experiences frequent writes and must survive a full Regional outage with near-zero data loss. The product team also requires that users can continue to write during the incident using the closest Region. Which approach is most aligned with these requirements?

A.Use an active/active design with multi-Region data replication (for example, global tables for the write-heavy datastore) and route traffic to multiple Regions based on health and latency.

B.Use warm standby with periodic backups of the primary write datastore every 24 hours.

C.Use pilot light where the secondary Region runs only infrastructure templates and starts data replication only after detecting failure.

D.Use a single-writer model in one Region and deploy read-only replicas in the other Region for continuity.

AnswerA

Active/active supports writing in multiple Regions and reduces the blast radius of a Regional failure while enabling continued operations.

Why this answer

Option A is correct because an active/active design with multi-Region data replication, such as Amazon DynamoDB global tables, allows writes to occur in any Region and replicates them to all other Regions with near-real-time latency (typically sub-second). This meets the requirement for near-zero data loss during a full Regional outage, as data is asynchronously replicated to multiple Regions, and users can continue writing to the closest healthy Region via Route 53 latency-based or geolocation routing.

Exam trap

The trap here is that candidates often confuse 'multi-Region replication' with 'read replicas only' (Option D) or assume that periodic backups (Option B) provide sufficient durability, failing to recognize that near-zero data loss requires continuous asynchronous replication, not batch-based or on-demand replication.

How to eliminate wrong answers

Option B is wrong because warm standby with periodic backups every 24 hours cannot achieve near-zero data loss; a 24-hour backup window means up to 24 hours of writes could be lost in a Regional failure. Option C is wrong because pilot light starts data replication only after detecting failure, which introduces a recovery time objective (RTO) and recovery point objective (RPO) that are too high for near-zero data loss, and it does not support continuous writes during the incident. Option D is wrong because a single-writer model with read-only replicas in another Region means writes cannot continue during a Regional outage of the primary Region, violating the requirement that users can write during the incident.

Full explanation →

118

MCQmedium

A server assumes an IAM role and must read export objects only from this prefix in an S3 bucket: s3://customer-data/exports/acme/ . The application also needs to list the objects under that exact prefix so it can discover which export folders exist. The application performs ListBucket requests with Prefix set to exactly "exports/acme/". The current role policy allows s3:ListBucket on the bucket ARN without a prefix condition, and security reports the role can list other tenants’ export object keys. Which IAM policy change best enforces least privilege for both ListBucket and GetObject?

A.Keep s3:ListBucket allowed on arn:aws:s3:::customer-data, but restrict s3:GetObject to arn:aws:s3:::customer-data/exports/acme/*.

B.Allow s3:ListBucket on arn:aws:s3:::customer-data only when s3:prefix equals "exports/acme/" (for example, using a StringEquals condition on s3:prefix). Also allow s3:GetObject only on arn:aws:s3:::customer-data/exports/acme/*.

C.Allow s3:ListBucket only on arn:aws:s3:::customer-data/exports/acme/* and allow s3:GetObject on arn:aws:s3:::customer-data/*.

D.Add a Deny statement for s3:GetObject outside arn:aws:s3:::customer-data/exports/acme/*, but keep s3:ListBucket unrestricted on arn:aws:s3:::customer-data.

AnswerB

ListBucket must be authorized at the bucket ARN level, then scoped using a Condition on the request prefix (so only the approved listing prefix is allowed). GetObject is authorized at the object ARN level and is restricted to exports/acme/*, preventing reads outside the prefix.

Why this answer

Option B is correct because it uses an s3:prefix condition with StringEquals on the ListBucket action to restrict listing to exactly 'exports/acme/', preventing the role from enumerating other tenants' objects. It also restricts GetObject to the same prefix using a resource ARN of arn:aws:s3:::customer-data/exports/acme/*, ensuring least privilege for both read operations. This combination enforces the principle of least privilege by scoping both actions to the specific tenant prefix.

Exam trap

The trap here is that candidates often confuse bucket-level actions (like s3:ListBucket) with object-level actions (like s3:GetObject), incorrectly applying resource ARNs with key prefixes to ListBucket, or forgetting that a condition on s3:prefix is required to scope listing to a specific prefix.

How to eliminate wrong answers

Option A is wrong because it leaves s3:ListBucket unrestricted on the bucket ARN, which still allows the role to list objects under any prefix (e.g., other tenants' exports), violating least privilege. Option C is wrong because s3:ListBucket cannot be granted on a resource ARN with a key prefix (e.g., arn:aws:s3:::customer-data/exports/acme/*); ListBucket is a bucket-level action and must target the bucket ARN, not an object path. Option D is wrong because it keeps s3:ListBucket unrestricted, allowing the role to list all object keys in the bucket, and a Deny statement for GetObject outside the prefix does not prevent listing other tenants' export keys.

Full explanation →

119

MCQeasy

An internal web application is exposed through an Application Load Balancer (ALB). The ALB currently has only an HTTP listener on port 80. Security requires that all client traffic be encrypted in transit. What is the best next step?

A.Enable S3 bucket encryption for application files, since it ensures encryption in transit.

B.Configure an ALB HTTPS listener on port 443 using an ACM certificate, and redirect HTTP (80) to HTTPS (443).

C.Turn on default encryption for CloudFront origin access, which automatically encrypts all ALB traffic.

D.Add KMS permissions to the ALB role so TLS is enabled automatically.

AnswerB

An HTTPS listener terminates TLS at the ALB, encrypting traffic in transit. Redirecting HTTP to HTTPS ensures clients use TLS for all requests.

Why this answer

Option B is correct because the requirement to encrypt all client traffic in transit is met by adding an HTTPS listener on port 443 using an ACM certificate, which enables TLS encryption. Additionally, configuring a redirect from HTTP (port 80) to HTTPS (port 443) ensures that any client attempting to connect over unencrypted HTTP is automatically upgraded to HTTPS, enforcing encryption for all traffic.

Exam trap

The trap here is that candidates often confuse encryption at rest (e.g., S3 bucket encryption) with encryption in transit, or assume that enabling KMS or CloudFront settings automatically secures ALB traffic without explicit listener configuration.

How to eliminate wrong answers

Option A is wrong because S3 bucket encryption (e.g., SSE-S3 or SSE-KMS) protects data at rest, not data in transit, and does not affect ALB traffic encryption. Option C is wrong because CloudFront default encryption refers to encrypting traffic between CloudFront and the origin (ALB), but it does not automatically encrypt client-to-ALB traffic; also, the question does not mention CloudFront being in use. Option D is wrong because KMS permissions on the ALB role are used for decrypting TLS private keys or for KMS-based certificate management, but they do not automatically enable TLS; the ALB must be explicitly configured with an HTTPS listener and a certificate.

Full explanation →

120

MCQmedium

A high-volume telemetry pipeline writes streaming click events that must be processed by multiple independent consumers. Which service is most appropriate?

A.Amazon Kinesis Data Streams

B.AWS DataSync

C.Amazon EBS

D.Amazon Route 53

AnswerA

Kinesis Data Streams supports high-throughput event ingestion with multiple consumers reading from the stream.

Why this answer

Amazon Kinesis Data Streams is the most appropriate service because it is designed for real-time streaming data ingestion and processing. It can capture and store terabytes of data per hour from hundreds of thousands of sources, such as click events, and allows multiple independent consumers to read and process the same stream concurrently using the Kinesis Client Library (KCL) or enhanced fan-out with dedicated throughput.

Exam trap

The trap here is that candidates may confuse Kinesis Data Streams with simpler messaging services like SQS or SNS, but the key differentiator is that Kinesis supports multiple independent consumers processing the same stream in real-time with replay capability, whereas SQS is designed for point-to-point message delivery and SNS for pub/sub with push-based fan-out.

How to eliminate wrong answers

Option B (AWS DataSync) is wrong because it is a data transfer service for moving large datasets between on-premises storage and AWS services, not for real-time streaming or multiple consumer processing. Option C (Amazon EBS) is wrong because it provides block-level storage volumes for EC2 instances, not a streaming data ingestion or processing capability. Option D (Amazon Route 53) is wrong because it is a DNS web service for domain name resolution and routing, not for handling streaming telemetry data.

Full explanation →

121

MCQmedium

A patient portal receives bursts of orders that sometimes overwhelm a downstream fulfilment service. The architecture must absorb spikes and retry processing without losing requests. Which service should be placed between the web tier and fulfilment workers?

A.AWS WAF

B.Amazon CloudFront

C.Amazon SQS queue

D.Amazon Route 53 weighted routing

AnswerC

SQS decouples producers and consumers, buffers bursts, and supports retries through visibility timeout and dead-letter queues.

Why this answer

Amazon SQS is the correct choice because it acts as a durable, highly available message buffer between the web tier and the fulfilment workers. It decouples the components, allowing the web tier to enqueue requests immediately without waiting for the downstream service, and the workers can poll and process messages at their own pace. SQS automatically retains messages for up to 14 days and supports retries via a dead-letter queue, ensuring no requests are lost even during spikes.

Exam trap

The trap here is that candidates may confuse a load-balancing or caching service (like CloudFront or Route 53) with a message queue, failing to recognize that only a queue provides durable, asynchronous decoupling and retry capability for request processing.

How to eliminate wrong answers

Option A is wrong because AWS WAF is a web application firewall that filters HTTP/S traffic based on rules (e.g., SQL injection, XSS) and does not provide message buffering, retry logic, or decoupling for asynchronous processing. Option B is wrong because Amazon CloudFront is a content delivery network (CDN) that caches and accelerates static and dynamic content at edge locations; it cannot buffer or persist requests for downstream workers to process asynchronously. Option D is wrong because Route 53 weighted routing distributes DNS traffic across multiple endpoints based on weights, but it operates at the DNS level and cannot absorb spikes or retry failed requests; it provides no queueing or persistence.

Full explanation →

122

MCQeasy

A system uses multiple AWS Lambda functions behind different event sources. One Lambda occasionally spikes and causes other Lambdas to be throttled due to shared concurrency limits. Which setting best helps ensure the important Lambda keeps capacity during spikes?

A.Increase the function timeout so throttling is less likely.

B.Set Reserved Concurrency for the important Lambda function.

C.Enable Provisioned Concurrency for every Lambda in the account.

D.Reduce the number of IAM policies attached to the Lambda roles.

AnswerB

Reserved concurrency allocates a guaranteed amount of concurrent execution capacity to a specific Lambda. This prevents other functions from consuming all concurrency and throttling the important one. If the reserved limit is reached, only that function is throttled, isolating impact.

Why this answer

Reserved Concurrency guarantees that the important Lambda function always has a set number of concurrent executions available, preventing other functions from consuming the account-level concurrency pool and throttling it during spikes. This setting isolates the function's capacity from shared contention, ensuring its performance remains stable.

Exam trap

The trap here is confusing Provisioned Concurrency (which reduces cold starts) with Reserved Concurrency (which guarantees capacity), leading candidates to pick Option C even though it does not solve the throttling issue.

How to eliminate wrong answers

Option A is wrong because increasing the function timeout does not affect concurrency limits; it only extends the maximum execution duration, which could actually increase the chance of throttling by holding concurrency slots longer. Option C is wrong because Provisioned Concurrency pre-warms environments to reduce cold starts but does not reserve capacity against the shared concurrency limit; it still counts toward the account's total concurrency and does not prevent other functions from consuming the pool. Option D is wrong because reducing IAM policies affects permissions, not concurrency limits; it has no impact on Lambda's throttling behavior.

Full explanation →

123

MCQeasy

A retail API uses EC2 instances behind an ALB. CPU is consistently high during peak traffic, and request latency rises. What should be configured? The architecture review board prefers a managed AWS-native control.

A.Auto Scaling policy based on an appropriate CloudWatch metric

B.S3 Object Lock

C.A VPC endpoint for CloudWatch only

D.Disable health checks

AnswerA

Auto Scaling adds capacity when load increases and removes it when load falls.

Why this answer

The correct answer is A because an Auto Scaling policy based on an appropriate CloudWatch metric (e.g., CPUUtilization or ALB RequestCountPerTarget) dynamically adds or removes EC2 instances to match demand, preventing sustained high CPU and rising latency. This is a managed, AWS-native control that aligns with the architecture review board's preference.

Exam trap

The trap here is that candidates may confuse operational controls (like health checks or VPC endpoints) with scaling mechanisms, or assume that disabling health checks somehow improves performance, when in fact it worsens reliability and latency.

How to eliminate wrong answers

Option B is wrong because S3 Object Lock is a data protection feature for Amazon S3 objects (write-once-read-many, WORM) and has no role in scaling compute resources or reducing request latency. Option C is wrong because a VPC endpoint for CloudWatch only provides private connectivity to CloudWatch APIs, not scaling or performance improvement for EC2 instances behind an ALB. Option D is wrong because disabling health checks would cause the ALB to continue routing traffic to unhealthy instances, increasing latency and potentially causing failures, which is the opposite of a high-performance architecture.

Full explanation →

124

Multi-Selecthard

A latency-sensitive mobile game backend uploads large files to S3 from users around the world. Which two features can improve upload performance? The team wants the control to be enforceable during normal operations.

Select 2 answers

A.S3 Object Lock

B.S3 multipart upload

C.S3 Inventory

D.S3 Transfer Acceleration

AnswersB, D

Multipart upload parallelizes large object upload parts and improves reliability.

Why this answer

S3 multipart upload is correct because it allows large files to be uploaded in parallel parts, significantly reducing the impact of network latency and improving throughput. This is ideal for a latency-sensitive mobile game backend where users worldwide upload large files, as it enables faster uploads by splitting the file into smaller chunks that can be uploaded concurrently.

Exam trap

The trap here is that candidates might confuse S3 Transfer Acceleration with a feature that requires client-side changes, but it actually works transparently via a special endpoint, while S3 Object Lock is mistakenly thought to improve performance due to its 'lock' name implying faster access.

Full explanation →

125

MCQeasy

An engineering team deploys a stateless web API on EC2 using an Auto Scaling group and an Application Load Balancer (ALB). During a recent test, they noticed that when one Availability Zone was unavailable, traffic failed until new instances were manually launched. Which change most directly improves automatic failover for the compute layer within a single Region?

A.Place the Auto Scaling group in only one subnet so instance launches are simpler.

B.Ensure the ALB and Auto Scaling group span multiple subnets in at least two Availability Zones.

C.Increase the target group deregistration delay to allow old instances to stay longer.

D.Use a Network Load Balancer, but keep all subnets in a single Availability Zone.

AnswerB

Spreading the ALB and Auto Scaling group across at least two AZs provides redundant capacity. If one AZ fails, the ALB continues routing to healthy targets in the other AZ.

Why this answer

Option B is correct because an Application Load Balancer (ALB) and Auto Scaling group must span multiple subnets in at least two Availability Zones (AZs) to provide automatic failover. When one AZ becomes unavailable, the ALB automatically reroutes traffic to healthy targets in the remaining AZs, and the Auto Scaling group can launch replacement instances in the surviving AZs. This architecture ensures that the compute layer remains available without manual intervention.

Exam trap

The trap here is that candidates often think a single-AZ deployment with a load balancer provides failover, but without multiple AZs, the load balancer itself becomes a single point of failure and cannot reroute traffic when the AZ goes down.

How to eliminate wrong answers

Option A is wrong because placing the Auto Scaling group in only one subnet (single AZ) eliminates redundancy; if that AZ fails, all instances become unreachable and no automatic failover is possible. Option C is wrong because increasing the target group deregistration delay only keeps old instances longer during a scale-in event, it does not provide failover when an AZ becomes unavailable. Option D is wrong because using a Network Load Balancer (NLB) in a single AZ still creates a single point of failure; the NLB cannot route traffic to healthy targets in other AZs if the only AZ is down, and it does not improve failover over an ALB in this scenario.

Full explanation →

126

Multi-Selecthard

A payments API requires point-in-time recovery and accidental-delete protection for a DynamoDB table. Which two settings should the architect enable? The team wants the control to be enforceable during normal operations.

Select 2 answers

A.Deletion protection or tightly controlled delete permissions

B.Point-in-time recovery

C.Global secondary indexes

D.DAX

AnswersA, B

Deletion protection and least-privilege controls reduce accidental table removal risk.

Why this answer

Deletion protection (Option A) prevents accidental table deletion by blocking drop-table operations, which is enforceable during normal operations. Point-in-time recovery (Option B) enables continuous backups with 35-day granularity, allowing restoration to any second within that window. Together, they satisfy the requirements for accidental-delete protection and point-in-time recovery.

Exam trap

The trap here is that candidates often confuse point-in-time recovery with backup solutions like AWS Backup or assume that GSIs or DAX provide data protection, when in fact they serve entirely different purposes (performance optimization and caching).

Full explanation →

127

MCQmedium

A CI/CD system creates an IAM role (CICDRole) used for deployments. Your organization uses IAM permission boundaries to prevent developers from granting themselves higher privileges. After an incident, you discover that CICDRole can perform unintended IAM actions because the role’s identity policy includes broad permissions. Which change most directly ensures permission boundaries continue to restrict CICDRole regardless of what is later added to the role’s identity policies?

A.Remove the permission boundary from CICDRole so that only the identity policy controls access.

B.Ensure CICDRole is created with the required permissions boundary ARN, and verify that the boundary policy does not allow the unintended IAM actions.

C.Add an identity-policy deny for iam:CreatePolicy and iam:UpdateRole on all resources.

D.Rely on CloudTrail alerts to stop deployments from performing IAM changes after the fact.

AnswerB

Permission boundaries cap the maximum effective permissions for the role by intersecting the identity policy and the permissions boundary at authorization time. Even if the identity policy later expands, the boundary still prevents actions not allowed by the boundary policy, providing deterministic enforcement against privilege escalation.

Why this answer

Option B is correct because IAM permission boundaries define the maximum permissions that an IAM role can have, regardless of what is later added to its identity-based policies. By ensuring CICDRole is created with a permission boundary that explicitly denies the unintended IAM actions, even if broad permissions are added to the role's identity policy, the boundary will override and restrict those actions. This directly addresses the requirement to prevent privilege escalation through policy modifications.

Exam trap

The trap here is that candidates often think adding deny statements to the identity policy is sufficient, but they overlook that permission boundaries are the only mechanism that can restrict permissions added later, and that deny statements in the identity policy can be overridden by a broader allow if not carefully scoped.

How to eliminate wrong answers

Option A is wrong because removing the permission boundary eliminates the only mechanism that caps the role's maximum permissions, allowing any broad identity policy to grant unintended IAM actions without restriction. Option C is wrong because adding a deny for iam:CreatePolicy and iam:UpdateRole does not prevent the role from using other IAM actions like iam:PassRole or iam:AttachRolePolicy that could still lead to privilege escalation; it is an incomplete fix that does not address the root cause of broad permissions. Option D is wrong because relying on CloudTrail alerts is a detective control, not a preventive one; it only notifies after the fact, allowing unauthorized IAM actions to occur before any response can be taken.

Full explanation →

128

MCQeasy

A inventory service exposes a static website from S3 and CloudFront. Users should still receive cached pages if the S3 origin has a short outage. Which feature helps most? The team wants the control to be enforceable during normal operations.

A.CloudFront caching with appropriate TTLs

B.AWS Backup Vault Lock

C.IAM Access Analyzer

D.S3 Select

AnswerA

CloudFront can serve cached content from edge locations when the origin is temporarily unavailable.

Why this answer

CloudFront caching with appropriate TTLs allows the distribution to serve stale or cached content from edge locations even when the S3 origin is temporarily unavailable. By setting a minimum TTL (e.g., 0 seconds) and a default/max TTL (e.g., 86400 seconds), CloudFront can continue to respond to user requests with previously cached objects during an origin outage, ensuring high availability. This feature is enforceable during normal operations because the TTL settings are configured in the CloudFront distribution behavior and are always active, not just during failures.

Exam trap

The trap here is that candidates may confuse CloudFront's caching with origin failover or think that features like AWS Backup Vault Lock or IAM Access Analyzer can somehow enforce availability, when in fact only proper TTL configuration ensures cached content is served during an outage.

How to eliminate wrong answers

Option B (AWS Backup Vault Lock) is wrong because it is a data protection feature that prevents deletion or modification of backup vaults, not a mechanism to serve cached content during an origin outage. Option C (IAM Access Analyzer) is wrong because it analyzes resource-based policies to identify unintended access, not to control caching or origin failover behavior. Option D (S3 Select) is wrong because it is a query-in-place feature for filtering data within S3 objects, not a caching or availability mechanism for static websites.

Full explanation →

129

MCQmedium

A fintech startup uses AWS to run a web API and a PostgreSQL database. They must meet an RPO of 15 minutes and an RTO of 2 hours for a Region-wide disaster. Budget allows running a small, always-on set of infrastructure in a secondary Region, but not full production capacity. The team wants a DR approach that is regularly testable without large manual effort. Which disaster recovery strategy is the best fit?

A.Pilot light: replicate databases and store backups, keep only minimal infrastructure in the secondary Region, and scale up fully during failover.

B.Warm standby: keep a scaled-down application environment and database replication active in the secondary Region, using automated failover controls.

C.Backup and restore only: rely on daily automated backups and restore into the secondary Region during an incident.

D.Multi-site active-active: run both Regions at full capacity and route live traffic to both simultaneously.

AnswerB

Warm standby aligns with moderate RTO requirements by having ready-to-run resources plus continuous replication to meet the RPO target during failover.

Why this answer

Warm standby (B) is the best fit because it maintains a scaled-down but fully functional application environment in the secondary Region with active database replication, meeting the RPO of 15 minutes via synchronous or near-synchronous replication (e.g., PostgreSQL streaming replication or AWS DMS with ongoing replication). Automated failover controls (e.g., Route 53 health checks and Lambda automation) can achieve the RTO of 2 hours by scaling up the standby environment, and the always-on infrastructure allows regular, low-effort testing of the failover process without manual intervention.

Exam trap

The trap here is that candidates confuse 'pilot light' with 'warm standby' because both involve a secondary Region with minimal resources, but pilot light lacks pre-provisioned application servers and automated failover, making it unsuitable for the stated RTO and testability requirements.

How to eliminate wrong answers

Option A is wrong because pilot light keeps only minimal infrastructure (e.g., database replicas and no application servers) and requires manual or scripted scaling during failover, which risks exceeding the 2-hour RTO due to provisioning delays and lacks the automated failover controls needed for regular testing. Option C is wrong because backup and restore relies on daily backups, which cannot meet the 15-minute RPO (backups are typically taken every 24 hours) and restoring from backups into a secondary Region often takes longer than 2 hours due to data transfer and recovery time. Option D is wrong because multi-site active-active requires full production capacity in both Regions, which exceeds the budget constraint of running only a small, always-on set of infrastructure in the secondary Region.

Full explanation →

130

MCQmedium

A read-heavy document portal repeatedly queries the same product catalogue data from DynamoDB with millisecond latency requirements. Which service can reduce read latency and table load?

A.Amazon Kinesis Data Firehose

B.S3 Transfer Acceleration

C.DynamoDB Accelerator (DAX)

D.AWS Glue Data Catalog

AnswerC

DAX is an in-memory cache for DynamoDB that reduces read latency for suitable access patterns.

Why this answer

DynamoDB Accelerator (DAX) is a fully managed, in-memory cache for Amazon DynamoDB that delivers up to 10x read performance improvement by caching frequently accessed data. For a read-heavy workload querying the same product catalogue data, DAX reduces read latency to microseconds and offloads read requests from the DynamoDB table, lowering consumed read capacity units and table load.

Exam trap

The trap here is that candidates often confuse caching services (DAX) with data transfer acceleration (S3 Transfer Acceleration) or data ingestion (Kinesis Data Firehose), failing to recognize that the core requirement is to reduce DynamoDB read latency and table load, which only a dedicated in-memory cache like DAX can achieve.

How to eliminate wrong answers

Option A is wrong because Amazon Kinesis Data Firehose is a streaming data ingestion service for loading data into data stores or analytics tools, not a caching layer for DynamoDB read operations. Option B is wrong because S3 Transfer Acceleration speeds up uploads and downloads to Amazon S3 over long distances using AWS edge locations, but it does not cache DynamoDB data or reduce read latency for DynamoDB queries. Option D is wrong because AWS Glue Data Catalog is a metadata repository for data assets used in ETL and analytics, not a cache for DynamoDB read requests.

Full explanation →

131

Multi-Selectmedium

A company runs a customer portal on self-managed PostgreSQL on EC2, plus a self-managed RabbitMQ cluster for asynchronous work that only requires durable queueing and does not depend on RabbitMQ-specific exchange features. The operations team spends a lot of time patching, backing up, and scaling both systems. The business wants to reduce infrastructure management overhead and total cost of ownership. Which two changes are the best fit? Select two.

Select 2 answers

A.Migrate PostgreSQL to Amazon RDS for PostgreSQL.

B.Replace RabbitMQ with Amazon SQS for asynchronous message handling.

C.Move PostgreSQL to Amazon DynamoDB without redesigning the application.

D.Replace RabbitMQ with another EC2-based broker cluster for more control.

E.Keep the same design and increase instance sizes to simplify maintenance.

AnswersA, B

Amazon RDS removes much of the undifferentiated heavy lifting for backups, patching, and high availability management. For a standard relational database workload, RDS usually lowers operational effort and total cost of ownership compared with a self-managed PostgreSQL deployment on EC2.

Why this answer

Option A is correct because migrating PostgreSQL to Amazon RDS for PostgreSQL offloads patching, backups, and scaling to AWS, reducing operational overhead. RDS automates OS and database patching, provides automated backups with point-in-time recovery, and supports vertical scaling with minimal downtime. This directly addresses the team's time spent on maintenance while lowering TCO compared to self-managed EC2.

Exam trap

The trap here is that candidates may think DynamoDB can replace any database to reduce costs, ignoring that it requires application redesign and cannot directly substitute a relational database like PostgreSQL.

Full explanation →

132

Multi-Selectmedium

A startup runs two EC2-based workloads in the same AWS Region. Its customer-facing API is always on, and its nightly video transcoding fleet can restart jobs from checkpoints if an instance is interrupted. The finance team wants the lowest monthly compute cost without changing the application design. Which two actions should the team take? Select two.

Select 2 answers

A.Purchase an All Upfront Reserved Instance for the transcoding fleet only.

B.Buy a Compute Savings Plan to cover the always-on API baseline usage.

C.Run the transcoding fleet on Spot Instances because interrupted jobs can resume from checkpoints.

D.Increase the API instance size so CPU utilization stays below 30 percent.

E.Move the API tier to Dedicated Hosts to improve isolation and lower spend.

AnswersB, C

Savings Plans reduce cost for consistent compute usage and are well suited to the always-on API.

Why this answer

Option B is correct because a Compute Savings Plan offers the lowest cost for steady-state workloads like the always-on API by committing to a consistent amount of compute usage (measured in $/hour) across any EC2 instance family, size, or region, providing up to 66% savings compared to On-Demand. Option C is correct because the transcoding fleet can tolerate interruptions (jobs restart from checkpoints), making Spot Instances ideal for up to 90% discount over On-Demand, which directly minimizes compute cost without architectural changes.

Exam trap

The trap here is that candidates often choose Reserved Instances for the transcoding fleet (Option A) because they think 'always-on' workloads need RIs, but they overlook that Spot Instances are far cheaper for interruptible workloads, and a Compute Savings Plan (Option B) covers the API baseline more flexibly than an instance-specific RI.

Full explanation →

133

MCQhard

A claims workflow uses Amazon SQS. Poison messages are repeatedly failing and blocking useful retries. What should the architect configure?

A.A FIFO queue without a redrive policy

B.Short polling instead of long polling

C.A dead-letter queue with an appropriate maxReceiveCount

D.A larger message retention period only

AnswerC

A DLQ isolates messages that fail repeatedly so they can be investigated without disrupting normal processing.

Why this answer

Option C is correct because a dead-letter queue (DLQ) with an appropriate maxReceiveCount allows messages that repeatedly fail processing to be moved out of the source queue after a specified number of receive attempts. This prevents poison messages from blocking the queue and consuming retry capacity, enabling the workflow to continue processing valid messages without interruption.

Exam trap

The trap here is that candidates often confuse increasing the retention period or changing polling behavior with solving poison message issues, when the correct solution is to use a dead-letter queue with a maxReceiveCount to isolate failing messages.

How to eliminate wrong answers

Option A is wrong because a FIFO queue without a redrive policy does not automatically handle poison messages; without a DLQ, failed messages remain in the queue and continue to block retries. Option B is wrong because short polling reduces latency but does not address the issue of poison messages; it returns fewer messages per request and can increase costs, but it does not prevent repeated failures. Option D is wrong because increasing the message retention period only keeps messages in the queue longer; it does not remove or isolate poison messages, so they will continue to fail and block useful retries.

Full explanation →

134

MCQmedium

An orders service publishes payment instructions to an Amazon SQS queue. After occasional processing timeouts, the downstream consumer sometimes processes the same instruction twice, resulting in duplicate payment attempts. The team currently uses an SQS Standard queue with a visibility timeout of 2 minutes and relies on the consumer to finish before the timeout expires. What approach best improves resilience against duplicate processing?

A.Decrease visibility timeout to 10 seconds so duplicates are less likely to occur.

B.Make the consumer idempotent using the order ID as a deduplication key, and set the visibility timeout longer than the worst-case processing time.

C.Use an EventBridge rule with a fixed retry policy that only retries when the payload matches exactly.

D.Enable a dead-letter queue (DLQ) only, without changing the queue type or consumer logic.

AnswerB

SQS Standard provides at-least-once delivery, so duplicates can still occur. The most resilient design is to make the payment handler idempotent so repeated deliveries do not create duplicate side effects, and to set the visibility timeout long enough to cover the worst-case processing time to reduce unnecessary re-delivery.

Why this answer

Option B is correct because making the consumer idempotent using the order ID as a deduplication key ensures that even if the same message is processed multiple times, the downstream system will only apply the payment once. Setting the visibility timeout longer than the worst-case processing time prevents the message from becoming visible again before the consumer finishes, eliminating the root cause of duplicate processing in a Standard queue.

Exam trap

The trap here is that candidates often think reducing the visibility timeout or adding a DLQ alone solves duplicates, but they overlook that Standard queues inherently allow at-least-once delivery, so idempotency is the only reliable solution.

How to eliminate wrong answers

Option A is wrong because decreasing the visibility timeout to 10 seconds would increase the likelihood of duplicates by making the message reappear sooner if the consumer takes longer than 10 seconds, exacerbating the timeout issue. Option C is wrong because an EventBridge rule with a fixed retry policy does not address duplicate processing; EventBridge is a event bus service, not a queue, and its retry policy cannot prevent duplicate delivery from SQS. Option D is wrong because enabling only a DLQ without changing the queue type or consumer logic does not prevent duplicates; a DLQ captures failed messages but does not make the consumer idempotent or adjust visibility timeout to avoid reprocessing.

Full explanation →

135

Drag & Dropmedium

Order the steps for setting up an Application Load Balancer with an EC2 target group.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

ALB creation first, then target group, health checks, association, and testing.

Full explanation →

136

MCQmedium

Based on the exhibit, why is the IAM role still receiving AccessDenied even though it has AdministratorAccess attached?

A.AdministratorAccess is always evaluated before SCPs, so the SCP is ignored in production accounts.

B.The SCP is acting as a maximum permission guardrail, so its explicit deny overrides the IAM allow.

C.The role needs a session duration of at least 12 hours before SCPs stop applying.

D.The account needs an AWS Config rule to approve the snapshot action before IAM can work.

AnswerB

SCPs set the outer boundary for permissions in an account or OU. They do not grant access, but they can block actions even when the IAM role has AdministratorAccess. The explicit deny in the SCP is therefore the reason CreateSnapshot fails. To allow the operation, the organization must change the SCP or move the account out of the restrictive scope.

Why this answer

B is correct because Service Control Policies (SCPs) act as a maximum permission guardrail in AWS Organizations. Even if an IAM role has the AdministratorAccess policy attached, an SCP with an explicit deny on the ec2:CreateSnapshot action will override that allow, resulting in an AccessDenied error. SCPs are evaluated after IAM policies, and an explicit deny in an SCP cannot be overridden by any IAM allow.

Exam trap

The trap here is that candidates often assume AdministratorAccess grants full permissions unconditionally, forgetting that SCPs can impose a higher-level deny that overrides any IAM allow, especially in AWS Organizations.

How to eliminate wrong answers

Option A is wrong because AdministratorAccess is not always evaluated before SCPs; in fact, SCPs are evaluated after IAM policies, and an explicit deny in an SCP overrides any IAM allow. Option C is wrong because session duration has no effect on SCP evaluation; SCPs apply regardless of session length, and there is no 12-hour threshold. Option D is wrong because AWS Config rules are used for compliance and resource auditing, not for approving API actions; they do not affect IAM authorization or SCP evaluation.

Full explanation →

137

MCQhard

Based on the exhibit, which change best reduces latency during peak traffic without overprovisioning the fleet?

A.Replace the instances with a larger instance family so each server has more headroom.

B.Change the Auto Scaling policy to target tracking on ALB RequestCountPerTarget.

C.Use scheduled scaling to add instances only during the business hours peak window.

D.Replace the ALB with a Network Load Balancer to reduce request latency.

AnswerB

RequestCountPerTarget matches the actual demand reaching each instance and scales capacity before the thread pool saturates. Because CPU is still low, CPU-based scaling would react too late or not at all. Target tracking on request count helps keep queue depth and latency down while avoiding unnecessary overprovisioning during quieter periods.

Why this answer

Option B is correct because using a target tracking scaling policy on ALB RequestCountPerTarget dynamically adjusts the fleet size based on the actual load per instance, ensuring that capacity scales with demand during peak traffic without manual intervention or overprovisioning. This approach directly addresses latency caused by high request rates per instance by maintaining a target request count, which reduces response time without adding unnecessary instances during off-peak periods.

Exam trap

The trap here is that candidates confuse 'reducing latency' with 'improving network throughput' (Option D) or 'static capacity increases' (Option A), missing that dynamic scaling based on per-target request count directly addresses the latency caused by overloaded instances during peak traffic.

How to eliminate wrong answers

Option A is wrong because simply replacing instances with a larger family increases per-instance capacity but does not automatically scale the fleet; it leads to overprovisioning during low traffic and fails to adapt to variable peak loads, wasting cost and not reducing latency efficiently. Option C is wrong because scheduled scaling adds instances only during a fixed business hours window, which cannot handle unpredictable peak traffic spikes outside that window, leaving the fleet either under-provisioned or over-provisioned. Option D is wrong because replacing the ALB with a Network Load Balancer (NLB) reduces transport-layer latency but does not address the root cause of latency—high request load per target—and NLB lacks the application-layer metrics (like RequestCountPerTarget) needed for intelligent Auto Scaling based on request volume.

Full explanation →

138

MCQmedium

A containerized web service on Amazon ECS reads a database password at startup. Today, the password is stored in a plain environment variable and updated manually. Auditors require that credentials: (1) are encrypted at rest using AWS-managed controls, (2) can be rotated without redeploying the task definition, and (3) are accessible only to the running task via least-privilege permissions. Which solution best meets these requirements?

A.Store the password in Systems Manager Parameter Store as a SecureString and grant the ECS task role GetParameter only for that parameter ARN. Have the application call GetParameter on each request or on a short refresh interval.

B.Store the password in AWS Secrets Manager. Configure rotation for the secret. Grant the ECS task role secretsmanager:GetSecretValue for only that secret ARN. Update the application to fetch the secret at runtime and cache it briefly.

C.Store the password in a local file within the container image and mount it as a Docker secret at build time to avoid environment variables.

D.Store the password in an S3 bucket with server-side encryption and allow all ECS tasks to read it using a broad IAM policy on the bucket prefix.

AnswerB

Secrets Manager provides encrypted-at-rest storage and supports managed rotation. ECS task roles provide least-privilege access without static keys. Fetching at runtime with brief caching supports rotation without redeploying the task definition.

Why this answer

Option B is correct because AWS Secrets Manager provides automatic rotation of secrets without redeploying the task definition, encryption at rest via AWS KMS, and fine-grained IAM permissions. By granting the ECS task role only `secretsmanager:GetSecretValue` for the specific secret ARN, the application can fetch the password at runtime, meeting all three audit requirements.

Exam trap

The trap here is that candidates often choose Parameter Store (Option A) because it is cheaper and can store SecureStrings, but they overlook the explicit requirement for automatic rotation without custom infrastructure, which Secrets Manager natively supports.

How to eliminate wrong answers

Option A is wrong because Systems Manager Parameter Store SecureString does not support native automatic rotation; rotation would require custom automation or Lambda, and the requirement to call GetParameter on each request or short interval is inefficient and not a best practice. Option C is wrong because storing the password in a local file within the container image at build time violates the requirement to rotate without redeploying the task definition, and the password is not encrypted at rest using AWS-managed controls (it is baked into the image). Option D is wrong because allowing all ECS tasks to read the password using a broad IAM policy on an S3 bucket prefix violates the least-privilege requirement, and S3 does not provide native secret rotation capabilities.

Full explanation →

139

MCQmedium

A backend service in AWS uses an IAM role to upload large files to an S3 bucket using multipart upload. The upload typically succeeds, but it intermittently fails during cleanup with this error: "AccessDenied: User is not authorized to perform: s3:AbortMultipartUpload" The role identity policy currently allows only: - s3:PutObject on arn:aws:s3:::my-bucket/uploads/* - s3:ListBucket on arn:aws:s3:::my-bucket with a prefix condition What is the best least-privilege change to fix the cleanup failure?

A.Add s3:AbortMultipartUpload for arn:aws:s3:::my-bucket/uploads/*.

B.Add s3:AbortMultipartUpload for arn:aws:s3:::my-bucket/*.

C.Add s3:ListBucket for arn:aws:s3:::my-bucket/uploads/* so the service can find parts to abort.

D.Add kms:Decrypt permissions for the KMS key used to encrypt objects in the bucket.

AnswerA

For multipart uploads, S3 clients use s3:AbortMultipartUpload to stop/cleanup an in-progress multipart upload (for example, when an upload fails or the client cancels). Granting s3:AbortMultipartUpload only on the uploads prefix matches the denied API in the symptom and keeps the permission scoped to the exact objects the service uploads.

Why this answer

The error occurs because the IAM role lacks permission to abort multipart uploads. Multipart uploads in S3 require s3:AbortMultipartUpload to clean up incomplete upload parts after a failure or interruption. Option A grants this permission on the specific uploads/* prefix, which is the least-privilege fix because it scopes the action to the exact path where the service uploads files.

Exam trap

The trap here is that candidates may confuse the need for s3:AbortMultipartUpload with other permissions like s3:ListBucket or KMS actions, or they may over-scope the permission to the entire bucket instead of the specific prefix.

How to eliminate wrong answers

Option B is wrong because it grants s3:AbortMultipartUpload on the entire bucket (/*), which is broader than necessary and violates least-privilege principles. Option C is wrong because s3:ListBucket is already allowed with a prefix condition; adding it again does not grant the missing abort permission, and listing parts requires s3:ListMultipartUploadParts, not s3:ListBucket. Option D is wrong because the error is an S3 access denied, not a KMS permission issue; KMS permissions are needed for encrypting/decrypting objects, not for aborting multipart uploads.

Full explanation →

140

Multi-Selectmedium

A data-processing application runs in private subnets and needs to read objects from Amazon S3 and write items to Amazon DynamoDB. The team currently routes all outbound traffic through a NAT Gateway, and monthly networking charges are rising. Which two changes will most directly reduce cost while keeping traffic on the AWS network? Select two.

Select 2 answers

A.Add an S3 gateway VPC endpoint to the private route tables.

B.Add a DynamoDB gateway VPC endpoint to the private route tables.

C.Add a second NAT Gateway in another Availability Zone.

D.Move the instances into public subnets so they can reach AWS services directly.

E.Send private subnet traffic to S3 and DynamoDB through an Internet Gateway route.

AnswersA, B

An S3 gateway endpoint lets the instances reach S3 without sending traffic through the NAT Gateway. Gateway endpoints are the standard low-cost pattern for S3 access from private subnets and keep traffic on the AWS network.

Why this answer

Adding an S3 gateway VPC endpoint allows instances in private subnets to access S3 directly over the AWS network without traversing the NAT Gateway, eliminating NAT data processing charges for S3 traffic. This reduces monthly networking costs while keeping traffic within the AWS backbone.

Exam trap

The trap here is that candidates often think DynamoDB requires an interface endpoint (which incurs hourly charges) rather than a gateway endpoint, but DynamoDB supports both, and the gateway endpoint is the cost-optimal choice for reducing NAT Gateway traffic.

Full explanation →

141

Multi-Selecthard

A platform team lets project administrators create IAM roles for workloads in their own AWS accounts, but every role must stay inside a fixed security baseline. The organization also wants to block all member accounts from using AWS Regions outside us-east-1 and us-west-2. Which three controls should be used? Select three.

Select 3 answers

A.Attach a permissions boundary to each role created through the delegation process.

B.Require iam:PermissionsBoundary in the role creation policy so every new role must include the approved boundary.

C.Use an SCP to deny actions in all AWS Regions except us-east-1 and us-west-2.

D.Grant AdministratorAccess to the project administrators and rely on later audits for enforcement.

E.Use an AWS Config rule alone to stop role creation if the permissions are too broad.

AnswersA, B, C

A permissions boundary caps the maximum permissions a created role can ever receive, even if an administrator later attaches broader policies. This is the right mechanism for a fixed security baseline on delegated role creation.

Why this answer

Option A is correct because attaching a permissions boundary to each role created through delegation ensures that even if a project administrator grants excessive permissions, the effective permissions are limited by the boundary. This enforces the fixed security baseline without preventing administrators from creating roles within those constraints.

Exam trap

The trap here is that candidates often think a detective control like AWS Config is sufficient for enforcement, but the question requires preventive controls that block non-compliant actions before they occur.

Full explanation →

142

MCQmedium

A production internal reporting portal runs continuously on EC2 with predictable usage for the next three years. The team wants a discount while retaining some instance-family flexibility. What should they buy?

A.Spot Instances only

B.Dedicated Instances

C.Compute Savings Plan

D.S3 Intelligent-Tiering

AnswerC

Compute Savings Plans provide discounts for a committed spend while allowing flexibility across instance families, sizes, Regions, and compute services.

Why this answer

A Compute Savings Plan offers the lowest prices on EC2 usage (up to 66% off On-Demand) in exchange for a 1- or 3-year commitment, and it automatically applies to any EC2 instance family in any region, giving the flexibility the team needs. Since the workload runs continuously with predictable usage for three years, this plan is ideal for reducing costs while retaining the ability to change instance families if needed.

Exam trap

The trap here is that candidates often confuse Compute Savings Plans with EC2 Instance Savings Plans, assuming any Savings Plan locks you to a specific instance family, but Compute Savings Plans provide broader flexibility across families and services.

How to eliminate wrong answers

Option A is wrong because Spot Instances are designed for fault-tolerant, interruptible workloads and can be terminated with a 2-minute notice, making them unsuitable for a continuously running production reporting portal. Option B is wrong because Dedicated Instances are physically isolated at the host hardware level and billed per instance, which does not inherently provide a discount and lacks the instance-family flexibility of a Savings Plan. Option D is wrong because S3 Intelligent-Tiering is a storage class for objects with changing access patterns, not a compute pricing model, and cannot be applied to EC2 instances.

Full explanation →

143

MCQmedium

An orders system sends payment instructions to an Amazon SQS queue. The consumer sometimes times out after it has already created the payment record but before it deletes the SQS message. As a result, the same instruction can be processed more than once. Which design best ensures the consumer remains resilient and does not create duplicate payments when the same instruction is delivered multiple times?

A.Assume the consumer will always delete the SQS message in the same execution path, and ignore the timeout case.

B.Use idempotency: store a deterministic payment request identifier in a DynamoDB table and only create a payment when a conditional write indicates it was not processed before.

C.Switch to SQS Standard because it provides exactly-once delivery, so duplicates cannot happen.

D.Increase the consumer timeout and reduce the number of retries so that duplicates rarely occur.

AnswerB

Idempotency based on a stable identifier prevents duplicates by making processing repeatable and safely detectable.

Why this answer

Option B is correct because it implements idempotency using a DynamoDB table with a conditional write. By storing a deterministic payment request identifier (e.g., a hash of the message body) and only creating the payment if the conditional write succeeds (i.e., the identifier does not already exist), the consumer can safely process the same SQS message multiple times without creating duplicate payments. This pattern ensures resilience against the at-least-once delivery semantics of SQS and consumer timeouts that prevent message deletion.

Exam trap

The trap here is that candidates assume SQS FIFO queues provide exactly-once delivery, but the question specifies an SQS queue (likely Standard), and even FIFO queues only guarantee exactly-once processing within a limited deduplication window, not absolute idempotency; the correct solution is to make the consumer itself idempotent.

How to eliminate wrong answers

Option A is wrong because ignoring the timeout case violates the principle of designing for failure; SQS guarantees at-least-once delivery, and timeouts are a real-world occurrence that must be handled explicitly. Option C is wrong because SQS Standard does not provide exactly-once delivery; it offers at-least-once delivery, and duplicates can still occur due to network retries or consumer failures. Option D is wrong because increasing the consumer timeout and reducing retries only reduces the probability of duplicates but does not eliminate them, and it does not address the fundamental issue of at-least-once delivery semantics.

Full explanation →

144

Multi-Selectmedium

A company is running a stateful web application on EC2 instances that processes user uploads. The architecture currently uses a Multi-AZ deployment for high availability. Which three cost-optimization strategies can be applied without sacrificing high availability? (Choose three.)

Select 3 answers

.Replace Multi-AZ EC2 with a single-AZ deployment to reduce data transfer costs.

.Use an Amazon ElastiCache cluster in a single Availability Zone to reduce costs.

.Store uploaded files in Amazon S3 and use S3 Transfer Acceleration for uploads, then remove the local storage from EC2 instances.

.Use an Application Load Balancer with cross-zone load balancing disabled to reduce inter-AZ data transfer costs.

.Implement a NAT Gateway in each Availability Zone for outbound traffic to reduce data transfer costs.

.Use EC2 Auto Scaling with a mixed instances policy that includes Spot Instances for the stateless web tier, while using On-Demand for the stateful tier.

Why this answer

Storing uploaded files in Amazon S3 and using S3 Transfer Acceleration for uploads removes the need for local storage on EC2 instances, reducing storage costs and eliminating the cost of managing EBS volumes. This strategy does not affect high availability because S3 is inherently highly available and durable across multiple Availability Zones. Transfer Acceleration optimizes upload speed over the public internet without impacting the availability of the application.

Exam trap

The trap here is that candidates may think disabling cross-zone load balancing reduces availability, but it actually preserves high availability as long as each AZ has enough instances to handle failover, while reducing inter-AZ data transfer costs.

Full explanation →

145

MCQmedium

A company runs an application in private subnets (no inbound internet). The application must access Amazon S3 and AWS Secrets Manager endpoints without routing through the public internet and without exposing the instances to NAT gateways due to cost. Security requirements also state that only the required VPC traffic should be allowed to reach AWS services. Which architecture best satisfies these requirements?

A.Place instances in private subnets but use NAT gateways so traffic to S3 and Secrets Manager goes through the internet; restrict security groups to instance-to-instance only.

B.Add a VPC gateway endpoint for S3 and an interface VPC endpoint for Secrets Manager; keep instances in private subnets and configure security group rules attached to the endpoints to allow inbound traffic only from the application subnets.

C.Use public subnets with instances that have no security group rules; rely on AWS services to reject unauthorized traffic.

D.Create an S3 bucket policy that allows requests from the application instances’ private IP addresses and enable public access to Secrets Manager via the default service endpoint.

AnswerB

Gateway endpoints provide private routing to S3, and interface endpoints provide private access to Secrets Manager without internet traversal. Security group controls on interface endpoints restrict traffic to only the application subnets, meeting segmentation and cost constraints.

Why this answer

Option B is correct because it uses a VPC gateway endpoint for S3 and an interface VPC endpoint for Secrets Manager, both of which allow private subnet instances to access these AWS services without traversing the public internet or requiring a NAT gateway. The security group rules attached to the interface endpoint restrict inbound traffic to only the application subnets, satisfying the security requirement of allowing only required VPC traffic. This architecture avoids NAT gateway costs and keeps instances isolated from inbound internet traffic.

Exam trap

The trap here is that candidates may assume all AWS service endpoints require a NAT gateway or internet gateway for private subnet access, overlooking the cost-effective and secure alternative of VPC endpoints (gateway and interface) that keep traffic within the AWS network.

How to eliminate wrong answers

Option A is wrong because it introduces NAT gateways, which incur cost and violate the requirement to avoid them, and traffic still routes through the public internet, failing the no-internet requirement. Option C is wrong because using public subnets with no security group rules exposes instances to inbound internet traffic, violating the private subnet and security requirements, and does not leverage VPC endpoints. Option D is wrong because enabling public access to Secrets Manager via the default service endpoint routes traffic over the internet, violating the no-internet requirement, and S3 bucket policies based on private IP addresses are ineffective since private IPs are not routable over the internet and do not restrict traffic to VPC endpoints.

Full explanation →

146

MCQhard

Based on the exhibit, the company has one shared S3 bucket for many internal teams. Security wants each team to access only its own prefix, ACLs must remain disabled, and the current bucket policy has become too large and error-prone. What is the best redesign?

A.Re-enable object ACLs and manage access by setting object-level ACLs for each team's prefix.

B.Split the bucket into one bucket per team and keep using a single shared bucket policy for all of them.

C.Create one S3 access point per team and attach an access point policy that limits that team to its own prefix.

D.Make the bucket public and issue presigned URLs for team access so IAM policies are no longer needed.

AnswerC

S3 access points are designed for simplifying access management to shared buckets. A separate access point per team keeps the bucket private, avoids ACLs, and lets each team have a smaller, easier-to-review policy boundary. This reduces the blast radius of a policy mistake and scales far better than a single giant bucket policy with many prefix rules.

Why this answer

Option C is correct because S3 Access Points allow you to create separate access points for each team, each with its own policy that restricts access to a specific prefix (e.g., s3://shared-bucket/team-a/). This eliminates the need for a large, error-prone bucket policy while keeping ACLs disabled, meeting the security requirement for per-team prefix isolation without modifying the underlying bucket configuration.

Exam trap

The trap here is that candidates may think splitting into multiple buckets (Option B) is simpler, but they overlook that a single bucket policy cannot efficiently manage multiple buckets, and the requirement is to keep a shared bucket while avoiding a large bucket policy.

How to eliminate wrong answers

Option A is wrong because re-enabling object ACLs violates the requirement that ACLs must remain disabled, and managing access via object-level ACLs for each prefix would be complex and error-prone at scale. Option B is wrong because splitting into one bucket per team and using a single shared bucket policy does not solve the problem—each bucket would still need its own policy, and a single policy cannot effectively manage access across multiple buckets without becoming large and error-prone. Option D is wrong because making the bucket public is a severe security risk, and presigned URLs are intended for temporary, delegated access, not for ongoing team access management; IAM policies would still be needed to control who can generate presigned URLs.

Full explanation →

147

Multi-Selectmedium

A company is deploying a serverless application using AWS Lambda functions that process credit card transactions. The application stores data in Amazon DynamoDB and sends notifications through Amazon SNS. Compliance requirements mandate that all data in transit and at rest is encrypted, and that no AWS Lambda function can access resources in other AWS accounts. Which three steps should be taken to meet these requirements? (Choose three.)

Select 3 answers

.Configure DynamoDB to use AWS owned keys for encryption at rest.

.Attach an IAM policy to the Lambda execution role that denies access to resources outside the account using a condition on aws:SourceAccount.

.Enable encryption in transit for the Lambda function by using an ENI in a private subnet with a VPC endpoint for DynamoDB and SNS.

.Use a VPC endpoint for DynamoDB and SNS, and ensure the Lambda function is configured to use the VPC.

.Configure the Lambda function to use an IAM role that allows access to all accounts.

.Use a security group to block all outbound traffic from the Lambda function to the internet.

Why this answer

The correct options enforce account isolation and encryption in transit. Option 2 uses an IAM policy with a condition key like `aws:SourceAccount` to explicitly deny any action where the resource ARN belongs to a different AWS account, preventing cross-account access. Option 3 enables encryption in transit by routing Lambda traffic through an Elastic Network Interface (ENI) in a private subnet, using VPC endpoints for DynamoDB (HTTPS) and SNS (HTTPS) so data never traverses the public internet.

Option 4 ensures the Lambda function is attached to the VPC, which is necessary for the VPC endpoints to be used; without this, traffic would still go over the public internet, breaking encryption-in-transit compliance.

Exam trap

AWS often tests the misconception that VPC endpoints alone guarantee encryption in transit, but candidates forget that the Lambda function must actually be configured to use the VPC (via `VpcConfig`) for the endpoints to be effective; otherwise, traffic still goes over the public internet.

Full explanation →

148

Multi-Selecthard

A startup has an HTTP API with highly unpredictable traffic from mobile devices. Each request performs lightweight validation, writes an event record, and triggers downstream notifications. The current EC2 fleet stays mostly idle, and the team wants to reduce infrastructure management and pay only for usage. Which two changes best fit the requirement? Select two.

Select 2 answers

A.Place Amazon API Gateway in front of AWS Lambda functions for request handling.

B.Keep the EC2 fleet and add more instances so the idle cost is less noticeable.

C.Use Amazon SQS to buffer notification work and decouple it from the request path.

D.Move the API to an Application Load Balancer only, without changing compute.

E.Store runtime secrets in user data on each instance.

AnswersA, C

API Gateway plus Lambda fits spiky request traffic well because it removes server management and charges are based on actual use.

Why this answer

Option A is correct because Amazon API Gateway can directly invoke AWS Lambda functions, eliminating the need to manage EC2 instances. This serverless architecture scales automatically with unpredictable traffic, and you pay only for the requests and compute time consumed, which aligns with the startup's goal of reducing idle costs and infrastructure management.

Exam trap

The trap here is that candidates might think adding more EC2 instances (Option B) or using an ALB (Option D) solves the idle cost problem, but these options still require managing servers and incur idle costs, whereas serverless options (A and C) eliminate idle costs entirely.

Full explanation →

149

MCQeasy

CloudWatch metrics show your EC2 instances have average CPU utilization around 10% with stable performance over several weeks. The application does not require additional headroom right now. What is the most effective cost-optimization action?

A.Right-size the instances to a smaller size that matches the observed utilization

B.Increase the Auto Scaling desired capacity to add more instances

C.Switch to Spot Instances immediately even though interruptions would impact users

D.Disable detailed monitoring to reduce CPU usage from the monitoring agent

AnswerA

Right sizing reduces cost by matching instance capacity to actual demand. If average CPU is consistently low (around 10%) and performance is stable, it strongly indicates overprovisioning. Moving to a smaller instance (or a smaller capability within the same family) typically lowers hourly cost while maintaining sufficient capacity for the workload.

Why this answer

Right-sizing EC2 instances to match observed utilization is the most effective cost-optimization action because the current instances are over-provisioned (average CPU at 10%). By selecting a smaller instance type that aligns with the actual workload, you reduce hourly costs without impacting performance, as the application has stable behavior and no need for headroom.

Exam trap

The trap here is that candidates may think increasing capacity (Option B) or switching to Spot Instances (Option C) is always cost-effective, but they fail to recognize that right-sizing is the foundational first step before scaling or using Spot, especially when current utilization is low and stable.

How to eliminate wrong answers

Option B is wrong because increasing Auto Scaling desired capacity would add more instances, increasing costs without any performance need, and does not address the over-provisioning issue. Option C is wrong because switching to Spot Instances immediately risks interruptions that would impact users, and the question states the application does not require additional headroom but does not mention fault tolerance or interruption handling. Option D is wrong because disabling detailed monitoring does not reduce CPU usage from the monitoring agent (CloudWatch agent overhead is negligible), and it would remove valuable metrics needed for right-sizing decisions.

Full explanation →

150

Multi-Selectmedium

A customer portal must recover from a regional outage within a few hours. The business wants lower ongoing cost than a fully active second Region and does not want to rebuild everything from scratch during the outage. Which two DR patterns best fit that goal? Select two.

Select 2 answers

A.Backup and restore

B.Pilot light

C.Warm standby

D.Multi-site active-active

E.Single-AZ deployment

AnswersB, C

Pilot light keeps only core components running in the secondary Region, which lowers cost while reducing recovery time.

Why this answer

Pilot light is correct because it maintains a minimal core infrastructure (e.g., database, networking) in the secondary Region that can be quickly scaled up during a disaster, meeting the recovery time objective (RTO) of a few hours while keeping ongoing costs lower than a fully active second Region. It avoids rebuilding everything from scratch by having critical data and configurations already in place, allowing compute resources to be launched on demand.

Exam trap

AWS often tests the distinction between pilot light and warm standby—the trap here is that candidates may confuse pilot light with backup and restore, not realizing that pilot light maintains a live, minimal environment (e.g., database replicas) rather than just backup files, enabling faster recovery without full rebuild.

Full explanation →

SAA-C03 (SAA-C03) — Questions 76–150