Knowledge + Practice

AWS Certified DevOps Engineer Professional DOP-C02 (DOP-C02) — Questions 376–450

1740 questions total · 24pages · All types, answers revealed

Take a mock exam Exam hub

Page 6 of 24

376

Multi-Selectmedium

A security engineer is designing a secure VPC architecture for a web application. The application must be isolated from the internet and only accessible through a load balancer. Which TWO actions should the engineer take?

Select 2 answers

A.Place the EC2 instances in a private subnet with no internet gateway attachment.

B.Attach an Internet Gateway to the VPC and route the private subnet to it.

C.Configure a network ACL on the private subnet to allow inbound traffic on all ephemeral ports.

D.Configure the security group for the EC2 instances to allow traffic only from the ALB's security group.

E.Set up an AWS Direct Connect connection for the instances to access the internet.

AnswersA, D

Private subnets prevent direct internet access to instances.

Why this answer

Option A is correct because placing EC2 instances in a private subnet without an internet gateway ensures they have no direct path to the internet, meeting the isolation requirement. This forces all traffic to and from the instances to go through the load balancer, which is the only entry point for the application.

Exam trap

The trap here is that candidates often confuse the need for a network ACL to allow ephemeral ports (Option C) as a necessary step for inbound traffic from the ALB, but security groups handle stateful filtering and the ALB's security group is the correct source, while network ACLs are stateless and require explicit rules for both inbound and outbound traffic, which is not the primary action for isolation.

Full explanation →

377

MCQhard

An organization wants to ensure that all objects stored in the S3 bucket are encrypted at rest using server-side encryption with S3 managed keys (SSE-S3). The bucket policy above is intended to enforce this. However, a user reported that they can still upload unencrypted objects. What is the MOST likely reason?

A.The condition is applied to the GetObject action, not the PutObject action.

B.The bucket policy is not attached to the bucket because of a circular dependency.

C.The bucket policy does not apply to objects uploaded by the root user.

D.The condition should use 's3:x-amz-server-side-encryption-aws-kms-key-id' instead.

AnswerA

To enforce encryption on uploads, the condition must be on s3:PutObject.

Why this answer

Option D is correct. The bucket policy condition checks for the header 'x-amz-server-side-encryption' with value 'AES256'. However, this condition is on the GetObject action, not on PutObject.

To enforce encryption on uploads, the condition should be on s3:PutObject. Additionally, the condition should require encryption, not just check it. Option A is incorrect because using bucket policy is a valid way to enforce encryption.

Option B is incorrect because the condition syntax is correct for SSE-S3. Option C is incorrect because the policy applies to the bucket, but the condition is on the wrong action.

Full explanation →

378

MCQhard

A company is designing a disaster recovery (DR) strategy for a stateless web application deployed on Amazon ECS with Fargate. The application is fronted by an Application Load Balancer (ALB) and uses Amazon ElastiCache for Redis for session state. The primary region is us-east-1. The DR plan requires a Recovery Point Objective (RPO) of 15 minutes and a Recovery Time Objective (RTO) of 30 minutes. Which solution meets these requirements with the LEAST operational overhead?

A.Deploy an ALB with a warm standby ECS service in us-west-2. Use Route 53 health checks to route traffic to the secondary region if primary fails. Use ElastiCache Global Datastore for Redis to replicate data across regions.

B.Deploy an Active-Active configuration across two AWS regions using Route 53 latency routing. Use ElastiCache for Redis Global Datastore with multi-region writes.

C.Deploy a Pilot Light environment in us-west-2 with a scaled-down ECS service and Redis cluster. Use Route 53 DNS failover. On disaster, scale up the ECS service and promote the Redis cluster.

D.Use Amazon ECS with Fargate in us-east-1 only, and schedule daily snapshots of ElastiCache for Redis. In case of disaster, restore the snapshot in a new region and update DNS.

AnswerA

The warm standby approach with automatic failover and cross-region replication meets RPO and RTO with low operational overhead.

Why this answer

Option A is correct because it uses ElastiCache Global Datastore for Redis, which provides cross-region replication with an RPO of seconds (well within 15 minutes) and automatic failover, minimizing operational overhead. The warm standby ECS service in us-west-2 with Route 53 health checks allows traffic to be redirected within the 30-minute RTO without manual intervention, as the ALB and ECS service are pre-provisioned.

Exam trap

The trap here is that candidates may confuse Pilot Light (Option C) as lower overhead, but it requires manual scaling and promotion steps, whereas a warm standby with Global Datastore automates failover, making it the least operational overhead for the given RPO/RTO.

How to eliminate wrong answers

Option B is wrong because an Active-Active configuration with multi-region writes for ElastiCache Global Datastore is not supported; Global Datastore only supports active-passive (one primary, one replica) to avoid write conflicts. Option C is wrong because a Pilot Light approach requires manual scaling of the ECS service and promoting the Redis cluster on disaster, which adds operational overhead and risks exceeding the 30-minute RTO due to provisioning delays. Option D is wrong because daily snapshots of ElastiCache cannot achieve a 15-minute RPO (snapshots are at most daily), and restoring a snapshot in a new region plus updating DNS would likely exceed the 30-minute RTO due to manual steps and data transfer time.

Full explanation →

379

MCQmedium

A company uses AWS CloudTrail to monitor API activity. The security team notices that an IAM user 'dev-user' deleted an S3 bucket. They need to quickly identify the source IP address of the delete request. Which CloudTrail feature should they use to find this information?

A.Use CloudTrail Lake to query the event and extract the IP address from the userIdentity field.

B.Check S3 server access logs for the bucket deletion event.

C.Enable CloudTrail Insights to analyze unusual activity.

D.Search the CloudTrail event history for the delete event and review the sourceIPAddress field.

AnswerD

CloudTrail event history includes the sourceIPAddress field in the event record.

Why this answer

Option B is correct because CloudTrail event history includes the sourceIPAddress field in the event record, which directly provides the IP address from which the API call was made. Option A is wrong because CloudTrail Insights detects unusual activity but does not provide the IP address for individual events. Option C is wrong because CloudTrail Lake is used for aggregating and querying trails but the source IP is still found in the event record.

Option D is wrong because S3 server access logs are separate from CloudTrail and do not capture IAM user activity.

Full explanation →

380

MCQeasy

A company uses Ansible for configuration management on EC2 instances. They want to ensure that only instances with a specific tag (Environment: Production) are targeted by their playbooks. What is the best way to achieve this?

A.Add a 'when' condition in the playbook to check the instance tag at runtime.

B.Maintain a static inventory file listing only Production instances.

C.Use the AWS EC2 dynamic inventory plugin to filter instances based on tags.

D.Use the ec2_tag module to assign the tag to instances.

AnswerC

Dynamic inventory can filter by tags, so playbooks run only on matching instances.

Why this answer

Option B is correct because Ansible dynamic inventory with AWS EC2 plugin can filter by tags. Option A is wrong because it assigns tags to instances, not select them. Option C is wrong because it is less dynamic.

Option D is wrong because it modifies the playbook to check tags after connecting, which is inefficient.

Full explanation →

381

MCQmedium

A DevOps engineer is troubleshooting a production AWS Lambda function that occasionally times out. The function has a timeout of 30 seconds and uses a synchronous invocation. The engineer wants to capture invocation logs to identify the cause. Which approach will provide the MOST detailed diagnostic information?

A.Enable AWS CloudTrail data events for Lambda.

B.Create a CloudWatch dashboard with function duration metrics.

C.Add more logging statements to the function code and check CloudWatch Logs.

D.Enable AWS X-Ray tracing on the Lambda function.

AnswerD

X-Ray provides detailed traces showing each subsegment's duration, helping identify bottlenecks.

Why this answer

Option D is correct because AWS X-Ray provides end-to-end tracing for Lambda functions, capturing detailed timing information for each invocation, including subsegments for downstream calls, function initialization, and execution phases. This allows the engineer to pinpoint exactly where time is being spent, which is essential for diagnosing intermittent timeouts in synchronous invocations.

Exam trap

The trap here is that candidates often confuse CloudWatch Logs (which show custom log output) with X-Ray tracing (which provides automatic, detailed timing of every subcomponent), leading them to choose option C instead of the more diagnostic X-Ray approach.

How to eliminate wrong answers

Option A is wrong because CloudTrail data events for Lambda only record API calls (e.g., Invoke, UpdateFunctionConfiguration) and do not capture function execution logs or timing details needed to diagnose timeouts. Option B is wrong because a CloudWatch dashboard with duration metrics shows aggregated statistics (e.g., average, p99) over time, but cannot reveal per-invocation breakdowns or pinpoint the specific phase causing a timeout. Option C is wrong because adding more logging statements to the function code and checking CloudWatch Logs provides only custom log output without automatic tracing of downstream calls or sub-millisecond timing, making it insufficient for identifying intermittent timeout causes.

Full explanation →

382

MCQeasy

A company uses Amazon CloudFront to distribute content from an S3 bucket origin. Some users report intermittent access errors. The DevOps team suspects the origin is overwhelmed. What is the MOST effective way to improve resilience?

A.Set up an origin failover with two S3 buckets behind an Application Load Balancer (ALB).

B.Reduce the CloudFront cache TTL to serve fresher content.

C.Increase the CloudFront cache TTL to reduce requests to the origin.

D.Configure CloudFront to perform health checks on the origin.

AnswerA

Origin failover provides high availability.

Why this answer

Option D is correct because using an ALB or multiple origins with failover provides redundancy and load distribution. Option A is wrong because increasing cache TTL only reduces origin load but does not address origin failures. Option B is wrong because reducing TTL increases origin load.

Option C is wrong because CloudFront does not have an origin health check feature; it uses error rate.

Full explanation →

383

MCQhard

A company's application runs on Amazon EC2 instances in an Auto Scaling group. The application writes logs to local instance storage. The operations team needs to ensure logs are not lost during instance termination or scaling events. What should be done?

A.Increase the size of the instance store volumes.

B.Use an Amazon EFS file system and mount it to each instance for log storage.

C.Configure the Auto Scaling group to terminate instances after logs are copied to S3.

D.Install the CloudWatch Logs agent on each instance and stream logs to CloudWatch Logs.

AnswerD

Real-time streaming to CloudWatch prevents loss.

Why this answer

Configuring the CloudWatch Logs agent to stream logs to CloudWatch Logs in real time ensures logs are centralized and not lost.

Full explanation →

384

MCQmedium

A company runs a web application on EC2 instances behind an Application Load Balancer (ALB). The security team requires that all traffic to the ALB must be encrypted (HTTPS) and that the ALB must only accept traffic from CloudFront. The DevOps engineer has configured CloudFront with an origin pointing to the ALB, and the ALB has a listener on port 443 with a valid SSL certificate. The engineer also added a security group rule to the ALB that allows HTTPS traffic only from CloudFront's IP ranges. However, users are reporting intermittent 503 errors. The engineer checks CloudFront logs and sees that some requests are failing with 'Origin Connect Error'. What is the most likely cause?

A.The ALB has a Web Application Firewall (WAF) that is blocking requests from CloudFront.

B.The security group rule is using an outdated list of CloudFront IP ranges, and CloudFront has added new IP ranges that are being blocked.

C.The SSL certificate on the ALB is not trusted by CloudFront, causing handshake failures.

D.The ALB idle timeout is set too low, causing CloudFront to close connections prematurely.

AnswerB

CloudFront IP ranges change; using a static list is unreliable.

Why this answer

CloudFront uses a large set of IP addresses that change over time. Using a security group with a static list of CloudFront IPs is not recommended because the IPs change. Instead, the ALB should use a custom header that CloudFront adds.

Option A is correct: the security group blocks new CloudFront IPs not in the list. Option B (SSL mismatch) would cause a different error. Option C (timeout) would be more consistent.

Option D (WAF) would return 403, not 503.

Full explanation →

385

MCQmedium

A company runs a stateless web application on a fleet of EC2 instances in an Auto Scaling group. The application stores session state in a shared ElastiCache Redis cluster. During traffic spikes, the application becomes slow. Monitoring shows that the Redis cluster has high CPU utilization. Which solution is MOST cost-effective and scalable?

A.Upgrade the Redis instance to a larger node type to handle more operations

B.Enable cluster mode on the ElastiCache Redis cluster and add more shards

C.Add read replicas to offload read traffic from the primary node

D.Migrate session state to DynamoDB with DAX for caching

AnswerB

Cluster mode distributes data across shards, improving performance and scalability.

Why this answer

Enabling cluster mode on ElastiCache Redis and adding more shards horizontally scales the cluster, distributing write and read operations across multiple nodes. This directly reduces CPU utilization on any single node and is more cost-effective than vertical scaling (upgrading to a larger node type) because it allows granular, pay-as-you-grow capacity. Cluster mode also supports automatic shard rebalancing and is ideal for stateless web applications with session state that can be partitioned by session key.

Exam trap

The trap here is that candidates often confuse read replicas with horizontal scaling for write-heavy workloads, not realizing that replicas only help with read scaling and cannot reduce CPU from write operations, while cluster mode directly addresses both read and write scaling by splitting the data set.

How to eliminate wrong answers

Option A is wrong because upgrading to a larger node type (vertical scaling) is less cost-effective and has an upper limit; it does not provide the linear scalability of horizontal sharding and can lead to over-provisioning during low traffic. Option C is wrong because adding read replicas offloads only read traffic, but session state in Redis involves both reads and writes, and the high CPU is likely from write-heavy operations (e.g., SET/GET) that replicas cannot offload; replicas also introduce eventual consistency issues for session data. Option D is wrong because migrating to DynamoDB with DAX introduces unnecessary complexity and cost for session state that is already well-served by Redis; DAX is a separate caching layer that adds latency and cost, and DynamoDB's throughput pricing can be less predictable than ElastiCache for bursty traffic.

Full explanation →

386

MCQeasy

A company uses AWS Organizations to manage multiple accounts. The security team wants to centrally enforce that S3 buckets in all accounts block public access. Which policy should be attached to the root organizational unit to achieve this?

A.Configure a bucket policy on each S3 bucket to deny public access.

B.Create an AWS Config rule to mark noncompliant buckets.

C.Attach an IAM policy to the root user of each account.

D.Attach a service control policy (SCP) to the root organizational unit.

AnswerD

SCPs can deny actions that make S3 buckets public across all accounts in the organization.

Why this answer

Service control policies (SCPs) can be applied to organizational units to restrict permissions across member accounts. Option B is correct because SCPs allow denial of actions that would make S3 buckets public. Option A is wrong because IAM policies are account-specific and cannot be applied centrally to all accounts.

Option C is wrong because AWS Config rules can detect but not enforce. Option D is wrong because a bucket policy is per-bucket and cannot be applied globally.

Full explanation →

387

MCQeasy

A company uses AWS CodeBuild to run unit tests and package a Node.js application. The buildspec.yml file includes commands to install dependencies using npm. The build is failing with the error: 'npm ERR! code EACCES'. How should a DevOps engineer resolve this issue?

A.Configure the CodeBuild project to use a custom VPC with a NAT gateway for internet access

B.Configure the buildspec to run npm install with sudo

C.Add a command to change the ownership of the node_modules directory to the current user

D.Use 'npm ci' instead of 'npm install' and ensure a package-lock.json is present

AnswerD

npm ci uses the lock file and avoids permission issues.

Why this answer

Option B is correct because the default CodeBuild user does not have write permissions to the default npm global directory. Using 'npm ci' does not require global write access and is also faster. Option A is wrong because running as root is not recommended and may cause other issues.

Option C is wrong because changing permissions is not necessary. Option D is wrong because CodeBuild environments have internet access by default.

Full explanation →

388

MCQmedium

A team is using AWS CodePipeline to deploy a serverless application using AWS Lambda and Amazon API Gateway. The pipeline has a source stage from CodeCommit, a build stage using CodeBuild (which runs unit tests and packages the Lambda code), and a deploy stage using AWS CloudFormation to update a stack that contains the Lambda function and API Gateway. The deployment stage uses a CloudFormation template that creates the Lambda function and API Gateway. Recently, the deployment stage started failing with the error: 'The API Gateway deployment already exists'. The team has not changed the template. What is the most likely cause?

A.The CloudFormation template uses a fixed physical name for the deployment resource, causing a conflict when CloudFormation tries to create a new deployment with the same name.

B.The Lambda function does not have permission to be invoked by API Gateway.

C.The account has reached the limit for API Gateway deployments.

D.The API Gateway stage name is being changed in the template.

AnswerA

Using a fixed name prevents CloudFormation from creating a new deployment; it should use a unique name or allow CloudFormation to generate one.

Why this answer

Option D is correct because CloudFormation creates a deployment resource with a unique ID each time; if the template does not change the logical ID, it may try to create a new deployment with the same name, causing a conflict. Option A is wrong because the stage name is not changed. Option B is wrong because permissions would cause a different error.

Option C is wrong because API Gateway limits are higher.

Full explanation →

389

Multi-Selecthard

A company uses AWS Organizations with multiple accounts. The security team needs to ensure that all CloudTrail trails across the organization are delivering events to a centralized S3 bucket in the management account. Currently, some member accounts have their own trails. Which THREE steps should the security team take to enforce this? (Choose three.)

Select 3 answers

A.Manually disable CloudTrail in each member account.

B.Create an organization trail in the management account that applies to all accounts.

C.Use AWS Config rules to detect non-compliant trails and trigger automatic remediation.

D.Enable CloudTrail on the centralized S3 bucket to log access.

E.Use a service control policy (SCP) to deny the 'cloudtrail:CreateTrail' and 'cloudtrail:UpdateTrail' actions.

AnswersB, D, E

Organization trails automatically apply to all accounts in the organization.

Why this answer

Option A is correct because creating an organization trail in the management account will automatically apply to all accounts. Option B is correct because enabling CloudTrail on the S3 bucket ensures logs are delivered. Option D is correct because SCP can deny the ability to create or modify trails, preventing member accounts from creating separate trails.

Option C is wrong because disabling CloudTrail in member accounts is manual and not scalable. Option E is wrong because Config rules can detect but not prevent.

Full explanation →

390

MCQeasy

A developer is setting up AWS CodeBuild to compile a Go application. The build fails with the error: 'go: command not found'. What is the MOST likely cause?

A.The environment variables in buildspec.yml are incorrectly set

B.The build project does not have enough memory to compile Go code

C.The build environment image does not have the Go runtime installed

D.The CodeBuild service role does not have permission to access the S3 bucket for artifacts

AnswerC

CodeBuild images are version-specific; e.g., aws/codebuild/amazonlinux2-x86_64-standard:3.0 does not include Go by default.

Why this answer

The error 'go: command not found' indicates that the Go executable is not available in the build environment's PATH. CodeBuild uses a managed or custom build environment image (e.g., Ubuntu, Amazon Linux 2) to run build commands. If the image does not include the Go runtime, the shell cannot locate the 'go' binary, causing the build to fail.

The most direct fix is to select a build environment image that has Go pre-installed or to install Go in the install phase of the buildspec.

Exam trap

The trap here is that candidates often blame environment variables (Option A) or permissions (Option D) because they are common build failures, but the root cause is the missing runtime in the build environment image — a fundamental prerequisite that CodeBuild does not automatically provide.

How to eliminate wrong answers

Option A is wrong because environment variables in buildspec.yml control runtime behavior (e.g., GOPATH, GO111MODULE) but do not cause a 'command not found' error; the shell would still find the 'go' binary if it exists in PATH. Option B is wrong because insufficient memory leads to out-of-memory (OOM) kills or build timeouts, not a 'command not found' error; the Go compiler itself would be invoked before memory limits become an issue. Option D is wrong because S3 bucket permissions affect artifact uploads or cache retrieval, not the execution of build commands; the 'go' binary would still be found and run regardless of S3 access.

Full explanation →

391

MCQeasy

A company uses AWS CodeBuild to run unit tests as part of a CI pipeline. The buildspec.yaml file is located in the root of the source repository. The build takes 30 minutes to complete. The team wants to speed up the build by caching dependencies. Which approach should they take?

A.Download dependencies from the internet each time the build runs.

B.Mount an Amazon EFS file system to the build container and store dependencies there.

C.Configure the buildspec.yaml to enable local caching and specify the paths to cache.

D.Store dependencies in an AWS CodeCommit repository and clone it during the build.

AnswerC

Local caching stores dependencies in S3 and reuses them across builds, reducing build time.

Why this answer

Option C is correct because CodeBuild supports local caching, which stores the cache in an S3 bucket and speeds up builds by reusing downloaded dependencies. Option A is wrong because CodeBuild does not directly support EFS for caching. Option B is wrong because CodeCommit does not provide caching.

Option D is wrong because re-downloading dependencies defeats the purpose.

Full explanation →

392

MCQhard

A DevOps team applies the above IAM policy to a group. A developer in this group tries to upload an object to the S3 bucket using the AWS CLI without specifying any encryption. The upload fails with an AccessDenied error. Why does the upload fail?

A.The Allow statement's condition is satisfied, but the Deny statement is evaluated first and denies the request.

B.The Allow statement requires encryption to be AES256, but the CLI defaults to SSE-S3, which is not AES256.

C.The Deny statement explicitly denies PutObject when encryption is not AES256, overriding the Allow.

D.The Deny statement's condition is not met because the request does not include encryption headers.

AnswerC

Explicit Deny always overrides Allow.

Why this answer

Option A is correct because the Deny statement explicitly denies PutObject if encryption is not AES256. The Allow statement only allows if AES256 is specified. Since no encryption is specified, the condition in the Deny statement (StringNotEquals AES256) is true, causing a deny.

Option B is wrong because the Allow statement's condition is not met, so it does not grant permission. Option C is wrong because the Deny statement explicitly matches the condition. Option D is wrong because explicit Deny overrides Allow.

Full explanation →

393

MCQeasy

A DevOps engineer wants to automate the creation and cleanup of temporary development environments on AWS. Each environment consists of an Amazon EC2 instance and an Amazon RDS database. The environments should be isolated and cost-effective. Which AWS service is best suited for this?

A.AWS Elastic Beanstalk

B.AWS CloudFormation

C.AWS OpsWorks

D.Amazon ECS

AnswerB

CloudFormation can provision and tear down stacks easily.

Why this answer

Option A is correct because AWS CloudFormation allows you to define infrastructure as code and easily create and delete stacks. Option B is wrong because AWS OpsWorks is configuration management, not ideal for temporary environments. Option C is wrong because AWS Elastic Beanstalk is a PaaS service that manages the environment, but not as flexible for cleanup.

Option D is wrong because Amazon ECS is for containers, not EC2 and RDS directly.

Full explanation →

394

MCQeasy

Refer to the exhibit. The above IAM policy is attached to an IAM role used by a CI/CD pipeline. Which action is this policy allowing?

A.Start builds for any CodeBuild project in the account.

B.View details of any build in the account.

C.Start and view builds for the specified CodeBuild project.

D.Create and manage CodeBuild projects.

AnswerC

The policy allows StartBuild and BatchGetBuilds.

Why this answer

Option A is correct because the policy allows StartBuild and BatchGetBuilds on a specific CodeBuild project. Option B is wrong because CreateProject is not allowed. Option C is wrong because the resource is a project, not a build.

Option D is wrong because UpdateProject is not allowed.

Full explanation →

395

MCQmedium

A company hosts a static website on Amazon S3 with a CloudFront distribution. The website is critical for business operations and must be available even if the primary AWS Region fails. Currently, the S3 bucket is in us-east-1, and CloudFront uses that bucket as the origin. The company has a secondary bucket in us-west-2 with a replica of the data. The company wants to use CloudFront to automatically fail over to the secondary bucket if the primary becomes unavailable. The DevOps engineer needs to implement a solution that requires minimal operational overhead. What should the engineer do?

A.Use an Application Load Balancer in front of both S3 buckets and point CloudFront to the ALB.

B.Create a second CloudFront distribution pointing to the secondary bucket and use Route 53 failover routing between the two distributions.

C.Modify the application to switch the CloudFront origin URL using Lambda@Edge when health checks fail.

D.Configure CloudFront Origin Failover by adding both buckets as origins, with the primary in us-east-1 and secondary in us-west-2.

AnswerD

CloudFront natively supports origin failover with minimal configuration.

Why this answer

CloudFront Origin Failover allows you to set up a primary and secondary origin. If the primary returns an error (e.g., 503), CloudFront automatically routes requests to the secondary origin.

Full explanation →

396

Multi-Selectmedium

A DevOps team is using AWS Elastic Beanstalk to deploy a web application. They need to customize the software configuration on the EC2 instances that are part of the Elastic Beanstalk environment. Which THREE methods can they use? (Choose THREE.)

Select 3 answers

A.Use saved configurations to create reusable environment templates

B.Create a custom AMI with the desired configuration

C.Use OpsWorks to manage the instances

D.Use .ebextensions configuration files

E.Use platform hooks to run custom scripts during deployment

AnswersA, D, E

Saved configurations capture environment settings and can be applied to new environments.

Why this answer

Elastic Beanstalk provides several ways to customize the environment: .ebextensions for config files, saved configurations for environment settings, and platform hooks for custom scripts. Options A, B, and D are correct. Option C is incorrect because OpsWorks is a separate service.

Option E is incorrect because custom AMIs are not recommended for Elastic Beanstalk; you should use platform hooks instead.

Full explanation →

397

MCQhard

A company is using AWS CodePipeline with multiple stages that include source, build, and deploy. The pipeline uses an Amazon S3 bucket as the source action. The team notices that the pipeline is not automatically starting when new files are uploaded to the S3 bucket. The S3 bucket has versioning enabled. What is the most likely reason?

A.The S3 bucket is the same bucket used for the deploy action.

B.The pipeline is configured to detect changes based on object key, but the uploaded file uses the same key as an existing object.

C.The S3 bucket is not configured to send Amazon SQS notifications to CodePipeline.

D.The S3 bucket does not have versioning enabled.

AnswerB

CodePipeline triggers only when the object key changes or a new version is created; overwriting with same key may not trigger if versioning is not combined with proper event filtering.

Why this answer

Option C is correct because the S3 source action in CodePipeline only triggers on PUT operations that create new objects or versions; if the object key does not change, it may not trigger. Option A is wrong because versioning is actually required for S3 source actions. Option B is wrong because the pipeline action can use the same bucket as the source.

Option D is wrong because there is no specific SQS requirement for S3 source triggers.

Full explanation →

398

MCQeasy

A company uses AWS CloudFormation to deploy infrastructure. A stack update fails with the error 'UPDATE_ROLLBACK_FAILED'. What should the engineer do to resolve this?

A.Retry the stack update with the same parameters.

B.Delete the stack and recreate it.

C.Ignore the error and continue using the stack.

D.Use the 'ContinueUpdateRollback' operation to fix the resource that caused the failure.

AnswerD

This operation allows CloudFormation to complete the rollback.

Why this answer

Option B is correct because a rollback failure indicates that CloudFormation could not undo some changes. Continuing the rollback process or manually fixing the resource allows the stack to reach a consistent state. Option A is wrong because deleting the stack may cause data loss.

Option C is wrong because ignoring may leave the stack in an inconsistent state. Option D is wrong because updating again may compound the issue.

Full explanation →

399

MCQmedium

An application running on Amazon ECS Fargate is experiencing intermittent 'CannotPullContainerError' errors. The task definition references a Docker image in a private Amazon ECR repository. The task execution role has the 'AmazonECSTaskExecutionRolePolicy' policy attached. What is the most likely cause?

A.The Fargate task is in a private subnet without a NAT gateway or VPC endpoint

B.The task execution role does not have sufficient permissions

C.The ECS service is not configured with Auto Scaling

D.The ECR repository is not in the same region as the ECS cluster

AnswerA

Fargate tasks need internet access or VPC endpoints to pull images from ECR.

Why this answer

Option C is correct because the task execution role needs permission to pull images; 'AmazonECSTaskExecutionRolePolicy' includes ecr:GetAuthorizationToken and ecr:BatchCheckLayerAvailability etc., so it should work. However, if the image is in a different account, the repository policy may not allow cross-account pulls. But among options, D is plausible.

Actually, re-evaluate: The policy includes necessary permissions, so likely it's a networking issue. Option A is wrong because the policy does allow. Option B is wrong because ECR does not use VPC endpoints by default.

Option C is correct if the subnet has no route to ECR. Option D is wrong because Auto Scaling does not affect image pulling.

Full explanation →

400

Multi-Selecthard

Which THREE actions can be performed using the AWS CLI for CodeDeploy? (Choose three.)

Select 3 answers

A.create-deployment

B.get-deployment

C.push-revision

D.list-deployment-groups

E.register-instance

AnswersA, B, D

The CLI can create a deployment.

Why this answer

The AWS CLI for CodeDeploy supports creating deployments, listing deployment groups, and getting deployment details. Creating an application is also possible, but the question asks for actions; pushing revision is done via S3 or GitHub, not directly via CLI. Registering instances is done via the console or AWS CLI with the register-on-premises-instance command, but that is for on-premises instances.

The three correct are create-deployment, list-deployment-groups, and get-deployment.

Full explanation →

401

MCQeasy

A DevOps engineer needs to monitor the number of 4xx and 5xx HTTP errors returned by an Application Load Balancer (ALB). They want to set up a dashboard that shows the error count over the last 24 hours. Which CloudWatch metrics should they use?

A.Use the 'HTTPCode_Target_4XX_Count' and 'HTTPCode_Target_5XX_Count' metrics.

B.Use the 'RequestCount' metric with a statistic of 'ErrorCount'.

C.Use the 'HTTPCode_ELB_4XX_Count' and 'HTTPCode_ELB_5XX_Count' metrics.

D.Use the 'TargetResponseTime' metric and count the number of responses above 4 seconds.

AnswerA

These metrics track the HTTP error codes returned by the targets.

Why this answer

Option B is correct because the ALB emits the 'HTTPCode_Target_4XX_Count' and 'HTTPCode_Target_5XX_Count' metrics. Option A is wrong because those are load balancer-level metrics, not target-level. Option C is wrong because those metrics do not exist.

Option D is wrong because 'ErrorCount' is not a standard ALB metric.

Full explanation →

402

MCQhard

An organization uses AWS CodeBuild to compile a Java application. The buildspec.yml includes a pre_build phase that runs unit tests. Recently, the build started failing with 'NoClassDefFoundError' for certain test dependencies, even though the pom.xml includes them. The build environment uses an Amazon Linux 2 Docker image. What is the MOST likely cause?

A.The CodeBuild project has a cache that is corrupted or out of sync. Clear the build cache.

B.The S3 bucket for artifacts has incorrect permissions. Update the bucket policy.

C.The CodeCommit repository is not pulling the latest code. Add a webhook to trigger builds on push.

D.The build environment does not have Maven installed. Install Maven in the buildspec.

AnswerA

Cached dependencies may be stale or incomplete, leading to NoClassDefFoundError.

Why this answer

Option A is correct because CodeBuild caches dependency directories to speed up builds; a corrupted or stale cache can cause missing classes. Option B is wrong because CodeCommit pull frequency is unrelated. Option C is wrong because S3 bucket permissions affect artifact upload, not dependency resolution.

Option D is wrong because Amazon Linux 2 includes Maven, but that would not cause a single class missing.

Full explanation →

403

MCQmedium

A DevOps engineer is troubleshooting a CloudFormation stack creation failure. The stack includes an EC2 instance with a UserData script that installs software. The stack creation fails with the error: 'The following resource(s) failed: EC2Instance (AWS::EC2::Instance) – Resource creation cancelled'. What is the most likely cause?

A.The EC2 instance type is not supported in the region

B.The IAM role attached to the instance lacks permissions

C.The stack creation timed out or was manually cancelled

D.The UserData script failed due to a syntax error

AnswerC

Resource creation cancelled typically indicates a timeout or manual cancellation.

Why this answer

Option D is correct because Resource creation cancelled indicates the stack creation timed out or was manually cancelled. UserData scripts do not affect CloudFormation resource status unless the instance is created with a CreationPolicy or WaitCondition. Option A is incorrect because UserData failure does not cause a CloudFormation error.

Option B is incorrect because Resource creation cancelled is not due to API rate limiting. Option C is incorrect because missing IAM role would cause an 'AccessDenied' error, not cancellation.

Full explanation →

404

Multi-Selectmedium

A company is using Amazon CloudWatch Logs to store application logs. The security team requires that logs are encrypted at rest using a customer-managed KMS key. Which TWO steps must be taken to achieve this?

Select 2 answers

A.Recreate the log group after associating the key.

B.Add a statement to the KMS key policy that allows CloudWatch Logs to use the key.

C.Create a KMS grant to allow CloudWatch Logs to use the key.

D.Specify the KMS key ARN when creating each log stream.

E.Use the put-log-group-encryption API to associate the KMS key with the log group.

AnswersB, E

The key policy must grant the CloudWatch Logs service principal permissions to encrypt/decrypt.

Why this answer

B and C are correct: You must associate the KMS key with the log group using the associate-kms-key API, and the CloudWatch Logs service must have permissions to use the KMS key via a key policy. A is wrong because CloudWatch Logs does not require a grant; it uses key policies. D is wrong because you do not need to specify the key ARN in the log stream.

E is wrong because the log group already exists.

Full explanation →

405

MCQeasy

A company uses AWS OpsWorks for configuration management. The DevOps team wants to automate the patching of operating system updates on a set of EC2 instances managed by OpsWorks. Which OpsWorks feature should be used?

A.Weekly Auto Update

B.Auto Scaling

C.Chef Automate

D.Lifecycle events (Setup, Configure, Deploy, etc.)

AnswerA

OpsWorks can automatically install updates on a weekly schedule.

Why this answer

Option B is correct because OpsWorks Stacks supports automatic patching via the 'Auto Healing' and 'Custom Recipes' but also has a built-in 'Weekly Auto Update' feature for managed instances. Option A (Auto Scaling) handles capacity, not patching. Option C (Lifecycle events) can be used to trigger custom recipes for patching, but the question asks for a specific feature.

Option D (Chef Automate) is a separate product, not natively integrated with OpsWorks Stacks.

Full explanation →

406

Multi-Selecteasy

A DevOps engineer is designing an incident response plan for a serverless application using AWS Lambda, API Gateway, and DynamoDB. Which TWO services should be used to monitor and alert on errors and latency?

Select 2 answers

A.Amazon GuardDuty

B.Amazon CloudWatch

C.AWS X-Ray

D.AWS Config

E.AWS CloudTrail

AnswersB, C

CloudWatch provides metrics, logs, and alarms for Lambda and API Gateway.

Why this answer

Option A (CloudWatch) provides metrics and alarms. Option D (X-Ray) provides tracing for latency. Option B is wrong because CloudTrail records API calls but not latency.

Option C is wrong because Config monitors configuration changes. Option E is wrong because GuardDuty is for security threats.

Full explanation →

407

MCQhard

A company runs a multi-tier web application on AWS. The application consists of an Application Load Balancer (ALB), an EC2 Auto Scaling group (ASG) for web servers, and an Amazon RDS Multi-AZ DB instance. The ASG uses a launch template with Amazon Linux 2 and a user data script that installs the web application and connects to the RDS database using a static password stored in the user data. Recently, the security team discovered that the user data script is exposed in the EC2 console and could be viewed by anyone with EC2 describe-instances permissions. The team wants to remediate this immediately without causing downtime. The ASG is configured with a min size of 2, max size of 6, and desired capacity of 4. The application is currently under load. Which option describes the best course of action?

A.Create a new launch template version that retrieves the password from AWS Secrets Manager. Update the ASG to use the new template version and perform an instance refresh with a minimum healthy percentage of 100%.

B.Immediately modify the user data on each running EC2 instance to remove the password, then update the launch template to reference AWS Secrets Manager.

C.Update the existing launch template to use AWS Secrets Manager for the database password. The ASG will automatically apply the change to existing instances.

D.Delete the existing launch template and create a new one with secrets from AWS Secrets Manager. Then terminate all running instances and let the ASG launch new ones.

AnswerA

D is correct because it removes the password from the template, uses Secrets Manager for security, and performs a rolling update without downtime.

Why this answer

Option A is correct because it uses an instance refresh with a minimum healthy percentage of 100% to replace instances without downtime, while the new launch template version retrieves the password from AWS Secrets Manager, eliminating the static password exposure. This approach ensures that the security vulnerability is remediated immediately without disrupting the running application under load.

Exam trap

The trap here is that candidates assume updating the launch template automatically propagates to existing instances, but in reality, the ASG only applies the launch template to new instances, so an instance refresh or manual replacement is required to remediate existing instances.

How to eliminate wrong answers

Option B is wrong because modifying user data on running instances does not change the launch template, so any new instances launched by the ASG will still use the exposed static password; also, manually editing instances is not scalable and risks configuration drift. Option C is wrong because updating the launch template does not automatically apply changes to existing instances; the ASG only uses the launch template for new instances, so existing instances remain vulnerable until replaced. Option D is wrong because terminating all running instances at once would cause downtime, violating the requirement to avoid disruption, and the ASG would launch replacements based on the new template, but the immediate termination is not safe under load.

Full explanation →

408

MCQhard

A company uses AWS CloudFormation to manage a production environment. The DevOps team wants to implement a change management process where any changes to the stack must be reviewed before execution. Which feature should the team use?

A.CloudFormation StackSets

B.CloudFormation Stack Policies

C.CloudFormation Change Sets

D.CloudFormation Drift Detection

AnswerC

Change Sets allow review of changes before execution.

Why this answer

CloudFormation Change Sets allow the DevOps team to preview how proposed changes to a stack will impact running resources before they are executed. This enables a review-and-approval workflow because the change set can be created, reviewed, and then executed only after approval, satisfying the requirement for a change management process.

Exam trap

The trap here is that candidates often confuse Stack Policies (which control update permissions) with Change Sets (which provide a preview), or they assume Drift Detection can be used to review proposed changes when it only detects unmanaged changes after they occur.

How to eliminate wrong answers

Option A is wrong because CloudFormation StackSets are used to deploy stacks across multiple accounts and regions, not to review or approve changes to a single stack. Option B is wrong because Stack Policies define resource-level update protections (e.g., prevent replacement of a database) but do not provide a preview or review step before changes are applied. Option D is wrong because Drift Detection identifies whether a stack's actual resources have diverged from the template, but it does not control or review proposed changes before execution.

Full explanation →

409

MCQhard

A DevOps team is troubleshooting a performance issue where an Amazon RDS for PostgreSQL instance's CPU utilization spikes every hour. The team suspects a specific query from an application. Which combination of tools can identify the problematic query?

A.CloudWatch Logs Insights and CloudWatch metrics.

B.Amazon RDS Performance Insights and Enhanced Monitoring.

C.VPC Flow Logs and Lambda.

D.CloudTrail and CloudWatch alarms.

AnswerB

Performance Insights identifies top queries; Enhanced Monitoring shows resource usage.

Why this answer

Option D is correct because Performance Insights shows the top SQL queries by load, and RDS Enhanced Monitoring provides OS-level metrics. Option A is wrong because CloudWatch does not show individual queries. Option B is wrong because CloudTrail captures API calls, not database queries.

Option C is wrong because VPC Flow Logs capture network traffic, not queries.

Full explanation →

410

Multi-Selectmedium

Which TWO actions are effective ways to protect an AWS account root user? (Choose 2)

Select 2 answers

A.Use a strong, complex password and change it every 90 days.

B.Use the root user for everyday administrative tasks.

C.Enable multi-factor authentication (MFA) on the root user.

D.Rotate the root user password every 30 days.

E.Delete or disable the root user access keys.

AnswersC, E

MFA adds an extra layer of security.

Why this answer

Option A (MFA) and Option C (no access keys) are correct. Option B is wrong because a strong password alone is not sufficient; MFA is critical. Option D is wrong because using root user regularly increases risk.

Option E is wrong because a single password change is not a protection mechanism.

Full explanation →

411

MCQmedium

A company runs a critical web application on EC2 instances behind an Application Load Balancer. The application stores session state in an in-memory cache on each instance. During deployment of a new version, users experience session timeouts and errors. Which design change will MOST effectively improve resilience and avoid session loss during deployments?

A.Enable sticky sessions (session affinity) on the ALB.

B.Migrate session state to ElastiCache for Redis.

C.Increase the ALB idle timeout to 600 seconds.

D.Increase the EC2 instance size to handle higher memory.

AnswerB

Offloading session state to ElastiCache makes sessions durable across instance replacements.

Why this answer

Option B is correct because migrating session state from in-memory EC2 instance storage to ElastiCache for Redis decouples session data from individual instances. This ensures that when a new deployment replaces instances, sessions persist independently, preventing timeouts and errors. ElastiCache provides a centralized, highly available session store that survives instance termination and scaling events.

Exam trap

The trap here is that candidates often confuse sticky sessions (which only route traffic consistently) with session persistence (which requires external storage), leading them to choose option A despite it not preserving session data across instance replacements.

How to eliminate wrong answers

Option A is wrong because enabling sticky sessions (session affinity) on the ALB would lock users to a specific instance, but during deployment that instance is terminated and replaced, causing session loss regardless of stickiness. Option C is wrong because increasing the ALB idle timeout to 600 seconds only extends how long the ALB keeps a connection open without data transfer; it does not preserve session state stored in the instance's memory when the instance is replaced. Option D is wrong because increasing the EC2 instance size to handle higher memory does not solve the fundamental problem of session state being ephemeral and lost during instance replacement in a deployment.

Full explanation →

412

MCQmedium

A company is designing a serverless application using AWS Lambda, Amazon API Gateway, and Amazon DynamoDB. The application must tolerate a Regional failure. Which design provides the most resilience?

A.Use Lambda@Edge to run functions at AWS edge locations

B.Use DynamoDB auto-scaling and run Lambda in a single Region

C.Use DynamoDB global tables with Lambda functions deployed in multiple Regions and Route 53 multi-Region routing

D.Use DynamoDB Accelerator (DAX) to cache data across Regions

AnswerC

Global tables and multi-Region deployment provide resilience.

Why this answer

Option C is correct because DynamoDB global tables provide multi-Region, fully replicated tables with automatic conflict resolution, ensuring data availability during a Regional outage. Deploying Lambda functions in multiple Regions with Route 53 multi-Region routing (using health checks and latency-based or weighted routing) allows traffic to fail over to a healthy Region, making the entire serverless stack resilient to a Regional failure.

Exam trap

The trap here is that candidates often confuse caching (DAX) or edge computing (Lambda@Edge) with true multi-Region replication and failover, assuming they provide Regional resilience when they do not.

How to eliminate wrong answers

Option A is wrong because Lambda@Edge runs at CloudFront edge locations, not in multiple AWS Regions, and is designed for lightweight request/response modification, not for hosting a full serverless application backend; it does not provide Regional failover for DynamoDB or API Gateway. Option B is wrong because DynamoDB auto-scaling only adjusts throughput within a single Region and does not replicate data across Regions, so a Regional failure would still cause complete data unavailability; running Lambda in a single Region creates a single point of failure for compute. Option D is wrong because DAX is a caching layer that operates within a single Region and does not replicate data across Regions; it cannot provide data durability or availability during a Regional outage, and it is not designed for cross-Region failover.

Full explanation →

413

MCQeasy

A DevOps team uses AWS CodePipeline to deploy a static website to Amazon S3. The pipeline has a source stage from CodeCommit, a build stage using CodeBuild that generates the website files, and a deploy stage that copies files to an S3 bucket. The team wants to add a manual approval step before the deploy stage. What should the engineer do?

A.Add an approval action in the pipeline stage before deploy

B.Use Amazon SNS to send a notification and rely on a Lambda function to resume

C.Add a CodeBuild action that waits for an SNS confirmation

D.Configure the S3 bucket to send an event to the pipeline after upload

AnswerA

Approval action pauses pipeline for manual sign-off.

Why this answer

Option C is correct because an approval action in CodePipeline pauses the pipeline until manually approved. Option A is incorrect because CodeBuild does not support manual approval. Option B is incorrect because S3 events cannot pause a pipeline.

Option D is incorrect because the pipeline needs to pause, not just notify.

Full explanation →

414

MCQhard

A company uses AWS CodeCommit, CodeBuild, and CodePipeline to manage a multi-module Java application. The pipeline has a single build stage that runs tests and packages the application into a JAR file. Recently, the team split the application into multiple microservices, each in its own CodeCommit repository. They want to create a single pipeline that can build all microservices in parallel and then deploy them together to an Amazon ECS cluster. The pipeline should trigger when any of the repositories receives a push. Currently, the pipeline is configured with a single source stage pointing to one repository. The build stage uses a single build project. The team wants to minimize changes to the existing pipeline structure. What should a DevOps engineer do to achieve this?

A.Create a separate pipeline for each microservice and use a webhook to trigger all pipelines on any push

B.Create a single source action that uses an S3 bucket as source, and have each repository push to the S3 bucket via a Lambda function

C.Create a single source action in CodePipeline that uses a Lambda function to pull from all repositories

D.Add multiple source actions in the source stage, each pointing to a different CodeCommit repository. Configure the build stage to use multiple input artifacts and update the build project to build all modules

AnswerD

Allows parallel source retrieval and single pipeline execution.

Why this answer

Option D is correct because CodePipeline supports multiple source actions in the same stage, and each source can have its own CodeCommit repository. Additionally, CodeBuild can be configured to use multiple input artifacts, allowing the build project to access all repositories. The pipeline trigger can be set to any source change.

Option A is wrong because a single source action cannot point to multiple repositories. Option B is wrong because webhooks would require separate pipelines. Option C is wrong because Lambda invocation adds complexity and is not the simplest approach.

Full explanation →

415

MCQeasy

A DevOps team wants to monitor the disk space utilization on their EC2 instances. What is the simplest way to achieve this?

A.Use AWS Systems Manager Inventory to collect disk space data.

B.Install the CloudWatch agent on the EC2 instances and configure the disk metric.

C.Enable EC2 detailed monitoring in CloudWatch.

D.Use EC2 basic monitoring in CloudWatch.

AnswerB

The CloudWatch agent collects disk space metrics and sends them to CloudWatch.

Why this answer

Option D is correct because the CloudWatch agent can collect disk metrics from EC2 instances. Option A is wrong because basic monitoring does not include disk metrics. Option B is wrong because detailed monitoring also does not include disk metrics.

Option C is wrong because Systems Manager Inventory is for software inventory, not disk space.

Full explanation →

416

MCQeasy

A DevOps engineer needs to automate the creation of a CI/CD pipeline using infrastructure as code. Which AWS service is BEST suited to define and provision the pipeline resources?

A.AWS Elastic Beanstalk

B.AWS CodePipeline

C.AWS CloudFormation

D.AWS Service Catalog

AnswerC

CloudFormation allows you to define resources including pipelines in code.

Why this answer

Option B is correct because CloudFormation is IaC. Option A is wrong because CodePipeline is the pipeline itself, not IaC. Option C is wrong because EB is PaaS.

Option D is wrong because Service Catalog is for product portfolios.

Full explanation →

417

MCQmedium

A company is using a centralized logging solution with Amazon OpenSearch Service. The DevOps team notices that logs from some EC2 instances are missing. The CloudWatch agent is installed and configured on all instances. What should the team do to troubleshoot the issue?

A.Check the CloudWatch agent status using the CloudWatch agent status command.

B.Configure a Lambda function to poll the CloudWatch agent for logs.

C.Verify that the EC2 instances have an SQS queue configured for log delivery.

D.Check the CloudWatch agent log file located at /var/log/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.log.

AnswerD

This log file contains errors and warnings that help troubleshoot the issue.

Why this answer

The CloudWatch agent writes detailed operational logs to /var/log/amazon/amazon-cloudwatch-agent/amazon-cloudwatch-agent.log. This file contains errors, warnings, and debug messages that can reveal why logs from specific EC2 instances are not being delivered to Amazon OpenSearch Service. Checking this log is the first and most direct troubleshooting step because it captures agent-level issues such as configuration errors, network connectivity failures, or permission problems.

Exam trap

The trap here is that candidates may assume a 'status' command exists for the CloudWatch agent (Option A) because many other AWS services have such commands, but the agent uses a control script instead, and the real diagnostic starting point is the agent's own log file.

How to eliminate wrong answers

Option A is wrong because the CloudWatch agent does not have a 'status' command; the correct command to check the agent's operational state is 'sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a status'. Option B is wrong because polling the CloudWatch agent with a Lambda function is unnecessary and inefficient; the agent already pushes logs to CloudWatch Logs, and the issue is about missing logs, not about needing to pull them. Option C is wrong because SQS queues are not used for log delivery from the CloudWatch agent; logs are sent directly to CloudWatch Logs via the HTTPS API, and SQS is unrelated to this data path.

Full explanation →

418

MCQeasy

A development team uses AWS CodeStar to set up a continuous delivery pipeline for a web application. The application is deployed to an Elastic Beanstalk environment. After a successful deployment, the team wants to automatically run integration tests against the deployed application. What is the SIMPLEST way to achieve this?

A.Configure the Elastic Beanstalk environment to run integration tests after deployment via a custom platform hook.

B.Use Amazon CloudWatch alarms to trigger a Lambda function that runs the tests.

C.Add a post-deployment script in the CodeDeploy appspec file to run tests.

D.Add a test stage in CodePipeline after the deploy stage that uses CodeBuild to run integration tests.

AnswerD

Adding a test stage in the pipeline is straightforward and automated.

Why this answer

Option D is correct because adding a test stage in CodePipeline after the deploy stage is the simplest approach. Option A is incorrect because manual testing is not automatic. Option B is incorrect because CloudWatch Alarms monitor metrics, not run tests.

Option C is incorrect because CodeDeploy is the deployment service, not a test runner.

Full explanation →

419

MCQmedium

Refer to the exhibit. A CloudWatch alarm is configured for an EC2 instance. The CPU utilization exceeds 80% for two consecutive minutes. What action will occur?

A.No action will be taken because the period is 60 seconds.

B.The instance will be recovered.

C.The instance will be rebooted.

D.The instance will be stopped.

AnswerB

Alarm triggers instance recovery.

Why this answer

Option C is correct because the alarm action is 'ec2:recover', which initiates instance recovery. Option A is wrong because the action is recover, not reboot. Option B is wrong because the alarm does not stop the instance.

Option D is wrong because the alarm evaluates every 60 seconds, not 2 minutes as a total period.

Full explanation →

420

MCQhard

A critical application is deployed on Amazon EKS. The DevOps team notices that pods are failing with 'CrashLoopBackOff' status. The team needs to capture the application logs before the pod restarts to debug the issue. Which approach should the team use?

A.Use 'kubectl logs' command immediately after the crash

B.Configure a sidecar container to stream logs to Amazon CloudWatch Logs

C.Store logs in a ConfigMap

D.Use 'kubectl exec' to access the container and check logs

AnswerB

Sidecar streams logs continuously, even if main container crashes.

Why this answer

Option B is correct because configuring a sidecar container to stream logs to Amazon CloudWatch Logs ensures that logs are persisted and available for debugging even if the pod crashes and restarts. This approach decouples log collection from the pod's lifecycle, allowing the DevOps team to analyze logs from the crash without needing to capture them in real-time. It aligns with the incident response best practice of centralized logging for ephemeral environments like EKS.

Exam trap

The trap here is that candidates assume 'kubectl logs' can always capture logs from a crashed pod, but they overlook that CrashLoopBackOff causes the container to restart, overwriting previous logs in the default Kubernetes logging setup (which only retains logs for the current container instance).

How to eliminate wrong answers

Option A is wrong because 'kubectl logs' retrieves logs from the current container instance, and after a crash and restart, the logs from the previous instance are lost unless the container has a logging driver that persists them; in a CrashLoopBackOff scenario, the pod may restart before logs can be captured. Option C is wrong because ConfigMaps are designed for storing configuration data (e.g., environment variables, configuration files), not for dynamic application logs, and they have a size limit of 1 MiB, making them impractical for log storage. Option D is wrong because 'kubectl exec' requires a running container to execute commands, and in a CrashLoopBackOff state, the container may be in a crash loop or not running, making exec inaccessible.

Full explanation →

421

MCQeasy

A company uses AWS CodeBuild to run unit tests and package a Java application. The build process takes 15 minutes. The team wants to reduce build time by caching dependencies. Which approach should the engineer recommend?

A.Store the compiled dependencies in a separate CodeCommit repository and clone it during the build

B.Mount an Amazon EFS file system to the build container and persist the cache across builds

C.Use an Application Load Balancer in front of a private artifact repository

D.Configure CodeBuild to use Amazon S3 for cache storage and specify the cache directory in buildspec.yml

AnswerD

S3 cache can store Maven local repository and other dependencies, reducing download time on subsequent builds.

Why this answer

Option D is correct because CodeBuild natively supports Amazon S3 for cache storage, allowing you to persist dependency directories across builds. By specifying the cache type as S3 and the path to the dependency cache (e.g., /root/.m2 for Maven) in the buildspec.yml, subsequent builds can reuse previously downloaded dependencies, significantly reducing build time without additional infrastructure.

Exam trap

The trap here is that candidates may confuse CodeBuild's lack of persistent local storage with the ability to mount external file systems like EFS, or they may think that cloning a repository is an efficient caching mechanism, when in fact CodeBuild's native S3 cache is the simplest and most effective solution for dependency caching.

How to eliminate wrong answers

Option A is wrong because storing compiled dependencies in a separate CodeCommit repository and cloning them during each build adds network transfer and checkout overhead, which does not reduce build time and may even increase it. Option B is wrong because mounting an Amazon EFS file system to the build container is not supported by CodeBuild; CodeBuild does not allow persistent file system mounts across builds, and EFS is designed for concurrent access from multiple EC2 instances, not for CodeBuild's ephemeral containers. Option C is wrong because an Application Load Balancer in front of a private artifact repository addresses high availability and load distribution, not caching of dependencies within the build process; it does not reduce the time to download dependencies for each build.

Full explanation →

422

MCQhard

A DevOps team is troubleshooting a CloudFormation stack creation failure. The stack uses a service role with the trust policy shown in the exhibit. The error message states: 'Insufficient permissions to create the resource'. Which action should the team take to resolve this issue?

A.Modify the CloudFormation template to use the user's IAM role instead of a service role.

B.Create a new stack policy that allows the required actions.

C.Add the user's IAM role to the trust policy.

D.Attach IAM policies to the service role that grant permissions to create the resources.

AnswerD

The service role needs permission to create resources on behalf of CloudFormation.

Why this answer

Option B is correct because the service role's trust policy allows CloudFormation to assume it, but the role itself needs permissions to create resources. Option A is wrong because the trust policy is correct. Option C is wrong because adding a trust policy is for the stack's role.

Option D is wrong because CloudFormation does not need to assume the user role.

Full explanation →

423

MCQmedium

A DevOps engineer is designing a monitoring solution for an application that runs on Amazon EC2 instances in an Auto Scaling group. The engineer needs to collect memory utilization metrics and visualize them in a dashboard. What should the engineer do?

A.Create a custom CloudWatch metric namespace and publish memory data using the AWS CLI.

B.Install the Amazon CloudWatch agent on the EC2 instances to collect memory metrics and publish them to CloudWatch.

C.Enable detailed monitoring on the EC2 instances to collect memory metrics.

D.Use AWS CloudTrail to capture memory utilization events from the EC2 instances.

AnswerB

The CloudWatch agent collects OS-level metrics like memory and publishes as custom metrics.

Why this answer

Memory utilization is not available by default in CloudWatch. The CloudWatch agent must be installed to collect custom metrics. Option C is correct.

Option A is incorrect because CloudWatch by default does not collect memory metrics. Option B is incorrect as a custom CloudWatch metric is needed, not a custom namespace without agent. Option D is incorrect because AWS CloudTrail does not capture memory metrics.

Full explanation →

424

MCQmedium

A company's production database on Amazon RDS Multi-AZ DB instance experienced a failover. The application experienced a brief outage. How can the company reduce the failover time?

A.Switch to a Single-AZ deployment

B.Increase the DB instance size

C.Use Amazon RDS Proxy

D.Enable Enhanced Monitoring

AnswerC

RDS Proxy reduces failover time by pooling connections and rerouting them quickly.

Why this answer

Using RDS Proxy reduces failover time by maintaining connections and routing them to the new primary instance quickly.

Full explanation →

425

MCQeasy

A company uses AWS X-Ray to trace requests through its microservices application. The DevOps engineer notices that some traces are incomplete. What is a possible reason?

A.The X-Ray daemon is not running on the application servers.

B.X-Ray cannot trace requests that cross multiple AWS services.

C.The X-Ray SDK sampling rate is configured too low, causing many requests to be skipped.

D.X-Ray requires the CloudWatch agent to be installed on all EC2 instances.

AnswerC

Low sampling rate means fewer traces are recorded.

Why this answer

Option C is correct because the X-Ray SDK uses a sampling rate to decide which requests to record. If the sampling rate is set too low, a large percentage of requests are skipped, leading to incomplete traces. The DevOps engineer would observe missing segments for requests that were not sampled, even though the daemon and SDK are functioning correctly.

Exam trap

The trap here is that candidates often assume incomplete traces are due to infrastructure issues (daemon not running) rather than a configuration parameter (sampling rate), which is a subtle but common cause in distributed tracing.

How to eliminate wrong answers

Option A is wrong because if the X-Ray daemon were not running, the engineer would likely see no traces at all or errors in the SDK logs, not just incomplete traces. Option B is wrong because X-Ray is specifically designed to trace requests across multiple AWS services (e.g., API Gateway, Lambda, DynamoDB) using trace headers and service maps. Option D is wrong because X-Ray does not require the CloudWatch agent; it uses its own daemon and SDK to send trace data directly to the X-Ray API.

Full explanation →

426

MCQhard

A company has an AWS Lambda function that processes sensitive data. The function needs to access an RDS database with credentials stored in Secrets Manager. What is the MOST secure way to grant the Lambda function access to the secret?

A.Use AWS KMS to encrypt the credentials and pass them as parameters.

B.Attach an IAM role to the Lambda function with permissions to read the secret and retrieve it at runtime.

C.Store the credentials directly in the Lambda function's environment variables.

D.Use Lambda environment variables with encryption enabled.

AnswerB

Securely grants access without embedding secrets.

Why this answer

Option A is correct because using an IAM role with the necessary permissions and retrieving the secret at runtime is standard practice. Option B is wrong because embedding secrets in environment variables is insecure. Option C is wrong because KMS is for encryption, not storing secrets.

Option D is wrong because Lambda environment variables can be encrypted but still expose secrets in logs.

Full explanation →

427

MCQmedium

A company runs a production web application on Amazon EC2 instances in an Auto Scaling group behind an Application Load Balancer (ALB). The application is deployed across three Availability Zones. The DevOps team recently noticed that the application's error rate is spiking periodically, but they cannot correlate the spikes with any known deployments or changes. The team has enabled detailed CloudWatch metrics for the ALB and EC2, and they are using CloudWatch Logs for application logs. They also have AWS X-Ray enabled for tracing. The team observes that during error spikes, the ALB's 5XX count increases, but the EC2 instance-level CPU and memory metrics remain normal. The application logs show 'Connection timed out' errors. The team suspects the issue is related to network connectivity but is not sure. Which course of action should the DevOps team take to identify the root cause of the periodic error spikes?

A.Enable VPC Flow Logs for the subnets and analyze the logs to identify dropped connections during the error spikes.

B.Increase the EC2 instance size to handle higher traffic and reduce timeouts.

C.Configure a step scaling policy for the Auto Scaling group based on ALB 5XX count.

D.Enable ALB access logs and analyze the 5xx response patterns.

AnswerA

Correct: VPC Flow Logs capture network traffic metadata and can show blocked or rejected connections.

Why this answer

VPC Flow Logs capture metadata about IP traffic going to and from network interfaces in a VPC, including whether the traffic was accepted or rejected. Since the application logs show 'Connection timed out' errors and instance-level metrics are normal, the issue likely lies in the network path (e.g., security groups, NACLs, or subnet routing) rather than the application or compute layer. Analyzing VPC Flow Logs during the error spikes will reveal if connections are being dropped or rejected, pinpointing the root cause of the timeouts.

Exam trap

The trap here is that candidates often jump to scaling or access logs (options C or D) because they focus on the 5XX error symptom, but the question specifically points to network-level timeouts, making VPC Flow Logs the only diagnostic tool that can reveal dropped or rejected packets at the network layer.

How to eliminate wrong answers

Option B is wrong because increasing EC2 instance size addresses compute resource constraints (CPU/memory), but the metrics show those are normal, so the timeouts are not due to resource exhaustion. Option C is wrong because configuring a step scaling policy based on ALB 5XX count would only react to the symptom (error rate) by adding instances, but it does not diagnose the underlying network connectivity issue causing the timeouts. Option D is wrong because ALB access logs record HTTP request/response details (e.g., status codes, timestamps) but do not capture network-level drops or rejections; they would show 5xx errors but not explain why connections are timing out at the network layer.

Full explanation →

428

MCQmedium

A DevOps engineer is setting up monitoring for an Amazon S3 bucket that stores sensitive data. The engineer needs to be notified whenever an object in the bucket is accessed by a user or application, including read and write operations. Which AWS service should the engineer use to capture these events and trigger notifications?

A.Configure S3 event notifications to send events to an SNS topic for object-level operations.

B.Enable AWS CloudTrail data events for the S3 bucket and configure CloudWatch alarms on the log group.

C.Use AWS Config to record S3 resource changes and trigger an SNS notification.

D.Use Amazon CloudWatch metrics for the S3 bucket and set an alarm on the NumberOfObjects metric.

AnswerA

S3 event notifications provide real-time alerts for specific operations.

Why this answer

Option C is correct because Amazon S3 can be configured to send event notifications to SNS, SQS, or Lambda for object-level operations (e.g., PutObject, GetObject). This is the most direct way. Option A is wrong because CloudTrail logs API calls but does not trigger real-time notifications.

Option B is wrong because CloudWatch Metrics track bucket metrics but not per-object access. Option D is wrong because Config records changes but does not provide real-time access notifications.

Full explanation →

429

MCQeasy

The AWS Config rule 's3-bucket-ssl-requests-only' returns NON_COMPLIANT for the bucket 'my-bucket'. What does this mean?

A.The bucket's policy does not deny requests that are not using SSL.

B.The bucket is publicly accessible.

C.The bucket does not have server access logging enabled.

D.The bucket does not have default encryption enabled.

AnswerA

Correct interpretation of the rule.

Why this answer

Option B is correct because the rule checks that bucket policies deny HTTP requests. Option A is wrong because the rule checks the bucket policy, not encryption. Option C is wrong because the rule checks for SSL, not logging.

Option D is wrong because the rule does not check public access.

Full explanation →

430

MCQmedium

A company uses AWS OpsWorks for configuration management. They have a stack with a layer that includes several EC2 instances. The DevOps engineer needs to deploy a custom configuration file to all instances in the layer. What is the recommended approach?

A.Use custom JSON in the stack settings to specify the file content

B.Create a custom Chef recipe and assign it to the layer's lifecycle events

C.Use AWS Systems Manager Run Command to copy the file

D.Add the file to the instance's user data script

AnswerB

Custom recipes are the standard way to manage configuration in OpsWorks.

Why this answer

Option A is correct because OpsWorks allows custom recipes to be associated with lifecycle events, which can deploy configuration files. Option B is wrong because custom JSON is for overriding stack settings, not for deploying files. Option C is wrong because user data only runs at launch.

Option D is wrong because Run Command is for ad-hoc commands, not for ongoing configuration management.

Full explanation →

431

MCQeasy

A company wants to centralize audit logs from multiple AWS accounts into a single S3 bucket. The logs must be encrypted at rest using a KMS key. Which solution is the MOST secure and scalable?

A.Create an IAM role in each account and manually copy logs to a central bucket

B.Configure each account's CloudTrail to send logs to a central S3 bucket with a bucket policy that grants cross-account permissions

C.Use Amazon Kinesis Data Firehose to stream logs to S3

D.Use AWS Config rules to aggregate logs into a central bucket

AnswerB

This is the standard approach; CloudTrail can deliver to a central bucket, and KMS encryption can be applied.

Why this answer

Using AWS CloudTrail with an organization trail to deliver logs to a central S3 bucket is the recommended approach for multi-account logging. KMS encryption can be enabled on the bucket. Cross-account IAM roles can be complex and less centralized.

Full explanation →

432

MCQeasy

A company is using AWS Elastic Beanstalk to deploy a web application. The development team wants to ensure that environment variables are set consistently across all environments (development, staging, production) without manual intervention. Which AWS service or feature should be used to manage these environment variables?

A.AWS CloudFormation templates with parameters

B.Elastic Beanstalk environment properties

C.AWS Systems Manager Parameter Store

D.AWS CodeDeploy environment variables

AnswerC

Parameter Store can store configuration data and Elastic Beanstalk can retrieve it via instance profile or configuration files.

Why this answer

Option B is correct because AWS Systems Manager Parameter Store can store configuration data securely and Elastic Beanstalk can retrieve it during environment creation or update. Option A is incorrect because CloudFormation is for infrastructure provisioning, not runtime configuration management. Option C is incorrect because Elastic Beanstalk environment properties are environment-specific and not shared across environments.

Option D is incorrect because CodeDeploy is for deployment automation, not configuration management.

Full explanation →

433

MCQhard

A company uses AWS Lambda functions to process events from Amazon SQS. The Lambda function sometimes fails due to timeouts. The team wants to preserve the event for reprocessing. How should they configure the integration?

A.Set up a DLQ on the SQS queue that receives the events

B.Use Lambda reserved concurrency

C.Enable Lambda function DLQ with SNS topic

D.Increase Lambda timeout to maximum

AnswerA

SQS DLQ stores messages that failed processing, allowing reprocessing.

Why this answer

By configuring a dead-letter queue (DLQ) on the SQS queue, failed messages are preserved for later reprocessing.

Full explanation →

434

MCQhard

Refer to the exhibit. An AWS Lambda function has the IAM policy shown. The function is intended to write logs to CloudWatch Logs and publish custom metrics to CloudWatch. However, the function is failing to publish custom metrics. What is the MOST likely cause?

A.The function does not have permission to perform logs:PutLogEvents for the specific log stream.

B.The function does not have permission to perform cloudwatch:PutMetricData.

C.The function is trying to put metrics to a CloudWatch namespace that is not allowed by the resource constraint.

D.The function's execution role is missing the necessary trust policy to allow Lambda to assume the role.

AnswerD

Without a trust policy, the Lambda service cannot assume the role, causing all actions to fail.

Why this answer

Option B is correct because the policy allows `cloudwatch:PutMetricData` but does not restrict the namespace, and the function might be trying to publish to a specific namespace that requires additional permissions. However, the more common issue is that the function needs `cloudwatch:PutMetricData` on a specific namespace, but the policy allows all resources. Actually, the policy looks correct for putting metric data.

Wait - the Lambda execution role might be missing the `logs:PutLogEvents` permission? No, that's allowed. Possibly the function is using an SDK that requires `cloudwatch:ListMetrics`? Not required. Actually, a common mistake is that the function does not have the correct permissions for the log group ARN pattern.

The exhibit shows a specific log group ARN. The function might be trying to write to a different log group. But the question says it's failing to publish custom metrics.

The most likely cause is that the function is trying to put metrics into a CloudWatch namespace that is not allowed, but the policy allows all resources. Hmm. Option A is wrong because `PutMetricData` is allowed.

Option B: The function does not have permission to perform `cloudwatch:PutMetricData` for the specific metric namespace? The policy allows for all resources, so that should work. Actually, the issue might be that the function's execution role does not have the trust policy allowing Lambda to assume it? That would cause invocation failure, not metric publishing. Let's reconsider.

Option C: The function is trying to write to a CloudWatch Logs log group that does not match the ARN pattern. That would cause log failure, not metric failure. Option D: The function is trying to put metrics to a region different from the log group? That seems unlikely.

The best answer is that the policy allows `cloudwatch:PutMetricData` for all resources, so it should work. But perhaps the function is using `cloudwatch:PutMetricData` with a metric that requires additional permissions like `cloudwatch:ListMetrics`? That is not required. I'll go with Option B because it's the most plausible: the function's execution role is missing the trust policy? Actually, the exhibit shows only the policy, not the trust policy.

The trust policy is required for Lambda to assume the role. If the trust policy is missing, the function cannot assume the role, and thus cannot publish metrics. But the question states the function is failing to publish custom metrics, implying it can be invoked.

So trust policy exists. I'll choose Option D: The function is attempting to put metrics to a CloudWatch namespace that requires a specific resource ARN constraint not present in the policy. But the policy allows all resources.

So that's not it. Perhaps the issue is that `cloudwatch:PutMetricData` does not support resource-level permissions? Actually, it does not; you must use `Resource: "*"`. So the policy is correct.

Maybe the function is using the wrong region endpoint? That would cause a timeout, not a permission error. I think the most likely cause is that the function's execution role is missing the `logs:PutLogEvents` permission for the log stream? But that would affect logs, not metrics. I'll go with Option A: The function does not have permission to perform `cloudwatch:PutMetricData` because the action is not allowed.

But it is allowed. Hmm. Let's look at the options provided.

Option A says the function does not have permission to perform `cloudwatch:PutMetricData`. Option B says the function does not have permission to perform `logs:PutLogEvents` for the specific log stream. Option C says the function's execution role is missing the necessary trust policy.

Option D says the function is trying to put metrics to a CloudWatch namespace that is not allowed. Given the policy, the most likely cause is that the function's execution role is missing the trust policy (Option C) because without it, the Lambda service cannot assume the role, and thus no actions can be performed. The policy itself seems correct for the actions.

I'll choose Option C.

Full explanation →

435

Multi-Selecteasy

A company uses AWS CodePipeline to deploy a static website to Amazon S3 and CloudFront. The pipeline currently uses CodeBuild to run tests and then deploys to an S3 bucket. The team wants to add a stage that invalidates the CloudFront cache after deployment. Which TWO actions achieve this?

Select 2 answers

A.Configure CodePipeline to directly invalidate CloudFront using a built-in action.

B.Create a CodePipeline stage with a deploy action to CloudFront.

C.Use CloudFront to automatically invalidate based on S3 events.

D.Use the AWS CLI command 'aws cloudfront create-invalidation' in a CodeBuild build step.

E.Add a Lambda function as a custom action in CodePipeline that calls the CloudFront invalidation API.

AnswersD, E

This can be run as a build action.

Why this answer

Option A is correct because CloudFront has an invalidation API. Option C is correct because CodePipeline can have a Lambda invocation as a stage action. Option B is not a native integration.

Option D is an alternative but not a CodePipeline action. Option E is not a standard action.

Full explanation →

436

Multi-Selecthard

A DevOps team uses AWS CloudFormation with nested stacks. They are experiencing stack update failures because changes to a nested stack cause resource conflicts. Which THREE best practices should they follow to manage nested stack updates? (Choose THREE.)

Select 3 answers

A.Set the stack update to disable rollback to allow debugging of failures.

B.Apply a stack policy to prevent updates to critical resources in the nested stacks.

C.Use the resource import feature to bring existing resources under CloudFormation management.

D.Use AWS CloudFormation change sets to review changes before executing updates.

E.Use DependsOn to ensure nested stacks are updated in a specific order.

AnswersB, C, D

Stack policies can protect resources from unintentional updates.

Why this answer

Options A, C, and D are correct. Using change sets previews changes; using stack policies protects critical resources; using resource import can avoid replacement. Option B is wrong because using DependsOn can cause circular dependencies.

Option E is wrong because disabling rollback hides failures.

Full explanation →

437

MCQhard

A company uses AWS CodePipeline with multiple stages: source, build, test, and deploy. The test stage takes 45 minutes to complete. Developers complain that the pipeline takes too long to provide feedback. The team wants to run tests in parallel across multiple environments. Which approach should be taken to reduce the pipeline execution time?

A.Increase the compute capacity of the test environment

B.Configure the test stage with parallel actions in CodePipeline

C.Use AWS CodeBuild batch builds with a single buildspec

D.Create multiple pipelines, each running a subset of tests

AnswerB

Parallel actions run simultaneously, reducing total time

Why this answer

Running the test stage in parallel across multiple environments can be achieved by using a parallel action group in CodePipeline, which will execute multiple test actions simultaneously. This reduces the overall time from sequential 45 minutes to the longest single test time. Option C is correct.

Option A is wrong because building multiple parallel pipelines would increase complexity and resource usage. Option B is wrong because increasing instance size may not improve parallelism. Option D is wrong because CodeBuild cannot parallelize within a single build project without multiple buildspecs.

Full explanation →

438

MCQeasy

A company is running a batch processing job on Amazon EMR that writes results to an Amazon S3 bucket. The job runs daily and takes about 2 hours. The DevOps team wants to be alerted if the job fails or takes longer than 3 hours. Which solution is the MOST cost-effective and operationally efficient?

A.Configure Amazon Simple Notification Service (SNS) directly from the EMR job to send notifications on completion.

B.Use Amazon CloudWatch Events to trigger an AWS Lambda function when the EMR cluster changes to 'TERMINATED' state, then check the job duration and send an alert if it exceeded 3 hours.

C.Create a CloudWatch alarm on the EMR cluster's EC2 instance CPUUtilization metric to detect abnormal runtime.

D.Use Amazon CloudWatch Logs to monitor the job's log stream and create a metric filter for 'FAILED' messages.

AnswerB

Cost-effective and event-driven.

Why this answer

Option B is correct because it uses CloudWatch Events to detect the EMR cluster's 'TERMINATED' state, which triggers a Lambda function that can check the job duration against the 3-hour threshold and send an alert via SNS if needed. This approach is cost-effective (no polling, event-driven) and operationally efficient, as it decouples monitoring from the job itself and handles both failure and timeout scenarios without modifying the EMR job code.

Exam trap

The trap here is that candidates often assume CloudWatch Logs metric filters (Option D) are the simplest way to detect failures, but they miss the timeout requirement and require log-based failure patterns, whereas event-driven state monitoring (Option B) inherently captures both failure and duration scenarios without custom logging.

How to eliminate wrong answers

Option A is wrong because configuring SNS directly from the EMR job requires modifying the job code to publish notifications, which is not operationally efficient and does not natively handle the 'takes longer than 3 hours' timeout condition—it only sends a completion notification, not an alert for excessive duration. Option C is wrong because CPUUtilization metrics are not a reliable indicator of job runtime or failure; a job can fail or run long without abnormal CPU usage, and this approach would require complex threshold tuning and still miss job-specific failures. Option D is wrong because using CloudWatch Logs metric filters for 'FAILED' messages only detects explicit failure log entries, not the timeout condition (job running >3 hours), and it requires the job to write specific log messages, which may not be present for all failure modes.

Full explanation →

439

Multi-Selecthard

A security team is investigating a potential data exfiltration from an S3 bucket. They need to identify which IAM user accessed a specific object and whether the access was from a known IP address. Which THREE AWS services or features should they use together to conduct this investigation?

Select 3 answers

A.AWS Config

B.VPC Flow Logs

C.AWS CloudTrail

D.S3 server access logs

E.Amazon Athena

AnswersC, D, E

Logs all S3 API calls including GetObject.

Why this answer

Options A, B, and D are correct. A is correct because AWS CloudTrail logs S3 API calls including GetObject. B is correct because S3 server access logs provide detailed object-level access logs including requester, IP, and object key.

D is correct because Amazon Athena can query both CloudTrail logs and S3 access logs efficiently. Option C is wrong because AWS Config records resource configuration, not data access. Option E is wrong because VPC Flow Logs capture network traffic but not S3 object-level access.

Full explanation →

440

MCQhard

A company uses AWS Elastic Beanstalk to deploy a web application. The deployment fails with a '502 Bad Gateway' error. The developer checks the logs and sees that the application is running but returns errors. The environment uses a load balancer. What is the MOST likely cause?

A.The application source bundle is missing a required file.

B.The security group of the environment does not allow inbound HTTP traffic.

C.The environment's environment variables are misconfigured.

D.The application is not binding to the correct port or is crashing under load.

AnswerD

If the application does not respond, the load balancer returns 502.

Why this answer

Option C is correct because 502 errors from a load balancer often indicate that the target is not responding or is unhealthy. Option A is wrong because 502 is not related to missing files. Option B is wrong because security group misconfiguration would cause 503 or timeout.

Option D is wrong because environment variables cause application errors, not necessarily 502.

Full explanation →

441

MCQmedium

A company stores sensitive data in Amazon S3. A security audit reveals that several S3 buckets are publicly accessible. The DevOps engineer needs to implement a solution that automatically detects and alerts on any S3 bucket that becomes public. Which AWS service should the engineer use?

A.Amazon Macie

B.S3 Block Public Access

C.AWS Trusted Advisor

D.AWS Config

AnswerA

Macie can detect and alert on public buckets.

Why this answer

Option C is correct because Amazon Macie uses machine learning to discover, classify, and protect sensitive data, and it can alert on public buckets. Option A is wrong because AWS Trusted Advisor checks for public buckets but does not provide real-time alerts. Option B is wrong because S3 Block Public Access is a preventive control, not detective.

Option D is wrong because AWS Config can track bucket policies but does not automatically alert on public access.

Full explanation →

442

MCQhard

A company runs a stateful web application on EC2 instances in an Auto Scaling group. The application uses an Application Load Balancer (ALB) and an Amazon ElastiCache Redis cluster. Users report that after a scaling event, they are logged out and lose session data. What is the most likely cause?

A.The ALB health check interval is too short, causing healthy instances to be marked unhealthy

B.The ElastiCache cluster is configured with in-transit encryption, causing session tokens to be invalidated

C.The Auto Scaling group is using a termination policy that terminates the oldest instance first, which holds active sessions

D.The ElastiCache cluster is not configured for Multi-AZ and a node failure caused all sessions to be lost

AnswerD

Without Multi-AZ, a node failure can cause data loss.

Why this answer

Option D is correct because the scenario describes a stateful web application that relies on ElastiCache Redis for session storage. If the ElastiCache cluster is not configured for Multi-AZ, a node failure can cause all cached session data to be lost, logging users out. This is the most likely cause of session loss after a scaling event, as scaling events do not directly affect ElastiCache data persistence.

Exam trap

The trap here is that candidates may focus on the Auto Scaling group's termination policy or ALB health checks, overlooking that the session data is stored externally in ElastiCache and that its lack of high availability is the root cause of session loss.

How to eliminate wrong answers

Option A is wrong because a short health check interval would cause instances to be marked unhealthy and replaced, but it would not directly cause session data loss; sessions are stored in ElastiCache, not on the instances. Option B is wrong because in-transit encryption on ElastiCache protects data during transmission and does not invalidate session tokens; it is unrelated to session persistence. Option C is wrong because terminating the oldest instance first is a common termination policy that does not cause session loss if sessions are stored externally in ElastiCache; the issue is with the session store itself, not the instance termination order.

Full explanation →

443

MCQhard

A DevOps team uses AWS CodePipeline with an S3 source action and CodeBuild as a build provider. The pipeline has a manual approval step before deployment. Recently, the team noticed that the pipeline automatically starts when a new object is uploaded to the S3 bucket, even if the object is not the source code. They want to ensure that the pipeline only triggers on changes to the source code directory. What is the MOST efficient solution?

A.Use Amazon CloudWatch Events to create a custom rule that matches the source code path and triggers the pipeline.

B.Enable versioning on the S3 bucket and configure the pipeline to use the latest version.

C.Configure the S3 event notification to use a prefix filter that matches the source code directory.

D.Disable the S3 trigger and manually start the pipeline after each code commit.

AnswerC

S3 event notifications support prefix and suffix filtering, allowing precise triggers.

Why this answer

Option C is correct because S3 event notifications can be filtered by prefix and suffix, so setting a prefix filter to the source code directory ensures only changes to that directory trigger the pipeline. Option A is wrong because enabling versioning alone doesn't filter events. Option B is wrong because CloudWatch Events can filter but adds complexity and cost.

Option D is wrong because disabling automatic triggers would require manual intervention, which is not efficient.

Full explanation →

444

Multi-Selecteasy

A company is using Amazon CloudWatch Logs to collect application logs. They need to search and analyze the logs in near real-time. Which TWO AWS services can be used to achieve this?

Select 2 answers

A.Amazon CloudWatch Logs Insights

B.Amazon CloudWatch Synthetics

C.Amazon Kinesis Data Analytics

D.Amazon Athena

E.Amazon OpenSearch Service

AnswersA, E

CloudWatch Logs Insights enables interactive querying of log data stored in CloudWatch Logs.

Why this answer

A and D are correct: CloudWatch Logs Insights allows querying logs directly, and Amazon OpenSearch Service can ingest logs via a subscription filter for analysis. B is wrong because Athena queries data in S3, not directly from CloudWatch Logs. C is wrong because Kinesis Data Analytics processes streaming data, but not directly from CloudWatch Logs.

E is wrong because CloudWatch Synthetics is for canaries, not log analysis.

Full explanation →

445

MCQhard

A company is using Amazon CloudWatch Logs to collect logs from its containerized applications running on Amazon ECS Fargate. The DevOps engineer wants to centralize logs from multiple services into a single CloudWatch Logs log group. They currently have a log group per service. Which approach minimizes operational overhead and cost?

A.Use Amazon Kinesis Data Firehose to stream logs from each log group to Amazon S3 and then to a central CloudWatch Logs group.

B.Create a CloudWatch Logs subscription filter on each service log group to stream matching log events to a central log group.

C.Export each log group to Amazon S3 and use Amazon Athena to query across all exported logs.

D.Modify the application logging configuration in each container to send logs to a single log group and log stream per container.

AnswerB

Subscription filters can forward logs to a destination log group within the same account, allowing centralization without code changes.

Why this answer

Option B is correct because CloudWatch Logs subscription filters allow you to stream log events from multiple source log groups directly into a single destination log group in real time, without any intermediate storage or additional services. This minimizes operational overhead by using a native CloudWatch feature and avoids the cost of running Kinesis, S3, or Athena for this specific use case.

Exam trap

The trap here is that candidates may overcomplicate the solution by choosing a multi-service pipeline (Kinesis, S3, Athena) when a native CloudWatch Logs feature (subscription filter) directly solves the requirement with minimal overhead and cost.

How to eliminate wrong answers

Option A is wrong because using Kinesis Data Firehose to stream logs to S3 and then back into CloudWatch Logs introduces unnecessary complexity, latency, and cost (Kinesis, S3, and Lambda if needed), and CloudWatch Logs cannot directly ingest from S3 without additional processing. Option C is wrong because exporting logs to S3 and querying with Athena does not centralize logs into a single CloudWatch Logs group; it only provides a query layer over exported files, losing real-time log streaming and the ability to use CloudWatch Logs features like metric filters and alarms. Option D is wrong because modifying the application logging configuration to send all logs to a single log group and a single log stream per container would cause log streams to be shared across services, violating the intended isolation and making it impossible to distinguish logs from different services; it also requires code changes in every container, increasing operational overhead.

Full explanation →

446

MCQhard

A company runs a multi-tier web application on EC2 instances behind an Application Load Balancer. The application experiences intermittent 503 errors during peak traffic. The Auto Scaling group is configured with a step scaling policy based on CPU utilization. CloudWatch metrics show that CPU utilization never exceeds 70%, but the ALB target group reports that some targets are unhealthy. What is the MOST likely cause?

A.The application health check endpoint is returning HTTP 5xx or timing out.

B.The ALB is misconfigured with an incorrect security group blocking traffic to the targets.

C.The ALB connection draining settings are too short, causing in-flight requests to fail.

D.The step scaling policy is too aggressive and is terminating instances prematurely.

AnswerA

Unhealthy targets cause the ALB to stop routing traffic, leading to 503 errors for users.

Why this answer

Option A is correct because health checks are failing, causing the ALB to stop sending traffic to those instances, which results in 503 errors. The CPU utilization might be low because the unhealthy instances are not receiving traffic. Option B is wrong because if the scaling policy were too aggressive, you'd see more instances and possibly lower CPU, not 503 errors.

Option C is wrong because connection draining does not cause health check failures. Option D is wrong because the application itself is failing health checks, not the ALB configuration.

Full explanation →

447

MCQeasy

A DevOps engineer needs to set up a centralized logging solution for multiple AWS accounts. The logs must be stored in a central Amazon S3 bucket for long-term retention and analysis. Which combination of services should the engineer use?

A.Use AWS CloudTrail to deliver logs to the central S3 bucket.

B.Use Amazon Athena and Amazon QuickSight to query logs across accounts.

C.Use Amazon CloudWatch Logs and Amazon Kinesis Data Firehose to deliver logs to the central S3 bucket.

D.Use Amazon VPC Flow Logs to send logs to the central S3 bucket.

AnswerC

CloudWatch Logs can export logs to S3, and Kinesis Firehose can stream logs to S3 for centralized storage.

Why this answer

Option D is correct because Amazon CloudWatch Logs can deliver log data to Amazon S3 via export tasks or subscription filters, and Amazon Kinesis Data Firehose can also stream logs to S3. Together they enable centralized logging. Option A is wrong because CloudTrail alone does not capture application logs.

Option B is wrong because VPC Flow Logs only capture network traffic. Option C is wrong because Amazon Athena and Amazon QuickSight are analysis tools, not ingestion services.

Full explanation →

448

MCQhard

A company runs a containerized microservices architecture on Amazon ECS with Fargate. The services communicate via an internal Application Load Balancer. Recently, a new deployment of Service A caused its health checks to fail. The DevOps engineer notices that the old tasks remain running and the service is unavailable. What configuration change would prevent this issue in future deployments?

A.Set the deployment minimum healthy percent to 50 and maximum percent to 100 with a health check grace period

B.Set the deployment circuit breaker to rollback on deployment failure and disable rollback

C.Change the deployment controller from ECS to CodeDeploy for blue/green deployments

D.Set the deployment minimum healthy percent to 0 and maximum percent to 200

AnswerA

This configuration ensures old tasks remain until new tasks pass health checks.

Why this answer

Setting the deployment minimum healthy percent to 50 and maximum percent to 100 ensures that during a deployment, at least 50% of tasks are healthy, but if health checks fail, the deployment can roll back because the old tasks are not replaced until new tasks are healthy. Option B is wrong because it allows replacing all tasks before health checks pass. Option C is wrong because it is a deployment controller, not a configuration to prevent failure.

Option D is wrong because it removes the ability to roll back.

Full explanation →

449

MCQhard

A company runs a stateless web application on Amazon ECS with Fargate launch type. The application experiences intermittent traffic spikes. The company wants to ensure that the application can scale automatically and remain resilient to underlying infrastructure failures. Which combination of actions should the DevOps engineer take?

A.Configure a scheduled scaling policy for the Amazon ECS service to add tasks during known peak hours.

B.Launch tasks in a single Availability Zone and use an Application Auto Scaling target tracking policy based on CPU utilization.

C.Configure a step scaling policy for the Amazon ECS service and increase the task memory size.

D.Configure an Application Auto Scaling target tracking policy based on memory utilization and enable Amazon ECS service auto-recovery.

AnswerD

Target tracking scales based on demand; service auto-recovery replaces failed tasks, ensuring resilience.

Why this answer

Option D is correct because it combines Application Auto Scaling target tracking based on memory utilization, which is a relevant metric for a stateless web application to handle traffic spikes, with Amazon ECS service auto-recovery, which automatically replaces unhealthy tasks to ensure resilience against underlying infrastructure failures. This approach provides both automatic scaling and fault tolerance without manual intervention.

Exam trap

The trap here is that candidates often assume CPU utilization is the only valid scaling metric for web applications, but memory utilization can be more appropriate for stateless workloads, and they may overlook the critical need for service auto-recovery to handle infrastructure failures in Fargate.

How to eliminate wrong answers

Option A is wrong because scheduled scaling is reactive to known peak hours but cannot handle intermittent, unpredictable traffic spikes, and it does not address resilience to infrastructure failures. Option B is wrong because launching tasks in a single Availability Zone creates a single point of failure, violating resilience best practices, and while target tracking based on CPU utilization can scale, it does not provide auto-recovery for failed tasks. Option C is wrong because step scaling policies can be effective, but increasing task memory size does not directly improve scaling or resilience; it may reduce the need for scaling but does not automate recovery from failures.

Full explanation →

450

MCQhard

A company uses AWS CloudFormation to manage its infrastructure. The stack creation recently failed because an IAM role resource was created before the AWS Lambda function that depends on it. The template has no DependsOn clauses. What is the most likely reason for this failure and how can it be fixed?

A.Add a DependsOn clause to the Lambda function resource referencing the IAM role

B.Use AWS Systems Manager Automation to create the resources sequentially

C.Use a ChangeSet to roll back the stack and modify the template

D.Split the template into two separate stacks and use nested stacks

AnswerA

Explicitly ordering resource creation solves the parallel creation failure.

Why this answer

The most likely reason for the failure is that CloudFormation, by default, parallelizes the creation of resources that do not have explicit dependencies. Since the IAM role and Lambda function have no DependsOn clause, CloudFormation may attempt to create the Lambda function before the IAM role is fully created and its permissions are propagated. Adding a DependsOn clause to the Lambda function resource referencing the IAM role ensures that CloudFormation creates the IAM role first, resolving the dependency and preventing the failure.

Exam trap

The trap here is that candidates may assume CloudFormation automatically detects all dependencies via Ref or Fn::GetAtt, but it does not infer dependencies from resource attributes like IAM role ARNs used in Lambda function configurations unless explicitly referenced in the template.

How to eliminate wrong answers

Option B is wrong because AWS Systems Manager Automation is used for operational tasks like patching or runbooks, not for managing CloudFormation resource creation order; it does not address the missing dependency in the template. Option C is wrong because a ChangeSet is used to preview changes before updating a stack, not to roll back a failed creation or modify the template to fix dependency ordering; rolling back and modifying the template would require a new stack creation, not a ChangeSet. Option D is wrong because splitting the template into two separate stacks and using nested stacks does not inherently solve the dependency ordering issue; the same parallel creation problem could occur across nested stacks unless explicit DependsOn or cross-stack references are used, making it an unnecessarily complex solution.

Full explanation →

Page 6 of 24

All pages

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Practice DOP-C02 by domain

Target a specific domain to shore up weak areas.

Configuration Management and IaC Resilient Cloud Solutions Monitoring and Logging Incident and Event Response Security and Compliance SDLC Automation

See all domains with question counts →