Amazon Web Services · Free Practice Questions · Last reviewed May 2026
36real exam-style questions organised by domain, each with the correct answer highlighted and a plain-English explanation of why it's right — and why the others are wrong.
A company uses AWS CloudFormation to deploy a multi-tier web application. The template includes a nested stack for the database layer. When updating the stack, the database stack fails with a 'CREATE_FAILED' status, but the parent stack continues updating other resources. What is the most likely cause and best practice to prevent this?
The parent stack's update policy is set to 'CONTINUE' by default. To prevent this, set 'OnFailure' to 'ROLLBACK' in the stack update options.
Setting 'OnFailure' to 'ROLLBACK' during update ensures the entire stack rolls back if any resource fails, maintaining consistency.
The parent stack was created without the '--capabilities' parameter, so it cannot roll back.
The nested stack failure automatically triggers a rollback of the parent stack, but the rollback also failed.
The parent stack is configured with 'OnFailure' set to 'DO_NOTHING'. Change it to 'DELETE'.
A DevOps engineer manages infrastructure using Terraform. The team needs to store secrets such as database passwords in a secure manner and reference them in Terraform configurations. They have configured AWS Secrets Manager. What is the recommended approach to reference secrets in Terraform without exposing them in state files?
Store the secret ARN in a Terraform variable and use 'var.secret_arn' in the resource.
Store the secret in AWS Systems Manager Parameter Store and reference it using 'data.aws_ssm_parameter'.
Pass the secret as an environment variable to Terraform and reference it with 'var.secret_value'.
Use the 'data.aws_secretsmanager_secret_version' data source and mark the attribute as 'sensitive = true' in the output.
The data source retrieves the secret, and marking outputs as sensitive prevents them from being shown in logs or state.
A company uses AWS OpsWorks to manage a set of EC2 instances. They need to ensure that a custom recipe runs on all instances during the 'Configure' lifecycle event. What is the correct way to achieve this?
Modify the stack's CloudFormation template to include the recipe.
Upload the recipe to a custom cookbook repository and assign it to the 'Configure' lifecycle event in the stack settings.
This is the standard way to run custom recipes on OpsWorks lifecycle events.
Add the recipe commands to the instance's user data script.
Use AWS CodeDeploy to trigger the recipe during the Configure event.
A DevOps team uses AWS CodePipeline to automate deployments. The pipeline has a Deploy stage that uses AWS CloudFormation to create or update a stack. Recently, a stack update failed because the template referenced an AMI that was deprecated. The team wants to automatically roll back the stack to the last known good state if a deployment fails. What should they do?
Configure the CloudFormation deployment action in CodePipeline with 'ActionMode' set to 'CREATE_UPDATE' and check the 'Rollback on failure' option.
CodePipeline's CloudFormation action supports automatic rollback on failure.
Use the CodePipeline console to enable 'Automatic rollback' for the Deploy stage.
Set the stack's 'DisableRollback' parameter to 'true' in the template.
Add a stack policy to the CloudFormation stack that denies updates to the AMI parameter.
An organization uses AWS Elastic Beanstalk for application deployments. They want to implement immutable updates to minimize downtime and ensure that if the new environment fails health checks, the old environment remains intact. Which deployment policy should they choose?
Traffic splitting.
Immutable update.
Immutable updates create a completely new environment and only swap when healthy.
All at once.
Rolling update based on health.
A developer wants to use AWS CloudFormation to create an Amazon RDS DB instance. The template includes a DB instance resource. Which property is required for the DB instance to be created successfully?
DBInstanceClass and Engine
These are required properties for the DB instance resource.
AllocatedStorage
DBInstanceIdentifier
MasterUsername and MasterUserPassword
Want more Configuration Management and IaC practice?
Practice this domainA company runs a critical web application on EC2 instances behind an Application Load Balancer (ALB) with Auto Scaling. During a recent traffic spike, the application became unavailable for 10 minutes. Analysis shows that the ALB's healthy host count dropped to zero because the instances failed health checks due to high CPU load. What is the MOST effective design change to improve resilience during future traffic spikes?
Use predictive scaling with a scheduled scaling policy for known peak times.
Predictive scaling anticipates demand and scales out in advance, preventing overload.
Increase the instance size to handle higher load.
Configure step scaling policies based on CPU utilization.
Set a higher CPU threshold for health checks.
A company uses DynamoDB global tables in two AWS Regions with strong consistency reads. They observe occasional write conflicts that are not being resolved automatically. The application uses DynamoDBMapper with optimistic locking. What should the DevOps engineer do to ensure conflict resolution?
Implement a custom conflict resolution using DynamoDB Streams and AWS Lambda.
Switch to eventual consistency reads to reduce conflicts.
Add a third global table region to increase redundancy.
Use conditional writes with a version number attribute to ensure updates are applied only to the latest version.
Conditional writes with versioning enable optimistic locking, allowing only the latest version to be updated, which aligns with LWW.
A company's application runs on EC2 instances in a single Availability Zone. The operations team wants to improve resilience without redesigning the application. Which action is the MOST effective?
Use a larger instance type to handle more traffic.
Enable EC2 Auto Recovery to automatically restart the instance if it fails.
Deploy EC2 instances across multiple Availability Zones using an Auto Scaling group.
Multi-AZ deployment ensures application availability even if one AZ fails.
Place the instance in a placement group to ensure low latency.
A company uses a third-party backup solution to back up its EC2 instances daily. The backups are stored in an S3 bucket with default settings. The company wants to ensure that backups are protected from accidental deletion and are available for at least one year. Which combination of S3 features should the DevOps engineer implement?
Enable MFA Delete and set a lifecycle policy to transition to S3 Glacier after 30 days.
Enable versioning and set a lifecycle policy to expire noncurrent versions after 365 days.
Enable cross-Region replication to a bucket with versioning enabled.
Enable S3 Object Lock with Governance mode and a retention period of 365 days, and set a lifecycle policy to transition to S3 Glacier Deep Archive after 30 days.
Object Lock prevents deletion during the retention period, and lifecycle transition reduces costs.
A company runs a stateful web application on EC2 instances behind a Network Load Balancer (NLB) in a single Availability Zone. The application stores session state locally on the instance. The company wants to achieve high availability across multiple AZs with minimal application changes. What should the DevOps engineer do?
Add more AZs and configure the NLB with cross-zone load balancing.
Replace the NLB with an ALB and use ElastiCache for session storage.
Use a Multi-AZ RDS instance to store session state.
Replace the NLB with an ALB and enable sticky sessions (session affinity) using the ALB's cookie.
Sticky sessions ensure that requests from the same client are routed to the same instance, preserving local session state.
A company's DevOps team is designing a disaster recovery plan for a critical application. The application runs on EC2 instances with an RDS MySQL database. The Recovery Time Objective (RTO) is 15 minutes, and the Recovery Point Objective (RPO) is 1 hour. Which approach BEST meets these requirements?
Use backup and restore with daily snapshots stored in S3 and cross-Region replication.
Use a multi-Region application with Route 53 latency-based routing and RDS read replicas in the DR Region.
Use a warm standby strategy with a scaled-down copy of the production environment in the DR Region, and replicate data using RDS Multi-AZ with synchronous replication.
Warm standby allows quick failover; synchronous replication meets RPO of 1 hour.
Use a pilot light strategy with EC2 instances stopped and RDS snapshots copied to the DR Region.
Want more Resilient Cloud Solutions practice?
Practice this domainA company is running a critical web application on Amazon EC2 instances behind an Application Load Balancer (ALB). The DevOps team wants to monitor HTTP 5xx errors and receive alerts when the error rate exceeds 5% over a 5-minute period. Which combination of services and configurations should be used to meet these requirements?
Enable CloudWatch Logs for the ALB and use CloudWatch Logs Insights to query 5xx logs, then create a metric filter and alarm.
Configure AWS Config rules to check ALB 5xx error counts and trigger alarms.
Use CloudWatch ALB metrics (HTTPCode_ELB_5XX_Count) and create a CloudWatch Alarm on the Sum statistic with a threshold based on total request count.
Correct: ALB publishes HTTP 5xx metrics to CloudWatch, and alarms can be set on these metrics.
Use AWS X-Ray to trace requests and create a CloudWatch alarm based on X-Ray error rate.
A DevOps team is using Amazon CloudWatch Logs to collect application logs from multiple EC2 instances. They notice that some log entries are missing and that the CloudWatch agent is consuming high CPU. The log group has a retention policy of 30 days. Which action should the team take to reduce CPU usage without losing log data?
Increase the batch size in the CloudWatch agent configuration.
Correct: Larger batch size reduces API calls and CPU usage.
Use JSON format for logs instead of plain text.
Set the agent's timezone to UTC.
Change the log group retention policy to 7 days.
A company wants to monitor the number of messages in an Amazon SQS queue and send an alert if the queue depth exceeds 1000 for more than 5 minutes. Which AWS service should be used to create the alarm?
Amazon EventBridge
Amazon CloudWatch Alarms
Correct: CloudWatch Alarms monitor metrics and trigger actions.
AWS X-Ray
Amazon CloudWatch Logs
A company is using Amazon CloudWatch Synthetics canaries to monitor its web application endpoints. The canaries are deployed in multiple AWS regions. The team wants to aggregate the canary results into a single dashboard in the US East (N. Virginia) region. What is the MOST efficient way to achieve this?
Replicate the canaries to US East (N. Virginia) and run them from there.
Create a cross-region CloudWatch dashboard and add metrics from each region using metric math.
Correct: Cross-region dashboards natively support displaying metrics from different regions.
Set up a Lambda function in each region to push canary results to a central S3 bucket, then create a dashboard from S3.
Create a CloudWatch Logs Insights query across all regions and visualize results.
A DevOps team is troubleshooting a slow application. They enabled AWS X-Ray tracing and see that one of the downstream services has a high average response time. However, the traces show that the service itself is fast; the delay is in the network call from the upstream service. Which X-Ray feature should the team use to identify the root cause?
Examine the trace map to see the connection between services.
Correct: The trace map visualizes service connections and latency.
Add annotations to the traces for better filtering.
View the raw segments of the upstream service.
Adjust the sampling rules to capture more traces.
A company needs to monitor the CPU utilization of its Amazon RDS for PostgreSQL instance. The metric should be available in Amazon CloudWatch with a granularity of 1 minute. Which action should the team take?
Install the CloudWatch agent on the RDS instance.
Enable Enhanced Monitoring for the RDS instance.
No additional configuration is needed; RDS automatically sends metrics to CloudWatch.
Correct: RDS publishes CPU utilization to CloudWatch by default.
Enable Performance Insights for the RDS instance.
Want more Monitoring and Logging practice?
Practice this domainA company uses an Auto Scaling group with a dynamic scaling policy based on a custom CloudWatch metric. After a recent deployment, the metric spikes unexpectedly, causing the Auto Scaling group to launch several EC2 instances. The operations team wants to quickly determine whether the spike was caused by a real load increase or a deployment issue. What is the MOST efficient way to investigate this?
Check the SNS topic that the scaling policy publishes to for notifications.
Use CloudWatch Logs Insights to query application logs for error patterns or deployment markers that coincide with the metric spike.
CloudWatch Logs Insights allows querying logs to find patterns related to the spike.
Use AWS CloudTrail to review API calls that modified the scaling policy.
Temporarily disable the scaling policy and manually increase the desired capacity to handle the load.
A company runs a critical application on Amazon ECS with Fargate launch type. The application uses an Application Load Balancer (ALB) in front. During a load test, the team notices a sudden increase in 5xx errors from the ALB, and some tasks become unhealthy. The task logs show occasional 'OutOfMemoryError' exceptions. The task definition currently has 512 CPU units and 1024 MiB memory. What should the team do to mitigate the issue while maintaining a cost-effective approach?
Increase the task definition CPU to 1024 units and memory to 2048 MiB.
Increase the task definition memory to 2048 MiB while keeping CPU at 512 units.
This directly addresses the memory error without wasting resources on extra CPU.
Configure the ECS service to use a rolling update with a longer health check grace period.
Decrease the task definition memory to 512 MiB to force garbage collection more frequently.
A DevOps engineer is investigating an incident where an EC2 instance became unreachable. The engineer checks the AWS Management Console and finds the instance is running, but the status check shows '2/2 checks passed' and the system log shows no errors. What should the engineer do NEXT to diagnose the connectivity issue?
Review the CloudWatch metrics for CPU utilization and network throughput.
Reboot the instance to reset the network interface.
Stop and start the instance to move it to new underlying hardware.
Check the security group and network ACL rules to ensure inbound traffic is allowed.
Connectivity issues often stem from network permissions.
A company has an AWS Lambda function that processes S3 events. The function is invoked multiple times for the same S3 object, causing duplicate processing. The engineer suspects the issue is related to retries from the S3 event notification or Lambda's built-in retry behavior. What is the MOST effective way to ensure idempotent processing?
Modify the S3 bucket event notification configuration to use a prefix filter that excludes duplicate objects.
Use a DynamoDB table to store a record of processed S3 object keys and check for existence before processing.
This pattern ensures idempotency by tracking processed objects.
Set the Lambda function's ReservedConcurrency to 1 to prevent concurrent executions.
Use an Amazon SQS FIFO queue as the event source and enable content-based deduplication.
An organization uses AWS CloudFormation to manage infrastructure. During an incident, a stack update fails with 'UPDATE_ROLLBACK_FAILED' status. The engineer needs to bring the stack to a consistent state without losing data. What is the BEST approach?
Use the 'ContinueUpdateRollback' API to skip the resource that caused the failure.
This is the designed method to resolve rollback failures.
Create a new stack from the same template and migrate resources.
Manually correct the resource configuration that caused the failure, then perform a stack update.
Delete the stack and then recreate it from the same template.
A company uses Amazon RDS for MySQL with Multi-AZ deployment. The database instance fails and AWS automatically fails over to the standby. After the failover, the application cannot connect to the database. The engineer checks the RDS console and sees that the instance status is Available. What is the MOST likely cause of the connectivity issue?
The security group for the RDS instance has changed during failover.
The application is using the database's DNS endpoint for the old primary, which is no longer the writer.
After failover, the writer endpoint points to the new primary, but if the application caches the old endpoint, it may fail.
The DNS record for the RDS endpoint has not propagated to the application's DNS resolver.
The database instance is still in the process of failover and is not yet accepting connections.
Want more Incident and Event Response practice?
Practice this domainA company is using AWS Organizations with multiple accounts. The Security team wants to centrally manage IAM roles that can be assumed by users in member accounts. Which solution should be used to enforce that only specific roles can be assumed across accounts, while ensuring that the policy updates are automatically applied to all accounts?
Create an IAM role in each member account with a trust policy that allows the Security account, and use AWS CloudFormation StackSets to deploy the roles.
Use AWS Single Sign-On (SSO) to assign permissions to users across accounts.
Create an IAM role in the Security account with a trust policy that references a service control policy (SCP) in AWS Organizations.
SCPs can restrict IAM actions across accounts, and the trust policy can reference the SCP to enforce central control.
Create a resource-based policy on each IAM role in the member accounts that allows the Security account to assume the role.
A company is running a critical application on an Amazon EC2 instance that needs to access an S3 bucket. The application must use temporary credentials that automatically rotate. The DevOps engineer must ensure that the credentials are never stored on disk. Which approach meets these requirements?
Store the credentials in AWS Secrets Manager and retrieve them at application startup.
Attach an IAM role to the EC2 instance and use the instance profile to obtain temporary credentials from the instance metadata service.
Instance profiles provide temporary credentials that are automatically rotated and never stored on disk.
Use AWS Systems Manager Parameter Store to store the credentials and retrieve them using the EC2 instance's IAM role.
Generate an access key and secret key for an IAM user and store them in a configuration file on the EC2 instance.
A DevOps engineer needs to ensure that all API calls made to AWS are recorded for auditing purposes. Which AWS service should be used?
AWS CloudTrail
CloudTrail records all AWS API calls for auditing.
AWS Config
Amazon CloudWatch Logs
Amazon VPC Flow Logs
A company uses AWS Key Management Service (KMS) to encrypt data at rest in Amazon S3. The security team wants to ensure that only users with a specific attribute in their SAML assertion can decrypt the data. Which KMS key policy should be used?
Create an S3 bucket policy that denies kms:Decrypt unless the request includes a specific tag.
Modify the KMS key policy to include a condition that allows kms:Decrypt only if the SAML assertion contains the specific attribute.
KMS key policies can use conditions based on SAML attributes to control decryption.
Attach a resource-based policy to the S3 bucket that allows decryption only for users with the specific attribute.
Use an IAM policy that grants kms:Decrypt only if the user has the specific attribute.
A company has a requirement to rotate database credentials every 30 days for an Amazon RDS for MySQL instance. The credentials are currently stored in AWS Secrets Manager. The DevOps engineer needs to implement automatic rotation without modifying the application code. Which solution should be used?
Create a scheduled job that runs every 30 days to update the secret in Secrets Manager with a new password.
Store the credentials in AWS Systems Manager Parameter Store and configure automatic rotation using a Lambda function.
Use the AWS RDS automatic password rotation feature, which automatically updates the password every 30 days.
Configure Secrets Manager to automatically rotate the secret every 30 days using a Lambda rotation function, and have the application retrieve the secret using the Secrets Manager API.
Secrets Manager provides built-in rotation for RDS with a Lambda function, and the application can retrieve credentials on-the-fly.
A company uses AWS Organizations to manage multiple accounts. The Security team wants to prevent member accounts from disabling AWS CloudTrail or deleting CloudTrail log files. Which TWO actions should the Security team take in the organization's management account? (Choose TWO.)
Create an SCP to deny cloudtrail:UpdateTrail.
Create an IAM policy in each member account to deny cloudtrail:StopLogging.
Create an SCP to deny s3:DeleteObject on the CloudTrail log bucket.
This prevents deletion of log files.
Enable AWS CloudTrail from the management account with organization trail.
Create an SCP to deny cloudtrail:StopLogging and cloudtrail:DeleteTrail.
This prevents disabling or deleting the trail.
Want more Security and Compliance practice?
Practice this domainA company uses AWS CodePipeline with a multi-branch strategy. A new feature branch triggers a pipeline that runs unit tests and deploys to a test environment. The deployment step uses AWS CodeDeploy with a deployment group configured for in-place deployment to Amazon EC2 instances. The deployment fails intermittently with the error 'The overall deployment failed because too many individual instances failed deployment, too few healthy instances are available for deployment, or some instances in your deployment group are experiencing problems.' The instances are healthy and pass health checks. What is the most likely cause?
The pipeline has a failed execution that is blocking subsequent executions.
The CodeDeploy agent on the instances is not running, causing the deployment to fail.
The pipeline is configured with a high frequency of changes, causing throttling from CodePipeline.
A previous deployment is still in progress or frozen in the CodeDeploy deployment group.
CodeDeploy limits concurrent deployments per deployment group; a frozen deployment prevents new ones.
A development team uses AWS CodeBuild to compile a Java application and run unit tests. The build takes 30 minutes, but the team wants to reduce build time. The codebase has not changed significantly, and dependencies are stable. Which action would be MOST effective in reducing build time?
Configure CodeBuild to cache dependencies in an Amazon S3 bucket.
Caching avoids re-fetching dependencies every build.
Move the build process to a local developer machine to avoid CodeBuild overhead.
Reduce the number of unit tests executed in the build phase.
Increase the compute type of the build environment to a larger instance.
A company uses AWS CodePipeline with multiple stages: Source (Amazon S3), Build (AWS CodeBuild), and Deploy (AWS CodeDeploy). The build stage runs a series of tests, and if they pass, the pipeline proceeds to deploy. Recently, a developer committed a change that passed all tests but caused a production outage. The team wants to add an approval step before the deploy stage, but they also want to ensure that only changes from specific branches can be deployed. What is the MOST secure and maintainable way to enforce this?
Use a Lambda function in the pipeline to check the branch name and fail if not allowed.
Add a manual approval step in the pipeline and rely on the approver to verify the branch.
Create a separate pipeline for each allowed branch, with the approval step only in the production pipeline.
Isolating pipelines prevents direct deployment from unauthorized branches.
Tag the source artifacts with the branch name and use a condition in CodePipeline to allow only specific tags.
A company uses AWS CodeCommit for source control. Developers frequently push large binary files (e.g., compiled JARs) to the repository, causing the repository size to grow rapidly and slowing down clone operations. The team wants to enforce a policy to reject pushes that contain files larger than 50 MB. Which approach should be used?
Configure a CodeCommit trigger that invokes an AWS Lambda function to validate file sizes and reject the push.
CodeCommit triggers allow custom validation before accepting a push.
Set up an Amazon CloudWatch Events rule to monitor repository size and alert when it exceeds a threshold.
Create an IAM policy that denies the `codecommit:GitPush` action if the file size exceeds 50 MB.
Use a pre-receive hook in the repository to reject large files by generating an S3 pre-signed URL.
An organization uses AWS CodePipeline to orchestrate deployments to multiple environments (dev, test, prod). Each environment uses a different AWS account. The pipeline uses cross-account actions with IAM roles. Recently, the pipeline failed at the deploy stage for the prod account with the error 'Access Denied' when assuming the cross-account role. The role ARN is correct and the trust policy allows the pipeline's service role. What is the MOST likely cause?
The EC2 instances in the prod account do not have an appropriate instance profile.
The pipeline's service role lacks the `sts:AssumeRole` permission for the cross-account role.
The service role needs explicit permission to assume the cross-account role.
The cross-account role's permissions boundary denies the deploy action.
The pipeline's service role does not have permission to perform the deploy action in the prod account.
A team uses AWS CodeDeploy to deploy a web application to an Auto Scaling group. The deployment strategy is Blue/Green. During a recent deployment, the new instances passed all health checks, but traffic was not routed to them. What is the most likely reason?
The target group associated with the Auto Scaling group is not properly configured to route traffic.
The target group must be correctly set up to forward traffic to the new instances.
The deployment group is not configured to use a load balancer.
The Auto Scaling group's lifecycle hook failed to signal readiness.
The CodeDeploy agent on the new instances is not installed.
Want more SDLC Automation practice?
Practice this domainThe DOP-C02 exam has 75 questions and must be completed in 180 minutes. The passing score is 750/1000.
Scenario-based questions covering exam objectives with detailed answer explanations.
The exam covers 6 domains: Configuration Management and IaC, Resilient Cloud Solutions, Monitoring and Logging, Incident and Event Response, Security and Compliance, SDLC Automation. Questions are weighted by domain — higher-weight domains appear more on your actual exam.
No. These are original exam-style practice questions written against the official Amazon Web Services DOP-C02 exam objectives. They are not copied from the real exam. Courseiva focuses on genuine understanding, not memorisation of braindumps.
Courseiva tracks your accuracy per domain and routes you toward weak areas automatically. Free, no account required.