A company uses AWS CloudFormation to deploy a multi-tier application. The stack includes an Application Load Balancer, Auto Scaling group, and RDS database. The SysOps administrator receives a notification that a stack update has failed. The administrator wants to investigate the failure and understand which resource caused the issue. The stack is in the UPDATE_ROLLBACK_IN_PROGRESS state. What should the administrator do to identify the failed resource?
Stack events provide detailed information about each resource operation.
Why this answer
When a CloudFormation stack update fails and enters UPDATE_ROLLBACK_IN_PROGRESS, the most direct way to identify the failed resource is to view the stack events in the CloudFormation console. Each event includes a status reason field that contains the specific error message and the logical resource ID of the resource that caused the failure, allowing the administrator to pinpoint the issue without additional investigation.
Exam trap
The trap here is that candidates may assume the failure is due to a template syntax error (Option A) or that application logs (Option B) would reveal the issue, when in fact CloudFormation events are the authoritative source for resource-level failure details during stack operations.
How to eliminate wrong answers
Option A is wrong because syntax errors in the template would typically cause the update to fail before it begins (e.g., during validation), not during the update process itself; the stack is already in UPDATE_ROLLBACK_IN_PROGRESS, meaning the template was valid enough to start the update. Option B is wrong because CloudWatch Logs for EC2 instances in the Auto Scaling group would only show application-level or OS-level logs, not CloudFormation resource provisioning failures; the failure is at the infrastructure layer, not within the instances. Option C is wrong because manually re-running the update with the same parameters is risky and inefficient; it could cause the same failure again or trigger additional rollbacks, and it does not leverage the existing event data that already contains the error details.