AWS CloudFormation is an Infrastructure as Code (IaC) service that enables you to model and provision AWS resources using declarative templates. For the SOA-C02 exam, CloudFormation is a core topic in Domain 3: Deployment, and you can expect approximately 8–12% of questions to involve CloudFormation concepts, template structure, stack operations, and troubleshooting. This chapter covers everything a SysOps administrator needs to know for the exam: template anatomy, stack creation and updates, change sets, drift detection, deletion protection, stack sets, and common troubleshooting scenarios. Mastering CloudFormation is essential for automating deployments and maintaining consistent environments.
Jump to a section
CloudFormation is like a construction blueprint for building a house. The blueprint (template) specifies every detail: the foundation (VPC), walls (subnets), doors (security groups), windows (route tables), plumbing (RDS), and electrical (EC2). The architect writes the blueprint in a standard format (JSON or YAML). When the owner (SysOps admin) hands the blueprint to the general contractor (CloudFormation service), the contractor reads each instruction and directs specialized workers (AWS APIs) to build each component in the correct order. For example, the foundation must be poured before walls can be erected. If any step fails (e.g., a door frame is warped), the contractor stops work and rolls back everything to a safe state, leaving no half-built structure. The owner can also update the blueprint later (e.g., add a garage), and the contractor will intelligently modify only what changed, without demolishing the whole house. This prevents manual mistakes, ensures consistency across multiple builds (dev, test, prod), and provides a complete audit trail of every nail and screw. Without the blueprint, each worker would act independently, leading to chaos and costly rework.
What is AWS CloudFormation and Why It Exists
AWS CloudFormation is a service that allows you to define your entire AWS infrastructure as a text file (template) in JSON or YAML format. Instead of manually clicking through the AWS Management Console or running dozens of CLI commands, you can provision a collection of resources as a single unit called a stack. This approach eliminates configuration drift, reduces human error, and enables version control of your infrastructure.
Template Anatomy
A CloudFormation template is structured into several sections. The only required section is Resources; all others are optional but commonly used.
AWSTemplateFormatVersion: The version of the template format. Currently, the valid value is "2010-09-09".
Description: A text string describing the template. Must follow the AWSTemplateFormatVersion if present.
Metadata: Additional information about the template, such as template author or dependencies.
Parameters: Input values you can supply when you create or update a stack. You can reference parameters using Ref intrinsic function.
Rules: Validates parameter values before stack creation. Useful for enforcing constraints like allowed patterns or ranges.
Mappings: A lookup table of key-value pairs. Often used to map region-specific AMI IDs or instance types.
Conditions: Define whether certain resources are created or not based on parameter values or other conditions.
Transform: For including snippets of reusable code (e.g., AWS SAM or macros).
Resources: The core of the template – declares the AWS resources to create. Each resource has a logical ID (alphanumeric) and a type (e.g., AWS::EC2::Instance).
Outputs: Values you can view after the stack is created, such as the public IP of an EC2 instance or the endpoint of an RDS database.
Intrinsic Functions
CloudFormation provides intrinsic functions to assign values to properties at runtime. The most commonly tested ones:
Ref: Returns the value of a parameter or the physical ID of a resource. For example, !Ref MyVPC returns the VPC ID.
Fn::GetAtt: Returns an attribute of a resource. For example, !GetAtt MyInstance.PublicIp.
Fn::Join: Concatenates strings. Example: !Join ["-", ["stack", !Ref "AWS::StackName"]].
Fn::Select: Picks an element from a list by index.
Fn::Sub: Substitutes variables in a string. Example: !Sub "arn:aws:ec2:${AWS::Region}:${AWS::AccountId}:instance/*".
Fn::If: Returns one value if a condition is true, another if false.
Fn::Equals, Fn::Not, Fn::And, Fn::Or: Condition functions.
Fn::Base64, Fn::Cidr, Fn::FindInMap, etc.
Pseudo Parameters
CloudFormation provides pseudo parameters that are resolved automatically:
- AWS::Region – the region where the stack is created.
- AWS::StackId – the unique stack ID.
- AWS::StackName – the stack name.
- AWS::AccountId – the 12-digit account ID.
- AWS::NotificationARNs – the list of SNS topics for stack events.
- AWS::NoValue – used to remove a property when a condition is false.
Stack Operations
Creating a Stack:
When you create a stack, CloudFormation parses the template, validates it, and then calls the AWS APIs in the correct order based on resource dependencies (depends on attributes). For example, an EC2 instance depends on a subnet, which depends on a VPC. CloudFormation automatically determines this order from the template properties. If a resource fails to create, the default action is to roll back all successfully created resources, leaving no orphaned resources. You can disable rollback on failure using the --disable-rollback flag (CLI) or the console option.
Updating a Stack: You can update a stack by providing a new template or new parameter values. CloudFormation determines what changes are needed – it can update, replace, or delete resources. Updating can cause downtime if resources are replaced. To control the update behavior, you can use: - Change Sets: Preview the changes before applying them. - Stack Policy: Protect critical resources from accidental updates or deletion.
Deleting a Stack: Deleting a stack removes all resources that were created by the stack, in reverse order of creation. To prevent accidental deletion, you can enable termination protection on the stack. Even with termination protection, you can delete the stack by first disabling the protection.
Change Sets
A change set is a summary of proposed changes to a stack. It shows which resources will be added, modified, or deleted. You can create a change set, review it, and then execute it. This is useful for validating updates in a non-production environment before applying to production.
Stack Sets
Stack sets allow you to deploy stacks across multiple accounts and regions from a single template. You define a stack set with a template and specify target accounts and regions. CloudFormation creates stacks in each target account/region. Stack sets support automatic drift detection and can be updated centrally. This is commonly used for governance (e.g., enabling AWS Config rules across all accounts in an organization).
Drift Detection
Drift detection compares the actual state of resources in a stack with the expected state defined in the template. If someone manually modifies a resource (e.g., changes a security group rule via the console), drift detection will flag it. You can run drift detection on a stack or on individual resources. The possible drift statuses are:
- DRIFTED: The actual state differs from the template.
- IN_SYNC: No drift.
- UNKNOWN: Drift detection has not been run or failed.
- NOT_CHECKED: Drift detection has not been run on this resource.
Resource Dependencies
CloudFormation uses DependsOn attribute to specify explicit dependencies. However, it can often infer dependencies from resource references (e.g., a subnet referencing a VPC ID). If you need to ensure a specific order, use DependsOn. For example, an EBS volume must be attached to an EC2 instance after the instance is created. You would set DependsOn on the volume attachment resource to the instance.
CreationPolicy and UpdatePolicy
CreationPolicy: Used with Auto Scaling groups or EC2 instances to wait for a signal (e.g., cfn-signal) before considering the resource created. The policy includes a timeout and a count of expected signals.
UpdatePolicy: Controls how updates are performed on Auto Scaling groups (e.g., rolling update, batch size, pause time).
AWS::CloudFormation::Init and cfn-init
The AWS::CloudFormation::Init metadata on an EC2 instance defines configuration tasks (install packages, create files, start services). The cfn-init helper script runs on the instance and processes this metadata. This is more reliable than user data because it supports retries and can be updated without replacing the instance.
Nested Stacks
Nested stacks allow you to break a large template into smaller, reusable components. You use the AWS::CloudFormation::Stack resource type to reference another template. Nested stacks are useful for separating concerns (e.g., network layer, application layer) and for reusing common patterns (e.g., a standard VPC template).
Stack Policies
A stack policy is a JSON document that defines which resources can be updated or deleted during a stack update. By default, all resources can be updated. You can apply a policy to protect critical resources (e.g., database instances). The policy uses Effect, Action (Update:Modify, Update:Replace, Update:Delete), and Resource (logical resource ID or wildcard).
Rollback Configuration
When a stack creation or update fails, CloudFormation automatically rolls back to the last known good state. You can configure:
- Rollback triggers: Monitor CloudWatch alarms and automatically roll back if an alarm goes into ALARM state during an update.
- On failure: For stack creation, you can specify DO_NOTHING, ROLLBACK, or DELETE.
Resource Signals and Wait Conditions
CreationPolicy and WaitCondition are used to pause stack creation until a signal is received. WaitCondition is a separate resource that can be used with any external signal. CreationPolicy is simpler and directly attached to a resource.
AWS CloudFormation Designer
A visual tool to create and modify templates. You can drag and drop resources and see the template JSON/YAML update in real time. It helps visualize dependencies but is not required for the exam.
Best Practices for SysOps
Use parameters for environment-specific values (e.g., instance type, VPC CIDR).
Use mappings for region-specific AMIs.
Use conditions to handle different environments (e.g., create a bastion host only in production).
Tag all resources for cost tracking and management.
Use AWS::CloudFormation::Init for instance configuration instead of inline user data.
Enable termination protection on production stacks.
Use change sets for critical updates.
Run drift detection regularly.
Store templates in version control (e.g., Git) and use CI/CD pipelines to deploy.
Common Troubleshooting
Stack creation fails: Check the events tab for the specific error. Common issues: insufficient IAM permissions, resource limits exceeded, missing parameters, or invalid template syntax.
Stack update fails: Often due to resource replacement constraints (e.g., changing an RDS DB instance identifier requires replacement). Use change sets to preview.
Drift detected: Investigate manual changes and decide whether to update the template or revert the manual change.
cfn-init failures: Check instance logs (/var/log/cfn-init.log) for errors. Ensure the IAM role has the necessary permissions for the metadata actions.
Exam-Specific Details
The maximum template body size is 51,200 bytes for JSON/YAML. If larger, you must upload to S3 and specify the URL.
Stack names must be unique within a region.
CloudFormation is a regional service – stacks are regional.
You can export stack outputs using Export field in Outputs and import them in other stacks using Fn::ImportValue.
Fn::GetAtt returns specific attributes; not all resources support all attributes. Check documentation.
Ref on a parameter returns the parameter value; on a resource, returns the physical ID (e.g., EC2 instance ID, VPC ID).
AWS::NoValue is used to conditionally omit a property. For example, if a condition is false, you can set a property to AWS::NoValue to leave it unspecified.
Stack sets require a service-managed or self-managed permission model. Service-managed uses AWS Organizations. Self-managed requires you to create IAM roles in target accounts.
The WaitCondition resource has a timeout (default 12 hours, but you can set it). The CreationPolicy timeout is also configurable.
cfn-signal and cfn-init are part of the AWS CloudFormation helper scripts, which must be installed on the AMI (Amazon Linux includes them).
The AWS::CloudFormation::Interface metadata group can be used to control how parameters are grouped in the console.
Advanced: Custom Resources
Custom resources allow you to implement custom provisioning logic using AWS Lambda. When CloudFormation creates, updates, or deletes a custom resource, it invokes a Lambda function that you provide. The function must return a response to CloudFormation with a status (SUCCESS or FAILED) and optional data. This is useful for integrating with third-party APIs or performing tasks not natively supported.
Summary of Intrinsic Functions
| Function | Purpose | Example |
|----------|---------|---------|
| Ref | Returns parameter value or resource physical ID | !Ref MyVPC |
| Fn::GetAtt | Returns resource attribute | !GetAtt MyInstance.PublicIp |
| Fn::Join | Concatenates strings | !Join ["-", ["a", "b"]] → "a-b" |
| Fn::Select | Selects element from list | !Select [0, !Ref AZList] |
| Fn::Sub | Substitutes variables | !Sub "${AWS::Region}" |
| Fn::If | Conditional value | !If [IsProd, "t2.large", "t2.micro"] |
| Fn::Equals | Equality check | !Equals [!Ref Env, "prod"] |
| Fn::FindInMap | Looks up mapping | !FindInMap [RegionMap, !Ref "AWS::Region", "AMI"] |
Key Numbers for the Exam
Template body max: 51,200 bytes (50 KB).
Parameters max: 200.
Resources max per stack: 500.
Outputs max: 60.
Mappings max: 200.
Stack name max: 128 characters.
Stack sets: max 2000 stack instances per stack set.
Drift detection can be run every 1 minute on a single stack.
WaitCondition timeout default: 43200 seconds (12 hours).
CreationPolicy timeout default: 3600 seconds (1 hour) for Auto Scaling groups.
Permissions and IAM
CloudFormation requires permissions to create, describe, update, and delete resources. It uses the IAM role or user that initiates the stack operation. For cross-account stacks, you need appropriate trust policies. For stack sets, you need permissions in the target accounts. The AWS::IAM::Role can be created in the template to grant permissions to instances (e.g., for cfn-init).
Resource Replacement
When a stack update requires replacing a resource (e.g., changing an EC2 instance's security group requires replacement), CloudFormation creates the new resource first, then deletes the old one. This can cause temporary state. For stateful resources like RDS, replacement can lead to data loss unless you have snapshots.
DeletionPolicy
The DeletionPolicy attribute on a resource controls what happens when the resource is deleted (either by stack deletion or update replacement). Options:
- Delete (default): The resource is deleted.
- Retain: The resource is retained (e.g., to keep an S3 bucket with data).
- Snapshot: For supported resources (e.g., RDS, EC2 volume, ElastiCache), a snapshot is taken before deletion.
UpdateReplacePolicy
Similar to DeletionPolicy but applies when a resource is replaced during an update. Options: Delete, Retain, Snapshot.
Resource Attributes
Common attributes you can set on any resource:
- Condition: Reference a condition from the Conditions section.
- DependsOn: Explicit dependency.
- CreationPolicy: Wait for signals.
- UpdatePolicy: Update behavior for Auto Scaling groups.
- DeletionPolicy: What to do on delete.
- UpdateReplacePolicy: What to do on replace.
- Metadata: Arbitrary data.
Template Validation
You can validate a template using the aws cloudformation validate-template command. It checks syntax and resource types but does not verify that the resources will be created successfully (e.g., it won't check for insufficient IP addresses).
Stack Events
Every stack operation generates events. You can view them in the console or via CLI (aws cloudformation describe-stack-events). Events include resource creation, update, and deletion statuses, along with timestamps and logical IDs.
Resource Signals with cfn-signal
When you use a CreationPolicy on an Auto Scaling group or EC2 instance, you must run cfn-signal on the instance after configuration is complete. The command sends a success or failure signal. Example:
cfn-signal --stack MyStack --resource MyAutoScalingGroup --region us-east-1If the signal is not received within the timeout, stack creation fails.
Using AWS CLI with CloudFormation
Common commands:
- aws cloudformation create-stack --stack-name my-stack --template-body file://template.yaml --parameters ParameterKey=KeyName,ParameterValue=mykey
- aws cloudformation update-stack --stack-name my-stack --template-body file://updated.yaml
- aws cloudformation describe-stacks --stack-name my-stack
- aws cloudformation delete-stack --stack-name my-stack
- aws cloudformation create-change-set --stack-name my-stack --template-body file://new.yaml --change-set-name my-change-set
- aws cloudformation execute-change-set --change-set-name my-change-set --stack-name my-stack
- aws cloudformation detect-stack-drift --stack-name my-stack
Stack Drift Detection Output
Drift detection results include:
- StackDriftStatus: DRIFTED, IN_SYNC, UNKNOWN, NOT_CHECKED.
- DriftedStackResourceCount: Number of drifted resources.
- Timestamp: When detection was completed.
Exam Traps
Trap: Confusing Ref and Fn::GetAtt. Ref returns the physical ID (e.g., instance ID), while Fn::GetAtt returns a specific attribute (e.g., PublicIp).
Trap: Thinking DependsOn is always required. CloudFormation automatically infers dependencies from references. Only use DependsOn when you need an explicit ordering that cannot be inferred (e.g., a custom resource that must run after another).
Trap: Assuming DeletionPolicy: Retain protects against stack deletion. It only protects the resource from being deleted when the stack is deleted; the resource still exists outside the stack.
Trap: Believing that stack sets automatically handle drift across all accounts. Drift detection must be run separately on each stack instance.
Trap: Forgetting that AWS::CloudFormation::Init requires an IAM role with appropriate permissions (e.g., cloudformation:DescribeStackResource).
Integration with Other Services
AWS CodePipeline: Can deploy CloudFormation stacks as part of a CI/CD pipeline.
AWS Config: Can detect drift by evaluating resource configurations.
AWS Systems Manager: Can use AWS::SSM::Parameter to store parameter values and reference them in templates.
AWS Lambda: Custom resources can invoke Lambda functions.
Amazon S3: Templates can be stored in S3 and referenced by URL.
CloudFormation vs. Terraform
While Terraform is not on the SOA-C02 exam, understanding the differences can help. CloudFormation is AWS-native and deeply integrated with IAM and other services. It uses declarative JSON/YAML. Terraform uses HCL and supports multiple providers. For the exam, focus on CloudFormation-specific features like change sets, stack sets, drift detection, and helper scripts.
Design the Template
Define the desired AWS resources in a JSON or YAML file. Start with the Resources section. For each resource, specify the type and properties. Use Parameters for configurable inputs (e.g., instance type, VPC CIDR). Use Mappings for region-specific values (e.g., AMI IDs). Use Conditions to create resources only in certain environments. Use Outputs to export useful information like endpoint URLs. Ensure the template is valid by running `aws cloudformation validate-template`.
Create the Stack
Use the AWS Management Console, CLI, or SDK to create a stack from the template. Provide stack name, parameters, and IAM role if needed. CloudFormation will parse the template, resolve intrinsic functions, and start creating resources in dependency order. It will wait for each resource to complete before proceeding to the next. If a resource fails, the stack rolls back by default, deleting any resources created so far.
Monitor Stack Events
During creation, you can view stack events in the console or via CLI (`describe-stack-events`). Each event shows logical resource ID, resource type, status (CREATE_IN_PROGRESS, CREATE_COMPLETE, CREATE_FAILED), and timestamp. If creation fails, the status reason provides the error message (e.g., "The security group 'sg-xxx' does not exist"). Monitoring helps identify the exact point of failure.
Update the Stack
To update, you can modify the template or parameters. CloudFormation compares the new template with the current stack and determines what changes are needed. It creates a change set if requested. During update, resources may be added, modified, or replaced. Replacement is disruptive (e.g., an EC2 instance is terminated and a new one launched). Use a stack policy to protect critical resources. If an update fails, CloudFormation rolls back to the previous state.
Detect and Handle Drift
Run drift detection on the stack to check if any resources were manually modified. Use `aws cloudformation detect-stack-drift`. The output shows drift status. If drift is detected, you can either update the template to match the actual state or revert the manual changes. Drift detection is important for compliance and to ensure that the infrastructure matches the IaC definition.
Delete the Stack
When the stack is no longer needed, delete it. CloudFormation will delete all resources in reverse order of creation. If termination protection is enabled, you must disable it first. Use `DeletionPolicy` on resources you want to retain (e.g., S3 buckets with data). After deletion, the stack is removed from the list. You can also delete a stack while keeping some resources if they have `DeletionPolicy: Retain`.
Enterprise Scenario 1: Multi-Account Governance with Stack Sets
A large enterprise uses AWS Organizations to manage 50 accounts across multiple business units. The security team wants to enforce a baseline of security groups, AWS Config rules, and CloudTrail trails in every account. They create a CloudFormation stack set with a template that defines these resources. The stack set is configured with service-managed permissions, automatically deploying to all accounts in the organization. When a new account is added, CloudFormation automatically deploys the stack to it. If a security group is manually modified in one account, drift detection flags it. The security team runs drift detection weekly and remediates any drifted stacks by updating the stack set, which re-deploys the correct configuration. This ensures consistent security posture across all accounts without manual intervention.
Enterprise Scenario 2: CI/CD Pipeline for Microservices
A SaaS company runs hundreds of microservices on ECS. Each service has its own CloudFormation template that defines ECS tasks, services, load balancers, and auto scaling. The team uses AWS CodePipeline to automate deployments. When a developer pushes code to a Git repository, CodePipeline triggers a build, runs tests, and then uses CloudFormation to update the stack for that service. The template uses parameters for the Docker image tag, so each deployment updates the task definition. Change sets are used to preview the update before applying. If the update fails (e.g., due to a misconfigured health check), CloudFormation rolls back, and the pipeline fails, alerting the team. This approach allows safe, automated deployments with minimal downtime.
Enterprise Scenario 3: Disaster Recovery with Nested Stacks
A financial institution needs to replicate its production environment in a secondary region for disaster recovery. They create a master template that calls nested stacks for each layer: networking, compute, database, and application. The nested stacks are parameterized by environment (prod, dr). The DR stack is created in the secondary region with different parameter values (e.g., smaller instance types). The master template uses Fn::ImportValue to share outputs between nested stacks (e.g., VPC ID from the networking stack is imported by the compute stack). When a DR test is required, the team updates the DR stack with the latest production parameters, runs drift detection to ensure consistency, and then tests failover. This modular approach allows independent updates to each layer and reusability across regions.
Common Pitfalls in Production
Resource limits: Exceeding default limits (e.g., 500 resources per stack) requires splitting into multiple stacks.
IAM permissions: Forgetting to grant CloudFormation permissions to create resources, leading to stack failures.
Manual changes: Developers manually modifying resources via console, causing drift and breaking automation.
Stateful resources: Replacing an RDS instance without a snapshot leads to data loss. Use DeletionPolicy: Snapshot.
Template size: Large templates exceed the 51,200-byte limit. Upload to S3 and use template URL.
Cross-region references: Outputs cannot be directly exported across regions; use Parameter Store or DynamoDB for cross-region values.
What the SOA-C02 Exam Tests
The exam objective 3.1 (Deploy, manage, and operate AWS CloudFormation) covers:
Creating and managing stacks via console, CLI, and SDK.
Understanding template structure (Resources, Parameters, Outputs, Conditions, Mappings).
Using intrinsic functions (Ref, GetAtt, Join, Sub, If, Equals, FindInMap).
Implementing stack updates, change sets, and stack policies.
Understanding drift detection and rollback configuration.
Using nested stacks and stack sets for multi-account/region deployments.
Configuring AWS::CloudFormation::Init and helper scripts (cfn-init, cfn-signal).
Troubleshooting stack failures (events, logs, permissions).
Common Wrong Answers and Why Candidates Choose Them
Confusing `Ref` and `Fn::GetAtt`: Candidates often think Ref returns an attribute like PublicIp. Actually, Ref returns the physical ID (e.g., instance ID). To get PublicIp, use Fn::GetAtt. The exam may ask: "How do you get the public IP of an EC2 instance?". The correct answer is !GetAtt MyInstance.PublicIp.
Assuming `DependsOn` is required for all dependencies: Many believe you must explicitly declare DependsOn for every resource relationship. CloudFormation automatically infers dependencies from resource references (e.g., a subnet referencing a VPC ID). Only use DependsOn when the dependency cannot be inferred (e.g., a custom resource that must run after another).
Thinking `DeletionPolicy: Retain` prevents stack deletion: It only retains the resource when the stack is deleted. The resource still exists and must be managed separately. Candidates may think it protects the stack itself from deletion.
Believing that change sets are required for all updates: Change sets are optional but recommended for critical updates. You can update a stack directly without a change set.
Mixing up `WaitCondition` and `CreationPolicy`: Both wait for signals, but CreationPolicy is attached to a resource (e.g., Auto Scaling group) and is simpler. WaitCondition is a separate resource that can be used with any signal source.
Specific Numbers and Values to Memorize
Maximum template body size: 51,200 bytes (50 KB).
Maximum resources per stack: 500.
Maximum parameters: 200.
Maximum outputs: 60.
Maximum mappings: 200.
Stack name length: up to 128 characters.
WaitCondition timeout default: 43200 seconds (12 hours).
CreationPolicy timeout default: 3600 seconds (1 hour) for Auto Scaling groups.
Drift detection can be run every 1 minute per stack.
Stack set maximum instances: 2000.
Edge Cases and Exceptions
Cross-account stack sets: You must create IAM roles in target accounts (self-managed) or use AWS Organizations (service-managed).
Resource replacement: Changing certain properties (e.g., EC2 security group) requires replacement, which deletes and recreates the resource. This can cause data loss for stateful resources.
Nested stacks vs. stack sets: Nested stacks are for modular templates within a single account/region. Stack sets are for multi-account/region deployments.
Export names must be unique within a region: You cannot have two stacks exporting the same name.
`Fn::ImportValue` can only be used within the same region: Cross-region imports are not supported directly; use Parameter Store or custom solutions.
How to Eliminate Wrong Answers
If a question asks about returning a resource attribute, look for Fn::GetAtt in the options. If it asks for a parameter value or resource ID, look for Ref.
If the question involves multi-account deployment, the answer likely involves stack sets.
If the question involves waiting for configuration completion, look for CreationPolicy or cfn-signal.
If the question involves protecting a resource from deletion, look for DeletionPolicy: Retain.
If the question involves previewing changes, look for change sets.
CloudFormation is a declarative IaC service that provisions AWS resources as a single stack.
Templates are written in JSON or YAML with sections: AWSTemplateFormatVersion, Description, Metadata, Parameters, Rules, Mappings, Conditions, Transform, Resources, Outputs.
Key intrinsic functions: Ref (returns physical ID or parameter value), Fn::GetAtt (returns resource attribute), Fn::Join, Fn::Sub, Fn::If, Fn::FindInMap.
Stack operations: create, update, delete. Updates can be previewed with change sets.
Stack sets deploy stacks across multiple accounts and regions; nested stacks modularize templates.
Drift detection compares actual resource state to template; statuses: DRIFTED, IN_SYNC, UNKNOWN, NOT_CHECKED.
DeletionPolicy controls resource behavior on stack deletion: Delete (default), Retain, Snapshot.
Helper scripts (cfn-init, cfn-signal) configure EC2 instances via AWS::CloudFormation::Init metadata.
Maximum template body size: 51,200 bytes; max resources per stack: 500; max parameters: 200.
Stack policies protect resources from accidental updates; termination protection prevents stack deletion.
Common exam traps: confusing Ref and GetAtt, overusing DependsOn, misunderstanding DeletionPolicy.
Troubleshoot stack failures via stack events, logs (cfn-init.log), and IAM permissions.
These come up on the exam all the time. Here's how to tell them apart.
Change Sets
Provides a preview of changes before applying them.
Allows review and approval workflow.
Does not modify the stack until executed.
Can be created from a new template or parameters.
Useful for validating updates in non-production environments.
Direct Update
Applies changes immediately without preview.
Faster for simple changes.
No built-in approval step.
Cannot be undone without rollback.
Suitable for automated CI/CD pipelines with validation.
Mistake
CloudFormation templates must be JSON, not YAML.
Correct
CloudFormation supports both JSON and YAML formats. YAML is often preferred for readability and supports comments.
Mistake
The `Ref` function returns the Amazon Resource Name (ARN) of a resource.
Correct
`Ref` returns the physical ID (e.g., EC2 instance ID, VPC ID), not the ARN. To get the ARN, use `Fn::GetAtt` with the appropriate attribute or construct it using `Fn::Sub`.
Mistake
You must explicitly define all resource dependencies using `DependsOn`.
Correct
CloudFormation automatically infers dependencies from resource references (e.g., a subnet referencing a VPC ID). `DependsOn` is only needed when the dependency cannot be inferred or you need to enforce an order that is not based on references.
Mistake
Setting `DeletionPolicy: Retain` prevents the stack from being deleted.
Correct
`DeletionPolicy: Retain` only prevents the specific resource from being deleted when the stack is deleted. The stack itself can still be deleted, and the retained resource will exist independently.
Mistake
Drift detection automatically corrects any drifted resources.
Correct
Drift detection only reports differences between the actual state and the template. It does not automatically fix drift. You must manually update the stack or revert the manual changes.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
`Ref` returns the physical ID of a resource (e.g., EC2 instance ID `i-123abc`) or the value of a parameter. `Fn::GetAtt` returns a specific attribute of a resource, such as `PublicIp` or `Arn`. For example, `!Ref MyInstance` gives the instance ID, while `!GetAtt MyInstance.PublicIp` gives the public IP address. Use `Ref` when you need the identifier, and `GetAtt` when you need a particular property.
If your template body exceeds the 51,200-byte limit, you must upload it to an S3 bucket and then specify the S3 URL when creating or updating the stack. Use the `--template-url` parameter in the CLI or the S3 URL option in the console. The maximum template size is 460 KB when using an S3 URL.
Use CloudFormation Stack Sets. Create a stack set with a template and specify target accounts and regions. You can use service-managed permissions (via AWS Organizations) or self-managed permissions (create IAM roles in each target account). Stack sets automatically create stacks in each target account/region. You can update the stack set centrally, and it will update all stack instances.
Drift detection compares the actual state of resources in a stack with the expected state defined in the template. It detects manual changes made outside CloudFormation. You can run drift detection on a stack or individual resources using the console or CLI (`aws cloudformation detect-stack-drift`). The results show which resources have drifted. To fix drift, either update the template to match the actual state or revert the manual changes.
Both are used to pause stack creation until a signal is received. A `WaitCondition` is a separate resource that can be used with any external signal (e.g., from an application). A `CreationPolicy` is an attribute attached directly to a resource (e.g., an Auto Scaling group) and is simpler to configure. `CreationPolicy` is typically used with Auto Scaling groups and EC2 instances that use `cfn-signal`. `WaitCondition` is more flexible but requires creating an additional resource.
Set the `DeletionPolicy` attribute of the RDS resource to `Snapshot` or `Retain`. `Snapshot` takes a final snapshot before deletion, preserving the data. `Retain` keeps the database running even after the stack is deleted. Example: `MyDB: Type: AWS::RDS::DBInstance Properties: ... DeletionPolicy: Snapshot`.
You've just covered CloudFormation for SysOps — now see how well it sticks with free SOA-C02 practice questions. Full explanations included, no account needed.
Done with this chapter?