This chapter covers AWS Cloud Development Kit (CDK), a powerful Infrastructure as Code (IaC) tool that allows you to define AWS infrastructure using familiar programming languages. For the SOA-C02 exam, understanding CDK is essential as it tests your ability to automate deployments, manage stacks, and apply best practices. Approximately 10-15% of the Deployment domain questions may involve CDK concepts, particularly around synthesis, bootstrapping, and construct levels. This chapter will equip you with the mechanistic understanding needed to answer exam questions confidently.
Jump to a section
Imagine a construction company that builds skyscrapers. Traditionally, they manually draw blueprints for every building, specifying every beam, pipe, and wire. This is like writing CloudFormation templates manually. Now, they adopt a new approach: they use a high-level programming language (like Python) to define a 'Skyscraper' class that, when instantiated, automatically generates the detailed blueprints (YAML/JSON). The class encapsulates best practices: it knows that every floor needs fire sprinklers, that the foundation depth depends on soil type, and that elevators must meet safety codes. When the company builds a new skyscraper, they just create an instance with parameters like 'height=50 floors' and 'location=seismic zone 3'. The class then outputs the complete set of blueprints, which are then handed to the construction crew (CloudFormation) to execute. If they need to change the elevator brand, they update the class definition and regenerate all blueprints — ensuring consistency across all buildings. This is exactly how AWS CDK works: you write infrastructure in a familiar programming language, and the CDK 'synthesizes' that code into CloudFormation templates, which are then deployed. Just as the class enforces building codes, CDK constructs enforce AWS best practices and security defaults. The construction crew never sees the high-level code; they only see the generated blueprints. Similarly, CloudFormation only sees the generated templates, not the CDK code.
What is AWS CDK and Why Does It Exist?
AWS Cloud Development Kit (CDK) is an open-source software development framework for defining cloud infrastructure in code and provisioning it through AWS CloudFormation. Unlike writing raw YAML or JSON templates, CDK lets you use TypeScript, JavaScript, Python, Java, C#, or Go to define resources. The primary benefit is that you can use object-oriented programming, loops, conditions, and variables to create reusable infrastructure components called constructs. CDK synthesizes your code into CloudFormation templates, which are then deployed by CloudFormation. This abstraction reduces manual errors and accelerates development.
How CDK Works Internally – The Mechanism
CDK operates in three main phases: Construct, Synthesize, and Deploy.
1. Construct Phase: You write code using CDK constructs. Constructs are the basic building blocks of CDK apps. They represent AWS resources or groups of resources. There are three levels of constructs:
- L1 (Low-Level): These are direct representations of CloudFormation resources, prefixed with Cfn. For example, CfnBucket maps directly to AWS::S3::Bucket. They require you to specify every property manually.
- L2 (Curated/High-Level): These are AWS-constructed resources that provide sensible defaults and convenience methods. For example, Bucket from aws-s3 automatically configures encryption, versioning, and lifecycle rules based on best practices. They also expose methods like addLifecycleRule().
- L3 (Patterns): These represent complete architectural patterns, like LoadBalancedFargateService from aws-ecs-patterns, which creates an ECS service behind an Application Load Balancer with auto-scaling.
Synthesize Phase: When you run cdk synth, CDK traverses your app's construct tree and generates a CloudFormation template for each stack. The synthesis process resolves references, generates logical IDs, and produces the final YAML/JSON. You can inspect these templates for debugging. The output is stored in the cdk.out directory by default.
Deploy Phase: Running cdk deploy first synthesizes, then uploads the templates to an S3 bucket (the staging bucket created during bootstrapping), and finally calls CloudFormation's CreateStack or UpdateStack API. CDK uses CloudFormation change sets to preview changes. You can also use cdk diff to see what will change before deploying.
Key Components, Values, Defaults, and Timers
App: The root construct that represents the entire CDK application. It holds all stacks.
Stack: A unit of deployment that maps to a CloudFormation stack. Each stack is deployed independently.
Construct: Any building block that extends the Construct class. Constructs are organized in a tree structure.
Environment: An AWS account and region where stacks are deployed. You can specify env when creating a stack: { account: '123456789012', region: 'us-east-1' }.
Bootstrap: Before deploying CDK apps, you must run cdk bootstrap in each account/region. This creates a CloudFormation stack (CDKToolkit) that includes an S3 bucket for storing templates and assets, and IAM roles for deployment. The bucket name is cdk-<hash>-assets-<account>-<region>.
Synthesis Output: By default, cdk synth outputs templates to cdk.out/. Each stack gets a file named {stackName}.template.json.
Context: A mechanism to pass values at synthesis time. Context can come from cdk.json, command line (-c key=value), or AWS account information. Common uses include environment configurations.
Assets: Local files or Docker images that are bundled and uploaded to the staging bucket during deployment. CDK automatically generates an S3 asset manifest.
Permissions: CDK automatically generates IAM policies based on resource usage. For example, if you use bucket.grantRead(lambda), CDK adds an IAM policy statement allowing the Lambda to read from the bucket.
Removal Policy: Determines what happens when a resource is removed. Common values: RemovalPolicy.DESTROY (deletes resource) and RemovalPolicy.RETAIN (retains resource, e.g., for S3 buckets). Default is RETAIN for stateful resources like S3 buckets, and DESTROY for stateless ones like Lambda functions.
Configuration and Verification Commands
cdk init app --language python – Initializes a new CDK app in Python.
cdk synth – Synthesizes CloudFormation templates. Use -o <directory> to change output directory.
cdk deploy [StackName] – Deploys the stack. Use --require-approval never to skip approval prompts.
cdk diff [StackName] – Shows the difference between the current stack and the deployed stack.
cdk destroy [StackName] – Deletes the stack and its resources.
cdk list – Lists all stacks in the app.
cdk bootstrap – Bootstraps the environment. Use --cloudformation-execution-policies to specify IAM policy ARN for deployment.
cdk doctor – Checks your CDK environment for issues.
cdk context – Manages context values.
How CDK Interacts with Related Technologies
CDK is tightly integrated with CloudFormation. When you deploy, CDK calls CloudFormation APIs. The generated templates are standard CloudFormation templates, so you can even deploy them manually via the console. CDK also integrates with: - AWS Identity and Access Management (IAM): CDK automatically creates IAM roles and policies based on resource interactions. - AWS Lambda: CDK can bundle Lambda code from local directories or inline code. It handles uploading the code to S3. - Amazon S3: CDK uses S3 for storing assets and templates. The bootstrap bucket stores all deployment artifacts. - AWS CodePipeline: CDK can generate pipeline resources to implement CI/CD for the infrastructure itself. - AWS CloudFormation StackSets: CDK can deploy to multiple accounts and regions using stack sets.
CDK also supports aspects, which are like middleware that can modify constructs. For example, you can apply an aspect that tags all resources with a specific tag.
Important Exam Details
The SOA-C02 exam may ask about the three construct levels (L1, L2, L3) and their characteristics.
You should know the purpose of cdk bootstrap and what resources it creates.
Understand the difference between cdk synth and cdk deploy.
Be aware that CDK supports CloudFormation parameters via CfnParameter construct, but best practice is to use CDK context instead.
CDK permissions boundaries can be applied to all IAM roles in a stack using PermissionsBoundary.
The cdk watch command automatically deploys changes when files are modified.
CDK escape hatches allow you to access underlying CloudFormation properties when L2 constructs don't expose something. Use cfnOptions or addPropertyOverride.
The exam may test that CDK generates logical IDs that are deterministic but can change if the construct path changes.
CDK nag is a tool for rule-based linting of CDK apps to enforce best practices.
Common Exam Scenarios
Bootstrapping: A candidate must bootstrap an environment before deploying. The bootstrap stack creates an S3 bucket, IAM roles, and a container registry (for Docker assets).
Synthesis: Running cdk synth produces CloudFormation templates. The exam may ask what happens during synthesis (e.g., assets are uploaded? No, that happens during deploy).
Removal Policy: An S3 bucket with RemovalPolicy.DESTROY will be deleted when the stack is deleted. But if the bucket contains objects, the deletion fails unless autoDeleteObjects is set to true.
Cross-Stack References: CDK automatically creates CloudFormation exports and imports when stacks reference each other's resources.
Code Example: Simple S3 Bucket
from aws_cdk import (
aws_s3 as s3,
core as cdk
)
class MyStack(cdk.Stack):
def __init__(self, scope: cdk.Construct, id: str, **kwargs) -> None:
super().__init__(scope, id, **kwargs)
# L2 bucket with default encryption
bucket = s3.Bucket(
self, "MyBucket",
versioned=True,
encryption=s3.BucketEncryption.S3_MANAGED,
removal_policy=cdk.RemovalPolicy.DESTROY,
auto_delete_objects=True
)
# Output the bucket name
cdk.CfnOutput(self, "BucketName", value=bucket.bucket_name)
app = cdk.App()
MyStack(app, "MyStack")
app.synth()When you run cdk synth, this generates a CloudFormation template with an AWS::S3::Bucket resource, including properties for VersioningConfiguration, BucketEncryption, and a deletion policy. The auto_delete_objects adds a custom resource Lambda that empties the bucket before deletion.
Advanced Concepts
Custom Resources: CDK can create custom resources backed by Lambda functions for tasks not natively supported by CloudFormation.
Cloud Assembly: The output of synthesis is a cloud assembly, which is a set of files and assets ready for deployment.
CDK Pipelines: A construct that builds a self-mutating CI/CD pipeline for your CDK app.
Multi-account deployment: Using Environment objects and StackSet constructs you can deploy to multiple accounts.
Testing CDK apps: Use @aws-cdk/assert or assertions module to write unit tests for your infrastructure code.
Initialize CDK Application
Run `cdk init app --language python` in an empty directory. This creates a standard project structure including `app.py` (entry point), `requirements.txt`, `cdk.json`, and `setup.py`. The `cdk.json` file tells the CDK how to run your app (e.g., the command to execute). The `app.py` file creates an instance of the `App` and instantiates stacks. This step sets up the environment for writing constructs.
Define Infrastructure Using Constructs
In the stack file (e.g., `my_stack.py`), import CDK modules and define resources by creating instances of constructs. For example, `s3.Bucket(self, 'MyBucket', versioned=True)`. Each construct takes a scope (usually `self`), an id (unique within scope), and props. Constructs automatically generate logical IDs and handle dependencies. You can use methods like `bucket.grant_read(lambda_func)` to add permissions. The construct tree is built during initialization.
Synthesize CloudFormation Template
Run `cdk synth` to compile the construct tree into a CloudFormation template. CDK executes the app code (e.g., `app.synth()`), which traverses all constructs and generates JSON/YAML. The template is written to `cdk.out/StackName.template.json`. During synthesis, CDK resolves references (e.g., `bucket.bucket_arn`) by generating CloudFormation intrinsic functions like `Fn::GetAtt`. No AWS API calls are made during synthesis.
Bootstrap the Environment (First Time Only)
Run `cdk bootstrap aws://ACCOUNT/REGION` to create the bootstrap stack. This creates an S3 bucket for storing templates and assets, an IAM role for CloudFormation execution, and a container registry for Docker-based assets. The bucket is versioned and has a lifecycle policy to expire old versions. Bootstrapping is required once per account/region combination. If you skip this, `cdk deploy` will fail with an error about missing bucket.
Deploy Stack with cdk deploy
Run `cdk deploy MyStack`. CDK first synthesizes (if not already done), then uploads the template and any assets to the bootstrap S3 bucket. It then calls CloudFormation's CreateStack or UpdateStack API. CDK shows a deployment plan and asks for approval (unless `--require-approval never`). CloudFormation provisions resources in order. If a resource fails, CDK shows the error and rolls back by default. After deployment, CDK outputs any `CfnOutput` values.
Verify and Iterate with cdk diff
After making changes to your code, run `cdk diff` to see what will change. CDK compares the current stack's template with the one generated from the new code. It shows additions, modifications, and deletions. This is useful for catching unintended changes. Then run `cdk deploy` again to apply changes. CDK uses CloudFormation change sets to update resources with minimal disruption.
Enterprise Scenario 1: Multi-Account Microservices Platform
A financial services company manages over 50 microservices across three AWS accounts (dev, staging, prod). Each microservice requires an ALB, ECS Fargate cluster, RDS database, and IAM roles. Using raw CloudFormation templates, the team struggled with duplication and drift. They adopted CDK with TypeScript, creating L3 constructs like MicroserviceStack that encapsulate the entire pattern. The construct accepts parameters like serviceName, cpu, memory, and dbInstanceClass. It automatically creates a VPC with public and private subnets, security groups, and a load balancer. The team uses CDK Pipelines to deploy changes. The pipeline itself is defined in CDK and self-mutates. Each account has its own bootstrap stack. One common issue was that the autoDeleteObjects property on S3 buckets was not set initially, causing stack deletion to fail when buckets contained logs. They fixed this by adding autoDeleteObjects: true and removalPolicy: RemovalPolicy.DESTROY. Performance considerations: CDK synthesis time increased with many resources, but they mitigated by using nested stacks for independent components.
Scenario 2: Compliance-Driven Infrastructure
A healthcare startup must enforce encryption at rest and in transit for all services to meet HIPAA. They use CDK with Python and apply a custom Aspect that checks every resource for required encryption properties. The Aspect walks the construct tree and adds tags or modifies properties. For example, it ensures every S3 bucket has encryption: BucketEncryption.KMS and every RDS instance has storageEncrypted: true. They also use CDK Nag to run rules before deployment. One challenge was that some L2 constructs, like Bucket, default to BucketEncryption.S3_MANAGED, which is acceptable, but the Aspect flagged it because they required KMS. They overrode the default by passing encryption: BucketEncryption.KMS. They also used PermissionsBoundary to restrict IAM roles. The cdk diff command is used in CI/CD to prevent unauthorized changes.
Scenario 3: Serverless Data Pipeline
An e-commerce company processes clickstream data using Lambda, Kinesis Firehose, and S3. They use CDK to define the pipeline. Each stage is a construct: ClickstreamSource, StreamProcessor, and DataLake. The StreamProcessor construct creates a Lambda function with the code stored as an asset. CDK automatically uploads the code to S3 and sets the Lambda's code property. They use cdk watch during development to automatically redeploy on code changes. A common misconfiguration was forgetting to set reservedConcurrentExecutions on the Lambda, causing throttling under load. They added reserved_concurrent_executions: 100. Also, they used CfnOutput to export the Firehose delivery stream name for other teams. The pipeline processes over 10 TB of data daily with no issues.
SOA-C02 Exam Focus on AWS CDK
The SOA-C02 exam tests AWS CDK under Domain 3: Deployment, specifically objective 3.1 – Automate deployment of resources using AWS services. The exam expects you to understand the CDK workflow, construct levels, bootstrapping, and common commands. Here are the critical points:
1. Construct Levels (L1, L2, L3): You must distinguish between them. L1 (Cfn*) are direct CloudFormation resource mappings; L2 are higher-level with defaults; L3 are patterns. The exam may present a scenario and ask which level to use. For example, if you need a full ECS service with ALB, use an L3 pattern. If you need to set a specific CloudFormation property not exposed by L2, use an L1 escape hatch.
2. Bootstrapping: The exam loves to test what cdk bootstrap does. Common wrong answer: 'It creates a CloudFormation stack that deploys your application.' Reality: It creates the CDKToolkit stack with an S3 bucket and IAM roles, not the application itself. Another trap: 'Bootstrapping is required every time you deploy.' No, only once per environment.
3. Synthesis vs. Deploy: Candidates often confuse cdk synth and cdk deploy. The exam may ask: 'What happens when you run cdk synth?' Correct: It generates CloudFormation templates and writes them to disk. Wrong: It deploys resources. Also, note that cdk deploy automatically runs synth first.
4. Removal Policy and Auto-Delete: Many questions involve S3 bucket deletion. Default removal policy for stateful resources is RETAIN. If you want the bucket to be deleted when the stack is deleted, you must explicitly set removalPolicy: RemovalPolicy.DESTROY and autoDeleteObjects: true (otherwise deletion fails if bucket is non-empty). The exam may present a stack deletion failure and ask why. Answer: The bucket contains objects and autoDeleteObjects was not set.
5. Cross-Stack References: CDK automatically creates CloudFormation exports and imports. The exam may ask how to share a resource between stacks. Correct: Use stack.exportValue() or pass a reference directly. Wrong: Manually create Fn::ImportValue.
6. IAM Permissions: CDK automatically generates IAM policies based on resource interactions. The exam may test that you don't need to manually create IAM roles for Lambda to write to S3 if you use bucket.grantWrite(lambda).
7. Context and Parameters: CDK prefers context over CloudFormation parameters. The exam may test that cdk deploy -c key=value passes context, and you retrieve it via self.node.try_get_context('key').
8. CDK Pipelines: Understand that CDK Pipelines creates a self-mutating pipeline. The pipeline stack itself is deployed via cdk deploy. The exam may ask about the source stage (e.g., CodeCommit, GitHub).
9. Asset Management: Assets are uploaded to the bootstrap bucket during deploy, not synth. The exam may ask when the S3 bucket for assets is created: during bootstrap.
10. Common Wrong Answers on Exams: - 'CDK replaces CloudFormation.' (False, CDK uses CloudFormation.) - 'You can use CDK without bootstrapping.' (False, for most deployments.) - 'CDK synthesizes directly to CloudFormation API calls.' (False, it generates templates.) - 'CDK does not support IAM roles.' (False, it does automatically.)
Edge Cases:
- If you change the logical ID of a resource (by renaming the construct id), CloudFormation will delete the old resource and create a new one. The exam may ask how to avoid this: Use resource.override_logical_id('OldLogicalId').
- CDK supports multiple languages, but the exam focuses on TypeScript and Python.
How to Eliminate Wrong Answers:
- Understand the mechanism: CDK is a code generator for CloudFormation. Any answer that suggests CDK directly calls AWS APIs without CloudFormation is wrong.
- Know the order: bootstrap -> synth -> deploy. If a question says 'deploy without synth', it's wrong because deploy includes synth.
- For resource deletion, always check removal policy. If a bucket is not deleted, suspect autoDeleteObjects missing.
CDK synthesizes code into CloudFormation templates; it does not replace CloudFormation.
Bootstrapping (cdk bootstrap) is required once per environment and creates the CDKToolkit stack with an S3 bucket and IAM roles.
Constructs have three levels: L1 (Cfn*), L2 (curated), and L3 (patterns).
RemovalPolicy.DESTROY with autoDeleteObjects: true is needed to fully delete S3 buckets on stack deletion.
cdk synth generates templates to cdk.out/; cdk deploy runs synth and then deploys via CloudFormation.
CDK automatically generates IAM policies for resource interactions (e.g., grantRead).
Context values (passed via -c key=value) are preferred over CloudFormation parameters.
Assets are uploaded during deploy, not synth.
cdk diff shows changes between the current and new template.
CDK Pipelines creates a self-mutating CI/CD pipeline for infrastructure.
These come up on the exam all the time. Here's how to tell them apart.
AWS CDK
Uses familiar programming languages (Python, TypeScript, etc.).
Automatically generates logical IDs and dependencies.
Provides high-level constructs with sensible defaults.
Supports loops, conditions, and reusable components via classes.
Automatically handles asset uploads and IAM permissions.
AWS CloudFormation (Raw Templates)
Uses YAML or JSON templates.
Requires manual specification of logical IDs and DependsOn.
All properties must be explicitly defined.
Limited to CloudFormation intrinsic functions and pseudo-parameters.
Requires manual S3 uploads and IAM policy writing.
Mistake
CDK is a replacement for CloudFormation.
Correct
CDK is an abstraction layer that generates CloudFormation templates. It does not replace CloudFormation; it uses it for deployment. The generated templates are standard CloudFormation templates.
Mistake
You must run cdk synth before every cdk deploy.
Correct
The `cdk deploy` command automatically runs synthesis. You only need to run `cdk synth` separately if you want to inspect the generated templates or debug.
Mistake
Bootstrapping deploys your application.
Correct
Bootstrapping creates the CDKToolkit stack that includes an S3 bucket for assets and IAM roles. It does not deploy your application stacks. You must run `cdk deploy` separately.
Mistake
CDK automatically deletes S3 buckets when the stack is deleted.
Correct
By default, S3 buckets have a removal policy of `RETAIN`, meaning they are not deleted when the stack is deleted. To delete the bucket, you must set `removalPolicy: RemovalPolicy.DESTROY` and `autoDeleteObjects: true`.
Mistake
CDK does not support IAM roles; you must create them manually.
Correct
CDK automatically generates IAM roles and policies based on resource interactions. For example, using `bucket.grantRead(lambda)` adds the necessary IAM policy to the Lambda's role.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
`cdk synth` generates CloudFormation templates from your CDK code and writes them to the `cdk.out` directory. It does not make any AWS API calls. `cdk deploy` first runs synthesis (unless you use `--no-synth`), then uploads the templates and assets to the bootstrap S3 bucket, and finally calls CloudFormation's CreateStack or UpdateStack API to provision resources. In summary, synth is a dry-run code generation, deploy is the actual deployment.
No. Bootstrapping is a one-time setup per account and region. It creates the CDKToolkit stack with an S3 bucket for storing templates and assets, and IAM roles for deployment. You only need to re-bootstrap if you change the bootstrap stack's configuration (e.g., permissions boundary). The exam often tests that bootstrapping is required before first deployment but not every time.
CDK uses 'grant' methods on constructs to add IAM permissions. For example, `bucket.grantRead(lambda)` adds a policy statement to the Lambda's execution role allowing `s3:GetObject` on the bucket. CDK tracks these grants and generates the appropriate IAM policy documents in the CloudFormation template. This reduces manual IAM policy writing and ensures least privilege.
An aspect is a way to apply a transformation or validation to all constructs in the construct tree. You create a class that implements `IAspect` and override the `visit` method. For example, you can create an aspect that tags all resources with a specific tag. To apply it, use `Aspects.of(app).add(new TagAspect('Environment', 'Prod'))`. Aspects are invoked during synthesis.
No. CDK is designed to generate CloudFormation templates and deploy them via CloudFormation. It does not have its own provisioning engine. All resource management is done through CloudFormation. If you need to manage resources outside of CloudFormation, you would need to use custom resources or other tools.
You can pass a reference to the resource from one stack to another. CDK automatically creates CloudFormation exports and imports. For example, if StackA creates a VPC and StackB needs its ID, you can define the VPC in StackA and pass it as a property to StackB. CDK will generate an export from StackA and an import in StackB. Alternatively, you can use `stack.exportValue(vpc.vpcId)`.
Changing the construct ID changes the logical ID of the resource in the CloudFormation template. CloudFormation treats this as a new resource and will delete the old one and create a new one, which could cause downtime or data loss. To avoid this, you can override the logical ID using `resource.override_logical_id('OldLogicalId')` to keep the same logical ID.
You've just covered AWS CDK for Infrastructure as Code — now see how well it sticks with free SOA-C02 practice questions. Full explanations included, no account needed.
Done with this chapter?