DVA-C02Chapter 13 of 101Objective 3.3

CodeCommit and CodeBuild

This chapter covers AWS CodeCommit and AWS CodeBuild, two core services in the AWS CI/CD pipeline. For the DVA-C02 exam, these topics appear in Domain 3 (Deployment) under Objective 3.3: 'Configure and troubleshoot CI/CD pipelines.' Approximately 10-15% of exam questions reference these services, often in scenarios involving source control integration, build specifications, and artifact management. Mastering CodeCommit and CodeBuild is essential for understanding how to automate code deployment and maintain secure, scalable repositories.

25 min read
Intermediate
Updated May 31, 2026

CodeCommit & CodeBuild: Git + Assembly Line

Imagine a software company with a central library (CodeCommit) that stores all blueprints (source code). Every engineer must check out a blueprint, make changes, and check it back in. The library enforces strict rules: only authorized engineers can access, every change is logged with a revision number, and you can revert to any previous version. Now, when a blueprint is checked in, a manager (CodeBuild) immediately takes that revision and runs it through an automated assembly line. The line has stations: compile the design, run unit tests, package the product, and produce a final artifact. The manager spins up a clean, temporary factory floor (build environment) for each run, uses tools specified in a Buildfile (buildspec.yml), and discards the floor afterward. If any station fails, the manager stops the line, marks the build as failed, and notifies the team. This ensures every blueprint that passes the line is verified and ready for shipment, without relying on any engineer's personal workstation.

How It Actually Works

What is AWS CodeCommit?

AWS CodeCommit is a fully managed source control service that hosts secure Git-based repositories. It eliminates the need to operate your own Git server, providing high availability, durability, and scalability. CodeCommit supports standard Git commands (clone, push, pull, commit, branch, merge) and integrates with AWS Identity and Access Management (IAM) for access control. It automatically encrypts repositories at rest using AWS KMS and in transit using HTTPS or SSH.

Why CodeCommit over alternatives?

The exam focuses on scenarios where you need a private Git repository within AWS that integrates natively with other AWS CI/CD services (CodeBuild, CodePipeline, CodeDeploy). Unlike GitHub or Bitbucket, CodeCommit does not require managing external credentials or plugins—IAM roles and policies handle authentication and authorization. Key differentiators include: - IAM integration: Use IAM users, roles, or federated identities to grant repository access. - No size limits: Repositories can be up to 10 GB each (soft limit, can be increased). - Automatic scaling: No need to provision servers. - Event notifications: Trigger actions via Amazon SNS or AWS Lambda on repository events (push, pull request, comment). - Pull request approval templates: Enforce approval rules before merging.

How CodeCommit Works Internally

CodeCommit stores Git objects in Amazon S3 buckets (not directly accessible to users) and uses a metadata database to track references. When you clone a repository, CodeCommit authenticates your IAM credentials (via HTTPS with Git credentials or SSH with uploaded public key) and streams the objects from S3. Each push updates the references and triggers any configured event notifications. The service replicates data across multiple Availability Zones for durability.

Key CodeCommit Configuration and Commands

- Create a repository:

aws codecommit create-repository --repository-name MyRepo --repository-description "My first repo"

- Clone using HTTPS:

git clone https://git-codecommit.us-east-1.amazonaws.com/v1/repos/MyRepo

- Set up IAM policy for CodeCommit:

{
      "Version": "2012-10-17",
      "Statement": [
          {
              "Effect": "Allow",
              "Action": [
                  "codecommit:GitPull",
                  "codecommit:GitPush"
              ],
              "Resource": "arn:aws:codecommit:us-east-1:123456789012:MyRepo"
          }
      ]
  }

Git credentials: IAM users can generate a static HTTPS username and password under the "HTTPS Git credentials for AWS CodeCommit" section in the IAM console.

What is AWS CodeBuild?

AWS CodeBuild is a fully managed build service that compiles source code, runs tests, and produces deployable artifacts. It scales automatically and only charges for the compute time used. CodeBuild integrates with CodeCommit, S3, GitHub, Bitbucket, and CodePipeline. The exam tests your ability to configure build projects, interpret buildspec.yml files, and troubleshoot build failures.

How CodeBuild Works Internally

When a build is triggered (manually, via webhook, or as part of a pipeline), CodeBuild: 1. Provision: Launches a new build environment (container) based on the chosen build image (e.g., aws/codebuild/standard:5.0 for Ubuntu with Java, Node.js, Python, etc.). 2. Download source: Fetches the source code from the configured repository (e.g., CodeCommit, S3). 3. Execute buildspec: Runs the commands defined in buildspec.yml (or inline in the console). 4. Upload artifacts: Stores the build output (e.g., JAR, ZIP) to an S3 bucket (if configured). 5. Clean up: Terminates the environment and publishes build logs to Amazon CloudWatch Logs.

Buildspec.yml Structure

The buildspec.yml file defines the build phases. It must be placed in the root of the source code. Key sections: - version: Currently 0.2. - env: Environment variables (plaintext or parameter store). - phases: - install: Install dependencies (e.g., npm install). - pre_build: Commands before build (e.g., sign in to ECR). - build: The actual build command (e.g., mvn package). - post_build: Commands after build (e.g., push to ECR). - artifacts: Files to upload to S3 (e.g., target/*.jar). - cache: Files to cache to speed up future builds (e.g., ~/.m2/repository).

Example buildspec.yml:

version: 0.2
phases:
  install:
    runtime-versions:
      java: corretto11
    commands:
      - echo Installing...
  pre_build:
    commands:
      - echo Logging in to Amazon ECR...
      - aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
  build:
    commands:
      - echo Build started on `date`
      - mvn package
  post_build:
    commands:
      - echo Build completed on `date`
      - docker build -t my-app .
      - docker tag my-app:latest $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/my-app:latest
      - docker push $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/my-app:latest
artifacts:
  files:
    - target/*.jar
  discard-paths: yes
cache:
  paths:
    - '/root/.m2/**/*'

CodeBuild Environment Types

Managed images: Provided by AWS (e.g., Ubuntu, Windows Server, with runtimes like Java, Python, Node.js, Go, .NET Core).

Custom images: You can provide a Docker image stored in Amazon ECR.

Compute types: build.general1.small, build.general1.medium, build.general1.large (varying CPU and memory).

Environment variables: Can be defined in the console, buildspec, or AWS Systems Manager Parameter Store (for sensitive values).

CodeBuild Integration with Other Services

CodePipeline: CodeBuild can be used as a build action in a pipeline.

Amazon S3: Source from S3 or store artifacts.

Amazon ECR: Push Docker images.

AWS KMS: Encrypt artifacts.

Amazon CloudWatch: Logs and metrics.

Amazon SNS: Notifications on build state changes.

AWS Lambda: Invoke custom actions.

Build Badges and Reports

CodeBuild can generate build badges (SVG) for public repositories. It also supports test reports (e.g., JUnit XML) that can be viewed in the CodeBuild console.

Security and Networking

By default, CodeBuild runs in an AWS-managed VPC. To access resources in your VPC (e.g., private CodeCommit, RDS), you must configure a VPC connection in the build project. This assigns an elastic network interface to the build container.

Pricing Model

CodeCommit: Pay per active user per month (first 5 users free). CodeBuild: Pay per minute of build time (free tier: 100 build minutes per month).

Exam-Relevant Details

CodeCommit repository size limit: 10 GB (default).

CodeBuild timeout: Default 1 hour, max 8 hours (can be set in project configuration).

Buildspec file must be named 'buildspec.yml' (case-sensitive) and placed in the root of the source.

Artifacts are uploaded to S3 after successful build only.

CodeBuild can cache dependencies to reduce build time (e.g., Maven local repository).

To use custom Docker images, they must be in Amazon ECR.

CodeCommit supports up to 1,000 repositories per AWS account (soft limit).

Git push events can trigger CodeBuild via webhooks or CodePipeline.

Walk-Through

1

Create CodeCommit Repository

Begin by creating a repository in AWS CodeCommit via AWS CLI or Console. Use the AWS CLI command: aws codecommit create-repository --repository-name MyAppRepo --repository-description "Source code for MyApp". This creates a Git repository with a unique ARN. The repository is private by default. IAM policies must grant access to the repository for developers. The repository appears in the CodeCommit console with an HTTPS and SSH clone URL.

2

Clone Repository and Push Code

Developers clone the repository using Git with HTTPS (using Git credentials) or SSH (using uploaded SSH keys). They add source code, commit, and push. CodeCommit validates the IAM permissions for each operation. The push updates the remote references. If a branch protection rule is configured, pushes to certain branches (like main) may require a pull request with approvals.

3

Create CodeBuild Project

Create a CodeBuild project in the AWS Console or via CLI. Specify the source provider (CodeCommit), repository, and branch. Choose a managed image (e.g., aws/codebuild/standard:5.0) and compute type (e.g., build.general1.medium). Optionally, provide a VPC configuration to access resources inside your VPC. Set environment variables, such as AWS_ACCOUNT_ID, and choose a service role with permissions to access CodeCommit, S3, and CloudWatch Logs.

4

Define Buildspec.yml

In the root of the source code, create a buildspec.yml file. This file defines build phases: install (dependencies), pre_build (setup), build (compile), and post_build (packaging, pushing). Specify artifacts (files to output) and cache paths (e.g., Maven repository). The buildspec must be valid YAML. CodeBuild reads this file at build time. If no buildspec is found, the build fails.

5

Trigger Build and Monitor

Trigger the build manually from the console, via CLI (aws codebuild start-build --project-name MyProject), or automatically via a webhook (e.g., push to CodeCommit). CodeBuild provisions a container, downloads source, executes build phases in order, uploads artifacts to S3, and streams logs to CloudWatch. Monitor build status in the console. If any phase fails, the build stops and reports failure. Successful builds produce artifacts in the specified S3 bucket.

What This Looks Like on the Job

Enterprise Scenario 1: Microservices CI/CD Pipeline

A large e-commerce company uses CodeCommit to host source code for 50 microservices, each in its own repository. They use CodeBuild to compile and test each service upon every push. Buildspec.yml files for Java services include Maven commands, while Node.js services use npm. Artifacts (JARs and zipped bundles) are stored in an S3 bucket. A central CodePipeline orchestrates the build, test, and deploy phases. The team enforces branch protection on the main branch, requiring at least two approvals via CodeCommit's pull request approval rules. Common issues: developers forgetting to include buildspec.yml, or specifying incorrect artifact paths causing deployment failures. The team mitigates by using pre-commit hooks to validate buildspec syntax.

Enterprise Scenario 2: Docker Image Build and Push

A SaaS startup uses CodeCommit to store Dockerfiles and application code. CodeBuild builds Docker images and pushes them to Amazon ECR. The buildspec.yml includes commands to authenticate to ECR, build the image, tag it with the Git commit hash, and push it. The team uses custom Docker images stored in ECR as build environments to ensure consistency. They also use the cache feature to speed up builds: the Maven local repository (~/.m2) is cached across builds. Performance considerations: large images can cause builds to timeout (default 1 hour); they increase the timeout to 2 hours. Misconfiguration: if the service role lacks ecr:GetAuthorizationToken or ecr:PushImage permissions, the build fails at the push step.

Enterprise Scenario 3: Cross-Account Builds

A consulting firm manages multiple AWS accounts (dev, test, prod). They use CodeCommit in a central tools account and CodeBuild in each environment account. CodeBuild's VPC configuration allows it to pull source from CodeCommit across accounts using VPC endpoints. Artifacts are stored in an S3 bucket in the same account. The firm uses AWS KMS customer-managed keys to encrypt artifacts and source. They learned the hard way that the KMS key policy must grant the CodeBuild service role decrypt permissions. Common mistake: forgetting to attach the IAM policy for S3 bucket access to the CodeBuild service role, leading to 'Access Denied' errors when uploading artifacts.

How DVA-C02 Actually Tests This

DVA-C02 Exam Focus on CodeCommit and CodeBuild

The exam tests Objective 3.3: 'Configure and troubleshoot CI/CD pipelines.' Specific sub-objectives include:

Setting up a CodeCommit repository and managing access with IAM.

Configuring CodeBuild projects, including environment variables, compute types, and VPC settings.

Writing buildspec.yml files with correct phases and artifact definitions.

Integrating CodeBuild with CodePipeline and other AWS services.

Troubleshooting build failures (e.g., missing buildspec, permission errors, timeouts).

Common Wrong Answers and Why Candidates Choose Them

1. Wrong answer: 'CodeCommit automatically triggers a build in CodeBuild when code is pushed.' Why chosen: Many assume CodeCommit and CodeBuild are natively connected. In reality, you must configure a webhook in CodeBuild or use CodePipeline to trigger builds automatically. CodeCommit itself does not trigger builds.

2. Wrong answer: 'CodeBuild can only use managed images provided by AWS.' Why chosen: The default option uses managed images, but custom Docker images from ECR are fully supported. Candidates overlook the 'environment image' setting.

3. Wrong answer: 'Artifacts are stored in CodeCommit after a successful build.' Why chosen: Candidates confuse source storage with artifact storage. Artifacts go to S3, not back to CodeCommit.

4. Wrong answer: 'The buildspec file can be named anything as long as it is referenced in the project.' Why chosen: While you can specify a different name in the project configuration, the default and most common name is buildspec.yml. The exam tests that the default file must be named 'buildspec.yml' and placed in the root.

Specific Numbers and Terms to Memorize

CodeCommit default repository size: 10 GB.

CodeBuild default build timeout: 1 hour, maximum: 8 hours.

Buildspec version: 0.2.

Artifact upload location: S3 bucket (specified in project).

Cache paths: e.g., '/root/.m2/**/*' for Maven.

Compute types: build.general1.small, .medium, .large.

Managed image identifiers: aws/codebuild/standard:5.0, aws/codebuild/amazonlinux2-x86_64-standard:4.0.

Edge Cases and Exceptions

No buildspec found: CodeBuild fails immediately. Ensure the file is in the root and named correctly.

VPC required: If the build needs to access resources in your VPC (e.g., private CodeCommit), you must configure VPC in the build project. Otherwise, CodeBuild runs in an AWS-managed VPC without network access to your resources.

KMS encryption: If you use a customer-managed KMS key for S3 artifacts, ensure the CodeBuild service role has kms:GenerateDataKey and kms:Decrypt permissions.

Build badge: Only available for public repositories.

Multiple buildspecs: You can have different buildspec files for different branches by specifying the buildspec name in the project configuration.

How to Eliminate Wrong Answers

Focus on the underlying mechanism: CodeBuild is a stateless build server that spins up a container, runs commands, and produces artifacts. It does not store source or artifacts permanently. CodeCommit is a Git repository that stores code. If a question mentions 'automatic build on push,' look for the trigger mechanism (webhook or CodePipeline). For permissions, always check the IAM role/service role. If the question involves custom images, remember ECR. If artifacts are mentioned, S3 is the destination.

Key Takeaways

CodeCommit is a fully managed Git repository service integrated with IAM; no need to manage servers.

CodeBuild is a fully managed build service that runs commands defined in a buildspec.yml file.

The buildspec.yml file must be named exactly 'buildspec.yml' and placed in the root of the source code.

CodeBuild supports custom Docker images from Amazon ECR.

Default build timeout is 1 hour; maximum is 8 hours.

Artifacts from a successful build are uploaded to an S3 bucket specified in the project.

To trigger a CodeBuild build automatically on a push to CodeCommit, configure a webhook in CodeBuild or use CodePipeline.

CodeCommit does not trigger builds directly; a webhook or pipeline is required.

CodeBuild can be configured to run inside your VPC to access private resources.

Cache paths in buildspec.yml can speed up builds by reusing downloaded dependencies.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

CodeCommit

Fully managed Git service within AWS, no external dependencies.

IAM-based authentication and authorization.

No per-user pricing; pay per active user per month (first 5 free).

Supports pull request approval templates.

Native integration with CodeBuild and CodePipeline.

GitHub (with AWS Integration)

Third-party service with its own authentication (OAuth, personal tokens).

Requires webhook or AWS CodeStar connection to integrate with AWS services.

Free for public repositories; private repos require paid plan.

Rich code review features and community integrations.

More widely used; larger ecosystem of third-party tools.

Watch Out for These

Mistake

CodeCommit is a substitute for GitHub and supports pull request workflows natively.

Correct

CodeCommit does support pull requests with approval rules, but it lacks a built-in code review interface like GitHub's pull request UI. You can use third-party tools or AWS CodeStar to enhance the experience.

Mistake

CodeBuild can build any source code without a buildspec file.

Correct

CodeBuild requires a buildspec.yml file in the root of the source code to define build commands. Without it, the build fails immediately.

Mistake

CodeBuild automatically caches dependencies between builds.

Correct

Caching is optional and must be explicitly configured in the buildspec.yml file under the 'cache' section. By default, no caching occurs.

Mistake

CodeCommit repositories are automatically backed up to S3.

Correct

CodeCommit stores data in S3 internally, but users cannot access the underlying S3 buckets. Backups must be performed manually using git clone or AWS CLI commands.

Mistake

You can use any Docker image from Docker Hub as a CodeBuild environment.

Correct

CodeBuild only supports images from Amazon ECR or AWS managed images. To use a Docker Hub image, you must first push it to ECR.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

How do I trigger a CodeBuild build automatically when code is pushed to CodeCommit?

You can trigger a build automatically by configuring a webhook in the CodeBuild project. In the project settings, under 'Source', enable 'Rebuild every time a code change is pushed to this repository'. Alternatively, use CodePipeline with CodeCommit as the source and CodeBuild as the build action. The webhook method sends a POST request to a CodeBuild endpoint when Git events occur. Ensure the CodeBuild service role has permissions to describe the repository.

What is the maximum size of a CodeCommit repository?

The default maximum size for a CodeCommit repository is 10 GB. If you need a larger repository, you can request a limit increase via AWS Support. However, best practices suggest keeping repositories under 1 GB for optimal performance. CodeCommit uses S3 for storage, so there is no hard limit, but performance may degrade with very large repositories.

Can I use CodeBuild to build a Docker image and push it to ECR?

Yes, absolutely. In the buildspec.yml, you include commands to authenticate to ECR (aws ecr get-login-password), build the Docker image, tag it, and push it. The CodeBuild service role must have permissions for ecr:GetAuthorizationToken, ecr:BatchCheckLayerAvailability, ecr:InitiateLayerUpload, ecr:UploadLayerPart, ecr:CompleteLayerUpload, and ecr:PutImage. Ensure the Docker image is built in the build phase.

How do I access a private CodeCommit repository from CodeBuild?

If the CodeBuild project uses CodeCommit as the source, it can access the repository using the service role attached to the project. The role must have codecommit:GitPull permissions. No additional configuration is needed. However, if the repository is in a different AWS account, you must set up cross-account IAM roles and configure the build project with a VPC that can reach the repository via VPC endpoints.

What happens if the buildspec.yml file is missing from the source code?

CodeBuild will fail immediately with an error like 'BUILD_SPEC_NOT_FOUND'. The build log will indicate that no buildspec file was found. To resolve, ensure the file is named 'buildspec.yml' (case-sensitive) and placed in the root directory of the source. Alternatively, you can specify a different path or inline build commands in the CodeBuild project configuration.

Can I use environment variables from AWS Systems Manager Parameter Store in CodeBuild?

Yes, you can reference parameters from Parameter Store in the build project's environment variables. In the console, choose 'Parameter' as the value type and enter the parameter name. The CodeBuild service role must have ssm:GetParameters permission. For secure strings, the role also needs kms:Decrypt if the parameter is encrypted with a KMS key.

What is the difference between a CodeBuild build and a CodePipeline?

CodeBuild is a single build service that compiles code and runs tests. CodePipeline is a CI/CD orchestration service that automates the entire release process, including source, build, test, and deploy stages. CodePipeline can use CodeBuild as a build action, but it also integrates with other services like CodeDeploy, Elastic Beanstalk, and Lambda. Think of CodeBuild as a single step, and CodePipeline as the entire workflow.

Terms Worth Knowing

Ready to put this to the test?

You've just covered CodeCommit and CodeBuild — now see how well it sticks with free DVA-C02 practice questions. Full explanations included, no account needed.

Done with this chapter?