GCDLChapter 13 of 101Objective 4.3

DevOps on Google Cloud

This chapter covers DevOps practices on Google Cloud, focusing on how Cloud Build, Container Registry, and Cloud Operations enable continuous integration, delivery, and monitoring. For the GCDL exam, this topic appears in roughly 10-15% of questions, particularly around the purpose and benefits of each service and how they integrate. Understanding these concepts is critical for digital leaders who need to orchestrate development and operations teams to deliver software faster and more reliably.

25 min read
Intermediate
Updated May 31, 2026

DevOps Assembly Line for Car Manufacturing

Imagine a car factory that builds custom vehicles. Traditionally, each department (design, parts, assembly, testing) works in silos, handing off blueprints and parts with long delays. When a bug is found in final testing, it takes weeks to trace back and fix. Now, consider a DevOps factory: every change to a car design is immediately built into a digital twin, tested in a virtual crash simulator, and if it passes, the new part design is automatically ordered and the assembly line robot reprogrammed — all within minutes. The factory has a single shared pipeline (like a CI/CD pipeline) that connects all departments. Developers (designers) commit code (blueprints) to a version control system (Git). An automated build system (Cloud Build) compiles the design into a digital model, runs unit tests (simulations), packages the components (container images), and deploys them to a staging environment (test track). Operations (assembly line workers) monitor the real-time performance of cars on the road (production) using dashboards and logs (Cloud Monitoring, Cloud Logging). When an issue arises, they can roll back to a previous known-good version (deployment rollback) or push a hotfix through the same pipeline. This tight feedback loop means the factory can release new features weekly instead of yearly, with higher quality and less waste. The key enabler is infrastructure as code (IaC) — the factory floor layout and robot configurations are defined in YAML files, version-controlled, and automatically provisioned (Terraform, Deployment Manager). This mechanistic analogy mirrors how Google Cloud services like Cloud Build, Container Registry, GKE, and Cloud Operations work together to enable continuous integration, delivery, and monitoring.

How It Actually Works

What is DevOps and Why It Exists

DevOps is a set of practices that combines software development (Dev) and IT operations (Ops). Its goal is to shorten the development lifecycle and provide continuous delivery with high software quality. On Google Cloud, DevOps is implemented through a suite of tools that automate the build, test, deployment, and monitoring of applications. The core philosophy is to break down silos between teams, automate repetitive tasks, and establish feedback loops.

How DevOps Works on Google Cloud: The CI/CD Pipeline

A continuous integration and continuous delivery (CI/CD) pipeline is the backbone of DevOps. In Google Cloud, this pipeline typically consists of:

Cloud Source Repositories: A Git repository for storing source code.

Cloud Build: A managed service that executes builds in containers. It can pull source from Cloud Source Repositories, GitHub, or Bitbucket. Cloud Build uses a cloudbuild.yaml configuration file that defines steps, each running in a separate container. Steps can include compiling code, running tests, and producing artifacts.

Container Registry (now part of Artifact Registry): A private registry for storing Docker container images. Cloud Build can push images here after a successful build.

Google Kubernetes Engine (GKE) or Compute Engine: The deployment target. Cloud Build can deploy to GKE using kubectl commands.

Cloud Operations (formerly Stackdriver): A suite for logging, monitoring, tracing, and error reporting. It provides observability into the running application.

Key Components, Values, and Defaults

Cloud Build - Build triggers: Can be set to run on push to a branch, tag creation, or pull request. - Timeout: Default 10 minutes, max 24 hours. - Machine types: E2_HIGHCPU_8 (8 vCPUs, 8 GB memory) is default. Others include E2_HIGHCPU_32 and N1_HIGHCPU_32. - Logs: Stored in Cloud Logging; can be viewed in Console or gcloud builds log. - Build artifacts: Can be stored in Cloud Storage bucket.

Container Registry / Artifact Registry - Image format: Docker V2 and OCI. - Regions: Multi-regional (e.g., gcr.io) or regional (us.gcr.io). - Vulnerability scanning: Enabled by default for new images; uses Container Analysis. - Retention: No automatic deletion; must be configured via lifecycle policies.

Cloud Operations - Cloud Monitoring: Collects metrics from GCP services, custom metrics, and agent-based metrics. Default retention for metrics is 6 weeks for most, but can be extended. - Cloud Logging: Retains logs for 30 days (free tier) or 365 days (paid tier). - Cloud Trace: Samples traces by default at 10% rate; can be adjusted. - Error Reporting: Groups similar errors automatically.

Configuration and Verification Commands

Building with Cloud Build

# Submit a build using cloudbuild.yaml
gcloud builds submit --config cloudbuild.yaml .

# View build logs
gcloud builds log BUILD_ID

# List builds
gcloud builds list

Example cloudbuild.yaml:

steps:
- name: 'gcr.io/cloud-builders/docker'
  args: ['build', '-t', 'gcr.io/$PROJECT_ID/my-image', '.']
- name: 'gcr.io/cloud-builders/docker'
  args: ['push', 'gcr.io/$PROJECT_ID/my-image']
- name: 'gcr.io/cloud-builders/kubectl'
  args: ['set', 'image', 'deployment/my-deployment', 'my-container=gcr.io/$PROJECT_ID/my-image']
  env:
  - 'CLOUDSDK_COMPUTE_ZONE=us-central1-a'
  - 'CLOUDSDK_CONTAINER_CLUSTER=my-cluster'

Working with Container Registry

# Tag and push an image
docker tag my-image gcr.io/my-project/my-image:latest
docker push gcr.io/my-project/my-image:latest

# Pull an image
docker pull gcr.io/my-project/my-image:latest

# List images in registry
gcloud container images list --repository=gcr.io/my-project

Monitoring with Cloud Operations

# Create a monitoring policy
gcloud alpha monitoring policies create --policy-from-file policy.yaml

# View logs
gcloud logging read "resource.type=gae_app AND severity>=ERROR" --limit 10

How It Interacts with Related Technologies

DevOps on Google Cloud integrates tightly with:

Cloud IAM: Controls who can trigger builds, push images, and view logs. Service accounts are used for pipeline automation.

Secret Manager: Stores sensitive data like API keys used in builds. Cloud Build can access secrets via availableSecrets in cloudbuild.yaml.

Cloud KMS: For encrypting artifacts and images.

Cloud Deploy: A managed continuous delivery service that can promote releases across targets (dev, staging, prod) with rollout strategies like canary and blue-green.

Cloud Run: A serverless container platform that can be a deployment target. Cloud Build can deploy directly to Cloud Run.

Step-by-Step CI/CD Pipeline Flow

1.

Developer pushes code to Cloud Source Repositories.

2.

A Cloud Build trigger fires, initiating a build.

3.

Cloud Build executes steps defined in cloudbuild.yaml:

- Fetches dependencies. - Runs unit tests. - Builds a Docker image. - Pushes the image to Container Registry. 4. Cloud Build then deploys the image to GKE by updating a deployment. 5. After deployment, Cloud Operations begins monitoring application metrics and logs. 6. If an error is detected, Error Reporting sends an alert via Cloud Monitoring. 7. The developer can roll back to a previous image using kubectl rollout undo or by triggering a previous build.

Infrastructure as Code (IaC)

DevOps relies on IaC to manage environments consistently. Google Cloud Deployment Manager and Terraform are used to define infrastructure in declarative templates. These templates are version-controlled and applied through CI/CD pipelines, ensuring that staging and production environments are identical and reproducible.

Continuous Monitoring and Feedback

Cloud Operations provides a unified view of application health. Uptime checks can monitor external endpoints. Custom dashboards can display key metrics like request latency, error rate, and CPU usage. Alerts can be configured to notify on-call engineers via email, SMS, or PagerDuty. This feedback loop allows teams to detect and fix issues quickly, completing the DevOps cycle.

Walk-Through

1

Developer commits code to repository

A developer writes code and pushes it to a Git repository hosted on Cloud Source Repositories, GitHub, or Bitbucket. The push triggers a webhook that Cloud Build receives. The webhook includes branch name, commit SHA, and repository URL. Cloud Build checks if the branch matches a trigger pattern (e.g., 'main' or 'release/*'). If it matches, a new build is initiated. The build is assigned a unique ID and queued. Cloud Build logs the event to Cloud Logging.

2

Cloud Build executes build steps

Cloud Build reads the `cloudbuild.yaml` file from the repository. Each step runs in a separate Docker container. The first step might use the `gcr.io/cloud-builders/docker` image to build a Docker image. The second step runs tests using a testing framework. The third step pushes the image to Container Registry. Each step has access to the workspace (shared volume) and can pass artifacts. If a step fails (non-zero exit code), the build stops and logs the error.

3

Image stored in Container Registry

After a successful build, the Docker image is pushed to Container Registry (or Artifact Registry). The image is tagged with the build's commit SHA and optionally 'latest'. Container Registry stores the image in a Google Cloud Storage bucket. Vulnerability scanning is triggered automatically using Container Analysis. The scan checks for known CVEs in the base images and dependencies. Results are available in the Console and can be used to block deployments if critical vulnerabilities are found.

4

Deployment to staging environment

Cloud Build then deploys the image to a staging environment, typically a GKE cluster or Cloud Run service. For GKE, Cloud Build uses `kubectl` to update the deployment's image. For Cloud Run, it uses `gcloud run deploy`. The deployment is rolled out gradually if a rollout strategy is configured (e.g., canary). Cloud Operations begins collecting metrics and logs from the new version. Automated smoke tests can be run to verify the deployment.

5

Promotion to production via Cloud Deploy

After staging tests pass, the release is promoted to production using Cloud Deploy. Cloud Deploy uses a delivery pipeline defined in a YAML file. It can automatically promote releases across targets (e.g., dev -> staging -> prod) or require manual approval. Each promotion can use a rollout strategy like blue-green or canary. Cloud Deploy manages the release process, including creating necessary Kubernetes resources and monitoring rollout progress.

6

Monitoring and feedback loop

Once in production, Cloud Operations continuously monitors the application. Cloud Monitoring collects metrics like CPU usage, memory, request latency, and error rates. Cloud Logging ingests application logs. Cloud Trace provides latency analysis for requests. If an anomaly is detected, an alert fires and notifies the on-call engineer. The engineer can view dashboards and logs to diagnose the issue. If a rollback is needed, they can use `kubectl rollout undo` or trigger a previous build to redeploy the last known-good image.

What This Looks Like on the Job

Enterprise Scenario 1: E-commerce Platform with Frequent Releases

A large e-commerce company uses Google Cloud to run its website. The development team releases updates every two weeks. They use Cloud Source Repositories for code, Cloud Build for CI/CD, and GKE for deployment. The pipeline builds a Docker image, runs unit and integration tests, and deploys to a staging cluster. After manual approval, Cloud Deploy promotes the release to production using a blue-green strategy. Cloud Operations monitors the site; if error rates spike after a release, an alert triggers an automatic rollback. This setup reduced release time from weeks to days and improved uptime.

Enterprise Scenario 2: Financial Services with Compliance Needs

A bank uses Artifact Registry with vulnerability scanning to ensure no images with critical CVEs are deployed. Their Cloud Build pipeline includes a step that checks the vulnerability scan results; if any critical vulnerability is found, the build fails. Images are signed using Binary Authorization, ensuring only signed images are deployed to GKE. Cloud Operations logs are retained for 365 days to meet audit requirements. This approach helps the bank maintain security and compliance while still deploying frequently.

Common Pitfalls

Misconfigured IAM: If the Cloud Build service account lacks permissions to push to Container Registry or update GKE deployments, the pipeline fails. The solution is to grant the Cloud Build service account the necessary roles (e.g., Storage Object Admin, Kubernetes Engine Developer).

Build Timeout: Default 10 minutes may be too short for large projects. The timeout should be increased in cloudbuild.yaml using the timeout field.

Ignoring Vulnerability Scans: Teams often push images without reviewing scan results, leading to security issues. Automating blocking of builds with critical vulnerabilities is a best practice.

Not Using Cloud Deploy: Manually promoting releases increases risk of human error. Cloud Deploy automates promotion with approval gates, reducing errors.

How GCDL Actually Tests This

What the GCDL Exam Tests

Objective 4.3 focuses on understanding the purpose and benefits of DevOps tools on Google Cloud, not deep configuration. The exam expects you to know:

What Cloud Build does (build, test, deploy)

What Container Registry/Artifact Registry does (store and manage container images)

What Cloud Operations includes (monitoring, logging, tracing, error reporting)

How these services integrate to form a CI/CD pipeline

The concept of infrastructure as code

Common Wrong Answers and Why Candidates Choose Them

1.

"Cloud Build is used for storing container images." This is wrong because Cloud Build is a CI/CD service; Container Registry stores images. Candidates confuse the two because they are often used together.

2.

"Cloud Operations is only for monitoring." This is incomplete; Cloud Operations includes logging, tracing, error reporting, and debugging, not just monitoring. Candidates may think it's just the former Stackdriver Monitoring.

3.

"Cloud Deploy is the same as Cloud Build." Cloud Deploy is a continuous delivery service that manages releases, while Cloud Build is for building and testing. They complement each other but are distinct.

4.

"DevOps on Google Cloud requires using all Google Cloud services." The exam tests that DevOps can be implemented with a subset; e.g., you can use Cloud Build with GitHub and Compute Engine. Candidates may think you must use Cloud Source Repositories and GKE.

Specific Numbers and Terms

Default Cloud Build timeout: 10 minutes

Log retention: 30 days default (free), 365 days (paid)

Vulnerability scanning: enabled by default for Artifact Registry

Cloud Deploy rollout strategies: canary, blue-green

Binary Authorization: ensures only signed images are deployed

Edge Cases and Exceptions

Cloud Build can also build non-container artifacts (e.g., Java JARs) and store them in Cloud Storage.

Container Registry is being replaced by Artifact Registry; the exam may refer to either.

Cloud Operations can monitor on-premises and other clouds via agents.

Cloud Build can run in a VPC network using a private pool.

How to Eliminate Wrong Answers

For questions about CI/CD, identify the service that builds (Cloud Build) vs. stores (Artifact Registry) vs. deploys (Cloud Deploy).

For monitoring questions, look for options that include logging, tracing, and error reporting — those are all part of Cloud Operations.

If an answer says "only" or "exclusively," it's likely wrong because Google Cloud services are designed to be integrated with third-party tools.

Key Takeaways

Cloud Build is a managed CI/CD service that executes build steps in containers; default timeout is 10 minutes.

Artifact Registry stores container images and other artifacts; vulnerability scanning is enabled by default.

Cloud Operations includes Cloud Monitoring, Cloud Logging, Cloud Trace, and Error Reporting for observability.

Cloud Deploy provides continuous delivery with canary and blue-green rollout strategies.

Infrastructure as Code (IaC) using Deployment Manager or Terraform enables reproducible environments.

DevOps on Google Cloud is tool-agnostic; you can integrate with third-party CI/CD and monitoring tools.

Binary Authorization ensures only signed container images are deployed, enhancing security.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Cloud Build (Google-managed CI/CD)

Fully managed; no infrastructure to maintain

Priced per build minute; free tier of 120 build-minutes per day

Native integration with Google Cloud services (GKE, Cloud Run, Artifact Registry)

Supports any language using custom build steps

Scales automatically; no need to manage build agents

Jenkins (self-managed CI/CD)

Self-managed; you must provision and maintain Jenkins master and agents

Open source; free but requires infrastructure costs

Extensive plugin ecosystem; can integrate with many tools

Requires manual configuration for Google Cloud integration

Requires scaling of agents; can be complex to manage

Watch Out for These

Mistake

Cloud Build only builds Docker containers.

Correct

Cloud Build can build any artifact. It runs steps in containers, which can compile code, run tests, and produce any output (e.g., JARs, binaries, static files). The image used in steps can be any Docker image, not just Google-provided ones.

Mistake

Container Registry automatically deletes old images.

Correct

Container Registry does not automatically delete images. You must configure lifecycle policies in Artifact Registry to automatically delete images based on age or count. Without such policies, images accumulate indefinitely.

Mistake

Cloud Operations requires installing an agent on all VMs.

Correct

Cloud Operations can collect metrics from Google Cloud services without agents (e.g., GKE, Cloud SQL, Load Balancers). For custom metrics from Compute Engine VMs, you need to install the Cloud Monitoring agent. Logging agent is also needed for custom logs.

Mistake

DevOps on Google Cloud means you must use all Google Cloud services.

Correct

Google Cloud DevOps tools integrate with third-party tools. For example, you can use GitHub for source code, Jenkins for CI/CD, and Elasticsearch for logging. Google Cloud services are designed to be open and interoperable.

Mistake

Cloud Deploy is required for any deployment.

Correct

Cloud Deploy is optional. You can deploy directly from Cloud Build to GKE, Cloud Run, or Compute Engine using `kubectl` or `gcloud` commands. Cloud Deploy adds advanced release management features like canary deployments and approval gates.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between Cloud Build and Cloud Deploy?

Cloud Build is a CI/CD service that builds, tests, and deploys artifacts. It focuses on the build and test phases. Cloud Deploy is a continuous delivery service that manages the promotion of releases across environments (e.g., dev, staging, prod) with rollout strategies like canary and blue-green. They complement each other: Cloud Build can trigger a Cloud Deploy pipeline after a successful build.

Does Cloud Build support building non-Docker artifacts?

Yes. Cloud Build's build steps run in Docker containers, but the steps themselves can compile code into any artifact (e.g., JAR files, binaries, static HTML). The output can be stored in Cloud Storage or Artifact Registry. For example, you can use a Maven container to build a Java JAR and then upload it to Cloud Storage.

How do I automatically trigger a build when code is pushed?

Create a Cloud Build trigger in the Cloud Console or via gcloud. Specify the repository (Cloud Source Repositories, GitHub, or Bitbucket) and the branch pattern (e.g., 'main' or '.*'). When a push matches the pattern, Cloud Build automatically starts a build. You can also trigger on pull requests or tags.

What is the default retention period for Cloud Logging logs?

The default retention period for Cloud Logging logs is 30 days for the free tier. With a paid tier (e.g., using Cloud Logging with a log bucket), you can retain logs for up to 365 days. You can also export logs to Cloud Storage or BigQuery for longer retention.

Can I use Cloud Operations to monitor applications not running on Google Cloud?

Yes. Cloud Operations can monitor on-premises and other cloud environments using the Cloud Monitoring agent and Cloud Logging agent. You can also use the Stackdriver API to send custom metrics. However, some features like automatic resource discovery are limited to Google Cloud resources.

What is Binary Authorization and how does it relate to DevOps?

Binary Authorization is a security service that ensures only trusted container images are deployed to GKE or Cloud Run. It enforces that images must be signed by an approved authority (e.g., using Cloud KMS). In a DevOps pipeline, you can integrate image signing into Cloud Build steps. Binary Authorization then blocks deployments of unsigned or unauthorized images.

How does Cloud Deploy handle rollbacks?

Cloud Deploy supports rollbacks by reverting to a previous release. You can manually initiate a rollback in the Cloud Console or via gcloud. Cloud Deploy will redeploy the previous version's artifacts and update the rollout state. It also keeps a history of releases for audit purposes.

Terms Worth Knowing

Ready to put this to the test?

You've just covered DevOps on Google Cloud — now see how well it sticks with free GCDL practice questions. Full explanations included, no account needed.

Done with this chapter?