GCDLChapter 69 of 101Objective 4.3

GitOps and Infrastructure as Code

Managing cloud infrastructure declaratively and automating deployments relies on two foundational practices: GitOps and Infrastructure as Code (IaC). For the GCDL exam, this topic appears in approximately 5-8% of questions, focusing on understanding the principles, benefits, and Google Cloud-specific tools like Config Sync and Cloud Build. Mastering these concepts is essential for demonstrating how organizations achieve consistency, reliability, and auditability in their cloud operations.

25 min read

Intermediate

Updated Jul 20, 2026

Reviewed by Johnson Ajibi· Senior Network & Security Engineer · MSc IT Security

Jump to a section

Explain it to me simply Where people get tripped up Test what I know Look up key terms

GitOps as a Restaurant Recipe System

Fifty recipes, one central binder, and a high-end restaurant kitchen that operates on a recipe system. Each recipe is stored in a central binder (Git repository) and specifies exact ingredients, quantities, and cooking steps. The head chef (CI/CD pipeline) reads a recipe, prepares the dish, and places it on the pass (production environment). A food inspector (GitOps agent) constantly checks that the dish on the pass matches the recipe. If the dish is altered (e.g., a line cook adds extra salt), the inspector flags it and either reverts the change or triggers a new cook cycle to redeploy the correct dish. The restaurant never allows manual modifications to the pass; every change must go through the recipe binder. This ensures consistency, traceability, and rapid recovery. In GitOps, the Git repository is the single source of truth. The desired state (infrastructure config) is declared in Git. An operator (like Argo CD or Config Sync) continuously compares the live state to the desired state and reconciles any drift. All changes are made via pull requests, which are reviewed and merged, triggering automated deployment. This model eliminates configuration drift and provides a complete audit trail.

How It Actually Works

What is Infrastructure as Code (IaC)?

Infrastructure as Code (IaC) is the practice of managing and provisioning infrastructure through machine-readable definition files, rather than manual processes or interactive configuration tools. The core idea is to treat infrastructure the same way developers treat application code: store it in version control, review it, test it, and deploy it automatically. IaC enables repeatable, consistent environments and eliminates configuration drift.

There are two main approaches to IaC: declarative and imperative. Declarative IaC specifies the desired end state (e.g., 'I want three VMs with 4 vCPUs each'), and the tool figures out how to achieve that state. Imperative IaC specifies the exact steps to reach the state (e.g., 'create VM, then install software, then configure network'). Declarative is preferred for cloud environments because it is idempotent and easier to reason about. Google Cloud Deployment Manager and Terraform are declarative tools; Ansible is primarily imperative.

What is GitOps?

GitOps extends IaC by using Git as the single source of truth for declarative infrastructure and applications. The Git repository contains the entire desired state of the system. An automated operator (agent) continuously compares the live state in the cluster or cloud environment with the desired state in Git. If they diverge, the operator takes corrective action, either by alerting or automatically reconciling. GitOps is particularly popular in Kubernetes environments but is applicable to any cloud infrastructure.

Key principles of GitOps: - Declarative description: The entire system is described declaratively. - Version controlled and immutable: Git is the single source of truth; all changes are made via pull requests. - Automated delivery: Changes merged to Git are automatically applied to the environment. - Continuous reconciliation: An operator ensures the live state matches the desired state.

How GitOps Works Internally

Let's step through the GitOps workflow using Google Cloud's Config Sync as an example:

Repository setup: An organization stores Kubernetes manifests (YAML files) in a Git repository. The root directory contains a cluster directory with namespaces, roles, deployments, etc.

Config Sync installation: Config Sync is installed on a GKE cluster using gcloud or via the GKE UI. It requires a service account with appropriate permissions to read from the Git repository.

Initial sync: On startup, Config Sync clones the repository and applies all resources to the cluster. It records the commit SHA and resource versions.

Reconciliation loop: Config Sync runs a continuous reconciliation loop, typically every 15 seconds (the re-sync period defaults to 15 seconds). It compares the live state (via Kubernetes API) to the desired state from Git. If a resource is missing or different, it creates or updates it.

Drift detection: If someone manually edits a resource (e.g., kubectl edit deployment nginx), Config Sync will detect the change on the next reconciliation and revert it to match Git. This prevents configuration drift.

Pull-based deployment: When a developer merges a pull request to the main branch, a webhook triggers a re-sync (or the next poll cycle picks up the change). Config Sync applies the new manifests.

Key Components and Defaults

Config Sync: Google's GitOps operator for GKE. It supports multiple source types: Git, OCI, and Helm. Default sync frequency: 15 seconds. It can be configured via ConfigManagement custom resource.

Argo CD: A popular open-source GitOps tool that works with any Kubernetes cluster, including GKE. It provides a UI, CLI, and API. Sync interval defaults to 3 minutes but can be customized.

Cloud Build: Google's CI/CD platform that can integrate with GitOps. For example, you can set up a Cloud Build trigger that runs on pull request merge, builds a container image, updates the Kubernetes manifest in Git, and triggers Config Sync to deploy.

Source Repositories: Google's managed Git service that integrates with IAM, Cloud Build, and other GCP services.

Configuration and Verification

To install Config Sync on a GKE cluster:

gcloud container clusters update CLUSTER_NAME --region=REGION --update-addons ConfigConnector=ENABLED

Or via the YAML manifest:

apiVersion: configmanagement.gke.io/v1
kind: ConfigManagement
metadata:
  name: config-management
spec:
  git:
    syncRepo: https://source.developers.google.com/p/PROJECT/r/REPO
    syncBranch: main
    secretType: ssh
    policyDir: /cluster

To verify sync status:

nomos status

Interaction with Related Technologies

GitOps works hand-in-hand with CI/CD pipelines. A typical flow: 1. Developer commits code to application repo. 2. CI pipeline (Cloud Build) tests and builds a container image. 3. CD pipeline updates the deployment manifest in the config repo (e.g., changes image tag). 4. GitOps operator detects the change and rolls out the new image to the cluster.

GitOps also integrates with monitoring (e.g., Cloud Monitoring alerts on sync failures) and security (e.g., Binary Authorization ensures only signed images are deployed).

Benefits and Challenges

Benefits: - Audit trail: Every change is recorded in Git with author, timestamp, and diff. - Rollback: Reverting a change is as simple as reverting a Git commit. - Consistency: All environments (dev, staging, prod) use the same configuration. - Disaster recovery: A cluster can be rebuilt from Git in minutes.

Challenges: - Secret management: Git is not designed for secrets. Tools like Google Cloud Secret Manager or Sealed Secrets are used to encrypt secrets before storing in Git. - Learning curve: Teams must adopt Git workflow and declarative configuration. - Performance: Large repositories with many resources can slow down reconciliation.

Walk-Through

Define Desired State in Git

The process begins with defining the desired state of your infrastructure and applications in a Git repository. This includes Kubernetes manifests (Deployments, Services, ConfigMaps), Terraform files, or Deployment Manager templates. The repository is structured with clear directories for environments (dev, staging, prod) and components. All files are written declaratively, specifying exactly what the end state should look like. The repository is the single source of truth; no manual changes are allowed in production.

Install GitOps Operator

A GitOps operator, such as Config Sync or Argo CD, is installed on the target environment (e.g., a GKE cluster). The operator requires credentials to access the Git repository, typically a SSH key or a service account with read access. The operator is configured with the repository URL, branch, and path to the configuration files. It also defines the sync policy: automated (apply changes automatically) or manual (require approval). The operator runs as a pod in the cluster.

Initial Synchronization

Upon startup, the operator clones the Git repository and applies all defined resources to the cluster. It creates namespaces, deployments, services, and any other objects. The operator records the current commit SHA and stores resource versions. This initial sync ensures the live state matches the desired state from the first moment. If there are existing resources that conflict, the operator may overwrite them or fail, depending on the configuration.

Continuous Reconciliation Loop

The operator enters a reconciliation loop, periodically comparing the live state with the desired state from Git. For Config Sync, this loop runs every 15 seconds by default. It queries the Kubernetes API for the current state of each resource and compares it with the manifest in Git. If a resource exists but differs (e.g., different replica count), the operator updates it. If a resource is missing, it creates it. If a resource exists that is not in Git, it may delete it (pruning) or leave it, depending on the pruning policy.

Drift Detection and Correction

If someone manually modifies a resource in the cluster (e.g., using `kubectl edit`), the operator detects the drift during the next reconciliation. It reverts the change to match the Git state. This ensures that the only way to make persistent changes is through Git. Drift can also occur due to external factors like auto-scaling or cluster autoscaler; the operator may revert those as well if they are not part of the desired state. Some operators allow exceptions via annotations.

Change via Pull Request

To make a change, a developer updates the configuration files in a new branch, commits, and opens a pull request. Team members review the changes. After approval, the pull request is merged into the main branch. This triggers a webhook or the next poll cycle to notify the operator. The operator then syncs the new desired state, applying the changes to the cluster. This workflow ensures all changes are reviewed, tested, and auditable.

What This Looks Like on the Job

Enterprise Scenario 1: Financial Services Compliance

A large bank adopts GitOps to meet regulatory requirements for change management. They store all Kubernetes manifests in a private Git repository with strict branch protection. Every change requires a pull request approved by two senior engineers. Config Sync runs on their GKE clusters across three regions (us-central1, europe-west1, asia-east1). The sync interval is set to 30 seconds for faster drift detection. They use Cloud Build to run policy checks (e.g., ensuring no privileged containers) before merging. In production, a developer once manually scaled a deployment to handle a traffic spike. Config Sync detected the drift within 30 seconds and reverted the scale-down back to the original value, causing a brief performance issue. The team learned to use HorizontalPodAutoscaler instead, which is defined in Git and allowed to adjust replicas dynamically.

Enterprise Scenario 2: E-commerce Platform with Multi-Environment Deployments

An e-commerce company uses GitOps to manage dev, staging, and prod environments. They maintain separate directories in a single repository: envs/dev, envs/staging, envs/prod. Each environment has its own Config Sync configuration pointing to the respective directory. The GitOps operator is Argo CD, chosen for its rich UI and sync waves. They use Kustomize to overlay environment-specific settings. A common issue is that developers sometimes forget to update the image tag in the staging directory after testing, causing staging to fall behind. They implement a CI pipeline that automatically updates the staging manifest after a successful build. When a misconfigured Ingress caused a prod outage, they rolled back by reverting the Git commit, and Argo CD synced the previous state within 3 minutes.

Scenario 3: Hybrid Cloud with Anthos

A multinational corporation uses Anthos to manage workloads across on-premises and multiple clouds. They deploy Config Sync on all Anthos clusters, both on GKE and on-premises. The Git repository stores cluster-specific configurations in separate folders. They use the policyDir field to point each cluster to its folder. A challenge is that on-prem clusters have limited network connectivity; they set up a mirror of the Git repository using Cloud Source Repositories with a proxy. They also use the source-format: unstructured option to support non-Kubernetes resources like Terraform state. When a cluster goes down, they can spin up a new one and point it to the same Git source, fully restoring the state within minutes.

How GCDL Actually Tests This

What GCDL Tests on This Topic (Objective 4.3)

The GCDL exam expects you to understand the concepts and benefits of GitOps and IaC, not the detailed configuration. Key areas: - Definition and benefits: Be able to explain what GitOps is and why organizations use it (consistency, auditability, rollback). - Tools: Know that Config Sync is Google's GitOps product for GKE, and that Cloud Build integrates with GitOps workflows. - Declarative vs. imperative: Understand the difference and why declarative is preferred for cloud. - Single source of truth: Git is the source of truth; no manual changes. - Drift detection: The operator continuously reconciles and reverts drift.

Common Wrong Answers and Why

'GitOps means using Git for version control of application code only.' This is too narrow. GitOps uses Git for infrastructure and application configuration, not just code.

'IaC eliminates the need for any manual configuration.' While IaC reduces manual work, some initial setup (like bootstrapping the GitOps operator) may be manual. Also, IaC does not cover all aspects (e.g., secrets).

'Config Sync can only sync from Cloud Source Repositories.' Config Sync supports any Git repository, including GitHub, GitLab, and Bitbucket.

'GitOps requires a CI/CD pipeline.' GitOps can work without a separate CI/CD pipeline; the operator itself applies changes from Git. However, CI/CD is often used for building and testing.

Specific Values and Terms

Sync interval: Config Sync default is 15 seconds. Argo CD default is 3 minutes.

Pruning: Config Sync can delete resources that exist in the cluster but not in Git (pruning). This is enabled by default but can be disabled.

`nomos status`: Command to check Config Sync status.

`ConfigManagement`: Custom resource for configuring Config Sync.

Edge Cases

Large clusters: With thousands of resources, reconciliation can be slow. Config Sync has a resource quota limit; exceeding it causes sync failures.

Secrets: Git is not secure for plaintext secrets. The exam may ask about Secret Manager or Sealed Secrets as solutions.

Multiple clusters: Config Sync can manage multiple clusters from a single repository using different directories or branches.

How to Eliminate Wrong Answers

If a question asks about the primary benefit of GitOps, think about consistency and audit trail. If an answer mentions 'faster application performance,' it's likely wrong because GitOps is about configuration management, not performance. If an answer says 'GitOps replaces CI/CD,' it's wrong; GitOps complements CI/CD.

Key Takeaways

GitOps uses Git as the single source of truth for declarative infrastructure and applications.

Config Sync is Google's GitOps agent for GKE, with a default sync interval of 15 seconds.

IaC can be declarative (desired state) or imperative (step-by-step); declarative is preferred for cloud.

GitOps operators continuously reconcile live state with desired state, reverting any drift.

Changes are made via pull requests, ensuring review and audit trail.

GitOps complements CI/CD; CI builds artifacts, GitOps deploys them.

Config Sync supports pruning: resources not in Git are deleted from the cluster.

Secrets should not be stored plaintext in Git; use Secret Manager or Sealed Secrets.

Easy to Mix Up

These come up on the exam all the time. Here's how to tell them apart.

Config Sync (Google Cloud)

Native to GKE, integrated with Google Cloud Console.

Default sync interval of 15 seconds for fast drift detection.

Supports Git, OCI, and Helm sources.

Managed by Google; no separate installation of operator required.

Limited to GKE and Anthos clusters.

Argo CD (Open Source)

Works with any Kubernetes cluster, including on-prem and multi-cloud.

Default sync interval of 3 minutes but configurable.

Rich UI, CLI, and API with sync waves and hooks.

Community-driven with extensive plugin ecosystem.

Requires separate installation and maintenance.

Watch Out for These

Mistake

GitOps is only for Kubernetes.

Correct

While GitOps is most popular with Kubernetes, it can be applied to any infrastructure that can be managed declaratively, such as Google Cloud resources via Config Controller or Terraform.

Mistake

Config Sync automatically updates the Git repository when the cluster changes.

Correct

Config Sync only reads from Git and applies to the cluster. It does not write back to Git. If you want to capture changes, you need a separate tool like a mutating webhook or a manual process.

Mistake

IaC means you never have to touch the cloud console.

Correct

IaC reduces the need for manual console use, but some operations (like troubleshooting connectivity issues) may still require console access. Also, initial setup often requires some console steps.

Mistake

GitOps and IaC are the same thing.

Correct

IaC is the practice of managing infrastructure with code. GitOps is a specific implementation of IaC that uses Git as the source of truth and an operator to reconcile state. All GitOps is IaC, but not all IaC is GitOps.

Mistake

You cannot use GitOps with existing manually managed clusters.

Correct

You can adopt GitOps on an existing cluster by first exporting the current configuration into Git (using tools like `kubectl get all --export`), then installing the operator. The operator will then manage the cluster from that point onward.

Do You Actually Know This?

Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.

Frequently Asked Questions

What is the difference between GitOps and Infrastructure as Code?

Infrastructure as Code (IaC) is the broader practice of managing infrastructure with machine-readable definition files. GitOps is a subset of IaC that specifically uses Git as the single source of truth and employs an automated operator to reconcile the live state with the desired state in Git. In short, all GitOps is IaC, but not all IaC is GitOps. For example, you can use Terraform in a CI/CD pipeline without GitOps, but if you add a GitOps operator that continuously syncs from Git, it becomes GitOps.

How does Config Sync handle secrets?

Config Sync can reference secrets stored in Google Cloud Secret Manager. You can use the `secrets-store.csi.k8s.io` driver to mount secrets as volumes. Alternatively, you can use tools like Sealed Secrets or Helm secrets to encrypt secrets before storing them in Git. Config Sync itself does not decrypt secrets; it applies the encrypted manifests as-is, and the decryption happens at runtime via a controller.

Can I use GitOps for non-Kubernetes resources on Google Cloud?

Yes, with Config Controller (a managed version of Config Connector) or with Terraform. Config Controller allows you to manage Google Cloud resources (like Cloud SQL, VPCs, etc.) using Kubernetes-style YAML manifests. You can store these manifests in Git and use Config Sync to apply them to Config Controller, which then creates the cloud resources. This extends GitOps to the entire Google Cloud infrastructure.

What happens if the Git repository is unavailable?

The GitOps operator will continue to use the last successfully synced state. It will not be able to detect changes from Git until the repository is reachable again. If there is drift during the outage, the operator will not correct it until the next successful sync. For high availability, you can mirror the repository or use a multi-region Git service like Cloud Source Repositories.

How do I roll back a change with GitOps?

To roll back, you revert the commit in Git (e.g., `git revert` or `git reset`) and push the change. The GitOps operator will detect the revert and apply the previous state. This is much simpler than traditional rollback methods because you are just reverting a Git commit. You can also use Git tags or branches to mark stable versions.

What is the default sync interval for Config Sync?

The default sync interval for Config Sync is 15 seconds. This means Config Sync checks the Git repository and reconciles the cluster every 15 seconds. You can configure this interval using the `spec.git.syncWaitSecs` field in the ConfigManagement resource. For Argo CD, the default is 3 minutes.

Does GitOps replace CI/CD pipelines?

No, GitOps does not replace CI/CD; it complements it. CI pipelines build and test application code, producing artifacts (e.g., container images). CD pipelines can update the Git repository with new image tags, and then GitOps deploys the changes. GitOps is the deployment mechanism, while CI/CD is the build and test mechanism. They work together.

Terms Worth Knowing

Azure Policy BigQuery Cloud computing Cloud IAM Cloud storage CodePipeline Machine learning Region

Ready to put this to the test?

You've just covered GitOps and Infrastructure as Code — now see how well it sticks with free GCDL practice questions. Full explanations included, no account needed.

Try GCDL practice questions Back to all chapters

Done with this chapter?

Site Reliability Engineering (SRE) Principles

Observability: Logging, Monitoring, and Tracing

See the full GCDL study guide