This chapter covers Kubernetes concepts tailored for non-technical leaders preparing for the Google Cloud Digital Leader (GCDL) exam. Kubernetes is a critical topic under Infrastructure Objective 2.2, and it appears in approximately 10-15% of exam questions. You will learn what Kubernetes is, its core components, how it works, and why it matters for cloud strategy—without needing to write YAML files or deploy clusters. The focus is on understanding the architecture, benefits, and key terminology so you can make informed decisions about container orchestration.
Jump to a section
Imagine you are the manager of a global shipping fleet. You have hundreds of identical cargo containers (containers) that need to be loaded onto ships, sent to ports, and delivered to customers. You don't care which specific container goes where—you just need the right number of containers at each destination. Kubernetes is like an automated fleet management system. You specify: 'I need 3 containers of product A at port New York, and 5 containers of product B at port London.' The system then selects available containers (pods), decides which ship (node) to load them onto, and ensures they stay there. If a container falls overboard (crashes), the system automatically replaces it with a new one from the warehouse (container image registry). It also balances the load: if one ship is too heavy, it moves containers to another ship. The system constantly monitors the health of containers and ships, and it can roll out new product versions (updates) gradually, replacing old containers with new ones without downtime. You never touch a container directly—you only interact with the management system via its API (kubectl). This abstraction allows you to scale your operations from a single port to a global network without changing your workflow.
What is Kubernetes and Why Does It Exist?
Kubernetes (often abbreviated as K8s) is an open-source platform for automating the deployment, scaling, and management of containerized applications. Containers package an application with its dependencies, making it portable across environments. However, running containers in production at scale introduces challenges: how do you ensure high availability, load balance traffic, roll out updates without downtime, and recover from failures? Kubernetes solves these problems by providing a declarative model where you specify the desired state (e.g., 'run 3 copies of this app') and Kubernetes continuously works to maintain that state.
Core Architecture: Control Plane and Nodes
A Kubernetes cluster consists of two main planes: the control plane (master) and the compute plane (nodes). The control plane makes global decisions about the cluster (e.g., scheduling, responding to events) and exposes the API. It includes several components: - kube-apiserver: The front-end for the Kubernetes API. All communication, internal and external, goes through the API server. It validates and processes RESTful requests. - etcd: A consistent and highly-available key-value store used as Kubernetes' backing store for all cluster data. Only the API server talks to etcd directly. - kube-scheduler: Watches for newly created pods with no assigned node and selects a node for them to run on based on resource requirements, constraints, and policies. - kube-controller-manager: Runs controller processes that handle routine tasks such as replication (ensuring the correct number of pod replicas), node management (detecting node failures), and endpoint management.
Nodes are the worker machines (VMs or physical servers) that run your applications. Each node contains: - kubelet: An agent that ensures containers are running in a pod as expected. It communicates with the API server to receive pod specifications and reports back node and pod status. - kube-proxy: A network proxy that maintains network rules on nodes, enabling communication to pods from inside or outside the cluster. - Container runtime: The software that runs containers (e.g., Docker, containerd).
Pods: The Smallest Deployable Units
A pod is the smallest and simplest unit in the Kubernetes object model. It represents a single instance of a running process in the cluster. A pod can contain one or more containers that are tightly coupled and share the same network namespace (IP address and port space), storage volumes, and lifecycle. In practice, most pods run a single container. Pods are ephemeral: they are created, assigned to a node, run until termination or failure, and are not automatically restarted unless managed by a higher-level controller.
Workload Resources: Deployments, StatefulSets, and DaemonSets
Pods are almost never created directly. Instead, you use workload resources that manage pods: - Deployment: The most common resource for stateless applications. It defines a desired state (e.g., 3 replicas of a pod template) and a strategy for updating pods (e.g., rolling update with max surge and max unavailable). Deployments ensure that the actual state matches the desired state, replacing failed or unhealthy pods automatically. - StatefulSet: Used for stateful applications that require stable, unique network identifiers, persistent storage, and ordered deployment/scaling. Examples include databases like MySQL or Cassandra. Each pod in a StatefulSet has a sticky identity (e.g., pod-0, pod-1) and its own persistent volume. - DaemonSet: Ensures that all (or some) nodes run a copy of a pod. Commonly used for cluster-wide services like logging agents (e.g., Fluentd), monitoring agents (e.g., Prometheus node exporter), or network plugins.
Services and Networking
Pods are ephemeral and their IP addresses change when they are recreated. Services provide a stable endpoint to access a set of pods. A Service selects pods based on labels and load-balances traffic to them. Types of Services:
- ClusterIP (default): Exposes the Service on an internal IP in the cluster. Only reachable within the cluster.
- NodePort: Exposes the Service on each Node's IP at a static port (30000-32767). You can access the Service from outside the cluster by requesting <NodeIP>:<NodePort>.
- LoadBalancer: Exposes the Service externally using a cloud provider's load balancer. On Google Kubernetes Engine (GKE), this provisions a Google Cloud TCP/UDP load balancer.
- ExternalName: Maps a Service to a DNS name (e.g., my-service returns a CNAME record).
Storage: Volumes and PersistentVolumeClaims
Containers are ephemeral—their filesystem disappears when the container restarts. Volumes provide persistent storage that survives container restarts. Kubernetes supports many volume types (e.g., emptyDir, hostPath, cloud-specific like Google Persistent Disk). For stateful workloads, you use PersistentVolumeClaim (PVC) to request storage, and the cluster binds it to a PersistentVolume (PV) that has been provisioned by an administrator or dynamically provisioned by a StorageClass.
Configuration and Secrets
To decouple configuration from application code, Kubernetes provides ConfigMaps and Secrets: - ConfigMap: Stores non-sensitive configuration data as key-value pairs. Pods can consume ConfigMaps as environment variables, command-line arguments, or mounted as files. - Secret: Similar to ConfigMap but designed for sensitive data (e.g., passwords, API keys). Secrets are base64-encoded and can be encrypted at rest. Pods access Secrets similarly to ConfigMaps.
Scaling and Autoscaling
Kubernetes supports manual scaling (updating replica count) and autoscaling: - Horizontal Pod Autoscaler (HPA): Automatically scales the number of pod replicas based on observed CPU/memory utilization or custom metrics. The HPA periodically queries metrics (via Metrics Server) and adjusts the desired replica count of a Deployment or StatefulSet. - Vertical Pod Autoscaler (VPA): Adjusts resource requests and limits for containers based on usage, but requires pod restart. - Cluster Autoscaler: Automatically adjusts the number of nodes in the cluster when pods fail to schedule due to resource shortages or when nodes are underutilized.
Kubernetes on Google Cloud: GKE
Google Kubernetes Engine (GKE) is a managed Kubernetes service on Google Cloud. GKE automates cluster creation, upgrades, node management, and integrates with other Google Cloud services (Cloud Logging, Cloud Monitoring, Cloud Build, etc.). Key GKE features: - Autopilot: A mode where Google manages the entire cluster infrastructure, including node provisioning, scaling, and maintenance. You only pay for pods, not nodes. - Standard: You manage the node pools and have more control over the underlying VMs. - Workload Identity: Allows pods to authenticate to Google Cloud APIs using IAM service accounts. - GKE Ingress: Provides HTTP(S) load balancing using Google Cloud Load Balancer.
Kubernetes and the GCDL Exam
For the GCDL, you do not need to know how to write a Deployment YAML or run kubectl commands. Instead, focus on:
The purpose and benefits of Kubernetes (scalability, resilience, declarative management)
Key concepts: pods, deployments, services, namespaces, and persistent storage
How Kubernetes fits into a cloud strategy (hybrid/multi-cloud, microservices, DevOps)
The difference between Kubernetes and other orchestration tools (e.g., Docker Swarm, Nomad)
GKE as a managed Kubernetes service and its advantages (autopilot, integrated logging/monitoring)
Common exam scenarios include identifying when to use Kubernetes vs. other compute options (Compute Engine, App Engine, Cloud Run), understanding how Kubernetes enables rolling updates and self-healing, and recognizing the role of etcd in cluster state management.
Submit a Deployment manifest
A user or CI/CD pipeline submits a Deployment manifest (YAML or JSON) to the Kubernetes API server using kubectl or a client library. The manifest specifies the desired state: container image, replicas (e.g., 3), resource requests/limits, and update strategy (e.g., rolling update with maxSurge=25% and maxUnavailable=25%). The API server validates the request, stores the object in etcd, and returns a success response.
Scheduler assigns pods to nodes
The Deployment controller creates a ReplicaSet object, which generates pod templates. The kube-scheduler watches for unscheduled pods (those with no node assigned). It filters nodes based on resource requirements (CPU, memory), node affinity/anti-affinity, and other constraints. It then scores remaining nodes using a priority function (e.g., least resource utilization). The highest-scoring node is selected, and the scheduler writes the binding to the API server, updating the pod's nodeName field.
Kubelet launches containers
On the assigned node, the kubelet agent sees the new pod binding. It pulls the container image from the registry (e.g., Container Registry, Docker Hub) if not already cached. It then uses the container runtime to create and start the containers within the pod. The kubelet reports pod status back to the API server (e.g., Pending, Running, Ready). It also executes liveness and readiness probes periodically to check container health.
Service routes traffic to pods
A Service object with a label selector (e.g., app: myapp) is created. The endpoint controller watches for pods matching the selector and updates the Endpoints object with the pod IPs and ports. kube-proxy on each node programs iptables or IPVS rules to forward traffic from the Service's ClusterIP to the selected pods. For a LoadBalancer Service, GKE provisions a Google Cloud load balancer and updates its backend with the node IPs and NodePort.
Autoscaling adjusts replicas
The Horizontal Pod Autoscaler (HPA) periodically queries the Metrics Server (or custom metrics API) to get CPU/memory utilization of pods. If the current utilization exceeds the target (e.g., 80% CPU), the HPA calculates the desired replica count (ceil(currentUtilization/targetUtilization * currentReplicas)). It then updates the Deployment's replica count via the API server. The Deployment controller then creates or removes pods accordingly.
Enterprise Scenario 1: Microservices Migration
A large financial services company with a monolith application running on VMs wants to migrate to microservices on GKE. They have 50+ microservices, each requiring independent scaling, deployment, and health management. Using Kubernetes, they define each microservice as a Deployment with 3 replicas. They use a Service of type LoadBalancer for external-facing APIs and ClusterIP for internal communication. They set up HPA based on CPU utilization (target: 70%) to handle traffic spikes during market hours. They use ConfigMaps for configuration and Secrets for database credentials (encrypted at rest). In production, they run a cluster with 20 nodes (n1-standard-4) and observe 99.95% uptime. Misconfiguration: initially, they set resource requests too low, causing CPU throttling and latency; after adjusting requests to match actual usage, performance stabilized.
Enterprise Scenario 2: CI/CD Pipeline for a SaaS Platform
A SaaS startup uses GitLab CI to build container images and push them to Artifact Registry. On every merge to main, GitLab triggers a deployment to a staging GKE cluster. The deployment uses a rolling update strategy with maxSurge=1 and maxUnavailable=0 (zero downtime). They use readiness probes to ensure new pods accept traffic before old pods are terminated. In production, they have a multi-cluster setup: one cluster per region (US, EU, Asia) behind a global HTTP(S) load balancer. They use GKE Autopilot to avoid managing nodes. Common issue: readiness probe timeout too short (e.g., 1 second) causes pods to be marked unhealthy prematurely, leading to deployment failures; they increased initialDelaySeconds to 30 seconds.
Enterprise Scenario 3: Stateful Application - Cassandra Database
A media company runs a Cassandra cluster on GKE using StatefulSets. Each pod has a persistent volume (SSD persistent disk) with ReadWriteOnce access mode. They use headless Service for DNS-based pod discovery (each pod gets a stable hostname like cassandra-0.cassandra.default.svc.cluster.local). They scale by adding pods manually (Cassandra handles rebalancing). Performance: with 10 nodes, each with 4 vCPUs and 15GB memory, they handle 50k writes/sec. Misconfiguration: setting podManagementPolicy to Parallel instead of OrderedReady caused race conditions during cluster bootstrap; they switched to OrderedReady to ensure pods start sequentially.
What GCDL Tests on Kubernetes (Objective 2.2)
The GCDL exam focuses on high-level understanding of Kubernetes concepts and their business value. Specific tested areas: - Purpose of Kubernetes: Container orchestration, automated deployment, scaling, and management. - Pods vs. Deployments: Pods are the smallest unit; Deployments manage replica sets and provide rolling updates. - Services: How they provide stable networking to ephemeral pods (ClusterIP, NodePort, LoadBalancer). - GKE: Managed Kubernetes on Google Cloud; Autopilot vs. Standard modes. - Benefits: Self-healing, scalability, portability across clouds, declarative configuration. - Use cases: Microservices, batch processing, hybrid cloud, CI/CD.
Common Wrong Answers and Why
Choosing 'Kubernetes is a container runtime': Kubernetes is an orchestrator, not a runtime. Docker/containerd are runtimes. Candidates confuse the layers.
Selecting 'Pods are the same as containers': A pod can contain multiple containers, but typically one. Pods are the atomic unit of scheduling.
Thinking Kubernetes replaces VMs: Kubernetes runs on top of VMs (or bare metal) and manages containers. It does not replace the underlying infrastructure.
Assuming GKE requires manual node management: GKE Autopilot is fully managed; Standard gives more control but requires node pool management.
Specific Numbers and Terms to Memorize
NodePort range: 30000-32767.
Default replica count: 1 (if not specified).
Rolling update defaults: maxSurge=25%, maxUnavailable=25% (rounded up).
HPA default metrics: CPU utilization (target percentage).
etcd: Uses port 2379 for client communication, 2380 for peer communication.
kubelet: Port 10250 (HTTPS).
Edge Cases and Exceptions
DaemonSet vs. Deployment: DaemonSets run one pod per node, useful for logging agents; Deployments run a specified number of replicas regardless of nodes.
StatefulSet scaling: Not as flexible as Deployments; scaling down deletes pods in reverse order.
PersistentVolume reclaim policy: Can be Retain, Recycle, or Delete; exam may test that Retain preserves data after PVC deletion.
Resource quotas: Namespace-level limits on total CPU/memory and object counts.
How to Eliminate Wrong Answers
If an answer mentions 'Kubernetes manages virtual machines', it's wrong—Kubernetes manages containers.
If an answer says 'Pods are automatically recreated by default', it's wrong—only controllers like Deployment do that.
If an answer suggests 'GKE Autopilot requires you to manage nodes', it's wrong—Autopilot abstracts nodes.
Look for keywords: 'declarative', 'desired state', 'self-healing', 'orchestration'—these are hallmarks of Kubernetes.
Kubernetes is an open-source container orchestration platform that automates deployment, scaling, and management of containerized applications.
A pod is the smallest deployable unit; it runs one or more containers and is ephemeral.
Deployments manage replica sets and provide declarative updates and self-healing for pods.
Services provide stable networking to pods, with types ClusterIP, NodePort, LoadBalancer, and ExternalName.
GKE is Google's managed Kubernetes service; Autopilot mode fully manages the cluster infrastructure.
Kubernetes uses a declarative model: you specify desired state, and controllers continuously work to achieve it.
The control plane components include API server, etcd, scheduler, and controller manager; nodes run kubelet, kube-proxy, and container runtime.
Common exam terms: pod, deployment, service, namespace, ConfigMap, Secret, HPA, StatefulSet, DaemonSet.
These come up on the exam all the time. Here's how to tell them apart.
Kubernetes
More complex setup but highly extensible and widely adopted.
Supports advanced features like autoscaling, rolling updates, and self-healing.
Declarative configuration via YAML/JSON manifests.
Strong ecosystem and community support.
Multi-cloud and hybrid cloud capable.
Docker Swarm
Simpler to set up and use, tightly integrated with Docker.
Limited features compared to Kubernetes; no built-in autoscaling.
Imperative commands primarily; less flexible.
Smaller ecosystem and declining adoption.
Primarily used in single-cloud or small deployments.
Mistake
Kubernetes is a container runtime like Docker.
Correct
Kubernetes is a container orchestration platform. It uses a container runtime (e.g., Docker, containerd) to actually run containers. The runtime is a lower-level component that Kubernetes manages.
Mistake
A pod is the same as a container.
Correct
A pod is the smallest deployable unit in Kubernetes and can contain one or more containers that share the same network namespace and storage. Most pods run a single container, but they are not synonymous.
Mistake
Kubernetes automatically persists data across pod restarts.
Correct
Pods are ephemeral; their local storage is lost on restart. To persist data, you must use volumes (e.g., PersistentVolumeClaims) that survive pod restarts.
Mistake
GKE Standard mode is completely managed—you don't need to manage nodes.
Correct
GKE Standard requires you to manage node pools (e.g., upgrade, resize). GKE Autopilot is the fully managed mode where Google handles node infrastructure.
Mistake
Kubernetes only runs on Google Cloud.
Correct
Kubernetes is open-source and runs on any cloud or on-premises. GKE is Google's managed Kubernetes service, but Kubernetes itself is cloud-agnostic.
Reveal each answer, then mark whether you got it right. Score 60%+ to unlock the next chapter.
A pod is the smallest unit in Kubernetes, representing one or more containers. A deployment is a higher-level resource that manages pods, ensuring a specified number of replicas are running and supporting rolling updates and rollbacks. You rarely create pods directly; instead, you use deployments to manage them.
Kubernetes uses rolling updates, where new pods are created gradually while old pods are terminated. The deployment controller ensures that at most a specified number of pods are unavailable (maxUnavailable) and at most a specified number of extra pods are created (maxSurge). Readiness probes ensure new pods are ready to serve traffic before old ones are removed.
etcd is a distributed key-value store that stores all cluster data, including configuration, state, and metadata. Only the API server communicates directly with etcd. It is critical for cluster consistency and recovery; if etcd fails, the cluster cannot operate.
Use Autopilot if you want fully managed infrastructure—Google handles node provisioning, scaling, and maintenance. You pay per pod. Use Standard if you need fine-grained control over node configurations, such as custom machine types, GPU nodes, or specific node taints/tolerations. Standard requires you to manage node pools.
Kubernetes supports manual scaling (updating replica count) and autoscaling via Horizontal Pod Autoscaler (HPA), which adjusts replicas based on CPU/memory or custom metrics. Cluster Autoscaler can add/remove nodes to accommodate pod scheduling demands.
Namespaces provide a way to partition a single cluster into multiple virtual clusters. They are used to organize resources, apply resource quotas, and enforce access controls. Default namespace is used if not specified. Common use cases: separating environments (dev, prod) or teams.
Kubernetes uses volumes that are mounted into pods. For persistent storage, you create a PersistentVolumeClaim (PVC) that requests storage from a PersistentVolume (PV) or dynamic provisioner. The PVC outlives the pod, so data persists across pod restarts. StatefulSets use this for stateful applications.
You've just covered Kubernetes Concepts for Non-Technical Leaders — now see how well it sticks with free GCDL practice questions. Full explanations included, no account needed.
Done with this chapter?