Which command shows CPU and memory usage of nodes in the cluster?
Displays CPU and memory usage for nodes.
Why this answer
kubectl top nodes displays resource usage for nodes, provided the Metrics Server is installed.
1005 questions total · 14pages · All types, answers revealed
Which command shows CPU and memory usage of nodes in the cluster?
Displays CPU and memory usage for nodes.
Why this answer
kubectl top nodes displays resource usage for nodes, provided the Metrics Server is installed.
Which THREE of the following are valid taint effects that can be applied to a node? (Select 3)
Valid taint effect.
Why this answer
Taint effects in Kubernetes control pod scheduling onto nodes. `NoSchedule` (C) prevents pods that do not tolerate the taint from being scheduled on the node, making it a valid taint effect.
Exam trap
CNCF often tests the exact spelling of taint effects, and candidates confuse `PreferNoSchedule` with `PreferSchedule` or invent effects like `NeverSchedule` that do not exist in the Kubernetes API.
An administrator wants to expose an application running in a pod on port 3000 using a Service of type LoadBalancer. Which command creates the Service?
Correct command.
Why this answer
The correct command is 'kubectl expose pod my-pod --type=LoadBalancer --port=80 --target-port=3000'. This creates a LoadBalancer Service mapping port 80 to the pod's port 3000.
What is the purpose of the kube-proxy component in a Kubernetes cluster?
kube-proxy implements the service abstraction by maintaining iptables or IPVS rules.
Why this answer
Option A is correct because kube-proxy is the component responsible for implementing the Kubernetes Service concept by maintaining network rules on each node. It performs connection forwarding and load balancing for Service endpoints using either iptables, IPVS, or userspace mode, ensuring that traffic destined for a Service's ClusterIP is correctly routed to healthy Pods.
Exam trap
The trap here is that candidates confuse kube-proxy with the kubelet or kube-scheduler because all three are node-level components, but kube-proxy's sole purpose is network proxying and Service load balancing, not pod management or scheduling.
How to eliminate wrong answers
Option B is wrong because scheduling pods to nodes is the responsibility of the kube-scheduler, not kube-proxy. Option C is wrong because managing the lifecycle of pods (creation, monitoring, restart) is handled by the kubelet, not kube-proxy. Option D is wrong because storing cluster configuration data is the function of etcd, a distributed key-value store; kube-proxy does not persist any state.
A pod is in ImagePullBackOff. Which TWO of the following are possible causes? (Select 2)
Without credentials, pulling from a private registry fails.
Why this answer
Common causes: invalid image tag (typo) and authentication failure when the image is in a private registry.
You want to configure NetworkPolicy to allow ingress traffic only from pods with label 'role: frontend' in the same namespace. Which podSelector should be in the ingress rule?
The ingress rule's from section uses podSelector to select allowed source pods.
Why this answer
The from section's podSelector selects pods that are allowed as sources. So podSelector: matchLabels: role: frontend means only pods with that label can access the destination pods.
You have a pod that is in 'CrashLoopBackOff' state. Which command should you use to view the logs from the previous instance of the container?
The --previous flag retrieves logs from the previous (crashed) container.
Why this answer
Option C is correct. The '--previous' flag shows logs from the previous (crashed) container instance. Option A shows current logs.
Option B shows events. Option D describes the pod.
Which of the following is a required field when defining a PersistentVolume?
capacity is required to define the size of the PV.
Why this answer
A PV must specify its capacity (storage amount). Other fields like accessModes are also required, but capacity is one of them.
You have a PVC that is bound to a PV with a filesystem volume mode. You want to use the volume as a block device in a pod. What should you do?
You need to create a new PVC with Block mode and a corresponding PV.
Why this answer
To use a volume as a block device, the PV and PVC must be created with volumeMode: Block. You cannot change the mode after creation; you must create a new PVC with the correct mode.
Which of the following Service types exposes a Service on a static port on each node's IP address?
NodePort exposes the Service on each node's IP at a static port.
Why this answer
NodePort exposes the Service on a static port (30000-32767) on each node's IP address. ClusterIP is internal only, LoadBalancer provides an external load balancer, and ExternalName maps to an external DNS name.
Which of the following is true about IngressClass resources?
The IngressClass resource is the preferred way to specify the ingress controller.
Why this answer
IngressClass is a cluster-scoped resource that defines which ingress controller should implement an Ingress. It is referenced by the `ingressClassName` field in the Ingress spec.
You are troubleshooting a node that is 'NotReady'. Which THREE of the following are possible causes? (Choose three.)
If the kubelet loses connectivity to the API server, it cannot report its status, and the node will eventually become NotReady.
Why this answer
Options B, C, and E are correct. A node becomes NotReady when the kubelet stops reporting its status. Option B: If the kubelet service is stopped, the node will become NotReady.
Option C: If the network plugin (e.g., Calico, Flannel) is down, the node may become NotReady because it cannot communicate with the control plane. Option E: If the kubelet cannot contact the API server (e.g., due to network issues), it can't report its status, leading to NotReady. Option A (disk pressure) causes node conditions like DiskPressure but does not directly cause NotReady; the node might still be Ready with pressure conditions.
Option D (a pod using too much memory) does not cause node NotReady; it might cause OOMKilled or resource pressure but not the node status to change to NotReady.
A pod is stuck in Pending state. You run 'kubectl describe pod my-pod' and see the event: '0/4 nodes are available: 1 node(s) had taint {node.kubernetes.io/not-ready: }, 3 node(s) had taint {node-role.kubernetes.io/control-plane: }, that the pod didn't tolerate'. What is the most likely cause?
The event explicitly states the pod didn't tolerate the control-plane taint.
Why this answer
The pod does not tolerate the control-plane taint, so it cannot be scheduled on control-plane nodes. Adding a toleration or removing the taint from a worker node would help.
You are a platform engineer managing a multi-tenant Kubernetes cluster. A development team deploys a StatefulSet for a database with the following configuration: 3 replicas, headless service 'db-headless' for DNS-based discovery, and a regular ClusterIP service 'db' for read/write operations. The cluster uses Calico CNI with default NetworkPolicy enforcement. The team reports that applications in the same namespace can connect to the ClusterIP service but cannot connect to individual pod DNS names (e.g., db-0.db-headless.namespace.svc.cluster.local). You verify that the DNS resolution works (nslookup returns the pod IP). However, a curl to the pod IP on the database port (5432) times out. You check the endpoints and they are correct. Which action should you take to resolve the connectivity issue?
This policy would explicitly allow direct pod-to-pod traffic, which is currently blocked by a default-deny policy.
Why this answer
The issue is that DNS resolution works (pod IPs are returned) but direct pod IP connectivity fails, which points to a network policy blocking traffic. Since Calico CNI enforces NetworkPolicies by default, and no ingress rule allows traffic to the database pods on port 5432, the connection times out. Creating a NetworkPolicy that permits ingress from all pods in the namespace to the database pods on port 5432 resolves the connectivity problem.
Exam trap
The trap here is that candidates assume DNS resolution success implies full connectivity, overlooking that NetworkPolicy can block traffic at the pod IP level even when DNS works correctly.
How to eliminate wrong answers
Option A is wrong because CoreDNS is functioning correctly (DNS resolution returns pod IPs), so changing the DNS policy would not fix the connectivity issue at the network layer. Option C is wrong because removing the headless service would break DNS-based pod discovery, which is required for the StatefulSet's stable network identities, and the problem is not with service type but with network policy blocking traffic. Option D is wrong because changing the ClusterIP service to NodePort does not affect pod-to-pod connectivity; it only exposes the service externally on node ports, and the issue is internal pod IP connectivity blocked by network policy.
A pod has resource requests: cpu: 250m, memory: 128Mi. The node has 2 CPU cores and 4Gi memory. What is the maximum number of such pods that can fit on this node based solely on CPU requests?
8 pods * 250m = 2000m, exactly filling the CPU.
Why this answer
CPU request per pod is 250m = 0.25 CPU. Node has 2 CPUs = 2000m. 2000m / 250m = 8 pods.
Your team is deploying a new application that consists of a web frontend and a backend API. The frontend must be accessible from outside the cluster, and the backend should only be accessible from within the cluster. The cluster has multiple namespaces: 'frontend' and 'backend'. You have been asked to design the deployment. The frontend Deployment should have 5 replicas, and the backend Deployment should have 3 replicas. Additionally, you need to ensure that the frontend pods can communicate with the backend pods using a stable DNS name. You also want to isolate the backend from other namespaces. Which set of resources should you create?
LoadBalancer exposes frontend externally, ClusterIP for internal backend, and NetworkPolicy restricts backend access.
Why this answer
Option D is correct because it uses a LoadBalancer Service for the frontend to provide external access, a ClusterIP Service for the backend to restrict access to within the cluster, and a NetworkPolicy that allows ingress traffic from the frontend namespace to the backend while denying all other ingress, thus isolating the backend. This ensures the frontend pods can reach the backend via a stable DNS name (the ClusterIP Service's DNS name) and meets the requirement of backend isolation from other namespaces.
Exam trap
The trap here is that candidates often forget that a ClusterIP Service cannot be accessed from outside the cluster, and they may incorrectly choose a LoadBalancer or NodePort for the backend, or omit the NetworkPolicy needed to enforce isolation.
How to eliminate wrong answers
Option A is wrong because it uses a NodePort Service for the frontend, which exposes the frontend on a high port on every node but does not provide a stable external endpoint like a LoadBalancer, and it lacks a NetworkPolicy to isolate the backend. Option B is wrong because it uses a ClusterIP Service for the frontend, which does not make the frontend accessible from outside the cluster, violating the requirement. Option C is wrong because it uses a LoadBalancer Service for the backend, which exposes the backend externally, contradicting the requirement that the backend should only be accessible from within the cluster.
You need to implement a PriorityClass named 'high-priority' with value 1000 and mark it as non-preempting. Which YAML field should you set to true?
Setting preemptionPolicy to Never prevents pods using this PriorityClass from preempting other pods.
Why this answer
Option A is correct because setting `spec.preemptionPolicy: Never` in a PriorityClass definition marks it as non-preempting, meaning pods with this priority will not preempt lower-priority pods even if they have a higher priority value. This field is the only one that controls preemption behavior for a PriorityClass.
Exam trap
CNCF often tests the distinction between `spec.value` (which sets priority) and `spec.preemptionPolicy` (which controls preemption behavior), leading candidates to mistakenly think that a high priority value alone implies preemption or that `spec.globalDefault` affects preemption.
How to eliminate wrong answers
Option B is wrong because `spec.description` is a free-text field for human-readable notes and has no effect on preemption behavior. Option C is wrong because `spec.globalDefault: true` sets this PriorityClass as the default for all pods that do not specify a priorityClassName, but it does not affect preemption. Option D is wrong because `spec.value: 1000` sets the priority value (higher numbers indicate higher priority) but does not control preemption; preemption is governed by the `preemptionPolicy` field.
An administrator runs 'kubectl drain node01 --ignore-daemonsets --force' to prepare node01 for maintenance. However, a pod running a critical application is evicted and becomes unschedulable. Which flag could prevent eviction of that specific pod?
Setting --delete-local-data=false prevents eviction of pods with local storage, protecting critical pods that use local data.
Why this answer
Option C is correct because the `--delete-local-data=false` flag prevents the eviction of pods that use emptyDir volumes or local data. By default, `kubectl drain` evicts all pods except those managed by DaemonSets, and the `--force` flag bypasses checks that would normally protect pods with local storage. Setting this flag to false ensures that pods with local data (like the critical application) are not evicted during the drain operation.
Exam trap
The trap here is that candidates often confuse `--delete-local-data` with a flag that deletes data on the node, when in fact it controls whether pods using local storage (like emptyDir) are evicted, and the default behavior with `--force` is to evict them.
How to eliminate wrong answers
Option A is wrong because `--grace-period=0` forces immediate termination of pods without a graceful shutdown period, which does not prevent eviction—it actually makes eviction more aggressive. Option B is wrong because `--pod-selector='app=critical'` is not a valid flag for `kubectl drain`; the command does not support a pod-selector to exclude specific pods from eviction. Option D is wrong because `--evict-unscheduled-pods` is not a valid flag for `kubectl drain`; the correct flag is `--evict-unscheduled-pods` (note the typo in the option), and it would evict pods that have not been scheduled to a node, which is irrelevant to protecting a running critical pod.
Which TWO statements about Headless Services are correct?
Correct. Setting clusterIP: None makes the service headless.
Why this answer
Options B and C are correct. A Headless Service (clusterIP: None) does not have a cluster IP; it returns the IPs of the selected pods directly via DNS. This allows clients to discover all pod IPs.
Option A is false: Headless Services still do load balancing if the client chooses to, but DNS returns all IPs. Option D is false: Headless Services still need a selector to define endpoints. Option E is false: They can be used for StatefulSets, but they do not provide a stable network identity; StatefulSet provides that.
A pod is stuck in Pending state. 'kubectl describe pod' shows '0/2 nodes are available: 1 node(s) had taint that the pod didn't tolerate, 1 node(s) didn't match pod anti-affinity rules'. What should you check?
Why this answer
The events clearly indicate taint toleration and anti-affinity issues; checking tolerations and affinity rules is the right step.
A developer created a ServiceAccount named 'app-sa' in the 'dev' namespace. They want a pod to use this ServiceAccount. Which field in the pod spec should be set?
Correct field.
Why this answer
Option C is correct because the `spec.serviceAccountName` field in a Pod spec is the standard way to assign a specific ServiceAccount to a Pod. When this field is set, the Pod's containers will use the token of that ServiceAccount for API authentication. If omitted, the Pod defaults to the `default` ServiceAccount in its namespace.
Exam trap
The trap here is that candidates may confuse the deprecated `spec.serviceAccount` field (which still works in older clusters but is removed in recent versions) with the correct `spec.serviceAccountName`, or invent a non-existent field like `spec.accountName` due to similarity with other Kubernetes resource specs.
How to eliminate wrong answers
Option A is wrong because `spec.serviceAccount` is a deprecated field (removed in Kubernetes 1.24) and should not be used; it was replaced by `spec.serviceAccountName`. Option B is wrong because `spec.authentication.serviceAccount` is not a valid Kubernetes Pod spec field—authentication is handled via the ServiceAccount token, not a nested `authentication` object. Option D is wrong because `spec.accountName` is not a recognized field in the Pod spec; the correct field name is `serviceAccountName`.
A CKA candidate runs 'kubectl get nodes' and sees that a worker node is in the 'NotReady' state. Which command should be used to diagnose the node's kubelet health?
Why this answer
The kubelet is the primary node agent that registers the node with the cluster and reports its status via periodic heartbeats. When a node is NotReady, the most direct way to diagnose kubelet health is to inspect its systemd unit logs using 'journalctl -u kubelet', which shows startup errors, certificate issues, or resource exhaustion that prevent the kubelet from functioning correctly.
Exam trap
CNCF often tests the misconception that the kubelet runs as a Kubernetes pod (like kube-apiserver) and can be debugged with 'kubectl logs', when in reality it is a systemd service on the node, requiring OS-level commands like 'journalctl' or 'systemctl'.
How to eliminate wrong answers
Option A is wrong because 'systemctl status docker' checks the Docker container runtime, not the kubelet; while a runtime failure can cause node issues, the question specifically asks for diagnosing kubelet health. Option B is wrong because 'kubectl get events --all-namespaces' shows cluster-wide events (e.g., pod scheduling failures) but does not provide the kubelet's own log output or system-level errors. Option D is wrong because 'kubectl logs -n kube-system kubelet-<node>' attempts to access a pod named 'kubelet-<node>', but the kubelet does not run as a pod in the kube-system namespace; it runs as a systemd service on the node, so this command would fail with a 'not found' error.
Which TWO of the following volume types are typically used for sharing data between containers in the same pod?
hostPath can be used for sharing data between containers on the same node, but is less common.
Why this answer
emptyDir and hostPath can be used for sharing data between containers in a pod. ConfigMap and Secret are for configuration, not general data sharing. NFS is a network volume typically used across pods.
A node has a taint 'gpu=true:NoSchedule'. A pod has a toleration 'key: gpu, operator: Exists, effect: NoSchedule'. Will the pod be scheduled on the node?
Toleration matches the taint, so pod can be scheduled.
Why this answer
The toleration matches the taint (key gpu, effect NoSchedule). The pod can tolerate the taint and will be scheduled on the node.
A Job named 'data-processor' completes successfully. You want to run it again with the same configuration. What is the correct way to rerun the Job?
Jobs are immutable; to rerun, you must delete and recreate.
Why this answer
Option B is correct because a completed Job in Kubernetes is immutable and cannot be rerun by editing or restarting it. The only way to execute the same Job again is to delete the existing Job and recreate it with the same configuration, as Jobs are designed to run to completion and are not intended to be restarted like Deployments.
Exam trap
The trap here is that candidates confuse Jobs with Deployments or other controllers that support rolling updates and restarts, leading them to incorrectly apply commands like 'kubectl rollout restart' or assume editing the template will trigger a new run.
How to eliminate wrong answers
Option A is wrong because editing a completed Job's template does not trigger a new run; the Job's pod template is immutable after creation, and changes require deletion and recreation. Option C is wrong because 'kubectl rollout restart' is a command for Deployments, DaemonSets, and StatefulSets, not for Jobs, which do not support rolling updates or restarts. Option D is wrong because 'kubectl rerun' is not a valid kubectl command; Kubernetes does not provide a built-in command to rerun a Job.
Which TWO of the following are valid access modes for a PersistentVolume in Kubernetes? (Select two.)
RWO allows a single node to mount the volume in read-write mode.
Why this answer
ReadWriteOnce (RWO) and ReadWriteMany (RWX) are standard access modes. ReadWriteOncePod (RWOP) is also valid but not listed here. The other options are not valid access modes.
You have a pod that is CrashLoopBackOff. The logs show 'error: dial tcp: lookup service.default.svc.cluster.local: no such host'. What is the most likely cause?
Why this answer
The error message 'no such host' indicates that the DNS lookup for 'service.default.svc.cluster.local' failed because the hostname does not exist. In Kubernetes, this FQDN resolves only if a Service named 'service' exists in the 'default' namespace. Since the lookup fails with 'no such host', the most likely cause is that the Service does not exist, not a DNS infrastructure issue.
Exam trap
The trap here is that candidates often assume any DNS error means CoreDNS is down, but the specific 'no such host' message points to a missing DNS record, not a DNS service failure.
Why the other options are wrong
While CoreDNS being down could cause this, the error is about a specific service name not found, not a generic DNS failure. More likely the service doesn't exist.
If DNS policy was None, the pod would not even attempt cluster DNS; the error shows it tried but failed.
Network policies block traffic; but DNS would likely timeout or connection refused, not 'no such host'.
A pod's YAML specifies 'restartPolicy: Never' and the container exits with code 0. What state will the pod be in?
A completed pod with exit code 0 is Succeeded.
Why this answer
When a pod's restartPolicy is set to 'Never' and its container exits with code 0 (indicating a successful termination), the pod transitions to the 'Succeeded' phase. This is because Kubernetes treats a zero exit code as a successful completion, and with restartPolicy: Never, no restart is attempted, leaving the pod in a terminal Succeeded state.
Exam trap
The trap here is that candidates confuse the 'Completed' status from kubectl get pods output (which is a human-readable shorthand) with the actual pod phase 'Succeeded', or they assume any exit means 'Failed' regardless of the exit code.
How to eliminate wrong answers
Option A is wrong because 'Completed' is not a valid Kubernetes pod phase; the correct phase for a successful termination is 'Succeeded'. Option B is wrong because 'Failed' applies only when the container exits with a non-zero exit code, not code 0. Option D is wrong because 'Running' indicates the container is still executing, but here the container has already exited.
You are troubleshooting a pod that cannot start. Running 'kubectl describe pod' shows the event: 'Failed to pull image "myregistry.io/myapp:1.0": rpc error: code = Unknown desc = Error response from daemon: manifest for myregistry.io/myapp:1.0 not found'. What is the MOST likely cause?
The specific error 'manifest not found' indicates the tag does not exist.
Why this answer
Correct answer is B. The error indicates the image tag does not exist in the registry. Option A would show authentication errors.
Option C would show connection errors. Option D would show digest mismatch errors.
Which TWO of the following are valid ways to expose a set of pods as a network service in Kubernetes?
Ingress provides HTTP/HTTPS routing to Services, thus exposing them externally.
Why this answer
Ingress is correct because it provides HTTP/HTTPS routing to services based on hostnames or paths, exposing a set of pods externally via a single entry point. It relies on an Ingress controller (e.g., NGINX, Traefik) to implement the rules defined in the Ingress resource, making it a valid way to expose pods as a network service.
Exam trap
The trap here is that candidates often confuse a Deployment with a Service, thinking a Deployment alone can expose pods externally, but a Deployment only manages pod replicas and requires a Service or Ingress for network exposure.
What is the default kube-proxy mode in Kubernetes v1.29?
iptables is the default mode.
Why this answer
The default kube-proxy mode is iptables, though ipvs is also available.
Which Service type exposes a Service externally via a cloud provider's load balancer?
LoadBalancer creates an external load balancer (e.g., ELB) that routes to the Service.
Which of the following volume types provides ephemeral storage that shares the pod's lifecycle and is initially empty?
emptyDir is ephemeral and starts empty.
Why this answer
emptyDir volumes are created when a pod is assigned to a node and exist as long as the pod runs. They start empty.
A pod is stuck in 'Pending' state. You run 'kubectl describe pod mypod' and see the event: '0/3 nodes are available: 1 node(s) had taint {node.kubernetes.io/unreachable: }, that the pod didn't tolerate, 2 Insufficient memory.' Which issue is causing the pod to be pending?
The event indicates both a taint tolerance issue and insufficient memory. Both must be resolved for the pod to schedule.
Why this answer
Option C is correct. The event clearly states two issues: a taint that the pod doesn't tolerate and insufficient memory. Multiple reasons are given, so the pod is pending due to both.
Option A is partially correct but incomplete. Option B is not mentioned. Option D is incorrect because the event mentions both issues.
A node is 'NotReady'. Which THREE steps should you take to troubleshoot?
Shows kubelet log entries.
Why this answer
Checking kubelet status (A), kubelet logs (B), and node conditions (D) are direct steps. Restarting the node (C) is drastic. Checking API server logs (E) is not the first step for a node issue.
You need to expose multiple HTTP services on a single IP address with path-based routing. Which resource should you use?
Ingress provides path-based routing to multiple Services.
Why this answer
Ingress provides HTTP/HTTPS routing to Services based on rules, including path-based routing. Services alone do not support path-based routing; they only provide load balancing.
A company is migrating a stateful application to Kubernetes. The application requires persistent storage that is 'zone-aware' to survive a single zone failure and must provide the highest possible I/O performance. Which storage solution best meets these requirements?
Regional PDs provide zone redundancy and high performance, meeting both requirements.
Why this answer
Option C is correct because regional Persistent Disks replicate data synchronously across two zones, providing zone-level fault tolerance while maintaining high I/O performance due to direct block storage access. This meets the requirement for surviving a single zone failure without the overhead of network filesystem protocols or application-level replication.
Exam trap
The trap here is that candidates often confuse 'zone-aware' with simply scheduling pods across zones (Option B) or assume that any replicated storage (Option A) meets the requirement, but fail to recognize that only synchronous block-level replication across zones provides both fault tolerance and high I/O performance.
How to eliminate wrong answers
Option A is wrong because a single-pod NFS server creates a single point of failure and introduces network filesystem latency, which cannot survive a zone failure or provide the highest possible I/O performance. Option B is wrong because WaitForFirstConsumer binding mode only delays volume provisioning until a pod is scheduled; it does not provide zone-level replication or fault tolerance. Option D is wrong because local SSDs are node-bound and cannot survive a zone failure without external replication, and a DaemonSet managing replication adds complexity and performance overhead that does not guarantee synchronous replication across zones.
You update a NetworkPolicy to add an egress rule. After applying, pods affected by the policy can no longer reach external IPs. What is the most likely reason?
Adding an egress rule enables default deny for egress; external IPs must be allowed explicitly.
Why this answer
When any egress rule is defined in a NetworkPolicy, traffic to destinations not allowed by any egress rule is denied. There is no default allow; you must explicitly include all desired destinations.
A team is designing a storage solution for a Cassandra cluster on Kubernetes. Each pod must have its own dedicated storage, and the cluster must be able to scale up and down dynamically. Which Kubernetes resource should be used to manage the storage?
This creates a unique PVC for each pod, providing dedicated storage.
Why this answer
StatefulSet is the correct choice because it provides stable, unique network identities and dedicated storage for each pod via a volumeClaimTemplate. This ensures each Cassandra pod gets its own PersistentVolume, which is essential for stateful applications that require data persistence and ordered scaling. The volumeClaimTemplate automatically provisions a unique PersistentVolumeClaim for each replica, enabling dynamic scaling up and down while preserving data integrity.
Exam trap
The trap here is that candidates often choose Deployment with a shared PersistentVolume, mistakenly thinking it simplifies management, but they overlook the need for dedicated, persistent storage per pod and the ordered scaling guarantees that only StatefulSet provides.
How to eliminate wrong answers
Option A is wrong because emptyDir volumes are ephemeral and tied to the pod's lifecycle; data is lost when the pod is deleted, making it unsuitable for a Cassandra cluster that requires persistent storage. Option B is wrong because hostPath volumes bind to a specific node's filesystem, which prevents dynamic scaling and can cause data inconsistency if pods are rescheduled to different nodes; DaemonSets also run one pod per node, not suitable for a scalable Cassandra cluster. Option D is wrong because a single PersistentVolume shared by all pods would create a single point of failure and contention, violating the requirement for each pod to have its own dedicated storage; Deployments also do not guarantee stable pod identities or ordered scaling needed for stateful applications.
An administrator needs to upgrade the kube-apiserver on a control plane node from version 1.22.0 to 1.23.0. Which of the following is the correct order of steps?
Draining first ensures no workloads are disrupted.
Why this answer
Option B is correct because when upgrading the kube-apiserver, the standard workflow is to first drain the node to evict pods, then upgrade kubeadm (which manages the control plane components), then upgrade kubelet (which runs on the node), and finally uncordon the node to make it schedulable again. This sequence ensures that the node is safely taken out of service before any changes are made, and that the upgrade tools are updated before the components they manage.
Exam trap
The trap here is that candidates often confuse the upgrade order for control plane components with the order for worker nodes, where draining is done after upgrading kubeadm but before upgrading kubelet, but for the control plane, draining must come first to avoid service disruption.
How to eliminate wrong answers
Option A is wrong because it starts with upgrading kubelet before kubeadm, but kubeadm must be upgraded first to ensure it can handle the new version of the control plane components. Option C is wrong because it drains the node after upgrading kubeadm, which risks disrupting running workloads if the upgrade process fails or requires a restart. Option D is wrong because it upgrades both kubeadm and kubelet before draining the node, leaving workloads running during the upgrade and potentially causing downtime or data loss.
Match each Kubernetes resource to its primary function.
Drag a concept onto its matching description — or click a concept then click the description.
Smallest deployable unit, runs containers
Stable network endpoint for a set of Pods
HTTP/HTTPS routing to Services
Non-sensitive configuration data
Storage resource provisioned by an administrator
Why these pairings
These are fundamental Kubernetes objects with distinct roles.
You are a cluster administrator managing a production Kubernetes cluster that hosts a stateful application using StatefulSets with PersistentVolumeClaims (PVCs) backed by a cloud provider's persistent disk. A developer reports that a new pod in the StatefulSet is stuck in 'Pending' state. You describe the StatefulSet and see that it has 3 replicas. Two pods are Running, but the third pod (pod-2) is Pending. You check the PVC for pod-2 and see it is 'Pending'. The StorageClass uses 'WaitForFirstConsumer' volume binding mode. The node where pod-2 should run has sufficient resources. Other PVCs in the same namespace bound successfully. What is the most likely cause of the pending PVC and pod?
With WaitForFirstConsumer, after scheduling, a PV is provisioned or selected; if its nodeAffinity doesn't match the node, the pod remains pending.
Why this answer
The correct answer is A because with 'WaitForFirstConsumer' volume binding mode, the PVC binding is deferred until a pod using it is scheduled. The PV that should bind to the PVC has a nodeAffinity that does not match any available node, preventing the scheduler from binding the PVC and scheduling the pod. This results in both the PVC and pod remaining in 'Pending' state, even though the node has sufficient resources.
Exam trap
The trap here is that candidates often assume a Pending PVC is always due to insufficient storage capacity or quota, ignoring the impact of volume binding modes and nodeAffinity constraints on scheduling.
How to eliminate wrong answers
Option B is wrong because the CSI driver must be installed on all nodes that can run pods using the CSI driver; if it were missing on the scheduled node, the pod would fail with a different error (e.g., 'FailedMount'), not remain Pending due to an unbounded PVC. Option C is wrong because if the requested storage size exceeded the cloud provider's quota, the PVC would likely fail with a specific error (e.g., 'ProvisioningFailed') rather than remain Pending, and other PVCs in the same namespace bound successfully, indicating quota is not the issue. Option D is wrong because ReadWriteOnce is the default access mode for most cloud persistent disks and is compatible with StatefulSet pods; ReadWriteMany would be required only if multiple pods need to write simultaneously to the same volume, which is not the case here.
You run 'kubectl get pods' and see that a pod named 'web' is in 'ImagePullBackOff' state. Which command would help you see the reason for the image pull failure?
The describe command shows events including image pull errors.
Why this answer
'kubectl describe pod web' provides detailed information including events that show the exact error from the image pull attempt.
A NetworkPolicy named 'deny-all' is created with an empty podSelector and no rules. What does this policy accomplish?
Empty podSelector selects all pods. With no ingress/egress rules, all traffic is denied.
Why this answer
An empty podSelector selects all pods in the namespace. With no ingress rules, the policy defaults to denying all ingress traffic to all pods. Similarly, no egress rules means all egress is denied.
This effectively creates a default deny for both directions.
Which TWO of the following are valid ways to isolate a set of pods from all ingress traffic except from monitoring pods?
This allows ingress from monitoring pods.
Why this answer
To isolate pods from all ingress except monitoring, you can define a NetworkPolicy that denies all ingress (default) and then allows ingress from monitoring pods. Option A has no rules, which denies all. Option B allows from monitoring pods.
Option C uses namespaceSelector incorrectly. Option D allows all. Option E allows from specific pods, but only one rule.
The correct combination is A to deny all and B to allow monitoring.
Which command creates a Job that runs a single pod to execute the command 'echo Hello'?
This is the correct command to create a Job.
Why this answer
kubectl create job hello --image=busybox -- echo Hello creates a Job named 'hello' that runs the command 'echo Hello' in a busybox container.
A pod is in ImagePullBackOff state. Which command is MOST useful to diagnose the issue?
Why this answer
kubectl describe pod shows events with the exact image pull error (e.g., unauthorized, not found).
You run 'kubectl get pods' and see a pod with status 'ImagePullBackOff'. Which of the following is a possible cause?
A misspelled image name will cause the kubelet to fail to pull the image, resulting in ImagePullBackOff.
Why this answer
Option A is correct. ImagePullBackOff occurs when the kubelet cannot pull the container image, often due to a wrong image name, tag, or authentication issue. Options B, C, and D cause other pod states.
You try to run 'kubectl logs mypod' and get the error: 'Error from server (BadRequest): container "myapp" in pod "mypod" is waiting to start: PodInitializing'. What does this mean?
PodInitializing means init containers or container startup is in progress.
Why this answer
The pod is still initializing (e.g., pulling images, running init containers). Logs are not available until the main container starts.
A pod in the 'production' namespace is in CrashLoopBackOff state. Running 'kubectl describe pod web-app -n production' shows the event 'OOMKilled'. What is the most appropriate action to resolve this issue?
The container was killed due to memory limit; increasing it allows more memory.
Why this answer
OOMKilled means the container exceeded its memory limit. Increasing the memory limit is the correct fix.
You need to check the memory usage of all pods in the 'production' namespace. Which command fulfills this requirement?
Correct command to show resource usage of pods.
Why this answer
kubectl top pod --namespace=production shows CPU and memory usage for pods in that namespace.
Which TWO of the following are valid IngressClass annotations or fields?
spec.ingressClassName is a field in Ingress to specify the IngressClass name.
Why this answer
The IngressClass resource uses spec.controller and spec.parameters. The annotation kubernetes.io/ingress.class is deprecated but still used. The field spec.ingressClassName is used in Ingress to reference an IngressClass.
A pod cannot resolve a service DNS name. The cluster uses CoreDNS. Which of the following is the most likely cause if the pod's /etc/resolv.conf contains 'nameserver 10.96.0.10' and the CoreDNS pod is running?
If CoreDNS is not configured with the cluster domain, it cannot resolve service names.
Why this answer
The nameserver IP 10.96.0.10 is the default ClusterIP of the kube-dns service. If CoreDNS is running but not serving correctly, a common issue is that the CoreDNS ConfigMap is missing the cluster domain or has incorrect forwarders.
Which command is used to take a snapshot of etcd using etcdctl?
This is the correct command to save a snapshot.
Why this answer
Option C is correct because `etcdctl snapshot save` is the official command to create a point-in-time snapshot of an etcd datastore, which is essential for backup and disaster recovery in Kubernetes clusters. This command writes the snapshot to a specified file path, preserving all keys and metadata for later restoration via `etcdctl snapshot restore`.
Exam trap
The trap here is that candidates may confuse the `snapshot save` command with the non-existent `snapshot create` or the deprecated `backup` command from etcd v2, leading them to pick a plausible-sounding but incorrect option.
How to eliminate wrong answers
Option A is wrong because `etcdctl dump` is not a valid etcdctl subcommand; it may be confused with `etcdctl get` or `etcdctl watch` but does not exist for snapshotting. Option B is wrong because `etcdctl snapshot create` is not a valid command; the correct subcommand is `snapshot save`, not `create`. Option D is wrong because `etcdctl backup` is not a valid etcdctl command; the `backup` subcommand was used in older etcd v2 but has been replaced by `snapshot save` in etcd v3, which is the version used in modern CKA environments.
Which TWO of the following kubectl commands can be used to view the logs of a container in a pod? (Choose two.)
This shows the current logs of the first container in the pod.
Why this answer
Options A and C are correct. 'kubectl logs pod-name' streams the current logs of the default container. 'kubectl logs pod-name --previous' shows logs from the previous instance of the container (useful for crash loops). Option B 'kubectl describe pod pod-name' shows pod details, not logs. Option D 'kubectl exec pod-name -- logs' is not a standard command; logs are not an executable inside the container.
Option E 'kubectl get pod pod-name -o yaml' shows the pod manifest, not logs.
A Pod is in 'CrashLoopBackOff' state. You run 'kubectl logs <pod> --previous' and see an error about a missing environment variable. The Pod spec defines the environment variable in a ConfigMap. What is the best next step to diagnose the issue?
This directly checks if the ConfigMap exists.
Why this answer
Checking whether the ConfigMap exists and is correctly referenced will identify if the variable is missing due to misconfiguration. Option A is correct. Option B would not help if the variable is missing.
Option C is irrelevant. Option D might show logs but doesn't address the ConfigMap.
Which component is responsible for maintaining network rules on each node?
kube-proxy maintains network rules on each node to implement Kubernetes Services.
Why this answer
C is correct because kube-proxy is the component responsible for maintaining network rules on each node. It watches the Kubernetes API server for changes to Services and EndpointSlices, then updates iptables, IPVS, or other rules to route traffic to the appropriate Pods. This ensures that network policies and service load balancing are enforced at the node level.
Exam trap
The trap here is confusing kubelet with kube-proxy, as both run on each node, but kubelet manages Pod lifecycle while kube-proxy manages network rules and service routing.
How to eliminate wrong answers
Option A is wrong because kube-controller-manager runs controller loops (e.g., ReplicaSet, Deployment, Node) but does not manage per-node network rules. Option B is wrong because etcd is a distributed key-value store used for cluster state, not for maintaining network rules on nodes. Option D is wrong because kubelet is the primary node agent that manages Pods and containers, but it does not handle network rule enforcement (that is kube-proxy's role).
Which TWO commands can be used to interact with etcd snapshot operations? (Select TWO.)
Restores from a snapshot.
Why this answer
The `etcdctl snapshot save` command creates a point-in-time snapshot of the etcd datastore, which is essential for backing up the Kubernetes cluster state. The `etcdctl snapshot restore` command restores a cluster from a previously saved snapshot, typically used for disaster recovery or migrating etcd to a new node. Both commands are part of the official etcdctl snapshot subcommand group.
Exam trap
The trap here is that candidates may confuse `etcdctl snapshot save` with the non-existent `etcdctl backup` command, or think `etcdctl migrate` is related to snapshot operations when it is actually for version migration.
A user needs to deploy a pod that requires access to the Kubernetes API server from within the pod. Which resource should be used to provide authentication credentials automatically?
ServiceAccounts are automatically mounted as volumes in pods, providing a token for API authentication.
Why this answer
A ServiceAccount is the correct resource because Kubernetes automatically mounts a projected volume containing a JWT token into pods that use the default or a specified ServiceAccount. This token is used by the pod to authenticate against the Kubernetes API server, enabling secure in-cluster communication without manual credential management.
Exam trap
The trap here is that candidates often confuse authorization resources like ClusterRoleBinding with authentication mechanisms, or think that a generic Secret or ConfigMap can serve as an automatic credential provider, when in fact only a ServiceAccount provides the automated token injection and rotation required for in-cluster API access.
How to eliminate wrong answers
Option B is wrong because a Secret is a generic resource for storing sensitive data like passwords or tokens, but it does not automatically provide authentication credentials to a pod; you must explicitly mount or reference it, and it lacks the automatic token rotation and API server integration of a ServiceAccount. Option C is wrong because a ConfigMap is designed for non-sensitive configuration data (e.g., environment variables or config files) and cannot store or provide authentication credentials. Option D is wrong because a ClusterRoleBinding grants RBAC permissions to a subject (like a ServiceAccount or user) but does not itself provide authentication credentials; it is an authorization resource, not an authentication mechanism.
You are troubleshooting a Service connectivity issue. A pod in namespace 'frontend' cannot reach a Service in namespace 'backend' by its DNS name. CoreDNS is running. Which statements are true? (Select TWO.)
Pods in a different namespace must use the full FQDN to resolve a service.
Why this answer
Option A is correct: Services are exposed via DNS in the form <service>.<namespace>.svc.cluster.local. Option C is correct: pods can resolve cross-namespace services by using the FQDN. Option B is false because pods typically rely on CoreDNS for service discovery, not env variables by default in recent versions.
Option D is false: NetworkPolicy can block traffic even if DNS resolves. Option E is false: the default cluster domain is cluster.local, not just 'cluster'.
Correct. DNS short name only works within the same namespace.
Why this answer
The service must be in the same namespace for DNS resolution by short name; otherwise, you need the full name <service>.<namespace>.svc.cluster.local.
A node in your cluster is reporting 'NotReady' status. You log into the node and run 'systemctl status kubelet'. The kubelet service is not running. Which command should you use to start the kubelet and enable it to start on boot?
This enables the service on boot and starts it immediately.
Why this answer
The correct command is 'systemctl enable --now kubelet' which enables the service to start on boot and starts it immediately. Option A does not start it immediately. Option B is for restarting, not starting from stopped.
Option D is incorrect syntax.
A node in the cluster has been cordoned. Which of the following is true about the node?
Cordoning sets the node status to unschedulable, so no new pods are placed, but existing pods remain.
Why this answer
When a node is cordoned using `kubectl cordon`, it is marked as unschedulable by setting the `node.Spec.Unschedulable` field to true. This prevents new pods from being scheduled onto the node, but existing pods continue to run normally. The node remains part of the cluster and is not removed or drained automatically.
Exam trap
The trap here is that candidates confuse cordoning with draining, assuming cordoning also evicts existing pods or removes the node, when in fact it only prevents new scheduling and leaves running pods untouched.
How to eliminate wrong answers
Option A is wrong because cordoning does not remove the node from the cluster; the node remains a member and can be uncordoned later. Option B is wrong because `kubectl drain` is not automatically performed; draining is a separate, explicit operation that evicts pods, whereas cordon only prevents new scheduling. Option D is wrong because existing pods are not immediately evicted; they continue running until they are terminated or the node is drained manually.
Which TWO statements about PersistentVolume (PV) and PersistentVolumeClaim (PVC) binding are correct?
Without a matching PV or dynamic provisioning, the PVC cannot be bound.
Why this answer
Option C is correct because a PersistentVolumeClaim (PVC) will remain in the Pending state if no PersistentVolume (PV) matches its storage request and no StorageClass is defined to dynamically provision a volume. Without a matching PV or a StorageClass, the Kubernetes scheduler cannot bind or create a volume for the claim, leaving it pending indefinitely until a suitable PV becomes available or a StorageClass is added.
Exam trap
CNCF often tests the misconception that a PVC can bind to a PV with a different access mode if the PV supports multiple modes, but in reality, the access mode must match exactly between the PVC's request and one of the PV's supported modes for the binding to succeed.
You need to upgrade a Kubernetes cluster from v1.28 to v1.29 using kubeadm. Which TWO steps are REQUIRED as part of the upgrade process? (Choose TWO.)
Control plane must be upgraded before worker nodes.
Why this answer
Option B is correct because the Kubernetes control plane must be upgraded first to ensure the API server, controller manager, and scheduler are running the new version before worker nodes. This prevents version skew issues where newer node components might rely on API features not yet available on an older control plane. The kubeadm upgrade process explicitly requires upgrading the control plane node(s) before any worker nodes.
Exam trap
The trap here is that candidates often think 'kubeadm upgrade plan' is mandatory because it appears in many tutorials, but the CKA exam tests that only draining the node and upgrading control plane first are required steps, while the plan command is optional.
Which THREE of the following are valid commands to troubleshoot network connectivity between pods? (Select 3)
Tests DNS resolution from within a pod.
Why this answer
kubectl exec can run networking tools inside a pod. curl is a common tool. nslookup tests DNS. ping tests basic connectivity.
A cluster administrator notices that nodes are not joining the cluster after a kubeadm init. The kubelet logs show: 'failed to run Kubelet: could not init service: open /var/lib/kubelet/config.yaml: permission denied'. What is the most likely cause?
Permission denied indicates the kubelet cannot read the config file.
Why this answer
The error message 'open /var/lib/kubelet/config.yaml: permission denied' indicates that the kubelet process does not have the necessary read permissions to access its configuration file. This is typically caused by incorrect file ownership (e.g., owned by root instead of the kubelet user) or restrictive file permissions (e.g., 600 instead of 644). Since kubelet runs as a systemd service, it requires appropriate access to this file to initialize properly.
Exam trap
The trap here is that candidates often confuse 'permission denied' with network connectivity issues or resource exhaustion, but the specific file path in the error message directly points to a filesystem permission problem.
How to eliminate wrong answers
Option A is wrong because disk space issues would produce errors like 'no space left on device' or 'disk quota exceeded', not a permission denied error on a specific file. Option B is wrong because inability to reach the API server would manifest as connection timeout or refused errors in the kubelet logs, not a file permission error during initialization. Option C is wrong because a missing kubelet binary would result in a 'command not found' or 'executable file not found' error when systemd tries to start the service, not a permission denied error on a configuration file.
After deploying a new Deployment, you run 'kubectl get events' and see 'FailedScheduling' events. What is a possible cause?
If no node matches the node selector, the scheduler cannot place the pod.
Why this answer
FailedScheduling indicates the scheduler cannot place the pod, often due to taints or resource constraints.
Match each troubleshooting scenario to its likely cause.
Drag a concept onto its matching description — or click a concept then click the description.
Insufficient resources or unschedulable Node
Application crashes repeatedly
Missing or incorrect Endpoint selector labels
Kubelet not reporting or network issue
Invalid image name or registry authentication failure
Why these pairings
Common cluster issues require systematic debugging.
Which TWO of the following are control plane components?
The cloud controller manager runs controllers that interact with cloud provider APIs.
Why this answer
The cloud-controller-manager is a control plane component that integrates with cloud provider APIs to manage cloud-specific resources such as load balancers, storage volumes, and nodes. It runs as a daemon on the control plane and interacts with the Kubernetes API server to reconcile cloud resources with cluster state.
Exam trap
CNCF often tests the distinction between control plane and node components, and the trap here is that candidates mistakenly classify kube-proxy or kubelet as control plane components because they are essential for cluster operation, but they are not part of the control plane's core management layer.
Which TWO of the following are valid methods to view the logs of a container that has terminated?
Shows last state and exit code, but not full logs.
Why this answer
kubectl logs --previous retrieves logs of the last terminated container. kubectl describe pod shows the last termination state and exit code. The other options are not valid.
A Pod is stuck in 'Pending' state. Upon inspection, you find that the PVC it references is also 'Pending'. Which of the following is NOT a common cause for a PVC to remain in Pending state?
The pod's volume mount does not affect the PVC binding. The PVC can be bound independently of the pod.
Why this answer
Option D is correct. A pod using the PVC does not affect the PVC's binding process. The PVC is bound before the pod uses it.
Common causes are A (no storage class), B (no provisioner), and C (quota exceeded).
Which TWO of the following are valid resources for granting permissions in RBAC?
A Role defines permissions within a namespace.
Why this answer
A is correct because a Role defines a set of permissions (rules) within a specific namespace, making it a valid resource for granting RBAC permissions. It specifies which API operations (verbs like get, list, create) are allowed on which resources (e.g., pods, services) within that namespace.
Exam trap
The trap here is that candidates often confuse binding resources (RoleBinding, ClusterRoleBinding) with permission-defining resources (Role, ClusterRole), leading them to select bindings as valid answers for granting permissions.
Which TWO statements about DaemonSets are correct? (Select 2)
Common use cases for DaemonSets.
Why this answer
Option C is correct because DaemonSets are specifically designed to run cluster-wide services such as log collectors (e.g., Fluentd) or monitoring agents (e.g., Prometheus Node Exporter). They automatically deploy a pod on every node (or a subset of nodes based on node selectors), ensuring that each node has a copy of the daemon pod for tasks like log aggregation or metrics collection.
Exam trap
The trap here is that candidates often confuse DaemonSets with Deployments, mistakenly thinking DaemonSets have a replica count or are managed by a Deployment, or that they lack update strategies, when in fact DaemonSets are independent and fully support rolling updates.
You run 'kubectl top nodes' and get an error: 'error: metrics not available yet'. Which of the following is the MOST likely cause?
'kubectl top' requires Metrics Server to be installed; otherwise it returns this error.
Why this answer
Option D is correct. The 'metrics not available yet' error indicates that the Metrics Server is not installed or not functioning. Option A (node not ready) would show a different error.
Option B (kubelet not running) would cause node NotReady. Option C (API server down) would cause connection refused.
Practice CKA by domain
Target a specific domain to shore up weak areas.