Knowledge + Practice

Certified Kubernetes Administrator CKA (CKA) — Questions 151–225

1005 questions total · 14pages · All types, answers revealed

Take a mock exam Exam hub

Page 3 of 14

151

Multi-Selectmedium

Which of the following are valid methods to troubleshoot a Node that is 'NotReady'? (Select all that apply)

Select 3 answers

A.Check the kubelet logs on the node via journalctl

B.Verify that the kubelet certificate is valid and not expired

C.Restart the node's container runtime (e.g., containerd)

D.Delete the node object and re-create it

E.Reinstall the entire Kubernetes cluster

AnswersA, B, C

Why this answer

Option A is correct because the kubelet is the primary node agent that communicates with the control plane. When a node is 'NotReady', checking the kubelet logs via `journalctl -u kubelet` can reveal errors such as network connectivity issues, certificate problems, or resource exhaustion. Option B is correct because an expired or invalid kubelet certificate will cause TLS authentication failures with the API server, leading to the node being marked 'NotReady'.

Option C is correct because the container runtime (e.g., containerd) is responsible for managing containers; if it is down or misconfigured, the kubelet cannot start pods, and restarting it can resolve transient failures.

Exam trap

CNCF often tests the misconception that deleting and re-creating a Node object is a valid troubleshooting step, when in reality it only removes the API server's representation and does not fix the underlying cause of the 'NotReady' state.

Why the other options are wrong

D

Deleting the node does not fix the node condition; it only removes it from the cluster.

E

Overkill; a single node issue doesn't require cluster reinstall.

Full explanation →

152

MCQhard

A pod is unable to communicate with a Service in the same namespace. The administrator checks kube-proxy logs and finds no errors. Which command would help diagnose whether the iptables rules for the Service are correctly programmed?

A.ss -tuln

B.kubectl describe svc

C.kubectl exec -n kube-system <kube-proxy-pod> -- iptables -L

D.iptables-save | grep <service-name>

AnswerD

Shows iptables rules containing the Service name.

Why this answer

The command 'iptables-save | grep <service-name>' on a node can show the iptables rules related to a specific Service. This helps verify if kube-proxy has created the expected rules.

Full explanation →

153

Multi-Selecthard

You need to prepare a worker node for maintenance. Which TWO actions should you perform? (Choose TWO.)

Select 2 answers

A.kubectl delete node <node>

B.kubectl uncordon <node>

C.kubectl drain <node> --ignore-daemonsets

D.kubectl cordon <node>

E.kubectl taint nodes <node> key=value:NoSchedule

AnswersC, D

Draining evicts pods from the node, with --ignore-daemonsets to leave DaemonSet pods.

Why this answer

Option C is correct because `kubectl drain` safely evicts all pods from a node before maintenance, and the `--ignore-daemonsets` flag is necessary because DaemonSet pods cannot be evicted (they are managed by the node controller). Option D is correct because `kubectl cordon` marks the node as unschedulable, preventing new pods from being scheduled onto it, which is a prerequisite before draining to avoid race conditions.

Exam trap

The trap here is that candidates often think `kubectl cordon` alone is sufficient for maintenance, but it only prevents new scheduling—it does not evict existing pods, so you must also drain the node to safely move workloads off.

Full explanation →

154

MCQeasy

What is the default DNS name for a service named 'my-svc' in namespace 'default'?

A.my-svc.default.cluster.local

B.my-svc.svc.cluster.local

C.my-svc.default.svc.cluster.local

D.my-svc.cluster.local

AnswerC

Correct format.

Why this answer

The default DNS name for a service is <service>.<namespace>.svc.cluster.local.

Full explanation →

155

Multi-Selecteasy

Which THREE components run on every worker node in a Kubernetes cluster? (Choose THREE.)

Select 3 answers

A.etcd

B.kube-proxy

C.kubelet

D.kube-apiserver

E.container runtime

AnswersB, C, E

kube-proxy runs on every node and handles service networking.

Why this answer

B (kube-proxy) is correct because it runs on every worker node to maintain network rules and handle service-to-pod traffic routing via iptables or IPVS. It is a core component for Kubernetes networking, ensuring that service endpoints are reachable from within the cluster.

Exam trap

CNCF often tests the misconception that etcd or kube-apiserver are distributed across all nodes, but in a standard Kubernetes cluster, they are strictly control plane components and never run on worker nodes.

Full explanation →

156

MCQmedium

A cluster administrator applies the following NetworkPolicy. What is the effect on pods matching the podSelector?

A.All ingress traffic is denied, but egress is allowed

B.All ingress and egress traffic is denied

C.All ingress and egress traffic is allowed

D.Only traffic from pods in the same namespace is allowed

AnswerB

Empty rules mean no traffic is allowed, so all ingress and egress is denied.

Why this answer

A NetworkPolicy with empty podSelector matches all pods in the namespace. With policyTypes Ingress and Egress and no rules, it defaults to denying all ingress and egress traffic to/from those pods.

Full explanation →

157

MCQmedium

An administrator creates a Service of type ClusterIP named 'my-svc' in the namespace 'default'. A pod in the same namespace tries to resolve the hostname 'my-svc' but fails. The pod's resolv.conf shows 'search default.svc.cluster.local svc.cluster.local cluster.local'. What is the most likely cause?

A.The pod's DNS policy is set to 'None'

B.The pod is using hostNetwork and bypasses CoreDNS

C.The Service does not exist or is in a different namespace

D.The Service type is ExternalName

AnswerC

If the Service doesn't exist, DNS will not return an A record. The pod search domains include 'default.svc.cluster.local', so 'my-svc' should resolve if the Service exists in the same namespace.

Why this answer

The fully qualified domain name for the Service is 'my-svc.default.svc.cluster.local'. The pod's search domains should allow resolving 'my-svc' to that FQDN. If resolution fails, the most common cause is that the Service does not exist or CoreDNS is not running.

Full explanation →

158

MCQmedium

What is the default kube-proxy mode in modern Kubernetes clusters?

A.kernelspace

B.iptables

C.userspace

D.ipvs

AnswerB

iptables is the default kube-proxy mode in most clusters.

Why this answer

As of Kubernetes 1.29, the default kube-proxy mode is iptables (or sometimes ipvs if explicitly configured). However, the default remains iptables in many distributions.

Full explanation →

159

MCQhard

A node in your cluster is reporting 'NotReady'. You SSH into the node and run 'systemctl status kubelet'. The output shows 'Active: inactive (dead)'. Which command should you run FIRST to attempt to resolve this?

A.journalctl -u kubelet

B.systemctl status kubelet

C.reboot

D.systemctl start kubelet

AnswerD

The kubelet is inactive/dead, so starting it will bring the node to Ready state.

Why this answer

Option C is correct. Since the kubelet service is inactive/dead, you need to start it using 'systemctl start kubelet'. Option A shows status only.

Option B restarts the node, which is too aggressive. Option D checks logs but does not start the service.

Full explanation →

160

MCQmedium

A Deployment 'web' has replicas=3 and update strategy RollingUpdate with maxSurge=50% and maxUnavailable=0. You update the container image. During the rollout, what is the maximum number of pods that can be running simultaneously?

A.4

B.3

C.5

D.6

AnswerA

maxSurge=1, so total running pods can be up to 4.

Why this answer

With maxSurge=50% (rounded up to 1) and maxUnavailable=0, the Deployment can create up to 1 extra pod above the desired 3 during a rolling update. Therefore, the maximum number of pods running simultaneously is 3 (desired) + 1 (surge) = 4.

Exam trap

The trap here is that candidates often forget that maxSurge is rounded up (ceil) when expressed as a percentage, leading them to miscalculate the surge as 1.5 (which rounds to 2) instead of correctly rounding 50% of 3 to 1.

How to eliminate wrong answers

Option B (3) is wrong because it ignores the maxSurge setting, which allows extra pods to be created during the rollout, so the count can exceed the desired replicas. Option C (5) is wrong because it incorrectly assumes maxSurge=50% allows 2 extra pods (50% of 3 rounded up is 1, not 2). Option D (6) is wrong because it doubles the desired replicas, which would only occur with maxSurge=100% or a different configuration.

Full explanation →

161

MCQmedium

An administrator runs 'kubectl get pvc' and sees a PVC with status 'Lost'. What does this status indicate?

A.The PVC requested more storage than the available PV can provide

B.The pod using the PVC has been deleted

C.The PVC's storage class does not match any available PV

D.The PV that was bound to the PVC has been deleted or is no longer available

AnswerD

'Lost' status occurs when the underlying PV is removed or becomes unavailable.

Why this answer

Option B is correct: 'Lost' means the PV that was bound to the PVC no longer exists, leaving the PVC orphaned. Option A is incorrect because 'Lost' is not related to storage class mismatch. Option C is incorrect because 'Lost' does not indicate capacity exceeded.

Option D is incorrect because 'Lost' is about the PV, not the pod.

Full explanation →

162

Multi-Selectmedium

You have a ClusterRole named 'deployer' that allows creating Deployments and Services. You want to grant a ServiceAccount 'ci-cd' in namespace 'app' the permissions defined in this ClusterRole. Which TWO resources are needed? (Choose TWO.)

Select 2 answers

A.The existing ClusterRole 'deployer'

B.Create a new ClusterRole with the same rules

C.RoleBinding in namespace 'app' referencing ClusterRole 'deployer' and ServiceAccount 'ci-cd'

D.ClusterRoleBinding with subject ServiceAccount 'ci-cd' in namespace 'app'

E.A Secret for the ServiceAccount

AnswersA, C

The ClusterRole defines the permissions.

Why this answer

Option A is correct because the ClusterRole 'deployer' already contains the necessary rules to allow creating Deployments and Services. You do not need to create a new ClusterRole; you can reuse the existing one. Option C is correct because a RoleBinding in namespace 'app' can reference a ClusterRole and bind it to a ServiceAccount within that namespace, granting the permissions only in that namespace.

This is the standard method to grant cluster-scoped permissions to a namespaced subject.

Exam trap

The trap here is that candidates often confuse RoleBinding and ClusterRoleBinding, thinking a ClusterRole must always be bound with a ClusterRoleBinding, but a RoleBinding can bind a ClusterRole to grant permissions only in a specific namespace.

Full explanation →

163

MCQmedium

You are unable to resolve a Service DNS name from within a pod. To test DNS resolution, which command should you run inside the pod?

A.kubectl describe svc <service-name>

B.kubectl logs <pod>

C.kubectl attach <pod>

D.kubectl exec <pod> -- nslookup <service-name>

AnswerD

This runs nslookup inside the pod to test DNS resolution.

Why this answer

The correct command is 'kubectl exec <pod> -- nslookup <service-name>'. nslookup or dig are common DNS troubleshooting tools.

Full explanation →

164

MCQhard

A Kubernetes cluster uses Calico as the CNI plugin. Two pods on different nodes cannot communicate, but pods on the same node can. Network policies are not enforced. What is the most likely cause?

A.Calico is not configured with an overlay network.

B.A NetworkPolicy is blocking inter-node traffic.

C.The pods are using different Service types.

D.The nodes' firewalls are blocking required ports for Calico (e.g., BGP port 179 or VXLAN port 4789).

AnswerD

Calico needs inter-node communication; firewall blocking can prevent pod-to-pod across nodes.

Why this answer

Option D is correct because Calico relies on specific ports for inter-node communication. When using BGP (default), port 179 must be open; when using VXLAN overlay, port 4789 is required. If node firewalls block these ports, Calico cannot establish routes or encapsulate traffic between nodes, causing cross-node pod communication to fail while same-node communication (which uses the local bridge) remains unaffected.

Exam trap

The trap here is that candidates often assume Calico always uses an overlay (like Flannel) and pick Option A, missing the fact that Calico's default BGP mode is a direct routing approach that requires open ports, not an overlay.

How to eliminate wrong answers

Option A is wrong because Calico does not require an overlay network by default; it uses BGP for direct routing, and even when VXLAN is used, the issue is port blocking, not the absence of an overlay. Option B is wrong because the question explicitly states that Network Policies are not enforced, so no NetworkPolicy can be blocking traffic. Option C is wrong because Service types (ClusterIP, NodePort, etc.) affect how services are exposed, not the underlying pod-to-pod communication at the CNI level.

Full explanation →

165

MCQmedium

A cluster administrator creates a PersistentVolume with the following YAML: apiVersion: v1 kind: PersistentVolume metadata: name: pv-example spec: capacity: storage: 1Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain hostPath: path: /data/pv A user creates a PersistentVolumeClaim requesting 500Mi with access mode ReadWriteOnce and no storage class. The PVC remains in Pending state. What is the most likely cause?

A.The PV's reclaim policy is Retain, so it is in Released state and cannot be bound to a new PVC.

B.The cluster has a default StorageClass defined, causing the PVC to attempt dynamic provisioning instead of binding to this static PV.

C.The hostPath path /data/pv does not exist on any node in the cluster.

D.The PVC requests 500Mi, but the PV has 1Gi capacity, which is larger than requested, so the PVC cannot bind to it.

AnswerB

When a default StorageClass exists, a PVC without a storageClassName will use that class for dynamic provisioning, ignoring static PVs without a matching class.

Why this answer

The PVC remains Pending because the PV's capacity is 1Gi, which is sufficient, and access modes match. However, the PV does not specify a storageClassName, so it is considered a 'default storage class' PV only if the cluster has a default storage class and the PV has the annotation or a matching class. Since the PVC also has no storage class, it will try to use the default storage class (if any) for dynamic provisioning, but the PV is statically provisioned without a storage class. The PVC will not bind to a PV without a matching storage class. The issue is that the PV has no storageClassName and the cluster may have a default storage class, or the PV's storage class is empty, which does not match the PVC's empty storage class? Actually, both have empty storage class, so they should match. But there is another subtlety: the PVC's storage class name is empty string, and the PV's storage class name is also empty string (since not specified). In Kubernetes, an empty string storage class name means the PV does not use any storage class, and the PVC without a storage class will try to bind to a PV with the same empty storage class. However, if the cluster has a default storage class defined, the PVC without a storage class will use that default storage class for dynamic provisioning, not static binding. The PVC will remain Pending because dynamic provisioning may not be configured or the PV is not suitable. The most likely cause is that the PV's hostPath path does not exist on the node. In CKA, hostPath PVs require the path to exist on the node where the pod runs. If the path does not exist, the PV will be created but the pod will fail. However, the PVC is Pending, not the pod. Actually, PVC binding does not depend on the hostPath path existence. The PVC will bind to the PV if capacity and access modes match and storage class matches. The PVC is Pending, meaning it's not bound. Since both have empty storage class, they should match. But if the PV is not marked as Available (e.g., it might be Released or Failed), it won't bind. The reclaim policy is Retain, so after a PVC is deleted, the PV remains in Released state, not Available. But if this is a new PV, it should be Available. Possibly the PV has been used before and is in Released state. However, the question doesn't indicate that. Another common issue: the PV's capacity is 1Gi, PVC requests 500Mi, that's fine. The access modes match. The most likely cause in a real scenario is that the PVC's storage class name is empty and the PV's storage class name is also empty, but the cluster has a default storage class, so the PVC expects dynamic provisioning from that default class, ignoring static PVs. That could cause Pending. But the correct answer should be that the PV's hostPath path does not exist on the node? That would cause pod scheduling failure, not PVC Pending. The PVC Pending is usually due to no PV matching the request. Since both have empty storage class, they should match. However, if the PV is not using a storage class (empty string) and the PVC is also empty string, they should bind. But in Kubernetes, when you don't specify a storage class in PVC, it uses the default storage class if one exists, otherwise it looks for PVs with empty storage class. So if a default storage class exists, the PVC will try to dynamically provision from that default class, not bind to a static PV with no class. That is the likely cause. So the answer is that the cluster has a default storage class, causing the PVC to ignore the static PV. But the options may include that. Let's design the options.

Options:

A: The PVC requests 500Mi but the PV has 1Gi, which is not a match? (Incorrect, PV capacity must be >= PVC, 1Gi >= 500Mi)

B: The hostPath /data/pv does not exist on any node (Incorrect, that affects pod, not PVC binding)

C: The PV's reclaim policy is Retain, so it cannot be bound to a new PVC (Partially correct: if PV is in Released state, it won't bind, but a new PV is Available)

D: The cluster has a default StorageClass defined, causing the PVC to expect dynamic provisioning instead of binding to this static PV (Correct)

Thus D is correct.

Full explanation →

166

MCQhard

You have a Deployment with 3 replicas. One pod is in CrashLoopBackOff. The other two are Running. You run 'kubectl get events' and see 'Liveness probe failed: HTTP probe failed with statuscode: 503'. What should you do?

A.Increase the initialDelaySeconds of the liveness probe

B.Scale down the deployment

C.Check the application health endpoint and fix it to return 200

D.Remove the liveness probe

AnswerC

The app's health endpoint returns 503; fix the app to respond correctly.

Why this answer

Liveness probe failure causes container restart. The fix could be adjusting the probe or fixing the app to respond correctly.

Full explanation →

167

Multi-Selecthard

Which THREE of the following are true about HorizontalPodAutoscaler (HPA)?

Select 3 answers

A.HPA can use custom metrics from the Kubernetes Metrics Server.

B.HPA supports in-place pod resizing.

C.HPA cannot scale based on memory utilization.

D.HPA can be configured with target average CPU utilization.

E.HPA can scale Deployments and StatefulSets.

AnswersA, D, E

HPA can use custom metrics via the custom.metrics.k8s.io API.

Why this answer

Option A is correct because the HorizontalPodAutoscaler (HPA) can use custom metrics provided by the Kubernetes Metrics Server, such as requests per second or queue length, in addition to standard CPU and memory metrics. The HPA retrieves these metrics via the `metrics.k8s.io` API (for resource metrics) or custom metrics APIs, enabling scaling based on application-specific behavior.

Exam trap

The trap here is that candidates often assume HPA only supports CPU metrics, but it also supports memory and custom metrics, and they confuse horizontal scaling (replicas) with vertical scaling (in-place resizing), which is not supported by HPA.

Full explanation →

168

Multi-Selectmedium

Which two commands can be used to view the logs of a container that has crashed? (Choose two.)

Select 2 answers

A.journalctl -u kubelet

B.kubectl logs pod-name --previous

C.kubectl describe pod pod-name

D.systemctl status kubelet

E.kubectl logs pod-name

AnswersA, B

Contains container logs from the kubelet.

Why this answer

kubectl logs with --previous flag shows the logs of the terminated container. journalctl -u kubelet can show container logs if the kubelet is configured to send them there, but it's not the standard way; however, the question asks for commands that can view logs of a crashed container. kubectl logs --previous is direct. Also, kubectl logs without --previous may show logs of the current (crashed) container if it restarted, but it's not guaranteed. Actually, the reliable way is --previous.

Many would say kubectl logs and kubectl logs --previous are both valid? But kubectl logs without --previous shows logs of the currently running container (if any). For a crashed container, --previous is needed. Another command is 'crictl logs' but not listed.

The typical CKA answer is kubectl logs --previous and journalctl -u kubelet (since kubelet logs contain container logs). However, the question specifies 'two commands'. Let's choose the two most direct: kubectl logs --previous and docker logs (if using docker).

But docker is not listed. We have options: A) kubectl logs pod-name, B) kubectl logs pod-name --previous, C) journalctl -u kubelet, D) kubectl describe pod, E) systemctl status kubelet. Correct: B and C. journalctl -u kubelet can show container logs because kubelet logs include them.

Full explanation →

169

MCQeasy

Which kubectl command is used to view the logs of a container that has previously crashed in a pod?

A.kubectl logs pod-name -c container-name --tail=100

B.kubectl logs pod-name --all-containers

C.kubectl logs pod-name --previous

D.kubectl logs pod-name

AnswerC

Correct. The --previous flag retrieves logs from the previous instance of the container.

Why this answer

Correct answer is A. 'kubectl logs pod-name --previous' retrieves the logs from the previous instance of a container. Option B shows current logs. Option C and D are incorrect flags.

Full explanation →

170

Multi-Selectmedium

A pod is in 'Pending' state. 'kubectl describe pod' shows: '0/3 nodes are available: 1 Insufficient memory, 2 node(s) had taint {node-role.kubernetes.io/control-plane: }, that the pod didn't tolerate.' Which TWO actions would resolve the issue? (Choose two)

Select 2 answers

A.Remove the taint from the control-plane nodes.

B.Decrease the memory request of the pod.

C.Increase the CPU request of the pod.

D.Add a toleration to the pod for the control-plane taint.

E.Add a node selector to the pod that matches the control-plane nodes.

AnswersA, D

Removing the taint also allows the pod to be scheduled on those nodes.

Why this answer

Increasing memory availability or tolerating the taint would allow the pod to be scheduled.

Full explanation →

171

MCQhard

A Kubernetes cluster has three control plane nodes and five worker nodes. The kube-apiserver is failing to start on one control plane node with the error 'etcdserver: request timed out'. The etcd cluster is healthy with three members. Which of the following is the most likely cause?

A.A firewall is blocking traffic on port 2379 between the control plane and etcd

B.The etcd cluster has a leader election issue

C.The kube-apiserver is using the wrong etcd client port (2380 instead of 2379)

D.The etcd client certificate is expired

AnswerA

Port 2379 is used for etcd client requests; blocking it would cause timeouts.

Why this answer

The error 'etcdserver: request timed out' indicates that the kube-apiserver cannot establish a TCP connection to the etcd cluster within the timeout period. Since the etcd cluster is healthy with three members and leader election is functioning, the most likely cause is a firewall blocking port 2379 (the etcd client port) between the control plane node and the etcd members. This prevents the kube-apiserver from communicating with etcd, even though the etcd cluster itself is operational.

Exam trap

The trap here is that candidates often assume a timeout error implies an etcd cluster problem (like leader election or node failure), but the question explicitly states the etcd cluster is healthy, shifting the focus to network connectivity between the apiserver and etcd.

How to eliminate wrong answers

Option B is wrong because a leader election issue would cause etcd to be unavailable or return errors like 'etcdserver: no leader', not a simple timeout; the question states the etcd cluster is healthy with three members, implying leader election is working. Option C is wrong because using port 2380 (the etcd peer port) instead of 2379 would result in a connection refused error, not a timeout, as the kube-apiserver would attempt to connect to a port that is not listening for client requests. Option D is wrong because an expired etcd client certificate would cause a TLS handshake failure with an error like 'x509: certificate has expired or is not yet valid', not a generic timeout.

Full explanation →

172

MCQeasy

A Pod with a restartPolicy of 'OnFailure' exits with code 0. What will happen?

A.The container will restart immediately.

B.The Pod will be terminated.

C.The Pod will remain in Running state.

D.The container will not restart, and the Pod will be in Succeeded phase.

AnswerD

Exit code 0 indicates success; no restart.

Why this answer

When a Pod has a restartPolicy of 'OnFailure' and its container exits with code 0 (indicating successful completion), the container will not be restarted. Instead, the Pod transitions to the Succeeded phase, as defined by Kubernetes Pod lifecycle semantics. This is because 'OnFailure' only triggers a restart on a non-zero exit code, which signifies a failure.

Exam trap

The trap here is that candidates often confuse 'OnFailure' with 'Always', assuming any exit triggers a restart, or they mistakenly think a Pod is 'terminated' (deleted) when it actually enters a terminal phase like Succeeded.

How to eliminate wrong answers

Option A is wrong because the container will restart only if the exit code is non-zero; exit code 0 indicates success, so no restart occurs. Option B is wrong because the Pod is not terminated; it enters the Succeeded phase, which is a terminal phase but not a termination of the Pod object itself. Option C is wrong because the Pod cannot remain in the Running state after the container exits; it must transition to a terminal phase (Succeeded or Failed) based on the exit code and restartPolicy.

Full explanation →

173

MCQmedium

An administrator applies the following YAML: --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: my-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi storageClassName: "" --- What does setting storageClassName to an empty string achieve?

A.It disables storage for the PVC

B.It ensures the PVC binds only to a PV with no storage class defined

C.It triggers dynamic provisioning using the default StorageClass

D.It assigns the default StorageClass automatically

AnswerB

An empty string means the PVC will only bind to PVs that also have storageClassName set to empty string.

Why this answer

Option A is correct: setting storageClassName to an empty string explicitly binds to a PV without a storage class (i.e., a static PV with no storageClassName). It does not use the default StorageClass. Option B is incorrect because an empty string does not trigger dynamic provisioning.

Option C is incorrect because it does not disable storage entirely. Option D is incorrect because empty string is not the default.

Full explanation →

174

MCQeasy

Which command shows all events in the cluster sorted by timestamp?

A.kubectl describe events

B.kubectl get events

C.kubectl logs --all-namespaces

D.kubectl get events --sort-by=.metadata.creationTimestamp

AnswerB

Shows events sorted by timestamp by default.

Why this answer

kubectl get events shows events sorted by timestamp by default.

Full explanation →

175

MCQmedium

You have a CronJob that runs a batch job every 5 minutes. The job takes about 2 minutes to complete. However, if a job takes longer than 5 minutes, you want to prevent a new job from starting until the previous one finishes. Which CronJob field should you configure?

A.successfulJobsHistoryLimit

B.concurrencyPolicy: Forbid

C.suspend: true

D.startingDeadlineSeconds

AnswerB

Setting concurrencyPolicy to Forbid ensures only one job is running at a time; new jobs are skipped if the previous hasn't completed.

Why this answer

The `concurrencyPolicy` field in a CronJob spec controls how the controller handles overlapping job executions. Setting it to `Forbid` ensures that if a previous job is still running when the next scheduled time arrives, the new job is skipped, preventing concurrent runs. This directly addresses the requirement to block a new job from starting until the previous one finishes.

Exam trap

The trap here is that candidates often confuse `concurrencyPolicy` with `suspend` or `startingDeadlineSeconds`, mistakenly thinking pausing or delaying the job will solve the overlap issue, when only `Forbid` explicitly prevents concurrent runs.

How to eliminate wrong answers

Option A is wrong because `successfulJobsHistoryLimit` controls how many completed jobs are retained for inspection, not the concurrency behavior of running jobs. Option C is wrong because `suspend: true` pauses the entire CronJob, preventing any new jobs from being created at all, rather than conditionally blocking only when a previous job is still running. Option D is wrong because `startingDeadlineSeconds` sets a time window for starting a missed job if the CronJob controller is down, but it does not affect concurrency control.

Full explanation →

176

MCQmedium

You have a service account named 'my-sa' in the 'default' namespace. You want to mount its token into a pod automatically. Which field in the pod spec achieves this?

A.spec.serviceAccountName

B.spec.serviceAccount

C.spec.containers[].env[].valueFrom.secretKeyRef

D.spec.automountServiceAccountToken

AnswerA

Setting this field to 'my-sa' will use that service account and automatically mount its token.

Why this answer

Option A is correct because setting `spec.serviceAccountName` to 'my-sa' in the pod spec automatically mounts the service account token as a volume at `/var/run/secrets/kubernetes.io/serviceaccount/`. This is the standard way to associate a service account with a pod, and Kubernetes automatically handles token projection and mounting for that service account.

Exam trap

The trap here is that candidates confuse `spec.serviceAccountName` with the deprecated `spec.serviceAccount` field, or think that `spec.automountServiceAccountToken` alone is sufficient to mount a specific service account's token, when it only controls the mounting behavior for the default service account.

How to eliminate wrong answers

Option B is wrong because `spec.serviceAccount` is a deprecated field (removed in Kubernetes 1.24+) that previously served the same purpose as `spec.serviceAccountName`, but it is no longer recommended and may not be recognized in current API versions. Option C is wrong because `spec.containers[].env[].valueFrom.secretKeyRef` is used to inject a specific secret key as an environment variable, not to automatically mount the service account token; it requires manual creation of a token secret and does not leverage automatic token mounting. Option D is wrong because `spec.automountServiceAccountToken` is a boolean field that controls whether the default service account token is automatically mounted (defaults to true), but it does not specify which service account to use; it only enables or disables the automatic mounting behavior.

Full explanation →

177

MCQhard

You apply the following NetworkPolicy to namespace 'ns1': apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: deny-ingress spec: podSelector: {} policyTypes: - Ingress ingress: [] What effect does this policy have?

A.Denies all egress traffic as well.

B.Allows ingress traffic only from pods in the same namespace.

C.Denies all ingress traffic to all pods in namespace ns1.

D.Allows all ingress traffic because no explicit deny rules are defined.

AnswerC

Correct interpretation.

Why this answer

An empty 'ingress: []' rule combined with 'podSelector: {}' selects all pods in the namespace and denies all ingress traffic that is not explicitly allowed. This effectively creates a default deny ingress policy.

Full explanation →

178

MCQhard

You need to check the logs of a container that previously crashed. The pod is currently running, but the previous instance of the container exited with an error. Which command will show you the logs from the crashed container?

A.kubectl logs pod-name --previous

B.kubectl exec pod-name -- cat /var/log/crash.log

C.kubectl attach pod-name

D.kubectl logs pod-name -c container-name

AnswerA

'kubectl logs --previous' retrieves logs from the previous terminated container, exactly what is needed to see the crash output.

Why this answer

The '--previous' flag shows logs from the previous instance of a container in a pod, which is useful for debugging crash loops.

Full explanation →

179

MCQmedium

An administrator needs to upgrade a Kubernetes cluster from v1.28 to v1.29 using kubeadm. Which of the following steps is performed FIRST?

A.Run 'kubeadm upgrade plan' on the first control plane node

B.Drain all worker nodes

C.Run 'kubectl cordon' on all nodes

D.Upgrade kubelet on the first control plane node

AnswerA

Why this answer

Before any upgrade operations, kubeadm requires an assessment of the cluster's upgrade path and potential issues. 'kubeadm upgrade plan' on the first control plane node checks the current and target versions, validates the upgrade feasibility, and displays the upgrade steps and any manual interventions needed. This must be done first to ensure the upgrade is safe and to identify any version skew or configuration problems before proceeding.

Exam trap

The trap here is that candidates often assume draining or cordoning nodes is the first step, but the CKA exam emphasizes that the upgrade plan must be run first to validate the upgrade path and avoid irreversible errors.

How to eliminate wrong answers

Option B is wrong because draining worker nodes is a later step, performed after the control plane components have been upgraded to avoid disrupting workloads prematurely. Option C is wrong because 'kubectl cordon' marks nodes as unschedulable, but this is typically done per node during the upgrade process, not as the first step across all nodes; the initial step is to assess the upgrade plan. Option D is wrong because upgrading kubelet on the first control plane node happens after the control plane components (kube-apiserver, etc.) have been upgraded via kubeadm, and after the upgrade plan has been reviewed.

Full explanation →

180

MCQhard

A DevOps engineer notices that the kubelet on a node is unable to register with the Kubernetes API server. The kubelet logs show 'Failed to get bootstrap CA certificate' and the node is not yet part of the cluster. What is the most likely cause?

A.The kubelet configuration file has incorrect node IP.

B.The node's RBAC permissions are misconfigured.

C.The API server is not running.

D.The bootstrap token used for TLS bootstrapping has expired.

AnswerD

Expired token prevents CA certificate retrieval.

Why this answer

The bootstrap token used for TLS bootstrapping has expired. During the TLS bootstrap process, the kubelet uses a limited-time bootstrap token to authenticate with the API server and request a client certificate. If the token expires before the kubelet completes registration, the kubelet will fail to obtain the bootstrap CA certificate and cannot join the cluster, as indicated by the error 'Failed to get bootstrap CA certificate'.

Exam trap

CNCF often tests the distinction between authentication failures (expired token) and authorization failures (RBAC), leading candidates to incorrectly select RBAC misconfiguration when the actual issue is token expiry.

How to eliminate wrong answers

Option A is wrong because an incorrect node IP would cause connectivity or identity issues, but the specific error 'Failed to get bootstrap CA certificate' points to a TLS bootstrap authentication failure, not an IP misconfiguration. Option B is wrong because RBAC permissions are enforced after authentication; the kubelet cannot even authenticate with an expired token, so RBAC misconfiguration is not the root cause. Option C is wrong because if the API server were not running, the kubelet would likely report a connection refused or timeout error, not a bootstrap CA certificate retrieval failure.

Full explanation →

181

MCQeasy

Which command can be used to scale a Deployment named 'webapp' to 5 replicas?

A.kubectl patch deployment webapp -p '{"spec":{"replicas":5}}'

B.kubectl edit deployment webapp --replicas=5

C.kubectl scale deployment webapp --replicas=5

D.kubectl update deployment webapp --replicas=5

AnswerC

'kubectl scale' directly changes the replica count.

Why this answer

Option B is correct. 'kubectl scale' is the command to scale resources. Option A is incorrect because 'kubectl update' does not exist. Option C is incorrect because 'kubectl edit' opens an editor, not a direct scale.

Option D is incorrect because 'kubectl patch' can be used but is more complex than necessary.

Full explanation →

182

Multi-Selectmedium

Which THREE are valid methods to debug DNS resolution inside a pod? (Select 3)

Select 3 answers

A.kubectl exec <pod> -- ping kubernetes.default.svc.cluster.local

B.kubectl exec <pod> -- nslookup kubernetes.default.svc.cluster.local

C.kubectl logs <pod>

D.kubectl exec <pod> -- dig kubernetes.default.svc.cluster.local

E.kubectl exec <pod> -- curl http://kubernetes.default

AnswersB, D, E

nslookup queries DNS.

Why this answer

nslookup, dig, and curl to a service name test DNS. ping uses ICMP, not DNS. kubectl logs does not test resolution.

Full explanation →

183

MCQmedium

A pod in the 'production' namespace is in a CrashLoopBackOff state. The pod has been running successfully for several days. You run 'kubectl describe pod app-pod -n production' and see the message: 'OOMKilled'. What is the MOST appropriate action to resolve this issue?

A.Increase the CPU request for the container

B.Increase the memory limit in the pod's container resource specification

C.Delete and recreate the pod to clear the crash loop

D.Delete the namespace and redeploy all workloads

AnswerB

OOMKilled indicates the container exceeded its configured memory limit. Increasing the memory limit allows the container to use more memory and prevents the OOM kill.

Why this answer

Option B is correct. OOMKilled means the container exceeded its memory limit and was killed by the kernel OOM killer. The solution is to increase the memory limit in the container's resource specification.

Option A would not help — restarting the pod without addressing the root cause will result in the same failure. Option C addresses CPU, not memory. Option D (deleting the namespace) is destructive and unnecessary.

Full explanation →

184

Multi-Selectmedium

Which TWO of the following are valid methods to restore an etcd cluster from a snapshot? (Select 2)

Select 2 answers

A.Restarting etcd pod

B.Using etcd operator with a backup CR

C.etcdctl snapshot restore

D.kubeadm reset

E.kubectl apply -f etcd-backup.yaml

AnswersB, C

The etcd operator can restore from backups.

Why this answer

Option B is correct because the etcd operator, commonly used in Kubernetes environments managed by operators, provides a custom resource (CR) for backup and restore. When a backup CR is applied, the operator automates the process of restoring an etcd cluster from a previously taken snapshot, handling the necessary steps like stopping etcd, restoring data, and restarting the cluster. Option C is correct because `etcdctl snapshot restore` is the native etcd command-line tool that directly restores an etcd cluster from a a snapshot file, which is the standard method for manual restoration.

Exam trap

The trap here is that candidates often confuse restarting a pod (which only reloads the current data) with restoring from a backup, or they mistakenly think `kubectl apply` can directly restore etcd because it's a common Kubernetes command, but etcd restoration requires specific snapshot handling tools.

Full explanation →

185

MCQhard

A cluster administrator needs to provide storage to a pod that must read and write files, but the data does not need to persist beyond the pod's lifecycle. Which volume type should be used?

A.hostPath

B.emptyDir

C.configMap

D.PersistentVolumeClaim

AnswerB

emptyDir provides temporary storage that is deleted when the pod terminates.

Why this answer

B is correct because emptyDir creates an empty volume that is provisioned when a pod is assigned to a node and exists as long as the pod is running. It allows both reading and writing files, and its contents are deleted when the pod is removed, matching the requirement that data does not need to persist beyond the pod's lifecycle.

Exam trap

The trap here is that candidates often confuse emptyDir with hostPath, thinking both provide temporary storage, but hostPath persists on the node and can cause data leakage or node-specific issues, while emptyDir is truly ephemeral and tied to the pod's lifecycle.

How to eliminate wrong answers

Option A is wrong because hostPath mounts a file or directory from the host node's filesystem into the pod, and data persists on the node even after the pod is deleted, which violates the requirement that data does not need to persist beyond the pod's lifecycle. Option C is wrong because configMap is designed to inject configuration data as files or environment variables, and while it can be mounted as a volume, it is read-only by default and not intended for read/write file storage. Option D is wrong because PersistentVolumeClaim is used to request persistent storage that outlives the pod, which directly contradicts the requirement that data does not need to persist beyond the pod's lifecycle.

Full explanation →

186

MCQmedium

Which of the following commands can be used to check the endpoints of a service named 'my-service'?

A.kubectl get endpoints my-service

B.kubectl get pod my-service

C.kubectl get service my-service -o yaml

D.kubectl get svc my-service --show-endpoints

AnswerA

Shows the endpoints of the service.

Why this answer

The command 'kubectl get endpoints my-service' or 'kubectl describe svc my-service' can show endpoints.

Full explanation →

187

MCQeasy

Which command creates a PersistentVolume named 'pv-data' that uses a hostPath volume located at '/mnt/data' with a storage capacity of 10Gi and access mode ReadWriteOnce?

A.kubectl create pv pv-data --capacity=10Gi --access-modes=ReadWriteOnce --hostpath=/mnt/data

B.kubectl apply -f pv.yaml

C.kubectl run pv-data --image=nginx --hostpath=/mnt/data

D.kubectl create -f pv.yaml

AnswerB

This command applies a YAML manifest that defines the PV with the required specifications.

Why this answer

The correct command uses 'kubectl create -f' with a YAML file. The YAML is the standard way to create a PV with specific settings.

Full explanation →

188

MCQeasy

You run 'kubectl top nodes' and it returns an error: 'error: metrics not available yet'. What does this indicate?

A.The kubelet is not running on the nodes.

B.The cluster is using an older version of Kubernetes.

C.The nodes are under heavy load.

D.The Metrics Server is not installed or not functioning.

AnswerD

Metrics Server must be deployed to collect and expose resource metrics.

Why this answer

The 'top' command relies on the metrics server to provide resource usage data. The error means the metrics server is not deployed or not ready.

Full explanation →

189

MCQmedium

A pod is in CrashLoopBackOff. You run 'kubectl logs pod-name' and see nothing. You suspect the app is failing due to a missing environment variable. Which command can you use to verify environment variables inside the container?

A.kubectl exec pod-name -- env

B.kubectl get pod pod-name -o yaml

C.kubectl logs pod-name --previous

D.kubectl describe pod pod-name

AnswerA

Runs 'env' in the container, showing actual env vars.

Why this answer

kubectl exec allows running commands inside a container. 'env' will list environment variables.

Full explanation →

190

MCQmedium

A ServiceAccount named 'my-sa' exists in the 'default' namespace. Which command creates a token for this ServiceAccount and stores it in a secret?

A.kubectl get secret my-sa-token-xxxxx -o jsonpath='{.data.token}' | base64 -d

B.kubectl apply -f token.yaml

C.kubectl create serviceaccount my-sa --token

D.kubectl create token my-sa

AnswerD

This creates a token for the ServiceAccount.

Why this answer

Option D is correct because `kubectl create token my-sa` is the dedicated command in Kubernetes 1.24+ to generate a time-bound, signed token for a ServiceAccount, which is automatically stored as a Secret object in the same namespace. This command replaces the legacy automatic token Secret creation and ensures the token is properly bound to the ServiceAccount.

Exam trap

The trap here is that candidates may confuse the legacy automatic token Secret creation (which was removed in Kubernetes 1.24) with the new explicit `kubectl create token` command, or mistakenly think `kubectl create serviceaccount` with a flag can generate a token.

How to eliminate wrong answers

Option A is wrong because it retrieves and decodes an existing token from a Secret, but does not create a new token for the ServiceAccount. Option B is wrong because it applies a YAML file named 'token.yaml', which is a generic operation that could create any resource, not a specific command to create a ServiceAccount token. Option C is wrong because `kubectl create serviceaccount my-sa --token` is not a valid kubectl syntax; the `--token` flag is not supported for the `create serviceaccount` subcommand.

Full explanation →

191

Multi-Selectmedium

Which TWO statements about emptyDir volumes are correct?

Select 2 answers

A.An emptyDir volume is created empty when a Pod is assigned to a node.

B.An emptyDir volume can be shared between Pods on different nodes.

C.An emptyDir volume persists across pod restarts.

D.An emptyDir volume is deleted when the Pod is removed from the node.

E.An emptyDir volume requires a PersistentVolume.

AnswersA, D

emptyDir starts as an empty directory on the node.

Why this answer

emptyDir starts empty (A) and is deleted when the pod is removed (E). It is not persistent across pod restarts, and it does not require a PV or PVC. Options B and D are incorrect.

Option C is false because emptyDir is not meant for inter-pod sharing.

Full explanation →

192

MCQhard

An admin attempts to restore an etcd snapshot using 'etcdctl snapshot restore' but encounters an error. Which environment variable must be set for etcdctl to work with v3 API?

A.ETCD_API=3

B.ETCDCTL_API=v3

C.ETCDCTL_API=3

D.ETCDCTL_VERSION=3

AnswerC

This variable enables the v3 API.

Why this answer

Option C is correct because etcdctl uses the etcd v2 API by default, and to interact with the v3 API (which is the standard for etcd v3.x clusters), the environment variable `ETCDCTL_API=3` must be set. Without this variable, `etcdctl snapshot restore` will fail as it relies on v3-specific commands and data model.

Exam trap

The trap here is that candidates often confuse the variable name (`ETCDCTL_API` vs `ETCD_API`) or the value format (`3` vs `v3`), leading them to pick a syntactically similar but incorrect option.

How to eliminate wrong answers

Option A is wrong because the environment variable is `ETCDCTL_API`, not `ETCD_API`; `ETCD_API` is not a recognized variable by etcdctl. Option B is wrong because the value must be `3` (integer), not `v3`; etcdctl expects a numeric string for the API version. Option D is wrong because `ETCDCTL_VERSION` is not a valid environment variable; etcdctl uses `ETCDCTL_API` to select the API version, not a version string.

Full explanation →

193

MCQmedium

A DevOps engineer is designing a Kubernetes cluster for a production environment. Which of the following is a best practice for etcd deployment?

A.Deploy etcd on exactly 2 nodes for simplicity.

B.Deploy etcd on all worker nodes to maximize redundancy.

C.Deploy etcd on dedicated nodes with SSD storage.

D.Deploy etcd on the same nodes as GPU-accelerated workloads.

AnswerC

Dedicated nodes with fast storage are recommended for etcd performance and stability.

Why this answer

Option C is correct because etcd is the Kubernetes cluster's primary data store, and its performance directly impacts the entire cluster's stability and responsiveness. Dedicated nodes prevent resource contention from other workloads, while SSD storage provides the low-latency, high-IOPS performance required for etcd's frequent write operations (especially with the default 1 MB write-ahead log). This isolation is a recommended best practice in the official Kubernetes documentation for production clusters.

Exam trap

The trap here is that candidates often assume 'more nodes = more redundancy' (Option B) or that 'simplicity is better' (Option A), without understanding that etcd's Raft consensus requires an odd number of members and that dedicated, fast storage is non-negotiable for production reliability.

How to eliminate wrong answers

Option A is wrong because etcd requires an odd number of members (typically 3, 5, or 7) to maintain a quorum for the Raft consensus algorithm; exactly 2 nodes cannot achieve a majority (quorum requires > N/2, so 2 nodes would need both to agree, creating a single point of failure). Option B is wrong because deploying etcd on all worker nodes introduces severe performance risks due to resource contention with application pods, and it violates the principle of separating the control plane from data plane components. Option D is wrong because GPU-accelerated workloads are typically compute-intensive and can cause unpredictable I/O and CPU spikes, which would degrade etcd's latency-sensitive operations and risk cluster instability.

Full explanation →

194

MCQmedium

A Pod is configured with a volume that uses a ConfigMap. The ConfigMap is updated after the Pod is running. How can the Pod access the updated data without restarting?

A.The ConfigMap must be deleted and recreated.

B.The Node must be rebooted.

C.The Pod will automatically see the updated data within minutes.

D.The Pod must be deleted and recreated.

AnswerC

Kubelet periodically syncs ConfigMap data; mounted volumes are updated automatically (without subPath).

Why this answer

When a ConfigMap is mounted as a volume using subPath, updates are not automatically reflected. However, if the volume uses the entire ConfigMap (not subPath), the files are updated automatically after a delay (kubelet sync period). If subPath is used, the Pod must be restarted to pick up changes.

The question implies a volume mount, not subPath, so the updates are eventually reflected without restart. But to force immediate update, the Pod can be restarted. The best answer: The Pod will automatically receive updates after the kubelet syncs (usually within minutes).

So no action is needed, but the Pod may need to be restarted if using subPath. Since the question does not mention subPath, the default behavior is that the mounted ConfigMap is updated automatically. However, the correct answer among options is that the Pod will automatically see the updated data after the kubelet sync period.

Full explanation →

195

MCQhard

You have a multi-node cluster. One node shows 'NotReady'. You run 'journalctl -u kubelet' on that node and see 'network plugin is not ready'. What is the most likely cause?

A.The CNI plugin pod (e.g., Calico) is not running on that node

B.The container runtime (e.g., containerd) is down

C.The kubelet service is not running

D.The node's IP address has changed

AnswerA

The kubelet reports network plugin not ready when the CNI plugin is not operational.

Why this answer

The kubelet's network plugin (e.g., CNI) is not ready, often because the pod network add-on is not deployed or misconfigured.

Full explanation →

196

MCQhard

A StatefulSet named 'db' has 3 replicas. You need to update the pod template to change the resource limits. After applying the change, you run 'kubectl rollout status sts db' and it hangs. What is the most likely reason?

A.The update strategy is set to OnDelete, and you need to delete pods manually.

B.The StatefulSet's pod management policy is OrderedReady, and the first pod to update (db-2) is not becoming Ready.

C.The maxSurge setting is preventing the update from starting.

D.The StatefulSet's service name is incorrect, causing DNS resolution failures.

AnswerB

StatefulSet updates pods in reverse ordinal order, waiting for each to become Ready before proceeding. If db-2 fails to become Ready, the rollout hangs.

Why this answer

Option B is correct because StatefulSets with the default OrderedReady pod management policy update pods sequentially in reverse order (from highest ordinal to lowest). When `kubectl rollout status sts db` hangs, it indicates that the update is stuck waiting for the first pod in the update sequence (db-2) to become Ready. If db-2 fails to become Ready due to the new resource limits (e.g., insufficient cluster resources or misconfigured limits), the rollout cannot proceed to update db-1 and db-0, causing the command to hang indefinitely.

Exam trap

The trap here is that candidates confuse StatefulSet update behavior with Deployment behavior, assuming that maxSurge or maxUnavailable settings control the rollout, when in fact StatefulSets do not support those fields and rely on ordered pod management.

How to eliminate wrong answers

Option A is wrong because the OnDelete update strategy requires manual pod deletion to trigger updates, but the question states that the rollout status command hangs, implying the update was applied and is waiting for pods to become Ready—not that pods are untouched. Option C is wrong because StatefulSets do not support a maxSurge setting; maxSurge is a field for Deployments, not StatefulSets, and StatefulSets use a rolling update with partition or podManagementPolicy instead. Option D is wrong because an incorrect service name would cause DNS resolution failures for pod-to-pod communication, but it would not prevent the StatefulSet controller from updating pods or cause the rollout status to hang; the controller would still proceed with the update regardless of DNS issues.

Full explanation →

197

Multi-Selectmedium

Which TWO of the following are common causes of CrashLoopBackOff? (Choose two)

Select 2 answers

A.Node resource pressure (CPU/memory)

B.Misconfigured startup probe

C.NetworkPolicy blocking egress

D.Application initialization failure

E.PersistentVolume not mounted

AnswersB, D

Why this answer

CrashLoopBackOff occurs when a container repeatedly starts and crashes. Application errors or misconfigured probes that cause the container to be killed lead to this state. Node resource pressure causes Pending or Eviction, not CrashLoopBackOff.

Missing volumes cause CreateContainerConfigError or CrashLoopBackOff only if the app fails without the volume. NetworkPolicy does not cause a container to crash.

Full explanation →

198

Multi-Selectmedium

Which two statements are true about EndpointSlices? (Choose two.)

Select 2 answers

A.EndpointSlices are only used by headless Services

B.EndpointSlices can contain multiple endpoints per slice

C.EndpointSlices replace the deprecated Endpoints resource

D.EndpointSlices cannot be created manually

E.EndpointSlices are namespaced resources

AnswersB, E

EndpointSlices can contain up to 100 endpoints by default.

Why this answer

EndpointSlices are the successor to Endpoints, provide better scalability, and include topology information. They support multiple addresses per slice and can be managed by the EndpointSlice controller.

Full explanation →

199

MCQmedium

A pod is in 'Pending' state. After running 'kubectl describe pod', you see the event: '0/3 nodes are available: 3 PersistentVolumeClaim is not bound'. What is the most likely cause?

A.The PersistentVolumeClaim referenced by the pod does not exist or is not bound

B.The storage class used by the PVC does not match any PV

C.The scheduler is not configured to handle persistent volumes

D.The nodes do not have enough CPU or memory

AnswerA

The event explicitly states the PVC is not bound.

Why this answer

The PVC referenced by the pod is not bound to a PV. Option A correctly identifies this. Option B (different storage class) is a possible reason but not directly indicated.

Option C (scheduler conflict) is irrelevant. Option D (node resources) would produce a different event.

Full explanation →

200

MCQmedium

You need to investigate why a service is not reachable from within the cluster. Which of the following is the first step?

A.Check kube-proxy logs on the nodes

B.Check DNS resolution

C.Check if the service has endpoints

D.Restart the service

AnswerC

Why this answer

Option C is correct because the most fundamental check when a service is unreachable from within the cluster is to verify whether the service has any endpoints. A service without endpoints means no pods are matching its selector, so traffic cannot be forwarded regardless of DNS or kube-proxy status. The `kubectl get endpoints <service>` command directly reveals this, making it the logical first step before deeper diagnostics.

Exam trap

The trap here is that candidates often jump to DNS or kube-proxy issues because those are common networking topics, but the CKA exam expects you to follow a logical troubleshooting hierarchy, starting with the simplest check—whether the service has any backing pods.

Why the other options are wrong

A

More advanced step; start with endpoints.

B

DNS is for service discovery, not connectivity.

D

Restarting is not troubleshooting.

Full explanation →

201

Matchingmedium

Match each scheduling concept to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Simple label-based constraint

Expressive scheduling rules using labels

Repels Pods unless they tolerate the taint

Allows a Pod to be scheduled on a tainted Node

Determines scheduling precedence based on priority class

Why these pairings

Scheduling controls where Pods run; these mechanisms enforce placement policies.

Full explanation →

202

Matchingmedium

Match each etcd operation to its description.

Drag a concept onto its matching description — or click a concept then click the description.

Concepts

Matches

Store a key-value pair

Retrieve value for a key

Create a backup of the datastore

List all etcd cluster members

Check health of etcd endpoints

Why these pairings

etcd is the key-value store for cluster state; these operations are vital for backup and recovery.

Full explanation →

203

Multi-Selectmedium

Which TWO of the following are valid reclaim policies for a PersistentVolume?

Select 2 answers

A.Snapshot

B.Reuse

C.Retain

D.Delete

E.Archive

AnswersC, D

Retain keeps the PV and its data after PVC deletion.

Why this answer

The valid reclaim policies are Retain, Delete, and Recycle (deprecated). Retain retains the PV after PVC deletion, Delete deletes the underlying storage asset, and Recycle (deprecated) performs a basic scrub. So the correct two are Retain and Delete.

Full explanation →

204

MCQmedium

A StatefulSet named 'web' uses volumeClaimTemplates to dynamically provision PersistentVolumeClaims for each replica. The cluster runs on AWS with the EBS CSI driver. How many PersistentVolumeClaims will be created if the StatefulSet has 3 replicas and the volumeClaimTemplates specify a single template?

A.9

B.1

C.Depends on the storage class

D.3

AnswerD

Each replica gets its own PVC from the template, so 3 PVCs are created.

Why this answer

volumeClaimTemplates create one PVC per replica, using the template. With 3 replicas, 3 PVCs are created, each bound to its own dynamically provisioned volume.

Full explanation →

205

MCQeasy

Which Service type is used to expose a service externally with a cloud provider's load balancer?

A.NodePort

B.ClusterIP

C.LoadBalancer

D.ExternalName

AnswerC

Correct. LoadBalancer creates an external load balancer from the cloud provider.

Why this answer

Option D is correct. LoadBalancer type provisions a cloud load balancer. ClusterIP is internal, NodePort exposes on each node's IP, and ExternalName maps to a DNS name.

Full explanation →

206

MCQhard

An administrator runs 'kubeadm certs check-expiration' and sees that the kubelet client certificate expires in 7 days. What is the correct way to renew it?

A.Run 'kubeadm init phase certs kubelet-client'

B.Renew all certificates using 'kubeadm certs renew all'

C.Run 'kubeadm certs renew kubelet-client'

D.Delete the old certificate and restart kubelet

AnswerC

Correct: renews the kubelet client certificate.

Why this answer

The `kubeadm certs renew kubelet-client` command is the correct way to renew only the kubelet client certificate without affecting other certificates. This targeted renewal is appropriate when only the kubelet client certificate is expiring, as it avoids unnecessary disruption to other components. The command regenerates the certificate using the existing CA, and the kubelet will automatically reload it.

Exam trap

The trap here is that candidates may think deleting the old certificate and restarting the kubelet (Option D) will force renewal, but kubeadm does not auto-renew certificates on file deletion; the kubelet will simply fail to start without a valid certificate.

How to eliminate wrong answers

Option A is wrong because `kubeadm init phase certs kubelet-client` is used during initial cluster setup, not for renewal; it would attempt to generate a new certificate from scratch, potentially overwriting the existing CA configuration. Option B is wrong because `kubeadm certs renew all` renews all certificates managed by kubeadm, which is overkill and may cause unnecessary restarts of control-plane components when only the kubelet client certificate needs renewal. Option D is wrong because simply deleting the old certificate and restarting kubelet does not trigger certificate renewal; the kubelet would fail to start without a valid certificate, and the CA must sign a new certificate explicitly via `kubeadm certs renew` or manual CSR.

Full explanation →

207

MCQeasy

You need to check the CPU and memory usage of all pods in the 'production' namespace. Which command should you use?

A.kubectl top pods -n production

B.kubectl get pods -n production --show-labels

C.kubectl get resourceusage pods -n production

D.kubectl top nodes -n production

AnswerA

This shows CPU and memory usage for pods in the namespace.

Why this answer

'kubectl top pods -n production' shows CPU and memory usage. Option A is for nodes. Option C is a made-up command.

Option D shows pod status, not resource usage.

Full explanation →

208

Multi-Selectmedium

Which TWO of the following are control plane components? (Select 2)

Select 2 answers

A.kube-proxy

B.kube-controller-manager

C.kube-scheduler

D.kubelet

E.etcd

AnswersB, C

Part of the control plane.

Why this answer

The kube-controller-manager (B) and kube-scheduler (C) are core control plane components that run on the master node. The controller-manager runs controller processes (e.g., Node Controller, Replication Controller) to regulate cluster state, while the scheduler assigns pods to nodes based on resource constraints and policies. Both are essential for cluster management and are listed in the official Kubernetes control plane architecture.

Exam trap

CNCF often tests the distinction between control plane and node-level components, trapping candidates who confuse kube-proxy or kubelet (which run on every node) with control plane components, or who incorrectly categorize etcd as a control plane component rather than a distributed data store.

Full explanation →

209

Drag & Dropmedium

Drag and drop the steps to upgrade a Kubernetes cluster using kubeadm into the correct order.

Drag steps to the numbered slots on the right, or tap a step then tap a slot.

Steps

Order

Why this order

First drain the node, upgrade kubeadm, then kubelet/kubectl, restart kubelet, uncordon, then repeat for workers.

Full explanation →

210

MCQmedium

A pod in the 'production' namespace is in a CrashLoopBackOff state. The pod has been running successfully for several days. You run 'kubectl describe pod app-pod -n production' and see the message: 'OOMKilled'. What is the MOST appropriate action to resolve this issue?

A.Increase the CPU request for the container

B.Delete and recreate the pod to clear the crash loop

C.Delete the namespace and redeploy all workloads

D.Increase the memory limit in the pod's container resource specification

AnswerD

OOMKilled indicates the container exceeded its configured memory limit. Increasing the memory limit allows the container to use more memory and prevents the OOM kill.

Why this answer

The 'OOMKilled' status indicates that the container was terminated because it exceeded its memory limit. Since the pod ran successfully for days before crashing, the most likely cause is a memory leak or increased workload demand. Increasing the memory limit in the container's resource specification allows the pod to use more memory without being killed, directly addressing the root cause.

Exam trap

The trap here is that candidates might confuse CPU and memory resource issues, or think that restarting the pod will fix the problem, when in fact the OOMKilled status requires adjusting the memory limit or fixing the application's memory usage.

How to eliminate wrong answers

Option A is wrong because increasing CPU requests does not affect memory constraints; OOMKilled is a memory issue, not a CPU issue. Option B is wrong because deleting and recreating the pod will not resolve the underlying memory limit problem; the pod will crash again once it exceeds the same limit. Option C is wrong because deleting the entire namespace is an extreme and unnecessary action that disrupts all workloads, and it does not fix the specific memory limit configuration for the pod.

Full explanation →

211

Multi-Selectmedium

A node is in 'NotReady' state. Which TWO of the following are common causes? (Select 2)

Select 2 answers

A.A pod on the node is in ImagePullBackOff

B.The node has insufficient memory

C.The kubelet service has stopped on the node

D.The Kubernetes API server is down

E.The network plugin (e.g., Calico, Flannel) is not functioning

AnswersC, E

If kubelet is not running, the node cannot report its status and becomes NotReady.

Why this answer

Options A and D are correct. Kubelet not running and network plugin issues are common reasons for a node to be NotReady. Option B (insufficient memory) would cause pods to be evicted, but the node remains Ready.

Option C (API server down) affects the entire cluster but nodes may still report Ready if kubelet is healthy. Option E (image pull) is a pod-level issue.

Full explanation →

212

MCQmedium

A pod is failing to start with the error 'CrashLoopBackOff'. You check the logs with 'kubectl logs pod' and see nothing. What is the most likely reason?

A.The pod's log path is misconfigured

B.The container crashed before generating any log output

C.The kubelet has logging disabled

D.The pod is using a sidecar container for logging

AnswerB

If the container fails early, no logs may be produced.

Why this answer

If the container crashes before writing anything to stdout/stderr, logs will be empty. The crash may happen early in the startup process.

Full explanation →

213

Multi-Selecthard

A Kubernetes cluster uses a NetworkPolicy to restrict traffic to a set of pods labeled 'app: db'. Which TWO statements about the following NetworkPolicy are correct? apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: db-policy spec: podSelector: matchLabels: app: db policyTypes: - Ingress ingress: - from: - podSelector: matchLabels: app: api ports: - port: 5432

Select 2 answers

A.The database pods can accept traffic on any port from pods with label 'app: api'.

B.Pods in other namespaces with label 'app: api' cannot reach the database pods.

C.The database pods can initiate outbound connections to any destination.

D.Pods from the same namespace but without matching labels can still access the database pods.

E.Pods with label 'app: api' can connect to the database pods on TCP port 5432.

AnswersC, E

Since only Ingress is specified, egress is allowed by default.

Why this answer

Option C is correct because NetworkPolicy only restricts inbound (Ingress) traffic when `policyTypes` includes only `Ingress`. By default, if no Egress rules are defined and `policyTypes` does not include `Egress`, outbound traffic is unrestricted. Thus, the database pods can initiate connections to any destination.

Exam trap

The trap here is that candidates often forget that a NetworkPolicy with an ingress rule implicitly denies all other ingress traffic, and that `podSelector` without `namespaceSelector` restricts the rule to the same namespace, allowing cross-namespace traffic to bypass the policy.

Full explanation →

214

MCQmedium

You need to expose a Service externally using an Ingress. The Ingress controller requires a specific IngressClass. How do you specify the IngressClass in the Ingress resource?

A.Set spec.class: nginx

B.Set spec.ingressClassName: nginx

C.Add an annotation: kubernetes.io/ingress.class: nginx

D.Create an IngressClass resource and reference it via spec.ingressClassRef

AnswerB

This is the correct field in the Ingress spec.

Why this answer

In Kubernetes 1.19+, you specify the IngressClass via the 'ingressClassName' field in the Ingress spec. The older annotation 'kubernetes.io/ingress.class' is deprecated.

Full explanation →

215

MCQmedium

You want to mount a Secret named 'db-secret' as a volume in a pod. Which volume type should you use in the pod spec?

A.hostPath

B.configMap

C.emptyDir

D.secret

AnswerD

The secret volume type mounts a Kubernetes Secret as a read-only volume.

Why this answer

Option D is correct because Kubernetes provides a dedicated 'secret' volume type specifically designed to mount Secret objects (like 'db-secret') as files into a pod. This volume type automatically handles decryption and projection of the secret data into the container's filesystem, ensuring sensitive information is securely exposed without being stored in the pod definition or environment variables.

Exam trap

The trap here is that candidates confuse ConfigMap and Secret volume types, thinking ConfigMap can mount Secrets because both store key-value data, but Kubernetes enforces strict separation: Secrets require the 'secret' volume type for security and proper handling of sensitive data.

How to eliminate wrong answers

Option A is wrong because hostPath mounts a file or directory from the host node's filesystem into the pod, not a Kubernetes Secret object; it bypasses Secret management entirely and is used for node-level access. Option B is wrong because configMap volume type mounts ConfigMap objects (which store non-sensitive configuration data), not Secrets; Secrets require the 'secret' volume type to handle encryption and base64 encoding properly. Option C is wrong because emptyDir creates an empty temporary directory that is initially empty and shared between containers in the pod; it cannot directly mount a Secret's contents without manual population.

Full explanation →

216

Multi-Selectmedium

Which TWO of the following are possible reasons for a node being in 'NotReady' state?

Select 2 answers

A.The node is running an outdated kernel version

B.The kubelet service has stopped on the node

C.The node's disk is full

D.The container network plugin (e.g., Calico, Flannel) is not functioning

E.The node has insufficient CPU resources

AnswersB, D

If kubelet stops, the node status becomes NotReady.

Why this answer

Correct answers are B and C. Kubelet stopped or network plugin issues are common causes. A would cause scheduling issues but node would still be Ready.

D is not a direct cause. E is not a direct cause.

Full explanation →

217

MCQhard

A user reports that they cannot access a service running in the cluster from within another pod. They run 'kubectl exec -it pod-a -- curl http://service-b:8080' and get a connection timeout. What is the first thing you should check?

A.Check if the service has a ClusterIP assigned.

B.Check if the kube-dns pod is running.

C.Check if the pod-a has network connectivity by pinging the node IP.

D.Check if the target service's pods have the correct labels.

AnswerD

If the selector doesn't match any pods, the service will have no endpoints and connections will timeout.

Why this answer

Since curl from within the pod times out, the issue might be that the service does not have any endpoints (no pods matching the selector). Checking endpoints is a quick way to verify if the service is backed by running pods.

Full explanation →

218

MCQmedium

A pod in the same namespace tries to reach 'my-service' on port 80, but gets 'Connection refused'. The pod's labels are 'app: my-app'. What is the most likely cause?

A.The Service type is ClusterIP, which cannot be accessed from within the cluster.

B.No pods match the selector or the matching pods are not ready.

C.The port name 'http' is invalid.

D.The targetPort is 8080, but the container port is not 8080.

AnswerB

Endpoints are empty, so no ready pods are available.

Why this answer

A 'Connection refused' error indicates that the TCP connection request reached the target IP and port, but no process was listening there. For a Kubernetes Service, this most commonly occurs when the Service's selector does not match any pod labels, or the matching pods are not in a Ready state, so the endpoints controller does not populate the Service's endpoints list, and kube-proxy has no backends to forward traffic to.

Exam trap

The trap here is that candidates often confuse 'Connection refused' with 'Connection timeout' — the former indicates the Service IP was reached but no backend is listening, while the latter would suggest network-level blocking or incorrect DNS resolution.

How to eliminate wrong answers

Option A is wrong because ClusterIP Services are specifically designed to be accessed from within the cluster, and they work correctly when endpoints exist. Option C is wrong because port names are optional and have no effect on connectivity; they are only used for named ports in Service selectors. Option D is wrong because the targetPort can differ from the container port as long as the container is listening on the targetPort; the mismatch described would cause a different error (e.g., timeout or connection refused on the container side), but the question states the container is listening on port 80, so targetPort 8080 would not match.

Full explanation →

219

MCQhard

You need to grant a ServiceAccount named 'jenkins' in the 'ci' namespace the ability to list pods in the 'production' namespace. Which RBAC resources should you create?

A.Create a ClusterRole in the 'production' namespace and a RoleBinding in the 'ci' namespace.

B.Create a Role in the 'production' namespace and a RoleBinding in the 'ci' namespace referencing the Role.

C.Create a Role in the 'ci' namespace and a RoleBinding binding the ServiceAccount to the Role.

D.Create a ClusterRole and a ClusterRoleBinding binding the ServiceAccount to the ClusterRole.

AnswerD

A ClusterRole can define permissions for pods in any namespace, and a ClusterRoleBinding grants those permissions cluster-wide, including to the ServiceAccount.

Why this answer

Option D is correct because a ServiceAccount in one namespace ('ci') needs to list pods in another namespace ('production'). A ClusterRole grants permissions cluster-wide (or across namespaces), and a ClusterRoleBinding binds it to the ServiceAccount, allowing cross-namespace access. Roles and RoleBindings are namespace-scoped and cannot grant permissions across namespaces.

Exam trap

The trap here is that candidates often think a RoleBinding can bind a Role from another namespace, but RoleBindings are namespace-scoped and can only reference Roles in the same namespace, making a ClusterRole and ClusterRoleBinding necessary for cross-namespace access.

How to eliminate wrong answers

Option A is wrong because a ClusterRole cannot be created inside a namespace; ClusterRoles are cluster-scoped resources. Option B is wrong because a Role in the 'production' namespace is namespace-scoped, and a RoleBinding in the 'ci' namespace cannot reference a Role from a different namespace; RoleBindings must reference a Role in the same namespace. Option C is wrong because a Role in the 'ci' namespace only grants permissions within that namespace, not in the 'production' namespace.

Full explanation →

220

MCQhard

You have a Deployment with the following resource limits for containers: memory: 256Mi. The pod is repeatedly killed with OOMKilled. You need to change the limit to 512Mi. Which field should you modify in the Deployment YAML?

A.spec.template.spec.containers[].resources.limits.memory

B.spec.template.spec.containers[].resources.requests.memory

C.spec.template.spec.containers[].resources.requests.cpu

D.spec.template.spec.containers[].resources.limits.cpu

AnswerA

Increasing the memory limit allows the container to use more memory and avoids OOMKilled.

Why this answer

Option C is correct. Memory limit is specified under resources.limits.memory. Option A is for requests, not limits.

Option B is for CPU. Option D is for CPU.

Full explanation →

221

Multi-Selectmedium

Which TWO statements about PersistentVolume (PV) reclaim policies are correct?

Select 2 answers

A.Retain: The PV remains in the cluster and must be manually reclaimed.

B.Retain: The underlying storage asset is automatically deleted.

C.Recycle: The PV is automatically cleaned and made available for a new claim.

D.Delete: The PV must be manually deleted by the administrator.

E.Delete: The PV and the associated storage asset are automatically deleted.

AnswersA, E

Retain keeps the PV; admin must manually delete or reuse it.

Why this answer

Option A is correct because the Retain reclaim policy leaves the PersistentVolume (PV) in the cluster in a 'Released' state after the PersistentVolumeClaim (PVC) is deleted. The underlying storage asset (e.g., an EBS volume or NFS export) is not touched by Kubernetes, and the administrator must manually delete the PV object and then handle the storage asset (e.g., reuse or delete it) outside of Kubernetes.

Exam trap

The trap here is that candidates confuse Retain with automatic cleanup or think Recycle is still a valid, active policy, when in fact it has been deprecated and removed in recent Kubernetes versions.

Full explanation →

222

MCQeasy

You want to see the last 50 lines of logs from a pod named 'api-pod' for the container 'api-container'. Which command accomplishes this?

A.kubectl logs -l app=api --tail=50

B.kubectl logs api-pod --tail=50

C.kubectl logs api-pod -c api-container --tail=50

D.kubectl logs api-container -p api-pod --tail 50

AnswerC

The --tail flag limits output to the last N lines.

Why this answer

'kubectl logs api-pod -c api-container --tail=50' shows the last 50 lines for the specified container.

Full explanation →

223

Multi-Selectmedium

Which TWO components are part of the Kubernetes control plane? (Choose TWO.)

Select 2 answers

A.container runtime

B.kube-apiserver

C.kube-scheduler

D.kubelet

E.kube-proxy

AnswersB, C

The API server is a core control plane component.

Why this answer

The Kubernetes control plane manages the cluster's overall state and scheduling decisions. The kube-apiserver (B) is the front-end for the control plane, exposing the Kubernetes API, while the kube-scheduler (C) is responsible for assigning newly created pods to nodes based on resource availability and constraints. Both are essential control plane components that run on the master node(s).

Exam trap

The trap here is that candidates often confuse node-level components (kubelet, kube-proxy, container runtime) with control plane components, especially because kubelet and kube-proxy are critical for cluster operation but run on every node, not just the control plane.

Full explanation →

224

MCQmedium

A Kubernetes cluster was upgraded from v1.28 to v1.29. After the upgrade, nodes report NotReady. You check kubelet logs and see: 'error: failed to run Kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"'. What is the most likely cause?

A.The container runtime version is incompatible with Kubernetes v1.29

B.The kubelet cannot connect to the API server

C.The kubelet was not restarted after the upgrade

D.The kubelet configuration has a different cgroup driver than the container runtime

AnswerD

The kubelet and container runtime must use the same cgroup driver. Here kubelet uses systemd but docker uses cgroupfs.

Why this answer

Option D is correct because the error message explicitly states that the kubelet's cgroup driver (systemd) differs from the container runtime's cgroup driver (cgroupfs). In Kubernetes, the kubelet and the container runtime must use the same cgroup driver to manage resource limits correctly. After upgrading from v1.28 to v1.29, the kubelet configuration may have been reset or changed, causing this mismatch, which prevents the kubelet from starting and the node from becoming Ready.

Exam trap

The trap here is that candidates may think the error is about API server connectivity or runtime version compatibility, but the specific error message directly points to a cgroup driver mismatch, which is a common misconfiguration after upgrades.

How to eliminate wrong answers

Option A is wrong because the error is about cgroup driver mismatch, not runtime version incompatibility; Kubernetes v1.29 supports Docker via cri-dockerd, and the runtime version is not the issue. Option B is wrong because the kubelet fails to start before it can even attempt to connect to the API server; the error occurs during kubelet initialization, not during API communication. Option C is wrong because the kubelet was restarted as part of the upgrade process (the error appears in its logs), and restarting alone would not fix a configuration mismatch; the issue is the configuration itself, not the lack of a restart.

Full explanation →

225

MCQmedium

A developer runs `kubectl run nginx --image=nginx --port=80` and then `kubectl expose pod nginx --port=80 --target-port=80 --type=NodePort`. What is the name of the created Service?

A.nginx

B.expose-nginx

C.nginx-service

D.nginx-pod

AnswerA

The Service is named after the pod by default.

Why this answer

By default, `kubectl expose` uses the name of the resource being exposed. Since the pod is named 'nginx', the Service will be named 'nginx'.

Full explanation →

Page 3 of 14

All pages

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Practice CKA by domain

Target a specific domain to shore up weak areas.

Cluster Architecture, Installation and Configuration Services and Networking Workloads and Scheduling Storage Troubleshooting Cluster Architecture, Installation & Configuration Workloads & Scheduling Services & Networking

See all domains with question counts →