CNCF · Free Practice Questions · Last reviewed May 2026
48real exam-style questions organised by domain, each with the correct answer highlighted and a plain-English explanation of why it's right — and why the others are wrong.
Which control plane component is responsible for storing the cluster state and configuration?
etcd
etcd is the store for all cluster state and configuration.
kube-controller-manager
kube-apiserver
kube-scheduler
An administrator runs 'kubectl drain node01 --ignore-daemonsets --force' to prepare node01 for maintenance. However, a pod running a critical application is evicted and becomes unschedulable. Which flag could prevent eviction of that specific pod?
--grace-period=0
--pod-selector='app=critical'
--delete-local-data=false
Setting --delete-local-data=false prevents eviction of pods with local storage, protecting critical pods that use local data.
--evict-unscheduled-pods
A cluster was upgraded from v1.28 to v1.29 using kubeadm. After upgrading the control plane, nodes remain at v1.28. What is the correct next step to upgrade a worker node?
Drain the node, then run 'kubeadm upgrade node' on the worker node.
SSH into the worker node and run 'kubeadm upgrade node', then upgrade kubelet and kubectl, then restart kubelet.
This is the standard procedure for upgrading a worker node with kubeadm.
Upgrade kubelet on the worker node using the package manager and restart kubelet.
Run 'kubeadm upgrade apply' on the worker node.
An administrator creates a ServiceAccount named 'monitor' in the 'default' namespace. They want any pod using this ServiceAccount to be able to list pods cluster-wide. Which RBAC resource should be created and bound to this ServiceAccount?
ClusterRole and RoleBinding in the default namespace
ClusterRole and RoleBinding in kube-system
ClusterRole and ClusterRoleBinding
ClusterRoleBinding grants permissions cluster-wide regardless of namespace.
Role and RoleBinding in the default namespace
You want to upgrade the control plane from v1.28.0 to v1.29.0 using kubeadm. After upgrading kubeadm on the control plane node, which command should you run first?
kubeadm upgrade plan
kubeadm upgrade plan checks the feasibility and shows the steps.
kubeadm upgrade apply v1.29.0
kubeadm upgrade node
kubeadm upgrade diff
Which component runs on every node in a Kubernetes cluster and ensures containers are running in a pod?
kubelet
kubelet is the primary node agent that manages pods.
kube-scheduler
container runtime
kube-proxy
Want more Cluster Architecture, Installation and Configuration practice?
Practice this domainWhich of the following service types exposes a service on a static port on each node's IP address?
ExternalName
NodePort
NodePort exposes the service on a static port on each node's IP address.
LoadBalancer
ClusterIP
You run 'kubectl get svc my-service -o yaml' and see 'type: ClusterIP'. The service has no endpoints. What is the most likely cause?
The service type is ClusterIP, which does not support endpoints.
The service's port does not match the container port.
The service is misconfigured and needs to be deleted and recreated.
No pods with labels matching the service selector are running and ready.
Endpoints are created from pods that match the selector and are in the Ready state.
An administrator runs 'kubectl run nginx --image=nginx --port=80' and then 'kubectl expose pod nginx --port=80 --type=NodePort'. Later, they run 'kubectl get svc nginx' and see that the NodePort is set to 0. What is the most likely reason?
The pod was not ready when the service was created, so NodePort assignment was delayed.
The nodePort field was explicitly set to 0 in the service YAML, but the administrator used a flag that was ignored.
The cluster has a mutating webhook that converted the service type to ClusterIP because NodePort is disabled.
A cluster-level policy may disallow NodePort services, causing the type to be overridden.
The pod was created by a Deployment, so its labels do not match the service selector.
You have a Service named 'my-service' in namespace 'ns1'. Another pod in namespace 'ns2' needs to resolve 'my-service' using DNS. What FQDN should the pod use?
my-service.svc.cluster.local
my-service.cluster.local
my-service.ns1.svc.cluster.local
The FQDN format is <service>.<namespace>.svc.cluster.local.
my-service.ns2.svc.cluster.local
An Ingress resource is created with the following spec:
spec: rules: - host: example.com http: paths: - path: /api pathType: Prefix backend: service: name: api-service port: number: 80
The backend service 'api-service' is in the same namespace as the Ingress. What must be true for the Ingress to route traffic to the service?
The Ingress controller must be configured to use the NodePort of the service.
The service 'api-service' must be of type NodePort.
The service 'api-service' must have a valid ClusterIP and at least one endpoint.
The Ingress controller forwards traffic to the service's ClusterIP, and endpoints must exist for the service to forward to pods.
The Ingress must have an IngressClass annotation.
A cluster has a NetworkPolicy that denies all ingress traffic by default. An administrator wants to allow TCP traffic on port 8080 from pods with label 'app: web' in the same namespace. Which NetworkPolicy egress rule is needed?
An ingress rule with podSelector matching 'app: web' and ports.
To allow incoming traffic from specific pods, you need an ingress rule with the appropriate podSelector.
An ingress rule with namespaceSelector matching the same namespace.
An egress rule with namespaceSelector and podSelector.
An egress rule with podSelector matching 'app: web' and ports.
Want more Services and Networking practice?
Practice this domainA pod in the 'production' namespace is in a CrashLoopBackOff state. The pod has been running successfully for several days. You run 'kubectl describe pod app-pod -n production' and see the message: 'OOMKilled'. What is the MOST appropriate action to resolve this issue?
Increase the memory limit in the pod's container resource specification
OOMKilled indicates the container exceeded its configured memory limit. Increasing the memory limit allows the container to use more memory and prevents the OOM kill.
Delete the namespace and redeploy all workloads
Delete and recreate the pod to clear the crash loop
Increase the CPU request for the container
You need to update a Deployment's image from nginx:1.20 to nginx:1.21 using a rolling update strategy, but you want to ensure that during the update, at most 2 pods above the desired replicas (10) are running, and at least 8 pods are available at all times. Which strategy configuration should you apply?
maxSurge: 3, maxUnavailable: 2
maxSurge: 3, maxUnavailable: 3
maxSurge: 2, maxUnavailable: 2
maxSurge=2 allows at most 12 pods (10+2). maxUnavailable=2 ensures at least 8 pods are available (10-2).
maxSurge: '20%', maxUnavailable: '20%'
Which kubectl command will show the rollout history of a Deployment named 'web-app'?
kubectl describe deployment web-app
kubectl rollout status deployment web-app
kubectl rollout history deployment web-app
This is the correct command to view rollout history.
kubectl get deployment web-app -o yaml
You have a DaemonSet that is supposed to run on all nodes, but you notice it is not running on a node with a taint 'dedicated=monitoring:NoSchedule'. What must be added to the DaemonSet's pod template to make it run on that node?
Add the annotation 'scheduler.alpha.kubernetes.io/tolerations'
A nodeSelector with key 'dedicated' and value 'monitoring'
Set the priorityClassName to 'system-node-critical'
A toleration with key 'dedicated', value 'monitoring', effect 'NoSchedule'
Adding this toleration allows the pod to schedule on nodes with the matching taint.
You have a Deployment 'db' with 3 replicas. Each pod writes to a PersistentVolumeClaim (PVC). A StatefulSet is required for stable network identities and ordered pod management. Which of the following is a key characteristic that differentiates a StatefulSet from a Deployment?
StatefulSets support rolling updates but not canary deployments
StatefulSets automatically create a Service for each pod
StatefulSets cannot use PersistentVolumeClaims
StatefulSets maintain a sticky identity for each pod, including stable hostnames and persistent storage
Each pod in a StatefulSet gets a unique ordinal index and stable hostname, and retains its storage across rescheduling.
You have a CronJob that runs a batch job every 5 minutes. The job takes about 2 minutes to complete. However, if a job takes longer than 5 minutes, you want to prevent a new job from starting until the previous one finishes. Which CronJob field should you configure?
successfulJobsHistoryLimit
concurrencyPolicy: Forbid
Setting concurrencyPolicy to Forbid ensures only one job is running at a time; new jobs are skipped if the previous hasn't completed.
suspend: true
startingDeadlineSeconds
Want more Workloads and Scheduling practice?
Practice this domainA DevOps team needs to provide persistent storage to a set of pods that all require read-write access to the same data simultaneously. Which volume type should they use?
PersistentVolumeClaim with ReadWriteMany
ReadWriteMany allows multiple pods to read and write simultaneously, which is required here.
hostPath
emptyDir
PersistentVolumeClaim with ReadWriteOnce
A company is migrating a stateful application to Kubernetes. The application requires persistent storage that is 'zone-aware' to survive a single zone failure and must provide the highest possible I/O performance. Which storage solution best meets these requirements?
Use a network filesystem (NFS) server running as a single pod with a PersistentVolume backed by a regional Persistent Disk
Create a StorageClass with WaitForFirstConsumer binding and volumeBindingMode: WaitForFirstConsumer
Use a StorageClass that provisions regional Persistent Disks with replication across two zones
Regional PDs provide zone redundancy and high performance, meeting both requirements.
Deploy a StatefulSet with a local SSD on each node and use a DaemonSet to manage replication
A pod is unable to start because the PersistentVolumeClaim it references is still in 'Pending' state. What is the most likely cause?
The PersistentVolumeClaim's storage class does not exist or cannot provision a volume
If the StorageClass is missing or misconfigured, the PVC will stay Pending.
The pod's YAML has a syntax error
The pod is using a hostPath volume
The node has insufficient CPU resources
A cluster administrator needs to provide storage to a pod that must read and write files, but the data does not need to persist beyond the pod's lifecycle. Which volume type should be used?
hostPath
emptyDir
emptyDir provides temporary storage that is deleted when the pod terminates.
configMap
PersistentVolumeClaim
A team is designing a storage solution for a Cassandra cluster on Kubernetes. Each pod must have its own dedicated storage, and the cluster must be able to scale up and down dynamically. Which Kubernetes resource should be used to manage the storage?
ReplicaSet with emptyDir volumes
DaemonSet with hostPath volumes
StatefulSet with a volumeClaimTemplate
This creates a unique PVC for each pod, providing dedicated storage.
Deployment with a single PersistentVolume shared by all pods
Which TWO statements about PersistentVolume (PV) and PersistentVolumeClaim (PVC) binding are correct?
A PVC can be bound to a PV that has been released and is pending reclamation
A PV can be bound to multiple PVCs simultaneously if it has ReadWriteMany access mode
A PVC will remain in Pending state if no PV matches its storage request and no StorageClass is defined
Without a matching PV or dynamic provisioning, the PVC cannot be bound.
A PV can only be bound to a PVC that requests exactly the same amount of storage
A PVC can be bound to a PV with a different access mode if the PV supports multiple modes
Want more Storage practice?
Practice this domainA pod named 'web-frontend' is in CrashLoopBackOff. You run 'kubectl logs web-frontend' and see: 'Error: listen tcp :8080: bind: address already in use'. What is the most likely cause and how should you fix it?
The NodePort is conflicting; change the service type to ClusterIP.
The container is missing an environment variable required for startup; add it via ConfigMap.
The container process is not terminating gracefully; add a preStop hook or use a proper init system to release the port.
The error shows port already in use, indicating the old process didn't release it.
The pod has insufficient memory; increase memory limits in the deployment.
A user reports that their application cannot resolve DNS names for services in the cluster. The application runs in a pod with dnsPolicy: ClusterFirst. What is the most likely cause?
The CoreDNS deployment has 0 ready replicas.
CoreDNS is the cluster DNS provider; if down, in-cluster DNS fails.
The pod's dnsPolicy is set to Default instead of ClusterFirst.
The node's network plugin is misconfigured, blocking UDP port 53.
The pod's /etc/resolv.conf contains incorrect nameserver entries.
Which TWO of the following are valid methods to troubleshoot a pod that is stuck in 'Pending' state?
Run 'kubectl describe pod <pod-name>' and check the Events section.
Events show scheduling failures.
Run 'kubectl logs <pod-name>' to view application logs.
Run 'kubectl exec -it <pod-name> -- /bin/sh' to inspect the container.
Run 'kubectl top pod <pod-name>' to check resource usage.
Run 'kubectl get events --sort-by=.metadata.creationTimestamp' to see recent cluster events.
Events include pod scheduling failures.
Based on the exhibit, the pod is in CrashLoopBackOff. Which command should you run NEXT to identify the root cause?
kubectl describe node node-1
kubectl top pod api-6f4d7b9d4c-abcde -n production
kubectl get deployment api -n production -o yaml
kubectl logs api-6f4d7b9d4c-abcde -n production --previous
Shows logs from the crashed container instance.
You are a CKA managing a production cluster with 5 worker nodes. A developer reports that a new deployment 'payment-service' is not accessible from other pods via its Service 'payment-svc' in the 'default' namespace. The Service is of type ClusterIP with selector 'app: payment'. The deployment has 3 replicas, all showing 'Running' status. From a test pod, you run 'curl http://payment-svc:8080' and get 'Connection refused'. You verify that the pods are listening on port 8080 and the container's readiness probe passes. 'kubectl get endpoints payment-svc' shows no endpoints. 'kubectl describe svc payment-svc' shows the selector 'app=payment'. What is the most likely cause?
A NetworkPolicy is blocking traffic from the test pod to the service IP.
The service type should be NodePort to allow in-cluster access.
The readiness probe is failing on all pods, causing them to be removed from service endpoints.
The pods have label 'app: payment-service' instead of 'app: payment', so the service selector does not match.
Selector mismatch is the classic cause of empty endpoints.
A developer reports that a newly deployed Deployment named 'web-app' is not serving traffic. The Deployment has 3 replicas, a Service of type ClusterIP, and an Ingress. Which TWO commands should you run first to diagnose the issue?
kubectl describe svc web-app
Shows endpoints and selector matching.
kubectl logs deployment/web-app
kubectl get pods -l app=web-app
Shows if pods are running and ready.
kubectl get events --sort-by='.lastTimestamp'
kubectl describe ingress web-app
Want more Troubleshooting practice?
Practice this domainA company wants to install Kubernetes on a set of bare-metal servers with no existing orchestration tools. They need a solution that supports high availability for the control plane and uses etcd operators for cluster management. Which tool should they use?
kube-spray
kubeadm
kubeadm can bootstrap HA clusters and integrates with etcd operators.
minikube
kops
A DevOps engineer notices that the kubelet on a node is unable to register with the Kubernetes API server. The kubelet logs show 'Failed to get bootstrap CA certificate' and the node is not yet part of the cluster. What is the most likely cause?
The kubelet configuration file has incorrect node IP.
The node's RBAC permissions are misconfigured.
The API server is not running.
The bootstrap token used for TLS bootstrapping has expired.
Expired token prevents CA certificate retrieval.
An administrator needs to upgrade the kube-apiserver on a control plane node from version 1.22.0 to 1.23.0. Which of the following is the correct order of steps?
Upgrade kubelet, upgrade kubeadm, drain node, uncordon node.
Drain node, upgrade kubeadm, upgrade kubelet, uncordon node.
Draining first ensures no workloads are disrupted.
Upgrade kubeadm, drain node, upgrade kubelet, uncordon node.
Upgrade kubeadm, upgrade kubelet, drain node, uncordon node.
A Kubernetes cluster has been running for months. Recently, some pods are reporting 'FailedScheduling' due to insufficient memory. The administrator wants to add a new node with 32GB RAM. However, after joining the node, the new node shows 'NotReady' and the kubelet logs indicate 'Failed to update node status: context deadline exceeded'. What is the most likely cause?
The kubelet is not configured with the correct node IP.
The new node does not have enough disk space for container images.
There is a network connectivity issue between the new node and the control plane.
Context deadline exceeded indicates timeout reaching the API server.
The API server is overloaded and cannot handle the node update request.
A cluster administrator has configured a PodSecurityPolicy (PSP) that requires all pods to run with read-only root filesystem. However, a newly deployed pod is failing to start with the error 'container has runAsNonRoot and image will run as root'. The PSP is designed to prevent running as root. What is the most likely cause?
The PodSecurityPolicy admission controller is not enabled.
The PSP is not set to enforce read-only root filesystem.
The container image is configured to run as root user.
The PSP requires runAsNonRoot, but the image runs as root.
The PSP is not being applied to the pod's service account.
An administrator is tasked with setting up a new Kubernetes cluster using kubeadm. They have two nodes: one control plane and one worker. After initializing the control plane with 'kubeadm init', the worker node fails to join with the error 'error execution phase preflight: [preflight] Some fatal errors occurred: [ERROR CRI]: container runtime is not running'. What should the administrator check first?
Ensure that containerd is installed and running on the worker node.
The CRI error indicates the runtime is not running.
Verify that the control plane node is healthy.
Check if the join token has expired.
Install a network plugin like Calico on the control plane.
Want more Cluster Architecture, Installation & Configuration practice?
Practice this domainA DevOps team wants to ensure that a critical web application pod runs on a dedicated set of nodes with SSDs. Which Kubernetes feature should they use to achieve this?
Pod priority and preemption
Taints and tolerations
Node affinity
Node affinity allows a pod to express preferences or requirements for node selection based on labels.
Resource quotas
A Kubernetes cluster has a deployment with 3 replicas. After a node failure, you notice that only 2 pods are running, and the deployment has not rescheduled the missing pod. What is the most likely cause?
The deployment has a resource quota that prevents new pods
The pod's terminationGracePeriodSeconds is set to 0
The node controller has not yet evicted the pod
The node controller waits for a default of 5 minutes before evicting pods from a failed node.
The deployment's replicas field is set to 2
You have a StatefulSet with 5 pods, each requiring a unique stable network identity. The StatefulSet is scaled down from 5 to 3. Which pods will be terminated?
Random pods
Pods with the highest ordinals (4 and 3)
StatefulSet deletes pods in reverse ordinal order when scaling down.
Pods with the lowest ordinals (0 and 1)
Pods with the highest resource usage
An application requires that a pod runs on a node that has a GPU. The cluster has nodes with and without GPUs labeled as 'gpu=true' and 'gpu=false'. Which scheduling method should be used?
Taint on non-GPU nodes and toleration on the pod
Pod affinity to prefer GPU nodes
Node affinity with a requiredDuringSchedulingIgnoredDuringExecution rule for gpu=true
nodeSelector with gpu=true
nodeSelector directly matches the label gpu=true.
A cluster administrator wants to ensure that no pods are scheduled on the master node(s). Which approach is the best practice?
Add a taint to the master node
The master node already has a NoSchedule taint by default.
Delete the master node from the cluster
Use a resource quota on the master namespace
Set nodeSelector on the master node
A pod is stuck in 'Pending' state. The 'kubectl describe pod' output shows the event: '0/4 nodes are available: 3 node(s) had taint {node.kubernetes.io/unreachable: }, and 1 node(s) had taint {node.kubernetes.io/not-ready: }.' What is the most likely reason?
The pod has resource requests that exceed available capacity
The pod does not have tolerations for the node taints
The events explicitly mention taints, indicating missing tolerations.
The nodes are cordoned
The kube-scheduler is not running
Want more Workloads & Scheduling practice?
Practice this domainA developer created a Deployment with 3 replicas and a ClusterIP Service named 'app-service' on port 80 targeting port 8080 on the pods. Pod logs show that the container is listening on 8080, but curl from another pod in the same namespace to http://app-service:80 fails with 'Connection refused'. What is the most likely cause?
The Service selector does not match the pod labels.
If labels don't match, the Service has no endpoints, causing connection refused.
The container port is 8080 but the Service targetPort is 80.
The Service type should be NodePort for inter-pod communication.
The DNS resolution for 'app-service' is failing.
An administrator needs to expose a set of pods running a stateful application that require stable network identities. The pods must be reachable from outside the cluster via a DNS name that resolves to individual pod IPs. Which Service type should be used?
ExternalName Service
NodePort Service
ClusterIP with a regular Service
Headless Service (ClusterIP: None)
Headless Service returns individual pod IPs via DNS, suitable for stateful apps.
A cluster has multiple namespaces: 'frontend', 'backend', and 'monitoring'. A pod in the 'frontend' namespace needs to reach a Service named 'db-service' in the 'backend' namespace. The 'db-service' Service is of type ClusterIP. Which DNS name should the pod use?
db-service.svc.cluster.local
db-service
db-service.backend.cluster.local
db-service.backend.svc.cluster.local
This is the correct FQDN for cross-namespace access.
A pod is running with the default DNS policy. The cluster DNS service is at 10.96.0.10. The node's /etc/resolv.conf has nameserver 8.8.8.8. When the pod tries to resolve an external hostname like 'example.com', which DNS server will it query first?
The node's DNS server (8.8.8.8)
There is no DNS resolution; the pod cannot resolve external names by default
The cluster DNS service (10.96.0.10)
Default policy sends queries to the cluster DNS first.
The pod's own /etc/resolv.conf which contains the node's DNS
An administrator notices that traffic to a Service is not being forwarded to any pod. The Service has selector 'app: web' and there are pods with that label. However, 'kubectl get endpoints' shows no endpoints. What is the most likely cause?
The Service port name does not match the container port name.
The Service type is ClusterIP.
The Service targetPort is not specified.
The pods are not in Ready state (e.g., failing readiness probes).
Only Ready pods are included as endpoints.
A Kubernetes cluster uses Calico as the CNI plugin. Two pods on different nodes cannot communicate, but pods on the same node can. Network policies are not enforced. What is the most likely cause?
Calico is not configured with an overlay network.
A NetworkPolicy is blocking inter-node traffic.
The pods are using different Service types.
The nodes' firewalls are blocking required ports for Calico (e.g., BGP port 179 or VXLAN port 4789).
Calico needs inter-node communication; firewall blocking can prevent pod-to-pod across nodes.
Want more Services & Networking practice?
Practice this domainThe CKA exam is performance-based — there are no multiple-choice questions. It is a hands-on lab exam completed within 120 minutes. You complete practical tasks in a live or simulated environment. Courseiva practice questions cover the underlying concepts.
Hands-on labs and command-line tasks in a live Kubernetes cluster. Courseiva provides concept checks and scenario questions to support lab preparation.
The exam covers 8 domains: Cluster Architecture, Installation and Configuration, Services and Networking, Workloads and Scheduling, Storage, Troubleshooting, Cluster Architecture, Installation & Configuration, Workloads & Scheduling, Services & Networking. Questions are weighted by domain — higher-weight domains appear more on your actual exam.
No. These are original exam-style practice questions written against the official CNCF CKA exam objectives. They are not copied from the real exam. Courseiva focuses on genuine understanding, not memorisation of braindumps.
Courseiva tracks your accuracy per domain and routes you toward weak areas automatically. Free, no account required.