CNCFKubernetesContainer OrchestrationIntermediate25 min read

What Does Persistent Volumes Mean?

Also known as: Persistent Volume, Kubernetes storage, CKA exam, PersistentVolumeClaim, PVC

Reviewed byJohnson Ajibi· Senior Network & Security Engineer · MSc IT Security
On This Page

Quick Definition

A Persistent Volume is like a reserved parking space for your data in a Kubernetes cluster. It is a piece of storage that stays around even after the pods (the small containers running your applications) are deleted or moved. This allows your important data to remain safe and accessible no matter which pod needs it.

Must Know for Exams

The CKA exam places significant emphasis on storage concepts, including Persistent Volumes, PersistentVolumeClaims, and StorageClasses. The exam objectives explicitly state that candidates must understand and be able to configure storage for stateful workloads. You can expect tasks that require you to create a PVC, bind it to a PV, and then mount it in a pod. You may also need to troubleshoot why a PVC is not binding or why a pod cannot access its volume.

In the CKA exam, you will not be asked to write a full dissertation on PV theory. Instead, you will be given hands-on scenarios. For example, you might be told that a pod needs to use a specific PV that already exists. You must create the PVC with the correct size and access mode so that it binds to that PV. Then you must modify the pod's YAML to mount that PVC at a specific path. This tests your understanding of the YAML structure and the relationship between PVs and PVCs.

Another common scenario is dynamic provisioning. You might be asked to create a StorageClass and then create a PVC that uses it. The exam environment will automatically provision a PV when the PVC is created. You must then create a pod that uses that PVC. This tests your knowledge of StorageClass parameters and how they influence the behavior of the underlying storage.

The exam also tests reclaim policies. You may be given a scenario where a PVC is deleted, and you need to determine what happens to the PV. You might be asked to change the reclaim policy of a PV from Delete to Retain to preserve data for forensic analysis. Understanding these policies is crucial for correctly answering questions about data lifecycle.

Finally, the exam may include troubleshooting questions. For instance, you might have a pod that is stuck in a Pending state because its PVC cannot find a matching PV. You must inspect the PV and PVC objects, check their status, and identify the mismatch in size or access mode. This requires both theoretical knowledge and practical debugging skills.

Simple Meaning

Imagine you are working in a large office building with many desks. Each desk is a temporary workspace assigned to a different person each day. You might have a desk today but a different one tomorrow. Now suppose you need to keep a large box of files that you access every day. It would be very inconvenient to move that box to a different desk every time your desk assignment changes. Instead, you put that box in a secure, shared locker room that everyone can access from any desk. That locker room is like a Persistent Volume in Kubernetes.

A Persistent Volume is a storage resource that exists separately from the pods that use it. Pods in Kubernetes are temporary things. They might get created, destroyed, or moved to different machines for many reasons. If your application saves important data inside a pod, that data disappears when the pod is destroyed. That would be a disaster for any application that needs to keep track of user information, uploaded files, database records, or logs.

Persistent Volumes solve this problem by providing storage that lives on its own. An administrator sets up a pool of storage, which could be anything from a hard drive on a network to a cloud-based block storage service. Then, when a pod needs to store data permanently, it requests a piece of that storage. Kubernetes connects the pod to the Persistent Volume, and the pod can read and write data just like it would to a local folder. But when the pod is destroyed, the volume is not destroyed with it. The data remains available for the next pod that needs it.

This separation of storage from compute is a core principle in modern infrastructure. It allows applications to be resilient and scalable. If one pod fails, another pod can take its place and access the same data without missing a beat. It also means that storage can be managed by specialists who understand hardware and backup policies, while developers focus on writing code.

Full Technical Definition

In Kubernetes, a Persistent Volume (PV) is a cluster-wide resource that represents a piece of networked storage. It is analogous to a node being a resource of CPU and memory. PVs are not created by users directly but are provisioned by administrators or dynamically through StorageClasses. They have a lifecycle independent of any individual Pod. This is a fundamental difference from ephemeral storage, which is tied to a Pod's lifecycle.

A PV is defined by a YAML manifest that specifies its capacity, access modes (ReadWriteOnce, ReadOnlyMany, ReadWriteMany), storage class, reclaim policy (Retain, Delete, Recycle), and the actual storage backend. Common backends include NFS, iSCSI, cloud provider disks (like AWS EBS, Azure Disk, GCE Persistent Disk), and distributed filesystems like Ceph or GlusterFS. The PV's status can be Available, Bound, Released, or Failed.

To use a PV, a user creates a PersistentVolumeClaim (PVC). A PVC is a request for storage by a user. It specifies the desired size and access modes. Kubernetes then attempts to bind the PVC to a PV that matches the request. This binding is a one-to-one mapping. Once bound, the PVC can be mounted into a Pod as a volume. The Pod specification references the PVC by name.

StorageClasses provide a way to describe different classes of storage. For example, you might have a StorageClass for fast SSD storage and another for slower HDD storage. When a PVC specifies a StorageClass, Kubernetes can dynamically provision a PV that matches that class. This is dynamic provisioning and is the most common way PVs are used in production environments.

The reclaim policy determines what happens to a PV when the PVC bound to it is deleted. With the Retain policy, the PV remains and its data is preserved, but it cannot be reused by a new PVC until the administrator manually reclaims it. With the Delete policy, both the PV and the underlying storage asset are deleted. The Recycle policy is deprecated, but it would scrub the volume and make it available again.

Persistent Volumes are crucial for stateful applications like databases, message queues, and file storage. They enable stateful workloads to run reliably in a containerized environment. Understanding PVs, PVCs, and StorageClasses is essential for the CNCF Certified Kubernetes Administrator (CKA) exam.

Real-Life Example

Think of a large public library. The library building itself is the Kubernetes cluster. Inside the library, there are reading rooms where people can work. These reading rooms are like pods. People come and go, and a reading room might be used by different people throughout the day. However, the books in the library must stay on the shelves permanently. The shelves are the Persistent Volumes.

A library patron (a pod) comes in and wants to read a specific book. The patron cannot just take the book off the shelf and put it on a desk permanently. Instead, the patron goes to the checkout desk (the PersistentVolumeClaim) and requests a specific book. The librarian (Kubernetes) finds the book (a specific PV) on the shelf and records that this book is now checked out to this patron. The book is now bound to the patron for a period of time. The patron can read the book at any desk in the library.

If the patron leaves the library (the pod is destroyed), the book is not thrown away. It stays on the shelf, and the record shows it is still checked out. Another patron can come along and, if given the same checkout claim, can access that same book from a different desk. The library shelves themselves are the physical storage backend (like an NFS server or a cloud disk). Each shelf might have different properties. One shelf might hold rare books that cannot be removed (ReadOnlyMany), while another shelf holds regular books that one person can borrow at a time (ReadWriteOnce).

This system allows many patrons to use different books independently, and the books remain safe and organized. The separation of the book collection from the reading rooms makes the library efficient and resilient. Even if one reading room is closed for renovation (a node failure), the books are still accessible from other rooms using the same checkout process.

Why This Term Matters

Persistent Volumes matter because most real-world applications are stateful. They need to remember data between sessions, across restarts, and during scaling events. Without persistent storage, every time a pod restarts or moves to a different node, all data is lost. That would make it impossible to run databases, content management systems, user authentication services, or any application that stores files or records.

In a production Kubernetes environment, applications are often updated, scaled up or down, and moved between nodes for load balancing or node maintenance. Persistent Volumes ensure that data travels with the application, not with the pod. This decoupling of compute and storage is a key advantage of container orchestration. It allows operations teams to manage storage independently from application deployments. Storage can be backed up, replicated, or migrated without affecting running applications.

Persistent Volumes also enable multi-tenant environments. Different teams or applications can have their own dedicated storage volumes, isolated from each other, even though they run on the same cluster. This is critical for security and performance. Administrators can set quotas and policies on storage consumption per namespace or per project.

In cloud environments, Persistent Volumes directly map to cloud storage services. For example, a Kubernetes cluster running on AWS can automatically provision an EBS volume for a PV. This integration allows dynamic scaling of storage. If an application needs more space, the PVC can be updated, and the underlying volume can be expanded without downtime.

Finally, understanding Persistent Volumes is a requirement for the CKA exam and for any role that involves managing Kubernetes in production. It is not just a theoretical concept; it is something you configure, troubleshoot, and optimize on a daily basis. Without a solid grasp of PVs, you cannot effectively run stateful workloads in Kubernetes.

How It Appears in Exam Questions

In the CKA exam, questions about Persistent Volumes appear in several distinct patterns.

Configuration questions: You will be asked to create a PVC that binds to a given PV. The question will provide the PV's YAML definition or its attributes. You must create the PVC YAML with matching values for storage size and access mode. For example, if the PV has a capacity of 10Gi and access mode ReadWriteOnce, your PVC must specify those exact values. If the PV has a storageClassName defined, your PVC must use the same class.

Scenario questions: You might be told that an application needs to store logs persistently. You must create a PVC and a pod that mounts that PVC at /var/log/app. The pod might need to write data to that location. You must ensure the mount path is correct and that the pod has the necessary permissions.

Troubleshooting questions: You might be given a cluster where a pod is not starting because its PVC is unbound. You must examine the PVC and PV objects using kubectl describe and identify the issue. Common issues include a size mismatch, an access mode mismatch, or the PV being in a Released or Failed state. You must then fix the PVC or PV to resolve the binding.

Architecture questions: You might be asked to explain how to make a deployed web application stateful. The answer would involve creating a PV and PVC for the application's data directory, such as /var/www/html for a web application's uploaded files. You might also need to consider the access mode. If multiple replicas of the web server need to read the same files, you would need a ReadOnlyMany or ReadWriteMany PV.

Dynamic provisioning questions: You might be asked to create a StorageClass that uses a specific provisioner, such as kubernetes.io/aws-ebs. Then you must create a PVC that uses that StorageClass. The exam environment will automatically create a PV. You do not need to create the PV yourself. This tests your understanding of StorageClass YAML and how it interacts with PVCs.

Reclaim policy questions: You might be given a scenario where a PVC is deleted, and you are asked what happens to the PV. If the PV has a Retain policy, it remains in the Released state. If it has a Delete policy, it gets deleted along with the underlying storage. You might be asked to change the reclaim policy of an existing PV.

Study cncf-cka

Test your understanding with exam-style practice questions.

Practise

Example Scenario

You are a system administrator for a company that runs a customer feedback portal on Kubernetes. The portal allows customers to upload screenshots of errors they encounter. These screenshots must be saved and accessible even if the pod running the web server crashes and gets restarted. Currently, the application is storing these files in a local directory inside the pod, and every time the pod restarts, the uploaded files are lost.

To fix this, you decide to use a Persistent Volume. You check what storage is available in your cluster and find that there is already a Persistent Volume named feedback-storage that uses an NFS server. This PV has a capacity of 50Gi and supports ReadWriteOnce. You create a PersistentVolumeClaim named feedback-claim that requests 10Gi of storage with ReadWriteOnce. Kubernetes binds your claim to the feedback-storage PV. Then you edit the deployment YAML for the web server to add a volume that uses the feedback-claim PVC. You mount this volume at /app/uploads. Now whenever a customer uploads a screenshot, the file is saved to the PV, not to the pod's local filesystem. If the pod crashes and a new one is created, the new pod mounts the same PVC and can still read all previously uploaded files. The data persists.

Common Mistakes

Thinking a Persistent Volume is a type of volume that automatically survives pod restarts without any configuration.

A Persistent Volume is a separate resource that must be explicitly created and claimed. Simply using a regular emptyDir volume inside a pod will not persist data across pod restarts. The volume must be a PV bound via a PVC.

Always remember that persistence requires a PV and a PVC. Create a PVC in your manifest and mount it in the pod. Do not rely on default local storage.

Matching a PVC to a PV with a larger size than the PVC requests, thinking it will still bind.

A PVC will bind only to a PV with a capacity equal to or greater than the PVC's request. However, if the PV is much larger, the binding depends on the StorageClass and whether the PV allows oversubscription. In most cases, a PV with a capacity exactly matching the PVC is preferred. If no exact match exists, the PVC may remain unbound.

Create the PVC with the same storage size as the target PV, or ensure the PV has exactly the capacity the PVC requests. Use kubectl describe pv to see the exact capacity of available PVs.

Forgetting to specify the access mode, or specifying a different access mode than what the PV supports.

The access mode of the PVC must be compatible with the PV. If the PV only supports ReadWriteOnce and the PVC asks for ReadOnlyMany, the binding will fail. The PVC will remain in a Pending state.

Always check the access modes of the target PV using kubectl get pv. Then create the PVC with exactly one of those access modes. You can only specify one access mode in a PVC.

Assuming that deleting a PVC automatically deletes the bound PV.

The behavior depends on the reclaim policy of the PV. If the reclaim policy is Retain, the PV remains after the PVC is deleted. If the reclaim policy is Delete, the PV is deleted. Many beginners do not check the reclaim policy and are surprised when data is lost or when PVs accumulate.

Before deleting a PVC, check the reclaim policy of the bound PV. Use kubectl get pv to see it. If you want to keep the data, change the reclaim policy to Retain before deleting the PVC.

Using the same PVC in multiple pods that all need to write data at the same time, without checking if the PV supports ReadWriteMany.

Many PV backends only support ReadWriteOnce, meaning only one pod can write at a time. If you mount the same PVC in two pods that both try to write, one pod may fail or cause data corruption. Beginners often assume all PVs are multi-writer.

Check the access mode of the PV. If you need multiple pods to write simultaneously, use a PV that supports ReadWriteMany, such as NFS or CephFS. If only one pod needs to write, use ReadWriteOnce.

Exam Trap — Don't Get Fooled

The exam presents a PVC that is stuck in Pending state because the cluster only has PVs with a different access mode. The candidate is asked to fix the binding. Remember that you cannot change the access mode or size of a PVC once it is created.

The correct approach is to delete the existing PVC and create a new one with the correct access mode. Alternatively, if the PV has a different access mode that the PVC supports, you could create a new PVC that matches the existing PV. In the exam, always check the immutability of resources before attempting edits.

Commonly Confused With

Persistent VolumesvsemptyDir Volume

An emptyDir volume is created when a pod starts and is destroyed when the pod is deleted. It is useful for temporary data that does not need to survive pod restarts. A Persistent Volume, by contrast, exists independently of any single pod and persists data across pod lifecycles. emptyDir volumes do not require a PV or PVC, while Persistent Volumes do.

If you need a scratch pad for data that only a single pod uses temporarily, use emptyDir. If you need to store a database file that must survive pod crashes, use a Persistent Volume.

Persistent VolumesvsConfigMap

A ConfigMap stores non-sensitive configuration data as key-value pairs or files. It is mounted into a pod as a volume or used as environment variables. ConfigMaps are not designed for dynamic data like application logs or user uploads. Persistent Volumes are for storing actual persistent data that grows over time. ConfigMaps are for static configuration.

Use a ConfigMap to store the URL of an external API. Use a Persistent Volume to store the user profile pictures that users upload to your application.

Persistent VolumesvsHostPath Volume

A HostPath volume mounts a file or directory from the host node's filesystem into a pod. It is tied to a specific node and its lifecycle is not managed by Kubernetes. Persistent Volumes are cluster-wide resources that can be scheduled on any node, and their lifecycle is managed independently. HostPath volumes are generally discouraged in production because of security and portability issues.

Use HostPath only for testing or for accessing host-specific files like docker.sock. For production data that must be available across the cluster, use a Persistent Volume with a proper backend like NFS or cloud storage.

Step-by-Step Breakdown

1

Administrator provisions a Persistent Volume

A cluster administrator creates a PV object in Kubernetes. This involves defining the storage backend, capacity, access modes, reclaim policy, and optionally a StorageClass. The PV is a cluster-level resource, not tied to any namespace. It represents the actual storage available, such as an NFS share or a cloud disk volume.

2

User creates a PersistentVolumeClaim

A user creates a PVC in a specific namespace. The PVC specifies the desired storage size, access modes, and possibly a StorageClass. This is a request for storage. The PVC does not directly reference a specific PV; it describes requirements. Kubernetes will attempt to find a PV that satisfies these requirements.

3

Kubernetes binds the PVC to a matching PV

Kubernetes continuously scans for PVs that satisfy the PVC's requirements. When it finds a matching PV (same or larger capacity, compatible access mode, same StorageClass if specified, and the PV is in Available status), it binds the PVC to that PV. The PVC's status changes to Bound. This binding is one-to-one. Once bound, that PV cannot be bound to any other PVC until it is released.

4

User creates a Pod that uses the PVC

In the Pod specification, the user adds a volume that references the PVC by name. This is done in the volumes section of the pod spec. Then, in the container spec, the user mounts that volume to a specific mount path inside the container. When the pod starts, Kubernetes mounts the underlying PV to that path, making the persistent data accessible to the application.

5

Pod reads and writes data to the PV

The application running inside the container can now read and write files to the mount path, just like it would to any local directory. The data is actually being stored on the underlying storage backend. The pod can continue to access this data even if it is restarted or rescheduled to a different node, because the PV remains available.

6

Pod is deleted or rescheduled

When the pod is deleted, the PV is not automatically deleted. The PVC remains, and its bound PV remains. When a new pod is created that uses the same PVC, it will mount the same PV and have access to all the same data. The PVC acts as a persistent reference to the storage resource.

7

User deletes the PVC

When the user deletes the PVC, the PV's reclaim policy determines the outcome. If the policy is Retain, the PV remains but is in a Released state, cannot be reused by a new PVC until an administrator manually clears it. If the policy is Delete, the PV and the underlying storage asset are also deleted, potentially losing all data. The Recycle policy would scrub the volume and make it available again, but this policy is deprecated.

Practical Mini-Lesson

Let us walk through a practical example of configuring Persistent Volumes for a WordPress site running on Kubernetes. This will illustrate the full workflow from administrator provisioning to developer deployment.

First, as an administrator, you need to decide on a storage backend. For this lesson, assume you have an NFS server running at 192.168.1.100 with a shared directory at /srv/wordpress. You create a PersistentVolume YAML file named pv-wordpress.yaml. In it, you specify the capacity as 20Gi, access mode as ReadWriteMany (since both WordPress and its database may need to write), a storageClassName of standard, and the NFS server details. You also set the reclaim policy to Retain to protect against accidental data loss. After applying this YAML with kubectl apply -f pv-wordpress.yaml, the PV is available in the cluster.

Next, you create a PersistentVolumeClaim for WordPress itself. In a file named pvc-wordpress.yaml, you request 10Gi of storage with access mode ReadWriteMany and storageClassName standard. You apply it with kubectl apply -f pvc-wordpress.yaml. Kubernetes will bind your PVC to the PV you created, because the parameters match exactly. You can verify the binding with kubectl get pvc.

Now you deploy WordPress. In your WordPress deployment YAML, you add a volumes section that references the PVC: volumes: - name: wordpress-storage persistentVolumeClaim: claimName: pvc-wordpress. Then, in the container spec, you add a volumeMount: mountPath: /var/www/html name: wordpress-storage. This path is where WordPress stores its uploaded media, themes, and plugin files.

You also need a database for WordPress. You create a separate PVC for the MySQL database with its own PV. This demonstrates that each stateful component can have its own persistent storage. The MySQL pod mounts its PVC at /var/lib/mysql.

Now test the setup. Upload a media file to WordPress. Delete the WordPress pod. Wait for Kubernetes to recreate the pod automatically (if using a Deployment). When the new pod starts, it mounts the same PVC, and your uploaded file is still there. The site is fully persistent.

Common issues in this scenario: The PVC might remain Pending if the PV's access mode does not match. Use kubectl describe pvc to see events. Another issue is that the NFS server must be reachable from all nodes. Firewall rules or incorrect NFS export options can prevent the pod from mounting. Always verify the underlying storage is functional before blaming Kubernetes. Finally, remember that if you ever need to delete and recreate the PVC, you must ensure the PV's reclaim policy is set to Retain if you want to keep the data. Otherwise, you might lose all your WordPress uploads.

Memory Tip

Persistent Volumes are like a library's permanent book collection, not a temporary reading desk. The PV is the book on the shelf. The PVC is the checkout request. The pod is the person reading at a desk.

Covered in These Exams

Related Glossary Terms

Frequently Asked Questions

Can a single Persistent Volume be used by multiple pods simultaneously?

Yes, if the PV supports the ReadWriteMany access mode. This allows multiple pods to mount and read from the volume at the same time. However, not all storage backends support this. For example, NFS supports ReadWriteMany, but a typical AWS EBS volume does not.

What happens if I create a PVC but no PV matches it?

The PVC will remain in a Pending state. Kubernetes will keep trying to bind it to a PV, but if none match, it will stay Pending. You need to either create a matching PV or use a StorageClass that can dynamically provision one.

What is the difference between static and dynamic provisioning of Persistent Volumes?

Static provisioning means an administrator manually creates PVs ahead of time. Dynamic provisioning means a StorageClass is used, and a PV is automatically created when a PVC requests it. Dynamic provisioning is more common in cloud environments because it scales automatically.

Can I change the size of a Persistent Volume after it is created?

It depends on the storage backend and the CSI driver. Some backends support volume expansion, but you must enable the feature in Kubernetes. The PV must also have the AllowVolumeExpansion field set to true in its StorageClass. You can then edit the PVC to request a larger size.

What is the reclaim policy and why does it matter?

The reclaim policy determines what happens to a PV when its bound PVC is deleted. The options are Retain, Delete, and Recycle. Retain keeps the PV and its data but requires manual cleanup. Delete removes both the PV and the underlying storage. Recycle is deprecated but would scrub the volume. Choosing the wrong policy can lead to accidental data loss or orphaned volumes.

Is a Persistent Volume tied to a specific node?

Not by default. A PV is a cluster resource and can be mounted by a pod on any node that has access to the underlying storage. However, some volume types like HostPath are tied to a specific node. For production, use network-based storage like NFS or cloud block storage that is accessible from any node.

Can I use a Persistent Volume in multiple namespaces?

No, a PVC is namespace-scoped, but a PV is cluster-scoped. However, a PV can only be bound to one PVC at a time, and that PVC exists in one namespace. If you need to share the same underlying storage across namespaces, you would need to create multiple PVCs, each bound to a different PV, or use a writable many volume type and manage access externally.

Summary

Persistent Volumes are a fundamental Kubernetes resource that decouples storage from the lifecycle of pods. They allow stateful applications to run reliably in a containerized environment. A PV is a cluster-wide resource provisioned by an administrator, while a PVC is a user's request for storage.

Kubernetes binds the two together, enabling pods to mount and use persistent storage. This separation is crucial for production workloads that need to preserve data across pod restarts, scaling events, and node failures. For the CKA exam, you must be comfortable creating and binding PVs and PVCs, configuring StorageClasses for dynamic provisioning, and troubleshooting binding issues.

Common mistakes include mismatching access modes, forgetting to check reclaim policies, and confusing PVs with other volume types. By mastering Persistent Volumes, you gain the ability to deploy and manage stateful applications like databases, file storage, and content management systems in Kubernetes with confidence.