Skip to main content
Kubernetes Introduction
CHAPTER 09 Intermediate

Persistent Volumes and Storage

Updated: May 15, 2026
25 min read

# CHAPTER 9

Persistent Volumes and Storage

1. Introduction

The Golden Rule of Kubernetes is that Pods are ephemeral. If a Pod dies, everything saved inside its internal filesystem is instantly destroyed. This is perfect for a frontend web server, but it is a catastrophic disaster for a Database. If you run MySQL in a Pod, and the Pod restarts, you lose every customer record in your system. To run stateful applications, Kubernetes provides a robust, decoupled storage architecture utilizing Persistent Volumes (PV) and Persistent Volume Claims (PVC).

2. Learning Objectives

By the end of this chapter, you will be able to:
  • Understand why local Pod storage is insufficient for databases.
  • Define a Persistent Volume (PV).
  • Define a Persistent Volume Claim (PVC).
  • Understand the relationship between the Cluster Admin and the Developer regarding storage.
  • Mount a PVC into a Pod to guarantee data persistence.

3. Beginner-Friendly Explanation

Imagine checking into a hotel (The Cluster).
  • Pod Storage (Ephemeral): You use the mini-fridge in your specific room. When you check out (The Pod dies), the cleaning staff throws away everything in the fridge.
  • Persistent Volume (PV): The hotel basement has 10 locked storage lockers. These represent physical hard drives (AWS EBS volumes, NFS drives). They exist independently of any hotel room.
  • Persistent Volume Claim (PVC): You (The Developer) walk to the front desk and say, "I need a storage locker that holds exactly 10 Gigabytes." (You make a Claim). The front desk finds a basement locker that matches your request, gives you the key, and you take the key up to your room.

If you switch rooms (The Pod restarts on a different node), you still have the key to your basement locker, and your data remains perfectly safe.

4. Persistent Volume (PV)

A PV is a piece of actual, physical storage in the cluster. It is usually provisioned by a Cluster Administrator. It could be an Amazon EBS volume, a Google Persistent Disk, or a local hard drive. Crucial Concept: PVs exist independently of Pods. If all Pods are deleted, the PV and its data remain untouched.

5. Persistent Volume Claim (PVC)

Developers usually do not know (or care) if the cluster is running on AWS or Google Cloud. They just know they need storage. A PVC is a *request* for storage by a user. When you submit a PVC for "10Gi of storage", Kubernetes automatically searches the cluster for an available PV that matches the request and binds them together.

6. Anatomy of Storage YAMLs

The PVC (What the Developer writes):

yaml
12345678910
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pvc
spec:
  accessModes:
    - ReadWriteOnce # Only one node can mount this drive at a time
  resources:
    requests:
      storage: 5Gi

The Pod (Mounting the Claim):

yaml
1234567891011
# Inside the Pod spec...
  volumes:
    - name: mysql-storage # Define a volume based on the PVC
      persistentVolumeClaim:
        claimName: mysql-pvc
  containers:
    - name: mysql
      image: mysql:8.0
      volumeMounts:
        - mountPath: /var/lib/mysql # Where the container expects the data
          name: mysql-storage

7. Mini Project: Persist Database Data

Let's prove that data can survive Pod destruction.

Step-by-Step Tutorial: *(Note: Minikube utilizes a dynamic provisioner, meaning if you create a PVC, Minikube automatically creates the underlying PV for you!)*

  1. 1. Save the PVC YAML from Section 6 into pvc.yaml and apply it:

bash
1
kubectl apply -f pvc.yaml
  1. 2. Verify the PVC is "Bound":
bash
1
kubectl get pvc
  1. 3. Create a simple Pod that writes a permanent file (writer-pod.yaml):
yaml
1234567891011121314151617
apiVersion: v1
kind: Pod
metadata:
  name: writer-pod
spec:
  volumes:
    - name: my-storage
      persistentVolumeClaim:
        claimName: mysql-pvc
  containers:
    - name: alpine
      image: alpine
      command: ["/bin/sh", "-c"]
      args: ["echo 'THIS DATA SURVIVES' > /data/message.txt; sleep 3600"]
      volumeMounts:
        - mountPath: /data
          name: my-storage
  1. 4. Apply the Pod (kubectl apply -f writer-pod.yaml).
  1. 5. The Destruction: Delete the Pod! kubectl delete pod writer-pod. The Pod is dead.
  1. 6. The Recovery: Apply the *exact same* writer-pod.yaml again to create a new Pod.
  1. 7. Exec into the new Pod:
bash
1
kubectl exec -it writer-pod -- cat /data/message.txt

*(You will see "THIS DATA SURVIVES". The data successfully lived on the PVC while the Pod was dead!)*

8. Real-World Scenarios

In AWS EKS, you define a StorageClass (e.g., gp3 SSDs). When a developer creates a PVC requesting 100Gi of gp3 storage, Kubernetes actually makes an API call to AWS, physically provisions a 100GB Elastic Block Store (EBS) hard drive in the cloud, attaches it to the Worker Node, and mounts it into the Pod. All completely automatically!

9. Best Practices

  • StatefulSets for Databases: In this chapter, we attached a PVC to a basic Pod. In a true production environment, you deploy clustered databases (like a 3-node MongoDB cluster) using a StatefulSet controller instead of a Deployment. A StatefulSet guarantees strict ordering and assigns a unique, dedicated PVC to every single replica in the database cluster.

10. Common Mistakes

  • ReadWriteMany vs ReadWriteOnce: If you create a PVC with ReadWriteOnce (RWO), and try to mount that exact same PVC into two different Pods running on two different Worker Nodes, the second Pod will fail to start. A standard cloud hard drive can only be physically attached to one Node at a time. If multiple Pods on different nodes need to read the same files, you must use a Network File System (NFS) supporting ReadWriteMany (RWX).

11. Exercises

  1. 1. Explain the decoupling benefit of having separate PV and PVC objects, rather than allowing Pods to define physical hard drives directly in their manifests.
  1. 2. If a Pod is deleted, what happens to the Persistent Volume Claim (PVC) attached to it?

12. FAQs

Q: Are Kubernetes databases fast enough for production? A: Storage latency in Kubernetes was a historical issue, but with modern NVMe cloud storage and Container Storage Interface (CSI) drivers, database performance is near bare-metal speeds. However, managing backups and disaster recovery for stateful clusters is highly complex, which is why many companies still prefer managed services like Amazon RDS over running databases inside Kubernetes.

13. Interview Questions

  • Q: Describe the architectural relationship between a StorageClass, a PersistentVolume (PV), and a PersistentVolumeClaim (PVC) in a dynamic provisioning environment like AWS EKS.
  • Q: A developer attempts to scale a Deployment attached to a standard AWS EBS volume (ReadWriteOnce) from 1 replica to 3 replicas. Explain the storage failure that will occur and how it prevents the new Pods from entering the Running state.

14. Summary

In Chapter 9, we conquered the ephemeral nature of Kubernetes. We learned how to run stateful applications by decoupling storage from compute resources. We explored how Cluster Administrators provision physical hard drives as Persistent Volumes (PVs), and how developers request allocations of that storage using Persistent Volume Claims (PVCs). Finally, we proved that mounting a PVC guarantees that critical database data survives the catastrophic destruction and recreation of its host Pod.

15. Next Chapter Recommendation

Our cluster is filling up with dozens of Deployments, Services, and PVCs. It is getting messy. How do we organize and separate them? Proceed to Chapter 10: Kubernetes Namespaces.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·