Skip to main content
Kubernetes Introduction
CHAPTER 15 Intermediate

Deploying Databases in Kubernetes

Updated: May 15, 2026
30 min read

# CHAPTER 15

Deploying Databases in Kubernetes

1. Introduction

Historically, software engineers vehemently argued that databases should *never* be run inside Kubernetes. The ephemeral nature of Pods terrified database administrators. However, as Kubernetes matured and introduced robust controllers specifically designed for stateful workloads, the industry shifted. In this chapter, we will overcome the fear of ephemeral storage by deploying resilient, persistent database clusters utilizing the powerful StatefulSet controller.

2. Learning Objectives

By the end of this chapter, you will be able to:
  • Understand the unique challenges of running Databases in Kubernetes.
  • Contrast a standard Deployment with a StatefulSet.
  • Understand stable network identities (Headless Services).
  • Deploy a MySQL database using a StatefulSet and PVC.
  • Discuss the merits of Managed Cloud Databases (RDS) vs. Kubernetes databases.

3. Beginner-Friendly Explanation

Imagine organizing a fleet of delivery trucks.
  • Deployment (Stateless): You manage a fleet of generic white vans. If Van #4 breaks down, you buy a new white van. You do not care about the license plate or the name of the van, as long as you have 10 vans driving. (This is perfect for web servers).
  • StatefulSet (Stateful): You manage a fleet of armored bank trucks. Truck #1 strictly carries gold. Truck #2 strictly carries diamonds. If Truck #1 breaks down, you cannot just replace it with a generic van. You need a highly specific replacement truck that instantly inherits the identity, the security clearance, and the specific cargo (The PVC) of the original Truck #1.

A StatefulSet guarantees strict identity and strict storage mapping.

4. The StatefulSet Controller

A StatefulSet is the sibling of the Deployment controller, designed exclusively for databases (MySQL, MongoDB, Cassandra). Unlike a Deployment, a StatefulSet maintains a sticky identity for each of its Pods.
  • Ordered Naming: If you deploy a StatefulSet named mysql with 3 replicas, it doesn't create random names like mysql-8f7b5. It creates exactly: mysql-0, mysql-1, and mysql-2.
  • Ordered Creation/Deletion: It starts them in order. It waits for mysql-0 to be fully healthy before starting mysql-1.
  • Sticky Storage: This is the most critical feature. It dynamically provisions a unique Persistent Volume Claim (PVC) for *each* specific Pod. If mysql-1 crashes, Kubernetes creates a new mysql-1 and rigorously attaches the *exact same hard drive* back to it.

5. Headless Services

When deploying a StatefulSet, you do not use a standard ClusterIP service to load balance traffic randomly. (You rarely want to send a WRITE request randomly to a Read-Only database replica!). You create a Headless Service (by setting clusterIP: None in the Service YAML). This allows you to communicate directly with specific Pods using DNS (e.g., mysql-0.mysql-service.default.svc.cluster.local).

6. Anatomy of a StatefulSet YAML

Notice the volumeClaimTemplates block. This tells Kubernetes: "Every time you create a Pod in this set, automatically generate a new 10Gi PVC specifically for it."
yaml
1234567891011121314151617181920212223242526272829303132
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mysql
spec:
  serviceName: "mysql-service" # Required: Points to the Headless Service
  replicas: 1
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        env:
          - name: MYSQL_ROOT_PASSWORD
            value: "supersecret"
        volumeMounts:
        - name: data
          mountPath: /var/lib/mysql
  volumeClaimTemplates:
  - metadata:
      name: data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          storage: 10Gi

7. Mini Project: Deploy a Persistent MySQL DB

Let's launch an armored bank truck.

Step-by-Step Tutorial:

  1. 1. Save the YAML from Section 6 into stateful-mysql.yaml.
  1. 2. First, we must create the Headless Service required by the StatefulSet. Create mysql-svc.yaml:

yaml
12345678
apiVersion: v1
kind: Service
metadata:
  name: mysql-service
spec:
  clusterIP: None
  selector:
    app: mysql
  1. 3. Apply both files:
bash
12
kubectl apply -f mysql-svc.yaml
kubectl apply -f stateful-mysql.yaml
  1. 4. Watch the Pod creation: kubectl get pods -w. You will notice it boots with the strict name mysql-0.
  1. 5. Verify the dynamic storage creation: kubectl get pvc. You will see Kubernetes automatically generated a PVC named data-mysql-0.
  1. 6. The Test: Delete the Pod! kubectl delete pod mysql-0.
  1. 7. Watch kubectl get pods. Kubernetes will instantly recreate it, name it mysql-0 again, and securely remount the data-mysql-0 hard drive to it. Total data persistence achieved!

8. Real-World Scenarios

The Great Debate: Should you run databases in Kubernetes in production? If you are a startup, running PostgreSQL inside your AWS EKS cluster saves money because you don't have to pay for a separate Amazon RDS instance. However, managing database backups, disaster recovery, and master-slave replication inside Kubernetes is extremely difficult. Most enterprise companies prefer to use Managed Services (like AWS RDS or Google Cloud SQL) for their databases, reserving Kubernetes purely for their stateless Node.js/PHP web applications.

9. Best Practices

  • The Operator Pattern: Manually managing a 3-node PostgreSQL cluster (handling leader election, failovers, and backups) using raw StatefulSets is agonizing. In production, you must use the Operator Pattern. You install a third-party software (like the Zalando Postgres Operator) into your cluster. The Operator acts as a robot Database Administrator, completely automating the complex lifecycle of the database cluster.

10. Common Mistakes

  • Scaling Down a StatefulSet: If you scale a StatefulSet down from 3 replicas to 2, Kubernetes deletes the mysql-2 Pod. Crucially, it DOES NOT delete the PVC (data-mysql-2)! This is a safety mechanism to prevent accidental data loss. You must manually run kubectl delete pvc data-mysql-2 to stop paying your cloud provider for that physical hard drive.

11. Exercises

  1. 1. Contrast the naming conventions and creation ordering of Pods managed by a Deployment versus Pods managed by a StatefulSet.
  1. 2. Explain the function of a Headless Service. Why is random load balancing detrimental to a Master-Replica database architecture?

12. FAQs

Q: Can I use a Deployment for a database if I only want 1 replica? A: Technically, yes. If you only ever have replicas: 1, a Deployment with a PVC will work. However, using a StatefulSet is the strict industry standard, as it guarantees ordering, sticky identity, and future-proofs the architecture in case you ever need to scale to a multi-node cluster.

13. Interview Questions

  • Q: Detail the architectural necessity of the volumeClaimTemplates specification within a StatefulSet manifest. How does this differ from manually defining a static PVC in a Deployment?
  • Q: A CTO asks for your architectural recommendation: Deploy the primary transactional PostgreSQL database inside the company's existing Kubernetes cluster, or utilize a managed cloud service like AWS RDS. Present a balanced argument highlighting the operational overhead of both approaches.

14. Summary

In Chapter 15, we bridged the gap between ephemeral compute and permanent data. We recognized that standard Deployments destroy the strict identity required by clustered databases. We introduced the StatefulSet controller, mastering its ability to provide ordered execution, sticky network identities via Headless Services, and guaranteed data persistence through dynamic volumeClaimTemplates. While running databases in Kubernetes presents operational overhead, we have proven it is architecturally sound and highly resilient.

15. Next Chapter Recommendation

Deploying a complex database might require applying 10 different YAML files manually. This is tedious. What if we could package all those YAMLs into a single, installable application? Proceed to Chapter 16: Helm Charts Basics.

Finish this Chapter

Save your progress on your learning path and prepare for coding interview challenges.

Discussion

Join the discussion

Log in or create a free account to participate.

Sort: ·