Imagine you’re running an application that lets users store text, but every time the container crashes or restarts, their data is lost. Frustrating, right? That’s exactly the problem Kubernetes solves with volumes—ensuring data survives beyond container lifecycles. In this blog, we’ll walk through a practical example that demonstrates how Kubernetes can persist data effectively, even in failure scenarios.
The Problem: Where Did My Data Go?
Containers are lightweight, flexible, and easily restartable, but they come with a challenge: they are stateless by default. This means that every time a container crashes or is redeployed, any data stored inside it vanishes.
Consider a simple application where users submit text, and the text is stored in a file inside a container. Here's how you can run it using Docker Compose:
docker compose up --build
Once the container is running, test it using curl
:
- Retrieve stored text:
curl --location 'localhost/story'
- Add new text:
curl --location 'localhost/story' \
--header 'Content-Type: application/json' \
--data '{
"text": "My text!"
}'
Now, if the container crashes and restarts, all previously submitted text is gone! This happens because the container filesystem resets on every startup.
So, how do we persist the data beyond container failures? Kubernetes volumes provide the answer.
Solution: Adding Volumes in Kubernetes
Kubernetes allows containers to mount volumes—storage spaces that persist beyond container lifecycles. Compared to Docker volumes, Kubernetes volumes offer greater flexibility and resilience.
Setting Up Kubernetes Deployment
First, push the Docker image to a registry:
docker tag kub-data-01-starting-setup-stories mayankcse1/kub-data-01-starting-set
docker push mayankcse1/kub-data-01-starting-setup-stories
Then, create a deployment.yaml
file to define how Kubernetes should manage our container:
apiVersion: apps/v1
kind: Deployment
metadata:
name: story-deployment
spec:
replicas: 1
selector:
matchLabels:
app: story
template:
metadata:
labels:
app: story
spec:
containers:
- name: story
image: mayankcse1/kub-data-01-starting-setup-stories
ports:
- containerPort: 3000
volumeMounts:
- mountPath: /app/story
name: stories-volume
volumes:
- name: stories-volume
persistentVolumeClaim:
claimName: stories-pvc
Creating a Service to Expose the App
Now, define a Kubernetes service to allow external access to the application:
apiVersion: v1
kind: Service
metadata:
name: story-service
spec:
selector:
app: story
ports:
- protocol: TCP
port: 80
targetPort: 3000
type: LoadBalancer
Deploy and Test the Application in Kubernetes
Start Minikube if it's not running:
minikube status
minikube start --driver=docker
Then, apply the Kubernetes configuration:
kubectl apply -f=service.yaml -f=deployment.yaml
kubectl get pods
kubectl get deployments
minikube service story-service
Test the service:
curl --location 'http://<Host Address>/story'
curl --location 'http://<Host Address>/story' \
--header 'Content-Type: application/json' \
--data '{
"text": "mayank"
}'
At this point, the service is running—but our data can still disappear when the pod itself is removed. To solve this, let’s improve our volume strategy.
Ensuring Data Persistence with Volumes
Kubernetes supports various volume types, but a simple way to persist data within a pod is using emptyDir
:
volumes:
- name: story-volume
emptyDir: {}
This ensures data remains available as long as the pod is alive. However, if the pod is deleted, all data is lost.
Handling Multiple Pods & Node Failures
If you scale your application to multiple pods, data consistency becomes tricky. Suppose a pod crashes and traffic is redirected to another pod—the new pod won’t have the old data!
To store data across multiple pods running on the same node, we can use hostPath
:
volumes:
- name: story-volume
hostPath:
path: /data
type: DirectoryOrCreate
However, this only works for pods on the same node—if a pod is scheduled on a different node, it won’t have access to the previous data.
For a more robust solution across multiple nodes, consider Persistent Volumes (PVs) that work with cloud storage or external databases.
Final Thoughts: Why Kubernetes Volumes Matter
Kubernetes volumes solve critical data loss issues in containerized applications. By implementing persistent storage solutions, developers ensure that user data survives container crashes, pod restarts, and even scaling across multiple nodes.
To explore all available volume storage options in Kubernetes, visit the official documentation:
https://kubernetes.io/docs/concepts/storage/volumes/
Understanding how stateful applications work within Kubernetes is essential for building scalable, resilient infrastructure. Whether deploying a simple text storage app or a large-scale distributed system, managing volumes effectively ensures reliable data persistence.
Top comments (0)