DEV Community

Cover image for Kubernetes Volume Deep Dive
Panchanan Panigrahi
Panchanan Panigrahi

Posted on

Kubernetes Volume Deep Dive

Introduction:

Welcome to "Kubernetes Persistence Volume Deep Dive." In this concise blog, we explore the core elements of data persistence in Kubernetes — Persistent Volumes (PVs), Persistent Volume Claims (PVCs), and Storage Classes. Unravel the intricacies, master essential configurations, and discover real-world applications to enhance your proficiency in managing storage within containerized environments. Join us on a journey into the heart of Kubernetes persistence, unlocking the full potential of your data.

Persistent Storage


Importance of Persistent Storage in Containerized Environments

1. Data Durability:

  • Ensures critical data persists beyond container lifecycles, providing availability and durability.

2. Stateful Applications:

  • Enables stateful applications (e.g., databases) by maintaining data continuity across container instances.

3. Scaling and Load Balancing:

  • Supports horizontal scaling, ensuring consistent data access for load balancing in dynamic environments.

4. Zero Downtime Updates:

  • Facilitates rolling updates without service interruptions, maintaining continuous data access during deployments.

5. Compliance and Governance:

  • Meets regulatory standards by securely storing critical data, crucial for sectors like finance and healthcare.

For your reference you can checkout GitHub repo specifically designed for this blog.

GitHub Link: https://github.com/panchanandevops/kubernetes-volumes/tree/main/Volumes


Persistent Volumes (PVs) in Kubernetes

Persistent Volumes (PVs) in Kubernetes are a fundamental resource that allows decoupling of storage configuration from pod specification. Essentially, a Persistent Volume is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using a StorageClass.


Lets understand Persistent Volumes (PVs) through our example yaml manifestos.



apiVersion: v1
kind: PersistentVolume
metadata:
  name: mongo-pv
spec:
  capacity:
    storage: 5Gi
  accessModes:
    - ReadWriteMany
  local:
    path: /storage/data
  nodeAffinity:
    required:
      nodeSelectorTerms:
        - matchExpressions:
            - key: kubernetes.io/hostname
              operator: In
              values:
                - minikube


Enter fullscreen mode Exit fullscreen mode

Persistent Volume (PV) Configuration:

Details:

  • Name: mongo-pv

    • Unique identifier for this Persistent Volume in the Kubernetes cluster.
  • Capacity: 5Gi

    • Allocates 5 gigabytes of storage capacity, ensuring ample space for data storage.
  • Access Modes: ReadWriteMany

    • Enables multiple pods to concurrently read from and write to the volume, fostering collaborative data access across diverse nodes within the cluster.
  • Local Storage Path: /storage/data

    • Designates the local path on the node where the storage physically resides, set specifically to "/storage/data."
  • Node Affinity Requirement:

    • Ensures the PV has a node affinity requirement, indicating its association with particular nodes based on defined criteria.
    • Node Selector Criteria:
    • Based on the node's hostname (kubernetes.io/hostname).
    • Operator: In
      • Permits the node's hostname to match any of the specified values.
    • Allowed Values: minikube
      • Restricts the selection to nodes with the hostname "minikube."

This configuration for the Persistent Volume mongo-pv incorporates these critical specifications, guaranteeing effective storage capacity, access modes, local path definition, and targeted node affinity within the Kubernetes environment.

IMPORTANT NOTE: When utilizing a StorageClass, there's no need for manual creation of Persistent Volumes (PVs). The StorageClass seamlessly handles dynamic PV creation on-demand.


Key Characteristics of Persistent Volumes (PVs)

Administrator-Provisioned Resources:

Kubernetes administrators hold responsibility for creating and configuring PVs. This includes defining the storage type (e.g., NFS, iSCSI), capacity, access modes (read-only, read/write once, etc.), and potentially associating the PV with a specific StorageClass for tailored performance or service levels. This ensures PVs align with specific application requirements.

Storage Infrastructure Abstraction:

PVs act as an abstraction layer, decoupling the underlying storage infrastructure (e.g., cloud storage, SAN) from the applications themselves. This streamlines storage management for administrators and application development by eliminating the need for application-specific storage configuration. Developers can focus on application logic without needing in-depth knowledge of the underlying storage technology.

Guaranteed Data Persistence:

A critical characteristic of PVs is their ability to guarantee data persistence beyond the lifecycle of individual Pods. This ensures that critical application data, such as databases or configuration files, survives Pod restarts, terminations, or rescheduling across Nodes within the cluster. This persistence is essential for stateful applications that rely on maintaining operational state across Pod lifecycles.

Dynamic Provisioning with PVCs:

PVs work in conjunction with Persistent Volume Claims (PVCs) to enable dynamic storage provisioning. Applications specify their storage needs via PVCs, outlining required capacity and access modes. The Kubernetes system then dynamically matches PVCs with suitable PVs based on defined criteria. This approach fosters efficient resource utilization and simplifies application management by automating storage allocation.


Persistent Volume Claims (PVCs) in Kubernetes

Within a Kubernetes cluster, Persistent Volume Claims (PVCs) serve as a core mechanism for requesting persistent storage for Pods. They function as an abstraction layer, separating the storage specifications required by a Pod from the actual Persistent Volume (PV) that fulfills those needs. They offer a declarative approach for applications to specify their storage requirements, facilitating resource management and decoupling storage concerns from application logic.


Lets understand PVC with MongoDB deployment and mongo-pvc through the following yaml manifestos.

PVC.yaml



apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mongo-pvc
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 5Gi
  storageClassName: "demo-storage"


Enter fullscreen mode Exit fullscreen mode

Persistent Volume Claim (PVC) Specifications:

  • Access Modes:

    • accessModes:
    • ReadWriteMany: Specifies that multiple pods can concurrently read from and write to the volume, enabling collaborative access across different nodes within the cluster.
  • Resource Requests:

    • resources:
    • requests:
      • storage: 5Gi: Requests 5 gigabytes of storage from the underlying storage provider to fulfill the needs of the PVC. This ensures that the PVC is allocated with the specified amount of storage capacity.
  • Storage Class:

    • storageClassName: "demo-storage"
    • Associates the PVC with a specific Storage Class named "demo-storage." The Storage Class defines the type of storage, provisioning characteristics, and additional attributes such as performance and availability parameters.

Deployment.yaml



apiVersion: apps/v1
kind: Deployment
metadata:
  name: mongo
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mongo
  template:
    metadata:
      labels:
        app: mongo
    spec:
      containers:
        - image: mongo
          name: mongo
          args: ["--dbpath", "/data/db"]
          env:
            - name: MONGO_INITDB_ROOT_USERNAME
              value: "admin"
            - name: MONGO_INITDB_ROOT_PASSWORD
              value: "password"
          volumeMounts:
            - mountPath: /data/db
              name: mongo-volume
      volumes:
        - name: mongo-volume
          persistentVolumeClaim:
            claimName: mongo-pvc


Enter fullscreen mode Exit fullscreen mode

Volume Configuration in Deployment:

Within our Deployment YAML, the volume configuration is pivotal for ensuring MongoDB's seamless integration with persistent storage. Let's explore this critical section:

Volume Mount Configuration:

  • Volume Mounts:
    • volumeMounts:: Describes the configuration for mounting volumes into the container.
    • - mountPath: /data/db: Specifies the path within the container where the volume will be mounted.
    • - name: mongo-volume: Associates the volume mount with the previously defined volume named "mongo-volume."

Volume Configuration:

  • Volumes:
    • volumes:: Describes the volumes to be mounted into the pod.
    • - name: mongo-volume: Assigns the name "mongo-volume" to the volume.
      • persistentVolumeClaim:: Signals the use of a Persistent Volume Claim (PVC) for dynamic provisioning.
      • claimName: mongo-pvc: Associates the volume with the PVC named "mongo-pvc."

By naming the volume "mongo-volume" and associating it with the PVC "mongo-pvc" we ensure seamless access to persistent storage for our MongoDB container, fostering resilience and efficiency within our Kubernetes environment.


Key characteristics of PVCs:

1. User-Driven Requests:

Unlike Persistent Volumes (PVs) provisioned by administrators, PVCs are created by Pod users or applications. This empowers developers to define their storage needs without delving into the underlying storage infrastructure.

2. Storage Specification:

A PVC encapsulates the storage requirements of a Pod, including:

  • Capacity: The desired storage size for the Pod.
  • Access Modes: Specifying how the Pod can interact with the storage (e.g., read-only, read/write once, read/write many).
  • Storage Class (Optional): Referencing a pre-defined StorageClass that embodies specific storage performance or service level characteristics.

3. Dynamic Provisioning:

Kubernetes leverages PVCs to facilitate dynamic storage provisioning. When a Pod with a PVC is created, the Kubernetes system searches for a suitable PV that fulfills the PVC's specifications. This matching process considers factors like storage class, capacity, and access modes.

4. Binding to PVs:

Upon finding a matching PV, Kubernetes establishes a binding between the PVC and the PV. This essentially creates a connection between the Pod and the persistent storage resource, allowing the Pod to mount the volume and utilize it for persistent data storage.

IMPORTANT NOTE: PVC is in active use by a Pod when a Pod object exists that is using the PVC. If a user deletes a PVC in active use by a Pod, the PVC is not removed immediately. PVC removal is postponed until the PVC is no longer actively used by any Pods.


Storage Classes in Kubernetes:

Within Kubernetes, Storage Classes (StorageClasses) serve as a cornerstone mechanism for administering persistent storage. They offer a sophisticated approach to storage management, empowering administrators to define and expose diverse storage options tailored to application requirements.

Let understand Storage Classes through our example yaml manifestos.

SC.yaml



apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: demo-storage
provisioner: k8s.io/minikube-hostpath
volumeBindingMode: Immediate
reclaimPolicy: Delete


Enter fullscreen mode Exit fullscreen mode

StorageClass Configuration:

Specifications:

  • Provisioner: k8s.io/minikube-hostpath

    • Specifies the provisioner responsible for dynamically provisioning storage. Here, we utilize Minikube's built-in hostPath provisioner for streamlined storage management.
  • Volume Binding Mode: Immediate

    • Sets the volume binding mode to "Immediate," ensuring rapid binding of Persistent Volumes (PVs) to Persistent Volume Claims (PVCs) immediately after PVC creation. This enhances efficiency in storage allocation.
  • Reclaim Policy: Delete

    • Defines the reclaim policy for associated Persistent Volumes (PVs). With a "Delete" policy, when a PVC is removed, the corresponding PV and its data are automatically deleted, promoting a clean and resource-efficient storage lifecycle.

This StorageClass, named demo-storage, encapsulates these configurations to optimize storage provisioning, binding, and cleanup within the Kubernetes cluster.


Key characteristics of Storage Classes:

1. Storage Type Abstraction:

Storage Classes act as an abstraction layer, decoupling the specific storage technology (e.g., cloud storage, SAN) from the applications themselves. This simplifies storage management by enabling administrators to define storage options independent of the underlying infrastructure. Developers can focus on application logic without needing in-depth storage knowledge.

2. Customizable Storage Profiles:

Administrators possess fine-grained control over storage class configuration. This encompasses defining parameters such as:

  • Provisioning Type: Selection of the provisioner plugin (e.g., kubernetes.io/nfs, kubernetes.io/aws-ebs) that determines the underlying storage technology used.
  • Performance Profiles: Specifying storage performance characteristics such as HDD or SSD to optimize storage allocation based on application needs (e.g., high IOPS for databases vs. cost-effective cold storage).
  • Durability Options: Configuring parameters like replication factor for data redundancy, catering to application requirements for data availability and resilience.

3. Dynamic Provisioning with PVCs:

Storage Classes collaborate seamlessly with Persistent Volume Claims (PVCs) to facilitate dynamic storage provisioning. When a Pod submits a PVC requesting storage, Kubernetes searches for a suitable Persistent Volume (PV) that aligns with the PVC's specifications and the characteristics defined within the requested Storage Class. This automated process streamlines storage allocation and application deployment.

4. Optional Default Storage Class:

Administrators can establish a default Storage Class. In scenarios where a PVC doesn't explicitly request a specific class, the default class will be employed for provisioning the corresponding PV. This simplifies storage configuration for common use cases and expedites application deployments.


Detail Explanation of Access Modes:

Access Modes define the level of access that a Pod has to a Persistent Volume (PV) through a Persistent Volume Claim (PVC). In Kubernetes, there are four primary access modes:

  1. ReadWriteOnce (RWO):

    • Description: Allows the volume to be mounted as read-write by a single node at a time.
    • Use Case: Suitable for scenarios where a Pod needs exclusive read-write access to the volume. For example, in a database setup where only one instance should write data to the storage at any given time.
  2. ReadOnlyMany (ROX):

    • Description: Allows the volume to be mounted as read-only by multiple nodes simultaneously.
    • Use Case: Ideal for scenarios where multiple Pods across different nodes require read-only access to shared data. For example, when serving static assets that multiple web server Pods need to read.
  3. ReadWriteMany (RWX):

    • Description: Allows the volume to be mounted as read-write by multiple nodes simultaneously.
    • Use Case: Useful for scenarios where multiple Pods, possibly on different nodes, need both read and write access to shared data. For example, in a file-sharing scenario where multiple Pods in a distributed system need read-write access to a common storage pool.
  4. ReadWriteOncePod (RWOP):

    • Description: ReadWriteOncePod is a Kubernetes Persistent Volume (PV) access mode introduced in v1.22 that grants exclusive read-write access to a single Pod on a single Node.
    • Use Case: Use ReadWriteOncePod access mode if you want to ensure that only one pod across the whole cluster can read that PVC or write to it.

Detail Explanation of Reclaim Policy:

The Reclaim Policy defines the behavior of a Persistent Volume (PV) after its associated Persistent Volume Claim (PVC) is deleted or released. It governs what action should be taken with the storage resources associated with the PV. There are three standard reclaim policies:

  1. Retain:

    • Description: With the "Retain" policy, the PV retains the storage resources even after the associated PVC is deleted or released. The PV's data remains intact and accessible.
    • Use Case: This policy is suitable for scenarios where you want to preserve the data on the storage volume for potential future use, even if the associated PVC is no longer in use.
  2. Delete:

    • Description: The "Delete" policy instructs Kubernetes to delete both the PV and its associated storage resources when the associated PVC is deleted or released.
    • Use Case: Typically used in environments where the data stored in the PV is disposable or transient, and there's no need to retain it after the PVC is no longer in use.
  3. Recycle (Deprecated):

    • Description: The "Recycle" policy, though deprecated, was used to delete the contents of the PV when the associated PVC was deleted. It would attempt to clean up the contents of the volume by deleting all files within it.
    • Use Case: Previously used in environments where simple cleanup of the PV's contents was sufficient, but it's no longer recommended due to security and data privacy concerns.

volumeBindingMode in Kubernetes StorageClass:

The volumeBindingMode is a critical parameter within a Kubernetes StorageClass configuration. It determines how Persistent Volumes (PVs) are bound to Persistent Volume Claims (PVCs) when dynamic provisioning is involved. This setting influences the timing and mechanism by which the system allocates storage resources based on PVC requirements.

1. Immediate:

  • Description: With the Immediate binding mode, the system dynamically binds a PV to a PVC as soon as the PVC is created. This means that the storage resource is provisioned immediately upon the creation of the claim.
  • Use Case: Suitable for scenarios where rapid and on-demand provisioning of storage resources is preferred, allowing for quick allocation and utilization.

2. WaitForFirstConsumer:

  • Description: In the WaitForFirstConsumer binding mode, the PV binding is delayed until the first Pod using the PVC is scheduled. The PV is only provisioned when a Pod referencing the PVC is scheduled onto a node.
  • Use Case: This mode is useful when you want to avoid unnecessary resource allocation until an actual workload requiring storage is scheduled.

Conclusion:

In wrapping up our exploration of Kubernetes Persistence Volume essentials, we've delved into key elements—Persistent Volumes, Volume Claims, and Storage Classes. These components facilitate effective data storage in containerized environments, ensuring durability, scalability, and resource efficiency.

Thanks to our readers for joining this informative journey, and enhancing understanding of Kubernetes storage complexities. Happy orchestrating!

Top comments (0)