Kubernetes Persistent Volumes for Dataset Storage
PythonCreating persistent storage in Kubernetes involves defining Persistent Volume (PV) and Persistent Volume Claim (PVC) resources. A PV is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. A PVC is a request for storage by a user. It is similar to a pod; pods consume node resources and PVCs consume PV resources.
Here's a breakdown of the process:
-
Define a Persistent Volume (PV): This is where you define a volume that is available for use. It can be hosted on a variety of supported storage backends and configured with specific capabilities (such as size, access modes, etc.).
-
Create a Persistent Volume Claim (PVC): This is a request for storage by a user. PVCs can request specific size and access modes (like a read-only or read-write volume).
-
Mount the PVC in a Pod: Once the PVC is bound to a PV, it can be mounted into a pod for use by an application.
For example, let's create a Kubernetes Persistent Volume backed by NFS (Network File System), and a Persistent Volume Claim to claim that storage for use. You'll also write a pod to show how to use the PVC. In this example, we are assuming that you have an existing NFS server with an IP address of
10.10.10.20
and a path/dataset
.Let's write the Pulumi program in Python to create the above resources:
import pulumi import pulumi_kubernetes as k8s # Create a Kubernetes Persistent Volume nfs_persistent_volume = k8s.core.v1.PersistentVolume("nfsPersistentVolume", metadata=k8s.meta.v1.ObjectMetaArgs( name="nfs-pv", ), spec=k8s.core.v1.PersistentVolumeSpecArgs( capacity={"storage": "5Gi"}, access_modes=["ReadWriteMany"], nfs=k8s.core.v1.NFSVolumeSourceArgs( server="10.10.10.20", path="/dataset" ) ) ) # Create a Persistent Volume Claim nfs_persistent_volume_claim = k8s.core.v1.PersistentVolumeClaim("nfsPersistentVolumeClaim", metadata=k8s.meta.v1.ObjectMetaArgs( name="nfs-pvc", ), spec=k8s.core.v1.PersistentVolumeClaimSpecArgs( access_modes=["ReadWriteMany"], resources=k8s.core.v1.ResourceRequirementsArgs( requests={"storage": "5Gi"} ) ) ) # Create a Pod that uses the PVC nfs_pod = k8s.core.v1.Pod("nfsPod", metadata=k8s.meta.v1.ObjectMetaArgs( name="nfs-pod", ), spec=k8s.core.v1.PodSpecArgs( volumes=[k8s.core.v1.VolumeArgs( name="nfs-storage", persistent_volume_claim=k8s.core.v1.PersistentVolumeClaimVolumeSourceArgs( claim_name=nfs_persistent_volume_claim.metadata.name, ), )], containers=[k8s.core.v1.ContainerArgs( name="alpine", image="alpine:latest", command=["sh", "-c"], args=["while true; do echo hello; sleep 10;done"], volume_mounts=[k8s.core.v1.VolumeMountArgs( name="nfs-storage", mount_path="/mnt/dataset", )] )] ) ) # Export the name of the PVC pulumi.export('persistent_volume_claim_name', nfs_persistent_volume_claim.metadata.name)
In this example:
- We define a
PersistentVolume
with NFS details wherecapacity
is5Gi
andaccess_modes
is set toReadWriteMany
, allowing multiple nodes to write simultaneously. - We create a
PersistentVolumeClaim
, which will look for aPersistentVolume
that satisfies its requirements likeaccess_modes
and requestedstorage
size. - We declare a simple pod with a single container that will mount the storage from our Persistent Volume Claim to the path
/mnt/dataset
within the container.
Here, when the
nfs-pod
is running, it will use the NFS-backed Persistent Volume to store any data written to/mnt/dataset
.Make sure to have your Pulumi and Kubernetes configurations set up (i.e., context configured to point to your Kubernetes cluster) before running this Pulumi program. After you've set up your configurations, you can run this Pulumi program by executing
pulumi up
in your command line, in the directory where this file is saved. This will provision the resources in your Kubernetes cluster.-