Kubernetes is a popular and successful orchestration framework. By using Kubernetes one can deploy and manage large numbers of containers within an enterprise. These containers of course contain applications that execute whatever is needed. Kubernetes does a great job of managing those containers, but does not provide for persistence of data within those containers.
When a container runs in Kubernetes it can ask the Kubernetes framework for volumes or persistent volumes. In the simplest case those volumes are stored on the local host. The problem with that is if the container is moved to another host, the stored data is lost. Additionally, two containers running on different hosts cannot see data that is stored locally to one host. To enable shared and reliable persistent storage, Kubernetes allows for customizable persistent storage. Kubernetes applications need a persistent store that can scale well, is very reliable, is very fast, and can be accessed remotely by any container. VAST is one such store.
In this article we'll explain how to use VAST as a persistent store for applications running in Kubernetes. We show two techniques. In the first part we show using VAST simply as an NFS file server which Kubernetes natively supports. Then in the second part we show how to leverage the VAST CSI plugin for deeper Kubernetes integration.
You may be interested in our webinar that covers our Kubernetes work:
In Kubernetes Volumes allow a container (via its YAML specification) to state what persistent storage it wishes to use. Kubernetes supports a variety of volumes types, including an NFS volume. Conveniently VAST supports NFS and thus one can easily reference an existing VAST cluster using standard Kubernetes constructs.
Here's an example of a trivial pod that uses an existing VAST cluster as persistent storage. In advance we did the following:
- Created a DNS round robin entry (not in the Kubernetes DNS server as the NFS Kubernetes clients cannot see the Kubernetes DNS, only the host DNS).
- Created a VAST export for use with Kubernetes - named /k8s.
The application prints a date to a file within VAST, demonstrating proper connectivity. This is the complete YAML file for that application.
# Create a pod that reads and writes to the
# VAST NFS server via an NFS volume.
# Add the server as an NFS volume for the pod
- name: nfs-volume
# URL for the NFS server. The DNS name must be defined at the OS level, not within K8S DNS.
# In this container, we'll mount the NFS volume
# and write the date to a file inside it.
- name: app
# Mount the NFS volume in the container
- name: nfs-volume
# Write to a file inside our NFS
args: ["-c", "while true; do date >> /var/nfs/nfsdirect-dates.txt; sleep 5; done"]
To deploy this using kubectl, create the file above and then run
kubectl apply -f pod-using-nfs.yaml
That's more than sufficient for many applications, but in some Kubernetes environments, the administrators prefer to better abstract storage from the applications by using Persistent Volumes and Persistent Volume Claims. As you likely noticed the YAML file above for the pod included very specific environmental details - the DNS name and mount point for VAST. It is advantageous for applications to simply state "I need storage" and Kubernetes can then provide it as needed. That is done via the higher level abstraction of Persistent Volumes and Persistent Volume Claims. Essentially an administrator creates a Persistent Volume in advance (the information looks a lot like the information above) and then applications express their need by using a Persistent Volume Claim. We aren't going to cover that as it's a rather trivial extension of the previous example.
A much more interesting enhancement is to move from statically creating Persistent Volumes to Dynamic Provisioning. This frees the administrator from creating all of those Persistent Volumes.
With dynamic provisioning, applications still specify Persistent Volume Claims, but instead of the Kubernetes administrator creating the volumes, the administrator deploys in advance the elements needed for Dynamic Provisioning, specifically these key components:
- StorageClass - a YAML class definition that specifies the dynamic provisioner as well as some volume related options
- dynamic provisioner - in this case, a Container Storage Interface (CSI) aware provisioner
The linkage between the StorageClass and the provisioner is what enables dynamic provisioning. When a Persistent Storage Claim is made, the Kubernetes framework will use the provisioner to dynamically create a Persistent Volume. Later when a container (via its YAML) references a Persistent Storage Claim Kubernetes will provide the volume to the container. This is a bit of a simplification and Persistent Volumes have lifecycle rules - Persistent Volume lifecycle behaviors are controlled by the Kubernetes framework and outside the scope of this article.
Kubernetes has a few built in provisioners but also provides for CSI based provisioning which is the focus of the remainder of this article.
Dynamically Provisioning Volumes Using VAST's CSI
With the introduction of CSI, containers can access storage exposed by VAST systems. VAST's CSI allows CO (container orchestrators) such as Kubernetes to provision storage volumes from a VAST system. This document will explain how to install the plugin on a Kubernetes cluster.
The plugin is currently distributed as a docker image. It contains both the plugin code itself, as well as code for generating Kubernetes deployment YAML.
Once the docker image is downloaded, it can be exported as tar file and loaded into a private docker registry, for clusters that are not connected to the internet.
The image is publicly available here: https://hub.docker.com/r/vastdataorg/csi
A Kubernetes cluster is up and running
A VAST cluster is up and running
All nodes of the Kubernetes cluster are networked with the VAST cluster and can mount it via NFS
At least one node can communicate with VAST's management VIP
A host with docker installed, preferably in the network of the VMS, for generating and validating the plugin deployment YAML.
A Virtual IP pool has been created for use by the Kubernetes nodes
Please note at present virtual IP pools are named vippool-N according to their creation order. This will be addressed in a later release.
An export has been created for the Kubernetes volumes.
Within this single export, many volumes (Persistent Volume Claims) will be created by Kubernetes for the various applications that need them.
- A user created on the VAST cluster for the CSI driver to connect via. This user requires a minimum of Logical and Security roles. In future this will be only Logical.
Create an Export for Kubernetes
Kubernetes can use any VAST export. However, in order to ensure that Kubernetes volumes are in one dedicated place not at the root of the VAST file system, we recommend you create a dedicated export and directory for that purpose. In this example we will create a directory /k8s and export it as /k8s. Here's an example from the VAST VMS web interface:
Find an Existing VIP Pool Name
If you have more than one VIP Pool it may be necessary to find out the corresponding vippool-x number to configure the CSI driver. The x is the ID listed in the output below. If you only have one VIP Pool then it will always be vippool-1
#SSH to C-node1 in the vast cluster then run the following command to find the vms
$ clush -g cnodes -b 'docker ps -q --filter label=role=vms'
#ssh to the ip address listed in the output above , in this case 172.16.3.2
$ ssh 172.16.3.2
#start the vcli using your username and password
$ /vast/data/vms.sh vcli -u admin -p 123456
Using host 10.100.201.201
Welcome to VAST Management CLI!!!
Type help or ? to see a list of commands
vcli: admin> vippool list
| ID | Start-IP | End-IP | Subnet-CIDR | GW-IP | CNodes | Cluster |
| 1 | 10.101.201.41 | 10.101.201.56 | 16 | |  | se-demo |
| 2 | 10.101.201.57 | 10.101.201.58 | 16 | |  | se-demo |
This VIP pools info is available in the management interface but the ID's are not listed, but they are shown in ascending order starting at 1.
Download the CSI
Download the VAST's CSI from docker: Please check that the version is the latest.
docker pull vastdataorg/csi:latest
Or a specific version can be done as below.,
docker pull vastdataorg/csi:$VERSION
If you have no internet access you can download this to your private registry.
docker save vastdataorg/csi:$VERSION -o csi-vast.tar
docker load -i csi-vast.tar
docker tag vastdataorg/csi:$VERSION example-registry/vast-csi:$VERSION
docker push example-registry/vast-csi:$VERSION
Determine the VAST Management VIP
The VAST management VIP is the same IP address or DNS name you use to connect to VMS in your web browser. The IP automatically points to whatever node VMS is currently using and thus is always available. The CSI provisioner will need that.
Create a VMS User with Needed Authority
You can use the default 'admin' user that is created by VAST during initial install for use with the VAST CSI or you can create a new user with more limited authority.
To create a limited user first create a CSI Role that has 'Logical' and 'Security' selected with Create, View and Edit privileges as below.
Then create a new manager called CSI that uses the above CSI role. You will need to enter a username, password, select CSI from the roles and remove 'ReadOnly' as below.
Generating the YAML
Run the following command to interactively generate a deployment YAML:
docker run -it --net=host -v `pwd`:/out $IMAGE_NAME template --image $IMAGE_NAME
# As an example if you have downloaded the CSI from the internet
docker run -it --net=host -v `pwd`:/out vastdataorg/csi:$VERSION template
The command takes these two parameters of interest:
- The `pwd`:/out means the template will be generated in the current directory
Name of this Docker Image: This name will be used by k8s to distribute the image within the cluster.
This would normally be vastdataorg/csi:v0.1.3 (or whatever is the current version), unless the cluster is in at a location without internet access. In that case - follow step above for downloading into your own registry and specify the image name as it is in the private repository (example-registry/vast-csi:v0.1.3).
Once the command runs (see below) you will be asked the following questions:
Image Pull Policy (Never|Always): Whether Kubernetes should pull the image from the remote registry, or use the existing image in its local registry.
VAST Management hostname: The DNS name or IP for VAST's management endpoint
VAST Management username/password: Credentials for communicating with Vast’s management. These will be recorded as Kubernetes secrets. Note - The tool will attempt to connect to the VMS. Failure will be reported to the user, but will allow the user to continue.
Virtual IP Pool Name: The name of the VAST virtual IP pool. Recall these are named vippool-<number>. The VAST CSI will spread load across the VIPs in the pool.
NFS Export Name: The name of the export
Additional Mount Options: Any custom NFS mount options, for example proto=rdma,port=20049
This command can also be used non-interactively - append --help at the end for details
Below is an example of using the tool to generate the YAML file.
- A file will be created: vast-csi-deployment.yaml. You may inspect and edit the file if needed.
- Be sure to delete it when done, as it contains credentials to the VMS
To deploy, run:
kubectl apply -f vast-csi-deployment.yaml
If successful you will see one csi-vast-controller-0 plus a csi-vast-node-xxxxx per worker node running. If they show running then this means you have deployed the CSI successfully.
VAST's deployment has a singleton Controller pod responsible for creating/deleting the volumes (quotas), and a Node pod responsible for creating the mount on the node (host) it is running on.
The node pod contains node-driver-registrar sidecar container (provided by k8s), and our vast-csi plugin. The sidecar is the glue between k8s and our CSI plugin endpoint. The plugin runs in "node" mode, and invokes "mount/umount" commands. It does not communicate with the VMS.
kubectl get pods --all-namespaces -o wide
Launching an Application Using a Persistent Volume Claim
The following YAML can be used to test the CSI plugin. It will create a PVC by provisioning a volume of 1Gi from VAST, and a set of 5 pods running an application consisting of docker containers with mounts to that volume - in this example the application is trivial, a simple shell program but any real application is essentially the same in terms of the PVC reference:
Create a YAML file with the following content (test.csi.yaml)
- name: my-frontend
- mountPath: "/shared"
command: [ '/bin/sh', '-c', 'while true; do date -Iseconds >> /shared/$HOSTNAME; sleep 1; done' ]
- name: my-shared-volume
Apply the YAML file
kubectl apply -f test.csi.yaml
Once operational you can monitor the /k8s export, you will see a PVC per container, in this example 5. In each a date will be appended to a text file.
Let's discuss what happened here. Essentially when the application started the following things happened related to persistence:
- You created a Persistent Volume Claim which indirectly then used the StorageClass which in turn referenced the dynamic provisioner for VAST via CSI.
- The CSI received the request to create a persistent volume. Using the already configured NFS information, it created a directory (named for the Persistent Volume) and informed Kubernetes that the volume was created.
- Kubernetes allocated the logical Persistent Volume which references the NFS directory. At this point you can see the directory for it by listing the k8s directory.
- Time passes.
- An application is launched (in this case a pod but it could be a more complex deployment).
- Kubernetes began the launch sequence and recognized that the application referenced a Persistent Volume Claim.
- Kubernetes asked the vast-csi controller to publish the volume to the node where the application is to run. The vast-csi controller gets the API request and allocates a VIP round-robin. If the application happens to fail over it may get a new VIP. Note that each volume will get a single VIP on each node - meaning multiple applications running on the same node sharing the same volume will use the same VIP. Multiple volumes on the same node or the same volume on different nodes will likely get different VIPs for good load spreading across volumes and nodes.
- Kubernetes then asked the vast-csi plugin on the node to mount the volume, making it available for the application.
- Kubernetes causes the application within the container to start (in this case a simple shell script that prints to a file).
- The application creates a file in what appears to be a local directory within its container, but actually points to a directory in a VAST export (the directory being the Persistent Volume).
- The application writes to the file.
Here are a few handy Kubernetes commands relevant to this article.
# get all pods
kubectl get pods --all-namespaces -o wide
# get logs from the controller
kubectl logs pods/csi-vast-controller-0 --namespace kube-system -c csi-vast-plugin | less
# get logs from the node
kubectl logs pods/csi-vast-node-<NODE_ID> --namespace kube-system -c csi-vast-plugin
To see the Persistent Volumes and Persistent Volume Claims:
$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
shared-claim Bound pvc-41e3cd02-1a80-4dfb-bb50-6d272a9e649e 1Mi RWO managed-nfs-storage 12d
$ kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-41e3cd02-1a80-4dfb-bb50-6d272a9e649e 1Mi RWO Delete Bound default/shared-claim managed-nfs-storage 12d
Notice the name for the volume. That's the name of the directory we looked at earlier - with default-shared-claim prefixed to it.
To see the defined Storage Classes:
$ kubectl get sc
NAME PROVISIONER AGE
managed-nfs-storage nfs-client 13d
All of the above commands have corresponding delete and describe options as well - just insert 'delete' or 'describe' after kubectl and before the object type. Describe gives you a lot more information about the object.
You've seen how VAST can be used easily with Kubernetes to provide dynamically provisioned Persistent Volumes. Since VAST provides a highly scalable, fast, and remotely accessible data store, it is a great choice for use with Kubernetes.
We are very interested in feedback on our Kubernetes support. Please contact your VAST representative to let us know how well this works for you and what enhancements would be useful for a productized version.