Graceful removal of the OpenEBS cStor disk pool from the cluster

Kubernetes

Mar 05, 2023

OpenEBS takes volume provisioning to the next level. It is due to its dynamic nature as well as helpful replication capabilities. However, from time to time it might be required to perform some maintenance work in the cluster that would require taking a node out. This has to be done with tremendous care and caution so no data is corrupted or lost.

Reducing replicas gracefully

I use OpenEBS cStor as the default storage engine. CStorPoolCluster has been configured to have 3 CStorPoolInstances, on each worker node. Now it's time to temporarily reduce the number of those instances to 2. Issuing the following command will list them.

❯ kubectl get cstorpoolinstances.cstor.openebs.io 
NAME                           HOSTNAME   FREE     CAPACITY   READONLY   PROVISIONEDREPLICAS   HEALTHYREPLICAS   STATUS   AGE 
openebs-cstor-disk-pool-d5rg   vm0102     87400M   90100M     false      4                     4                 ONLINE   34d 
openebs-cstor-disk-pool-pxgl   vm0302     87400M   90100M     false      4                     4                 ONLINE   34d 
openebs-cstor-disk-pool-rvl2   vm0202     87400M   90100M     false      4                     4                 ONLINE   33d

‌Due to the removal of a node, I have to decommission openebs-cstor-disk-pool-pxgl pool. The pool is used by multiple PVCs, therefore have to edit the CStorVolumeConfig of each PVC to get rid of the pool.

❯ kubectl get cstorvolumeconfigs.cstor.openebs.io 
NAMESPACE   NAME                                       CAPACITY   STATUS   AGE 
openebs     pvc-48b1eaaa-8874-4d56-9c3a-1240e30d861c   4Gi        Bound    33d 
openebs     pvc-5f57319d-a744-44fa-9aa1-33cfcadb649f   8Gi        Bound    33d 
openebs     pvc-863aa630-5368-4400-9c01-e965d17c5aeb   4Gi        Bound    33d 
openebs     pvc-8d8cb8cb-9e13-4a3a-8f12-d5a19b4afb8b   4Gi        Bound    33d

Editing volume config will give us output similar to the following one.

❯ kubectl edit cstorvolumeconfigs.cstor.openebs.io pvc-48b1eaaa-8874-4d56-9c3a-1240e30d861c 
apiVersion: cstor.openebs.io/v1 
kind: CStorVolumeConfig 
metadata: 
  annotations: 
    openebs.io/persistent-volume-claim: datadir-mysql-innodbcluster-0 
    openebs.io/volume-policy: "" 
    openebs.io/volumeID: pvc-48b1eaaa-8874-4d56-9c3a-1240e30d861c 
  creationTimestamp: "2023-01-14T20:37:31Z" 
  finalizers: 
  - cvc.openebs.io/finalizer generation: 6 
  labels: 
    cstor.openebs.io/template-hash: "251293062" 
    openebs.io/cstor-pool-cluster: openebs-cstor-disk-pool 
    openebs.io/pod-disruption-budget: openebs-cstor-disk-poolrb5qk 
  name: pvc-48b1eaaa-8874-4d56-9c3a-1240e30d861c 
  namespace: openebs 
  resourceVersion: "538165" 
  uid: e441ef25-4fd1-4fcf-9e96-597f79f812db 
publish: 
  nodeId: vm0202 
spec: 
  capacity: 
    storage: 4Gi 
  cstorVolumeRef: 
    apiVersion: cstor.openebs.io/v1 
    kind: CStorVolume name: pvc-48b1eaaa-8874-4d56-9c3a-1240e30d861c 
    namespace: openebs 
    resourceVersion: "537566" 
    uid: 3037d679-b74f-4fdf-8279-7c31d4ff59e2 
  policy: 
    provision: 
      replicaAffinity: false 
    replica: {} 
    replicaPoolInfo: 
    - poolName: openebs-cstor-disk-pool-d5rg 
    - poolName: openebs-cstor-disk-pool-pxgl 
    - poolName: openebs-cstor-disk-pool-rvl2 
    target: 
      auxResources: 
        limits: 
          cpu: "0" 
          memory: "0" 
        requests: 
          cpu: "0" 
          memory: "0"

The crucial part here is to remove openebs-cstor-disk-pool-pxgl from the replicaPoolInfo list. It has to be done for each CStorVolumeConfig.

Once it is done, there should be no replicas in the pool.

❯ kubectl get cstorpoolinstances.cstor.openebs.io 
NAME                           HOSTNAME   FREE     CAPACITY   READONLY   PROVISIONEDREPLICAS   HEALTHYREPLICAS   STATUS   AGE 
openebs-cstor-disk-pool-d5rg   vm0102     87400M   90100M     false      4                     4                 ONLINE   34d 
openebs-cstor-disk-pool-pxgl   vm0302     90G      90117M     false      0                     0                 ONLINE   34d 
openebs-cstor-disk-pool-rvl2   vm0202     87400M   90100M     false      4                     4                 ONLINE   33d

We are ready to delete it.

❯ kubectl delete cstorpoolinstances.cstor.openebs.io openebs-cstor-disk-pool-pxgl 
cstorpoolinstance.cstor.openebs.io "openebs-cstor-disk-pool-pxgl" deleted

‌Now we should have two disk pools only. Describing CStorPoolClusters will show the number of healthy and desired instances, so 2 and 3 respectively.

❯ kubectl get cstorpoolclusters.cstor.openebs.io 
NAME                      HEALTHYINSTANCES   PROVISIONEDINSTANCES   DESIREDINSTANCES   AGE 
openebs-cstor-disk-pool   2                  2                      3                  34d

It's time to update CStorPoolClusters.

❯ kubectl edit cstorpoolclusters.cstor.openebs.io openebs-cstor-disk-pool 
apiVersion: cstor.openebs.io/v1 
kind: CStorPoolCluster 
metadata: 
  creationTimestamp: "2023-01-13T21:31:16Z" 
  finalizers: 
  - cstorpoolcluster.openebs.io/finalizer 
  generation: 21 
  name: openebs-cstor-disk-pool 
  namespace: openebs 
  resourceVersion: "18193885" 
  uid: b9a7dfe4-5f2f-43b5-aeef-af03af3d87f1 
spec: 
  pools: 
  - dataRaidGroups: 
    - blockDevices: 
      - blockDeviceName: blockdevice-a251ba13122b4b5f8c2ce9471cf4b03e 
    nodeSelector: 
      kubernetes.io/hostname: vm0102 
    poolConfig: 
      dataRaidGroupType: stripe 
  - dataRaidGroups: 
    - blockDevices: 
      - blockDeviceName: blockdevice-19d2d2fdc1c0e274aa3ba199d8fba897 
    nodeSelector: 
      kubernetes.io/hostname: vm0302 
    poolConfig: 
      dataRaidGroupType: stripe 
  - dataRaidGroups: 
    - blockDevices: 
      - blockDeviceName: blockdevice-2806021afad58e5ef20c5c82b78fd943 
    nodeSelector: 
      kubernetes.io/hostname: vm0202 
    poolConfig: 
      dataRaidGroupType: stripe

‌ We have to get rid of dataRaidGroup containing the hostname vm0302 and block device blockdevice-19d2d2fdc1c0e274aa3ba199d8fba897 from the pools list.

❗ If a node is already lost, OpenEBS won't let us remove the hostname and device from the list. This requires additional steps to take, see Node is lost before disk pool removal.

The block device should now be Unclaimed.

❯ kubectl get bd 
NAME                                           NODENAME   SIZE          CLAIMSTATE   STATUS   AGE 
blockdevice-19d2d2fdc1c0e274aa3ba199d8fba897   vm0302     99998934528   Unclaimed    Active   34d 
blockdevice-2806021afad58e5ef20c5c82b78fd943   vm0202     99998934528   Claimed      Active   34d 
blockdevice-a251ba13122b4b5f8c2ce9471cf4b03e   vm0102     99998934528   Claimed      Active   34d

Now it can be removed.

❯ kubectl delete bd blockdevice-19d2d2fdc1c0e274aa3ba199d8fba897 
blockdevice.openebs.io "blockdevice-19d2d2fdc1c0e274aa3ba199d8fba897" deleted

We are ready to drain nodes.

Node is lost before disk pool removal

Planned maintenance is the most desirable way of making things run healthy. However, rarely happen random unplanned events. The disasters for instance. In such cases, there is no way to perform graceful removal as the node is already lost.

OpenEBS has finalizers that perform clean-up actions once a resource is removed. However, when the finalizer is not able to run an action, it gets stuck and prevents the removal of the disk pool from the cluster. In such a case it is required to edit the cStor volume replica related to the lost pool and remove the finalizer manually.

❯ kubectl get cstorvolumereplicas.cstor.openebs.io 
NAME                                                                    ALLOCATED   USED    STATUS    AGE 
pvc-5ea36f92-daec-4ba8-a650-456b1b97b17a-openebs-cstor-disk-pool-pxgl   873M        4.79G   Offline   4d12h 
pvc-5ea36f92-daec-4ba8-a650-456b1b97b17a-openebs-cstor-disk-pool-d5rg   873M        4.79G   Healthy   4d12h 
pvc-5ea36f92-daec-4ba8-a650-456b1b97b17a-openebs-cstor-disk-pool-rvl2   873M        4.79G   Healthy   4d12h 
pvc-5f57319d-a744-44fa-9aa1-33cfcadb649f-openebs-cstor-disk-pool-pxgl   226M        758M    Offline   14d 
pvc-5f57319d-a744-44fa-9aa1-33cfcadb649f-openebs-cstor-disk-pool-d5rg   227M        758M    Healthy   48d 
pvc-5f57319d-a744-44fa-9aa1-33cfcadb649f-openebs-cstor-disk-pool-rvl2   227M        758M    Healthy   48d 
pvc-a274c159-eeb4-4d7c-9c22-bc8df66e9ae9-openebs-cstor-disk-pool-pxgl   189M        512M    Offline   4d12h 
pvc-a274c159-eeb4-4d7c-9c22-bc8df66e9ae9-openebs-cstor-disk-pool-d5rg   189M        512M    Healthy   4d12h 
pvc-a274c159-eeb4-4d7c-9c22-bc8df66e9ae9-openebs-cstor-disk-pool-rvl2   189M        511M    Healthy   4d12h

Here following volume replicas need treatment:

pvc-5ea36f92-daec-4ba8-a650-456b1b97b17a-openebs-cstor-disk-pool-pxgl,
pvc-5f57319d-a744-44fa-9aa1-33cfcadb649f-openebs-cstor-disk-pool-pxgl, and
pvc-a274c159-eeb4-4d7c-9c22-bc8df66e9ae9-openebs-cstor-disk-pool-pxgl.

When editing the resource, we have to find finalizers list.

❯ kubectl edit cstorvolumereplicas.cstor.openebs.io pvc-5ea36f92-daec-4ba8-a650-456b1b97b17a-openebs-cstor-disk-pool-pxgl 
apiVersion: cstor.openebs.io/v1 
kind: CStorVolumeReplica 
metadata: 
  annotations: 
    cstorpoolinstance.openebs.io/hostname: vm0302 
    creationTimestamp: "2023-02-27T20:13:52Z" 
    finalizers: 
    - cstorvolumereplica.openebs.io/finalizer 
    generation: 13086 
    labels: 
      cstorpoolinstance.openebs.io/name: openebs-cstor-disk-pool-pxgl

And remove the following line:

- cstorvolumereplica.openebs.io/finalizer

This will allow CStorVolume to scale down.

Now, we can update CStorPoolClusters and remove unreachable resources.

Conclusion

Installing OpenEBS on a cluster and then configuring it is an easy task. However, maintenance can be quite difficult and requires a lot of insight. I strongly encourage you to seek troubleshooting guides before going down the rabbit hole.

Troubleshooting OpenEBS - cStor | OpenEBS Docs