Mounting issues with encrypted volumes #41

teknologista · 2021-10-20T09:47:02Z

Describe the bug
Volumes gets in an unmountable state after trying to restart a pod using an encrypted PV

To Reproduce
Setup Scaleway CSI and create encrypted storageClass as outlined in the docs.
Deploy a StatefulSet such as a 3 replicaset mongodb
Wait for the workload to come up
PVs are provisioned and everything is fine
Kill one pod and wait for it to be recreated by Kubernetes
Just after the scheduler schedules the pod to run on a node it errors because it cannot mount the previously created and existing PV.
See errors shown in kube logs below

Expected behavior
PV should be attached to the new node where the new pod is scheduled and the pod should start

Details (please complete the following information):

Scaleway CSI version: 0.1.7
Platform: Rancher RKE2 v2.5.9
Orchestrator and its version: Kubernetes v1.20.11+rke2r2

Additional context

Errors shown

Warning FailedMount MountVolume.MountDevice failed for volume "pvc-3030ae10-3579-494a-a215-0017aea58332" : rpc error: code = Internal desc = error encrypting/opening volume with ID aeffa5d1-d5c3-406c-a728-d5d2c856aed9: luksStatus returned ok, but device scw-luks-aeffa5d1-d5c3-406c-a728-d5d2c856aed9 is not active

and

MountVolume.WaitForAttach failed for volume "pvc-83cf34a9-d36d-46e5-bbf2-199c426f518c" : volume fr-par-2/cbe3eca8-f623-4bbe-bc76-450eceb391b2 has GET error for volume attachment csi-879b1d2e5fa7ca784f356b823505c5506b57891aa56966b59c8ebfdae3497320: volumeattachments.storage.k8s.io "csi-879b1d2e5fa7ca784f356b823505c5506b57891aa56966b59c8ebfdae3497320" is forbidden: User "system:node:node-5" cannot get resource "volumeattachments" in API group "storage.k8s.io" at the cluster scope: no relationship found between node 'node-5' and this object

Again, that only seems to happen for encrypted PVs.

teknologista · 2021-10-20T09:53:22Z

By the way I have clusters of type RKE2 hardened available to test a potential fix or help debug the issue

Sh4d1 · 2021-10-20T11:10:49Z

Couldn't reproduce with this:

allowVolumeExpansion: false # not yet supported
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: "scw-bssd-enc"
provisioner: csi.scaleway.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
parameters:
  encrypted: "true"
  csi.storage.k8s.io/node-stage-secret-name: "enc-secret"
  csi.storage.k8s.io/node-stage-secret-namespace: "default"
---
apiVersion: v1
kind: Secret
metadata:
  name: enc-secret
  namespace: default
type: Opaque
data:
  encryptionPassphrase: bXlhd2Vzb21lcGFzc3BocmFzZQ==
---
apiVersion: v1
kind: Service
metadata:
  name: mongo
  labels:
    name: mongo
spec:
  ports:
    - port: 27017
      targetPort: 27017
  clusterIP: None
  selector:
    role: mongo
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: mongo
spec:
  selector:
    matchLabels:
      role: mongo
      environment: test
  serviceName: "mongo"
  replicas: 3
  template:
    metadata:
      labels:
        role: mongo
        environment: test
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: mongo
        image: mongo
        command:
          - mongod
          - "--replSet"
          - rs0
        ports:
          - containerPort: 27017
        volumeMounts:
          - name: mongo-persistent-storage
            mountPath: /data/db
      - name: mongo-sidecar
        image: cvallance/mongo-k8s-sidecar
        env:
          - name: MONGO_SIDECAR_POD_LABELS
            value: "role=mongo,environment=test"
  volumeClaimTemplates:
    - metadata:
        name: mongo-persistent-storage
      spec:
        storageClassName: "scw-bssd-enc"
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 5Gi

Used this script:

#!/bin/bash

while true; do
    while kubectl get pods --no-headers | grep -v Running ; do
        sleep 2
    done

    kubectl delete pods mongo-$(($RANDOM % 3))
done

I let it run for some time, with no issue. Tested on Kapsule with k8s 1.20.11.

Could you get cryptsetup status /dev/mapper/scw-luks-<id> when it's stuck ?

teknologist · 2021-10-20T11:39:55Z

Hi Patrik,

Thanks for looking at this.

It may then be related to the fact that the Kubernetes cluster is RKE2 Government with hardened Pod Security Policy being enforced.

I will try again tomorrow and let you know the outcome.

teknologista · 2021-12-03T16:05:23Z

Hi @Sh4d1 ,

We are stuck with this issue again today while doing a rolling upgrade of a kubernetes cluster between two minor 1.20 versions.

This is what happended:

We drain (forced) and then cordoned a node
The workload was then launched on a new node
it is now stuck with:

MountVolume.MountDevice failed for volume "pvc-7da69745-cd8b-4e4e-b236-ebcb6c76c328" : rpc error: code = Internal desc = failed to format and mount device from ("/dev/mapper/scw-luks-cd5543ac-4300-4bb7-882a-6f19ca0149c3") to ("/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-7da69745-cd8b-4e4e-b236-ebcb6c76c328/globalmount") with fstype ("ext4") and options ([]): exit status 1

As per your request this is the output:

 ~ sudo cryptsetup status /dev/mapper/scw-luks-cd5543ac-4300-4bb7-882a-6f19ca0149c3
/dev/mapper/scw-luks-cd5543ac-4300-4bb7-882a-6f19ca0149c3 is active.
  type:    n/a
  cipher:  aes-xts-plain64
  keysize: 256 bits
  key location: keyring
  device:  (null)
  sector size:  512
  offset:  32768 sectors
  size:    188710912 sectors
  mode:    read/write

On the scaleway web console I can see the volume being attached to the right node though.

On the other side, I have logged on the node and tried a full cycle of:

 - cryptsetup lucksClose the device mapper from CSI
 - cryptsetup lucksOpen the device  /dev/sda
 - fsck -fy /dev/mapper/the_mapper-device
fsck did fix a few minor errors. nothing crazy but it di modify fs.

With success.

Then I did a cryptsetup lucksClose and then the volume was successfully auto mounted by the CSI without me doing anything.

There might be a reason why sometimes an ext4 volume is messed up after disconnection from workload because of a sudden kill of the pod using it.
It then might need to have an fsck run to be mounted again by CSI.

I don't know if this helps, I may be wrong but it is the result of my research.

Anyway, is there anything we can do about this as it kills the auto-healing behaviour of a Kubernetes cluster (maybe run an automated fsck -fy prior to mount in pod)... :-(

Many thanks.

teknologista added the bug Something isn't working label Oct 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mounting issues with encrypted volumes #41

Mounting issues with encrypted volumes #41

teknologista commented Oct 20, 2021

teknologista commented Oct 20, 2021 •

edited

Loading

Sh4d1 commented Oct 20, 2021

teknologist commented Oct 20, 2021

teknologista commented Dec 3, 2021 •

edited

Loading

Mounting issues with encrypted volumes #41

Mounting issues with encrypted volumes #41

Comments

teknologista commented Oct 20, 2021

teknologista commented Oct 20, 2021 • edited Loading

Sh4d1 commented Oct 20, 2021

teknologist commented Oct 20, 2021

teknologista commented Dec 3, 2021 • edited Loading

teknologista commented Oct 20, 2021 •

edited

Loading

teknologista commented Dec 3, 2021 •

edited

Loading