topolvm-controller
provides a CSI controller service.
It also works as a custom Kubernetes controller for additional tasks.
topolvm-controller
implements following optional features:
CREATE_DELETE_VOLUME
to support dynamic volume provisioningGET_CAPACITY
EXPAND_VOLUME
topolvm-controller
implements two webhooks:
Mutate new Pods to add capacity.topolvm.io/<device-class>
annotations to the pod
and topolvm.io/capacity
resource request to its first container.
These annotations and the resource request will be used by
topolvm-scheduler
to filter and score Nodes.
This hook handles two classes of pods. First, pods having at least one unbound PersistentVolumeClaim (PVC) for TopoLVM and no bound PVC for TopoLVM. Second, pods which have at least one generic ephemeral volume which specify using the StorageClass of TopoLVM.
For both PVCs and generic ephemeral volumes, the requested storage size for the volume is calculated as follows:
- if the volume has no storage request, the size will be treated as 1 GiB.
- if the volume has storage request, the size is as is.
The value of the resource request is the sum of storage sizes of unbound PVCs for TopoLVM.
The following manifest exemplifies usage of TopoLVM PVCs:
kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
name: topolvm
provisioner: topolvm.io # topolvm-scheduler works only for StorageClass with this provisioner.
parameters:
"csi.storage.k8s.io/fstype": "xfs"
"topolvm.io/device-class": "ssd"
volumeBindingMode: WaitForFirstConsumer
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: local-pvc1
namespace: hook-test
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: topolvm # reference the above StorageClass
---
apiVersion: v1
kind: Pod
metadata:
name: pause
namespace: hook-test
labels:
app.kubernetes.io/name: pause
spec:
containers:
- name: pause
image: registry.k8s.io/pause
volumeMounts:
- mountPath: /test1
name: my-volume1
volumes:
- name: my-volume1
persistentVolumeClaim:
claimName: local-pvc1 # have the above PVC
The hook inserts capacity.topolvm.io/<device-class>
to the annotations
and topolvm.io/capacity
to the first container as follows:
metadata:
annotations:
capacity.topolvm.io/ssd: "1073741824"
spec:
containers:
- name: pause
resources:
limits:
topolvm.io/capacity: "1"
requests:
topolvm.io/capacity: "1"
If the specified StorageClass does not have topolvm.io/device-class
parameter,
it will be annotated with capacity.topolvm.io/00default
.
Below is an example for TopoLVM generic ephemeral volumes:
apiVersion: v1
kind: Pod
metadata:
name: pause
labels:
app.kubernetes.io/name: pause
spec:
containers:
- name: pause
image: registry.k8s.io/pause
volumeMounts:
- mountPath: /test1
name: my-volume
volumes:
- name: my-volume
ephemeral:
volumeClaimTemplate:
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: topolvm # reference the above StorageClass
The hook inserts capacity.topolvm.io/<device-class>
to the annotations
and topolvm.io/capacity
to the first container as follows:
metadata:
annotations:
capacity.topolvm.io/ssd: "1073741824"
spec:
containers:
- name: ubuntu
resources:
limits:
topolvm.io/capacity: "1"
requests:
topolvm.io/capacity: "1"
Mutate new PVCs to add topolvm.io/pvc
finalizer. This finalizer is required to delete a pod in the following scenario.
- StatefulSet pod is deleted by
kubectl drain
. PVC is remained. - A pod is recreated by the StatefulSet controller but not scheduled for some reasons.
- Delete a node resource on which the pod was running.
- PVC related to the node is deleted by the TopoLVM controller.
At step 4, the StatefulSet pod is not deleted if the PVC finalizer does not exist.
The controller is to cleanup all PVCs and LogicalVolumes associating to the deleting Node.
It adds the topolvm.io/node
finalizer to run the cleanup task.
This node finalize procedure may be skipped with the --skip-node-finalize
flag.
When this is true, the PVCs and the LogicalVolume CRs from a deleted node must be
deleted manually by a cluster administrator.
The controller accomplishes two tasks.
-
Delete Pods using PVCs under deletion. When a PVC for TopoLVM is being deleted, the controller deletes pods referencing the PVC, if any. This is repeated until other finalizers to be completed. Once it becomes the last finalizer, it removes the finalizer to immediately delete the PVC.
-
Speed up resizing a PVC filesystem by nudging the kubelet. kubelet watches Pods rather than PVCs periodically to resize the filesystem, therefore the filesystem resizing may be delayed. To avoid this, the controller will notify kubelet by setting the
topolvm.io/last-resizefs-requested-at
annotation with the current time to the Pod.
Name | Type | Default | Description |
---|---|---|---|
cert-dir |
string | /tmp/k8s-webhook-server/serving-certs |
Directory for tls.crt and tls.key files. |
csi-socket |
string | /run/topolvm/csi-topolvm.sock |
UNIX domain socket of topolvm-controller . |
metrics-bind-address |
string | :8080 |
Listen address for Prometheus metrics. |
secure-metrics-server |
bool | false |
Secures the metrics server. |
leader-election-id |
string | topolvm |
ID for leader election by controller-runtime. |
webhook-addr |
string | :9443 |
Listen address for the webhook endpoint. |
skip-node-finalize |
bool | false |
When true, skips automatic cleanup of PhysicalVolumeClaims on Node deletion. |