-
Notifications
You must be signed in to change notification settings - Fork 808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Static EBS Volume Allocations on CSINode CRDs #2153
Comments
We've also run into this problem and have been considering what could be done to address it. One alternative to consider to updating the allocation calculation dynamically is to assume that every potential ENI is already in use when computing the allocation amount. Pros:
Cons:
For our use cases, maximizing the number of supported volumes isn't critical. Following your example, losing a handful of potential volumes on a node that can hold 25 is acceptable, and k8s should provision additional nodes to satisfy pods that require volume mounts without issue. |
A general update on this: we are actively working with the relevant Kubernetes sigs to better account for the dynamic nature of volume limits on EC2. There is a KEP which aims to address this pain point and is available for review here: kubernetes/enhancements#4875. Please feel free to leave any feedback directly on the PR. In the meantime, we recommend adopting one or more of the below solutions if you are experiencing volume limit issues caused by the limit changing after startup:
|
@wmgroot Thank you for this feedback, it's very helpful! Please take a look at the solutions listed above, one or more of these may be suitable for your specific use case and could help alleviate the pain in the short to medium term while we work on the Kubernetes enhancement. We're aware that this specific issue is incredibly disruptive and are striving to resolve it.
That absolutely makes sense, especially as the existing behavior leads to stuck volume attachments when the driver attempts to maximize the attachment slots. As you could imagine, changing this default behavior would be incredibly difficult because there are certainly a large number number of users who care deeply about maximizing the attachment slots (even if its done on a best effort basis). That said, PTAL at the |
No worries. I definitely wasn't hoping that the default behavior would change, and definitely understand that there are probably users out there who care a lot about maximizing the number of pods with EBS volumes they can fit onto their nodes. I would expect that my suggestion was simply an option not enabled by default that could be toggled on if it made sense for each user. Regarding Running multiple daemonsets to handle different instance type requirements in the strategy above also seems like it could easily spiral out of control. I'll take a look at the prefix delegation option you've linked, I'm not sure my team is aware of that recommendation or if it would help us in this case. |
@wmgroot Makes sense, thank you for the additional context. With regards to prefix delegation, it's generally considered a best practice so it's definitely worth exploring. There are some good recommendations, more helpful background, and instructions for enabling this mode in the |
/kind bug
What happened?
Persistent Volumes failed to attach to a host due to EBS Volume Limits not adjusting for dynamically allocated ENIs.
What you expected to happen?
The EBS Volume allocation field on the CSINode should update at some point after the pod starts up.
While it might not be ideal to continuously update this field, there should be some kind of logic to re-calculate these values other than only updating on startup. It could be in the event that the CSI driver receives a timeout waiting for the EBS volumes to attach to a node?
How to reproduce it (as minimally and precisely as possible)?
Startup a host in a k8s cluster.
Check the corresponding CSINode allocatable number of EBS volumes.
After the aws-ebs-csi-driver has started, attach an ENI to the host.
Observe the CSINode allocatable number of EBS volumes does not change.
Assign the CSINode allocatable number of EBS volumes to a host.
Verify that the last one cannot mount the host and the CSINode allocatable field is not updated.
Anything else we need to know?:
For our use case, cilium is dynamically provisioning ENIs which scale based on the number of pods scheduled to the host. Because of the dynamic nature here, it is not ideal to allocate a number of
--reserved-volume-attachments
In the event of a container restart, the new allocation calculation will count the attached ENIs and the allocatable EBS volumes will be less than what is actually available.
Example:
We start with 28 EBS volume slots, 2 host EBS volumes, 1 ENI. The calculated allocatable field will be 25.
If we reserve 7 slots we are down to 20 (Total - reserved - attached ENIs) and we ensure cilium has 4 ENI slots to dynamically allocate.
If the aws-ebs-csi-driver restarts after cilium is using 4 ENI slots, we will be down to 16 (Total - reserved - 5 ENIs) allocatable EBS volumes and 4 will be unused.
Environment
kubectl version
): 1.28The text was updated successfully, but these errors were encountered: