-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A Solution to Fixing Containers when lxcfs Crashes #583
Comments
Hi @deleriux that's a good idea, as I said before we are currently working on internal lxcfs mechanism to recover from crashes. But that's a good solution for some cases if rebooting all containers is problematic. |
We'll need to be very very careful when doing something like that as root in the container can mess with the mount namespace. That's the reason we never invested too much effort into injecting LXCFS mounts into an existing instance. I certainly feel a lot better about the current plan from @mihalicyn to allow recovering from a lxcfs crash by re-attaching to the existing FUSE mounts. |
@mihalicyn we can re-use this one to track the FUSE re-attach work |
@stgraber so, would this fix be added to next release? |
It won't be addressed in the next release I would say. We need to make some changes in the Linux kernel as a part of this work. But it will be definitely implemented in LXCFS. Do you have any issues with LXCFS right now? |
@mihalicyn any update? |
Hello all,
I briefly mentioned last week that I had a solution
Transport endpoint not connected
errors in containers when lxcfs crashes without having to restart every container that came up.I've uploaded the code I have as-is and here it is.
https://github.com/deleriux/lxcfs-reattach
I've tested on Ubuntu 22 and Ubuntu 16 (with updated kernel).
The way that this works is by utilizing various system calls introduced into the kernel post 5.2 that split up the mount process into multiple steps, see:
https://lwn.net/Articles/759499/
You can leverage this step-by-step approach to take the source path in the host namespace, then switch to the container namespace and mount the target path in the container namespace.
The algorithm basically is as follows.
/var/lib/lxcfs/proc/meminfo
open_tree()
on the path to obtain a mount_fd representing this mount point.unmount()
on containers path to/proc/meminfo
move_mount()
against/proc/meminfo
to reattach this mountpoint to the containers VFS.The code for this part is kept in https://github.com/deleriux/lxcfs-reattach/blob/main/container.c#L145 .
The remaining code is mostly dedicated to heuristics in finding containers to mount and mountpoints to monitor.
I'm pretty sure it littered with stupid bugs, but it works.
The process supports a monitor mode that uses
epoll()
against all discovered/proc/pid/mounts
to watch mounts come and go. If a qualifying mountpoint is unmounted then remounted (such as iflxcfs
gets restarted) the process detects it and issues a request to test then rebind mountpoints that no longer work.If lxcfs crashes and is not restarted, then it cant help there, but as soon as a new instance comes up it should rebind the mountpoints pretty quickly.
My code doesn't / can't distinguish which lxcfs process to use when rebinding mountpoints, it merely selects the 'best/first' working one and runs with it. This is particularly prevalent in LXD in snaps which tends to run its own lxcfs along with the systems lxcfs which can also be running.
I'm not suggesting this is the best and only solution to this problem (or my code for that matter is suitable for this project in its current form) but the algorithm to fix running containers is pretty straightforwards and tends to work flawlessly without being too disruptive.
The text was updated successfully, but these errors were encountered: