-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unix socket proxy support added #77
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: protosam The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@baude PTAL |
Hold off on approving this for the moment. Adding in unix-to-unix (over SSH) support in the forwarder service. |
Added unix-to-unix/over-ssh feature. Ready for review now. Example UsageProxy local unix socket to remote tcp port.
Proxy local unix socket to remote unix socket over ssh proxy. (These are podman non-root and root sockets as example)
Testing DisclosureTo be upfront about this, I did not test this as thoroughly as I would have liked due to limited time today. The code was quickly put together and my testing was limited to checking connectivity with a |
pkg/services/forwarder/ports.go
Outdated
} | ||
|
||
// build an ssh client using sshConn | ||
sshClient := ssh.NewClient(sshConn, chans, reqs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First off thank you for contributing! I have some comments, but keep in mind I am just a contributor like you and not an official reviewer/merger
It looks like this approach is creating a new ssh connection per client. One of the benefits of the existing implementation is that, just like OpenSSH's forwarding behavior, the ssh connection is kept alive and reused for multiple connections on the forwarded unix socket. This allows us to avoid paying key exchange costs with every client connect, fitting expectations of existing docker api clients coded against a unix socket.
On this topic, in addition to your proposed dynamic registration addition, we have plans to add a ssh-forwarding only variant of gvproxy for windows that does not involve vsock (I hope to take a look at this sometime this week). Relatedly we've previously discussed it being a good idea to share the same ssh client code in the long term between podman and gvproxy to ensure consistent behavior and share bug fixes etc). Ideally, everything uses the same code. Can this be updated to reuse the existing (ssh forwarder) static impl (then you get stuff like the above for free)?
As a general note, IMO the static preconfigured ssh forwarding is desirable in the podman machine use-case (aside from the non-vsock windows need) since it can be established immediately without blocking on subsequent round trips (similar to the preconfigured routes capability).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch @protosam and the review @n1hility!
I think both commits are useful. Can you add 2 end-to-end tests for them? You can find some examples in test/
.
It looks like this approach is creating a new ssh connection per client.
Yes we can evolve it in the future and have a single or a pool of connections?
As a general note, IMO the static preconfigured ssh forwarding is desirable in the podman machine use-case (aside from the non-vsock windows need) since it can be established immediately without blocking on subsequent round trips (similar to the preconfigured routes capability).
gvproxy
is really for podman right now, so we optimize for it. If you have other uses of this binary, @protosam, perhaps we can change it to have a configuration file or create a more generic binary?
I am happy to merge anything that can help podman machine (or crc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update this further after I figure out how the tests should work. I'm working on fixing the multiple SSH connections behavior and the tests this week when I have some time.
Thanks for the code review @n1hility
I'm going to patch it a single ssh connection is used per forwarder in this PR. The behavior as noted is actually a bug, the connection should be scoped to the forwarder expose function and dial function should be only opening a connection if it detects it was closed.
Reusing the existing ssh forwarder implementation looks like it adds unnecessary overhead for an implementation that lives as a networking service this close to the networking stack. (Disclaimer, I might not know what I'm even talking about here, after reading the source for sshclient it's existence doesn't make sense to me)
It should be possible to create forwarders to re-create the same effect that we already have with the ssh-forwarder today.
@guillaumerose I have been borrowing a lot of code from this repository to build a version of the virtual network that acts more like a configurable service and the API is written to be more consumable. When I finish the ipv6 support I'll be pushing it to github. If you're interested I can shoot you an email to see what you think. What I need it for is a cross-platform container VM solution that I'm building, but it should drive more than just my own hobby project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For IPv6, I have this PR opened with everything working (#59). I don't merge it because I don't like one hack I had to write :)
If you're interested I can shoot you an email to see what you think.
Yes, good idea! Thanks 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think my last push resolves this thread.
A single SSH connection is used and it will reconnect if EOF error is received from a read of 0 bytes on the underlying tcp connection.
Also added tests for unix2unix and unix2tcp forwarders.
@guillaumerose tool I'm working on pretty much just borrows your IPv6 WIP code. It looked mostly fine to me, aside from some hard coded instances of "fe80::1" in places. Once I have more work done on DHCP for v6, that PR is on my giveback list this year.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sshClient connection has been moved into it's own anonymous function so the locking would only apply to the relevant tasks. Being surgical with the lock in any other way seems like it would get racy. The lock shouldn't be necessary for sshClient.Dial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry just had a chance to take a look I agree @protosam that the nested stream dial should be safe since the ssh client does have a locking model around internal message handling. Although regrading the heartbeat point @guillaumerose brings up, I do think you want reconnection logic here since long running ssh connections can easily timeout from either ssh, networking, netfilter policy, or cycling sshd. This is where locking becomes important for this type of dispatch model since you need a happens-before edge around changes to the sshClient handle. The role of the bastion code in sshclient is to wrap the connection over multiple reconnects and handle changes . If dialing fails you can test the connection by sending a heartbeat (see the code in ssh_forwarder to see what I mean), and then tear down and rebuild the connection. Reestablishing (and the first establishment) might be rocky, so I recommend doing a similar backoff retry loop as well. One other thing you might want to look into when it comes to passing the dialed unix socket back to gvisor, is if the gvisor proxying logic manages half-close. The nested ssh streaming relies on a CloseWrite to gracefully terminate the client. I suspect gvisor does do this, but might be worth checking since otherwise you can get abrupt or delayed closes on interactive sessions. I have a local branch with the windows related changes I was referring to earlier, so when I finish that off tomorrow I can double check the gvisor thing if you dont get to it first.
Sorry about the branch conflict. I went to purge and redo my last commit because |
@protosam can you rebase so we can merge? |
#77 extract of unix-to-tcp part
I rebased the PR but I am wondering if we should perhaps introduce a real struct for port forwarding instead of relying on URL parsing.
Protocol is in fact the type of listener on the host. Perhaps we should have this:
It is not really convenient when using curl :/ WDYT? The most annyoing thing in the current proposition is that for unix protocol, the remote must be an URL. |
@guillaumerose seems reasonable to use a struct |
Been offline for a few days, sorry about the late reply. In my hobby project, my forwarder heavily relies on Edit to expand a bit on this.The forwarder seems like it needs to be flexible and may evolve over time, I don't really see a necessity to have struct or object definition for everything it does, but I do believe it should be possible to track everything it does. Right now gvproxy uses
In gvproxy, I'm not really sure if this is worth-while if the tool is just for podman, but it's the direction I'm going in my own endeavors. Something else I've been tinkering with is reverse forwarding (not sure what to really call it), but the idea is that if the "local" value isn't the host, a listener should get setup on the guest to the resource. |
Ok, let's go with URL. |
2 things
|
I took a look at podman code and they are not using the unix protocol so I think we are fine with this PR right now. |
IMO the only loose ends are the ssh reconnect and availability looping. An example outcome of the former is that If the ssh connection times out the port will have to be removed and re-added to resolve the issue. Some slight adaptation in the sshclient package would enable the existing logic here to be reused (basically just introducing a simpler dial variant of the AcceptAndTunnel routine). If you are swamped @protosam I could suggest a small patch on top of this PR to add those bits. Up to you @guillaumerose if you would rather merge and follow-up |
If there is no emergency, let's wait for the small patch. 👍 @baude I already merged part of the feature set: unix to tcp forwarding. Is it enough for you right now? Do you need this PR also? |
@n1hility I haven't caught up enough to circle back to this and adjacent efforts yet. Right now my belief is that that reconnect will passively happen with how it's written, but if I'm wrong, someone would need to show me the flaw in what I wrote. Feel free to make suggestions or changes. It looks like the most important part of this PR was added in by @guillaumerose . |
containers/gvisor-tap-vsock#77 have a way to use forward api to expose the unix socket also with this it is easy to export the docker unix socket from instance to host. ``` ✗ curl --unix-socket ~/.crc/crc-http.sock http:/unix/network/services/forwarder/expose -X POST -d '{"protocol":"unix","local":"/tmp/docker.sock","remote":"ssh-tunnel://192.168.127.2//var/run/docker.sock?user=core&key=/Users/prkumar/.crc/machines/crc/id_ecdsa"}' ✗ curl --unix-socket ~/.crc/crc-http.sock http:/unix/network/services/forwarder/all | jq . % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 315 100 315 0 0 521k 0 --:--:-- --:--:-- --:--:-- 307k [ { "local": "/tmp/docker.sock", "remote": "ssh-tunnel://192.168.127.2//var/run/docker.sock?user=core&key=/Users/prkumar/.crc/machines/crc/id_ecdsa", "protocol": "unix" }, { "local": "127.0.0.1:2222", "remote": "192.168.127.2:22", "protocol": "tcp" }, { "local": "127.0.0.1:9090", "remote": "192.168.127.2:9090", "protocol": "tcp" } ] ➜ ~ export DOCKER_HOST=unix:///tmp/docker.sock ➜ ~ docker images REPOSITORY TAG IMAGE ID CREATED SIZE quay.io/crcont/gvisor-tap-vsock 3231aba53905468c22e394493a0debc1a6cc6392 f42b05e6b25e 5 months ago 6.65MB ```
I tested this PR with crc on mac ( unix socket expose part) and it works without any issue, I even stop the instance and started it again even restart the sshd daemon to try to see if reconnect happen seamlessly and it does. Testing notes:
|
@guillaumerose Is there still stuff blocking this PR? |
Let's merge this if it helps crc. We can still open follow-up PRs after. |
Resolves issue #41
Testing
This is the testing I've done for this PR.
Environment Notes
The host tested on is
Mac OS (Intel) / Big Sur
. The repo is located at~/repos/gvisor-tap-vsock
with branch set tofeature/unix-socket-tcp-proxy
. For simplicity sake, a qemu VM produced bypodman machine init
was used to test against.Setup Notes
Built
gvproxy
, backed up the productiongvproxy
, and replaced it to be used by the podman qemu VM.Started the podman VM.
Testing Notes
From the existing leases,
192.168.127.2
was the podman VM, this was confirmed from accessing the VM withpodman machine ssh
and viewingip a
output.Below illustrates creating the unix-to-tcp proxy.
Below illustrates disabling the unix-to-tcp proxy.