Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unix socket proxy support added #77

Merged
merged 7 commits into from Feb 11, 2022
Merged

unix socket proxy support added #77

merged 7 commits into from Feb 11, 2022

Conversation

ghost
Copy link

@ghost ghost commented Dec 27, 2021

Resolves issue #41

Testing

This is the testing I've done for this PR.

Environment Notes

The host tested on is Mac OS (Intel) / Big Sur. The repo is located at ~/repos/gvisor-tap-vsock with branch set to feature/unix-socket-tcp-proxy. For simplicity sake, a qemu VM produced by podman machine init was used to test against.

Setup Notes

Built gvproxy, backed up the production gvproxy, and replaced it to be used by the podman qemu VM.

$ cd ~/repos/gvisor-tap-vsock
$ go mod tidy
$ make gvproxy
$ podman machine init
$ cp /usr/local/opt/podman/libexec/gvproxy /usr/local/opt/podman/libexec/gvproxy.backup
$ cp ~/repos/gvisor-tap-vsock/bin/gvproxy /usr/local/opt/podman/libexec/gvproxy
$ ls -lah /usr/local/opt/podman/libexec/gvproxy*

Started the podman VM.

$ podman machine start

Testing Notes

From the existing leases, 192.168.127.2 was the podman VM, this was confirmed from accessing the VM with podman machine ssh and viewing ip a output.

$ curl -s http://127.0.0.1:7777/leases | jq

Below illustrates creating the unix-to-tcp proxy.

$ curl http://127.0.0.1:7777/services/forwarder/expose -X POST -d '{"protocol":"unix","local":"/tmp/podman-machine-ssh.sock","remote":"192.168.127.2:22"}'
$ curl -s http://127.0.0.1:7777/services/forwarder/all | jq
[
  {
    "local": "/tmp/podman-machine-ssh.sock",
    "remote": "192.168.127.2:22",
    "protocol": "unix"
  },
  {
    "local": ":55488",
    "remote": "192.168.127.2:22",
    "protocol": "tcp"
  }
]

$ ssh -i ~/.ssh/podman-machine-default -o "ProxyCommand socat - UNIX-CLIENT:/tmp/podman-machine-ssh.sock" core@podman-machine
Warning: Permanently added 'podman-machine' (ECDSA) to the list of known hosts.
Fedora CoreOS 35.20211215.2.0
Tracker: https://github.com/coreos/fedora-coreos-tracker
Discuss: https://discussion.fedoraproject.org/c/server/coreos/

Last login: Mon Dec 27 21:25:47 2021 from 192.168.127.1
[core@localhost ~]$ 

Below illustrates disabling the unix-to-tcp proxy.

$ curl http://127.0.0.1:7777/services/forwarder/unexpose -X POST -d '{"protocol":"unix","local":"/tmp/podman-machine-ssh.sock"}'
$ curl -s http://127.0.0.1:7777/services/forwarder/all | jq
[
  {
    "local": ":55488",
    "remote": "192.168.127.2:22",
    "protocol": "tcp"
  }
]

$ ls -lah /tmp/podman-machine-ssh.sock
ls: /tmp/podman-machine-ssh.sock: No such file or directory

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 27, 2021

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: protosam

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rhatdan
Copy link
Member

rhatdan commented Dec 29, 2021

@baude PTAL

@ghost
Copy link
Author

ghost commented Dec 29, 2021

Hold off on approving this for the moment. Adding in unix-to-unix (over SSH) support in the forwarder service.

@ghost
Copy link
Author

ghost commented Jan 1, 2022

Added unix-to-unix/over-ssh feature. Ready for review now.

Example Usage

Proxy local unix socket to remote tcp port.

$ curl http://127.0.0.1:7777/services/forwarder/expose -X POST -d '{"protocol":"unix","local":"/tmp/podman-machine-ssh.sock","remote":"tcp://192.168.127.2:22"}'

Proxy local unix socket to remote unix socket over ssh proxy. (These are podman non-root and root sockets as example)

$ curl http://127.0.0.1:7777/services/forwarder/expose -X POST -d '{"protocol":"unix","local":"/tmp/podman-user.sock","remote":"ssh-tunnel://192.168.127.2/run/user/1000/podman/podman.sock?user=core&key=/your-home-dir/.ssh/podman-machine-default"}'
$ curl http://127.0.0.1:7777/services/forwarder/expose -X POST -d '{"protocol":"unix","local":"/tmp/podman-root.sock","remote":"ssh-tunnel://192.168.127.2/run/podman/podman.sock?user=root&key=/your-home-dir/.ssh/podman-machine-default"}'

Testing Disclosure

To be upfront about this, I did not test this as thoroughly as I would have liked due to limited time today. The code was quickly put together and my testing was limited to checking connectivity with a podman ps after updating ~/.config/containers/containers.conf . The unix-to-tcp test was just a simple ssh connection.

@ghost ghost changed the title unix socket to tcp proxy support added unix socket proxy support added Jan 1, 2022
}

// build an ssh client using sshConn
sshClient := ssh.NewClient(sshConn, chans, reqs)
Copy link
Member

@n1hility n1hility Jan 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First off thank you for contributing! I have some comments, but keep in mind I am just a contributor like you and not an official reviewer/merger

It looks like this approach is creating a new ssh connection per client. One of the benefits of the existing implementation is that, just like OpenSSH's forwarding behavior, the ssh connection is kept alive and reused for multiple connections on the forwarded unix socket. This allows us to avoid paying key exchange costs with every client connect, fitting expectations of existing docker api clients coded against a unix socket.

On this topic, in addition to your proposed dynamic registration addition, we have plans to add a ssh-forwarding only variant of gvproxy for windows that does not involve vsock (I hope to take a look at this sometime this week). Relatedly we've previously discussed it being a good idea to share the same ssh client code in the long term between podman and gvproxy to ensure consistent behavior and share bug fixes etc). Ideally, everything uses the same code. Can this be updated to reuse the existing (ssh forwarder) static impl (then you get stuff like the above for free)?

As a general note, IMO the static preconfigured ssh forwarding is desirable in the podman machine use-case (aside from the non-vsock windows need) since it can be established immediately without blocking on subsequent round trips (similar to the preconfigured routes capability).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch @protosam and the review @n1hility!
I think both commits are useful. Can you add 2 end-to-end tests for them? You can find some examples in test/.

It looks like this approach is creating a new ssh connection per client.

Yes we can evolve it in the future and have a single or a pool of connections?

As a general note, IMO the static preconfigured ssh forwarding is desirable in the podman machine use-case (aside from the non-vsock windows need) since it can be established immediately without blocking on subsequent round trips (similar to the preconfigured routes capability).

gvproxy is really for podman right now, so we optimize for it. If you have other uses of this binary, @protosam, perhaps we can change it to have a configuration file or create a more generic binary?

I am happy to merge anything that can help podman machine (or crc).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update this further after I figure out how the tests should work. I'm working on fixing the multiple SSH connections behavior and the tests this week when I have some time.

Thanks for the code review @n1hility

I'm going to patch it a single ssh connection is used per forwarder in this PR. The behavior as noted is actually a bug, the connection should be scoped to the forwarder expose function and dial function should be only opening a connection if it detects it was closed.

Reusing the existing ssh forwarder implementation looks like it adds unnecessary overhead for an implementation that lives as a networking service this close to the networking stack. (Disclaimer, I might not know what I'm even talking about here, after reading the source for sshclient it's existence doesn't make sense to me)

It should be possible to create forwarders to re-create the same effect that we already have with the ssh-forwarder today.

@guillaumerose I have been borrowing a lot of code from this repository to build a version of the virtual network that acts more like a configurable service and the API is written to be more consumable. When I finish the ipv6 support I'll be pushing it to github. If you're interested I can shoot you an email to see what you think. What I need it for is a cross-platform container VM solution that I'm building, but it should drive more than just my own hobby project.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For IPv6, I have this PR opened with everything working (#59). I don't merge it because I don't like one hack I had to write :)

If you're interested I can shoot you an email to see what you think.

Yes, good idea! Thanks 👍

Copy link
Author

@ghost ghost Jan 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think my last push resolves this thread.

A single SSH connection is used and it will reconnect if EOF error is received from a read of 0 bytes on the underlying tcp connection.

Also added tests for unix2unix and unix2tcp forwarders.

@guillaumerose tool I'm working on pretty much just borrows your IPv6 WIP code. It looked mostly fine to me, aside from some hard coded instances of "fe80::1" in places. Once I have more work done on DHCP for v6, that PR is on my giveback list this year.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the new commits. A part from the lock, which I am not sure, it looks good.

@n1hility has more experience with ssh connections and keepalive. Can you take a look @n1hility ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sshClient connection has been moved into it's own anonymous function so the locking would only apply to the relevant tasks. Being surgical with the lock in any other way seems like it would get racy. The lock shouldn't be necessary for sshClient.Dial.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry just had a chance to take a look I agree @protosam that the nested stream dial should be safe since the ssh client does have a locking model around internal message handling. Although regrading the heartbeat point @guillaumerose brings up, I do think you want reconnection logic here since long running ssh connections can easily timeout from either ssh, networking, netfilter policy, or cycling sshd. This is where locking becomes important for this type of dispatch model since you need a happens-before edge around changes to the sshClient handle. The role of the bastion code in sshclient is to wrap the connection over multiple reconnects and handle changes . If dialing fails you can test the connection by sending a heartbeat (see the code in ssh_forwarder to see what I mean), and then tear down and rebuild the connection. Reestablishing (and the first establishment) might be rocky, so I recommend doing a similar backoff retry loop as well. One other thing you might want to look into when it comes to passing the dialed unix socket back to gvisor, is if the gvisor proxying logic manages half-close. The nested ssh streaming relies on a CloseWrite to gracefully terminate the client. I suspect gvisor does do this, but might be worth checking since otherwise you can get abrupt or delayed closes on interactive sessions. I have a local branch with the windows related changes I was referring to earlier, so when I finish that off tomorrow I can double check the gvisor thing if you dont get to it first.

@ghost
Copy link
Author

ghost commented Jan 8, 2022

Sorry about the branch conflict. I went to purge and redo my last commit because test/qcon.log managed to make it's way into the commit history in my feature branch.

@baude
Copy link
Member

baude commented Jan 13, 2022

@protosam can you rebase so we can merge?

guillaumerose added a commit that referenced this pull request Jan 14, 2022
#77 extract of unix-to-tcp part
@guillaumerose
Copy link
Contributor

guillaumerose commented Jan 17, 2022

I rebased the PR but I am wondering if we should perhaps introduce a real struct for port forwarding instead of relying on URL parsing.

type ExposeRequest struct {
	Local    string            `json:"local"`
	Remote   string            `json:"remote"`
	Protocol TransportProtocol `json:"protocol"`
}

Protocol is in fact the type of listener on the host. Perhaps we should have this:

type ExposeRequest struct {
	Local  Endpoint `json:"local"`
	Remote Endpoint `json:"remote"`
}

type Endpoint struct {
	TCP  *TCPEndpoint
	UDP  *UDPEndpoint
	Unix *UnixEndpoint
	SSH  *SSHEndpoint
}

type TCPEndpoint struct {
	IP   string
	Port string
}

type UDPEndpoint struct {
	IP   string
	Port string
}

type UnixEndpoint struct {
	Path string
}

type SSHEndpoint struct {
	IP         string
	Port       string
	User       string
	PrivateKey string
	PassPhrase string
	Path       string
}

type UnexposeRequest struct {
	Local Endpoint `json:"local"`
}

It is not really convenient when using curl :/ WDYT?

The most annyoing thing in the current proposition is that for unix protocol, the remote must be an URL.
For tcp and udp, it must be an IP+port.

@baude
Copy link
Member

baude commented Jan 17, 2022

@guillaumerose seems reasonable to use a struct

@ghost
Copy link
Author

ghost commented Jan 18, 2022

Been offline for a few days, sorry about the late reply.

In my hobby project, my forwarder heavily relies on url.URL{} from package net/url. It provides a lot of useful bits already that can be used for generic uri stuff.

Edit to expand a bit on this.

The forwarder seems like it needs to be flexible and may evolve over time, I don't really see a necessity to have struct or object definition for everything it does, but I do believe it should be possible to track everything it does.

Right now gvproxy uses protocol, local, and remote to handle forwarding. Using URIs in local and remote opens up the possibility of mixing protocols with some supporting code.

unix:///path/to/socket/on/local.sock (local unix socket listener)
tcp://ip.add.rr.ess:port (can be a listener or forwarded port)
udp://ip.add.rr.ess:port (can be a listener or forwarded port)
ssh-unix://192.168.127.X:22/path/to/socket/on/guest.sock  (proxied forwarding)
ssh-tcp://192.168.127.X:22?port=80 (proxied forwarding)
ssh-udp://192.168.127.X:22?port=53 (proxied forwarding)

In gvproxy, I'm not really sure if this is worth-while if the tool is just for podman, but it's the direction I'm going in my own endeavors.

Something else I've been tinkering with is reverse forwarding (not sure what to really call it), but the idea is that if the "local" value isn't the host, a listener should get setup on the guest to the resource.

@guillaumerose
Copy link
Contributor

Ok, let's go with URL.
We can add code for supporting old format (eg. Protocol + IP + Port) in case it is not a valid URL. We will be able to focus on 2 params then: remote and local.

@ghost
Copy link
Author

ghost commented Jan 19, 2022

2 things

  1. Should the code changes for the forwarder be handled in a separate PR?

  2. Just for awareness, I was on vacation working on stuff out of self interest. I'm back to my day-to-day stuff so I'll be likely committing time over weekends when I can to give back.

@guillaumerose
Copy link
Contributor

I took a look at podman code and they are not using the unix protocol so I think we are fine with this PR right now.
@n1hility should I merge? Any thoughts?

@n1hility
Copy link
Member

n1hility commented Jan 25, 2022

I took a look at podman code and they are not using the unix protocol so I think we are fine with this PR right now.
@n1hility should I merge? Any thoughts?

IMO the only loose ends are the ssh reconnect and availability looping. An example outcome of the former is that If the ssh connection times out the port will have to be removed and re-added to resolve the issue.

Some slight adaptation in the sshclient package would enable the existing logic here to be reused (basically just introducing a simpler dial variant of the AcceptAndTunnel routine).

If you are swamped @protosam I could suggest a small patch on top of this PR to add those bits.

Up to you @guillaumerose if you would rather merge and follow-up

@guillaumerose
Copy link
Contributor

If there is no emergency, let's wait for the small patch. 👍

@baude I already merged part of the feature set: unix to tcp forwarding. Is it enough for you right now? Do you need this PR also?

@ghost
Copy link
Author

ghost commented Jan 25, 2022

@n1hility I haven't caught up enough to circle back to this and adjacent efforts yet. Right now my belief is that that reconnect will passively happen with how it's written, but if I'm wrong, someone would need to show me the flaw in what I wrote.

Feel free to make suggestions or changes. It looks like the most important part of this PR was added in by @guillaumerose .

praveenkumar added a commit to praveenkumar/crc that referenced this pull request Feb 11, 2022
containers/gvisor-tap-vsock#77 have a way to use forward
api to expose the unix socket also with this it is easy to export the docker unix
socket from instance to host.
```
✗ curl --unix-socket ~/.crc/crc-http.sock http:/unix/network/services/forwarder/expose -X POST -d '{"protocol":"unix","local":"/tmp/docker.sock","remote":"ssh-tunnel://192.168.127.2//var/run/docker.sock?user=core&key=/Users/prkumar/.crc/machines/crc/id_ecdsa"}'
✗ curl --unix-socket ~/.crc/crc-http.sock http:/unix/network/services/forwarder/all | jq .
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   315  100   315    0     0   521k      0 --:--:-- --:--:-- --:--:--  307k
[
  {
    "local": "/tmp/docker.sock",
    "remote": "ssh-tunnel://192.168.127.2//var/run/docker.sock?user=core&key=/Users/prkumar/.crc/machines/crc/id_ecdsa",
    "protocol": "unix"
  },
  {
    "local": "127.0.0.1:2222",
    "remote": "192.168.127.2:22",
    "protocol": "tcp"
  },
  {
    "local": "127.0.0.1:9090",
    "remote": "192.168.127.2:9090",
    "protocol": "tcp"
  }
]
➜  ~ export DOCKER_HOST=unix:///tmp/docker.sock
➜  ~ docker images
REPOSITORY                        TAG                                        IMAGE ID       CREATED        SIZE
quay.io/crcont/gvisor-tap-vsock   3231aba53905468c22e394493a0debc1a6cc6392   f42b05e6b25e   5 months ago   6.65MB
```
@praveenkumar
Copy link
Contributor

I tested this PR with crc on mac ( unix socket expose part) and it works without any issue, I even stop the instance and started it again even restart the sshd daemon to try to see if reconnect happen seamlessly and it does.

Testing notes:

✗ curl --unix-socket ~/.crc/crc-http.sock http:/unix/network/services/forwarder/expose -X POST -d '{"protocol":"unix","local":"/tmp/docker.sock","remote":"ssh-tunnel://192.168.127.2//var/run/docker.sock?user=core&key=/Users/prkumar/.crc/machines/crc/id_ecdsa"}'
✗ curl --unix-socket ~/.crc/crc-http.sock http:/unix/network/services/forwarder/all | jq .
[
  {
    "local": "/tmp/docker.sock",
    "remote": "ssh-tunnel://192.168.127.2//var/run/docker.sock?user=core&key=/Users/prkumar/.crc/machines/crc/id_ecdsa",
    "protocol": "unix"
  },
  {
    "local": "127.0.0.1:2222",
    "remote": "192.168.127.2:22",
    "protocol": "tcp"
  },
  {
    "local": "127.0.0.1:9090",
    "remote": "192.168.127.2:9090",
    "protocol": "tcp"
  }
]
➜  ~ export DOCKER_HOST=unix:///tmp/docker.sock
➜  ~ docker images
REPOSITORY                        TAG                                        IMAGE ID       CREATED        SIZE
quay.io/crcont/gvisor-tap-vsock   3231aba53905468c22e394493a0debc1a6cc6392   f42b05e6b25e   5 months ago   6.65MB

@rhatdan
Copy link
Member

rhatdan commented Feb 11, 2022

@guillaumerose Is there still stuff blocking this PR?

@guillaumerose
Copy link
Contributor

Let's merge this if it helps crc. We can still open follow-up PRs after.

@guillaumerose guillaumerose merged commit 3e44a1c into containers:main Feb 11, 2022
@ghost ghost deleted the feature/unix-socket-tcp-proxy branch February 11, 2022 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants