-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blind tunneling breaks with Google Chrome and TLS Kyber cipher #11758
Comments
I will add more detailed reproduction instructions. |
These are relevant debug logs that show what happens. caesars-with-kyber-fail.txt In the In Here one can see that after |
https://github.com/fsecure-kilkanen/kyber-handshaketest/blob/main/kyber-handshaketest.py is a simple Python script that sends a previously recorded TLS Client Hello packet with Kyber cipher to a chosen destination. It sends the client hello packet in two ways:
Both cases succeed when running directly against a web server. With ATS, first case succeeds depending on network conditions. Second case breaks always, as ATS receives the ClientHello always in two separate socket read calls. |
https://tldr.fail/ is a website that covers these classes of bugs. |
@bryancall Are there any plans to fix this bug? |
Hello, Tero. Thanks for report. I recently start taking a look at PQTLS and tested ATS with tldr_fail_test.py. If I run ATS as a reverse proxy, there're no issue with large Client Hello nor separated Client Hello. However, as you pointed out, the Blind Tunnel case is not working with separated Client Hello.
I agree with we should read all packets that has Client Hello and forward them to origin server as tunnel.
There're several test cases around this code. If you open a PR, they should run automatically :) |
I have some questions about the implementation. What is the reason for switching OpenSSL to read from the socket after reading first packet? My plan is to check the length of handshake from TLS client hello header, and then compare that to amount of data that was read. If the whole handshake was not read, then further read of handshake needs to be triggered. How do I properly trigger a second read at this point using the async event system in ATS? The handshake data is read into IOBufferBlock: IOBufferBlock *b = this->handShakeBuffer->first_write_block(); If I do a second read to this IOBufferBlock, will the data be stored in continuous memory block? So that when I get Thanks for the assistance. |
Hi @shinrich I see that you have worked with this code path, could you give some input on my questions above? |
Executive summary
When TLS Client Hello is split into two TCP segments and segments arrive to TrafficServer in different parts, blind tunneling flow breaks.
Details
Use case
Trafficserver runs as a transparent forward proxy for TLS traffic. A custom plugin uses SSL_CERT_HOOK to determine destination hostname of the TLS connection.
If the destination is accepted, connection is converted into a blind tunnel by calling
TSVConnTunnel(sslvc)
andTSVConnReenable(sslvc)
in the hook callback function.Affected versions
Reproduced with TrafficServer 8.x branch and TrafficServer 9.2.4. Issue should also exist in 10.x branch because affected code paths have not changed.
The issue
When a connection is started with Kyber cipher, the TLS Client Hello packet contains more key material than previous ciphers. With Kyber cipher, the TLS Client Hello packet is 1700-2100 bytes in size.
Since the network MTU is usually 1500, it means that the TLS Client Hello is split into two TCP segments. In some conditions, clients' network connectivity is slow such that the gap between the first and second segment of Client Hello is big enough to trigger the bug.
https://github.com/apache/trafficserver/blob/master/src/iocore/net/SSLNetVConnection.cc#L423 is the place where TLS Client Hello is read. This is a call to read data from socket that is for the incoming TCP connection. There is no guarantee that this call will return both segments of the Client Hello. If data for complete Client Hello is received here, the blind tunnel operation is succesful.
However, if only first segment of Client Hello is received, then the following sequence of events happen:
https://github.com/apache/trafficserver/blob/master/src/iocore/net/SSLNetVConnection.cc#L1365 OpenSSL
ssl_accept()
is called. It returnsSSL_ERROR_WANT_READ
, which is returned asSSL_HANDSHAKE_WANT_READ
.https://github.com/apache/trafficserver/blob/master/src/iocore/net/SSLNetVConnection.cc#L661 is the branch taken in this case. Processing goes to
update_rbio
at https://github.com/apache/trafficserver/blob/master/src/iocore/net/SSLNetVConnection.cc#L529 withmove_to_socket
set totrue
.In
update_rbio
, OpenSSL BIO is set to the file descriptor of the connection, and existing handshake buffers are deallocated.Now, when the second segment of Client Hello arrives, the blind tunnel branch https://github.com/apache/trafficserver/blob/master/src/iocore/net/SSLNetVConnection.cc#L624 is not taken, because
this->handShakeReader
isnullptr
after it was deallocated previously inupdate_rbio
.To fix the issue, the case of
SSL_HANDSHAKE_WANT_READ
should be handled so that the second TCP segment of handshake is read into the existing SSL handshake buffer, and thenssl_accept()
is called with that buffer.I have tried a couple of workarounds to do this. However, since I don't fully understand the context why current implementation is like this, I am not confident of making big changes, because they might affect other use cases.
The text was updated successfully, but these errors were encountered: