Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stolon proxy breakes connection even when --stop-listening=false #859

Closed
wildermesser opened this issue Jan 14, 2022 · 5 comments
Closed

Comments

@wildermesser
Copy link

wildermesser commented Jan 14, 2022

What happened:
Recently I reproduced the issue #674. I am using slow etcd, so sometimes requests to kubernetes-apiserver exceed the limit of 10s. At this moment stolon proxy breaks connection from clients as expected. In the logs I saw check timeout timer fired; Stopping listening.
To avoid this, I add --stop-listening=false to stolon-proxy args. Now in case of slow requests, I am getting:

2022-01-13T05:13:54.881Z	INFO	cmd/proxy.go:346	check function error	{"error": "cannot get cluster data: failed to get latest version of configmap: Get \"https://10.233.0.1:443/api/v1/namespaces/ptaf-stolon/configmaps/stolon-cluster-kube-stolon?timeout=10s\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)"}
2022-01-13T05:13:55.119Z	INFO	cmd/proxy.go:304	check timeout timer fired

Without message like Stopping listening. But! At this moment my python application still become connection reset:

connection was closed in the middle of operation
ConnectionResetError: [Errno 104] Connection reset by peer

What you expected to happen:
When --stop-listening=false passed as arg to stolon-proxy, connections are still alive despite store backend issues.

Environment:

  • Stolon version:
stolon@stolon-proxy-779f67f5d8-smzbh:/$ stolon-proxy --version
stolon-proxy version 3bb7499f815f77140551eb762b200cf4557f57d3
  • Stolon running environment: k8s 1.20.11, stolon store backend - kubernetes configmap
@sgotti
Copy link
Member

sgotti commented Jan 14, 2022

@wildermesser This is the right behavior to maintain data consistency. If the proxy cannot know the cluster state for a specified interval it'll close connections (see https://github.com/sorintlab/stolon/blob/master/doc/faq.md#why-clients-should-use-the-stolon-proxy).

The right solutions are:

What you expected to happen:
When --stop-listening=false passed as arg to stolon-proxy, connections are still alive despite store backend issues.

With or without --stop-listening the proxy will always close connections when it cannot detect the cluster state.
--stop-listening will only change the "listening" behavior: keep the tcp listener opened (but dropping connections) or also close it. This option is useful only for handling different load balancing behaviors.

@sgotti sgotti closed this as completed Jan 14, 2022
@sgotti sgotti removed the bug label Jan 14, 2022
@wildermesser
Copy link
Author

@sgotti thank you for explaining!

@nh2
Copy link
Contributor

nh2 commented Sep 12, 2022

Increase the proxy timeout.

@sgotti which setting is that? Is it --tcp-keepalive-count?

Also, what happens after check timeout timer fired + Stopping listening occurred, should the proxy ever reconnect/recover? If yes, does that depend on what --stop-listening is set to?

@nh2
Copy link
Contributor

nh2 commented Sep 12, 2022

Increase the proxy timeout.

@sgotti which setting is that? Is it --tcp-keepalive-count?

Ah, it's probably proxyTimeout from the Cluster Specification.

So only the other recovery question remains.

@nh2
Copy link
Contributor

nh2 commented Apr 13, 2023

Also, what happens after check timeout timer fired + Stopping listening occurred, should the proxy ever reconnect/recover?

I think I found the answer: "It should recover, but does not, due to this bug": #888 (comment)

PR to work around the issue: #907

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants