Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add connect_timeout inside set_keyspace_blocking #259

Closed
wants to merge 1 commit into from

Conversation

fruch
Copy link

@fruch fruch commented Oct 11, 2023

set_keyspace_blocking is called from places holding a lock, and in case that the connection is closed from the server side, it might hang forever.

using the connect_timeout on it to make sure it
won't hang forever.

Ref: #261

@fruch fruch requested a review from Lorak-mmk October 11, 2023 12:01
@fruch
Copy link
Author

fruch commented Oct 11, 2023

@Lorak-mmk I need you review on this one, at least it's pass the integration test.

And I think it's relatively harmless change

@Lorak-mmk
Copy link

It would be good to have a tests that executes the timeout path - but if it's too hard to write then I think we can skip it.

@fruch
Copy link
Author

fruch commented Oct 11, 2023

It would be good to have a tests that executes the timeout path - but if it's too hard to write then I think we can skip it.

we might be able to copy the dtest casing it, but it wasn't 100% of the time...

@fruch
Copy link
Author

fruch commented Oct 17, 2023

It would be good to have a tests that executes the timeout path - but if it's too hard to write then I think we can skip it.

it might be a bit tricky, and we are still trying to find the root cause of that issue,
so once fixed, that test won't be covering it anymore.

and simulated the situation which we don't know it's exact root case, it kind of futile.

`set_keyspace_blocking` is called from places holding
a lock, and in case that the connection is closed from
the server side, it might hang forever.

using the `connect_timeout` on it to make sure it
won't hang forever.

Ref: https://github.com/scylladb/scylladb/issues/15661
@fruch fruch force-pushed the fix_shard_aware_race_condition branch from 6c8bd51 to b756779 Compare November 9, 2023 23:39
@Lorak-mmk
Copy link

Lorak-mmk commented Jan 2, 2024

This is actually a backwards-incompatible change, the behavior of the function is different and there might be users depending on current behavior.
I didn't look at the code, but wouldn't it be possible to solve this issue in wait_for_response? Intuitively, I think it should throw an exception when connection is closed.

@fruch
Copy link
Author

fruch commented Jan 3, 2024

This is actually a backwards-incompatible change, the behavior of the function is different and there might be users depending on current behavior.
I didn't look at the code, but wouldn't it be possible to solve this issue in wait_for_response? Intuitively, I think it should throw an exception when connection is closed.

That's what one would expect, but some when it works with TLS behaves differently and no part identifies the connection is closed, when TLS is the lower version it doesn't hang there.

We can have higher number, but having queries with no timeout at all seems like a recipe for trouble, especially when there's a lock involved

@dkropachev
Copy link
Collaborator

Closed in favor of #362

@dkropachev dkropachev closed this Aug 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants