Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: panic on cluster redirection to a node with stale role #695

Merged
merged 2 commits into from
Dec 12, 2024

Conversation

rueian
Copy link
Collaborator

@rueian rueian commented Dec 12, 2024

Fixes #694

The situation happened when a cluster node got promoted to be the primary and the client followed a MOVED redirection but still think the node as a replica. Then the client set the connection to the rslots which was not yet initialized. This PR changes two things:

  1. Remove the connrole.replica field because, in the future, a node can have mixed roles. [NEW] Primary replica role at the slot level valkey-io/valkey#1372
  2. Trust the MOVED redirection. ie. It always points to the primary of the slot so the client should set the connection to pslots directly.

Hi @proost, could you help review this?

@rueian rueian marked this pull request as ready for review December 12, 2024 02:56
@rueian rueian force-pushed the fix-cluster-redirection-panic-on-stale-role branch 3 times, most recently from c42a54a to 084671c Compare December 12, 2024 04:38
@rueian rueian force-pushed the fix-cluster-redirection-panic-on-stale-role branch from 084671c to d29cbfd Compare December 12, 2024 04:41
Copy link
Contributor

@proost proost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

  1. Mixed roles is quite surprising to me.
  2. We should trust MOVED, and also set conn pslots is okay because of lazy refresh.

cluster.go Show resolved Hide resolved
@rueian rueian merged commit a31e941 into main Dec 12, 2024
60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Observed panics during cluster topology change
2 participants