-
Notifications
You must be signed in to change notification settings - Fork 982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(search): Multishard cutoffs #1924
base: main
Are you sure you want to change the base?
Conversation
24b2025
to
fe7559a
Compare
My condolences to the one reviewing it 🙂 |
I actually wrote Vlad - Kostas 1-1 only to not submit the comment because it would have been noise. Anyway, when you wrote this I could not resist not to write this msg 😛 I would argue mine was simpler because it was mostly boilerplate (I haven't looked yours TBH so it could be the same case) so it's not really 1-1 and probably you are getting the 👑 👨🍳 |
fe7559a
to
c94d9ad
Compare
c94d9ad
to
30ea6b8
Compare
30ea6b8
to
883d326
Compare
Please take a quick look at this. There is still some polishment left but the general "structure" is ready. I'd wish for this to be merged before we announce search because it's important for fast multishard queries PS: Yes there is lots of code and it's complicated 🙂 I'd start looking from MultiShardSearch::Run because it's the main entry point w everything else being tailored to fit it's needs |
883d326
to
7755b92
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I skimmed through this PR, it looks good.
What is missing, except for that TODO around how many items to retrieve?
Also see some minor questions/nits
size_t avg_shard_min = hits * intlog2(hits) / (12 + shard_set->size() / 10); | ||
avg_shard_min -= min(avg_shard_min, min(hits, size_t(5))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain the rationale here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Experimentally from #1892 😄 Those formulas might not be final
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd link to that, at least until you change it :)
// If all shards have at least avg min, keep the bare minimum needed to cover the request | ||
size_t limit = requested / shards + 1; | ||
|
||
// Aggregations like SORTBY and KNN reorder the result and thus introduce some variance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we have to fetch all matching for SORTBY?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No/Yes 🤓 We have to fetch all, but not serialize all. That is what the doc reference is for. If we find out when building the reply that some entries should have been included but are not serialized, we'll refill (see BuildSortedOrder())
8fe00de
to
52856f4
Compare
Signed-off-by: Vladislav Oleshko <[email protected]>
Signed-off-by: Vladislav Oleshko <[email protected]>
* combine replication tests and reg tests in one flow * allow replication tests to run on arm
Signed-off-by: Roman Gershman <[email protected]>
…method (dragonflydb#3191) fixes dragonflydb#3190 Signed-off-by: Stepan Bagritsevich <[email protected]>
…ragonflydb#3173) * fix(cluster): Support `FLUSHALL` while slot migration is in progress Fixes dragonflydb#3132 Also do a small refactor to move cancellation logic into `RestoreStreamer`.
…db#3171) * fix: fix RegisterOnChange methods for journal and db_slice. Call db_slice and journal callbacks atomically. Made a hack to avoid deadlock during SAVE
…pes (dragonflydb#3181) * fix(generic_family): fix RenameGeneric command for non-string data types fixes dragonflydb#3107, fixes dragonflydb#3113, fixes dragonflydb#307 Signed-off-by: Stepan Bagritsevich <[email protected]> * refactor: address comments Signed-off-by: Stepan Bagritsevich <[email protected]> * refactor: address comments 2 Signed-off-by: Stepan Bagritsevich <[email protected]> --------- Signed-off-by: Stepan Bagritsevich <[email protected]>
…rt` (dragonflydb#3193) That was a misleading name, as the logic was the exact opposite (oops 🤦) This PR introduces a new name for the same flag: break_replication_on_master_restart We're keeping the previous flag for now, to make transition easier. We'll remove it in a later Dragonfly version (>= 1.22) Fixes dragonflydb#3192
fix unit tests Signed-off-by: adi_holden <[email protected]>
* add support for multiple passwords * add support for deleting passwords * add support for resetpass * add tests * always prefix passwords with hashtag when printed
* chore(tiering): More advanced tiering tests * fix: fixes
* add replica-priority flag * add it on info replication command * add test
Signed-off-by: Vladislav Oleshko <[email protected]>
Signed-off-by: Vladislav Oleshko <[email protected]>
52856f4
to
d4ee35d
Compare
Signed-off-by: Vladislav Oleshko <[email protected]>
No description provided.