Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BlockJoinBulkScorer could check for parent deletions (not children) #14067

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Dec 15, 2024

This change ensures that BlockJoinBulkScorer verifies deletions at the parent level.

This change ensures that BlockJoinBulkScorer verifies deletions at the parent level.
@jimczi jimczi requested a review from jpountz December 15, 2024 09:08
@jpountz
Copy link
Contributor

jpountz commented Dec 15, 2024

Block-joins require deletes to be consistent across parent and child documents. So I wouldn't call it a bug, the implementation should be free to check deletes on child documents or parent documents depending on whichever is more convenient/efficient.

@jimczi jimczi changed the title BlockJoinBulkScorer must check for parent deletions (not children) BlockJoinBulkScorer could check for parent deletions (not children) Dec 16, 2024
@ChrisHegarty
Copy link
Contributor

Block-joins require deletes to be consistent across parent and child documents.

This seems a little fragile, and place where bugs could hide. I wonder if we should assert this constraint, which would at least catch some of these potential issue usage issues, while not hurt run time performance (too much).

@jpountz
Copy link
Contributor

jpountz commented Dec 16, 2024

Indeed. We have a few such assertions already, e.g. to check that docs returned by the child query don't match the parent bitset. Adding a few more would be good if it can be done in a reasonable fashion (e.g. not forcing nextDoc() / advance() to be called just for the sake of validation).

The block-join module also has CheckJoinIndex, which is expected to be used in a more offline fashion and will catch such bugs.

for (int child = prevParentDoc + 1; child != parentDoc; child++) {
final boolean childIsLive = liveDocs.get(child);
if (parentIsLive != childIsLive) {
if (parentIsLive) {
throw new IllegalStateException(
"Parent doc "
+ parentDoc
+ " of segment "
+ context.reader()
+ " is live but has a deleted child document "
+ child);
} else {
throw new IllegalStateException(
"Parent doc "
+ parentDoc
+ " of segment "
+ context.reader()
+ " is deleted but has a live child document "
+ child);
}
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants