Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: TopNRowNumber::getOutputFromMemory loop #11440

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

aditi-pandit
Copy link
Collaborator

@aditi-pandit aditi-pandit commented Nov 5, 2024

The getOutputFromMemory loop first output the remainder of rows (if any) from the current partition. If there was still space on the output buffer, then it went into a loop trying to add as many partitions (some with all rows in entirety and a last partial one possibly). The code is simplified to just a loop that carries forward from the current partition to fill output rows.

The priority queue in the TopNRowNumber partitions pops rows in reverse order of row number. The old code maintained a remainingRowsInPartition_ variable that was used in some odd ways to compute the 'start' rowNumber for each output block from a partition. This is not needed. The partition.rows.size() can be used simply to get the correct row numbers.

Remove the un-necessary currentPartition() function as a result.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 5, 2024
Copy link

netlify bot commented Nov 5, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 9e32bd0
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/6776d74472096200085ec84a

@aditi-pandit aditi-pandit force-pushed the topn_output branch 3 times, most recently from 6d65173 to af1d2b1 Compare November 5, 2024 18:47
@aditi-pandit aditi-pandit changed the title Refactor TopNRowNumber::getOutputFromMemory Refactor TopNRowNumber::getOutputFromMemory loop Nov 5, 2024
@aditi-pandit aditi-pandit requested a review from pedroerp November 6, 2024 04:28
@pedroerp
Copy link
Contributor

It would be nice to get @mbasmanova or @xiaoxmeng review on this (I'm not very familiar with this operator).

@aditi-pandit aditi-pandit changed the title Refactor TopNRowNumber::getOutputFromMemory loop [refactor] TopNRowNumber::getOutputFromMemory loop Nov 15, 2024
@aditi-pandit aditi-pandit changed the title [refactor] TopNRowNumber::getOutputFromMemory loop refactor: TopNRowNumber::getOutputFromMemory loop Nov 15, 2024
@aditi-pandit
Copy link
Collaborator Author

@xiaoxmeng : Hi Meng, Pinging for review. Please let me know if you have any questions about this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants