Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement multi-threaded FastQ compression for umi-transfer #11

Merged
merged 19 commits into from
May 6, 2024

Conversation

MatthiasZepper
Copy link
Member

@MatthiasZepper MatthiasZepper commented Apr 14, 2024

umi-transfer 1.0.0 does its job, but the performance of the single-threaded tool is poor. I/O should be asynchronous to avoid blocking per-core threads, and this pull request implements such a functionality for the output files using the gzp crate. Inspiration how to incorporate it into the actual code was drawn from Crabz and Ouch, both MIT-licensed tools as well. Thread-pinning, albeit supported by gzp has not been enabled because of the additional complexity involved.

In a later PR, the input file logic will be slightly improved as well - not with multi-threading, but switching from flate2 decompression to gzp as well, to not use two libraries for the same purpose.

PS: 😟 Seems I have to fix the CI first...that Action uses no longer existing steps... which is being done in #12

src/file_io.rs Fixed Show fixed Hide fixed
src/file_io.rs Fixed Show fixed Hide fixed
src/umi_external.rs Fixed Show fixed Hide fixed
src/umi_external.rs Fixed Show fixed Hide fixed
src/umi_external.rs Fixed Show fixed Hide fixed
src/umi_external.rs Fixed Show fixed Hide fixed
src/file_io.rs Fixed Show fixed Hide fixed
src/file_io.rs Fixed Show fixed Hide fixed
Copy link
Member

@alneberg alneberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to work for me and I think it looks good! Would be fun to do some benchmarking at some point as well.

@MatthiasZepper MatthiasZepper changed the base branch from main to dev April 29, 2024 16:52
@MatthiasZepper MatthiasZepper force-pushed the Improved_compression branch 2 times, most recently from 16e9988 to 62128f0 Compare May 2, 2024 20:06
@MatthiasZepper MatthiasZepper force-pushed the Improved_compression branch 2 times, most recently from 56c8c84 to b33ad60 Compare May 6, 2024 16:49
@MatthiasZepper MatthiasZepper force-pushed the Improved_compression branch from b33ad60 to 1a0df50 Compare May 6, 2024 16:58
Copy link

codecov bot commented May 6, 2024

Welcome to Codecov 🎉

Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests.

Thanks for integrating Codecov - We've got you covered ☂️

@MatthiasZepper MatthiasZepper merged commit 0ca36af into SciLifeLab:dev May 6, 2024
5 checks passed
@MatthiasZepper MatthiasZepper deleted the Improved_compression branch May 15, 2024 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants