You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using Demultiplexer() to demultiplex nanopore reads.
This works well, but when allowing more errors in the barcodes, the time to generate the demultiplexer grows very fast.
Current Behavior
Allowing one more error cost more than 10 times longer in terms of time and allocations.
Desired Behavior
It would be great if it was faster.
Steps to reproduce
julia> @time Demultiplexer(LongDNASeq.(["GGAGAAGAAGAAGAA"]), n_max_errors=1, distance=:hamming)
0.000388 seconds (1.56 k allocations: 162.047 KiB)
Demultiplexer{LongSequence{DNAAlphabet{4}}}:
distance: hamming
number of barcodes: 1
number of correctable errors: 1
julia> @time Demultiplexer(LongDNASeq.(["GGAGAAGAAGAAGAA"]), n_max_errors=2, distance=:hamming)
0.010063 seconds (50.47 k allocations: 3.590 MiB)
Demultiplexer{LongSequence{DNAAlphabet{4}}}:
distance: hamming
number of barcodes: 1
number of correctable errors: 2
julia> @time Demultiplexer(LongDNASeq.(["GGAGAAGAAGAAGAA"]), n_max_errors=3, distance=:hamming)
0.193055 seconds (1.08 M allocations: 58.884 MiB)
Demultiplexer{LongSequence{DNAAlphabet{4}}}:
distance: hamming
number of barcodes: 1
number of correctable errors: 3
julia> @time Demultiplexer(LongDNASeq.(["GGAGAAGAAGAAGAA"]), n_max_errors=4, distance=:hamming)
3.394650 seconds (15.94 M allocations: 734.229 MiB, 10.49% gc time)
Demultiplexer{LongSequence{DNAAlphabet{4}}}:
distance: hamming
number of barcodes: 1
number of correctable errors: 4
julia> @time Demultiplexer(LongDNASeq.(["GGAGAAGAAGAAGAA"]), n_max_errors=5, distance=:hamming)
39.984839 seconds (169.53 M allocations: 7.118 GiB, 9.05% gc time)
Demultiplexer{LongSequence{DNAAlphabet{4}}}:
distance: hamming
number of barcodes: 1
number of correctable errors: 5
My Environment
julia> versioninfo()
Julia Version 1.4.0
Commit b8e9a9ecc6 (2020-03-21 16:36 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-8.0.1 (ORCJIT, sandybridge)
julia> Pkg.status("BioSequences")
Status `~/.julia/environments/v1.4/Project.toml`
[7e6ae17a] BioSequences v2.0.1
The text was updated successfully, but these errors were encountered:
Background
I'm using
Demultiplexer()
to demultiplex nanopore reads.This works well, but when allowing more errors in the barcodes, the time to generate the demultiplexer grows very fast.
Current Behavior
Allowing one more error cost more than 10 times longer in terms of time and allocations.
Desired Behavior
It would be great if it was faster.
Steps to reproduce
My Environment
The text was updated successfully, but these errors were encountered: