Intronic/'upstream' last exons - expand quantification regions to upstream penultimate exon/ shared region & add decoy transcripts #36

SamBryce-Smith · 2022-10-13T15:37:18Z

For gene-body internal last exons currently the uniquely-exonic regions for each last exon are passed as regions for Salmon to quantify. e.g. for a bleedthrough event, the annotated internal exon is subtracted from the last exon and only the region unique to the last exon is used to quantify the region.

Whilst simple, it has a few drawbacks:

shortening the region of which to count reads. This will penalise power to detect DU of short last exons.
Doesn't take into account other processing decisions that can happen at that region e.g. intron retention. All reads aligned to that region will be assigned to the last exon, which in some cases will be erroneous
Salmon requires both reads of a fragment/pair to be compatible with the transcript in order for them to be used for quantification. This means rightmost (+ strand) / leftmost (- strand) alignments of mates which fall at the start of the last exon will be not be considered for quantification b/c their mate will not align to the last exon. see screenshot below, the first few blue alignments are right-most in their pairs and just overlap the start of the last exon

One way to get around this (and to avoid problems of intron retention being mis-assigned to last exon) is to:

Expand the last exons to the upstream annotated exon (allow more fragments to be aligned to tx)
Add 'decoy transcripts' which include all permutations of processing decisions that do not include using the last exon. E.g. for CNPY example above that would be splicing out from the shared last exon to exon downstream and retention of the intron

SamBryce-Smith · 2022-10-13T15:52:08Z

take a look at the --recoverOrphans option - if it uses the genome decoys to search upstream of the last exon it may be able to recover some of these alignments?

Even if so, would still be useful to implement this feature RE the point of alternative processing decisions, especially intron retention

SamBryce-Smith added enhancement New feature or request high priority This should be worked asap labels Oct 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Intronic/'upstream' last exons - expand quantification regions to upstream penultimate exon/ shared region & add decoy transcripts #36

Intronic/'upstream' last exons - expand quantification regions to upstream penultimate exon/ shared region & add decoy transcripts #36

SamBryce-Smith commented Oct 13, 2022

SamBryce-Smith commented Oct 13, 2022

Intronic/'upstream' last exons - expand quantification regions to upstream penultimate exon/ shared region & add decoy transcripts #36

Intronic/'upstream' last exons - expand quantification regions to upstream penultimate exon/ shared region & add decoy transcripts #36

Comments

SamBryce-Smith commented Oct 13, 2022

SamBryce-Smith commented Oct 13, 2022