Releases: nf-core/rnaseq
nf-core/rnaseq version 1.4 "Gray Crocus Dachshund"
Major novel changes include:
-
Support for Salmon as an alternative method to STAR and HISAT2
-
Several improvements in
featureCounts
handling of types other thanexon
. It is possible now to handle nuclearRNAseq data. Nuclear RNA has un-spliced RNA, and the whole transcript, including the introns, needs to be counted, e.g. by specifying--fc_count_type transcript
. -
Support for outputting unaligned data to results folders.
-
Added options to skip several steps
- Skip trimming using
--skipTrimming
- Skip BiotypeQC using
--skipBiotypeQC
- Skip Alignment using
--skipAlignment
to only use pseudo-alignment using Salmon
- Skip trimming using
Documentation updates
- Adjust wording of skipped samples in pipeline output
- Fixed link to guidelines #203
- Add
Citation
andQuick Start
section toREADME.md
- Add in Documentation of the
--gff
parameter
Reporting Updates
- Generate MultiQC plots in the results directory #200
- Get MultiQC to save plots as standalone files
- Get MultiQC to write out the software versions in a
.csv
file #185 - Use
file
instead ofnew File
to createpipeline_report.{html,txt}
files, and properly create subfolders
Pipeline enhancements & fixes
- Restore
SummarizedExperimment
object creation in the salmon_merge process avoiding increasing memory with sample size. - Fix sample names in feature counts and dupRadar to remove suffixes added in other processes
- Removed
genebody_coverage
process #195 - Implemented Pearsons correlation instead of Euclidean distance #146
- Add
--stringTieIgnoreGTF
parameter #206 - Removed unused
stringtie
channels forMultiQC
- Integrate changes in
nf-core/tools v1.6
template which resolved #90 - Moved process
convertGFFtoGTF
beforemakeSTARindex
#215 - Change all boolean parameters from
snake_case
tocamelCase
and vice versa for value parameters - Add SM ReadGroup info for QualiMap compatibility#238
- Obtain edgeR + dupRadar version information #198 and #112
- Add
--gencode
option for compatibility of Salmon and featureCounts biotypes with GENCODE gene annotations - Added functionality to accept compressed reference data in the pipeline
- Check that gtf features are on chromosomes that exist in the genome fasta file #274
- Maintain all gff features upon gtf conversion (keeps
gene_biotype
orgene_type
to makefeatureCounts
happy) - Add SortMeRNA as an optional step to allow rRNA removal #280
- Minimal adjustment of memory and CPU constraints for clusters with locked memory / CPU relation
- Cleaned up usage,
parameters.settings.json
and thenextflow.config
Dependency Updates
- Dependency list is now sorted appropriately
- Force matplotlib=3.0.3
Updated Packages
- Picard 2.20.0 -> 2.21.1
- bioconductor-dupradar 1.12.1 -> 1.14.0
- bioconductor-edger 3.24.3 -> 3.26.5
- gffread 0.9.12 -> 0.11.4
- trim-galore 0.6.1 -> 0.6.4
- gffread 0.9.12 -> 0.11.4
- rseqc 3.0.0 -> 3.0.1
- R-Base 3.5 -> 3.6.1
Added / Removed Packages
nf-core/rnaseq version 1.3
Pipeline Updates
- Added configurable options to specify group attributes for featureCounts #144
- Added support for RSeqC 3.0 #148
- Added a
parameters.settings.json
file for use with the newnf-core launch
helper tool. - Centralized all configuration profiles using nf-core/configs
- Fixed all centralized configs for offline usage
- Hide %dup in multiqc report
Bug fixes
- Fixing HISAT2 Index Building for large reference genomes #153
- Fixing HISAT2 BAM sorting using more memory than available on the system
- Fixing MarkDuplicates memory consumption issues following #179
Dependency Updates
- RSeQC 2.6.4 -> 3.0.0
- Picard 2.18.15 -> 2.18.23
- r-data.table 1.11.4 -> 1.12.0
- r-markdown 0.8 -> 0.9
- csvtk 0.15.0 -> 0.17.0
- stringtie 1.3.4 -> 1.3.5
- subread 1.6.2 -> 1.6.4
- gffread 0.9.9 -> 0.9.12
- multiqc 1.6 -> 1.7
nf-core/rnaseq version 1.2
Pipeline updates
- Removed some outdated documentation about non-existent features
- Config refactoring and code cleaning
- Added a
--fcExtraAttributes
option to specify more than ENSEMBL gene names infeatureCounts
- Remove legacy rseqc
strandRule
config code. #119 - Added STRINGTIE ballgown output to results folder #125
- HiSAT index build now requests
200GB
memory, enough to use the exons / splice junction option for building.- Added documentation about the
--hisatBuildMemory
option.
- Added documentation about the
- BAM indices are stored and re-used between processes #71
Bug Fixes
nf-core/rnaseq version 1.1
Pipeline updates
- Wrote docs and made minor tweaks to the
--skip_qc
and associated options - Removed the depreciated
uppmax-modules
config profile - Updated the
hebbe
config profile to use the newwithName
syntax too - Use new
workflow.manifest
variables in the pipeline script - Updated minimum nextflow version to
0.32.0
Software updates
- FastQC
0.11.7
>0.11.8
- STAR
2.6.1a
>2.6.1b
- Picard
2.18.11
>2.18.14
- Deeptools
3.1.1
>3.1.4
Bug Fixes
nf-core/rnaseq version 1.0
Initial release of nf-core/rnaseq! 🎉
This release marks the point where the pipeline was moved from SciLifeLab/NGI-RNAseq
over to the new nf-core community, at nf-core/rnaseq.
You can view the previous changelog at SciLifeLab/NGI-RNAseq/CHANGELOG.md
In addition to porting to the new nf-core community, the pipeline has had a number of major changes in this version. There have been 157 commits by 16 different contributors covering 70 different files in the pipeline: 7,357 additions and 8,236 deletions!
In summary, the main changes are:
- Rebranding and renaming throughout the pipeline to nf-core
- Updating many parts of the pipeline config and style to meet nf-core standards
- Support for GFF files in addition to GTF files
- Just use
--gff
instead of--gtf
when specifying a file path
- Just use
- New command line options to skip various quality control steps
- More safety checks when launching a pipeline
- Several new sanity checks - for example, that the specified reference genome exists
- Improved performance with memory usage (especially STAR and Picard)
- New BigWig file outputs for plotting coverage across the genome
- Refactored gene body coverage calculation, now much faster and using much less memory
- Bugfixes in the MultiQC process to avoid edge cases where it wouldn't run
- MultiQC report now automatically attached to the email sent when the pipeline completes
- New testing method, with data on GitHub
- Now run pipeline with
-profile test
instead of using bash scripts
- Now run pipeline with
- Rewritten continuous integration tests with Travis CI
- New explicit support for Singularity containers
- Improved MultiQC support for DupRadar and featureCounts
- Now works for all users instead of just NGI Stockholm
- New configuration for use on AWS batch
- Updated config syntax to support latest versions of Nextflow
- Built-in support for a number of new local HPC systems
- CCGA, GIS, UCT HEX, updates to UPPMAX, CFC, BINAC, Hebbe, c3se
- Slightly improved documentation (more updates to come)
- Updated software packages
...and many more minor tweaks.
Thanks to everyone who has worked on this release!