Orciraptor_agilis_2021

Read processing and filtering, de novo transcriptome assembly (rnaSPAdes) of Orciraptor agilis RNA-seq data

Module 1 + 2: See repository "Orciraptor_agilis_2021_Trinity"

Module 3: De novo transcriptome assembly, decontamination, ORF prediction

Run assembly.sh to assemble the transcriptome from processed reads of all libraries. Output is de novo transcriptome assembly of Orciraptor agilis as a fasta.
Filter transcriptome for contigs larger than 200 nt with seqkit_length.sh
Run blastn search with this transcriptome (nt database v5 updated on 2021-03-10): blastn.sh

Checked contigs with > 95% identity over a length of minimum 100 nt, saved contig identifiers of all bacterial, viral, ribosomal and algal contigs in contaminants.txt
Remove these sequences from transcriptome with seqkit.sh

ORF prediction with transdecoder.sh
Run blastp search of Orciraptor ORFs against Mougeotia predicted ORFs: blastp.sh

Remove ORFs with a > 95% identity over a length of 150 aa from Orciraptor predicted proteome: seqkit_NA.sh

Rename ORFs to pattern "gx_iy.pz" (gene, isoform, peptide) with rename_transdecoder.py. Usage in folder Module_3/transdecoder: Output is "orciraptor_transdecoder.pep_renamed.fasta".

python rename_transdecoder.py orciraptor_200_filtered2.fasta.transdecoder.pep

Module 5: Generate gene_trans_map

Generate gene_trans_map file for Lace: gene_trans_map.R
Mapping the processed reads back to the newly generated transcriptome with bowtie2 and counting with salmon in alignment-mode (bowtie2.sh).

Module 6: Assembly summary statistics

Number of number + length statistics of contigs was calculated with TrinityStats.pl script from Trinity toolkit
Number, completeness and orientation of ORFs is summarised with transdecoder_count.sh
ExN50 statistic is calculated with ExN50.sh

Module 7: Supertranscripts

Run Lace to generate supertranscript fasta: lace.sh
Generate genome index of supertranscriptome for STAR mapping (star_genome.sh), perform STAR mapping of processed reads (star_mapping.sh), index the bam files (index.sh)
Run stringtie and merge the gtfs (stringtie.sh)
Convert gtf to fasta and predict ORFs (stringtie_fasta.sh and stringtie_transdecoder.sh)

Name		Name	Last commit message	Last commit date
Latest commit History 213 Commits
Module_3		Module_3
Module_5		Module_5
Module_6		Module_6
Module_7		Module_7
README.md		README.md
config.txt		config.txt
experiment.txt		experiment.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Orciraptor_agilis_2021

Module 1 + 2: See repository "Orciraptor_agilis_2021_Trinity"

Module 3: De novo transcriptome assembly, decontamination, ORF prediction

Module 5: Generate gene_trans_map

Module 6: Assembly summary statistics

Module 7: Supertranscripts

About

Releases 1

Packages

Languages

jenny-gerbracht/Orciraptor_agilis_2021

Folders and files

Latest commit

History

Repository files navigation

Orciraptor_agilis_2021

Module 1 + 2: See repository "Orciraptor_agilis_2021_Trinity"

Module 3: De novo transcriptome assembly, decontamination, ORF prediction

Module 5: Generate gene_trans_map

Module 6: Assembly summary statistics

Module 7: Supertranscripts

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages