Skip to main content

Table 1 Overview of existing pipelines for RNA-seq data analysis

From: transXpress: a Snakemake pipeline for streamlined de novo transcriptome assembly and annotation

Pipeline

Platform

Preprocessing

Assembly

Read mapping

Expression analysis

Functional annotation

transXpress

Snakemake

trimmomatic, FastQC, MultiQC

Trinity, rnaSPAdes

bowtie2 (optional)

kallisto, edgeR

BLAST, TargetP, SignalP, TMHMM, BUSCO

Pincho [8]

Bash, python3

trimmomatic, Rcorrector, TransRate, CD-HIT

Trinity, rnaSPAdes, BinPacker, IDBA-tran, Velvet-Oases, Shannon, Trans-AbySS, TransLig

HISAT2

kallisto, RSEM

BLAST, BUSCO, TransRate

RNAflow [9]

Nextflow

FastQC, MultiQC, fastp, SortMeRNA

Trinity

HISAT2

DESeq2

BUSCO, dammit

Rnnotator (unavailable) [6]

Unknown

 

Velvet, AMOS

   

themira (unavailable) [7]

Unknown

FastXtoolkit, FastX, CAP3

Velvet-Oases

  

Blast2GO

nf-core/rnaseq [10]

Nextflow

FastQC, TrimGalore

(None)

STAR, HISAT2

RSEM, Salmon, DESeq2

 

Pipeliner [11]

Nextflow

FastQC, MultiQC, TrimGalore

(None)

STAR, HISAT2

StringTie, HTSeq, featureCounts

 

VIPER [12]

Snakemake

RSeQC

(None)

STAR

Picard, Cufflinks, RSeQC, ComBat, DESeq2, PCA

VarScan, Gostats, GAGE, Pathview, ClusterProfiler, STAR-fusion, TRUST, TIMER, virus contamination detection

RASflow [13]

Snakemake

TrimGalore, FastQC, MultiQC

(None)

Salmon, HiSAT2

featureCounts or htseq-count, Qualimap, edgeR, DESeq2

 

hppRNA [14]

Snakemake

cutadapt, FastQC, PRINSEQ, FASTX-toolkit

(None)

Tophat, bowtie, subread, STAR, HiSAT

Cufflinks, featureCounts, RSEM, eXpress, kallisto, StringTie, ngs.plot, Cuffdiff, DESeq2, EBSeq, edgeR, sleuth, Ballgown

GATK, FusionCatcher

TRAPLINE [15]

Galaxy

FastxClipper, FastQC, FASTQ, FASTX-toolkit

(None)

Tophat, bowtie

Picard, Cufflinks, Cuffdiff

DAVID, miRanda, BioGRID

QuickRNAseq [16]

bash, Perl, R

RSeQC

(None)

STAR

featureCounts, RSeQC, edgeR

VarScan

ARMOR [17]

Snakemake

TrimGalore, FastQC + MultiQC

(None)

Salmon, STAR

edgeR, DRIMSeq

 

BISR-RNAseq [18]

PBS, bash, shiny, R

FastQC + MultiQC

(None)

HiSAT2

Picard, featureCounts, RSeQC, limma, edgeR

 

RNAseq123 [19]

Bioconductor

 

(None)

 

edgeR, limma, glimma

 
  1. The table summarizes the architecture and individual tools used in the pipelines for the main steps of data processing. Five of the pipelines (transXpress, Pincho, RNAflow, themira, Rnnotator) include a step of de novo transcriptome assembly, while the others require a reference genome or transcriptome