A comparison of methods for differential expression analysis of RNA-seq data

Soneson, Charlotte; Delorenzi, Mauro

doi:10.1186/1471-2105-14-91

BMC Bioinformatics

Table 2 Summary of the main observations

From: A comparison of methods for differential expression analysis of RNA-seq data

DESeq	- Conservative with default settings. Becomes more conservative when outliers are introduced.
	- Generally low TPR.
	- Poor FDR control with 2 samples/condition, good FDR control for larger sample sizes, also with outliers.
	- Medium computational time requirement, increases slightly with sample size.
edgeR	- Slightly liberal for small sample sizes with default settings. Becomes more liberal when outliers are introduced.
	- Generally high TPR.
	- Poor FDR control in many cases, worse with outliers.
	- Medium computational time requirement, largely independent of sample size.
NBPSeq	- Liberal for all sample sizes. Becomes more liberal when outliers are introduced.
	- Medium TPR.
	- Poor FDR control, worse with outliers. Often truly non-DE genes are among those with smallest p-values.
	- Medium computational time requirement, increases slightly with sample size.
TSPM	- Overall highly sample-size dependent performance.
	- Liberal for small sample sizes, largely unaffected by outliers.
	- Very poor FDR control for small sample sizes, improves rapidly with increasing sample size. Largely unaffected by outliers.
	- When all genes are overdispersed, many truly non-DE genes are among the ones with smallest p-values. Remedied when the counts for some genes are Poisson distributed.
	- Medium computational time requirement, largely independent of sample size.
voom / vst	- Good type I error control, becomes more conservative when outliers are introduced.
	- Low power for small sample sizes. Medium TPR for larger sample sizes.
	- Good FDR control except for simulation study $B_{0}^{4000}$ . Largely unaffected by introduction of outliers.
	- Computationally fast.
baySeq	- Highly variable results when all DE genes are regulated in the same direction. Less variability when the DE genes are regulated in different directions.
	- Low TPR. Largely unaffected by outliers.
	- Poor FDR control with 2 samples/condition, good for larger sample sizes in the absence of outliers. Poor FDR control in the presence of outliers.
	- Computationally slow, but allows parallelization.
EBSeq	- TPR relatively independent of sample size and presence of outliers.
	- Poor FDR control in most situations, relatively unaffected by outliers.
	- Medium computational time requirement, increases slightly with sample size.
NOISeq	- Not clear how to set the threshold for q_NOISeq to correspond to a given FDR threshold.
	- Performs well, in terms of false discovery curves, when the dispersion is different between the conditions (see supplementary material).
	- Computational time requirement highly dependent on sample size.
SAMseq	- Low power for small sample sizes. High TPR for large enough sample sizes.
	- Performs well also for simulation study $B_{0}^{4000}$ .
	- Largely unaffected by introduction of outliers.
	- Computational time requirement highly dependent on sample size.
ShrinkSeq	- Often poor FDR control, but allows the user to use also a fold change threshold in the inference procedure.
	- High TPR.
	- Computationally slow, but allows parallelization.

The table summarizes the present study by means of the main observations and characteristic features for each of the evaluted methods. We have grouped voom+limma and vst+limma together since they performed overall very similarly.

Back to article page

ISSN: 1471-2105

Contact us

General enquiries: journalsubmissions@springernature.com