Skip to main content

Table 2 Cleaning data with high sequencing depth [23]

From: Modeling and cleaning RNA-seq data significantly improve detection of differentially expressed genes

Cleaning method

#DEGs*(DEGs30&)

#DEGs/p-value

Single-organism process

Biological regulation

Regulation of biological process

Multicellular organism process

Response to stimulus

Author’s$

130 (29)

27/9.3e-3

24/2.3e-2

21/1.6e-1

21/1.9e-3

21/8.2e-3

Raw data

135 (96)

65/3.9e-2

52/3.4e-1

47/6.0e-1

7/9.4e-1

40/3.3e-1

RNAdeNoise

336 (328)

193/1.7e-4

164/5.4e-3

155/1.6e-2

33/3.1e-1

134/6.3e-4

HTSFilter

27 (27)

20/1.2e-1

14/7.4e-1

10/9.8e-1

13/1.8e-1

10/7.9e-1

counts > 3

184 (96)

65/3.9e-2

52/3.4e-1

47/6.0e-1

38/2.1e-1

40/3.3e-1

counts > 5

250 (98)

67/3.1e-2

54/2.8e-1

49/5.3e-1

40/1.5e-1

42/2.5e-1

counts > 10

572 (128)

76/5.1e-2

63/2.2e-1

58/4.0e-1

11/7.4e-1

52/6.8e-2

FPKM > 0.3

550 (166)

108/2.7e-2

91/1.1e-1

85/2.2e-1

15/8.0e-1

69/1.8e-1

  1. This dataset illustrates a “step” phenomenon appearing after application of threshold–based filters, when one of the values under the threshold is zeroed. As a result, a program for DEGs detection preferentially finds genes with very low counts (in brackets genes with counts ≥ 30). RNAdeNoise does not introduce such a bias and shows an increase in number and statistical significance of functional DEGs (distributions of counts in DEGs is shown in Additional file 1: Fig. S3)
  2. $Genes from Tables 1, 2, 3 and 4 [23]
  3. *Criteria for DEGs: |log2(FoldChange)|>1.0, p-value < 0.0001
  4. &DEGs with counts ≥ 30 at least in one sample