Skip to main content

Table 1 Simulation strategies. More details for each setting can be found in the Additional file 1

From: Impact of adaptive filtering on power and false discovery rate in RNA-seq experiments

Simulation

Description and data sources

NB

The count data are assumed to follow a negative binomial distribution (NB), dispersion and mean parameters are fixed and equal for all \(H_0\) or \(H_1\), respectively.

NB with distributed parameters

Read counts follow a NB distribution, dispersion and mean parameters vary across genes and are based on real RNA-seq data sets according to [2] (real data sets Kidney [6], Bottomly [7], and Sultan [8], see Table 3).

SimSeq [9]

Counts based on real data read counts adjusted by a correction factor to generate differential expressions, dependence between genes is imitated from real data sets Bottomly [7], Kidney [6], and mouse [10].

PROPER [11]

Read counts follow a NB distribution, dispersion and mean parameters vary across genes and are based on a real RNA-seq data set (Cheung [12]). Additional noise is introduced due to zero baseline expressions in the original data leading to many genes with zero counts only.

PROPER with fixed sequencing depth [11]

As PROPER. Here, the empirical average expressions sampled from the Cheung data are standardised to reach a fixed sequencing depth.