Sim. study

${\mathit{G}}_{\mathit{DE}}^{\mathit{up}}$

${\mathit{G}}_{\mathit{DE}}^{\mathit{down}}$

{g; ϕ_{
g
} = 0}

‘Single’ outlier fraction

‘Random’ outlier fraction


${B}_{0}^{0}$

0

0

0

0

0

${B}_{0}^{1250}$

1,250

0

0

0

0

${B}_{625}^{625}$

625

625

0

0

0

${B}_{0}^{4000}$

4,000

0

0

0

0

${B}_{2000}^{2000}$

2,000

2,000

0

0

0

${P}_{0}^{0}$

0

0

6,250

0

0

${P}_{625}^{625}$

625

625

6,250

0

0

${S}_{0}^{0}$

0

0

0

10%

0

${S}_{625}^{625}$

625

625

0

10%

0

${R}_{0}^{0}$

0

0

0

0

5%

${R}_{625}^{625}$

625

625

0

0

5%

 In all synthetic data sets, the observations were distributed between two conditions (denoted S_{1} and S_{2}), with the same number of observations (2, 5 or 10) in each condition. We let $\left{G}_{\mathit{DE}}^{\mathit{up}}\right$ and $\left{G}_{\mathit{DE}}^{\mathit{down}}\right$ denote, respectively, the number of genes that were up and downregulated in condition S_{2} compared to S_{1}. The number of genes whose counts were drawn from a Poisson distribution (i.e., with the dispersion parameter equal to zero) is given by {g; ϕ_{
g
} = 0}. The ‘single’ outlier fraction denotes the fraction of the genes for which we selected a single sample and multiplied the corresponding count with a factor between 5 and 10. The ‘random’ outlier fraction denotes the fraction of counts that were selected randomly (among all counts) and multiplied with a factor between 5 and 10. The notation for the simulation studies (leftmost column) summarizes the type of simulation (B  ‘baseline’, P  ‘Poisson’, S  ‘single outlier’, R  ‘random outlier‘), the number of DE genes that are upregulated in S_{2} (i.e., $\left{G}_{\mathit{DE}}^{\mathit{up}}\right$, in the superscript) and the number of DE genes that are downregulated in S_{2} (i.e., $\left{G}_{\mathit{DE}}^{\mathit{down}}\right,$, in the subscript).