Skip to main content

Table 2 Probability of declaring equivalence (pr(Rej) normal test, \(pr_B(Rej)\) bootstrap test) for a simulated dissimilarity equal to the equivalence limit, \(d_S = d_0\). \(n = 1000\) stands for the total number of GO terms, with \(p_{ij}\) probabilities of enrichment.

From: An equivalence test between features lists, based on the Sorensen–Dice index and the joint frequencies of GO term enrichment

nSim

\({\mathbf{d_S}}\)

\({\mathbf{d_0}}\)

\({\mathbf{p}}_{\boldsymbol{11}}\)

\({\mathbf{p}}_{\boldsymbol{01}}\)

\({\mathbf{p}}_{\boldsymbol{10}}\)

pr(Rej)

\({\mathbf{pr_B(Rej)}}\)

\({\mathbf{E}}(\nu )\)

99433

0.2857

0.2857

0.01250

0.005

0.005

0.0807

0.0251

22.50

99994

0.2857

0.2857

0.01875

0.005

0.010

0.0741

0.0371

33.75

100000

0.2857

0.2857

0.02500

0.010

0.010

0.0706

0.0417

45.00

100000

0.2857

0.2857

0.06875

0.005

0.050

0.0617

0.0485

123.75

100000

0.2857

0.2857

0.07500

0.010

0.050

0.0614

0.0483

135.00

100000

0.2857

0.2857

0.12500

0.050

0.050

0.0591

0.0500

225.00

100000

0.2857

0.2857

0.13125

0.005

0.100

0.0584

0.0493

236.25

100000

0.2857

0.2857

0.13750

0.010

0.100

0.0583

0.0496

247.50

100000

0.2857

0.2857

0.18750

0.050

0.100

0.0568

0.0499

337.50

100000

0.2857

0.2857

0.25000

0.100

0.100

0.0558

0.0497

450.00

100000

0.2857

0.2857

0.25625

0.005

0.200

0.0558

0.0498

461.25

100000

0.2857

0.2857

0.26250

0.010

0.200

0.0559

0.0497

472.50

100000

0.2857

0.2857

0.31250

0.050

0.200

0.0548

0.0494

562.50

100000

0.2857

0.2857

0.37500

0.100

0.200

0.0541

0.0490

675.00

100000

0.2857

0.2857

0.50000

0.200

0.200

0.0538

0.0495

900.00

  1. \(E(\nu )\) is the expected total number of enriched terms. nsim corresponds to the number of effective simulation replicates (over an initial number of \(10^5\)) to obtain \(pr_B(Rej)\) (\(nsim \times B\) test computations, \(B = 10000\); pr(Rej) was based on an initial number of \(10^6\) simulation replicates). In some scenarios with low \(p_{ij}\), the generated tables contained zeros making impossible the Sorensen–Dice computations, so the effective number of simulation replicates was lower than what was initially planned