Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: EPEPT: A web service for enhanced P-value estimation in permutation tests

Figure 1

The P -value of a permutation test as a function of test statistic x 0 . Pperm is the correct P-value of the permutation test based on all possible label permutations (Nall = 105 in this example). The Nall permutation values are visualized as gray crosses on the x-axis. Pecdf is the standard empirical estimator of the P-value based on a limited set of N permutation values (N = 103 in this example). These are visualized as blue plus signs on the x-axis. Pgpd is the P-value estimator described in [1], which is also based on the N permutation values. It uses the 'extreme' permutation values, which exceed a particular threshold t. These Nexc permutation values are called the exceedances and are visualized by the red circles added to the blue plus signs. In this example t = 5. The exceedances are used to estimate the tail of the distribution of permutation values as a generalized Pareto distribution (GPD). The GPD is represented by function F in the Pgpd equation. From this figure it is clear that Pecdf is a poor estimator of small P-values, the minimum obtainable P-value being 1/N. In general, Pecdf requires 10/P permutations for a good estimate, P being the correct P-value. Pgpd, on the other hand, provides an accurate estimate of the correct P-value, even for P-values smaller than 1/N.

Back to article page