Permutation testing with 1000 permutations of a large case-control genome-wide association study with 5000 individuals genotyped for 500,000 markers can be performed using PRESTO in approximately one hour of computing time (Table 1). With PRESTO, the costs of permutation testing (in terms of time and computing resources) are extremely low for many common study designs, and these costs compare very favourably to the costs associated with data generation (e.g. performing genotype assays, calling genotypes, and performing data quality control filtering).
There has been some debate regarding the number of permutations required. When performing N permutations, the smallest multiple-testing adjusted P-value one can observe is 1/(N+1) . Thus, 1000 permutations can provide multiple-testing adjusted P-values as low as 0.001, which provide strong evidence of association. In the analysis of Wellcome Trust Case Control Consortium data described in Table 1, multiple-testing adjusted P-values of 0.05, 0.01, and 0.005 correspond to nominal P-values of 7.5 × 10-8, 1.5 × 10-8, and 6.2 × 10-9 respectively. If additional permutations are desired, 104 or 105 permutations are easily performed on a large genome-wide data set like the WTCCC data set in Table 1, and even larger numbers of permutations can be easily performed for smaller studies (computation time is linear in the number of permutations).
Permutation testing is particularly appealing because of its simplicity. Recently, several more complex alternatives to permutation testing have been proposed [11–14]. These methods can be useful, more computationally efficient alternatives to permutation testing in some situations.
Some methods for computing adjusted P-values exploit the fact that for many common statistical tests, the correlated tests have an asymptotic multivariate normal distribution under the null hypothesis of no trait-marker correlation. Seaman and Müller-Myhsok have proposed estimating the asymptotic distribution and sampling directly from it , and Conneely and Boehnke have proposed estimating the asymptotic distribution and calculating probabilities under this distribution using numerical integration . Either approach can be used to estimate the probability of observing a minimum P-value smaller than the observed minimum P-value. Both approaches are particularly well-suited to situations where covariate data are available or multiple quantitative phenotypes are tested. When the asymptotic distribution is accurately estimated, these methods are shown to give accurate results (compared to permutation as the gold standard) for candidate gene studies.
There are some limitations with these approaches that estimate the asymptotic multivariate normal distribution of the test statistics. These methods do not estimate significance levels for two-stage genotyping designs. A more severe restriction is that these methods are typically limited to several hundred correlated tests. Seaman and Müller-Myhsok and Conneely and Boehnke suggest that the number of samples should be at least 10 times the number of tests performed in order to accurately estimate the asymptotic multivariate normal distribution [11, 14]. So these methods cannot be directly applied to hundreds of thousands of single marker tests in a genome-wide association study.
Other alternatives to permutation testing are based on importance sampling. Kimmel and Shamir  have proposed a method that uses importance sampling to accurately estimate extremely small multiple-testing adjusted P-values, and Kimmel and colleagues  have modified this method to work with data from a stratified population. Decay of linkage disequilibrium with increasing genomic distance is exploited to further improve the computational efficiency of these methods.
These importance sampling methods lack some of the features that are found in PRESTO. The methods do not calculate significance for two-stage genotyping designs, and they do not calculate adjusted P-values for general order statistics. In the extension to stratified data, the association test statistic used in Kimmel et al  will have suboptimal power because it ignores the population structure of the data (the population structure is incorporated in the importance sampling, but not in the test statistic). The method of Kimmel and colleagues  can be modified to use a test statistic for stratified data (such as those used in PRESTO), but this would dramatically increase the computation time because their method loops through all possible contingency tables for each sampled permutation, and the number of contingency tables consistent with a permutation increases exponentially with the number of population strata.
Methods for computing multiple-testing adjusted P-values that are based on asymptotic multivariate normal distributions or importance sampling, are more complex than permutation testing, and require the asymptotic approximations to be accurate. In addition, when testing a single binary trait, these alternative methods provide little or no decrease in computational time relative to permutation testing with PRESTO, unless one is performing more than 1000 permutations.