Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Detecting purely epistatic multi-locus interactions by an omnibus permutation test on ensembles of two-locus analyses

Figure 1

Outline of 2LOmb. In this example, the algorithm takes a balanced case-control data set that consists of 400 samples and 1,000 SNPs. Each genotype is represented by an integer: 0 denotes a homozygous wild-type genotype, 1 denotes a heterozygous genotype and 2 denotes a homozygous variant or homozygous mutant genotype. A χ2 contingency table is then constructed for each pair of SNPs in two-locus analysis. This results in the total of = 499,500 two-locus analyses. Thus, the Bonferroni-corrected χ2's p-value for each two-locus analysis is the lower value between 499,500 × its uncorrected p-value and one. In one ensemble, Bonferroni-corrected χ2's p-values from multiple two-locus analyses are combined together via a Fisher's combining function, which in turn provides a Fisher's test statistic result. The raw p-value for the ensemble is obtained through a permutation test, which is composed of 10,000 randomised permutation replicates. Since multiple ensembles may be tried during the identification of the best association explanation, a global p-value is calculated to account for multiple hypothesis testing. The global p-value is estimated through the same permutation test that gives the raw p-value for each ensemble. The progressive search for the best association explanation is carried out by incrementally adding a two-SNP unit to the current best ensemble. The condition for search termination is based on both the raw p-value for the explored ensemble and the global p-value. In this example, the search is terminated after the fourth ensemble is explored due to an increase in the raw p-value. Subsequently, the best SNP set for association explanation contains SNP1, SNP2 and SNP3 where the global p-value that accounts for testing of four hypotheses is p < 0.0001.

Back to article page