Skip to main content
Figure 4 | BMC Bioinformatics

Figure 4

From: The choice of null distributions for detecting gene-gene interactions in genome-wide association studies

Figure 4

Null distributions of testing on the independent data set. We generate 500 null data sets. Each data set has 2000 samples and 2000 SNPs. We divide each data set into three subsets with nearly equal size. The first one is used for screening, the second one is for modeling and the third one is for hypothesis testing. The upper panel: Logistic regression (LR) is used in modeling. The degrees of freedom of the theoretical null distributions are df = 8,26,80 for 2,3,4-way interaction models, respectively. We see that the null distributions of LR match the theoretical null distributions well for 2,3-way interaction models. The resulting null distribution of the 4-way interaction model follows Here df = 73.18 is smaller than the theoretical one (df = 80) because there are only about 666 samples in hypothesis testing. The number of samples is too small to accurately estimate the large degree of freedom of the theoretical null distribution (df = 80). The lower panel: MDR is used in modeling. We can see that the obtained null distributions are roughly the same with those shown in the upper panel of Figure 2.

Back to article page