Replication of epistatic DNA loci in two case-control GWAS studies using OPE algorithm

Goudey, Benjamin; Wang, Qiao; Rawlinson, Dave; Zarnegar, Armita; Kikianty, Eder; Markham, John; Macintyre, Geoff; Abraham, Gad; Stern, Linda; Inouye, Michael; Haviv, Izhak; Kowalczyk, Adam

doi:10.1186/1471-2105-12-S11-A5

Volume 12 Supplement 11

Highlights from the Seventh International Society for Computational Biology (ISCB) Student Council Symposium 2011

Oral presentation
Open access
Published: 21 November 2011

Replication of epistatic DNA loci in two case-control GWAS studies using OPE algorithm

Benjamin Goudey^1,2,
Qiao Wang²,
Dave Rawlinson²,
Armita Zarnegar²,
Eder Kikianty²,
John Markham²,
Geoff Macintyre^1,2,
Gad Abraham^1,2,
Linda Stern¹,
Michael Inouye^3,5,
Izhak Haviv^2,4 &
…
Adam Kowalczyk²

BMC Bioinformatics volume 12, Article number: A5 (2011) Cite this article

2474 Accesses
2 Citations
Metrics details

Background

One of the limiting factors of current genome-wide association studies (GWAS) is the inability of current methods to comprehensively examine SNP interactions for a reasonable sized dataset. It is hypothesised that this limitation is one of the reasons that GWAS studies have not been able to have a greater impact [1, 2]. Many current methods for handling interactions are computationally expensive and do not scale to entire studies. Those methods that do scale often achieve this by pruning their datasets in some manner. This is commonly done by considering only those SNPs that show strong marginal effects, despite the fact that a strongly interacting pair may consist of SNPs with low effects individually.

Material and methods

In this presentation, we validate the robustness of a novel algorithm known as Optimal Pairwise Epistasis (OPE) for exhaustively examining all pairwise interactions in GWAS data. This method is based on the systematic evaluation of “binary genotype pairs” (BG-pairs), i.e. the pairs of complementary binary classification of genotype calls for an individual SNP, or a pair of SNPs. We can quantify the discrimination potential of BG-pairs using a family of statistics based on odds ratios.

Results and conclusion

The approach is computationally efficient: the dataset reported here as Study 1 (consisting of ~310K SNPs and 2200 samples [3]) takes 12 hour to process on a single CPU (compared to 149 hours of the recent BOOST algorithm [4]). The method can be highly parallelised with a recent GPU implementation reducing this processing time to less than 15 minutes.

We have tested our approach over 2 independent GWAS studies of Celiac disease: the first (Study 1 mentioned above, [3]) with 778/1422 and the second (Study 2, [5]) with 1849/4936 of case/control samples, respectively. Each point in the figure 1 below shows the observed frequency of the BG carriers for the case and control subpopulations: in blue for a pair of SNPs or in yellow for an individual SNP. Every BG-pair can be evaluated with respect to the two sets of axes labels: purple labels for the protective BG and black labels for the risk BG. The resulting figure shows both studies related by symmetry in the main diagonal and indicates replication of results across studies. We emphasise the replicability of our approach by showing in green the same subset of SNP pairs in both studies. We also show in red contours for p-values and plot in black / purple solid diagonal lines to indicate different odds ratios.

References

Cordell HJ: Detecting gene-gene interactions that underlie human diseases. Nat Rev Genet 2009, 10: 392–404.
Article PubMed Central CAS PubMed Google Scholar
Moore JH, Asselbergs FW, Williams SM: Bioinformatics challenges for genome-wide association studies. Bioinformatics 2010, 26: 445–455. 10.1093/bioinformatics/btp713
Article PubMed Central CAS PubMed Google Scholar
van Heel DA, Franke L, Hunt KA, Gwilliam R, Zhernakova A, Inouye M, Wapenaar MC, Barnardo MC, Bethel G, Holmes GK, et al.: A genome-wide association study for celiac disease identifies risk variants in the region harboring IL2 and IL21. Nat Genet 2007, 39: 827–829. 10.1038/ng2058
Article PubMed Central CAS PubMed Google Scholar
Wan X, Yang C, Yang Q, Xue H, Fan X, Tang NL, Yu W: BOOST: a fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am J Hum Genet 2010, 87: 325–340. 10.1016/j.ajhg.2010.07.021
Article PubMed Central CAS PubMed Google Scholar
Dubois PC, Trynka G, Franke L, Hunt KA, Romanos J, Curtotti A, Zhernakova A, Heap GA, Adány R, Aromaa A, et al.: Multiple common variants for celiac disease influencing immune gene expression. Nat Genet 2010, 42: 295–302. 10.1038/ng.543
Article PubMed Central CAS PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Department of Software Engineering and Computer Science, The University of Melbourne, Parkville, Victoria, 3010, Australia
Benjamin Goudey, Geoff Macintyre, Gad Abraham & Linda Stern
National ICT Australia (NICTA) Victoria Research Laboratories, The University of Melbourne, Parkville, Victoria, 3010, Australia
Benjamin Goudey, Qiao Wang, Dave Rawlinson, Armita Zarnegar, Eder Kikianty, John Markham, Geoff Macintyre, Gad Abraham, Izhak Haviv & Adam Kowalczyk
The Walter and Eliza Hall Institute of Medical Research, Parkville, Victoria, 3050, Australia
Michael Inouye
Baker IDI Heart and Diabetes Institute, Melbourne, Victoria, 3004, Australia
Izhak Haviv
Department of Medical Biology, University of Melbourne, Parkville, Victoria, 3010, Australia
Michael Inouye

Authors

Benjamin Goudey
View author publications
You can also search for this author in PubMed Google Scholar
Qiao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Dave Rawlinson
View author publications
You can also search for this author in PubMed Google Scholar
Armita Zarnegar
View author publications
You can also search for this author in PubMed Google Scholar
Eder Kikianty
View author publications
You can also search for this author in PubMed Google Scholar
John Markham
View author publications
You can also search for this author in PubMed Google Scholar
Geoff Macintyre
View author publications
You can also search for this author in PubMed Google Scholar
Gad Abraham
View author publications
You can also search for this author in PubMed Google Scholar
Linda Stern
View author publications
You can also search for this author in PubMed Google Scholar
Michael Inouye
View author publications
You can also search for this author in PubMed Google Scholar
Izhak Haviv
View author publications
You can also search for this author in PubMed Google Scholar
Adam Kowalczyk
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Benjamin Goudey.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Goudey, B., Wang, Q., Rawlinson, D. et al. Replication of epistatic DNA loci in two case-control GWAS studies using OPE algorithm. BMC Bioinformatics 12 (Suppl 11), A5 (2011). https://doi.org/10.1186/1471-2105-12-S11-A5

Download citation

Published: 21 November 2011
DOI: https://doi.org/10.1186/1471-2105-12-S11-A5

Highlights from the Seventh International Society for Computational Biology (ISCB) Student Council Symposium 2011

Replication of epistatic DNA loci in two case-control GWAS studies using OPE algorithm

Background

Material and methods

Results and conclusion

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

BMC Bioinformatics

Contact us

Highlights from the Seventh International Society for Computational Biology (ISCB) Student Council Symposium 2011

Replication of epistatic DNA loci in two case-control GWAS studies using OPE algorithm

Background

Material and methods

Results and conclusion

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us