Efficient branch-and-bound techniques for two-locus association mapping

Klotzbücher, Karin; Kobayashi, Yasushi; Shervashidze, Nino; Stegle, Oliver; Müller-Myhsok, Bertram; Weigel, Detlef; Borgwardt, Karsten

doi:10.1186/1471-2105-12-S11-A3

Volume 12 Supplement 11

Highlights from the Seventh International Society for Computational Biology (ISCB) Student Council Symposium 2011

Oral presentation
Open access
Published: 21 November 2011

Efficient branch-and-bound techniques for two-locus association mapping

Karin Klotzbücher^1,4,
Yasushi Kobayashi²,
Nino Shervashidze¹,
Oliver Stegle¹,
Bertram Müller-Myhsok³,
Detlef Weigel² &
…
Karsten Borgwardt¹

BMC Bioinformatics volume 12, Article number: A3 (2011) Cite this article

2042 Accesses
1 Citations
Metrics details

Background

In this project we want to determine pairs of single nucleotide polymorphisms (SNPs) which have a statistically significant effect on the phenotypic variation of the flowering time of Arabidopsis thaliana.

Material and methods

For a large-scale dataset of over 200,000 SNPs from about 200 individuals together with several phenotypes, published by Atwell et al. [1], we develop efficient methods to find pairs of SNPs which are strongly associated with the phenotype. As an exhaustive search of all possible combinations of interacting SNPs is often unfeasible, even when only considering pairs of interacting SNPs, the challenge is to find methods which avoid an exhaustive search but can still guarantee to find the causal pair. We propose two distinct approaches to efficiently determine the t top-scoring pairs of SNPs.

Results and conclusions

In the first approach we employ a branch-and-bound strategy to reduce the search space by pruning insignificant pairs of SNPs. Based on this branch-and-bound strategy we develop the two methods fastHSIC and COAT, which use as association measures the Hilbert-Schmidt Independence Criterion (HSIC) [2] and Pearson's correlation coefficient, respectively. The key idea is that we are able to bound the association scores of pairs of SNPs for both methods based only on the association score of one of the SNPs of the pair.

In our second approach we use prior biological knowledge to select a much smaller subset of candidate genes which, according to other findings, affect the flowering time of Arabidopsis thaliana. These candidate genes and interactions between them make up a network of 1,452 nodes or genes and 938 edges or gene-gene interactions, and allow us to select a subset of SNPs that lie within or in close proximity to the genes of the network.

Empirical evaluation of our own as well as traditional methods on the original and the reduced dataset shows that both our approaches can greatly reduce the runtime.

References

Atwell S, Huang YS, Vilhjalmsson BJ, Willems G, Horton M, Li Y, Meng D, Platt A, Tarone AM, Hu TT, et al.: Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature 2010, 465(7298):627–631. 10.1038/nature08800
Article PubMed Central CAS PubMed Google Scholar
Gretton A, Bousquet O, Smola A, Schölkopf B: Measuring statistical dependence with Hilbert-Schmidt Norms. Proceedings of the International Conference on Algorithmic Learning Theory. J Gen Virol 2005, ():63–78.
Google Scholar

Download references

Author information

Authors and Affiliations

Machine Learning & Computational Biology Research Group, MPIs Tübingen, Tübingen, Germany
Karin Klotzbücher, Nino Shervashidze, Oliver Stegle & Karsten Borgwardt
Max Planck Institute for Developmental Biology, Tübingen, Germany
Yasushi Kobayashi & Detlef Weigel
Max Planck Institute for Psychiatry, Munich, Germany
Bertram Müller-Myhsok
Zentrum für Bioinformatik, Universität Tübingen, Tübingen, Germany
Karin Klotzbücher

Authors

Karin Klotzbücher
View author publications
You can also search for this author in PubMed Google Scholar
Yasushi Kobayashi
View author publications
You can also search for this author in PubMed Google Scholar
Nino Shervashidze
View author publications
You can also search for this author in PubMed Google Scholar
Oliver Stegle
View author publications
You can also search for this author in PubMed Google Scholar
Bertram Müller-Myhsok
View author publications
You can also search for this author in PubMed Google Scholar
Detlef Weigel
View author publications
You can also search for this author in PubMed Google Scholar
Karsten Borgwardt
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Karin Klotzbücher.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Klotzbücher, K., Kobayashi, Y., Shervashidze, N. et al. Efficient branch-and-bound techniques for two-locus association mapping. BMC Bioinformatics 12 (Suppl 11), A3 (2011). https://doi.org/10.1186/1471-2105-12-S11-A3

Download citation

Published: 21 November 2011
DOI: https://doi.org/10.1186/1471-2105-12-S11-A3

Highlights from the Seventh International Society for Computational Biology (ISCB) Student Council Symposium 2011

Efficient branch-and-bound techniques for two-locus association mapping

Background

Material and methods

Results and conclusions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

BMC Bioinformatics

Contact us

Highlights from the Seventh International Society for Computational Biology (ISCB) Student Council Symposium 2011

Efficient branch-and-bound techniques for two-locus association mapping

Background

Material and methods

Results and conclusions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us