alleHap: an efficient algorithm to reconstruct zero-recombinant haplotypes from parent-offspring pedigrees

Medina-Rodríguez, Nathan; Santana, Angelo; Wägner, Ana M; Quinteiro, José M

doi:10.1186/1471-2105-15-S3-A6

Volume 15 Supplement 3

Highlights from the Ninth International Society for Computational Biology (ISCB) Student Council Symposium 2013

Meeting abstract
Open access
Published: 11 February 2014

alleHap: an efficient algorithm to reconstruct zero-recombinant haplotypes from parent-offspring pedigrees

Nathan Medina-Rodríguez^1,2,
Angelo Santana¹,
Ana M Wägner³ &
…
José M Quinteiro²

BMC Bioinformatics volume 15, Article number: A6 (2014) Cite this article

2024 Accesses
2 Citations
Metrics details

Background

Haplotype inference is an essential stage in genetic linkage analysis and estimation methods are also very frequently used to reconstruct haplotypes in current genetic association studies. Most of the latter are focused on haplotype phasing from recombinant DNA areas of unrelated individuals and use likelihood-based methods to infer the presence of alleles in several loci with very time-consuming probabilistic algorithms.

So far, literature does not analyze haplotypes using deterministic techniques, and there are hardly any alternative methods for constructing haplotypes from non-recombinant DNA areas, despite the fact that computational inference by probabilistic models may cause a large number of incorrect inferences.

Description and results

We have developed an algorithm called alleHap, which is able to impute alleles from parent-offspring pedigree databases with missing family members, to later construct their corresponding, unambiguous haplotypes.

The alleHap algorithm is based on a preliminary analysis of all possible combinations that may exist in the genotyping of a family, considering that each member, due to meiosis, should unequivocally have two alleles, one from each parent. The analysis was founded on the differentiation of seven cases, as described in [1], but some of them divided into a maximum of three variants, representing a different combination of alleles of the family members (Table 1).

Table 1 Possible allelic combinations in a parent-offspring pedigree

Full size table

The classification by cases and variants allows the algorithm to impute missing values efficiently in the loaded database to proceed afterwards to the conformation of corresponding unambiguous haplotypes. Furthermore, the algorithm allows the construction of haplotypes, without any limitation in terms of the number of SNPs, i.e. enables the construction of haplotypes of more than two SNPs.

By analyzing all possible combinations of a parent-offspring pedigree in which parents may be missing, as long as one child has been genotyped, theoretically an unequivocal imputation of three possible parent haplotypes is possible in 92.3% of cases even when one parent is missing. When neither parent has been genotyped, in 36.4% of cases at least two haplotypes can be constructed. Regarding offspring allele imputation with both parents fully genotyped, a minimum of one haplotype for each child may be successfully reconstructed in 6.1% of possible cases.

Evaluation of the results (Figure 1) reveals an optimum performance of alleHap computational tasks, namely Simulation, Imputation and Reconstruction. Their corresponding execution times are quite low even when considering a large number of families (≤ 2000) and SNPs (≤ 50).

Figure 2 shows how our algorithm has high allele imputation rates (about 65%) even when the probability of missing parents in each family is high (>50%). Regarding haplotype reconstruction rates, there is an almost linear relationship between reconstruction rates and the number of missing individuals per family. This is because alleHap is mainly based on the information included in the offspring, so the more children that are missing the more difficult it is to reconstruct the family haplotypes.

Conclusions

alleHap has been tested by simulations and also with the Type 1 Diabetes Genetics Consortium [2] database. Our algorithm is very robust against inconsistencies within the genotypic data and consumes very little time, even when handling large amounts of data. The missing data imputation may improve results in numerous epidemiological and/or genetic linkage studies.

Our algorithm could be a useful instrument for information retrieval and knowledge discovery in genetics, since it would allow epidemiological specialists to discover new intergenic patterns by studying zero-recombinant haplotypes with a larger number of SNPs from family-based databases.

References

Berger-Wolf TY: Reconstruction sibling relationships in wild populations. Bioinformatics. 2007, 23: i49-i56. 10.1093/bioinformatics/btm219.
Article CAS PubMed Google Scholar
Rich SS: The Type 1 Diabetes Genetics Consortium. Ann N Y Acad Sci. 2006, 1079: 1-8. 10.1196/annals.1375.001.
Article CAS PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Universidad de Las Palmas de Gran Canaria, Campus de Tafira, 35017, Las Palmas, Spain
Nathan Medina-Rodríguez & Angelo Santana
IUMA - Information and Communication Systems, Universidad de Las Palmas de Gran Canaria, Campus de Tafira, 35017, Las Palmas, Spain
Nathan Medina-Rodríguez & José M Quinteiro
Department of Medical and Surgical Sciences, Universidad de Las Palmas de Gran Canaria, Campus de Tafira, 35017, Las Palmas, Spain
Ana M Wägner

Authors

Nathan Medina-Rodríguez
View author publications
You can also search for this author in PubMed Google Scholar
Angelo Santana
View author publications
You can also search for this author in PubMed Google Scholar
Ana M Wägner
View author publications
You can also search for this author in PubMed Google Scholar
José M Quinteiro
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Medina-Rodríguez, N., Santana, A., Wägner, A.M. et al. alleHap: an efficient algorithm to reconstruct zero-recombinant haplotypes from parent-offspring pedigrees. BMC Bioinformatics 15 (Suppl 3), A6 (2014). https://doi.org/10.1186/1471-2105-15-S3-A6

Download citation

Published: 11 February 2014
DOI: https://doi.org/10.1186/1471-2105-15-S3-A6

Highlights from the Ninth International Society for Computational Biology (ISCB) Student Council Symposium 2013

alleHap: an efficient algorithm to reconstruct zero-recombinant haplotypes from parent-offspring pedigrees

Background

Description and results

Conclusions

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

BMC Bioinformatics

Contact us

Highlights from the Ninth International Society for Computational Biology (ISCB) Student Council Symposium 2013

alleHap: an efficient algorithm to reconstruct zero-recombinant haplotypes from parent-offspring pedigrees

Background

Description and results

Conclusions

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us