GENOMEPOP: A program to simulate genomes in populations
© Carvajal-Rodríguez; licensee BioMed Central Ltd. 2008
Received: 05 February 2008
Accepted: 30 April 2008
Published: 30 April 2008
There are several situations in population biology research where simulating DNA sequences is useful. Simulation of biological populations under different evolutionary genetic models can be undertaken using backward or forward strategies. Backward simulations, also called coalescent-based simulations, are computationally efficient. The reason is that they are based on the history of lineages with surviving offspring in the current population. On the contrary, forward simulations are less efficient because the entire population is simulated from past to present. However, the coalescent framework imposes some limitations that forward simulation does not. Hence, there is an increasing interest in forward population genetic simulation and efficient new tools have been developed recently. Software tools that allow efficient simulation of large DNA fragments under complex evolutionary models will be very helpful when trying to better understand the trace left on the DNA by the different interacting evolutionary forces. Here I will introduce GenomePop, a forward simulation program that fulfills the above requirements. The use of the program is demonstrated by studying the impact of intracodon recombination on global and site-specific dN/dS estimation.
I have developed algorithms and written software to efficiently simulate, forward in time, different Markovian nucleotide or codon models of DNA mutation. Such models can be combined with recombination, at inter and intra codon levels, fitness-based selection and complex demographic scenarios.
GenomePop has many interesting characteristics for simulating SNPs or DNA sequences under complex evolutionary and demographic models. These features make it unique with respect to other simulation tools. Namely, the possibility of forward simulation under General Time Reversible (GTR) mutation or GTR×MG94 codon models with intra-codon recombination, arbitrary, user-defined, migration patterns, diploid or haploid models, constant or variable population sizes, etc. It also allows simulation of fitness-based selection under different distributions of mutational effects. Under the 2-allele model it allows the simulation of recombination hot-spots, the definition of different frequencies in different populations, etc. GenomePop can also manage large DNA fragments. In addition, it has a scaling option to save computation time when simulating large sequences and population sizes under complex demographic and evolutionary situations. These and many other features are detailed in its web page .
There are several situations in population biology research where simulation of DNA sequences is useful. Simulations have been used to for hypothesis testing [2–4], to study the impact of differing demographic scenarios on patterns of human diversity , or to simulate the evolution of complex diseases in human populations [6, 7]. In addition, population simulation of genetic datasets is also used to estimate population parameters [8–10].
One of the most exciting research areas in the current context of population genetics is the HapMap project. Knowledge about patterns of linkage disequilibrium (LD) in humans is very important from a genomic point of view. The existence of linkage or haplotype blocks  or, at least, networks of SNPs in high LD , will facilitate the assembly of human genome haplotype maps [13–15] that will enormously improve, among other things, the efficiency of disease gene mapping. It seems that these blocks are mainly defined by recombination hot spots [16, 17], but haplotype blocks can also be generated by genetic drift in regions of uniform recombination if rates is low enough . We have now growing empirical knowledge about haplotype block and tagSNP diversity, but less is known about the effect of population demographic history. Though important work has been undertaken in the application of population genetics to LD mapping [19–22] and its relevance to human populations [23–25], we still have an incomplete understanding of how the combined effect of genetic drift, mutation, recombination and migration, affect LD and tagSNP patterns, although it is known that they do . Moreover, recombination is an important evolutionary process to understand how genetic diversity is generated and maintained in populations. Jointly with positive selection, recombination allows for very high rates of evolution . However, the impact of recombination is dependent on other forces, such as selection and demography. Developing tools that allow simultaneous simulation of natural selection, recombination and complex demographic patterns will be of great help in trying to better understand the trace left on the DNA by the different interacting evolutionary forces.
Simulation of biological populations under different evolutionary genetic models can be done following backward or forward strategies. Backward simulations, also called coalescent-based simulations, are computationally very efficient because they are based on the history of lineages with surviving offspring in the current population and ignore all individuals that are not ancestral to the present-day population . Hence, coalescent is a sample-based theory relevant to the study of population samples and DNA sequence data. From its beginnings, the basic coalescent has been extended in several useful ways. For example, to include structured population models [28–32], changing population size [33–35], recombination [36, 37] and selection [38–43].
On the contrary, forward simulations are less efficient because the entire population is simulated from past to present. However, the coalescent framework imposes some limitations that forward simulation does not. The first of these is the same feature that causes its efficiency, namely, the coalescent does not keep track of the complete ancestral information i.e. only takes into account ancestries that survived to form the present-day sample. Thus, if the interest is focused on the evolutionary process itself, rather than on its outcome, forward simulations should be preferred . Second, coalescent simulations are complicated by simple genetic forces such as selection, and although different evolutionary scenarios have been incorporated (see above) it is still difficult to implement models incorporating complex evolutionary situations with selection, variable population size, recombination, complex mating schemes, and so on. In fact, we can only simulate limited forms of recombination and selection under the coalescent. It is known that recombination has a major impact for detecting positive natural selection [45, 46]. Shriner et al studied the impact of recombination under a neutral model. Anisimova et al studied the recombination effect under a coalescent codon-based model i.e. the unit of change was the codon instead of the nucleotide. In the latter case, recombination was not simulated at the intracodon level. Therefore, we still ignore the importance of intracodon recombination under a given codon-based model. Moreover, coalescent methods cannot yet simulate realistic samples of complex human diseases . Indeed, when simulating non-neutral scenarios and/or complex models under the coalescent, much of its computational efficiency is lost (however, see recent work by Marjoram  and Liang ). Furthermore, the coalescent model is based on specific limiting values and relationships between some important parameters . Hence, there is increasing interest in forward population genetic simulation and new efficient tools have been recently developed [50–52]. Therefore, a program that allows the simulation forward in time, of different Markovian nucleotide or codon models of DNA mutation combined with recombination, at inter and intra codon levels, fitness-based selection and complex demographic scenarios, will be of great interest. Here I will introduce the program GenomePop that fulfills the mentioned requirements.
GenomePop uses a simple and efficient algorithm to perform forward simulation of populations and/or genomes. The basic idea considers an individual as the differences (mutations) between this individual and a reference or consensus genotype. Thus, each individual is no longer represented by its complete sequence or genotype but by the mutations it carries with respect to the consensus. A more detailed explanation of the algorithm is provided at the program web page. Taking advantage of the efficiency of this approach, GenomePop can simulate, forward in time, DNA sequences under specific Markov models. The program allows the simulation of recombination under both nucleotide and codon models of evolution, providing a way to simulate recombination at inter and intracodon levels under codon models. It also permits arbitrary migration models, simulation of SNPs, recombination hot-spots, fitness-based selection and many other features that are detailed in the program web-page. GenomePop has different output formats as GenePop for SNPs and Phylip or Nexus for DNA sequences.
Markov models of DNA mutation
Markov processes are used in molecular evolution to describe the change between nucleotides, aminoacids or codons over evolutionary time. Usually, time is measured as the number of substitutions because molecular sequence data does not allow the separate estimation of the rate and the time, but only of their product . In the context of forward simulation we are not interested in the transition after an arbitrary time t (branch length) but just in the transition from a nucleotide or codon to another, given that a mutation occurs. An advantage of this approach is that we need to compute the transition matrix just once at the beginning of the evolutionary process. Therefore, consider a given instantaneous substitution rate matrix Q, which allows for a complete definition of any Markovian substitution model , the matrix M = -qQ + I is the conditional transition matrix to go from i to j provided that a substitution occurs, where q = diagonal (1/q i ) and I is the identity matrix . Then, given an instantaneous substitution matrix Q, estimated for example using PAUP  or Hyphy  programs, we can obtain the corresponding transition matrix M that can be used to produce the necessary mutation process in a forward in time evolutionary model.
There are two basic biological models implemented in GenomePop, namely "viral" and "non-viral". The only difference that distinguishes them is just that in the viral model the initial sequences are different in each population, as the different viruses infect different individuals. Thus, the user can define a viral model indicating the percentage of sequence identity (0–100) between the sequences of the distinct populations. By default the sequence identity is zero i.e. the sequences at each population are randomly settled. In the non-viral model the initial sequence is the same for every population (identity of 100%).
DNA models, recombination and selection
GenomePop DNA models
MG94 × (JC/GTR)
Clearly, the more complex the model defined, the slower the simulation. To avoid high computation times, GenomePop incorporates a scaling option based on the fact that, under neutral models, we can scale the population size N and the time t, provided the consequent correction to the mutation (μ), migration (m) and recombination (r) rates holds the corresponding compound products Nμ, Nr, Nm, etc., constant.
Thus, the input in Figure 2 generates 100 datasets under a GTR model with substitution rates typical for HIV . Both recurrent and retromutation are allowed. The system will evolve 1 chromosome of 1 Kb under the given model over 20,000 generations. As can be seen in Figure 2, a scaling of 10 was used, which implies that both, population size and the number of generations, was divided by 10 and mutation was multiplied by the same factor. A more exhaustive explanation of the input facilities of GenomePop is provided at the program web page.
Example and validation of the Markov mutation method
For each obtained dataset from the input in Figure 2, the best-fit model of nucleotide substitution under the Akaike information criteria (AIC) was estimated with Modeltest v3.6 , using maximum likelihood (ML) estimates from PAUP* . The percentage of correct model estimation (GTR) was 97% although some datasets, about 29%, were also assigned invariable sites or rate heterogeneity among sites. The substitution pattern and equilibrium frequencies were correctly estimated.
Examples and validation of other general features
We ran this example over 200 generations and then analyze the output with the GenePop 4.0 program . As expected the SNPs were detected as independent. We then changed the value of recombination to 0 ('Rec' = 0) and then GenePop 4.0 tell us that the 10 SNPs are linked, as expected. Note the many possibilities that the program provides in the context of studying SNPs under complex evolutionary situations. We can define any number of populations under any user-defined migration model. We can set any number of SNPs with the desired linkage relationships. The SNPs can be set at distinct initial frequencies in the different populations, for example, 'SNPfreqs' at 1.0 and 0.0 defines the first population with allele 1 fixed and the second with allele 2 fixed.
Impact of recombination on estimation of positive selection
Impact of recombination on dN/dS estimation under a Jukes Cantor model.
1.02 ± 0.03
0.1 ± 0.05
1.06 ± 0.04
9.9 ± 0.56
1.01 ± 0.03
8.8 ± 0.49
0.3 ± 0.07
13.1 ± 0.77
12.7 ± 0.65
GenomePop has interesting characteristics for simulating SNPs or DNA sequences under complex models of evolution and demography. These features make it unique with respect to other simulation tools. Namely, the possibility of forward simulation under GTR mutation or GTR × MG94 codon models with intra-codon recombination, simulation of any user-defined migration pattern, diploid or haploid models, constant or variable population sizes, fitness-based selection, etc. Under the 2-allele model it allows the simulation of recombination hot-spots, the definition of different frequencies in different populations, etc. GenomePop can also manage large DNA fragments and has a scaling option to save computation time when simulating large sequences or population sizes under complex demographic and evolutionary situations. It has many other features that are detailed in the web page .
Availability and requirements
Project name: GenomePop v. 1.0
Project home page: http://webs.uvigo.es/acraaj/GenomePop.htm
Operating system(s): Windows and Linux (the source will be provided to compile for Mac)
Programming language: C++
License: GNU GPL.
I am grateful to A. Caballero, H. Quesada, S.T. Rodríguez-Ramilo and two anonymous reviewers for discussion and comments on the manuscript. I also want to thank Sergei L Kosakovsky Pond for his help with HYPHY. This work was supported by grant CPE03-004-C2 from Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA) and from Dirección Xeral de Investigación e Desenvolvemento from Xunta de Galicia. AC-R is currently funded by an Isidro Parga Pondal research fellowship from Xunta de Galicia (Spain).
- Carvajal-Rodríguez A: GenomePop: software to simulate the evolution of genomes and populations.[http://webs.uvigo.es/acraaj/GenomePop.htm]
- Liu Y, Nickle DC, Shriner D, Jensen MA, Gerald H, Learn J, Mittler JE, Mullins JI: Molecular clock-like evolution of human immunodeficiency virus type 1. Virology 2004, 329: 101–108. 10.1016/j.virol.2004.08.014View ArticlePubMedGoogle Scholar
- Liu Y, Mullins JI, Mittler JE: Waiting times for the appearance of cytotoxic T-lymphocyte escape mutants in chronic HIV-1 infection. Virology 2006, 347(1):140–146. 10.1016/j.virol.2005.11.036View ArticlePubMedGoogle Scholar
- Caballero A, Cusi E, Garcia C, Garcia-Dorado A: Accumulation of deleterious mutations: Additional Drosophila melanogaster estimates and a simulation of the effects of selection. Evolution 2002, 56(6):1150–1159.View ArticlePubMedGoogle Scholar
- Carvajal-Rodriguez A, Rolan-Alvarez E, Caballero A: Quantitative variation as a tool for detecting human-induced impacts on genetic diversity. Biological Conservation 2005, 124(1):1–13. 10.1016/j.biocon.2004.12.008View ArticleGoogle Scholar
- Peng B, Amos CI, Kimmel M: Forward-Time Simulations of Human Populations with Complex Diseases. PLoS Genet 2007, 3(3):e47. 10.1371/journal.pgen.0030047PubMed CentralView ArticlePubMedGoogle Scholar
- Peng B, Kimmel M: Simulations provide support for the common disease-common variant hypothesis. Genetics 2007, 175(2):763–776. 10.1534/genetics.106.058164PubMed CentralView ArticlePubMedGoogle Scholar
- Keightley PD: Inference of genome-wide mutation rates and distributions of mutation effects for fitness traits: a simulation study. Genetics 1998, 150(3):1283–1293.PubMed CentralPubMedGoogle Scholar
- Wakeley J: Nonequilibrium migration in human history. Genetics 1999, 153(4):1863–1871.PubMed CentralPubMedGoogle Scholar
- Wakeley J: The coalescent in an island model of population subdivision with variation among demes. Theor Popul Biol 2001, 59(2):133–144. 10.1006/tpbi.2000.1495View ArticlePubMedGoogle Scholar
- Goldstein DB: Islands of linkage disequilibrium. Nat Genet 2001, 29: 109–111. 10.1038/ng1001-109View ArticlePubMedGoogle Scholar
- Nothnagel M, Rohde K: The effect of single-nucleotide polymorphism marker selection on patterns of haplotype blocks and haplotype frequency estimates. Am J Hum Genet 2005, 77(6):988–998. 10.1086/498175PubMed CentralView ArticlePubMedGoogle Scholar
- International-HapMap-Consortium: The International HapMap Project. Nature 2003, 426(6968):789–796. 10.1038/nature02168View ArticleGoogle Scholar
- International-HapMap-Consortium: A haplotype map of the human genome. Nature 2005, 437(7063):1299–1320. 10.1038/nature04226View ArticleGoogle Scholar
- International-HapMap-Consortium: A second generation human haplotype map of over 3.1 million SNPs. Nature 2007, 449(7164):851–861. 10.1038/nature06258View ArticleGoogle Scholar
- Jeffreys AJ, Holloway JK, Kauppi L, May CA, Neumann R, Slingsby MT, Webb AJ: Meiotic recombination hot spots and human DNA diversity. Philos Trans R Soc Lond B Biol Sci 2004, 359(1441):141–152. 10.1098/rstb.2003.1372PubMed CentralView ArticlePubMedGoogle Scholar
- Greenawalt DM, Cui X, Wu Y, Lin Y, Wang HY, Luo M, Tereshchenko IV, Hu G, Li JY, Chu Y, Azaro MA, Decoste CJ, Chimge NO, Gao R, Shen L, Shih WJ, Lange K, Li H: Strong correlation between meiotic crossovers and haplotype structure in a 2.5-Mb region on the long arm of chromosome 21. Genome Res 2006, 16(2):208–214. 10.1101/gr.4641706PubMed CentralView ArticlePubMedGoogle Scholar
- Liu N, Sawyer SL, Mukherjee N, Pakstis AJ, Kidd JR, Kidd KK, Brookes AJ, Zhao H: Haplotype block structures show significant variation among populations. Genet Epidemiol 2004, 27(4):385–400. 10.1002/gepi.20026View ArticlePubMedGoogle Scholar
- Nordborg M, Tavare S: Linkage disequilibrium: what history has to tell us. Trends Genet 2002, 18(2):83–90. 10.1016/S0168-9525(02)02557-XView ArticlePubMedGoogle Scholar
- Rosenberg NA, Nordborg M: Genealogical trees, coalescent theory and the analysis of genetic polymorphisms. Nat Rev Genet 2002, 3(5):380–390. 10.1038/nrg795View ArticlePubMedGoogle Scholar
- Stumpf MPH, McVean GAT: Estimating recombination rates from population-genetic data. Nature Reviews Genetics 2003, 4: 959–968. 10.1038/nrg1227View ArticlePubMedGoogle Scholar
- Hein J, Wiuf C, Schierup MH: Gene genealogies, variation and evolution : a primer in coalescent theory. Oxford , Oxford University Press; 2005:XIII, 276 s..Google Scholar
- Kruglyak L: Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat Genet 1999, 22: 139–144. 10.1038/9642View ArticlePubMedGoogle Scholar
- Pritchard JK, Przeworski M: Linkage disequilibrium in humans: models and data. Am J Hum Genet 2001, 69(1):1–14. 10.1086/321275PubMed CentralView ArticlePubMedGoogle Scholar
- McVean GA, Myers SR, Hunt S, Deloukas P, Bentley DR, Donnelly P: The fine-scale structure of recombination rate variation in the human genome. Science 2004, 304(5670):581–584. 10.1126/science.1092500View ArticlePubMedGoogle Scholar
- Gu S, Pakstis AJ, Li H, Speed WC, Kidd JR, Kidd KK: Significant variation in haplotype block structure but conservation in tagSNP patterns among global populations. Eur J Hum Genet 2007, 15(3):302–312. 10.1038/sj.ejhg.5201751View ArticlePubMedGoogle Scholar
- Marais G, Charlesworth B: Genome evolution: recombination speeds up adaptive evolution. Curr Biol 2003, 13(2):R68–70. 10.1016/S0960-9822(02)01432-XView ArticlePubMedGoogle Scholar
- Bahlo M, Griffiths RC: Coalescence time for two genes from a subdivided population. J Math Biol 2001, 43(5):397–410. 10.1007/s002850100104View ArticlePubMedGoogle Scholar
- Bahlo M, Griffiths RC: Inference from gene trees in a subdivided population. Theor Popul Biol 2000, 57(2):79–95. 10.1006/tpbi.1999.1447View ArticlePubMedGoogle Scholar
- Beerli P, Felsenstein J: Maximum likelihood estimation of a migration matrix and efective population sizes in n subpopulations by using a coalescent approach. Proceedings of the National Academy of Sciences, USA 2001, 98(8):4563–4568. 10.1073/pnas.081068098View ArticleGoogle Scholar
- Notohara M: The coalescent and the genealogical process in geographically structured population. J Math Biol 1990, 29: 59–75. 10.1007/BF00173909View ArticlePubMedGoogle Scholar
- Wilkinson-Herbots HM: Genealogy and subpopulation differentiation under various models of population structure. J Math Biol 1998, 37(6):535–585. 10.1007/s002850050140View ArticleGoogle Scholar
- Griffiths RC, Tavare S: Sampling theory for neutral alleles in a varying environment. Philosophical Transactions of the Royal Society of London, Series B 1994, 344: 403–410. 10.1098/rstb.1994.0079View ArticleGoogle Scholar
- Mohle M, Sagitov S: A classification of coalescent processes for haploid exchangeable population models. Annals of Probability 2001, 29(4):1547–1562. 10.1214/aop/1015345761View ArticleGoogle Scholar
- Tajima F: The effect of change in population size on DNA polymorphism. Genetics 1989, 123: 597–601.PubMed CentralPubMedGoogle Scholar
- Hey J, Wakeley J: A coalescent estimator of the population recombination rate. Genetics 1997, 145: 833–846.PubMed CentralPubMedGoogle Scholar
- Hudson RR, Kaplan NL: The coalescent process in models with selection and recombination. Genetics 1988, 120: 831–840.PubMed CentralPubMedGoogle Scholar
- Kaplan NL, Darden T, Hudson RR: The coalescent process in models with selection. Genetics 1988, 120: 819–829.PubMed CentralPubMedGoogle Scholar
- Krone SM, Neuhauser C: Ancestral processes with selection. Theor Popul Biol 1997, 51(3):210–237. 10.1006/tpbi.1997.1299View ArticlePubMedGoogle Scholar
- Neuhauser C, Krone SM: The genealogy of samples in models with selection. Genetics 1997, 145: 519–534.PubMed CentralPubMedGoogle Scholar
- Donnelly P, Nordborg M, Joyce P: Likelihoods and simulation methods for a class of nonneutral population genetics models. Genetics 2001, 159(2):853–867.PubMed CentralPubMedGoogle Scholar
- Barton NH, Etheridge AM, Sturm AK: Coalescence in a random background. Annals of Applied Probability 2004, 14(2):754–785. 10.1214/105051604000000099View ArticleGoogle Scholar
- Fearnhead P: Perfect simulation from nonneutral population genetic models: Variable population size and population subdivision. Genetics 2006, 174(3):1397–1406. 10.1534/genetics.106.060681PubMed CentralView ArticlePubMedGoogle Scholar
- Calafell F, Grigorenko EL, Chikanian AA, Kidd KK: Haplotype evolution and linkage disequilibrium: A simulation study. Hum Hered 2001, 51(1–2):85–96. 10.1159/000022963View ArticlePubMedGoogle Scholar
- Anisimova M, Nielsen R, Yang Z: Effect of Recombination on the Accuracy of the Likelihood Method for Detecting Positive Selection at Amino Acid Sites. Genetics 2003, 164(3):1229–1236.PubMed CentralPubMedGoogle Scholar
- Shriner D, Nickle DC, Jensen MA, Mullins JI: Potential impact of recombination on sitewise approaches for detecting positive natural selection. Genet Res 2003, 81: 115–121. 10.1017/S0016672303006128View ArticlePubMedGoogle Scholar
- Marjoram P, Wall JD: Fast "coalescent" simulation. BMC Genet 2006, 7: 16. 10.1186/1471-2156-7-16PubMed CentralView ArticlePubMedGoogle Scholar
- Liang L, Zollner S, Abecasis GR: GENOME: a rapid coalescent-based whole genome simulator. Bioinformatics 2007, 23(12):1565–1567. 10.1093/bioinformatics/btm138View ArticlePubMedGoogle Scholar
- Wakeley J: The limits of theoretical population genetics. Genetics 2005, 169(1):1–7.PubMed CentralPubMedGoogle Scholar
- Balloux F: EASYPOP (version 1.7): a computer program for population genetics simulations. J Hered 2001, 92(3):301–302. 10.1093/jhered/92.3.301View ArticlePubMedGoogle Scholar
- Peng B, Kimmel M: simuPOP: a forward-time population genetics simulation environment. Bioinformatics 2005, 21(18):3686–3687. 10.1093/bioinformatics/bti584View ArticlePubMedGoogle Scholar
- Guillaume F, Rougemont J: Nemo: an evolutionary and population genetics programming framework. Bioinformatics 2006, 22(20):2556–2557. 10.1093/bioinformatics/btl415View ArticlePubMedGoogle Scholar
- Yang Z, Balding D, Bishop M, Cannings: Adaptive Molecular Evolution. In Handbook of Statistical Genetics. Wiley J. and Sons Ltd.; 2003.Google Scholar
- Karlin S, Taylor HM: A second course in stochastic processes. New York , Academic Press; 1981:XVIII, 542 s..Google Scholar
- Swofford DL: PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). 4th edition. Sunderland, Massachusetts , Sinauer Associates; 2002.Google Scholar
- Kosakovsky Pond SL, Frost SDW, Muse SV: HyPhy: hypothesis testing using phylogenies. Bioinformatics 2005, 21(5):676–679. 10.1093/bioinformatics/bti079View ArticleGoogle Scholar
- Muse SV, Gaut BS: A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Mol Biol Evol 1994, 11(5):715–724.PubMedGoogle Scholar
- Carvajal-Rodriguez A, Crandall KA, Posada D: Recombination Estimation under Complex Evolutionary Models with the Coalescent Composite Likelihood Method. Mol Biol Evol 2006, 23(4):817–827. 10.1093/molbev/msj102PubMed CentralView ArticlePubMedGoogle Scholar
- Posada D, Crandall KA: Modeltest: testing the model of DNA substitution. Bioinformatics 1998, 14(9):817–818. 10.1093/bioinformatics/14.9.817View ArticlePubMedGoogle Scholar
- McVean GAT, Awadalla P, Fearnhead P: A coalescent based-method for detecting and estimating recombination from gene sequences. Genetics 2002, 160: 1231–1241.PubMed CentralPubMedGoogle Scholar
- Raymond M, Rousset F: GENEPOP (version 1.2): population genetics software for exact tests and ecumenicism. J Heredity 1995, 86: 248–249.Google Scholar
- Kosakovsky Pond SL, Frost SD: Not so different after all: a comparison of methods for detecting amino acid sites under selection. Mol Biol Evol 2005, 22(5):1208–1222. 10.1093/molbev/msi105View ArticlePubMedGoogle Scholar
- Rodríguez F, Oliver JF, Marín A, Medina JR: The general stochastic model of nucleotide substitution. J Theor Biol 1990, 142: 485–501. 10.1016/S0022-5193(05)80104-3View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.