A quantitative genetic and epigenetic model of complex traits
© Wang et al.; licensee BioMed Central Ltd. 2012
Received: 24 May 2012
Accepted: 1 October 2012
Published: 26 October 2012
Despite our increasing recognition of the mechanisms that specify and propagate epigenetic states of gene expression, the pattern of how epigenetic modifications contribute to the overall genetic variation of a phenotypic trait remains largely elusive.
We construct a quantitative model to explore the effect of epigenetic modifications that occur at specific rates on the genome. This model, derived from, but beyond, the traditional quantitative genetic theory that is founded on Mendel’s laws, allows questions concerning the prevalence and importance of epigenetic variation to be incorporated and addressed.
It provides a new avenue for bringing chromatin inheritance into the realm of complex traits, facilitating our understanding of the means by which phenotypic variation is generated.
Systematic or stochastic changes in chromatin states, such as DNA methylation, chromatin remodeling, histone modification and RNA interference, have been thought to provide an additional driving force for phenotypic variation in complex traits and diseases [1–9]. Different chromatin states, called epialleles, that occur in the same sequence allele cannot be captured by an analysis based on DNA sequence alone . With the increasing availability of epigenome technologies, there has been an unprecedented opportunity to understand the role of epiallelic variants in maintaining and inducing functional variation for organisms to better buffer against environmental perturbations. This hence entails the development of quantitative models that can enable our knowledge about the amount and pattern of quantitative variation determined by epialleles. By integrating with linkage or association mapping strategies, these models can retrieve epigenetic variation that cannot be estimated presently [10–13].
There have been several publications on methodological development for epigenetic detection [14–17]. Johannes and Colome-Tatche  proposed an experimental approach for estimating epigenetic variation in experimental crosses derived from epigenomically perturbed isogenic lines. This approach is powered to model the effects of epiallelic instability, recombination, parent-of-origin effects, and transgressive segregation on phenotypic variation across generations. Tal et al.  derived an expression form for covariances between relatives due to epigenetic transmissibility. A statistical model based on multiple testing procedures has been developed to identify the genomic regions of epigenetic variability among different individuals from genome-wide DNA methylation data . These model developments, in a combination with empirical studies, can be used to test the hypothesis that epigenetic variation arising from chromatin modifications of DNA directly or indirectly is an important contributor to the missing heritability [17, 19].
Despite these advances, we are still unclear how much of the phenotypic variation is contributed by epigenetic modifications and, more importantly, through which way epialleles trigger their effects on phenotypic values. The motivation of this article is to develop a quantitative model for estimating and testing the contribution of epigenetic variants to quantitative trait variation. The model allows the prediction of how much genetic variation is produced through a change in the rate of occurrence of epigenetic mutation and the effect of epigenetic factors in a natural population. We particularly discuss how the epigenetic effect interacts with other genetic effects, such as additive and dominant, to affect phenotypic traits. By implementing it into genome-wide association studies , the model proposed provides useful guidance for designing efficient and effective molecular experiments to characterize a comprehensive picture of the epigenetic variation of complex traits or diseases in different organisms.
Occurrence rate of methylation
Consider an epigenetic study population of n individuals that are randomly drawn from a natural population, in which a nucleotide site, with two alleles A1 and A2, is thought to affect a phenotypic trait. Let p and q (p + q = 1) denote the allele frequencies of A1 and A2 in the natural population at Hardy-Weinberg equilibrium (HWE), respectively. The genotypic frequencies of A1A1, A1A2, and A2A2 at the nucleotide site studied are expressed as p2, 2pq, and q2, respectively [20, 21].
where D12, D1e, and D2e are the coefficients of Hardy-Weinberg disequilibrium (HWD) due to a non-random association between alleles A1 and A2, between allele A1 and epiallele A e , and between allele A2 and epiallele A e , respectively. It is possible that the previous equilibrium of the population is violated by DNA methylation, leading to the HWD quantified by D12, D1e, and D2e. Thus, the genotype and epigenotype frequencies may be determined by allele and epiallele frequencies and HWD coefficients.
may not follow a standard chi-square distribution. Self and Liang  showed that the null distribution of the LR test statistic is a mixture of projections of chi-square variables onto surfaces, with the weights of mixtures that can be derived analytically only in special cases. By establishing the asymptotic null and alternative distributions of quasi-likelihood ratio, rescaled quasi-likelihood ratio, Wald, and score tests, Andrews  suggested the use of these test statistics to test the boundary value of a model parameter. While the first three test statistics are easy to compute, the score test is more difficult by deriving the first and second-order derivatives of the alternative log-likelihood.
Similar tests can be performed for individual HWD, D1e, D2e, or D12, or their combinations, by formulating the null hypotheses, respectively. Under the alternative hypothesis H1 associated with each null hypothesis considered, the likelihood is calculated. The LR value calculated is thought to be asymptotically chi-square distributed with the degree of freedom equal to the difference in the number of parameters to be estimated between the alternative and null hypotheses.
Genetic and epigenetic effect
where the genotypic value of the trait is decomposed into different components, i.e., the overall mean (μ), the additive effects due to the substitution of allele A1 (a1) and epiallele A e by allele A2 (a e ), and the dominance effects due to the interaction between allele A1 and epiallele A e (d1e), between allele A1 and allele A2 (d12) and between allele A2 and epiallele A e (d2e).
Each of these effects (10) – (14) can be tested by the log-likelihood ratio approach. For an epigenetic study, we are more interested in testing the epigenetic effect of the nucleotide site a e and dominant effects due to the interactions between the alleles and epiallele d1e and d2e. The log-likelihood ratio test statistics for each hypothesis test is thought of being asymptotically chi-square distributed with the degree of freedom equal to the difference in the number of parameters to be estimated between the alternative and null hypotheses.
Genetic and epigenetic variation
where σ a 2 = 2pqα2 is the additive genetic variance depending on both a and d, and σ d 2 = (2pqd)2 is the dominant genetic variance only depending on d. Both additive and dominance variances are affected by the relative magnitudes of allele frequencies p and q. These two variances reach their maximums when two alternative alleles A1 and A2 occur at the same frequency.
These two parameters can be used to assess the contribution of DNA methylation to the total phenotypic variation of a quantitative trait.
In this section, we performed numerical analyses to investigate how epigenetic marks contribute to the heritability of a complex trait. The occurrence of epigenetic marks is described by population genetic parameters including the occurrence rate of the epiallele and its Hardy-Weinberg disequilibria with unmarked alleles. The effect of epigenetic marks can be specified by quantitative genetic parameters including the epigenetic effect of the epiallele and its interactions with other effects. As analyzed above, population genetic parameters (p, q, u, D1e, D2e, D12) and quantitative genetic parameters (a1, a e , d1e, d2e, d12) contribute to the genetic variance in a complex way (16). We will analyze the contribution of epigenetic marks by separately investigating how these population and quantitative genetic parameters affect R e 2.
Population genetic effect
Quantitative genetic effect
MLEs of population and quantitative genetic parameters from simulated data with different heritabilities ( H 2 ) and sample sizes ( n )
H2 = 0.05
H2 = 0.1
H2 = 0.2
The power of epigenetic-effect detection by the epigenetic model and its false positive rates (FPR) under different sample sizes ( n ) and heritabilities ( H 2 )
ae= d1e = d2e
H2 = 0.05
H2 = 0.1
H2 = 0.2
Implementing the epigenetic model into GWAS
where ξi 1, …, ξi 5 are the indicator variable for subject i that corresponds to a specific genetic or epigenetic effect at a methylated site, u ir (r = 1, …, R) is the value of the r th continuous covariate, such as age and BMI, for subject i, α r is the effect of the r th continuous covariate, v sl (l = 1, …, L s , s = 1, …, S) is the effect of the l th level for the s th discrete covariate, such as race, gender, and treatment, with ∑ l=1 Ls υsl = 0 where L s is the number of levels for the s th discrete covariate, x isl is an indicator variable of subject i who receives the l th level of the s th discrete covariate, and e i is a random error.
A standard multiple linear regression approach can be used to estimate all the effects described in model (19). If the test is made individually for each of the methylated sites, the significance of each effect should be adjusted by multiple comparison approaches such as Bonferroni or FDR.
Analysis of one single methylated site at a time is limited for statistical inference about a comprehensive picture of the genetic and epigenetic architecture of complex phenotypes. The best way such a picture is illustrated is to analyze all sites simultaneously. Li et al.  proposed a new approach by incorporating the least absolute shrinkage and selection operator (lasso)  to simultaneously analyze a larger number of variables using a much smaller sample size. A detailed algorithm for the Bayesian lasso has been derived  and can be readily implemented to GWAS aimed to identify epige-netic variants.
Epigenetic alternations have been increasingly recognized to play an important role in generating and maintaining quantitative genetic variation for complex phenotypes underlying physiology and diseased [6, 7, 9, 26–28]. Preliminary estimates in plants suggest that it can account for up to 30% of the variation in commonly studied phenotypes such as height and flowering time . Many theoretical models have been available to analyze the contributions of epigenetic marks to missing heritability in genome-wide association studies (GWAS) [14–18]. In this article, we extended Mendelian inheritance-based genetic principles to derive a quantitative framework by which to analyze the pattern of how DNA methylation contributes to overall genetic variance. By defining several epigenetic effect parameters, the analytical framework allows the mechanistic characterization of epigenetic actions within the quantitative genetic context.
Through numerical analysis, a small incidence of DNA methylation as well as a small effect due to methylation alternations could lead to a substantial increase of genetic variance, suggesting that epigenetic marks may be an important cause for genetic diversity in nature. Given our finding, the neglection of epigenetic variants in many current GWAS may partly explain the problem of missing heritability . Simulation studies suggest that the model can provide reasonable estimates of epigenetic effect parameters with a sample size of 200 – 400, even when the trait studied has a small heritability. It should be pointed out, however, that this conclusion is based on a well-controlled study in which there are few background noises. For the GWAS in humans, the estimated genetic variation is likely to be confounded by many factors, such as population structure, heterogeneous genetic background, demographic complexity, and highly noisy phenotypic measurements among others. To remove these confounding effects from genetic and epigenetic analysis, a considerably large sample size may be needed.
The model only considers a single methylated site. However, there is no technical difficulty in extending the model to explore two or more sites at the same time which may interact with each other to produce a complex network of epistasis . For two methylated sites, a total of 25 interaction parameters are formed between parameter sets each composed of (a1, a e , d1e, d2e, d12) for each site. In this case, an exponentially increasing sample size and more precise phenotypic measurement (aimed to increase the trait’s heritability) are needed. For the methylated population, originally existing HWE assumption may be violated in which case it is not possible to use gametic linkage disequilibria to specify the association between the two sites. Wu et al.  proposed a robust approach to analyze the marker-marker association by deriving a so-called zygotic linkage disequilibrium model. Wu et al.’s approach can be incorporated to identify the contribution of epigenetic marks at two sites to the overall genetic variance.
Epigenetic changes may be an adaptation to environmental perturbations [5, 17, 28]. Thus, it is crucial to incorporate the epigenetic model into a genotype-environment interaction study. By doing so, we can identify which and how epigenetic effects interact with the environment to determine final phenotypes so that the genetic etiology of quantitative variation can be better elucidated. In addition, there is a considerable body of evidence that epigenetic effects may transmitted from one generation to next [31, 32], although other studies found the reprogramming of epigenetic effects during meiosis [5, 33, 34]. By embedding our epigenetic model into a family-based design, we can develop a powerful approach to test the relative importance of these two phenomena in trait control [35–37]. Traditional models analyze the inheritance of quantitative traits based on Mendel’s laws, failing to study the contribution of epigenetic modifications. In addition, many GWAS are based on a case–control study in which genotype frequencies are compared between two groups. To study the association between epigenetic effects and a particular disease, such as cancer, we can incorporate quantitative epigenetic models as described by equations (10) – (14) into a case–control framework, allowing each effect to be tested. The integration of general quantitative genetic models and a case–control design has been discussed and its statistical properties investigated through analytical derivations and computer simulations [38–40]. With these extensions, the new model proposed in this article by integrating traditional quantitative genetic theory and the latest discoveries of epigenetic effects will allow geneticists to chart a more comprehensive picture of the genetic landscape for complex phenotypes underlying agricultural production, physiology and human diseases.
This work is partially supported by NSF/IOS-0923975, NIH/UL1RR0330184 and the Nantong “Jianghai Elites” program.
- Rutherford SL, Henikoff S: Quantitative epigenetics. Nat Genet 2003, 33: 6–8. 10.1038/ng0103-6View ArticlePubMedGoogle Scholar
- Richards EJ: Inherited epigenetic variation–revisiting soft inheritance. Nat Rev Genet 2006, 7: 395–401.View ArticlePubMedGoogle Scholar
- Richard EJ: Quantitative epigenetics: DNA sequence variation need not apply. Genes Dev 2009, 23: 1601–1605. 10.1101/gad.1824909View ArticleGoogle Scholar
- Richards EJ: Natural epigenetic variation in plant species: a view from the field. Curr Opin Plant Biol 2011, 14: 204–209. 10.1016/j.pbi.2011.03.009View ArticlePubMedGoogle Scholar
- Richards CL, Bossdorf O, Pigliucci M: What role does heritable epigenetic variation play in phenotypic evolution? Bioscience 2010, 60: 232–237. 10.1525/bio.2010.60.3.9View ArticleGoogle Scholar
- Feinberg AP: Phenotypic plasticity and the epigenetics of human disease. Nature 2007, 447: 433–440. 10.1038/nature05919View ArticlePubMedGoogle Scholar
- Feinberg AP, Irizarry RA: Stochastic epigenetic variation as a driving force of development, evolutionary adaptation, and disease. Proc Natl Acad Sci USA 2010, 107: 1757–1764. 10.1073/pnas.0906183107PubMed CentralView ArticlePubMedGoogle Scholar
- Johannes F, Porcher E, Teixeira FK, Saliba-Colombani V, Simon M, Agier N, Bulski A, Albuisson J, Heredia F, Audigier P, Bouchez D, Dillmann C, Guerche P, Hospital F, Colot V: Assessing the impact of transgenerational epigenetic variation on complex traits. PLoS Genet 2009, 5: e1000530. 10.1371/journal.pgen.1000530PubMed CentralView ArticlePubMedGoogle Scholar
- Eichten SR, Swanson-Wagner RA, Schnable JC, Waters AJ, Hermanson PJ, Liu S, Yeh CT, Jia Y, Gendler K, Freeling M, Schnable PS, Vaughn MW, Springer NM: Heritable epigenetic variation among maize inbreds. PLoS Genet 2011, 7(11):e1002372. 10.1371/journal.pgen.1002372PubMed CentralView ArticlePubMedGoogle Scholar
- Johannes F, Colot V, Jansen RC: Epigenome dynamics: a quantitative genetics perspective. Nat Rev Genet 2008, 9: 883–890. 10.1038/nrg2467View ArticlePubMedGoogle Scholar
- Maher B: Personal genomes: the case of the missing heritability. Nature 2008, 456: 18–21.View ArticlePubMedGoogle Scholar
- Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A, Cho JH, Guttmacher AE, Kong A, Kruglyak L, Mardis E, Rotimi CN, Slatkin M, Valle D, Whittemore AS, Boehnke M, Clark AG, Eichler EE, Gibson G, Haines JL, Mackay TF, McCarroll SA, Visscher PM: Finding the missing heritability of complex diseases. Nature 2009, 461: 747–753. 10.1038/nature08494PubMed CentralView ArticlePubMedGoogle Scholar
- Eichler E, Flint J, Gibson G, Kong A, Leal S, Moore JH, Nadeau JH: Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet 2010, 11: 446–450. 10.1038/nrg2809PubMed CentralView ArticlePubMedGoogle Scholar
- Slatkin M: Epigenetic inheritance and the missing heritability problem. Genetics 2009, 182: 845–850. 10.1534/genetics.109.102798PubMed CentralView ArticlePubMedGoogle Scholar
- Tal O, Kisdi E, Jablonka E: Epigenetic contribution to covariance between relatives. Genetics 2010, 184: 1037–1050. 10.1534/genetics.109.112466PubMed CentralView ArticlePubMedGoogle Scholar
- Johannes F, Colome-Tatche M: Quantitative epigenetics through epigenomic perturbation of isogenic lines. Genetics 2011, 188: 215–227. 10.1534/genetics.111.127118PubMed CentralView ArticlePubMedGoogle Scholar
- Furrow RE, Christiansen FB, Feldman MW: Environment-sensitive epigenetics and the heritability of complex diseases. Genetics 2011, 189: 1377–1387. 10.1534/genetics.111.131912PubMed CentralView ArticlePubMedGoogle Scholar
- Jaffe AE, Feinberg AP, Irizarry RA, Leek JT: Significance analysis and statistical dissection of variably methylated regions. Biostatistics 2012, 13: 166–178. 10.1093/biostatistics/kxr013PubMed CentralView ArticlePubMedGoogle Scholar
- Roux F, Colome-Tatche M, Edelist C, Warenaar R, Guerche P, Hospital F, Colot V, Jansen RC, Johannes F: Genome-wide epigenetic perturbation jump-starts patterns of heritable variation found in nature. Genetics 2011, 188: 1015–1017. 10.1534/genetics.111.128744PubMed CentralView ArticlePubMedGoogle Scholar
- Falconer DS, Mackay TFC: Introduction to Quantitative Genetics. London: Longman; 1996.Google Scholar
- Lynch M, Walsh B: Genetics and Analysis of Quantitative Traits. Sunderland, MA: Sinauer Associates; 1998.Google Scholar
- Self SG, Liang KY: Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard conditions. J Am Stat Assoc 1987, 82: 605–610. 10.1080/01621459.1987.10478472View ArticleGoogle Scholar
- Andrews DWK: Testing when a parameter is on the boundary of the maintained hypothesis. Econometrica 2001, 69: 683–734. 10.1111/1468-0262.00210View ArticleGoogle Scholar
- Tibshirani R: Regression shrinkage and selction via the lasso. J R Stat Soc Ser B 1996, 58: 267–288.Google Scholar
- Li JH, Das K, Fu GF, Li RZ, Wu RL: The Bayesian lasso for genome-wide association studies. Bioinformatics 2011, 27: 516–523. 10.1093/bioinformatics/btq688PubMed CentralView ArticlePubMedGoogle Scholar
- Feinberg AP, Tycko B: The history of cancer epigenetics. Nat Rev Cancer 2004, 4: 143–153.View ArticlePubMedGoogle Scholar
- Feinberg AP, Irizarry RA, Fradin D, Aryee MJ, Murakami P, et al.: Personalized epigenomic signatures that are stable over time and covary with body mass index. Sci Transl Med 2011, 3(65):65er1.Google Scholar
- Petronis A: Epigenetics as a unifying principle in the aetiology of complex traits and diseases. Nature 2010, 465: 721–727. 10.1038/nature09230View ArticlePubMedGoogle Scholar
- Smith LM, Weigel D: On epigenetics and epistasis: hybrids and their non-additive interactions. EMBO J 2012, 31: 249–250.PubMed CentralView ArticlePubMedGoogle Scholar
- Wu S, Yang J, Wu RL: Genetic mapping of quantitative trait loci in a non-equilibrium population. Stat Appl Mol Genet Biol 2010, 9(1):32.Google Scholar
- Reik W: The Wellcome Prize Lecture. Genetic imprinting: the battle of the sexes rages on. Exp Physiol 1996, 81: 161–172.View ArticlePubMedGoogle Scholar
- Reik W, Dean W, Walter J: Epigenetic reprogramming in mammalian development. Science 2001, 293: 1089–1093. 10.1126/science.1063443View ArticlePubMedGoogle Scholar
- Youngson NA, Whitelaw E: Transgenerational epigenetic effects. Annu Rev Genomics Hum Genet 2008, 9: 233–257. 10.1146/annurev.genom.9.081307.164445View ArticlePubMedGoogle Scholar
- Whitelaw NC, Whitelaw E: Transgenerational epigenetic inheritance in health and disease. Curr Opin Genet Dev 2008, 18: 273–279. 10.1016/j.gde.2008.07.001View ArticlePubMedGoogle Scholar
- Wang C, Wang Z, Luo J, Li Q, Li Y, Ahn K, Prows DR, Wu R: A model for transgenerational imprinting variation in complex traits. PLoS One 2010, 5(7):e11396. 10.1371/journal.pone.0011396PubMed CentralView ArticlePubMedGoogle Scholar
- Wang CG, Wang Z, Prows DR, Wu RL: A computational framework for the inheritance of genomic imprinting for complex traits. Brief Bioinform 2012, 13: 34–45. 10.1093/bib/bbr023PubMed CentralView ArticlePubMedGoogle Scholar
- Li Y, Guo YQ, Hou W, Chang M, Liao LP, Wu RL: A statistical design for testing transgenerational genomic imprinting in natural human populations. PLoS One 2011, 6(2):e16858. 10.1371/journal.pone.0016858PubMed CentralView ArticlePubMedGoogle Scholar
- Wang Z, Liu T, Lin Z, Hegarty J, Koltun WA, Wu R: A general model for multilocus epistatic interactions in case–control studies. PLoS One 2010, 5(8):e11384. 10.1371/journal.pone.0011384PubMed CentralView ArticlePubMedGoogle Scholar
- Liu T, Thalamuthu A, Liu JJ, Chen C, Wang Z, Wu R: Asymptotic distribution for epistatic tests in case–control studies. Genomics 2011, 98: 145–151. 10.1016/j.ygeno.2011.05.001View ArticlePubMedGoogle Scholar
- Zhang L, Liu R, Wang Z, Culver DA, Wu R: Modeling haplotype-haplotype interactions in case–control genetic association studies. Front Genet 2012, 3: 2.PubMed CentralPubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.