 Methodology
 Open Access
 Published:
Fully moderated tstatistic in linear modeling of mixed effects for differential expression analysis
BMC Bioinformatics volume 20, Article number: 675 (2019)
Abstract
Background
Gene expression profiling experiments with few replicates lead to great variability in the estimates of gene variances. Toward this end, several moderated ttest methods have been developed to reduce this variability and to increase power for testing differential expression. Most of these moderated methods are based on linear models with fixed effects where residual variances are smoothed under a hierarchical Bayes framework. However, they are inadequate for designs with complex correlation structures, therefore application of moderated methods to linear models with mixed effects are needed for differential expression analysis.
Results
We demonstrated the implementation of the fully moderated tstatistic method for linear models with mixed effects, where both residual variances and variance estimates of random effects are smoothed under a hierarchical Bayes framework. We compared the proposed method with two current moderated methods and show that the proposed method can control the expected number of false positives at the nominal level, while the two current moderated methods fail.
Conclusions
We proposed an approach for testing differential expression under complex correlation structures while providing variance shrinkage. The proposed method is able to improve power by moderation and controls the expected number of false positives properly at the nominal level.
Background
To understand the dynamics of gene function, many functional genomic studies have profiled transcriptomic data of individuals with multiple samples of different origins [1, 2], diverse cell types [3, 4], and various time points [5–7]. These measurements of gene expression levels from multiple samples of the same individual are correlated in nature. To analyze the correlated gene expression levels under complex structures, linear mixedeffects models for genes are needed. Due to the constrain of availability, financial resources, or even practical considerations, small number of individuals are commonly used for many studies, which can lead to undesired low power for detecting differential expression between conditions when genes are analyzed separately using linear mixedeffects models. Therefore, variance shrinkage over the range of genes’ expression levels is necessary in order to improve power. However, currently there is a lack of methods that use variance shrinkage techniques for linear mixedeffects models.
Several variance shrinkage methods through moderation under hierarchical Bayes framework have been proposed for detecting differential expression with microarray or RNASeq data [8–11]. These methods are based on variance shrinkage of residual errors of linear models with only fixed effects, but the power gain relative to other test methods without variance shrinkage is significant. However, use of these models in designs with correlation structures will lead to biased estimation of gene variances, thus will result in inflated false positive and false negative rates.
There are two variance shrinkage methods that try to address the correlation issue. Limma method [12] incorporates correlation estimates of replicates in linear modeling, but it enforces a common correlation for all genes, which works well for withinarray replicates as originally planned, but is risky for betweenarray replicates. Dream method [13] tries to solve this issue in the setting of linear mixedeffects models. This method shrinks residual variances only, while variance estimates of random effects are scaled to residual variances before and after shrinkage. The limitation of this approach is that variance shrinkage of random effects is ignored by using the same shrinkage for both random and residual variances over the range of genes’ expression levels.
Toward this end, we implemented a novel procedure that allows us to shrink both types of variation independently through moderated tmethods under linear mixedeffects models. Our method assumes that variances of both residual errors and random effects have different mean functional forms over the range of genes’ expression levels. Then testing for differential expression between conditions is based on the combined variance shrinkage estimators. Simulations were performed to evaluate the proposed procedure. A real gene expression experiment on alopecia areata treatment on mouse skin samples was used as a case study.
Methods
Linear mixedeffects model
For each gene, Y denotes expression levels of all samples, which are normally distributed. The linear mixedeffects model (LMM) is formulated as
where β are fixedeffects, γ are randomeffects, and ε are residual errors with Var(γ)=G and Var(ε)=R. γ and ε are independent. Y has mean Xβ and variance ZGZ^{′}+R.
The hypothesis for testing fixedeffects is
Then the tstatistic for the hypothesis test is
where \(\hat {\beta }=(X^{\prime }\hat {V}^{1}X)^{}X^{\prime }\hat {V}^{1}Y\), \(C=(X^{\prime }\hat {V}^{1}X)^{}\).
Hierarchical model
Following our previously published fully moderated tstatistic (FMT) method [10], a hierarchical Bayes model is assumed for residual variances or variance estimates of a random effect from a linear mixedeffects model of log transformed expression data.
Residual variance for a gene g, \(s_{e_{g}}^{2}\), is assumed to follow a scaled Chisquare distribution with \(d_{e_{g}}\) degrees of freedom:
where \(\sigma _{e_{g}}^{2}\) is the variance of residual error, which is assigned a scaled inverse Chisquare prior:
\(d_{0e_{g}}\) and \(s_{0e_{g}}^{2}\) are the prior degrees of freedom and location, respectively. Under this model, the posterior mean of \(\sigma _{e_{g}}^{2}\), given \(s_{e_{g}}^{2}\), is
Similarly, variance estimate of a random effect for a gene g, \(s_{r_{g}}^{2}\), is assumed to follow a scaled Chisquare distribution with \(d_{r_{g}}\) degrees of freedom:
where \(\sigma _{r_{g}}^{2}\) is the variance of a random effect, which is also assigned a scaled inverse Chisquare prior:
\(d_{0r_{g}}\) and \(s_{0r_{g}}^{2}\) are the prior degrees of freedom and location, respectively. Under this model, the posterior mean of \(\sigma _{r_{g}}^{2}\), given \(s_{r_{g}}^{2}\), is
Then the denominator of the moderated tstatistic is
Variance components method [14] or WelchSatterthwaite method [15] is used to calculate an approximation to the degrees of freedom for the linear combinations of independent posterior variance estimates \(\stackrel {\sim }{s}_{e_{g}}^{2}\) and \(\stackrel {\sim }{s}_{r_{g}}^{2}\).
Estimation of hyperparameters
Hyperparameter estimation procedures for \(\sigma _{e_{g}}^{2}\) and \(\sigma _{r_{g}}^{2}\) are performed separately. The hyperparameters \(d_{0e_{g}}\) and \(s_{0e_{g}}^{2}\) as well as \(d_{0r_{g}}\) and \(s_{0r_{g}}^{2}\) are estimated by an empirical Bayes approach implemented in our FMT method [10]. The accuracy of hyperparameter estimation (in terms of bias and variance) can be approximated by using the empirical distribution of hyperparameters’ estimates generated from multiple simulation runs, which is demonstrated in our FMT method paper.
False positive control
With thousands of genes available in genomewide profiling studies, multiplicity adjustment is needed for false positive error control. While the Bonferroni method controls the familywise error rate (probability of one or more false rejections among all comparisons), it is wellknown to be a conservative method for genomewide profiling studies. A less conservative procedure (the extended interpretation of the Bonferroni method) that controls the expected mean number of false positives was used for multiplicity adjustment throughout this paper [16]. This procedure controls the per family error rate (PFER). It is as powerful as the BenjamimiHochberg (BH) FDR control procedure, but has better stability compared to the BH FDR control procedure. In simulations, the nominal level for the false positive error control is set as the PFER.
Simulation procedures

1: For 12,000 genes (g=1,2,⋯,12,000), simulate average log expression, α_{g}, from a three parameter lognormal distribution: ln(α_{g}−3.75)∼N(1.35,0.35).

2: For \(\sigma _{e_{g}}^{2}\) of gene g, calculate the genespecific prior location \(s_{0e_{g}}^{2}\) and prior degrees of freedom \(d_{0e_{g}}\) as a function of the average log expression values (α_{g}) obtained from Step 1. The prior location \(s_{0e_{g}}^{2}\) and prior degrees of freedom \(d_{0e_{g}}\) are modeled as:
$${{}{\begin{aligned}s_{0e_{g}}^{2} & = 0.2 e^{1.2(\alpha_{g}4.6)}+0.05 \\ d_{0e_{g}} & = 1.5 \sqrt{1.1e^{0.1(\alpha_{g}6)} + 0.6(\alpha_{g}7)^{2} + 20} + e^{0.25(\alpha_{g}14)}\! + 2 \end{aligned}}} $$For \(\sigma _{r_{g}}^{2}\) of gene g, calculate the genespecific prior location \(s_{0r_{g}}^{2}\) and prior degrees of freedom \(d_{0r_{g}}\) as a function of the average log expression values (α_{g}) obtained from Step 1. The prior location \(s_{0r_{g}}^{2}\) and prior degrees of freedom \(d_{0r_{g}}\) are modeled as:
$${{}{\begin{aligned} s_{0r_{g}}^{2} & = 0.5 e^{0.8(\alpha_{g}5.25)}+0.1 \\ d_{0r_{g}} & = 1.2 \sqrt{1.1e^{0.1(\alpha_{g}6)} + 0.6(\alpha_{g}7)^{2} + 20} + e^{0.25(\alpha_{g}14)}\! + 2 \end{aligned}}} $$ 
3: For gene g, simulate \(\sigma _{e_{g}}^{2}\) and \(\sigma _{r_{g}}^{2}\) from an inverse Chisquare distribution (Eqs. 1 and 2) using the hyperparameters from Step 2.

4: For gene g, simulate random effect S_{i} from \(N(0,\sigma _{r_{g}}^{2})\).

5: Among the 12,000 expressed genes, 500 genes are simulated to be differentially expressed using the following steps:
 a)
Simulate mean logdifferences μ_{g} from N(0,2) for 500 differentially expressed genes.
 b)
Simulate expression data for each replicate in group 1 from \(N(\alpha _{g}+\mu _{g}/2+S_{i},\sigma _{e_{g}}^{2})\) and for each replicate in group 2 from \(N(\alpha _{g}\mu _{g}/2+S_{i},\sigma _{e_{g}}^{2})\).
 a)

6: For the remaining 11,500 expressed genes, simulate expression data for each replicate from \(N(\alpha _{g}+S_{i},\sigma _{e_{g}}^{2})\).
Results
Simulation
Parameter setting
For simulation, we chose a simple design for gene expression experiment without loss of generality. We assumed two cohort groups with each cohort having n_{s} biological replicates (subjects) and each subject having n_{r} technical replicates. Our main interest is to compare gene expression differences between two cohort groups for each gene. We simulated gene expression data under linear mixedeffects models, where subject is a random effect and variation due to technical replicates was included in the residual errors.
Expression levels of 12,000 genes were simulated independently, among which 500 genes were simulated with differential expression between two cohort groups (see above Simulation Procedures). Simulation models were based on model assumptions of our published FMT method, which allows hyperparameters to vary as a function of genes’ expression levels. All parameters were set to resemble real gene expression arrays. Under this setting, we generated data sets for 13 different biological replicates (320 in each cohort group) and 6 different technical replicates (211) within each biological replicate. One hundred simulation runs were conducted for each simulation scenario.
Methods comparison
We compared the proposed method (FMT in LMM) to Limma [12] and Dream methods [13]. The Limma method calculates a common correlation coefficient between technical replicates for all genes to account for the correlation of repeated measurements within subjects in a linear fixedeffects model setting. The Dream method uses a linear mixedeffects model, where variance estimates of random subject effect is standardized to residual variances, then residual variances were modeled as a function of average of expression levels. In our proposed method, we used two ways to estimate the degrees of freedom of the moderated tstatistic in Eq. 3. The first approach is the variance components method (VC), which divides the residual degrees of freedom into betweensubject and withinsubject portions based on variance decomposition and assigns the betweensubject degrees of freedom to the fixed effects if subjects are nested in the fixed effects. The second approach is the WelchSatterthwaite method (Sat), which uses an approximation of the effective degrees of freedom of a linear combination of independent variance estimates. We also compared our method to the ordinary ttest method (OT) in LMM, which does not perform variance shrinkage to the variance estimates.
Power and false positives
Power averaged over 100 simulation runs are shown in Fig. 1, where power is plotted against numbers of biological replicates (320) for 2 technical replicates and 5 expected mean number of false positives. The Limma method has the best power except at the small number of biological replicates and OTVC has the lowest power. The power difference between all methods goes down when the number of biological replicates increases. For both FMT and OT methods, the WelchSatterthwaite method results in better power than the variance components method. Fig. 2 shows the actual number of false positives over numbers of biological replicates for 2 technical replicates and 5 expected mean number of false positives. The Dream method has the largest actual number of false positives, and it spikes more than 100 fold higher than the controlled level when the number of biological replicates is relatively small and drops dramatically when the number of biological replicates increases. The Limma method has inflated actual number of false positives as well, but it is relatively stable over numbers of biological replicates. Only FMTVC and OTVC maintain the nominal level of false positive error control at each biological replicate level. Similar patterns are observed for higher numbers of technical replicates.
Power averaged over 100 simulation runs is plotted against numbers of technical replicates (211) for 3 biological replicate and 5 expected number of false positives as shown in Fig. 3. The Limma and Dream methods have higher power than other methods. The power differences between methods seem stable across the range of technical replicates. Figure 4 shows the actual number of false positives by technical replicates for 3 biological replicates and 5 expected number of false positives. Similarly, FMTVC and OTVC maintain the desired false positive error control at each technical replicate number. When the number of technical replicates is bigger than 4, the FMTSat method maintains the desired false positive error control, while the other three methods still fail to maintain false positives at the nominal level. For both Limma and Dream methods, the actual number of false positives increases with larger numbers of technical replicates. Similar patterns are observed for higher numbers of biological replicates. We believe that the ultrally inflated actual number of false positives for both Limma and Dream methods contributes to the higher power compared to both FMT and OT methods.
Gene correlation
To understand effects of gene correlation on the tests’ performance, we also performed simulations using a blockdiagonal gene correlation structure. We randomly assigned genes into blocks of size 10, where gene correlations inside blocks are fixed at 0.5 while gene correlations between blocks are set to 0. Results on power and actual number of false positives under the gene correlation structure are highly similar to those under gene independence. Supplemental plots are provided in Additional file 1.
A case study
To demonstrate the performance of the proposed method for detecting differential expression under linear mixedeffects models, we applied it to a real microarray study published in Nature Medicine in 2014 by Xing et al. [17]. Microarray data were deposited in the GEO database with an access number GSE45514. This study used Affymetrix Mouse Genome 430 2.0 Array on mouse skin samples taken at different time points (weeks 6 and 12) from C3H/HeJ mice treated with topical JAK inhibitors ruxolitinib (Jak1), tofacitinib (Jak3) or control (PBS). Three skin samples were taken at each time point from 3 mice for each treatment. We focused on the comparison between two treatments (J3 vs. PBS), and we analyzed the data using our FMTVC linear mixedeffects model where mouse was treated as a random effect. We compared the FMTVC method with other methods as done in simulation studies. Figures 5 and 6 show the fitted values of residual variances and variance estimates of random mouse effect. As shown in the venndiagram in Fig. 7, lists of significant genes of week 12 at the PFER cutoff value of 5 expected mean number of false positives are compared among methods. There are several genes detected only by the Limma and Dream methods, but these genes are most likely to be false positives since our simulation studies demonstrate the ultrally inflated false positive rate for both methods especially when the number of biological replicates is small. The few significant genes for OTVC method are mainly due to low power of that method.
Discussion
Differential expression analysis through moderated ttests has become common practice in genomic studies for the past decade. But most moderated ttests are limited to simple designs with the usage of linear fixedeffects models. Recent efforts have been made to accommodate designs associated with complex correlation structures using LMM. In this paper, we proposed a novel approach to address the need of applying moderated methods in LMM and compared our method with two other current methods. Our proposed method is very flexible because we allow different shrinkage over the range of expression levels for each variance component. However the Limma and Dream methods can only do shrinkage of residual variances over the range of expression levels.
We have employed two approaches for approximating degrees of freedom of moderated tstatistic (the variance components method and the WelchSatterthwaite method) for both FMT and OT methods in LMM for testing differential expression. The WelchSatterthwaite method uses all variance estimates to approximate degrees of freedom, but the VC method only accounts for betweensubject variation and ignores withinsubject variation from the total degrees of freedom when fixed effects change across subjects. The WelchSatterthwaite method inflates type I error and fails to maintain false positive error control at the nominal level, but the VC method performs proper control for false positive errors. In addition, FMTVC is more powerful than OTVC mainly due to moderation.
The performance of the proposed method was demonstrated through a case study with a simple design. However it can be applied to comprehensive designs including multifactors or multilevels in linear mixedeffects models. The proposed method can also be applied to data sets that measure gene expression by other technologies, e.g. RNASeq, protein expression, or quantities of small molecules.
Conclusions
Current moderatedt methods are limited to shrinking only the residual variance of linear models. In this paper, we proposed an approach to allow variance shrinkage of both residual error and random effects using moderated tstatistic under LMM. The gain in power through moderation and proper control of false positive errors was demonstrated by simulation studies. We also applied the proposed method to a real gene expression data set.
Availability of data and materials
The microarray data set used in the case study is from publicly available repositories. R package fmt used in the analysis is available from CRAN. R code for the case study is available from the corresponding author.
Abbreviations
 BH:

BenjamimiHochberg
 FDR:

False discovery rate
 FMT:

Fully moderated tstatistic
 LMM:

Linar mixedeffects models
 PFER:

Per family error rate
References
Carithers LJ, Ardlie K, Barcus M, Branton PA, Britton A, Buia SA, Compton CC, DeLuca DS, PeterDemchok J, Gelfand ET, Guan P, Korzeniewski GE, Lockhart NC, Rabiner CA, Rao AK, Robinson KL, Roche NV, Sawyer SJ, Segrè AV, Shive CE, Smith AM, Sobin LH, Undale AH, Valentino KM, Vaught J, Young TR, Moore HM, GTEx Consortium. A Novel Approach to HighQuality Postmortem Tissue Procurement: The GTEx Project. Biopreserv Biobank. 2015; 13(5):311–9.
Yu YP, Landsittel D, Jing L, Nelson J, Ren B, Liu L, McDonald C, Thomas R, Dhir R, Finkelstein S, Michalopoulos G, Becich M, Luo JH. Gene expression alterations in prostate cancer predicting tumor aggression and preceding development of malignancy. J Clin Oncol. 2004; 22(14):2790–9.
Liu H, Dowdle JA, Khurshid S, Sullivan NJ, Bertos N, Rambani K, Mair M, Daniel P, Wheeler E, Tang X, Toth K, Lause M, Harrigan ME, Eiring K, Sullivan C, Sullivan MJ, Chang SW, Srivastava S, Conway JS, Kladney R, McElroy J, Bae S, Lu Y, Tofigh A, Saleh SMI, Fernandez SA, Parvin JD, Coppola V, Macrae ER, Majumder S, Shapiro CL, Yee LD, Ramaswamy B, Hallett M, Ostrowski MC, Park M, Chamberlin HM, Leone G. Discovery of Stromal Regulatory Networks that Suppress RasSensitized Epithelial Cell Proliferation. Dev Cell. 2017; 41(4):392–407.
Shi L, Zhang Y, Feng L, Wang L, Rong W, Wu F, Wu J, Zhang K, Cheng S. Multiomics study revealing the complexity and spatial heterogeneity of tumorinfiltrating lymphocytes in primary liver carcinoma. Oncotarget. 2017; 8(21):34844–57.
Kupfer DM, White VL, Strayer DL, Crouch DJ, Burian D. Microarray characterization of gene expression changes in blood during acute ethanol exposure. BMC Med Genomics. 2013; 6:26.
Golumbeanu M, Desfarges S, Hernandez C, Quadroni M, Rato S, Mohammadi P, Telenti A, Beerenwinkel N, Ciuffi A. ProteoTranscriptomic Dynamics of Cellular Response to HIV1 Infection. Sci Rep. 2019; 9(1):213.
Muscedere J, Maslove DM, Boyd JG, O’Callaghan N, Sibley S, Reynolds S, Albert M, Hall R, Jiang X, Day AG, Jones G, Lamontagne F. Prevention of Nosocomial Infections in Critically Ill Patients With Lactoferrin: A Randomized, DoubleBlind, PlaceboControlled Study. Crit Care Med. 2018; 46(9):1450–6.
Smyth GK. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 2004; 3:Article3.
Sartor MA, Tomlinson CR, Wesselkamper SC, Sivaganesan S, Leikauf GD, Medvedovic M. Intensitybased hierarchical Bayes method improves testing for differentially expressed genes in microarray experiments. BMC Bioinformatics. 2006; 7:538.
Yu L, Gulati P, Fernandez S, Pennell M, Kirschner L, Jarjoura D. Fully moderated Tstatistic for small sample size gene expression arrays. Stat Appl Genet Mol Biol. 2011; 10(1):Article 42.
Law CW, Chen Y, Shi W, Smyth GK. voom: Precision weights unlock linear model analysis tools for RNAseq read counts. Genome Biol. 2014; 15(2):R29.
Smyth GK, Michaud J, Scott HS. Use of withinarray replicate spots for assessing differential expression in microarray experiments. Bioinformatics. 2005; 21(9):2067–75.
Hoffman GE, Roussos P. Dream: Powerful differential expression analysis for repeated measures designs. biorxiv. 2018. https://doi.org/10.1101/432567.
Schluchter MD, Elashoff JD. Smallsample adjustments to tests with unbalanced repeated measures assuming several covariance structures. J Stat Comput Simul. 1990; 37:69–87.
Satterthwaite FE. An approximate distribution of estimates of variance components. Biom Bull. 1946; 2:110–4.
Gordon A, Glazko G, Qiu X, Yakovlev A. Control of the mean number of false discoveries, Bonferroni and stability of multiple testing. Annals Appl Stat. 2007; 1:179–90.
Xing L, Dai Z, Jabbari A, Cerise JE, Higgins CA, Gong W, de Jong A, Harel S, DeStefano GM, Rothman L, Singh P, Petukhova L, MackayWiggan J, Christiano AM, Clynes R. Alopecia areata is driven by cytotoxic T lymphocytes and is reversed by JAK inhibition. Nat Med. 2014; 20(9):1043–9.
Funding
This research was partially supported by NIH grants 2P30CA01605840. Publication costs are covered by internal fund of the corresponding author’s department.
Author information
Authors and Affiliations
Contributions
LY generated the original idea and drafted the manuscript. LY and JZ performed the simulations. GB and SF guided the research and revised the manuscript. All authors read and approved the final version of this manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Additional file 1
This pdf file contains all supplementary figures referenced in results section.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Yu, L., Zhang, J., Brock, G. et al. Fully moderated tstatistic in linear modeling of mixed effects for differential expression analysis. BMC Bioinformatics 20 (Suppl 24), 675 (2019). https://doi.org/10.1186/s1285901932489
Published:
DOI: https://doi.org/10.1186/s1285901932489
Keywords
 Fully moderated Tstatistic
 Linear mixedeffects model
 Variance shrinkage
 Expected number of false positives