 Research article
 Open Access
 Published:
Generalized shrinkage Flike statistics for testing an interaction term in gene expression analysis in the presence of heteroscedasticity
BMC Bioinformatics volumeÂ 12, ArticleÂ number:Â 427 (2011)
Abstract
Background
Many analyses of gene expression data involve hypothesis tests of an interaction term between two fixed effects, typically tested using a residual variance. In expression studies, the issue of variance heteroscedasticity has received much attention, and previous work has focused on either betweengene or withingene heteroscedasticity. However, in a single experiment, heteroscedasticity may exist both within and between genes. Here we develop flexible shrinkage error estimators considering both betweengene and withingene heteroscedasticity and use them to construct Flike test statistics for testing interactions, with cutoff values obtained by permutation. These permutation tests are complicated, and several permutation tests are investigated here.
Results
Our proposed test statistics are compared with other existing shrinkagetype test statistics through extensive simulation studies and a real data example. The results show that the choice of permutation procedures has dramatically more influence on detection power than the choice of F or Flike test statistics. When both types of gene heteroscedasticity exist, our proposed test statistics can control preselected typeI errors and are more powerful. Raw data permutation is not valid in this setting. Whether unrestricted or restricted residual permutation should be used depends on the specific type of test statistic.
Conclusions
The Flike test statistic that uses the proposed flexible shrinkage error estimator considering both types of gene heteroscedasticity and unrestricted residual permutation can provide a statistically valid and powerful test. Therefore, we recommended that it should always applied in the analysis of real gene expression data analysis to test an interaction term.
Background
The regulation of gene expression starts when a cell's DNA is transcribed into mRNA. The simultaneous expression profiles of many genes under different circumstances can provide insight into physiological processes. Using modern technologies in gene expression experiments such as oligonucleotide arrays [1], and cDNA spotted arrays [2], many scientists have made novel discoveries about complex biological processes of yeast [3, 4], drosophila [5], mice [6], humans [7], and other species. Recently one such study also included RNAseq [8]. Statistical methodologies and issues involved in microarray data analysis have been widely reviewed [9â€“12], and it is expected that many of the same issues will need to be addressed with RNAseq.
The analysis of variance (ANOVA) model is a popular statistical modeling method for the analysis of microarrays. Since its introduction by Kerr et al. [13], it has been extensively examined for use in this setting [14â€“21]. Kerr et al. constructed an ANOVA model that included the gene effect as a fixed effect. This model assumes identically and independently distributed residual errors across genes. The advantage of this model is that the large number of genes involved in a microarray experiment results in huge degrees of freedom for the error estimate, which can lead to a very powerful test. However, the common assumption of homoscedasticity may not hold true in this setting [22]. One alternative is to use an ANOVA model for each gene, but the resulting test statistics from genespecific models may have limited power because the biological sample size for each gene in a microarray experiment is usually small.
To address this problem of limited power, researchers have proposed other methods for obtaining more information across genes, ranging from a simple equalweighted average of a genespecific error estimate and the global average of all genespecific error estimates (F_{2} statistic proposed by Wu et al. [19] to empirical Bayesian modeling of all genespecific errors [23â€“26]. Other variations [27â€“29] used different variance modeling strategies to address the heteroscedasticity problem, but no clear winner has emerged [30]. Huang and Liu [31] extended the test statistics proposed by Cui et al. [28] by assuming a normal distribution on the mean and then deriving an empirical Bayes likelihood ratio test. The resulting test statistic shrinks both the mean and variances.
In addition to the problem of betweengene heteroscedasticity, we must also be concerned with withingene heteroscedasticity. For example, in the study of simple differential gene expression between a treatment group and a control group, the variance in the treatment arm may differ from that in the control arm. Some approaches to this problem include a general Bayesian framework to model heteroscedastic error in a single generalized linear mixed model setting [32] and a structural model placed on the error variances specific to each gene and treatment combination [33].
As gene expression studies become more popular, the complexity of the experiment increases. Instead of only simple treatment and control experiments, two or more factor experiments are being conducted. This increase in experiment complexity has led to many scientific questions involving the hypothesis testing of an interaction between two factors. For example, testing a probe by genotype interaction can result in inferences about polymorphism in the probe, such as single nucleotide polymorphism (SNP) and insertiondeletion (indel) [34â€“37]; testing a probe by sex can imply that alternative splicing occurs between male and female subjects [38]; and in pharmacogenomic studies, testing the genotypedrug/treatment or genotypedisease interaction may be of interest [39]. Thus far, all the development of ANOVA methods for microarray studies has focused on tests of main effects.
Here, a generalized shrinkage estimator incorporating both within and betweengene heteroscedasticities is developed (see Lehmann and Cesella [40] for a review of shrinkage estimation). In any given experiment, both withingene and betweengene heteroscedasticity may exist; thus, taking these possibilities into account should lead to an improved test statistic. Moreover, given the increasing complexity of recent studies and the burgeoning interest in hypotheses that involve interactions, we focus on an improved shrinkagebased Ftest for interaction terms.
Methods
Here we develop new shrinkage estimates for the error term and show how to use these estimates to construct Flike statistics. We then estimate the null distribution of these statistics by using permutation tests.
Shrinkage error estimators
Shrinkage error estimators pull individual error estimates toward shrinkage targets, with the amount of shrinkage depending on the variability of individual error estimates [28, 40]. Let the genespecific error estimates for all genes i and subgroups k be
i = 1,...,I, k = 1,..., K, and let {\mathrm{\xcf\u0192}}_{i,k}^{2} be the true variance of gene i in group k. When the experimental design is balanced, {\widehat{\mathrm{\xcf\u0192}}}_{i,k}^{2} is the residual mean square for gene i in group k and \mathrm{\xce\xbd}{\widehat{\mathrm{\xcf\u0192}}}_{i,k}^{2}\xe2\u02c6\u2022{\mathrm{\xcf\u0192}}_{i,k}^{2}~{\mathrm{\xcf\u2021}}_{\mathrm{\xce\xbd}}^{2}, where Î½ represents the degrees of freedom for the error estimates.
The choices of shrinkage targets in microarray data include the following:

1.
Specific values for each genegroup combination

2.
Genespecific values that are the same across all other groups

3.
Groupspecific values that are the same across genes but different across groups

4.
A single point representing the underlying common error
Correspondingly, these targets are correct when (1) there are both withingene and betweengene heteroscedasticity; (2) there is only betweengene heteroscedasticity; (3) there is only withingene heteroscedasticity; and (4) all error variances are identical. We now develop a generalized shrinkage error estimator using these four shrinkage targets.
Let {X}_{i,k}\xe2\u2030\xa1log{\widehat{\mathrm{\xcf\u0192}}}_{i,k}^{2}m~log{\mathrm{\xcf\u0192}}_{i,k}^{2}+log{\mathrm{\xcf\u2021}}_{\mathrm{\xce\xbd}}^{2}\xe2\u02c6\u2022\mathrm{\xce\xbd}m, where m is the mean of log log{\mathrm{\xcf\u2021}}_{\mathrm{\xce\xbd}}^{2}\xe2\u02c6\u2022\mathrm{\xce\xbd}. Then using asymptotic normal approximation of X_{i,k}, the distribution of X_{i,k}s with different shrinkage targets for different gene i and group k combinations is
where \stackrel{\xcc\u0192}{\mathrm{\xce\xb8}}=\left({\mathrm{\xce\xb8}}_{1,1},...,{\mathrm{\xce\xb8}}_{1,K,}...,{\mathrm{\xce\xb8}}_{I,1},...,{\mathrm{\xce\xb8}}_{I,K}\right),\phantom{\rule{2.77695pt}{0ex}}\phantom{\rule{0.3em}{0ex}}\stackrel{\xcc\u0192}{\mathrm{\xce\pm}}=\left({\mathrm{\xce\pm}}_{1},...,{\mathrm{\xce\pm}}_{I}\right) represents the genespecific mean differences, and \stackrel{\xcc\u0192}{\mathrm{\xce\xb2}}=\left({\mathrm{\xce\xb2}}_{1},...,{\mathrm{\xce\xb2}}_{K}\right) models different means with respect to different classes of the subgroups.
If Ïƒ^{2}and Ï„^{2} are known, then the Bayes estimator of Î¸_{i,k}under the squared error loss is [39]:
Here, Ïƒ^{2}is the variance of log {\mathrm{\xcf\u2021}}_{\mathrm{\xce\xbd}}^{2}\xe2\u02c6\u2022\mathrm{\xce\xbd} and is known [28, 40], but Ï„^{2} is not known. However, the marginal distribution of X_{i,k}can be used to create an empirical Bayes estimator of Ï„^{2} and hence of Î¸_{i,k}. Marginally, X_{i,k}~ N(Î¼ + Î±_{ i }+ Î²_{ k }, Ïƒ^{2} + Ï„^{2}),i = 1,..., I, k = 1, ...K, and, from this model, the least square estimates of \mathrm{\xce\xbc},\stackrel{\xcc\u0192}{\mathrm{\xce\pm}},\stackrel{\xcc\u0192}{\mathrm{\xce\xb2}},\widehat{\mathrm{\xce\xbc}},\stackrel{\xcc\u0192}{\mathrm{\xce\pm}},\widehat{\mathrm{\xce\xb2}}, are the uniformly minimumvariance and unbiased estimators. Using the fact that
the empirical Bayes estimator for Ï„^{2} is \mathrm{\xce\pounds}{\left({X}_{i,k}\widehat{\mathrm{\xce\xbc}}{\widehat{\mathrm{\xce\pm}}}_{i}{\widehat{\mathrm{\xce\xb2}}}_{k}\right)}^{2}\xe2\u02c6\u2022\left[IK\left(I+K1\right)2\right]{\mathrm{\xcf\u0192}}^{2}.
Then, we can construct the positivepart empirical Bayes estimator [40]:
where(x)_{+} = max(x, 0). The generalized shrinkage error estimate for Ïƒ_{i,k}can be obtained through exponentiating {\mathrm{\xce\xb8}}_{i,k}^{EB+} as follows:
Using a similar argument, the generalized shrinkage error estimator with the shrinkage target at each gene is
with the shrinkage target at each group is
and with the shrinkage target at the common error, we have
The shrinkage error estimator proposed by Cui et al. [28] shrinks the genespecific error estimators toward their common corrected geometric mean. Specifically, the estimator for {\mathrm{\xcf\u0192}}_{i}^{2} is calculated as
where X_{ i }is the residual variance estimate from a genespecific model, and m and Ïƒ^{2} are the mean and variance of log \frac{{\mathrm{\xcf\u2021}}^{2}K\mathrm{\xce\xbd}}{K\mathrm{\xce\xbd}}. The underlying assumption for this estimator is that there is no betweengene heteroscedasticity, as this estimator shrinks every genespecific error estimator toward one target. Therefore, it will overshrink the genespecific error estimates when gene heteroscedasticity exists. In comparison, generalized shrinkage error estimators are flexible in terms of incorporating a different type of heteroscedasticity. Some degrees of freedom are used for incorporating the heteroscedasticity. However, the gain is that the error estimator is then closer to the underlying distribution and should lead to better performance of the resultant Flike test statistics as shown in the results section.
In formulas (2), (3), (5), and (6), m is the mean and Ïƒ^{2} is the variance of a logtransformed chisquare random variable. The simulationbased approximate values of m and Ïƒ^{2} can be found from Table 1 in work of Cui et al. [28]. Pounds [41] gave analytical expressions for these parameters and developed R code for the exact calculation. Here, the simulationbased approximate values were used.
Shrinkage Flike statistics
To construct a statistic for the hypothesis test of no interaction between two fixed effects, the traditional Ftest is simply the ratio of the mean square of the interaction term (MSI) and the mean square of residuals (MSE). This Ftest, referred to as F_{1} [42], is {F}_{1}=\frac{MSI}{MSE}=\frac{MSI}{{\widehat{\mathrm{\xcf\u0192}}}^{2}}. The F_{1} test corresponding to a specific gene i is denoted by
The error variance estimator in this test uses data from only gene i. In oligonucleotide microarray models, the degrees of freedom for the error estimate can be small because the sample size of RNA is usually small, and hence the power of F_{1} can be limited.
Following the method of constructing an Ftest statistic given by Neter et al. [42], the genespecific shrinkage Flike statistics for testing an interaction between two fixed effects can be obtained as
When the homoscedastic error assumption is true, the pooled variance estimator, {\widehat{\mathrm{\xcf\u0192}}}_{pool}^{2}, can be used to construct an Flike statistic. For a balanced design, the pooled variance estimate is the average of all genespecific error estimates. This statistic is denoted by F_{3} using the same notation used by Cui and Churchill [22], who also introduced another shrinkagetype F statistic, F_{2}, which can also borrow information across genes when estimating the residual variances. The statistic F_{2} uses an equalweighted average of a genespecific error estimator {\widehat{\mathrm{\xcf\u0192}}}^{2} and {\widehat{\mathrm{\xcf\u0192}}}_{pool}^{2}. The definitions of F_{2,i}and F_{3,i}are
Permutation tests
For the proposed generalized shrinkage Flike test statistics, the null distributions are not known named distributions. Therefore, an empirical approach such as a permutation test can be used to estimate the null distributions. The permutation test for interaction is complicated, because there is no exact permutation test for such a purpose [43]. We therefore must consider an approximate permutation method for testing an interaction term in a crossed fixed/mixed model [44, 45].
Permutation approaches developed previously focused on a single ANOVA model. In the typical gene expression study, thousands of ANOVA models are considered simultaneously. The additional complexity of the shrinkage Flike statistics indicates that Monte Carlo studies are needed to investigate the performance of residual permutation and raw data permutation, with restrictions or not, in a geneexpression analysis. The choice of permutation procedures is critical for assessing the performance of a test statistic.
For all the modified Flike statistics presented in the previous section, the null distributions can only be approximated empirically, but permutation procedures can be used to find the approximate null distribution of all the F and Flike statistics. The important issues in performing a permutation analysis include the choice of the exchangeable units under the null hypothesis, the choice of using restricted permutation or not, and the choice of residual permutation or raw data permutation. These choices influence the power of a test statistic.
Residual permutation using residuals from a reduced model and unrestricted raw data permutation can be used to approximate the null distribution of a statistic for testing an interaction term [44]. When using F_{1} to test an interaction term in a single ANOVA model, the residual permutation leads to a more powerful test than unrestricted raw data permutation [44]. However, in gene expression analysis, thousands of genespecific ANOVA models are simultaneously considered, and for a particular genespecific ANOVA model, information from other genespecific ANOVA models is used to construct the shrinkage error estimate. Hence, both residual permutation and raw data permutation were investigated. Furthermore, both restricted and unrestricted permutations were studied, because the permutation units are exchangeable only within each particular group when withingene heteroscedasticity is present across those subgroups.
Results
The properties of this shrinkage estimator are compared with those of other existing F and Flike statistics that have been proposed and described in the "Shrinkage Flike statistics" section.
Simulation studies
The purpose of these simulation studies was to compare the performances of F_{1}, F_{2}, F_{3}, F_{ Cui }, F_{ Gen }, F_{ Gengene }, and F_{ Gengrp }in terms of type I error and power and to compare the results of a particular Flike statistic using four different permutation strategies: restricted/unrestricted residual permutation and restricted/unrestricted raw data permutation.
In these simulation studies, 100 genes with two probes for each gene and three replicates from each of two lines were simulated to mimic a splitplot design in a general oligonucleotide microarray experiment. The genespecific ANOVA model in which data were generated from the model, y_{ plr }= P_{ p }+ L_{ l }+ RL_{ rl }+ PL_{ pl }+ Ïµ_{ plr }, wp = 1, 2, l = 1,2, r = 1,2,3, where P, L, RL, and PL represent probe, line, replicates from a particular line, and the interaction between probe and line, respectively.
Replicates were nested within each line, and RL is usually treated as a random effect during the modelfitting procedure, which results in a correlation between probes from the same biological sample. In the simulated data sets, the correlation between genes was 0. As many as 900 simulation runs were carried out to compare the performances of F_{1}, F_{2}, F_{3}, F_{ Cui }, F_{ Gen }, F_{ Gengene }, and F_{ Gengrp }based on different permutation procedures. The four permutations tested were unrestricted residual permutation, restricted residual permutation with respect to each line, unrestricted raw data permutation, and restricted raw data permutation with respect to each line. The residuals permuted were from a reduced fixed model with fixed effects for only line and probe.
Two types of data were simulated: null cases and cases with a probe by line interaction at a range of degrees. Null cases included: nullce, all probelevel expression values were simulated from the standard normal distribution; nullgh, the genespecific error variances were simulated from the lognormal distribution with mean log at 0 and standard deviation at 2, mimicking the general heteroscedastic error distribution in typical datasets; nullwgh, all genes had the same error structures and the residual error variance of line 1 was 100 times that of line 2; nullbgh, simulated data were modified from nullgh, with the variance of line 1 multiplied by 100. Correspondingly, ce, gh, wgh, and bgh in Figures 1 and 2 were simulated by adding interaction terms to nullce, nullgh, nullwgh, and nullbgh. Quantitative interaction was assumed and the differences in the opposite direction were set to make the detection powers for an interaction term based on traditional Fstatistics and tabled pvalues range from 0.05 to 0.95.
Tables 1 and 2 show the results from 900 simulation runs using raw data permutation and residual permutation, respectively. Data in Table 1 suggest that when both types of gene heteroscedasticity exist, the unrestricted raw data permutation had a greater average comparisonwise error rate (CWER) than residual permutation. Raw data permutation with restriction can control prespecified CWER im all cases. In Table 2, for the common error cases, all test statistics had the prespecified CWER from both restricted and unrestricted residual permutation. When withingene heteroscedasticity existed, F_{1} and F_{ Cui }had inflated CWER from both two residual permutation tests. Restricted residual permutation reduces, but does not solve, this problem. For F_{2} and F_{3}, only the restricted residual permutation could control the prespecified CWER. For F_{ Gen }, F_{ Gengene }, and F_{ Gengrp }, restricted residual permutation gave conservative results in terms of having CWER smaller than the prespecified level. When the shrinkage target is correctly set, unrestricted residual permutation controls the nominal CWER. As expected, only F_{ Gen }coupled with unrestricted residual permutation could be used for all cases, because the CWER was always less than the nominal level.
Further simulations to compare the rejection rates were conducted. Only results from residual permutation are shown because it was found that raw data permutation was less powerful than residual permutation. This is consistent with the findings of Anderson and Ter Braak [44]. Figure 1 shows the estimated average null hypothesis rejection rate curves from all Flike statistics and both restricted and unrestricted residual permutation procedures. The xaxis represents the average null hypothesis rejection rate using F_{1} and the tabulated pvalues. The solid line shows that the corresponding statistic controls the prespecified CWER, and the dashed line shows that the corresponding CWER was inflated. In general, restricted residual permutation is less powerful than unrestricted residual permutation. For example, the power of all statistics from unrestricted residual permutation almost doubled in some cases where heteroscedasticity existed.
When the common error assumption is valid, F_{3} is obviously the most powerful test and the prespecified CWER is controlled. All other Flike statistics performed very similarly in this case. When the shrinkage target was correctly set, the resultant test statistic was the most powerful one. For example, when there was only withingene heteroscedasticity, F_{ Gengrp }was more powerful than F_{ Gen }and F_{ Gengene }based on either restricted or unrestricted residual permutation. The rejection rate comparison of statistically valid test statistics is further illustrated in Figure 2, where the xaxis is the average rejection rate from using F_{ Gen }and unrestricted residual permutation. Figure 2 clearly shows that unrestricted residual permutation is more favorable in terms of power. F_{ Gengrp }appears to be more powerful than F_{ Gen }, but when both types of gene heteroscedasticities occur, F_{ Gen grp }has inflated CWER.
Drosophila data
The data used in this study are from a gene expression comparison study between D. melanogaster and D. simulans [46]. Expression of 10 genotypes of each species was measured in male flies. In D. simulans, each genotype was measured separately, and in D. melanogaster, a pool of 10 genotypes was measured. All genotypes (individual or pooled) were independently isolated and hybridized three times. The goal of the original study was to provide a genomewide approach to identifying candidate genes potentially responsible for adaptation and speciation in D. simulans and D. melanogaster. In this study, we focus on identifying sequence differences between genotypes in D. simulans based on hybridization profiles. Withingene heteroscedasticity is expected because the genotypes come from different lines. The proposed generalized shrinkage Flike test statistics F_{ Gen }, F_{ Gengene }, and F_{ Gengrp }were compared with F_{2}, F_{3} with restricted residual permutation, which could control prespecified CWER for any variance structure in simulation studies. Furthermore, Smyth's moderated Ftest statistic [25] without multiple testing adjustment and controlling the false discovery rate (FDR) at 5% were used for comparison. As the main interest is in sequence difference, the focus is on the test of interaction between line and probe. The split plot model described above is used. SAS program codes are included in the additional files (additional file 1 and additional file 2).
The Drosophila genome has been fully sequenced and both SNPs and indels can cause a significant interaction term. Thus, the false positive rate and detection power based on SNP/indel sequence information can be calculated for a subset of the data. In the data set, there were 10 lines from D. simulans and three replicates from each line. Each probe set had 14 probes. The 1,285 probesets containing all "good" probes were selected. A "bad" probe's sequence satisfies one or more of the following criteria: it matches the D. simulans genome multiple times; it cannot be mapped to the flybase 4.2.1 genome; or, it has no information, such as hitting outside an exon, hitting a poorly aligned region, or hitting a region lacking a sequence. SNP or indel information could be determined in 777 probesets. For this data set, there was a high degree of withingene heteroscedasticity: about 22.3% of the probe sets had a difference in linespecific residual variance estimates as large as or more than a 10fold change. Therefore, as suggested by the conclusions from simulation studies, unrestricted residual permutation and restricted residual permutation were used for generalized shrinkage Flike test statistics (F_{ Gen }, F_{ Gengene }, F_{ Gengrp }) and restricted residual permutation was used for statistics (F_{2}, F_{3}). The results are shown in Table 3. Consistent with the findings from the simulation studies, F_{ Gen }had about 30% more detecting power by valuing the withingene heteroscedasticity than the other Flike test statistics (F_{2}, F_{3}). The false discovery rate of F_{ Gen }was slightly higher than that of F_{2}, F_{3}. F_{ Gengene }and F_{ Gengrp }performed similarly to F_{ Gen }. Both of Smyth's moderated Ftest statistic without multiple testing adjustment and with FDR set at 5% for multiple testing adjustment detected more SNPs and indels but at the expense of a greater FDR than F_{ Gen }.
Discussion
For gene expression analysis, ANOVA models have been a popular modeling technique. Based on ANOVA models, flexible shrinkage Flike test statistics were developed to account for both the withingene and betweengene heteroscedasticities. The emphasis here is on testing an interaction term, as this case is of increasing interest to biologists, and there is no clear existing theory on the most powerful, valid approach for such statistics. For all Flike statistics studied here, their null distributions were approximated empirically through permutations. Four different permutation procedures were investigated for eight different Flike statistical tests of the interaction term.
As expected, we found that when an error estimator overshrinks, the resulting Flike statistic cannot control the prespecified CWER. For example, F_{ Gengene }is an overshrinkage error estimator when there is withingene heteroscedasticity. As a result, compared with generalized shrinkage Flike statistics, it is not valid when withingene heteroscedasticity exists. Undershrinkage is also important, as it will lead to a conservative test and lower power. This is clearly demonstrated when the common error can be assumed and the most powerful valid test is F_{ Gengrp }.
The most striking result was the impact of the permutation procedures. Although this was not completely unexpected [43â€“45], the effect of the permutation procedures is dramatic and worthy of special attention. Unrestricted raw data permutation could not control prespecified CWER when there was withingene heteroscedasticity. Restricted raw data permutation could be used, but it was less powerful than residual permutation. Also consistent with findings from Anderson and Ter Braak [44], restricted permutations are less powerful than unrestricted permutations. However, unrestricted permutations are valid only for a common error and when betweengene heteroscedasticity exists for our proposed shrinkage statistics; they are not valid in combination with F_{2}, F_{3}, or F_{ Cui }. For F_{ Gengrp }, the unrestricted permutation can also be used in cases having withingene heteroscedasticity, while only F_{ Gen }is valid with unrestricted permutation in all cases in terms of controlling prespecified CWER. Interestingly, the power gain from using the correct shrinkage target F_{ Gengrp }rather than F_{ Gen }is far less than that of using unrestricted permutation. The result is that F_{3} is never the most powerful choice when testing an interaction term.
The correct shrinkage target can lead to the most powerful test statistic. As one of the reviewers suggested, a statistical test may be applied to help pick the best shrinkage target before obtaining shrinkage error estimates. However, this extra testing step may inflate the CWER of the test statistic when there is gene heteroscedasticity. For example, when there are both types of gene heteroscedasticities, it is possible that the above test suggests only withingene heteroscedasticities exist, and F_{ Gengrp }is shown to inflate the CWER. There is minimal penalty to using the shrinkage estimator we propose, so we recommend setting the shrinkage target in the full space spanned by group and gene and using unrestricted permutation to compensate for the possible power loss in fewer degrees of freedom left for estimating the errors.
Conclusions
The proposed generalized shrinkage Flike statistic with shrinkage targets located in a space spanned by gene and another group, F_{ Gen }, with unrestricted residual permutation is always valid in terms of having a prespecified CWER. This statistic has reasonable power in most cases; thus, it is generally recommended to be applied to test an interaction term in the analysis of real gene expression data.
Abbreviations
 CWER:

comparisonwise error rate
 FDR:

false discovery rate
 indel:

insertion and deletion
 SNP:

single nucleotide polymorphism
References
Fodor SPA: Massively parallel genomics. Science 1997, 277: 393â€“395. 10.1126/science.277.5324.393
Schena M, Shalon D, Heller R, Chai A, Brown PO, Davis RW: Parallel human genome analysis: microarraybased expression monitoring of 1000 genes. Proceedings of National Academy Science 1996, 93: 10614â€“10619. 10.1073/pnas.93.20.10614
Galitski T, Saldanha AJ, Styles CA, Lander ES, Fink GR: Ploidy regulation of gene expression in yeast. Science 1999, 285: 251â€“254. 10.1126/science.285.5425.251
Tu BP, Kudlicki A, Rowicka M, McKnight SL: Logic of the yeast metabolic cycle: Temporal compartmentalization of cellular processes. Science 2005, 310: 1152â€“1158. 10.1126/science.1120499
White KP, Rifkin SA, Hurban P, Hogness DS: Microarray analysis of Drosophila development during metamorphosis. Science 1999, 286: 2179â€“2184. 10.1126/science.286.5447.2179
Chabas D, Baranzini SE, Mitchell D, Bernard CCA, Rittling SR, Denhardt DT, Sobel RA, Lock C, Karpuj M, Pedotti R, Heller R, Oksenberg JR, Steinman L: The influence of the proinflammatory cytokine, Osteopontin, on autoimmune demyelinating disease. Science 2001, 294: 1731â€“1735. 10.1126/science.1062960
Sebat J, Lakshmi B, Troge J, Alexander J, Young J, Lundin P, Maner S, Massa H, Walker M, Chi MY, Navin N, Lucito R, Healy J, Hicks J, Ye K, Reiner A, Gilliam TC, Trask B, Patterson N, Zetterberg A, Wigler M: Largescale copy number polymorphism in the human genome. Science 2005, 305: 525â€“528.
Blekhman R, Marioni JC, Zumbo P, Stephens M, Gilad Y: Sexspecific and lineagespecific alternative splicing in primates. Genome Research 2010, 20(2):180â€“189. 10.1101/gr.099226.109
Butte A: The use and analysis of microarray data. Nature Reviews 2002, 1: 951â€“960. 10.1038/nrd961
Churchill GA: Fundamentals of experimental design for cDNA microarrays. Nature Genetics 2002, 32: 490â€“495. 10.1038/ng1031
Craig BA, Black MA, Doerge RW: Gene expression data: The technology and statistical analysis. Journal of Agricultural, Biological, and Environmental Statistics 2003, 8(1):1â€“28. 10.1198/1085711031256
Allison DA, Cui X, Page GP, Sabripour M: Microarray data analysis: from disarray to consolidation and consensus. Nature Reviews Genetics 2006, 7: 55â€“65. 10.1038/nrg1749
Kerr MK, Martin M, Churchill GA: Analysis of variance for gene expression microarray data. Journal of Computational Biology 2000, 7: 819â€“837. 10.1089/10665270050514954
Kerr MK, Churchill GA: Statistical design and the analysis of gene expression microarrays. Genetical Research 2001, 77: 123â€“128.
Kerr MK, Churchill GA: Experimental design for gene expression microarrays. Biostatistics 2001, 2: 183â€“201. 10.1093/biostatistics/2.2.183
Pritchard CC, Hsu L, Delrow J, Nelson PS: Project normal: Defining normal variation in mouse gene expression. Proceedings of the National Academy of SciencesUSA 2001, 98: 13266â€“13271. 10.1073/pnas.221465998
Wolfinger RD, Gibson G, Wolfinger ED, Bennett L, Hamadeh H, Bushel P, Ashfari C, Paules RS: Assessing gene significance from cDNA microarray expression data via mixed models. Journal of Computational Biology 2001, 8(6):625â€“637. 10.1089/106652701753307520
Kerr MK, Afshari CA, Bennett L, Bushel P, Martinez J, Walker NJ, Churchill GA: Statistical analysis of a gene expression microarray experiment with replication. Statistica Sinica 2002, 12: 203â€“217.
Wu H, Kerr MK, Cui XQ, Churchill GA: MAANOVA: A Software package for the analysis of spotted cDNA microarray experiments, In. In The analysis of gene expression data: methods and software. Springer; 2002:313â€“341.
Chu T, Weir B, Wolfinger R: A systematic statistical linear modeling approach to oligonucleotide array experiments. Mathematical Biosciences 2002, 176: 35â€“51. 10.1016/S00255564(01)001079
Wayne ML, Pan YJ, Nuzhdin SV, McIntyre LM: Additivity and transacting effects on gene expression in male Drosophila simulans . Genetics 2004, 168: 1413â€“1420. 10.1534/genetics.104.030973
Cui X, Churchill GA: Statistical tests for differential expression in cDNA microarray experiments. Genome Biology 2003, 4(4):201.
Baldi P, Long AD: A Bayesian framework for the analysis of microarray expression data: Regularized ttest and statistical inferences of gene changes. Bioinformatics 2001, 17: 509â€“519. 10.1093/bioinformatics/17.6.509
LÃ¶nnstedt I, Speed T: Replicated microarray data. Statistica Sinca 2002, 12: 31â€“46.
Smyth GK: Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology 2004., 3: No. 1, Article 3 No. 1, Article 3
Tong TJ, Wang YD: Optimal shrinkage estimation of variances with applications to microarray data analysis. Journal of the American Statistical Association 2007, 102: 113â€“122. 10.1198/016214506000001266
Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to ionizing radiation response. The Preceedings of National Academy Science 2001, 989: 5116â€“5121.
Cui X, Hwang JTG, Qiu J, Blades NJ, Churchill GA: Improved statistical tests for differential gene expression by shrinking variance components estimates. Biostatistics 2005, 6: 59â€“75. 10.1093/biostatistics/kxh018
Feng S, Wolfinger RD, Chu TM, Gibson GC, McGraw LA: Empirical Bayes analysis of variance component models for microarray data. Journal of Agricultural, Biological & Environmental Statistics 2006, 1113: 197â€“190.
Kim SY, Lee JW, Sohn IS: Comparison of various statistical methods for identifying differential gene expression in replicated microarray data. Statistical Methods in Medical Research 2006, 15: 3â€“20. 10.1191/0962280206sm423oa
Hwang JTG, Liu P: Optimal tests shrinking both means and variances applicable to microarray data analysis. In preprint 2007â€“03. Department of Statistics, Iowa State University, IA; 2007.
Kizilkaya K, Tempelman RJ: A general approach to mixed effects modeling of residual variances in generalized linear mixed models. Genetics Selection Evolution 2005, 37: 31â€“56. 10.1186/1297968637131
Jaffrezic F, Marot G, Degrelle S, Isabelle H, Foulley JL: A structural mixed model for variances in differential gene expression studies. Genetical Research 2007, 89(1):19â€“25. 10.1017/S0016672307008646
Rostoks N, Borevitz JO, Hedley PE, Russell J, Mudie S, Morris J, Cardle L, Marshall DF, Waugh R: Singlefeature polymorphism discovery in the Barley transcriptome. Genome Biology 2005, 6: R54. 10.1186/gb200566r54
Kirst M, Caldo R, Casati P, Tanimoto G, Walbot V, Wise RP, Buckler ES: Genetic iversity contribution to errors in short oligonucleotide microarray analysis. Plant Biotechnology Journal 2006, 4: 489â€“498.
Zhang X, Shiu SH, Cal A, Borevitz JO: Global analysis of genetic, epigenetic and transcriptional polymorphisms in Arabidopsis Thaliana Using whole genome tiling arrays. PLoS Genetics 2008, 4(3):e1000032. 10.1371/journal.pgen.1000032
Zhang X, Borevitz JO: Global Analysis of Allelespecific Expression in Arabidopsis Thaliana. Genetics 2009, 182(4):943â€“954. 10.1534/genetics.109.103499
McIntyre LM, Bono LM, Genissel A, Westerman R, Junk D, TelonisScott M, Harshman L, Wayne ML, Kopp A, Nuzhdin SV: Sex specific expression of alternative transcripts in Drosophila. Genome Biology 2006, 7: R79. 10.1186/gb200678r79
Kelly P, Zhou YH, Whitehead J, Stallard N, Bowman C: Sequentially testing for a genedrug interaction in a genomewide analysis. Statistics in Medicine 2008, 27: 2022â€“2034. 10.1002/sim.3059
Lehmann EL, Casella G: Theory of Point Estimation. 2nd edition. New York: SpringerVerlag; 1998.
Pounds S: Computational enhancement of a shrinkagebased analysis of variance Ftest proposed for differential gene expression analysis. Biostatistics 2007, 83: 505â€“506.
Neter J, Wasserman W, Kutner MH: Applied Linear Statistical Models: Regression, Analysis of Variance, and Experimental Designs. 3rd edition. Irwin, Inc; 1990.
Edgington ES: Randomization Tests. 3rd edition. Marcel Dekker, New York; 1995. (1995) (1995)
Anderson MJ, Ter Braak CJF: Permutation tests for multifactorial analysis of anova. Journal of Statistical Computation and Simulation 2003, 732: 85â€“113.
Churchill GA, Doerge RW: Naive application of permutation testing leads to nflated type I error rates. Genetics 2008, 178: 609â€“610. 10.1534/genetics.107.074609
Nuzhdin SV, Wayne ML, Harmon KL, McIntyre LM: Common pattern of evolution of gene expression level and protein sequence in drosophila. Molecular Biology and Evolution 2004, 21: 1308â€“1317. 10.1093/molbev/msh128
Acknowledgements
We thank Brandon Walts for identifying true SNP positions; Angela J. McArthur and David R. Galloway for their help in scientific editing; associate editor and three reviewers for their constructive comments that much improved this manuscript. This research was supported by NIH 1R01GM077618 (McIntyre), NIH 1R01GM081704 (Casella).
Author information
Authors and Affiliations
Corresponding author
Additional information
Authors' contributions
All authors contributed to the design of the overall strategy. JY carried out all the analysis and drafted the manuscript. LMM and GC helped to draft and finalize the manuscript. All authors read and approved the final manuscript.
Electronic supplementary material
12859_2010_4887_MOESM1_ESM.SAS
Additional file 1: SAS program code 1. SAS program code for analyzing the real data set using residual permutation without restriction. (SAS 11 KB)
12859_2010_4887_MOESM2_ESM.SAS
Additional file 2: SAS program code 2. SAS program code for analyzing the real data set using residual permutation with restriction. (SAS 11 KB)
Authorsâ€™ original submitted files for images
Below are the links to the authorsâ€™ original submitted files for images.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Yang, J., Casella, G. & McIntyre, L.M. Generalized shrinkage Flike statistics for testing an interaction term in gene expression analysis in the presence of heteroscedasticity. BMC Bioinformatics 12, 427 (2011). https://doi.org/10.1186/1471210512427
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/1471210512427
Keywords
 Null Distribution
 Permutation Procedure
 Shrinkage Estimator
 Propose Test Statistic
 Multiple Testing Adjustment