 Research
 Open access
 Published:
A statistical method for the conservative adjustment of false discovery rate (qvalue)
BMC Bioinformatics volumeÂ 18, ArticleÂ number:Â 69 (2017)
Abstract
Background
qvalue is a widely used statistical method for estimating false discovery rate (FDR), which is a conventional significance measure in the analysis of genomewide expression data. qvalue is a random variable and it may underestimate FDR in practice. An underestimated FDR can lead to unexpected false discoveries in the followup validation experiments. This issue has not been well addressed in literature, especially in the situation when the permutation procedure is necessary for pvalue calculation.
Results
We proposed a statistical method for the conservative adjustment of qvalue. In practice, it is usually necessary to calculate pvalue by a permutation procedure. This was also considered in our adjustment method. We used simulation data as well as experimental microarray or sequencing data to illustrate the usefulness of our method.
Conclusions
The conservativeness of our approach has been mathematically confirmed in this study. We have demonstrated the importance of conservative adjustment of qvalue, particularly in the situation that the proportion of differentially expressed genes is small or the overall differential expression signal is weak.
Background
Microarray and sequencing technologies have been widely used in genomewide expression experimental for biological and medical studies [1â€“5]. After screening a large number of genes simultaneously, we expect to achieve new biological discoveries. In these situations, an important statistical concept is multiple hypothesis testing, in which many statistical tests are conducted at the same time. Then, a detection of gene with relatively small pvalue may be actually a false discovery. Since the introduction of microarray technology, the concept of false discovery rate (FDR) and its related statistical methods have been well developed [6, 7]. qvalue is a statistical method for the estimation of FDRs [8]. It has been widely used in the analysis of microarray and sequencing data.
Since qvalue is an estimation method, it is possible that it underestimates true FDRs. Then, an undesirable consequence is that many genes detected with low qvalue cannot be validated in a followup experiment. Therefore, it is important to avoid underestimations of FDRs. However, there is a lack of statistical method to address this important issue. Furthermore, in many situations, qvalues are calculated based on pvalues that are evaluated based on a permutation procedure (due to unknown data distributions or nontraditional test statistics). Then, these pvalues are also estimated. This increases the complexity of FDR underestimations. It is necessary to adjust the underestimation of FDRs in a comprehensive approach.
In this study, we proposed a statistical method for the conservative adjustment of qvalue, which is one of the most frequently used procedure for estimating FDRs. We first reviewed the concepts of multiple hypothesis testing, FDR and qvalue. Then, we introduced a concept of conservative adjustment. Based on the theory of rankbased quantiles, we described a nonparametric approach and we conducted simulation and application studies to illustrate its usefulness.
Methods
Multiple hypothesis testing and false discovery rate
When a large number of variables are simultaneously screened, it is usually a situation that a mixture of true and false null hypotheses is presented. (There is a hypothesis to test for each variable but the underlying true/false nature is unknown). Statistically, this is a situation of multiple hypothesis testing (MHT). An illustrative summary is shown in Table 1. After certain statistical tests for m total hypotheses, with a threshold for declaring significance, we have R hypotheses rejected (claimed positives). If we know the underlying nature of each variable, then its related hypothesis can be classified as either a true null or a false null (termed gold standard). If this information is provided, then we know the numbers U, V, W and S in Table 1. U, V, W and S represent the numbers of true negatives, false positives, false negatives and true positives, respectively. However, in practice, the gold standard information (or the underlying nature) is usually unknown. Therefore, it is statistically interesting to evaluate U, V, W or S (or their combinations).
The traditional familywise error rate (FWER) provides a strong control on V [9]. Since FWER is too conservative (for example, requiring extremely small pvalue threshold), it is usually difficult to claim statistical significance. The false discovery rate (FDR) has been proposed to empirically evaluate the proportion of false positives among the claimed positives: V/R [6]. The concept of FDR and its related estimations have been widely used in the analysis of genomewide expression data collected by microarray or RNA sequencing technologies. Particularly, qvalue [8] is one of the most widely used method for FDR estimation.
qValue
Storey and Tibshirani [8] proposed the qvalue method that is a statistical procedure for FDR evaluation. Suppose T is the test statistic and genes with test scores greater than or equal to t will be claimed significant. Let Î± be the pvalue at t, and let f(t) and s(t) be the expected numbers of false positives and significant genes, respectively. Storey and Tibshirani [8] proposed that the FDR for T=t could be approximated by the proportion of false positives:
where m is the total number of genes and Ï€ _{0} is the proportion of true null hypotheses (i.e. Ï€ _{0}=m _{0}/m in Table 1). With a proper estimator \(\hat {\pi }_{0}\) for Ï€ _{0}, they proposed the qvalue as a FDR estimator:
where (#{Tâ‰¥t}) estimates m Pr(Tâ‰¥t).
Conservative adjustment of qvalue
In practice, it can be difficult to obtain the theoretical pvalue of t. Therefore, a permutation based pvalue \(\hat {\alpha }\) has been widely used to estimate Î±. Then, the permutation version of qvalue is
When the permutation method [10] is used to evaluate pvalues, it is possible to obtain underestimated results. We have proposed a conservative adjustment of permutation pvalues to address this issue [11].
Similarly, the above FDR can be underestimated since a qvalue is actually a combination of three estimates:

\(\hat {\pi }_{0}\) for Ï€ _{0},

\(\hat {\alpha }\) for Î±,

(#{Tâ‰¥t}) for m Pr(Tâ‰¥t).
From above, notice that m Pr(Tâ‰¥t) must still be empirically estimated even when Î± can be theoretically determined. To address the underestimation of FDR (from qvalue), our solution is to consider a conservative adjustment of q(t).
According to Eq. (1), the theoretical qvalue for T=t can be defined as:
where Î±(t)=Pr(Tâ‰¥tH _{0}) and Î³(t)=Pr(Tâ‰¥t).
We define the 100(1âˆ’a)% conservative adjustment of q(t) as an estimator \(\hat {q}_{c}(t) \in [0,1]\) such that:
Our solution for \(\hat {q}_{c}(t)\) is to find \(\hat {\pi }_{0c}\), \(\hat {\alpha }_{c}(t)\) and \(\hat {\gamma }_{c}(t)\) such that:
If (1âˆ’a _{0})(1âˆ’a _{1})(1âˆ’a _{2})â‰¥1âˆ’a, then we claim that \(\hat {\pi }_{0c} \hat {\alpha }_{c}(t) / \hat {\gamma }_{c}(t)\) satisfies the requirement for \(\hat {q}_{c}(t)\), given Ï€ _{0},Î±(t)>0 and \(\mathbf {Pr}[\hat {\gamma }_{c}(t)>0]=1\). The mathematical proof is given as an Appendix 1.
Remark
\(\hat {q}_{c}(t)\) is actually an upper confidence limit of \(\tilde {q}(t)\). However, due to the difficulty in the accurate estimation of Ï€ _{0} (Ï€ _{0} is usually conservatively estimated), it is difficult to calculate a lower confidence limit of \(\tilde {q}(t)\) in practice.
Conservative estimation of Ï€ _{0}
Due to the identifiability issue, Ï€ _{0} (the proportion of true null hypotheses) is usually conservatively estimated in practice [12]. Many statistical methods have been proposed for the estimation of Ï€ _{0} [13]. Among them, the convest [14] is a wellrecognized method that conservatively estimates Ï€ _{0}. (Instead of estimating the true Ï€ _{0}, it estimates a close upper bound of Ï€ _{0}. Therefore, the estimation is conservative since the proportion of nondifferentially expressed genes is usually overestimated in a differential expression analysis). According to our experience, it is reasonable to assume that
When \(\hat {\pi }_{0c}\) is the convest method (or other similar methods). Then, we discuss \(\hat {\alpha }_{c}(t)\) and \(\hat {\gamma }_{c}(t)\), which are closely related to some rankbased quantiles.
Conservative adjustment of rankbased quantiles
In a differential expression analysis, only qvalues associated with observed test scores can be evaluated, and it is difficult to accurately evaluate the qvalues for unobserved test scores. Here are the mathematical details. With a test statistic T, we can obtain m observed test scores {T _{1},T _{2},â€¦,T _{ m }} for m variables (genes). An empirical estimate of Î³(t)=Pr(Tâ‰¥t) is
where the indicator function Î´(TRUE)=1 and Î´(FALSE)=0. If we sort these m test scores in an increasing order: T _{(1)}â‰¤T _{(2)}â‰¤â€¦â‰¤T _{(m)}, then we have
which is a rankbased quantile estimator.
If the theoretical distribution of T is unknown, then we need to use the permutation procedure [10] for evaluating pvalues. In the permutation procedure, we permute sample labels and recalculate test scores. Since the observed test scores (scores calculated based the original data) can also be considered as the results from a particular permutation, they are generally included in the pool of permuted test scores. Based on r permuted test scores: \(\{ T^{0}_{1}, T^{0}_{2}, \ldots, T^{0}_{r} \}\), an empirical estimate of Î±(t)=Pr(Tâ‰¥tH _{0}) is
If we sort these r permuted test scores in an increasing order: \(T^{0}_{(1)} \le T^{0}_{(2)} \le \ldots \le T^{0}_{(r)}\), then we have
Note that, if the observed test scores are not included in the pool of permuted test scores, then the permutation pvalue of T=t will be \(\hat {\alpha }(T^{0}_{(k)})=(rk+1)/r\), where \(T^{0}_{(k)} \ge t\) is the closest order statistic to t. Therefore, the permutation pvalue of T=t can also be considered as a rankbased quantile estimator.
Since \(\hat {\gamma }\) and \(\hat {\alpha }\) are both random variables, it is possible for them to underestimate their target parameters. The above discussion shows that, in practice, \(\hat {\gamma }\) and \(\hat {\alpha }\) are actually both rankbased quantile estimators. Based on the theory of order statistics [15], we have proposed a conservative adjustment for this type of estimator [11]. Such an adjustment requires no parametric assumption on the distribution of test statistic and the solution can be expressed by a normalised incomplete beta function. Therefore, based on this adjustment, we can otain \(\hat {\alpha }_{c}(t)\) and \(\hat {\gamma }_{c}(t)\) such that \(\mathbf{Pr}[\hat {\alpha }_{c}(t) \ge \alpha (t)] \ge 1a_{1}\) and \(\mathbf{Pr}[\hat {\gamma }_{c}(t) \le \gamma (t)] \ge 1a_{2}\).
Results
A simulation study
To understand how likely the qvalue method underestimates the true false discovery rate, we conducted a simulation study. We choose the normal distributions for simulating expression data and the Studentâ€™s t for differential expression. In this way, we could evaluate the true false discovery rate theoretically.
Ten thousand genes were simulated for two sample groups with sample size 5 for each group (10 for total sample size). For nondifferentially expressed genes, the expression data were simulated from N(0,1) for both groups. Then, we considered four scenarios. For the first simulation scenario, the proportion of differentially expressed genes was 10%; the expression data for differentially expressed genes were simulated from N(0,1) and N(1,1) for the first and the second sample groups, respectively (Î”=1). For the second simulation scenario, the proportion of differentially expressed genes was 10%; the expression data for differentially expressed genes were simulated from N(0,1) and N(2,1) for the first and the second sample groups, respectively (Î”=2). For the third simulation scenario, the proportion of differentially expressed genes was 20%; the expression data for differentially expressed genes were simulated from N(0,1) and N(1,1) for the first and the second sample groups, respectively (Î”=1). For the last simulation scenario, the proportion of differentially expressed genes was 20%; the expression data for differentially expressed genes were simulated from N(0,1) and N(2,1) for the first and the second sample groups, respectively (Î”=2).
For all different scenarios, the Studentâ€™s ttest was used for differential expression analysis. To evaluate pvalues, we performed all possible (126) permutations for each simulated gene and pooled all 1,260,000 permuted test scores together as the empirical null distribution. (In practice, the underlying data distributions are unknown and the permutation procedure is widely used.) For each scenario, we conducted 100 repetitions to understand the variations in the simulation results.
The simulation results are shown in Fig. 1. For each scenario, the theoretical true FDRs is compared to their related estimates (qvalues). In summary, when the proportion of differentially expressed genes (1âˆ’Ï€ _{0}) becomes smaller (from 20 to 10%), it is more likely to obtain underestimated FDRs; when the signal of differential expression (Î”) becomes weaker, it is more likely to obtain underestimated FDRs. Figure 2 gives a scenario with a moderate proportion of differentially expressed genes (1âˆ’Ï€ _{0}=15%) and overall moderate differential expression signals (Î”=1.5). Each curve is a comparison between the empirical FDRs (qvalues) vs. theoretical FDRs (based on one repetition of simulation). Below 0.05 theoretical FDR, some curves can be observed under the diagonal line, which indicate that these empirical FDRs (qvalues) are underestimated. Furthermore, among 100 repetitions, there is a considerable portion of empirical FDRs underestimated (when the theoretical FDR below 0.05).
An artificial illustration
A conservative adjustment of false discovery rate (FDR) can be useful in practice, especially before the experimental validation of genes identified from a genomewide expression study. For example, based on a microarray or RNAseq study, one may want to validate a few genes with qvalue less than 10%. However, it may be surprising that much less genes can be confirmed after the RTPCR validation. (The validation result is beyond our expectation based on 5% estimated FDR.) This hypothetical situation would be an example of underestimation of FDR.
To demonstrate the above situation, we conducted a simple simulation study. 10,000 genes were simulated for two sample groups. The sample size was 5 for each group. The proportion of differentially expressed genes was 10%. The expression profiles of differentially expression genes were simulated from N(0,1) and N(1,1) for the first and the second sample groups, respectively. The expression profiles of nondifferentially expressed genes were simulated from N(0,1) for both groups. The Studentâ€™s ttest was used for differential expression analysis. To evaluate pvalues, we performed all possible (126) permutations for each simulated gene and pooled all 1,260,000 permuted test scores together as the empirical null distribution. (In practice, the underlying data distributions are unknown and the permutation procedure is widely used.) The convest method proposed by Langaas et al. [14] was used to obtain an estimated Ï€ _{0}, which was used in the qvalue estimation procedure [8]. In this simulation it is feasible to calculate the true FDR theoretically. Figure 3 shows that the low values of true FDR can be seriously underevaluated by qvalue. Then, we considered a conservative adjustment. We set a _{0}=0 (since Ï€ _{0} is usually conservatively estimated), \(a_{1}=1\sqrt {0.95}\) and \(a_{2}=1\sqrt {0.95}\) (then a=0.05). Figure 3 shows the conservatively adjusted qvalues.
Three applications
We applied our method to two genomewide expression data sets. The first one was a microarray data set and it was collected for a diabetes study. It is wellknown that differential expression signals are usually weak in diabetes studies. When the sample size is not relatively large, it is usually difficult to detect true differentially expressed genes. (Due to the inflated false positive rates from the multiple hypothesis testing for a large number of genes, genes with seemingly small FDRs are likely noise genes). It is interesting to understand the adjustment effects from our method for this scenario. The second one was a RNA sequencing (RNAseq) data set and it was collected for a prostate cancer study. It is also wellknown that differential expression signals are usually strong in cancer studies. Genomewide expression data for different types of cancer have been increasingly collected in The Cancer Genome Atlas project [3]. The current sample sizes in many TCGA cancer studies are relatively large. Then, it is also interesting to understand the adjustment effects from our method for such as scenario.
In practice, it is not feasible to calculate the theoretical true FDRs. The curve of estimated FDR vs. number of identified genes is widely used to for a summary of differential expression analysis. (In this curve, the yvalue is a specific FDR and the xvalue is the related number of genes with the specific FDR). Our application results can also be summarized in term of this type of curve.
For the first microarray genomewide expression data set for a type 2 diabetes study [16], there were 17 normal subjects and 18 diabetic subjects. After the procedure of gene filtering [16], there are 10,983 genes. Based on 1,000 permutations and the related Studentâ€™s ttest calculations, there were 10,983,000 permuted test scores as our empirical null distribution for pvalue evaluations. (Since it is difficult to enumerate all possible permutations, we performed 1,000 of them). We set a _{0}=0 (since Ï€ _{0} is already conservatively estimated by the convest method [14]), \(a_{1}=1\sqrt {0.95}\) and \(a_{2}=1\sqrt {0.95}\) (then a=0.05). Figure 4 a shows that the qvalues (estimated FDRs) can only be as low as slightly less than 0.2. There were only several genes with qvalues around 0.2. However, after our conservative adjustment, Fig. 4 a shows that all the conservatively adjusted qvalues are greater than 0.8. This comparison implies that most genes were very likely noise genes and the detections of differentially expressed genes by low qvalues could be very unreliable. The only gene with qvalue less than 0.2 is a mRNA for CD20like precursor. However, no literature was found to support its association with diabetes diseases.
For the second RNAseq genomewide expression data set for a prostate cancer study [3], there were 52 normal subjects and 445 tumor subjects (at the time of data download for this analysis). There are 20,531 genes. Since the sample size was large and the RNAseq expression profiles were counttype data, we used the nonparametric Wilcoxon rank sum test with its theoretical pvalues calculation. Therefore, there was no need for an adjustment of pvalues (i.e. a _{1}=0). Then, we set a _{0}=0 (since Ï€ _{0} is already conservatively estimated by the convest method [14]) and a _{2}=1âˆ’0.95 (then a=0.05). Figure 4 b shows that the qvalues (estimated FDRs) can be extremely low and lots of genes can be detected. After our conservative adjustment, Fig. 4 b shows that the curve of adjusted qvalues almost overlaps with the curve of original qvalues. This comparison implies that many genes were truly differentially expressed genes and the detection of these genes by low qvalues could be highly confident. After checking literature for top ranked genes, many of them have been studied to be either directly or indirectly related to prostate cancer or general cancer diseases (details not shown).
For the third microarray genomewide expression data set for a pancreaticislet study [17], there were 7 normal patients and 5 type 2 diabetic patients. There are 44,928 genes and ESTs. Based on all possible permutations and the related Studentâ€™s ttest calculations. We set a _{0}=0 (since Ï€ _{0} is already conservatively estimated by the convest method [14]), \(a_{1}=1\sqrt {0.95}\) and \(a_{2}=1\sqrt {0.95}\) (then a=0.05). Figure 4 c shows that the qvalues (estimated FDRs) can only be as low as approximately 0.4 (about 20 genes). However, after our conservative adjustment, Fig. 4 c shows that all the conservatively adjusted qvalues are greater than 0.5. This comparison implies that more top ranked genes were likely noise genes. The only two genes with qvalue less than 0.4 are mRNAs for ARNT and APCDD1. Although ARNT has been widely studied for its association with diabetes diseases, no literature was found to support the association between APCDD1 and diabetes diseases.
Discussion
For our method, there are three components that can be adjusted separately. The first component is on Ï€ _{0} estimation. Since the estimators for Ï€ _{0} are usually conservative (especially for the convest method [14]), we do not suggest any adjustment for this component according to our experience. The last component is on the number of identified genes. It can be adjusted based on the theory of order statistic. The second component is on pvalues. When theoretical pvalue can be obtained, it is not necessary to adjust this component. For permutation pvalues, an adjustment can be also performed based on the theory of order statistic. Notice that the number of permutations is also important. In practice, we need to determine it before data analysis. When the sample size is relatively small, we can enumerate all possible permutations. When the sample size is relatively large, we can set the number of permutations as large as possible according to the computing power.
A clear advantage of our approach is that there are rigorous mathematical theories to support it. Furthermore, no distribution assumptions are required for our conservative adjustment. However, the adjustment may be overconservative. If we could further improve the control of upper/lower bounds (as shown in our mathematical proof), then less conservative adjustment of qvalue could be achieved. Furthermore, an independence assumption is required. It is wellknown that genes work with each other during molecular and cellular processes. It would be an interesting future research topic to address the dependence among genes. Therefore, it will be our future research topics to investigate possible better upper/lower bounds for the conservative adjustment of qvalue, as well as the impact of dependence on the conservative adjustment of qvalue.
Conclusions
In this study, we proposed a statistical method for the conservative adjustment of qvalue, which is widely used to estimate false discovery rate (FDR) in practice. We provided a mathematical proof to confirm the conservativeness of our approach. We conducted simulation studies to understand how likely the qvalue method would underestimate FDRs. From our simulation results, both the proportion of differentially expressed genes and the overall differential expression signal were two important factors. When both of them were relatively small/weak, it was likely to identify genes with underestimated FDRs. Our first application was based on a microarray diabetes study data set with relatively small sample size (and weak differential expression signals). Our third application was based on a microarray pancreatic islet study data set with relatively small sample size (and also weak differential expression signals). The results were consistent with the conclusion from our simulation studies. Our second application was based on a RNAseq prostate cancer study data set with relatively large sample size (and strong differential expression signals). According to the results, the conservatively adjusted qvalues were close to the originally unadjusted qvalues.
Appendix 1
Mathematical Proof:
Remark
For any random variable X, Y and a constant c, Pr(X Yâ‰¥cXâ‰¥1)â‰¥Pr(Yâ‰¥c) since Yâ‰¥câ‡’X Yâ‰¥c given Xâ‰¥1 ({Yâ‰¥c} is a subset of {X Yâ‰¥c} when Xâ‰¥1).
Appendix 2
Rfunctions for calculating conservatively adjusted qvalues.
Notice that pi0est is a conservative estimate of Ï€ _{0}; pvals is a list of permutation pvalues based on r permutations; m is the number of permutation pvalues (also the number of genes).
References
Schena M, Shalon D. Davis RW and Brown PO. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995; 270:467â€“70.
Lockhart D, Dong H, Byrne M, Follettie M, Gallo M, Chee M, Mittmann M, Wang C, Kobayashi M. Horton H, and Brown E. Expression monitoring by hybridization to highdensity oligonuleotide arrays. Nat Biotechnol. 1996; 14:1675â€“80.
The Cancer Genome Atlas Network. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008; 455:1061â€“8.
Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M, Snyder M. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science. 1344; 320.
Wilhelm BT, Marguerat S, Watt S, Schubert F, Wood V, Goodhead I, Penkett CJ, Rogers J, Bahler J. Dynamic repertoire of a eukaryotic transcriptome surveyed at singlenucleotide resolution. Nature. 2008; 453:1239â€“43.
Benjamini Y, Hochberg Y. Controlling the false discovery rate: A practical and powerful approach to multiple testing. J R Stat Soc Series B. 1995; 57:289â€“300.
Ghosh D. Incorporating the empirical null hypothesis into the BenjaminiHochberg procedure. Stat Appl Genet Mol Biol. 2012; 11(Online):1544â€“6115.
Storey JD, Tibshirani R. Statistical significance for genomewide studies. Proc Natl Acad Sci U S A. 2003; 100:9440â€“5.
Wright SP. Adjusted Pvalues for simultaneous inference. Biometrics. 1992; 48:1005â€“13.
Dudoit S, Shaffer JP, Boldrick JC. Multiple hypothesis testing in microarray experiments. Stat Sci. 2003; 18:71â€“103.
Lai Y. Conservative adjustment of permutation pvalues when the number of permutations is limited. Int J Bioinforma Res Appl. 2007; 3:536â€“46.
Markitsis A, Lai Y. A censored beta mixture model for the estimation of the proportion of nondifferentially expressed genes. Bioinformatics. 2010; 26:640â€“6.
Cheng Y, Gao D, Tong T. Bias and variance reduction in estimating the proportion of truenull hypotheses. Biostatistics. 2015; 16:189â€“204.
Langaas M, Ferkingstad E, Lindqvist B. Estimating the proportion of true null hypotheses, with application to DNA microarray data. J R Stat Soc Series B. 2005; 67:555â€“72.
Balakrishnan N, Cohen AC. Order statistics and inference. San Diego: Academic Press, INC.; 1991.
Mootha VK, Lindgren CM, Eriksson KF, Subramanian A, Sihag S, Lehar J, Puigserver P, Carlsson E, Ridderstrale M, Laurila E, Houstis N, Daly MJ, Patterson N, Mesirov JP, Golub TR, Tamayo P, Spiegelman B, Lander ES, Hirschhorn JN, Altshuler D, Groop L. PGC1 Î±response genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet. 2003; 34:267â€“73.
Gunton JE, Kulkarni RN, Yim SH, Okada T, Hawthorne WJ, Tseng YH, Roberson RS, Ricordi C, Oâ€™Connell PJ, Gonzales FJ, Kahn CR. Loss of ARNT/HIF1 Î² mediates altered gene expression and pancreaticislet dysfunction in human type 2 diabetes. Cell. 2005; 122:337â€“49.
Acknowledgements
Not applicable.
Declarations
This article has been published as part of BMC Bioinformatics Volume 18 Supplement 3, 2017. Selected articles from the 15th Asia Pacific Bioinformatics Conference (APBC 2017): bioinformatics. The full contents of the supplement are available online https://bmcbioinformatics.biomedcentral.com/articles/supplements/volume18supplement3.
Funding
This work was partially supported by the NIH grant GM092963 (Y.Lai). The publication costs were funded by the Department of Statistics at The George Washington University.
Availability of data and material
Rfunctions for calculating conservatively adjusted qvalues are provided in an Appendix 1.
Authorsâ€™ contributions
YL conceived of the study, developed the methods, performed the statistical analysis, and drafted the manuscript.
Competing interests
The author declares that they have no competing interests.
Consent for publication
The author agrees the consent for publication.
Ethics approval and consent to participate
Not applicable.
Author information
Authors and Affiliations
Corresponding author
Additional information
From The Fifteenth Asia Pacific Bioinformatics Conference Shenzhen, China. 1618 January 2017
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Lai, Y. A statistical method for the conservative adjustment of false discovery rate (qvalue). BMC Bioinformatics 18 (Suppl 3), 69 (2017). https://doi.org/10.1186/s1285901714746
Published:
DOI: https://doi.org/10.1186/s1285901714746