Statistical analysis of a Bayesian classifier based on the expression of miRNAs

Background During the last decade, many scientific works have concerned the possible use of miRNA levels as diagnostic and prognostic tools for different kinds of cancer. The development of reliable classifiers requires tackling several crucial aspects, some of which have been widely overlooked in the scientific literature: the distribution of the measured miRNA expressions and the statistical uncertainty that affects the parameters that characterize a classifier. In this paper, these topics are analysed in detail by discussing a model problem, i.e. the development of a Bayesian classifier that, on the basis of the expression of miR-205, miR-21 and snRNA U6, discriminates samples into two classes of pulmonary tumors: adenocarcinomas and squamous cell carcinomas. Results We proved that the variance of miRNA expression triplicates is well described by a normal distribution and that triplicate averages also follow normal distributions. We provide a method to enhance a classifiers’ performance by exploiting the correlations between the class-discriminating miRNA and the expression of an additional normalized miRNA. Conclusions By exploiting the normal behavior of triplicate variances and averages, invalid samples (outliers) can be identified by checking their variability via chi-square test or their displacement by the respective population mean via Student’s t-test. Finally, the normal behavior allows to optimally set the Bayesian classifier and to determine its performance and the related uncertainty. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0715-9) contains supplementary material, which is available to authorized users.


Background
MicroRNAs (miRNAs or miRs) are small non-coding single-stranded RNAs, 19-25 nucleotides in length, acting as negative regulators of gene expression at the posttranscriptional level. More than 1000 miRNAs are transcribed from miRNA genes in the human genome. A single miRNA is able to modulate hundreds of downstream genes by recognizing complementary sequences in the 3 untranslated regions (UTRs) of their target messenger RNAs. It has been estimated that in humans about 30 % of messenger RNAs are under miRNA regulation. The biological functions of miRNAs are diverse and include several key cellular processes, such as differentiation, proliferation, cellular development, cell death and metabolism.
In the last decade, evidences have accumulated to indicate that miRNAs play a role in the onset and progression of several human cancers [1]. The transcription or processing of some miRNAs is altered in neoplastic tissues, with respect to their normal counterparts. miRNAs whose levels increase in tumors are referred to as oncogenic miRNAs (onco-miRs), sometimes even if there is no evidence for their causative role in tumorigenesis. On the other hand, miRNAs down-regulated in cancer are considered tumor suppressors.
In parallel to these studies, the effectiveness of miR-NAs as markers for tracing the tissue of origin of cancers of unknown primary origin was demonstrated by several authors, and the utility of miRNAs levels as diagnostic and prognostic markers became clear (reviewed by [2]). The main advantage of the use of miRNAs as markers resides in the ease of their detection and in their extreme specificity. miRNAs are stable molecules well preserved in formalin fixed, paraffin embedded tissues (FFPE) as well as in fresh snap-frozen specimens, unlike larger RNA molecules as messenger RNAs [3]. The finding that miR-NAs have an exceptional stability in several tissues suggested that these tiny molecules were also preserved, detectable and quantifiable in plasma and in other biofluids, such as urine, saliva, cerebro-spinal fluid and amniotic fluid [4]. Circulating miRNAs are attracting attention as markers not only for cancer but also for neurodegenerative diseases (reviewed by [5]) as they have some important features: non-invasivity, specificity, early detection, sensitivity and ease of translatability from model systems to humans.
Methods based on next generation sequencing (NGS), microarray and quantitative reverse-transcription polymerase chain reaction (qRT-PCR) are currently used for miRNA profiling and for the identification of miRNAs differently expressed in tumor samples and in matched, healthy tissue. The majority of miRNA profiling studies have been so far carried out by using microarrays. These studies have provided signatures consisting in few to several (5-30) distinct miRNAs [6]. However, the large amount of data obtained by microarray and NGS profiling needs to be transposed into clinical trials by developing an easily performed, cost-effective and serviceable assay that can analyze the cancer-specific miRNAs for cancer diagnosis and prognosis. Such an analysis has been so far relying on qRT-PCR assays, performed by measuring the levels of a restricted number of miRNAs (see review by [7]).
The realization of classifiers based on the expression of miRNAs is widely discussed in the scientific literature. Within the context of lung cancers (see, for example, [4,[8][9][10][11][12]) the work by [13] describes a classifier that distinguishes squamous from nonsquamous non-small-cell lung carcinomas, by using miR-205 as a specific marker and miR-21, snRNA (small nuclear RNA) U6 as normalizers. The approach followed is essentially machine learning: the classifier relies on a sample score and a threshold. A more elaborated support vector machine, which uses the combination of 5 miRNAs for lung squamous cell carcinoma diagnosis, is described in [14]. A receiver operating characteristic curve analysis to evaluate the possibility of diagnosing the histologic subtype of pulmonary neuroendocrine tumors via altered expression of miR-21, miR-155, let-7a is discussed in [15] (a similar statistical approach is described in [16]).
These classifiers are generally declared to be efficient. For example, [13] report a sensitivity of 96 % and a specificity of 90 %. However, at least two aspects are widely overlooked in the scientific literature: first, the distribution of the measured miRNA expressions; second, the statistical uncertainty that unavoidably affects the parameters that characterize a classifier and its performance. Both aspects are crucial in order to assess the reproducibility, and thus the reliability, of a classifier. The goal of the present paper is to close these gaps. Our analysis concerns a Bayesian classifier based on the expression of a single class-discriminating miRNA, with additional miRNAs that are used either as normalizers or as performance-enhancer via noise-reduction.
In the present paper the following issues are discussed: normal distribution of the triplicate variance and identification of outliers; improvement of accuracy via normalizers; class-discriminating measures and their distributions; identification of "bias" outliers; assessment of a classifier's performance; finally, improvement of a classifier's performance by exploiting correlations.
As a prototypical case we discuss throughout the paper the development of a classifier that assigns samples either to adenocarcinomas (ADC) or to squamous cell carcinomas (SQC). The two classes ADC, SQC are henceforth referred to as the target and the versus class, respectively. The miRNAs used are miR-205, miR-21 and snRNA U6.

Distribution of triplicates: normally-distributed variance and outlier identification
Given a sample stemming from a patient, a set of miRNAs is measured in triplicate by using qRT-PCR. For each miRNA, the sample mean x of the corresponding triplicate and the related sample standard deviation s are calculated. To provide an a priori knowledge on the samples, each one was classified via immunohistochemical analysis and gene profiling into one of different categories of lung tumors. We use data based on lung carcinoma biopsies retrieved from the archives of the Unit of Surgical Pathology of the S. Chiara Hospital in Trento, Italy. The research project had been approved by the Ethical Committee of the Trentino Public Health System (Azienda Provinciale per i Servizi Sanitari). Most of the data analyzed here were previously published by our research group [17]. All datasets are available as Supplementary material in the Additional file 1.
For each miRNA, the distribution of the variances s 2 can be described by a normal chi-square distribution with number of degrees of freedom ν equal to 2: s 2 /σ 2 ∼ χ 2 ν=2 . To prove this behavior, we fitted the cumulative distribution F χ 2 ν=2 ; ν = 2 of the chi-square with ν = 2 to the cumulative distribution of the measured variances: knowing that F s 2 /σ 2 ; 2 = 1 − exp −s 2 /(2σ 2 ) , the fit was carried out by searching the population variance σ 2 that minimizes the Kolmogorov-Smirnov (K-S) statistic D. We used the K-S test because it is less sensitive to outliers than other statistical tests and does not require any assumption on the distributions. On the contrary, for example, the estimation of the population variance σ 2 out of the set of sample variances is valid only if the sample distribution is already known to be normal. The p-values The population variance σ results from the Kolmogorov-Smirnov test analysis; the related p-value is reported. The threshold σ max (bold), above which a value is deemed to be an outlier, is set to 2.448 · σ . The two rightmost columns provide s 2 1/2 and [ s 4 − s 2 2 ] 1/4 (see main text) resulting from the fit (see Table 1) show that the variance distributions are indeed compatible with chi-square distributions.
As a result, an outlier can be identified by checking its variability via the chi-square test: we assume a triplicate of a given miRNA to be an outlier if its sample standard deviation s exceeds the critical value σ max corresponding to the significance level α = 0.05. The critical value σ max is given by [ −2 log α] 1/2 σ 2.448 σ , where σ is the population standard deviation assessed for that miRNA. The σ max values are reported in Table 1. The outlier definition used here implies that the significance level α corresponds to the rate of statistical false alarms (type I errors), i.e. the rate of valid triplicates that are falsely deemed to be outliers. 1 According to this procedure, the triplicate sets of both miR-205 and miR-21 contained no outliers, whereas the snRNA U6 contained 4 outliers. These samples were excluded from the following analysis. Table 1 also shows the statistics of the standard deviation for the snRNA U6 data set devoid of outliers.
We note that, for snRNA U6, the two values of σ are very similar. This fact reflects, as mentioned above, the robustness to outliers of the K-S approach. In addition, for all data sets devoid of outliers, the population standard deviation σ is approximately equal to both the root-meansquare sample standard deviation s 2 1/2 and the fourth root of the variance of variances [ s 4 − s 2 2 ] 1/4 . Because ν = 2, this behavior provides further evidence to the null hypothesis that samples are drawn from the same normal distribution.
For each sample, i.e. patient, of the set devoid of outliers, we henceforth use the following notation for the sample mean x of the available triplicates: x U6 for the snRNA U6 triplicate, x 21 for the miR-21 triplicate, and x 205 for the miR-205 triplicate. In addition, x 205 , x 21 , x U6 will be referred to as measures.

Distribution of triplicates: accuracy and normalization
A main issue to cope with towards the development of a reliable classifier is accuracy. The question is whether the values of the sample means of the triplicates are constant over different experimental sessions -i.e. measurements taken at different times and/or with different set-upsor, rather, have to be normalized in order to remove experimental bias. Table 2 shows the results of a statistical analysis, in terms of sample mean X and sample standard deviation S, carried out for x 21 , x U6 and their difference x 21 − x U6 on data gathered in two different experimental sessions. For both single-miRNA values x 21 and x U6 , and both for the target and the versus class, the sample means significantly differ between the two sessions. However, this is not the case for the sample means of x 21 − x U6 : the difference of the sample mean of x 21 − x U6 between session II and session I is 0.4 ± 0.4 for the target class, and −0.2 ± 0.8 for the versus class; in both cases the difference is less than twice the respective sample standard deviation (p > 0.05). Remarkably, the sample standard deviations, and consequently the uncertainties on the sample means, do not significantly differ between the two sessions.
The necessity to improve accuracy by suitably normalizing an oncomir has been extensively discussed in For each session, measure and class, the two symbols X and S correspond to sample mean and sample standard deviation of the x values, respectively the scientific literature (see for example [18]). Thus, in accordance with many previous works, we use snRNA U6 to normalize x 21 and x 205 . The following notation is henceforth used:

MiRNA statistics
In this section, the statistics of x 205 , x 21 , snRNA U6 is discussed.
As stated above, each sample was classified into one of the two classes ADC and SQC via immunohistochemical analysis and gene profiling. Figure 1 shows the histograms of x 205 , x 21 , x U6 for samples belonging either to the target class ADC or to the versus class SQC.
By means of the Shapiro-Wilk test of normality, each histogram was shown to be consistent with a normal parent population. Consequently, we assume that, for each of the two classes, the measures x 205 , x 21 , x U6 are normally distributed with a mean and a standard deviation that are respectively estimated by the sample mean X and the sample standard deviation S of the x values (triplicates). The results are reported in Table 3. The proof that the measures of interest are compatible with normal distributions makes up a crucial step towards the optimization of the Bayesian classifier and the determination of its performance, inclusively of the related uncertainty (see below).
With regard to x 205 , the histograms of the target class ADC and the versus class SQC are well-separated: Student's t-test provides p < 10 −7 (see Table 3). Conversely, both for x 21 and x U6 , the overlapping of the histograms of the two classes is confirmed by Student's ttest, which provides p = 0.90 and p = 0.14, respectively. Therefore, only the measure x 205 is a good candidate to classify samples into ADC or SQC.  Table 3. Each histogram is normalized to the respective set size Table 3 Statistics of x 205 , x 21 , x U6 , y DV (see Eq. (2)), y opt (see Eq. (10) where χ is a fixed threshold. For example, in the works by [13,17] the linear combination was used and a threshold χ = 2.5 was set. The discriminator approach described in Eq. (1) requires tackling three main issues: finding a suitable linear combination y; finding a suitable value for χ; analyzing the performance of the classifier.
Given a linear combination y (that may also coincide with one of the three measures), the threshold χ can be determined by calculating the value that, according to Eq. (1), maximizes the accuracy (or rate of correct responses) p c : In this equation, p T and p V correspond to the prior presentation probabilities of the target class and the versus class, respectively, whereas H and C respectively correspond to the sensitivity and the specificity of the classifier, provided that the condition positive is taken to correspond to the target class [19]. Under the assumption that the val-ues of y are normally distributed, sensitivity and specificity are given by the following expressions: where (x) is the standard normal cumulative distribution. The optimal position of χ is given by one of the roots (the most appropriate one!) of the following seconddegree equation: where The uncertainties on the three coefficients η, β, γ , and thus on the threshold χ, are computed by means of standard error propagation. We remind that μ T , σ T , μ V , σ V are evaluated as sample means and sample standard deviations, and are therefore uncertainty-affected: for example, the errors on μ T , σ T are σ T / √ N T , σ T / √ 2(N T − 1), respectively, where N T is the number of triplicates belonging to the target class.
The threshold χ depends on the ratio p T /p V of the prior occurrence probabilities of the two classes of tumors. These probabilities are typically inferred from epidemiological studies. If prior probabilities are balanced, the ratio p T /p V is unitary and the related term in Eq. (5) vanishes. Table 4 reports the values of χ and the respective uncertainties for each of the measures dealt with in this paper, in the case of balanced prior probabilities. For x U6 the normal curve for the target class lays on the left of the normal curve for the versus class (i.e., X ADC < X SQC ; see Fig. 1 and Table 3). So, in the case of this measure, The digits in parentheses correspond to the uncertainty on the respective least significant digits the application of the rule of Eq. (1) requires the linear combination y being set to −x U6 .

Estimated classifier performance
A new sample, i.e. a new value of the measure y whose costituent triplicates were checked not to be outliers, can now be assigned to the target or the versus class according to the rule of Eq. (1). An estimate of the reliability of this assignment is provided by Bayes' theorem: the odds that the new sample, given its y, belongs to the target class are given by the likelihood ratio where p T and p V are the two prior probabilities (p T + p V = 1), and η, β, γ are given by Eq. (6). From Eq. (5) it follows that the odds at y = χ are 50:50. By means of Eq. (7), the thresholds χ 10:90 , χ 90:10 for the odds 10:90 and 90:10 can be determined. The values of these thresholds in the case of balanced prior probabilities are reported in Table 4.

Bias outliers
In the previous sections, we have addressed the issue of outlier identification by using the triplicate variability. The normality of the "target" and "versus" distributions of the measure of interest y allows for the identification of a second kind of outliers: given a value y, one can promptly evaluate -via Student's t distribution -the one-tailed probability of obtaining a more extreme value, i.e. a value more displaced by the population mean than the value y. If such p-value is less than a given significance level (for example, 1 %), the value y can be deemed to be a "bias" outlier, i.e. an outlier due to a bias in the triplicate estimates.

Improvement of a classifier's performance
Looking at Student's t statistic provides two possible strategies to improve the performance of a classifier. First, one can enhance the difference at the numerator of Student's t, namely μ T − μ V ; this solution requires the linear combination of the available class-discriminating measures (in the present case x 205 ) with new, additional measures that also reliably discriminate between the two classes. Such linear combination has to be optimized by means of methods like, for example, principal component analysis or support vector machines. The discussion of these methods goes beyond the goals of the present paper. The second strategy to improve the performance of a classifier consists in reducing the denominator in the expression of Student's t by linearly combining available measures with new ones. These new measures are not required to be class-discriminating. In the following section the second strategy is analyzed in detail.

Analysis of correlation
Let y a , y b be two measures: y a is supposed to discriminate between two classes (according to Student's t-test), whereas y b is not. In the present paper, we can have y a = x 205 and y b = x 21 or y b = x U6 . Let y be a linear combination of y a , y b as follows: where c is a coefficient to be determined. Taking the average of the last equation provides Because measure y b does not discriminate between the two classes, we have μ b,T ≈ μ b,V (the means of the distributions of y b for the target and the versus class are not significantly different). Consequently, μ T − μ V ≈ μ a,T − μ a,V , i.e. nothing significant can be expected with regard to the numerator of Student's t statistic when μ a is replaced by μ.
With regard to standard deviations, from Eq. (8) it follows: where r is the linear correlation coefficient between y a and y b . If r = 0, adding the new measure y b is definitely detrimental for the sake of discrimination because the variance is increased by a term c 2 σ 2 b . However, in the case of a significant correlation, setting reduces the standard deviation as follows: Consequently, the additional measure y b can be deemed to be a sort of noise-reducer for the class-discriminating measure y a .
The previous argument still holds if, rather than regarding the whole data set, the correlation appears only on the data subset corresponding to one of the two classes. Therefore, if c is set according to Eq. (9), one of the two standard deviations σ T , σ V is reduced whereas the other is enhanced. The net result can be still an increase of the classifier's performance. The optimal value of c can be assessed by standard analytical and numerical techniques. . Dots corresponding to samples of the target class ADC and the versus class SQC are colored in blue and red, respectively. In each plot, the black, bold line represents χ , wheres the two black, dashed lines correspond to χ ± dχ. Similarly, the three red lines and the three blue lines represent χ 10,90 , χ 10,90 ± dχ 10,90 and χ 90,10 , χ 90,10 ± dχ 90,10 , respectively (see Table 4)

A classifier for ADC vs. SQC
to the class the sample was assigned to via immunohistochemical analysis and gene profiling (diagnosis). The plots contains four different regions, bounded by the three thresholds and corresponding to different outcomes of the classifier: orange ⇒ versus class with odds larger than 90:10; yellow ⇒ versus class with odds between 50:50 and 90:10; light green ⇒ target class with odds between 50:50 and 90:10; green ⇒ target class with odds larger than 90:10. The reliability of the classifiers based on each of the three available measures can be inferred by considering the confusion matrix reported in Table 5. The accuracy is equal to 97 % for x 205 , 60 % for x 21 , and 71 % for −x U6 . Due to overfitting, the performance of the Table 5 Confusion matrix for classifiers of ADC vs. SQC relying on x 205 , x 21 , x U6 as well as on y DV (see Eq. (2)) and y opt (see Eq. (10) classifiers based on the last two measures are apparently satisfactory, though non competitive with that of the classifier based on x 205 . However, if only high-reliability responses are considered, namely those with odds at least 90:10, the accuracies drop to 64 %, 29 %, 9 %, respectively: while the accuracy of the classifier based on x 205 is still satisfactory, the other two measures do not provide reliable outcomes. This behavior is linked to the t-statistic regarding the separation of the distributions corresponding to the two classes with respect to the widths of the distributions (see Fig. 1).
In the case of the classifier based on x 205 , by relying on Eqs. (3, 4) a maximum accuracy of 91.4 % ± 3.9 % can be predicted. Table 6 reports the correlation coefficient r for the pair (x 205 , x U6 ), i.e. directly between miR-205 and snRNA U6, and for the pair ( x 205 , x 21 ). The p-value refers to the null hypothesis that the data from the two pair elements are uncorrelated. Significant p-values (< 0.05) are marked in bold

Improved classifier for ADC vs. SQC
We first note that no correlation between miR-205 and snRNA U6 can be significantly inferred. Consequently, despite being a normalizing miRNA -i.e. a bias-reducerfor miR-205, snRNA U6 is useless as a noise-reducer for miR-205. On the contrary, the class-discriminating measure x 205 has a significant correlation with x 21 , which can be therefore used to improve the classification performance of x 205 . Given an overall correlation coefficient r of 0.022, Eq. (9) provides a value of c approximately equal to -0.8 so that the optimal linear combination y opt is: Tables 3, 4 and 5 report the statistics of the measure y opt , and the thresholds and confusion matrix of the classifier relying on this measure, respectively. For the sake of comparison, the same tables also show the data of the classifier relying on the measure y DV defined in Eq. (2), which was the topic of previous works [13,17].
Testing the same-parent-distribution null hypothesis via Student's t statistic provides p = 2.6 · 10 −11 , half of the value obtained by testing t on the histograms generated by using the linear combination y DV . Figure 3 shows the histograms of y opt for samples belonging either to the target class ADC or to the versus class SQC. The Shapiro-Wilk test of normality yielded p-values of 0.78 (target class) and 0.24 (versus class).
The reliability of the classifier based on y opt can be inferred by considering the confusion matrix of Table 5 (see also the scatter plot in Fig. 4). The accuracy is 94 %, similar to that provided by the classifier relying on x 205 only, and equal to that provided by the classifier relying on y DV . However, if only high-reliability responses are considered, namely those with odds at least 90:10, the accuracy of the classifier based on y opt is still 91 %, slightly better than 88 % provided by y DV and definitely larger than 64 % given by the classifier relying on x 205 only. The improvement with regard to a classifier based on this last measure is pointed out by the ROC curves that are also shown in Fig. 4.
In the case of the classifier based on y opt , by relying on Eqs. (3,4) a maximum accuracy of 96.1 % ± 2.4 % can be predicted. By comparison, this last parameter is 95.6 % ± 2.6 % in the case of the classifier based on y DV . Figure 5 shows the results of the application on a set of 9 additional samples of the classifier based on y opt and using the population mean, population standard deviation, and thresholds expressed in Tables 3 and 4. With the exception of one single case, all values of the triplicate standard deviations comply with the respective σ max requirements explained above. The single outlier is a miR-21 triplicate whose standard deviation of 0.52 slightly a b Overlapping regions are in magenta. The bin width is equal to 1. Each histogram is normalized to the respective set size. The green bold line represents the discrimination threshold χ = 3.6, whereas the green dashed lines represent the threshold displaced by its uncertainty, i.e. χ ± dχ, with dχ = 0.4 (see Table 4). ROC curves (right) of the classifier based on x 205 (green line) and of the classifier based on y opt (red line) [20]. The increase of the AUC (area under the curve) from 0.9815 to 0.9926, respectively, is another marker of the improvement of the classifier exceeds the maximum value of 0.46 (see Table 1 Table 4 Fig. 5 Scatter plot of y opt = x 205 − 0.8 · x 21 applied to an independent set of data. See Section "A classifier for ADC vs. SQC" and the caption of Fig. 2 for the color code of dots, lines and shaded areas. The values of the thresholds are reported in Table 4. The empty, red dot and the square, blue dot refer to a standard variability outlier and a "bias" outlier, respectively (see main text) any outlier according to the variability of its triplicates, its value of y opt appears to be extremely low: according to the statistics of y opt (see Fig. 3), the probability of getting a more extreme value is p < 5 · 10 −4 . This hints, rather than to a misclassification of the sample, to a case of "bias" outlier, i.e. to a possibly wrong assessment of the triplicates, as discussed above. For the sake of comparison, for all other 8 samples, as well as for all 33 samples considered in the previous sections, p > 0.02

Discussion and conclusion
We introduced a Bayesian classifier that, on the basis of the expression of miR-205, miR-21 and snRNA U6, discriminates samples into two different classes of pulmonary tumors, normally classified by immunohistochemical approaches: adenocarcinomas and squamous cell carcinomas. The advantage to use miRNAs is due to the ease of their detection and quantification by qRT-PCR, as well as in their extreme specificity. miR-NAs are stable molecules well preserved in formalin fixed, paraffin embedded tissues (FFPE) as well as in fresh snap-frozen specimens, unlike larger RNA molecules as messenger RNAs [3].
Our approach is based on a method that employs the quantification of snRNA U6 as a normalizer, miR-21 as a performance enhancer via noise reduction, and miR-205 as a class discriminator. First, we determined that the variance of miRNA quantification triplicates follows a normal chi-square distribution. Thereupon, we designed a procedure to recognize invalid measures (outliers) and remove them from the analysis. The proof that the measures of interest are compatible with normal distributions makes up a crucial step towards the optimization of the Bayesian classifier, the determination of its performance, inclusively of the related uncertainty, and the identification of "bias" outliers. We then proceeded to optimally set our Bayesian classifier and to determine its performance as well as the related uncertainty. Results are displayed in Fig. 2: the classifier based on miR-205 and normalized on snRNA U6 has the best performance.
A main feature of the Bayesian approach described here is the possibility, also in presence of a limited size of the available data sets, of estimating the reliability of a classifier's performance. This possibility relies on the verification of the normality of the different distributions of interest. Other powerful methods to set up a classifier, such as support vector machines (SVM) and decision trees, though quite versatile to optimize the decisional parameters on a training set, are less suited than a probabilistic approach to provide an immediate quantification of the reliability in the case of application to new sets of data. For example, in the cases discussed above, a SVM approach would likely result in an optimized classifier that also exploits the "apparent" classification capability of x 21 and even x U6 (see, for example, Figs. 1 and 2). However, according to our analysis based on Student's t, there is no evidence of such capability, so that such a SVM classifier would also possibly have a larger generalization error than a Bayesian classifier of the kind discussed in this paper.
Finally, we provided a method to enhance a classifiers' performance by exploiting the correlation between the tumor-discriminating miRNA miR-205 and the expression of miR-21, used as a noise reduction factor. The method essentially consists in exploiting the nonzero covariance of two miRNAs, where the first one acts a classifier and the second one is used to abate the variability of the first one. Figure 4 shows the result of an improved classifier, indicating that only 2 samples lay within the uncertainty region, much less than the 12 samples in the case of the non-improved classifier shown in Fig. 2. Results obtained on an independent data set are also satisfactory.
In conclusion, the proposed method introduces a robust tool for determining the cases in which miRNA quantification can be applied in discriminating inter-and intra-tumoral heterogeneity.
Endnote 1 Because f α, ν, ∞ = χ 2 α, ν /ν, an alternative outlier definition relying on the F-test would produce the same result as that discussed in this section.

Additional file
Additional file 1: The datasets are available as Supplementary Material. (TXT 12.8 KB)