Volume 10 Supplement 1
Selected papers from the Seventh AsiaPacific Bioinformatics Conference (APBC 2009)
Using random forest for reliable classification and costsensitive learning for medical diagnosis
 Fan Yang†^{1},
 Huazhen Wang†^{1},
 Hong Mi^{1}Email author,
 Chengde Lin^{1} and
 Weiwen Cai^{2}
DOI: 10.1186/1471210510S1S22
© Yang et al; licensee BioMed Central Ltd. 2009
Published: 30 January 2009
Abstract
Background
Most machinelearning classifiers output label predictions for new instances without indicating how reliable the predictions are. The applicability of these classifiers is limited in critical domains where incorrect predictions have serious consequences, like medical diagnosis. Further, the default assumption of equal misclassification costs is most likely violated in medical diagnosis.
Results
In this paper, we present a modified random forest classifier which is incorporated into the conformal predictor scheme. A conformal predictor is a transductive learning scheme, using Kolmogorov complexity to test the randomness of a particular sample with respect to the training sets. Our method show wellcalibrated property that the performance can be set prior to classification and the accurate rate is exactly equal to the predefined confidence level. Further, to address the cost sensitive problem, we extend our method to a labelconditional predictor which takes into account different costs for misclassifications in different class and allows different confidence level to be specified for each class. Intensive experiments on benchmark datasets and real world applications show the resultant classifier is wellcalibrated and able to control the specific risk of different class.
Conclusion
The method of using RF outlier measure to design a nonconformity measure benefits the resultant predictor. Further, a labelconditional classifier is developed and turn to be an alternative approach to the cost sensitive learning problem that relies on labelwise predefined confidence level. The target of minimizing the risk of misclassification is achieved by specifying the different confidence level for different class.
Background
Most machinelearning classifiers output predictions for new instances without indicating how reliable the predictions are. The application of these classifiers is limited in the domains where incorrect predictions have serious consequences. Medical practitioners need a reliable assessment of risk of error for individual cases [1]. Thus, given the prediction tailed with a corresponding confidence value, a system can decide whether it is safe to classify. The recently introduced Conformal Predictor (CP) [2–5] is a promising framework that produces prediction coupled with confidence estimation. The exploiters advanced a welcome preference for formal relationship among Kolmogorov complexity, universal Turing Machines and strict minimum message length (MML). They assumed the transductive prediction as a randomness test which returns nonconformity scores closely associated with the property of the iid distribution (identically and independent distribution) governing all of the examples. When classifying a new instance, CP assigns a pvalue for each given artificial label to approximate the confidence level of prediction. CP is more than a reliable classifier of which the most novel and valuable feature is hedging prediction, i.e., the performance can be set prior to classification and the prediction is wellcalibrated that the accurate rate is exactly equal to the predefined confidence level. It is impressive to see its superiority over the Bayesian approach which often relies on strong underlying assumptions. In this paper, we use a random forest outlier measure to design the nonconformity score and develop a modified random forest classifier.
Since reports from both academia and practice indicate that the default assumption of equal misclassification costs is most likely violated [6], the natural desiderata is extending CP to labelwise CP, which takes into account different costs for misclassification errors of different class and allows different confidence level to be specified for different classification of an instance. In this paper, we investigate the method to extend CP to labelconditional CP, which can solve the nonuniform costs of errors in classification.
Consider a classification problem E: The reality outputs examples Z^{(n1)} = {(x_{1}, y_{1}),..., (x_{n1}, y_{n1})} ∈ X × Y and an unlabeled test instance x_{n}, where X denotes a measurable space of possible instances x_{i}∈ X, i = 1, 2,... n  1,...; Y denotes a measurable space of possible labels, y_{i}∈ Y, i = 1,2,... n  1,...; the example space is represented as Z = X × Y. We assume that each instance is generated by the same unknown probability distribution P over Z, which satisfies the exchangeability assumption.
Conformal predictor (CP)
CP is designed to introduce confidence estimation to the machine learning algorithms. It generalizes its framework from the iid assumption to exchangeability which omits the information about examples order.
To construct a prediction set for an unlabeled instance x_{n}, CP operates in a transductive manner and online setting. Each possible label is tried as a label for x_{n}. In each try we form an artificial sequence (x_{1}, y_{1}),..., (x_{n}, y), then we measure how likely it is that the resulting sequence is generated by the unknown distribution P and how nonconforming x_{n} is with respect to other available examples.
Given the classification problem E, The function A_{n}: Z^{(n1)} × z_{n} → R is a nonconformity measure if, for any n ∈ N,
α_{i} := A_{n}(z_{i}, ⟦_{1},..., z_{i1}, z_{i+1},..., z_{n}⟧)
i = 1,..., n  1
α_{n} := A_{n}(z_{n}, ⟦z_{1},..., z_{n1}⟧)
where [·] is a "bag" in which the elements are irrelevant according to their order. The symbol α denotes sample nonconformity score: the larger α_{i} is, the stranger z_{i} is corresponding to the distribution. In short, a nonconformity measure is characterized as a measurable kernel that maps Z to R while the value of α_{i} is irrelevant with the order of z_{i} in sequence.
where y is a possible label for x_{n}; P_{y} is called p value, which is the randomness level of z_{n} = (x_{n}, y) and also the confidence level of y being the true label; τ_{n}, n ∈ N is a random variables that distributed uniformly in [0, 1]. Smoothed CP is a power version of CP, which benefits from p distributing uniformly in [0, 1].
Let Γ^{ ε }= {y ∈ Y: P_{y} > ε}, and the true label of x_{n} is denoted as y_{n},
If Γ^{ ε } = 1, we define it as a certain prediction.
If Γ^{ ε } > 1, it is an uncertain prediction.
If Γ^{ ε } = ∅ , it is an empty prediction.
If y_{n} ∈ Γ^{ ε }, we define it as a corrective prediction with confidence level 1  ε. Otherwise, it is defined as an error.
When it comes to forced point prediction, CP selects the label with maximum p value as the prediction.
with $Er{r}_{n}^{\epsilon}$ the number of error predictions at the confidence level 1  ε (See [7] for detailed proof). Extensive experiments demonstrated that CP is also applicable to offline learning, which enlarge its applications.
Different nonconformity measures have been developed from existing algorithms, such as SVM, KNN and so on [9–11]. All the CPs have the calibration property, but the efficiency of CP largely depends on the designing of nonconformity measure [8]. Efficiency means the certain and empty prediction ratio in all predictions. Certain prediction is favourable because it is more informative than uncertain predictions. CP is successfully employed to hedge these popular machine learning methods, and this paper shows that CPRF is more efficient than others.
Random forest (RF)
 (1)
First, the real world data is noisy and contains many missing values, some of the attributes are categorical, or semicontinuous.
 (2)
Furthermore, there are needs to integrate different data sources which face the issue of weighting them.
 (3)
RF show high predictive accuracy and are applicable in highdimensional problems with highly correlated features, especially in the situation which often occurs in bioinformatics, like medical diagnosis.
In this paper, the random forest outlier measure is used to design a nonconformity measure in order to incorporate random forest into the CP and label conditional CP scheme. Our method can be used in both online and offline settings.
Costsensitive learning problem
In medical diagnosis, the default assumption of equal misclassification costs underlying machine learning techniques is most likely violated. A false negative prediction may have more serious consequences than a false positive prediction. To address this problem, costsensitive classification is developed, which considers the varying costs of different misclassification types [16]. Usually a cost matrix is defined or learned to reflect the penalty of classifying samples from one class as another. A costsensitive classification method takes a cost matrix into consideration during the model building process [17]. However, how to get a proper cost matrix remains an open question [18]. The definition or learning of a cost matrix is quite subjective. In this paper, we extend our method to label conditional CP to address the cost sensitive problem, and the risk of misclassification of each class is well controlled.
Results
Experiments setup
The experiments are divided into two parts: First, to show the calibration property and efficiency of our method, we demonstrate our method CPRF on 8 benchmark datasets and a realworld gene expression dataset. Second, to cope with the costsensitive problem, we extend CPRF to label conditional CPRF, and test its performance on two public application datasets.
Part I Performance of CPRF
Datasets used in the experiments
Dataset  n  c  a  num  nom 

liver  345  2  7  7  0 
pima  768  2  8  8  0 
sonar  208  2  60  60  0 
house votes  435  2  16  0  16 
satellite  6435  6  60  60  0 
isolet  300  26  618  618  0 
soybean  683  19  35  0  35 
covertype  500  3  54  10  44 
We perform CPRF in a 10fold cross validation in an online fashion and report the average performance and compare it with TCMSVM and TCMKNN. We use the following key indices at each predefined significance level: (1) Percentage of certain predictions. (2) Percentage of uncertain predictions. (3) Percentage of empty predictions. (4) Percentage of corrective predictions. These terms distinguish with traditional accuracy rate given by RF, SVM and other traditional classifiers.
Corrective predictions at 5 confidence level
Confidence level  liver  pima  sonar  vote 

0.95  94%  96%  96%  98% 
0.90  80%  87%  93%  90% 
0.85  78%  81%  82%  84% 
0.80  66%  82%  76%  80% 
Confidence level  satellite  isolet  soybean  covertype 
0.95  92%  94%  96%  96% 
0.90  90%  88%  92%  82% 
0.85  84%  85%  88%  78% 
0.80  81%  79%  79%  76% 
Table 2 demonstrates that CPRF ensures relative high accuracy when controlling a low risk of error. It is important in many domains to measure the risk of misclassification, and if possible, to ensure low risk of error.
Comparison of certain prediction
Dataset  Confidence level  CPRF  TCMKNN  TCMSVM 

Sonar  99%  53.15%  44.23%  23.07% 
95%  77.89%  73.07%  48.07%  
90%  86.74%  80.76%  71.15%  
Confidence level  CPRF  TCMKNN  TCMSVM  
Liver  99%  36.28%  17.56%  25.31% 
95%  67.21%  20.56%  31.45%  
90%  79.77%  36.08%  58.01%  
Confidence level  CPRF  TCMKNN  TCMSVM  
Pima  99%  42.36%  39.46%  24.05% 
95%  69.56%  44.56%  28.35%  
90%  74.71%  61.97%  45.12%  
Confidence level  CPRF  TCMKNN  TCMSVM  
Vote  99%  87.76%  83.49%  80.21% 
95%  92.87%  91.75%  90.72%  
90%  94.89%  92.56%  91.05%  
Confidence level  CPRF  TCMKNN  TCMSVM  
Satellite  99%  73.07%  64.47%  68.07% 
95%  87.92%  85.97%  88.41%  
90%  92.28%  90.72%  91.76%  
Confidence level  CPRF  TCMKNN  TCMSVM  
Isolet  99%  57.95%  54.87%  56.95% 
95%  74.72%  70.37%  72.07%  
90%  87.51%  81.75%  82.40%  
Confidence level  CPRF  TCMKNN  TCMSVM  
Soybean  99%  73.85%  54.57%  63.97% 
95%  79.32%  73.45%  78.83%  
90%  90.49%  80.29%  81.72%  
Confidence level  CPRF  TCMKNN  TCMSVM  
Covertype  99%  52.74%  47.82%  50.42% 
95%  68.73%  63.77%  65.07%  
90%  76.45%  72.16%  73.75% 
Comparisons of accuracy
Model  liver  pima  sonar  vote 

CPRF  66%  86%  84%  95% 
TCMKNN  61%  85%  83%  91% 
TCMSVM  51%  77%  96%  84% 
level  satellite  isolet  soybean  covertype 
CPRF  84%  82%  93%  83% 
TCMKNN  82%  70%  89%  74% 
TCMSVM  74%  89%  77%  67% 
It is clear that CPRF performs well at most of the datasets, especially on the datasets with categorical and mixed variable. CPRF especially outperforms TCMKNN for highdimension dataset (isolet), and outperforms TCMSVM for noisy data (covertype).
The characteristic of ALL data
Group (Class)  Size of training set  Size of testing set 

(1)BCRABL  9  6 
(2)E2APBX1  18  9 
(3)Hyperdiploid>50  42  22 
(4)MLL  14  6 
(5)TALL  28  15 
(6)TELAML1  52  27 
(7)Others  52  27 
Total  215  112 
Confusion matrix of CPRF
Real\Predicted  1  2  3  4  5  6  7 

1  4  0  1  0  0  1  0 
2  0  9  0  0  0  0  0 
3  0  0  22  0  0  0  0 
4  0  0  0  6  0  0  0 
5  0  0  0  0  15  0  0 
6  0  0  0  0  0  27  0 
7  0  0  0  0  0  0  27 
Comparison of accuracy per class
Subgroups  CPRF  SVM* 

1  66.7%  100% 
2  100%  100% 
3  100%  99% 
4  100%  97% 
5  100%  100% 
6  100%  96% 
Tables 6 and 7 show CPRF outperforms SVM in subgroup 3, 4 and 6, and they are wellmatched in subgroup 2 and 5. All of the two misclassifications happen in subgroup 1, because this subgroup only has six cases, the error rate seems very large.
Corrective and certain prediction at 5 confidence levels
level  Corrective prediction  Certain prediction 

99%  97.64%  100% 
95%  93.24%  98.41% 
90%  88.05%  98.32% 
85%  83.90%  87.64% 
80%  77.65%  82.94% 
Part II: Performance of label conditional CPRF
Datasets used in the experiments
Name of class  Index  Size 

1. Thyroid dataset  
primary hyperthyroid  1  166 
compensated hyperthyroid  2  368 
normal  3  6666 
2. Chronic gastritis dataset  
incoordination between liver and stomach  1  240 
dampnessheat of spleen and stomach  2  77 
deficiency of spleen and stomach  3  151 
blood stasis in stomach  4  84 
yin deficiency of stomach  5  157 
ID of the symptoms of chronic gastritis
ID  Symptom  ID  Symptom  ID  Symptom  ID  Symptom 

1  distending pain  2  hunger pain  3  dull pain of stomach  4  stabbing pain of stomach 
5  burning pain of stomach  6  abdominal distention  7  aggravated after eating  8  likeness of being warmed and pressed 
9  aggravated in the night  10  distention and fullness  11  poor appetite  12  nausea 
13  vomiting  14  vomiting  15  vomiting of water  16  belching 
17  gastric upset  18  acid regurgitation  19  heartburn  20  blockage in deglutition 
21  emaciation  22  dysphoria  23  sallow complexion  24  dim complexion 
25  less lustrous complexion  26  cold limbs  27  dizziness  28  weakness 
29  spontaneous sweating  30  night sweating  31  insomnia  32  dry mouth, 
33  bitter taste in mouth  34  halitosis  35  loose stool  36  constipation 
37  alternate dry and loose stool  38  hemafecia  39  yellowish urine  40  pale tongue 
41  pink tongue  42  red tongue  43  purplish tongue  44  fissured tongue 
45  teethprint tongue  46  ecchymosis on tongue  47  thinwhite fur  48  white and greasy fur 
49  yellow and greasy fur  50  little fur  51  thready and unsmooth pulse  52  stringy and thready pulse 
53  stringy and slippery pulse  54  deep and weak pulse  55  stringy and slippery pulse 
When constructing RF, we let the number of trees equal to 1000 and the number of variables to split on at each node be $\lfloor \sqrt{55}\rfloor $ (Parameter sensitivity analysis of CPRF is laid out in the next section). For experiments on Thyroid disease dataset, the original dataset is randomly divided into a training set (3772 samples) and a test set (3428 samples). For Chronic Gastritis dataset, we perform our method in a 10fold cross validation. Average performances are reported.
It is noticeable that the percentage of certain predictions and certain & correct ratios monotonically increase with significance levels. How fast the decline of uncertain prediction goes to zero also depends on the superiority of calculation of p value.
label conditional empirical corrective prediction at 5 confidence level within each class on thyroid data
class  99(%)  95(%)  90(%)  85(%)  80(%) 

1  100  97.26  90.41  87.67  80.82 
2  99.44  96.79  91.13  85.49  81.04 
3  98.87  95.37  90.12  85.06  80.49 
label conditional empirical corrective prediction at 5 confidence level within each class on chronic gastritis data
class  99(%)  95(%)  90(%)  85(%)  80(%) 

1  99.14  95.35  89.77  87.44  82.33 
2  100  93.67  88.72  82.78  77.83 
3  100  96.32  89.71  83.64  77.94 
4  100  96.05  92.11  86.84  81.58 
5  100  95.12  89.92  85.34  80.94 
Discussion
Part I: CPRF
Parameter sensitivity analysis
A common way to validate an approach is to ensure robustness, that is, the approach must produce consistent results independent of the initial parameter settings. Empirical studies show the parameters adjustments have great impacts on CPs. Normalization of examples affects TCMKNN greatly. As for TCMSVM, not only the normalization but the type and parameters of kernel functions are important. Thus, the empirical and nontheoretically alteration hints a potential instability.
To demonstrate the parameter insensitivity of CPRF, we set up different parameters for CPRF, with ntrees = 500,1000,5000 and ntry = 1,..., $\sqrt{\text{a}}$. Mean and standard deviation of forced accuracy on sonar are reported.
Comparison of Parameter Sensitivity
TCMKNN  Without normalization  Attributes normalization  Examples normalization 

accuracy  82.69%  88.46%  86.54% 
TCMSVM  Simple dot product  Radial basis function  Binomial coefficient polynomial 
accuracy  63.46%  48.08%  96.15% 
CPRF  Mean  standard deviation  
accuracy  84.92%  3.52% 
Feature selection
The problem of feature selection is an open question in many applications. In our method, there is no feature selection. Take gene expression analysis for example, gene selection is a crucial study and remains unsolved. In Yeoh's study, gene expression profiling can accurately identify the known prognostically important leukemia subtypes, by the means of classification using SVM, KNN, and ANN when various selected genes were used. Unfortunately, classifications were performed following a process of discriminating gene selections by a correlationbased feature selection. This process is also labor intensive and requiring experiential knowledge. It is better that automated classification should be made with a level of confidence. Moreover, due to the low sample size, although their research has yielded high predictive accuracies that are comparable with or better than traditional clinical techniques, it remains uncertain how well the selected genes results will extrapolate to practice in the future [25]. CPRF is especially suitable for this situation, without discriminating gene selections, i.e. using all of the genes, and this may meet the need of an automated classification. Moreover, no selection bias is introduced.
Part II: Label conditional CPRF
From experiments in Part I, we can see that though CPRF is well calibrated globally, i.e. the error predictions equal to the predefined confidence level on the whole test data, it cannot guarantee the reliability of classification for each class especially for unbalanced datasets. Different from CPRF, label conditional CPRF is labelwise well calibrated while the former may not satisfy the calibration property in some classes. Because the latter uses only partial information from the whole data set, so the computational efficiency is better.
Conclusion
Most of stateoftheart machine learning algorithms cannot provide a reliable measure of their classifications and predictions. This paper addresses the importance of reliability and confidence for classification, and presents a novel method based on a combination of random forest, and conformal predictor. The new algorithm hedges the predictions of RF and gives a wellcalibrated region prediction by using the proximity matrix generated with RF as a nonconformity measure of examples. For medical diagnosis, the most important advantage of CPRF is its calibration: the risk of error can be well controlled. The new method takes advantage of RF and possesses a more precise and stable nonconformity measure. It can deal with redundant and noisy data with mixed types of variables, and is less sensitive to parameter settings. Furthermore, we extend CPRF to a label conditional version, so that it can control the risk of prediction for each class independently rather than globally. This modified version can provide an alternative way for cost sensitive learning. Experiments on benchmark datasets and real world applications show the usability and superiority of our method.
Methods
CPRF algorithm
Executed by transductive inference learning, CP is able to hedge the predictions of any popular machine learning method, which constructs a nonconformity measure for CPs [3, 4]. It is a remarkable fact that error calibration is guaranteed regardless of the particular classifier plugged into CP and nonconformity measure constructed. However, the quality of region predictions and CP's efficiency accordingly, depends on the nonconformity measure. This issue has been discussed and several types of classifiers have been used, such as support vector machine, knearest neighbors, nearest centroid, kernel perceptron, naive Bayes and linear discriminant analysis [9–11]. The implementations of these methods are determined by the nature of these classifiers. So TCMSVM and TCMKP mainly consider binary classification tasks, TCMKNN and TCMKNC is the simplest mathematical realization, and TCMNB and TCMLDC is suitable for transductive regression. Indeed, the above methods have demonstrated their applicability and advantages over inductive learning, but there is still much infeasibility. For nonlinear datasets, it is especially challenging to TCMLDC. TCMKNN and TCMNC have difficulties with dispersed datasets. TCMSVM is so processing intensive that it suffers from large datasets. TCMKP is only practicable to relatively noisefree data. In short, there are many restrictions on data qualities when applying them to real world data. The difficulties in essence lie in the nonconformity measure, which remains an unanswered question.
i.e., if instance i and j both land in the same terminal node, the proximity between i and j is increased by one, this forms a N × N matrix (⟦prox(i, j))⟧_{N × N}, which is symmetric, positive definite and bounded above by 1, with the diagonal elements equal to 1, and N is the total number of cases [26].
After a random forest is constructed, the proximity matrix of training dataset and a given test example remains the same regardless of changing the order of input data sequence, so random forest outlier measure can be used as a nonconformity measure.
In our method CPRF, we define a new nonconformity measure α_{ i }= out(i), and then predict each test sample with Eq. (3). The detailed CPRF algorithm is summarized in pseudo codes below.
Algorithm: CPRF
Input: Training set T = {(x_{1}, y_{1}),..., (x_{ l }, y_{1})} and a new unlabeled example x_{l+1}.
 1.
for i = 1 to m do
 2.
Assign label i to x_{l+1}
 3.
Construct a RF classifier with training set T, put the test example x_{l+1}to the forest and output the sample proximity matrix (⟦prox(i, j))⟧_{(l+1) × (l+1)};
 4.
Compute nonconformity scores ${\alpha}_{1}^{i},\mathrm{...},{\alpha}_{l}^{i},{\alpha}_{l+1}^{i}$ of all examples using Eq.(6) (${\alpha}_{l+1}^{i}$ is the nonconformity measure of x_{l+1}when assigned label i);
 5.
Compute the p value ${p}_{l+1}^{i}$ of x_{l+1}with Eq. (3).
 6.
End for
Label conditional CPRF algorithm
Given a significance level ε > 0 and the goal is to compute predictive regions, ideally consisting of just one label, containing the true label with probability 1  ε. But in some situations our predictions are wellcalibrated globally, but not within each class. In costsensitive learning problem, we must allow different significance levels to be specified for each possible classification of an object because the penalty of misclassification is not the same among all classes [27, 28]. This problem can be viewed as a conditional inference. We extend our method to label conditional CP to address it, which can also be seen as one version of Mondrian CP (MCP) [3, 29].
An important aspect of MCP is the method of calculating p values. For example, calculating the pvalues in standard CP, the nonconformity score of a new example against the nonconformity scores of all examples observed up to that point are compared. In contrast, label conditional CPs compare the nonconformity score of a new example with the previously observed examples within each class. In detail, this method applies a function called Mondrian taxonomy to effectively partition the example space Z into rectangular groups. Given a division of the Cartesian product N × Z into categories: a function k: N × Z → K maps each pair (n, z) (z is an example and n is the ordinal number of this example in the data sequence ⟦(Z)⟧ ↓ 1, z ↓ 2,...) to its category; a label conditional nonconformity measure based on k is defined as:
A_{ n }: K^{n1}× ((Z^{•}))^{ K }× K × Z → R
with α_{ i }denotes a nonconformity score.
In label conditional CPRF, α_{ i }= out(i) and compared with CPRF, the small difference is computing p values with part of training examples, with Eq. (7). So we get higher computational efficiency. Limited by space, the detailed label conditional CPRF algorithm is omitted here.
Availability
The chronic gastritis dataset, the core source codes of CPRF and label conditional CPRF are available at http://59.77.15.238/APBC_paper or http://www.penna.cn/cprf
Notes
List of abbreviations used
 CP:

Conformal predictor
 RF:

Random forests
 KNN:

K nearest neighbour classifier
 SVM:

Support vector machine
 KP:

Kernel perceptron
 NB:

Naïve Bayes
 NC:

Nearest centroid
 LDC:

Linear discriminant classifier
 KNC:

Kernel nearest centroid
 ANN:

Artificial neural network.
Declarations
Acknowledgements
The authors are grateful to Prof. Vladimir Vovk and Dima Devetyarov for valuable suggestions and essential help on conformal predictors. The authors would also like to thank Prof. Changle Zhou for assistance in data collection and support. The work is based upon work supported by the 985 Innovation Project on Information Technique of Xiamen University under Grant No.0000x07204 and the National High Technology Research and Development Program of China (863 Program) under Grant No.2006AA01Z129.
This article has been published as part of BMC Bioinformatics Volume 10 Supplement 1, 2009: Proceedings of The Seventh Asia Pacific Bioinformatics Conference (APBC) 2009. The full contents of the supplement are available online at http://www.biomedcentral.com/14712105/10?issue=S1
Authors’ Affiliations
References
 Pirooznia M, Yang JY, Yang MQ, Deng YP: A comparative study of different machine learning methods on microarray gene expression data. BMC Genomics. 2008, 9 (Suppl 1): S13PubMed CentralView ArticlePubMedGoogle Scholar
 Gammerman A, Vovk V: Prediction algorithms and confidence measures based on algorithmic randomness theory. Theoretical Computer Science. 2002, 287: 209217.View ArticleGoogle Scholar
 Vovk V, Gammerman A, Shafer G: Algorithmic learning in a random world. 2005, Springer, New YorkGoogle Scholar
 Gammerman A, Vovk V: Hedging predictions in machine learning. Computer Journal. 2007, 50: 151177.View ArticleGoogle Scholar
 Shafer G, Vovk V: A tutorial on conformal prediction. J Mach Learn Res. 2007, 9: 371421.Google Scholar
 Elkan C: The foundations of costsensitive learning. Proceedings of the Seventeenth International Joint Conference of Artificial Intelligence. 2001, Morgan Kaufmann, Seattle, Washington, 973978.Google Scholar
 Vovk V: A Universal WellCalibrated Algorithm for Online Classification. J Mach Learn Res. 2004, 5: 575604.Google Scholar
 Stijn V, Laurens VDM, Ida SK: Offline learning with transductive confidence machines: an empirical evaluation. Proceedings of the 5th International Conference on Machine Learning and Data Mining in Pattern Recognition. Edited by: Petra Perner, LNAI. 2007, Leipzig, Germany. Springer Press, 4571: 310323.View ArticleGoogle Scholar
 Tony B, Zhiyuan L, Gammerman A, Frederick VD, Vaskar S: Qualified predictions for microarray and proteomics pattern diagnostics with confidence machines. International Journal of Neural Systems. 2005, 15 (4): 247258.View ArticleGoogle Scholar
 Bellotti T, Zhiyuan L, Gammerman A: Reliable classification of childhood acute leukaemia from gene expression data using Confidence Machines. Proceedings of IEEE International Conference on Granular Computing, Atlanta, USA. 2006, 148153.Google Scholar
 Proedrou K, Nouretdinov I, Vovk V, Gammerman A: Transductive confidence machines for pattern recognition. Proceedings of the 13th European Conference on Machine Learning. 2002, 381390.Google Scholar
 Breiman L: Bagging Predictors. Mach Learn. 1996, 24 (2): 123140.Google Scholar
 Breiman L: Random forests. Mach Learn. 2001, 45 (1): 532.View ArticleGoogle Scholar
 Diaz UR, Alvarez AS: Gene Selection and Classification of Microarray Data Using Random Forest. BMC Bioinformatics. 2006, 7: 3View ArticleGoogle Scholar
 Strobl C, Boulesteix AL, Kneib T, Augustin T, Zeileis A: Conditional variable importance for random forests. BMC Bioinformatics. 2008, 9: 307PubMed CentralView ArticlePubMedGoogle Scholar
 Turney P: Types of cost in inductive concept learning. Workshop on CostSensitive Learning at ICML. 2000, Stanford University, California, 1521.Google Scholar
 Zhou ZH, Liu XY: On multiclass costsensitive learning. Proceedings of the 21st National Conference on Artificial Intelligence, Boston, MA. 2006, 567572.Google Scholar
 Zadrozny B, Elkan C: Learning and making decisions when costs and probabilities are both unknown. Proceedings of the Seventh International Conference on Knowledge Discovery and Data Mining. 2001, ACM Press, 204213.View ArticleGoogle Scholar
 UCI Machine Learning Repository. [http://archive.ics.uci.edu/ml/]
 Yeoh EJ, Ross ME, Shurtleff SA: Classification subtype discovery and prediction of outcome in pediatric acute lymphoblastic leukemia by gene expression profiling. Cancer Cell. 2002, 1 (2): 133143.View ArticlePubMedGoogle Scholar
 Sorin D: Data Analysis Tools for DNA Microarrays. 2003, Chapman&Hall/CRC, LondonGoogle Scholar
 Thyroid Disease Database. [ftp://ftp.ics.uci.edu/pub/machinelearningdatabases/thyroiddisease/]
 Chronic Gastritis Dataset. [http://59.77.15.238/APBC_paper]
 Niu HZ, Wang RX, Lan SM, Xu WL: hinking and approaches on treatment of chronic gastritis with integration of traditional Chinese and western medicine. Shandong Journal of Traditional Chinese Medicine. 2001, 20 (3): 7072.Google Scholar
 Boulesteix AL, Strobl C, Augustin T, Daumer M: Evaluating microarraybased classifiers: an overview. Cancer Informatics. 2008, 6: 7797.PubMed CentralPubMedGoogle Scholar
 Qi Y, Klein SJ, Bar JZ: Random forest similarity for proteinprotein interaction prediction from multiple sources. Pacific Symposium on Biocomputing. 2005, 10: 531542.Google Scholar
 Domingos P: MetaCost: A general method for making classifiers costsensitive. Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining. 1999, New York. ACM Press, 155164.View ArticleGoogle Scholar
 Chris D, Robert CH: Cost curves: An improved method for visualizing classifier performance. Machine Learning. 2006, 65 (1): 95130.View ArticleGoogle Scholar
 Vovk V, Lindsay D, Nouretdinov I, Gammerman A: Mondrian Confidence Machine. Technical Report. Computer Learning Research Centre, Royal Holloway, University of London
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.