Skip to main content

Table 3 Prediction accuracies of the ensemble SVM(Biased) and SVM(Unbiased) classifiers

From: Finding minimum gene subsets with heuristic breadth-first search algorithm for robust tumor classification

Dataset Ensemble modes #Individual Acc.% Acc.%
   classifiers (Biased) (Unbiased)
Leukemia Top 300 gene subsets 147 92.31 84.62
10-Fold >98* 47 96.15 88.46
10-Fold = 100 and Full-fold > =99 5 88.46 86.54
DLBCL Top 300 gene subsets 61 95.24 85.71
10-Fold = 100 143** 95.24 85.71
10-Fold = 100 and Full-fold = 100 29** 95.24 85.71
Prostate Top 300 gene subsets 300 97.06 88.24
Full-fold > 98 290 97.06 88.24
Full-fold > 99 139 97.06 88.24
SRBCT Top 300 gene subsets 300 90 80
Full-fold > 98 114 95 85
Full-fold > 98 and 10-Fold = 100 8 100 90
ALL Top 300 gene subsets 300 96 96
10-Fold = 100 59 97 96
  10-Fold = 100 and Full-fold > =99 42 95 95
Colon Top 300 gene subsets 300 90 70
  10-Fold = 100 62 85 65
  10-Fold = 100 and Full-fold > =98 59 85 65
  1. * The corresponding prediction accuracies (Biased and Unbiased) are obtained on the Leukemia52 test set, respectively. The item 10-Fold > 98 means that the gene subsets with 10-fold CV accuracy greater than 98% are selected from the 300 top-ranked gene subsets in which only 47 gene subsets are shared between the Leukemia72 training set and Leukemia52 test set. Thus the final ensemble classifier consists of the 47 individual classifiers respectively constructed from these 47 gene subsets; the corresponding prediction accuracies (Biased and Unbiased) are obtained by the ensemble classifiers constructed by SVM(Biased) and SVM(Unbiased) on the Leukemia52 test set, respectively.
  2. ** The individual classifiers are constructed from the gene subsets that are selected from all nodes in last layer, not limited to the 300 top-ranked nodes in last layer because more than 300 gene subsets can obtain 100% 10-fold CV accuracy on DLBCL.