Skip to main content

Table 3 Prediction accuracies of the ensemble SVM(Biased) and SVM(Unbiased) classifiers

From: Finding minimum gene subsets with heuristic breadth-first search algorithm for robust tumor classification

Dataset

Ensemble modes

#Individual

Acc.%

Acc.%

  

classifiers

(Biased)

(Unbiased)

Leukemia

Top 300 gene subsets

147

92.31

84.62

10-Fold >98*

47

96.15

88.46

10-Fold = 100 and Full-fold > =99

5

88.46

86.54

DLBCL

Top 300 gene subsets

61

95.24

85.71

10-Fold = 100

143**

95.24

85.71

10-Fold = 100 and Full-fold = 100

29**

95.24

85.71

Prostate

Top 300 gene subsets

300

97.06

88.24

Full-fold > 98

290

97.06

88.24

Full-fold > 99

139

97.06

88.24

SRBCT

Top 300 gene subsets

300

90

80

Full-fold > 98

114

95

85

Full-fold > 98 and 10-Fold = 100

8

100

90

ALL

Top 300 gene subsets

300

96

96

10-Fold = 100

59

97

96

 

10-Fold = 100 and Full-fold > =99

42

95

95

Colon

Top 300 gene subsets

300

90

70

 

10-Fold = 100

62

85

65

 

10-Fold = 100 and Full-fold > =98

59

85

65

  1. * The corresponding prediction accuracies (Biased and Unbiased) are obtained on the Leukemia52 test set, respectively. The item 10-Fold > 98 means that the gene subsets with 10-fold CV accuracy greater than 98% are selected from the 300 top-ranked gene subsets in which only 47 gene subsets are shared between the Leukemia72 training set and Leukemia52 test set. Thus the final ensemble classifier consists of the 47 individual classifiers respectively constructed from these 47 gene subsets; the corresponding prediction accuracies (Biased and Unbiased) are obtained by the ensemble classifiers constructed by SVM(Biased) and SVM(Unbiased) on the Leukemia52 test set, respectively.
  2. ** The individual classifiers are constructed from the gene subsets that are selected from all nodes in last layer, not limited to the 300 top-ranked nodes in last layer because more than 300 gene subsets can obtain 100% 10-fold CV accuracy on DLBCL.