Skip to main content

Advertisement

Table 2 Error rates (estimated using the 0.632+ bootstrap method with 200 bootstrap samples) for the microarray data sets using different methods. The results shown for variable selection with random forest used ntree = 2000, fraction.dropped = 0.2, mtryFactor = 1. Note that the OOB error used for variable selection is not the error reported in this table; the error rate reported is obtained using bootstrap on the complete variable selection process. The column "no info" denotes the minimal error we can make if we use no information from the genes (i.e., we always bet on the most frequent class).

From: Gene selection and classification of microarray data using random forest

Data set no info SVM KNN DLDA SC.l SC.s NN.vs random forest random forest var.sel.
          s.e. 0 s.e. 1
Leukemia 0.289 0.014 0.029 0.020 0.025 0.062 0.056 0.051 0.087 0. 075
Breast 2 cl. 0.429 0.325 0.337 0.331 0.324 0.326 0.337 0.342 0.337 0. 332
Breast 3 cl. 0.537 0.380 0.449 0.370 0.396 0.401 0.424 0.351 0.346 0. 364
NCI 60 0.852 0.256 0.317 0.286 0.256 0.246 0.237 0.252 0.327 0.353
Adenocar. 0.158 0.203 0.174 0.194 0.177 0.179 0.181 0.125 0.185 0. 207
Brain 0.762 0.138 0.174 0.183 0.163 0.159 0.194 0.154 0.216 0. 216
Colon 0.355 0.147 0.152 0.137 0.123 0.122 0.158 0.127 0.159 0. 177
Lymphoma 0.323 0.010 0.008 0.021 0.028 0.033 0.04 0.009 0.047 0. 042
Prostate 0.490 0.064 0.100 0.149 0.088 0.089 0.081 0.077 0.061 0. 064
Srbct 0.635 0.017 0.023 0.011 0.012 0.025 0.031 0.021 0.039 0.038