Skip to main content

Table 1 Performance of the classifiers under the alternative hypothesis with large class-imbalance ( k 1  = 0.9 ) and moderate differences between classes ( μ 2  =1  )

From: Improved shrunken centroid classifiers for high-dimensional class-imbalanced data

Method λ a #, % info # non-info [%] FDR PA 1 PA 2 g-means AUC
      (n 1  = 90) (n 2  = 10)   
PAM 0.05 99.94 9184.1 [92.77] 0.99 0.95 0.11 0.31 0.6
  (0.08) (0.87) (1136.28) (0.00) (0.02) (0.05) (0.07) (0.04)
GM-PAM 1.29 58.81 602.1 [6.08] 0.62 0.63 0.64 0.62 0.69
  (0.51) (43.6) (1004.73) (0.38) (0.1) (0.16) (0.08) (0.1)
ALP 0.07 99.96 9004.2 [90.95] 0.99 0.95 0.11 0.3 0.61
  (0.19) (0.54) (2232.18) (0.01) (0.03) (0.07) (0.08) (0.04)
GM-ALP 3.76 63.08 408.8 [4.13] 0.59 0.67 0.61 0.63 0.68
  (2.21) (41.09) (816.3) (0.36) (0.1) (0.17) (0.09) (0.09)
AHP 0.36 96.89 6438.9 [65.04] 0.95 0.94 0.14 0.34 0.62
  (1.57) (12.84) (4235.52) (0.11) (0.04) (0.1) (0.1) (0.05)
GM-AHP 5.29 37.45 266.5 [2.69] 0.42 0.78 0.5 0.6 0.69
  (3.94) (37.96) (1275.07) (0.4) (0.08) (0.18) (0.12) (0.1)
  1. The table reports the estimated optimal threshold (λ), the number [%] of active non-informative variables (# non-info [%], selected out of 9,900 non-informative variables) and the number (also equal to %) of active informative variables (#, % info, selected out of 100 informative variables, equal to 100(1-false negative rate)), false discovery rate (FDR, # non-info/(# active)), class specific predictive accuracies, g-means and AUC, averaged over 500 repetitions; standard deviations are reported in brackets. The simulation settings are the same as in Figure 2. a For AHP and GM-AHP only λ θ was optimized while λ γ was set to zero.