Skip to main content

Table 1 Performance of the classifiers under the alternative hypothesis with large class-imbalance ( k 1  = 0.9 ) and moderate differences between classes ( μ 2  =1  )

From: Improved shrunken centroid classifiers for high-dimensional class-imbalanced data

Method

λ ∗ a

#, % info

# non-info [%]

FDR

PA 1

PA 2

g-means

AUC

     

(n 1  = 90)

(n 2  = 10)

  

PAM

0.05

99.94

9184.1 [92.77]

0.99

0.95

0.11

0.31

0.6

 

(0.08)

(0.87)

(1136.28)

(0.00)

(0.02)

(0.05)

(0.07)

(0.04)

GM-PAM

1.29

58.81

602.1 [6.08]

0.62

0.63

0.64

0.62

0.69

 

(0.51)

(43.6)

(1004.73)

(0.38)

(0.1)

(0.16)

(0.08)

(0.1)

ALP

0.07

99.96

9004.2 [90.95]

0.99

0.95

0.11

0.3

0.61

 

(0.19)

(0.54)

(2232.18)

(0.01)

(0.03)

(0.07)

(0.08)

(0.04)

GM-ALP

3.76

63.08

408.8 [4.13]

0.59

0.67

0.61

0.63

0.68

 

(2.21)

(41.09)

(816.3)

(0.36)

(0.1)

(0.17)

(0.09)

(0.09)

AHP

0.36

96.89

6438.9 [65.04]

0.95

0.94

0.14

0.34

0.62

 

(1.57)

(12.84)

(4235.52)

(0.11)

(0.04)

(0.1)

(0.1)

(0.05)

GM-AHP

5.29

37.45

266.5 [2.69]

0.42

0.78

0.5

0.6

0.69

 

(3.94)

(37.96)

(1275.07)

(0.4)

(0.08)

(0.18)

(0.12)

(0.1)

  1. The table reports the estimated optimal threshold (λ∗), the number [%] of active non-informative variables (# non-info [%], selected out of 9,900 non-informative variables) and the number (also equal to %) of active informative variables (#, % info, selected out of 100 informative variables, equal to 100(1-false negative rate)), false discovery rate (FDR, # non-info/(# active)), class specific predictive accuracies, g-means and AUC, averaged over 500 repetitions; standard deviations are reported in brackets. The simulation settings are the same as in Figure 2. a For AHP and GM-AHP only λ θ was optimized while λ γ was set to zero.