Skip to main content

Table 5 Performance comparison of the LHR with 12 standard feature selection schemes (FSSs)

From: Feature weight estimation for gene selection: a local hyperlinear learning approach

Classifier FSS DLBCL Prostate1 GCM Prostate2 CNS Leukemia Prostate3 Colon Lung Avg.
  No. genes 27 22 6 24 7 23 6 18 5  
  IG 93.5 96.1 81.8 84.1 88.2 97.2 100 87.1 99.4 91.9
  TR 94.8 97.1 82.1 84.1 85.3 97.2 100 91.9 99.4 92.4
  Gini 93.5 95.1 80.7 81.8 88.2 95.8 100 83.9 99.4 90.9
  SumM 94.8 93.1 81.8 70.5 85.3 98.6 100 88.7 98.9 90.2
  MaxM 94.8 97.1 82.1 84.1 85.3 97.2 100 91.9 99.4 92.4
SVMη SumV 94.8 97.1 82.1 84.1 85.3 97.2 100 91.9 99.4 92.4
  t-stat 92.2 91.2 83.2 81.8 82.4 95.8 100 87.1 98.9 90.3
  OSVM 98.7 93.1 80.7 73.9 85.3 95.8 72.7 85.5 98.3 87.1
  MIDξ 75.3 75.3 75.3 75.3 73.5 65.3 72.7 64.5 82.9 73.4
  MIQξ 75.3 75.3 75.3 75.3 73.5 65.3 72.7 64.5 82.9 73.4
  I-RELIEF 92.2 88.2 81.2 83.0 83.4 94.4 81.2 75.8 84.0 84.8
  LHR 94.8 96.1 100 95.5 100 98.6 100 87.1 100 96.9
  LOGO 100 100 88.9 92.0 97.1 100 100 91.9 100 96.7
  IG 88.2 90.1 82.9 79.6 88.3 92.0 100 80.5 97.3 88.8
  TR 88.0 90.1 82.9 80.3 86.7 93.0 100 82.6 97.8 89.0
  Gini 79.5 90.2 83.6 79.9 88.3 94.5 100 82.6 98.3 88.5
  SumM 86.1 92.1 82.5 71.9 89.2 91.6 100 79.3 98.3 87.9
  MaxM 90.7 94.2 82.9 80.7 84.2 94.3 100 80.7 97.8 89.5
LDA SumV 90.9 91.2 82.1 84.0 88.3 92.9 100 74.0 97.3 89.0
  t-stat 77.5 89.4 83.9 84.2 82.5 91.6 97.5 80.5 93.4 86.7
  OSVM 97.4 92.3 79.6 84.4 85.0 92.9 40.0 82.4 98.9 83.7
  MIDξ 75.5 83.9 81.6 83.9 90.8 78.9 84.2 77.6 96.7 83.7
  MIQξ 76.3 79.3 78.2 72.9 73.3 85.9 83.3 78.8 95.6 80.4
  I-RELIEF 89.5 80.6 80.7 87.5 81.2 92.9 80.2 74.0 80.1 83.0
  LHR 95.0 97.1 99.4 95.4 99.5 98.8 99.5 87.4 99.5 96.8
  LOGO 98.6 98.0 90.0 95.6 86.7 100 100 86.9 100 95.1
  IG 88.3 93.1 80.0 84.1 91.2 95.8 100 87.1 98.9 90.9
  TR 89.6 93.1 80.0 84.1 88.2 95.8 100 88.7 98.9 90.9
  Gini 89.6 92.2 80.4 83.0 91.2 95.8 100 88.7 98.9 91.1
  SumM 89.6 92.2 80.7 73.9 91.2 95.8 100 88.7 98.9 90.1
  MaxM 89.6 93.1 80.0 84.1 88.2 95.8 100 88.7 98.9 90.9
NB SumV 89.6 93.1 80.0 84.1 88.2 95.8 100 88.7 98.9 90.9
  t-stat 89.6 94.1 82.5 83.0 91.2 98.6 100 79.0 98.3 90.7
  OSVM 90.9 94.1 81.1 81.8 91.2 95.8 100 83.9 98.3 90.8
  MIDξ 76.6 76.6 80.5 75.3 88.2 84.7 84.8 80.6 97.8 82.8
  MIQξ 80.5 83.1 77.9 79.2 73.5 94.4 84.8 74.2 97.2 82.8
  I-RELIEF 84.4 73.5 87.3 81.8 85.1 91.7 87.3 67.7 86.7 82.8
  LHR 92.2 98.0 97.2 89.8 97.8 98.6 97.8 90.3 97.2 95.4
  LOGO 98.7 93.1 84.3 94.3 97.1 100 100 90.3 100 95.3
  IG 92.2 96.1 85.7 84.1 91.2 98.6 100 88.7 98.9 92.8
  TR 90.9 98.0 84.6 84.1 88.2 98.6 100 87.1 98.9 92.3
  Gini 90.9 92.2 86.1 84.1 88.2 98.6 100 85.5 98.9 91.6
  SumM 93.5 92.2 84.3 86.4 94.1 98.6 100 87.1 98.9 92.8
  MaxM 90.9 98.0 84.6 84.1 88.2 98.6 100 87.1 98.9 92.3
KNNη SumV 90.9 98.0 84.6 84.1 88.2 98.6 100 87.1 98.9 92.3
  t-stat 93.5 94.1 86.8 86.4 91.2 97.2 100 88.7 99.4 93.0
  OSVM 90.9 93.1 87.9 80.7 91.2 94.4 84.8 85.5 98.9 89.7
  MID ξ 88.3 89.6 90.9 87.0 85.3 90.3 93.9 77.4 91.2 88.2
  MIQ ξ 93.5 87.0 87.0 89.6 85.3 91.7 93.9 79.0 91.2 88.7
  I-RELIEF 96.1 91.2 87.8 86.4 88.4 94.4 87.8 82.3 88.4 89.2
  LHR 96.1 99.0 100 94.3 99.4 100 99.4 91.9 100 97.8
  LOGO 100 99.0 94.6 96.6 94.1 100 100 91.9 100 97.4
  IG 90.9 97.1 83.9 85.2 91.2 95.8 100 83.9 98.9 91.9
  TR 90.9 95.1 82.5 85.2 88.2 97.2 100 87.1 98.9 91.7
  Gini 90.9 93.1 84.3 84.1 94.1 98.6 100 87.1 98.9 92.3
  SumM 92.2 91.2 83.9 84.1 91.2 98.6 100 87.1 100 92.0
  MaxM 90.9 95.1 82.5 85.2 88.2 97.2 100 87.1 98.9 91.7
HKNN η SumV 90.9 95.1 82.5 85.2 88.2 97.2 100 87.1 98.9 91.7
  t-stat 89.6 91.2 81.4 81.8 94.1 97.2 100 83.9 99.4 91.0
  OSVM 89.6 92.2 83.9 79.5 91.2 97.2 87.9 87.1 99.4 89.8
  MID ξ 80.5 81.8 87.0 83.1 79.4 84.7 90.9 79.0 95.0 84.6
  MIQ ξ 88.3 83.1 80.5 89.6 82.4 91.7 90.9 75.8 93.9 86.2
  I-RELIEF 96.1 85.3 84.0 77.3 84.0 95.8 84.0 77.4 86.2 85.6
  LHR 97.4 97.1 100 94.3 100 100 100 90.3 100 97.7
  LOGO 100 99.0 96.8 96.6 97.1 100 100 91.9 100 97.9
  1. ξPreprocessing of the data via t-test with confidence leel of 0.01 to reduce the computation burden on estimating of mutual information.
  2. ηHyper-parameters are estimated via 5-fold cross validation.
  3. The number of genes is determined by LHR and used for all other FSSs. LOOCV criteria is used to evaluate the performance of the FSSs, coupling with five classification models. The optimal and suboptimal accuracy (columnwise) on each tested data are highlighted in bold and italic, respectively.