Skip to main content

Table 3 Performance of the classifiers on the Miller data set without feature selection

From: SMOTE for high-dimensional class-imbalanced data

   ER Grade
   1-NN 3-NN 5-NN 1-NN 3-NN 5-NN
NC (CUT-OFF) PA 0.838 0.862 0.874 (0.777) 0.779 0.839 0.835 (0.835)
  PA1 0.925 0.953 0.972 (0.789) 0.897 0.954 0.949 (0.897)
  PA2 0.294 0.294 0.265 (0.706) 0.352 0.426 0.426 (0.611)
  AUC 0.610 0.692 0.772 (0.772) 0.625 0.769 0.816 (0.816)
  G-mean 0.522 0.529 0.507 (0.746) 0.562 0.637 0.636 (0.741)
SMOTE PA 0.271 0.249 0.249 0.364 0.373 0.384
   (0.012) (0.013) (0.012) (0.014) (0.015) (0.016)
  PA1 0.156 0.130 0.132 0.194 0.209 0.223
   (0.014) (0.014) (0.013) (0.018) (0.020) (0.020)
  PA2 0.996 0.992 0.984 0.979 0.966 0.966
   (0.010) (0.015) (0.017) (0.012) (0.013) (0.011)
  AUC 0.576 0.632 0.671 0.586 0.680 0.736
   (0.009) (0.014) (0.013) (0.011) (0.013) (0.010)
  G-mean 0.393 0.359 0.360 0.435 0.449 0.464
   (0.018) (0.020) (0.019) (0.020) (0.021) (0.021)
UNDER PA 0.625 0.685 0.691 0.766 0.836 0.840
   (0.065) (0.056) (0.049) (0.017) (0.012) (0.012)
  PA1 0.742 0.841 0.863 0.798 0.871 0.878
   (0.017) (0.013) (0.012) (0.016) (0.011) (0.012)
  PA2 0.761 0.866 0.890 0.649 0.709 0.700
   (0.017) (0.013) (0.010) (0.051) (0.039) (0.028)
  AUC 0.693 0.822 0.861 0.723 0.833 0.850
   (0.033) (0.021) (0.021) (0.027) (0.015) (0.008)
  G-mean 0.689 0.770 0.784 0.719 0.786 0.784
   (0.036) (0.031) (0.029) (0.029) (0.022) (0.017)
  1. Overall predictive accuracy (PA), predictive accuracy for Class 1 (P A1), predictive accuracy for Class 2 (P A2), Area under the ROC curve (AUC) and G-mean for 1-NN, 3-NN and 5-NN achieved on the Miller data set with different methods of training set manipulation (no correction - NC (in brackets we report the results obtained by adjusting the threshold for 5-NN - CUT-OFF), SMOTE and undersampling - UNDER). Prediction of Estrogen receptor status (ER) and Grade of the tumor (Grade). All variables were considered when training the classifiers.
\