Skip to main content

Table 7 Statistical comparison of EBD and FI discretization methods

From: Application of an efficient Bayesian discretization method to biomedical data

Evaluation Measure Method Mean (SEM) Difference of Means Z statistic (p-value)
C4.5 Accuracy EBD 73.49% (2.07) 2.01 2.219
[0%, 100%] FI 71.48% (2.12)   (0.026)
C4.5 AUC EBD 73.22% (1.89) 1.07 2.732
[50%, 100%] FI 72.15% (1.77)   (0.007)
C4.5 Robustness EBD 72.55% (2.81) -0.26 -0.261
[0%, ∞] FI 72.81% (2.76)   (0.794)
NB Accuracy EBD 77.55% (2.65) 0.76 2.080
[0%, 100%] FI 76.79% (2.32)   (0.038)
NB AUC EBD 74.83% (1.43) 1.11 2.711
[0%, 100%] FI 73.71% (1.24)   (0.007)
NB Robustness EBD 81.72% (2.92) -0.68 -0.016
[50%, ∞] FI 82.40% (2.59)   (0.987)
Stability EBD 0.74 (0.025) 0.02 1.972
[0, 1] FI 0.72 (0.029)   (0.049)
Mean # of intervals per predictor EBD 1.27 (0.074) 0.11 1.686
[1, n] FI 1.16 (0.038)   (0.092)
  1. In the first column the range of a measure is given in square brackets where n is the number of instances in the dataset. In the last column the number on top in the last column is the Z statistic and the number at the bottom is the corresponding p-value. On all performance measures, except for the mean number of intervals per predictor, the Z statistic is positive when EBD performs better than FI. The two-tailed p-values of 0.05 or smaller are in bold, indicating that EBD performed statistically significantly better at that level.