Skip to main content

Table 7 Statistical comparison of EBD and FI discretization methods

From: Application of an efficient Bayesian discretization method to biomedical data

Evaluation Measure

Method

Mean (SEM)

Difference of Means

Z statistic (p-value)

C4.5 Accuracy

EBD

73.49% (2.07)

2.01

2.219

[0%, 100%]

FI

71.48% (2.12)

 

(0.026)

C4.5 AUC

EBD

73.22% (1.89)

1.07

2.732

[50%, 100%]

FI

72.15% (1.77)

 

(0.007)

C4.5 Robustness

EBD

72.55% (2.81)

-0.26

-0.261

[0%, ∞]

FI

72.81% (2.76)

 

(0.794)

NB Accuracy

EBD

77.55% (2.65)

0.76

2.080

[0%, 100%]

FI

76.79% (2.32)

 

(0.038)

NB AUC

EBD

74.83% (1.43)

1.11

2.711

[0%, 100%]

FI

73.71% (1.24)

 

(0.007)

NB Robustness

EBD

81.72% (2.92)

-0.68

-0.016

[50%, ∞]

FI

82.40% (2.59)

 

(0.987)

Stability

EBD

0.74 (0.025)

0.02

1.972

[0, 1]

FI

0.72 (0.029)

 

(0.049)

Mean # of intervals per predictor

EBD

1.27 (0.074)

0.11

1.686

[1, n]

FI

1.16 (0.038)

 

(0.092)

  1. In the first column the range of a measure is given in square brackets where n is the number of instances in the dataset. In the last column the number on top in the last column is the Z statistic and the number at the bottom is the corresponding p-value. On all performance measures, except for the mean number of intervals per predictor, the Z statistic is positive when EBD performs better than FI. The two-tailed p-values of 0.05 or smaller are in bold, indicating that EBD performed statistically significantly better at that level.