Skip to main content

Table 3 The χ2 p-values for the fit to the diagonal in the reliability diagram, number of calibrated points, and difference between the maximum and minimum calibrated probabilities (range) for the k-means classifier presented in Fig. 4

From: A nonparametric Bayesian method of translating machine learning scores to probabilities in clinical decision support

Data set BBQ Proposed method
χ2 p-value Calibrated points Range χ2 p-value Calibrated points Range
Lung Cancer 0.087 2 0.27 0.038 3 0.62
SPECT <0.001 4 0.75 <0.001 5 0.79
Parkinsons 0.544 2 0.11 0.006 3 0.28
Arcene 0.032 3 0.61 0.623 5 0.60
Suicide 0.497 2 0.05 0.724 4 0.34
Arrhythmia 0.389 2 0.26 0.012 4 0.43
Breast Cancer <0.001 3 0.96 <0.001 8 0.98
Contraception 0.867 1 0.003 0.380 4 0.52
  1. The data sets with large overlaps in the score distributions are emphasized in boldface. The proposed method consistently achieves a larger number and more dynamic range of calibrated points. Note the Contraception data set has one calibration point on the reliability diagram, but a finite range. This is due to the number of calibration points being calculated from the number of (binned) points in the reliability diagram