Skip to main content

Table 3 The χ2 p-values for the fit to the diagonal in the reliability diagram, number of calibrated points, and difference between the maximum and minimum calibrated probabilities (range) for the k-means classifier presented in Fig. 4

From: A nonparametric Bayesian method of translating machine learning scores to probabilities in clinical decision support

Data set

BBQ

Proposed method

χ2 p-value

Calibrated points

Range

χ2 p-value

Calibrated points

Range

Lung Cancer

0.087

2

0.27

0.038

3

0.62

SPECT

<0.001

4

0.75

<0.001

5

0.79

Parkinsons

0.544

2

0.11

0.006

3

0.28

Arcene

0.032

3

0.61

0.623

5

0.60

Suicide

0.497

2

0.05

0.724

4

0.34

Arrhythmia

0.389

2

0.26

0.012

4

0.43

Breast Cancer

<0.001

3

0.96

<0.001

8

0.98

Contraception

0.867

1

0.003

0.380

4

0.52

  1. The data sets with large overlaps in the score distributions are emphasized in boldface. The proposed method consistently achieves a larger number and more dynamic range of calibrated points. Note the Contraception data set has one calibration point on the reliability diagram, but a finite range. This is due to the number of calibration points being calculated from the number of (binned) points in the reliability diagram