Skip to main content

Table 2 Cross validation statistics.

From: MScanner: a classifier for retrieving Medline citations

Statistic PG07 Radiology AIDSBio Control
# Relevant 1663 67 10727 10000
# Irrelevant 99986 100000 99927 99955
Prevalence 0.01636 0.00067 0.09702 0.09095
ROC Area 0.9754 0.9923 0.9913 0.4975
ROC Std Error 0.0020 0.0047 0.0004 0.0030
Averaged Precision 0.693 0.711 0.924 0.090
Break-Even 0.652 0.642 0.884 0.089
  1. Summary of the cross validation training sets and performance metrics. Prevalence is the fraction of the data that is relevant, and break-even is point where cross validation precision equals recall.