Skip to main content

Table 2 Cross validation statistics.

From: MScanner: a classifier for retrieving Medline citations

Statistic

PG07

Radiology

AIDSBio

Control

# Relevant

1663

67

10727

10000

# Irrelevant

99986

100000

99927

99955

Prevalence

0.01636

0.00067

0.09702

0.09095

ROC Area

0.9754

0.9923

0.9913

0.4975

ROC Std Error

0.0020

0.0047

0.0004

0.0030

Averaged Precision

0.693

0.711

0.924

0.090

Break-Even

0.652

0.642

0.884

0.089

  1. Summary of the cross validation training sets and performance metrics. Prevalence is the fraction of the data that is relevant, and break-even is point where cross validation precision equals recall.