Skip to main content

Table 9 Runs on the test set (after code correction)

From: A linear classifier based on entity recognition tools and a statistical approach to method extraction in the protein-protein interaction literature

Run

Precision

Recall

F-Score

MCC

AUC iP/R

Total Docs Evaluated

All

2.50%

93.17%

0.0487

0.0908

0.1852

222

Top 40

4.83%

82.92%

0.0913

0.1604

0.1583

222

RScore ≥6

26.61%

50.58%

0.3488

0.3535

0.1522

214

RScore ≥7

28.44%

48.62%

0.3589

0.3591

0.1524

210

  1. The table shows the results of running our (corrected) program, on the BC 3 test set. The measurements shown are of precision, recall, F-score, Matthews Correlation Coefficient (MCC), Area under the Curve, and the total number of articles being evaluated by our program.
  2. The rows reflect four different runs: The first based on pattern-matching of methods to the text alone (All); the second scoring the sentence-method associations and reporting the top 40 scoring methods; the third reporting the top scoring methods whose raw score was at least 6, while the last reporting the top scoring methods whose top score was at least 7.