Skip to main content

Table 8 IMT Runs on the training set (after code correction)

From: A linear classifier based on entity recognition tools and a statistical approach to method extraction in the protein-protein interaction literature

Run

Precision

Recall

F-Score

MCC

AUC iP/R

Total Docs Evaluated

All

2.38%

94.80%

0.0465

0.0937

0.2032

2002

Top 40

4.54%

85.16%

0.0864

0.1598

0.2063

2002

RScore ≥6

26.30%

58.72%

0.3633

0.3806

0.1997

1947

RScore ≥7

29.14%

50.25%

0.3689

0.3711

0.1816

1871

  1. The table shows the results of running our (corrected) program on the BC 3 training set. The measurements shown are of precision, recall, F-score, Matthews Correlation Coefficient (MCC), Area under the Curve, and the total number of articles being evaluated by our program.
  2. The rows reflect four different runs: The first based on pattern-matching of methods to the text alone (All); the second scoring the sentence-method associations and reporting the top 40 scoring methods; the third reporting the top scoring methods whose raw score was at least 6, while the last reporting the top scoring methods whose top score was at least 7.