Skip to main content

Table 9 Runs on the test set (after code correction)

From: A linear classifier based on entity recognition tools and a statistical approach to method extraction in the protein-protein interaction literature

Run Precision Recall F-Score MCC AUC iP/R Total Docs Evaluated
All 2.50% 93.17% 0.0487 0.0908 0.1852 222
Top 40 4.83% 82.92% 0.0913 0.1604 0.1583 222
RScore ≥6 26.61% 50.58% 0.3488 0.3535 0.1522 214
RScore ≥7 28.44% 48.62% 0.3589 0.3591 0.1524 210
  1. The table shows the results of running our (corrected) program, on the BC 3 test set. The measurements shown are of precision, recall, F-score, Matthews Correlation Coefficient (MCC), Area under the Curve, and the total number of articles being evaluated by our program.
  2. The rows reflect four different runs: The first based on pattern-matching of methods to the text alone (All); the second scoring the sentence-method associations and reporting the top 40 scoring methods; the third reporting the top scoring methods whose raw score was at least 6, while the last reporting the top scoring methods whose top score was at least 7.