Skip to main content

Table 8 IMT Runs on the training set (after code correction)

From: A linear classifier based on entity recognition tools and a statistical approach to method extraction in the protein-protein interaction literature

Run Precision Recall F-Score MCC AUC iP/R Total Docs Evaluated
All 2.38% 94.80% 0.0465 0.0937 0.2032 2002
Top 40 4.54% 85.16% 0.0864 0.1598 0.2063 2002
RScore ≥6 26.30% 58.72% 0.3633 0.3806 0.1997 1947
RScore ≥7 29.14% 50.25% 0.3689 0.3711 0.1816 1871
  1. The table shows the results of running our (corrected) program on the BC 3 training set. The measurements shown are of precision, recall, F-score, Matthews Correlation Coefficient (MCC), Area under the Curve, and the total number of articles being evaluated by our program.
  2. The rows reflect four different runs: The first based on pattern-matching of methods to the text alone (All); the second scoring the sentence-method associations and reporting the top 40 scoring methods; the third reporting the top scoring methods whose raw score was at least 6, while the last reporting the top scoring methods whose top score was at least 7.