Skip to main content

Table 1 Protein name recognition performance

From: How to make the most of NE dictionaries in statistical NER

Tagging   R P F
(a) POS/PROTEIN tagging Full 52.91 43.85 47.96
  Left 61.48 50.95 55.72
  Right 61.38 50.87 55.63
Sequential Labelling   R P F
(b) Word feature Full 63.23 70.39 66.62
  Left 68.15 75.86 71.80
  Right 69.88 77.79 73.63
(c) (b) + orthographic feature Full 77.17 67.52 72.02
  Left 82.51 72.20 77.01
  Right 84.29 73.75 78.67
(d) (c) + POS feature Full 76.46 68.41 72.21
  Left 81.94 73.32 77.39
  Right 83.54 74.75 78.90
(e) (d) + PROTEIN feature Full 77.58 69.18 73.14
  Left 82.69 73.74 77.96
  Right 84.37 75.24 79.54
(f) (e) after adding protein names in the training set to the lexicon Full 79.85 68.58 73.78
  Left 84.82 72.85 78.38
  Right 86.60 74.37 80.02
  1. Protein name recognition performance of the proposed method, evaluated by recall (R), precision (P), and F-measure (F). The left boundary (Left), the right boundary (Right), and both boundary (Full) recognition performance were measured. (a) the performance of POS/PROTEIN tagging. (b) the performance of sequential labelling when using the word feature only. (c) the performance of sequential labelling when using the word and orthographic features. (d) the performance of sequential labelling when using the word, orthographic, and POS features. (e) the performance of sequential labelling when using the word, orthographic, POS, and PROTEIN name features. (f) the performance of sequential labelling with the features used in (e) after adding protein names appearing in the training set to the lexicon. NB: no retraining was conducted.
\