Skip to main content

Table 1 Protein name recognition performance

From: How to make the most of NE dictionaries in statistical NER

Tagging

 

R

P

F

(a) POS/PROTEIN tagging

Full

52.91

43.85

47.96

 

Left

61.48

50.95

55.72

 

Right

61.38

50.87

55.63

Sequential Labelling

 

R

P

F

(b) Word feature

Full

63.23

70.39

66.62

 

Left

68.15

75.86

71.80

 

Right

69.88

77.79

73.63

(c) (b) + orthographic feature

Full

77.17

67.52

72.02

 

Left

82.51

72.20

77.01

 

Right

84.29

73.75

78.67

(d) (c) + POS feature

Full

76.46

68.41

72.21

 

Left

81.94

73.32

77.39

 

Right

83.54

74.75

78.90

(e) (d) + PROTEIN feature

Full

77.58

69.18

73.14

 

Left

82.69

73.74

77.96

 

Right

84.37

75.24

79.54

(f) (e) after adding protein names in the training set to the lexicon

Full

79.85

68.58

73.78

 

Left

84.82

72.85

78.38

 

Right

86.60

74.37

80.02

  1. Protein name recognition performance of the proposed method, evaluated by recall (R), precision (P), and F-measure (F). The left boundary (Left), the right boundary (Right), and both boundary (Full) recognition performance were measured. (a) the performance of POS/PROTEIN tagging. (b) the performance of sequential labelling when using the word feature only. (c) the performance of sequential labelling when using the word and orthographic features. (d) the performance of sequential labelling when using the word, orthographic, and POS features. (e) the performance of sequential labelling when using the word, orthographic, POS, and PROTEIN name features. (f) the performance of sequential labelling with the features used in (e) after adding protein names appearing in the training set to the lexicon. NB: no retraining was conducted.