Skip to main content

Table 11 Effect of trained POS tagger on biomedical literature. POS(A) means the original, newswire trained POS tagger. POS(B) means the POS tagger with trained on GENIA corpus 3.02p. "full(A)" means using all features and the original, newswire trained POS tagger. "full(B)" means using all features and the POS tagger with trained on GENIA corpus 3.02p The parenthesized values are p-values. We compare two cases: "word+POS(A)+pc." vs. "word+POS(B)+pc." and "full(A)" vs. "full(B)". The values in bold have a statistically significant difference in the comparison. A difference is labeled statistically significant when the p-value is less than 0.05 on the Wilcoxon signed-ranks sum test (two-sided).

From: Gene/protein name recognition based on support vector machine using dictionary as features

 

word+POS(A)+pc.

 

word+POS(B)+pc

full(A)

 

full(B)

Precision

0.7813

 

0.7867

0.8189

 

0.8177

 

(0.105)

(0.557)

Recall

0.6423

 

0.6147

0.7661

 

0.7640

 

(0.002)

(0.322)

Balanced f-score

0.7118

 

0.6900

0.7916

 

0.7899

 

(0.002)

(0.432)