Skip to main content

Table 18 Part of speech tagging results on the CRAFT public release data (70% set)

From: A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools

POS Tagger

Precision

Recall

F-measure

LingPipe (Brown model)

0.59 (0.90)

0.58 (0.84)

0.59 (0.87)

LingPipe (MedPost model)

0.47 (0.88)

0.46 (0.83)

0.46 (0.85)

LingPipe (Genia model)

0.79 (0.88)

0.76 (0.85)

0.77 (0.87)

OpenNLP

0.82 (0.86)

0.74 (0.77)

0.78 (0.81)

  1. Numbers in parentheses indicate the upper-bound performance potential of the tools, calculated by removing occurrences of tags that did not align to the gold-standard tagset.