Model | GENIA | CRAFT |
---|
MarMoT | 98.61 | 97.07 |
jPTDP-v1 | 98.66 | 97.24 |
NLP4J-POS | 98.80 | 97.43 |
BiLSTM-CRF | 98.44 | 97.25 |
+ CNN-char |
98.89
| 97.51 |
+ LSTM-char | 98.85 |
97.56
|
Stanford tagger [ ⋆] | 98.37 | _ |
GENIA tagger [ ⋆] | 98.49 | _ |
- [ ⋆] denotes a result with a pre-trained POS tagger. We do not provide accuracy results of the pre-trained POS taggers on CRAFT because CRAFT uses an extended PTB POS tag set (i.e. there are POS tags in CRAFT that are not defined in the original PTB POS tag set). Corpus-level accuracy differences of at least 0.17% in GENIA and 0.26% in CRAFT between two POS tagging models are significant at p≤0.05. Here, we compute sentence-level accuracies, then use paired t-test to measure the significance level