From: NERBio: using selected word conjunctions, term normalization, and global patterns to improve biomedical named entity recognition
protein
DNA
RNA
cell type
cell line
All
Training Set
30,269
9,533
951
6,713
3,830
51,301
(15.1)
(4.8)
(0.5)
(3.4)
(1.9)
(25.7)
Test Set
5,067
1,056
118
1,921
500
8,662
(12.5)
(2.6)
(0.3)
(1.2)
(21.4)