From: A neural network multi-task learning approach to biomedical named entity recognition
Dataset | Contents | Entity counts |
---|---|---|
AnatEM [38] | Anatomy NE | 13,701 |
BC2GM [2] | Gene/Protein NE | 24,583 |
BC4CHEMD [3] | Chemical NE | 84,310 |
BC5CDR [5] | Chemical, Disease NEs | Chemical: 15,935; Disease:12,852 |
BioNLP09 [52] | Gene/Protein NE | 14,963 |
BioNLP11EPI [53] | Gene/Protein NE | 15,811 |
BioNLP11ID [53] | 4 NEs | Gene/Protein: 6551; Organism: 3471; |
 |  | Chemical: 973; Regulon-operon: 87 |
BioNLP13CG [54] | 16 NEs | Gene/Protein: 7908; Cell: 3492; Cancer: 2582 |
 |  | Chemical: 2270; Organism: 1715; Multi-tissue structure: 857; |
 |  | Tissue: 587; Cellular component: 569; Organ: 421; |
 |  | Organism substance: 283; Pathological formation: 228; Amino acid: 135; |
 |  | Immaterial anatomical entity: 102; Organism subdivision: 98; |
 |  | Anatomical system: 41; Developing anatomical structure: 35 |
BioNLP13GE [55] | Gene/Protein NE | 12,057 |
BioNLP13PC [56] | 4 NEs | Gene/Protein: 10,891; Chemical: 2487; |
 |  | Complex: 1502; Cellular component: 1013 |
CRAFT [57] | 6 NEs | SO: 18,974; Gene/Protein: 16,064; |
 |  | Taxonomy: 6868; Chemical: 6053; CL: 5495; GO-CC: 4180 |
Ex-PTM [58] | Gene/Protein NE | 4698 |
JNLPBA [44] | 5 NEs | Gene/Protein: 35,336; DNA: 10,589; Cell Type: 8639 |
 |  | Cell Line: 4330; RNA: 1069 |
Linnaeus [4] | Species NE | 4263 |
NCBI-Disease [6] | Disease NE | 6881 |
GENIA-PoS [59] | PoS-Tagging | N/A |