From: DTranNER: biomedical named entity recognition with deep learning-based label-label transition model
Datasets | Number of Sentences | Entity Types | Entity Counts | Max Entity Length | Average Entity Length |
---|---|---|---|---|---|
BC2GM [35] | 20128 | Gene/Protein | 24583 | 26 tokens | 2.44 tokens |
BC4CHEMD [36] | 87682 | Chemical/Drug | 84310 | 137 tokens | 2.19 tokens |
BC5CDR-Chemical [37] | 13935 | Chemical/Drug | 15935 | 56 tokens | 1.33 tokens |
BC5CDR-Disease [37] | 13935 | Disease | 12852 | 19 tokens | 1.65 tokens |
NCBI-Disease [38] | 7284 | Disease | 6881 | 22 tokens | 2.21 tokens |