Skip to main content

Table 1 Descriptions of datasets

From: CollaboNet: collaboration of deep neural networks for biomedical named entity recognition

Datasets Entity type # of sentence # of annotations Data Size
NCBI-Disease (Dogan et al., 2014) Disease 7639 6881 793 abstracts
JNLPBA (Kim et al., 2004) Gene/Proteins 22,562 35,336 2404 abstracts
BC5CDR (Li et al., 2016) Chemicals 14,228 15,935 1500 articles
BC5CDR (Li et al., 2016) Diseases 14,228 12,852 1500 articles
BC4CHEMD (Krallinger et al., 2015a) Chemicals 86,679 84,310 10,000 abstracts
BC2GM (Akhondi et al., 2014) Gene/Proteins 20,510 24,583 20,000 sentences