Skip to main content

Table 1 Descriptions of datasets

From: CollaboNet: collaboration of deep neural networks for biomedical named entity recognition

Datasets

Entity type

# of sentence

# of annotations

Data Size

NCBI-Disease (Dogan et al., 2014)

Disease

7639

6881

793 abstracts

JNLPBA (Kim et al., 2004)

Gene/Proteins

22,562

35,336

2404 abstracts

BC5CDR (Li et al., 2016)

Chemicals

14,228

15,935

1500 articles

BC5CDR (Li et al., 2016)

Diseases

14,228

12,852

1500 articles

BC4CHEMD (Krallinger et al., 2015a)

Chemicals

86,679

84,310

10,000 abstracts

BC2GM (Akhondi et al., 2014)

Gene/Proteins

20,510

24,583

20,000 sentences