Skip to main content

Table 1 Statistics of the NCBI, GM, and CDR corpora

From: Biomedical named entity recognition using deep neural networks with contextual information

Corpus

Entity

Unit

Training

Develop

Test

Total (Unit)

NCBI

Disease

Abstracts

592

100

100

792 (abstracts)

GM

Gene

Sentences

15000

-

5000

20000 (sentences)

CDR

Disease, Chemicals

Abstracts

500

500

500

1500 (abstracts)