Skip to main content

Table 1 The statistical information of the PharmaCoNER corpus

From: Deep learning with language models improves named entity recognition for PharmaCoNER

Set

Training

Development

Test

Total

Documents

500

250

250

1000

Sentences

7003

3454

3403

13860

NORMALIZABLES

2304

1121

973

4398

NO_NORMALIZABLES

24

16

10

50

PROTEINAS

1405

745

859

3009

UNCLEAR

89

44

34

167