Skip to main content

Table 2 Statistics of the employed datasets

From: JCBIE: a joint continual learning neural network for biomedical information extraction

Corpus

Sent. count

Entity mention counts

Relations counts

Ch./Dr.

Ph./Di.

Pr./Ge.

ADE

DDI

CPR

Training set

      

\({\text {ADE}}_{1}\)

800

969

1144

–

1171

–

–

\({\text {ADE}}_{2}\)

3418

4063

4585

–

5422

–

–

DDI

5002

13,276

–

–

–

3607

–

CPR

8471

11,369

–

12,572

–

–

6044

Total

17,691

29,677

5729

12,572

6593

3607

6044

Validation set

      

\({\text {ADE}}_{1}\)

100

124

129

–

140

–

–

\({\text {ADE}}_{2}\)

427

493

592

–

667

–

–

DDI

557

1487

–

–

–

413

–

CPR

1022

1490

–

1385

–

–

694

Total

2106

3594

721

1385

807

413

694

Testing set

      

\({\text {ADE}}_{1}\)

100

115

144

–

142

–

–

\({\text {ADE}}_{2}\)

427

506

597

–

732

–

–

DDI

543

1480

–

–

–

475

–

CPR

1117

1715

–

1520

–

–

1016

Total

2187

3816

741

1520

874

475

1016

Corpus-adaption evaluation

     

\({\text {ADE}}_{3}\)

4638

9517

2334

–

4767

–

–

  1. Ch./Dr., chemicals or drug; Ph./Di., phenotype or disease; Pr./Ge., protein or gene