Skip to main content

Table 2 Global statistics comparison between TBGA, EU-ADR [13], CoMAGC [15], PolySearch [14], GAD [27], and GDAE [28] datasets

From: TBGA: a large-scale Gene-Disease Association dataset for Biomedical Relation Extraction

Dataset

Annotation

Instances

Publications

Inst.s/pub.

Genes

Diseases

Relations

CoMAGC

Manual

821

408

2.01

538

3

15

EU-ADR

Manual

355

65

5.46

221

118

4

PolySearch

Manual

522

374

1.40

245

10

2

GAD

Weak

5329

4112

1.30

1139

535

3

GDAE

Weak

8000

5875

1.36

3635

1904

2

TBGA

Weak

218,973

134,059

1.63

11,784

9199

4

  1. Columns represent, from left to right, the considered dataset, the type of annotation, the total number of instances and publications, the average number of instances per publication, as well as the total number of genes, diseases, and relations