Skip to main content

Table 1 Per-relation statistics for TBGA

From: TBGA: a large-scale Gene-Disease Association dataset for Biomedical Relation Extraction

Granularity

Split

Therapeutic

Biomarker

Genomic alterations

NA

Sentence-level

Train

3139

20,145

32,831

122,149

Validation

402

2279

2306

15,206

Test

384

2315

2209

15,608

Bag-level

Train

2218

13,372

12,759

56,698

Validation

331

2019

1147

6994

Test

308

2068

1122

6996

  1. Statistics are reported separately for each data split. Columns represent, from left to right, the considered granularity level, the data split, and the number of instances and bags associated with Therapeutic, Biomarker, Genomic Alterations, and NA relations