Skip to main content

Table 1 Datasets statistics for PPI, DDI, and ChemProt

From: Investigation of improving the pre-training and fine-tuning of BERT model for biomedical relation extraction

Dataset

Instance #

Train

Dev

Test

PPI(AIMed)

5,834

–

–

–

DDI

33,508

22,233

5559

5716

ChemProt

45,048

18,035

11,268

15,745

  1. For the AIMed dataset of PPI, there are only two labels: Positive and Negative. The ChemProt corpus is labeled with five positive classes (CPR:3, CPR:4, CPR:5, CPR:6, CPR:9) and the negative class. Similarly, the DDI dataset contains four positive labels (ADVICE, EFFECT, INT, MECHANISM) and one negative label