Skip to main content

Table 1 Datasets statistics for PPI, DDI, and ChemProt

From: Investigation of improving the pre-training and fine-tuning of BERT model for biomedical relation extraction

Dataset Instance # Train Dev Test
PPI(AIMed) 5,834
DDI 33,508 22,233 5559 5716
ChemProt 45,048 18,035 11,268 15,745
  1. For the AIMed dataset of PPI, there are only two labels: Positive and Negative. The ChemProt corpus is labeled with five positive classes (CPR:3, CPR:4, CPR:5, CPR:6, CPR:9) and the negative class. Similarly, the DDI dataset contains four positive labels (ADVICE, EFFECT, INT, MECHANISM) and one negative label