Skip to main content

Table 4 BERT performance after pre-training with sub-domain data

From: Investigation of improving the pre-training and fine-tuning of BERT model for biomedical relation extraction

Model PPI DDI ChemProt
P R F P R F P R F
BioBERT 79.0 83.3 81.0 79.9 78.1 79.0 74.3 76.3 75.3
BioBERT (+P/G) \( {\mathbf{82.5 }}\) \( {\mathbf{83.7 }}\) \( {\mathbf{83.0 }}\) 76.1 77.6 76.9 76.5 74.2 75.3
BioBERT (+D) 81.5 80.9 81.2 \( {\mathbf{81.9 }}\) \( {\text {78.4}}\) \( {\mathbf{80.1 }}\) 76.7 74.4 75.6
BioBERT (+CP) 81.3 83.7 82.4 78.7 \(\mathbf{79}.0 \) 78.8 \( {\text {76.6}}\) \( {\text {76.1}}\) \( {\mathbf{76.4 }}\)
PubMedBERT 80.1 84.3 82.1 82.6 81.9 82.3 78.8 75.9 77.3
PubMedBERT (+P/G) \( {\mathbf{81.2 }}\) \( {\mathbf{85.5 }}\) \( {\mathbf{83.3 }}\) 83.7 80.5 82.0 80.5 75.5 77.9
PubMedBERT (+D) 79.1 85.3 82.0 \( {\mathbf{84.1 }}\) \( {\text {81.7}}\) \( {\mathbf{82.9 }}\) 80.4 74.6 77.4
PubMedBERT (+CP) 79.6 84.7 82.0 81.1 82.7 81.9 \( {\text {79.4}}\) \( {\mathbf{77.5 }}\) \( {\mathbf{78.4 }}\)
  1. Bold values indicate better results
  2. P: Precision; R: Recall; F: F1 Score; +P/G: add Protein/Gene-related PubMed abstracts as sub-domain data; +D: add Drug-related PubMed abstracts as sub-domain data; +CP: add protein-related and chemical-related PubMed abstracts as sub-domain data