Skip to main content

Table 7 Model performance without using the [CLS] token in the last layer

From: Investigation of improving the pre-training and fine-tuning of BERT model for biomedical relation extraction

Model PPI DDI ChemProt
P R F P R F P R F
BioBERT 79.0 83.3 81.0 79.9 78.1 79.0 74.3 76.3 75.3
BioBERT_SLL_Att 80.7 84.4 82.5 81.3 80.1 80.7 76.5 77.1 76.8
BioBERT_SLL_Att* 82.3 83.5 82.8 79.7 77.6 78.6 76.4 74.5 75.4
PubMedBERT 80.1 84.3 82.1 82.6 81.9 82.3 78.8 75.9 77.3
PubMedBERT_SLL_Att 81.3 85.0 83.1 84.3 82.7 83.5 78.3 77.6 77.9
PubMedBERT_SLL_Att* 80.0 85.2 82.4 82.5 80.9 81.7 75.7 77.7 76.7
  1. Bold values indicate better results
  2. P: Precision; R: Recall; F: F1 Score; BERT_SLL_Att*: models of fine-tuning with only the summarized information from attention mechanism (without [CLS] token)