Investigation of improving the pre-training and fine-tuning of BERT model for biomedical relation extraction

BMC Bioinformatics

Table 7 Model performance without using the [CLS] token in the last layer

Model	PPI			DDI			ChemProt
Model	P	R	F	P	R	F	P	R	F
BioBERT	79.0	83.3	81.0	79.9	78.1	79.0	74.3	76.3	75.3
BioBERT_SLL_Att	80.7	84.4	82.5	81.3	80.1	80.7	76.5	77.1	76.8
BioBERT_SLL_Att*	82.3	83.5	82.8	79.7	77.6	78.6	76.4	74.5	75.4
PubMedBERT	80.1	84.3	82.1	82.6	81.9	82.3	78.8	75.9	77.3
PubMedBERT_SLL_Att	81.3	85.0	83.1	84.3	82.7	83.5	78.3	77.6	77.9
PubMedBERT_SLL_Att*	80.0	85.2	82.4	82.5	80.9	81.7	75.7	77.7	76.7

Bold values indicate better results
P: Precision; R: Recall; F: F1 Score; BERT_SLL_Att*: models of fine-tuning with only the summarized information from attention mechanism (without [CLS] token)

ISSN: 1471-2105