Skip to main content

Table 4 BERT performance after pre-training with sub-domain data

From: Investigation of improving the pre-training and fine-tuning of BERT model for biomedical relation extraction

Model

PPI

DDI

ChemProt

P

R

F

P

R

F

P

R

F

BioBERT

79.0

83.3

81.0

79.9

78.1

79.0

74.3

76.3

75.3

BioBERT (+P/G)

\( {\mathbf{82.5 }}\)

\( {\mathbf{83.7 }}\)

\( {\mathbf{83.0 }}\)

76.1

77.6

76.9

76.5

74.2

75.3

BioBERT (+D)

81.5

80.9

81.2

\( {\mathbf{81.9 }}\)

\( {\text {78.4}}\)

\( {\mathbf{80.1 }}\)

76.7

74.4

75.6

BioBERT (+CP)

81.3

83.7

82.4

78.7

\(\mathbf{79}.0 \)

78.8

\( {\text {76.6}}\)

\( {\text {76.1}}\)

\( {\mathbf{76.4 }}\)

PubMedBERT

80.1

84.3

82.1

82.6

81.9

82.3

78.8

75.9

77.3

PubMedBERT (+P/G)

\( {\mathbf{81.2 }}\)

\( {\mathbf{85.5 }}\)

\( {\mathbf{83.3 }}\)

83.7

80.5

82.0

80.5

75.5

77.9

PubMedBERT (+D)

79.1

85.3

82.0

\( {\mathbf{84.1 }}\)

\( {\text {81.7}}\)

\( {\mathbf{82.9 }}\)

80.4

74.6

77.4

PubMedBERT (+CP)

79.6

84.7

82.0

81.1

82.7

81.9

\( {\text {79.4}}\)

\( {\mathbf{77.5 }}\)

\( {\mathbf{78.4 }}\)

  1. Bold values indicate better results
  2. P: Precision; R: Recall; F: F1 Score; +P/G: add Protein/Gene-related PubMed abstracts as sub-domain data; +D: add Drug-related PubMed abstracts as sub-domain data; +CP: add protein-related and chemical-related PubMed abstracts as sub-domain data