Skip to main content

Table 3 Comparison of BiLSTM-CRF model results trained on CDWC and CDWA with different re-correction times

From: Improving the recall of biomedical named entity recognition with label re-correction and knowledge distillation

Dataset

P (%)

R (%)

F (%)

Dataset

P (%)

R (%)

F (%)

CDR

91.42

83.59

87.86

CDR

91.42

83.59

87.86

CDR + CDWC

90.17

84.49

87.24

CDR + CDWA

94.02

71.02

80.92

CDWC

89.72

83.65

86.58

CDWA

94.75

67.27

78.68

CDWC1

89.84

89.32

89.58

CDWA1

90.16

88.94

89.55

CDWC2

90.00

89.35

89.67

CDWA2 (CDRA)

91.03

88.31

89.65

CDWC3(CDRC)

89.80

89.82

89.81

CDWA3

90.28

89.03

89.65

CDWC4

89.90

89.70

89.80

    
  1. The highest scores are highlighted in bold
  2. All results are evaluated on the CDR test set. The first two lines are the baselines. For the last 5 lines, each dataset is constructed by the correction model trained with the dataset right above it. The superscript represents the re-correction times. That is, CDWC1 is the dataset constructed by the correction model trained on the CDWC. The third row datasets are the weakly labeled datasets without re-correction. What’s more, CDWC3 is CDRC, and CDWA2 is CDRA