Improving the recall of biomedical named entity recognition with label re-correction and knowledge distillation

BMC Bioinformatics

Table 5 Ablation study results

Model	P (%)	R (%)	F (%)
Our best (BiLSTM-CRF)	90.71	89.99	90.35
w/o label re-correction	91.34	80.76	85.73**
w/o CDRC	90.48	89.14	89.81*
w/o CDRA	90.17	89.55	89.86**

The highest scores are highlighted in bold
w/o label re-correction: we train the teachers on the two weakly labeled datasets CDWC and CDWA rather than CDRC and CDRA
w/o CDRC: we train a single teacher without CDRC (i.e. only with CDRA)
w/o CDRA: we train a single teacher without CDRA (i.e. only with CDRC)
the marker * and ** represent P value < 0.05 and P value < 0.01, respectively, using pairwise t-test against our best (BiLSTM-CRF). Firstly, the formula of the pairwise t-test is defined as the sum of the differences of each pair divided by the square root of n times the sum of the differences squared minus the sum of the squared differences, overall n − 1. n is the number of pair. Then in this paper we use a two-tailed test in which the critical area of a distribution is two-sided and tests whether a sample is greater than or less than a certain range of values

ISSN: 1471-2105