Skip to main content

Table 4 Results of the syntactic information ensemble on BC5CDR-chemical dataset

From: Improving biomedical named entity recognition with syntactic information

Ensemble strategies

Syntactic info.

BioBERT-Base

BioBERT-Large

PL

SC

DR

F1

\(\sigma\)

F1

\(\sigma\)

Baseline

\(\times\)

\(\times\)

\(\times\)

93.50

0.10

93.90

0.31

Sum

\(\surd\)

\(\surd\)

\(\times\)

93.66

0.17

94.20

0.15

\(\surd\)

\(\times\)

\(\surd\)

93.76

0.16

94.10

0.15

\(\times\)

\(\surd\)

\(\surd\)

93.81

0.15

94.12

0.14

\(\surd\)

\(\surd\)

\(\surd\)

93.78

0.25

94.26

0.16

Concatenation

\(\surd\)

\(\surd\)

\(\times\)

93.75

0.23

94.25

0.12

\(\surd\)

\(\times\)

\(\surd\)

93.80

0.26

94.22

0.16

\(\times\)

\(\surd\)

\(\surd\)

93.83

0.20

94.31

0.08

\(\surd\)

\(\surd\)

\(\surd\)

\(\mathit{93} .\mathit{88}\)

0.26

\(\mathit{94} .\mathit{36}\)

0.25

  1. The three types of syntactic information used for the ensemble are POS labels (PL), syntactic constituents (SC), and dependency relations (DR). The results are reported in terms of the average F1 scores and the standard deviation (\(\sigma\)). Sum and concatenation are two ensemble strategies applied to our method