Skip to main content

Table 4 Robustness detection experiment results using the average evaluation value and the standard deviation among different seeds (12345, 24, 488)

From: External features enriched model for biomedical question answering

Model

6b Factoid QA

SAcc

LAcc

MRR

BioBERT (main baseline)

0.4048 ± 0.0107

0.6278 ± 0.0061

0.4927 ± 0.0102

Our Model (BioBERT+POS+NER+FF)

0.4325 ± 0.0167

0.6200 ± 0.0138

0.5063 ± 0.0137

Model

7b Factoid QA

SAcc

LAcc

MRR

BioBERT (main baseline)

0.4362 ± 0.0087

0.6146 ± 0.0121

0.5059 ± 0.0045

Our Model (BioBERT+POS+NER+FF)

0.4359 ± 0.0078

0.6379 ± 0.0035

0.5122 ± 0.0037

Model

8b Factoid QA

SAcc

LAcc

MRR

BioBERT (main baseline)

0.3859 ± 0.0087

0.5566 ± 0.0061

0.4509 ± 0.0065

Our Model (BioBERT+POS+NER+FF)

0.3916 ± 0.0033

0.5898 ± 0.0156

0.4652 ± 0.0040

  1. Bold values represent the highest results