Skip to main content

Table 10 Average scores for each system and each batch of phase B of Task 1b for the “ideal” answers

From: An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition

System Batch 1 Batch 2 Batch 3
Wishart-S1 3.95 4.23 -
Wishart-S2 3.95 - -
Wishart-S3 3.95 - -
Baseline1 2.86 3.02 3.19
Baseline2 2.73 2.87 3.17
main system 3.35 3.39 3.13
system 2 - 3.34 3.07
system 3 - 3.34 2.98
system 4 - 3.34 -
  1. The final score is calculated as the average of the individual scores of the systems for the different evaluation criteria. A hyphenation symbol (-) is used whenever the system did not participate in the corresponding batch. The scores are given by experts who read and evaluated the “ideal” answers, and they range from 1 to 5, with 5 being the best score.