Skip to main content

Table 3 Results for our current BioBERT system, best system reported in the shared-task paper [46], and the official baseline

From: Parallel sequence tagging for concept recognition

Annot. set

System

Proper

Extended

SER

F-score

SER

F-score

CHEBI

Baseline

0.44

0.72

0.29

0.80

Shared-task

0.3388

0.7700

0.2571

0.8209

Current

0.2492

0.8528

0.2289

0.8459

CL

Baseline

0.53

0.61

0.33

0.73

Shared-task

0.4862

0.6657

0.3361

0.7484

Current

0.4013

0.7526

0.2777

0.7926

GO_BP

Baseline

0.39

0.72

0.29

0.79

Shared-task

0.3047

0.8037

0.2786

0.8138

Current

0.2587

0.8297

0.2015

0.8506

GO_CC

Baseline

0.44

0.71

0.20

0.88

Shared-task

0.3788

0.7645

0.1678

0.8936

Current

0.2817

0.8219

0.1486

0.9073

GO_MF

Baseline

0.07

0.95

0.45

0.66

Shared-task

0.0319

0.9838

0.3881

0.7438

Current

0.0149

0.9904

0.4135

0.7139

MOP

Baseline

0.43

0.75

0.36

0.79

Shared-task

0.2684

0.8705

0.3080

0.8437

Current

0.1567

0.9188

0.1713

0.9082

NCBITaxon

Baseline

0.07

0.96

0.07

0.96

Shared-task

0.0537

0.9694

0.0466

0.9722

Current

0.0436

0.9744

0.0460

0.9704

PR

Baseline

0.69

0.48

0.62

0.52

Shared-task

0.3052

0.8026

0.3030

0.8011

Current

0.3068

0.8041

0.3130

0.7951

SO

Baseline

0.21

0.86

0.18

0.89

Shared-task

0.1593

0.9027

0.1230

0.9187

Current

0.1206

0.9223

0.0899

0.9419

UBERON

Baseline

0.41

0.70

0.36

0.75

Shared-task

0.3752

0.7488

0.3371

0.7714

Current

0.2790

0.8177

0.2537

0.8315

  1. In case of the shared-task systems, the results were selected independently for SER and F-score, i. e. the two scores for a given annotation set do not necessarily come from the same system. For the baseline and the current BioBERT system, however, only one system was evaluated for each annotation set