Skip to main content

Table 4 The performance of model-term and model-code in different modelling tasks, and in comparison with two baselines using Precision (P), Recall (R), \(F_{1}\) measures grounded at evidence level in each (in)consistency type and Micro-Precision (\(P^*\)), Micro-Recall (\(R^*\)) averaged over every predicted instances in the test set

From: Automatic consistency assurance for literature-based gene ontology annotation

 

Model-term

 

Consistent

(A)

(B)

(C)

 

P

R

F1

P

R

F1

P

R

F1

P

R

F1

Basic system

0.54

0.70

0.61

0.48

0.29

0.36

0.79

0.48

0.60

0.65

0.96

0.78

+Training Opt

0.74

0.71

0.72

0.54

0.33

0.41

0.76

0.57

0.65

0.61

0.93

0.73

+SectionInfo

0.69

0.65

0.67

0.46

0.35

0.40

0.77

0.52

0.62

0.52

0.96

0.68

+Opt & SectionInfo

0.69

0.64

0.66

0.45

0.31

0.37

0.75

0.51

0.61

0.50

0.96

0.66

Baselines

First modelling task

prior-biased classifier

0.48

0.35

0.41

0.05

0.02

0.03

0.36

0.28

0.31

0.10

0.33

0.15

rule-based model

0.53

0.38

0.44

0.09

0.04

0.06

0.41

0.29

0.34

0.18

0.99

0.30

 

Model-term

Model-code

Overall

Consistent

(D)

Overall

P*

R*

F1

P

R

F1

P

R

F1

P*

R*

F1

Basic system

0.64

0.64

0.64

0.75

0.50

0.60

0.31

0.58

0.41

0.52

0.52

0.52

+Training Opt

0.69

0.69

0.69

–

–

–

–

–

–

–

–

–

+SectionInfo

0.65

0.65

0.65

0.82

0.48

0.61

0.21

0.56

0.31

0.50

0.50

0.50

+Opt & SectionInfo

0.63

0.63

0.63

–

–

–

–

–

–

–

–

–

Baselines

First modelling task

Second modelling task

prior-biased classifier

0.30

0.30

0.30

0.6

0.64

0.62

0.41

0.37

0.39

0.53

0.53

0.53

rule-based model

0.36

0.36

0.36

0.6

0.64

0.62

0.41

0.37

0.39

0.53

0.53

0.53

  1. The highest metric scores for the identification of each type of (in)consistency is bolded