Skip to main content

Table 6 Evaluation using gene/protein name snippets from MEDLINE abstracts

From: Normalizing biomedical terms by minimizing ambiguity and variability

 

Dictionary

 

Lookup performance

Iter.

Ambiguity

Variability

Rule

Precision

Recall

0

5.797

12.479

(convert capital letters to lower case)

0.782

0.582

1

5.807

12.161

‘-’ → ‘’

0.766

0.603

2

5.811

12.025

‘ precursor’ → ‘’

0.767

0.611

3

5.812

11.941

‘,’ → ‘’

0.767

0.611

4

5.812

11.907

‘inc finger protein’ → ‘nf’

0.767

0.611

5

5.812

11.868

‘ isoform 1’ → ‘’

0.767

0.611

6

5.813

11.832

‘ isoform 2’ → ‘’

0.766

0.611

7

5.813

11.806

‘ isoform a’ → ‘’

0.766

0.611

8

5.813

11.781

‘ isoform b’ → ‘’

0.766

0.611

9

5.813

11.748

‘ containing protein’ → ‘containing’

0.766

0.611

10

5.813

11.730

‘ variant’ → ‘’

0.766

0.611

:

:

:

:

:

:

21

5.815

11.597

‘nterleukin’ → ‘l’

0.767

0.613

:

:

:

:

:

:

24

5.816

11.566

‘specific’ → ‘’

0.767

0.615

:

:

:

:

:

:

33

5.816

11.450

‘protein’ → ‘gene’

0.765

0.616

34

5.828

11.056

‘ gene’ → ‘’

0.765

0.619

:

:

:

:

:

:

38

5.829

11.016

‘ recepto’ → ‘’

0.767

0.623

:

:

:

:

:

:

44

5.830

10.970

‘ alph’ → ‘’

0.765

0.625

:

:

:

:

:

:

75

5.831

10.838

‘ i’ → ‘1’

0.766

0.626

:

:

:

:

:

:

84

5.831

10.790

‘ lpha’ → ‘’

0.766

0.627

:

:

:

:

:

:

86

5.831

10.782

‘ beta’ → ‘b’

0.767

0.630

:

:

:

:

:

:

100

5.832

10.732

‘ type’ → ‘’

0.767

0.633