Skip to main content

Table 1 Review of the existing annotation techniques.

From: Semantic annotation of morphological descriptions: an overall strategy

Methods

Handmade prerequisites and their reusability

Annotation Level

Results and their reusability

Scope of evaluation

Performance (*)

Syntactic parsing:

1. Abascal & Sanchenz (1999)

2. Taylor (1995)

Lexicon & grammar rules:

Not good for another taxon group/collection.

1. Paragraph

2. Character

1. Style clues: Less reusable.

2. Organ names & character states: Reusable.

1. FNA v. 19

2. Flora of New South Wales, Flora of Australia.

1. Not reported

2. Roughly estimated recall:60%-80%

Supervised machine learning--text classification: Cui & al. (2002)

Training examples: Not good for another taxon group.

paragraph

Classification models: Less reusable.

1500+ descriptions from FNA

Recall: 94% Precision: 97%

Ontology based extraction:

1. Diederich, Fortuner & Milton (1999)

2. Wood & al. (2003)

Dictionaries,

ontology, & checklists:

Not good for another taxon group.

Character

Organ names & character states:

Reusable.

1. 16 descriptions

2. 18 species descriptions from six Floras.

1. Accuracy on 1 sample:76%

2. Recall: 66%

Precision: 74%

Supervised machine learning--extraction patterns: Tang & Heidorn (2007)

Extraction template & training examples:

Not good for another taxon group.

Character, limit to these character states: leaf shape, size, color; Fruit type.

Extraction patterns: Sensitive to text variations, less reusable.

Character states: Reusable.

1600 FNA species

descriptions.

Recall: 33%-80%

Precision:75%-100%

Supervised machine learning--

association rules: Cui (2008a)

Annotation template & training examples:

Not good for another taxon group.

Clause

Association rules: Reusable only within the same taxon group

16,000 descriptions from FNA, FOC, and FNCT

Recall and precision: 80%-95%

Unsupervised learning: Cui (2008b)

No prerequisites

1. Clause

2. Character

Organ names & character states:

Reusable.

FNA, FOC, & Treatises Part H

Precision 88-95%

Recall 50%-75%

  1. * Precision is the proportion of the computer's decisions that is correct. Recall is the proportion of all targets correctly discovered by the computer.