Skip to main content

Table 1 Review of the existing annotation techniques.

From: Semantic annotation of morphological descriptions: an overall strategy

Methods Handmade prerequisites and their reusability Annotation Level Results and their reusability Scope of evaluation Performance (*)
Syntactic parsing:
1. Abascal & Sanchenz (1999)
2. Taylor (1995)
Lexicon & grammar rules:
Not good for another taxon group/collection.
1. Paragraph
2. Character
1. Style clues: Less reusable.
2. Organ names & character states: Reusable.
1. FNA v. 19
2. Flora of New South Wales, Flora of Australia.
1. Not reported
2. Roughly estimated recall:60%-80%
Supervised machine learning--text classification: Cui & al. (2002) Training examples: Not good for another taxon group. paragraph Classification models: Less reusable. 1500+ descriptions from FNA Recall: 94% Precision: 97%
Ontology based extraction:
1. Diederich, Fortuner & Milton (1999)
2. Wood & al. (2003)
Dictionaries,
ontology, & checklists:
Not good for another taxon group.
Character Organ names & character states:
Reusable.
1. 16 descriptions
2. 18 species descriptions from six Floras.
1. Accuracy on 1 sample:76%
2. Recall: 66%
Precision: 74%
Supervised machine learning--extraction patterns: Tang & Heidorn (2007) Extraction template & training examples:
Not good for another taxon group.
Character, limit to these character states: leaf shape, size, color; Fruit type. Extraction patterns: Sensitive to text variations, less reusable.
Character states: Reusable.
1600 FNA species
descriptions.
Recall: 33%-80%
Precision:75%-100%
Supervised machine learning--
association rules: Cui (2008a)
Annotation template & training examples:
Not good for another taxon group.
Clause Association rules: Reusable only within the same taxon group 16,000 descriptions from FNA, FOC, and FNCT Recall and precision: 80%-95%
Unsupervised learning: Cui (2008b) No prerequisites 1. Clause
2. Character
Organ names & character states:
Reusable.
FNA, FOC, & Treatises Part H Precision 88-95%
Recall 50%-75%
  1. * Precision is the proportion of the computer's decisions that is correct. Recall is the proportion of all targets correctly discovered by the computer.