From: Structured learning for spatial information extraction from biomedical text: bacteria biotopes
Type | Feature name | Description |
---|---|---|
Word features | Lexical-form | Word surface that appears in the text |
 | Bio-lemma | Word lemma using a lemmatizer for biomedical domain which uses additional lexical resources [24] |
 | POS-tag | Part of speech tag of a word to exploit the syntactical information for training |
 | Dprl | Dependency relation of a word to its syntactic head which gives clues to the semantic relationships |
 | Cocoa | Word tag using Cocoa - an external resource of biological concepts |
 | Capital | If a word starts with a capital letter |
 | Stop-word | If a word belongs to a list of stop words |
Phrase features | Head-features | The features of the word which is the syntactic head of a phrase |
 | nHead-features | The features of other words contained in the phrase |
 | Lexical-surface | Concatenation of the lexical form of the words in the phrase |
 | Phrasal-POS | The phrasal part of speech tag: the parse tree tag of the common parent of the words in a phrase |
 | NCBI-sim | Comparing the phrase and the list of bacterium names in NCBI |
 | Ontobio-sim | Comparing the phrase and the habitat classes in OntoBiotope |
Phrase-pair features | Same-par | If two phrases occur in same paragraph |
 | Same-sen | If two phrases occur in one sentence |
 | inTitle | If bacterium candidate occurs in the title |
 | Verb | The verb in between the two phrases- if in same sentence |
 | Preposition | The preposition in between the two phrases-if in same sentence |
 | Parse-Dis | The distance between the two phrases using the parse tree |
 | Parse-Path | The path between the two phrases using the parse tree |
 | Heads-Lem | The concatenation of the lemma of the heads |
 | Heads-POS | The concatenation of the POS-tag of the two heads |
 | Dep-Path | The dependency path between the two heads |
Relation-pair features | Same-B | If two relations have exactly the same bacterium candidate |
 | Sim-BH | Similarity of two relations based on the similarity of their bacterium and habitat candidates |