Skip to main content

Table 15 The features used in the baseline argument classification model

From: BIOSMILE: A semantic role labeling system for biomedical verbs using a maximum-entropy model with automatically generated template features

Predicate – The predicate lemma
Path – The syntactic path through the parsing tree from the constituent being classified to the predicate
Constituent type
Position – Whether the phrase is located before or after the predicate
Voice – passive if the predicate has a POS tag VBN, and its chunk is not a VP, or it is preceded by a form of "to be" or "to get" within its chunk; otherwise, it is active
Head word – Calculated using the head word table described by Collins (1999)
Head POS – The POS of the Head Word
Sub-categorization – The phrase structure rule that expands the predicate's parent node in the parsing tree
First and last Word and their POS tags
Level – The level in the parsing tree
Predicate's verb class
Predicate POS tag
Predicate frequency
Predicate's context POS
Number of predicates
Parent, left sibling, and right sibling paths, constituent types, positions, head words, and head POS tags
Head of Prepositional Phrase (PP) parent – If the parent is a PP, then the head of this PP is also used as a feature
Predicate distance combination
Predicate phrase type combination
Head word and predicate combination
Voice position combination
Syntactic frame of predicate/NP
Headword suffixes of lengths 2, 3, and 4
Number of words in the phrase
Context words & POS tags