Skip to main content

Table 1 The features used in BIOSMILE

From: A resource-saving collective approach to biomedical semantic role labeling

Basic features

• Verb predicate – The verb predicate lemma

• Path – The syntactic path through the parse tree from the constituent being classified to the verb predicate

• Constituent type (CT)

• Position – Whether the phrase is located before or after the verb predicate

• Voice – passive if the verb predicate has a POS tag VBN, and its chunk is not a VP, or it is preceded by a form of "to be" or "to get" within its chunk; otherwise, it is active

• Head word – Calculated using the head word table described by Collins (1999)

• Head POS – The POS of the Head Word

• Sub-categorization – The phrase structure rule that expands the predicate's parent node in the parse tree

• First and last Word (FW and LW) and their POS tags

• Level – The level in the parse tree

Verb predicate features

• Verb predicate's verb class

• Verb predicate POS tag

• Verb predicate frequency

• Verb predicate's context POS

• Number of verb predicates

Full parsing features

• Parent, left sibling, and right sibling paths, constituent types, positions, head words, and head POS tags

• Head of Prepositional Phrase (PP) parent – If the parent is a PP, then the head of this PP is also used as a feature

Combination features

• Verb predicate distance combination

• Verb predicate phrase type combination

• Head word and verb predicate combination

• Voice position combination

Others

• Syntactic frame of verb predicate/NP

• Headword suffixes of lengths 2, 3, and 4

• Number of words in the phrase

 

• Context words & POS tags