Skip to main content

Table 1 The rule specifications

From: A corpus for plant-chemical relationships in the biomedical domain

Rule specification type


Rule structure

Trigger word form


Example (PMID)

Verbal trigger rule


N P 0 V tr N P 1

Transitive verb (active from)


[ Pomegranate derived from the tree Punica granatum] [ contains] [anthocyanins ]. (PMID: 15493960)



N P 1 V tr PP N P 0

Transitive verb (passive from)

Any preposition between V tr and N P 0

[About 450 mg of FB1 ] [ were obtained ] [ from ] [800g cultured corn ]. (PMID: 23605447)



N P 0 V tr PP N P 1

Intransitive verb

Any preposition between V tr and N P 1

[The volatile oil (2-3 %) of ginger ] [ consists ] [ of ] mainly [mono and sesquiterpenes ]. (PMID: 17637489)

Preposition trigger rule


N P 0 P P tr N P 1



[ switchgrass ] [ as ] [a sole carbon (C) source]. (PMID: 22354956)



N P 1 P P tr N P 0



[ Saponins ] [ from ] [the flowers of Panax notoginseng ]. (PMID: 20518315)

Relative trigger rule


N P 1 R tr PP N P 0

Past participle form

Any preposition between R tr and N P 0

[ Anthocyanins ] [ isolated ] [ from ] [ black soybean seed coat]. (PMID: 16457818)



N P 0 R tr (PP) N P 1

Gerund form

When the trigger word (R tr ) is “consisting,” preposition (PP), “of,” should be followed by R tr .

With [thermally degraded Feverfw powder] [ containing ] [less contents of parthenolide ] no built-up antiserotonergic responses were observed after one month. (PMID: 11603284)

Apposition trigger rule


N P 0 A P tr N P 1

Apposition form (e.g. comma)

The token distance between N P 0 and N P 1 should be within ten.

Whereas that in PD is [ soybean oil ][ , ] [a source of unsaturated fatty acids ]. (PMID: 19932903



N P 1 A P tr N P 0

Apposition form (e.g. comma)

The token distance between N P 0 and N P 1 should be within ten.

[ Delta9-tetrahydrocannabinol (THC)][ , ] [the major active component of marijuana ]. (PMID: 9129126)

Copula trigger rule


N P 0 C tr N P 1

Be verb form

The token distance between N P 0 and N P 1 should be within ten.

[ Haematococcus pluvialis ] [ is ] [one of the potent organisms for production of astaxanthin ]. (PMID: 23605447)



N P 1 C tr N P 0

Be verb form

The token distance between N P 0 and N P 1 should be within ten.

[The calcium contents] [ were ] the highest in [the papaya ]. (PMID: 21695915)

Compound noun trigger rule


N P 0 C N tr N P 1

White space


To study the protective effect of [ panax notoginseng ] [ saponins (PNS)]. (PMID: 19317166)

  1. The rule-based model consists of six types of rules. The first column shows the specification name. Each specification has more than one rule structure shown in the third column. In the rule structure, N P 0 means the noun phrase containing a plant name, and N P 1 represents the noun phrase in which a chemical name appears. The component marked with “tr” represents a trigger word described in the fourth column. We also defined several constraints if necessary