Skip to main content

Table 6 Most important features as determined by the scikit-learn ExtraTreesClassifier.

From: TEES 2.2: Biomedical Event Extraction for Diverse Corpora

step

#

weight

feature

group

trigger

1

0.0087

POS VB

token

 

2

0.0078

linear_3_txt_I

linear

 

3

0.0066

stem_induct

subtoken

 

4

0.006

dt_on

subtoken

 

5

0.0054

linear_3_txt_we

linear

 

6

0.0054

linear_3_txt_was

linear

 

7

0.0041

linear_-1_txt_inhibits

linear

 

8

0.0041

dt_si

subtoken

 

9

0.0038

dist_3_annType_Protein

dependencies

 

10

0.0034

dt_xp

subtoken

edge

1

0.009

e2_txt_Id1

entity

 

2

0.0042

tok_FFtxt_phosphorylation

path

 

3

0.0039

dep_Reverse_dobj

path

 

4

0.0036

tokenPath_Positive_regulation_e1_Positive_regulation_

path

 

5

0.0035

GENIA_target_protein

entity

 

6

0.0034

POS_VBZ

path

 

7

0.0034

tok_RFFFtxt_mRNA

path

 

8

0.0028

tok_RFFtxt_phosphorylation

path

 

9

0.0025

tok_RRtxt_Id2

path

 

10

0.0025

txt_block

path

unmerging

1

0.0064

argTheme_dep_Reverse_prep_of

args

 

2

0.0062

argTheme_POS_NN

args

 

3

0.006

argTheme_txt_expression

args

 

4

0.0048

trg_dt_up

trigger

 

5

0.0047

trg_chain_dist_dist_2-rev_appos-rev_punct

trigger

 

6

0.0045

trg_dt_xp

trigger

 

7

0.0043

trg_tt_ssi

trigger

 

8

0.0041

argTheme_txt_affected

args

 

9

0.0041

trg_dt_ex

trigger

 

10

0.0041

argThemetrg_dep_dist_dist_3dep

args

modifier

1

0.013

t1HOut_neg_RB

dependencies

 

2

0.013

t1HOut_neg

dependencies

 

3

0.011

t1HOut_nsubjpass_NAMED_ENT

dependencies

 

4

0.0089

dep_dist_dist_3neg

dependencies

 

5

0.0074

t1HOut_not

dependencies

 

6

0.0072

dist_3_txt_not

dependencies

 

7

0.0053

dist_3_txt_significantly

dependencies

 

8

0.0048

chain_dist_dist_1-rev_nsubjpass-frw_conj_and-rev_dep

dependencies

 

9

0.0044

linear_-2_txt_was

linear

 

10

0.0032

t1HOut_advmod

dependencies

  1. See Figure 5 for feature group details.
  2. The weights are relative for each classification step.