Skip to main content

Table 4 Comparison of feature set quality using t-test on stratified 10-fold CV subsets from the development set at best performing percentiles (per) of class 1.

From: Detection of interaction articles and experimental methods in biomedical literature

f1 @ per

CI

f2 @ per

CI

p

EI

S @ 19

0.415, 0.480

P @ 35

0.397, 0.454

0.0379

0.022

B @ 13

0.460, 0.511

M @ 13

0.370, 0.461

0.0082

0.070

B @ 13

0.460, 0.511

S @ 19

0.415, 0.480

0.0273

0.038

W @ 20

0.491, 0.535

B @ 13

0.460, 0.511

0.0116

0.028

WS @ 22

0.510, 0.568

W @ 20

0.491, 0.535

0.0043

0.026

WMS @ 20

0.546, 0.577

WS @ 22

0.510, 0.568

0.0414

0.023

WMBS @ 17

0.549, 0.586

WS @ 22

0.510, 0.568

0.0179

0.029

WMPBS @ 17

0.558, 0.595

WS @ 22

0.510, 0.568

0.0122

0.038

  1. Features considered are W (bag of words), M (MeSH), P (PPIscore), B (bigrams), S (syntactic) - for a detailed description see page 3. Interpret the rows as follows (e.g. row 4): Feature set W has a 95% confidence interval (CI) of [0.491, 0.535], feature set B has one of [0.460, 0.511]. According to a t-test for dependent samples, feature set W is significantly better than feature set B (df=9; p=0.0116). The expected improvement (EI) of the MCC measure is at least 0.028 (95% confidence level). Notice, that feature set PBMSW or BMSW are not significantly better than MSW. For the case of combinations of 2, 3 or 4 different feature sets, only the best performing ones were selected in this table.