Skip to main content

Table 4 Comparison of feature set quality using t-test on stratified 10-fold CV subsets from the development set at best performing percentiles (per) of class 1.

From: Detection of interaction articles and experimental methods in biomedical literature

f1 @ per CI f2 @ per CI p EI
S @ 19 0.415, 0.480 P @ 35 0.397, 0.454 0.0379 0.022
B @ 13 0.460, 0.511 M @ 13 0.370, 0.461 0.0082 0.070
B @ 13 0.460, 0.511 S @ 19 0.415, 0.480 0.0273 0.038
W @ 20 0.491, 0.535 B @ 13 0.460, 0.511 0.0116 0.028
WS @ 22 0.510, 0.568 W @ 20 0.491, 0.535 0.0043 0.026
WMS @ 20 0.546, 0.577 WS @ 22 0.510, 0.568 0.0414 0.023
WMBS @ 17 0.549, 0.586 WS @ 22 0.510, 0.568 0.0179 0.029
WMPBS @ 17 0.558, 0.595 WS @ 22 0.510, 0.568 0.0122 0.038
  1. Features considered are W (bag of words), M (MeSH), P (PPIscore), B (bigrams), S (syntactic) - for a detailed description see page 3. Interpret the rows as follows (e.g. row 4): Feature set W has a 95% confidence interval (CI) of [0.491, 0.535], feature set B has one of [0.460, 0.511]. According to a t-test for dependent samples, feature set W is significantly better than feature set B (df=9; p=0.0116). The expected improvement (EI) of the MCC measure is at least 0.028 (95% confidence level). Notice, that feature set PBMSW or BMSW are not significantly better than MSW. For the case of combinations of 2, 3 or 4 different feature sets, only the best performing ones were selected in this table.