Skip to main content

Table 2 Feature comparison over all results

From: Feature engineering for MEDLINE citation categorization with MeSH

Feature

SVMLight

SVM-perf

AdaBoostM1

Ada Over

Unigram

0.418

0.492 †

0.420

0.471 †

Bigram

0.406

0.513* †

0.420

0.477* †

Argumentative

0.403

0.479 †

0.415

0.464 †

Noun phrases

0.222

0.329 †

0.222

0.271 †

Concepts

0.409

0.497* †

0.427

0.480* †

CUIs

0.398

0.496

0.422

0.475 †

MTI predictions

0.513*

0.531* †

0.478*

0.501* †

MTI MMI

0.398

0.454 †

0.367

0.382 †

MTI PRC

0.481*

0.502 †

0.430

0.453 †

First level taxonomy

0.300

0.456 †

0.351

0.429 †

Second level taxonomy

0.222

0.424 †

0.329

0.393 †

Third level taxonomy

0.173

0.383

0.285

0.341 †

Journal

0.115

0.193 †

0.126

0.208 †

Affiliation

0.046

0.064

0.045

0.044 †

Author

0.062

0.137 †

0.081

0.084 †

  1. Results are reported in F-measure. Binary representation of features is used. Several learning algorithms have been used including SVMLight, SVM-perf, AdaBoostM1 and AdaBoostM1 with oversampling of positive instances (Ada Over). For each column, results significantly better than unigram (p >0.05) are indicated with *. For each pair of methods (SVMLight/SVM-perf and AdaBoostM1/Ada Over), statistical differences are highlighted using †.