Skip to main content

Table 6 Feature combination results

From: Feature engineering for MEDLINE citation categorization with MeSH

  

SVM-perf

  

Ada over

 

Feature combination

Prec

Rec

F1

Prec

Rec

F1

Unigram

0.395

0.654

0.492

0.528

0.425

0.471

Unigram+CUI

0.409

0.657

0.504*

0.529

0.437

0.479*

Unigram+Meta

0.387

0.672

0.491

0.550

0.405

0.466

Unigram+NP

0.382

0.701

0.495*

0.535

0.424

0.473

Unigram+Taxo

0.403

0.660

0.500*

0.531

0.432

0.477

Unigram+mti

0.448

0.679

0.540*

0.586

0.477

0.526*

Unigram+mmi+prc

0.445

0.677

0.537*

0.583

0.474

0.523*

Unigram+all

0.452

0.689

0.546*

0.600

0.476

0.531*

Feature combination

Prec

Rec

F1

Prec

Rec

F1

TIAB+bigram

0.408

0.685

0.512

0.556

0.421

0.479

TIAB+bigram+CUI

0.439

0.688

0.536

0.556

0.435

0.488*

TIAB+bigram+Meta

0.408

0.689

0.513

0.581

0.406

0.478

TIAB+bigram+NP

0.417

0.686

0.518*

0.560

0.422

0.481

TIAB+bigram+Taxo

0.418

0.679

0.518*

0.554

0.412

0.472

TIAB+bigram+mti

0.451

0.701

0.549*

0.604

0.475

0.532*

TIAB+bigram+mmi+prc

0.448

0.699

0.546*

0.607

0.466

0.528*

TIAB+bigram+all

0.470

0.682

0.557*

0.629

0.380

0.474

  1. Results are reported in Precision (Prec), Recall (Rec) and F-measure (F1). Unigrams and bigrams with feature source (either title or abstract, TIAB+bigram) are combined with concepts identifiers (+CUI), meta-data (+Meta), noun phrases (+NP), hypernyms (+Taxo), MTI predictions (+mti), MTI components (mti+prc) and all the features (+all). For each column, results significantly better (p >0.05) than the ones obtained with unigram or TIAB+bigram are indicated with *.