Skip to main content

Table 2 Best-performing Boolean features, ordered by information gain

From: Machine learning methods for metabolic pathway prediction

Feature

ACC

SN

SP

FM

PR

RC

IG

has-enzymes

0.821

0.914

0.796

0.681

0.543

0.914

0.188

has-reactions-present

0.797

0.919

0.765

0.655

0.509

0.919

0.173

majority-of-reactions-present

0.872

0.707

0.916

0.699

0.69

0.707

0.165

some-initial-reactions-present

0.84

0.724

0.87

0.654

0.597

0.724

0.138

some-initial-and-final-reactions-present

0.864

0.605

0.933

0.651

0.706

0.605

0.136

mostly-absent-not-unique

0.215

0.163

0.229

0.08

0.053

0.163

0.133

all-initial-reactions-present

0.825

0.747

0.845

0.641

0.561

0.747

0.133

every-reaction-present

0.871

0.508

0.968

0.623

0.807

0.508

0.132

every-reaction-present-or-orphaned

0.871

0.508

0.968

0.623

0.807

0.508

0.132

taxonomic-range-includes-target

0.795

0.813

0.79

0.624

0.506

0.813

0.131

  1. See section "Feature Extraction and Processing" and Section 1 of Additional file 2 for description of features.
  2. Columns 2 through 8 correspond to various performance measures: ACC = accuracy; SN = sensitivity; SP = specificity; FM = F-measure; PR = precision; RC = recall; IG = information gain.