Skip to main content

Table 2 CGP performance on peptidoglycan-related genes (Escherichia coli K-12, 4131 genes).

From: In silico prioritisation of candidate genes for prokaryotic gene function discovery: an application of phylogenetic profiles

  Validation sets
Methods C (8 genes) B (28 genes) M (51 genes)
  AUC (/η max ) AUC (/η max ) AUC (/η max )
Statistical CGP (scoring functions)
   sens 0.913 (2.5/10.6) 0.891 (2.3/6.0) 0.818 (1.9/4.2)
   spec 0.321 (0.4/1.4) 0.310 (0.4/1.2) 0.418 (0.8/2.0)
   ppv 0.405 (0.8/5.2) 0.423 (1.2/18.4) 0.553 (1.7/28.6)
   npv 0.974 (3.9/42.0) 0.956 (3.5/20.9) 0.891 (2.8/13.2)
   amss 0.989 (4.8/110.) 0.966 (4.1/53.7) 0.911 (3.5/44.7)
   hmss 0.989 (4.9/113.) 0.969 (4.2/55.3) 0.909 (3.5/45.6)
   OR 0.403 (0.8/5.2) 0.424 (1.2/18.4) 0.552 (1.7/28.6)
   chisq 0.984 (4.7/73.8) 0.963 (3.9/35.9) 0.902 (3.2/27.0)
   bchisq 0.984 (4.7/73.8) 0.963 (3.9/35.9) 0.903 (3.2/27.0)
   F 0.965 (4.0/45.8) 0.921 (3.2/22.5) 0.838 (2.5/15.1)
Inductive CGP (machine learning algorithms)
   NB 0.930   0.889   0.820  
   LR 0.882   0.935   0.828  
   ADTree 0.976   0.981   0.925  
   IBk 0.998   0.929   0.946  
   J48 0.935   0.828   0.752  
   SMO/Poly 0.997   0.876   0.933  
   SMO/RBF 0.963   0.932   0.964  
  1. This table lists the performance of statistical and inductive CGP in prioritising peptidoglycan-related genes in Escherichia coli K-12. Abbreviations: sens: sensitivity; spec: specificity; ppv: positive predictive value; npv: negative predictive value; amss: arithmetic mean of sensitivity and specificity; hmss: harmonic mean of sensitivity and specificity; OR: odds ratio; chisq: chi-square; bchisq: signed chi-square; F: F-measure; NB: naïve Bayes classifier; LR: logistic regression; ADTree: alternating decision tree; IBk: k-nearest neighbour classifier; J48: J48 decision tree; SMO: support vector machine trained by sequential minimal optimisation algorithm; Poly: polynomial kernel; RBF: radial basis function kernel.