Skip to main content

Table 1 CGP performance on peptidoglycan-related genes (Streptococcus agalactiae 2603 V/R, 2124 genes).

From: In silico prioritisation of candidate genes for prokaryotic gene function discovery: an application of phylogenetic profiles

  Validation sets
Methods C (9 genes) B (18 genes) M (25 genes)
  AUC (/η max ) AUC (/η max ) AUC (/η max )
Statistical CGP (scoring functions)
   sens 0.858 (2.0/5.4) 0.853 (1.9/4.3) 0.830 (1.8/3.8)
   spec 0.396 (0.5/1.5) 0.427 (0.7/2.6) 0.506 (1.1/5.2)
   ppv 0.420 (0.6/1.57) 0.504 (1.3/29.5) 0.590 (2.1/85.0)
   npv 0.966 (3.6/30.1) 0.964 (3.5/21.7) 0.978 (3.2/17.3)
   amss 0.985 (4.6/88.5) 0.980 (4.4/59) 0.970 (4.4/85.0)
   hmss 0.986 (4.8/88.5) 0.980 (4.5/64) 0.969 (4.5/85.0)
   OR 0.415 (0.5/1.57) 0.509 (1.3/29.5) 0.592 (2.1/85.0)
   chisq 0.978 (4.2/59.0) 0.975 (3.9/34.7) 0.959 (3.7/28.3)
   bchisq 0.978 (4.2/59.0) 0.975 (3.9/34.7) 0.960 (3.7/28.3)
   F 0.932 (3.3/32.9) 0.915 (3.1/23.1) 0.881 (2.8/18.5)
Inductive CGP (machine learning algorithms)
   NB 0.901   0.879   0.843  
   LR 0.980   0.905   0.887  
   ADTree 0.996   0.944   0.975  
   IBk 0.948   0.950   0.974  
   J48 0.885   0.832   0.752  
   SMO/Poly 0.999   0.948   0.879  
   SMO/RBF 0.998   0.991   0.909  
  1. This table lists the performance of statistical and inductive CGPs in prioritising peptidoglycan-related genes in Streptococcus agalactiae 2603 V/R. Abbreviations: sens: sensitivity; spec: specificity; ppv: positive predictive value; npv: negative predictive value; amss: arithmetic mean of sensitivity and specificity; hmss: harmonic mean of sensitivity and specificity; OR: odds ratio; chisq: chi-square; bchisq: signed chi-square; F: F-measure; NB: naïve Bayes classifier; LR: logistic regression; ADTree: alternating decision tree; IBk: k-nearest neighbour classifier; J48: J48 decision tree; SMO: support vector machine trained by sequential minimal optimisation algorithm; Poly: polynomial kernel; RBF: radial basis function kernel.