From: Predicting protein linkages in bacteria: Which method is best depends on task
Predictor method | Data Types | Applied species | Operon Data | Sen (%) | Spe (%) | Accuracy |
Probability Current study, from Bowers et al., 2004 | Intergenic distance | E. coli K12 | 2684 TU E. coli K12 Jun07 | 78 | _ | _ |
 |  | B. subtilis | (831 multiprotein operons) |  |  | (precision = PPV = 82% E. coli K12) |
 |  |  | 1823 pairs in same operon |  |  |  |
 |  |  | 770 TU E. coli K12 Jan06 | 75 | _ | _ |
 |  |  | (356 mulitprotein operons) |  |  | (precision = 47%) |
 |  |  | 905 pairs in same operon | 94 | _ | _ |
 |  |  | 1115 TU B. subtilis |  |  | (precision = 45% B. subtilis) |
 |  |  | (419 multiprotein operons) |  |  |  |
 |  |  | 972 pairs in same operon |  |  |  |
HMM Yada et al., 1999 | Sequence information | E. coli K12 | 390 TU | 59 | _ | _ |
Naïve Bayes Craven et al., 2000 | Sequence information | E. coli K12 | 365 TU | 75 | 91 | 83%* |
Log likelihood Salgado et al., 2000 | Intergenic distance, functional classes | E. coli K12 | 361 TU (237 multi) | 88 | 88 | 82%* Distance |
 |  |  | 572 pairs in same operon |  |  | 88%* Both |
 |  |  | 346 pairs at TU border |  |  |  |
Probability Ermolaeva et al., 2001 | Conserved gene clusters across 34 genomes | E. coli K12 | 389 TU | 48 | 92 | 70%* |
 |  |  | 541 pairs in same operon |  |  |  |
 |  |  | 263 pairs at TU border |  |  |  |
 |  |  | (pair if ≤ 200 bp apart) |  |  |  |
HMM Tjaden et al., 2002 | Expression data | E. coli K12 | 463 pairs in same operon | 63 | 99 | 81%* |
Graph analysis Zheng et al., 2002 | Metabolic pathway information | E. coli K12 (also applied to 42 other genomes) | 128 TU metabolism related | 89 | 87 | 88%* |
Log likelihood Moreno-Hagelsieb et al., 2002 | Intergenic distance | B. subtilis (trained on E. coli K12, applied to 68 genomes) | 100 TU B. subtilis | 88 | 88 | 88%* B. subtilis |
 |  |  | 310 pairs in same operon |  |  | 82%* E. coli K12 |
 |  |  | 123 pairs at TU border |  |  |  |
Bayesian posterior probability Sabatti et al., 2002 | Intergenic distance, co-expression | E. coli K12 | 257 TU | 82 | 70 | 76%* Co-expr |
 |  |  | 604 pairs in same operon | 84 | 82 | 83%* Distance |
 |  |  | 151 pairs at TU border | 88 | 88 | 88%* Both |
Bayesian network Bockhorst et al., 2003 | Intergenic distance, sequence information, expression data | E. coli K12 | 365 TU | 78 | 90 | 84%* |
Machine learning Romero et al., 2004 | Intergenic distance, functional information | B. subtilis (trained on E. coli K12) | 100 TU B. subtilis 446 TU E. coli K12 | 81 | 48 | 65%* B. subtilis |
 |  |  |  | 87 | 86 | 87%* E. coli K12 |
 |  |  |  | (91) | (87) | (89%* if use all info on E. coli) |
Bayesian classifier De Hoon et al., 2004 | Intergenic distance, operon length, gene expression | B. subtilis | 635 TU | 82¶ | 89¶ | 83¶ distance |
 |  |  | 582 pairs in same operon | 80¶ | 79¶ | 80¶ expression |
 |  |  | 91 pairs at TU border | 88¶ | 88¶ | 89¶ all |
Machine learning without extensive training data Westover et al.,2005 | Intergenic distance, functional classes, conserved gene clusters | E. coli K12 (validated by known operons) | E. coli K12: | 88 | 80 | 84%* E. coli K12 |
 |  | B. theta (validated by co-expression) | 797 pairs in same operon |  |  |  |
 |  |  | 294 pairs at TU border |  |  |  |
 |  |  | B. theta: | 73 | 80 | 76.5%* B. theta |
 |  |  | 936 concordant pairs |  |  |  |
 |  |  | 106 discordant pairs |  |  |  |