Skip to main content

Table 4 Table of previous operon prediction methods.

From: Predicting protein linkages in bacteria: Which method is best depends on task

Predictor method

Data Types

Applied species

Operon Data

Sen (%)

Spe (%)

Accuracy

Probability

Current study, from Bowers et al., 2004

Intergenic distance

E. coli K12

2684 TU E. coli K12 Jun07

78

_

_

  

B. subtilis

(831 multiprotein operons)

  

(precision = PPV = 82% E. coli K12)

   

1823 pairs in same operon

   
   

770 TU E. coli K12 Jan06

75

_

_

   

(356 mulitprotein operons)

  

(precision = 47%)

   

905 pairs in same operon

94

_

_

   

1115 TU B. subtilis

  

(precision = 45% B. subtilis)

   

(419 multiprotein operons)

   
   

972 pairs in same operon

   

HMM

Yada et al., 1999

Sequence information

E. coli K12

390 TU

59

_

_

Naïve Bayes

Craven et al., 2000

Sequence information

E. coli K12

365 TU

75

91

83%*

Log likelihood

Salgado et al., 2000

Intergenic distance, functional classes

E. coli K12

361 TU (237 multi)

88

88

82%* Distance

   

572 pairs in same operon

  

88%* Both

   

346 pairs at TU border

   

Probability

Ermolaeva et al., 2001

Conserved gene clusters across 34 genomes

E. coli K12

389 TU

48

92

70%*

   

541 pairs in same operon

   
   

263 pairs at TU border

   
   

(pair if ≤ 200 bp apart)

   

HMM

Tjaden et al., 2002

Expression data

E. coli K12

463 pairs in same operon

63

99

81%*

Graph analysis

Zheng et al., 2002

Metabolic pathway information

E. coli K12 (also applied to 42 other genomes)

128 TU metabolism related

89

87

88%*

Log likelihood

Moreno-Hagelsieb et al., 2002

Intergenic distance

B. subtilis (trained on E. coli K12, applied to 68 genomes)

100 TU B. subtilis

88

88

88%* B. subtilis

   

310 pairs in same operon

  

82%* E. coli K12

   

123 pairs at TU border

   

Bayesian posterior probability

Sabatti et al., 2002

Intergenic distance, co-expression

E. coli K12

257 TU

82

70

76%* Co-expr

   

604 pairs in same operon

84

82

83%* Distance

   

151 pairs at TU border

88

88

88%* Both

Bayesian network

Bockhorst et al., 2003

Intergenic distance, sequence information, expression data

E. coli K12

365 TU

78

90

84%*

Machine learning Romero et al., 2004

Intergenic distance, functional information

B. subtilis

(trained on E. coli K12)

100 TU B. subtilis

446 TU E. coli K12

81

48

65%* B. subtilis

    

87

86

87%* E. coli K12

    

(91)

(87)

(89%* if use all info on E. coli)

Bayesian classifier

De Hoon et al., 2004

Intergenic distance, operon length, gene expression

B. subtilis

635 TU

82¶

89¶

83¶ distance

   

582 pairs in same operon

80¶

79¶

80¶ expression

   

91 pairs at TU border

88¶

88¶

89¶ all

Machine learning without extensive training data

Westover et al.,2005

Intergenic distance, functional classes, conserved gene clusters

E. coli K12

(validated by known operons)

E. coli K12:

88

80

84%* E. coli K12

  

B. theta

(validated by co-expression)

797 pairs in same operon

   
   

294 pairs at TU border

   
   

B. theta:

73

80

76.5%* B. theta

   

936 concordant pairs

   
   

106 discordant pairs

   
  1. Annotations: * value estimated as average of sensitivity and specificity, ¶ value based on leave-one-out analysis as reported by authors
  2. Abbreviations: Transcriptional Unit (TU), Base Pair (bp), Specificity (Spe), Sensitivity (Sen).