Skip to main content

Table 2 Specificity, sensitivity and precision estimates for different gene finders in E. coli.

From: EasyGene – a prokaryotic gene finder that ranks ORFs by statistical significance

Data set EasyGene Glim rbs-Glim Orpheus Gm24 GmS Gmhmm Frame
A'-% found 98.4 98.9/98.9 98.9 98.0/95.3 91.5 97.2 98.1 97.0
A'-% exact 93.8 98.9/95.3 84.1 95.1/92.4 41.6 88.0 85.7 93.2
B'-% found 98.4 98.5/98.6 98.6 95.9/96.5 90.2 96.6 97.2 96.4
T-% found 98.1(98.0) 98.3/98.4 98.4 96.5/95.6 89.8 96.3 97.1 96.1
Genome 4145 6827/5756 5756 9333/7543 3552 4064 4230 4064
zero order 7 169/211 211 6761/5430 6 153 1459 0
first order 7 545/723 723 6836/4804 13 241 830 0
third order 1 2423/2694 2694 6582/4817 43 659 866 1
shadows 0 19/21 21 22/9 1 0 2 0
  1. Upper part shows the percentage of genes found exactly (both 5' and 3' end) and partially (only 3' end exact) for different gene finders and sets of high confidence genes in E. coli. For Glimmer and Orpheus, the numbers before the "/" are based exclusively on their ORF scores and recommended threshold whereas the numbers after the "/" are based on their post-processing procedures. The number of genes predicted in the whole genome is also shown. This should be compared to the 4288 annotated genes in E. coli. The lower part of the table shows the number of false positives predicted in random sequences generated by Markov chains of order 0, 1 and 3 and the very last row shows the number of false predictions in the shadows of the high-confidence genes in data set A. All values listed for EasyGene are based on an R-value threshold of R = 2.