Skip to main content

Table 1 Automated gene predictiona in Leishmania major

From: Importing statistical measures into Artemis enhances gene identification in the Leishmania genome project

  Annotated CDSb GLIMMER GENESCAN TESTCODE CODON USAGE
   FPc FNd FP FN FP FN FP FN
Chr1 79 131 0 61 1 68 33 75 4
Chr3 94(1) 116 1 57 5 119 51 108 8
Chr4 123 328 1 97 6 130 56 139 9
Total 295 575 2 215 12 317 180 322 21
EDR e 1.96 0.77 1.68 1.16
  1. a All possible ORFs (i.e. starting with an ATG and ending with TAA, TAG or TGA) of >300 bp in the three chromosome sequence were scored by each of the programs. GLIMMER predictions (for ORFs > 100 amino acids, with default settings) were taken straight from the trained software. For GENESCAN and TESTCODE, ORFs were considered to be positive if the average score for the ORF exceeded a threshold of 4.0 and 9.7, respectively. For overlapping ORFs on the same strand, that with the highest score was chosen. In case of CODON USAGE, ORFs were predicted as coding when the average in-frame score was higher than the two out-of-frame scores. b The number of CDS of more than 300 bp in GenBank Accession numbers AE001274 (chr1), AC125735 (chr3), AL389894 and AL139794 (chr4). The number of annotated CDS of <300 bp are shown in parentheses. c False positives d False negatives e Error Discovery Rate (EDR) = (FN+FP)/(CDS)