Skip to main content

Table 1 Automated gene predictiona in Leishmania major

From: Importing statistical measures into Artemis enhances gene identification in the Leishmania genome project

 

Annotated CDSb

GLIMMER

GENESCAN

TESTCODE

CODON USAGE

  

FPc

FNd

FP

FN

FP

FN

FP

FN

Chr1

79

131

0

61

1

68

33

75

4

Chr3

94(1)

116

1

57

5

119

51

108

8

Chr4

123

328

1

97

6

130

56

139

9

Total

295

575

2

215

12

317

180

322

21

EDR e

1.96

0.77

1.68

1.16

  1. a All possible ORFs (i.e. starting with an ATG and ending with TAA, TAG or TGA) of >300 bp in the three chromosome sequence were scored by each of the programs. GLIMMER predictions (for ORFs > 100 amino acids, with default settings) were taken straight from the trained software. For GENESCAN and TESTCODE, ORFs were considered to be positive if the average score for the ORF exceeded a threshold of 4.0 and 9.7, respectively. For overlapping ORFs on the same strand, that with the highest score was chosen. In case of CODON USAGE, ORFs were predicted as coding when the average in-frame score was higher than the two out-of-frame scores. b The number of CDS of more than 300 bp in GenBank Accession numbers AE001274 (chr1), AC125735 (chr3), AL389894 and AL139794 (chr4). The number of annotated CDS of <300 bp are shown in parentheses. c False positives d False negatives e Error Discovery Rate (EDR) = (FN+FP)/(CDS)