Skip to main content

Table 3 ORFs in majority-annotated mixed COGs that do not appear to represent missed genes

From: Similarity-based gene detection: using COGs to find evolutionarily-conserved ORFs

ORF COG ida

Organism

Genomic coordinatesb

Annotated gene(s) present in COGc

ORF COG ida

Organism

Genomic coordinatesb

Annotated gene(s) present in COGc

Existing annotation of pseudogene

Frameshift 3' fragmentc

876

EcoK12

1488620-1487985

gap

1036

Bbur

21098-20445

queA

2340

Sent

4738725-4740071

dhaS/aldA

1750

Bhal

984866-983856

celB

2433

Sent

4745051-4743573

hsdB

1188

Bhal

1359362-1360555

recD

1895

Sent

3243737-3244861

fadH

2257

Bhal

3182850-3181696

ilvI/poxB/alsS

1399

Sent

461578-461874

 

88

Bsub

3671944-3672555

gtaB

653

Sent

2505700-2506824

ptsP

641

Hinf

1525427-1524561

thiI

1058

Sent

3413535-3416306

acrD/mdtC/mdtB

2031

Hinf

1719924-1718821

tldD

3088

Sent

4084807-4083605

 

2473

Mgal

431452-431778

fldA

815

Sent

1360931-1362226

rhlE

2309

Mgen

416785-416336

acpD

3104

Sent

4009730-4009993

 

975

Mmyc

57011-56760

recR

569

Sent

1969437-1970648

penA

686

Mmyc

690895-690356

rpsB

    

1319

Nmen

107757-109406

msbA

Annotated in GenomeReviews but with different stop

842

Nmen

1995043-1994876

 

928

Bsub

2500726-2499347

bfmBC

556

VchoI

553588-552383

dnaG

157

Cper

2751593-2751051

rplD

745

VchoI

555313-556182

gcp

999

EcoO157

3613249-3610595

alaS

106

VchoI

1087924-1089819

uvrB

107

Hinf

655042-654365

metI

2435

VchoI

2612949-2611972

 

589

Mpne

329463-331229

lepA

2807

VchoII

1060889-1060107

qseC

210

Mpul

150772-151668

grpE

    

743

Sent

2492196-2490763

gltX

Frameshift 5' fragmentc

534

Tpal

478406-478777

 

697

Bhal

3580443-3579682

csd

166

Upar

279005-279949

rplV

2029

Bsub

2304436-2305248

metA

    

2049

Bsub

3032201-3032512

 

Fragments around stop codons (nonsense)c

2

Cpne

383405-384037

recF

928

Bsub

2500726-2499347

bfmBC

462

Cpne

1088259-1088711

ispE

157

Cper

2751593-2751051

rplD

2769

EcoK12

3814680-3813886

rph

999

EcoO157

3613249-3610595

alaS

2257

EcoK12

3948538-3949566

ilvI/poxB/alsS

107

Hinf

655042-654365

metI

2433

Hinf

232074-232991

 

589

Mpne

329463-331229

lepA

3066

Hinf

1377365-1378063

dgt

210

Mpul

150772-151668

grpE

1075

Hinf

1477189-1476557

pstB

743

Sent

2492196-2490763

 

641

Hinf

1526028-1525285

thiI

    

2571

Nmen

292645-294051

 
    

220

Tpal

220772-221749

dnaJ

    

556

VchoI

554244-553561

dnaG

    

1826

VchoI

637551-638246

amt

    

42

VchoI

851189-849954

oadA

    

1082

VchoII

690599-690273

glpF

  1. aThe identifers for COGs are local to this study. They do not correspond to numbers in the NCBI COG database.
  2. bCoordinates in which the first number is greater than the second indicate that the ORF is on the minus strand.
  3. cA named annotated putative ortholog in another organism or paralog within the organism to the ORF listed.
  4. dThese categories represent probable pseudogenes or sequencing errors.