Skip to main content

Table 1 Predicted clusters of readthrough proteins

From: Mining prokaryotic genomes for unknown amino acids: a stop-codon-based approach

Cluster description

Codon

Size

Example organism (locus)

Selenocysteine

   

Formate dehydrogenase α subunit

TGA

45

Escherichia coli (b1474)

Selenide water dikinase

TGA

12

Haemophilus influenzae (HI0200m)

Glycine reductase complex selenoprotein A

TGA

6

Treponema denticola (TDE0745)

Glycine reductase complex selenoprotein B

TGA

6

Treponema denticola (TDE0078)

Heterodisulfide reductase subunit A

TGA

6

Methanococcus jannaschii (MJ1190m)

Coenzyme F420-reducing hydrogenase δ subunit

TGA

5

Methanococcus jannaschii (MJ1190a)

Formylmethanofuran dehydrogenase subunit B

TGA

4

Methanococcus jannaschii (MJ1194m)

Glutaredoxin-like

TGA

3

Carboxydothermus hydrogenoformans (CHY_0740)

Thioredoxin

TGA

3

Geobacter sulfurreducens (GSU3446)

Coenzyme F420-reducing hydrogenase α subunit

TGA

3

Methanococcus jannaschii (MJ0029)

HesB family

TGA

3

Desulfovibrio vulgaris (DVU_1382)

HesB family

TGA

2

Methanococcus maripaludis (MMP0252 + upstream)

Fe-S oxidoreductase

TGA

2

Desulfotalea psychrophila (DP1009)

DsbA-like

TGA

2

Desulfovibrio desulfuricans (Dde_1263 + upstream)

Periplasmic [NiFeSe] hydrogenase large subunit

TGA

2

Desulfovibrio vulgaris (DVU_1918)

Pyrrolysine

   

Monomethylamine methyltransferase

TAG

7

Methanosarcina acetivorans (MA0144)

Dimethylamine methyltransferase

TAG

7

Methanosarcina acetivorans (MA0532)

Trimethylamine methyltransferase

TAG

6

Methanosarcina acetivorans (MA0528)

Transcriptional regulator, TetR family

TAG

2

Methanosarcina acetivorans (MA2902)

Unknown

   

Cytochrome c family protein

TGA

2

Geobacter sulfurreducens (GSU2937 + GSU2936)

Hypothetical protein

TAG

2

Geobacter sulfurreducens (GSU2293 + downstream)

  1. A plus sign in a locus indicates that the genomic coordinates of the iORF can be described by a concatenation of two genes or regions. For example, "GSU2293 + downstream" means that the iORF consists of the gene GSU2293 and its downstream sequence. HesB family was not clustered into one family, because their sequences were too short and diverged.