Skip to main content

Table 1 Genetic repertoire of loci characterized by atypical tetranucleotide usage patterns and extreme OUV (section III in Fig. 4) identified in bacterial chromosomes

From: Differentiation of regions with atypical oligonucleotide composition in bacterial genomes

Genome

Genes and the encoded protein

Start*

Length (bp)

ΔD

ΔOUV

Acinetobacter sp.

putative hemagglutinin/hemolysin-related protein

923,008

11,136

3.11

4.13

 

non-coding multiple repeats TTTAGAAA

2,448,000

5.600

2.24

17.33

Bordetella bronchiseptica RB50

BB1186: putative hemolysin

1,268,967

10,041

5.13

4.12

Bradyrhizobium japonicum USDA110

blr325: unknown

3,592,327

17,058

3.17

4.65

 

bll356: unknown

3,930,196

10,326

6.23

5.02

 

bll371: unknown

4,106,955

12,387

4.39

4.95

 

bll547: unknown

6,017,600

12,633

5.04

6.16

Corynebacterium efficiens YS-314

fasA: fatty-acid synthase I

962,711

8,919

2.85

3.85

 

fasB: fatty-acid synthase II

2,541,750

9,069

2.88

5.42

Deinococcus radiodurans R1 chromosome 1

DR1461-1462: hypothetical proteins

1,465,188

10,000

2.19

8.27

 

non-coding tandem repeats CCCGCCC

519,833

8,415

7.06

8.42

E. coli O157:H7

Z0609, Z0615: RTX family exoproteins

581,356

20,160

1.82

9.43

Mycobacterium tuberculosis H37Rv

Rv0272c-Rv0279c hypothetical Gly-, Ala-rich proteins

328.573

10,499

1.52

9.15

 

Rv0297-Rv0304c: hypothetical Gly-, Ala-, Asn-rich proteins

361,332

11,431

8.79

7.91

 

Rv0355c: Asn-rich protein

424,775

9,903

8.31

10.91

 

Rv0573c-Rv0578c: hypothetical Gly-rich proteins

665,849

10,066

0.60

4.72

 

Rv0742-Rv0747: hypothetical Gly-rich proteins

832,979

7,876

1.24

3.97

 

Rv1060-Rv1068c: hypothetical Gly-, Ala-rich proteins

1,183,506

8,641

1.04

5.54

 

Rv1084-Rv1092c: hypothetical proteins

1,207,634

11.395

2.19

6.44

 

multiple repeats CCGCCGCCA

1,630,636

7,592

2.33

8.84

 

Rv2490c-Rv2494: hypothetical Gly-rich proteins

2,801,252

7,482

2.60

5.50

Pseudomonas aeruginosa PAO1

PA1874: hypothetical protein

2,036,441

7,407

2.61

5.61

P. putida KT2440

PP0168: Thr-rich surface adhesion protein

194,494

26,046

2.58

6.97

 

PP0806: surface adhesion protein

926,690

18,930

1.17

4.39

P. syringae DC3000

PSPTO3229: filamentous hemagglutinin

3,629,677

18,825

2.34

7.87

Rhodopirellula baltika 1

RB3077: putative cyclic nucleotide binding protein

1,588,083

18,024

1.62

6.19

 

RB4375: large polymorphic membrane protein, probable extracellular nuclease;

2,242,933

9,171

3.23

7.09

 

RB11769: probable aggregation factor core protein MAFp3

6,335,006

24,522

5.25

6.31

Rhodopseudomonas palustris CGA009

conserved hypothetical protein

1,459,664

9,891

2.61

3.38

 

conserved hypothetical protein

1,475,303

13,008

2.89

4.18

Sulfolobus solfataricus P2

non-coding tandem repeats GAATTGAAAG

1,228,221

12,238

1.94

15.25

  

1,253,000

5,000

1.50

8.67

  

1,305,242

5,000

1.89

12.39

Staphylococcus aureus N315

ebhA – ebhB: large surface anchored proteins

1,437,928

20,142

4.04

10.07

 

SA2447: similar to streptococcal hemagglutinin

2,755,253

6,816

3.03

9.29

Streptomyces coelicolor A3(2)

SC8F4.01c: Ala/Glu-rich protein

586,509

3.981

2.16

5.40

 

SC2H4.02: hypothetical protein

6,836,057

6,552

2.86

4.80

Xanthomonas campestris ATCC33913

yapH: putative autotransporter adhesin

2,374,740

11,886

3.22

6.61

Xylella fastidiosa Temecula 1

non-coding sequence, multiple

1,183,606

11,095

1.31

9.81

 

repeats (GGT)n

1,447,312

11,139

1.37

10.91

 

pspA1: hemagglutinin

2,082,143

10,134

1.06

9.78

 

pspA2: hemagglutinin

2,501,956

10,374

1.41

11.79

Yersinia pestis KIM

irp1-2: yersiniabactin peptide/polyketide synthetase;

2,654,642

15,867

4.27

6.05

 

yapH: putative autotransporter adhesin

3,747,888

11,133

2.66

8.60

 

y3579: putative filamentous hemagglutinin

3,961,333

9,888

3.31

4.32

  1. * left coordinate of the locus in the chromosomal sequence;
  2. deviation of the D:n0_4 mer value calculated for the locus from the mean genomic D:n0_4 mer in standard deviations;
  3. deviation of the OUV:n1_4 mer value calculated for the locus from the mean genomic OUV:n1_4 mer in standard deviations;