Skip to main content

Protein evolution driven by symmetric structural repeats

Background

Duplications play a major role in the evolution of genomes by creating and modifying molecular functions. Repeats are created at the DNA level, but when they are intragenic they affect protein sequence and structure. Moreover, as sequences evolve faster than 3D structures, some ancient repeats cannot be detected at the sequence level. Therefore, an understanding of the evolutionary and functional role of repeats requires tracking them at the three levels.

Materials and methods

For this task we have developed a program (SWELFE) to find repeats in DNA, amino acid sequences and 3D protein structures (available at http://bioserv.rpbs.jussieu.fr/swelfe) [1]. Repeats were searched independently at all three levels using dynamic programming (SIM algorithm) and we adapted scoring matrices for each level. 3D structures were encoded as strings of α angles (dihedral angle between four consecutive Cα) so that the same algorithm can be used at the three levels. We verified the validity of our approach by superimposing the 3D repeats. We made a statistical evaluation of the repeats we found.

As there is currently no connection between 3D structures and the corresponding DNA sequences, we built a databank where we associate a PDB structure with its best match among the genes in TREMBL. The databank has 85 845 entries corresponding to ~85% of the unique chains of the PDB databank after removing entries with very small peptides or many undefined residues. We could then compare directly the results of repeat identification at the three levels.

Results

In our analyses we were surprised to find that many of the large structural repeats are symmetrical (2-fold symmetry): one copy of the repeat can be superimposed with the other copy by a simple rotation of 180° around an axis (some repeats have 3-fold symmetry or higher). An example of a symmetrical repeat is shown on Figure 1. Then structures containing such repeats resemble to homodimers combined into one unit.

Figure 1
figure1

Example of repeat found in the 3D structure of Tata-box Binding Protein (TBP) of Sulfolobus acidocaldarius (1MP9). Repeats are shown in light grey and non-repeated regions are shown in black. The repeat is 83 amino acids long.

References

  1. 1.

    Abraham A-L, Rocha EPC, Pothier J: Swelfe : a detector of internal repeats in sequences and structures. Bioinformatics 2008. doi: 10.1093/bioinformatics/btn234

    Google Scholar 

Download references

Acknowledgements

This work was supported by grants from Region Ile-de-France to ALA, ACI IMPBIO to EvolRep and ANR-06-CIS to project PROTEUS.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Anne-Laure Abraham.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Abraham, AL., Pothier, J. & Rocha, E.P. Protein evolution driven by symmetric structural repeats. BMC Bioinformatics 9, P3 (2008). https://doi.org/10.1186/1471-2105-9-S10-P3

Download citation

Keywords

  • Dynamic Programming
  • Dihedral Angle
  • Combinatorial Library
  • Small Peptide
  • Protein Evolution