Skip to content

Advertisement

BMC Bioinformatics

Open Access

Using ConTemplate and the PDB to explore conformational space: on the detection of rare protein conformations

  • Aya Narunsky1,
  • Haim Ashkenazy2,
  • Rachel Kolodny3 and
  • Nir Ben-Tal1
BMC Bioinformatics201516(Suppl 3):A3

https://doi.org/10.1186/1471-2105-16-S3-A3

Published: 13 February 2015

Keywords

Protein Data BankCalcium BindingQuery ProteinS100 FamilyS100A12 Protein

Background

Conformational changes mediate important protein functions, such as opening and closing of channel gates, activation and inactivation of enzymes, etc. The entire conformational repertoire of a given query protein may not be known; however, it may be possible to infer unknown conformations from other proteins. We developed the ConTemplate method to exploit the richness of the Protein Data Bank (PDB)[1] for this purpose. ConTemplate uses a three-step process to suggest alternative conformations for a query protein with one known conformation [2]. First, ConTemplate uses GESAMT to scan the PDB for proteins that share structural similarity with the query [3]. Next, for each of the collected proteins, additional known conformations are detected using BLAST [4], and clustered into a predefined number of clusters [5]. Finally, MODELLER [6] builds models of the query in various conformations, each representative of a cluster.

Results

We demonstrate the application of ConTemplate with S100A6, a member of the S100 family of Ca2+ binding proteins. The vast majority of proteins in this family bind Ca2+ through helix-loop-helix EF-hand motifs. The structure of the protein includes four helices connected by three loops. Calcium binding is coupled to a conformational change, in which helix 3 changes its orientation with respect to helix 4 (Figure 1A and 1B) [7]. Helix 2 also changes its positioning with respect to the rest of the protein upon calcium binding, but the change is not as dramatic. The RMSD between the Ca2+-bound and -free conformations is 4.46Å. The EF-hand motif is found in many PDB entries. Yet, known structures of the Ca2+-free conformation are relatively rare. These features make the protein an interesting example for examining how the performance of ConTemplate is affected by the distribution of conformations in the PDB: The highly abundant Ca2+-bound conformation may populate a very large cluster, which could mask the Ca2+-free conformation. Thus, finding the latter conformation could be challenging.
Figure 1

ConTemplate results demonstrated using the S100A6 Ca 2+ binding protein. The Ca2+-free (A) and -bound (B) conformations are shown in the upper panels; helix 3 is marked in red, and the calcium ions in magenta. C. Reproducing the Ca2+-bound conformation, starting from the Ca2+-free conformation as a query. The maximal RMSD between the query and similar proteins is set to 1.2Å, the minimal Q-score to 0.4, and the number of clusters is set to 2. D. Reproducing the Ca2+-free conformation, starting from the Ca2+-bound conformation as a query. The similarity cutoffs are the same as in C, the number of clusters is set to 17.

Starting from the Ca2+-free conformation as a query, it is sufficient to set the number of clusters at 2 to retrieve both the Ca2+-bound and -free conformations. ConTemplate reproduces the Ca2+-bound conformation with RMSD of 1.6Å (Figure 1C). This is based on the query's structural similarity to the Ca2+-free conformation of another member of the family, the S100A2 protein [8], and the bound conformation of this protein [9]. The sequence identity between the two proteins is 47%. When the number of clusters is set to be larger than 2, each cluster represents either the Ca2+-bound or the Ca2+-free conformation. On the other hand, using the abundant Ca2+-bound conformation as a query, even with up to three clusters, the process retrieves only variants of the (initial) bound conformation. Only when the number of clusters is four or larger do we obtain at least one cluster representing the Ca2+-free conformation. In general, the ability to predict the other conformation improves as the number of clusters increases. For example, with 17 clusters, 4 clusters represent the rare conformation, and ConTemplate reproduces the Ca2+-free conformation with RMSD of 2.43Å (Figure 1D). This is based on the query's structural similarity to the bound conformation of another member of the family, the S100A12 protein [10], and the known free conformation of this protein [11]. The sequence identity between the query and the template is 42%.

Conclusions

ConTemplate suggests putative conformations for a query protein with at least one known structure, based on the query's structural similarity to other proteins. In principle, the clustering method enables the detection of distinct conformations, including local conformational changes. However, it may be necessary to adjust ConTemplate's parameters to reveal such changes, especially when looking for rare conformations. When ConTemplate suggests models that are similar to the query, and the clusters are very large, this may indicate that less-common conformations of the query are masked by highly-abundant conformations. Increasing the number of clusters may enable the rarer conformations to be detected. When the additional conformation is not known, it is not trivial to detect the "correct" conformation among the suggested models. A careful examination of the similar proteins and their conformational changes can be useful towards selecting the most probable conformations for the query. In addition, if the number of clusters is large enough, a pathway between the query conformation and a putative conformation may be found, with other models serving as intermediates. Identification of such a pathway could provide insight into the physiological relevance of a newly-detected conformation.

Declarations

Acknowledgements

A.N. and H.A. are funded in part by the Edmond J. Safra Center for Bioinformatics at Tel Aviv University.

Authors’ Affiliations

(1)
Department of Biochemistry and Molecular Biochemistry, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv, Israel
(2)
The Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv, Israel
(3)
Department of Computer Science, University of Haifa, Mount Carmel, Haifa, Israel

References

  1. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Res. 2000, 28 (1): 235-242. 10.1093/nar/28.1.235.PubMed CentralView ArticlePubMedGoogle Scholar
  2. Narunsky A, Ben-Tal N: ConTemplate: exploiting the protein databank to propose ensemble of conformations of a query protein of known structure. BMC Bioinformatics. 2014, 15 (Suppl 3): A5-10.1186/1471-2105-15-S3-A5.PubMed CentralView ArticleGoogle Scholar
  3. Krissinel E: Enhanced fold recognition using efficient short fragment clustering. J Mol Biochem. 2012, 1 (2): 76-85.Google Scholar
  4. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215 (3): 403-410. 10.1016/S0022-2836(05)80360-2.View ArticlePubMedGoogle Scholar
  5. Choi IG, Kwon J, Kim SH: Local feature frequency profile: a method to measure structural similarity in proteins. Proc Natl Acad Sci USA. 2004, 101 (11): 3797-3802. 10.1073/pnas.0308656100.PubMed CentralView ArticlePubMedGoogle Scholar
  6. Sali A, Blundell TL: Comparative protein modelling by satisfaction of spatial restraints. J Mol Biol. 1993, 234 (3): 779-815. 10.1006/jmbi.1993.1626.View ArticlePubMedGoogle Scholar
  7. Otterbein LR, Kordowska J, Witte-Hoffmann C, Wang CL, Dominguez R: Crystal structures of S100A6 in the Ca(2+)-free and Ca(2+)-bound states: the calcium sensor mechanism of S100 proteins revealed at atomic resolution. Structure. 2002, 10 (4): 557-567. 10.1016/S0969-2126(02)00740-2.View ArticlePubMedGoogle Scholar
  8. Koch M, Diez J, Fritz G: Crystal structure of Ca2+ -free S100A2 at 1.6-A resolution. J Mol Biol. 2008, 378 (4): 933-942. 10.1016/j.jmb.2008.03.019.View ArticlePubMedGoogle Scholar
  9. Koch M, Fritz G: The structure of Ca2+-loaded S100A2 at 1.3-A resolution. FEBS J. 2012, 279 (10): 1799-1810. 10.1111/j.1742-4658.2012.08556.x.View ArticlePubMedGoogle Scholar
  10. Moroz OV, Antson AA, Grist SJ, Maitland NJ, Dodson GG, Wilson KS, Lukanidin E, Bronstein IB: Structure of the human S100A12-copper complex: implications for host-parasite defence. Acta Crystallogr D Biol Crystallogr. 2003, 59 (Pt 5): 859-867.View ArticlePubMedGoogle Scholar
  11. Moroz OV, Blagova EV, Wilkinson AJ, Wilson KS, Bronstein IB: The crystal structures of human S100A12 in apo form and in complex with zinc: new insights into S100A12 oligomerisation. J Mol Biol. 2009, 391 (3): 536-551. 10.1016/j.jmb.2009.06.004.View ArticlePubMedGoogle Scholar

Copyright

© Narunsky et al.; licensee BioMed Central Ltd. 2015

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Advertisement