Correlation analysis of the side-chains conformational distribution in bound and unbound proteins
© Kirys et al.; licensee BioMed Central Ltd. 2012
Received: 23 January 2012
Accepted: 11 September 2012
Published: 17 September 2012
Protein interactions play a key role in life processes. Characterization of conformational properties of protein-protein interactions is important for understanding the mechanisms of protein association. The rapidly increasing amount of experimentally determined structures of proteins and protein-protein complexes provides foundation for research on protein interactions and complex formation. The knowledge of the conformations of the surface side chains is essential for modeling of protein complexes. The purpose of this study was to analyze and compare dihedral angle distribution functions of the side chains at the interface and non-interface areas in bound and unbound proteins.
To calculate the dihedral angle distribution functions, the configuration space was divided into grid cells. Statistical analysis showed that the similarity between bound and unbound interface and non-interface surface depends on the amino acid type and the grid resolution. The correlation coefficients between the distribution functions increased with the grid spacing increase for all amino acid types. The Manhattan distance showing the degree of dissimilarity between the distribution functions decreased accordingly. Short residues with one or two dihedral angles had higher correlations and smaller Manhattan distances than the longer residues. Met and Arg had the slowest growth of the correlation coefficient with the grid spacing increase. The correlations between the interface and non-interface distribution functions had a similar dependence on the grid resolution in both bound and unbound states. The interface and non-interface differences between bound and unbound distribution functions, caused by biological protein-protein interactions or crystal contacts, disappeared at the 70° grid spacing for interfaces and 30° for non-interface surface, which agrees with an average span of the side-chain rotamers.
The two-fold difference in the critical grid spacing indicates larger conformational changes upon binding at the interface than at the rest of the surface. At the same time, transitions between rotamers induced by interactions across the interface or the crystal packing are rare, with most side chains having local readjustments that do not change the rotameric state. The analysis is important for better understanding of protein interactions and development of flexible docking approaches.
Protein-protein interactions play a key role in life processes. Characterization of conformational changes in proteins upon binding is important for understanding the mechanisms of protein association and for our ability to model it. Dependence of side-chain dihedral angle distribution on the conformation of the backbone has been investigated in earlier studies [1–5]. The side-chain dihedral angles are not evenly distributed, but for the most part are tightly clustered. A number of unbound rotamer libraries have been described previously [1–14] (see  for a review). Dunbrack and Cohen  used Bayesian statistics to estimate populations and dihedral angles for all amino acids rotamers at all φ and ψ values. A backbone-dependent rotamer library  was obtained by dividing φ and ψ dihedral space into 10°× 10° bins, χ angles into 120° bins, and calculating frequencies and average values of rotamers for each amino acid. A backbone-independent rotamer library was generated in a similar way. In a recent study , a new version of the backbone-dependent rotamer library was developed. It consists of rotamer frequencies, mean dihedral angles, and variances as a function of the backbone dihedral angles. In one of the latest backbone-independent rotamer libraries, the “Penultimate rotamer library”  by Lovell, Richardson and colleagues, the dihedral angle space was clustered and rotamer positions were defined as the distribution mode.
Comparison of the side-chain distribution in the core and on the surface , conducted on 19 protein structures available in 1978, revealed a small variation of the χ1 rotamers distribution. A later study  on a set of 50 non-homologous proteins showed that for all side chains, except Asp, Asn and Glu, the distributions of χ1 rotamers on the surface and in the core are not significantly different.
Comparison of the χ1 and χ2 distributions at the interface and non-interface surface was performed by Guharoy et al. . Distributions were divided into bins as in the Dunbrack’s backbone-independent rotamer library . Empirical free energies of inter-rotamer transitions were calculated and compared for the interface and non-interface areas. The rotamers free energies were different at the interface and non-interface, whereas bound and unbound free energies were essentially the same.
Conformations of surface residues in protein structures determined by crystallography are affected by the crystal packing. The area of the protein surface involved in the crystal contacts is generally smaller than in biological interfaces , and the interface packing is looser . Studies of the crystal packing effect on the surface side chains [21–23] showed that ~ 20% of the exposed side chains change conformation, and the change increases with the increase of the side-chain solvent accessibility. Large polar or charged residues Arg, Lys, Glu, Gln, as well as Ser were found to be most flexible .
The purpose of this study was to analyze and compare dihedral angle distribution functions of the side chains at the interface and non-interface areas in bound and unbound proteins. Such analysis is important for better understanding of protein interactions and development of flexible docking approaches. The dihedral-angle distribution functions (DADF) were calculated on a cubic grid dividing the dihedral space into cells for each residue type, at interface and non-interface surface, in bound and unbound structures. The correlation coefficients between bound and unbound, interface and non-interface DADFs were calculated, along with the Manhattan distance, as a measure of dissimilarity between the DADFs. All the correlation coefficients depended on the amino acid type and the grid resolution. The correlation coefficients always increased with the increase of the grid spacing, whereas the Manhattan distances decreased accordingly. Short residues with one or two dihedral angles had higher correlations and smaller Manhattan distances at small grid spacing than the longer residues. The correlation between the interface and non-interface DADFs showed a similar dependence on the grid resolution in both bound and unbound states. The differences between bound and unbound DADFs induced by biological protein-protein interactions or crystal contacts disappeared at the 70° grid spacing for interfaces and 30° for non-interface surface. The two-fold difference in the critical grid spacing indicates larger changes at the interface than on the rest of the surface. While the earlier studies [18, 24, 25] observed this trend for the side-chain rotamers, this study validates it by a more general approach based on the DADFs.
The analysis was performed on the non-redundant Dockground Benchmark 3 set of bound and corresponding unbound protein structures . The set consists of 233 complexes, with the unbound structures of both interacting proteins for 99 complexes, and the unbound structure of one interacting protein for 134 complexes. The following criteria were used for generating the set: sequence identity between bound and unbound structures > 97%; sequence identity between complexes < 30%; and homomultimers, crystal packing, and obligate complexes excluded.
Number of surface residues in bound and unbound proteins
Side chain conformations were represented by dihedral angles, calculated by Dangle . All dihedral angles varied from −180° to 180°, with exception of the last dihedral angle in Phe, Tyr, Asp and Glu , which varied from 0° to 180° due to the symmetry of the terminal aromatic and charged groups. To calculate the distribution functions, the configuration space was divided into cells by a cubic grid.
The Manhattan distance equals 0 for two identical DADFs, and increases up to 1 with the decrease of the DADFs similarity (higher similarity between the DADFs corresponds to lower values of the Manhattan distance).
Results and discussion
The minimal grid spacing corresponding to correlation coefficient 0.7 between bound and unbound interface/non-interface dihedral angle distribution
Correlation between interface bound and unbound distributions for 30° grid spacing
Covariance (numerator in Equation1)
Product of Standard deviations (denominator in Equation 1)
Standard deviations of the unbound DADF
Standard deviation of the bound DADF
The dihedral-angle distribution functions were calculated for each amino acid type for interface and non-interface surface residues, in bound and unbound protein structures. To generate the distribution functions, the configuration space was divided into cells by a cubic grid. Correlation coefficients between bound and unbound interface and non-interface distribution functions were calculated. The similarity between the distributions was also quantified by the Manhattan distance. The results showed that all the correlation coefficients depend on amino acid type and the grid resolution. For all amino acid types, the correlation coefficients increased with the increase of the grid spacing. The Manhattan distances between the distribution functions decreased accordingly. Short residues with one or two dihedral angles had higher correlations and smaller Manhattan distances than the longer residues. Met and Arg had the lowest correlation coefficients at any grid spacing. The correlations between the interface and non-interface distribution functions had a similar dependence on the grid resolution in both bound and unbound states. The interface and non-interface difference between bound and unbound distribution functions, induced by biological protein-protein interactions or crystal contacts, disappeared at the 70° grid spacing for interfaces and 30° for non-interface surface, in agreement with an average span of a side-chain rotamer. The two-fold difference in the critical grid spacing indicates larger conformational changes upon binding at the interface than at the rest of the surface. At the same time, transitions between rotamers induced by interactions across the interface or the crystal packing are rare, with most side chains having local readjustments that do not change the rotameric state.
Conformational sampling based on the side chain dihedral angle distributions may optimize flexible docking protocols by reflecting conformational preferences of the bound proteins. The results suggest that the site- (interface vs. non-interface) and residue-specific grid spacing smaller than the critical values should be used in the sampling. The minimal grid spacing (Table 2) reflects intra-rotamer amino acid local readjustments upon binding. Thus, using such steps in conformational sampling may accelerate the flexible docking search by reflecting the size of these readjustments.
TK is a PhD student at the United Institute of Informatics Problems, National Academy of Sciences of Belarus and a Research Assistant at the Center for Bioinformatics, The University of Kansas; AMR is an Assistant Research Professor at the Center for Bioinformatics, The University of Kansas; AVT is the General Director of the United Institute of Informatics Problems, National Academy of Sciences of Belarus; and IAV is the Director of the Center for Bioinformatics and Professor of Bioinformatics and Molecular Biosciences at The University of Kansas.
This study was supported by grant R01GM074255 from the NIH.
- Dunbrack RL, Cohen FE: Bayesian statistical analysis of protein side-chain rotamer preferences. Protein Sci 1997, 6: 1661–1681. 10.1002/pro.5560060807PubMed CentralView ArticlePubMedGoogle Scholar
- Dunbrack RL, Karplus M: Backbone-dependent rotamer library for proteins: application to side-chain prediction. J Mol Biol 1993, 230: 543–574. 10.1006/jmbi.1993.1170View ArticlePubMedGoogle Scholar
- Janin J, Wodak S: Conformation of amino acid side-chains in proteins. J Mol Biol 1978, 125: 357–386. 10.1016/0022-2836(78)90408-4View ArticlePubMedGoogle Scholar
- Mcgregor MJ, Islam SA, Sternberg MJE: Analysis of the relationship between side-chain conformation and secondary structure in globular-proteins. J Mol Biol 1987, 198: 295–310. 10.1016/0022-2836(87)90314-7View ArticlePubMedGoogle Scholar
- Lovell SC, Word JM, Richardson JS, Richardson DC: The penultimate rotamer library. Proteins 2000, 40: 389–408. 10.1002/1097-0134(20000815)40:3<389::AID-PROT50>3.0.CO;2-2View ArticlePubMedGoogle Scholar
- Tuffery P, Etchebest C, Hazout S, Lavery R: A new approach to the rapid determination of protein side-chain conformations. J Biomol Struc & Dynamics 1991, 8: 1267–1289. 10.1080/07391102.1991.10507882View ArticleGoogle Scholar
- Benedetti E, Morelli G, Nemethy G, Scheraga HA: Statistical and energetic analysis of side-chain conformations in oligopeptides. Int J Pept Prot Res 1983, 22: 1–15.View ArticleGoogle Scholar
- Chandrasekaran R, Ramachandran GN: Studies on the conformation of amino acids. XI. Analysis of the observed side group conformation in proteins. Int J Protein Res 1970, 2: 223–233.View ArticlePubMedGoogle Scholar
- Bhat TN, Sasisekharan V, Vijayan M: Analysis of side-chain conformation in proteins. Int J Pept Prot Res 1979, 13: 170–184.View ArticleGoogle Scholar
- Ponder JW, Richards FM: Tertiary templates for proteins. Use of packing criteria in the enumeration of allowed sequences for different structural classes. J Mol Biol 1987, 193: 775–791. 10.1016/0022-2836(87)90358-5View ArticlePubMedGoogle Scholar
- Schrauber H, Eisenhaber F, Argos P: Rotamers: to be or not to be? An analysis of amino acid side-chain conformations in globular proteins. J Mol Biol 1993, 230: 592–612. 10.1006/jmbi.1993.1172View ArticlePubMedGoogle Scholar
- Kono H, Doi J: A new method for side-chain conformation prediction using a hopfield network and reproduced rotamers. J Comput Chem 1996, 17: 1667–1683.View ArticleGoogle Scholar
- DeMaeyer M, Desmet J, Lasters I: All in one: a highly detailed rotamer library improves both accuracy and speed in the modelling of sidechains by dead-end elimination. Fold Des 1997, 2: 53–66. 10.1016/S1359-0278(97)00006-0View ArticleGoogle Scholar
- Beglov D, Hall D, Brenke R, Shapovalov MV, Dunbrack RL, Kozakov D, Vajda S: Minimal ensembles of side chain conformers for modeling protein-protein interactions. Proteins 2011, 80: 591–601.PubMed CentralView ArticlePubMedGoogle Scholar
- Dunbrack RL: Rotamer libraries in the 21st century. Curr Opin Struct Biol 2002, 12: 431–440. 10.1016/S0959-440X(02)00344-5View ArticlePubMedGoogle Scholar
- Shapovalov MS, Dunbrack RL: A smoothed backbone-dependent rotamer library for proteins derived from adaptive kernel density estimates and regressions. Structure 2011, 19: 844–858. 10.1016/j.str.2011.03.019PubMed CentralView ArticlePubMedGoogle Scholar
- Pickett SD, Sternberg MJE: Empirical scale of side-chain conformational entropy in protein folding. J Mol Biol 1993, 231: 825–839. 10.1006/jmbi.1993.1329View ArticlePubMedGoogle Scholar
- Guharoy M, Janin J, Robert CH: Side-chain rotamer transitions at protein–protein interfaces. Proteins 2010, 78: 3219–3225. 10.1002/prot.22821View ArticlePubMedGoogle Scholar
- Carugo O, Argos P: Protein-protein crystal-packing contacts. Protein Sci 1997, 6: 2261–2263.PubMed CentralView ArticlePubMedGoogle Scholar
- Janin J, Bahadur RP, Chakrabarti P: Protein-protein interaction and quaternary structure. Quart Rev Biophys 2008, 41: 133–180.View ArticleGoogle Scholar
- Zhao S, Goodsell DS, Olson AJ: Analysis of a data set of paired uncomplexed protein structures: new metrics for side-chain flexibility and model evaluation. Proteins 2001, 43: 271–279. 10.1002/prot.1038View ArticlePubMedGoogle Scholar
- Jacobson MP, Friesner RA, Xiang Z, Honig B: On the role of the crystal environment in determining protein side-chain conformations. J Mol Biol 2002, 320: 597–608. 10.1016/S0022-2836(02)00470-9View ArticlePubMedGoogle Scholar
- Eyal E, Gerzon S, Potapov V, Edelman M, Sobolev V: The limit of accuracy of protein modeling: influence of crystal packing on protein structure. J Mol Biol 2005, 351: 431–442. 10.1016/j.jmb.2005.05.066View ArticlePubMedGoogle Scholar
- Ruvinsky AM, Kirys T, Tuzikov AV, Vakser IA: Side-chain conformational changes upon protein-protein association. J Mol Biol 2011, 408: 356–365. 10.1016/j.jmb.2011.02.030PubMed CentralView ArticlePubMedGoogle Scholar
- Kirys T, Ruvinsky A, Tuzikov AV, Vakser IA: Rotamer libraries and probabilities of transition between rotamers for the side chains in protein-protein binding. Proteins 2012, 80: 2089–2098.PubMed CentralPubMedGoogle Scholar
- Gao Y, Douguet D, Tovchigrechko A, Vakser IA: DOCKGROUND system of databases for protein recognition studies: unbound structures for docking. Proteins 2007, 69: 845–851. 10.1002/prot.21714View ArticlePubMedGoogle Scholar
- Hubbard SJ, Thornton JM: NACCESS, computer program, Department of Biochemistry and Molecular Biology. University College London; 1993.Google Scholar
- 'Dang', Computer Program. http://kinemage.biochem.duke.edu
- Rodgers JL, Nicewander WA: Thirteen ways to look at the correlation coefficient. The American Statistician 1988, 42: 59–66.View ArticleGoogle Scholar
- Krause EF: Taxicab Geometry : an adventure in non-Euclidean geometry. New York: Dover Publications; 1987.Google Scholar
- Rahman NA: A course in theoretical statistics. Charles Griffin & Co; 1968.Google Scholar
- Dall'Acqua W, Goldman ER, Lin W, Teng C, Tsuchiya D, Li H, Ysern X, Braden BC, Li Y, Smith-Gill SJ: A mutational analysis of binding interactions in an antigen-antibody protein-protein complex. Biochemistry 1998, 37: 7981–799. 10.1021/bi980148jView ArticlePubMedGoogle Scholar
- Bhat TN, Bentley GA, Boulot G, Greene MI, Tello D, Dall'Acqua W, Souchon H, Schwarz FP, Mariuzza RA, Poljak RJ: Bound water molecules and conformational stabilization help mediate an antigen-antibody association. Proc Natl Acad Sci USA 1994, 91: 1089–1093. 10.1073/pnas.91.3.1089PubMed CentralView ArticlePubMedGoogle Scholar
- Frigerio F, Coda A, Pugliese L, Lionetti C, Menegatti E, Amiconi G, Schnebli HP, Ascenzi P, Bolognesi M: Crystal and molecular structure of the bovine alpha-chymotrypsin-eglin c complex at 2.0 a resolution. J Mol Biol 1992, 225: 107–123. 10.1016/0022-2836(92)91029-OView ArticlePubMedGoogle Scholar
- Dixon MM, Matthews BW: Is gamma-chymotrypsin a tetrapeptide acyl-enzyme adduct of alpha-chymotrypsin? Biochemistry 1989, 28: 7033–7038. 10.1021/bi00443a038View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.