Oligomeric protein structure networks: insights into protein-protein interactions
© Brinda and Vishveshwara; licensee BioMed Central Ltd. 2005
Received: 26 July 2005
Accepted: 10 December 2005
Published: 10 December 2005
Protein-protein association is essential for a variety of cellular processes and hence a large number of investigations are being carried out to understand the principles of protein-protein interactions. In this study, oligomeric protein structures are viewed from a network perspective to obtain new insights into protein association. Structure graphs of proteins have been constructed from a non-redundant set of protein oligomer crystal structures by considering amino acid residues as nodes and the edges are based on the strength of the non-covalent interactions between the residues. The analysis of such networks has been carried out in terms of amino acid clusters and hubs (highly connected residues) with special emphasis to protein interfaces.
A variety of interactions such as hydrogen bond, salt bridges, aromatic and hydrophobic interactions, which occur at the interfaces are identified in a consolidated manner as amino acid clusters at the interface, from this study. Moreover, the characterization of the highly connected hub-forming residues at the interfaces and their comparison with the hubs from the non-interface regions and the non-hubs in the interface regions show that there is a predominance of charged interactions at the interfaces. Further, strong and weak interfaces are identified on the basis of the interaction strength between amino acid residues and the sizes of the interface clusters, which also show that many protein interfaces are stronger than their monomeric protein cores. The interface strengths evaluated based on the interface clusters and hubs also correlate well with experimentally determined dissociation constants for known complexes. Finally, the interface hubs identified using the present method correlate very well with experimentally determined hotspots in the interfaces of protein complexes obtained from the Alanine Scanning Energetics database (ASEdb). A few predictions of interface hot spots have also been made based on the results obtained from this analysis, which await experimental verification.
The construction and analysis of oligomeric protein structure networks and their comparison with monomeric protein structure networks provide insights into protein association. Further, the interface hubs identified using the present method can be effective targets for interface de-stabilizing mutations. We believe this analysis will significantly enhance our knowledge of the principles behind protein association and also aid in protein design.
It is well known that a vast majority of cellular functions are mediated through protein-protein and protein-DNA interactions. Protein association is implicated in cellular signal transduction, antigen-antibody binding, in the regulation of gene expression and in the functioning of a huge variety of other constitutive multimers, where the multimeric state is the biologically active state. Hence, extensive research has been carried out to identify and to understand the underlying principles of protein association and interactions. Some insights to such interactions at atomic level have emerged from the analysis of large number of high-resolution crystal structures. Such investigations involve the characterization of the geometrical, chemical, and the energetic features of the interfaces as explained in the various reviews [1–6]. Specific studies include obtaining residue preferences at the interfaces , calculations of geometric parameters and shape complementarities between the interacting protein chains [8–11], calculations of the loss in accessible surface upon multimerization [12–15], elucidation of the role of hydrogen bonds, salt-bridges and hydrophobic and polar interactions at protein interfaces [16–21] and the analysis of conservation of residues at protein interfaces [22–26]. Various investigators have identified and analyzed energetic hot spots in protein interfaces using varied approaches [26–29]. Haliloglu et al., have compared protein folding and protein binding using vibrational motions of interface hot spots and conserved residues and conclude that both processes involve similar packing of amino acid residues . They also provide a method for identifying hot spots at binding interfaces. Further, Ofran and Rost have classified and analyzed the differences between six interface types including obligatory and transient homo and hetero oligomers . De et al., have also distinguished obligatory and non-obligatory interfaces using differences in the amino acid contacts and interactions patterns between the two interface types . Bahadur et al., have distinguished the biological oligomers from non-specific oligomers caused due to crystal packing . There have also been speculations about whether folding and binding are completely de-coupled with each other or whether they occur simultaneously, one coupled with the other . Wolynes and co-workers through simulations present that even if the monomers involved in binding may be stable separately, binding might preferably occur through unfolded intermediates, thus implying that folding and binding may be coupled in vivo and driven by the native state topology of the functional protein . Further, a community-wide evaluation of the significance and success of different methods used in the prediction of protein-protein interactions and protein docking has been carried out (CAPRI) and has been hugely successful . However, though there have been significant advances in methods of protein docking, those that are generally used in the identification of binding sites in monomer surfaces and the prediction of protein-protein interactions sites are far from satisfactory. Hence, newer approaches are required to get more insights into the factors contributing to protein-protein interactions.
We have earlier carried out an analysis on a limited set of twenty homodimers to understand the principles of protein-protein interactions from a graph perspective . This analysis was directed towards identifying clusters of amino acid residues with strong interactions at the protein interfaces, the nature of the residues involved in these interface clusters and the accessibility and conservation of these interface cluster forming residues. We had also proposed a simple and straightforward method to identify interacting surfaces on protein monomers, which was highly successful in that dataset. The present study focuses on the network of amino acid interactions across protein interfaces and has been carried out on a larger dataset of protein homo as well as hetero multimers. Recently, Del Sol and co-workers have investigated protein-protein complexes from the small-world network perspective using parameters like clustering coefficients and betweenness, where the central residues identified at the interfaces, have been found to correlate with the experimentally determined hotspots . Further, the same group also proposes the rewiring of the small-world networks at protein interfaces to form clusters of central residues at the interfaces . The current analysis also considers the protein structure in its multimeric form as a network of non-covalently interacting amino acids. However, we use a different definition of nodes and edges than the ones used by Del Sol and co-workers [37, 38], and have also incorporated an interaction strength term in the network construction and in the analysis of different parameters to understand the network topology of protein multimers. Since we know that protein-protein interactions are mainly mediated through non-covalent interactions, the connections (edges) between amino acids (nodes) are defined on the basis of the strength of the non-covalent interactions, as evaluated from the normalized number of contacts between them. The results are analyzed in terms of the network properties such as the hubs (nodes with greater number of edges) and clusters of amino acid residues in the protein complex at a given interaction strength, with particular focus at the protein-protein interface. Such an approach gives a global perspective of the interactions across the interface, which is difficult to obtain from pair-wise interaction or loss of accessible surface area analysis. For example, our earlier analysis on the clusters of interacting residues at the protein interface has given insights regarding the sequence signatures responsible for the different types of quaternary association in legume lectins and has also helped in the identification of hot spots in the α-α dimeric interface of Escherichia coli RNA polymerase. The network representation presented here has also been used earlier to identify structural domains and domain interface residues in multi-domain protein using a graph spectral method . However, in this analysis, we focus on the identification and analysis of amino acid clusters and hubs at protein-protein interfaces based on a generic network approach.
Interesting observations made from the present analysis on protein multimers include the fact that the strength of interfaces evaluated using the interface clusters and hubs identified by present method correlate well with the kinetic and thermodynamic parameters of complex formation evaluated experimentally. Further, the interface hubs identified here also correlate well with the experimentally identified hot spots on the basis of binding free energy. This result indicates that hotspots can be associated with interface hubs, the identification of which can be useful in rationally designing interface de-stabilizing mutants. Further, a comparison of the interface hubs to the hubs within the protein monomer and with the non-hubs at the interface show significant differences in the interface hub properties, such as the contribution of the charged interactions being considerably higher at the interfaces. The analysis of the interface clusters has also shown that the protein interfaces are as strong as or stronger than the protein cores in more than half the protein complexes considered in the dataset. Thus, the present algorithm has given a new perspective into analyzing protein structures in general and protein complexes in specific, which has shed light onto some of the factors involved in protein association.
Results and discussion
The concept of networks in biology has been explored in the areas of protein interaction networks, metabolic networks etc . The idea of considering protein structures as a network of amino acid connections is relatively new and has provided insights into protein structure, stability and folding. For instance, Vendruscolo et al., and Dokholyan et al., [43–45] have used a similar approach to understand protein folding, where as Atilgan et al.,  and Green and Higman  have represented protein structures as amino acid networks to analyze residue fluctuations and stability of the protein structures. Del Sol and O'meara have analyzed protein complexes as small-world networks where the central residues in the interfaces correlate with experimental hot spots [37, 38]. We have previously used a similar network representation to understand the factors affecting protein stability where the amino acid residues are the nodes in the protein structure network and the strength of the non-covalent interactions between them are evaluated for the edge-determining criterion . In the present work, this approach has been extended to protein quaternary structures rather than just protein tertiary structures so as to understand the factors responsible for protein association. We have extracted the interface cluster (a set of connected residues) and hub (a highly connected residue) information from the network representation of protein multimers as explained in the methods section. This has given insights into the role of specific amino acid residues in stabilizing inter-subunit interfaces. The hubs in many real-world networks are known to provide robustness to the networks against random attack. However, targeted attacks on these hubs are known to destabilize them. In the multimeric protein structure networks, the interface hubs can be considered as the centers providing stability to these networks due to their extensive interactions and their presence at the oligomeric interface. Hence, the mutation of a hub can lead to the destabilization of the interface. Therefore, the hubs can be identified as hot spots at protein interfaces that can be targeted for interface de-stabilizing mutations.
A non-redundant set of 455 protein oligomers is used in this study. The oligomeric protein structures as a whole are represented as graphs, with each amino acid as a node and the strength of non-covalent interactions (I, evaluated as given in the methods section) between them determining the edges. Those amino acid pairs with interaction strength greater than a user-defined cutoff (Imin) are connected by edges. Such graphs generated at various Imin values, have been analyzed in this section to understand the details of protein-protein interfaces at the network level. Specifically, (1) the analysis of the interface clusters (defined as distinct clusters of amino acid residues with contributions from more than one chain of the protein oligomer) and interface hubs (defined as amino acid residues interacting with five or more residues with at least one residue belonging to a different chain than itself) have been presented. (2) The strength of interface interaction, as measured from the clusters and hubs identified at different Imin values has been compared with the experimentally determined dissociation constants for known complexes. Finally, (3) the relevance of interface hubs to the stability of the oligomer is pointed out comparing some of the identified interface hubs with experimental results.
Analysis of interface clusters
Correlation of interface clusters with loss of accessible surface area and composition of interface clusters
Interface clusters have been identified and analyzed for the loss of accessible surface area, the interface cluster composition and strength of the interface clusters based on Imin and number of residues participating in interface cluster composition. The results of these investigations have been summarized in the two figures in the additional material (Additional file 1, Figures A1 and A2). The comparison of the residues that formed interface clusters at Imin = 6% with those that have lost accessible surface area on oligomerization (δASA) showed a very good correlation (correlation coefficient = 0.83, Figure A1) indicating that the clusters identified at Imin = 6% are a good representation of the oligomeric interfaces. Hence, all generic cluster analyses are carried out at this Imin. This correlation decreases with increase or decrease of Imin since higher Imins give specific strong clusters that fail to represent the complete interface and at lower Imin, the monomeric protein core also becomes a part of the interface cluster. The interface cluster composition at Imin = 6% also correlated very well with the residue composition obtained from δASA calculations (Figure A2) with preference for residues like Arginine, Histidine, Tryptophan, Tyrosine and Phenyl Alanine, though other residues are not left out. Such preferences have also been observed in several earlier interface analyses [7, 15, 33, 36]. The present investigation in addition has provided information regarding the size and strength of oligomeric protein interfaces, through the parameters such as the number of interface clusters, the number of residues constituting the interface clusters and the size of the largest interface cluster. This is discussed in detail in a later section where experimental dissociation constants are compared with the amino acid cluster and hub results from our analysis.
Largest cluster analysis
Identification of interface patches
Analysis of interface hubs
Hub composition at interfaces
It is evident from Figure 3 that Arginine, Tryptophan, Tyrosine, Phenyl Alanine, Histidine and Methionine are highly preferred as hubs at the protein interfaces at both higher and lower Imins, making them the strong interface hubs. The interface hub preferences of hydrophobic Leucine, Isoleucine and Valine are seen at lower Imins, making them the weak interface hubs. This overall profile is similar to the non-interface hub preference profile (Figure 3, ). However, there are some differences between the interface and non-interface hub preferences as can be seen from Figure 3. These include the fact that the interface hub preferences for the hydrophobic and aromatic residues are much lower when compared to that in the non-interface regions at the same Imin. Further, the interface hub preferences of the charged residues are comparable to their non-interface hub preferences, though the non-interface regions are much larger than the interface regions. The differences between the preferences in the interface and the non-interface hubs become pronounced at higher Imins, with the predominance of Arginine and other charged amino acids in the interface hubs where as the aromatic residues predominate the non-interface hubs at higher Imins. The percentage of charged hubs is much higher in the interface regions than the non-interface regions and the percentage of aromatic and hydrophobic hubs is higher in the non-interface regions than the interface regions. Further, Arginine seems to make more contribution at the interface at both high and low Imins, ahead of the aromatic and hydrophobic amino acids (except a slight preference for Tyrosine and Tryptophan over Arginine at Imin = 0%), unlike the non-interface hubs, where either the hydrophobic or aromatic residues or both are preferred ahead of Arginine at any Imin. This shows that the protein interfaces have major contributions from the charged amino acid residues. The preference of Arginine and charged interactions at the protein interfaces has also been shown by a few previous analyses [7, 15, 17, 33, 36]. The present analysis also confirms this aspect with the charged interactions dominating the interfaces to a large extent.
Another important observation that can be made by comparing Figure A2 (see Additional file 1) and Figure 3 is that although the residues like Leucine, Isoleucine, Valine and Lysine are found significantly in the interface clusters even at higher Imins, they are not preferred as interface hubs at these Imins. Hence, there is marked difference in the residue preferences in interface clusters and the hub preferences at the interfaces.
Preferences of hub-forming residues to interact with other residue types
Preferences of interface hubs to interact with other residues at Imin = 4%1
It is to be noted that a similar 20 × 20 matrix for the non-interface hubs at Imin = 4%, shows a different profile (see Additional file 1, Table A2), where the Arg-Arg, His-Asn, Leu-Arg, Tyr-charged and Tyr-polar interaction preferences are much lower than what is observed for the interface hubs shown in Table 1. In the non-interface hubs, the Tyr-Aromatic and Tyr-Hydrophobic interactions are more preferred than Tyr-charged or Tyr-polar interactions. Similarly, Arg-Aromatic interactions are also more preferred than Arg-Arg interactions and Leu-Leu and Leu-Phe are more preferred than Leu-Arg in case of the non-interface hubs.
Interactions of interface hubs
We have seen from Figure 2 and Table 1 that Arginine, Histidine and Tyrosine form some of the important hubs in the protein interfaces with some interesting interacting partners. We will discuss the interactions of some of these interface hubs in this section.
(a) Arginine hubs
(b) Tyrosine hubs
One of the significant contributions to the interface hubs comes from the Tyrosine hubs, which makes extensive interactions with the charged and polar residues like Arginine, Aspartate, Asparagine, Glutamate and Glutamine apart from the expected interactions with the other aromatic residues and itself as can be seen from Table 1. The interactions of Tyrosine with charged and polar residues are generally due to hydrogen bonding or cation-π interactions. Figure 4b shows an example of an interface Tyrosine hub (Tyr 275) making different kinds of interactions including a short hydrogen bond involving the hydroxyl group (with Arg 282, donor-acceptor distance = 2.52 Å) at Imin = 4%. (Tyrosine is also known to contribute to the stability of protein tertiary structure by means of short hydrogen bonds ). This Tyrosine residue also interacts with a Serine (279), Valine (141), Glutamine (278) and Asparagine (114) with Asparagine and Valine being from the other chain. Thus, we find that the Tyrosine residue is also versatile in its interactions due to its planar de-localized side chain and the hydroxyl group.
Statistics of hub versus non-hub interactions at the interface
Statistics of interface interactions
Imin = 4%
Charged+Polar interactions1 (Salt bridges)3,5
Hydrophobic interactions2 (Aromatic-Aromatic interactions)4,5
Imin = 0%
Charged+Polar interactions (Salt bridges)
Hydrophobic interactions (Aromatic-Aromatic interactions)
Correlation with experiments
Correlation of interface clusters and hubs with dissociation constants
Comparison of interface clusters and hubs with experimental dissociation constants of oligomeric proteins
No. of interface hubs
Size of the largest interface cluster
Elongation Factor EF-TU/EF-TS Complex 
Rac-ExoS GAP domain 
Growth hormone-Receptor 
Correlation of interface hubs with ΔΔG
Hot spot predictions§ from interface hub analysis on protein complexes at Imin = 4%
Predicted mutations in monomer 1
Predicted mutations in monomer 2
35L, 40K, 95R, 117Q
RNase I 1
35D, 63R, 346Q, 349N
57H, 99L, 190S
280N, 368D, 370E, 469R
35Nh5, 45Lh, 50Mh, 103Lh, 106Wh, 36Yl5, 89Ql, 91Fl, 96Rl
47Wh, 99Yh, 103Wh, 36Yl, 92Nl
35Nh, 45Lh, 40Mh, 103Lh, 106Wh, 36Yl, 44Pl, 89Ql, 91Fl, 96Rl
128Fh, 164Mh, 208Yh, 230Rh, 95Nl, 101Yl, 118Yl
19F, 74R, 96N, 100F, 147F
33Yh, 39Kh, 50Yh, 98Wh, 103Wh, 166Fh, 32Nl, 50Yl, 94Wl, 96Yl, 121Sl, 123El, 135Fl
8R, 9L, 12N, 16R, 41K
150H, 152D, 197V, 200Y, 217R, 218N
72N, 75N, 84S, 86F, 97K
43L, 101Y, 108F
152I, 169K, 171N, 190Q, 192V, 201K
32Yl, 36Yl, 50Yl, 91Hl, 135Fl, 137Nl, 33Yh, 35Hh, 45Lh, 50Lh, 52Dh, 59Ih, 102Yh, 103Yh, 104Fh, 147Kh, 170Fh
Out of the already mutated residues, ten are found to be hubs even at Imin = 6%, out of which, five have ΔΔG ≥ 4 kcal/mol and the other five have ΔΔG varying between 1 and 4 kcal/mol. None of these have ΔΔG < 1 kcal/mol. One of the residues in the Trypsin-BPTI complex (Lysine 15) is known to have a ΔΔG ≈ 10 kcal/mol  and this residue remains a hub even at Imin = 8%. This happens to be the only residue with such a high ΔΔG value and also the only one to remain a hub even at Imin = 8%.
Surprisingly, a large number of the interface hubs identified by the present method in these complexes, have not been mutated (not shown in figure). These include quite a few strong hubs identified at Imin = 4% (84 in number). These have been listed in Table 4 and are potential hot spots in these protein complexes, which can be mutated to destabilize the protein interface. It would be interesting to verify these predictions experimentally, which would then establish this as a rational method for the design of mutants that disrupt the protein-protein interfaces.
The oligomeric protein structures have been represented as networks, with amino acid residues as nodes and the edges have been constructed on the basis of non-covalent interaction strength (ranging from a cutoff of 0% to 6%) between amino acids. The analysis is focused on characterizing the interface clusters and hubs.
The interfaces have been characterized as strong, if the largest cluster in the protein appears at the interface at high (6%) interaction strength. Interestingly more than 50% of the complexes in the dataset exhibit such strong interfaces. The interface clusters identified and their amino acid composition correlate with those identified from previous studies as well as from δASA calculations.
The composition and the connections of the highly connected interface hubs have been evaluated at varying interaction strengths and compared with those of the non-interface hubs. The interfaces show an increase in Arginine hubs and a decrease in hydrophobic hubs when compared to the non-interface hubs. The hydrophobic residues, though present in the interface clusters, do not contribute to the interface hubs. Further, the interface hubs make the usual interactions such as salt bridges, stacking interactions and hydrogen-bonds as well as unusual interactions such as Arginine-Arginine interactions. The hub and non-hub interactions at the interfaces also show specific profiles with the hub interactions being dominated with hydrophobic interactions at lower interaction cutoffs and charged interactions at higher interaction cutoffs, whereas the non-hub interactions are dominated with charged interactions at all cutoffs. More importantly, the cluster and hub identification procedure picks up all types of interactions in a consolidated way, giving a global view of the interactions at the interface.
The interface clusters and hubs identified correlate well with the experimentally determined dissociation constants for known complexes indicating that we have a robust method of identifying the strength of oligomeric protein interfaces. Finally, the hubs at high interaction strength have been identified as hotspots by comparing the ΔΔG values from alanine scanning mutagenesis experiments. Several strong hubs that have not been mutated have been predicted to be hotspots and await confirmation from future experiments.
Materials and methods
The dataset consists of a non-redundant set of protein multimer (455 in number) structures with resolution better than 2 Å, obtained from the protein data bank . The dataset list is provided in Table A1 in Additional file 1. The sequence identity of the selected proteins is less than 25%. In the cases where the full multimer coordinates were not provided, they were generated from the rotation matrices and translation vectors. The dataset includes dimers and multimers of all types such as homo, hetero, functional as well as crystallographic multimers. 44 of the 455 oligomers (<10%) are crystal dimers as obtained from the BIOLOGICAL_UNIT record of the pdb file and the protein quaternary structure server . These proteins are indicated in Table A1 (see Additional file 1). The size of the monomers varies from 50 to 1000 and that of the multimers varies from 100 to 2500.
Accessible surface area
The loss of accessible surface area upon dimerization/multimerization was calculated from the residue-wise accessible surface area of the multimeric proteins and that of their respective monomers, which were obtained from NACCESS . The multimer values were normalized to those of the dimers. The residues that lose greater than 1% of their accessible surface area upon dimerization were identified as those contributing to the interface from δASA calculations.
Protein structures have been considered as a network of interactions amongst amino acid residues. Each residue in a protein complex is considered as a node in the graph and the connections between these nodes are the edges. A group of interconnected nodes is defined as a cluster and a cluster with at least one residue belonging to a different protein chain in the multimer is denoted as an interface cluster. Contact number is defined as the number of edges made by a node and those nodes with a contact number greater than 4 (unless otherwise specified), have been identified as hubs. A hub with at least one residue belonging to a different protein chain in the multimer is denoted as an interface hub.
Evaluation of non-covalent interaction
The non-covalent interactions between side chain atoms of amino acid residues (with the exception of Glycine, where the Cα atom is taken) are considered. The interactions between the sequence neighbors however, have been ignored. The interaction between two residues i and j has been quantified as defined by Kannan and Vishveshwara :
Iij = (nij/N) × 100
where nij is the number of atom pairs belonging to the side-chains of i and j coming within a distance of 4.5 Å and N is the normalization value for the amino acid type, which has been evaluated previously from a non-redundant set of proteins and also correlates with the size of the residue . The lesser of the two normalization values corresponding to the residues i and j is used for the evaluation of the interaction Iij for cluster identification. The normalization value of the residue i is used to evaluate the interaction Iij, for hub detection. In the identification of the clusters, both the normalization values of residues i and j are required during Iij evaluation due to symmetric considerations during graph construction. We have tried different combinations of the normalization values in this case, like sqrt(Ni × Nj), (Ni + Nj)/2 and min(Ni, Nj). Since they give qualitatively very similar results, we use the lesser of the two values (min(Ni, Nj)) for cluster identification. However, for hub detection, such constraints are not there and hence we have used the normalization value of the residue i (Ni) whose hub character is being evaluated.
Contact criterion on the basis of interaction strength
We choose an interaction cutoff, referred to as Imin and any two non-sequential ij pair, which has an Iij value that is greater than a chosen Imin value, is connected by an edge in the graph. Such a graph is referred to as a protein structure graph for a given interaction strength Imin. The protein structure graphs are generated for all the multimers considered in the dataset using an Imin range varying from 0 to 10%. Physically, a higher Imin indicates strong interactions between the connected residues and a lower Imin includes the weakly interacting residues as well. For instance, at Imin = 0% even a single atom-atom contact between the side-chains of two residues is sufficient to connect them by an edge in the protein structure graph and more contacts are required for connections at higher Imins. The interface clusters and hubs were identified and analyzed in these protein structure graphs at varying Imins. Finally, an Imin of 6% was chosen for interface cluster analyses due to better correlation with results from δASA and an Imin of 0% to 4% was chosen for interface hub analyses so as to obtain statistically significant number for analyses.
Cluster and hub analysis
The protein structure graphs have been represented as an adjacency matrix, which is an N × N matrix, where N is the number of residues in the protein structure. Each ijth element in the matrix is either 0 or 1 depending on whether the two nodes (residues) are connected (interacting) or not, on the basis of the chosen Imin. The diagonal elements are considered as 0 since connections with self are avoided. The amino acid residues forming disjoint clusters (with minimum three residues in each) are identified from the adjacency matrix by using a standard graph algorithm (depth first search (DFS) algorithm ). This gives the clusters of all the interacting residues in the protein structure, from which the interface clusters are selected.
Similarly, the residues with contact number greater than 4 are detected as hubs, from which the interface hubs are identified. The hub definition is relaxed to a contact number equal to or greater than 4, while investigating the interface hubs of single multimeric complexes in detail, as given in Tables 3 and 4, in order to obtain statistically significant number for analysis. The interfacial hub preferences of amino acid residues and the preferences of the residues with which these hubs interact are obtained and compared with similar properties of the non-interface hubs and non-hubs at interfaces, identified from the same data set.
Size of the largest cluster
When analyzing complex networks, one of the most common parameters used is the size of the largest cluster . Here, we have used this parameter to analyze the structure networks of protein oligomers. At various Imins, the clusters in the protein oligomers are obtained using DFS and the size of the largest cluster in terms of the number of residues constituting it is obtained at different Imins. This has been found to be a function of protein size and hence the size of the largest cluster is normalized with respect to the protein size and is plotted as a function of Imin. The largest cluster size decreases as the Imin increases and the largest cluster obtained at a higher Imin may or may not be present at the oligomeric interface. An analysis is made on all the proteins in the data set, to find out if the largest cluster is at the interface or not at Imin = 6%. This provides an idea regarding the strength of the oligomeric interface with respect to its monomeric protein core.
We acknowledge the Computational Genomics Initiative at the Indian Institute of Science, funded by the Department of Biotechnology (DBT), India, for support. KVB would like to thank the Council of Scientific and Industrial Research (CSIR), India for the award of a fellowship. We also acknowledge Rakesh Kumar Pandey for providing the DFS program and generating the oligomer dataset.
- Janin J, Wodak SJ: Structural basis for macromolecular recognition. In Protein modules and protein-protein interactions. Advances in protein chemistry. Harcourt publishers Ltd; 2002.
- Russel RB, Alber F, Aloy P, Davis FP, Korkin D, Pichaud M, Topf M, Sali A: A structural perspective on protein-protein interactions. Curr Opin Struct Biol 2004, 14: 313–324. 10.1016/j.sbi.2004.04.006View Article
- Valencia A, Pazos F: Computational methods for prediction of protein interactions. Curr Opin Struct Biol 2002, 12: 368–372. 10.1016/S0959-440X(02)00333-0View ArticlePubMed
- Jones S, Thornton JM: Analysis and classification of protein-protein interactions from a structural perspective. In Protein-Protein Recognition. Edited by: Kleanthous C. Oxford University Press, Oxford; 2000.
- Janin J: Kinetics and thermodynamics of protein-protein interactions. In Protein-Protein Recognition. Edited by: Kleanthous C. Oxford University Press, Oxford; 2000.
- Smith GR, Sternberg MJ: Prediction of protein-protein interactions by docking methods. Curr Opin Struct Biol 2002, 12: 28–35. 10.1016/S0959-440X(02)00285-3View ArticlePubMed
- Glaser F, Steinberg DM, Vakser IA, Ben-Tal N: Residue frequencies and pairing preferences at protein-protein interfaces. Proteins: Struct Funct Genet 2001, 43: 89–102. Publisher Full Text 10.1002/1097-0134(20010501)43:2<89::AID-PROT1021>3.0.CO;2-HView Article
- Lawrence MC, Colman PM: Shape complementarity at protein/protein interfaces. J Mol Biol 1993, 234: 946–950. 10.1006/jmbi.1993.1648View ArticlePubMed
- Gabb HA, Jackson RM, Sternberg MJ: Modeling protein docking using shape complementarity, electrostatics and biochemical information. J Mol Biol 1997, 272: 106–120. 10.1006/jmbi.1997.1203View ArticlePubMed
- Jones S, Thornton JM: Analysis of protein-protein interaction sites using surface patches. J Mol Biol 1997, 272: 121–132. 10.1006/jmbi.1997.1234View ArticlePubMed
- Jones S, Thornton JM: Principles of protein-protein interactions. Proc Natl Acad Sci USA 1996, 93: 13–20. 10.1073/pnas.93.1.13PubMed CentralView ArticlePubMed
- Miller S, Lesk AM, Janin J, Chothia C: The accessible surface area and stability of oligomeric proteins. Nature 1987, 328: 834–836. 10.1038/328834a0View ArticlePubMed
- Lo Conte L, Chothia C, Janin J: The atomic structure of protein-protein recognition sites. J Mol Biol 1999, 285: 2177–2198. 10.1006/jmbi.1998.2439View ArticlePubMed
- Chakrabarti P, Janin J: Dissecting protein-protein recognition sites. Proteins: Struct Funct Genet 2002, 47(3):334–43. 10.1002/prot.10085View Article
- Bahadur RP, Chakrabarti P, Rodier F, Janin J: Dissecting subunit interfaces in homodimeric proteins. Proteins: Struct Funct Genet 2003, 53(3):708–19. 10.1002/prot.10461View Article
- Fernandez HA, Scheraga : Insufficiently dehydrated hydrogen bonds as determinants of protein interactions. Proc Natl Acad Sci USA 2003, 100(1):113–118. 10.1073/pnas.0136888100PubMed CentralView ArticlePubMed
- Xu D, Tsai CJ, Nussinov R: Hydrogen bonds and salt bridges across protein-protein interfaces. Protein Engg 1997, 10: 999–1012. 10.1093/protein/10.9.999View Article
- Young L, Jernigan RL, Covell DG: A role of surface hydrophobicity in protein-protein recognition. Protein Sci 1994, 3: 717–729.PubMed CentralView ArticlePubMed
- Tsai CJ, Lin SL, Wolfson HJ, Nussinov R: Study of protein-protein interfaces: a statistical analysis of the hydrophobic effect. Protein Sci 1997, 6: 53–64.PubMed CentralView ArticlePubMed
- Li Y, Huang Y, Swaminathan CP, Smith-Gill SJ, Mariuzza RA: Magnitude of the hydrophobic effect at central versus peripheral sites in protein-protein interfaces. Structure 2005, 13(2):297–307. 10.1016/j.str.2004.12.012View ArticlePubMed
- Shanahan HP, Thornton JM: Amino acid architecture and the distribution of polar atoms on the surfaces of proteins. Biopolymers 2005, 78(6):318–28. 10.1002/bip.20295View ArticlePubMed
- Valdar WSJ, Thornton JM: Protein-protein interfaces: analysis of amino acid conservation in homodimers. Proteins: Struct Funct Genet 2001, 42: 108–124. Publisher Full Text 10.1002/1097-0134(20010101)42:1<108::AID-PROT110>3.0.CO;2-OView Article
- Ma B, Elkayam T, Wolfson H, Nussinov R: Protein-protein interactions: structurally conserved residues distinguish between binding sites and exposed protein surfaces. Proc Natl Acad Sci USA 2003, 100: 5772–5777. 10.1073/pnas.1030237100PubMed CentralView ArticlePubMed
- Landgraf R, Xenarios I, Eisenberg D: Three-dimensional cluster analysis identifies interfaces and functional residue clusters in proteins. J Mol Biol 2001, 307: 1487–1502. 10.1006/jmbi.2001.4540View ArticlePubMed
- Lichtarge O, Bourne HR, Cohen FE: An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol 1996, 257: 342–358. 10.1006/jmbi.1996.0167View ArticlePubMed
- Keskin O, Ma B, Nussinov R: Hot regions in protein-protein interactions: the organization and contribution of structurally conserved hot spot residues. J Mol Biol 2005, 345(5):1281–94. 10.1016/j.jmb.2004.10.077View ArticlePubMed
- Bogan AA, Thorn KS: Anatomy of hot spots in protein interfaces. J Mol Biol 1998, 280(1):1–9. 10.1006/jmbi.1998.1843View ArticlePubMed
- Kortemme T, Baker D: A simple physical model for binding energy hot spots in protein-protein complexes. Proc Natl Acad Sci USA 2002, 99: 14116–14121. 10.1073/pnas.202485799PubMed CentralView ArticlePubMed
- Ying G, Wang R, Lai L: Structure based method for analyzing protein-protein interfaces. J Mol Model 2004, 10(1):44–54. 10.1007/s00894-003-0168-3View Article
- Haliloglu T, Keskin O, Nussinov R: How similar are protein folding and protein binding nuclei? Examination of vibrational motions of energy hot spots and conserved residues. Biophys J 2005, 88(3):1552–9. 10.1529/biophysj.104.051342PubMed CentralView ArticlePubMed
- Ofran Y, Rost B: Analyzing six types of protein-protein interfaces. J Mol Biol 2003, 325: 377–387. 10.1016/S0022-2836(02)01223-8View ArticlePubMed
- De S, Krishnadev O, Srinivasan N, Rekha N: Interaction preferences across protein-protein interfaces of obligatory and non-obligatory components are different. BMC Struct Biol 2005, in press.
- Bahadur RP, Chakrabarti P, Rodier F, Janin J: A dissection of specific and non-specific protein-protein interfaces. J Mol Biol 2004, 336: 943–955. 10.1016/j.jmb.2003.12.073View ArticlePubMed
- Levy Y, Wolynes PG, Onuchic JN: Protein topology determines binding mechanism. Proc Natl Acad Sci USA 2004, 101(2):511–516. 10.1073/pnas.2534828100PubMed CentralView ArticlePubMed
- Janin J, Henrick K, Moult J, Eyck LT, Sternberg MJ, Vajda S, Vakser I, Wodak SJ: CAPRI: a Critical Assessment of PRedicted Interactions. Proteins: Struct Funct Genet 2003, 52: 2–9. 10.1002/prot.10381View Article
- Brinda KV, Kannan N, Vishveshwara S: Analysis of homodimeric protein interfaces by graph-spectral methods. Protein Engg 2002, 4: 265–77. 10.1093/protein/15.4.265View Article
- Del Sol A, O'meara P: Small-world network approach to identify key residues in protein-protein interaction. Proteins: Struct Funct Bioinf 2005, 58(3):672–82. 10.1002/prot.20348View Article
- Del Sol A, Fujihashi H, O'meara P: Topology of small-world networks of protein-protein complex structures. Bioinformatics 2005, 21(8):1311–5. 10.1093/bioinformatics/bti167View ArticlePubMed
- Brinda KV, Mitra N, Surolia A, Vishveshwara S: Determinants of quaternary association in legume lectins. Protein Sci 2004, 13: 1735–1749. 10.1110/ps.04651004PubMed CentralView ArticlePubMed
- Kannan N, Preethi C, Ghosh P, Vishveshwara S, Chatterji D: Stabilizing interactions in the dimer interface of α-subunit in Escherichia coli RNA polymerase: A graph spectral and point mutation study. Protein Sci 2001, 10: 46–54. 10.1110/ps.26201PubMed CentralView ArticlePubMed
- Ramesh K, Sistla Brinda KV, Saraswathi Vishveshwara: Identification of Domains and Domain Interface Residues in Multidomain Proteins from Graph Spectral Method. Proteins: Struct Funct Bioinfo 2005, 59(3):616–626. 10.1002/prot.20444View Article
- Barabasi AL: Linked: The new science of networks. Persues Publishing, Cambridge, Massachusetts; 2002.
- Vendruscolo M, Paci E, Dobson CM, Karplus M: Three key residues form a critical contact network in a protein folding transition state. Nature 2001, 409: 641–645. 10.1038/35054591View ArticlePubMed
- Vendruscolo M, Dokholyan NV, Paci E, Karplus M: Small-world view of the amino acids that play a key role in protein folding. Phys Rev E 2002, 65: 061910. 10.1103/PhysRevE.65.061910View Article
- Dokholyan NV, Li L, Ding F, Shakhnovich EI: Topological determinants of protein folding. Proc Natl Acad Sci USA 2002, 99(13):8637–8641. 10.1073/pnas.122076099PubMed CentralView ArticlePubMed
- Atilgan AR, Akan P, Baysal C: Small-world communication of residues and significance for protein dynamics. Biophys J 2004, 86: 85–91.PubMed CentralView ArticlePubMed
- Greene LH, Higman VA: Uncovering network systems within protein structures. J Mol Biol 2003, 334: 781–791. 10.1016/j.jmb.2003.08.061View ArticlePubMed
- Brinda KV, Vishveshwara S: A network representation of protein structures: implications to protein stability. Biophys J 2005, in press.
- Sathyapriya R, Vishveshwara S: Short hydrogen bonds in proteins. FEBS J 2005, 272: 1819–1832. 10.1111/j.1742-4658.2005.04604.xView Article
- Nooren IMA, Thornton JM: Structural characterization and functional significance of transient protein-protein interactions. J Mol Biol 2003, 325: 991–1018. 10.1016/S0022-2836(02)01281-0View ArticlePubMed
- Schnittman SM, Lane HC, Roth J, Burrows A, Folks TM, Kehrl JH, Koenig S, Berman P, Fauci AS: Characterization of GP120 binding to CD4 and an assay that measures ability of sera to inhibit this binding. J Immunol 1988, 141(12):4181–6.PubMed
- Lascols O, Cherqui G, Capeau J, Caron M, Picard J: Alteration by concanavalin A of the slow dissociable component in the human growth hormone-receptor interaction. Horm Metab Res 1986, 18(8):512–6.View ArticlePubMed
- Thorn KS, Bogan AA: ASEdb: A database of alanine mutations and their effects on the free energy of binding in protein interactions. Bioinformatics 2001, 17(3):284–285. 10.1093/bioinformatics/17.3.284View ArticlePubMed
- Castro MJ, Anderson S: Alanine point mutations in the reactive regions of bovine pancreatic trypsin inhibitor: effects on the kinetics and thermodynamics of binding to beta-trypsin and alpha-chymotrypsin. Biochemistry 1996, 35(35):11435–46. 10.1021/bi960515wView ArticlePubMed
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Res 2000, 28: 235–242. 10.1093/nar/28.1.235PubMed CentralView ArticlePubMed
- Henrick K, Thornton JM, PQS: A protein quaternary structure file server. Trends in Biochem Sci 1998, 23(9):358–61. 10.1016/S0968-0004(98)01253-5View Article
- Hubbard SJ: NACCESS: program for calculating accessibilities. Department of Biochemistry and Molecular Biology. University college of London; 1992.
- Kannan N, Vishveshwara S: Identification of side-chain clusters in protein structures by a graph spectral method. J Mol Biol 1999, 292(2):441–64. 10.1006/jmbi.1999.3058View ArticlePubMed
- West DB: Introduction to Graph theory. Prentice-Hall of India Private Limited; 2000.
- Alanine Scanning Energetics DataBase (AsEDB)[http://thornlab.cgr.harvard.edu/hotspot/index.php]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.