Research | Open | Published:
HORI: a web server to compute Higher Order Residue Interactions in protein structures
BMC Bioinformaticsvolume 11, Article number: S24 (2010)
Folding of a protein into its three dimensional structure is influenced by both local and global interactions within a protein. Higher order residue interactions, like pairwise, triplet and quadruplet ones, play a vital role in attaining the stable conformation of the protein structure. It is generally agreed that higher order interactions make significant contribution to the potential energy landscape of folded proteins and therefore it is important to identify them to estimate their contributions to overall stability of a protein structure.
We developed HORI [Higher order residue interactions in proteins], a web server for the calculation of global and local higher order interactions in protein structures. The basic algorithm of HORI is designed based on the classical concept of four-body nearest-neighbour propensities of amino-acid residues. It has been proved that higher order residue interactions up to the level of quadruple interactions plays a major role in the three-dimensional structure of proteins and is an important feature that can be used in protein structure analysis.
HORI server will be a useful resource for the structural bioinformatics community to perform analysis on protein structures based on higher order residue interactions. HORI server is a highly interactive web server designed in three modules that enables the user to analyse higher order residue interactions in protein structures. HORI server is available from the URL: http://caps.ncbs.res.in/hori
Derivation of a three-dimensional structure of a protein from its primary sequence is controlled by a complex and largely unknown set of principles called as folding code. Folding principles of a three-dimensional structure is largely under the influence of both global and local interactions. For example, pairwise, triplet and quadruple based higher order residue interactions play a crucial role to attain the stable conformation of the protein structures. Higher order residue interactions also contribute to the potential energy landscape of proteins and hence it is important to understand such interactions mediated in the level of active site residues to whole structure [1–7]. In the current era of high-throughput sequencing, due to huge lacunae in the sequence to structure ratio, computational approaches are playing a significant role in understanding the design principles and functional aspects of protein structures [8–15]. In this paper, we describe the availability of a web server called HORI (Higher Order Residue Interactions in proteins) developed for the calculation of generic and specific higher order residue interaction patterns in protein structure. The basic algorithm of HORI is designed based on the classical concept of four-body nearest-neighbour propensities of amino acid residues. It has been proved that higher order residue interactions, up to the level of quadruple interactions, will play a major role in the three-dimensional structure of proteins. According to the earlier studies, if we approximate each residue as a sphere centred on its location, it is possible for a maximum of four closely packed spheres to make mutual contact, thus giving rise to pair wise, triplet and quadruple interactions. Just as no more than four same-sized spheres can be in mutual contact in 3D space, higher order interaction beyond quadruple interactions are generally not considered [3, 16]. The concept of higher order interactions has been introduced and successfully employed in structure analysis and fold recognition by different groups [17, 18]. Earlier work also reported that the higher order interactions can be used to improve accuracy of fold recognition and generic structure analysis [6, 19]. As HORI server can be used to compute higher order interactions in different levels from single residue to whole structure, analysis of higher order interactions mediated by residues in the functional or active sights will provide better insights to understand the structural interactions contributed by these important residues. We envisage that availability of a server to compute higher order interactions will enable the users to perform the computation of higher order interactions in easy steps. In this manuscript, we explain various feature of HORI server along with different example scenarios where the general application of the higher order interaction and the server is useful.
HORI server implementation: description and features
HORI - Global
HORI - Global provides set of programs for the computation of higher order interactions at the complete structural level. Using options in HORI - Global module, user can compute pairwise, triplet and quadruple interactions among the residues in a given protein structure. Here, the complete set of all possible interactions in each category will be computed. The higher and lower-cut distance cut-off for identifying probable higher order interactions are provided as a user-defined option. Preferred range for higher order residue interactions are 1 - 7 Angstroms. Using efficient utilization of different parameters, like atom-type for distance calculation, interaction type and distance cut-off, and user can derive interesting information on higher order interactions from the protein structure of interest. HORI - Global is the computationally intensive module available in the HORI server. This module is designed as an email based server due to computation intensive nature of the programs.
HORI - Lite & HORI - Cluster
Hori - Lite, the second module available from HORI server offers a set of programs for the computation of higher order interactions in a structure, based on specific residue numbers and specific distance. The third module, HORI - Cluster, offers a set of programs for computation of higher order interactions of different types of residues in a structure. Both HORI - Lite and HORI - Cluster provide nine different programs under the category of three different classes of higher order interactions. Programs are available in pairwise interactions class to compute pairwise distances for any two residues in PDB file and pairwise distance around any one residue. Triplet interaction class provides programs to compute triplet distance for any three residues, triplet distance for around any two residues and triplet distance around any one residue. Quadruplet interaction class provides options to compute quadruple Distance for any four, three, two or one residue in a PDB file.
All the three modules of HORI server require three dimensional co-ordinates of protein structures in PDB format. User can submit structures from PDB files and modelled structure files in PDB format. User can supply chain of interest from the various NMR structures. In the current version, HORI server can be used to analyse both single chain and multi-chain proteins. Due to computational-intensive nature of HORI-Global programs, currently, the server allows only two chains in the multi-chain based higher order interaction calculations. User of HORI - Global module should also submit a valid, non-commercial email address to the server to receive the notification about the availability of results. HORI server will send the result URL to the email address. In comparison to HORI - Global, both HORI - Lite and HORI - Clusters offer specific and faster computation of the higher order interactions in a protein structure. General parameters in different modules of HORI server are atom type, interaction type and range of distance to calculate interactions. Apart from these general parameters, user can also mention the range of residues (option available in HORI - Global), exact residue numbers (available as a parameter in HORI - Lite) and amino acid type (available as a parameter in HORI - Cluster).
HORI server computes pairwise, triplet and quadruple interactions within a protein structure based on different parameters provided by the user. Among the three modules available in HORI server, HORI - Global provides more detailed output. User can generate customised output based on pairwise, triplet and quadruple computations using HORI - Global. HORI - Global module also provides a tab-delimited text file of the results for the customised analysis of higher order interactions by the user. User can also visualize the interaction, either using pre-installed Rasmol  on local machines or on browser using Jmol plug-in . All three modules in HORI server provides output in html, tab-delimited text files for parsing and further analysis of higher-order interactions and visualization options to see individual interactions.
Results and discussion
Bioinformatics tools are widely used in the study of protein structures to understand structural, functional and interaction aspects of protein structures. Several tools are also available for the calculation of interaction, interface, bonding patterns, disulphide connectivity. In PDBWiki  various tools are listed to define or select interacting residues. For example, Protein Interaction Calculator  can be used to calculate several interaction parameters like intra-protein interactions, solvent accessibility, protein-protein interactions and depth calculations. Other tools like SCOPPI  can be used for analysis of protein-protein interface, LPC/CSU [25, 26] can be used for ligand-protein contacts & contacts of structural units. Irrespective of such wide array of structure tools for protein structure analysis, according to the best of our knowledge, HORI server is a primary attempt to provide a web server for the computation of higher order residue interactions in proteins in a whole structure as well residue-specific level.
Applications of HORI Server in protein structure analysis
Higher order interactions calculated using set of computationally intensive algorithms available in HORI server will be useful in fold prediction, protein modelling, protein-protein interaction, active site identification and to understand higher order interaction characteristics of active site residues within specified distance shells. Knowledge about the higher order interactions will be of great importance in structural biology due to its wide range of applications in fold recognition, structural analysis, protein engineering, protein-protein interactions, active site identification and to understand mechanism of action of enzymes [6, 17, 18]. In order to illustrate the usefulness of higher order interactions, we enumerate four different examples, in protein structure analysis contexts, where HORI server is used to analyse set of different single chain and multi-chain crystal structures from PDB.
Analysis of Higher order interactions in structures from TIM fold and Rossmann fold
Prior knowledge of spatial and higher order interactions will add more value to the interpretation of HORI results and to discriminate between folds of roughly similar secondary structural topology but dissimilar overall shape. To demonstrate where triplet and quadruplet interactions might be crucial, we take the example of two α/β folds from SCOP database [27, 28] with strikingly similar secondary structural topology but distinct overall spatial arrangements: triose phosphate isomerase (TIM) fold  and the doubly-wound dehydrogenase or Rossmann-fold . Despite the similar repeats of super secondary structural elements (βαβ), the TIM barrel is a closed structure with the first and eighth β-strand in hydrogen bonding, whereas the doubly-wound Rossmann folds have open structures with two distinct domains holding the entire polypeptide fold. Due to the similar secondary structural arrangements, most fold recognition methods have failed to distinguish the correct fold. When the pairwise interactions between the residues in the first two β strands in 1WYI (TIM fold)  and 1G5Q (Rossman fold)  are compared, they lie in the same range. The pairwise interaction between the Cβ atoms of residues 8 and 39 that lie in adjacent parallel strands in 1WYI (TIM fold) is found to be 5.17 Å while that between residues 7 and 34 in 1G5Q (Rossman fold) is 4.49 Å. While the topology appears to be similar, the higher order interactions exhibit distinct profiles for the two proteins. While the triplet interactions between Cβ atoms of residues 7, 21, 34 in 1G5Q lie within 10 Å, it is not so in the case of the triplet formed between Cα atoms of residues 8, 28, 39 in 1WYI. In the case of 4MDH, which is a Rossman fold protein, there are no triplet interactions observed between adjacent β strands and the joining α helix. For example, the only triplet interaction observed for Cα atom of residue 8 is with residue 38 of another β strand and residue 76 of α helix. On the other hand, in 1WYI, residue 8 is seen forming triplet interactions with adjacent β strands on either side, forming the triplet residues-8, 231, 244 and residues-8, 28, 39. Graphical view of the triplet interactions in 1WYI, 1G5Q and 4MDH is provided in Figure 3. HORI server can be used in the analysis of distance based interactions within proteins and proves to be an effective means of distinguishing protein folds through its map of higher order interactions.
Identification of intramolecular higher order interaction mediated by a cysteine residue in Crambin
Crambin  is a member of Crambin-like superfamily plant seed protein with three disulphide bonds. We have used HORI Server to compute quadruple distances around 1 cysteine residue in Crambin structure (PDB ID: 1CRN) with in the distance range of 1-7 Angstroms. This option will provide an insight in to all intramolecular quadruple interactions mediated by the six cysteine residues in crambin . A detailed screenshot of HORI server with results of HORI-Global runs on crambin (PDB ID: 1CRN) is provided in Figure 4.
Identification of alternate active site residues and suitability of residues for mutational studies based on higher order residue interaction
Cutinase  is a serine esterase containing the classical Ser, His, Asp triad of serine hydrolases . Catalytic Site Atlas  reports five active site residues based on homologous entries, found by PSI-BLAST  alignment to one of the PDB entries (PDB ID: 1AGY). We used two of these residues to identify potential third interacting residue (using the compute triplet distances option) around any two residues in PDB file (PDB ID: 1CEX). Approaches like this can be useful to identify potential alternate active site residues and residues suitable for mutational studies, based on number of intermolecular interactions contributed by residues in active site regions.
Analysis of higher order interaction in a multi-chain protein involved in 3D domain swapping
CD2 is shown to have ability to fold in two ways as a monomer or as a swapped dimer [40, 41]. We have performed HORI-Global based computation of higher order residue interactions using the two chains of CD2 structure (A and B chains of PDB ID: 1A64). The higher order interactions within the cut-off of 0-8 Å clearly indicate that the swapped structure is stabilised by several higher order interactions between the residues in chains A and B .
HORI server provides a landscape of all possible higher order residue interactions in protein structures. The information provided by HORI server will be important to understand the role of higher order residue interaction in stability, to recognise alternate patches of functionally important residues, structural integrity and folding properties of modelled and experimentally solved protein structures. Availability of HORI server in the public domain will enable the structural bioinformatics community to analyze and study higher order interaction patterns from protein structure data in easier way and gain better insight about the structure. This can also aid the design of mutation experiments for biochemists and biologists. By providing various options in three different modules, HORI server offers a complete computing platform online for higher order residue interactions and for the analysis of protein structures.
Levinthal C: Are there pathways for protein folding? J Chim Phys 1968, 65: 44.
Anfinsen CB: The formation and stabilization of protein structure. Biochem J 1972, 128(4):737–749.
Betancourt MR, Thirumalai D: Pair potentials for protein folding: choice of reference states and sensitivity of predicted native states to variations in the interaction schemes. Protein Sci 1999, 8(2):361–369.
Munson PJ, Singh RK: Statistical significance of hierarchical multi-body potentials based on Delaunay tessellation and their application in sequence-structure alignment. Protein Sci 1997, 6(7):1467–1481. 10.1002/pro.5560060711
Pace CN, Shirley BA, McNutt M, Gajiwala K: Forces contributing to the conformational stability of proteins. FASEB J 1996, 10(1):75–83.
Sippl MJ: Knowledge-based potentials for proteins. Curr Opin Struct Biol 1995, 5(2):229–235. 10.1016/0959-440X(95)80081-6
Rojnuckarin A, Subramaniam S: Knowledge-based interaction potentials for proteins. Proteins 1999, 36(1):54–67. 10.1002/(SICI)1097-0134(19990701)36:1<54::AID-PROT5>3.0.CO;2-B
Johnson MS, Srinivasan N, Sowdhamini R, Blundell TL: Knowledge-based protein modeling. Crit Rev Biochem Mol Biol 1994, 29(1):1–68. 10.3109/10409239409086797
Laskowski RA, Thornton JM: Understanding the molecular machinery of genetics through 3D structures. Nat Rev Genet 2008, 9(2):141–151. 10.1038/nrg2273
Lee D, Redfern O, Orengo C: Predicting protein function from sequence and structure. Nat Rev Mol Cell Biol 2007, 8(12):995–1005. 10.1038/nrm2281
Kristensen DM, Ward RM, Lisewski AM, Erdin S, Chen BY, Fofanov VY, Kimmel M, Kavraki LE, Lichtarge O: Prediction of enzyme function based on 3D templates of evolutionarily important amino acids. BMC Bioinformatics 2008, 9: 17. 10.1186/1471-2105-9-17
Russ WP, Ranganathan R: Knowledge-based potential functions in protein design. Curr Opin Struct Biol 2002, 12(4):447–452. 10.1016/S0959-440X(02)00346-9
Poole AM, Ranganathan R: Knowledge-based potentials in protein design. Curr Opin Struct Biol 2006, 16(4):508–513. 10.1016/j.sbi.2006.06.013
Thomas J, Ramakrishnan N, Bailey-Kellogg C: Graphical models of protein-protein interaction specificity from correlated mutations and interaction data. Proteins 2009, 76(4):911–929. 10.1002/prot.22398
Thomas J, Ramakrishnan N, Bailey-Kellogg C: Graphical models of residue coupling in protein families. IEEE/ACM Trans Comput Biol Bioinform 2008, 5(2):183–197. 10.1109/TCBB.2007.70225
Singh RK, Tropsha A, Vaisman II: Delaunay tessellation of proteins: four body nearest-neighbor propensities of amino acid residues. J Comput Biol 1996, 3(2):213–221. 10.1089/cmb.1996.3.213
Godzik A, Kolinski A, Skolnick J: Topology fingerprint approach to the inverse protein folding problem. J Mol Biol 1992, 227(1):227–238. 10.1016/0022-2836(92)90693-E
Xu J, Li M, Kim D, Xu Y: RAPTOR: optimal protein threading by linear programming. J Bioinform Comput Biol 2003, 1(1):95–117. 10.1142/S0219720003000186
Krishnamoorthy B, Tropsha A: Development of a four-body statistical pseudo-potential to discriminate native from non-native protein conformations. Bioinformatics 2003, 19(12):1540–1548. 10.1093/bioinformatics/btg186
Herráez A: Biomolecules in the Computer: Jmol to the rescue. Biochem Educ 2006, 34: 7.
Sayle RA, Milner-White EJ: RASMOL: biomolecular graphics for all. Trends Biochem Sci 1995, 20(9):374. 10.1016/S0968-0004(00)89080-5
Tina KG, Bhadra R, Srinivasan N: PIC: Protein Interactions Calculator. Nucleic Acids Res 2007, (35 Web Server):W473–476. 10.1093/nar/gkm423
Winter C, Henschel A, Kim WK, Schroeder M: SCOPPI: a structural classification of protein-protein interfaces. Nucleic Acids Res 2006, (34 Database):D310–314. 10.1093/nar/gkj099
Sobolev V, Eyal E, Gerzon S, Potapov V, Babor M, Prilusky J, Edelman M: SPACE: a suite of tools for protein structure prediction and analysis based on complementarity and environment. Nucleic Acids Res 2005, (33 Web Server):W39–43. 10.1093/nar/gki398
Sobolev V, Sorokine A, Prilusky J, Abola EE, Edelman M: Automated analysis of interatomic contacts in proteins. Bioinformatics 1999, 15(4):327–332. 10.1093/bioinformatics/15.4.327
Murzin AG, Brenner SE, Hubbard T, Chothia C: SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol 1995, 247(4):536–540.
Andreeva A, Howorth D, Chandonia JM, Brenner SE, Hubbard TJ, Chothia C, Murzin AG: Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res 2008, (36 Database):D419–425.
Wierenga RK: The TIM-barrel fold: a versatile framework for efficient enzymes. FEBS Lett 2001, 492(3):193–198. 10.1016/S0014-5793(01)02236-0
Rao ST, Rossmann MG: Comparison of super-secondary structures in proteins. J Mol Biol 1973, 76(2):241–256. 10.1016/0022-2836(73)90388-4
Kinoshita T, Maruki R, Warizaya M, Nakajima H, Nishimura S: Structure of a high-resolution crystal form of human triosephosphate isomerase: improvement of crystals using the gel-tube method. Acta Crystallogr Sect F Struct Biol Cryst Commun 2005, 61(Pt 4):346–349. 10.1107/S1744309105008341
Blaesse M, Kupke T, Huber R, Steinbacher S: Crystal structure of the peptidyl-cysteine decarboxylase EpiD complexed with a pentapeptide substrate. EMBO J 2000, 19(23):6299–6310. 10.1093/emboj/19.23.6299
Birktoft JJ, Rhodes G, Banaszak LJ: Refined crystal structure of cytoplasmic malate dehydrogenase at 2.5-A resolution. Biochemistry 1989, 28(14):6065–6081. 10.1021/bi00440a051
Teeter MM: Water structure of a hydrophobic protein at atomic resolution: Pentagon rings of water molecules in crystals of crambin. Proc Natl Acad Sci USA 1984, 81(19):6014–6018. 10.1073/pnas.81.19.6014
HORI Results for 1CRN[http://caps.ncbs.res.in/hori/hori_results/er_hori_SatDec13185917IST2008.html]
Longhi S, Czjzek M, Lamzin V, Nicolas A, Cambillau C: Atomic resolution (1.0 A) crystal structure of Fusarium solani cutinase: stereochemical analysis. J Mol Biol 1997, 268(4):779–799. 10.1006/jmbi.1997.1000
Martinez C, De Geus P, Lauwereys M, Matthyssens G, Cambillau C: Fusarium solani cutinase is a lipolytic enzyme with a catalytic serine accessible to solvent. Nature 1992, 356(6370):615–618. 10.1038/356615a0
Porter CT, Bartlett GJ, Thornton JM: The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Res 2004, (32 Database):D129–133. 10.1093/nar/gkh028
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25(17):3389–3402. 10.1093/nar/25.17.3389
Murray AJ, Head JG, Barker JJ, Brady RL: Engineering an intertwined form of CD2 for stability and assembly. Nat Struct Biol 1998, 5(9):778–782. 10.1038/1816
Liu Y, Eisenberg D: 3D domain swapping: as domains continue to swap. Protein Sci 2002, 11(6):1285–1299. 10.1110/ps.0201402
HORI Results for 1A64[http://caps.ncbs.res.in/hori/hori_results/er_hori_SatDec13182345IST2008.html]
R.S. was a Senior Research Fellow funded by the Wellcome Trust, U.K. We would like to thank Department of Biotechnology, India for partial financial support. We also thank National Centre for Biological Sciences (TIFR) for financial and infrastructural support.
This article has been published as part of BMC Bioinformatics Volume 11 Supplement 1, 2010: Selected articles from the Eighth Asia-Pacific Bioinformatics Conference (APBC 2010). The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2105/11?issue=S1.
The authors declare that they have no competing interests.
S.G and R.S. designed the research components and applications of this project. The project was conceived and developed by P.S. R. Sreenivasan coded the core calculations. K. S. contributed to the coding, developed and compiled the scripts for automation and generation of the webserver. P. S, K. S and R. Sreenivasan wrote the first draft of the manuscript. S. G and R. S provided critical inputs to improve the manuscript.
Khader Shameer, Raashi Sreenivasan contributed equally to this work.