GIANT: pattern analysis of molecular interactions in 3D structures of protein–small ligand complexes
© Kasahara and Kinoshita; licensee BioMed Central Ltd. 2014
Received: 10 October 2013
Accepted: 10 January 2014
Published: 14 January 2014
Interpretation of binding modes of protein–small ligand complexes from 3D structure data is essential for understanding selective ligand recognition by proteins. It is often performed by visual inspection and sometimes largely depends on a priori knowledge about typical interactions such as hydrogen bonds and π-π stacking. Because it can introduce some biases due to scientists’ subjective perspectives, more objective viewpoints considering a wide range of interactions are required.
In this paper, we present a web server for analyzing protein–small ligand interactions on the basis of patterns of atomic contacts, or “interaction patterns” obtained from the statistical analyses of 3D structures of protein–ligand complexes in our previous study. This server can guide visual inspection by providing information about interaction patterns for each atomic contact in 3D structures. Users can visually investigate what atomic contacts in user-specified 3D structures of protein–small ligand complexes are statistically overrepresented. This server consists of two main components: “Complex Analyzer”, and “Pattern Viewer”. The former provides a 3D structure viewer with annotations of interacting amino acid residues, ligand atoms, and interacting pairs of these. In the annotations of interacting pairs, assignment to an interaction pattern of each contact and statistical preferences of the patterns are presented. The “Pattern Viewer” provides details of each interaction pattern. Users can see visual representations of probability density functions of interactions, and a list of protein–ligand complexes showing similar interactions.
Users can interactively analyze protein–small ligand binding modes with statistically determined interaction patterns rather than relying on a priori knowledge of the users, by using our new web server named GIANT that is freely available at http://giant.hgc.jp/.
KeywordsMolecular recognition Ligand binding site Protein–ligand interactions Protein structure Protein function Pattern recognition Database Web-server
Elucidating molecular mechanisms in the selective recognition of small molecules (or ligands) by proteins is a central issue in biology. Structure data of protein–ligand complexes deposited in Protein databank (PDB)  are a very informative resource because the data contain direct information of molecular interactions between proteins and ligands at the atomistic scale. Structural biologists and medicinal chemists can obtain implications and knowledge through visual inspection of 3D structures of protein–ligand complexes. However, visual inspections by scientists are subjective and may focus on only some particular well-known interactions, e.g. hydrogen bonds. A more objective and comprehensive view is required for the interpretation of molecular interactions from 3D structure data.
Toward more objective analyses, a promising strategy is taking advantages of statistics of molecular interactions on PDB. Due to recent rapid increase in 3D structure data, this strategy has been become more attractive, and the statistics of protein–ligand interactions have been extensively studied . Many secondary databases of PDB focusing on protein–ligand complexes with various annotations have been constructed [3–7]. Relibase , CREDO  and PLI  particularly focus on atomic contacts between proteins and ligands. They store information of protein–ligand interactions at the atomistic level and provide catalogues of many types of interactions, such as hydrogen bonds, hydrophobic contacts and interactions of π-systems. While they provide fruitful information about protein–ligand interactions, analyses with these existing databases are limited to well-known, preliminarily defined atomic contacts. However, it is considered that the selective molecular recognition is accomplished by combinations of not only such typical interactions but also a huge variety of atomic contacts. Comprehensive knowledge about various kinds of atomic contacts is required.
Previously, we reported a comprehensive classification of spatial arrangements of ligand atoms around molecular fragments of proteins that were defined as three covalently linked atoms . We analyzed statistically preferred geometries of the atomic contacts, or interaction patterns, as mixtures of Gaussian functions. These interaction patterns were obtained from every atomic contact observed in PDB using an unsupervised pattern recognition approach . We found 13,512 interaction patterns in PDB and interactions in these patterns were more enriched in native complex structures than in docking decoys.
On the basis of the classification of interactions, we present a new web server for analyzing molecular interactions in the 3D structure of protein–ligand complexes, named “GIANT”, which stands for “Gaussian mixture model-based Interaction ANalyzer focusing on Three-atom fragments”. GIANT provides a web browser-based user interface for visual inspection of protein–ligand interactions in 3D structures. Users can investigate how statistically overrepresented each atomic contacts in the PDB, and what protein–ligand complex uses similar interactions, for any kind of atomic contacts rather than well-known predefined types of interactions.
Construction and content
On the basis of the analyses, we constructed a web-based application called GIANT, which consists of two main components: “Complex Analyzer” and “Pattern Viewer”. The former provides functionality to perform visual inspection of protein–ligand interactions on the basis of annotations of interaction patterns, and the latter shows a summary of each interaction pattern.
Utility and discussion
We show two examples of analyses of protein–ligand interactions. In the first example, a brief analysis of binding modes of the dihydrofolate reductase (DHFR) and methotrexate (MTX) complex [PDB:3dfr]  was described in a tutorial-style whitch instructs basic usage of GIANT. The second one is a practical case that compares interactions of two similar inhibitors recognized with the same binding mode by identical cycline-dependent kinase (CDK) [PDB:2r3j, 2r3k] .
Tutorial with a DHFR–methotrexate complex
We show a tutorial-style example for analyzing interactions of a DHFR–methotrexate complex. In this example, it is assumed that users want to know the molecular mechanisms of methotrexate recognition by DHFR. We here aim to study what kinds of statistically preferred interactions are working on the recognition of the pteridine ring and what other complexes apply similar interactions, by using GIANT.
On the right side of the window, there are three tables: a list of interacting amino acid residues (Figure 2C), interacting ligand atoms (Figure 2D) and interacting pairs of a protein fragment and a ligand atom (Figure 2E). Users can view interactions in the Jmol viewer by clicking check boxes in the tables. For example, they can focus on interactions of the 2′ amino group in the pteridine ring in MTX by filling the corresponding check box (Atom-ID = 3) in the table of ligand atoms. Amino acid residues recognizing the specified atom will then appear, and interactions will be depicted as lines between atoms (Figure 2B). Furthermore, the table of interaction collectively shows a list of interactions of the specified atom. In this case, the nitrogen atom composing the amino group is recognized by three residues: Ala6, Asp26 and Thr116. The interaction with Asp26 is a bifurcated hydrogen bond. This interaction pattern (Pattern-ID = 18713) was widely observed in the dataset, i.e., 952 interactions in 63 protein families (defined by a single-linkage cluster of amino acid sequences within 25% sequence identity with ≥50% sequence coverage) were assigned to this pattern. These values are described in the column “Freq.” and “Family” in the table of interactions. Users can also see the competence of this interaction to a probability distribution by checking the value in the column “Prob.” that denotes the probability density of this data point in the Gaussian mixture distribution. In contrast to that the interaction pattern of Asp26 with the nitrogen atom that is a common feature in a wide range of protein families, the interaction with Thr116 was observed only in five protein families. This interaction pattern is almost specific to the DHFR family (This residue recognizes the ligand via an intermediate water molecule. Although GIANT does not have information about water molecules, it shows such interactions as direct contacts provided distances between contacting atoms are below a threshold, calculated as sum of van der Waals radii and 1.0 Å).
While “Complex Analyzer” provides information about assignments of atomic contacts to interaction patterns for user-specified complexes, it is still difficult to interpret the nature of each interaction pattern using only this component. The alternative component “Pattern Viewer” helps analyses by providing graphical information about the 3D spatial probability distribution of each interaction pattern. Users can jump to this component by clicking “Pattern-ID” in any row of the table of interactions. To see the interaction patterns with Asp26, click “18713” (the seventh column) of the row with Interaction-ID = 101 (the first column) in the table of interactions. The “Pattern Viewer” page will be opened and will show the spatial distribution of each interaction pattern as 3D meshes (Figure 2F). Three covalently linked atoms centered in the viewer represent a fragment of proteins. The regions inside of the meshes are statistically preferred positions of ligand atoms for interaction with the fragment. The contour of the pattern specified in the “Complex Analyzer” is highlighted in red. Clicking “C” in the right table (Figure 2G) provides a list of ligands (Figure 2H) and that of protein–ligand complexes (Figure 2I) with the same interaction patterns. This information may be useful for seeking complexes with similar interactions to the query.
Comparing interactions of two CDK inhibitors
While one of the two altered parts in the ligands had similar interactions between 2r3j and 2r3k, the other one had distinct interactions. In the position I, that was an aromatic nitrogen atom and aromatic carbon atom in 2r3j and 2r3k, respectively, interacted with Leu134 residue by a CH–π interaction. The spatial distributions of aromatic nitrogen and carbon atoms interacting with Leu Cδ–Cγ–Cβ fragment were similar (Figure 3C and D, respectively), and both of patterns used in these atoms (shown as red contours) were widely shared in many protein families (89 and 174 families shares the patterns for aromatic nitrogen and aromatic carbon atoms, respectively). On the other hand, the position II, that is an aromatic carbon atom in 2r3j and an aromatic nitrogen atom in 2r3k, contacted with Ile10 side-chain and His84 main-chain. While interactions between Ile10 and the position II was in a pattern for the both complexes, that between His84 and the position II were in a pattern only for the complex 2r3k (position II was an aromatic carbon atom) despite of there was no significant structural changes (interatomic distance between the His84 backbone oxygen atom and the contacting ligand atoms were 3.7 Å and 3.6 Å in 2r3j and 2r3k, respectively). In contrast to the spatial distribution of aromatic carbon atoms around His O–C-N fragment (Figure 3E), that of aromatic nitrogen atoms preferred only one configuration of interactions (Figure 3F). This result implies that this loss of the statistically preferred interactions with His84 main chain causes 10-fold gain of IC50 value in 2r3k complex from 2r3j, and the position II should be a carbon atom rather than a nitrogen atom for higher affinity. This should be a helpful information for medicinal chemists.
Although the scope of GIANT is limited to the direct contacts between proteins and small molecules in the current version, the basic concept of GIANT is applicable to other various kinds of molecular interactions such as water-mediated interactions. In the future developments, we are planning taking statistics of interactions with metal and water molecules that play important roles for molecular recognitions. In addition, while the interaction patterns defined in GIANT focuses on the relative positions between a protein fragment and a ligand atom, and does not consider the combination of the elements interactions (or the “environment” around the contacting pair). The information about environment should be an important factor in the ligand recognition, we will take some statistics of co-occurrences of the interaction patterns in a future work.
The web-server GIANT shows the statistical preferences of each atomic contact in user specified 3D structures of protein–small ligand complexes. This function provides an objective perspective for visual inspections of binding modes on the basis of results of the survey of interactions reported in the previous paper, and provides many implications for structural biologists and medicinal chemists. For example, when medicinal chemists perform lead optimization with 3D structure data of protein–compound complexes, GIANT suggests parts of compounds where chemical moieties without statistically overrepresented interaction patterns should be replaced to gain binding affinities. Although this process has usually been performed by experts using their a priori knowledge and intuition, GIANT supports it with statistical, objective information and assists in realizing the concept of so-called rational drug-design.
Availability and requirement
GIANT is freely available at the following URL: http://giant.hgc.jp/.
This work was supported by the ‘HD-Physiology’ Grant-in-Aid for Scientific Research on Innovative Areas (22136005). The super-computing resource was provided by the Human Genome Center (University of Tokyo). A tool for visualization of density maps of Gaussian mixture distributions was provided by Dr. Takeshi Kawabata.
- Berman H, Henrick K, Nakamura H, Markley JL: The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res. 2007, 35: D301-D303. 10.1093/nar/gkl971.View ArticlePubMed CentralPubMedGoogle Scholar
- Bissantz C, Kuhn B, Stahl M: A medicinal chemist’s guide to molecular interactions. J Med Chem. 2010, 53: 5061-5084. 10.1021/jm100112j.View ArticlePubMed CentralPubMedGoogle Scholar
- Chalk AJ, Worth CL, Overington JP, Chan AWE: PDBLIG: classification of small molecular protein binding in the Protein Data Bank. J Med Chem. 2004, 47: 3807-3816. 10.1021/jm040804f.View ArticlePubMedGoogle Scholar
- Yamaguchi A: Het-PDB Navi.: a database for protein-small molecule interactions. J Biochem. 2004, 135: 79-84. 10.1093/jb/mvh009.View ArticlePubMedGoogle Scholar
- Feng Z, Chen L, Maddula H, Akcan O, Oughtred R, Berman HM, Westbrook J: Ligand Depot: a data warehouse for ligands bound to macromolecules. Bioinformatics. 2004, 20: 2153-2155. 10.1093/bioinformatics/bth214.View ArticlePubMedGoogle Scholar
- Meslamani J, Rognan D, Kellenberger E: sc-PDB: a database for identifying variations and multiplicity of “druggable” binding sites in proteins. Bioinformatics. 2011, 27: 1324-1326. 10.1093/bioinformatics/btr120.View ArticlePubMedGoogle Scholar
- Reddy AS, Amarnath HSD, Bapi RS, Sastry GM, Sastry GN: Protein ligand interaction database (PLID). Comput Biol Chem. 2008, 32: 387-390. 10.1016/j.compbiolchem.2008.03.017.View ArticlePubMedGoogle Scholar
- Hendlich M, Bergner A, Günther J, Klebe G: Relibase: design and development of a database for comprehensive analysis of protein–ligand interactions. J Mol Biol. 2003, 326: 607-620. 10.1016/S0022-2836(02)01408-0.View ArticlePubMedGoogle Scholar
- Schreyer A, Blundell T: CREDO: a protein-ligand interaction database for drug discovery. Chem Biol Drug Des. 2009, 73: 157-167. 10.1111/j.1747-0285.2008.00762.x.View ArticlePubMedGoogle Scholar
- Gallina AM, Bisignano P, Bergamino M, Bordo D: PLI: a web-based tool for the comparison of protein-ligand interactions observed on PDB structures. Bioinformatics. 2013, 29: 395-397. 10.1093/bioinformatics/bts691.View ArticlePubMedGoogle Scholar
- Kasahara K, Shirota M, Kinoshita K: Comprehensive classification and diversity assessment of atomic contacts in protein–small ligand interactions. J Chem Inf Model. 2013, 53: 241-248. 10.1021/ci300377f.View ArticlePubMedGoogle Scholar
- Attias H: Inferring parameters and structure of latent variable models by variational bayes. UAI’99 Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence, Stockholm, Sweden, July 30-Aug 1. 1999, 21-30.Google Scholar
- Clark M, Cramer RD, Van Opdenbosch N: Validation of the general purpose tripos 5.2 force field. J Comput Chem. 1989, 10: 982-1012. 10.1002/jcc.540100804.View ArticleGoogle Scholar
- Bolin JT, Filman DJ, Matthews DA, Hamlin RC, Kraut J: Crystal structures of Escherichia coli and Lactobacillus casei dihydrofolate reductase refined at 1.7 A resolution. I. General features and binding of methotrexate. J Biol Chem. 1982, 257: 13650-13662.PubMedGoogle Scholar
- Fischmann TO, Hruza A, Duca JS, Ramanathan L, Mayhood T, Windsor WT, Le HV, Guzi TJ, Dwyer MP, Paruch K, Doll RJ, Lees E, Parry D, Seghezzi W, Madison V: Structure-guided discovery of cyclin-dependent kinase inhibitors. Biopolymers. 2008, 89: 372-379. 10.1002/bip.20868.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.