Protein docking prediction using predicted protein-protein interface
© Li and Kihara; licensee BioMed Central Ltd. 2012
Received: 26 August 2011
Accepted: 10 January 2012
Published: 10 January 2012
Many important cellular processes are carried out by protein complexes. To provide physical pictures of interacting proteins, many computational protein-protein prediction methods have been developed in the past. However, it is still difficult to identify the correct docking complex structure within top ranks among alternative conformations.
We present a novel protein docking algorithm that utilizes imperfect protein-protein binding interface prediction for guiding protein docking. Since the accuracy of protein binding site prediction varies depending on cases, the challenge is to develop a method which does not deteriorate but improves docking results by using a binding site prediction which may not be 100% accurate. The algorithm, named PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), is based on a pair wise protein docking prediction algorithm, LZerD, which we have developed earlier. PI-LZerD starts from performing docking prediction using the provided protein-protein binding interface prediction as constraints, which is followed by the second round of docking with updated docking interface information to further improve docking conformation. Benchmark results on bound and unbound cases show that PI-LZerD consistently improves the docking prediction accuracy as compared with docking without using binding site prediction or using the binding site prediction as post-filtering.
We have developed PI-LZerD, a pairwise docking algorithm, which uses imperfect protein-protein binding interface prediction to improve docking accuracy. PI-LZerD consistently showed better prediction accuracy over alternative methods in the series of benchmark experiments including docking using actual docking interface site predictions as well as unbound docking cases.
Many important cellular processes, such as gene expression regulation and transport, are carried out by protein complexes [1–3]. The importance and the abundance of protein interactions and complexes have been recently further highlighted by large-scale protein-protein interaction maps revealed for many organisms [4–7]. The tertiary structure of proteins is necessary for understanding the underlying molecular mechanism of protein interaction , however, it is often difficult to obtain complex structures by experimental methods, e.g. the X-ray crystallography or NMR. Thus, experimentally solved protein complex structures only share a small fraction among known protein complexes confirmed by biochemical experiments. Therefore, an important task in bioinformatics is to develop efficient and accurate computational methods for predicting protein-protein docking conformations.
Many protein-protein docking methods have been developed in the past employing various ideas and techniques [8–20]. Typically a docking prediction for a pair of proteins produces a few thousands of docking conformations (docking decoys), which are subject to ranking using a scoring function. Conformational search algorithms employed include the Fast Fourier Transform (FFT) [16, 17, 21], the Geometry Hashing [18, 22], Monte Carlo algorithms , genetic algorithm [23, 24], and Langevin dynamics . For scoring a docking decoy, usually several terms are combined, which include physics-based scores  and those concern geometrical shape complementarity [18, 27, 28]. Clustering of docking decoys is also shown to be effective in selecting near native conformations [29–31]. Some of the recent docking algorithms have more elaborate procedures, for example, by considering alternative conformations of flexible protein chains  or post docking optimization steps [14, 33]. Nevertheless, despite significant efforts of developing methods, it is still difficult to identify and rank the correct conformations in top ranks among hundreds of decoys [18, 27, 34] as is also evidenced by results from recent Critical Assessment of Prediction of Interactions (CAPRI), a community wide experiment on the comparative evaluation of protein-protein docking methods .
The accuracy of docking prediction could improve when a part, even if not all, of protein-protein interface (PPI) residues are known. PPI residues for a pair of interacting proteins can be identified by experiments including point mutation such as the alanine scanning [35–38], chemical modification of residues [39, 40], NMR , hydrogen/deuterium exchange , and disulfide cross-linking . If several PPI residues are known, they can be simply used for filtering, i.e. to select docking decoys which have the known PPI residues at their docking interface [44, 45]. Alternatively, known PPI residues from interacting proteins can be incorporated as distant constraints . However, experimental methods are time consuming. This is particularly true if identification of a whole PPI region of an interacting protein pair is attempted or if investigating many interacting proteins in a network is planned.
PPI residues can be also predicted by computational methods, which capture sequence and structural features of PPI regions . There are a number of PPI site prediction methods developed. Sequence features used for PPI site prediction include amino acid residue propensity [46–52], sequence conservation [53–57], and correlated mutation [58–60]. Structure information used include hydrophobic patches, the secondary structure propensity , atom group propensity , relative accessible surface area , geometrical surface shape , the crystallographic B-factor , and energetic characteristics of PPI residues [62, 63]. Current protein interface prediction methods choose one or combinations of these features to construct scoring functions for machine learning techniques [51, 55, 56, 64–67]. Recent development of PPI site prediction methods has been overviewed in recent review articles [68, 69]. The obvious advantage of the computational methods over experimental methods is that the former can be performed much faster than the latter. However, the problem of computational prediction methods is that they are not always accurate. For example, the Meta-PPISP method , one of the state-of-the-art methods, predicts PPI residues on average with a precision of 50% at the coverage of 50% for enzyme-inhibitor complexes . Moreover, the prediction accuracy varies depending on target proteins and thus it is difficult to estimate the accuracy for individual cases. Therefore, computational PPI residue prediction cannot be reliably used for simple post-filtering of docking decoys. A naive use of PPI residue prediction for post-filtering may actually decrease the prediction accuracy, as we will show in Results.
Here, we present a novel protein docking algorithm, PI-LZerD (using Predicted Interface with Local 3D Zernike descriptor-based Docking algorithm), which utilizes imperfect PPI residue prediction for guiding protein-protein docking. PI-LZerD performs iterative improvement of docking results starting from an initial run of docking that uses potentially inaccurate PPI prediction as restraints. The base of the docking algorithm used is the LZerD (Local 3D Zernike descriptor-based Docking algorithm), which we have developed previously . The idea of using additional predicted information for aiding protein docking has been explored by a few previous works. In their works, PPI information is used for post-filtering docking decoys [16, 71–73] or incorporated as an additional scoring term [14, 45, 74, 75]. Compared to these related works, the current work is significantly different in the design and some important aspects: First, we have developed a novel algorithm which is specifically designed to utilize imperfect PPI prediction. Thus, we don't use PPI information simply for post filtering. Second, we perform thorough investigation on how the accuracy of PPI prediction affects to the docking prediction accuracy. PI-LZerD is shown to be able to consistently improve docking predictions when actual PPI predictions are used for unbound docking cases. The datasets used and the developed PI-LZerD program are made freely available for academic community.
Pairwise protein docking algorithm, LZerD: the original algorithm
We start with brief explanation of the original LZerD pairwise protein-protein docking algorithm . As will be explained in the next section, PI-LZerD performs an iterative use of a modified version of LZerD. LZerD takes two protein tertiary structures (Protein Data Bank, PDB , files) as input (termed a ligand and a receptor protein) and outputs over 30000~50000 of docking decoys ranked by a scoring function. The geometric hashing algorithm  is used for docking conformational search.
The scoring function is a weighted sum of the following terms: van der Waals, where, repulsive and attractive parts of the term are considered separately ; an electrostatics term, which considers repulsive/attractive and short-range/long-range contributions separately ; a hydrogen and disulfide bond term ; two solvation terms [80, 81]; and a knowledge-based atom contact term . Weighting factors for the linear combination of the terms were trained on two datasets, the protein-protein docking benchmark 2.0 , which contains 84 pairwise unbound-unbound and bound-unbound docking structures, and also on 851 protein-protein dimeric complexes compiled by Huang and Zou . The combination of weight values were determined by using logistic regression with the interface root mean square deviation (iRMSD) between predicted decoys and the native structure as the target function to be optimized.
Modified LZerD to incorporate PPI prediction
We modified the LZerD algorithm so that additional information of a PPI region can be used to restrict the docking search space. Figure 1B illustrates the two methods of restricting conformational search space in geometric hashing. Given a set of (predicted) PPI residues in a ligand or a receptor protein, each surface point is classified into either PPI (points within the gray ellipsoid in Figure 1B) or non-PPI depending on whether the closest atom for the point belong to a PPI residue or not. In the geometry hashing, two base points (two crosses) are selected to define a reference coordinate system, based on which the other local points are transformed. Base points are selected only from the PPI surface points for both ligand and receptor proteins. Then, in the voting stage, matching points between the ligand and receptor are counted either only from the PPI surface points (i.e. matches are only considered within the predicted PPI regions; triangles in the region in gray in Figure 1B) or from all the surface points (triangles and squares) including non-PPI points. Obviously, the former seeks for a geometrical complementarity of the two proteins only at the predicted PPI regions while the latter explores a wider surface area to identify well fitting docking conformation in the vicinity of the predicted PPI regions. PI-LZerD uses these two search areas in different stages of docking conformation search. The more permissive search area is considered for the initial LZerD run and the more restricted searches are performed for the subsequent iterations.
The 1000 decoys are subject to clustering by considering the similarity of docking interface regions. For a given pair of docking decoys, common atoms between the two PPI regions from the two decoys are selected. Then, the RMSD is computed for the common atoms only when the common atoms share more than 60% of all interface atoms of both PPI sites (if the common atoms do not exceed 60% then the two proteins are not clustered together). We call it the common interface RMSD (ciRMSD) of two docking decoys. The ciRMSD is more suitable for the PI-LZerD algorithm as compared to the conventional coordinate RMSD  or the ligand RMSD , since it focuses on capturing the similarity of docking interface regions.
Once the ciRMSD is computed for all the pairs of decoys, 60 decoys are selected by considering the physics-based score and the cluster size of the decoys. First, the decoy with the lowest score (the lower, the better) is selected and close decoys with a ciRMSD ≤ 4.0Å are discarded from the pool. This process is repeated until 30 decoys are identified. Next, additional 30 decoys are selected based on the cluster size. For each of the decoys, the number of the other decoys within 4.0Å ciRMSD is computed. Then, the largest cluster (i.e. the center decoy with the largest number of close decoys) is selected. If several clusters with the same size are found, the one which has the center decoy with the lowest physics-based score is selected. All the decoys in the cluster are removed, and the process is repeated until 30 representative decoys are selected. Consequently, 30 decoys are selected based on the lowest energy and 30 more decoys are selected based on the cluster size. It was shown that combining the energy value and the cluster size can find more hits than using a single metric alone (Additional file 1, Figure S1).
The selected 60 decoys are passed to the subsequent process. For each of the 60 docking decoys, PPI residues are extracted. PPI residues are defined as those which have a heavy atom closer than 5.0 Å to any atom to the docking partner. The decoys do not necessarily have the identical PPI region as the initially provided PPI information because the modified LZerD has explored the vicinity of the input PPI in the docking conformation search. Using the identified PPI residues as the updated constraint, the modified LZerD is run for the second time. In this round, only the PPI surface points are considered at the voting stage in the geometric hashing (the restrictive search). From the resulting docking decoys, the top 1000 lowest energy docking decoys are clustered based on ciRMSD, whose cluster centers are sorted by the physics-based score. Since the modified LZerD is run for each of the 60 decoys, in total of 60 LZerD runs are performed.
In addition to the 60 runs of the modified LZerD, we run the original LZerD without using predicted PPI information followed by post-filtering by using the predicted PPI residues (naive-filtering method) (the left branch of Figure 2). In the naive-filtering method, docking decoys are sorted not by the physics-based score but by the agreement of the docking interface residues to the predicted PPI residues. Therefore, the overall procedure produces 61 runs of docking predictions, i.e. 61 ranked lists of docking decoys. To make the final ranking of docking decoys, first, the top ranked docking decoys from each of the 61 lists are ranked by the physics-based score, and then the decoys in the same subsequent ranks from the 61 lists are ranked in the same way. Thus, the decoys from all the lists are first sorted by their ranks in each list then sorted by the physics-based score. If the identical decoys appear, one which is ranked lower in the entire final list is removed (it is not common but possible that identical docking decoys appear in different LZerD runs).
Dataset of protein complexes and PPI information
The first dataset we use for benchmarking PI-LZerD is the protein-protein docking benchmark version 3.0  with 124 bound cases. The average length of the proteins is 256 and the number of docking interface residues of the proteins range from 10 to 70 with an average of 25.
To investigate how the accuracy of PPI prediction affects to the docking prediction, we first use "simulated" PPI predictions as input. The actual PPI region of a ligand and a receptor proteins are shifted by 5, 10, 12, and 15 residues to two opposite directions on the protein surface along the major axis of the PPI region. To shift a PPI region on the surface, n PPI residues (n = 5, 10, 12, 15) at one end of the PPI site along the axis are removed and the same number of residues are added on the opposite side of the PPI site. Thus, the shifting of PPI regions are done geometrically rather than along the protein sequence (Additional file 1, Figure S2). By combining two shifted PPI regions from a ligand and a receptor protein, four test cases are made for each protein complex (because the PPI region on each protein is shifted in two opposite directions). The protein complexes are removed from the dataset if one of proteins has a smaller PPI region than the number of shifted residues. The total number of tested protein complexes with 5, 10, 12, and 15 PPI residues shift are 124 (124 × 4 = 496 test cases), 122 (488 cases), 118 (472 cases), and 104 (416 cases), respectively. Since four different combinations of shifted PPI regions of a ligand and a receptor are tested, the number of tested cases is four times of the number of protein complexes, which is shown in the parentheses.
We also test PI-LZerD using actual PPI predictions with a state-of-the-art PPI prediction method, meta-PPISP . Meta-PPISP is a meta server which combines predictions by three methods, Promate , PINUP , and cons-PPISP . The benchmark dataset is selected from the iPFAM database , a subset of PFAM database , which provides multiple sequence alignments (MSA) of interacting proteins. We used iPFAM because meta-PPISP needs a MSA as an input. The iPFAM entries were pruned using the following criterion: (1) PFAM families with 20 to 100 seed sequences were selected. (2) PFAM families consisting local domain sequences were replaced with their corresponding full-length sequences from UniProt . A representative PDB structure was then selected from each PFAM family given by the association in iPFAM. (3) Protein structures that do not have any observable interacting partners in their PDB files were removed. (4) Proteins with their PDB entries that have non-standard amino acids and obsolete PDB files were filtered out. (4) PDB structures with antibody-antigen and protein-DNA/RNA interactions were removed. (5) Protein complexes with more than two chains are removed. (6) Complexes were eliminated if they are classified as monomers bound by crystal contacts in the PQS definition . (7) Proteins with the size between 75 to 300 amino acids were selected. (8) In the final dataset, PFAM families with redundant representative structures with ≥35% sequence identity were filtered out. Given that MSAs in PFAM may not have the PDB structure as a part of the alignment, we employed MUSCLE (ver. 3.6)  with default parameters to compute MSAs from PFAM unaligned sequences and one sequence from the selected PDB structure. The final dataset includes 127 protein complexes. Using prediction output of the meta-PPISP server, residues which have a meta-PPISP score of 0.1 or higher are identified as PPI residues.
Availability and requirements
The executable program of PI-LZerD for Linux is freely available to academic institutions at our website, http://kiharalab.org/PI-LZerD. The datasets used in this study are also available at the same webpage. The program requires a computer with at least 1.5 GB RAM operated by Linux OS. The average times combining both docking and scoring range are about 1-2 hours for small proteins (about 400 points on the receptor and ligand) and it may take longer for larger proteins. This timing is reported on a computer with a dual-core 2.1 GHz processor with 8 GB RAM. In addition, the pairwise docking program, LZerD, which is the base of PI-LZerD, is also made available at http://kiharalab.org/proteindocking.
Results and Discussion
Naive post-filtering method
An obvious approach to use predicted PPI information for protein docking prediction is to select docking decoys with a PPI site that agrees well to the provided PPI information. This approach, termed as the naive post-filtering method, was tested on datasets with the five different accuracy levels of PPI prediction. In addition to the set of accurate PPI information, we used PPI sites shifted by 5, 10, 12, and 15 residues. For each protein complex with PPI information, we run original LZerD to produce top 1000 scoring docking decoys. Then, for each docking decoy, the fraction of the overlap of residues in the provided PPI information the PPI region of the docking decoy is computed for both ligand and for the receptor proteins, and the average of the two are used for sorting decoys.
PI-LZerD with simulated PPI predictions
Next we examine performance of PI-LZerD on the dataset of simulated PPI predictions. This experiment is for understanding the effect of various levels of inaccuracy in PPI predictions to the docking results. In the later sections we discuss the results using actual PPI predictions on bound and unbound docking cases. The full implementation of PI-LZerD (Figure 2, PI-LzerD-2) was compared with four other variations of LZerD, namely, the original LZerD without PPI information (the base LZerD), the original LZerD followed by post-clustering without using PPI information, LZerD with naive post-filtering with the PPI information, and PI-LZerD using PPI information with only one iteration of the modified LzerD (PI-LZerD-1). PI-LZerD-1 clusters output of docking decoys using the ciRMSD.
As the accuracy of the PPI information starts to deteriorate, the docking prediction accuracy by the naive post-filtering quickly drops relative to the others. When 5 residue shifted PPI information was used, the post-filtering method still showed the highest number of successful cases up to the 100 ranks (Figures 5C & 5D). When PPI regions were further shifted by 10 residues, PI-LZerD clearly outperformed the post-filtering method. The performance of the post-filtering method went down as low as the base LZerD which did not use the PPI information. It is also noticed that the PI-LZerD-2 performed better than PI-LZerD-1.
Figures 5G & 5H show that when the 12 residue shifted PPI regions were used, the naive filtering method performed even worse than the base LZerD. In contrast, remarkably, PI-LZerD-2 managed to successfully use the inaccurate PPI information, showing a higher accuracy than the base LZerD. The accuracy of PI-LZerD-1 is now comparable to the base LZerD when 2.5 Å iRMSD threshold was used (Figure 5G) but better for 4.0 Å iRMSD threshold (Figure 5H). Finally, with 15 residue shifted PPI regions (Figures 5I & 5J) PI-LZerD-2 still remained superior to the base LZerD while the accuracy by the naive post-filtering went further down. It is worth mentioning that the prediction accuracy by PI-LZerD-2 stays almost the same with 5, 10, 12, and 15 shifted PPI regions. Importantly, the stability of the prediction by PI-LZerD was observed only for PI-LZerD-2 but not PI-LZerD-1. This indicates that the two iterations of modified LZerD run are necessary to effectively explore the vicinity of specified PPI region to find the lowest energy conformation.
In Additional file 1, Figures S3 and S4, we analyzed the same results by classifying the shifted PPI sites by their accuracy. In Additional file 1, Figure S3, the protein complexes are classified by the average sensitivity of the shifted PPI sites of the receptor and the ligand proteins, while they are classified based on the fnat of shifted PPI sites of the receptor and the ligand proteins in Additional file 1, Figure S4. Essentially the same trend was observed in Additional file 1, Figures S3 & S4 as Figure 5. Using the naive post-filtering, near perfect prediction accuracy can be achieved only when the correct PPI information is provided. However, its results quickly deteriorate as the accuracy PPI site information drops. In contrast, PI-LZerD can take advantage of PPI information even when it is not very accurate. For the range of the PPI site information accuracy tested, PI-LZerD always showed better performance than the base LZerD without using PPI information. It is very important that employing additional information (in this case PPI site prediction) do not deteriorate prediction results even if the quality of information is not high, which is successfully achieved by PI-LZerD.
Docking Prediction using actual PPI site prediction
On this dataset, PI-LZerD-2 performed consistently the best at every rank cutoff (x-axis) with both 2.5 Å and 4.0 Å (Figures 6C & 6D) iRMSD thresholds. Within top 10 predictions, PI-LZerD-2 made at least one hit for 51.2% of the cases, while the base LZerD and the naive post-filtering obtained hits for 42.5%, 31.5% of the cases with the 2.5 Å iRMSD cutoff (Figure 6C). Within the rank of 100, the successful cases for the methods increased to 72.4, 55.1, and 38.6%, respectively. Thus, PI-LzerD-2 improved the success rate over the base LZerD by 8.7 and 17.3% points within the rank of 10 and 100. When 4.0Å is used for iRMSD cutoff (Figure 6D), PI-LZerD-2 obtained at least one hit for 33.1/59.8/85.0/95.3% within top 1/10/100/1000 predictions, respectively. The naive post-filtering performed consistently worse than the base LZerD. An important conclusion from these results is that blind PPI site predictions cannot be used for improving docking prediction with the post-filtering procedure. On average it will only deteriorate prediction accuracy.
Unbound protein docking using actual PPI site prediction
We observe again the same trend as we observed in the previous experiments: PI-LZerD-2 showed consistently better success rate than the base LZerD at each rank cutoff (Figures 7C & 7D). At the rank cutoff of 10, 100, 1000, PI-LZerD-2 made successful predictions within 2.5 Å iRMSD (Figure 7C) for 9.32%, 23.73%, and 44.92% of the cases, while the success rate of the base LZerD was 7.63%, 20.34%, and 38.98%. With 4.0 Å iRMSD cutoff, (Figure 7D), the success rate of PI-LZerD-2/the base LZerD was 16.95/11.86, 39.83/29.66, and 61.02/53.39 at 10, 100, 1000 ranks. The naive post-filtering performed again worse than the base LZerD at most of the rank cutoff values.
Using this test set, we have also examined the effect of using a different number of decoys in the second round of LZerD run in PI-LZerD. As shown in the illustration of the PI-LZerD algorithm (Figure 2), we use top 30 lowest energy decoys and another 30 decoys with the largest clusters, thus 60 decoys, as the sources of updated PPI sites. We compared prediction results using 50 (i.e. 25 lowest energy decoys and 25 largest cluster decoys), 80, and 100 decoys in Additional file 1, Figure S5. The results show that using 60 docking decoys performs overall best among tested when the cutoff of 2.5 Å is used. When the cutoff of 4.0 Å is used to define near native decoys, all of them showed similar performance.
Examples of docking prediction by PI-LZerD
First example is human cdk2 kinase complex with cell cycle-regulatory protein ckshs1 (PDB ID: 1BUH). The best predictions within top 50 using PI-LZerD/naive post-filtering/LzerD were 1.03 Å (8)/9.09 Å (24)/9.91 Å (17) iRMSD, respectively. In the parentheses the rank of the decoys are shown. The second example (Figure 9B) is monoclonal antibody fab d44.1 complexed with lysozyme (1MLC). The best prediction using PI-LZerD-2/naive post-filtering/LZerD were 0.89 Å (34)/8.37 Å (9)/14.35 Å (22) iRMSD, respectively. The predicted ligand protein position by the naive post-filtering method (shown in red) indicates where the shifted PPI site information pointed. Thus, PI-LZerD managed to find the near native docking pose (green) from the originally provided wrong PPI site information. The near native pose (iRMSD ≤ 4.0 Å) was not found among the top 50 lowest energy score decoys.
The next two examples are taken from the iPFAM dataset where actual PPI predictions by meta-PPISP were used (Figure 6). Figure 9C is a complex of adenovirus single-stranded DNA-binding proteins (1ADU). The PPI site prediction by meta-PPISP is fine for one protein (sensitivity: 0.77) but totally missed the correct PPI site for another protein (sensitivity and specificity of 0.0). PI-LZerD-2 managed to identify a 1.04 Å iRMSD conformation (blue) while the naive post-filtering method made significantly wrong prediction (red). The LZerD energy function failed to identify the near native conformation within top 50 ranks (yellow). Figure 9D is a complex of methionine synthase (1BMT). The best PI-LZerD-2 prediction is at 2.31 Å iRMSD, while the post-filtering method and the base LZerD predictions are at iRMSD of 14.4 Å and 13.0 Å iRMSD, respectively. The PPI prediction for the both chains are much worse than average.
The last two examples are from unbound docking experiments using meta-PPISP predictions. The first example is the predictions for α-1-antitrypsin precursor and trypsinogen complex (1OPH). The best iRMSD predictions by PI-LZerD, the post-filtering, and base LZerD were 3.76 Å, 5.71 Å, and 10.28 Å, respectively. The last one, the complex of human factor VIII and human monoclonal BO2C11 Fab (1IQD), again PI-LZerD-2 identified a near-native pose (an iRMSD of 2.91 Å) (Figure 9E). The base LZerD found lower energy decoys at very different position, an iMRSD of 10.28 Å.
Comparison with an existing method
The performance of docking prediction with CPORT and PI-LZerD are compared in Figures 10C & 10D. Overall, for both iRMSD threshold of 2.5 Å (Figure 10C) and 4.0 Å (Figure 10D), PI-LZerD-2 showed a higher success rate at each rank cutoff (x-axis). For example, PI-LZerD-2 obtained 14 success cases out of 57 complexes (24.6%) within 2.5Å when top 100 scoring decoys are considered, while CPORT had 9 successful cases (15.8%) at the same cutoff (Figure 9A). Using a 4.0 Å iRMSD threshold value, PI-LZerD-2 and CPORT obtained 23 (40.4%) and 21 successful cases (36.8%) within top 100 decoys, respectively.
We have developed PI-LZerD, a pairwise docking algorithm that uses imperfect PPI prediction to improve docking accuracy. In the series of experiments, we showed that PI-LZerD successfully improved docking results even when accuracy of PPI information is significantly low. Unlike the post-filtering whose success largely depends on the accuracy of provided PPI information, PI-LZerD can use imperfect PPI prediction to improve prediction by exploring docking poses in the neighborhood of provided PPI prediction. PI-LZerD identifies matches of two proteins at local surface regions that only partially overlap with the provided PPI prediction. In addition, employing two iterations of docking searches (PI-LZerD-2) is shown to be more effective than one round of docking (PI-LZerD-1) because the two iterations enable exploring further from the provided PPI site prediction. Improvement of the average docking accuracy by PI-LZerD over LZerD was observed consistently in the series of benchmark experiments including docking using actual PPI site predictions as well as unbound docking cases.
While this work focused on pairwise docking, the same procedure can be applied for multiple protein-protein docking algorithms [94–100]. As the protein interactions and their networks have become a very important research focus in systems biology, the procedure developed here will be valuable for providing physical picture of such interactions.
The authors gratefully acknowledge David La for helping preparing the benchmark dataset from the iPFAM database. We also thank Vishwesh Venkatraman and Yifeng D. Yang for providing the physics-based scoring function. We have used in part the Moffett clusters at Purdue University Rosen Center for Advanced Computing. This work has been supported by grants from the National Institutes of Health (R01GM075004, R01GM097528). DK also acknowledges grants from National Science Foundation (DMS0800568, EF0850009, IIS0915801).
- Aloy P, Russell RB: Ten thousand interactions for the molecular biologist. Nat Biotechnol 2004, 22: 1317–1321. 10.1038/nbt1018View ArticlePubMedGoogle Scholar
- Russell RB, Alber F, Aloy P, Davis FP, Korkin D, Pichaud M, Topf M, Sali A: A structural perspective on protein-protein interactions. Curr Opin Struct Biol 2004, 14: 313–324. 10.1016/j.sbi.2004.04.006View ArticlePubMedGoogle Scholar
- Szilagyi A, Grimm V, Arakaki AK, Skolnick J: Prediction of physical protein-protein interactions. Phys Biol 2005, 2: S1–16. 10.1088/1478-3975/2/2/S01View ArticlePubMedGoogle Scholar
- Giot L, Bader JS, Brouwer C, Chaudhuri A, Kuang B, Li Y, Hao YL, Ooi CE, Godwin B, Vitols E, Vijayadamodar G, Pochart P, Machineni H, Welsh M, Kong Y, Zerhusen B, Malcolm R, Varrone Z, Collis A, Minto M, Burgess S, McDaniel L, Stimpson E, Spriggs F, Williams J, Neurath K, Ioime N, Agee M, Voss E, Furtak K, Renzulli R, Aanensen N, Carrolla S, Bickelhaupt E, Lazovatsky Y, DaSilva A, Zhong J, Stanyon CA, Finley RL, White KP, Braverman M, Jarvie T, Gold S, Leach M, Knight J, Shimkets RA, McKenna MP, Chant J, Rothberg JM: A protein interaction map of Drosophila melanogaster. Science 2003, 302: 1727–1736. 10.1126/science.1090289View ArticlePubMedGoogle Scholar
- Uetz P, Giot L, Cagney G, Mansfield TA, Judson RS, Knight JR, Lockshon D, Narayan V, Srinivasan M, Pochart P, Qureshi-Emili A, Li Y, Godwin B, Conover D, Kalbfleisch T, Vijayadamodar G, Yang M, Johnston M, Fields S, Rothberg JM: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 2000, 403: 623–627. 10.1038/35001009View ArticlePubMedGoogle Scholar
- Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci USA 2001, 98: 4569–4574. 10.1073/pnas.061034498PubMed CentralView ArticlePubMedGoogle Scholar
- Collura V, Boissy G: From protein-protein complexes to interactomics. Subcell Biochem 2007, 43: 135–183. 10.1007/978-1-4020-5943-8_8View ArticlePubMedGoogle Scholar
- Halperin I, Ma B, Wolfson H, Nussinov R: Principles of docking: An overview of search algorithms and a guide to scoring functions. Proteins 2002, 47: 409–443. 10.1002/prot.10115View ArticlePubMedGoogle Scholar
- Ritchie DW: Recent progress and future directions in protein-protein docking. Curr Protein Pept Sci 2008, 9: 1–15. 10.2174/138920308783565741View ArticlePubMedGoogle Scholar
- Lensink MF, Wodak SJ: Docking and scoring protein interactions: CAPRI 2009. Proteins 2010, 78: 3073–3084. 10.1002/prot.22818View ArticlePubMedGoogle Scholar
- Gabb HA, Jackson RM, Sternberg MJ: Modelling protein docking using shape complementarity, electrostatics and biochemical information. J Mol Biol 1997, 272: 106–120. 10.1006/jmbi.1997.1203View ArticlePubMedGoogle Scholar
- Tovchigrechko A, Wells CA, Vakser IA: Docking of protein models. Protein Sci 2002, 11: 1888–1896. 10.1110/ps.4730102PubMed CentralView ArticlePubMedGoogle Scholar
- Gray JJ, Moughon S, Wang C, Schueler-Furman O, Kuhlman B, Rohl CA, Baker D: Protein-protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. J Mol Biol 2003, 331: 281–99. 10.1016/S0022-2836(03)00670-3View ArticlePubMedGoogle Scholar
- Dominguez C, Boelens R, Bonvin AM: HADDOCK: a protein-protein docking approach based on biochemical or biophysical information. J Am Chem Soc 2003, 125: 1731–1737. 10.1021/ja026939xView ArticlePubMedGoogle Scholar
- Jiang F, Kim SH: "Soft docking": matching of molecular surface cubes. J Mol Biol 1991, 219: 79–102. 10.1016/0022-2836(91)90859-5View ArticlePubMedGoogle Scholar
- Chen R, Li L, Weng Z: ZDOCK: an initial-stage protein-docking algorithm. Proteins 2003, 52: 80–87. 10.1002/prot.10389View ArticlePubMedGoogle Scholar
- Ritchie DW, Kemp GJ: Protein docking using spherical polar Fourier correlations. Proteins 2000, 39: 178–194. 10.1002/(SICI)1097-0134(20000501)39:2<178::AID-PROT8>3.0.CO;2-6View ArticlePubMedGoogle Scholar
- Venkatraman V, Yang YD, Sael L, Kihara D: Protein-protein docking using region-based 3D Zernike descriptors. BMC Bioinformatics 2009, 10: 407. 10.1186/1471-2105-10-407PubMed CentralView ArticlePubMedGoogle Scholar
- Garzon JI, Lopez-Blanco JR, Pons C, Kovacs J, Abagyan R, Fernandez-Recio J, Chacon P: FRODOCK: a new approach for fast rotational protein-protein docking. Bioinformatics 2009, 25: 2544–2551. 10.1093/bioinformatics/btp447PubMed CentralView ArticlePubMedGoogle Scholar
- de Vries SJ, Bonvin AM: How proteins get in touch: interface prediction in the study of biomolecular complexes. Curr Protein Pept Sci 2008, 9: 394–406. 10.2174/138920308785132712View ArticlePubMedGoogle Scholar
- Kozakov D, Brenke R, Comeau SR, Vajda S: PIPER: an FFT-based protein docking program with pairwise potentials. Proteins 2006, 65: 392–406. 10.1002/prot.21117View ArticlePubMedGoogle Scholar
- Fischer D, Lin SL, Wolfson HL, Nussinov R: A geometry-based suite of molecular docking processes. J Mol Biol 1995, 248: 459–477.PubMedGoogle Scholar
- Gardiner EJ, Willett P, Artymiuk PJ: GAPDOCK: a Genetic Algorithm Approach to Protein Docking in CAPRI round 1. Proteins 2003, 52: 10–14. 10.1002/prot.10386View ArticlePubMedGoogle Scholar
- Gardiner EJ, Willett P, Artymiuk PJ: Protein docking using a genetic algorithm. Proteins 2001, 44: 44–56. 10.1002/prot.1070View ArticlePubMedGoogle Scholar
- Li X, Moal IH, Bates PA: Detection and refinement of encounter complexes for protein-protein docking: taking account of macromolecular crowding. Proteins 2010, 78: 3189–3196. 10.1002/prot.22770View ArticlePubMedGoogle Scholar
- Schueler-Furman O, Wang C, Baker D: Progress in protein-protein docking: atomic resolution predictions in the CAPRI experiment using RosettaDock with an improved treatment of side-chain flexibility. Proteins 2005, 60: 187–194. 10.1002/prot.20556View ArticlePubMedGoogle Scholar
- Shentu Z, Al HM, Bystroff C, Zaki MJ: Context shapes: Efficient complementary shape matching for protein-protein docking. Proteins 2008, 70: 1056–1073.View ArticlePubMedGoogle Scholar
- Schneidman-Duhovny D, Inbar Y, Nussinov R, Wolfson HJ: PatchDock and SymmDock: servers for rigid and symmetric docking. Nucleic Acids Res 2005, 33: W363-W367. 10.1093/nar/gki481PubMed CentralView ArticlePubMedGoogle Scholar
- Comeau SR, Gatchell DW, Vajda S, Camacho CJ: ClusPro: an automated docking and discrimination method for the prediction of protein complexes. Bioinformatics 2004, 20: 45–50. 10.1093/bioinformatics/btg371View ArticlePubMedGoogle Scholar
- Kozakov D, Clodfelter KH, Vajda S, Camacho CJ: Optimal clustering for detecting near-native conformations in protein docking. Biophys J 2005, 89: 867–875. 10.1529/biophysj.104.058768PubMed CentralView ArticlePubMedGoogle Scholar
- Tong W, Weng Z: Clustering protein-protein docking predictions. Conf Proc IEEE Eng Med Biol Soc 2004, 4: 2999–3002.PubMedGoogle Scholar
- Das R, Andre I, Shen Y, Wu Y, Lemak A, Bansal S, Arrowsmith CH, Szyperski T, Baker D: Simultaneous prediction of protein folding and docking at high resolution. Proc Natl Acad Sci USA 2009, 106: 18978–18983. 10.1073/pnas.0904407106PubMed CentralView ArticlePubMedGoogle Scholar
- Shen Y, Paschalidis IC, Vakili P, Vajda S: Protein docking by the underestimation of free energy funnels in the space of encounter complexes. PLoS Comput Biol 2008, 4: e1000191. 10.1371/journal.pcbi.1000191PubMed CentralView ArticlePubMedGoogle Scholar
- Pierce B, Weng Z: ZRANK: reranking protein docking predictions with an optimized energy function. Proteins 2007, 67: 1078–1086. 10.1002/prot.21373View ArticlePubMedGoogle Scholar
- Hutchinson CL, Lowe PN, McLaughlin SH, Mott HR, Owen D: Mutational Analysis Reveals a Single Binding Interface between RhoA and Its Effector, PRK1. Biochemistry 2011.Google Scholar
- Bradshaw RT, Patel BH, Tate EW, Leatherbarrow RJ, Gould IR: Comparing experimental and computational alanine scanning techniques for probing a prototypical protein-protein interaction. Protein Eng Des Sel 2011, 24: 197–207. 10.1093/protein/gzq047View ArticlePubMedGoogle Scholar
- Bogan AA, Thorn KS: Anatomy of hot spots in protein interfaces. J Mol Biol 1998, 280: 1–9. 10.1006/jmbi.1998.1843View ArticlePubMedGoogle Scholar
- Delano WL: Unraveling hot spots in binding interfaces: progress and challenges. Curr Opin Struct Biol 2002, 12: 14–20. 10.1016/S0959-440X(02)00283-XView ArticlePubMedGoogle Scholar
- Dhungana S, Fessler MB, Tomer KB: Epitope mapping by differential chemical modification of antigens. Methods Mol Biol 2009, 524: 119–134. 10.1007/978-1-59745-450-6_9View ArticlePubMedGoogle Scholar
- Speck SH, Koppenol WH, Dethmers JK, Osheroff N, Margoliash E, Rajagopalan KV: Definition of cytochrome c binding domains by chemical modification. Interaction of horse cytochrome c with beef sulfite oxidase and analysis of steady state kinetics. J Biol Chem 1981, 256: 7394–7400.PubMedGoogle Scholar
- Bonvin AM, Boelens R, Kaptein R: NMR analysis of protein interactions. Curr Opin Chem Biol 2005, 9: 501–508. 10.1016/j.cbpa.2005.08.011View ArticlePubMedGoogle Scholar
- Anand GS, Law D, Mandell JG, Snead AN, Tsigelny I, Taylor SS, Ten Eyck LF, Komives EA: Identification of the protein kinase A regulatory RIalpha-catalytic subunit interface by amide H/2H exchange and protein docking. Proc Natl Acad Sci USA 2003, 100: 13264–13269. 10.1073/pnas.2232255100PubMed CentralView ArticlePubMedGoogle Scholar
- Meenan NA, Sharma A, Fleishman SJ, Macdonald CJ, Morel B, Boetzel R, Moore GR, Baker D, Kleanthous C: The structural and energetic basis for high selectivity in a high-affinity protein-protein interaction. Proc Natl Acad Sci USA 2010, 107: 10080–10085. 10.1073/pnas.0910756107PubMed CentralView ArticlePubMedGoogle Scholar
- Wiehe K, Pierce B, Tong WW, Hwang H, Mintseris J, Weng Z: The performance of ZDOCK and ZRANK in rounds 6–11 of CAPRI. Proteins 2007, 69: 719–725. 10.1002/prot.21747View ArticlePubMedGoogle Scholar
- Chelliah V, Blundell TL, Fernandez-Recio J: Efficient restraints for protein-protein docking by comparison of observed amino acid substitution patterns with those predicted from local environment. J Mol Biol 2006, 357: 1669–1682. 10.1016/j.jmb.2006.01.001View ArticlePubMedGoogle Scholar
- Pletneva EV, Laederach AT, Fulton DB, Kostic NM: The role of cation-pi interactions in biomolecular association. Design of peptides favoring interactions between cationic and aromatic amino acid side chains. J Am Chem Soc 2001, 123: 6232–6245. 10.1021/ja010401uView ArticlePubMedGoogle Scholar
- Jones S, Thornton JM: Analysis of protein-protein interaction sites using surface patches. J Mol Biol 1997, 272: 121–132. 10.1006/jmbi.1997.1234View ArticlePubMedGoogle Scholar
- Lo CL, Chothia C, Janin J: The atomic structure of protein-protein recognition sites. J Mol Biol 1999, 285: 2177–2198. 10.1006/jmbi.1998.2439View ArticleGoogle Scholar
- Liang S, Zhang C, Liu S, Zhou Y: Protein binding site prediction using an empirical scoring function. Nucleic Acids Res 2006, 34: 3698–3707. 10.1093/nar/gkl454PubMed CentralView ArticlePubMedGoogle Scholar
- Jones S, Thornton JM: Principles of protein-protein interactions. Proc Natl Acad Sci USA 1996, 93: 13–20. 10.1073/pnas.93.1.13PubMed CentralView ArticlePubMedGoogle Scholar
- Neuvirth H, Raz R, Schreiber G: ProMate: a structure based prediction program to identify the location of protein-protein binding sites. J Mol Biol 2004, 338: 181–199. 10.1016/j.jmb.2004.02.040View ArticlePubMedGoogle Scholar
- Negi SS, Braun W: Statistical analysis of physical-chemical properties and prediction of protein-protein interfaces. J Mol Model 2007, 13: 1157–1167. 10.1007/s00894-007-0237-0PubMed CentralView ArticlePubMedGoogle Scholar
- Mihalek I, Res I, Yao H, Lichtarge O: Combining inference from evolution and geometric probability in protein structure evaluation. J Mol Biol 2003, 331: 263–279. 10.1016/S0022-2836(03)00663-6View ArticlePubMedGoogle Scholar
- Zhou HX, Shan Y: Prediction of protein interaction sites from sequence profile and residue neighbor list. Proteins 2001, 44: 336–343. 10.1002/prot.1099View ArticlePubMedGoogle Scholar
- Tjong H, Qin S, Zhou HX: PI2PE: protein interface/interior prediction engine. Nucleic Acids Res 2007, 35: W357-W362. 10.1093/nar/gkm231PubMed CentralView ArticlePubMedGoogle Scholar
- Porollo A, Meller J: Prediction-based fingerprints of protein-protein interactions. Proteins 2007, 66: 630–645.View ArticlePubMedGoogle Scholar
- Caffrey DR, Somaroo S, Hughes JD, Mintseris J, Huang ES: Are protein-protein interfaces more conserved in sequence than the rest of the protein surface? Protein Sci 2004, 13: 190–202. 10.1110/ps.03323604PubMed CentralView ArticlePubMedGoogle Scholar
- Halperin I, Wolfson H, Nussinov R: Correlated mutations: advances and limitations. A study on fusion proteins and on the Cohesin-Dockerin families. Proteins 2006, 63: 832–845. 10.1002/prot.20933View ArticlePubMedGoogle Scholar
- Pazos F, Helmer-Citterich M, Ausiello G, Valencia A: Correlated mutations contain information about protein-protein interaction. J Mol Biol 1997, 271: 511–523. 10.1006/jmbi.1997.1198View ArticlePubMedGoogle Scholar
- Pazos F, Valencia A: In silico two-hybrid system for the selection of physically interacting protein pairs. Proteins 2002, 47: 219–227. 10.1002/prot.10074View ArticlePubMedGoogle Scholar
- Kufareva I, Budagyan L, Raush E, Totrov M, Abagyan R: PIER: protein interface recognition for structural proteomics. Proteins 2007, 67: 400–417. 10.1002/prot.21233View ArticlePubMedGoogle Scholar
- Burgoyne NJ, Jackson RM: Predicting protein interaction sites: binding hot-spots in protein-protein and protein-ligand interfaces. Bioinformatics 2006, 22: 1335–1342. 10.1093/bioinformatics/btl079View ArticlePubMedGoogle Scholar
- Liang S, Zhang J, Zhang S, Guo H: Prediction of the interaction site on the surface of an isolated protein structure by analysis of side chain energy scores. Proteins 2004, 57: 548–557. 10.1002/prot.20238View ArticlePubMedGoogle Scholar
- Bradford JR, Westhead DR: Improved prediction of protein-protein binding sites using a support vector machines approach. Bioinformatics 2005, 21: 1487–1494. 10.1093/bioinformatics/bti242View ArticlePubMedGoogle Scholar
- Res I, Mihalek I, Lichtarge O: An evolution based classifier for prediction of protein interfaces without using protein structures. Bioinformatics 2005, 21: 2496–2501. 10.1093/bioinformatics/bti340View ArticlePubMedGoogle Scholar
- Pettit FK, Bare E, Tsai A, Bowie JU: HotPatch: a statistical approach to finding biologically relevant features on protein surfaces. J Mol Biol 2007, 369: 863–879. 10.1016/j.jmb.2007.03.036PubMed CentralView ArticlePubMedGoogle Scholar
- Li MH, Lin L, Wang XL, Liu T: Protein-protein interaction site prediction based on conditional random fields. Bioinformatics 2007, 23: 597–604. 10.1093/bioinformatics/btl660View ArticlePubMedGoogle Scholar
- La D, Kihara D: Predicting binding interfaces of protein-protein interactions. In Biological Data Mining in Protein Interaction Networks. Edited by: Li XL, Ng SK. Philadelphia: IGI-GLobal; 2010:64–79.Google Scholar
- Ezkurdia I, Bartoli L, Fariselli P, Casadio R, Valencia A, Tress ML: Progress and challenges in predicting protein-protein interaction sites. Brief Bioinform 2009, 10: 233–246.View ArticlePubMedGoogle Scholar
- Qin S, Zhou HX: meta-PPISP: a meta web server for protein-protein interaction site prediction. Bioinformatics 2007, 23: 3386–3387. 10.1093/bioinformatics/btm434View ArticlePubMedGoogle Scholar
- Zhou HX, Qin S: Interaction-site prediction for protein complexes: a critical assessment. Bioinformatics 2007, 23: 2203–2209. 10.1093/bioinformatics/btm323View ArticlePubMedGoogle Scholar
- Heuser P, Bau D, Benkert P, Schomburg D: Refinement of unbound protein docking studies using biological knowledge. Proteins 2005, 61: 1059–1067. 10.1002/prot.20634View ArticlePubMedGoogle Scholar
- Tress M, de JD, Grana O, Gomez MJ, Gomez-Puertas P, Gonzalez JM, Lopez G, Valencia A: Scoring docking models with evolutionary information. Proteins 2005, 60: 275–280. 10.1002/prot.20570View ArticlePubMedGoogle Scholar
- de Vries SJ, Bonvin AM: CPORT: A Consensus Interface Predictor and Its Performance in Prediction-Driven Docking with HADDOCK. PLoS ONE 2011, 6: e17695. 10.1371/journal.pone.0017695PubMed CentralView ArticlePubMedGoogle Scholar
- Huang B, Schroeder M: Using protein binding site prediction to improve protein docking. Gene 2008, 422: 14–21. 10.1016/j.gene.2008.06.014View ArticlePubMedGoogle Scholar
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Res 2000, 28: 235–242. 10.1093/nar/28.1.235PubMed CentralView ArticlePubMedGoogle Scholar
- Wolfson H, Rigoutsos I: Geometric hashing: an overview. IEEE Computational Science Engineering 1997, 4: 10–21.View ArticleGoogle Scholar
- Andrusier N, Mashiach E, Nussinov R, Wolfson HJ: Principles of flexible protein-protein docking. Proteins 2008, 73: 271–289. 10.1002/prot.22170PubMed CentralView ArticlePubMedGoogle Scholar
- Meyer M, Wilson P, Schomburg D: Hydrogen bonding and molecular surface shape complementarity as a basis for protein docking. J Mol Biol 1996, 264: 199–210. 10.1006/jmbi.1996.0634View ArticlePubMedGoogle Scholar
- Lazaridis T, Karplus M: Effective energy functions for protein structure prediction. Curr Opin Struct Biol 2000, 10: 139–145. 10.1016/S0959-440X(00)00063-4View ArticlePubMedGoogle Scholar
- Eisenberg D, McLachlan AD: Solvation energy in protein folding and binding. Nature 1986, 319: 199–203. 10.1038/319199a0View ArticlePubMedGoogle Scholar
- Zhang C, Vasmatzis G, Cornette JL, DeLisi C: Determination of atomic desolvation energies from the structures of crystallized proteins. J Mol Biol 1997, 267: 707–726. 10.1006/jmbi.1996.0859View ArticlePubMedGoogle Scholar
- Mintseris J, Wiehe K, Pierce B, Anderson R, Chen R, Janin J, Weng Z: Protein-Protein Docking Benchmark 2.0: an update. Proteins 2005, 60: 214–216. 10.1002/prot.20560View ArticlePubMedGoogle Scholar
- Huang SY, Zou X: An iterative knowledge-based scoring function for protein-protein recognition. Proteins 2008, 72: 557–579. 10.1002/prot.21949View ArticlePubMedGoogle Scholar
- Kabsch W: A discussion of the solution for the best rotation to relate two sets of vectors. Acta Cryst 1978, A34: 827–828.View ArticleGoogle Scholar
- Janin J, Henrick K, Moult J, Eyck LT, Sternberg MJ, Vajda S, Vakser I, Wodak SJ: CAPRI: a Critical Assessment of PRedicted Interactions. Proteins 2003, 52: 2–9. 10.1002/prot.10381View ArticlePubMedGoogle Scholar
- Hwang H, Pierce B, Mintseris J, Janin J, Weng Z: Protein-protein docking benchmark version 3.0. Proteins 2008, 73: 705–709. 10.1002/prot.22106PubMed CentralView ArticlePubMedGoogle Scholar
- Mendez R, Leplae R, Lensink MF, Wodak SJ: Assessment of CAPRI predictions in rounds 3–5 shows progress in docking procedures. Proteins 2005, 60: 150–169. 10.1002/prot.20551View ArticlePubMedGoogle Scholar
- Finn RD, Marshall M, Bateman A: iPfam: visualization of protein-protein interactions in PDB at domain and amino acid resolutions. Bioinformatics 2005, 21: 410–412. 10.1093/bioinformatics/bti011View ArticlePubMedGoogle Scholar
- Finn RD, Mistry J, Tate J, Coggill P, Heger A, Pollington JE, Gavin OL, Gunasekaran P, Ceric G, Forslund K, Holm L, Sonnhammer EL, Eddy SR, Bateman A: The Pfam protein families database. Nucleic Acids Res 2010, 38: D211-D222. 10.1093/nar/gkp985PubMed CentralView ArticlePubMedGoogle Scholar
- Uniprot Consortium: The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Res 2010, 38: D142-D148.View ArticleGoogle Scholar
- Henrick K, Thornton JM: PQS: a protein quaternary structure file server. Trends Biochem Sci 1998, 23: 358–361. 10.1016/S0968-0004(98)01253-5View ArticlePubMedGoogle Scholar
- Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 2004, 32: 1792–1797. 10.1093/nar/gkh340PubMed CentralView ArticlePubMedGoogle Scholar
- Esquivel-Rodriguez J, Yang YD, Kihara D: Multi-LzerD: multiple protein docking for asymmetric complexes. 3DSIG 2011: Structural Bioinformatics and Computational Biophysics 2011.Google Scholar
- Karaca E, Melquiond AS, de Vries SJ, Kastritis PL, Bonvin AM: Building macromolecular assemblies by information-driven docking: introducing the HADDOCK multibody docking server. Mol Cell Proteomics 2010, 9: 1784–1794. 10.1074/mcp.M000051-MCP201PubMed CentralView ArticlePubMedGoogle Scholar
- Comeau SR, Camacho CJ: Predicting oligomeric assemblies: N-mers a primer. J Struct Biol 2005, 150: 233–244. 10.1016/j.jsb.2005.03.006View ArticlePubMedGoogle Scholar
- Berchanski A, Eisenstein M: Construction of molecular assemblies via docking: modeling of tetramers with D2 symmetry. Proteins 2003, 53: 817–829. 10.1002/prot.10480View ArticlePubMedGoogle Scholar
- Andre I, Bradley P, Wang C, Baker D: Prediction of the structure of symmetrical protein assemblies. Proc Natl Acad Sci USA 2007, 104: 17656–17661. 10.1073/pnas.0702626104PubMed CentralView ArticlePubMedGoogle Scholar
- Inbar Y, Benyamini H, Nussinov R, Wolfson HJ: Combinatorial docking approach for structure prediction of large proteins and multi-molecular assemblies. Phys Biol 2005, 2: S156-S165. 10.1088/1478-3975/2/4/S10View ArticlePubMedGoogle Scholar
- Esquivel-Rodriguez J, Kihara D: Evaluation of multiple protein docking structures using correctly predicted pairwise subunits. BMC Bioinformatics 2011, in press.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.