Scoring docking conformations using predicted protein interfaces
© Esmaielbeiki and Nebel; licensee BioMed Central Ltd. 2014
Received: 11 December 2012
Accepted: 29 May 2014
Published: 6 June 2014
Since proteins function by interacting with other molecules, analysis of protein-protein interactions is essential for comprehending biological processes. Whereas understanding of atomic interactions within a complex is especially useful for drug design, limitations of experimental techniques have restricted their practical use. Despite progress in docking predictions, there is still room for improvement. In this study, we contribute to this topic by proposing T-PioDock, a framework for detection of a native-like docked complex 3D structure. T-PioDock supports the identification of near-native conformations from 3D models that docking software produced by scoring those models using binding interfaces predicted by the interface predictor, Template based Protein Interface Prediction (T-PIP).
First, exhaustive evaluation of interface predictors demonstrates that T-PIP, whose predictions are customised to target complexity, is a state-of-the-art method. Second, comparative study between T-PioDock and other state-of-the-art scoring methods establishes T-PioDock as the best performing approach. Moreover, there is good correlation between T-PioDock performance and quality of docking models, which suggests that progress in docking will lead to even better results at recognising near-native conformations.
Accurate identification of near-native conformations remains a challenging task. Although availability of 3D complexes will benefit from template-based methods such as T-PioDock, we have identified specific limitations which need to be addressed. First, docking software are still not able to produce native like models for every target. Second, current interface predictors do not explicitly consider pairwise residue interactions between proteins and their interacting partners which leaves ambiguity when assessing quality of complex conformations.
KeywordsProtein-protein interaction Interface prediction Homology modelling Docking Model scoring Model ranking
Since proteins function by interacting with other molecules, analysis of protein-protein interactions is essential for comprehending biological processes. Given that alternation in those interactions can result in diseases, their identification is key information for drug design. For example, discovery that the Von Hippel-Lindau syndrome (VHL), a disorder characterised by the formation of tumours and cysts, is caused by a single mutation in the VHL protein which perturbes binding to the hypoxia-inducible factor has led to the manufacture of novel cancer drugs [1–3]. Experimental techniques such as Y2H , phage display  and affinity purification  have played an important role in deciphering protein interaction networks. Despite these efforts only 10% of the human interactome has been experimentally determined . Moreover, elucidation of biological processes often requires an understanding of atomic interactions within a complex. Although such information may be generated by X-ray crystallography or nuclear magnetic resonance, high costs in time and resources, and technical limitations have prevented their wide spread usage. Since approximately 40,000 protein complexes are available in the Protein Data Bank (PDB)  and PQS , they can be used for computational modelling of interactions : docking intends to predict a complex 3D structure from the structures of its components. Using energy-based cost functions, it explores the space of possible conformations and generates a list of plausible models. Although it often contains near-native conformations, additional knowledge, such as binding site location or interacting residues, is required to identify them. As a consequence, accurate prediction of protein interfaces has become an important component of a docking framework [11–14]. After a review of interface predictors, we explore how they have been used as constraint to evaluate docking conformations.
Protein-protein interface prediction
Computational methods which have been proposed for identifying interface residues of proteins can be broadly divided into two non-exclusive categories based on their use of protein information. The first approach is based on the specific features of sequences and/or structures, while the second one explores proteins which are either sequentially or structurally related to the query protein (QP).
A large variety of intrinsic features have been used for interface prediction, they include composition and propensity of interface residues , physico-chemical properties [16, 17], predicted structural characteristics , secondary structure , solvent-accessible surface area [19, 20], geometrical shape of the protein surface  and crystallographic B-factor [18, 21]. One of the first studies was conducted by Ofran and Rost  which used amino-acid composition to predict interfaces. Since they had previously shown that residues at interface have a totally different composition than others , this information was used to train a Neural Network (NN). They further improved their approach by introducing ISIS [16, 23] which uses both evolutionary profiles and predicted structural features for NN training. Better performance, especially in terms of sensitivity, demonstrates the value of integrating predicted structural information. ISIS prediction of a few residues (low sensitivity) with high accuracy suggests the importance of these residues in binding which have been referred as hot-spots residues. Other studies have confirmed the intuitive assumption that inclusion of structural information improves performance since non-surface residues can be trivially eliminated [19, 24]. A popular approach has been to exploit that information, either predicted or actual, using machine learning methods. Whereas Cons-PPISP relied on consensus predictions from multiple neural networks , ProMate, adopted an approach using a Bayesian network involving 13 different features . Eventually, the usage of additional structural information in the form of side chain energy estimation allowed PINUP performing better than both Cons-PPISP and ProMate . Finally, a meta predictor, meta-PPISP, which combines the scores of PINUP, Cons-PPISP and Promate, was shown to outperform each of these individual methods [24, 28].
An alternative research line has exploited the fact that structurally similar proteins (or structural neighbours) share similar interaction sites even if they are unrelated [10, 29, 30]. PredUs extracts structural neighbours of the QP, maps interface residues onto the QP and scores these residues using a SVM based classifier according to their intrinsic features [29, 30]. PrISE follows a similar approach but using only local structural similarity from a repository of structural elements . Experiments show PredUs and PrISE perform similarly and outperform meta-PPISP [31, 32]. Despite homology requirements potentially reducing the scope of usability of prediction methods, many approaches have exploited available homologous structures and/or sequences [33, 34]. These methods use Multiple Sequence Alignment and/or phylogenetic tree to detect homologues and extract evolutionary information. HomPPI divides the homologues of a QP into three zones according to interface conservation: Safe, Twilight and Dark Zones. Interfaces are then predicted using an MSA of homologues from the most reliable available zone . Performance was significantly improved by using Structure-based-MSA (S-MSA) in IBIS . IBIS combines sequence and structure conservation scores to detect potential binding sites. IBIS structurally aligns QP with its homologues creating an S-MSA which highlights the interface residues of homologues. Then, using the S-MSA a binding site similarity matrix is generated by comparing the structure and sequence of each homologue against all other homologues. Using the matrix, similar binding sites are clustered into groups which are ranked according to a weighted combination of sequence similarity score and conservation score. The inferred binding site of the best rank group is then mapped onto the QP. Recently, we introduced a novel template based approach, WePIP, which goes further than any method in exploiting homology : not only continuous scoring is used to express homology closeness to the QP, but the nature of interaction partners is also considered. Initial evaluation has suggested that WePIP outperforms competitors in terms of precision and accuracy .
Scoring protein-protein docking conformations
Protein-protein docking aims to computationally predict the 3D structure of a protein complex using the unbound structures of its components (useful reviews can be found in [36–39]). Docking algorithms can be divided into two groups , i.e. template-based [10, 41, 42] and template-free docking. With the increase in the number of 3D structures template-based docking has become particularly popular using experimentally determined structures as templates to generate new complexes. Template-based docking is particularly attractive since, unlike template-free docking, its low computational cost makes it suitable for interactome scale predictions. Template-free docking still remains highly important since not all proteins can be modelled using templates, . In addition, free-docking approaches with their refinement stage have made it possible to generate high resolution structures  which are important for understanding the molecular mechanism of protein contacts. In this paper we simply refer to template-free docking as ‘docking’.
Performances of docking algorithms are compared biannually in the CAPRI (Critical Assessment of Predicted Interaction) competition  and are evaluated against larger protein docking benchmarks [45–47]. Those algorithms explore thousands of docking orientations (sampling) that are assessed using an energy-based scoring function  involving, in the case of ZDOCK , measures of shape complementary, desolvation and electrostatics. In order to introduce flexibility, ensemble of conformations  have also been used to generate docking conformations. These ensembles are taken from X-ray or NMR structures or generated using computational methods such as (Molecular Dynamic) MD simulations, normal modes and loop modelling. One way of docking ensembles is to dock them one by one (cross-docking) but since it is computationally expensive methods such as mean-field approach have been used . Two studies of Smith et al.  and Grünberg et al.  investigated the use of ensembles docking by using MD simulations along with 3D-DOCK and HEX docking methods. Although they discovered an increase in the number of native like solutions in the pool of docked conformations, scoring became more difficult since wrong solutions were given higher ranks. In order to introduce flexibility and to reduce the size of the sampling space, some methods have adopted energy minimization (EM) techniques such as MD [13, 52] or Monte Carlo [53–56] simulated annealing.
These methods still produce a large number of solutions which require post-processing to detect native-like conformations. One should also be aware that since the present techniques neglect the presence of water during docking, the assembly of models can differ from the actual targets within a soluble environment . In order to refine the list of putative docking models, an additional step may be performed by applying energy minimisation, clustering or knowledge generated from available 3D structures. Typically, Cluspro [58–60], a state-of-the-art method, clusters the top 1000 models in terms of energy to generate a shorter set (hundreds) of model representatives. Although these models are associated with scores, they have shown to be unreliable to identify near native configurations .
Since docking software produce 100’s to 1000’s of putative models, their exploitation requires the ability to score them accurately [62–64]. Intuitively, physical-based scoring functions are particularly attractive since they can be applied to any model by exploiting physiochemical features of the atoms. ZRANK  relies on the usage of a combination of three atom-based terms, i.e. van der Waal, electrostatics and desolvation. In order to handle conformational changes upon binding, an extension of ZRANK, IRAD, integrated residue and atom based potentials . Experiments showed it outperforms ZRANK when dealing with complexes of medium docking difficulty.
Since comparative studies have shown that energy-based scoring functions are error-prone [65, 66], machine learning and knowledge–based statistical methods seem to be more promising approaches. Zhoe et al.  proposed a supervised (SVM) and a semi-supervised (TSVM) feature-based learning method trained using 3D interface features generated from interaction interfaces of protein complexes. Experiments revealed that both approaches can distinguish between native and non-native structures with accuracy around 80%. More recently, Othersen et al.  conducted a similar experiment using mutual information to select discriminative structural features . They identified 11 of them which led to good identification of near-native models.
Knowledge of interface residues has proved particularly successful  and has been applied to either constrain the initial search space of docking software [13, 14] or score docking conformations by calculating the similarity between the interfaces of the docked models and the predicted ones [69, 70]. Since evaluating interfaces can be applied to models generated by any docking software and can be combined with other scoring function, it has proved more popular and practical. Experiments aiming at gaining insight into the value of using interface knowledge showed that knowledge of at least 40% of interface residues is sufficient to significantly improve ZDOCK rankings . As a consequence standard interface prediction approaches, such as cons-PPISP , Promate  and HomPPI , were extended to evaluate the fit of docked proteins against their predicted binding sites [69, 70]. By combining five interface predictors, i.e. Promate , PPI–Pred , PPISP , PINUP , and a predictor based on NN  into one framework called metaPPI , success rates were improved by 15% in comparison to the best individual predictors. Finally, instead of representing interacting interfaces as a two-patch system, SPIDER  evaluates multi-residue interactions using a library of contacts containing graph representations of common interfacial patterns. Although SPIDER has claimed to outperform ZRANK, its usage is limited by the requirement of accurate and high resolution interfaces.
As highlighted in the latest edition of CAPRI , despite progress in docking predictions, there is still room for improvement. In this study, we contribute to this topic by proposing T-PioDock (Template based Protein Interface prediction and protein interface Overlap for Docking model scoring), a framework for detection of a native-like docked complex 3D structure. T-PioDock aims at supporting the identification of near-native conformations from 3D models produced by docking software by scoring those models. As supported by the review in the “Scoring protein-protein docking conformations” section, T-PioDock exploits template based predictions of complexes’ binding interfaces to evaluate docking configurations.
In this paper, we first conduct an exhaustive evaluation of interface predictors on a set of standard benchmark datasets and demonstrate that the T-PIP methodology whose predictions are customised to target complexity performs best. Second, we provide a comparative study between T-PioDock and other state-of-the-art scoring methods on the most complete docking benchmark dataset. This establishes T-PioDock as the best performing approach. Then, we discuss those results in the context of identification of the best conformations. Finally, we present the methodology behind T-PIP and T-PioDock.
Datasets and tools
Interface predictors and docking model ranking approaches are evaluated using three standard benchmark datasets: Ds56unbound , Docking Benchmark 3.0 (DBMK3.0)  and Docking Benchmark 4.0 (DBMK4.0) . These datasets contain high-resolution protein structures both in their unbound and bound forms.
Ds56unbound is comprised of 56 unbound chains generated from 27 CAPRI targets, T01 ~ T27 . In total, it contains 12173 residues including 2112 interacting ones. This dataset is used to perform evaluation of all interface prediction methods of interest.
In this study, initial docking predictions were produced using the ClusPro 2.0 docking server , which performed best at CAPRI 2009 . For a pair of proteins, Cluspro generates hundreds of conformational models usually containing at least one near native model. These models are generated by minimising their energy and are then clustered. Clusters are ranked based on their size. Unfortunately, these rankings have proved unable to detect the near-native models [61, 70].
Evaluation of interface prediction methods
In the first set of experiments, performance of state-of-the-art methods was performed using the Ds56unbound dataset. According to T-PIP, 27 chains were classified as ‘trivial’, 24 as ‘homologous’ and 5 as ‘unknown’ based on homologues availability in the PDB (see ‘T-PIP: Template based Protein Interface Prediction’ in Methods section). In addition, to evaluate interface prediction without knowledge of the QP structure, we also produced results where the QP sequence, instead of its structure, was aligned with a Structure-based-MSA (S-MSA) of its homologues. Those results are labelled as T-PIPQPseq+S-MSA. Since the IBIS server may provide several interfaces for a given protein, performance is calculated here by selecting the interface with the highest score. Note that two targets could not have their interface predicted using IBIS (5HMG-A and 5HMG-B). It should be stressed that, although T-PIPQPseq+S-MSA does not requires the actual QP structure, it relies on the availability of the 3D structure of QP homologues.
Evaluation of interface prediction methods using the Ds56unbound dataset
Comparison of interface predictors’ performance on DS120 and DS236
Predictor & dataset
Moreover in Table 1, T-PIP displays either best or second best results competing with PrISE  and PredUs [29, 30] depending on the metric considered. Comparison between standard T-PIP and T-PIPQPseq+S-MSA suggests that availability of the QP structure only marginally increases performance and is, therefore, not required for interface prediction. Nevertheless standard T-PIP is used in all remaining experiments.
Further tests were conducted on the best performing approaches, i.e. IBIS, PrISE, PredUs and T-PIP, using the DS120 and DS236 datasets. Note that for DS120 PredUs and IBIS failed to process 1 (1ZK0-B) and 9 proteins, respectively. For DS236, while IBIS failed to make predictions for 32 proteins which did not have ‘close’ homologues, i.e. at least 30% sequence similarity to the QP and 75% binding site overlap with the QP structure, T-PIP, which investigates remote homologues, was only unable to process 2 proteins (1H20-A and 1QFD-A) that do not have any structural neighbour. Since PredUs used DS120 chains for training, its performance on an independent dataset is likely to be lower (results on DS236 were not available). When using the PrISE server, query chains were removed from the database used for computation of similar structures.
T-PIP performance on DS120 and DS236 according to DBMK categories
Predictor & Categories
Rigid Body (86chains)
Medium-Difficulty (18 chains)
Difficult (16 chains)
Rigid Body (156 chains)
Medium-Difficulty (44 chains)
Difficult (34 chains)
T-PIP performance on DS120 and DS236 according to target complexity
Predictor & Categories
WePIP performance compared to PredUs on DS48
Predictor & Categories
Processing of ‘homologous’ targets by WePIP relies on extracting the relevant interacting residues from the interfaces of homologous proteins. In order to evaluate this process, for each protein from the 93 ‘homologous’ targets defined in Table 2 (DS93- See Additional file 4), the precision that would have been obtained using simply the interface of a homologue is computed. This shows how much the interface of a given homologue complex is representative of the solution binding site. In addition, for a given target, the average of its homologues precisions and its T-PIP precision is calculated. Figure 3 shows the quality of T-PIP predictions with respect to target homologues. Note that query proteins are identified using their association to their target employing the following notation: ABCD:WXYZ-E, where ABCD is the PDB code of the complex target and WXYZ-E is the query protein PDB code-chain, e.g. 1ZM4:1XK9-A.
Ranking Docking Conformations
Performance of docking model rankings according to ground truth criterion (DS93 dataset) based on average normalized χ 2
Ground truth criterion
Ranking method applied to DS93
Interfaces + PioDock
Performance of docking model rankings according to ground truth criterion (DS128 dataset) based on average normalized χ 2
Ground truth criterion
Ranking method applied to DS128
Interfaces + PioDock
In a first set of experiments, T-PioDock was compared to other state-of-the-art methods using DS93. In addition, we evaluate the PioDock module by applying it on the ground truth interfaces of the target complexes (Interfaces + PioDock) instead of their T-PIP predictions. We have used two different metrics to perform this comparison which are (i) normalizedχ2 and (ii) mean log rank metric (MLR) (for details see Methods section). Table 6 displays the average normalizedχ2 between the GT and rankings produced by each method. First, although Interfaces + PioDock is not based on interface prediction, but actual interfaces, its normalizedχ2 is worse than the reference scores (here, it is the double). This can be explained by the fact that since docking interfaces are treated as two set of interface residues without any pairwise knowledge (patches), which is the output of current interface predictors, they could perfectly overlap even if the position of a binding partner was rotated around the centre of the patches. Second, we have investigated usage of other interface predictors (here PrISE and IBIS) along with PioDock (shown as PrISE + PioDock and IBIS + PioDock, respectively) in ranking docking conformations. As demonstrates in the table PioDock based rankings are superior to all other methods whatever the criterion used to generate the GT rankings. Results of PioDock with 3 state-of-the-art template based interface predictors show very similar results (large standard deviations show that differences are not significant). Although those methods generate interfaces with different amino acid compositions, this does not affect PioDock much since it relies implicitly on comparing interface ‘patches’ to see if a complex is compatible or not. These results highlight the robustness of PioDock to small variations in interface predictions. Moreover, relative performances between other methods are in agreement with previously reported results [63, 64, 67, 74].
Performance of docking model rankings according to ground truth criterion (DS93 dataset) based on mean log rank metric
Ground truth criterion
Ranking method applied to DS93
Interfaces + PioDock
In a second set of experiments, T-PioDock is evaluated on DS128, see Table 7. As expected, better interface predictions for this ‘trivial’ dataset leads to better quality of rankings for T-Piodock compared to DS93.
Since the quality of the best docking model is very unequal, it is interesting to quantify how it affects model ranking by T-PioDock. In order to study this, best models from the ‘homologous’ target set were clustered using K-means clustering into three groups, i.e. good, average and bad, after normalisation. In Table 9, the average normalizedχ2 per group shows that T-PioDock produces significant better ranking when a better quality model is available. This suggests that progress in docking would lead to better performance by T-PioDock.
T-PioDock ranking performance (average normalized χ 2 based on the quality of the best model
Ground truth criterion
Quality of the best model
This study has confirmed that despite sustained activity in the field, the prediction of a complex 3D structure remains a challenge. First, docking software may not be able to produce any near native conformation among the generated set of putative models. Second, identification of the best conformations remains a difficult task. In this work, we have contributed to this topic by offering a pipeline, T-PioDock, for scoring docking models according to the overlap of their components’ predicted interfaces.
Experiments evaluating the proposed scoring process, PioDock, on actual interfaces (Interfaces + PioDock system) showed that the treatment of docking interfaces as patches instead of sets of residue interactions affects the quality of the model selection process: two patches can perfectly overlap even if all binary residue interactions are incorrect. Unfortunately, there is currently no promising alternative since current state of the art in interface prediction is not able to work at such a level of details even if this has started to be explored [61, 74]. Although this is an important issue, the study has revealed that the main source of scoring inaccuracy resides with the quality of predicted interfaces, see Table 7. Exhaustive evaluation of interface prediction methods demonstrated that T-PIP is a state-of-the-art method; moreover comprehensive comparison of state-of-the-art methods for ranking docking models supported its integration within the T-PioDock framework. However, as Tables 1 and 2 showed, performance of interface predictions remains unsatisfactory: most metrics returns values within the 40-60% range, with the notable exception of ‘accuracy’ , ~85%, which benefits from the low ratio between interface and non-interface residues. Although there is no doubt that the sustained growth of the PDB  will benefit template based methods and T-PIP in particular, Figure 3 also highlighted that T-PIP prediction could not outperform the best available homologue interface. This may be explained by the fact that residues are selected independently without considering pairwise interactions, whereas homologues present interfaces where residues belong to a consistent interaction network. While experiments reported in Table 1 have demonstrated the superiority of template based methods over feature based ones, one would expect than analysis of local features could complement initial template based prediction by bringing local contextual information.
In this study, we have presented a novel framework, T-PioDock, for prediction of a complex 3D configuration from the structures of its components. It aims to support the identification of near-native conformations by scoring models produced by any docking software. This is achieved by exploiting predictions of complexes’ binding interfaces.
Exhaustive evaluation of interface predictors on standard benchmark datasets has confirmed the superiority of template based approaches and has shown that the T-PIP methodology is a state-of-the-art method. Moreover, comparison between PioDock and other state-of-the-art scoring methods has revealed that the proposed approach outperforms all its competitors.
Despite the fact that detection of native-like models is an active field of research, accurate identification of near-native conformations remains a challenging task. Although availability of 3D complexes will be of benefit to template based methods such as T-PioDock, we have identified specific limitations which need to be addressed. First, docking software are still not able to produce native-like models for every target. Second, current interface predictors do not explicitly refer to binary residue interactions which leaves ambiguity when assessing quality of complex conformations.
T-PIP: Template based Protein Interface Prediction
As described in Figure 1, the T-PIP module, first, evaluates the complexity of a protein target in terms of availability of 3D structures of homologous proteins and, second, applies the most relevant template based interface predictor. In this study, an interface residue is defined according to CAPRI’s definition , i.e. an amino acid whose heavy atoms are within 5 Å from those of a residue in a separate chain.
Initially, protein targets are categorised into three categories: ‘trivial’ , ‘homologous’ and ‘unknown’. This is achieved by, first, searching homologues of the query proteins in PDB  using Blast . Since the aim is to learn from the interaction pattern of these homologues, only those involved in a complex are further considered. The original target complex under study is purposely removed from the homologue list. In this study, proteins are defined as homologous if their sequence similarity is associated with an Evalue ≤ 10- 2. Since predictions are not limited to close homologues as IBIS is , interface of more targets can be predicted. If among their homologous complexes both QPs share at least one complex, the target is considered to be ‘trivial’ , since at least a homologue of the complete complex is available. If each QP possesses a set of homologous complexes, but none of them belongs to both sets, the target is classified as ‘homologous’. Finally, if no homologous complex is found for at least one of the QP, the target is judged to be ‘unknown’.
Interfaces of ‘trivial’ targets are simply inferred by aligning the sequence of each QP with the corresponding chain from the ‘best’ common homologous complex and mapping their interface residues on the query chains. In order to select the ‘best’ template among all common complexes, we score them by multiplying the E-values of both components according to their respective QP. The common complex with the lower score is selected as the template from which interfaces are inferred.
Although WePIP was initially designed for predicting interface residues for query proteins whose 3D structure is known, it can also be applied when only the sequence of the query is available. In this case, an initial S-MSA is created using only homologous complexes of the QP. Then, the QP sequence is integrated into that S-MSA using the ClustalW Profile Alignment command  to create a complete MSA.
PioDock: Protein Interface Overlap for Docking model scoring
where interface ADocked and interface AT - PIP in the numerator of the formula represent , respectively, the sets of the residue in the interfaces of docked model and the ones predicted by T-PIP. While interface ADocked and interface AT - PIP in the denominator represent the number of residues in the interface of docked model and the ones predicted by T-PIP, respectively.
complexOverlap scores of native complexes should equal to 1, whereas completely incorrect interfaces should be assigned a value of zero. In this study, complexOverlap score was used to rank all conformational models generated by docking software for a given complex. When experiment was conducted to evaluate the PioDock module on its own, actual target interfaces were used instead of their predictions.
Note that when no interface prediction could be performed for one of the two docking partners, the overlap score for that protein is equal to zero and complexOverlap score is calculated using only the overlap score of the other protein.
Evaluation of docking model scorings
In order to allow any evaluation it is necessary to have some gold standard or ground truth. However, comparison of two docked models is far from being a straightforward task since CAPRI uses three differences measures to assess the docking model quality : l-rmsd measures the RMSD between the backbones of the two complexes ligands, i-rmsd restricts its evaluation to interface residues, whereas f nat is the fraction of native contacts within the interface. Since f nat can only discriminate between relatively good configurations – all models failing to predict a single interface residue receive a score of 0, only i-rmsd and l-rmsd are used in this study.
normalizedχ2 represents the similarity between two ranking lists by giving higher weights to the models that are ranked higher based on the gold standard: correct ranking is more important for top-ranking models than lower-ranking models. Perfect ranking would return a value of 0.
Where N c is the number of targets and Rank i is the rank of the ‘hit’ for target i. In the best case, if, for all targets, the ‘hit’ is placed in rank 1 then MLR equals to 1.
Interface prediction evaluation
In order to compare the performance of interface predictors, their True Positive (TP), False Positive (FP), True Negative (TN) and False Negative (FN) rates need to be calculated . Correctness and wrongness of predictions are calculated in respect to the ground truth (GT), which is defined as the X-ray structure of the target protein in its complex form. To summarise these four figures into a single performance measure, a few metrics have been proposed. Below we describe the measures we use in this paper:
MCC has shown to be effective especially for predictors which are biased because of the imbalances in their training set.
While receiving operator characteristic (ROC) plots  have also been widely used to evaluate classification predictors, they have not been used in this study since very few of our competitors have reported them in their publications.
Since the above mentioned metrics capture different aspects of a predictor’s performance, all of them are required for evaluation.
Availability and requirements
T-PIP and T-PioDock software are available from http://manorey.net/bioinformatics/wepip/.
Protein Data Bank
Mathew’s Correlation Coefficient
Protein Quaternary Structure
Multiple Sequence Alignment
Root Mean Square Deviation
This work was in part supported by grant 6435/B/T02/2011/40 of the Polish National Centre for Science. The authors would like to express their gratitude to David W. Gatchell for generating Cluspro Docking models, Marc Lensink for scoring them according to CAPRI criteria, and Nan Zhao and Martin Eberhardt for ranking them using SVM/TSVM and MI-base approaches, respectively. The authors would also like to thank Qiangfeng Cliff Zhang and Lei Deng for providing PredUs data and Anna Panchenko, Raed Saeed Khashan, Thome Vreven, Brian Pierce and Rafael A Jordan for their helpful assistance in using their tools.
- Kann MG: Protein interactions and disease: computational approaches to uncover the etiology of diseases. Brief Bioinform. 2007, 8: 333-346. 10.1093/bib/bbm031.View ArticlePubMedGoogle Scholar
- Ohh M, Park CW, Ivan M, Hoffman MA, Kim TY, Huang LE, Pavletich N, Chau V, Kaelin WG: Ubiquitination of hypoxia-inducible factor requires direct binding to the &bgr;-domain of the von Hippel–Lindau protein. Nat Cell Biol. 2000, 2: 423-427. 10.1038/35017054.View ArticlePubMedGoogle Scholar
- Patel PH, Chadalavada RSV, Chaganti RSK, Motzer RJ: Targeting von Hippel-Lindau pathway in renal cell carcinoma. Clin Cancer Res. 2006, 12: 7215-7220. 10.1158/1078-0432.CCR-06-2254.View ArticlePubMedGoogle Scholar
- Brückner A, Polge C, Lentze N, Auerbach D, Schlattner U: Yeast two-hybrid, a powerful tool for systems biology. Int J Mol Sci. 2009, 10: 2763-2788. 10.3390/ijms10062763.View ArticlePubMed CentralPubMedGoogle Scholar
- Pande J, Szewczyk MM, Grover AK: Phage display: concept, innovations, applications and future. Biotechnol Adv. 2010, 28: 849-858. 10.1016/j.biotechadv.2010.07.004.View ArticlePubMedGoogle Scholar
- Ethan K, Ashish S, Adrian V, Mathieu B: Predicting direct protein interactions from affinity purification mass spectrometry data. Algorithms Mol Biol. 2010, 5: 34-10.1186/1748-7188-5-34.View ArticleGoogle Scholar
- Venkatesan K, Rual JF, Vazquez A, Stelzl U, Lemmens I, Hirozane-Kishikawa T, Hao T, Zenkner M, Xin X, Goh KI: An empirical framework for binary interactome mapping. Nat Methods. 2008, 6: 83-90.View ArticlePubMed CentralPubMedGoogle Scholar
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The protein data bank. Nucleic Acids Res. 2000, 28: 235-242. 10.1093/nar/28.1.235.View ArticlePubMed CentralPubMedGoogle Scholar
- Henrick K, Thornton JM: PQS: a protein quaternary structure file server. Trends Biochem Sci. 1998, 23: 358-10.1016/S0968-0004(98)01253-5.View ArticlePubMedGoogle Scholar
- Zhang QC, Petrey D, Deng L, Qiang L, Shi Y, Thu CA, Bisikirska B, Lefebvre C, Accili D, Hunter T: Structure-based prediction of protein-protein interactions on a genome-wide scale. Nature. 2012, 490: 556-560. 10.1038/nature11503.View ArticlePubMed CentralPubMedGoogle Scholar
- Gottschalk KE, Neuvirth H, Schreiber G: A novel method for scoring of docked protein complexes using predicted protein–protein binding sites. Protein Eng Des Sel. 2004, 17: 183-189. 10.1093/protein/gzh021.View ArticlePubMedGoogle Scholar
- Huang B, Schroeder M: Using protein binding site prediction to improve protein docking. Gene. 2008, 422: 14-21. 10.1016/j.gene.2008.06.014.View ArticlePubMedGoogle Scholar
- De Vries SJ, Bonvin AMJJ: CPORT: a consensus interface predictor and its performance in prediction-driven docking with HADDOCK. PLoS One. 2011, 6: e17695-10.1371/journal.pone.0017695.View ArticlePubMed CentralPubMedGoogle Scholar
- Li B, Kihara D: Protein docking prediction using predicted protein-protein interface. BMC Bioinformatics. 2012, 13: 7-10.1186/1471-2105-13-7.View ArticlePubMed CentralPubMedGoogle Scholar
- Ofran Y, Rost B: Predicted protein–protein interaction sites from local sequence information. FEBS Lett. 2003, 544: 236-239. 10.1016/S0014-5793(03)00456-3.View ArticlePubMedGoogle Scholar
- Ofran Y, Rost B: ISIS: interaction sites identified from sequence. Bioinformatics. 2007, 23: e13-e16. 10.1093/bioinformatics/btl303.View ArticlePubMedGoogle Scholar
- Chen P, Li J: Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information. BMC Bioinformatics. 2010, 11: 402-10.1186/1471-2105-11-402.View ArticlePubMed CentralPubMedGoogle Scholar
- Jones S, Thornton JM: Analysis of protein-protein interaction sites using surface patches. J Mol Biol. 1997, 272: 121-132. 10.1006/jmbi.1997.1234.View ArticlePubMedGoogle Scholar
- Šikić M, Tomić S, Vlahoviček K: Prediction of protein–protein interaction sites in sequences and 3D structures by random forests. PLoS Comput Biol. 2009, 5: e1000278-10.1371/journal.pcbi.1000278.View ArticlePubMed CentralPubMedGoogle Scholar
- Chung JL, Wang W, Bourne PE: Exploiting sequence and structure homologs to identify protein–protein binding sites. Proteins. 2005, 62: 630-640. 10.1002/prot.20741.View ArticleGoogle Scholar
- De Vries SJ, Bonvin AMJJ: How proteins get in touch: interface prediction in the study of biomolecular complexes. Curr Protein Pept Sci. 2008, 9: 394-406. 10.2174/138920308785132712.View ArticlePubMedGoogle Scholar
- Ofran Y, Rost B: Analysing six types of protein–protein interfaces. J Mol Biol. 2003, 325: 377-387. 10.1016/S0022-2836(02)01223-8.View ArticlePubMedGoogle Scholar
- Ofran Y, Rost B: Protein–protein interaction hotspots carved into sequences. PLoS Comput Biol. 2007, 3: e119-10.1371/journal.pcbi.0030119.View ArticlePubMed CentralPubMedGoogle Scholar
- Zhou HX, Qin S: Interaction-site prediction for protein complexes: a critical assessment. Bioinformatics. 2007, 23: 2203-2209. 10.1093/bioinformatics/btm323.View ArticlePubMedGoogle Scholar
- Chen H, Zhou HX: Prediction of interface residues in protein–protein complexes by a consensus neural network method: test against NMR data. Proteins. 2005, 61: 21-35. 10.1002/prot.20514.View ArticlePubMedGoogle Scholar
- Neuvirth H, Raz R, Schreiber G: ProMate: a structure based prediction program to identify the location of protein-protein binding sites. J Mol Biol. 2004, 338: 181-10.1016/j.jmb.2004.02.040.View ArticlePubMedGoogle Scholar
- Liang S, Zhang C, Liu S, Zhou Y: Protein binding site prediction using an empirical scoring function. Nucleic Acids Res. 2006, 34: 3698-3707. 10.1093/nar/gkl454.View ArticlePubMed CentralPubMedGoogle Scholar
- Qin S, Zhou HX: meta-PPISP: a meta web server for protein-protein interaction site prediction. Bioinformatics. 2007, 23: 3386-3387. 10.1093/bioinformatics/btm434.View ArticlePubMedGoogle Scholar
- Zhang QC, Petrey D, Norel R, Honig BH: Protein interface conservation across structure space. Proc Natl Acad Sci U S A. 2010, 107: 10896-10901. 10.1073/pnas.1005894107.View ArticlePubMed CentralPubMedGoogle Scholar
- Zhang QC, Deng L, Fisher M, Guan J, Honig B, Petrey D: PredUs: a web server for predicting protein interfaces using structural neighbors. Nucleic Acids Res. 2011, 39: W283-W287. 10.1093/nar/gkr311.View ArticlePubMed CentralPubMedGoogle Scholar
- Jordan RA, Yasser ELM, Dobbs D, Honavar V: Predicting protein-protein interface residues using local surface structural similarity. BMC Bioinformatics. 2012, 13: 41-10.1186/1471-2105-13-41.View ArticlePubMed CentralPubMedGoogle Scholar
- Esmaielbeiki R, Nebel JC: Unbiased Protein Interface Prediction Based on Ligand Diversity Quantification. Ger Conf Bioinformatics. 2012, 2012: 119-130.Google Scholar
- La D, Kihara D: A novel method for protein–protein interaction site prediction using phylogenetic substitution models. Proteins. 2012, 80: 126-141. 10.1002/prot.23169.View ArticlePubMed CentralPubMedGoogle Scholar
- Xue LC, Dobbs D, Honavar V: HomPPI: a class of sequence homology based protein-protein interface prediction methods. BMC Bioinformatics. 2011, 12: 244-10.1186/1471-2105-12-244.View ArticlePubMed CentralPubMedGoogle Scholar
- Tyagi M, Thangudu RR, Zhang D, Bryant SH, Madej T, Panchenko AR: Homology Inference of Protein-Protein Interactions via Conserved Binding Sites. PLoS One. 2012, 7: e28896-10.1371/journal.pone.0028896.View ArticlePubMed CentralPubMedGoogle Scholar
- Halperin I, Ma B, Wolfson H, Nussinov R: Principles of docking: an overview of search algorithms and a guide to scoring functions. Proteins. 2002, 47: 409-443. 10.1002/prot.10115.View ArticlePubMedGoogle Scholar
- Smith GR, Sternberg MJE: Prediction of protein–protein interactions by docking methods. Curr Opin Struct Biol. 2002, 12: 28-35. 10.1016/S0959-440X(02)00285-3.View ArticlePubMedGoogle Scholar
- Ritchie DW: Recent progress and future directions in protein-protein docking. Curr Protein Pept Sci. 2008, 9: 1-15. 10.2174/138920308783565741.View ArticlePubMedGoogle Scholar
- Bonvin AMJJ: Flexible protein–protein docking. Curr Opin Struct Biol. 2006, 16: 194-200. 10.1016/j.sbi.2006.02.002.View ArticlePubMedGoogle Scholar
- Kundrotas PJ, Zhu Z, Janin J, Vakser IA: Templates are available to model nearly all complexes of structurally characterized proteins. Proc Natl Acad Sci. 2012, 109: 9438-9441. 10.1073/pnas.1200678109.View ArticlePubMed CentralPubMedGoogle Scholar
- Ghoorah AW, Devignes M-D, Smaïl-Tabbone M, Ritchie DW: Spatial clustering of protein binding sites for template based protein docking. Bioinformatics. 2011, 27: 2820-2827. 10.1093/bioinformatics/btr493.View ArticlePubMedGoogle Scholar
- Kuzu G, Keskin O, Gursoy A, Nussinov R: Constructing structural networks of signaling pathways on the proteome scale. Curr Opin Struct Biol. 2012, 22: 367-377. 10.1016/j.sbi.2012.04.004.View ArticlePubMedGoogle Scholar
- Vakser IA, et al: Low-resolution structural modeling of protein interactome. Curr Opin Struct Biol. 2013, 23: 198-205. 10.1016/j.sbi.2012.12.003.View ArticlePubMed CentralPubMedGoogle Scholar
- Janin J, Wodak S: The third CAPRI assessment meeting Toronto, Canada, April 20–21, 2007. Structure. 2007, 15: 755-759. 10.1016/j.str.2007.06.007.View ArticlePubMedGoogle Scholar
- Mintseris J, Wiehe K, Pierce B, Anderson R, Chen R, Janin J, Weng Z: Protein--protein docking benchmark 2.0: an update. Proteins. 2005, 60: 214-216. 10.1002/prot.20560.View ArticlePubMedGoogle Scholar
- Douguet D, Chen HC, Tovchigrechko A, Vakser IA: Dockground resource for studying protein–protein interfaces. Bioinformatics. 2006, 22: 2612-2618. 10.1093/bioinformatics/btl447.View ArticlePubMedGoogle Scholar
- Hwang H, Pierce B, Mintseris J, Janin J, Weng Z: Protein--protein docking benchmark version 3.0. Proteins. 2008, 73: 705-709. 10.1002/prot.22106.View ArticlePubMed CentralPubMedGoogle Scholar
- Chen R, Li L, Weng Z: ZDOCK: An initial-stage protein-docking algorithm. Proteins. 2003, 52: 80-87. 10.1002/prot.10389.View ArticlePubMedGoogle Scholar
- Andrusier N, Mashiach E, Nussinov R, Wolfson HJ: Principles of flexible protein–protein docking. Proteins. 2008, 73: 271-289. 10.1002/prot.22170.View ArticlePubMed CentralPubMedGoogle Scholar
- Smith GR, Sternberg MJE, Bates PA: The relationship between the flexibility of proteins and their conformational states on forming protein–protein complexes with an application to protein–protein docking. J Mol Biol. 2005, 347: 1077-1101. 10.1016/j.jmb.2005.01.058.View ArticlePubMedGoogle Scholar
- Grünberg R, Leckner J, Nilges M: Complementarity of structure ensembles in protein-protein binding. Structure. 2004, 12: 2125-2136. 10.1016/j.str.2004.09.014.View ArticlePubMedGoogle Scholar
- Dominguez C, Boelens R, Bonvin AMJJ: HADDOCK: a protein-protein docking approach based on biochemical or biophysical information. J Am Chem Soc. 2003, 125: 1731-1737. 10.1021/ja026939x.View ArticlePubMedGoogle Scholar
- Fernández-Recio J, Totrov M, Abagyan R: ICM-DISCO docking by global energy optimization with fully flexible side-chains. Proteins. 2003, 52: 113-117. 10.1002/prot.10383.View ArticlePubMedGoogle Scholar
- Fernández-Recio J, Totrov M, Abagyan R: Soft protein–protein docking in internal coordinates. Protein Sci. 2002, 11: 280-291.View ArticlePubMed CentralPubMedGoogle Scholar
- Mashiach E, Nussinov R, Wolfson HJ: FiberDock: Flexible induced-fit backbone refinement in molecular docking. Proteins. 2010, 78: 1503-1519.View ArticlePubMed CentralPubMedGoogle Scholar
- Gray JJ, Moughon S, Wang C, Schueler-Furman O, Kuhlman B, Rohl CA, Baker D: Protein–protein docking with simultaneous optimization of rigid-body displacement and side-chain conformations. J Mol Biol. 2003, 331: 281-300. 10.1016/S0022-2836(03)00670-3.View ArticlePubMedGoogle Scholar
- Van Dijk ADJ, Bonvin AMJJ: Solvated docking: introducing water into the modelling of biomolecular complexes. Bioinformatics. 2006, 22: 2340-2347. 10.1093/bioinformatics/btl395.View ArticlePubMedGoogle Scholar
- Kozakov D, Hall DR, Beglov D, Brenke R, Comeau SR, Shen Y, Li K, Zheng J, Vakili P, Paschalidis IC, Vajda S: Achieving reliability and high accuracy in automated protein docking: ClusPro, PIPER, SDU, and stability analysis in CAPRI rounds 13–19. Proteins. 2010, 78: 3124-3130. 10.1002/prot.22835.View ArticlePubMed CentralPubMedGoogle Scholar
- Kozakov D, Brenke R, Comeau SR, Vajda S: PIPER: An FFT-based protein docking program with pairwise potentials. Proteins. 2006, 65: 392-406. 10.1002/prot.21117.View ArticlePubMedGoogle Scholar
- Comeau SR, Gatchell DW, Vajda S, Camacho CJ: ClusPro: an automated docking and discrimination method for the prediction of protein complexes. Bioinformatics. 2004, 20: 45-50. 10.1093/bioinformatics/btg371.View ArticlePubMedGoogle Scholar
- Esmaielbeiki R, Naughton D, Nebel JC: Structure prediction of LDLR-HNP1 complex based on docking enhanced by LDLR binding 3D motif. Protein Pept Lett. 2012, 19: 458-10.2174/092986612799789341.View ArticlePubMedGoogle Scholar
- Li L, Chen R, Weng Z: RDOCK: Refinement of rigid-body protein docking predictions. Proteins. 2003, 53: 693-707. 10.1002/prot.10460.View ArticlePubMedGoogle Scholar
- Pierce B, Weng Z: ZRANK: reranking protein docking predictions with an optimized energy function. Proteins. 2007, 67: 1078-1086. 10.1002/prot.21373.View ArticlePubMedGoogle Scholar
- Vreven T, Hwang H, Weng Z: Integrating atom-based and residue-based scoring functions for protein–protein docking. Protein Sci. 2011, 20: 1576-1586. 10.1002/pro.687.View ArticlePubMed CentralPubMedGoogle Scholar
- Lensink MF, Méndez R, Wodak SJ: Docking and scoring protein complexes: CAPRI 3rd Edition. Proteins. 2007, 69: 704-718. 10.1002/prot.21804.View ArticlePubMedGoogle Scholar
- Janin J, Henrick K, Moult J, Eyck LT, Sternberg MJE, Vajda S, Vakser I, Wodak SJ: CAPRI: a critical assessment of predicted interactions. Proteins. 2003, 52: 2-9. 10.1002/prot.10381.View ArticlePubMedGoogle Scholar
- Zhao N, Pang B, Shyu CR, Korkin D: Feature-based classification of native and non-native protein–protein interactions: Comparing supervised and semi-supervised learning approaches. Proteomics. 2011, 11: 4321-4330. 10.1002/pmic.201100217.View ArticlePubMedGoogle Scholar
- Othersen OG, Stefani AG, Huber JB, Sticht H: Application of information theory to feature selection in protein docking. J Mol Model. 2012, 18: 1285-1297. 10.1007/s00894-011-1157-6.View ArticlePubMedGoogle Scholar
- Qin S, Zhou HX: A holistic approach to protein docking. Proteins. 2007, 69: 743-749. 10.1002/prot.21752.View ArticlePubMedGoogle Scholar
- Xue LC, Jordan RA, Yasser EL, Dobbs D, Honavar V, DockRank: Ranking docked conformations using partner‒specific sequence homology‒based protein interface prediction. Proteins: Structure, Function, and Bioinformatics. 2014, 82: 250-267. 10.1002/prot.24370.View ArticleGoogle Scholar
- Bradford JR, Westhead DR: Improved prediction of protein–protein binding sites using a support vector machines approach. Bioinformatics. 2005, 21: 1487-1494. 10.1093/bioinformatics/bti242.View ArticlePubMedGoogle Scholar
- Zhou HX, Shan Y: Prediction of protein interaction sites from sequence profile and residue neighbor list. Proteins. 2001, 44: 336-343. 10.1002/prot.1099.View ArticlePubMedGoogle Scholar
- Porollo A, Meller J: Prediction-based fingerprints of protein–protein interactions. Proteins. 2006, 66: 630-645. 10.1002/prot.21248.View ArticleGoogle Scholar
- Khashan R, Zheng W, Tropsha A: Scoring protein interaction decoys using exposed residues (SPIDER): A novel multibody interaction scoring function based on frequent geometric patterns of interfacial residues. Proteins. 2012, 80: 2207-2217. 10.1002/prot.24110.View ArticlePubMed CentralPubMedGoogle Scholar
- Fleishman SJ, Whitehead TA, Strauch EM, Corn JE, Qin S, Zhou HX, Mitchell JC, Demerdash ONA, Takeda-Shitaka M, Terashi G, Moal IH, Li X, Bates PA, Zacharias M, Park H, Ko J, Lee H, Seok C, Bourquard T, Bernauer J, Poupon A, Azé J, Soner S, Ovalı SK, Ozbek P, Tal NB, Haliloglu T, Hwang H, Vreven T, Pierce BG, Weng Z: Community-wide assessment of protein-interface modeling suggests improvements to design methodology. J Mol Biol. 2011, 414: 289-302. 10.1016/j.jmb.2011.09.031.View ArticlePubMedGoogle Scholar
- Hwang H, Vreven T, Janin J, Weng Z: Protein--protein docking benchmark version 4.0. Proteins. 2010, 78: 3111-3114. 10.1002/prot.22830.View ArticlePubMed CentralPubMedGoogle Scholar
- Krawczyk K, Baker T, Shi J, Deane CM: Antibody i-Patch prediction of the antibody binding site improves rigid local antibody–antigen docking. Protein Eng Des Sel. 2013, 26: 621-629. 10.1093/protein/gzt043.View ArticlePubMedGoogle Scholar
- Lensink MF, Wodak SJ: Docking and scoring protein interactions: CAPRI 2009. Proteins. 2010, 78: 3073-3084. 10.1002/prot.22818.View ArticlePubMedGoogle Scholar
- Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.View ArticlePubMed CentralPubMedGoogle Scholar
- Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22: 4673-4680. 10.1093/nar/22.22.4673.View ArticlePubMed CentralPubMedGoogle Scholar
- Kuo P, Makris D, Nebel JC: Integration of bottom-up/top-down approaches for 2D pose estimation using probabilistic Gaussian modelling. Comput Vis Image Underst. 2011, 115: 242-255. 10.1016/j.cviu.2010.09.001.View ArticleGoogle Scholar
- Ritchie DW, Kozakov D, Vajda S: Accelerating and focusing protein–protein docking correlations using multi-dimensional rotational FFT generating functions. Bioinformatics. 2008, 24: 1865-1873. 10.1093/bioinformatics/btn334.View ArticlePubMed CentralPubMedGoogle Scholar
- Yan C, Dobbs D, Honavar V: A two-stage classifier for identification of protein–protein interface residues. Bioinformatics. 2004, 20: i371-i378. 10.1093/bioinformatics/bth920.View ArticlePubMedGoogle Scholar
- Baldi P, Brunak S, Chauvin Y, Andersen CAF, Nielsen H: Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics. 2000, 16: 412-424. 10.1093/bioinformatics/16.5.412.View ArticlePubMedGoogle Scholar
- Fawcett T: ROC graphs: Notes and practical considerations for researchers. Mach Learn. 2004, 31: 1-38.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited.