Volume 13 Supplement 17
Eleventh International Conference on Bioinformatics (InCoB2012): Bioinformatics
Bcell epitope prediction through a graph model
 Liang Zhao^{1},
 Limsoon Wong^{2},
 Lanyuan Lu^{3},
 Steven CH Hoi^{1}Email author and
 Jinyan Li^{4}Email author
DOI: 10.1186/1471210513S17S20
© Zhao et al.; licensee BioMed Central Ltd. 2012
Published: 13 December 2012
Abstract
Background
Prediction of Bcell epitopes from antigens is useful to understand the immune basis of antibodyantigen recognition, and is helpful in vaccine design and drug development. Tremendous efforts have been devoted to this longstudied problem, however, existing methods have at least two common limitations. One is that they only favor prediction of those epitopes with protrusive conformations, but show poor performance in dealing with planar epitopes. The other limit is that they predict all of the antigenic residues of an antigen as belonging to one single epitope even when multiple nonoverlapping epitopes of an antigen exist.
Results
In this paper, we propose to divide an antigen surface graph into subgraphs by using a Markov Clustering algorithm, and then we construct a classifier to distinguish these subgraphs as epitope or nonepitope subgraphs. This classifier is then taken to predict epitopes for a test antigen. On a big data set comprising 92 antigenantibody PDB complexes, our method significantly outperforms the stateoftheart epitope prediction methods, achieving 24.7% higher averaged fscore than the best existing models. In particular, our method can successfully identify those epitopes with a nonplanarity which is too small to be addressed by the other models. Our method can also detect multiple epitopes whenever they exist.
Conclusions
Various protrusive and planar patches at the surface of antigens can be distinguishable by using graphical models combined with unsupervised clustering and supervised learning ideas. The difficult problem of identifying multiple epitopes from an antigen can be made easied by using our subgraph approach. The outstanding residue combinations found in the supervised learning will be useful for us to form new hypothesis in future studies.
Background
A Bcell epitope is a set of spatially proximate residues in an antigen that can be recognized by antibodies to activate immune response [1]. Bcell epitopes are of two types: about 10% of them are linear Bcell epitopes and about 90% are conformational Bcell epitopes [2–4]. Linear epitopes differ from conformational epitopes in the continuity of their residues in primary sequenceresidues of a linearepitope are contiguous in primary sequence while the residues in a conformationalepitope are not. Bcell epitope prediction is a longstudied problem of high complexity which aims to identify those residues in an antigen forming one or multiple epitopes.
This problem has attracted tremendous efforts over the last two decades because of its significance in prophylactic and therapeutic biomedical applications [5]. Various approaches have been proposed to identify conformational epitopes, for example, by clustering accessible surface area (ASA) [6], by combining residues' ASA and their spatial contact [7], by grouping surface residues under their protrusion index [8], by aggregating epitopefavorable triangular patches [9], or by using naïve Bayesian classifier on residues' physicochemical and geometrical properties [10]. Far more approaches have been developed for predicting linear epitopes. Some of these methods use just a single feature of residuessuch as hydrophobicity, polarity, or flexibility onlyto detect the crests or troughs of propensity values as epitopes [11, 12]. The other methods take complicated machine learning approaches, including artificial neural network, Bayesian network, and kernel methods, to tackle this problem [13–19]. With these tremendous efforts, this field of research has been advanced significantly and the best AUC performance has reached to 0.644 [9]. However, there are still many limitations in existing methods, and huge room for performance improvement exists.
The use of graph model is motivated by the following biological observations. First, the tight packing of residues at each protein surface can be effectively represented by a graph. Second, epitope/nonepitope residues form particular patches separately on antigen surfaces, displaying distinct subgraphs of their own characteristics. As shown in Figure 1, the binding site shapes like a hydrophilic island (a hydrophilic subgraph) containing a hydrophobic core (a hydrophobic subgraph). It can be also seen that this island subgraph is surrounded by hydrophobic nonepitope residues which form a nonepitope subgraph. Third, the distinction between protrusive and planar eptiopes can be manifested by the change of weights in the connections between residues.
Our graphbased prediction method consists of three main steps: construct a weighted graph to represent an antigen surface, cluster the nodes of this weighted graph, and learn a label (epitope or nonepitope) for each cluster. Specifically, we take the idea of Delaunay tessellation and use Qhull [21] in the implementation of Delaunay tessellation to construct a protein surface graph. The weights of the edges in this graph are determined by ${\mathcal{X}}^{2}$ test statistics combined with a log odds ratio of each edge type. An edge type is determined by the amino acid types of the interacting residue pair. Then, a Markov CLustering algorithm (MCL) [22] is used to partition the entire graph into subgraphs based on the weights of the edges and the graph topology. MCL simulates flows in a network with two operations: expansion and inflation. Expansion increases homogeneity of nodes within one subgraph, while inflation evaporates interflow between different subgraphs and amplifies flow within subgraphs. These ideas mimic properties of residues connecting within an epitope, within a nonepitope, or between an epitope and a nonepitope. Thus, we can divide the weighted antigen surface graph into a good set of subgraphs for the subsequent learning algorithms to predict these subgraphs as epitopes or nonepitopes accurately.
Experimental results on a set of 92 nonredundant antibodyantigen complexes compiled from the Protein Data Bank (PDB) [23] show that our proposed graph model improves the performance of Bcell epitope prediction significantly and, it is also able to identify multiple epitopes as well as predict epitopes with various geometrical formations. For ease of reference, we refer to the proposed BCell e piTop e prediction model as BeTop. Our data and web server for Bcell epitope prediction are available at http://sunim1.sce.ntu.edu.sg/~s080011/betop/index.php.
Materials and methods
Collection of antigen protein data
Protein complexes satisfying the following criteria were retrieved from the PDB on May 14th, 2011: (i) the macromolecular type is protein only, no DNA, RNA, or their hybrid complexes; (ii) the number of protein chains in an asymmetric unit of one complex is larger than two; (iii) the length of every chain is larger than or equal to 30; (iv) the Xray resolution of one complex is less than 3Å; and (v) the structure title contains at least one of the following terms: antibody, Fab, Fv, or VHH. We obtained 622 antibodyantigen complexes. As transformed and redundant chains in the raw PDB complexes may cause noise effect on the results, we removed all of the transformed chains and duplicate chains. One antigen chain is considered as a duplicate if there exists one pairwise chain similarity between this chain and one of the other in the data set larger than 80%, a threshold widely used to remove redundant antigens [24]. Removal of duplicate chains by pairwise chain similarity may filter out multiepitope antigens, but it can significantly reduce more noise data because the number of nonepitope residues is extremely larger than the number of epitope residue for an antigen. Asymmetric units in each complex that do not have structural difference were also excluded from our consideration. Finally, a nonredundant data set containing 92 antibodyantigen complexes were collected for our model training and testing. Epitope residues on antigen surfaces were determined by using the Euclidian distance of 4Å for every antigenantibody PDB complex, following the traditional method for determining epitope residues [7].
Construction of epitope prediction model
The training phase of our prediction method has the following steps: (i) antigen surface triangulation, (ii) weight calculation for edges, (iii) clustering on the nodes of the graphs, and (iv) supervised learning for distinguishing between epitope subgraphs and nonepitope subgraphs. The details of each step are presented below.
Triangulation of an antigen surface
Weight calculation for edges
where, N_{ xy } is the number of residue pairs xy in a cluster, i.e., the number of edges connecting two nodes with one node labeled as x and the other as y. Q_{ xy } is calculated by the same way of computing P_{ xy }.
The weight calculation for boundary edges is very innovative. A boundary edge is an edge connecting an epitope residue and a nonepitope residue. We group all of the boundary edges (e.g. dashed black lines in Figure 3) in our graph database as a class, and take all epitope edges (e.g. solid blue lines in Figure 3) as the other class. Then, we apply Equation (1a) and (1b) to calculate the weights ${W}_{xy}^{\prime}$ for the boundary edges by setting c ∈ {boundary_class, epitope_classg}. This process is also applied with regard to the boundary class and nonepitope class (e.g., edges with solid orange lines in Figure 3) to determine weights ${W}_{xy}^{\prime \prime}$ for the boundary edges. In other words, ${W}_{xy}^{\prime}$ and ${W}_{xy}^{\prime \prime}$ are determined by using the exactly same equations as computing W_{ xy }, with substitution of the relevant class label c. The weight of a boundary edge xy is finally set as ${W}_{xy}^{\prime}$ or ${W}_{xy}^{\prime \prime}$ whichever is larger. Those boundary edges with heavy weights (larger than a threshold W_{0}) are definitely boundary edges between epitope and nonepitope subgraphs. We remove them to sharpen the distinction in the later clustering step and supervised learning. Boundary edges might change to another set when different computational methods are used to define epitope residues, such as using accessible surface area larger than 1Å^{2} upon binding with an antibody [6, 27] and distance threshold of 4Å [7, 28], 5Å [29] or 6Å [30]. However, Ponomarenko et. al. have shown that epitope residues have no significant difference when various criteria are used to capture epitope residues [24].
where θ and are γ optimized as 3 and 3 in this study.
Since there are only 20 different standard residue types, the total number of different weights between two residue types is $210\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\left(=\phantom{\rule{0.3em}{0ex}}{\mathsf{\text{C}}}_{2}^{20}+{\mathsf{\text{C}}}_{1}^{20}\right)$.
Clustering on nodes in an antigen surface graph
Antigen surface graphs are constructed by Qhull with weights W on edges determined by the procedure above. We then use mcl [22] (an implementation of the MCL algorithm with inflation coefficient r of 1.8) to cluster the nodes and edges of every antigen graph into subgraphs. In the MCL algorithm, the graph of an antigen surface residues is represented by a square matrix M, where each row/column represents a surface residue and the value of each entry is the weight of these two residue types. In the expansion stage of MCL, M is expanded as the normal product of itself; during the inflation stage, the matrix M undertakes Hadamard power with coefficient r followed by normalization. This two steps keep on in iteration until an equilibrium state is reached, i.e., when expansion and inflation do not alter the matrix any more.
The subgraphs of an antigen surface clustered by MCL are not always clean and some subgraphs may contain a mixture of epitope residues and nonepitope residues. To clean up the training data, we consider a subgraph as an epitope subgraph if the number of epitope residues in this subgraph is larger than the number of nonepitope residues and, as a nonepitope subgraph if no or very few (say, at most two) epitope residues show up. Subgraphs with other situations are considered as noise data which are overlooked during model training. We adopt this strategy because of the small number of epitope residues. We note that this approach is tolerant to false positives, but is sensitive to false negative.
Supervised learning for distinguishing epitope subgraphs and nonepitope subgraphs
By using mcl, each antigen surface graph is clustered into a number of subgraphs. To distinguish between epitope subgraphs and nonepitope subgraphs, we design a feature vector to represent all of these subgraphs in our training data. Each subgraph is transformed into a feature vector with 1770 dimensions, which comprises $20\phantom{\rule{0.3em}{0ex}}\left(=\phantom{\rule{0.3em}{0ex}}{C}_{1}^{20}\right)$ dimensions of single residues, $210\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}\left(={C}_{1}^{20}+{C}_{2}^{20}\right)$ dimensions of residue pairs, and $1540\phantom{\rule{0.3em}{0ex}}\left(={C}_{1}^{20}+{C}_{2}^{20}\cdot {C}_{1}^{2}+{C}_{3}^{20}\right)$ dimensions of residue triangles. A singleresidue feature takes the weighted summation of ${\mathcal{X}}^{2}$ test and log odds ratio on the frequencies of the residue type between epitope clusters and nonepitope clusters, which is similar to the calculation of the weight of a pair of residue types shown in Equation (1). A residuepair feature takes the weight of this edge in the subgraph as its value, and a triplet feature takes the average weight of the three edges in the subgraph as its value.
The number of nodes in a subgraph is very small (15 on average); but the dimension of each vector is very large (1770). Therefore, each vector is very sparse and, some features even have no differentiability between epitope subgraphs and nonepitope subgraphs. Hence, feature selection is conducted to maximize classification performance. The feature selection was done by using the LIBSVM [31] featureselection module targeting at maximizing classification fscore. As a result, 144 from the 1770 features are chosen for classifying epitope subgraphs from nonepitopes subgraphs.
and the symbol annotations are as follows:

y: epitope/nonepitope label for a sample predicted by the model;

w_{ i }: weight of classifier i computed by its performance;

f(x_{ i }): label for a sample x determined by classifier i in the first level;

${p}_{{x}_{i}}^{0}$: probability of classifier i that predicts sample x as nonepitope;

δ_{ i }: determinant of classifier i. δ_{ i }is 0 when the classifier i is dubious and other confident classifiers exist.
θ_{0} is a threshold to filter out nonepitope residues, and τ_{0} is used to control to what extent we trust the classifier.
Prediction of epitopes for an unknown antigen
Given an antigen with 3D coordinate information, the following steps are taken to identify one or multiple epitope for this antigen: (i) calculate each atom's ASA by using NACCESS, and filter out those atoms with ASA less than 10Å^{2}; (ii) construct an atomlevel graph by using Qhull and upgrade it to a residuelevel graph; (iii) assign weights to all edges of this residue graph, where the weights are those determined during the training; (iv) cluster this undirected and weighted graph into exclusive subgraphs using mcl; and (v) transform every subgraph into a feature vector, and predict its label by the welltrained twostage classification model. Epitope residues are the residues within those subgraphs which are predicted as epitope. Two epitope subgraphs can be merged together if they are connected in the original surface graph.
Results and discussions
Our graphbased method BeTop made remarkable improvement on Bcell epitope prediction in comparison to the stateoftheart methods. First, BeTop shows significant improvement on overall prediction accuracy. Second, BeTop is capable of predicting epitopes located at both protrusive and planar surface areas. Third, BeTop is able to identify multiple epitopes if an antigen contains them. The detailed results of all these are presented below together with highlights of those features that distinguish epitope subgraphs from nonepitope subgraphs.
Significant improvement of prediction accuracy
Four performance metrics are adopted to evaluate model performanceviz., sensitivity (sen), specificity (spe), fscore, and accuracy (acc). They are defined as sen = TP/(TP + FN), spe = TN/(TN + FP), fscore = 2*pre*sen/(pre + sen), and acc = (TP + TN)/(TP + FP + TN + FN), where TP, TN, FP, and FN represent the number of predicted true positive, true negative, false positive and false negative samples, respectively. Due to the imbalance nature in the composition of nonepitope residues and epitope residues in an antigen, accuracy is not competent to measure model performance. Instead, fscore is more appropriate to evaluate BeTop's performance and to compare with other models.
Fscore ttest pvalues between BeTop, DiscoTope, SEPPA and ElliPro.
DiscoTope (0.22 ± 0.14)  SEPPA (0.25 ± 0.16)  ElliPro (0.36 ± 0.20)  BeTop (0.45 ± 0.16)  

DiscoTope  1.6e1  7.9e8  1.8e17  
SEPPA  2.3e5  7.3e15  
ElliPro  2.0e3 
The averaged performances comparison between BeTop, DiscoTope, SEPPA and ElliPro on sensitivity, specificity, accuracy and AUC.
DiscoTope  SEPPA  ElliPro  BeTop  

sensitivity  0.377 ± 0.278  0.526 ± 0.345  0.501 ± 0.290  0.665 ± 0.239 
specificity  0.686 ± 0.168  0.665 ± 0.255  0.849 ± 0.137  0.809 ± 0.162 
accuracy  0.631 ± 0.133  0.659 ± 0.193  0.798 ± 0.126  0.802 ± 0.134 
AUC  0.531 ± 0.127  0.595 ± 0.157  0.675 ± 0.140  0.737 ± 0.107 
One of the novel ideas used in this study is reducing the weight of boundary edges to distinguish epitope from nonepitope. Thus, we further compare the performances of BeTop with suppressing weights of boundary edges and without suppressing weights of boundary edges. Experimental results show that the averaged fscores are 0.45 and 0.41 for the two situations, with increment of fscore by 8.9%. The ttest pvalue of 0.11 between the two sets of fscores also demonstrates the improvement of performance by decreasing weights of edges enriched in boundary class.
Locating epitopes with planar formations
Existing conformational epitope prediction methods such as [7, 8, 10] heavily rely on the spatial structure information and nonplanarity properties of antigens. They usually have a good performance on epitopes that have a protrusive surface, otherwise the performance becomes poor. To understand the effect of nonplanarity of epitopes on epitope prediction, we divide all of the epitopes in our database into groups based on a nonplanarity index. The nonplanarity of a residue cluster is measured by the rootmeansquare deviation of all the surface atoms of this cluster of residues. It is expected that those structurebased prediction models favor epitopes with large nonplanarity but not at epitopes.
Taking PDB entry 1AR1 as example again (Figure 1), its epitope consists of 19 residues, and the nonplanarity of this epitope is as small as 1.08Å, indicating a very flat surface area. The fscore achieved by BeTop is 0.88 (with 16 true positives and 1 false positive). However, ElliPro, SEPPA, DiscoTope made an fscore of 0.273 (with 7 true positives and 22 false positives), 0.000, and 0.000, respectively. As another example, the prediction performance by BeTop, ElliPro, SEPPA and DiscoTope on the epitope residues of PDB entry 1N8Z are 0.667, 0.194, 0.198 and 0.07, respectively. This epitope also has a very planar surface with nonplanarity of 1.88Å.
For epitopes having a large nonplanarity bigger than or equal to 3Å, BeTop also performs better than the other models. The fscore is improved by 65.6%, 55.7% and 11.8% over DiscoTope, SEPPA and ElliPro, respectively. In particular, in comparison to ElliPro, which detects twisted epitopes based on residues' protrusion index, BeTop still achieved a better performance.
In summary, the fscore of the 3 existing methods becomes poor when the nonplanarity of epitopes becomes flat. However, BeTop performs equally well under both protrusive and planar conditions, demonstrating that our proposed BeTop graph model is invariant to the change of epitope nonplanarity.
Identifying multiple epitopes from an antigen
Although BeTop is trained on singleepitope antigenantibody complexes, the framework has no limitation on the number of predicted epitopes. To evaluate BeTop's capability in identifying multiple epitopes in an antigen, we tested it on a data set of epitopes that are comprehensively explored in [20].
The multiple epitopes of these antigens are determined by the following steps: (i) determine epitope residues for each complex by using the 4Å Euclidian distance criteria between the antigen and antibody; (ii) calculate a pairwise epitope similarity between two complexes X and Y of the same antigen by using S_{ XY } = X ∩ Y /min(X, Y); (iii) cluster epitopes based on their similarities for each antigen; (iv) select representative epitopes for each cluster with the best resolution, and map all representative epitopes to one of them with the finest resolution. Finally 9 antigens with a total of 20 epitopes are obtained.
As expected, BeTop can identify as many epitopes as possible when they exist on an antigen. For instance, there are four epitopes on the antigen hen egg white lysozyme. BeTop can detect all of the four epitopes with an average fscore and accuracy of 0.376 and 0.849. These experimental results show that multiple epitopes predicted by BeTop are not false positives, and it does not mix up multiple epitopes either.
Graphical triplet patterns for epitopes
Top ten features that are in favor of the epitope class and also those that are enriched in the nonepitopes in terms of Gtest.
epitope  nonepitope  

feature  Gtest  feature  Gtest 
SSL  9.12  YF  5.34 
GGI  4.92  SAA  4.69 
DDW  4.85  KQ  4.19 
NGG  4.53  AC  4.04 
RRT  4.23  A  3.20 
DDF  4.21  EVV  2.74 
STT  3.65  EA  2.59 
FLL  3.26  FFV  2.54 
QQS  3.13  HC  2.35 
HF  2.89  N  2.35 
Conclusion
Epitope prediction is an important way to understanding the immune basis of antibodyantigen interactions and is beneficial to prophylactic and therapeutic solutions. In this study, we proposed a novel graphbased model ("BeTop") to predict Bcell epitope by incorporating statistical ideas, graph clustering algorithms and supervised learning approaches. Our experimental results conducted on two data sets of nonredundant antigenantibody complexes show that BeTop makes great improvements for identifying those planar epitopes and for distinguishing multiple epitopes in an antigen. We have also presented interesting features and triplet feature patterns for the epitopes which will be useful for us to form new hypothesis in the future studies.
Declarations
Acknowledgements
We thank Mr. Zhenhua Li for helping us developing the web site. This work was supported by Nanyang Technological University [RG66/07].
This article has been published as part of BMC Bioinformatics Volume 13 Supplement 17, 2012: Eleventh International Conference on Bioinformatics (InCoB2012): Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/13/S17.
Authors’ Affiliations
References
 Abbas AK, Lichtman AH, Pillai S: Cellular and Molecular Immunology. 2009, W.B. Saunders Company, 6Google Scholar
 Atassi M: Antigenic structure of myoglobin: The complete immunochemical anatomy of a protein and conclusions relating to antigenic structures of proteins. Immunochemistry. 1975, 12 (5): 423438. 10.1016/00192791(75)900105.View ArticlePubMedGoogle Scholar
 Benjamin DC, Berzofsky JA, East IJ, Gurd FRN, Hannum C, Leach SJ, Margoliash E, Michaels JG, Miller A, Prager EM, Reichlin M, Sercarz EE, SmithGill SJ, Todd PE, Wilson A: The antigenic structure of proteins  a reappraisal. Annu Rev Immunol. 1984, 2: 67101. 10.1146/annurev.iy.02.040184.000435.View ArticlePubMedGoogle Scholar
 Pellequer JL, Westhof E, Van Regenmortel MHV: Predicting location of continuous epitopes in proteins from their primary structures. Molecular Design and Modeling: Concepts and Applications Part B: Antibodies and Antigens, Nucleic Acids, Polysaccharides, and Drugs, Volume 203 of Methods in Enzymology. Edited by: Langone JJ. 1991, Academic Press, 176201.Google Scholar
 Irving MB, Pan O, Scott JK: Randompeptide libraries and antigenfragment libraries for epitope mapping and the development of vaccines and diagnostics. Curr Opin Chem Biol. 2001, 5 (3): 314324. 10.1016/S13675931(00)002088.View ArticlePubMedGoogle Scholar
 KulkarniKale U, Bhosle S, Kolaskar AS: CEP: a conformational epitope prediction server. Nucleic Acids Res. 2005, 33: 168171.View ArticleGoogle Scholar
 Andersen PH, Morten N, Ole L: Prediction of residues in discontinuous Bcell epitopes using protein 3D structures. Protein Sci. 2006, 15 (11): 25582567. 10.1110/ps.062405906.View ArticleGoogle Scholar
 Ponomarenko J, Bui HHH, Li W, Fusseder N, Bourne PE, Sette A, Peters B: ElliPro: a new structurebased tool for the prediction of antibody epitopes. BMC Bioinf. 2008, 9: 514+10.1186/147121059514.View ArticleGoogle Scholar
 Sun J, Wu D, Xu T, Wang X, Xu X, Tao L, Li YX, Cao ZW: SEPPA: a computational server for spatial epitope prediction of protein antigens. Nucleic Acids Res. 2009, 37 (suppl 2): W612W616.PubMed CentralView ArticlePubMedGoogle Scholar
 Rubinstein N, Mayrose I, Martz E, Pupko T: Epitopia: a webserver for predicting Bcell epitopes. BMC Bioinformatics. 2009, 10: 287+10.1186/1471210510287.PubMed CentralView ArticlePubMedGoogle Scholar
 Hopp TP, Woods KR: Prediction of protein antigenic determinants from amino acid sequences. Proc Natl Acad Sci USA. 1981, 78 (6): 38243828. 10.1073/pnas.78.6.3824.PubMed CentralView ArticlePubMedGoogle Scholar
 Karplus P, Schulz G: Prediction of chain flexibility in proteins: a tool for the selection of peptide antigen. Naturwissenschaften. 1985, 72 (4): 212213. 10.1007/BF01195768.View ArticleGoogle Scholar
 Larsen JE, Lund O, Nielsen M: Improved method for predicting linear Bcell epitopes. Immunome Res. 2006, 2 (2):Google Scholar
 Söllner J, Mayer B: Machine learning approaches for prediction of linear Bcell epitopes on proteins. J Mol Recognit. 2006, 19: 200208. 10.1002/jmr.771.View ArticlePubMedGoogle Scholar
 Saha S, Raghava GPS: Prediction of continuous Bcell epitopes in an antigen using recurrent neural network. Proteins: Struct, Funct, Bioinf. 2006, 65: 4048. 10.1002/prot.21078.View ArticleGoogle Scholar
 ElManzalawy Y, Dobbs D, Honavar V: Predicting linear Bcell epitopes using string kernels. J Mol Recognit. 2008, 21 (4): 24355. 10.1002/jmr.893.PubMed CentralView ArticlePubMedGoogle Scholar
 Rubinstein ND, Mayrose I, Pupko T: A machinelearning approach for predicting Bcell epitopes. Mol Immunol. 2008, 46 (5): 840847.View ArticlePubMedGoogle Scholar
 Reimer U: Prediction of linear Bcell epitopes. Methods Mol Biol. 2009, 524: 33544. 10.1007/9781597454506_24.View ArticlePubMedGoogle Scholar
 Sweredoski MJ, Baldi P: COBEpro: a novel system for predicting continuous Bcell epitopes. Protein Eng Des Sel. 2009, 22 (3): 113120.PubMed CentralView ArticlePubMedGoogle Scholar
 Zhao L, Wong L, Li J: AntibodySpecified BCell Epitope Prediction in Line with the Principle of ContextAwareness. IEEE/ACM Trans Comput Biol Bioinf. 2011, 8 (6): 14831494.View ArticleGoogle Scholar
 Barber CB, Dobkin DP, Huhdanpaa H: The Quickhull algorithm for convex hulls. ACM T. Math. Software. 1996, 22 (4): 469483. 10.1145/235815.235821.View ArticleGoogle Scholar
 van Dongen S: Graph Clustering by Flow Simulation. PhD thesis, University of Utrecht. 2000Google Scholar
 Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Res. 2000, 28: 235242. 10.1093/nar/28.1.235.PubMed CentralView ArticlePubMedGoogle Scholar
 Ponomarenko JV, Bourne PE: Antibodyprotein interactions: benchmark datasets and prediction tools evaluation. BMC Struct Biol. 2007, 7: 6410.1186/14726807764.PubMed CentralView ArticlePubMedGoogle Scholar
 Hubbard SJ, Thornton JM: Naccess V2.1.1  Solvent accessible area calculations. 1992, [http://www.bioinf.manchester.ac.uk/naccess/]Google Scholar
 Huan J, Wang W, Bandyopadhyay D, Snoeyink J, Prins J, Tropsha A: Mining Protein Family Specific Residue Packing Patterns from Protein Structure. Eighth Annual International Conference on Research in Computational Molecular Biology (RECOMB). 2004, 308315.Google Scholar
 Rapberger R, Lukas A, Mayer B: Identification of discontinuous antigenic determinants on proteins based on shape complementarities. J Mol Recognit. 2007, 20 (2): 113121. 10.1002/jmr.819.View ArticlePubMedGoogle Scholar
 Sweredoski MJ, Baldi P: PEPITO: improved discontinuous Bcell epitope prediction using multiple distance thresholds and half sphere exposure. Bioinformatics. 2008, 24 (12): 14591460. 10.1093/bioinformatics/btn199.View ArticlePubMedGoogle Scholar
 Chen H, Zhou HX: Prediction of interface residues in proteinprotein complexes by a consensus neural network method: Test against NMR data. Proteins: Struct, Funct, Bioinf. 2005, 61: 2135. 10.1002/prot.20514.View ArticleGoogle Scholar
 Schlessinger A, Ofran Y, Yachdav G, Rost B: Epitome: database of structureinferred antigenic epitopes. Nucleic Acids Res. 2006, 34: 777780. 10.1093/nar/gkj053.View ArticleGoogle Scholar
 Chang CC, Lin CJ: LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology. 2011, 2: 27:127:27.View ArticleGoogle Scholar
 Sokal RR, Rohlf FJ: Biometry: The Principles and Practices of Statistics in Biological Research. 1994, W. H. Freeman, thirdGoogle Scholar
 Pettersen EF, Goddard TD, Huang CC, Couch GS, Greenblatt DM, Meng EC, Ferrin TE: UCSF Chimeraa visualization system for exploratory research and analysis. J Comput Chem. 2004, 25 (13): 160512. 10.1002/jcc.20084.View ArticlePubMedGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.