Skip to main content

Evaluation of MHC-II peptide binding prediction servers: applications for vaccine research

Abstract

Background

Initiation and regulation of immune responses in humans involves recognition of peptides presented by human leukocyte antigen class II (HLA-II) molecules. These peptides (HLA-II T-cell epitopes) are increasingly important as research targets for the development of vaccines and immunotherapies. HLA-II peptide binding studies involve multiple overlapping peptides spanning individual antigens, as well as complete viral proteomes. Antigen variation in pathogens and tumor antigens, and extensive polymorphism of HLA molecules increase the number of targets for screening studies. Experimental screening methods are expensive and time consuming and reagents are not readily available for many of the HLA class II molecules. Computational prediction methods complement experimental studies, minimize the number of validation experiments, and significantly speed up the epitope mapping process. We collected test data from four independent studies that involved 721 peptide binding assays. Full overlapping studies of four antigens identified binding affinity of 103 peptides to seven common HLA-DR molecules (DRB1*0101, 0301, 0401, 0701, 1101, 1301, and 1501). We used these data to analyze performance of 21 HLA-II binding prediction servers accessible through the WWW.

Results

Because not all servers have predictors for all tested HLA-II molecules, we assessed a total of 113 predictors. The length of test peptides ranged from 15 to 19 amino acids. We tried three prediction strategies – the best 9-mer within the longer peptide, the average of best three 9-mer predictions, and the average of all 9-mer predictions within the longer peptide. The best strategy was the identification of a single best 9-mer within the longer peptide. Overall, measured by the receiver operating characteristic method (AROC), 17 predictors showed good (AROC > 0.8), 41 showed marginal (AROC > 0.7), and 55 showed poor performance (AROC < 0.7). Good performance predictors included HLA-DRB1*0101 (seven), 1101 (six), 0401 (three), and 0701 (one). The best individual predictor was NETMHCIIPAN, closely followed by PROPRED, IEDB (Consensus), and MULTIPRED (SVM). None of the individual predictors was shown to be suitable for prediction of promiscuous peptides. Current predictive capabilities allow prediction of only 50% of actual T-cell epitopes using practical thresholds.

Conclusion

The available HLA-II servers do not match prediction capabilities of HLA-I predictors. Currently available HLA-II prediction servers offer only a limited prediction accuracy and the development of improved predictors is needed for large-scale studies, such as proteome-wide epitope mapping. The requirements for accuracy of HLA-II binding predictions are stringent because of the substantial effect of false positives.

Introduction

Vaccines are the most effective means for fighting against infectious diseases [1]. They are emerging as promising therapies for cancer [2], allergy [3], and autoimmunity [4]. The goal of vaccination is to induce immunity against pathogens and cancer cells by stimulating antigen-specific cytotoxic T lymphocytes (CTLs) or B cells. CTLs recognize peptide antigens presented by major histocompatibility complex class I (MHC-I) molecules on infected cells or cancer cells and kill them. B cells produce antibodies that specifically recognize pathogen- or cancer related molecules. Both these processes are initiated and regulated by T-helper (Th) cells that recognize antigenic peptides presented by MHC class II (MHC-II) molecules. MHC-II molecules present antigenic peptides internalized by professional antigen presenting cells, such as macrophages, dendritic cells, or T lymphocytes. A vaccine must at minimum contain two antigenic epitopes: one to induce specific B-cell or CTL responses and another to induce specific Th cells that regulate (initiate, enhance, or suppress) immune responses [5]. Peptides presented by MHC-I molecules are mainly intracellular and those presented by MHC-II molecules originate mainly from or extracellular proteins. A distinct characteristic of MHC molecules of either class is a groove that binds peptides in a highly promiscuous manner.

The peptide-binding groove of a MHC molecule consists of a β-sheet and two α-helices. A peptide binds through a network of hydrogen bonds between its backbone and the binding groove, and through interactions between the peptide side chains and pockets inside the binding groove [6, 7]. Most MHC-I binding peptides are 8–11 amino acids long [8]. MHC-II molecules bind nested sets of peptides most of which are 14–18 amino acids long [9], but some can extend beyond 30 amino acids. MHC-I molecules accommodate the whole length of the binding peptide inside their grooves that are closed [6]. Binding grooves of a MHC-II molecules have open ends; they accommodate the 9-mer binding core of the peptides inside while peptide termini protrude outside of the grooves [7].

The ability of the immune system to respond to a particular antigen differs between individuals because they display different patterns of MHC genes. Human MHC molecules are known as human leukocyte antigens. Each human individual expresses up to six HLA-I molecules and up to a dozen HLA-II molecules. HLA genes show extensive polymorphism. As of August 2008, more than 3000 HLA alleles have been identified and sequenced including 2215 HLA-I and 986 HLA-II sequences [10]. The diversity of HLA molecules increases the probability that any foreign antigen will contain HLA-binding peptides suitable as vaccine targets. The amino acids within the binding groove determine the specificity of peptide binding to a given HLA molecule. Across multiple HLA molecules, the polymorphic residues that form the binding groove determine the repertoire of binding peptides to a particular HLA molecule. Tens of thousands of allele-specific and promiscuous MHC binders and T-cell epitopes have been identified in humans and mice while smaller numbers have been identified in other model animals, such as monkeys and rats [11, 12].

Identification of HLA binding molecules is, therefore, important for both understanding the basing molecular function of the immune system and for vaccine development. However, systematic T-cell epitope mapping is costly and time-consuming because it involves synthesis and testing of overlapping peptides spanning the full length of target antigens. For short peptides such as tumor antigen surviving (BIRC5), that is 142 amino acids long, full overlapping studies of both HLA-I and -II binders were performed for several HLA molecules [13, 14]. However systematic studies are prohibitively expensive for studies of long antigens, such as autoantigen thyroglobulin (2768 amino acids long), where computational predictions were used to preselect suitable targets followed by experimental validation [15, 16]. This problem is particularly pronounced in the studies of whole pathogen proteomes, even in small viruses, such as influenza [17], or dengue [18].

Computational prediction of peptide binding to MHC molecules has been a topic of vigorous research and development activity [19–22]. Computational methods for prediction of HLA-I binding have reached a high level of sophistication and accuracy and represent significant research resources [23]. Computational predictions of HLA-II binding were useful in the study of infectious disease [24, 25], cancer [26, 27], and autoimmunity [15, 16]. However, recent reports have indicated that computational predictions of HLA-II binding are of much lower accuracy than for their HLA-I counterparts [28, 29], and even that these predictions may cause more confusion than conclusion [30]. The methods used for assessment of predictors of HLA-II binding have suffered from inadequately defined test sets and testing strategies. Several critical issues need to be addressed to rectify these failings.

  • Only a small fraction of peptides in a given pathogen or tumor-specific proteome are able to bind to a specific MHC molecule [31]. Tens of thousands of protein variants have been characterized in viruses [17, 18]. Several hundred of tumor-related antigens and their variants have been reported [32, 33]. The extensive variability of target antigens significantly increases the number of testable targets, making each individual binding peptide a representative of a large family of individual peptide groups or families [34].

  • The comparison studies performed to date have been based on assessing predictive performance using pre-defined sets of peptides, rather than well-defined standardized full-overlapping studies of complete antigens. This introduces biases and the reported performances are likely to be overestimates.

  • HLA-II peptide binding is mediated through 9-mer binding core, but longer peptides are used for experimental measurement of binding. Hereby we predict one element (the 9-mer binding cores) and experimentally test with another element (15-mer, or longer peptides). This makes the improvement of false positive rate an important issue in prediction of HLA-II binding and it requires sophisticated statistical and machine learning approaches (see [28, 29, 34]).

  • Both ends of the peptide binding grooves in HLA-II molecules are open, allowing the peptides to be more variable in length (typically 14–18 amino acids) and flanking residues are known to selectively affect binding [9]. This effect is not considered in most of the HLA-II prediction methods.

  • Some longer peptides bind MHC-II through multiple overlapping 9-mer registers [34, 35] adding further complexity to the selection of actual binding cores. The simpler question of identification of the location of 9-mer binding is extended to identification of multiple binding cores and their locations within the same peptide.

  • Experimental measurements of HLA-II binding shows variation depending on the conditions of the experiment, even for the control peptides.

  • Sufficient quantities of HLA-II binding data are available only for some HLA-DR molecules while, notwithstanding notable exceptions [35], HLA-DQ and -DP molecules have been understudied.

  • Presentation of HLA-II binding peptides depends on antigen processing steps including editing by HLA-DM and other accessory molecules. DM editing affects the density and preference for particular peptide species [36]. These effects have not yet been included in the prediction approaches.

HLA-II binding predictions are thus more complex than HLA-I predictions [23, 37, 38]. Various prediction algorithms have been developed to facilitate the identification of HLA-II binding peptides within protein antigens. They made computational pre-screening of antigens for HLA-II epitopes a standard approach in epitope-mapping studies; more than twenty prediction servers have been developed to facilitate the identification of MHC-II binding peptides. The performance of six prediction methods has been compared in each of the three recent studies [28–30]. The overall conclusions of these studies were similar, indicating a relatively low prediction accuracy of HLA-II binding predictors. Large quantities of HLA-DR binding peptides with precise measurements have recently become available [28, 29], yet contemporary methods have shown little, if any, improvement when compared to the older TEPITOPE method.

This study extends the assessment of predictive power to include a much larger number of servers that predict HLA-II binding. This study was limited to seven common HLA-DR molecules that have sufficient amount and quality of peptide binding data. We compiled and established standardized test data sets that are more representative of the experimental reality, and defined a uniform scaling scheme to use data from different studies. Finally we assessed the practical applicability of HLA-II binding predictions to identification of HLA-II T-cell epitopes. Our study identified several key issues that need to be addressed for the development of improved prediction systems of HLA-II binding.

Results

Classification

While not all the servers were designed specifically for peptide binding predictions, all of them have implemented modules for this step. Some servers also have advanced options, for example, MHCPred enables users to specify anchor positions. For this analysis we used the simplest prediction method available at each server. The numbers of the servers for individual HLA-DR alleles we studied were: HLA-DRB1*0101 – 19, HLA-DRB1*0301 – 15, HLA-DRB1*0401 – 20, HLA-DRB1*0701 – 16, HLA-DRB1*1101 – 17, HLA-DRB1*1301 – 9, and HLA-DRB1*1501 – 17.

In total 113 individual predictors were tested of which 17 showed good, 41 marginal, and 55 poor performance using the single maximum 9-mer prediction scheme. 8 showed good, 30 marginal, and 75 poor performance using the average prediction for all 9-mers within the test peptide. Using the average of best of three 9-mer predictions, 12 servers showed good, 37 marginal, and 64 poor, performance. The AROC values of these predictions are shown in Figure 1. An important finding from this analysis is that overall, for the best prediction scheme (a single best 9-mer), half of the prediction servers are not predictive while only 15% of the servers show acceptable performance. Other prediction schemes show even lower predictive performance.

Figure 1
figure 1

AROC values of predictions by the 21 servers using the combined test set (103 peptides from the four antigens) based on the three mapping methods: black bars for maximum 9-mer scores, grey bars for average scores of all overlapping 9-mers, and white bars for the average of the top three 9-mer scores. Vertical axis shows the AROC values while horizontal axis shows individual servers, as designated in Table 2. Best performing predictors for each allele are marked by asterisks.

Comparing the prediction performance across HLA-DR alleles, the best predictors are for HLA-DRB1*0101, where seven predictors showed good classification accuracy, while six DRB1*1101 predictors, three DRB1*0401 predictors, and only one DRB1*0701 predictor showed good classification accuracy. None of predictors for DRB1*0301, DRB1*1301, and DRB1*1501 showed good classification performance. Important to note, only four HLA-DRB1*0101 predictors have shown performance that approaches the value of AROC = 0.9 while other "good" predictors are close to the lower borderline leaving ample space for the improvement.

The best prediction server across all HLA molecules evaluated in this study is NETMHCIIPAN, closely followed by PROPRED, IEDB_SAT, and MULTI_SVM. The best predictors we recommend for each allele are marked by asterisks in Figure 1.

Prediction of promiscuous peptides

Promiscuous peptides are able to bind to multiple MHC molecules. Therefore they serve as promising targets for vaccine design because they are likely to cover a larger population of patients [39]. We performed analysis of prediction of promiscuous peptides by assigning a score to each peptide, which indicated the number of HLA-DR molecules it binds to. The AROC was then calculated and the results are shown in Figure 2. None of the predictors showed good performance, while MHCPRED, RANKPEP, PROPRED, IEDB_SAT, MULTI_HMM reached AROC values higher than 0.775. DR4_ANN and DR4_SVM predictors were excluded from this analysis since they predict peptide binding to single MHC-II allele (HLA-DRB1*0401). To enable the comparison of predictions that include multiple HLA alleles, we developed a common scaling scheme for seven HLA-DRB1 alleles. Binding scores used in this scheme range from 0 to 100 and threshold for binding is at 50. The scaled data are accessible at DFRMLI [42].

Figure 2
figure 2

AROC values for prediction of promiscuous peptides. Vertical axis shows the AROC values while horizontal axis shows numbers designating individual servers, as shown in Table 2. The first two servers were excluded from the analysis because they predicted peptide binding to a single DR molecule.

Prediction of T-cell epitopes

We also assessed the performance of prediction servers in identification of tumor antigen T-cell epitopes. For each server we predicted the binding affinity of all T-cell epitopes and determined the thresholds at which approximately 80% and 50% of tested T-cell epitopes were predicted as binders. The number of false positives (FPs) at the thresholds was calculated for the four antigens and representative results are shown in Table 1.

Table 1 Prediction performance of selected representative servers at two scenarios: a) thresholds that correctly predict ~80% of T-cell epitopes; b) thresholds that correctly predict ~50% of T-cell epitopes.

To identify 80% of T cell epitopes, the threshold for each predictor had to be set low, which resulted in a large number of false positives. This problem was pronounced for predictors such as MHCPRED, MULTI_ANN and RANKPEP, since the number of false positives even exceeded that of true negatives. At this threshold, IEDB (consensus), MULTPRED (SVM), and PROPRED showed the best performance. On the other hand, the thresholds for predicting ~50% of known T cell epitopes were much more stringent, significantly lowering the rate of false positives relative to the 80% threshold. At this threshold, NetMHCIIpan, IEDB (consensus), and PROPRED showed the best performance

Conclusion and discussion

In this study we evaluated the performance of 21 prediction servers for HLA-II binding peptides. Seven DRB1*0101 predictors, six DRB1*1101 predictors, three DRB1*0401 predictors, and one DRB1*0701 predictor showed good performance in identification of binders and non-binders. None of predictors for DRB1*0301, DRB1*1301, and DRB1*1501 performed well, indicating that much room for improvement still exists for MHC-II prediction.

The results suggest that some of current predictors are useful for pre-screening Th epitopes, although a relatively large number of false positives (at lower thresholds) and false negatives (at higher thresholds) would also be produced. Predictions using lower thresholds are useful for screening true negatives, while predictions at higher thresholds help cheaply identify a subset of the T-cell epitopes. Unlike MHC-I predictions, we have no evidence that nonlinear methods would perform better than linear methods. One possible reason may be due to the fact that nonlinear methods, such as ANN or SVM, generally require relatively larger amount of data for model development than linear methods. However, the amount of high-quality binding data for MHC-II binding is still far from sufficient, which limits the capability of nonlinear methods to recognize characteristics underlying MHC-peptide interaction

In summary, the prediction accuracy of HLA-II binding peptides is inferior to that of HLA-I binding peptides. Several factors appear to account for this disparity. Insufficient or low-quality training data has been the problem for developers of prediction methods for HLA-II binding peptides. Another problem with HLA-II predictions is the difficulty in identifying 9-mer binding cores within longer peptides used for training as well as lack of consideration of the influence of flanking residues. Amino acids flanking the binding core, contribute to MHC-peptide interactions and also antigen processing preferences [34, 40]. Another reason of poor performance for MHC-II prediction is that the binding groove of HLA-II molecules is relatively permissive for peptide binding, which limits the stringency of specific binding motifs. We propose that with new large datasets available [29, 37, 41, 42] new methods that implement knowledge-based strategies and computational search techniques need to be developed. Examples showing various approaches that can improve HLA-binding prediction systems include the use of advanced search algorithms [28, 29, 43], advanced statistical and machine learning approaches [44–47], combination approaches [28, 38, 48, 49], novel scoring functions [50], and improved use of structural predictions [51, 52], or application of knowledge-based approaches [53–57]. Future HLA-DR prediction developments studies should, at minimum, use standardized data sets, provide improved definition of binding cores, minimize number of false positives, and consider the effects of flanking residues.

Results of this study will help researchers to determine the most appropriate servers for pre-screening of HLA class II binding peptides. In addition, this study has defined basic criteria for slection of predication thresholds for selection of peptides that are most likely to be potential HLA-II epitopes. On the other hand, it provides guidelines for testing and test data to server developers. This knowledge, together with standardized test data sets should empower them to produce better solutions and improve prediction performance. Normalization and standardization methods that we introduced in this study enable annotation and integration of heterogeneous data into a uniform format, which facilitates the development of advanced algorithms. Future advancement in high-throughput measurements of binding affinities is expected to significantly improve the prediction performance of MHC-II binding peptides.

Materials and methods

We evaluated 21 servers for prediction of HLA class II binding peptides that have been developed by 12 groups (Table 2). These servers were accessible over the Internet as of July 2008. Predictive algorithms used in these servers include: binding matrices, partial least square function, artificial neural networks (ANN), hidden Markov models (HMM), and support vector machines (SVM). Our study involved five consecutive steps: a) Construct test data sets by collecting independent experimental data; b) Retrieve prediction results from the 21 servers; c) Assess the classification accuracy (binders vs. non-binders); d) Assess the prediction accuracy of promiscuous binding affinities; e) Assess the performance for predicting T cell epitopes.

Table 2 List of prediction servers of HLA class II binding peptides, their URLs (as of December 2007), and name abbreviations.

Data sets

In this study our test data sets consisted of 103 peptides derived from four protein antigens, including allergens – bee venom phospholipase A2 (API m1) [58] and dog lipocalin (Can f 1) [59], a tumor antigen LAGE-1 [60], and a viral antigen HIV NEF [61]. Although these studies were done by different groups, they were performed using comparable protocols and same control peptides. The lengths of the studied peptides were in the range of 15 to 19 amino acids (Table 3). Binding capability of these peptides to corresponding HLA molecules was measured by the concentration of peptides that prevented binding of 50% of the labeled reference peptides. These studies reported binding data for seven HLA-DR molecules (DRB1*0101, 0301, 0401, 0701, 1101, 1301, and 1501). The test data sets used in this study were extracted from the original references and rescaled to a common scale. The data used in this study are accessible at the Dana-Farber Machine Learning Repository for Immunology (DFRMLI) [42].

Table 3 Summary of the four testing protein antigens

Predictions and comparisons

Each protein sequence was submitted to the prediction servers and the results were recorded. Most servers predict binding affinities of 9-mer peptides while the experiments were conducted on longer peptides ranging from 15 aa to 19 aa. Three mapping methods were explored to map the 9-mer predictions to experimental results. First, the highest prediction score of the overlapping 9-mer peptides spanning the length of a longer peptide was used as the predicted binding of the longer peptide. Second, the average score of the overlapping 9-mers was used as the predicted binding. Finally, the average of the top three predicted 9-mer scores of the overlapping peptides was used as the prediction score.

Prediction accuracy is measured in terms of the area under the receiver operating characteristic curve (AROC) [62]. The ROC curve is a plot of the true positive rate TP/(TP+FN) on the vertical axis vs. false positive rate FP/(TN+FP) on the horizontal axis for the full range of the decision thresholds. The values AROC ≥ 0.9 indicate excellent, 0.9 > AROC ≥ 0.8 good, 0.8 > AROC ≥ 0.7 marginal and 0.7 > AROC poor predictions [62].

In this study we defined promiscuous peptides as those peptides from the test set that bound four or more of the seven studied alleles. Binding was defined as half maximal inhibitory concentration (IC50) lower than 100 nM (for DRB1*0101, 0401, 0701, and 1501), or lower than 1000 nM (for DRB1*0301, 1101, and 1301).

Scaling

To enable visual inspection for comparisons of predictions, both experimental measurements and predictions have been scaled to a common scale from 0 to 100 by linear transformation of the value ranges using the formula for each individual peptide:

y i S = y i − y min y max − y min × 100 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xI8qiVKYPFjYdHaVhbbf9v8qqaqFr0xc9vqFj0dXdbba91qpepeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyEaK3aa0baaSqaaiabdMgaPbqaaiabdofatbaakiabg2da9KqbaoaalaaabaGaemyEaK3aaSbaaeaacqWGPbqAaeqaaiabgkHiTiabdMha5naaBaaabaGagiyBa0MaeiyAaKMaeiOBa4gabeaaaeaacqWG5bqEdaWgaaqaaiGbc2gaTjabcggaHjabcIha4bqabaGaeyOeI0IaemyEaK3aaSbaaeaacyGGTbqBcqGGPbqAcqGGUbGBaeqaaaaakiabgEna0kabigdaXiabicdaWiabicdaWaaa@4CFF@

where y i S MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaemyEaK3aa0baaSqaaiabdMgaPbqaaiabdofatbaaaaa@3007@ is the scaled score, y min is the minimum and y max is the maximum score. The experimental binding affinity was corrected for variation in binding affinity of control peptides between different experiments then scaled. All values are accessible at DFRMLI site.

References

  1. Ehreth J: The value of vaccination: a global perspective. Vaccine 2003,21(27–30):4105–4117.

    Article  PubMed  Google Scholar 

  2. Voutsas IF, Gritzapis AD, Mahaira LG, Salagianni M, von Hofe E, Kallinteris NL, Baxevanis CN: Induction of potent CD4+ T cell-mediated antitumor responses by a helper HER-2/neu peptide linked to the Ii-Key moiety of the invariant chain. International journal of cancer 2007,121(9):2031–2041.

    Article  CAS  Google Scholar 

  3. Rhyner C, Kundig T, Akdis CA, Crameri R: Targeting the MHC II presentation pathway in allergy vaccine development. Biochem Soc Trans 2007,35(Pt 4):833–834.

    Article  CAS  PubMed  Google Scholar 

  4. Kong YC, Flynn JC, Banga JP, David CS: Application of HLA class II transgenic mice to study autoimmune regulation. Thyroid 2007,17(10):995–1003.

    Article  CAS  PubMed  Google Scholar 

  5. Purcell AW, McCluskey J, Rossjohn J: More than one reason to rethink the use of peptides in vaccine design. Nat Rev Drug Discov 2007,6(5):404–414.

    Article  CAS  PubMed  Google Scholar 

  6. Madden DR, Garboczi DN, Wiley DC: The antigenic identity of peptide-MHC complexes: a comparison of the conformations of five viral peptides presented by HLA-A2. Cell 1993,75(4):693–708.

    Article  CAS  PubMed  Google Scholar 

  7. Stern LJ, Brown JH, Jardetzky TS, Gorga JC, Urban RG, Strominger JL, Wiley DC: Crystal structure of the human class II MHC protein HLA-DR1 complexed with an influenza virus peptide. Nature 1994,368(6468):215–221.

    Article  CAS  PubMed  Google Scholar 

  8. Rammensee HG: Chemistry of peptides associated with MHC class I and class II molecules. Curr Opin Immunol 1995,7(1):85–96.

    Article  CAS  PubMed  Google Scholar 

  9. Lippolis JD, White FM, Marto JA, Luckey CJ, Bullock TN, Shabanowitz J, Hunt DF, Engelhard VH: Analysis of MHC class II antigen processing by quantitation of peptides that constitute nested sets. J Immunol 2002,169(9):5089–5097.

    Article  PubMed  Google Scholar 

  10. Robinson J, Marsh SG: The IMGT/HLA database. Methods Mol Biol 2007, 409: 43–60.

    Article  CAS  PubMed  Google Scholar 

  11. Peters B, Sidney J, Bourne P, Bui HH, Buus S, Doh G, Fleri W, Kronenberg M, Kubo R, Lund O, et al.: The immune epitope database and analysis resource: from vision to blueprint. PLoS Biol 2005,3(3):e91.

    Article  PubMed Central  PubMed  Google Scholar 

  12. Rammensee H, Bachmann J, Emmerich NP, Bachor OA, Stevanovic S: SYFPEITHI: database for MHC ligands and peptide motifs. Immunogenetics 1999,50(3–4):213–219.

    Article  CAS  PubMed  Google Scholar 

  13. Bachinsky MM, Guillen DE, Patel SR, Singleton J, Chen C, Soltis DA, Tussey LG: Mapping and binding analysis of peptides derived from the tumor-associated antigen survivin for eight HLA alleles. Cancer Immun 2005, 5: 6.

    PubMed  Google Scholar 

  14. Wang XF, Kerzerho J, Adotevi O, Nuyttens H, Badoual C, Munier G, Oudard S, Tu S, Tartour E, Maillere B: Comprehensive analysis of HLA-DR- and HLA-DP4-restricted CD4+ T cell response specific for the tumor-shared antigen survivin in healthy donors and cancer patients. J Immunol 2008,181(1):431–439.

    Article  CAS  PubMed  Google Scholar 

  15. Flynn JC, McCormick DJ, Brusic V, Wan Q, Panos JC, Giraldo AA, David CS, Kong YC: Pathogenic human thyroglobulin peptides in HLA-DR3 transgenic mouse model of autoimmune thyroiditis. Cellular immunology 2004,229(2):79–85.

    Article  CAS  PubMed  Google Scholar 

  16. Muixi L, Carrascal M, Alvarez I, Daura X, Marti M, Armengol MP, Pinilla C, Abian J, Pujol-Borrell R, Jaraquemada D: Thyroglobulin peptides associate in vivo to HLA-DR in autoimmune thyroid glands. J Immunol 2008,181(1):795–807.

    Article  CAS  PubMed  Google Scholar 

  17. Heiny AT, Miotto O, Srinivasan KN, Khan AM, Zhang GL, Brusic V, Tan TW, August JT: Evolutionarily conserved protein sequences of influenza a viruses, avian and human, as vaccine targets. PLoS ONE 2007,2(11):e1190.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  18. Khan A, Miotto O, Nascimento E, Srinivasan K, Heiny A, Zhang G, Salmon J, Marques E, Tan T, Brusic V, et al.: Identification and characterization of conserved sequences of dengue virus proteins: implications for vaccine design. PLoS Neglected Tropical Diseases 2008,2(8):e272.

    Article  PubMed Central  PubMed  Google Scholar 

  19. Tong JC, Tan TW, Ranganathan S: Methods and protocols for prediction of immunogenic epitopes. Brief Bioinform 2007,8(2):96–108.

    Article  CAS  PubMed  Google Scholar 

  20. Brusic V, Bajic VB, Petrovsky N: Computational methods for prediction of T-cell epitopes – a framework for modelling, testing, and applications. Methods 2004,34(4):436–443.

    Article  CAS  PubMed  Google Scholar 

  21. Davies MN, Flower DR: Harnessing bioinformatics to discover new vaccines. Drug Discov Today 2007,12(9–10):389–395.

    Article  CAS  PubMed  Google Scholar 

  22. Lundegaard C, Lund O, Kesmir C, Brunak S, Nielsen M: Modeling the adaptive immune system: predictions and simulations. Bioinformatics 2007,23(24):3265–3275.

    Article  CAS  PubMed  Google Scholar 

  23. Lin HH, Ray S, Tongchusak S, Reinherz EL, Brusic V: Evaluation of MHC class I peptide binding prediction servers: applications for vaccine research. BMC immunology 2008, 9: 8.

    Article  PubMed Central  PubMed  Google Scholar 

  24. Fonseca SG, Coutinho-Silva A, Fonseca LA, Segurado AC, Moraes SL, Rodrigues H, Hammer J, Kallas EG, Sidney J, Sette A, et al.: Identification of novel consensus CD4 T-cell epitopes from clade B HIV-1 whole genome that are frequently recognized by HIV-1 infected patients. Aids 2006,20(18):2263–2273.

    Article  CAS  PubMed  Google Scholar 

  25. Calvo-Calle JM, Strug I, Nastke MD, Baker SP, Stern LJ: Human CD4+ T cell epitopes from vaccinia virus induced by vaccination or infection. PLoS pathogens 2007,3(10):1511–1529.

    Article  CAS  PubMed  Google Scholar 

  26. Depil S, Morales O, Castelli FA, Delhem N, Francois V, Georges B, Dufosse F, Morschhauser F, Hammer J, Maillere B, et al.: Determination of a HLA II promiscuous peptide cocktail as potential vaccine against EBV latency II malignancies. J Immunother 2007,30(2):215–226.

    Article  CAS  PubMed  Google Scholar 

  27. Tatsumi T, Kierstead LS, Ranieri E, Gesualdo L, Schena FP, Finke JH, Bukowski RM, Brusic V, Sidney J, Sette A, et al.: MAGE-6 encodes HLA-DRbeta1*0401-presented epitopes recognized by CD4+ T cells from patients with melanoma or renal cell carcinoma. Clin Cancer Res 2003,9(3):947–954.

    CAS  PubMed  Google Scholar 

  28. Wang P, Sidney J, Dow C, Mothe B, Sette A, Peters B: A systematic assessment of MHC class II peptide binding predictions and evaluation of a consensus approach. PLoS Comput Biol 2008,4(4):e1000048.

    Article  PubMed Central  PubMed  Google Scholar 

  29. Nielsen M, Lundegaard C, Blicher T, Peters B, Sette A, Justesen S, Buus S, Lund O: Quantitative predictions of peptide binding to any HLA-DR molecule of known sequence: NetMHCIIpan. PLoS Comput Biol 2008,4(7):e1000107.

    Article  PubMed Central  PubMed  Google Scholar 

  30. Gowthaman U, Agrewala JN: In silico tools for predicting peptides binding to HLA-class II molecules: more confusion than conclusion. J Proteome Res 2008,7(1):154–163.

    Article  CAS  PubMed  Google Scholar 

  31. Larsen MV, Lundegaard C, Lamberth K, Buus S, Brunak S, Lund O, Nielsen M: An integrative approach to CTL epitope prediction: a combined algorithm integrating MHC class I binding, TAP transport efficiency, and proteasomal cleavage predictions. European journal of immunology 2005,35(8):2295–2303.

    Article  CAS  PubMed  Google Scholar 

  32. Bruggen P, Zhang Y, Chaux P, Stroobant V, Panichelli C, Schultz ES, Chapiro J, Eynde BJ, Brasseur F, Boon T: Tumor-specific shared antigenic peptides recognized by human T cells. Immunol Rev 2002, 188: 51–64.

    Article  Google Scholar 

  33. Parmiani G, De Filippo A, Novellino L, Castelli C: Unique human tumor antigens: immunobiology and use in clinical trials. J Immunol 2007,178(4):1975–1979.

    Article  CAS  PubMed  Google Scholar 

  34. Suri A, Lovitch SB, Unanue ER: The wide diversity and complexity of peptides bound to class II MHC molecules. Curr Opin Immunol 2006,18(1):70–77.

    Article  CAS  PubMed  Google Scholar 

  35. Tong JC, Zhang GL, Tan TW, August JT, Brusic V, Ranganathan S: Prediction of HLA-DQ3.2beta ligands: evidence of multiple registers in class II binding peptides. Bioinformatics 2006,22(10):1232–1238.

    Article  CAS  PubMed  Google Scholar 

  36. Sant AJ, Chaves FA, Jenks SA, Richards KA, Menges P, Weaver JM, Lazarski CA: The relationship between immunodominance, DM editing, and the kinetic stability of MHC class II:peptide complexes. Immunol Rev 2005, 207: 261–278.

    Article  CAS  PubMed  Google Scholar 

  37. Peters B, Bui HH, Frankild S, Nielson M, Lundegaard C, Kostem E, Basch D, Lamberth K, Harndahl M, Fleri W, et al.: A community resource benchmarking predictions of peptide binding to MHC-I molecules. PLoS Comput Biol 2006,2(6):e65.

    Article  PubMed Central  PubMed  Google Scholar 

  38. Trost B, Bickis M, Kusalik A: Strength in numbers: achieving greater accuracy in MHC-I binding prediction by combining the results from multiple prediction tools. Immunome Res 2007,3(1):5.

    Article  PubMed Central  PubMed  Google Scholar 

  39. Zhang GL, Khan AM, Srinivasan KN, August JT, Brusic V: MULTIPRED: a computational system for prediction of promiscuous HLA binding peptides. Nucleic Acids Res 2005, (33 Web Server):W172–179.

  40. Godkin AJ, Smith KJ, Willis A, Tejada-Simon MV, Zhang J, Elliott T, Hill AV: Naturally processed HLA class II peptides reveal highly conserved immunogenic flanking region sequence preferences that reflect antigen processing rather than peptide-MHC interactions. J Immunol 2001,166(11):6720–6727.

    Article  CAS  PubMed  Google Scholar 

  41. Nielsen M, Lundegaard C, Lund O: Prediction of MHC class II binding affinity using SMM-align, a novel stabilization matrix alignment method. BMC Bioinformatics 2007, 8: 238.

    Article  PubMed Central  PubMed  Google Scholar 

  42. DFRMLI[http://bio.dfci.harvard.edu/DFRMLI/]

  43. Rajapakse M, Wyse L, Schmidt B, Brusic V: Deriving matrix of peptide-MHC interactions in diabetic mouse by genetic algorithm. Lect Notes Comp Sci 2005, 3578: 440–447.

    Article  Google Scholar 

  44. Zhang W, Liu J, Niu YQ, Wang L, Hu X: A Bayesian regression approach to the prediction of MHC-II binding affinity. Computer methods and programs in biomedicine 2008.

    Google Scholar 

  45. Zhang C, Bickis MG, Wu FX, Kusalik AJ: Optimally-connected hidden markov models for predicting MHC-binding peptides. J Bioinform Comput Biol 2006,4(5):959–980.

    Article  CAS  PubMed  Google Scholar 

  46. Handoko SD, Kwoh CK, Ong YS, Zhang GL, Brusic V: Extreme learning machine for predicting HLA-peptide binding. Lecture Notes in Computer Science 2006, 3973: 716–721.

    Article  Google Scholar 

  47. Nanni L: Machine learning algorithms for T-cell epitopes prediction. 2006,69(7–9):866–868.

    Google Scholar 

  48. Cho Y, Kim H, Oh H: Prediction Rule Generation of MHC Class I Binding Peptides Using ANN and GA. Lecture Notes in Computer Science 2005, 3610: 1009–1016.

    Article  Google Scholar 

  49. Karpenko O, Huang L, Dai Y: A probabilistic meta-predictor for the MHC class II binding peptides. Immunogenetics 2008,60(1):25–36.

    Article  CAS  PubMed  Google Scholar 

  50. Hertz T, Yanover C: Identifying HLA supertypes by learning distance functions. Bioinformatics 2007,23(2):e148–155.

    Article  CAS  PubMed  Google Scholar 

  51. Tong JC, Zhang ZH, August JT, Brusic V, Tan TW, Ranganathan S: In silico characterization of immunogenic epitopes presented by HLA-Cw*0401. Immunome Res 2007, 3: 7.

    Article  PubMed Central  PubMed  Google Scholar 

  52. Antes I, Siu SW, Lengauer T: DynaPred: a structure and sequence based method for the prediction of MHC class I binding peptide sequences and conformations. Bioinformatics 2006,22(14):e16–24.

    Article  CAS  PubMed  Google Scholar 

  53. Kangueane P, Sakharkar MK, Lim KS, Hao H, Lin K, Chee RE, Kolatkar PR: Knowledge-based grouping of modeled HLA peptide complexes. Hum Immunol 2000,61(5):460–466.

    Article  CAS  PubMed  Google Scholar 

  54. Salomon J, Flower DR: Predicting Class II MHC-Peptide binding: a kernel based approach using similarity scores. BMC Bioinformatics 2006, 7: 501.

    Article  PubMed Central  PubMed  Google Scholar 

  55. Heckerman D, Kadie C, Listgarten J: Leveraging information across HLA alleles/supertypes improves epitope prediction. J Comput Biol 2007,14(6):736–746.

    Article  CAS  PubMed  Google Scholar 

  56. Sidney J, Assarsson E, Moore C, Ngo S, Pinilla C, Sette A, Peters B: Quantitative peptide binding motifs for 19 human and mouse MHC class I molecules derived using positional scanning combinatorial peptide libraries. Immunome Res 2008, 4: 2.

    Article  PubMed Central  PubMed  Google Scholar 

  57. DeLuca DS, Blasczyk R: Implementing the modular MHC model for predicting peptide binding. Methods Mol Biol 2007, 409: 261–271.

    Article  CAS  PubMed  Google Scholar 

  58. Texier C, Pouvelle S, Busson M, Herve M, Charron D, Menez A, Maillere B: HLA-DR restricted peptide candidates for bee venom immunotherapy. J Immunol 2000,164(6):3177–3184.

    Article  CAS  PubMed  Google Scholar 

  59. Immonen A, Farci S, Taivainen A, Partanen J, Pouvelle-Moratille S, Narvanen A, Kinnunen T, Saarelainen S, Rytkonen-Nissinen M, Maillere B, et al.: T cell epitope-containing peptides of the major dog allergen Can f 1 as candidates for allergen immunotherapy. J Immunol 2005,175(6):3614–3620.

    Article  CAS  PubMed  Google Scholar 

  60. Mandic M, Almunia C, Vicel S, Gillet D, Janjic B, Coval K, Maillere B, Kirkwood JM, Zarour HM: The alternative open reading frame of LAGE-1 gives rise to multiple promiscuous HLA-DR-restricted epitopes recognized by T-helper 1-type tumor-reactive CD4+ T cells. Cancer research 2003,63(19):6506–6515.

    CAS  PubMed  Google Scholar 

  61. Gahery H, Figueiredo S, Texier C, Pouvelle-Moratille S, Ourth L, Igea C, Surenaud M, Guillet JG, Maillere B: HLA-DR-restricted peptides identified in the Nef protein can induce HIV type 1-specific IL-2/IFN-gamma-secreting CD4+ and CD4+/CD8+ T cells in humans after lipopeptide vaccination. AIDS research and human retroviruses 2007,23(3):427–437.

    Article  CAS  PubMed  Google Scholar 

  62. Swets JA: Measuring the accuracy of diagnostic systems. Science 1988,240(4857):1285–1293.

    Article  CAS  PubMed  Google Scholar 

  63. HLA-DR4Pred[http://www.imtech.res.in/raghava/hladr4pred/index.html]

  64. Bhasin M, Raghava GP: SVM based method for predicting HLA-DRB1*0401 binding peptides in an antigen sequence. Bioinformatics 2004,20(3):421–423.

    Article  CAS  PubMed  Google Scholar 

  65. IEDB[http://tools.immuneepitope.org/analyze/html/mhc_II_binding.html]

  66. Bui HH, Sidney J, Peters B, Sathiamurthy M, Sinichi A, Purton KA, Mothe BR, Chisari FV, Watkins DI, Sette A: Automated generation and evaluation of specific MHC binding predictive tools: ARB matrix applications. Immunogenetics 2005,57(5):304–314.

    Article  CAS  PubMed  Google Scholar 

  67. Sturniolo T, Bono E, Ding J, Raddrizzani L, Tuereci O, Sahin U, Braxenthaler M, Gallazzi F, Protti MP, Sinigaglia F, et al.: Generation of tissue-specific and promiscuous HLA ligand databases using DNA microarrays and virtual HLA class II matrices. Nat Biotechnol 1999,17(6):555–561.

    Article  CAS  PubMed  Google Scholar 

  68. MHC BP[http://www.vaccinedesign.com]

  69. MHC2Pred[http://www.imtech.res.in/raghava/mhc2pred]

  70. MHC-BPS[http://bidd.cz3.nus.edu.sg/mhc]

  71. Cui J, Han LY, Lin HH, Tang ZQ, Jiang L, Cao ZW, Chen YZ: MHC-BPS: MHC-binder prediction server for identifying peptides of flexible lengths from sequence-derived physicochemical properties. Immunogenetics 2006,58(8):607–613.

    Article  CAS  PubMed  Google Scholar 

  72. MHCPred[http://www.jenner.ac.uk/MHCPred]

  73. Guan P, Hattotuwagama CK, Doytchinova IA, Flower DR: MHCPred 2.0: an updated quantitative T-cell epitope prediction server. Appl Bioinformatics 2006,5(1):55–61.

    Article  CAS  PubMed  Google Scholar 

  74. MULTIPRED1[http://antigen.i2r.a-star.edu.sg/multipred1]

  75. Zhang GL, Bozic I, Kwoh CK, August JT, Brusic V: Prediction of supertype-specific HLA class I binding peptides using support vector machines. J Immunol Methods 2007,320(1–2):143–154.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  76. NetMHCII[http://www.cbs.dtu.dk/services/NetMHCII]

  77. NetMHCIIpan[http://www.cbs.dtu.dk/services/NetMHCIIpan]

  78. PeptideCheck[http://www.peptidecheck.org]

  79. DeLuca DS, Khattab B, Blasczyk R: A modular concept of HLA for comprehensive peptide binding prediction. Immunogenetics 2007,59(1):25–35.

    Article  CAS  PubMed  Google Scholar 

  80. ProPred[http://www.imtech.res.in/raghava/propred]

  81. Singh H, Raghava GP: ProPred: prediction of HLA-DR binding sites. Bioinformatics 2001,17(12):1236–1237.

    Article  CAS  PubMed  Google Scholar 

  82. Rankpep[http://bio.dfci.harvard.edu/Tools/rankpep.html]

  83. Reche PA, Glutting JP, Reinherz EL: Prediction of MHC class I binding peptides using profile motifs. Hum Immunol 2002,63(9):701–709.

    Article  CAS  PubMed  Google Scholar 

  84. SVMHC[http://www-bs.informatik.uni-tuebingen.de/SVMHC/index_html]

  85. Donnes P, Kohlbacher O: SVMHC: a server for prediction of MHC-binding peptides. Nucleic Acids Res 2006, (34 Web Server):W194–197.

  86. SVRMHC[http://SVRMHC.umn.edu/SVRMHCdb]

  87. Wan J, Liu W, Xu Q, Ren Y, Flower DR, Li T: SVRMHC prediction server for MHC-binding peptides. BMC Bioinformatics 2006, 7: 463.

    Article  PubMed Central  PubMed  Google Scholar 

  88. SYFPEITHI[http://www.syfpeithi.de/Scripts/MHCServer.dll/EpitopePrediction.htm]

Download references

Acknowledgements

This work was supported by the ImmunoGrid project, under EC contract FP6-2004-IST-4, No. 028069, and NIH grant U19 A157330.

This article has been published as part of BMC Bioinformatics Volume 9 Supplement 12, 2008: Asia Pacific Bioinformatics Network (APBioNet) Seventh International Conference on Bioinformatics (InCoB2008). The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2105/9?issue=S12.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vladimir Brusic.

Additional information

Competing interests

The authors declare that they have no competing interests. Previously HHL co-developed MHC_BPS, GLZ and VB co-developed MULTIPRED, and ELR co-developed Rankpep.

Authors' contributions

VB and ELR designed the study, HHL performed the analysis, GLZ and ST collected and prepared data. HHL and VB drafted the article and all authors participated in manuscript.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Lin, H.H., Zhang, G.L., Tongchusak, S. et al. Evaluation of MHC-II peptide binding prediction servers: applications for vaccine research. BMC Bioinformatics 9 (Suppl 12), S22 (2008). https://doi.org/10.1186/1471-2105-9-S12-S22

Download citation

  • Published:

  • DOI: https://doi.org/10.1186/1471-2105-9-S12-S22

Keywords