Genome-wide prediction of vaccine targets for human herpes simplex viruses using Vaxign reverse vaccinology
© Xiang and He; licensee BioMed Central Ltd. 2013
Published: 8 March 2013
Skip to main content
© Xiang and He; licensee BioMed Central Ltd. 2013
Published: 8 March 2013
Herpes simplex virus (HSV) types 1 and 2 (HSV-1 and HSV-2) are the most common infectious agents of humans. No safe and effective HSV vaccines have been licensed. Reverse vaccinology is an emerging and revolutionary vaccine development strategy that starts with the prediction of vaccine targets by informatics analysis of genome sequences. Vaxign (http://www.violinet.org/vaxign) is the first web-based vaccine design program based on reverse vaccinology. In this study, we used Vaxign to analyze 52 herpesvirus genomes, including 3 HSV-1 genomes, one HSV-2 genome, 8 other human herpesvirus genomes, and 40 non-human herpesvirus genomes. The HSV-1 strain 17 genome that contains 77 proteins was used as the seed genome. These 77 proteins are conserved in two other HSV-1 strains (strain F and strain H129). Two envelope glycoproteins gJ and gG do not have orthologs in HSV-2 or 8 other human herpesviruses. Seven HSV-1 proteins (including gJ and gG) do not have orthologs in all 40 non-human herpesviruses. Nineteen proteins are conserved in all human herpesviruses, including capsid scaffold protein UL26.5 (NP_044628.1). As the only HSV-1 protein predicted to be an adhesin, UL26.5 is a promising vaccine target. The MHC Class I and II epitopes were predicted by the Vaxign Vaxitop prediction program and IEDB prediction programs recently installed and incorporated in Vaxign. Our comparative analysis found that the two programs identified largely the same top epitopes but also some positive results predicted from one program might not be positive from another program. Overall, our Vaxign computational prediction provides many promising candidates for rational HSV vaccine development. The method is generic and can also be used to predict other viral vaccine targets.
The Herpesviridae are a family of DNA viruses that cause diseases in humans and various animals. Herpesviruses are the members of the Herpesviridae family. All herpesviruses share a similar virion structure: a linear, double-stranded DNA molecule densely packaged into an icosahedral protein cage called capsid. The capsid is surrounded by an amorphous protein layer, called the tegument, consisting of both viral proteins and viral mRNAs and a lipid bilayer membrane (the envelope). Infectious virions are spherical. All herpesviruses are species-specific. Human herpesviruses (HHVs) include eight members: Herpes simplex virus (HSV) type 1 and 2 (HSV-1 and HSV-2), varicella zoster virus (VZV; HHV-3), Epstein-Barr virus (EBV; HHV-4), human cytomegalovirus (CMV; HHV-5), human herpesvirus-6 and -7 (HHV-6 and HHV-7), and Kaposi's sarcoma associated herpesvirus (KSHV; HHV-8). Herpesviruses typically cause latent, lytic, and recurring infections. HSV-1 and HSV-2 are two human pathogens that cause a variety of recurrent immunopathologic diseases, ranging from mild skin diseases including herpes labialis and herpes genitalis to life-threatening diseases including neonatal herpes and adult herpes encephalitis [1, 2]. For example, HSV-1 can cause epithelial lesions on the lip or face. After establishment of productive infection, HSV-1 causes latent infection of the trigeminal ganglia. Despite fairly widespread use of antiviral drugs, HSV-1 and HSV-2 remain among the most common infectious agents of humans. In the US, the seroprevalence of HSV-1 and HSV-2 in adults is 68% and 21%, respectively; and approximately 700-2000 cases of neonatal HSV infections per year occur in the US .
Although many acute infections can be controlled by vaccination, the development of prophylactic and therapeutic vaccines against persistent herpesviruses remains challenging. There are currently no US FDA-approved HSV vaccines available. The development of an effective vaccine against HSV is complicated by many unique characteristics of herpes viruses, including the complexity of the virus replication cycle (i.e., primary, latent and recurrent phases of infection), their sophisticated immunoevasion strategies, a high number of protein candidates by the large and complex herpes genome . Although antibodies generated following HSV-1 and HSV-2 immunizations do not protect against virus entry, antibodies against envelope glycoproteins gB, gC, gD, and gE provide passive protection against lethal viral challenges. T helper cell type 1 (Th1) response and cytotoxic T lymphocyte (CTL) activities are also critical to the host protection . Many HSV proteins, including two major protective antigens gB and gD, have been evaluated for vaccine development [5, 6]. Although animal studies showed induced protection, human clinical trials with vaccines using these two proteins (gB and gD) have not generated ideal results [5, 6]. Therefore, for developing safe and effective human HSV vaccines, it is necessary to identify and evaluate more protective antigens in HSVs.
As an emerging and revolutionary vaccine development approach, reverse vaccinology starts with the prediction of vaccine protein targets by bioinformatics analysis of genome sequences . Reverse vaccinology was first applied to development of a vaccine against serogroup B Neisseria meningitidis (MenB) . With this method, it took less than 18 months to identify more protective vaccine targets in MenB than had been discovered during the past 40 years by conventional methods . Afterwards, this technology has been successfully applied to many other pathogens such as Bacillus anthracis , Streptococcus pneumoniae , Mycobacterium tuberculosis , and Cryptosporidium hominis . Vaxign is the first web-based vaccine design software that uses the reverse vaccinology strategy [13, 14]. In reverse vaccinology, predicted proteins are selected based on defined desirable attributes. Predicted features in the Vaxign pipeline include protein subcellular location, transmembrane helices, adhesin probability, conservation among pathogenic strains, sequence exclusion from genomes of nonpathogenic strains, sequence similarity to host proteins, and epitope binding to Major histocompatibility complex (MHC) class I and class II. Vaxign has been demonstrated in successful prediction of verified and potential vaccine targets for Brucella spp. [14, 15] and uropathogenic E. coli . Over 200 genomes have been pre-computed using the Vaxign algorithm and available for query in the Vaxign website. Vaxign also allows for dynamic vaccine target prediction based on users' input sequences. Since the previous publication , Vaxign has included several new features. For example, after log in, a user can save Vaxign analysis projects for continuous updates and result sharing. While Vaxign includes its own MHC class I and II epitope prediction tool, Vaxign has now incorporated the tools implemented in the Immune Epitope Database (IEDB; http://tools.immuneepitope.org/main/html/tcell_tools.html). Both sets of epitope prediction results can then be compared in parallel.
The Vaxign reverse vaccinology approach is generic and can be used for analyses of vaccines against various pathogens and infection diseases. However, there has not been a report of how to use Vaxign to predict vaccine targets for a viral disease. In this study, we have used Vaxign to predict HSV vaccine targets.
Fifty-two herpesvirus genomes used in current Vaxign analysis.
No of proteins
Alcelaphine herpesvirus 1
Anatid herpesvirus 1
Anguillid herpesvirus 1
Ateline herpesvirus 3
Bovine herpesvirus 1
Bovine herpesvirus 4
Bovine herpesvirus 5
Callitrichine herpesvirus 3
Caviid herpesvirus 2
Cercopithecine herpesvirus 2
Cercopithecine herpesvirus 9
Cyprinid herpesvirus 3
Equid herpesvirus 2
Equid herpesvirus 4
Equid herpesvirus 9
Felid herpesvirus 1
Gallid herpesvirus 1
Gallid herpesvirus 2
Gallid herpesvirus 3
Human herpesvirus 1 strain 17
Human herpesvirus 1 strain F
Human herpesvirus 1 strain H129
Human herpesvirus 2 strain HG52
Human herpesvirus 3
Human herpesvirus 4
Human herpesvirus 4 type 1
Human herpesvirus 5
Human herpesvirus 6A
Human herpesvirus 6B
Human herpesvirus 7
Human herpesvirus 8
Ictalurid herpesvirus 1
Macacine herpesvirus 1
Macacine herpesvirus 3
Macacine herpesvirus 4
Macacine herpesvirus 5, genome
Meleagrid herpesvirus 1
Murid herpesvirus 1
Murid herpesvirus 2
Murid herpesvirus 4
Ostreid herpesvirus 1
Ovine herpesvirus 2
Panine herpesvirus 2
Papiine herpesvirus 2
Psittacid herpesvirus 1
Ranid herpesvirus 1
Ranid herpesvirus 2
Rodent herpesvirus Peru
Saimiriine herpesvirus 1
Saimiriine herpesvirus 2
Suid herpesvirus 1
Tupaiid herpesvirus 1
Sequence conservation among different genomes, e.g., among HSV-1 genomes or among all human herpesviruses.
Sequence exclusion from specific genomes, e.g., those HSV-1 proteins that do not have orthologous proteins in non-human herpesviruses.
Adhesin probability. Adhesin is critical for a virus to invade a host cell. So adhesins turn to be good vaccine targets.
The number of transmembrane helices. The presence of more than one transmembrane helix of a protein is often the result of failure of recombinant protein isolation and purification . Therefore, a user can choose to not include those proteins with many transmembrane helices as possible vaccine targets.
MHC Class I and II epitopes. A protein with many T cell epitopes is a preferred vaccine target. Also, prediction of MHC Class I and II epitopes is critical for those who plan to develop epitope vaccines.
Since different module software programs used in Vaxign are independent from each other, a user can choose whether or not to use any specific criteria and programs. Such module-based software pipeline is designed based on the observation that vaccine researchers and developers often have different preferences in terms which criteria to use and how to use them for their specific vaccine design applications. Below we present our predictions based on the schemes considered appropriate for development of HSV vaccines.
After the conservation analysis among all three HSV1 genomes, we have performed different Vaxign analyses using different schemes. First, our analysis identified two HSV-1 proteins that are conserved in all HSV-1 genomes but absent in all 9 genomes from seven other human herpesvirus types. These two proteins are envelope glycoprotein gJ (NP_044667.1) and envelope glycoprotein gG (NP_044666.1). The HSV-2 genome (NC_001798.1) is very similar to HSV-1 . However, these two proteins gJ and gG do not have orthologs in HSV-2. Therefore, gJ and gG are likely critical to differentiate HSV-1 from HSV-2 and other human herpesviruses.
Seven HSV-1 proteins that do not have orthologs in all 40 non-human herpesviruses.
TAP transporter inhibitor ICP47
tegument protein US11
envelope glycoprotein gJ
envelope glycoprotein gG
membrane protein UL56
neurovirulence protein ICP34.5
neurovirulence protein ICP34.5
19 HSV-1 proteins that are also conserved in other human herpesviruses.
helicase-primase helicase subunit
helicase-primase primase subunit
capsid portal protein
major capsid protein
capsid maturation protease (UL26)
capsid scaffold protein UL26.5
envelope glycoprotein gM
envelope glycoprotein gB
DNA packaging terminase subunit 1
DNA packaging terminase subunit 2
DNA packaging tegument protein UL25
DNA packaging protein UL32
DNA packaging protein UL33
nuclear protein UL24
single-stranded DNA-binding protein
DNA polymerase catalytic subunit
ribonucleotide reductase subunit 1
Vaxitop is a Vaxign program that predicts MHC Class I and II binding epitopes based on position specific scoring matrices (PSSM) . Currently many software programs for predicting T cell MHC Class I and II epitopes are available . One unique feature about Vaxitop is that it reports a statistics P-value while other MHC Class I and II prediction tools typically report a percentage or top number. It has been recognized that an incorporation of different programs would increase the specificity of T cell epitope prediction. Therefore, we have installed the default IEDB MHC Class I and II prediction tools (http://tools.immuneepitope.org/main/html/tcell_tools.html) in Vaxign. A Vaxign user is allowed to calculate and compare the immune epitopes by using both Vaxitop and IEDB programs.
The addition of epitope prediction allows further analysis for the existence of potential HSV vaccine targets. Our analysis found that the HSV-1 UL26.5 capsid scaffold protein (NP_044628.1)  is particularly interesting. The adhesin probability of this protein is 0.675, which is the only protein that has an adhesin probability of more than 0.51, the default cutoff value of defining a predicted adhesin. We have thus focused here on the immune epitope predictions based on this capsid scaffold protein.
Similarly, overlaping results could be found in MHC Class II epitope prediction between the results predicted from Vaxitop and the IEDB method (data not shown). The results of our receiver operating characteristic (ROC) curve data analyses using the positive and negative training data obtained from the IEDB database were comparable to the top results from existing MHC Class II prediction tools that were surveyed by Wang et al . The results of the ROC analyses are provided in the Vaxign web page (http://www.violinet.org/vaxign/docs/aucs.php).
One major bottleneck for developing an effective and safe human HSV vaccine(s) is to identify protective antigens that are conserved among all HSV genomes and are able to induce protective immune response. Our current study is the first time to use a reverse vaccinology strategy to analyze various herpesvirus genomes and identify possible HSV vaccine targets based on genome sequence analyses. Our Vaxign reverse vaccinology approach has proven to be an efficient method to predict many valuable vaccine targets that are conserved in HSV genomes and contain desired characteristics.
Current study provides many vaccine targets for HSV vaccine development, including seven HSV-1 proteins that do not have orthologs in all tested non-human herpesviruses (Table 2). Among them are envelope glycoprotein gJ and gG. Antibody against gG has been found in HSV-1 infected individuals' serum samples . HSV-1 gJ plays an important role in neuron-to-neuron transmission through synaptically linked neuronal pathways . The membrane protein UL56 is likely involved in vesicular trafficking in HSV-infected cells . HSV-1 ICP34.5 protein is a neurovirulence factor that plays critical roles in viral replication and anti-host responses . HSV-1 ICP47, one of the seven proteins, is an early expressed protein that blocks the MHC class I antigen presentation pathway by binding to the TAP transporter . These HSV proteins that do not have orthologs in non-human herpesviruses may be valuable human vaccine targets.
We have also found 19 HSV-1 proteins that are also conserved in other human herpesviruses (Table 3). This list includes two envelope glycoproteins gM and gB, five DNA packaging related proteins, and four capsid related proteins. Our analysis has identified capsid scaffold protein UL26.5 as a promising vaccine target. The primary reason is that UL26.5 is the only protein among all 77 HSV-1 proteins that has an adhesin probability of >0.51. The UL26.5 capsid scaffold protein is known to be critical for virus capsid formation . During the assembly of a HSV capsid, the major capsid protein VP5 interacts with the C-terminal residues of the scaffold proteins UL26.5 and UL26 (also one of the 19 proteins in Table 3). After capsid assembly the scaffold proteins are cleaved at the maturation site by a serine protease also encoded by UL26, thereby allowing the scaffold proteins to be released from the capsid [25, 26]. The cleaved UL26.5 protein releases the major scaffold protein VP22a. It is likely that the other cleaved segment plays a critical role in making UL26.5 a possible adhesin. An HSV-1 mutation with a deletion of UL26.5 amino acids 143-150 is unable to produce infectious virus . It suggests that UL26.5 is a virulence factor critical for viral pathogenesis. UL26.5 is one of the 19 HSV-1 proteins that are also conserved in other human herpesviruses (Table 3). It means that this protein can be potentially used to develop a vaccine against all human herpesviruses. Experimental study is required to verify the value of UL26.5 or part of UL26.5 as a protective antigen for HSV vaccine development.
Various T cell MHC Class I and II epitope prediction algorithms have been developed and use different prediction approaches . In general, T cell epitope algorithms have now achieved a high degree of prediction accuracy . Different from other T cell epitope methods, Vaxitop uses a statistical P-value , which is more understandable to many biologists. Our comparative analysis found that Vaxitop predicted MHC Class I epitope results overlap with the IEDB prediction method and Vaxitop usually predicts less positive hits than the IEDB prediction (Figure 2). Since it is often that many epitopes are identified, it may be safe to experimentally test those epitopes that are positive from both predictions. The incorporation of these IEDB MHC Class I and II methods also provides the Vaxign users more options to perform immune epitope analysis.
The Vaxign vaccine design program can also be improved with additional features. For example, Vaxign lacks a program to predict the location of a viral protein inside the virion particle and its subcellular location inside host cells. Currently Vaxign includes PSORTb, a program that is designed for bacterial subcellular location prediction . Another program for viral protein location would be needed, especially for those viruses containing a large genome. Since different criteria are provided in Vaxign, it is often a user's choice to balance the criteria for vaccine candidate selection. To make the selection more balanced, we are currently designing a comprehensive score that ranks predicted proteins by integrating different criteria. We are also in the process of incorporating gene expression profiles of microbial genes at different experimental conditions into our rational vaccine design. As an integrated component of the web-based VIOLIN database and analysis system (http://www.violinet.org) , Vaxign can also be improved by interaction with other VIOLIN programs. For example, VIOLIN Protegen is a web-based database that contains over 600 protective antigen information . These experimentally verified protective antigens can be used for identifying specific patterns in protective antigens and computationally predicting protective antigens . Many of the protective antigens in Protegen come from viruses, thus they can be used for the training of the Vaxign program and the verification of predicted results. The vaccine adjuvant database Vaxjo  in VIOLIN may provide training data for Vaxign to include a specific component for rational vaccine adjuvant design. The community-based Vaccine Ontology (VO; http://www.violinet.org/vaccineontology) is developed to support vaccine data standardization, integration, and computer-assisted reasoning . VO has been found to be valuable in ontology-based natural language processing and literature mining , which can facilitate advanced vaccine design . Currently, we are exploring how VO-based literature mining can improve Vaxign vaccine design.
For each herpesvirus protein, the Vaxign pipeline was used to calculate various criteria using module software programs described below:
Subcellular localization. This feature is implemented in Vaxign using optimized PSORTb 3.0 . PSORTb is the most precise bacterial protein subcellular localization predictor. To use this program, Vaxign first develops a script to generate standard input data for PSORTb. After the PSORTb execution, Vaxign automatically parses the PSORTb output and stores the results into the Vaxign MySQL database. Such a process allows seamless generation of PSORTb input, execution, and automatic processing and storage of PSORTb output in Vaxign. Similar strategies have been used in using other module programs.
Number of transmembrane helices. The transmembrane helix topology analysis is conducted using optimized HMMTOP .
Minimum adhesin probability (0-1.0). Optimized SPAAN program  is used for calculating adhesin probability. A probability of greater than the default cutoff of 0.51 indicates that a tested protein is likely an adhesin or obtains adhesin-like characteristics. With this cutoff, the performance of the SPAAN program is optimized with the highest Matthews correlation coefficient.
Microbial sequence conservation by ortholog analysis. OrthoMCL is applied for finding conserved proteins among a selected list of strains . The E-value of 105 is set as the default value for OrthoMCL processing.
Exclusion of proteins having orthologs in selected genome(s). Similarly, OrthoMCL is applied for excluding proteins that also exist in a non-pathogenic strain(s).
No similarity to host proteins. Choose this selection to exclude those vaccine targets that also exist in a host, including human, mouse, or pig.
Vaxign includes an internally developed program called Vaxitop, for prediction of MHC Class I & II epitopes. Vaxitop predicts immune epitopes based on position specific scoring matrices (PSSM). Different from other existing epitope prediction algorithms, Vaxitop calculates statistical P-value (instead of a percentage or top number) as the cutoff. A P-value of 0.05 provides a cutoff with high and balanced sensitivity and specificity . We have also used Vaxitop to predict MHC Class I & II epitopes for each herpesvirus protein.
As a new Vaxign feature, the IEDB MHC Class I and II epitope predictions programs have been downloaded from the IEDB website (http://tools.immuneepitope.org/main/html/tcell_tools.html). For each queried protein, these IEDB tools can be used to dynamically predict immune epitopes. The predicted results can be directly compared with the results output by Vaxitop.
After all proteins from 52 herpesvirus genomes were pre-computed, the results were made available for automatic query and deep analysis using the Vaxign web interface (http://www.violinet.org/vaxign).
Herpes simplex virus
Immune Epitope Database
Major histocompatibility complex
Serogroup B Neisseria meningitides
National Center for Biotechnology Information
Position specific scoring matrices
Transporter associated with antigen presentation
The United States Food and Drug Administration
Vaccine Investigation and Online Information Network
This manuscript was supported by the NIH-NIAID grant R01AI081062 to YH.
The funding for publication of this article is provided by the NIH-NIAID grant R01AI081062 to YH.
This article has been published as part of BMC Bioinformatics Volume 14 Supplement 4, 2013: Special Issue on Computational Vaccinology. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/14/S4
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.