Classification of viral zoonosis through receptor pattern analysis
© Bae and Son; licensee BioMed Central Ltd. 2011
Received: 7 September 2010
Accepted: 13 April 2011
Published: 13 April 2011
Viral zoonosis, the transmission of a virus from its primary vertebrate reservoir species to humans, requires ubiquitous cellular proteins known as receptor proteins. Zoonosis can occur not only through direct transmission from vertebrates to humans, but also through intermediate reservoirs or other environmental factors. Viruses can be categorized according to genotype (ssDNA, dsDNA, ssRNA and dsRNA viruses). Among them, the RNA viruses exhibit particularly high mutation rates and are especially problematic for this reason. Most zoonotic viruses are RNA viruses that change their envelope proteins to facilitate binding to various receptors of host species. In this study, we sought to predict zoonotic propensity through the analysis of receptor characteristics. We hypothesized that the major barrier to interspecies virus transmission is that receptor sequences vary among species--in other words, that the specific amino acid sequence of the receptor determines the ability of the viral envelope protein to attach to the cell.
We analysed host-cell receptor sequences for their hydrophobicity/hydrophilicity characteristics. We then analysed these properties for similarities among receptors of different species and used a statistical discriminant analysis to predict the likelihood of transmission among species.
This study is an attempt to predict zoonosis through simple computational analysis of receptor sequence differences. Our method may be useful in predicting the zoonotic potential of newly discovered viral strains.
Viral zoonosis, the transmission of a virus from its primary vertebrate reservoir species to humans, requires ubiquitous cellular proteins known as receptor proteins . Zoonosis can occur not only through direct transmission, but also through intermediate reservoirs or other environmental factors [2–4]. The zoonotic viruses can be categorized according to genotype; of the various classes of viruses, the RNA viruses exhibit the highest mutation rates . Most zoonotic viruses are RNA viruses that change their envelope proteins to facilitate binding to various receptors of host species [6, 7]. The high mutation rate of envelope proteins  hinders the development of accurate vaccines, as does the great ability of the RNA viruses to infect host species in order to exploit host proteins for viral reproduction .
Lacking the ability to self-replicate, viruses must utilize the replication apparatus of their host cells . Viral infection of a cell begins with attachment of the virus to the cell surface [6, 10, 11]. During attachment to the cell membrane, the viral envelope protein (a structural protein) interacts with the host-cell receptor protein(s) . In non-envelope viruses, the capsid plays this role. The cell receptors that play a major role in viral attachment are predominantly membrane proteins of the immunoglobin superfamily [13–15]. The identification of virus-binding cellular receptors was rapidly accelerated in the late 1980s owing to developments in the use of monoclonal antibodies and molecular cloning techniques . The various receptors that have been found are surface matrix structures containing carbohydrate, lipid, and protein moieties [1, 16, 17]. In some cases, viral attachment also exploits co-receptors. For example, HIV, which uses the CD4 molecule as its receptor, uses the CXCR4 and CCR5 co-receptors to strengthen the effectiveness of infection [1, 14, 18, 19]. Similarly, hepatitis C virus utilizes CD81 as a receptor and LDLR as a co-receptor .
Similarity scores of host receptor pairs.
A virus (NANA- synthase)
(Integrin alpha V)
Mustela putorius furo
Mustela putorius furo
(Alpha (V) beta(3) integrin)
Canis lupus familiaris#
Canis lupus familiaris#
We hypothesized that the major barrier to the transmission of viruses between species is the difference in cellular receptor sequences. In other words, the specific amino acid sequence of the receptor should be the major determinant of the ability of the viral envelope protein to attach to the cell. Ordinary sequence alignment protocol tells us overall sequence similarity which we thought useful but insufficient because most receptors are membrane proteins and membrane proteins consist of distinctive hydrophobic and hydrophilic parts. Therefore, we analysed host-cell receptor sequences for their hydrophobicity/hydrophilicity characteristics. We then analysed these properties for similarities among receptors of different species to predict the likelihood of transmission across species, including humans. To our best knowledge, this study is the first attempt to predict zoonosis through a simple analysis of receptor sequence similarities and differences. This method may be useful in predicting the zoonotic potential of newly discovered viral strains.
Results and Discussion
The pair-wise receptor sequence similarities (gSi,1, gSi,2, and gSi,3) between host-species pairs for each virus family are shown in Table 1. For logical comparisons, each virus contains at least one infected host (the primary reservoir, designated as "#" in Table 1). As shown in Table 1, the similarity scores for the infected group (g = 1) were high, ranging from 0.790 to 0.988 for 1Si,1, from 0.841 to 0.996 for 1Si,2, and 0.794 to 0.962 for 1Si,3. All pair-wise comparisons in group 1 (human vs. primary reservoir, primary reservoir vs. host, and human vs. host) yielded high similarity scores, indicating a high similarity among receptor sequences. The similarity scores were comparatively low in the non-infection group (g = 2), ranging from 0.092 to 0.440 for 2Si,1, from 0.108 to 0.432 for 2Si,2, and from 0.130 to 0.416 for 2Si,3. For group 2, both the primary host species and non-infected species are listed to illustrate the differences in similarity. In pair-wise comparisons, all the non-infection cases yielded low similarity values, i.e., the receptor sequences differed significantly from each other.
We assume that a low similarity in receptor sequences disfavors infection despite the existence of a common receptor. For example, enterovirus infects only Sus scrofa (pig); it does not infect Rattus norvegicus (rat) or Homo sapiens (human) because of the high transmission barrier. Similarly, for leukovirus, only Gallus gallus (chicken) is infected as a primary reservoir; because of the high transmission barrier, R. norvegicus and H. sapiens are not infected. These results imply that for non-infection cases, species barriers exist, and the propensity to cross the barrier is determined by the sequence similarity between the potential and primary host receptors.
Similarity scores for rabies virus were low between Canis lupus familiaris (domestic dog) and Bos Taurus (domestic cow) (2Si,1 = 0.280, 2Si,2 = 0.373, and 2Si,3 = 0.366) and also between B. taurus and H. sapiens (2Si,1 = 0.267, 2Si,2 = 0.371, and 2Si,3 = 0.416) but were high between C. l. familiaris and H. sapiens (1Si,1 = 0.947, 1Si,2 = 0.985, and 1Si,3 = 0.962). Clearly, C. l. familiaris is the primary reservoir, and transmission of the disease to H. sapiens is possible only because of the high human/dog receptor similarity. Thus, for particular viruses, transmission of disease may be species-selective, although common receptors exist among species. Furthermore, infection specificity may be determined by the species barrier, which results from receptor differences.
Our analysis revealed significant differences in receptor similarity between infection and non-infection cases. The similarity values, and the experimentally determined group categories were fed into a statistical discriminant analysis to logically predict infection (or zoonosis, in the case of human infection). As described in the Materials and Methods section, the values Di2 (i = 1, 2, 3) were calculated from the data in the Table 1 to yield results of a specific discriminant analysis.
Virus group prediction.
Pyyredicted group (G)
In Table 2, the hydrophilic similarity scores (S1) show less consistency, comparing to the hydrophobic scores (S2), with the predictive values (G). From the result, it could be said that the hydrophobic characteristics of receptor sequence might be the key contributor to the prediction. However, this observation should only be carefully interpreted because the variables (S1, S2, S3) are complementary in the statistical process.
Our analysis of viral receptor sequences shows that the likelihood of viral infection correlates with the similarity in sequence of the primary and host receptors. This result is not surprising, because viral infection also inversely correlates with the inhibition of viral coat protein binding to the receptors. Importantly, we were able to establish this relationship at the amino acid sequence level, allowing for the prediction of possible human infection at an early stage of a viral outbreak, before the structures of viral coat proteins and receptors are known. Therefore, once the receptor sequences of primary reservoir and the potential host are known, the likelihood of viral infection can be predicted if the virus does not mutate too abruptly. Our simplistic approach needs further refinement because the complex processes of host tropism of viruses are largely ignored in our current method. For example, the process of host immune response could be included for better prediction of zoonosis. Although further refinements of our methods and analyses of larger databases are needed, this simple conceptual approach may be useful, even now, as a basic tool for the classification of zoonosis of new viral species.
Viral infection requires the insertion of viral genes into host cells. Such a process begins with the binding of coat proteins to host receptors, and in some cases, co-receptors . Ten RNA viruses (seven zoonotic viruses and three non-zoonotic viruses) were investigated. Viruses that use co-receptors were excluded from the study. Receptor sequence data for each virus were collected from the National Center for Biotechnology Information http://www.ncbi.nlm.nih.gov/, and the research literature was examined to determine the specific species tropism of each virus [, http://www.ictvonline.org/]. The viruses, host species, receptors, receptor sequences, and infection information for each host are shown in Table 1. We selected viruses that are each a representative of a different family, with different primary reservoirs. Viruses with unknown or poorly defined host receptors (particularly human receptors) were excluded from the study. Orthologues of the human receptor sequences for the non-zoonotic viruses were collected to allow for clear comparison with zoonosis cases.
Discriminant analysis for data analysis
where Ntot is the total number of amino acids in one sequence string; ntot is the total number of matched amino acids in the sequence; Nphi and Npho are the numbers of hydrophilic and hydrophobic amino acids in the sequence, respectively; Nothers is the number of deleted amino acids (gaps/insertions in sequence) plus the number of amino acids with undetermined properties; nphi and npho are the numbers of hydrophilic and hydrophobic amino acids matched, respectively; and gSi,1 is the similarity score for hydrophilic residues of the ith row of infection group g. Here, there are only three groups: g = 1, 2, or 3, which are the infection, non-infection, and near-infection groups, respectively. The interspecies infection information was identified and classified among three infection states: group 1 (g = 1) represents infection; group 2 (g = 2) represents non-infection; and group 3 (g = 3) represents near-infection. By definition, if a group 1 species pair includes humans, then the infection is zoonotic. Decisions for grouping were made on the basis of experimental and epidemiological studies reported in the literature [4, 30–33].
The averages , , and for group 2 and , , and for group 3 were calculated similarly.
We next found the inverse matrix I, where I = P -1 . Because there were three groups in our study, we predicted the likelihood of infection for a virus of unknown infection condition by calculating the Mahalanobis distance (generally D2 = d1 × C-1 × Di).
where S 1 , S 2 , and S 3 are the input variables; here, they were similarity variables of a virus of an unknown infection group.
For example, if D12 is the minimum among three values from the above set of three equations, then G = 1; i.e., "group 1" is the group classification. To automate the mathematical process described above, we developed a Java computer program named ZOO. To evaluate the accuracy of our method and software, we analysed a test data set (described in the Results & Discussion section).
We acknowledge the invaluable contribution of the researchers who have made their data publicly available. We thank K.T. No (Yonsei University) for his support. This work was partly supported by the Brain Korea 21 project.
- Baranowski E, Ruiz-Jarabo CM, Domingo E: Evolution of Cell Recognition by Viruses. Science 2001, 292: 1102–1105. 10.1126/science.1058613View ArticlePubMedGoogle Scholar
- Schwabe CW: Veterinay medicine and human health. Baltimore, Williams & Wilkins; 1984.Google Scholar
- Webber R: Communicable disease epidemiology and control. Am J Epidemiol 1998, 147: 791–792.View ArticleGoogle Scholar
- Hugh-Jones ME, Hubbert WT, Hagstad HV: Zoonoses-recognition, control and prevention. Iowa: Iowa State University Press; 2008.Google Scholar
- Schneider-Schaulies J: Cellular receptors for viruses: links to tropism and pathogenesis. J Gen Virol 2000, 81: 1413–1429.View ArticlePubMedGoogle Scholar
- Dimmock NJ: Initial Stages in infection with Animal viruses. J Gen Virol 1982, 59: 1–22. 10.1099/0022-1317-59-1-1View ArticlePubMedGoogle Scholar
- Wiley DC, Wilson IA, Skehel JJ: Structural identification of the antibody-binding sites of Hong Kong influenza haemagglutinin and their involvement in antigenic variation. Nature 1981, 289: 373–378. 10.1038/289373a0View ArticlePubMedGoogle Scholar
- Duffy S, Shackelton LA, Holmes EC: Rates of evolutionary change in viruses: patterns and determinats. Nat Rev Genet 2008, 1–10.Google Scholar
- Horsfall FL Jr, Hardy PH Jr, Davenport FM: The significance of combinations between viruses and host cells. Bull N Y Acad 1948, 24: 470–475.Google Scholar
- Dales S: Early Events in Cell-Animal Virus Interactions. Bacteriol Rev 1973, 37: 103–135.PubMed CentralPubMedGoogle Scholar
- Lentz TL: The recognition event between virus and host cell receptor: a target for antiviral agents. J Gen Virol 1990, 71: 751–766. 10.1099/0022-1317-71-4-751View ArticlePubMedGoogle Scholar
- Vrublevskaya VV, Korney AN, Smirnow SV, Morenkov OS: Cell-binding properties of glycoprotein B of Aujeszky's disease virus. Virus Res 2002, 86: 7–19. 10.1016/S0168-1702(02)00032-1View ArticlePubMedGoogle Scholar
- Myszka DG, Sweet RW, Hensley P, Brigham-Burke M, Kwong PD, Hendrickson WA, Wyatt R, Sodroski J, Doyle ML: Energetics of the HIV gp120-CD4 binding reaction. PNAS 2000, 97: 9026–9031. 10.1073/pnas.97.16.9026PubMed CentralView ArticlePubMedGoogle Scholar
- Wu L, Gerard NP, Wyatt R, Choe H, Parolin C, Ruffing N, Borsetti A, Cardoso AA, Desjardin E, Newman W, Gerard C, Sodroski J: CD4-induced interaction of primary HIV-1 gp120 glycoproteins with the chemokine receptor CCR-5. Nature 1996, 384: 179–183. 10.1038/384179a0View ArticlePubMedGoogle Scholar
- Hyypiä T: Virus Host Cell Receptors. Encyclopedia of life science 2006, 1–8.Google Scholar
- Wang JH: Protein recognition by cell surface receptors: physiological receptors versus virus interactions. Trends Biochem Sci 2002, 27: 122–126. 10.1016/S0968-0004(01)02038-2View ArticlePubMedGoogle Scholar
- Wimmer E: Cellular receptors for animal viruses. Cold Spring Harbor laboratory Press, Cold Spring Harbor, NY; 1994.Google Scholar
- Haywood AM: Virus Receptors: Binding, Adhesion Strengthening, and Changes in Viral Structure. J Virol 1994, 68: 1–5.PubMed CentralPubMedGoogle Scholar
- Reeves JD, Gallo SA, Ahmad N, Miamidian JL, Harvey PE, Sharron M, Pohlmann S, Sfakianos JN, Derdeyn CA, Blumenthal R, Hunter E, Doms RW: Sensitivity of HIV-1 to entry inhibitors correlates with envelope/coreceptor affinity, receoptor density, and fusion kinetics. Proc Natil Acad Sci 2002, 99: 16249–16254. 10.1073/pnas.252469399View ArticleGoogle Scholar
- Pileri P, Uematsu Y, Campagnoli S, Galli G, Falugi F, Petracca R, Weiner AJ, Houghton M, Rosa D, Grandi G, Abrignani S: Binding of Hepatitis C Virus to CD81. Science 1998, 282: 938–941. 10.1126/science.282.5390.938View ArticlePubMedGoogle Scholar
- Gareth MJ, Andrew R, Pybus OG, Holmes EC: Rates of Molecular Evolution in RNA Viruses: A Quantitative Phylogenetic Analysis. J Mol Evol 2002, 54: 156–165. 10.1007/s00239-001-0064-3View ArticleGoogle Scholar
- Faure E: Could FIV zoonosis responsible of the breakdown of the pathocenosis which has reduced the European CCR5-Delta32 allele frequencies? Virol J 2008, 5: 119. 10.1186/1743-422X-5-119PubMed CentralView ArticlePubMedGoogle Scholar
- VandeWoude S, Apeterei C: Going Wild: Lessons from Naturally Occurring T-Lymphotropic Lentiviruses. Clin Microbiol Rev 2006, 19: 728–762. 10.1128/CMR.00009-06PubMed CentralView ArticlePubMedGoogle Scholar
- Berger EA, Murphy PM, Farber JM: Chemokine receptors as HIV-1 coreceptors; roles in viral entry, tropism, and disease. Annu, Rev Immunol 1999, 17: 657–700. 10.1146/annurev.immunol.17.1.657View ArticleGoogle Scholar
- Fauguet CM, Mayo MA, Maniloff J, Desselberger U, Ball LA: 8thReports of the international committee on Taxonomy of viruses. Academic Press;
- Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG: The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research 1997, 25: 4876–4882. 10.1093/nar/25.24.4876PubMed CentralView ArticlePubMedGoogle Scholar
- Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD: Multiple sequence alignment with the Clustal series of programs. Nucleic acids res 2003, 31: 3497–3500. 10.1093/nar/gkg500PubMed CentralView ArticlePubMedGoogle Scholar
- Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25(17):3389–402. 10.1093/nar/25.17.3389PubMed CentralView ArticlePubMedGoogle Scholar
- Löytynoja A, Goldman N: An algorithm for progressive multiple alignment of sequences with insertions. PNAS 2005, 102: 10557–10562.PubMed CentralView ArticlePubMedGoogle Scholar
- Greger M: The Human/Animal Interface: emergence and resurgence of zoonotic infectious diseases. Crit Rev Microbiology 2007, 33: 243–299. 10.1080/10408410701647594View ArticleGoogle Scholar
- Ryou WS: Virology. Life Science publishing; 2007.Google Scholar
- Woolhouse MEJ, Gowtage-Sequeria S: Host Range and Emerging and Reemerging Pathogens. Emerg Infect Dis 2005, 11: 1842–1847.PubMed CentralView ArticlePubMedGoogle Scholar
- Baltimore D: Expression of Animal Virus Genomes. Bacteriol Rev 1971, 35: 235–241.PubMed CentralPubMedGoogle Scholar
- Fisher RA: The use of multiple measurement in taxonomic problems. Ann Eugenics 2 1936, 179–188. 10.1111/j.1469-1809.1936.tb02137.xGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.