The oligodeoxynucleotide sequences corresponding to never-expressed peptide motifs are mainly located in the non-coding strand
© Capone et al; licensee BioMed Central Ltd. 2010
Received: 3 June 2010
Accepted: 20 July 2010
Published: 20 July 2010
We study the usage of specific peptide platforms in protein composition. Using the pentapeptide as a unit of length, we find that in the universal proteome many pentapeptides are heavily repeated (even thousands of times), whereas some are quite rare, and a small number do not appear at all. To understand the physico-chemical-biological basis underlying peptide usage at the proteomic level, in this study we analyse the energetic costs for the synthesis of rare and never-expressed versus frequent pentapeptides. In addition, we explore residue bulkiness, hydrophobicity, and codon number as factors able to modulate specific peptide frequencies. Then, the possible influence of amino acid composition is investigated in zero- and high-frequency pentapeptide sets by analysing the frequencies of the corresponding inverse-sequence pentapeptides. As a final step, we analyse the pentadecamer oligodeoxynucleotide sequences corresponding to the never-expressed pentapeptides.
We find that only DNA context-dependent constraints (such as oligodeoxynucleotide sequence location in the minus strand, introns, pseudogenes, frameshifts, etc.) provide a coherent mechanistic platform to explain the occurrence of never-expressed versus frequent pentapeptides in the protein world.
This study is of importance in cell biology. Indeed, the rarity (or lack of expression) of specific 5-mer peptide modules implies the rarity (or lack of expression) of the corresponding n-mer peptide sequences (with n > 5), so possibly modulating protein compositional trends. Moreover the data might further our understanding of the role exerted by rare pentapeptide modules as critical biological effectors in protein-protein interactions.
Proteins comprise subsets of all plausible amino acid sequences, i.e. peptide motifs that occur in different quantitative percentages and with different qualitative significance at the proteomic level. To understand the correspondence between structure and function, we must understand the rules dictating the modular arrangement of proteins. We chose the pentapeptide as a basic structural/functional unit to analyse the compositional distribution of peptide sequences. Indeed, pentapeptides appear to be minimal biological units exerting a central role in fundamental cellular processes such as inhibition/stimulation of cell growth, hormone activity, regulation of transcript expression, enzyme activity, and immune recognition . Following a robust set of experimental protein analyses [2–9], we determined that, as a rule, amino acid stretches with low/no proteomic redundancy alternate with portions of high proteomic redundancy along protein primary structures , independently of the protein length [3, 4], whether the protein is derived from microbial or mammalian organisms [3–9], and the proteome under analysis [5–9]. Preliminarily to any evolutionary/functional/physio-pathological considerations, the data prompt a fundamental question: what makes one pentapeptide occur more frequently than another in the protein world? In this paper, we undertake a large-scale analysis of the physico-(bio)chemical factors that theoretically might account for the modular peptide composition of proteins, and examine a total of 20991 pentapeptides, divided into eleven sets characterized by frequencies ranging from zero to 2500.
The complete UniRef100, UniRef90 and UniRef50 databases (http://www.uniprot.org/downloads) were downloaded as single proteomes and analysed for internal peptide redundancy using 5-mers sequentially overlapping by four residues. The scans were performed using standard UNIX/LINUX commands and custom programs written in Perl .
The proteins were manipulated and analysed as follows. All the protein sequences were decomposed in silico to a set of 5-mers (including all duplicates). Any 5-mers containing ambiguous amino acids (i.e., denoted by the letters B, X, or Z, which respectively represent ambiguity between N and D, ambiguity between Q and E, and an unknown amino acid) or non-standard amino acid codes (i.e., -, U, *, O, denoting gaps, selenocysteine residues, stop codons, etc.) were eliminated. Since there are only 3200000 possible 5-mers, a simple linear scan was used to determine the counts of occurrences and 5-mers that do not occur. That is, for each pentamer, the UniRef100 (or UniRef90 or UniRef50) proteome was searched for instances of that pentamer. Any such occurrence was termed a match. The number of matches defines the proteomic frequency of each pentapeptide.
Eleven peptide sets with zero, low, medium and high frequencies (i.e., from zero to 2500 matches) were selected from UniRef100 (hereafter called the "universal proteome") for physico-(bio)chemical analyses. Specifically, the frequencies defining the eleven sets were: 0, 1, 4, 5, 50, 100, 341, 500, 1000, 1368 and 2500. The pentapeptide sets were screened by starting with the UniRef100 database and then using the Perfect Peptide Match program at the Protein Information Resource (PIR) website (http://pir.georgetown.edu/pirwww) to eliminate repeated sequences and fragments. The protein entries containing the 5-mer under analysis were further filtered using the UniProtKB resources (http://www.uniprot.org) to eliminate obsolete entries.
Analysis of the energetics was carried out for each pentaptide using Spartan'06 software (from Wavefunction Inc, Irvine, CA) and applying the semi-empirical method. The peptide bulkiness degree was measured using the ProtScale program available at http://www.expasy.ch/tools. The hydrophobicity level was determined using the scale described by Takano and Yutani . The codon number per pentapeptide was calculated by summing the number of codons of each amino acid forming the 5-mer. One-way analysis of variance (ANOVA, F-test) was used to derive a p-value indicating whether the means of the measurements for the different sets were all equal.
To analyse DNA constraints, we analysed the oligodeoxynucleotide coding sequences corresponding to the pentameric amino acid sequences. The Sequence Manipulation Suite Reverse Translate program (http://www.bioinformatics.org/sms2/) was used to generate a DNA sequence representing the most likely, optimized coding sequence. Additionally, Reverse Translate a Protein (http://www.vivo.colostate.edu/molkit/rtranslate/index.html), a program that uses the standard genetic code and does not consider differences in codon usage, was used in order to obtain all the possible degenerate oligodeoxynucleotide coding frames for each pentapeptide under analysis.
The pentadecameric oligodeoxynucleotide sequences so obtained were the subject of nucleotide-nucleotide BLAST (blastn) analysis at NCBI (http://blast.ncbi.nlm.nih.gov) to find and localize regions of 100% similarity (i.e. with no gaps allowed) in the entire nucleotide collection (nr/nt) comprehending genomic and transcript sequences .
Pentapeptide redundancy and ΔG°
The biosynthesis of the peptide bond from amino acids involves an increase in free energy and must therefore depend on energy yielding reactions. We reasoned that, if a substantial fraction of energy is needed to convert starting amino acids into peptides, then the pentapeptide composition of proteins expressed in the proteomes should be biased toward less energetically costly pentapeptides. Theoretically, the extent to which pentapeptide composition is biased to reduce metabolic costs should positively correlate with the pentapeptide redundancy at the proteomic level.
Therefore, we reasoned that the same ΔG° variability would apply even more strongly to longer peptide units. Based on this rationale, we calculated the heats of formation for pentapeptide sets with different frequencies in the universal proteome (i.e., from zero to 2500 occurrences). As a universal proteome database, we used UniRef100, which representsone of the most comprehensive non-redundant protein sequence datasets available ([16–18], see also http://www.ebi.ac.uk/uniref/). To control for existing bias and redundancies in the UniRef100 database, the protein entries containing the 5-mers under analysis were filtered for repeated sequences, fragments, and obsolete entries.
The relationship between pentapeptide redundancy and hydrophobicity, bulkiness, and codon number
Pentapeptide redundancy and amino acid composition
Figures 4 and 5 indicate almost no relationship between pentapeptide frequencies and physico-chemical factors such as hydrophobicity and bulkiness. On the other hand, the analyses reported in Figure 3 suggest that rare pentapeptides are formed primarily by Trp, Tyr, and Met, i.e. by essential low-concentration amino acids endowed with high values of hydrophobicity and residue bulkiness. This raises the question: might amino acid frequencies affect pentapeptide frequency?
Taken together, these data indicate that amino acid composition appears to modulate at some extent, but does not dictate, the pentapeptide composition of the universal proteome.
Analysing the never-expressed pentapeptides at the DNA level
After obtaining the results above, we postulated that the lack of occurrence of the pentapeptides never found in the universal proteome could be ascribed to a lack of the corresponding pentadecameric oligodeoxynucleotides in the DNA coding sequence. Therefore, a search was conducted for occurrences of the oligodeoxynucleotide sequences coding for the pentapeptides never expressed in the universal proteome using the standard nucleotide-nucleotide BLAST (blastn) program as described under Methods.
The oligodeoxynucleotide sequences corresponding to never-expressed peptide motifs are mainly located in the non-coding strand
Organisms hosting the ATGTGGCATATGTGC oligodeoxynucleotide coding for MWHMC pentapeptide:
Location of the oligodeoxynucleotide:
DNA minus strand
Alkaliphilus metalliredigens (1)
Anoxybacillus flavithermus (1)
Chlorobium phaeovibrioides (1)
Ciona intestinalis (1)
Cryptococcus bacillisporus (1)
Danio rerio (1)
Dictyostelium discoideum (1)
Drosophila erecta (1)
Drosophila melanogaster (1)
Drosophila sechellia (1)
Drosophila simulans (1)
Drosophila yakuba (2)
Gorilla gorilla gorilla (1)
Homo sapiens (4)
Macaca mulatta (1)
Mus musculus (2)
Oryza sativa Japonica (3)
Streptococcus thermophilus (2)
Sus scrofa (1)
Teredinibacter turnerae (1)
Thalassiosira pseudonana (1)
Organisms hosting the TGGTTTCAGTGCATG oligodeoxynucleotide coding for WFQCM pentapeptide:
Location of the oligodeoxynucleotide:
DNA minus strand
Bacillus pumilus (1)
Brassica napus (1)
Chitinophaga pineni (1)
Cynops pyrrhogaster (1)
Danio rerio (3)
Felis catus (1)
Gasterosteus aculeatus (1)
Haemophilus ducreyi (1)
Homo sapiens (7)
Kluyveromyces lactis (2)
Macaca mulatta (1)
Methanosarcina barkeri (1)
Mus musculus (5)
Nicotiana plumbaginifolia (1)
Pan troglodytes (1)
Penicillium chrysogenum (2)
Ricinus communis (1)
Vitis vinifera (2)
Xenopus tropicalis (1)
From this we conclude that DNA context-dependent constraints (e.g., oligodeoxynucleotide sequence location in the minus strand, introns, splicing-dependent frameshifts, etc.) are the main factors limiting/preventing the expression of the corresponding amino acid sequences in the universal proteome.
The factors acting on the amino acid composition of proteins have been thoroughly investigated with particular attention to the habitat of the organisms (e.g., growth temperature and salinity) [19–22], sub-cellular localization (e.g., cytosolic, membrane or nuclear) , physical properties (e.g., mass and charge) , translational constraints , and the metabolic costs of amino acid biosynthesis . In contrast, less attention has been dedicated to the structural and functional constraints acting on the peptide composition of proteins. Clearly, the empirical distribution of pentapeptide frequencies has, one way or another, an impact upon protein expression as well as on function/structure, and it is important to understand and define the physico-chemical-biological factors that correlate with pentapeptide frequencies in the protein world.
We already reported preliminary data showing that certain short sequences of amino acids (i.e. pentapeptides) are very common, whereas some are quite rare, and a small number do not appear at all in the collection of all known proteins . Here we report the results of a comprehensive study of the influence of physico-(bio)chemical parameters (energetic cost, bulkiness, hydrophobicity and amino acid codon number), amino acid composition, and DNA constraints on pentapeptide expression in the protein world.
First, we observe a definite (although not determining) role of, in descending order of importance, amino acid codon number, hydrophobicity and bulkiness in modulating pentapeptide frequency in the universal proteome. On the other hand, we find that ΔG° has little influence in defining the pentapeptide composition of the universal proteome. This result is relevant and deserves to be emphasized. We explored in detail whether variations in the peptide bond energetical cost might explain the extent of the pentapeptide compositional bias in the universal proteome based on the following rationale. The data reported for protein amino acid composition indicate increases in the abundance of less energetically costly amino acids in highly expressed proteins . Accordingly and further supported by the correlation existing between dipeptide redundancy (Figure 1A) and ΔG° level (Figure 1B), we expected that energetically costly pentapeptides would be rare, whereas more frequent pentapeptides would have a low energetic cost. In conflict with this theoretical expectation, the experimental data obtained in this study and reported in Figures 3 and 4 clearly demonstrate that there is no such correlation at the pentapeptide level. Surprisingly we found that high energies of formation are associated with moderately or highly frequent pentapeptides.
A second unexpected finding is that amino acid composition is a marginal factor in determining pentapeptide rarity: although enriched in hydropathic, rare amino acids such as Trp, Tyr, and Met, the inverse sequences of never-expressed pentapeptides are indeed expressed in the universal proteome.
Third, and as a logical consequence of the previous two points, we show that the constraints acting on pentapeptide expression mainly lie at the nucleotide sequence level. Once we excluded possible limitations due to Trp, Met, and Tyr rarity  (see Figure 6), we had to suppose that other constraints are active in defining the proteomic pentapeptide frequencies. Effectively, as demonstrated in Table 1 (see also Additional file 1, Table S1), we found that never-expressed pentapeptides correspond to untranslatable, frameshifted or mistranslated oligodeoxynucleotide sequences. In other words, allocation of the coding oligodeoxynucleotide in pseudogenes/minus strand/untranslated regions/introns as well the shift of the reading frame are the main factors determining the distribution of pentapeptide frequencies throughout the protein world.
The results above are of importance both in the biochemical and functional cellular context. Indeed, as already described [1, 3–10, 29], it seems that rare pentapeptides are basic to control functions , whereas possibly frequent modules are preferentially involved in structure definition. In this regard, it is worth noting that multicodonic Leu, Ser, Pro, Ala, and Gly residues are the most common ones in high-frequency, low-complexity peptides whose function, in many cases, is the spacing of structural/functional domains . Conversely, the mono/di-codonic amino acids Asn, Cys, Tyr, Met, Phe, and Trp are relatively rare in highly-frequent, low-complexity peptides and characterize functionally critical proteins such as proto-oncogenes . In this case, specific usage of mono/di-codonic amino acids would allow the control of the proto-oncogene product at the transcriptional level. Moreover, during the last decade one of us proposed and demonstrated the association between rare pentapeptides and immunogenic potential [32–39]. Hence, understanding the mechanisms by which peptide platforms are used in the protein world not only is of biochemical interest but also proves of practical importance for biotechnology, e.g. vaccines, expression vectors and peptide therapy approaches  with the relevant advantage of effectiveness  without adverse side-effects [42–45].
Partially funded by: the National Sciences and Engineering Research Council of Canada (BT and AK) and Ministry of University, Italy (DK). GN and CF are PhD students of the course "Analytical Morphometry and Models of Molecular Medicine".
- Lucchese G, Stufano A, Trost B, Kusalik A, Kanduc D: Peptidology: short amino acid modules in cell biology and immunology. Amino Acids 2007, 33: 703–707. 10.1007/s00726-006-0458-zView ArticlePubMedGoogle Scholar
- Kanduc D, Capone G, Delfino VP, Losa G: The fractal dimension of protein information. Adv Stud Biol 2010, 2: 53–62.Google Scholar
- Kanduc D, Lucchese A, Mittelman A: Individuation of monoclonal anti-HPV16 E7 antibody linear peptide epitope by computational biology. Peptides 2001, 22: 1981–1985. 10.1016/S0196-9781(01)00539-3View ArticlePubMedGoogle Scholar
- Mittelman A, Tiwari R, Lucchese G, Willers J, Dummer R, Kanduc D: Identification of monoclonal anti-HMW-MAA antibody linear peptide epitope by proteomic database mining. J Invest Dermatol 2004, 123: 670–675. 10.1111/j.0022-202X.2004.23417.xView ArticlePubMedGoogle Scholar
- Mittelman A, Lucchese A, Sinha AA, Kanduc D: Monoclonal and polyclonal humoral immune response to EC HER-2/NEU peptides with low similarity to the host's proteome. Int J Cancer 2002, 98: 741–747. 10.1002/ijc.10259View ArticlePubMedGoogle Scholar
- Lucchese A, Mittelman A, Lin MS, Kanduc D, Sinha AA: Epitope definition by proteomic similarity analysis: identification of the linear determinant of the anti-Dsg3 MAb 5H10. J Transl Med 2004, 2: 43. 10.1186/1479-5876-2-43View ArticlePubMedPubMed CentralGoogle Scholar
- Lucchese A, Willers J, Mittelman A, Kanduc D, Dummer R: Proteomic scan for tyrosinase peptide antigenic pattern in vitiligo and melanoma: role of sequence similarity and HLA-DR1 affinity. J Immunol 2005, 175: 7009–7020.View ArticlePubMedGoogle Scholar
- Willers J, Lucchese A, Mittelman A, Dummer R, Kanduc D: Definition of anti-tyrosinase MAb T311 linear determinant by proteome-based similarity analysis. Exp Dermatol 2005, 14: 543–550. 10.1111/j.0906-6705.2005.00327.xView ArticlePubMedGoogle Scholar
- Stufano A, Kanduc D: Proteome-based epitopic peptide scanning along PSA. Exp Mol Pathol 2009, 86: 36–40. 10.1016/j.yexmp.2008.11.009View ArticlePubMedGoogle Scholar
- Gusfield D: Algorithms on strings, trees, and sequences: computer science and computational biology. Cambridge: Cambridge University Press; 1997.View ArticleGoogle Scholar
- Wu CH, Huang H, Arminski L, Castro-Alvear J, Chen Y, Hu ZZ, Ledley RS, Lewis KC, Mewes HW, Orcutt BC, Suzek BE, Tsugita A, Vinayaka CR, Yeh LS, Zhang J, Barker WC: The Protein Information Resource: an integrated public resource of functional annotation of proteins. Nucleic Acids Res 2002, 30: 35–37. 10.1093/nar/30.1.35View ArticlePubMedPubMed CentralGoogle Scholar
- Zimmerman JM, Eliezer N, Simha R: The characterization of amino acid sequences in proteins by statistical methods. J Theor Biol 1968, 21: 170–201. 10.1016/0022-5193(68)90069-6View ArticlePubMedGoogle Scholar
- Takano K, Yutani K: A new scale for side-chain contribution to protein stability based on the empirical stability analysis of mutant proteins. Protein Eng 2001, 14: 525–528. 10.1093/protein/14.8.525View ArticlePubMedGoogle Scholar
- Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25: 3389–3402. 10.1093/nar/25.17.3389View ArticlePubMedPubMed CentralGoogle Scholar
- Huang H, Shukla HD, Wu C, Saxena S: Challenges and Solutions in Proteomics. Curr Genomics 2007, 8: 21–28. 10.2174/138920207780076910View ArticleGoogle Scholar
- The UniProt Consortium: The Universal Protein Resource (UniProt) 2009. Nucleic Acids Res 2009, 37: D169–174. 10.1093/nar/gkn664View ArticlePubMed CentralGoogle Scholar
- The UniProt Consortium: The Universal Protein Resource (UniProt) in 2010. Nucleic Acids Res 2010, 38: D142–148. 10.1093/nar/gkp846View ArticlePubMed CentralGoogle Scholar
- Kreil DP, Ouzounis CA: Identification of thermophilic species by the amino acid compositions deduced from their genomes. Nucleic Acids Res 2001, 29: 1608–1615. 10.1093/nar/29.7.1608View ArticlePubMedPubMed CentralGoogle Scholar
- Slesarev AI, Mezhevaya KV, Makarova KS, Polushin NN, Shcherbinina OV, Shakhova VV, Belova GI, Aravind L, Natale DA, Rogozin IB, Tatusov RL, Wolf YI, Stetter KO, Malykh AG, Koonin EV, Kozyavkin SA: The complete genome of hyperthermophile Methanopyrus kandleri AV19 and monophyly of archaeal methanogens. Proc Natl Acad Sci USA 2002, 99: 4644–4649. 10.1073/pnas.032671499View ArticlePubMedPubMed CentralGoogle Scholar
- Lobry JR, Chessel D: Internal correspondence analysis of codon and amino-acid usage in thermophilic bacteria. J Appl Genet 2003, 44: 235–261.PubMedGoogle Scholar
- Peer I, Felder CE, Man O, Silman I, Sussman JL, Beckmann JS: Proteomic signatures: Amino acid and oligopeptide compositions differentiate among phyla. Proteins Struct, Funct Bioinform 2004, 54: 20–40. 10.1002/prot.10559View ArticleGoogle Scholar
- Schwartz R, Ting CS, King J: Whole proteome pI values correlate with subcellular localizations of proteins for organisms within the three domains of life. Genome Res 2001, 11: 703–709. 10.1101/gr.GR-1587RView ArticlePubMedGoogle Scholar
- Knight CG, Kassen R, Hebestreit H, Rainey PB: Global analysis of predicted proteomes: Functional adaptation of physical properties. Proc Natl Acad Sci USA 2004, 101: 8390–8395. 10.1073/pnas.0307270101View ArticlePubMedPubMed CentralGoogle Scholar
- Akashi H: Gene expression and molecular evolution. Curr Opin Genet Dev 2001, 11: 660–666. 10.1016/S0959-437X(00)00250-1View ArticlePubMedGoogle Scholar
- Akashi H, Gojobori T: Metabolic efficiency and amino acid composition in the proteomes of Escherichia Coli and Bacillus subtilis. Proc Natl Acad Sci USA 2002, 99: 3695–3700. 10.1073/pnas.062526999View ArticlePubMedPubMed CentralGoogle Scholar
- Kusalik A, Trost B, Bickis M, Fasano C, Capone G, Kanduc D: Codon number shapes peptide redundancy in the universal proteome composition. Peptides 2009, 10: 1940–1944. 10.1016/j.peptides.2009.06.035View ArticleGoogle Scholar
- Brooks DJ, Fresco JR, Lesk AM, Singh M: Evolution of amino acid frequencies in proteins over deep time: inferred order of introduction of amino acids into the genetic code. Mol Biol Evol 2002, 19: 1645–1655.View ArticlePubMedGoogle Scholar
- Trost B, Kanduc D, Kusalik A: Rare peptide segments are found significantly more often in proto-oncoproteins than control proteins: implications for immunology and oncology. J R Soc Interface 2009, 6: 123–127. 10.1098/rsif.2008.0320View ArticlePubMedPubMed CentralGoogle Scholar
- Kanduc D: Protein information content resides in rare peptide segments. Peptides 2010, 31: 983–988. 10.1016/j.peptides.2010.02.003View ArticlePubMedGoogle Scholar
- Wootton JC: Sequences with 'unusual' amino acid composition. Curr Opin Struct Biol 1994, 4: 413–421. 10.1016/S0959-440X(94)90111-2View ArticleGoogle Scholar
- Willers J, Lucchese A, Kanduc D, Ferrone S: Molecular mimicry of phage displayed peptides mimicking GD3 ganglioside. Peptides 1999, 20: 1021–1026. 10.1016/S0196-9781(99)00095-9View ArticlePubMedGoogle Scholar
- Natale C, Giannini T, Lucchese A, Kanduc D: Computer-assisted analysis of molecular mimicry between HPV16 E7 oncoprotein and human protein sequences. Immunol Cell Biol 2000, 78: 580–585. 10.1046/j.1440-1711.2000.00949.xView ArticlePubMedGoogle Scholar
- Kanduc D: Peptimmunology: immunogenic peptides and sequence redundancy. Curr Drug Discov Technol 2005, 2: 239–244. 10.2174/157016305775202946View ArticlePubMedGoogle Scholar
- Kanduc D: Defining peptide sequences: from antigenicity to immunogenicity through redundancy. Curr Pharmacogenomics 2006, 4: 33–37. 10.2174/157016006776055374View ArticleGoogle Scholar
- Kanduc D: Correlating low-similarity peptide sequences and allergenic epitopes. Curr Pharm Des 2008, 14: 289–295. 10.2174/138161208783413257View ArticlePubMedGoogle Scholar
- Kanduc D: Immunogenicity in peptide-immunotherapy: from self/nonself to similar/dissimilar sequences. In Multichain Immune Recognition Receptor Signaling: From Spatiotemporal Organization to Human Disease. Landes Biosci. Edited by: Sigalov A. Austin, TX, USA; 2008:198–207. full_textView ArticleGoogle Scholar
- Kanduc D: Self-nonself peptides in the design of vaccines. Curr Pharm Des 2009, 15: 3283–3289. 10.2174/138161209789105135View ArticlePubMedGoogle Scholar
- Lucchese G, Stufano A, Kanduc D: Proteome-guided search for influenza A B-cell epitopes. FEMS Immunol Med Microbiol 2009, 57: 88–92. 10.1111/j.1574-695X.2009.00582.xView ArticlePubMedGoogle Scholar
- Lucchese A, Serpico R, Crincoli V, Shoenfeld Y, Kanduc D: Sequence uniqueness as a molecular signature of HIV-1-derived B-cell epitopes. Int J Immunopathol Pharmacol 2009, 22: 639–646.PubMedGoogle Scholar
- Kanduc D: Epitopic peptides with low similarity to the host proteome: towards biological therapies without side effects. Expert Opin Biol Ther 2009, 9: 45–53. 10.1517/14712590802614041View ArticlePubMedGoogle Scholar
- Mandavilli A: When the vaccine causes disease. Nat Med 2007, 13: 274. 10.1038/nm0307-274bView ArticlePubMedGoogle Scholar
- Kanduc D: Penta- and hexapeptide sharing between HPV16 and Homo sapiens proteomes. Int J Med Sci 2009, 1: 387.Google Scholar
- Kanduc D: Quantifying the possible cross-reactivity risk of an HPV16 vaccine. J Exp Ther Oncol 2009, 8: 65–76.PubMedGoogle Scholar
- Ricco R, Kanduc D: Hepatitis B virus and Homo sapiens proteome-wide analysis: A profusion of viral peptide overlaps in neuron-specific human proteins. Biologics 2010, 4: 75–81.PubMedPubMed CentralGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.