Widespread evidence of viral miRNAs targeting host pathways
© Carl et al.; licensee BioMed Central Ltd. 2013
Published: 21 January 2013
Skip to main content
© Carl et al.; licensee BioMed Central Ltd. 2013
Published: 21 January 2013
MicroRNAs (miRNA) are regulatory genes that target and repress other RNA molecules via sequence-specific binding. Several biological processes are regulated across many organisms by evolutionarily conserved miRNAs. Plants and invertebrates employ their miRNA in defense against viruses by targeting and degrading viral products. Viruses also encode miRNAs and there is evidence to suggest that virus-encoded miRNAs target specific host genes and pathways that may be beneficial for their infectivity and/or proliferation. However, it is not clear whether there are general patterns underlying cellular targets of viral miRNAs.
Here we show that for several of the 135 known viral miRNAs in human viruses, the human genes targeted by the viral miRNA are enriched for specific host pathways whose targeting is likely beneficial to the virus. Given that viral miRNAs continue to be discovered as technologies evolve, we extended the investigation to 6809 putative miRNAs encoded by 23 human viruses. Our analysis further suggests that human viruses have evolved their miRNA repertoire to target specific human pathways, such as cell growth, axon guidance, and cell differentiation. Interestingly, many of the same pathways are also targeted in mice by miRNAs encoded by murine viruses. Furthermore, Human Cytomegalovirus (CMV) miRNAs that target specific human pathways exhibit increased conservation across CMV strains.
Overall, our results suggest that viruses may have evolved their miRNA repertoire to target specific host pathways as a means for their survival.
MicroRNAs (miRNA) are ~22nt non-coding regulatory genes that target other RNA molecules via sequence-specific hybridization, which results either in translation inhibition (an imperfect target miRNA sequence match) or in cleavage and degradation of the targeted RNA (a perfect target miRNA sequence match) . Initially discovered in worms, miRNAs are now known to serve numerous critical regulatory functions in a wide spectrum of species including viruses , plants, and mammals . In mammals, miRNAs play a regulatory role in processes as diverse as metabolism, immune response, cell death, proliferation, circadian rhythm, and hematopoiesis 
Viruses depend upon the molecular machinery of the cells they infect. They have co-evolved with their host over millions of years and have had to adapt to the cellular environment, which in turn is evolving to evade viral infection. RNA viruses are limited in genomic size due in part to error prone RNA polymerases. DNA viruses allow a larger genomic size and a richer gene repertoire to interact with the host and presumably evolve complex mechanisms for infectivity and survival; we have therefore restricted our study to DNA viruses. DNA viruses have utilized several mechanisms to evade host defenses . Poxviruses and herpes viruses have evolved multiple strategies to disrupt presentation of viral antigens by MHC molecules in order to evade control by T lymphocytes . The adenovirus inhibits MHC-I molecules to evade immune detection . Viruses have also developed mechanisms to bolster the host cell to withstand the viral replication process such as an inhibition of apoptosis , an induction of cellular growth or differentiation , a recruitment of tissue repair processes such as fibrosis , tissue growth , and angiogenesis .
Evidence presented thus far suggests that viral miRNAs target host pathways and biological processes that may potentially benefit viral replication or persistence. If the suppression of a host gene disables host defense or enables viral survival or proliferation (e.g., by promoting cell growth, or inducing repair mechanisms to aid the survival of the infected cell), then encoding a miRNA that targets the specific host gene would potentially benefit the virus. While a few previous studies support this suggestion with specific examples in a limited context [10, 19, 20], it is not clear whether there are shared patterns among different viral miRNAs targeting different hosts. Here we investigate the general patterns of cellular targeting by viral miRNAs for a comprehensive set of human and murine viruses.
We found that for several of the 135 known miRNAs encoded by human viruses, the putative human gene targets of these viral miRNAs are enriched for specific host pathways whose targeting is arguably beneficial to the virus. Many of these pathways, including cancer-related pathways, melanogenesis, axon guidance, ErbB, GnRH, Wnt and MAPK signaling pathways are independently enriched among targets of multiple miRNAs encoded by multiple viruses. Next, to assess the generality of our findings for known miRNAs, based on previously identified 6809 putative miRNAs encoded by 23 human viruses, we predicted high-confidence targets of the miRNAs. We then assessed the enrichment of pathways and Gene Ontology (GO) functional classes in the target sets. At a false discovery rate (FDR) of 10%, we found that 75 miRNAs, corresponding to 15 viruses, target 43 unique pathways. Of these pathways, 20 (6 of which are cancer-related) were repeatedly targeted by independent miRNAs encoded by 14 different viruses. A similar analysis of 4 murine viruses encoding 21 miRNAs also yielded several enriched pathways and functions which significantly overlapped with those identified in human. Based on our biological understanding, the most significant pathways and processes likely targeted by viral miRNAs present a reasonable choice from a virus' vantage point, which include cell growth, axon guidance, and antigen processing and presentation. We discuss this in light of current understanding of viral infection pathways. Moreover, based on genome sequences of 14 strains of human CMV, we found that the miRNAs targeting these specific pathways are evolutionarily more conserved than the other viral miRNAs, suggesting a purifying selection of targeting specific host pathways. Overall, our results suggest that viruses may have evolved their miRNA repertoire to target specific host pathways as a means for their survival and proliferation.
Targets of known human viral miRNA are enriched for KEGG Pathways.
KEGG Pathway Name
ErbB signaling pathway
MAPK signaling pathway
GnRH signaling pathway
Wnt signaling pathway
Viral miRNA selection.
Human adenovirus A
Human adenovirus Type 35
Human adenovirus C
Human adenovirus Type 5
Human adenovirus D
Human adenovirus Type 7
Human adenovirus E
Human adenovirus F
Human adenovirus Type 11
Human adenovirus Type 1
Human adenovirus Type 17
Human adenovirus Type 2
Murine adenovirus A
Human miRNA targets are enriched for KEGG Pathways and GO terms, and multiple viruses target the same pathways even across virus families.
Biosynthesis of steroids
ErbB signaling pathway
Regulation of autophagy
Cytoplasmic vesicle membrane
Hedgehog signaling pathway
Inner ear morphogenesis
Multicellular organismal development
Antigen processing and presentation
GnRH signaling pathway
Protein serine/threonine kinase activity
Extracell-glutamate-gated ion channel activity
Voltage-gated ion channel activity
Canonical Wnt Receptor signaling pathway
We assessed whether the distribution of the number of miRNAs targeting a pathway is skewed relative to expectation. Given the bipartite graph composed of miRNAs (whose targets are enriched for specific pathways) and the corresponding enriched pathways, we randomized the bipartite graph while preserving the degree of each miRNA, in order to estimate the expected degree distribution of pathways. The degree distribution for the real data is skewed towards higher values (Kolmogorov Smirnov (KS) test p-value = 0.029). The antigen processing and presentation pathway was specifically enriched among the targets of twelve miRNA. Based on 1000 randomizations of the bipartite graph; the probability of this occurrence is less than 0.0001. Three pathways were enriched in targets of 7 or more miRNAs (p-value = 0.016 based on 1000 randomizations).
While the pathway-enrichment analysis can reveal very specific biology, it is also limited by our knowledge of pathways. We therefore extended our pathway-enrichment procedure above to search for enriched GO biological processes. As before, we applied a term-specific adjustment of p-value, applied multiple testing corrections, and used a FDR threshold of 10% to select for enriched GO terms. Table 3 shows GO terms that were enriched among the targets of miRNAs in more than one virus. An interesting set of terms not revealed by KEGG pathway analysis became apparent (discussed later). As above, we analyzed the degree distribution associated with GO terms in the bipartite graph composed of enriched GO terms and miRNAs. As before, the GO terms were multiply targeted more frequently than expected (KS test p-value = 0.007). Cytoplasmic microtubule and microtubule bindings were enriched for 9 miRNA.
It is possible that a pathway/GO term is enriched among targets of multiple miRNAs simply due to a high similarity of the miRNA sequences, and could not be interpreted as independent evidence. To rule out this possibility we analyzed the pair-wise overlap in target sites for miRNAa enriched the same functional term (see Methods). For each miRNA pair, we calculated the overlap as the fraction of all target sites (for the miRNA with fewer hits) that overlapped with the target sites of the other miRNA. We found that in almost all cases (99%), the overlap was less than 0.05. Thus, the miRNA sequence similarity does not explain the common pathway enrichment. Taken together, our results suggest that several pathways and GO terms are targeted independently by multiple miRNAs.
Mouse and human virus target similar KEGG pathways and GO terms.
Synthesis and degradation of ketone bodies
C21-steroid hormone metabolism
ErbB signaling pathway
Multicellular organismal development
Regulation of immune response
Cellular response to retinoic acid
Positive regulation of muscle cell differentiation
Negative regulation of insulin receptor signaling pathway
Positive regulation of phosphatidylinositol 3-kinsase activity
Insulin receptor signaling pathway
Many of the miRNAs encoded by EBV have been suggested to target host genes and were shown to be evolutionarily conserved . Next, we investigated if the viral miRNAs that target specific host processes are evolving under differential pressure relative to other miRNAs. We obtained and aligned 14 CMV strain sequences  to reference the VirMir strain using ClustalW . Given the multiple alignment we calculated the average conservation score for each miRNA, defined as a fraction of the consensus base at each position, averaged across the length of the miRNA. Those that putatively target specific host processes are significantly more conserved than the miRNAs that do not target a specific host pathway (Wilcoxon p-val 0.047), suggesting that they are evolving under purifying selection. This analysis could not be extended to other strains due to paucity of strains and inter-strain variability.
Given that DNA viruses encode for miRNAs that can be effectively processed by the host machinery , it stands to reason that targeting of specific host genes by viral RNA may be effective ammunition against host defense. A single miRNA may target hundreds of genes for regulation, and identifying the true targets of miRNAs remains a challenge. This study was designed to investigate the possibility that virally-encoded miRNAs target host genes in pathways central to the persistence and/or replication strategies of many viruses. We first tested for enriched pathways among putative targets of known human viral miRNAs from miRbase. Recent next-generation sequencing efforts have revealed new viral miRNAs, and this trend is likely to continue as more cell types and tissues are analyzed  (and unpublished data by Trgovcich). Therefore, given that a vast majority of viral miRNAs remain to be identified, we used a dataset of predicted viral miRNA to test the generality of our findings for known miRNAs. Based on a stringent analysis, our results support the general hypothesis.
A few aspects of our finding are worth highlighting. First, published experimental data suggests that many of the pathways and processes putatively targeted by viral miRNAs make excellent targets to aid viral infectivity and survival. Second, we found that many KEGG pathways and GO biological processes are targeted by multiple miRNAs from multiple viruses, even across viral families. For instance, miRNAs targeting antigen processing and presentation and gonadotropin receptor hormone signaling are enriched in both adeno and herpes virus families. While it is known that herpes reside in neuronal cells during latency , axon guidance pathways are also enriched by both viral families, which is surprising because adeno viruses are normally associated with lung or gastrointestinal infections, and not neuronal cells . The other commonality between both virus families was the cancer-related pathways, which we categorized as one group. Third, we also found many of these enriched targets in human to be the same when we repeated our analysis independently in mouse viruses. For example, cancer and the ErbB signaling pathway, insulin receptor regulation, and axon guidance pathways are preserved between mouse and human viral targets. We discuss a selection of enriched pathway in more detail below. Finally, we found evidence to suggest that the miRNAs targeting specific host processes are evolving at a relatively slower rate than other miRNAs. This is consistent with previous findings in EBV that showed viral miRNAs are likely to target host genes that are evolutionarily conserved . We note that the evolutionarily conserved miRNAs in our analysis are not biased towards known miRNAs.
A few previous studies report similar findings, although in a limited context. A previous analysis showed enrichment for KEGG pathways associated with cell signaling and adhesion/junction pathways using the putative targets of 85 known miRNAs in 5 human herpes virus as an aggregate (not virus-specific or miRNA-specific) . Another study of known KSHV miRNAs and targets predicted by differential expression analysis found the targets to be involved in proliferation, immune modulation, angiogenesis, and apoptosis . Finally experimentally identified targets for EBV were found to be enriched for targets involved in innate immunity, cell survival, and cell proliferation . Here we verified these previous finding and extended the analysis to individual viruses and miRNAs as well as broadened the virus families to include the adenoviruses and poxviruses. Furthermore, by including human and mouse viruses, our analyses support evolutionary prevalence of viral miRNAs targeting host pathways and processes for their survival and proliferation. Finally, we present evidence for possible purifying selection, acting on certain miRNAs targeting specific host processes.
While miRNA prediction is challenging in general, it is especially difficult to predict viral miRNAs, because most miRNA prediction approaches have been developed for eukaryotes and exploit evolutionary conservation , and are thus not applicable to viruses. Current miRNA prediction in viruses essentially relies on identifying hairpin like structures with certain additional properties . We started with the putative miRNAs for dsDNA viruses compiled in the VirMir . Undoubtedly, many of these predicted miRNAs will be false positives. However, the complete set of viral miRNAs is far from known. For instance, since 2007 in VirMir database, the number of known miRNAs for viruses studied here has grown greater than three-fold, and will grow as technologies improve and as additional cell types and tissues infected by viruses are analyzed. Using the group of all potential viral miRNAs based on predicted hairpin precursor structures therefore allowed us to begin with most comprehensive set of putative target genes for statistical analysis. Predicting miRNA targets presents yet another complication, especially for viruses, for similar reasons, and is fraught with false positives. A 24% false discovery rate has been calculated for MiRanda similar to other target prediction algorithms. To minimize false positives from the putative miRNAs collected from VirMir, we restricted the target scores to a value of 140 or higher which correspond to exact seed matching criteria as well as a further threshold based on target score distribution for scrambled 3'UTR sequences. We adopted several additional checks to minimize false positives at the stage of functional enrichment assessment. First, we performed a strict pathway-specific adjustment of p-values accounting for 3' UTR lengths of genes in the pathway (a novel aspect of this work) and applied multiple testing corrections. Second, we used multiple virus families. Third, we relied on both KEGG and GO to better interpret our findings, and we used two distant species to corroborate our findings. Lastly, we compared the evolutionary conservation of miRNAs targeting specific host pathways to the other miRNAs based on 14 distinct strains of CMV and determined that they are conserved much more than expected, suggesting a selective pressure to maintain the targeting specificity. Although there are likely to be many false positives at the level of individual target genes, due to the reasons listed above, it is encouraging that they still reveal biologically meaningful patterns of functional enrichments.
We found that miRNAs in many viruses targeted similar pathways even across distant host species. We compared the pathway enrichment for known viral miRNA in miRBase with that for predicted miRNAs in VirMir. For brevity, we discuss in detail the biological significance of three most significant pathways.
It is encouraging that cancer, as a theme, is repeatedly revealed by our analysis. It is becoming increasingly clear that miRNAs may have a causal role in initiation and progression of cancer . A few viruses included in this study are known to be associated with human cancer such as EBV in Burkett's lymphoma and nasopharyngeal carcinoma, and KSHV in Kaposi's sarcoma and some B cell lymphomas, but there is no direct evidence yet that viral miRNAs play a causal role in the development of cancer. Oncogenic pathways that viruses are known to interfere with include (1) cell growth, (2) proliferation, (3) evasion of apoptosis, (4) genetic instability, and (5) angiogenesis. The enriched KEGG pathways by both mice and humans include one pathway and three GO terms associated with features deregulated in cancer, in addition to the cancer pathway. Regulation of the MAPK signaling pathway is an essential co-factor for re-activation of KSHV . A closely related PI3K pathway, targeted by herpes viruses, is critical to cell survival and growth , and is frequently activated in cancer pathways . Both pathways have the potential to inhibit KSHV infections . The PI3K pathway is a particular favorite for DNA viruses enabling infected cells to withstand viral induced stress . DNA viruses can affect ErbB signaling pathway, which can initiate the MAPK pathway resulting in growth or differentiation . Enrichment for these pathways suggests that viral miRNAs may play a role in the oncogenesis.
The Hh genes regulate tissue patterning and are expressed in the beta cells of the pancreas and was implicated in insulin production . Hepatitis infection induces Hh pathways and is associated with fibrogenic repair, and vascular remodeling . The viral need for tissue repair could be categorized as stress management so that the infected cell can withstand the stresses of viral replication programs; however, it is not clear why a virus would need to regulate insulin. Insulin is a metabolic molecule that causes tissue to take up glucose from the blood and to stop using fat as an energy source; it is possibly used as a metabolic regulator. One can speculate that viral hijacking of the key energy regulator may induce the infected cell to devote their energy production to the virus' needs. Enrichment of the GO term for negative regulation of insulin receptor signaling and insulin receptor signaling pathway would indicate that insulin regulation is indeed the targeted pathway.
HSV is commonly latent in nerves, and during an infection the virus travels along the nerves to the connected tissue and forms lytic lesions. Nerve growth factors are necessary survival factors for HSV in latency . Additionally, it has been shown that HSV utilizes nerve growth cones and varicosities of axons for transport  indicating a role in nerve growth pathways. What is surprising is that three miRNA from two different adeno viruses were enriched for axon guidance pathway in humans. Because adeno viruses do not usually target neurons, it is likely that this is a false positive. Nevertheless, viral miRNA targets are enriched for axon guidance and GO term neuron development and support the targeting of this important latency pathway for the herpes virus.
Given a miRNA sequence from VirMir , and the 3' UTR of human (or mouse) genes, we used Miranda3.3 to predict the miRNA targets . As a baseline threshold, we only considered Miranda hits scoring 140 or higher, which corresponds to a perfect heptamer match in positions 2-8 and should be considered stringent . As a further control, we randomly scrambled each 3' UTR sequence and repeated the target prediction on the randomized sequences. Using the distribution of scores on the randomized sequences, we determined a score cutoff corresponding to top 1% (for analysis of known miRNA) or 5% (for analysis of predicted miRNAs) scores. This threshold was determined separately for each miRNA and was always much more stringent that the baseline score threshold of 140.
We compared each gene target set with an annotated functional gene sets corresponding to KEGG pathways and GO biological processes; gene sets with fewer than 10 members were excluded. For each miRNA we assessed overlap between the miRNA target set and each functional gene set using a hypergeometric test, which estimates the probability (p-value) of observing an overlap of certain size of greater between two gene sets by random chance. We noted that if the genes in functional gene set have longer 3' UTRs, then those pathways are more likely to be deemed as enriched simply by chance. To control for 3' UTR length bias, we divided the range of 3' UTR lengths into 20 equal-sized bins. For a miRNA, for each target of that miRNA, we randomly selected a gene in the same 20 percentile bin, thus generating a length matched random set of target genes for each miRNA; we generated 10 such random sets corresponding to each actual target set. For each randomize target set, we calculated the hypergeometric p-value for the overlap with each functional gene set. Thus, for a functional gene set (KEGG or GO), we generated a NULL distribution of hypergeometric p-values containing 68,090 p-values corresponding to 10 randomized sets corresponding to 6809 miRNAs. Given an actual miRNA target set S and functional gene set F, and given the hypergeometric p-value P(S, F), we estimated an adjusted p-value P Adj (S, F) as the rank of P in the 68,090 p-values corresponding to F. Finally, given adjusted p-values for all miRNA target sets for all viruses against all functional gene sets, we corrected the adjusted p-values for multiple testing using the Benjamini-Hochberg procedure for FDR estimation .
Predicted miRNA targets (310 unique Human miRNA, 817 unique Entrez gene ID's) are available at ftp://ftp.cbcb.umd.edu/pub/data/sridhar/Human.miRNA.supplemental.
The publication costs for this article were funded by NIH grant R01GM085226 to SH.
This article has been published as part of BMC Bioinformatics Volume 14 Supplement 2, 2013: Selected articles from the Eleventh Asia Pacific Bioinformatics Conference (APBC 2013): Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/14/S2.
This work is supported by NIH grant R01GM085226 to SH.
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.