dbSMR: a novel resource of genome-wide SNPs affecting microRNA mediated regulation
© Hariharan et al; licensee BioMed Central Ltd. 2009
Received: 06 November 2008
Accepted: 16 April 2009
Published: 16 April 2009
MicroRNAs (miRNAs) regulate several biological processes through post-transcriptional gene silencing. The efficiency of binding of miRNAs to target transcripts depends on the sequence as well as intramolecular structure of the transcript. Single Nucleotide Polymorphisms (SNPs) can contribute to alterations in the structure of regions flanking them, thereby influencing the accessibility for miRNA binding.
The entire human genome was analyzed for SNPs in and around predicted miRNA target sites. Polymorphisms within 200 nucleotides that could alter the intramolecular structure at the target site, thereby altering regulation were annotated. Collated information was ported in a MySQL database with a user-friendly interface accessible through the URL: http://miracle.igib.res.in/dbSMR.
The database has a user-friendly interface where the information can be queried using either the gene name, microRNA name, polymorphism ID or transcript ID. Combination queries using 'AND' or 'OR' is also possible along with specifying the degree of change of intramolecular bonding with and without the polymorphism. Such a resource would enable researchers address questions like the role of regulatory SNPs in the 3' UTRs and population specific regulatory modulations in the context of microRNA targets.
Interaction of microRNAs (miRNAs) to specific sites in the transcripts of several human genes evidently, has profound effects on various biological processes like development, differentiation, proliferation, apoptosis, metabolism, host-pathogen interactions and cancer [1, 2]. These ~17–25 nucleotide long molecules generally bind to the 3' untranslated regions (UTRs) of certain transcripts harboring complementary sites, thereby reducing its translational ability. Over 500 human miRNAs have been identified in the human genome, each of them having the potential to bind to hundreds of transcripts. The miRNAs form a complex with other proteins called miRNA-Protein complex (miRNP) or the miRNA Induced Silencing Complex (miRISC). This complex is known to interact with target sites with incomplete complementarity . Several experiments have demonstrated that bases 2–7 from the 5' end of the miRNA are required to be exactly complementary to the sequence at the target site to form a 'seed' binding and a few mismatches of 3–5 bulged loops can be tolerated [4, 5]. Another set of experiments demonstrate that a seed match is not mandatory if there is a compensatory pairing towards the 3' end of the miRNA in the bound complex that is sufficient to obtain an optimum free energy for the bound complex .
Variation of structure at the target site has been identified as another key factor that determines the interaction of miRNA to the target site. Long range interactions between bases in the RNA result in complex structures like pseudo-knots while there are short range interactions which mostly lead to stem-loop structures. While composition of the bases and the length of the stem region make some of these structures particularly stable, other conformations like the presence of internal loops, multi-branch loops or bulges could destabilize these structures. Conceivably, the target site of a particular miRNA might not always be open and accessible for the miRISC to interact with the site. It has been established that the miRNPs can effectively bind to target sites which do not have a highly structured conformation in comparison to a structurally stable target site . The presence of stable structures 70 nucleotides (nt) flanking the respective target sites hindered hsa-miR-1 from downregulating thymosin β4 and Igf1 while the same miRNA could regulate the levels of Hand2 . In another study, the sequence composition was altered to force a structural variation in order to confirm the accessibility preference of miRNA . Based on these principles, others and we (unpublished web server) have implemented second generation of target prediction servers, which incorporate the accessibility of miRNAs to target site as another factor [10–13].
Several cases of dysregulation due to polymorphisms at the miRNA binding site have been reported. It was noted that the 3' UTR of SLITRK1 gene, a candidate of Tourette's syndrome, harboured a G-to-A polymorphism which stabilized the interaction of hsa-miR-189 since a A:U pairing is stronger that the G:U wobble; to facilitate collating of such SNPs that occur at the miRNA target site, a database called Patrocles had been developed . Quantitative Trait Loci (QTL) mapping in sheep identified a gene GDF8 accounting for muscular dystrophy. This gene contained a G-to-A substitution in the 3' UTR that created a more stable site for two miRNAs miR-1 and miR-206. A three-fold reduction in GDF8 was observed . A genome wide study has established that though SNPs at miRNA binding site are rare, few of them are positively selected in certain population . Another such study of SNPs in miRNA binding sites of all human transcripts established that very few SNPs occur in the miRNA binding motifs and that aberrant allele frequencies were found in cancer ESTs . Another example is the A-to-C polymorphism (rs5186) which disrupts the A:U pairing and consequently, the binding of hsa-miR-155 to the AGTR1 gene, possibly leading to hypertension . A C-to-T polymorphism 14 nt downstream of the miR-24 target site on DHFR gene resulted in degradation of the target transcript .
Construction and Content
Targets to all human miRNAs, obtained from miRBase database v9 , were predicted in the 3' UTR sequences downloaded from the Ensembl database  using the BioMart feature. Currently available miRNA target prediction tools are associated with a large number of false positives and as an alternative, results which agree between two or three algorithms would be better to identify the most probable miRNA-target pairs . We used three software – miRanda, RNAHybrid and TargetScan to detect the miRNA target pairs [23–25]. Only those miRNA-target pairs were selected which were predicted to bind to the same target site by all the three software.
We further analyzed the subset of SNPs that are located within 200 nt of the predicted miRNA-target pairs, by extracting two sets of sequences, one with the wild type allele and other with the polymorphic allele at the 201st position of this stretch. Further, we computationally determine the presence of secondary structures using the RNAFold program for both the sequences . Computational prediction of RNA secondary structure has limited accuracy in predicting long-range interactions, complex structures like pseudo-knots, structures of long sequences (>1 kb). We focused on sequence stretches of 400 nt for two reasons: (a) the long-range interactions might be overcome by the steric hindrance caused by miRNPs; and (b) presently available secondary structure prediction tools have an optimum efficiency for sequence of length 400–700 bases .
We then extracted the structural information of the 3' UTR at the site where the miRNA is known to bind in the case of wild and polymorphic sequence. The bases involved in intramolecular base-pairing is denoted by an 'X' while a '-' denotes an unbound base. We calculate the change in number of bases changing its structural conformation and the ratio of the number of bases changing the intramolecular structure at the target site to the total number of bases binding to the miRNA gives a degree of change in the overall structural variation. The degree of change in the intramolecular bonds formed is an indication of the affect of the SNP in the intramolecular structure change at the particular target site.
Overview of predicted effect of SNPs on miRNA target binding
Intramolecular Structure at miRNA binding region
Utility and Discussion
Data pertaining to validated miRNA-target pairs allows further studies on the the effect of polymorphism, not just at the target site of miRNA binding, but also in the region around them. Two miRNAs (hsa-miR-15a and hsa-miR-16) are experimentally demonstrated to target the BCL2 transcript. The deletion of this miRNA cluster in B-cell lymphoma has been implicated in B-cell lymphoma . We notice that a polymorphism 172 bases upstream of the target site for the miRNAs (rs4987856) can alter the highly accessible structure to an inaccessible site (Figure 1b) for the miRNAs hsa-miR-15a and hsa-miR-16. This structural alteration might not enable miRNA interaction to the transcript harboring the polymorphic allele, mimicking the effect that of deleted miRNAs as in case of B-cell lymphoma patients.
We further analyzed the selection pressure on those SNPs which alter miRNA binding due to the structural effects. The integrated haplotype score (iHS) is a standardized measure of long range haplotype for a particular SNP in a given population. The same approach was used in a recent paper which performed a genome-wide scan of SNPs at miRNA binding sites . The iHS values for all SNPs available from HapMap phase 2 data in three population – ASI (Chinese and Japanese), YRI and CEU) were obtained from Haplotter website http://hg-wen.uchicago.edu/selection/haplotter.htm. Data for only those SNPs which have minor allele frequency (MAF) > 5% were available. We found that very few (only 1–2%) of the SNPs that change the miRNA accessibility were prone to either positive or negative selection (iHS < -2 or iHS > 2, respectively). The SNPs rs140074 (in the PATZ1 3'UTR) and rs11848279 (in the NFATC4 3'UTR) indicate negative selection (in Yoruban and Caucasian population) and positive selection (in Yoruban and Caucasian population) respectively.
It is appreciated that secondary structures are common in the UTRs of the transcripts. It is also clear from several studies that interaction of miRNAs to the target site is governed to a large extent by the structural accessibility to these sites. Since polymorphisms can alter the structure of these regions, we propose that variations in the 3' UTRs, even if farther away from the target site can alter the miRNA binding and hence would contribute to this additional layer of regulation. Stable structural motifs in the target sites would be inaccessible for miRNAs thereby constraining miRNA mediated regulation. The large activation energy involved in destabilizing the mRNA secondary structure would render interactions within a secondary structure forming region kinetically non-feasible even when thermodynamically viable. Others and we have previously devised approaches to incorporate the structural architecture of target regions into miRNA target prediction. Comparing the free energy difference of the intramolecular interaction with that of the interaction with the miRNAs, it is possible to identify thermodynamically feasible interactions of miRNA with the target site. Although currently available reports suggest direct involvement of SNPs in the miRNA target site whereby a nucleotide that interacts with the miRNA itself changes altering the intermolecular energy (Minimal Free Energy of the complex), we notice that variations away from the target site (the target region) can also affect miRNA accessibility. The loss of miR-24 targeting DHFR transcript due to a T-allele 14 nt downstream of the predicted target site was demonstrated to reduce the half life of the transcript . The authors propose that the region 14 nt downstream of the target site is important in the binding of the Ago proteins. However, we find that there is a significant change in the structural conformation of the UTR of DHFR. While the UTR exists in a highly structured form with a 'T' allele, the UTR which harbors a 'C' is highly unstructured. This would be a cause for the increase miRNA binding affinity to the target region of the UTR with the 'C' allele (Figure 1c).
It is difficult for individual investigators to look at the overall complexity in the context of genetic variation. Hence the dataset presented would be of immense value for researchers. In this paper, we have analyzed and catalogued polymorphisms that would make some individual specific genes more susceptible (or otherwise) to miRNA mediated regulation due to such changes. As demonstrated in the case of the validated miR-15a/miR-16 target site in BCL2 gene, a stretch of intramolecular bond formation at the interacting site of the miRNA in the UTR might lead to loss of miRNA binding. It remains open for experimentalists to validate such interesting possibilities and study various complexities involved in miRNA-target interactions. It would be worthwhile to identify polymorphisms with high polymorphic allele frequencies that have an effect on miRNA accessibility. Linking the functional role of the target gene and known effects of the miRNA binding, investigators can detect novel regulatory components that are prevalent in certain population which make them susceptible or otherwise, to miRNA mediated PTGS. Such a resource would enable researchers address questions like the role of regulatory SNPs in the 3' UTRs and population specific regulatory modulations.
As validation and experimental confirmation of miRNA-target interactions increase, we aim to keep the database regularly updated. In the next version, we also plan to include a graphical representation of the intramolecular structural changes. Although most users would require the data pertaining to a specific gene or a miRNA, we plan to incorporate a representation of the polymorphism and target region as an interactive map in the forthcoming improvement.
There have been several studies which have proven the detrimental effects of polymorphisms at the miRNA target site. Various structural analyses have also shown that accessibility of the miRNAs at the target site is an important factor that governs the miRNA mediated regulation. Polymorphisms that can alter the secondary structure at the miRNA binding region can thus have a significant role in controlling the accessibility of the miRNAs.
Through the genome-wide miRNA prediction performed here, we have collated the information of all validated SNPs that can affect the secondary structure of the miRNA binding regions, at varying degrees. Such a resource would enable researchers address questions like the role of regulatory SNPs in the 3' UTRs and population specific regulatory modulations. The true significance of the principle can be realized when the effect of these polymorphisms is studied at population level or in case-control disease samples. These would allow conclusive classification of SNPs as detrimental to miRNA binding or not, based on the information provided. We hope the database provides the necessary support for such high-throughput and thorough analysis
Availability and Requirements
The dbSMR database is freely available to all academic and users and is accessible through the URL: http://miracle.igib.res.in/dbSMR
The authors thank all colleagues who tested the database and gave several suggestions for improvement. We especially thank Drs. Beena Pillai, Anurag Aggarwal, Souvik Maiti and Sridhar Sivasubbu for suggestions on the database and manuscript. MH acknowledges Prof. Vani Brahmachari, Jasmine Ahluwalia, Rhishikesh Bargaje and Deeksha Bhartiya for evaluating the database. This work was supported by funding from Council of Scientific and Industrial Research (CSIR), India through project NWP0036 and Senior Research Fellowship by CSIR to MH. Comments from anonymous reviewers are also acknowledged which has improved the manuscript.
- He L, Hannon GJ: MicroRNAs: small RNAs with a big role in gene regulation. Nat Rev Genet 2004, 5: 522–531. 10.1038/nrg1379View ArticlePubMedGoogle Scholar
- Scaria V, Hariharan M, Pillai B, Maiti S, Brahmachari SK: Host-virus genome interactions: macro roles for microRNAs. Cell Microbiol 2007, 9: 2784–2794. 10.1111/j.1462-5822.2007.01050.xView ArticlePubMedGoogle Scholar
- Pillai RS, Bhattacharyya SN, Filipowicz W: Repression of protein synthesis by miRNAs: how many mechanisms? Trends Cell Biol 2007, 17: 118–126. 10.1016/j.tcb.2006.12.007View ArticlePubMedGoogle Scholar
- Lai EC: Predicting and validating microRNA targets. Genome Biol 2004, 5: 115. 10.1186/gb-2004-5-9-115PubMed CentralView ArticlePubMedGoogle Scholar
- Brennecke J, Stark A, Russell RB, Cohen SM: Principles of microRNA-target recognition. PLoS Biol 2005, 3: e85. 10.1371/journal.pbio.0030085PubMed CentralView ArticlePubMedGoogle Scholar
- Doench JG, Sharp PA: Specificity of microRNA target selection in translational repression. Genes Dev 2004, 18: 504–511. 10.1101/gad.1184404PubMed CentralView ArticlePubMedGoogle Scholar
- Robins H, Li Y, Padgett RW: Incorporating structure to predict microRNA targets. Proc Natl Acad Sci USA 2005, 102: 4006–4009. 10.1073/pnas.0500775102PubMed CentralView ArticlePubMedGoogle Scholar
- Zhao Y, Samal E, Srivastava D: Serum response factor regulates a muscle-specific microRNA that targets Hand2 during cardiogenesis. Nature 2005, 436: 214–220. 10.1038/nature03817View ArticlePubMedGoogle Scholar
- Kertesz M, Iovino N, Unnerstall U, Gaul U, Segal E: The role of site accessibility in microRNA target recognition. Nat Genet 2007, 39: 1278–1284. 10.1038/ng2135View ArticlePubMedGoogle Scholar
- Long D, Lee R, Williams P, Chan CY, Ambros V, Ding Y: Potent effect of target structure on microRNA function. Nat Struct Mol Biol 2007, 14: 287–294. 10.1038/nsmb1226View ArticlePubMedGoogle Scholar
- Thadani R, Tammi MT: MicroTar: predicting microRNA targets from RNA duplexes. BMC Bioinformatics 2006, 7(Suppl 5):S20. 10.1186/1471-2105-7-S5-S20PubMed CentralView ArticlePubMedGoogle Scholar
- Muckstein U, Tafer H, Hackermuller J, Bernhart SH, Stadler PF, Hofacker IL: Thermodynamics of RNA-RNA binding. Bioinformatics 2006, 22: 1177–1182. 10.1093/bioinformatics/btl024View ArticlePubMedGoogle Scholar
- Abelson JF, Kwan KY, O'Roak BJ, Baek DY, Stillman AA, Morgan TM, Mathews CA, Pauls DL, Rasin MR, Gunel M, et al.: Sequence variants in SLITRK1 are associated with Tourette's syndrome. Science 2005, 310: 317–320. 10.1126/science.1116502View ArticlePubMedGoogle Scholar
- Clop A, Marcq F, Takeda H, Pirottin D, Tordoir X, Bibe B, Bouix J, Caiment F, Elsen JM, Eychenne F, et al.: A mutation creating a potential illegitimate microRNA target site in the myostatin gene affects muscularity in sheep. Nat Genet 2006, 38: 813–818. 10.1038/ng1810View ArticlePubMedGoogle Scholar
- Saunders MA, Liang H, Li WH: Human polymorphism at microRNAs and microRNA target sites. Proc Natl Acad Sci USA 2007, 104(9):3300–5. 10.1073/pnas.0611347104PubMed CentralView ArticlePubMedGoogle Scholar
- Yu Z, Li Z, Jolicoeur N, Zhang L, Fortin Y, Wang E, Wu M, Shen SH: Aberrant allele frequencies of the SNPs located in microRNA target sites are potentially associated with human cancers. Nucleic Acids Res 2007, 35: 4535–4541. 10.1093/nar/gkm480PubMed CentralView ArticlePubMedGoogle Scholar
- Martin MM, Buckenberger JA, Jiang J, Malana GE, Nuovo GJ, Chotani M, Feldman DS, Schmittgen TD, Elton TS: The human angiotensin II type 1 receptor +1166 A/C polymorphism attenuates microrna-155 binding. J Biol Chem 2007, 282: 24262–24269. 10.1074/jbc.M701050200PubMed CentralView ArticlePubMedGoogle Scholar
- Mishra PJ, Humeniuk R, Mishra PJ, Longo-Sorbello GS, Banerjee D, Bertino JR: A miR-24 microRNA binding-site polymorphism in dihydrofolate reductase gene leads to methotrexate resistance. Proc Natl Acad Sci USA 2007, 104(33):13513–8. 10.1073/pnas.0706217104PubMed CentralView ArticlePubMedGoogle Scholar
- Griffiths-Jones S, Saini HK, van Dongen S, Enright AJ: miRBase: tools for microRNA genomics. Nucleic Acids Res 2008, 36: D154-D158. 10.1093/nar/gkm952PubMed CentralView ArticlePubMedGoogle Scholar
- Flicek P, et al.: Ensembl 2008. Nucleic Acids Res 2008, 36: D707–14. 10.1093/nar/gkm988PubMed CentralView ArticlePubMedGoogle Scholar
- Rajewsky N: microRNA target predictions in animals. Nat Genet 2006, 38(Suppl):S8–13. 10.1038/ng1798View ArticlePubMedGoogle Scholar
- Enright AJ, John B, Gaul U, Tuschl T, Sander C, Marks DS: MicroRNA targets in Drosophila. Genome Biol 2003, 5: R1. 10.1186/gb-2003-5-1-r1PubMed CentralView ArticlePubMedGoogle Scholar
- Rehmsmeier M, Steffen P, Hochsmann M, Giegerich R: Fast and effective prediction of microRNA/target duplexes. RNA 2004, 10: 1507–1517. 10.1261/rna.5248604PubMed CentralView ArticlePubMedGoogle Scholar
- Lewis BP, Shih IH, Jones-Rhoades MW, Bartel DP, Burge CB: Prediction of mammalian microRNA targets. Cell 2003, 115: 787–798. 10.1016/S0092-8674(03)01018-3View ArticlePubMedGoogle Scholar
- Hofacker W, Fontana PF, Stadler S, Bonhoeffer M, Tacker P, Schuster : Fast Folding and Comparison of RNA Secondary Structures. Monatshefte f Chemie 1994, 125: 167–188. 10.1007/BF00818163View ArticleGoogle Scholar
- Gardner PP, Giegerich R: A comprehensive comparison of comparative RNA structure prediction approaches. BMC Bioinformatics 2004, 5: 140. 10.1186/1471-2105-5-140PubMed CentralView ArticlePubMedGoogle Scholar
- Sethupathy P, Corda B, Hatzigeorgiou AG: TarBase: A comprehensive database of experimentally supported animal microRNA targets. RNA 2006, 12: 192–197. 10.1261/rna.2239606PubMed CentralView ArticlePubMedGoogle Scholar
- Cimmino A, et al.: miR-15 and miR-16 induce apoptosis by targeting BCL2. Proc Natl Acad Sci USA 2005, 102: 3944–9. 10.1073/pnas.0506654102View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.