AUG_hairpin: prediction of a downstream secondary structure influencing the recognition of a translation start site
© Kochetov et al; licensee BioMed Central Ltd. 2007
Received: 25 January 2007
Accepted: 30 August 2007
Published: 30 August 2007
The translation start site plays an important role in the control of translation efficiency of eukaryotic mRNAs. The recognition of the start AUG codon by eukaryotic ribosomes is considered to depend on its nucleotide context. However, the fraction of eukaryotic mRNAs with the start codon in a suboptimal context is relatively large. It may be expected that mRNA should possess some features providing efficient translation, including the proper recognition of a translation start site. It has been experimentally shown that a downstream hairpin located in certain positions with respect to start codon can compensate in part for the suboptimal AUG context and also increases translation from non-AUG initiation codons. Prediction of such a compensatory hairpin may be useful in the evaluation of eukaryotic mRNA translation properties.
We evaluated interdependency between the start codon context and mRNA secondary structure at the CDS beginning: it was found that a suboptimal start codon context significantly correlated with higher base pairing probabilities at positions 13 – 17 of CDS of human and mouse mRNAs. It is likely that the downstream hairpins are used to enhance translation of some mammalian mRNAs in vivo. Thus, we have developed a tool, AUG_hairpin, to predict local stem-loop structures located within the defined region at the beginning of mRNA coding part. The implemented algorithm is based on the available published experimental data on the CDS-located stem-loop structures influencing the recognition of upstream start codons.
An occurrence of a potential secondary structure downstream of start AUG codon in a suboptimal context (or downstream of a potential non-AUG start codon) may provide researchers with a testable assumption on the presence of additional regulatory signal influencing mRNA translation initiation rate and the start codon choice. AUG_hairpin, which has a convenient Web-interface with adjustable parameters, will make such an evaluation easy and efficient.
Translation of most eukaryotic mRNAs is likely to be initiated by a linear scanning mechanism , although some alternative mechanisms are also possible [2, 3]. According to the scanning model, 40S ribosomal subunits can either initiate translation at the 5'-proximal AUG codon in a suboptimal context or miss it and initiate translation at downstream AUG(s). For mammalian mRNAs, the most important elements of AUG context are the adenine at position -3 and guanine at position +4 [1, 4]. One might expect that mRNA should possess some features providing efficient translation, including the recognition of a genuine translation start site (TSS). However, the fraction of eukaryotic mRNAs with the start AUG codon in a suboptimal context is relatively large [5, 6]. It is likely that at least some mRNAs with a suboptimal context of annotated start codon contain other signals providing additional information for efficient TSS recognition (e.g., ).
This hypothetical mechanism has not been fully verified by in vivo experiments yet. However, it was often suggested that such secondary structures can be used to modulate translation efficiency of certain viral and cellular mRNAs with a suboptimal context of start AUG codons [10–14] or with a non-AUG start codons [15–17]. For example, the efficiency of TSS in a suboptimal context in Dengue virus mRNA is increased by a hairpin of moderate stability located 17 nucleotides downstream. The hairpin effect on the TSS recognition depended on its stability and position with respect to AUG . It was also recently found that the stable hairpin located downstream of start AUG codon in Sindbis virus subgenomic 26S RNA provides efficient translation even though eIF2alpha is phosphorylated and translation of most cellular mRNAs is blocked . The authors hypothesized that the hairpin can stall the ribosomes on the correct site to initiate translation, thus bypassing the requirement for a functional eIF2 and, thereby, specifically supporting translation of certain viral and (probably) cellular mRNAs. Notably, this function of a downstream hairpin is related to a general cellular translation control rather than a compensation of the "weakness" of the upstream start codon context. Thus, the information on the presence of potential compensatory hairpins may be useful for further experimental investigation of both general and specific mRNA translational properties.
Here, we describe the computational tool (AUG_hairpin) targeted at the prediction of secondary structure elements possibly compensating for suboptimal context of translational start codon. We also analyzed the structural features of human and mouse mRNAs and found significant correlation between the base pairing probabilities in positions 13–17 of CDS and the TSS context. This relationship supports the hypothesis on the functional significance of precisely located downstream hairpins for the TSS recognition in some cellular mRNAs.
According to the experimental data, the hairpins started either upstream or downstream of certain "critical" region of CDS did not compensate for the "weak" AUG context. In particular, continuous secondary structure started at 5th nucleotide of coding sequence did not increase translation initiation efficiency despite it included the critical 12th and 18th positions (Fig. 1; ). Based on this observation, AUG_hairpin predicts the stem-loop structures, in which 5'-borders are located within the critical region (from 12th to 18th nucleotides by default). An appropriate stem-loop structure can also be a part of a more complex secondary structure started upstream of the critical region. We hypothesized that the 40S ribosomal subunit moving from the 5'-end of mRNA can pause consequently on each stable stem of a complex stem-loop structure waiting for its melting. In this case we assumed that an eligible hairpin has to be separated from upstream secondary structure elements by some impaired segment (e.g., loop) (for detailed description, see tutorial at the program www-sites).
5'-UTR and CDS nucleotide sequences have to be entered separately to the program through www-page (this provides the program with information on the start codon position). AUG_hairpin analyzes the mRNA segment compiled from 10 nucleotides long 5'-UTR portion located immediately upstream of the start AUG codon and 100 5'-proximal nucleotides of CDS. Algorithm consists of the following main steps: (1) Prediction of RNA optimal secondary structure for 5'UTR-CDS fragment. For this purpose the program foldRNA from Vienna RNA package v.1.4  was implemented as subroutine. (2) Checking the occurrence of a perfect (or imperfect) stem located a certain distance downstream of start AUG codon (from 12th to 18th nucleotides by default; user can change this range from the CDS beginning till 30th nucleotide). Conventionally, a stem is perfect when it does not contain any interrupting loops; an imperfect stem includes short mismatches (one-nucleotide bulges or 1+1 inner loops) which presumably do not interrupt stacking interactions. Program's output presents visualization of the optimal secondary structure and provides calculation of the predicted thermodynamic stability of the secondary structure and the helices (if any) started within the defined critical region. The program was written on C++ and runs in a Unix environment.
Results and discussion
The difference in average BPP values and G+C content in 6th – 30th CDS positions of human and mouse cDNAs characterized by optimal and suboptimal start codon contexts (purine (Pu) and pyrimidine (Py) in pos. -3 upstream AUG, respectively)†
It may be speculated that this difference results from a local deviation in G+C content (GC pairs make the major impact on the secondary structure stability) or from some codon-dependent periodic pattern [23, 24]. However, average positional differences in G+C frequencies do not correlate with the difference in BPP values: it is unlikely that the observed dependency between the TSS context and base pairing probabilities reflects an unusual G+C distribution at the CDS beginning rather than the more frequent representation of the hairpin-containing mRNAs in a sample with a suboptimal TSS context (Table 1). These data demonstrate that precisely positioned hairpins may increase translation efficiency of some mammalian mRNAs in vivo and that the positions defined by Kozak (; from 13th to 17th nucleotide of CDS) are most frequently (or efficiently) used for this purpose.
The difference in characteristics of 5'-end proximal hairpins started in 6th – 20th CDS positions of human and mouse cDNAs characterized by optimal and suboptimal start codon contexts (purine (Pu) and pyrimidine (Py) in pos. -3 upstream AUG, respectively; Etot, energy of an eligible hairpin, kcal/mol; Est, energy of the hairpin stem region, kcal/mol; Lst, size of the hairpin stem region, base pairs)†
It was reported that a correctly positioned hairpin even of a moderate stability (-8.2 kcal/mol) enhanced the recognition of upstream AUG codon in a suboptimal context : thus, it may be assumed that at least some of these mammalian mRNAs possess higher translation initiation efficiency due to the presence of "compensatory" hairpins than it may be expected from the context of annotated start codon (lists of human and mouse mRNAs with some additional information on secondary structure characteristics are available as Additional file 1 and Additional file 2, respectively). It should be, however, noted that suboptimal context of start codon is not necessarily compensated by "compensatory hairpin": in many cases, contexts of translational start sites are likely to be evolutionary attenuated to decrease translation level of mRNAs encoding regulatory proteins (e.g., ).
The presence of a potential secondary structure downstream of start AUG codon in a suboptimal context (or downstream of a potential non-AUG start codon) can provide researchers with a testable assumption on the additional regulatory element influencing translation initiation level. AUG_hairpin is based on an elegant hypothesis supported by the in vitro  and in vivo experimental data [10, 12–14] as well as the results of computational analysis (; Tables 1 and 2 in this manuscript).
It should be noted that the applied algorithm depends on the interpretation of available (rather limited) experimental data and the prediction accuracy may also be limited. Only few hairpin positions were tested in experiments. Secondary structure elements influencing translation start site recognition in vivo may have distinct characteristics (e.g., species-specific or tissue-specific). Currently it is also not possible to predict the interdependence of hairpin stability and its influence on start codon recognition as well as the influence of mRNA-protein and mRNA-ribosome interactions during translation initiation process on the mRNA secondary structure. Finally, the recognition of start codons in a suboptimal context can be modulated through other (currently poorly known) signals [7, 25–28], and the absence of a "compensatory" hairpin does not necessarily mean that the TSS recognition is inefficient. However, despite these limitations, AUG_hairpin may be used to reveal potential "compensatory" hairpins in the case of discrepancy between the gene expression pattern and mRNA features (e.g., highly expressed gene is characterized by a suboptimal context of annotated translation start site, proteomic or phylogenetic data suggest the usage of non-AUG potential start codons , etc.).
Availability and requirements
Project name: AUG_hairpin
Project home pages:
Any restrictions to use by non-academics: licence needed
untranslated region of mRNA
protein coding sequence
base pairing probability.
This work was supported by the Russian Foundation for Basic Research (grant No. 05-04-48207) and RAS programs (Dynamics of Gene Pools). A.V.K., N.A.K. and D.G. thank Siberian Division of Russian Academy of Sciences (Complex Integration Program 5.3) and RAS Program "Molecular and Cellular Biology" for partial support. A.V.K. is also grateful to JSPS short-term fellowship program.
- Kozak M: Regulation of translation via mRNA structure in prokaryotes and eukaryotes. Gene 2005, 361: 13–37. 10.1016/j.gene.2005.06.037View ArticlePubMedGoogle Scholar
- Jackson RJ: Alternative mechanisms of initiating translation of mammalian mRNAs. Biochem Soc Trans 2005, 33: 1231–1241. 10.1042/BST20051282View ArticlePubMedGoogle Scholar
- Baird SD, Turcotte M, Korneluk RG, Holcik M: Searching for IRES. RNA 2006, 12: 1755–1785. 10.1261/rna.157806PubMed CentralView ArticlePubMedGoogle Scholar
- Pisarev AV, Kolupaeva VG, Pisareva VP, Merrick WC, Hellen CU, Pestova TV: Specific functional interactions of nucleotides at key -3 and +4 positions flanking the initiation codon with components of the mammalian 48S translation initiation complex. Genes Dev 2006, 20: 624–636. 10.1101/gad.1397906PubMed CentralView ArticlePubMedGoogle Scholar
- Rogozin IB, Kochetov AV, Kondrashov FA, Koonin EV, Milanezi L: Presence of ATG triplets in 5' untranslated regions of eukaryotic cDNAs correlates with a "weak" context of the start codon. Bioinformatics 2001, 17: 890–900. 10.1093/bioinformatics/17.10.890View ArticlePubMedGoogle Scholar
- Kochetov AV, Sarai A, Rogozin IB, Shumny VK, Kolchanov NA: The role of alternative translation start sites in generation of human protein diversity. Mol Genet Genomics 2005, 273: 491–496. 10.1007/s00438-005-1152-7View ArticlePubMedGoogle Scholar
- Kochetov AV: AUG codons at the beginning of protein coding sequences are frequent in eukaryotic mRNAs with a suboptimal start codon context. Bioinformatics 2005, 21: 837–840. 10.1093/bioinformatics/bti136View ArticlePubMedGoogle Scholar
- Kozak M: Downstream secondary structure facilitates recognition of initiator codons by eukaryotic ribosomes. Proc Natl Acad Sci USA 1990, 87: 8301–8305. 10.1073/pnas.87.21.8301PubMed CentralView ArticlePubMedGoogle Scholar
- Kozak M: Context effects and inefficient initiation at non-AUG codons in eucaryotic cell-free translation systems. Mol Cell Biol 1989, 9: 5073–5080.PubMed CentralView ArticlePubMedGoogle Scholar
- Hwang W-L, Su T-S: The encapsidation signal of hepatitis B virus facilitates preC AUG recognition resulting in inefficient translation of the downstream genes. J Gen Virol 1999, 80: 1769–1776.View ArticlePubMedGoogle Scholar
- Ciullo M, Del Pozzo G, Autiero M, Guardiola J: Downstream sequence adjacent to AUG affects translation of chloramphenicol acetyl transferase in eukaryotic cells. DNA Cell Biol 2000, 19: 39–46. 10.1089/104454900314690View ArticlePubMedGoogle Scholar
- Kwon HS, Lee DK, Lee JJ, Edenberg HJ, Ahn YH, Hur MW: Posttranscriptional regulation of human ADH5/FDH and Myf6 gene expression by upstream AUG codons. Arch Biochem Biophys 2001, 386: 163–171. 10.1006/abbi.2000.2205View ArticlePubMedGoogle Scholar
- Yang L, Chen J, Chang CC, Yang XY, Wang ZZ, Chang TY, Li BL: A stable upstream stem-loop structure enhances selection of the first 5'-ORF-AUG as a main start codon for translation initiation of human ACAT1 mRNA. Acta Biochim Biophys Sin 2004, 36: 259–268.View ArticlePubMedGoogle Scholar
- Clyde K, Harris E: RNA secondary structure in the coding region of dengue virus type 2 directs translation start codon selection and is required for viral replication. J Virol 2006, 80: 2170–2182. 10.1128/JVI.80.5.2170-2182.2006PubMed CentralView ArticlePubMedGoogle Scholar
- Riechmann JL, Ito T, Meyerowitz EM: Non-AUG initiation of AGAMOUS mRNA translation in Arabidopsis thaliana. Mol Cell Biol 1999, 19: 8505–8512.PubMed CentralView ArticlePubMedGoogle Scholar
- Nguyen M, He B, Karaplis A: Nuclear forms of parathyroid hormone-related peptide are translated from non-AUG start sites downstream from the initiator methionine. Endocrinology 2001, 142: 694–703. 10.1210/en.142.2.694PubMedGoogle Scholar
- Takahashi K, Maruyama M, Tokuzawa Y, Murakami M, Oda Y, Yoshikane N, Makabe KW, Ichisaka T, Yamanaka S: Evolutionarily conserved non-AUG translation initiation in NAT1/p97/DAP5 (EIF4G2). Genomics 2005, 85: 360–371. 10.1016/j.ygeno.2004.11.012View ArticlePubMedGoogle Scholar
- Ventoso I, Sanz MA, Molina S, Berlanga JJ, Carrasco L, Esteban M: Translational resistance of late alphavirus mRNA to eIF2alpha phosphorylation: a strategy to overcome the antiviral effect of protein kinase PKR. Genes Dev 2006, 20: 87–100. 10.1101/gad.357006PubMed CentralView ArticlePubMedGoogle Scholar
- Hofacker IL: Vienna RNA secondary structure server. Nucleic Acids Res 2003, 31: 3429–3431. 10.1093/nar/gkg599PubMed CentralView ArticlePubMedGoogle Scholar
- Kobayashi Y, Dokiya Y, Kumazawa Y, Sugita M: Non-AUG translation initiation of mRNA encoding plastid-targeted phage-type RNA polymerase in Nicotiana sylvestris . Biochem Biophys Res Comm 2002, 299: 57–61. 10.1016/S0006-291X(02)02579-2View ArticlePubMedGoogle Scholar
- Kochetov AV, Kolchanov NA, Sarai A: Interrelations between the efficiency of translation start sites and other sequence features of yeast mRNAs. Mol Genet Genomics 2003, 270: 442–447. 10.1007/s00438-003-0941-0View ArticlePubMedGoogle Scholar
- McCaskill JS: The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers 1990, 29: 1105–1119. 10.1002/bip.360290621View ArticlePubMedGoogle Scholar
- Meyer IM, Miklos I: Statistical evidence for conserved, local secondary structure in the coding regions of eukaryotic mRNAs and pre-mRNAs. Nucleic Acids Res 2005, 33: 6338–6348. 10.1093/nar/gki923PubMed CentralView ArticlePubMedGoogle Scholar
- Shabalina SA, Ogurtsov AY, Spiridonov NA: A periodic pattern of mRNA secondary structure created by the genetic code. Nucleic Acids Res 2006, 34: 2428–2437. 10.1093/nar/gkl287PubMed CentralView ArticlePubMedGoogle Scholar
- Lukaszewicz M, Feuermann M, Jerouville B, Stas A, Boutry M: In vivo evaluation of the context sequence of the translation initiation codon in plants. Plant Sci 2000, 154: 89–98. 10.1016/S0168-9452(00)00195-3View ArticlePubMedGoogle Scholar
- Sawant SV, Kiran K, Singh PK, Tuli R: Sequence architecture downstream of the initiator codon enhances gene expression and protein stability in plants. Plant Physiol 2001, 126: 1630–1636. 10.1104/pp.126.4.1630PubMed CentralView ArticlePubMedGoogle Scholar
- Zhao K-N, Tomlison L, Liu WJ, Gu W, Frazer IH: Effects of additional sequences directly downstream from the AUG on the expression of GFP gene. Biochim Biophys Acta 2003, 1630: 84–95.View ArticlePubMedGoogle Scholar
- Shabalina SA, Ogurtsov AY, Rogozin IB, Koonin EV, Lipman DJ: Comparative analysis of orthologous eukaryotic mRNAs: potential hidden functional signals. Nucleic Acids Res 2004, 32: 1774–1782. 10.1093/nar/gkh313PubMed CentralView ArticlePubMedGoogle Scholar
- Touriol C, Bornes S, Bonnal S, Audigier S, Prats H, Prats AC, Vagner S: Generation of protein isoform diversity by alternative initiation of translation at non-AUG codons. Biol Cell 2003, 95: 169–178. 10.1016/S0248-4900(03)00033-9View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.