Many classes of 18-30 nt small non-coding RNAs (sRNAs) can be characterized based on their functions in gene regulation and epigenetic control in plants, animals and fungi [1, 2].
Identification of the complete set of miRNAs and other small regulatory RNAs in organisms is essential with regard to our understanding of genome organization, genome biology, and evolution . There are three important classes of endogenous small RNAs in plants, animal or fungi: micro RNAs (miRNAs), short interfering RNAs (siRNAs) and piwi-interacting RNAs (piRNAs). In plants, there are no known piRNA.
MicroRNAs (miRNAs) are small 18-24 nucleotide regulatory RNAs that play very important roles in post-transcriptional gene regulation by directing degradation of mRNAs or facilitating repression of targeted gene translation [4, 5]. While siRNA are processed from longer double stranded RNA molecules and represent both strands of the RNA, miRNAs originate from hairpin precursors formed from one RNA strand [6, 7]. The hairpin precursors (pre-miRNA) are typically around ~60-70 bp in animals, but somewhat larger, ~90-140 bp in plants. In plants, helped by RNA polymerase II, miRNA gene is first transcribed into pri-miRNA. The pri-miRNAs are cleaved to miRNA precursors (pre-miRNA), which form a characteristic hairpin structure, catalyzed by Dicer-like enzyme (DCL1) [7, 8]. The pre-miRNA is further cleaved to a miRNA duplex (miRNA:miRNA*), a short double-stranded RNA (dsRNA) . The dsRNA is then exported to cytoplasm by exportin-5. Helped by AGO1, single-strand mature miRNA will form a RNA-protein complex, named RNA-induced silencing complex (RISC), which negatively regulates gene expression by inhibiting gene translation or degrading mRNAs by perfect or near-perfect complement to target mRNAs [10, 11].
Although some soybean miRNA were previously identified , the number was small and, therefore, the identification of all soybean miRNAs is far from complete. The aim of this study is to expand the collection of miRNAs expressed in soybean by using a deep sequencing approach with the Illumina Solexa platform. Towards this, we generated Solexa cDNA sequencing data for root, nodule and flower tissues since they are all relevant soybean organs to various studies in legume biology and due to their impact on soybean yield. One of the legume-specific traits is the symbiosis existing between the legume root and soil bacteria leading to the nodule. We think the small RNA content of soybean nodules needs to be established since research in other legume species showed a role for small RNA in nodule development [13, 14]. Root tissue is another important organ to analyze due to its role in nutrient-water absorption, which is clearly important to soybean yield. Finally, we selected flower for its direct impact on soybean seed yield. We constructed the small RNA libraries, prepared from these four different soybean tissues and each library was sequenced individually, generating a total of over one million sequences per library. We developed a bioinformatics pipeline using in-house developed scripts and other publicly available RNA structure prediction tools to differentiate the authentic mature miRNA sequences from other small RNAs and short RNA fragments represented in the sequencing data. We also conducted a detailed analysis of predicted miRNA target genes and correlated the miRNA expression data to that of the corresponding target genes using Solexa cDNA sequencing data.