Skip to main content
Fig. 4 | BMC Bioinformatics

Fig. 4

From: TrancriptomeReconstructoR: data-driven annotation of complex transcriptomes

Fig. 4

Novel genes and transient RNAs in A.thaliana. A Stacked barplot shows the counts of novel HC, MC and LC genes and transient RNAs which have no overlap with any known gene or lncRNA on the same strand. Novel transcription units which overlap any known gene on the opposite strand are considered antisense (red), otherwise intergenic (blue). B Metagene plot of pNET-seq signal over the whole bodies of novel HC, MC and LC genes and novel transient RNAs. The called genes have variable length, therefore they were scaled to 100 bins. The 100 bp upstream and downstream flanking regions were scaled to 20 bins each. The vertical lines in the plotting area denote the starts and ends of novel genes. Red and blue wiggle lines show the average RNAPII elongation activity in novel genes and transient RNAs, respectively. Red and blue shaded areas show normal-bases 95% confidence interval for the respective means. C Example of a novel gene encoding a stable transcript. Features on forward and reverse strands are shown in red and blue, respectively. HC_gene_10019 is a High Confidence gene which was called on forward strand, i.e. in antisense orientation to lncRNA-encoding locus (denoted as AT3G05945 in Araport11, and as LincRNA_1146 in the annotation of Zhao et al. [32]). This novel gene has support from both ONT Direct RNA-seq (not shown), PAT-seq, CAGE-seq and plaNET-seq. Since the first two methods depend on the presence of poly(A) tail, the transcript is most probably polyadenylated. Moreover, the gene was validated by three independent datasets (pNET-seq, chrRNA-seq and RNA-seq). Given that the gene is clearly visible even in RNA-seq data, it remains unclear why it is absent from both TAIR10 and Araport11 annotations

Back to article page