Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: TrancriptomeReconstructoR: data-driven annotation of complex transcriptomes

Fig. 1

Outline of the TranscriptomeReconstructoR concept. A Genomic coordinates of TSS and PAS are called from 5' and 3' tag sequencing data, respectively. Terminal subalignments of long reads are extended towards the summits of the nearby TSS and PAS (within 100 bp distance on the same strand). An extended read is considered complete, if its 5' and 3' ends overlap with TSS and PAS, respectively. B Overlapping subalignments of complete reads are clustered together, if the pairwise distances between their borders do not exceed 10 bp. Within each cluster, the coordinates of subalignments are unified to the most frequently observed values. C Long reads sharing the same TSS and PAS are grouped together. Subalignments present in more than 50% of reads within the group are considered constitutive exons. Alignment errors are detected by comparing subalignments of each long read to the set of constitutive exons. A subalignment with alternative 3' or 5' border can be considered a novel alternative exon, only if the next subalignment in the same read precisely matches the next constitutive exon. Otherwise, if the next constitutive exon is absent from the read, the tested subalignment is marked as potential alignment error. D Continuous intervals of nascent transcription (green) cover a larger fraction of genome than the transcripts and genes called from steady-state long read RNA-seq data (grey). Nascent transcription intervals which do not overlap with regions of mature RNA production on the same strand, are classified into either read-through (RT) tails, or transient RNAs

Back to article page