Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: SLIDR and SLOPPR: flexible identification of spliced leader trans-splicing and prediction of eukaryotic operons from RNA-Seq data

Fig. 2

Schematic representation of the SLOPPR pipeline (Spliced leader-informed operon prediction from RNA-Seq). A) Spliced leader tails (example: SL1a, SL1b, SL2a and SL2b) are identified and trimmed from the 5′ end of reads that correspond to the 5′ end of transcripts. B) Trimmed reads are aligned to the genome, quantified against exons (squares; grey: covered; white: not covered) and counts are summarised by gene (example: two genes A and B). Incorrect gene annotations (fused operonic genes) can optionally be identified and corrected via SL reads at internal exons (example: Gene B is split into B1 and B2). C) SL read sets from multiple libraries (example: X, Y and Z) are ordinated via PCA on genome-wide read counts and grouped into two clusters (K-means clustering) expected to correspond to SL1 (circles) and SL2-type (squares) subfunctionalisation. D) SL2:SL1 read ratios are computed between pre-defined SL groups (SL1, SL2) or inferred clusters (Cl1, Cl2). Operons are predicted via tracts of genes receiving SL2 bias (downstream operonic genes) plus an optional upstream gene receiving either an SL1 bias or no SLs at all. E) Intercistronic distances among predicted operons are expected to be reduced compared to intergenic distances among non-operonic genes (others). Operon predictions can optionally be filtered by intercistronic distance using a user-supplied or inferred optimal cut-off

Back to article page