Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: Roast: a tool for reference-free optimization of supertranscriptome assemblies

Fig. 2

ROAST workflow to identify and fix de novo supertranscriptome assembly errors using RNA-seq data. ROAST takes paired-end RNA-seq reads and the de novo supertranscriptome assembly (optional) as input. It starts by removing the redundant supertranscripts and subsequently performs assembly improvement as two nested iterations. At the start of each outer iteration, an inner iteration is run that extends incomplete supertranscripts using soft-clipped bases. The inner iteration starts by mapping the reads on to the assembly from which partially mapped reads (reads containing soft-clipped bases) are extracted and used to extend incomplete contigs. This is done until the number of iterations or the number of contigs containing partially mapped reads reach the user-defined threshold. Once out of the inner iteration, ROAST merges fragmented supertranscripts using partially mapped reads. This is followed by realignment of reads on the improved assembly, which is then used to extend partial supertranscripts and merge fragmented contigs using reads with unmapped mates (orphan reads) and discordantly mapped read pairs respectively. The resulting assembly is then used for re-mapping of reads and subsequently false chimera and local mis-assemblies are identified and fixed. This whole process is repeated until the number of iteration or the number of contigs containing errors reach the user-defined threshold. At the end of iterative improvement, ROAST provides final improved assembly as output

Back to article page