Skip to main content
Fig. 3 | BMC Bioinformatics

Fig. 3

From: groHMM: a computational tool for identifying unannotated and cell type-specific transcription units from global run-on sequencing data

Fig. 3

Performance of transcript unit callers. a Schematic representation of TUA metrics. We divided non-overlapping gene annotations into three distinct regions: (1) upstream of the transcription start site (TSS), (2) within the gene body, and (3) downstream of the transcription termination site (TTS). We scaled all genes to a uniform length and plotted the frequency with which the called transcription units overlap each position near the gene annotations (see Methods for details). Asterisks represent two possible errors of transcript calling * ‘dissociated annotation error’ and ** ‘merged annotation error.’ The terms are as follows: \( \widehat{5\hbox{'}FP} \) (5’ false positive; upstream region), \( \hat{TP} \) (true positive; gene body), and \( \widehat{PostTTs} \) (downstream of the TTS). b Transcript density plot of called transcripts for well-expressed genes (n = 2,060), where expression (i.e., GRO-seq reads) was observed in 100 evenly divided regions (EDR). Transcript density is defined as the number of called transcripts divided by the number of annotations per each genomic location. Ten percent of the transcripts were bootstrapped with replacement (n = 100). c TUA metrics for (B), comparing three transcript callers: groHMM, SICER, and HOMER. d Coverage of called transcripts compared with actual expression in genic and intergenic regions using a window size of 100 bp for groHMM, SICER, and HOMER. Ten percent of the annotations were bootstrapped with replacement (n = 100). e TUAs of called transcripts grouped by annotation widths: short, <20 kb (n = 3,919); medium, 20-50 kb (n = 3,339); and long, >50 kb (n = 4,740) for groHMM, SICER, and HOMER. Annotations with EDR = 10 were used. Ten percent of the transcripts were bootstrapped with replacement (n = 100). f TUAs for various sequencing depths for groHMM, SICER, and HOMER. The same optimal values for each method were used. Ten percent of the transcripts were bootstrapped (n = 100) with annotations of EDR = 1

Back to article page