Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: Unsupervised detection of regulatory gene expression information in different genomic regions enables gene expression ranking

Fig. 1

Illustration of the various genomic regions containing interleaved regulatory motif sequences and an illustration of the Average Repetitive Substring Index (ARSI) measure. a The pre-mRNA transcript contains different sections comprising of interleaved regulatory sequence motifs, which affect gene expression; these regions include exons, introns, and untranslated regions (UTRs). Transcripts of highly expressed genes tend to contain motifs with the precise sequences. However, in transcripts originating from lowly expressed genes these motifs are more likely to acquire mutations, affecting their gene expression; this leads to a lower ARSI score for these genes. b In order to compute the ARSI measure for a certain sequence we find for each nucleotide position in the sequence the longest substring that starts in this position and appears in one of the reference set of genetic sequence elements. The score is based on the average over the lengths of all these substrings; see more details in the Methods section and in [25]

Back to article page