Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: Recognizing RNA structural motifs in HT-SELEX data for ribosomal protein S15

Fig. 2

a Histogram of normalized Levenshtein distance from the top 4 high frequency sequences (Seq. ID: 98, 101, 290, 669) shows a clear cluster cutoff at distance 10%. Within the cluster, there is a decrease in the frequency of sequences further from the center indicating sequence clusters containing high frequency sequences are valid. b Plot of the CD-HIT clustering data represented as cluster size vs mean percent identity to cluster seed (diffuseness). In red are the clusters containing high frequency sequences with more than 100 read counts. In blue are clusters containing high frequency sequences with more than 100 read counts, which have been experimentally examined for binding to S15 (Table 6). In green are sequences experimentally tested that are from the clusters that do not contain high frequency sequences

Back to article page