Skip to main content
Fig. 5 | BMC Bioinformatics

Fig. 5

From: Alignment-free clustering of large data sets of unannotated protein conserved regions using minhashing

Fig. 5

Heatmap generated based on the pairwise local similarity percentages of the sequences in the PF02801.19 domain family of Pfam. The darker rectangles represent sub-clusters that are more similar to each other than to the rest of the cluster. The overlaid percentages show the F1-value of the matching clusters from the output of our algorithm and the sub-clusters obtained by cutting the hierarchical clustering tree to generate four sub-clusters based on pairwise similarity scores. The F1 score of the matches from larger to smaller sub-clusters are 93%, 90%, 13%, and 66%

Back to article page