Fig. 2From: Alignment-free clustering of large data sets of unannotated protein conserved regions using minhashingF1-value comparison for Pfam-annotated domains of data set #9 using different numbers of hash functions. For each iteration of the algorithm a comparison is made between Pfam and pClust (blue and green lines). The red line represents the F1-value computed at the end of each iteration using d=40. Comparisons are based on non-overlapping clusters of domain regions. The dashed line represents the number of hash functions where the termination condition is met for Ï„=0.9 and d=40Back to article page