Fig. 4From: Alignment-free clustering of large data sets of unannotated protein conserved regions using minhashingF1 score comparison for data set #3 using various values of d used in the termination condition. Red, olive, and green lines represent comparison at the end of each iteration with an earlier output of the algorithm using d=40, 50, and 60, respectively. Blue and purple lines demonstrate the comparison between pClust and Pfam clusters. Dashed lines represent the number of hash functions where the termination condition is satisfied, where from left to right the termination condition is (Ï„=0.9, d=40), (Ï„=0.9, d=50), (Ï„=0.9, d=60), and (Ï„=0.95, d=50)Back to article page