Skip to main content
Fig. 7 | BMC Bioinformatics

Fig. 7

From: Alignment-free clustering of large data sets of unannotated protein conserved regions using minhashing

Fig. 7

F1-value comparison for NADDA-annotated conserved regions of data set #9 using different numbers of hash functions. The red line represents the F1 score computed at the end of each iteration for checking with the termination condition (d=40). A comparison between Pfam and pClust protein clusters (overlapping) and the clustering of proteins generated at each iteration is shown with blue and red lines. The dashed line represents the number of hash functions where the termination condition is met (for Ï„=0.9 and d=40)

Back to article page