Skip to main content
Figure 4 | BMC Bioinformatics

Figure 4

From: Evaluation and improvements of clustering algorithms for detecting remote homologous protein families

Figure 4

Distribution of minimum e-values intra and inter super-families across all datasets. Curves for Astral subsets A-10, A-20, A-30, A-50, A-70, A-90 and A-95 are showed in panels a, b, c, d, e, f and g respectively, and curves for Gold database is showed in panel h. E-values associated with sequence-sequence comparisons (SSCs) were computed by BLAST, while e-values related to profile-profile comparisons (PPCs) were obtained by combining HHBlits and HHsearch. For each protein in the datasets, we considered the e-value to the nearest neighbor from its own super-family (intra curves) and the e-value to the nearest neighbor from any other super-family (inter curves). Solid lines indicate BLAST e-values and dashed lines indicate HHsearch e-values.

Back to article page