Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Evaluation and improvements of clustering algorithms for detecting remote homologous protein families

Figure 1

Distribution of minimum BLAST e-values for GOLD and ASTRAL A-10 datasets. The GOLD dataset (a) is a collection of enzymes that were manually assigned to protein families/super-families. A-10 (b) is an ASTRAL subset of the SCOP database that contains only sequences with identities less than 10%. For each protein in both datasets, we considered the e-value to the nearest neighbor from its own family/superfamily (intra curves) and the e-value to the nearest neighbor from any other family/superfamily (inter curves).

Back to article page