Figure 1From: Ultra-fast sequence clustering from similarity networks with SiLiXSingle linkage clustering with alignment coverage constraints. The four proteins (A, B, C, D) contain some homologous domains (represented by colored boxes). To avoid the clustering in the same family of proteins that do not share any homology (e.g. A and D), pairwise sequence alignments are considered for the clustering only if they cover a minimum threshold of the length of each of the two proteins. This threshold has to be high enough to exclude cases like the alignment (B, C), which would lead to the clustering of A and D.Back to article page