Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: Clustering biological sequences with dynamic sequence similarity threshold

Fig. 1

Illustration of the binning process and graph contraction in ALFATClust. A Suppose the black dots are the primitive graph vertices. The binning process determines whether a vertex vt (red circle) representing a graph cluster of two primitive vertices can be assigned to the bin B (purple dashed ellipse) consisting of two vertices (blue circles) containing clusters of sizes 2 and 3. Ibin(B, W) is the average edge weight for all edges inter-connecting primitive vertices inside B (blue and purple lines) and I(B, W, vt) refers to the average edge weight for all edges (orange lines) connecting vt and B. The cluster separation g = Ibin(B, W) − I(B, W, vt) and the cluster cut-off threshold r = I(B, W, vt) − γlow. According to Eq. (2), vt can be assigned to B when g is less than r. B The graph contraction shrinks two clusters (primitive vertices connected by either blue or red edges) into two vertices (blue and red dots respectively). The intra-cluster (blue and red) and inter-cluster (black) edge weights are averaged to become vertex (blue and red) weights and collapsed edge (black) weights respectively. The value in the parentheses denotes the number of underlying actual edges. These two vertices can be further collapsed into a single vertex (black)

Back to article page