Skip to main content
Figure 3 | BMC Bioinformatics

Figure 3

From: Unsupervised statistical clustering of environmental shotgun sequences

Figure 3

Pairwise genome divergence distributions. Cumulative distributions of pairwise divergences (D n ) between all completed bacterial genomes retrieved from GenBank. Fragment lengths of 400 to 1000 were used to compute D n . Divergences based on k-mer order 2, 3, and 4 are represented in panels A, B, and C, respectively. The vertical cut-off line at D = 1 indicates an empirical boundary above which the binning algorithm works with high accuracy. For fragment length 400, over 80% of all randomly selected pairs are observed to have divergences above this line.

Back to article page