Skip to main content
Figure 8 | BMC Bioinformatics

Figure 8

From: CLAG: an unsupervised non hierarchical clustering algorithm handling biological data

Figure 8

CLAG on the 128-dimensional synthetic dataset. A: the 128-dimensional dataset contains 1024 points and 16 clusters generated with a gaussian distribution (http://cs.joensuu.fi/sipu/datasets/DIM128.txt). CLAG perfectly distinguished the 16 clusters when run with Δ = 0.05and scores ≥ 0.5. B: curves associated to different score thresholds describing the number of elements that are clustered by CLAG while varying Δvalues. Note that the number of elements is 1011 for Δ = 0.05and maximal scores. Different clustering algorithms were run on this dataset: k-means (C), c-means (D), MCLUST (E). k-means and c-means were run with 16 clusters, and MCLUST with “ellipsoidal, equal variance with 9 components” as best model (note the 8 grey clusters). For k-means, clusters 1, 13 are split in several k-means clusters while clusters 3, 8 (violet) and 4, 16 (light blue) are fused together. c-means clusters the original ensemble in only 11 clusters: clusters 10, 4, 16 (brown) and 5, 6, 8, 9 (orange) are grouped together. In A, C, D, E elements are represented by circles. Different clusters are distinguished by different colors. Figures ACDE are realized by plotting the first two columns of the matrix describing the dataset.

Back to article page