Skip to main content
Fig. 12 | BMC Bioinformatics

Fig. 12

From: Statistical power for cluster analysis

Fig. 12

Traditional (top row), fuzzy (middle row and bottom row) silhouette scores, respectively computed after clustering through k-means, c-means, and Gaussian finite mixture modelling. Each line presents the mean and 95% confidence interval obtained over 100 iterations of sampling (N = 120, two uncorrelated features) from populations with no subgroups (left column), two subgroups (second column), three subgroups (third column), or four subgroups (right column) that were equidistant and of equal size. The same dataset in each iteration was subjected to k-means, c-means, and Gaussian mixture modelling. Estimates were obtained for different centroid separation values (Δ = 1 to 10, in steps of 0.5; brighter colours indicate stronger separation). All simulation results are plotted, and power (proportion of iterations that found k > 1) and accuracy (proportion of iterations that found the true number of clusters) are annotated for centroid separations Δ = 2 to 4 (see Fig. 13 for power and accuracy as a function of centroid separation)

Back to article page