Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: Statistical power for cluster analysis

Fig. 2

This figure shows three datasets in the top row, each made up out of 150 observations that fall in three equally sized clusters. While the datasets are made up of 4 features, the plotted data is a two-dimensional projection through multi-dimensional scaling (MDS). The left column presents simulated “blobs” as they are commonly used in clustering tutorials, the middle column presents the popular Iris dataset, and the right presents more realistic multivariate normal distributions. The bottom row presents the outcome of k-means clustering, showing good classification accuracy for all datasets, but only reliable cluster detection (silhouette coefficient of 0.5 or over) for blobs and the Iris dataset, but not the more realistic scenario

Back to article page