Skip to main content
Figure 3 | BMC Bioinformatics

Figure 3

From: Computational cluster validation for microarray data analysis: experimental assessment of Clest, Consensus Clustering, Figure of Merit, Gap Statistics and Model Explorer

Figure 3

A geometric interpretation of Gap. The curve in green is the WCSS computed with K-means-R on the CNS Rat dataset. The curve in red is the average WCSS curve, computed on ten datasets generated from the original data via the Ps null model. The vertical lines indicate the gap between the null model curves and the real curve. Since WCSS is expected to decrease sharply up to k*, on the real dataset, while it has a nearly constant slope on the null model datasets, the length of the vertical segments is expected to increase up to k* and then to decrease. In fact, we get k* = 7, a value very close to the number of classes (six) in the dataset.

Back to article page