Statistical power for cluster analysis

BMC Bioinformatics

Table 4 Statistical power for the binary decision of data being “clustered” using k-means clustering

	Δ = 1	Δ = 2	Δ = 3	Δ = 4	Δ = 5	Δ = 6	Δ = 7	Δ = 8	Δ = 9	Δ = 10
2 Clusters (10/90%)
N = 10	44	28	31	25	14	8	3	0	0	0
N = 20	10	11	22	53	81	95	100	100	100	100
N = 40	1	0	15	44	96	99	100	100	100	100
N = 80	0	0	5	54	98	100	100	100	100	100
N = 160	0	0	1	39	99	100	100	100	100	100
2 Clusters (50/50%)
N = 10	25	56	77	95	100	99	100	100	100	100
N = 20	12	32	82	98	100	100	100	100	100	100
N = 40	3	29	93	99	100	100	100	100	100	100
N = 80	0	12	91	100	100	100	100	100	100	100
N = 160	0	1	98	100	100	100	100	100	100	100
3 Clusters (33/34/33%)
N = 10	18	30	57	76	97	99	100	100	100	100
N = 20	21	36	70	97	100	100	100	100	100	100
N = 40	8	13	77	100	100	100	100	100	100	100
N = 80	2	4	89	100	100	100	100	100	100	100
N = 160	0	0	91	100	100	100	100	100	100	100
4 Clusters (25/25/25/25%)
N = 10	4	6	22	34	72	90	100	99	100	100
N = 20	19	36	70	94	100	100	100	100	100	100
N = 40	7	22	75	99	100	100	100	100	100	100
N = 80	0	4	83	100	100	100	100	100	100	100
N = 160	0	0	88	100	100	100	100	100	100	100

Estimates based on 100 iterations per cell, using a decision threshold of 0.5 for silhouette scores

ISSN: 1471-2105