Nearest Neighbor Networks: clustering expression data based on gene neighborhoods

Huttenhower, Curtis; Flamholz, Avi I; Landis, Jessica N; Sahi, Sauhard; Myers, Chad L; Olszewski, Kellen L; Hibbs, Matthew A; Siemers, Nathan O; Troyanskaya, Olga G; Coller, Hilary A

doi:10.1186/1471-2105-8-250

BMC Bioinformatics

Table 1 Clustering algorithm summary statistics.

From: Nearest Neighbor Networks: clustering expression data based on gene neighborhoods

	NNN g = 5, n = 25	CAST t = 0.8	CLICK h = μ_T	QTC d = 0.5, n = 5	SAMBA
Brem 2005, 6162 genes, 131 conditions
Genes	1527	3410	6162	6137	2284
Clusters	54	800	82	127	113
Mean Size	28.4	4.26	75.1	48.3	102
Size Dev.	49.2	16.91	161	93.3	70.3
Gasch 2000, 6115 genes, 173 conditions
Genes	1142	4079	6115	6092	3120
Clusters	38	666	9	69	128
Mean Size	30.1	6.12	679	88.3	130
Size Dev.	62.5	35.58	787	220	101
Haugen 2004, 6256 genes, 7 conditions
Genes	64	6251	6256	6236	280
Clusters	11	45	16	56	5
Mean Size	5.82	138.9	391	11.4	88.4
Size Dev.	1.19	347.3	474	258	36.5
Hughes 2000, 6153 genes, 300 conditions
Genes	1996	2579	6153	6121	3375
Clusters	29	519	75	177	325
Mean Size	68.9	4.97	82.0	34.6	45.9
Size Dev.	245.4	11.95	107	57.8	44.1
Primig 2000, 6005 genes, 24 conditions
Genes	2247	5820	6005	5970	778
Clusters	27	687	46	110	25
Mean Size	83.2	8.47	131	54.3	139
Size Dev.	390	19.26	187	80.4	96.3
Spellman 1998, 5701 genes, 25 conditions
Genes	2050	5535	5701	5669	777
Clusters	28	616	47	100	32
Mean Size	73.3	8.99	121	56.7	69.0
Size Dev.	324	30.14	206	114	37.3
Concatenated Data, 6160 genes, 660 conditions
Genes	694	6155	6160	-	4892
Clusters	29	7	5	-	609
Mean Size	23.9	879.3	1232	-	63.7
Size Dev.	34.7	2140	1768	-	82.0
Uniformly Distributed Random Data, 6000 genes, 10 conditions
Genes	0 (± 0)	5988 (± 0.89)	3600 (± 3286)	5964 (± 28.8)	0 (± 0)
Clusters	0 (± 0)	216.2 (± 2.95)	9.8 (± 9.81)	109 (± 4.72)	0 (± 0)
Mean Size	0 (± 0)	27.7 (± 0.38)	190 (± 175)	53.0 (± 1.39)	0 (± 0)
Size Dev.	0 (± 0)	21.86 (± 0.25)	48.8 (± 45.7)	35.2 (± 0.791)	0 (± 0)
Normally Distributed Random Data, 6000 genes, 10 conditions
Genes	0 (± 0)	5986 (± 3.58)	6000 (± 0)	5975 (± 4.77)	0 (± 0)
Clusters	0 (± 0)	231.6 (± 3.29)	28.8 (± 11.9)	124 (± 1.30)	0 (± 0)
Mean Size	0 (± 0)	25.85 (± 0.36)	235 (± 82.6)	48.3 (± 0.482)	0 (± 0)
Size Dev.	0 (± 0)	18.14 (± 0.15)	64.8 (± 46.3)	30.9 (± 0.374)	0 (± 0)
Brem 2005, 6162 genes, 131 conditions, randomly permuted
Genes	101.4 (± 28.85)	0 (± 0)	6162 (± 0)	5837 (± 260.6)	1061 (± 35.87)
Clusters	16.2 (± 3.96)	0 (± 0)	36.2 (± 28.99)	428 (± 33.88)	156 (± 4.85)
Mean Size	6.23 (± 0.78)	0 (± 0)	680.7 (± 864.7)	13.67 (± 0.46)	32.46 (± 1.35)
Size Dev.	1.64 (± 0.79)	0 (± 0)	884.5 (± 1179)	2.36 (± 0.52)	18.03 (± 1.13)
Gasch 2000, 6115 genes, 173 conditions, randomly permuted
Genes	19.4 (± 6.66)	0 (± 0)	4586 (± 3058)	5507 (± 47.19)	1382 (± 15.27)
Clusters	3.6 (± 1.34)	0 (± 0)	20.75 (± 33.71)	411.2 (± 5.12)	219.8 (± 15.27)
Mean Size	5.47 (± 1.04)	0 (± 0)	701 (± 941.3)	13.39 (± 0.058)	18.38 (± 0.35)
Size Dev.	0.66 (± 1.48)	0 (± 0)	950.8 (± 1197)	1.7 (± 0.03)	9.25 (± 0.38)
Hughes 2000, 6153 genes, 300 conditions, randomly permuted
Genes	20.2 (± 8.61)	572.8 (± 12.74)	4922 (± 2752)	4815 (± 76.96)	1808 (± 56.32)
Clusters	3.6 (± 1.82)	224 (± 8.22)	13 (± 10.84)	407.2 (± 7.56)	390.8 (± 5.67)
Mean Size	6.13 (± 1.64)	2.56 (± 0.044)	592.5 (± 826.8)	11.83 (± 0.038)	11.09 (± 0.39)
Size Dev.	0.53 (± 0.71)	0.82 (± 0.046)	101.7 (± 200.2)	1.15 (± 0.024)	5.59 (± 0.5)

Summary statistics detailing Nearest Neighbor Networks clusters formed from the data sets employed in this study, from their concatenation, and from two synthetic random data sets using default parameters (g = 5, n = 25). Results from other clustering algorithms with appropriate output formats (CAST, CLICK, QTC, and SAMBA) have been included, also utilizing default parameter settings provided by the algorithms' implementations. Random values are shown with standard deviations over five different seeds.

Back to article page

ISSN: 1471-2105

Contact us

General enquiries: journalsubmissions@springernature.com