Microarray data mining: A novel optimization-based approach to uncover biologically coherent structures

BMC Bioinformatics

Table 7 Summary of clustering results with dataset III

Cluster Group		Cluster Size		-log₁₀(P) Values			Correction	Correlation	Precision
		Max.	Min.	Ave.	Max.	Min.	Ave.	Ave.	Ave.
	B	61	4	2.5	4.2	1.2	1.7	0.609	0.753
	D	175	8	5.5	13.7	1.5	3.9	0.362	0.641
	F	102	102	2.9	2.9	2.9	0.9	0.172	0.494
A	Initial	271	2	20.0	140.1	0^a	19.7	0.655	0.461
	Final	271	2	21.7	140.1	1.8^b	20.6	0.707	0.522
C		116	2	10.4	33.3	3.2	9.5	0.672	0.735
E	Initial^c	-	-	-	-	-	-	-	-
	Final	88	2	4.4	11.4	1.1	3.4	0.635	0.440

Genes in dataset III are clustered by EP_GOS_Clust through a sequential process outlined in Figure 3. Genes in cluster groups A and E are further clustered by the iterative algorithm, yielding an initial and final set of clusters. Precision is defined as the fraction of genes within a cluster assigned to the predominant functional group within that cluster.
^aThe cluster p-value is zero if a GO search did not manage to uncover any significant annotation.
^bAfter iteratively clustering 184 genes into 15 initial clusters ('A' on Figure 3), just one poor cluster remains. The next worse cluster has a -log₁₀(P) value of 4.1.
^cThere are no applicable initial values here since the remaining genes to be clustered are subjected to the second filter before being re-clustered into the initial 6 clusters (see Figure 3).

ISSN: 1471-2105