Partition decoupling for multi-gene analysis of gene expression profiling data

BMC Bioinformatics

Table 1 Procedure for Spectral Clustering.

	Spectral Clustering Algorithm
1.	Compute the correlation ρ_ij between all pairs of n data points i and j.
2.	Form the similarity matrix S ∈ ℝ^n×ndefined by s_ij = exp [- sin² (arccos(ρ_ij)/2)/σ²], where σ is a scaling parameter (σ = 1 in the reported results).
3.	Define D to be the diagonal matrix whose (i,i) elements are the column sums of S.
4.	Define the Laplacian L = I - D^-1/2SD^-1/2.
5.	Find the eigenvectors {v₀, v₁, v₂, . . . , v_n-1} with corresponding eigenvalues 0 ≤ λ₁ ≤ λ₂ ≤ ⋯ ≤ λ_n-1of L.
6.	Determine from the eigendecomposition the optimal dimensionality l and natural number of clusters k (see text).
7.	Construct the embedded data by using the first l eigenvectors to provide coordinates for the data (i.e., sample i is assigned to the point in the Laplacian eigenspace with coordinates given by the i th entries of each of the first l eigenvectors, similar to PCA).
8.	Using k-means, cluster the l-dimensional embedded data into k clusters.

ISSN: 1471-2105