From: Partition decoupling for multi-gene analysis of gene expression profiling data
Spectral Clustering Algorithm | |
---|---|
1. | Compute the correlation ρ ij between all pairs of n data points i and j. |
2. | Form the similarity matrix S ∈ ℝn×ndefined by s ij = exp [- sin2 (arccos(ρ ij )/2)/σ2], where σ is a scaling parameter (σ = 1 in the reported results). |
3. | Define D to be the diagonal matrix whose (i,i) elements are the column sums of S. |
4. | Define the Laplacian L = I - D-1/2SD-1/2. |
5. | Find the eigenvectors {v0, v1, v2, . . . , vn-1} with corresponding eigenvalues 0 ≤ λ1 ≤ λ2 ≤ ⋯ ≤ λn-1of L. |
6. | Determine from the eigendecomposition the optimal dimensionality l and natural number of clusters k (see text). |
7. | Construct the embedded data by using the first l eigenvectors to provide coordinates for the data (i.e., sample i is assigned to the point in the Laplacian eigenspace with coordinates given by the i th entries of each of the first l eigenvectors, similar to PCA). |
8. | Using k-means, cluster the l-dimensional embedded data into k clusters. |