From: 2D–EM clustering approach for high-dimensional data through folding feature vectors
Feature Selection |
1. Given x ∈ χ in a d-dimensional space. |
2. Perform hierarchical clustering on all samples x to find temporary class labels. |
3. Using these class labels find p-values for all the d features. |
4. Find m by placing a threshold or cut-off on p-values (e.g. cut-off for p-values could be 0.01). |
5. Retaining the top m 2 features will give us a sample \( y\in {\mathrm{\mathbb{R}}}^{m^2} \), where all y samples form a sample set \( Y\in {\mathrm{\mathbb{R}}}^{m^2\times n} \). |
Matrix arrangement |
6. Compute mean \( {\mu}_y=\frac{1}{n}\sum \limits_{y\in Y}y \). |
7. Arrange features of μ y in ascending order and note the indices. |
8. Arrange features of y by following the indices from step 7. |
9. Reshape a sample y to a matrix X ∈ ℝm × m. |