1. | Calculate feature importance on the training data |
---|---|
a. Gini importance | |
b. absolute value of regression coefficients (PLS/PCR) | |
c. p-values from Wilcoxon-test/t-test | |
2. | Rank the features according to the importance measure, remove the p% least important |
3. | Train the classifier on the training data |
A. Random forest | |
B. D-PLS | |
C. D-PCR | |
and apply it to the test data | |
4. | Repeat 1.–4. until no features are remaining |
5. | Identify the best feature subset according to the test error |