Win percentage: a novel measure for assessing the suitability of machine classifiers for biological problems

BMC Bioinformatics

Table 3 Pseudocode for a Monte Carlo wrapper-based feature selection algorithm

MCW(S, N)
1. x_out = -∞
2. For i = 1 to N
S_i = randomSubset(S)
(x_i, C_i) = performance(S_i)
If x_i >x_out,
S_out = S_i, x_out = x_i, c_out = randomElement(C_i)
3. output S_out, x_out, c_out

The input, S, is the set of all features, and N is the total number of feature subsets to draw randomly. The variable x_i is the performance of the top classifier for subset S_i, and C_i is the label of the top classifier. S_out, x_out, and c_out, return the top performing feature set, top estimated performance, and top classifier, respectively.

ISSN: 1471-2105