Input: Ranked independent gene list G, number of candidate genes n, gene expression microarray

D∈R^{(n∗d)},threshold ε.

Output: Predictor set P, predict accuracy ACC,P_ACC.

Step1: Initialization: P=∅, candidate gene set C=G.

Step 1.1:Calculate the accuracy ACC_{i}, SN_{i},SP_{i},MCC_{i} by Eqs. 1, 2, 3, 4 of each c_{i}∈C

which acts as a single predictor in RF with 5fold cross validation, respectively.

Step 1.2: P←p,where \(p \leftarrow \mathop {argmax}\limits _{c_{i} \in C}ACC\), \(P\_ACC \leftarrow \mathop {max}\limits _{i\in {1 \cdots n}}ACC,C \leftarrow C/p\).

Step2: Character selection.

While P_ACC−ACC_max>−ε and C≠∅, do

Step2.1: ACC_max←P_ACC.

Step2.2: Add members into P.

1.\(P_{add\_i} \leftarrow P \cup \{ c_{i} \in C\}\) calculate \({ACC}_{add\_i}\) using \(P_{add\_i}\),as predictors in RF with 5folf cross validation, i=1,⋯,n.

2.\(P \leftarrow P_{add\_I_{add}}\),where \(I_{add} \leftarrow \mathop {argmax}\limits _{i=1,\cdots,n}({ACC}_{add}),P\_ACC \leftarrow max(ACC),C \leftarrow C/P\).

Step2.3: Try remove members form P.

If P_ACC−ACC_max>−ε,do

1.Define n_{remove} as the length of P.

2.\(P_{remove\_i} \leftarrow P/p_{i} \in P\),calculate \({ACC}_{remove\_i}\) using \(P_{remove\_i}\)as predictors in Random Forest with 5fold cross validation, i=1,⋯,n_{remove}.

3.\(P\leftarrow \{p_{i}{ACC}_{remove\_i}>P\_ACC,i=1,\cdots,n_{remove}\}\).

\(C \leftarrow \{C \cup \{p_{i}{ACC}_{remove\_i} \leq P\_ACC,i=1,\cdots,n_{remove}\}\)

End if.

Step2.4: Calculate accuracy.

Calculate P_ACC using P as predictors in RF with 5fold cross validation.

End while.
