Algorithm: Stepwise Character Selection |
---|
Input: Ranked independent gene list G, number of candidate genes n, gene expression microarray |
D∈R(n∗d),threshold ε. |
Output: Predictor set P, predict accuracy ACC,P_ACC. |
Step1: Initialization: P=∅, candidate gene set C=G. |
Step 1.1:Calculate the accuracy ACCi, SNi,SPi,MCCi by Eqs. 1, 2, 3, 4 of each ci∈C |
which acts as a single predictor in RF with 5-fold cross validation, respectively. |
Step 1.2: P←p,where \(p \leftarrow \mathop {argmax}\limits _{c_{i} \in C}ACC\), \(P\_ACC \leftarrow \mathop {max}\limits _{i\in {1 \cdots n}}ACC,C \leftarrow C/p\). |
Step2: Character selection. |
While P_ACC−ACC_max>−ε and C≠∅, do |
Step2.1: ACC_max←P_ACC. |
Step2.2: Add members into P. |
1.\(P_{add\_i} \leftarrow P \cup \{ c_{i} \in C\}\) calculate \({ACC}_{add\_i}\) using \(P_{add\_i}\),as predictors in RF with 5-folf cross validation, i=1,⋯,n. |
2.\(P \leftarrow P_{add\_I_{add}}\),where \(I_{add} \leftarrow \mathop {argmax}\limits _{i=1,\cdots,n}({ACC}_{add}),P\_ACC \leftarrow max(ACC),C \leftarrow C/P\). |
Step2.3: Try remove members form P. |
If P_ACC−ACC_max>−ε,do |
1.Define nremove as the length of P. |
2.\(P_{remove\_i} \leftarrow P/p_{i} \in P\),calculate \({ACC}_{remove\_i}\) using \(P_{remove\_i}\)as predictors in Random Forest with 5-fold cross validation, i=1,⋯,nremove. |
3.\(P\leftarrow \{p_{i}|{ACC}_{remove\_i}>P\_ACC,i=1,\cdots,n_{remove}\}\). |
\(C \leftarrow \{C \cup \{p_{i}|{ACC}_{remove\_i} \leq P\_ACC,i=1,\cdots,n_{remove}\}\) |
End if. |
Step2.4: Calculate accuracy. |
Calculate P_ACC using P as predictors in RF with 5-fold cross validation. |
End while. |