Skip to main content

Table 2 Process of stepwise character selection based on RF

From: Extracting predictors for lung adenocarcinoma based on Granger causality test and stepwise character selection

Algorithm: Stepwise Character Selection
Input: Ranked independent gene list G, number of candidate genes n, gene expression microarray
  DR(nd),threshold ε.
Output: Predictor set P, predict accuracy ACC,P_ACC.
Step1: Initialization: P=, candidate gene set C=G.
  Step 1.1:Calculate the accuracy ACCi, SNi,SPi,MCCi by Eqs. 1, 2, 3, 4 of each ciC
which acts as a single predictor in RF with 5-fold cross validation, respectively.
Step 1.2: Pp,where \(p \leftarrow \mathop {argmax}\limits _{c_{i} \in C}ACC\), \(P\_ACC \leftarrow \mathop {max}\limits _{i\in {1 \cdots n}}ACC,C \leftarrow C/p\).
  Step2: Character selection.
  While P_ACCACC_max>−ε and C, do
   Step2.1: ACC_maxP_ACC.
   Step2.2: Add members into P.
    1.\(P_{add\_i} \leftarrow P \cup \{ c_{i} \in C\}\) calculate \({ACC}_{add\_i}\) using \(P_{add\_i}\),as predictors in RF with 5-folf cross validation, i=1,,n.
    2.\(P \leftarrow P_{add\_I_{add}}\),where \(I_{add} \leftarrow \mathop {argmax}\limits _{i=1,\cdots,n}({ACC}_{add}),P\_ACC \leftarrow max(ACC),C \leftarrow C/P\).
   Step2.3: Try remove members form P.
    If P_ACCACC_max>−ε,do
    1.Define nremove as the length of P.
    2.\(P_{remove\_i} \leftarrow P/p_{i} \in P\),calculate \({ACC}_{remove\_i}\) using \(P_{remove\_i}\)as predictors in Random Forest with 5-fold cross validation, i=1,,nremove.
    3.\(P\leftarrow \{p_{i}|{ACC}_{remove\_i}>P\_ACC,i=1,\cdots,n_{remove}\}\).
    \(C \leftarrow \{C \cup \{p_{i}|{ACC}_{remove\_i} \leq P\_ACC,i=1,\cdots,n_{remove}\}\)
    End if.
   Step2.4: Calculate accuracy.
    Calculate P_ACC using P as predictors in RF with 5-fold cross validation.
   End while.