Skip to main content

Table 1 20 input features used for Sigma-RF are listed along with their importance estimates

From: Sigma-RF: prediction of the variability of spatial restraints in template-based modeling by random forest

Index Feature Importance
F1 |IJ| 7.51
F2 |IJ|/N r e s,t a r g e t 2.91
F3 d template 9.43
F4 m I,K m J,L 2.55
F5 \(\sum _{\substack {I \leq i \leq J\\ K \leq j \leq L}} m_{i,j}\delta (i,j)/\sum _{\substack {I \leq i \leq J\\ K \leq j \leq L}}\delta (i,j)\) 16.81
F6 \(N_{\textit {gap}}^{IJ}\) 1.91
F7 \(N_{\textit {gap}}^{IJ}/|I-J|\) 1.36
F8 1/|I I| 0.12
F9 1/|J J| 0.20
F10 \(N_{\textit {gap}}^{KL}\) 0.37
F11 \(N_{\textit {gap}}^{KL}/|K-L|\) 0.32
F12 1/|K K| 0.23
F13 1/|L L| 0.49
F14 \(\sum _{s=H,E,C} p(s)\delta (s_{I},s_{K})\) 0.16
F15 \(\sum _{s=H,E,C} p(s)\delta (s_{J},s_{L})\) 0.88
F16 \(\sum _{acc=B,E} p(acc)\delta (acc_{I},acc_{K})\) 0.53
F17 \(\sum _{acc=B,E} p(acc)\delta (acc_{J},acc_{L})\) 0.58
F18 F 4 F 14 F 15 F 16 F 17 3.62
F19 \(\frac {F_{18}}{1+F_{6}+F_{10}}\) 3.02
F20 \(\frac {F_{19}}{1+F_{8}+F_{9}+F_{12}+F_{13}}\) 4.22
  1. I and J (>I) indicate the residue indices in the target sequence, and K and L (>K) indicate those in the template sequence. When two residue pairs [(I, K) and (J, L)] are aligned, we extract the distance information of d template between two atoms in the template. N r e s,t a r g e t is the chain length of the target sequence. m I,K is the match score of the aligned pair (I, K). In F5, δ(i,j)=1, if residues i,j are aligned, otherwise δ(i,j)=0. \(N^{I,J}_{\textit {gap}}\) is the number of gaps between I and J in the target sequence. I , J , K and L represent the residue indices of the closest gaps of I, J, K and L, respectively. p(s) represents the PSI-PRED scores of the secondary structure elements, helix (H), strand (E) and coil (C). p(acc) represents the SANN scores of the solvent accessibility states, buried (B) and exposed (E).