Skip to main content

Table 7 Summary of the feature selection results.

From: SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences

Feature set # features
   all after step 1 after step 2
Length   1 0 0
Index-based   50 5 0
CV and CMV   60 2 0
CV for collocated AAs   2000 4 1
Property group-based   35 1 0
Predicted secondary structure content   4 2 0
Predicted secondary structure-based with PSI-PRED 86 27 8
  with YASPIN 86 12 0
Total # of features   2322 53 9
10 fold cross validation accuracy for prediction on 25PDB dataset   73.2% 80.2% 80.1%
  1. The "feature set" columns defines categories of the considered features, the "all" column shows the total number of features in a given category, while the "after step 1" and "after step 2" columns show the corresponding number of features from a given category that were selected in the step 1 and step 2 of the feature selection procedure, respectively.