Skip to main content

Table 7 Summary of the feature selection results.

From: SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences

Feature set

# features

  

all

after step 1

after step 2

Length

 

1

0

0

Index-based

 

50

5

0

CV and CMV

 

60

2

0

CV for collocated AAs

 

2000

4

1

Property group-based

 

35

1

0

Predicted secondary structure content

 

4

2

0

Predicted secondary structure-based

with PSI-PRED

86

27

8

 

with YASPIN

86

12

0

Total # of features

 

2322

53

9

10 fold cross validation accuracy for prediction on 25PDB dataset

 

73.2%

80.2%

80.1%

  1. The "feature set" columns defines categories of the considered features, the "all" column shows the total number of features in a given category, while the "after step 1" and "after step 2" columns show the corresponding number of features from a given category that were selected in the step 1 and step 2 of the feature selection procedure, respectively.