Skip to main content

Table 1 Prediction quality of different prediction approaches

From: Integrative approaches to the prediction of protein functions based on the feature selection

Specificity

# of GO terms

KLR

KL1LR

KLR with Relief

  

Pfam

Interpro

All

ES

λ / λ max = 0.01

λ / λ max = 0.1

 
    

NOS

S

 

NOS

S

NOS

S

 

3-10

952

0.60

0.58

0.55

0.54

0.55

0.73

0.75

0.72

0.74

0.58

11-30

435

0.74

0.76

0.70

0.70

0.82

0.85

0.88

0.79

0.85

0.69

31-100

239

0.79

0.79

0.82

0.82

0.84

0.84

0.88

0.78

0.86

0.64

101-300

100

0.80

0.80

0.84

0.84

0.83

0.82

0.86

0.73

0.84

0.61

  1. GO terms are categorized into four groups based on the number of genes covering the GO term (specificity in the first column). Prediction quality is estimated using AUC values for KLR using Pfam or Interpro data only, KLR using all data sources, KLR using a data source selected by exhaustive search (ES), KL1LR, and KLR using data sources selected by the Relief method. In the case of KL1LR, two different values of the regularization parameter λ are used. NOS stands for non-standardization of features, and S for standardization.