Skip to main content

Advertisement

Table 1 Prediction quality of different prediction approaches

From: Integrative approaches to the prediction of protein functions based on the feature selection

Specificity # of GO terms KLR KL1LR KLR with Relief
   Pfam Interpro All ES λ / λ max = 0.01 λ / λ max = 0.1  
     NOS S   NOS S NOS S  
3-10 952 0.60 0.58 0.55 0.54 0.55 0.73 0.75 0.72 0.74 0.58
11-30 435 0.74 0.76 0.70 0.70 0.82 0.85 0.88 0.79 0.85 0.69
31-100 239 0.79 0.79 0.82 0.82 0.84 0.84 0.88 0.78 0.86 0.64
101-300 100 0.80 0.80 0.84 0.84 0.83 0.82 0.86 0.73 0.84 0.61
  1. GO terms are categorized into four groups based on the number of genes covering the GO term (specificity in the first column). Prediction quality is estimated using AUC values for KLR using Pfam or Interpro data only, KLR using all data sources, KLR using a data source selected by exhaustive search (ES), KL1LR, and KLR using data sources selected by the Relief method. In the case of KL1LR, two different values of the regularization parameter λ are used. NOS stands for non-standardization of features, and S for standardization.