Skip to main content

Table 5 Subset evaluation. Accuracies by learning algorithms with default parameters set by WEKA and best data subset by combination (Column 3) and Feature selection method (column 5) are listed above

From: A model to predict the function of hypothetical proteins through a nine-point classification scoring schema

Algorithms Best combination Subsets (from complete dataset) Accuracy Feature selection subsets Accuracy
bayes_NaiveBayesUpdateable 1,6,7,9 96.67 Cfs 1,2,3,6,7,9 96.67
functions_smo_npolyk 1,2,4,6,7,9 98.00 PCA 1,2,3,4,5,6,7,8 97.00
rules_DecisionTable 6,7,9 96.00 Cfs 1,2,3,6,7,9 96.00
functions_mlp 1,2,4,6,7,9 98.33 Cfs 1,2,3,6,7,9 96.67
bayes_nbay 1,6,7,9 96.67 Cfs 1,2,3,6,7,9 96.67
trees_j48 1,2,4,6,9 97.67 PCA 1,2,3,4,5,6,7,8 97.00
  1. Column 1 lists different algorithms. Columns 2 & 4 list the best data subsets and Columns 3 & 5 accuracies, respectively. (1: Pfam; 2: Orthology; 3: Prot_interactions; 4: Best Blast hits; 5: Subcellular localization; 6: Functional linkages; 7: HPs linked to Pseudogenes 8: Homology modelling; 9: HPs linked to ncRNAs). Accuracies shown by both the subset combinations are almost same, with subset combinations from the complete dataset showing a slightly higher accuracy