Skip to main content

Advertisement

Table 5 Accuracy of machine learning predictions.a

From: Use of machine learning algorithms to classify binary protein sequences as highly-designable or poorly-designable

  J48 Naïve Bayes SMO
a) Sequences folding to the top 10% of designable structures vs. sequences folding to the bottom 10% of designable structures for both shapes 69.5% correct 65.0% correct 65.6% correct
  AUC 0.73 AUC 0.69 AUC 0.67
  Sens: 0.67 Sens: 0.66 Sens: 0.71
  Spec: 0.71 Spec: 0.65 Spec: 0.64
b) Sequences folding to the top 10% of designable structures of hexagonal shape vs. sequences folding to the bottom 10% of designable structures in the triangular shape 98.1% correct 84.9% correct 87.0% correct
  AUC 0.99 AUC 0.92 AUC 0.87
  Sens: 0.98 Sens: 0.82 Sens: 0.84
  Spec: 0.98 Spec: 0.90 Spec: 0.92
c) Sequences folding to the top 10% of designable structures of triangular shape vs. sequences folding to the bottom 10% of designable structures in the hexagonal shape 98.0% correct 65.8% correct 64.3% correct
  AUC 0.99 AUC 0.70 AUC 0.63
  Sens: 0.98 Sens: 0.64 Sens: 0.75
  Spec: 0.98 Spec: 0.72 Spec: 0.66
  1. a For classifying a)sequences folding to highly-designable conformations for the hexagonal and triangular shapes against sequences folding to the least designable conformations for these two shapes; b)sequences folding to the most designable conformations of the hexagonal shape against sequences folding to the least designable conformations of the triangular shape and c)sequences folding to the most designable conformations of the triangular shape against sequences folding to the least designable conformations of the hexagonal shape. Prediction accuracy and area under the curve (AUC), sensitivity (Sens) and specificity (Spec) for each method are given.