Skip to main content

Table 1 Classification and regression performance of Prethermut on the M-dataset

From: Predicting changes in protein thermostability brought about by single- or multi-site mutations

Methoda

Mutation

Numbers

nb

MCC

Q2 (%)

Sensitivity

(%)

Specificity

(%)

r

RF

1

2765

0.46

77.3

71.3

7 9.7

0.70

RF

2

441

0.66

84.8

81.0

86.5

0.79

RF

3

93

0.86

96.8

84.6

98.8

0.87

RF

≥4

67

0.92

97.0

93.8

98.0

0.86

RF

≥1

3366

0.50

79.7

73.6

81.1

0.72

SVM

1

2765

0.39

79.8

41.2

92.1

0.64

SVM

2

441

0.59

83.0

51.1

97.4

0.74

SVM

3

93

0.45

89.7

23.1

100.0

0.79

SVM

≥4

67

0.66

88.1

50.0

100.0

0.78

SVM

≥1

3366

0.43

79.7

42.7

93.2

0.67

  1. All of the results were obtained by a 10-fold cross validation on the M-dataset. See Methods for definitions of overall accuracy (Q2), Matthews correlation coefficient (MCC), sensitivity, specificity, and Pearson correlation coefficient (r). aThe number of trees in the random forests (RF) method is 10000; the parameters for the support vector machine (SVM) method are gamma (g) = 2, cost (c) = 8, and the weight for the positive samples (w) = 3. bn is the number of mutant proteins in the sample; the total number of proteins in the M-dataset was 3366.