Skip to main content

Table 5 Leave-cluster-out cross validation results of the four models on the 23 protein families (A to W) and 3 multi-family (X to Z) clusters of PDBbind v2009 refined set (N = 1740) in terms of root mean square error RMSE, standard deviation SD in linear correlation, Pearson correlation coefficient Rp and Spearman correlation coefficient Rs

From: Substituting random forest for multiple linear regression improves binding affinity prediction of scoring functions: Cyscore as a case study

      MLR::Cyscore     RF::Cyscore     RF::CyscoreVina     RF::CyscoreVinaElem  
Cluster name Cluster N RMSE SD Rp Rs RMSE SD Rp Rs RMSE SD Rp Rs RMSE SD Rp Rs
HIV protease A 188 1.65 1.53 0.259 0.216 1.70 1.51 0.310 0.201 1.76 1.56 0.182 0.105 1.77 1.56 0.166 0.129
trypsin B 74 1.24 1.11 0.612 0.695 1.10 1.11 0.610 0.636 0.96 0.97 0.723 0.700 0.93 0.93 0.751 0.715
carbonic anhydrase C 57 2.47 1.35 0.473 0.343 2.44 1.43 0.368 0.264 2.60 1.37 0.448 0.372 2.33 1.35 0.481 0.234
thrombin D 53 1.52 1.40 0.702 0.676 1.50 1.44 0.680 0.611 1.47 1.45 0.675 0.675 1.46 1.40 0.699 0.680
protein tyrosine phosphatase E 32 1.23 1.06 0.411 0.313 1.30 1.10 0.338 0.268 1.36 0.98 0.538 0.542 1.23 0.89 0.643 0.615
factor Xa F 32 1.18 0.96 0.604 0.634 1.54 1.13 0.367 0.356 1.53 1.02 0.533 0.498 1.61 1.07 0.470 0.470
urokinase G 29 1.15 1.14 0.643 0.602 1.10 1.14 0.642 0.645 1.25 1.27 0.516 0.436 1.05 1.06 0.699 0.624
different similar transporters H 29 0.96 0.96 0.285 0.122 1.27 0.99 0.056 -0.040 1.10 0.98 0.188 0.077 1.01 0.93 0.354 0.123
c-AMP dependent kinase I 17 1.32 1.15 0.537 0.537 1.16 1.11 0.582 0.602 0.94 0.91 0.748 0.664 1.06 0.91 0.747 0.644
β-glucosidase J 17 1.03 0.78 0.383 0.316 1.04 0.76 0.444 0.365 0.92 0.72 0.518 0.443 1.05 0.68 0.597 0.649
antibodies K 16 1.41 1.43 0.693 0.706 1.67 1.76 0.455 0.466 1.47 1.51 0.645 0.643 1.36 1.33 0.739 0.777
casein kinase II L 16 0.75 0.58 0.538 0.358 0.76 0.58 0.535 0.330 0.90 0.60 0.493 0.322 0.97 0.61 0.454 0.309
ribonuclease M 15 1.12 1.20 0.230 0.340 1.07 1.06 0.505 0.281 1.11 0.99 0.595 0.481 1.23 1.03 0.551 0.493
thermolysin N 14 1.15 1.14 0.680 0.635 0.98 1.03 0.748 0.648 1.04 1.12 0.696 0.565 0.97 1.05 0.738 0.636
CDK2 kinase O 13 1.06 0.80 0.841 0.812 1.14 1.01 0.733 0.817 1.14 1.02 0.729 0.661 1.12 1.14 0.640 0.525
glutamate receptor 2 P 13 1.08 0.85 0.070 0.096 1.09 0.85 0.120 0.097 1.08 0.85 0.116 0.121 1.00 0.84 0.123 0.016
P38 kinase Q 13 0.55 0.57 0.834 0.896 0.76 0.66 0.762 0.757 0.95 0.62 0.799 0.764 0.59 0.51 0.870 0.896
β-secretase I R 12 1.44 1.33 0.892 0.725 1.57 1.51 0.858 0.620 1.54 1.51 0.860 0.687 1.43 1.31 0.895 0.687
tRNA-guanine transglycosylase S 12 0.90 0.95 0.463 0.544 1.06 1.04 0.212 0.375 0.87 0.95 0.457 0.403 0.87 0.95 0.457 0.522
endothiapepsin T 11 1.18 1.30 0.435 0.215 1.28 1.35 0.358 0.210 1.35 1.36 0.345 0.215 1.36 1.27 0.480 0.210
α-mannosidase 2 U 10 1.67 1.63 -0.004 0.248 1.65 1.62 0.116 0.188 1.73 1.62 0.089 0.176 1.83 1.63 0.053 0.103
carboxypeptidase A V 10 2.13 1.99 0.479 0.523 1.90 1.89 0.556 0.370 1.82 1.76 0.632 0.467 1.77 1.54 0.734 0.685
penicillopepsin W 10 1.71 1.87 0.339 0.188 1.78 1.94 0.236 0.188 1.81 1.96 0.183 0.030 1.91 1.99 0.078 -0.030
families with 4-9 complexes X 386 1.73 1.71 0.500 0.577 1.61 1.60 0.587 0.598 1.58 1.56 0.610 0.612 1.54 1.53 0.630 0.632
families with 2-3 complexes Y 340 1.64 1.64 0.510 0.495 1.64 1.63 0.522 0.505 1.55 1.55 0.583 0.580 1.51 1.52 0.608 0.595
singletons Z 321 1.76 1.74 0.407 0.417 1.81 1.75 0.397 0.395 1.70 1.68 0.476 0.467 1.67 1.65 0.503 0.507
average    1.35 1.24 0.493 0.470 1.38 1.27 0.465 0.414 1.37 1.23 0.515 0.450 1.33 1.18 0.545 0.479
standard deviation    0.41 0.38 0.216 0.217 0.38 0.37 0.209 0.212 0.39 0.36 0.211 0.211 0.39 0.35 0.228 0.251