Skip to main content

Table 2 Results of the automated optimisation of the Bound and Spec parameters using the K fold cross validation result (n = 10) and the final scoring scheme using the same validation approach

From: EnCOUNTer: a parsing tool to uncover the mature N-terminus of organelle-targeted proteins in complex samples

Investigated parameters Dataset or Scoring Scheme ExpMin Position ExpMax Position EnCOUNTer or Spec threshold True Positive True Negative False Positive False Negative Accuracy Sensitivity Specificity False Discovery Rate MCC
Spec (True dataset) Training - - > 62.9 ± 2.0 162 ± 3 288 ± 3 7 ± 2 21 ± 2 94.0 ± 0.3% 88.5 ± 1.0% 97.4 ± 0.5% 4.6 ± 0.9% 0.87 ± 0.01
Validation - - N.R. 16 ± 2 31 ± 3 2 ± 2 4 ± 2 88.5 ± 4.1% 78.7 ± 7.3% 94.9 ± 4.2% 9.3 ± 7.7% 0.67 ± 0.11
Bound (True dataset) Training 17 ± 4 80 ± 6 - 164 ± 4 241 ± 6 56 ± 6 19 ± 3 84.3 ± 1.5% 89.8 ± 0.7% 81.2 ± 1.8% 25.3 ± 1.6% 0.71 ± 0.01
Validation N.R. N.R. - 18 ± 3 26 ± 4 7 ± 3 3 ± 1 82.9 ± 5.5% 87.7 ± 4.3% 80.0 ± 7.8% 26.6 ± 8.7% 0.67 ± 0.09
All data together 14 78 63.2 183 266 63 20 84.4% 90.1% 80.9% 25.6% 0.69
Spec / Bound / Prox (True dataset) Training 17 ± 4 80 ± 6 > 129.9 ± 0.6 167 ± 4 293 ± 5 3 ± 1 16 ± 2 96.1 ± 0.6% 91.2 ± 0.6% 99.1 ± 0.2% 1.6 ± 0.3% 0.92 ± 0.01
Validation N.R. N.R. N.R. 19 ± 4 33 ± 5 0 ± 1 2 ± 2 95.9 ± 2.9% 91.1 ± 5.3% 98.7 ± 2.3% 1.9 ± 3.2% 0.91 ± 0.06
Spec / Bound / Prox (False dataset) Training 86 ± 9 300 ± 1 < 69.3 ± 6.1 (*) 272 ± 5 108 ± 4 74 ± 3 24 ± 4 79.5 ± 0.5% 92.0 ± 1.2% 59.1 ± 1.7% 21.4 ± 0.6% 0.59 ± 0.01
Validation N.R. N.R. N.R. 30 ± 3 12 ± 3 9 ± 3 3 ± 3 78.0 ± 4.1% 90.5 ± 6.7% 58.1 ± 10.1% 22.0 ± 5.4% 0.55 ± 0.08
Fraction 5 dataset True dataset (Spec only) 14 78 65.1 179 321 8 24 94.0% 88.2% 97.6% 4.3% 0.872
True dataset (Spec, Bound) 14 78 112.8 180 326 3 23 95.1% 88.7% 99.1% 1.6% 0.897
True dataset (Spec, Bound, Prox) 14 78 130.1 185 326 3 18 96.1% 91.1% 99.1% 1.6% 0.917
False dataset (Spec Only) 79 300 72.2 285 185 17 44 88.5% 86.6% 91.6% 5.6% 0.767
False dataset (Spec, Bound, Prox) 79 300 66.7 304 118 84 25 79.5% 92.4% 58.4% 21.6% 0.556
False dataset (Stringent params) 14 78 133.8 186 326 3 17 96.2% 91.6% 99.1% 1.6% 0.921
Fraction 6 dataset True dataset (Spec, Bound, Prox) 14 78 130.1 179 442 8 51 91.3% 77.8% 98.2% 4.3% 0.806
  1. (*) for the prediction based on the False dataset, the EnCOUNTer score must be below the determined Threshold for the True hits