Skip to main content

Table 4 Evaluation of results from top-ranking TESTLoc prediction schemes1

From: TESTLoc: protein subcellular localization prediction from EST data

  Prediction scheme   chl cyt end ext mit nuc per pla vac
Expanded plant dataset 1. Top-performing individual feature (4th order amino acid composition) SN 99.9
(0.44)
53
(12.2)
20
(42.2)
83
(17.7)
88.4
(7)
82.7
(4.9)
20
(42.2)
20
(42.2)
80
(23.3)
   PPV 99.9
(0.44)
76.1
(14.4)
20
(42.2)
96.3
(7.8)
67.9
(4.7)
88
(4.8)
20
(42.2)
20
(42.2)
100
(0)
   MCC 0.99
(0.01)
0.61
(0.14)
0.2
(0.42)
0.88
(0.1)
0.7
(0.07)
0.82
(0.04)
0.2
(0.42)
0.2
(0.42)
0.88
(0.14)
  2. Integration of predictions from all sequence features SN 100
(0)
45.5
(8.55)
20
(42.2)
69
(12)
86.1
(10.8)
78.1
(8.3)
40
(51.6)
30
(48.3)
63.3
(24.6)
   PPV 99.3
(1.4)
93.5
(8.6)
20
(42.2)
98
(6.3)
70.3
(11.6)
97.4
(4.2)
11.7
(31.2)
8.8
(16.7)
100
(0)
   MCC 0.99
(0.01)
0.63
(0.08)
0.2
(0.42)
0.81
(0.07)
0.7
(0.05)
0.85
(0.05)
0.16
(0.31)
0.15
(0.26)
0.78
(0.16)
  3. Integration attributions of all sequence features SN 100
(0)
9.3
(9.2)
10
(31.6)
48.5
(22.9)
82.2
(8.2)
80
(6.2)
0
(0)
0
(0)
0
(0)
   PPV 100
(0)
28.8
(31.8)
10
(31.6)
77.8
(22)
53.1
(3.4)
76.2
(8.5)
0
(0)
0
(0)
0
(0)
   MCC 1
(0)
0.1
(0.16)
0.1
(0.32)
0.6
(0.13)
0.5
(0.07)
0.7
(0.07)
0
(0)
0
(0)
0
(0)
  4. Integration of predictions from three top-performing features2 SN 99.9
(0.44)
50.5
(9.5)
20
(42.2)
71
(12)
86.7
(17.7)
75.8
(7.9)
50
(52.7)
30
(48.3)
63.3
(24.6)
   PPV 99.7
(0.6)
88.4
(9.2)
20
(42.2)
98
(6.3)
71.2
(12.4)
96.1
(4.6)
21.6
(41.4)
5
(8.1)
100
(0)
   MCC 0.99
(0.01)
0.65
(0.09)
0.2
(0.42)
0.83
(0.07)
0.71
(0.06)
0.82
(0.04)
0.26
(0.4)
0.12
(0.19)
0.78
(0.16)
  5. Integration of attributes from three top-performing features SN 94.4
(2.2)
52.2
(12.7)
20
(42.2)
75
(17.2)
84.2
(5.8)
77.7
(4.7)
20
(42.2)
20
(42.2)
76.7
(22.5)
   PPV 86.6
(3.7)
90.5
(11)
20
(42.2)
96
(8.4)
67.8
(3.7)
92
(4.3)
20
(42.2)
20
(42.2)
100
(0)
   MCC 0.8
(0.05)
0.66
(0.1)
0.2
(0.42)
0.84
(0.1)
0.68
(0.05)
0.81
(0.04)
0.2
(0.42)
0.2
(0.42)
0.86
(0.14)
Arabidopsis validation dataset Integration of predictions from three top-performing features SN 47.4 58.5 80 100 89.8 90.2 0 100 100
   PPV 90.6 86.1 100 100 68.2 100 100 100 100
   MCC 0.42 0.67 0.89 1 0.57 0.94 0 1 1
  1. 1 Numbers are the average of the 10-fold test. Numbers in parenthesis are the standard deviation. Bold numbers indicate the best values for each metric (SN, PPV, MCC) in each class of the expanded plant dataset. The values for SN and PPV are given in %. MCC, Matthews Correlation Coefficient. For other abbreviations, see footnote to Table 1.
  2. 2 The three features are 4th order amino acid composition, 6th order group-C amino acid composition and 7th order group-D amino acid composition