Skip to main content

Table 4 Evaluation of results from top-ranking TESTLoc prediction schemes1

From: TESTLoc: protein subcellular localization prediction from EST data

 

Prediction scheme

 

chl

cyt

end

ext

mit

nuc

per

pla

vac

Expanded plant dataset

1. Top-performing individual feature (4th order amino acid composition)

SN

99.9

(0.44)

53

(12.2)

20

(42.2)

83

(17.7)

88.4

(7)

82.7

(4.9)

20

(42.2)

20

(42.2)

80

(23.3)

  

PPV

99.9

(0.44)

76.1

(14.4)

20

(42.2)

96.3

(7.8)

67.9

(4.7)

88

(4.8)

20

(42.2)

20

(42.2)

100

(0)

  

MCC

0.99

(0.01)

0.61

(0.14)

0.2

(0.42)

0.88

(0.1)

0.7

(0.07)

0.82

(0.04)

0.2

(0.42)

0.2

(0.42)

0.88

(0.14)

 

2. Integration of predictions from all sequence features

SN

100

(0)

45.5

(8.55)

20

(42.2)

69

(12)

86.1

(10.8)

78.1

(8.3)

40

(51.6)

30

(48.3)

63.3

(24.6)

  

PPV

99.3

(1.4)

93.5

(8.6)

20

(42.2)

98

(6.3)

70.3

(11.6)

97.4

(4.2)

11.7

(31.2)

8.8

(16.7)

100

(0)

  

MCC

0.99

(0.01)

0.63

(0.08)

0.2

(0.42)

0.81

(0.07)

0.7

(0.05)

0.85

(0.05)

0.16

(0.31)

0.15

(0.26)

0.78

(0.16)

 

3. Integration attributions of all sequence features

SN

100

(0)

9.3

(9.2)

10

(31.6)

48.5

(22.9)

82.2

(8.2)

80

(6.2)

0

(0)

0

(0)

0

(0)

  

PPV

100

(0)

28.8

(31.8)

10

(31.6)

77.8

(22)

53.1

(3.4)

76.2

(8.5)

0

(0)

0

(0)

0

(0)

  

MCC

1

(0)

0.1

(0.16)

0.1

(0.32)

0.6

(0.13)

0.5

(0.07)

0.7

(0.07)

0

(0)

0

(0)

0

(0)

 

4. Integration of predictions from three top-performing features2

SN

99.9

(0.44)

50.5

(9.5)

20

(42.2)

71

(12)

86.7

(17.7)

75.8

(7.9)

50

(52.7)

30

(48.3)

63.3

(24.6)

  

PPV

99.7

(0.6)

88.4

(9.2)

20

(42.2)

98

(6.3)

71.2

(12.4)

96.1

(4.6)

21.6

(41.4)

5

(8.1)

100

(0)

  

MCC

0.99

(0.01)

0.65

(0.09)

0.2

(0.42)

0.83

(0.07)

0.71

(0.06)

0.82

(0.04)

0.26

(0.4)

0.12

(0.19)

0.78

(0.16)

 

5. Integration of attributes from three top-performing features

SN

94.4

(2.2)

52.2

(12.7)

20

(42.2)

75

(17.2)

84.2

(5.8)

77.7

(4.7)

20

(42.2)

20

(42.2)

76.7

(22.5)

  

PPV

86.6

(3.7)

90.5

(11)

20

(42.2)

96

(8.4)

67.8

(3.7)

92

(4.3)

20

(42.2)

20

(42.2)

100

(0)

  

MCC

0.8

(0.05)

0.66

(0.1)

0.2

(0.42)

0.84

(0.1)

0.68

(0.05)

0.81

(0.04)

0.2

(0.42)

0.2

(0.42)

0.86

(0.14)

Arabidopsis validation dataset

Integration of predictions from three top-performing features

SN

47.4

58.5

80

100

89.8

90.2

0

100

100

  

PPV

90.6

86.1

100

100

68.2

100

100

100

100

  

MCC

0.42

0.67

0.89

1

0.57

0.94

0

1

1

  1. 1 Numbers are the average of the 10-fold test. Numbers in parenthesis are the standard deviation. Bold numbers indicate the best values for each metric (SN, PPV, MCC) in each class of the expanded plant dataset. The values for SN and PPV are given in %. MCC, Matthews Correlation Coefficient. For other abbreviations, see footnote to Table 1.
  2. 2 The three features are 4th order amino acid composition, 6th order group-C amino acid composition and 7th order group-D amino acid composition