Skip to main content

Table 4 Dataset statistics and prediction accuracies after homologous sequences removal (HSR) at 90% and 70% identity. DS refers to descriptor set, where D1 = amino acid composition; D2 = dipeptide composition; D3 = Moreau-Broto autocorrelation; D4 = Moran autocorrelation; D5 = Geary autocorrelation; D6 = composition, transition and distribution descriptors; D7 = quasi sequence order; D8 = pseudo amino acid composition; D9 = combination of D1+D2; and D10 = combination of D1-D8. Predicted results given as TP (true positive), FN (false negative), TN (true negative), FP (false positive), Sen (sensitivity), Spec (specificity), Q (overall accuracy) and MCC (Matthews correlation coefficient).

From: Efficacy of different protein descriptors in predicting protein functional families

   

Independent evaluation set

Protein family

% HSR

DS

P

N

Q (%)

MCC

   

TP

FN

Sen(%)

TN

FP

Spec(%)

  

EC2.4

90

D1

552

250

68.8

3235

201

94.2

89.4

0.65

  

D2

626

176

78.1

3339

97

97.2

93.6

0.78

  

D3

609

193

75.9

3384

52

98.5

94.2

0.80

  

D4

603

199

75.2

3355

81

97.6

93.4

0.78

  

D5

591

211

73.7

3381

55

98.4

93.7

0.79

  

D6

501

301

62.5

3374

62

98.2

91.4

0.70

  

D7

545

257

68.0

3261

175

94.9

89.8

0.66

  

D8

666

136

83.0

3375

61

98.2

95.4

0.84

  

D9

630

172

78.6

3357

79

97.7

94.1

0.80

  

D10

670

132

83.5

3388

48

98.6

95.8

0.86

 

70

D1

459

223

67.3

3193

199

94.1

89.6

0.62

  

D2

516

166

75.7

3296

96

97.2

93.6

0.76

  

D3

503

179

73.8

3341

51

98.5

94.4

0.78

  

D4

495

187

72.6

3311

81

97.6

93.4

0.75

  

D5

484

198

71.0

3339

53

98.4

93.8

0.77

  

D6

399

283

58.5

3330

62

98.2

91.5

0.67

  

D7

452

230

66.3

3218

174

94.9

90.1

0.63

  

D8

551

131

80.8

3331

61

98.2

95.3

0.83

  

D9

520

162

76.3

3314

78

97.7

94.1

0.78

  

D10

554

128

81.2

3344

48

98.6

95.7

0.84

GPCR

90

D1

391

13

96.8

6724

58

99.1

99.0

0.91

  

D2

395

9

97.8

6744

38

99.4

99.4

0.94

  

D3

393

11

97.3

6726

56

99.2

99.1

0.92

  

D4

386

18

95.5

6734

48

99.3

99.1

0.92

  

D5

381

23

94.3

6723

59

99.1

98.9

0.90

  

D6

391

13

96.8

6731

51

99.3

99.1

0.92

  

D7

382

22

94.6

6685

97

98.6

98.3

0.86

  

D8

387

17

95.8

6758

24

99.7

99.4

0.95

  

D9

391

13

96.8

6752

30

99.6

99.4

0.94

  

D10

388

16

96.0

6762

20

99.7

99.5

0.95

 

70

D1

307

8

97.5

6695

58

99.1

99.1

0.90

  

D2

309

6

98.1

6715

38

99.4

99.4

0.93

  

D3

306

9

97.1

6697

56

99.2

99.1

0.90

  

D4

301

14

95.6

6705

48

99.3

99.1

0.90

  

D5

198

17

94.6

6694

59

99.1

98.9

0.88

  

D6

307

8

97.5

6702

51

99.2

99.2

0.91

  

D7

296

19

94.0

6656

97

98.6

98.4

0.83

  

D8

301

14

95.6

6729

24

99.6

99.5

0.94

  

D9

307

8

97.5

6723

30

99.6

99.5

0.94

  

D10

302

13

95.9

6733

20

99.7

99.5

0.95

TC8.A

90

D1

28

27

50.9

1846

2

99.9

98.5

0.68

  

D2

33

22

60.0

1846

2

99.9

98.7

0.75

  

D3

34

21

61.8

1845

3

99.8

98.7

0.75

  

D4

29

26

52.7

1845

3

99.8

98.8

0.75

  

D5

29

26

52.7

1845

3

99.8

98.8

0.75

  

D6

36

19

65.5

1846

2

99.9

98.9

0.78

  

D7

35

20

63.6

1845

3

99.8

98.8

0.76

  

D8

40

15

72.7

1845

3

99.8

99.2

0.82

  

D9

33

22

60.0

1846

2

99.9

98.7

0.75

  

D10

40

15

72.7

1845

3

99.8

99.2

0.82

 

70

D1

25

24

51.0

1828

2

99.9

98.6

0.68

  

D2

29

20

59.2

1828

2

99.9

98.8

0.74

  

D3

29

20

59.2

1827

3

99.8

98.8

0.73

  

D4

26

23

53.1

1828

2

99.9

98.7

0.70

  

D5

26

23

53.1

1828

2

99.9

98.7

0.70

  

D6

33

16

67.3

1828

2

99.9

99.0

0.79

  

D7

30

19

61.2

1827

3

99.8

98.8

0.74

  

D8

36

13

73.5

1827

3

99.8

99.2

0.82

  

D9

29

20

59.2

1828

2

99.9

98.8

0.74

  

D10

36

13

73.5

1827

3

99.8

99.2

0.82

Chlorophyll

90

D1

159

127

55.6

1594

8

99.5

92.9

0.70

  

D2

205

81

71.7

1598

4

99.8

95.5

0.82

  

D3

224

62

78.3

1599

3

99.8

96.6

0.86

  

D4

222

64

77.6

1599

3

99.8

96.5

0.86

  

D5

211

75

73.8

1598

4

99.8

95.8

0.83

  

D6

182

104

63.6

1594

8

99.5

94.1

0.75

  

D7

159

127

55.6

1595

9

99.4

92.8

0.69

  

D8

233

53

81.5

1595

7

99.6

96.8

0.87

  

D9

224

62

78.3

1594

8

99.5

96.3

0.85

  

D10

229

57

80.1

1597

5

99.7

96.7

0.87

 

70

D1

113

118

48.9

1578

8

99.5

93.1

0.65

  

D2

155

76

67.1

1582

4

99.8

95.6

0.79

  

D3

171

60

74.0

1583

3

99.8

96.5

0.84

  

D4

171

60

74.0

1583

3

99.8

96.5

0.84

  

D5

161

70

69.7

1582

4

99.8

95.9

0.81

  

D6

137

94

59.3

1578

8

99.5

94.4

0.72

  

D7

114

117

49.4

1575

11

99.3

93.0

0.64

  

D8

182

49

78.8

1579

7

99.6

96.9

0.85

  

D9

172

59

74.5

1578

8

99.5

96.3

0.82

  

D10

178

53

77.1

1581

5

99.7

96.8

0.85

Lipid synthesis

90

D1

403

149

73.0

1213

59

95.4

88.6

0.72

  

D2

431

121

78.1

1256

16

98.7

92.5

0.81

  

D3

436

116

79.0

1268

4

99.7

93.4

0.84

  

D4

421

131

76.3

1270

2

99.8

92.7

0.83

  

D5

416

136

75.4

1270

2

99.8

92.4

0.82

  

D6

449

103

81.3

1270

2

99.8

94.2

0.86

  

D7

435

117

78.8

1269

3

99.8

93.4

0.84

  

D8

423

129

76.6

1265

7

99.5

92.5

0.82

  

D9

449

103

81.3

1245

27

97.9

92.9

0.83

  

D10

454

98

82.3

1265

7

99.5

94.2

0.86

 

70

D1

316

138

69.6

1205

59

95.3

88.5

0.69

  

D2

343

111

75.6

1248

16

98.7

92.6

0.81

  

D3

340

114

74.9

1260

4

99.7

93.1

0.82

  

D4

330

124

72.7

1262

2

99.8

92.7

0.81

  

D5

328

126

72.3

1260

4

99.7

92.4

0.80

  

D6

358

96

78.9

1244

20

98.4

93.3

0.82

  

D7

342

112

75.3

1257

7

99.5

93.1

0.82

  

D8

331

123

72.9

1257

7

99.4

92.4

0.80

  

D9

360

94

79.3

1237

27

97.9

93.0

0.81

  

D10

360

94

79.3

1257

7

99.5

94.1

0.85

rRNA binding

90

D1

1407

91

93.9

3502

59

98.3

97.0

0.93

  

D2

1437

61

95.9

3510

51

98.6

97.8

0.95

  

D3

1403

95

93.7

3529

32

99.1

97.5

0.93

  

D4

1347

151

89.9

3491

70

98.0

95.6

0.89

  

D5

1347

151

89.9

3533

28

99.2

96.5

0.91

  

D6

1451

47

96.9

3537

24

99.3

98.6

0.97

  

D7

1358

140

90.7

3429

132

96.3

94.6

0.87

  

D8

1442

56

96.3

3531

30

99.2

98.3

0.96

  

D9

1436

62

95.9

3518

43

98.8

97.9

0.95

  

D10

1449

49

96.7

3537

24

99.3

98.6

0.97

 

70

D1

924

83

91.8

3454

59

98.3

96.9

0.91

  

D2

952

55

94.5

3463

50

98.6

97.7

0.93

  

D3

920

87

91.4

3483

30

99.2

97.4

0.92

  

D4

907

100

90.1

3444

69

98.0

96.3

0.89

  

D5

908

99

90.2

3485

28

99.2

97.2

0.92

  

D6

963

44

95.6

3493

20

99.4

98.6

0.96

  

D7

917

90

91.1

3382

131

96.3

95.1

0.86

  

D8

654

53

94.7

3484

29

99.2

98.2

0.95

  

D9

950

57

94.3

3471

42

98.8

97.8

0.94

  

D10

960

47

95.3

3490

23

99.4

98.5

0.96