From: Efficacy of different protein descriptors in predicting protein functional families
Protein family | Des-criptor set | Training set | Testing set | Independent evaluation set | Â | Â | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
 |  | P | N | P | N | P | N | Q(%) | MCC | ||||||
 |  |  |  | TP | FN | TN | FP | TP | FN | Sen(%) | TN | FP | Spec(%) |  |  |
EC2.4 | D1 | 1249 | 2120 | 1154 | 1 | 9065 | 12 | 724 | 176 | 80.4 | 3244 | 202 | 94.1 | 91.3 | 0.74 |
 | D2 | 1319 | 2120 | 1080 | 5 | 8806 | 1 | 646 | 154 | 82.9 | 3349 | 97 | 97.2 | 94.1 | 0.80 |
 | D3 | 1105 | 1756 | 1295 | 4 | 9166 | 5 | 768 | 132 | 85.3 | 3394 | 52 | 98.5 | 95.8 | 0.87 |
 | D4 | 1239 | 2221 | 1161 | 4 | 8701 | 5 | 756 | 144 | 84.0 | 3365 | 81 | 97.7 | 94.8 | 0.84 |
 | D5 | 1242 | 2223 | 1160 | 2 | 8690 | 14 | 753 | 147 | 83.6 | 3391 | 55 | 98.4 | 95.4 | 0.85 |
 | D6 | 1214 | 2077 | 1145 | 45 | 8846 | 4 | 741 | 159 | 82.3 | 3383 | 63 | 98.2 | 94.9 | 0.84 |
 | D7 | 1293 | 2624 | 1072 | 39 | 8295 | 8 | 696 | 204 | 77.3 | 3270 | 176 | 94.9 | 91.3 | 0.73 |
 | D8 | 1226 | 3008 | 1177 | 1 | 7918 | 1 | 794 | 106 | 88.2 | 3387 | 59 | 98.3 | 96.2 | 0.88 |
 | D9 | 1275 | 2747 | 1129 | 0 | 8177 | 3 | 782 | 118 | 86.9 | 3367 | 79 | 97.7 | 95.5 | 0.86 |
 | D10 | 1228 | 3254 | 1176 | 0 | 7672 | 1 | 798 | 102 | 88.7 | 3397 | 49 | 98.6 | 96.5 | 0.89 |
GPCR | D1 | 1590 | 7458 | 1847 | 1 | 14166 | 3 | 505 | 17 | 96.7 | 6735 | 58 | 99.1 | 99.0 | 0.93 |
 | D2 | 564 | 711 | 1728 | 3 | 14121 | 5 | 510 | 12 | 97.7 | 6737 | 56 | 99.2 | 99.1 | 0.93 |
 | D3 | 1169 | 4628 | 1122 | 4 | 10208 | 1 | 507 | 15 | 97.1 | 6737 | 56 | 99.2 | 99.0 | 0.93 |
 | D4 | 1257 | 4474 | 1037 | 1 | 10363 | 0 | 499 | 23 | 95.6 | 6745 | 48 | 99.3 | 99.0 | 0.93 |
 | D5 | 1290 | 4724 | 997 | 8 | 10113 | 0 | 494 | 28 | 94.6 | 6734 | 59 | 99.1 | 98.8 | 0.91 |
 | D6 | 757 | 2060 | 1536 | 2 | 12777 | 0 | 503 | 19 | 96.3 | 6742 | 51 | 99.2 | 99.0 | 0.93 |
 | D7 | 812 | 2950 | 1482 | 1 | 11887 | 0 | 495 | 27 | 94.8 | 6696 | 97 | 98.6 | 98.3 | 0.88 |
 | D8 | 653 | 2171 | 1644 | 0 | 12550 | 1 | 501 | 21 | 96.0 | 6769 | 24 | 99.7 | 99.4 | 0.95 |
 | D9 | 1590 | 7458 | 693 | 12 | 7322 | 57 | 512 | 10 | 98.1 | 6735 | 58 | 99.1 | 99.1 | 0.93 |
 | D10 | 672 | 2454 | 1625 | 0 | 12268 | 0 | 502 | 20 | 96.2 | 6757 | 36 | 99.5 | 99.2 | 0.94 |
TC8.A | D1 | 118 | 2858 | 49 | 0 | 13121 | 0 | 36 | 27 | 57.1 | 1843 | 2 | 99.9 | 98.5 | 0.73 |
 | D2 | 116 | 1100 | 50 | 0 | 14824 | 0 | 41 | 22 | 65.1 | 1843 | 2 | 99.9 | 98.7 | 0.78 |
 | D3 | 94 | 7962 | 53 | 0 | 14501 | 0 | 42 | 21 | 66.7 | 1842 | 3 | 98.6 | 98.7 | 0.78 |
 | D4 | 94 | 7962 | 47 | 0 | 11250 | 0 | 37 | 26 | 58.7 | 1843 | 2 | 99.9 | 98.5 | 0.74 |
 | D5 | 94 | 7962 | 47 | 0 | 11137 | 0 | 37 | 26 | 58.7 | 1843 | 2 | 99.9 | 98.5 | 0.74 |
 | D6 | 94 | 7962 | 64 | 0 | 15283 | 0 | 44 | 19 | 69.8 | 1843 | 2 | 99.9 | 98.9 | 0.81 |
 | D7 | 94 | 7962 | 59 | 0 | 15045 | 0 | 43 | 20 | 68.3 | 1843 | 2 | 99.9 | 98.9 | 0.80 |
 | D8 | 103 | 943 | 63 | 0 | 14981 | 0 | 48 | 15 | 76.2 | 1843 | 2 | 99.9 | 99.1 | 0.85 |
 | D9 | 114 | 810 | 52 | 0 | 15114 | 0 | 41 | 22 | 65.1 | 1843 | 2 | 99.9 | 98.7 | 0.78 |
 | D10 | 102 | 1068 | 64 | 0 | 14856 | 0 | 48 | 15 | 76.2 | 1843 | 2 | 99.9 | 99.1 | 0.85 |
Chlorophyll | D1 | 356 | 7928 | 166 | 0 | 14297 | 0 | 182 | 128 | 58.7 | 1587 | 11 | 99.3 | 92.7 | 0.71 |
 | D2 | 4S40 | 934 | 248 | 1 | 7927 | 1 | 228 | 82 | 73.6 | 1595 | 3 | 99.8 | 95.6 | 0.83 |
 | D3 | 425 | 603 | 264 | 0 | 15253 | 0 | 246 | 64 | 79.4 | 1594 | 4 | 99.8 | 96.4 | 0.86 |
 | D4 | 415 | 574 | 273 | 1 | 15282 | 0 | 247 | 65 | 79.7 | 1597 | 1 | 99.9 | 96.6 | 0.87 |
 | D5 | 429 | 615 | 259 | 1 | 15240 | 1 | 233 | 77 | 75.2 | 1597 | 1 | 99.9 | 95.9 | 0.84 |
 | D6 | 482 | 946 | 202 | 5 | 14910 | 0 | 205 | 105 | 66.1 | 1597 | 1 | 99.9 | 94.4 | 0.79 |
 | D7 | 394 | 3337 | 210 | 85 | 12517 | 2 | 178 | 132 | 57.4 | 1597 | 1 | 99.9 | 93.0 | 0.73 |
 | D8 | 371 | 1421 | 317 | 1 | 14435 | 0 | 255 | 55 | 82.3 | 1593 | 5 | 99.7 | 96.9 | 0.88 |
 | D9 | 399 | 1273 | 289 | 1 | 14582 | 1 | 249 | 61 | 80.3 | 1591 | 7 | 99.6 | 96.4 | 0.86 |
 | D10 | 381 | 1753 | 307 | 1 | 14102 | 1 | 251 | 59 | 81.0 | 1594 | 4 | 99.8 | 96.7 | 0.88 |
Lipid synthesis | D1 | 849 | 2026 | 705 | 3 | 8229 | 7 | 470 | 165 | 74.0 | 1218 | 57 | 95.5 | 88.4 | 0.73 |
 | D2 | 927 | 2037 | 629 | 1 | 8225 | 0 | 512 | 123 | 80.6 | 1259 | 16 | 98.6 | 92.7 | 0.84 |
 | D3 | 898 | 2968 | 659 | 0 | 7294 | 0 | 509 | 126 | 80.2 | 1271 | 4 | 99.7 | 93.2 | 0.84 |
 | D4 | 968 | 3227 | 588 | 1 | 7035 | 0 | 493 | 142 | 77.6 | 1273 | 2 | 99.8 | 92.5 | 0.83 |
 | D5 | 970 | 3280 | 586 | 1 | 6982 | 0 | 491 | 144 | 77.3 | 1260 | 15 | 98.8 | 91.7 | 0.81 |
 | D6 | 874 | 2112 | 681 | 2 | 8149 | 1 | 525 | 110 | 82.7 | 1268 | 7 | 99.5 | 93.9 | 0.86 |
 | D7 | 863 | 2415 | 692 | 2 | 7845 | 2 | 512 | 123 | 80.6 | 1271 | 4 | 99.7 | 93.4 | 0.85 |
 | D8 | 907 | 1608 | 615 | 0 | 4488 | 0 | 498 | 137 | 78.4 | 1268 | 7 | 99.5 | 92.5 | 0.83 |
 | D9 | 815 | 1613 | 740 | 2 | 8638 | 11 | 525 | 110 | 82.7 | 1248 | 27 | 97.9 | 92.8 | 0.84 |
 | D10 | 865 | 1640 | 657 | 0 | 4456 | 0 | 531 | 104 | 83.6 | 1268 | 7 | 99.5 | 94.2 | 0.87 |
rRNA binding | D1 | 548 | 579 | 3390 | 6 | 9598 | 22 | 1824 | 87 | 95.5 | 3511 | 60 | 98.3 | 97.3 | 0.94 |
 | D2 | 1133 | 1225 | 2811 | 0 | 8974 | 0 | 1844 | 67 | 96.5 | 3519 | 52 | 98.5 | 97.8 | 0.95 |
 | D3 | 1126 | 1638 | 2816 | 2 | 8560 | 1 | 1812 | 99 | 94.8 | 3535 | 36 | 99.0 | 97.5 | 0.95 |
 | D4 | 1337 | 1958 | 2697 | 0 | 8241 | 0 | 1783 | 128 | 93.3 | 3484 | 87 | 97.6 | 96.1 | 0.91 |
 | D5 | 1372 | 1976 | 2572 | 0 | 8223 | 0 | 1784 | 127 | 93.4 | 3479 | 92 | 97.4 | 96.0 | 0.91 |
 | D6 | 921 | 1208 | 2971 | 52 | 8991 | 0 | 1824 | 87 | 95.5 | 3541 | 30 | 99.2 | 97.9 | 0.95 |
 | D7 | 878 | 2743 | 3040 | 26 | 7442 | 14 | 1808 | 103 | 97.9 | 3481 | 90 | 97.5 | 96.5 | 0.92 |
 | D8 | 810 | 2245 | 3143 | 0 | 7954 | 0 | 1849 | 62 | 96.8 | 3541 | 30 | 99.2 | 98.3 | 0.96 |
 | D9 | 810 | 972 | 3075 | 3 | 9182 | 2 | 1848 | 63 | 96.7 | 3526 | 45 | 98.7 | 98.0 | 0.96 |
 | D10 | 900 | 2600 | 3044 | 0 | 7599 | 0 | 1858 | 53 | 97.2 | 3547 | 24 | 99.3 | 98.6 | 0.97 |