Skip to main content

Table 3 Data set SCL16. The data set SCL16 consists of SCL16L and SCL16T as the learning data set and testing data set, respectively. There are 15 essential GO terms corresponding to eukaryotic subcellular compartments. Note that GO:0005814 is not appeared in the set of n = 2870 GO terms. The number t of (t) in SCL16L represents the number of sequences which are correctly annotated by only one essential GO term.

From: ProLoc-GO: Utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization

Label Compartment Essential Number of sequences
   GO term SCL16L SCL16T
1 Centriole GO:0005814 17 (0) 4
2 Cytoplasm GO:0005737 384 (92) 334
3 Cytoskeleton GO:0005856 20 (7) 5
4 Endoplasmic reticulum GO:0005783 91 (83) 22
5 Extracellular GO:0030198 402 (1) 404
6 Golgi apparatus GO:0005794 68 (8) 17
7 Lysosome GO:0005764 37 (32) 9
8 Chloroplast GO:0009507 207 (192) 51
9 Mitochondrion GO:0005739 183 (173) 45
10 Nucleus GO:0005634 474 (395) 695
11 Peroxisome GO:0005777 52 (38) 12
12 Plasma membrane GO:0005886 323 (29) 90
13 Cell wall GO:0005618 20 (16) 5
14 Cyanelle GO:0009842 78 (65) 19
15 Vacuole GO:0005773 36 (30) 8
1 6 Plastid GO:0009536 31 (1) 7
Total    2423 (1162) 1727