Skip to main content

Table 3 Data set SCL16. The data set SCL16 consists of SCL16L and SCL16T as the learning data set and testing data set, respectively. There are 15 essential GO terms corresponding to eukaryotic subcellular compartments. Note that GO:0005814 is not appeared in the set of n = 2870 GO terms. The number t of (t) in SCL16L represents the number of sequences which are correctly annotated by only one essential GO term.

From: ProLoc-GO: Utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization

Label

Compartment

Essential

Number of sequences

  

GO term

SCL16L

SCL16T

1

Centriole

GO:0005814

17 (0)

4

2

Cytoplasm

GO:0005737

384 (92)

334

3

Cytoskeleton

GO:0005856

20 (7)

5

4

Endoplasmic reticulum

GO:0005783

91 (83)

22

5

Extracellular

GO:0030198

402 (1)

404

6

Golgi apparatus

GO:0005794

68 (8)

17

7

Lysosome

GO:0005764

37 (32)

9

8

Chloroplast

GO:0009507

207 (192)

51

9

Mitochondrion

GO:0005739

183 (173)

45

10

Nucleus

GO:0005634

474 (395)

695

11

Peroxisome

GO:0005777

52 (38)

12

12

Plasma membrane

GO:0005886

323 (29)

90

13

Cell wall

GO:0005618

20 (16)

5

14

Cyanelle

GO:0009842

78 (65)

19

15

Vacuole

GO:0005773

36 (30)

8

1 6

Plastid

GO:0009536

31 (1)

7

Total

  

2423 (1162)

1727