Skip to main content

Table 2 Data set SCL12. The data set SCL12 consists of SCL12L and SCL12T as the learning data set and testing data set, respectively. There are 12 essential GO terms corresponding to subcellular compartments. The number t of (t) in SCL12L represents the number of sequences which are correctly annotated by only one essential GO term.

From: ProLoc-GO: Utilizing informative Gene Ontology terms for sequence-based prediction of protein subcellular localization

Label

Compartment

Essential

Number of sequences

  

GO term

SCL12L

SCL12T

1

Centriole

GO:0005814

20 (1)

25

2

Cytoplasm

GO:0005737

155 (38)

377

3

Cytoskeleton

GO:0005856

12 (6)

14

4

Endoplasmic reticulum

GO:0005783

28 (19)

35

5

Extracellular

GO:0030198

140 (0)

301

6

Golgi apparatus

GO:0005794

33 (5)

42

7

Lysosome

GO:0005764

32 (27)

40

8

Microsome

GO:0005792

7 (0)

8

9

Mitochondrion

GO:0005739

125 (111)

228

10

Nucleus

GO:0005634

196 (179)

580

11

Peroxisome

GO:0005777

18 (16)

23

12

Plasma membrane

GO:0005886

153 (23)

368

Total

  

919 (425)

1122