Skip to main content

Table 1 Sequences and annotations for each dataset.

From: GOtcha: a new method for prediction of protein function assessed by the annotation of seven genomes

   

Sequences annotated by ontology

Dataset

Total associations

Total sequences

Cellular Component

Molecular Function

Biological Process

Arabidopsis thaliana

290952(94824)

20108(7969) [451]

14851(2115)

14467(7555)

10454(3481)

Drosophila melanogaster

129694(29311)

7536(7536) [0]

3613(3589)

6528(6520)

3730(3723)

Homo sapiens

409153(67357)

21251(9074) [659]

13723(6516)

19362(7328)

17080(7707)

Plasmodium falciparum

36952(32536)

2406(2209) [41]

2061(1227)

2094(2094)

2044(2044)

Saccharomyces cerevisiae

136938(36267)

6910(6849) [0]

6751(6751)

6831(6831)

6899(6838)

Vibrio cholerae

42616(42616)

2924(2924) [27]

189(189)

2721(2721)

2923(2923)

Caenorhabditis elegans

109360(18626)

6916(1870) [199]

3054(650)

5746(282)

5102(1557)

  1. Values in parentheses do not include IEA associations. Values in [brackets] are sequences with annotations that are children of obsolete (GO:0008369).