Skip to main content

Table 1 Statistics of gene annotation for ENSEMBL and NCBI.

From: Non-coding sequence retrieval system for comparative genomic analysis of gene regulatory elements

ENSEMBL – as of 12/06/06

Organism

Assembly

Genebuild Date

Version

Known

Novel

Total Predictions

Human

NCBI 36

Aug 2006

41.36c

22205

1019

69185

Mouse

NCBI m36

Apr 2006

41.36b

21839

2599

71259

Chicken

WASHUC 1

Dec 2005

41.1p

5123

5417

76146

NCBI – taxonomy browser and Unigene as of 12/06/06

Organism

Assembly

GenBank Date

UniGene Build

Entrez Genes

Total Unigene Clusters

human

NCBI 36

Oct 2006

197

38597

85590

mouse

NCBI m36

Oct 2006

159

60745

64618

chicken

WASHUC 1

Aug 2006

31

24313

30837

  1. Known – genes that have species-specific protein sequences already available in the public sequence databases. Novel – genes that could not be mapped with confidence to existing entries. Total Predictions – the number of 'known', 'novel' and 'pseudogenes' predicted by the Ensembl analysis and annotation pipeline.
  2. Entrez Genes – number of genes defined by sequence and/or located in the NCBI Map Viewer. Total Unigene Clusters – the number of non-redundant sets of gene-oriented clusters automatically partitioned by UniGene.