Skip to main content

Table 1 Characteristics of ontologies evaluated

From: Large-scale biomedical concept recognition: an evaluation of current automatic annotators and their parameters

Ontology Version # Concepts Avg. term Avg. words Avg. # % Have % Have % Have
    length in term synonyms punctuation numerals stop words
Cell type 25:05:2007 838 20.0 ± 9.5 3.0 ± 1.4 0.5 ± 1.1 11.6 4.8 3.3
Sequence 30:03:2009 1,610 21.6 ± 13.3 3.1 ± 1.0 1.4 ± 1 91.9 6.6 9.3
ChEBI 28:05:2008 19,633 25.5 ± 24.2 4.3 ± 4.8 2.0 ± 2.5 54.8 41.3 0
NCBITaxon 12:07:2011 789,538 24.6 ± 10.2 3.6 ± 2.0 N/A 53.7 56.0 0.3
GO-MF 28:11:2007 7,984 39.1 ± 15.4 4.6 ± 2.2 2.8 ± 4.6 52.8 26.6 2.7
GO-BP 28:11:2007 14,306 40.1 ± 19.0 5.0 ± 2.7 2.1 ± 2.5 23.5 7.0 45.7
GO-CC 28:11:2007 2,047 26.6 ± 14.2 3.6 ± 1.7 0.1 ± 0.9 29.5 14.4 6.8
Protein 22:04:2011 26,807 38.4 ± 18.5 5.5 ± 2.5 3.1 ± 3.2 68.4 74.8 4.3
\