BMC Bioinformatics

Figure 1

From: Semi-automated curation of protein subcellular localization: a text mining-based approach to Gene Ontology (GO) Cellular Component curation

Textpresso category development for Cellular Component curation. Curators identified true positive sentences from a training set and used word frequency analysis and manual inspection to identify words and phrases that were most indicative of experimentally determined subcellular localization. Three new categories, Cellular Components, Assay Terms, and Verbs, were created.

