Skip to main content

Table 2 Analysis of false negative sentences

From: Semi-automated curation of protein subcellular localization: a text mining-based approach to Gene Ontology (GO) Cellular Component curation

Reason(s) for search failure Percentage of total sentences (n = 78)
Non-standard protein nomenclature (NSN) 39.7% (n = 31)
Missing category term(s) (MT) 3.8% (n = 3)
Information spread over multiple sentences (IMS) 11.5% (n = 9)
Information expressed with <3 categories (IFC) 6.4% (n = 5)
NSN + MT 7.7% (n = 6)
NSN + IMS 3.8% (n = 3)
NSN + IFC 3.8% (n = 3)
MT + IMS 10.3% (n = 8)
MT + IFC 1.3% (n = 1)
NSN + MT + IMS 6.4% (n = 5)
Technical issues 5.1% (n = 4)