Skip to main content

Table 8 Analysis of sampled NOBLE Coder errors

From: NOBLE – Flexible concept recognition for large-scale biomedical natural language processing

Error Type Definition Type of error CRAFT ShARe
Boundary detection Incorrectly incorporates words from earlier or later in the sentence, considering them to be part of the concept annotated FP 0 (0 %) 3 (1.5 %)
Concept hierarchy Incorrectly assigns more general or more specific concept than gold standard FP and FN 18 (9 %) 13 (6.5 %)
Context/background knowledge Concept annotated incorrectly because context or background knowledge was needed FP and FN, usually FN 72 (36 %) 81 (40.5 %)
Exact match missed Concept not annotated despite exactly matching the preferred name or a synonym FN 2 (1 %) 3 (1.5 %)
Importance Annotated concept was not deemed relevant by gold annotators FP 33 (16.5 %) 51 (25.5 %)
Abbreviation detection Abbreviation defined in the dictionary had a case-insensitive match, because it did not match a defined abbreviation pattern FP 18 (9 %) 0 (0 %)
Alternative application of terminology Gold used obsolete term, term is not in SNOMED, or same term existed in multiple ontologies, resulting in different annotations for same mention FN 31 (15.5 %) 10 (5 %)
Text span Concept annotated was identical to gold but text span was different than gold FP and FN 10 (5 %) 20 (10 %)
Word sense ambiguity Concept annotated was used in different word sense FP 4 (2 %) 0 (0 %)
Wording mismatch Missing or incorrect annotation due to word inflection mismatch between dictionary term and input text FP and FN, usually FN 12 (6 %) 19 (9.5 %)
Total Errors    200 (100 %) 200 (100 %)