From: NOBLE – Flexible concept recognition for large-scale biomedical natural language processing
Error Type | Definition | Type of error | CRAFT | ShARe |
---|---|---|---|---|
Boundary detection | Incorrectly incorporates words from earlier or later in the sentence, considering them to be part of the concept annotated | FP | 0 (0 %) | 3 (1.5 %) |
Concept hierarchy | Incorrectly assigns more general or more specific concept than gold standard | FP and FN | 18 (9 %) | 13 (6.5 %) |
Context/background knowledge | Concept annotated incorrectly because context or background knowledge was needed | FP and FN, usually FN | 72 (36 %) | 81 (40.5 %) |
Exact match missed | Concept not annotated despite exactly matching the preferred name or a synonym | FN | 2 (1 %) | 3 (1.5 %) |
Importance | Annotated concept was not deemed relevant by gold annotators | FP | 33 (16.5 %) | 51 (25.5 %) |
Abbreviation detection | Abbreviation defined in the dictionary had a case-insensitive match, because it did not match a defined abbreviation pattern | FP | 18 (9 %) | 0 (0 %) |
Alternative application of terminology | Gold used obsolete term, term is not in SNOMED, or same term existed in multiple ontologies, resulting in different annotations for same mention | FN | 31 (15.5 %) | 10 (5 %) |
Text span | Concept annotated was identical to gold but text span was different than gold | FP and FN | 10 (5 %) | 20 (10 %) |
Word sense ambiguity | Concept annotated was used in different word sense | FP | 4 (2 %) | 0 (0 %) |
Wording mismatch | Missing or incorrect annotation due to word inflection mismatch between dictionary term and input text | FP and FN, usually FN | 12 (6 %) | 19 (9.5 %) |
Total Errors | 200 (100 %) | 200 (100 %) |