Skip to main content

Table 8 Analysis of sampled NOBLE Coder errors

From: NOBLE – Flexible concept recognition for large-scale biomedical natural language processing

Error Type

Definition

Type of error

CRAFT

ShARe

Boundary detection

Incorrectly incorporates words from earlier or later in the sentence, considering them to be part of the concept annotated

FP

0 (0 %)

3 (1.5 %)

Concept hierarchy

Incorrectly assigns more general or more specific concept than gold standard

FP and FN

18 (9 %)

13 (6.5 %)

Context/background knowledge

Concept annotated incorrectly because context or background knowledge was needed

FP and FN, usually FN

72 (36 %)

81 (40.5 %)

Exact match missed

Concept not annotated despite exactly matching the preferred name or a synonym

FN

2 (1 %)

3 (1.5 %)

Importance

Annotated concept was not deemed relevant by gold annotators

FP

33 (16.5 %)

51 (25.5 %)

Abbreviation detection

Abbreviation defined in the dictionary had a case-insensitive match, because it did not match a defined abbreviation pattern

FP

18 (9 %)

0 (0 %)

Alternative application of terminology

Gold used obsolete term, term is not in SNOMED, or same term existed in multiple ontologies, resulting in different annotations for same mention

FN

31 (15.5 %)

10 (5 %)

Text span

Concept annotated was identical to gold but text span was different than gold

FP and FN

10 (5 %)

20 (10 %)

Word sense ambiguity

Concept annotated was used in different word sense

FP

4 (2 %)

0 (0 %)

Wording mismatch

Missing or incorrect annotation due to word inflection mismatch between dictionary term and input text

FP and FN, usually FN

12 (6 %)

19 (9.5 %)

Total Errors

  

200 (100 %)

200 (100 %)