NOBLE – Flexible concept recognition for large-scale biomedical natural language processing

BMC Bioinformatics

Table 8 Analysis of sampled NOBLE Coder errors

Error Type	Definition	Type of error	CRAFT	ShARe
Boundary detection	Incorrectly incorporates words from earlier or later in the sentence, considering them to be part of the concept annotated	FP	0 (0 %)	3 (1.5 %)
Concept hierarchy	Incorrectly assigns more general or more specific concept than gold standard	FP and FN	18 (9 %)	13 (6.5 %)
Context/background knowledge	Concept annotated incorrectly because context or background knowledge was needed	FP and FN, usually FN	72 (36 %)	81 (40.5 %)
Exact match missed	Concept not annotated despite exactly matching the preferred name or a synonym	FN	2 (1 %)	3 (1.5 %)
Importance	Annotated concept was not deemed relevant by gold annotators	FP	33 (16.5 %)	51 (25.5 %)
Abbreviation detection	Abbreviation defined in the dictionary had a case-insensitive match, because it did not match a defined abbreviation pattern	FP	18 (9 %)	0 (0 %)
Alternative application of terminology	Gold used obsolete term, term is not in SNOMED, or same term existed in multiple ontologies, resulting in different annotations for same mention	FN	31 (15.5 %)	10 (5 %)
Text span	Concept annotated was identical to gold but text span was different than gold	FP and FN	10 (5 %)	20 (10 %)
Word sense ambiguity	Concept annotated was used in different word sense	FP	4 (2 %)	0 (0 %)
Wording mismatch	Missing or incorrect annotation due to word inflection mismatch between dictionary term and input text	FP and FN, usually FN	12 (6 %)	19 (9.5 %)
Total Errors			200 (100 %)	200 (100 %)

ISSN: 1471-2105