Table 6 PER annotation error analysis

From: SIFR annotator: ontology-based semantic annotation of French biomedical text and clinical notes

  Description Example % in EMEA (14 FP & 36 FN) % in MEDLINE (15FP & 35 FN)
FP Annotation with a concept that was not covered in the gold standard “évaluant la douleur”/Proc. (i.e., “pain evaluation”) matched but not in gold standard. 10 10
Partial annotation on some but not all of the expected tokens “sytème nerveux central” recognized instead of “signes du système nerveux central” (spelling) 10 12
Incorrect Semantic Group annotation “rein” (kidney) annotated with DISO. instead of ANAT. Generates both an FP and an FN. 8 8
Concept missing from the French ontologies in the portal Expected annotation: “canaux” (canals), but the SIFR Annotator dictionary only contains “canal, sai” (canal unspecified), which cannot match 34 12
FN Morphosyntactic variation Expected annotation “sériques” (an adjectivation of sérum) as ANAT, whereas the ontology label is “sérum” (the noun). 18 26
Formulation different from concept labels (synonym, paraphrase) Expected annotation “flacon” (vial), while the ontology concept label read “bouteille” (bottle). 14 22
Incorrect Semantic Group “rein” (kidney) annotated with DISO instead of ANAT. Generates both an FP and an FN. 6 10
Unrecognized acronym or medical abbreviation The gold standard expects “SNM” to be annotated with DISO, while the ontologies only contain “syndrome malin des neuroleptiques”. 2 0
  1. Performed on 50 uniformly sampled errors on EMEA and MEDLINE obtained with the baseline method. The two most common causes are highlighted in bold