Skip to main content

Table 7 Typo correction performance in the NCBI-disease and SPR datasets

From: MLM-based typographical error correction of unstructured medical texts for named entity recognition

Datasets

Algorithms

Precision

Recall

f1-score

Support

NCBI-disease

SymSpell

0.62

0.7

0.67

3944

Proposed model

0.65

0.81

0.72

SPR

SymSpell

0.59

0.7

0.64

3671

Proposed model

0.70

0.76

0.73

  1. The occurrences of each type of word. ‘support’ means the number of words in the data. The performance of the proposed model was improved by 5% and 9% for each dataset, respectively.