Skip to main content

Table 7 Overall accuracy on the data set

From: Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation

Data set

NB

AEC

JDI

MRD

2-MRD

Abbreviation Set

0.9716

0.9090

 

0.8759

0.8501

Abbreviation Subset

0.9760

0.9218

0.6725

0.8838

0.8725

Term Set

0.8980

0.7462

 

0.7148

0.6773

Term Subset

0.8991

0.7448

0.6209

0.7132

0.6609

Term/Abbreviation Set

0.9384

0.8879

 

0.8801

0.9356

Term/Abbreviation Subset

0.9360

0.9026

0.6899

0.8715

0.9350

Overall MSH WSD Set

0.9386

0.8383

 

0.8070

0.7799

Overall MSH WSD Subset

0.9413

0.8448

0.6551

0.8118

0.7837

NLM WSD

0.8830

0.6836

 

0.6389

0.5500

NLM WSD Subset

0.9063

0.6932

0.7475

0.6526

0.5800

  1. NB stands for Naïve Bayes, AEC stands for Automatic Extracted Corpus, MRD stands for Machine Readable dictionary, 2-MRD stands for 2nd Order Co-occurrence MRD, and JDI stands for Journal Descriptor Indexing. The term set stands for all the ambiguous words in the category while subset indicates that only the words that the JDI method can use are considered. Results on the NLM WSD set have been included.