Skip to main content

Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Table 7 Overall accuracy on the data set

From: Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation

Data set NB AEC JDI MRD 2-MRD
Abbreviation Set 0.9716 0.9090   0.8759 0.8501
Abbreviation Subset 0.9760 0.9218 0.6725 0.8838 0.8725
Term Set 0.8980 0.7462   0.7148 0.6773
Term Subset 0.8991 0.7448 0.6209 0.7132 0.6609
Term/Abbreviation Set 0.9384 0.8879   0.8801 0.9356
Term/Abbreviation Subset 0.9360 0.9026 0.6899 0.8715 0.9350
Overall MSH WSD Set 0.9386 0.8383   0.8070 0.7799
Overall MSH WSD Subset 0.9413 0.8448 0.6551 0.8118 0.7837
NLM WSD 0.8830 0.6836   0.6389 0.5500
NLM WSD Subset 0.9063 0.6932 0.7475 0.6526 0.5800
  1. NB stands for Naïve Bayes, AEC stands for Automatic Extracted Corpus, MRD stands for Machine Readable dictionary, 2-MRD stands for 2nd Order Co-occurrence MRD, and JDI stands for Journal Descriptor Indexing. The term set stands for all the ambiguous words in the category while subset indicates that only the words that the JDI method can use are considered. Results on the NLM WSD set have been included.