Skip to main content

Table 10 This table shows the Pearson (r) and Spearman (\(\rho\)) correlation values between the similarity values returned by a set of path-based similarity measures and those values returned by their reformulation based on the new AncSPL algorithm for a sequence of 1000 random CUI pairs in SNOMED-CT 2019AB, GO (2020-05-02), and WordNet 3.0

From: HESML: a real-time semantic measures library for the biomedical domain with a reproducible survey

Base measure AncSPL reformulation 50 samples 100 samples 200 samples 1000 samples
r \(\rho\) r \(\rho\) r \(\rho\) r \(\rho\)
Correlation values in SNOMED-CT (\(\text {tree-like}_{\sigma }\) = 0.425)
 Rada [71] AnsSPL-Rada 0.9214 0.9412 0.9413 0.9444 0.9357 0.9352 0.9231 0.9217
 Leacock and Chodorow [73] AnsSPL-Leacock 0.9409 0.9412 0.9479 0.9444 0.9422 0.9352 0.9217 0.9217
 coswJ&C [35] AnsSPL-coswJ&C 0.9136 0.9506 0.9583 0.9747 0.9761 0.9775 0.941 0.9714
Correlation values in GO (\(\text {tree-like}_{\sigma }\) = 0.446)
 Rada [71] AnsSPL-Rada 0.8571 0.8277 0.9133 0.9085 0.8883 0.8868 0.9074 0.8947
 Leacock and Chodorow [73] AnsSPL-Leacock 0.8542 0.8277 0.9109 0.9085 0.9007 0.8868 0.9191 0.8947
 coswJ&C [35] AnsSPL-coswJ&C 0.9679 0.9848 0.9372 0.9894 0.9654 0.9888 0.9533 0.977
Correlation values in WordNet (\(\text {tree-like}_{\sigma }\) = 0.0269)
 Rada [71] AnsSPL-Rada 0.9072 0.8882 0.9151 0.8855 0.9225 0.8994 0.9168 0.9038
 Leacock and Chodorow [73] AnsSPL-Leacock 0.9354 0.8882 0.9375 0.8855 0.937 0.8994 0.9345 0.9038
 coswJ&C [35] AnsSPL-coswJ&C 0.9993 0.9906 0.998 0.9916 0.9644 0.9859 0.9815 0.9807
  1. We show the results obtained in the evaluation of the first 50, 100, 200, and 1000 random CUI pairs. All similarity measures are implemented in HESML V1R5 [63]. CoswJ&C [35] sets the current state-of-the-art in the family of ontology-based semantic similarity measures based on WordNet [58]. We define the tree-like deviation (\(\text {tree-like}_{\sigma }\)) below as the ratio of nodes with multiple parents regarding the overall number of ontology nodes. The tree-like deviation is 0 for MeSH, whilst it is (2213/82115) for WordNet 3.0, (151916/357406) for SNOMED-CT, and (19680/44509) for GO