Skip to main content

Table 4 Similarity measures sorted by area under the ROC curve (AUC)

From: How can functional annotations be derived from profiles of phenotypic annotations?

Measure AUC Protein interactions p-value
Resnik in CMPO 0.56 24 0.0102
Schlicker in CMPO 0.56 12 0.7512
Lin in CMPO 0.55 11 0.8332
Cohen’s kappa 0.54 27 0.0015
Pesquita in CMPO 0.54 14 0.5494
Jiang in CMPO 0.54 11 0.8332
TF-IDF 0.53 25 0.0055
Euclidean 0.53 16 0.3433
correlation 0.52 22 0.0311
Hamming 0.52 21 0.0513
cosine 0.49 13 0.6545
Jaccard 0.49 13 0.6545
Euclidean (logistic PCA) 0.46 25 0.0055
correlation (logistic PCA) 0.45 19 0.1242
Cosine (logistic PCA) 0.45 14 0.5494
  1. The second column represents the number of nearest neighbour gene pairs who are also protein interaction partners, and the third one, the p-values (computed from the hypergeometric distribution) that the number of observed interacting pairs is due to chance