From: How can functional annotations be derived from profiles of phenotypic annotations?
Name | Formula |
---|---|
Euclidean similarity | \(s^{2}\left (g_{1}, g_{2}\right)=\frac {1}{1+\left (x_{g1}-x_{g2}\right)\left (x_{g1}-x_{g2}\right)^{'}}\) |
Correlation similarity | \(s\left (g_{1},g_{2}\right) = \frac {\left (x_{g1}-\overline {x}_{g1}\right)\left (x_{g2}-\overline {x}_{g2}\right)^{'}} {\sqrt {\left (x_{g1}-\overline {x}_{g1}\right)\left (x_{g1}-\overline {x}_{g1}\right)^{'}} \sqrt {\left (x_{g2}-\overline {x}_{g2}\right)\left (x_{g2}-\overline {x}_{g2}\right)^{'}}}\) |
where \(\overline {x}_{g1}=\frac {1}{n}\sum _{p \in P}x^{p}_{g1}\) and \(\overline {x}_{g2}=\frac {1}{n}\sum _{p \in P}x^{p}_{g2}\) | |
Cosine similarity | \(s\left (g_{1},g_{2}\right) = \frac {x_{g1}x_{g2}^{'}}{\sqrt {x_{g1}^{'}x_{g1}} \sqrt {x_{g2}^{'}x_{g2}}}\) |
Hamming similarity | \(s\left (g_{1},g_{2}\right) = \frac {x^{p}_{g1}=x^{p}_{g2}}{n}\) |
Jaccard similarity | \(s\left (g_{1},g_{2}\right) = 1 - \frac {\left [\left (x^{p}_{g1} \neq x^{p}_{g2}\right)\wedge \left (\left (x^{p}_{g1} \neq 0\right) \vee \left (x^{p}_{g2} \neq 0\right)\right)\right ]} {\left (x^{p}_{g1} \neq 0\right) \vee \left (x^{p}_{g2} \neq 0\right)}\) |
Cohen’s kappa | \(s\left (g_{1},g_{2}\right)=\frac {p_{0}-p_{c}}{1-p_{c}}\) where: |
- p 0 is the proportion of terms common to profiles g 1 and g 2, and | |
- p c is the proportion of terms common to profiles g 1 and g 2 expected by chance. | |
TF-IDF similarity | \(s\left (g_{1},g_{2}\right) = \max _{p \in P}\left \{x^{p}_{g1}x^{p}_{g2}IDF(p)\right \}\) where\(IDF(p)=log\frac {n_{G}}{1+\sum _{g \in G}{x^{p}_{g}}}\) |
Resnik’s semantic similarity | s(t 1,t 2)=IC(t MICA ) where: |
- the Most Informative Common Ancestor is\(t_{MICA}={argmax}_{t \in S\left (t_{1},t_{2}\right)}{IC(t)}\), | |
- the information content (IC) of a term t is IC(t)=−log(p(t)), | |
- the probability of a term t is \(p(t)=\frac {annotations(t)}{totalAnnotations}\), and | |
- S(t 1,t 2) is the set of common ancestors of t 1 and t 2. | |
Lin’s semantic similarity | \(s\left (t_{1},t_{2}\right) = {\frac {{2\cdot IC\left (t_{MICA}\right)}}{IC\left (t_{1}\right)+IC\left (t_{2}\right)}}\) |
Schlicker’s semantic similarity | \(s\left (t_{1},t_{2}\right) = \frac {2\cdot IC\left (t_{MICA}\right)}{IC\left (t_{1}\right)+IC\left (t_{2}\right)}\cdot \left (1-p\left (t_{MICA}\right)\right)\) |
Jiang’s semantic similarity | s(t 1,t 2)=1+2·IC(t MICA )(IC(t 1)+IC(t 2)) |
Pesquita’s semantic similarity | \(s\left (t_{1},t_{2}\right) = \frac {\sum \limits _{t \in S(t_{1},t_{2})}{IC(t)}}{\sum \limits _{t \in P(t_{1},t_{2})}{IC(t)}}\) where: |
- P(t 1,t 2) is the set of ancestors of either t 1 or t 2. |