Metric | Definition | Description |
---|---|---|
Smith–Waterman similarity | \(SW(X,Y) = \max \limits _{\substack {\scriptscriptstyle x \in seq(X)\\ \scriptscriptstyle y \in seq(Y)}} \left (\frac {sw(x,y)}{len(x)+len(y)}\right)\) | The Smith–Waterman similarity sw(x,y) is given by maximizing a score computed over a number of operations needed to transform one string into the other, where an operation is defined as an insertion, deletion, or substitution of a single character [46]. Deletions/insertions (gaps) are penalized with a zero score, matches are rewarded with +5, and substitutions are penalized with -4 (NUC 4.4 substitution matrix). The time complexity is O(len(x)·len(y)). |
Damerau–Levenshtein distance | \(DLevDist(X,Y) = \min \limits _{\substack {\scriptscriptstyle x \in seq(X)\\ \scriptscriptstyle y \in seq(Y)}} \left (\frac {dl(x,y)}{len(x)+len(y)}\right)\) | The Damerau–Levenshtein distance dl(x,y) is given by counting the minimum number of operations needed to transform one string into the other, where an operation is defined as an insertion, deletion, or substitution of a single character, or a transposition of two adjacent characters [47]. The time complexity is O(len(x)·len(y)). |