Skip to main content

Table 1 Notations used in this paper

From: RefSelect: a reference sequence selection algorithm for planted (l, d) motif search

Notation Explanation
|x| The length of a string, the size of a set, or the number of elements in a matrix.
D, D' D is the set of input sequences. D' is the set of reference sequences. D = {s 1, s 2, …, s t } and D' = {s r1, s r2, …, s rk }, satisfying D' D.
t The number of sequences in the input sequence set D, namely |D| = t.
k The number of required reference sequences, namely |D'| = k.
n The length of each input sequence.
x l s The string x is an l-length substring of the sequence s. In other words, x is an l-mer in the sequence s.
s[i] The ith character in the string s.
s[i…j] A substring of the string s starting from the ith position to the jth position.
d H (x, x') The Hamming distance between two strings x and x' of the same length.
M d (x, x') The common candidate motifs of two l-mers x and x'. M d (x, x') = {y: |y| = |x| = |x'|, d H (y, x) ≤ d, d H (y, x') ≤ d}.
N r (D') The number of candidate motifs generated from the reference sequences set D', calculated by (1).
N r (s i , s j ) The number of candidate motifs generated from two sequences s i and s j , calculated by (2).
min(i, j) The minimum value between two integers i and j. min(i, j) = i if i ≤ j, j otherwise.
sim(s i , s j ) The similarity of two sequences s i and s j .