A comparative study of conservation and variation scores

BMC Bioinformatics

Table 4 Notations used in Table 3

N	No. of sequences in alignment.
A _ik	The amino acid in sequence i at alignment site k.
d(A_i, A_j)	Sequence distance in percent.
p _k	Probability estimated from site k.
p	Probability estimated from alignment.
q	Probability estimated from database.
S_b(k)
R(p_k, p)
V (p_k)	-Tr(ω log₂₀ ω), Tr(ω) = 1 ω = diag(p_k(α₁), ⋯, p_k(α₂₀)) × M_f
n _k	No. of occurences in site k.
n	No. of average occurences in a site.
α₀(k)	Most common amino acid at k.
d _k	No. of different amino acids at k.
M	The BLOSUM62 matrix, containing log-odds ratios (blosum62.bla).
M _f	The BLOSUM62 matrix of frequencies (blosum62.qij ).
M _V
M _K
M _M	M_f normalized such that each row and column approx. sums to 1.
M _L	M normalized such that M_L(α, α) = 10; 2 ≤ M_L (α, β) ≤ 10

ISSN: 1471-2105