Table 6 Tested substitution matrices

Matrix Reference Description
Evolutionary matrices
 PAM 30, 60, 120, 250 Dayhoff et al. [1] Evolutionary model of point substitutions
 Blosum45, 50, 62 Henikoff et al. [2] Series based on the alignment of segments of related sequences from protein families grouped into blocks
 Gonnet250 Gonnet et al. [5] Matrix based on substitutions in protein families on an extended database for long evolutionary distances
 Gonnet_p Vogt et al. [6] Later modification of Gonnet250
 Optima Kann et al. [7] Detection of differences between homologues and non-homologues for a large evolutionary distance
 VTML250 Muller et al. [3] Improved evolutionary model based on maximal likelihood method (for distant homologue detection)
 MIQS Yamada et al. [8] Data derived on the basis of principal component analysis of the previously known matrices (Blosum, VTML, Benner)
 Pfasum 50, 100 Keul et al. [4] Model based on modern data covering a large and diverse sequence space.
Matrix based on Dirichlet mixture model
 Crooks Crooks et al. [9] The model takes into account the difference in the dynamics of substitutions depending on the time of evolution.
Evolutionary matrices for special protein families
 CCF53 Brick et al. [10] Search for homologues in families of related proteins, taking into account the bias of the amino acid composition characteristic for proteins of two species of the genus Plasmodium.
 MOLLI60 Lemaitre et al. [11] General method for constructing matrices focused on a certain bias of amino acid composition, based on the example of bacteria proteins of the Mollicutes class.
Matrices based on the structural alignment
 Johnson Johnson et al. [12] Obtained by calculating the substitutions of amino acid residues in the structural alignment of proteins from homologous families with a low level of sequence identity.
 Prlic Prlic et al. [13] Obtained on the basis of superposition of pairs of proteins having a similar structure, but low sequence identity.
 Blake Blake et al. [14] Based on structural superposition data, taking into account differences
in arrays of amino acid residues substitutions for distant and closely related homologs.
Genetic code matrix
 Benner Benner et al. [16] Based on the number of nucleotide substitutions required for a given amino acid substitution.
Contacts energy matrix
 Miyazawa Miyazawa et al. [15] Based on the assessment of the distribution of contacts in three-dimensional protein structure.