Skip to main content

Table 3 Alignment accuracy for global and local homologies of different evolutionary models implemented under the e2msa local alignment algorithm

From: Parameterizing sequence alignment with an explicit evolutionary model

 

Alignment Accuracy

 

[ AUC for F measure (%)]

Method

Global homology set

Local homology set

 

parameterization

parameterization

 

short

long

optimal

short

long

optimal

e2msa.afg

71.4

80.4

80.3

68.2

68.2

73.6

e2msa.aga

71.4

80.4

80.1

68.2

67.3

73.6

e2msa.aif

71.3

80.4

80.2

68.1

68.3

73.3

e2msa.tkf92

71.2

80.0

79.9

68.1

68.2

73.4

e2msa.afr

71.7

80.0

79.8

68.1

68.2

73.3

e2msa.aali

71.0

78.7

78.6

67.9

66.4

72.7

e2msa.tkf91

69.5

75.4

74.5

66.2

69.1

70.7

PHMMER (no filters) SSEARCH

 

78.7

  

72.9

 

(BLOSUM62, -11/-1)

 

80.0

  

71.7

 

NCBIBLAST

 

78.9

  

68.4

 

MSAProbs

 

81.7

  

NA

 

MUSCLE

 

80.8

  

NA

 
  1. The “Global Homology set” is the one used in Fig. 7. The “Local Homology set” is the one used in Fig. 8. The e2msa algorithm was run in local mode, and with three different parameterizations: two at a fixed branch length (a short-branch and a long-branch parameterization, introduced in Fig. 7), and a variable optimal-time parameterization that uses for each homology the branch length that optimizes the probability of the sequences given the model. The rate parameters for all evolutionary model were obtained using the same training set “Pfam.seed.S1000.sto”. For all experiments, alignments are binned in 5 % identity groups, and the total F measure for one bin is calculated adding all alignments in that bin. In order to provide one single number, we report the area under the curve (AUC) for the F measure of alignments covering all identity ranges. For comparison, we provide results for other standard methods. Methods have been ranked by their combined performance in both sets. Methods such as MSAProbs and MUSCLE work only in “global” alignment mode, and they are not appropriate to detect local homologies
  2. In bold, we indicate the best performing of the three alternative parameterizations