Skip to main content

Table 1 Performance in aligning yeast sequence

From: Sigma-2: Multiple sequence alignment of non-coding DNA via an evolutionary model

Program dataset Matches per basea dataset Matches per basea Difference
Sigma-2 orthologous 1.9893 shuffled 0.0031 1.9862
Sigma-1.1.3 orthologous 1.8688 shuffled 0.0050 1.8638
FSA orthologous 2.4695 shuffled 0.1465 2.3230
Dialign-TX orthologous 2.7498 shuffled 0.4539 2.2959
Pecan orthologous 3.0234 shuffled 0.4430 2.5804
Mavid orthologous 3.3181 shuffled 2.8248 0.4933
T-Coffee orthologous 3.5582 shuffled 3.3495 0.2487
Clustal-W orthologous 3.6202 shuffled 3.7517 -0.1315
KAlign orthologous 3.7480 shuffled 3.8434 -0.0954
MLagan orthologous 3.2956 shuffled 2.7082 0.5874
Muscle orthologous 3.4541 shuffled 3.1901 0.2670
PCMA orthologous 3.4822 shuffled 2.8941 0.5881
  1. Performance in the yeast benchmark, described in the text, of 12 programs (including 2 versions of Sigma). 947 genes were selected, each of which had 1000 bp of non-coding upstream sequence in S. cerevisiae and four other species. Each upstream sequence and its four orthologues were aligned (dataset "orthologous"). In addition, 947 "scrambled" files were prepared each of which contained sequence from each of the five species, including no orthologous sequences, and these were aligned (dataset "scrambled"). "Matches per base" indicates the average number of nucleotides in other species that each nucleotide in the input data was aligned with (so its theoretical maximum is four). The difference between the "orthologous" and "scrambled" numbers is a measure of how discriminative the program is to genuine orthology.