Skip to main content

Table 1 Performance in aligning yeast sequence

From: Sigma-2: Multiple sequence alignment of non-coding DNA via an evolutionary model

Program

dataset

Matches per basea

dataset

Matches per basea

Difference

Sigma-2

orthologous

1.9893

shuffled

0.0031

1.9862

Sigma-1.1.3

orthologous

1.8688

shuffled

0.0050

1.8638

FSA

orthologous

2.4695

shuffled

0.1465

2.3230

Dialign-TX

orthologous

2.7498

shuffled

0.4539

2.2959

Pecan

orthologous

3.0234

shuffled

0.4430

2.5804

Mavid

orthologous

3.3181

shuffled

2.8248

0.4933

T-Coffee

orthologous

3.5582

shuffled

3.3495

0.2487

Clustal-W

orthologous

3.6202

shuffled

3.7517

-0.1315

KAlign

orthologous

3.7480

shuffled

3.8434

-0.0954

MLagan

orthologous

3.2956

shuffled

2.7082

0.5874

Muscle

orthologous

3.4541

shuffled

3.1901

0.2670

PCMA

orthologous

3.4822

shuffled

2.8941

0.5881

  1. Performance in the yeast benchmark, described in the text, of 12 programs (including 2 versions of Sigma). 947 genes were selected, each of which had 1000 bp of non-coding upstream sequence in S. cerevisiae and four other species. Each upstream sequence and its four orthologues were aligned (dataset "orthologous"). In addition, 947 "scrambled" files were prepared each of which contained sequence from each of the five species, including no orthologous sequences, and these were aligned (dataset "scrambled"). "Matches per base" indicates the average number of nucleotides in other species that each nucleotide in the input data was aligned with (so its theoretical maximum is four). The difference between the "orthologous" and "scrambled" numbers is a measure of how discriminative the program is to genuine orthology.