Skip to main content

Table 1 Segregation success and run-times for different clustering strategies are tabulated for: a) a single peptide pre-sort with increasing numbers of alignment cycles, b) staged alignment cycles from 90 to 50 % identity each with a peptide presort (using 1 pass), c) as for (b) but with multiple pre-sort cycles indicated in parentheses)

From: Reduction, alignment and visualisation of large diverse sequence families

a Single pre-sort
Alignment Time Sequences Remaining
stages (to 50 %) sec. selected subfamilies
1 58.8 1658 503
2 91.4 355 302
3 226.1 196 171
4 462.6 175 154
5 727.1 172 151
b Staged pre-sort
Alignment Time Sequences Remaining
stages (3 to X%) sec. selected subfamilies
90 (1) 36.30 1597 947
80 (1) 1.33 563 314
70 (1) 1.00 165 31
60 (1) 0.53 104 24
50 (1) 0.47 71 21
c Staged (multi-pass) pre-sort
Alignment Time Sequences Remaining
stages (3 to X%) sec. selected subfamilies
90 (8) 10.62 3641 598
80 (4) 4.26 1034 93
70 (2) 2.31 285 40
60 (1) 1.10 98 20
50 (1) 0.42 62 21
  1. The data columns indicate the elapsed time in seconds (real time reported by the Linux time utility), the number of the 10,000 starting sequences remaining after each stage and the number of families or subfamilies (defined by sequence adjacency