Skip to main content

Table 1 A summary of experimental results for 23 distinct sets of sequences

From: M-GCAT: interactively and efficiently constructing large-scale multiple genome comparison frameworks in closely related species

#

Sequence set

Size

A MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacqWFaeFqaaa@3821@ size

A MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacqWFaeFqaaa@3821@

ℳ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacqWFZestaaa@3790@

C MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacqWFce=qaaa@3825@

t anchor

t mum

t total

Mem

Cov.

1

Mycoplasma 2

1.5

22

264

6325

649

2s

5s

8s

52

72.8

2

Pyrococcus 2

3.5

23

1159

3229

484

6s

3s

9s

153

62.5

3

Salmonella 2

9.6

27

470

516

39

16s

1s

17s

419

98.9

4

Listeria 3

8.7

24

13101

45940

722

16s

114s

143s

283

94.3

5

X. Campestris 3

15.3

27

15843

37702

2441

25s

151s

181s

487

74.8

6

P. Syringae 3

18.4

27

11232

39753

1527

35s

252s

294s

573

72.8

7

C. Pneumoniae 4

4.9

21

770

0

7

6s

0s

6s

156

98.5

8

Yersinia 4

21.4

25

14049

6

400

24s

1s

25s

488

94.0

9

Shigella 5

23.1

23

37596

1285

564

32s

2s

38s

548

76.7

10

Salmonella 5

23.8

23

46336

983

328

35s

1s

39s

567

93.1

11

E. Coli 5

25.3

23

47221

5543

704

38s

9s

57s

553

84.2

12

Streptococcus 7

13.1

21

15446

84

121

17s

1s

18s

258

88.3

13

Staphylococcus 7

19.6

24

23216

132

260

24s

2s

26s

390

92.6

14

Bacillus 7

36.8

25

27731

4149

468

54s

7s

62s

713

93.2

15

Entero 10 (9&11)

48.4

23

39979

3753

418

63s

12s

78s

740

73.3

16

Entero 15 (9&10&11)

72.2

23

5802

8136

1218

95s

84s

181s

991

54.9

17

Entero 19 (8&9&10&11)

93.6

18

1132

637

907

161s

99s

261s

1174

15.6

18

Bacilli 14 (12&13)

22.7

19

251

3801

2721

43s

41s

93s

414

14.3

19

Bacilli 14 (13&14)

56.4

19

431

5250

2718

79s

100s

185s

654

26.3

20

Bacilli 21 (12&13&14)

62.3

15

597

1691

2045

100s

54s

155s

638

4.1

  1. A selection of results for 20 independent sets of closely related sequence comparisons conducted with M-GCAT. Size and Memory usage are listed in megabytes (MB). All experiments were performed and running times (cpu time) measured on a 2 GHZ Pentium processor, with 2 GB of main memory, running Windows XP Professional. Size is the total size (MB) of the set of sequences. A MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacqWFaeFqaaa@3821@ is the number of multi-MUM Anchors found, A MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacqWFaeFqaaa@3821@ size is the configured minimum size of multi-MUM Anchors, ℳ MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacqWFZestaaa@3790@ is number of multi-MUMs found, C MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamrtHrhAL1wy0L2yHvtyaeHbnfgDOvwBHrxAJfwnaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaWaaeGaeaaakeaaimaacqWFce=qaaa@3825@ is the number of multi-MUM clusters. t anchor is the time needed to find the set of multi-MUM anchors, t mum is the time needed to find the initial set of multi-MUMS, and t total the time required to perform entire comparison. Mem is peak usage of system memory (MB), and Cov. is the percentage of each sequence that was aligned. The percentage that was not aligned corresponds to regions where no multi-MUMs were found. A p value of 10,000,000 and q value of 100 was used for all experiments. The d value was set to the length of the longest sequence in each example to emphasize the global alignment framework. For a complete listing of the sequences used in these comparisons refer to Additional file 2.