Skip to main content

Table 1 Results of assemblies of actual Illumina sequencing data on 3.0 GHz Xeon processor with 32 GB memory.

From: QSRA – a quality-value guided de novo short read assembler

Organism

Error rate (%)

Depth

Program

Program version

Program release date

Maximum RAM used (GB)

Run time (s)

Genome covered (%)

Largest contig (bp)

N50 (bp)

N80 (bp)

Number of Contigs

Runtime options

pina

5.33

479

SSAKE

3.2

2008

0.91

3463

79.8

3051

241

N/A

24686

-m 16

pina

5.33

479

VCAKE

1.0

05/2007

0.74

8400

68.1

1721

101

N/A

188778

-k 33 -o 34

pina

5.33

479

VELVET

0.6.04

03/2008

0.36

74

58.5

3076

285

N/A

464

-min_contig_lgth 34

pina

5.33

479

EDENA

2.1.1

2008

0.24

210

77.8

3329

400

N/A

3377

-c 34

pina

5.33

479

QSRA

06032008

06/2008

0.84

1553

93.1

1046

94

86

32473

-k 33 -o 34

pina

5.33

479

QSRA*

06032008

06/2008

0.91

1301

99.3

1771

85

85

83004

-k 33 -o 34

gera03

5.18

376

SSAKE

3.2

2008

0.78

1936

85.1

3613

347

42

18093

-m 16

gera03

5.18

376

VCAKE

1.0

05/2007

0.64

3114

82.6

1964

157

96

175451

-k 33 -o 34

gera03

5.18

376

VELVET

0.6.04

03/2008

0.32

55

60.2

4296

386

N/A

311

-min_contig_lgth 34

gera03

5.18

376

EDENA

2.1.1

2008

0.16

98

88.9

3285

535

41

1977

-c 34

gera03

5.18

376

QSRA

06032008

06/2008

0.69

733

99.1

3012

71

71

21132

-k 33 -o 34

gera03

5.18

376

QSRA*

06032008

06/2008

0.75

641

99.0

3012

569

167

60584

-k 33 -o 34

suisp

2.26

49

SSAKE

3.2

2008

2.21

5941

95.8

6475

1036

355

15632

-m 16

suisp

2.26

49

VCAKE

1.0

05/2007

1.66

7202

99.0

11894

1577

718

487006

-k 36, -o 37

suisp

2.26

49

VELVET

0.6.04

03/2008

0.74

144

96.4

18690

4401

1992

1185

-min_contig_lgth 37

suisp

2.26

49

EDENA

2.1.1

2008

0.48

357

97.3

8829

1836

759

3254

-c 37

suisp

2.26

49

QSRA

06032008

06/2008

1.89

3329

96.9

11934

2432

259

18834

-k 36 -o 37

suisp

2.26

49

QSRA*

06032008

06/2008

2.18

3628

98.5

11934

2370

259

168464

-k 36 -o 37

  1. Five tests were run for each of three data sets, including SSAKE, VCAKE, VELVET, EDENA, QSRA without q-values, specified simply as QSRA in Table 1, and QSRA with q-values, specified as QSRA* in Table 1. In addition to the runtime options listed for VELVET, each VELVET run used a tile size of 19 and a coverage cutoff of 5. Only contigs were used, discarding unextended singletons, in the calculation of coverage, N50, and N80 values. Coverage values were determined though analysis of BLAT output by comparing the total number of bases in the reference genome with the number of bases uniquely "hit" by the BLAT alignments with assembled contigs. Thus, any contig which BLAT, using the default value of 90% identity, could not match to its reference genome did not contribute to coverage calculations. N50 and N80 values are equal to the largest contig in the output such that it and all contigs of greater length accounted for 50%/80% of total genome coverage. For S. suis, 43.8% of the 36 mer Illumina reads in the data set matched perfectly to the reference genome, which corresponds to an estimated average error rate per sequence base of 2.26%.