Skip to main content

Table 1 Results of assemblies of actual Illumina sequencing data on 3.0 GHz Xeon processor with 32 GB memory.

From: QSRA – a quality-value guided de novo short read assembler

Organism Error rate (%) Depth Program Program version Program release date Maximum RAM used (GB) Run time (s) Genome covered (%) Largest contig (bp) N50 (bp) N80 (bp) Number of Contigs Runtime options
pina 5.33 479 SSAKE 3.2 2008 0.91 3463 79.8 3051 241 N/A 24686 -m 16
pina 5.33 479 VCAKE 1.0 05/2007 0.74 8400 68.1 1721 101 N/A 188778 -k 33 -o 34
pina 5.33 479 VELVET 0.6.04 03/2008 0.36 74 58.5 3076 285 N/A 464 -min_contig_lgth 34
pina 5.33 479 EDENA 2.1.1 2008 0.24 210 77.8 3329 400 N/A 3377 -c 34
pina 5.33 479 QSRA 06032008 06/2008 0.84 1553 93.1 1046 94 86 32473 -k 33 -o 34
pina 5.33 479 QSRA* 06032008 06/2008 0.91 1301 99.3 1771 85 85 83004 -k 33 -o 34
gera03 5.18 376 SSAKE 3.2 2008 0.78 1936 85.1 3613 347 42 18093 -m 16
gera03 5.18 376 VCAKE 1.0 05/2007 0.64 3114 82.6 1964 157 96 175451 -k 33 -o 34
gera03 5.18 376 VELVET 0.6.04 03/2008 0.32 55 60.2 4296 386 N/A 311 -min_contig_lgth 34
gera03 5.18 376 EDENA 2.1.1 2008 0.16 98 88.9 3285 535 41 1977 -c 34
gera03 5.18 376 QSRA 06032008 06/2008 0.69 733 99.1 3012 71 71 21132 -k 33 -o 34
gera03 5.18 376 QSRA* 06032008 06/2008 0.75 641 99.0 3012 569 167 60584 -k 33 -o 34
suisp 2.26 49 SSAKE 3.2 2008 2.21 5941 95.8 6475 1036 355 15632 -m 16
suisp 2.26 49 VCAKE 1.0 05/2007 1.66 7202 99.0 11894 1577 718 487006 -k 36, -o 37
suisp 2.26 49 VELVET 0.6.04 03/2008 0.74 144 96.4 18690 4401 1992 1185 -min_contig_lgth 37
suisp 2.26 49 EDENA 2.1.1 2008 0.48 357 97.3 8829 1836 759 3254 -c 37
suisp 2.26 49 QSRA 06032008 06/2008 1.89 3329 96.9 11934 2432 259 18834 -k 36 -o 37
suisp 2.26 49 QSRA* 06032008 06/2008 2.18 3628 98.5 11934 2370 259 168464 -k 36 -o 37
  1. Five tests were run for each of three data sets, including SSAKE, VCAKE, VELVET, EDENA, QSRA without q-values, specified simply as QSRA in Table 1, and QSRA with q-values, specified as QSRA* in Table 1. In addition to the runtime options listed for VELVET, each VELVET run used a tile size of 19 and a coverage cutoff of 5. Only contigs were used, discarding unextended singletons, in the calculation of coverage, N50, and N80 values. Coverage values were determined though analysis of BLAT output by comparing the total number of bases in the reference genome with the number of bases uniquely "hit" by the BLAT alignments with assembled contigs. Thus, any contig which BLAT, using the default value of 90% identity, could not match to its reference genome did not contribute to coverage calculations. N50 and N80 values are equal to the largest contig in the output such that it and all contigs of greater length accounted for 50%/80% of total genome coverage. For S. suis, 43.8% of the 36 mer Illumina reads in the data set matched perfectly to the reference genome, which corresponds to an estimated average error rate per sequence base of 2.26%.