Skip to main content

Table 2 Comparison of assemblers on Staphylococcus aureus (SA), Rhodobacter sphaeroides (RS) and human chromosome 14 (HG)

From: Clover: a clustering-oriented de novo assembler for Illumina sequences

Data (Mb

Assembler

Contigs

Scaffolds

Num

N50 (kb)

E-size (kb)

Errs

N50C (kb)

E-sizeC (kb)

Num

N50 (kb)

E-size (kb)

Errs

N50C (kb)

E-sizeC (kb)

SA

Clover

128

43.9

53.1

13

41.3

50.5

12

1490

947

2

1490

890

2.9

ABySS

90

129.1

181.1

16

69.8

102.5

61

170

199

0

107

127

 

Bambus2

109

50.2

69.1

178

16.7

19.5

17

1084

1120

0

1084

1120

 

CABOG

Could not run because of incompatible read lengths in one library

 

MSR-CA

94

59.2

60.4

22

49.2

51.4

17

2412

2026

1

1022

1039

 

SGA

1252

4.0

4.7

3

4.0

4.7

546

208

166

2

208

164

 

SOAPdenovo

107

288.2

252.3

58

62.7

67.5

99

332

302

0

288

227

 

SPAdes

98

62.6

87.9

9

57.0

75.1

41

1703

1144

2

684

570

 

Velvet

162

48.4

60.3

19

41.5

49.8

45

762

664

18

284

282

RS

Clover

453

20.1

23.8

19

19.5

21.9

59

2483

1795

1

2483

1795

4.6

ABySS

644

19.7

25.1

57

13.3

18.5

414

51

56

0

46

47

 

Bambus2

177

93.2

94.5

360

12.8

16.3

92

2439

1375

1

390

1106

 

CABOG

322

20.2

24.1

31

17.9

21.5

130

66

520

3

65

381

 

MSR-CA

395

22.1

24.2

32

19.1

21.5

43

2976

2039

3

2976

2017

 

SGA

3067

2.3

3.3

4

2.3

3.3

2096

51

53

0

51

53

 

SOAPdenovo

204

131.7

157.2

401

14.6

18.7

166

660

688

0

660

559

 

SPAdes

768

11.8

13.7

7

11.7

13.5

352

718

840

0

718

840

 

Velvet

583

15.7

18.6

24

14.5

16.9

178

353

380

16

301

352

HG

Clover

24,527

3.4

5.3

718

3.2

5.0

2089

839

943

385

409

502

88.3

ABySS

21,222

14.7

19.0

1876

10.4

13.4

19,249

18

24

13

13

19

 

Bambus2

13,592

5.9

23.3

8175

4.3

6.3

1792

324

528

240

200

274

 

CABOG

3361

45.3

58.8

2346

23.7

30.6

479

393

549

39

309

457

 

MSR-CA

30,103

4.9

6.8

1656

4.3

5.9

1425

893

1420

1430

282

407

 

SGA

56,939

2.7

3.8

375

2.7

3.7

30,975

83

113

24

81

111

 

SOAPdenovo

21,818

16.7

21.9

6587

7.8

10.4

13,502

454

533

384

227

276

 

SPAdes

16,854

12.7

16.7

1519

10.4

13.6

9245

173

223

199

129

162

 

Velvet

45,564

2.3

3.3

3665

2.1

3.0

3565

1190

1825

8659

86

124

  1. Num the number of sequences produced, N50 the N50 statistic calculated with respect to the genome size, E-size the most likely size of the sequence containing some random base in the genome, Errs the number of misjoins and for the contig value, also the number of indels > 5 bases, N50C the N50 calculated after splitting all sequences at error locations, and E-sizeC the E-size calculated after splitting all sequences at error locations. The best result in each column, for each dataset, is indicated in bold