Skip to main content

Table 1 The summaries of assemblies on simulated metagenomic sim-113sp dataset

From: InteMAP: Integrated metagenomic assembly pipeline for NGS short reads

 

Total cover length (Mbp)

Corr. N-len at 10 Mbp (bp)

E-size (bp)

Num. of covered genes

Total errors

Kbp/errors

Identity (%)

ABySS,k = 31

163.8

185,122

11,466

42,376

11,654

14.1

99.8

ABySS,k = 61

85.5

222,581

15,395

33,997

6,719

12.7

99.9

Bambus2

232.5

90,788

6,531

40,139

259,320

0.9

99.5

CABOG

244.8

139,195

10,142

47,968

2,482

98.6

99.8

IDBA-UD

227.9

222,631

14,651

67,713

5,416

42.1

99.7

MetaVelvet,k = 23

182.8

5,437

689

23,971

3,271

55.9

99.8

MetaVelvet,k = 61

76.3

121,245

8,628

26,747

251

304.1

99.9

Omega

75.8

90,383

7,751

25,837

78,078

0.9

99.8

Ray

90.2

35,059

2,365

22,958

45

2005.5

99.8

SOAPdenovo,k = 23

203.0

2,116

345

14,253

1,717

118.3

99.8

SOAPdenovo,k = 61

75.2

89,811

6,078

24,081

1,921

39.1

99.9

SPAdes

175.3

46,658

4,470

34,954

30,942

5.66

99.8

InteMAP

266.8

244,190

17,652

70,859

5,072

52.6

99.8

  1. Only contigs with length ≥200 bp are considered. “k = 23”, “k = 31” and “k = 61” in the first column denote the assembler use the option of k-mer size at 23 bp, 31 bp and 61 bp. Bambus 2 uses unitigs from CABOG. Total cover length denotes the total length of reference sequences that are covered by contigs. Corr. N-len denotes the correct N-len size. E-size is also computed using correct contigs. Only complete covered genes are counted. Errors denote the structural errors in contigs. The error rate is measured as the average distance between errors. Identity denotes the average identity of the alignments between contigs and references, where unmapped segments of contigs are not considered. Values in bold indicate the best in the column