Skip to main content

Table 2 Assembly statistics comparing dataset assemblies for each method

From: Spherical: an iterative workflow for assembling metagenomic datasets

Dataset

Method

RAM usage (Gb)

Alignment (%)

False bases (%)

Longest contig

Number of contigs

Cecum

Normalised

119

29.5

0.01

831

103,618

Base assembly

1

29.5

0.01

831

103,618

Metavelvet

2

29.1

0.07

831

103,618

Spherical (1)

2

30.9

0.04

831

138,995

Oral

Normalised

14

8.1

0.01

3337

1,825,177

Base assembly

25

13.0

0.02

4548

1,178,611

Metavelvet

15

13.0

0.07

4548

1,178,611

Spherical (1)

5

24.6

0.19

2380

1,053,802

Ground water

Normalised

361

52.8

3.86

117,274

5,721,819

Base assembly

376

52.0

3.84

117,274

5,772,465

Metavelvet

376

52.0

4.04

117,274

5,772,461

Spherical (1)

377

59.7

2.89

117,274

13,312,643

Spherical (0.25)

129

51.5

3.50

104,353

7,851,021

Spherical (0.033)

107

49.8

3.78

53,836

7,145,998

  1. The first column indicates the dataset utilized whilst the second column identified the assembly methodology. To identify the different subsampling amounts during each Spherical assembly the subsample size is stated in brackets in the method column. The final 5 columns provide information on the computational needs for each assembly (RAM usage) as well as statistics about the produced assemblies e.g. number of contigs and alignment (%)