Skip to main content

Table 2 Assembly statistics comparing dataset assemblies for each method

From: Spherical: an iterative workflow for assembling metagenomic datasets

Dataset Method RAM usage (Gb) Alignment (%) False bases (%) Longest contig Number of contigs
Cecum Normalised 119 29.5 0.01 831 103,618
Base assembly 1 29.5 0.01 831 103,618
Metavelvet 2 29.1 0.07 831 103,618
Spherical (1) 2 30.9 0.04 831 138,995
Oral Normalised 14 8.1 0.01 3337 1,825,177
Base assembly 25 13.0 0.02 4548 1,178,611
Metavelvet 15 13.0 0.07 4548 1,178,611
Spherical (1) 5 24.6 0.19 2380 1,053,802
Ground water Normalised 361 52.8 3.86 117,274 5,721,819
Base assembly 376 52.0 3.84 117,274 5,772,465
Metavelvet 376 52.0 4.04 117,274 5,772,461
Spherical (1) 377 59.7 2.89 117,274 13,312,643
Spherical (0.25) 129 51.5 3.50 104,353 7,851,021
Spherical (0.033) 107 49.8 3.78 53,836 7,145,998
  1. The first column indicates the dataset utilized whilst the second column identified the assembly methodology. To identify the different subsampling amounts during each Spherical assembly the subsample size is stated in brackets in the method column. The final 5 columns provide information on the computational needs for each assembly (RAM usage) as well as statistics about the produced assemblies e.g. number of contigs and alignment (%)