Skip to main content

Table 1 Description of the publicly available mock community datasets used for this experiment

From: Evaluation of taxonomic classification and profiling methods for long-read shotgun metagenomic sequencing datasets

Label

Technology

Mock community

Species

Abundances

Reads used

Median length

Mean length

Total bases (Gb)

Median QV

Release date

Source

HiFi ATCC MSA-1003

PacBio HiFi

ATCC

MSA-1003

20a

Staggered

(14–0.02%)

2,419,037

8,310

8,492

20.54

36

6/4/19

NCBI: SRX6095783

HiFi Zymo D6331

PacBio HiFi

ZymoBIOMICS D6331

17b

Staggered

(18–0.0001%)

1,978,852

8,077

9,092

17.99

40

11/25/20

NCBI: SRX9569057

ONT R10 Zymo D6300

Oxford Nanopore Technologies

ZymoBIOMICS D6300

10c

Even

(12%, 2%)

275,318d

6,664

12,022

3.31

10

2/7/20

https://lomanlab.github.io/mockcommunity/r10.html

ONT Q20 Zymo D6300

Oxford Nanopore Technologies

ZymoBIOMICS D6300

10c

Even

(12%, 2%)

2,000,000d

4,160

4,805

9.61

N/A

3/23/21

ENA: ERR5396170

Illumina ATCC MSA-1003

Illumina

ATCC

MSA-1003

20a

Staggered

(14–0.02%)

10,038,314

125

125

1.25

37

12/2018

NCBI: SRX5169925

Illumina Zymo D6300

Illumina

ZymoBIOMICS D6300

10c

Even

(12%, 2%)

20,000,000e

150

150

2.99

37

7/2020

NCBI: SRX8824472

  1. a20 bacteria
  2. b14 bacteria, 1 archaea, 2 yeasts
  3. c8 bacteria (at 12% abundance), 2 yeasts (at 2% abundance)
  4. dLength-filtered to eliminate reads < 2 kb and > 50 kb from starting set of 1.16 million reads (ONT R10) and 5.4 million reads (ONT Q20);
  5. eSubsampled from ~ 103 million available reads