Skip to main content

Table 1 Main characteristics of the original and the generated datasets

From: To denoise or to cluster, that is not the question: optimizing pipelines for COI metabarcoding and metaphylogeography

Ā 

n. ESVs (*)

n. MOTUs

Single-ESV MOTUs

ESVs/MOTU (*)

Reads/MOTU

Original

330,382

ā€“

ā€“

ā€“

ā€“

Du (**)

60,198

ā€“

ā€“

ā€“

ā€“

Da

32,798

ā€“

ā€“

ā€“

ā€“

Du_e (***)

113,133

ā€“

ā€“

ā€“

ā€“

S

330,382

19,012

12,257

17.378

511.194

Du_S

60,198

19,058

12,471

3.159

509.961

S_Du

75,069

19,012

12,433

3.949

511.194

Da_S

32,798

19,167

15,565

1.711

507.060

S_Da

35,376

19,012

15,198

1.861

511.194

Du_d_S

60,198

19,058

12,471

3.159

509.960

Du_c_S

60,198

19,058

12,471

3.159

509.960

Du_e_S

113,133

19,016

12,365

5.949

511.087

Du_e_d_S

113,133

19,016

12,365

5.949

511.087

Du_e_c_S

113,133

19,016

12,365

5.949

511.087

  1. All datasets had 9,718,827 reads. 1-ESV MOTUs refer to the number of MOTUs with just one ESV. Codes of the datasets: Du, denoised with UNOISE3 algorithm (unless otherwise stated, it refers to the original formulation giving precedence to abundance ratio); Da, denoised with DADA2 algorithm; S, clustered with SWARM algorithm; Du_S, denoised (UNOISE3) and clustered; S_Du, clustered and denoised (UNOISE3); Da_S, denoised (DADA2) and clustered; S_Da, clustered and denoised (DADA2); Du_d_S, denoised (UNOISE3) with precedence to distance and clustered; Du_c_S, denoised (UNOISE3) with combined precedence and clustered; Du_e _S, denoised (UNOISE3) with correction taking into account the entropy of the codon positions and clustered; Du_e_d_S, denoised (UNOISE3) with correction plus precedence to distance and clustered; Du_e_c_S, denoised (UNOISE3) with correction plus combined precedence and clustered
  2. *For the original and S datasets the number of sequences instead of ESVs is used
  3. **The same values apply to Du_d (distance precedence) and Du_c (combined precedence)
  4. ***The same values apply to Du_e_d (distance precedence) and Du_e_c (combined precedence)