Skip to main content

Table 1 Number of target and non-target sequences in simulated SAG data

From: SAG-QC: quality control of single amplified genome information by subtracting non-target sequences based on sequence compositions

Target

Contamination

Proportion of contaminant sequence [%]

E. coli/M. magneticum

Pseudomonas

Delftia

1000

0

0

0

900

75

25

10

800

150

50

20

700

225

75

30

600

300

100

40

500

375

125

50

400

450

150

60

300

525

175

70

200

600

200

80

100

675

225

90

0

750

250

100

  1. We utilized public bacterial sequences to simulate SAG datasets. We defined Escherichia coli and Magnetospirillum magneticum as target species in this simulation. We mixed their sequences with sequences of Pseudomonas and Delftia to simulate sequences of contaminated samples. The sequences were mixed in several proportions to simulate datasets with different contamination levels.