Skip to main content

Table 4 Filter and assembly statistics for Bignorm with Q 0=20, Diginorm, and the raw datasets (Part I)

From: An improved filtering algorithm for big read datasets and its application to single-cell assembly

Dataset

Algorithm

Reads kept

Mean phred

Contigs

Filter time

SPAdes time

  

in %

score

≥10 000

in sec

in sec

Aceto

Bignorm

3.16

37.33

1

906

1708

 

Diginorm

3.95

27.28

1

3290

4363

 

Raw

 

36.52

3

 

47,813

Alphaproteo

Bignorm

3.13

34.65

18

623

420

 

Diginorm

7.81

28.73

17

1629

11,844

 

Raw

 

33.64

17

 

29,057

Arco

Bignorm

2.20

33.77

4

429

207

 

Diginorm

8.76

21.39

6

1410

1385

 

Raw

 

32.27

6

 

15,776

Arma

Bignorm

7.90

28.21

44

240

135

 

Diginorm

29.30

21.19

50

588

1743

 

Raw

 

26.96

44

 

5371

ASZN2

Bignorm

5.66

37.66

118

1224

1537

 

Diginorm

12.62

32.73

130

5125

21,626

 

Raw

 

36.85

112

 

47,859

Bacteroides

Bignorm

2.85

37.47

6

653

3217

 

Diginorm

4.94

27.64

5

2124

3668

 

Raw

 

37.25

9

 

32,409

Caldi

Bignorm

3.97

37.82

41

842

455

 

Diginorm

5.61

30.67

36

1838

793

 

Raw

 

37.37

38

 

7563

Caulo

Bignorm

2.40

36.95

10

679

712

 

Diginorm

4.70

25.16

9

2584

765

 

Raw

 

36.01

13

 

18,497

Chloroflexi

Bignorm

1.40

31.91

32

694

134

 

Diginorm

9.70

18.91

33

2304

1852

 

Raw

 

30.50

34

 

15,108

Crenarch

Bignorm

1.46

33.18

19

1107

790

 

Diginorm

9.72

19.80

18

2931

3754

 

Raw

 

31.49

26

 

20,590

Cyanobact

Bignorm

1.65

30.45

12

679

450

 

Diginorm

11.30

17.58

13

1487

1343

 

Raw

 

28.49

13

 

9417

E. coli

Bignorm

1.91

26.14

67

2279

598

 

Diginorm

17.03

19.34

63

9105

3995

 

Raw

 

24.34

64

 

16,706

SAR324

Bignorm

4.34

33.05

55

1222

708

 

Diginorm

4.69

23.58

52

3706

3085

 

Raw

 

32.52

51

 

26,237