From: An improved filtering algorithm for big read datasets and its application to single-cell assembly
Dataset | Reference | Raw | Diginorm | Bignorm | |||
---|---|---|---|---|---|---|---|
 | Ref length | Total length | % of ref | Total length | % of ref | Total length | % of ref |
Aceto | 426,710 | 750,316 | 175.80 | 769,090 | 180.20 | 731,850 | 171.50 |
Alphaproteo | 463,456 | 405,020 | 87.40 | 377,293 | 81.40 | 394,979 | 85.20 |
Arco | 231,937 | 408,571 | 176.20 | 419,403 | 180.80 | 380,191 | 163.90 |
Arma | 1,364,272 | 2,123,588 | 155.70 | 2,131,958 | 156.30 | 2,077,037 | 152.20 |
ASZN2 | 3,669,182 | 4,938,079 | 134.60 | 4,930,677 | 134.40 | 4,836,216 | 131.80 |
Bacteroides | 560,676 | 826,566 | 147.40 | 818,799 | 146.00 | 792,384 | 141.30 |
Caldi | 1,961,164 | 2,044,270 | 104.20 | 2,041,841 | 104.10 | 2,037,901 | 103.90 |
Caulo | 423,390 | 601,709 | 142.10 | 616,942 | 145.70 | 590,319 | 139.40 |
Chloroflexi | 863,677 | 1,317,768 | 152.60 | 1,326,848 | 153.60 | 1,186,531 | 137.40 |
Crenarch | 716,004 | 1,009,122 | 140.90 | 1,016,485 | 142.00 | 946,606 | 132.20 |
Cyanobact | 343,353 | 635,368 | 185.00 | 636,876 | 185.50 | 591,367 | 172.20 |
E. coli | 4,639,675 | 4,896,992 | 105.50 | 4,898,422 | 105.60 | 4,948,739 | 106.70 |
SAR324 | 4,255,983 | 4,676,938 | 109.90 | 4,674,540 | 109.80 | 4,669,774 | 109.70 |