Skip to main content

Table 2 Comparison of compression ratios of six software suites

From: MetaCRAM: an integrated pipeline for metagenomic taxonomy identification and compression

Data Original (MB) MCH1 (MB) MCH2 (MB) MCG (MB) MCEG (MB) Align. % Qual value (MB) bzip2 (MB) gzip (MB) MFComp (MB)
ERR321482 1429 191 186 312 213 29.6 411 362 408 229
SRR359032 3981 319 282 657 458 61.8 2183 998 1133 263
ERR532393 8230 948 898 1503 1145 46.8 3410 2083 2366 1126
SRR1450398 5399 703 697 854 729 7.7 365 1345 1532 726
SRR062462 6478 137 135 188 144 2.7 153 222 356 161
  1. For short hand notation, we used“MCH” = MetaCRAM-Huffman, “MCG” = MetaCRAM-Golomb, “MCEG” = MetaCRAM-extended Golomb, “MFComp” = MFCompress. MCH1 is the default option of MetaCRAM with Huffman encoding, and MCH2 is a version of MetaCRAM in which we removed the redundancy in both quality scores and the read IDs. “Align. %” refers to the total alignment rates from the first and second iteration. Minimum compressed file size achievable by the methods are written in bold case letters. Minimum compressed file size achievable by the methods are written in bold case letters