Skip to main content

Table 5 Experiment 4-File size overhead

From: FASTA/Q data compressors for MapReduce-Hadoop genomics: space and time savings made easy

Dataset BZIP2 LZ4 ZSTD MFCompress SPRING
SA vs HS SA vs HS SA vs HS SA vs HU SA vs HU
16GB \(\sim 0\%\) \(\sim 0\%\) \(\sim 0\%\) \(\sim 4.35\%\) \(\sim 46.67\%\)
32GB \(\sim 0\%\) \(\sim 0\%\) \(\sim 0\%\) \(\sim 9.09\%\) \(\sim 59.26\%\)
64GB \(\sim 0\%\) \(\sim 0\%\) \(\sim 0\%\) \(\sim 7.95\%\) \(\sim 95.45\%\)
96GB \(\sim 0\%\) \(\sim 0\%\) \(\sim 0\%\) \(\sim 7.63\%\) \(\sim 134.55\%\)
  1. Type 1 datasets FASTA files: Space overhead (in percentage), computed as specified in the main text, introduced by compression Codecs encapsulated in the HU and in the HS Codecs vs their (SA) stand-alone versions (on the columns), when compressing input files of increasing size (on the rows)