Skip to main content

Table 5 Experiment 4-File size overhead

From: FASTA/Q data compressors for MapReduce-Hadoop genomics: space and time savings made easy

Dataset

BZIP2

LZ4

ZSTD

MFCompress

SPRING

SA vs HS

SA vs HS

SA vs HS

SA vs HU

SA vs HU

16GB

\(\sim 0\%\)

\(\sim 0\%\)

\(\sim 0\%\)

\(\sim 4.35\%\)

\(\sim 46.67\%\)

32GB

\(\sim 0\%\)

\(\sim 0\%\)

\(\sim 0\%\)

\(\sim 9.09\%\)

\(\sim 59.26\%\)

64GB

\(\sim 0\%\)

\(\sim 0\%\)

\(\sim 0\%\)

\(\sim 7.95\%\)

\(\sim 95.45\%\)

96GB

\(\sim 0\%\)

\(\sim 0\%\)

\(\sim 0\%\)

\(\sim 7.63\%\)

\(\sim 134.55\%\)

  1. Type 1 datasets FASTA files: Space overhead (in percentage), computed as specified in the main text, introduced by compression Codecs encapsulated in the HU and in the HS Codecs vs their (SA) stand-alone versions (on the columns), when compressing input files of increasing size (on the rows)