Skip to main content

Table 2 Descriptions of 8 FASTQ datasets used for performance evaluation

From: GTZ: a fast compression and cloud transmission tool optimized for FASTQ files

Dataset Species Reference genome size Encoding No. of quality scores in data file
ERR233152 P. aeruginosa 556 Sanger 32
SRR935126 A. thaliana 9755 Sanger 39
SRR489793 C. elegans 12,807 Illumina 1.8+ 38
SRR801793 L. pneumophila 2756 Sanger 38
SRR125858 H. sapiens 50,744 Sanger 39
SRR5419422 RNA seq (H. sapiens) 15,095 Illumina 1.8+ 6
ERR1137269 metagenomes 56,543 Illumina 1.8+ 7
NA12878 (read 2) H. sapiens 202,631 Sanger 38