Skip to main content

Table 2 Descriptions of 8 FASTQ datasets used for performance evaluation

From: GTZ: a fast compression and cloud transmission tool optimized for FASTQ files

Dataset

Species

Reference genome size

Encoding

No. of quality scores in data file

ERR233152

P. aeruginosa

556

Sanger

32

SRR935126

A. thaliana

9755

Sanger

39

SRR489793

C. elegans

12,807

Illumina 1.8+

38

SRR801793

L. pneumophila

2756

Sanger

38

SRR125858

H. sapiens

50,744

Sanger

39

SRR5419422

RNA seq (H. sapiens)

15,095

Illumina 1.8+

6

ERR1137269

metagenomes

56,543

Illumina 1.8+

7

NA12878 (read 2)

H. sapiens

202,631

Sanger

38