Skip to main content

Table 1 Statistics of the datasets used in the experiments for k=25

From: Disk-based k-mer counting on a PC

 

D. ananassae

C. elegans

Z. mays

H. sapiens

H. sapiens

    

NA19238

HG02057

FASTQ file size [GB]

8.7

16.4

45.9

353

208

Total gzipped size [GB]

1.8

4.6

16.3

116.6

65.9

No. of gzipped files

6

2

108

463

6

No. of reads [ ×106]

35

68

62

2,662

860

Read lengths

75

100

25-2043

36(most)-75

100

  

Statistics of k -mers

   

No. of singletons

43

347

1,010

11,823

3,367

No. of distinct

63

459

1,916

14,599

6,023

No. of distinct non-singletons

20

112

906

2,776

2,657

Total no.

1,803

5,127

20,214

44,687

65,325

  1. Totals in millions.