Skip to main content

Table 1 Statistics of the datasets used in the experiments for k=25

From: Disk-based k-mer counting on a PC

  D. ananassae C. elegans Z. mays H. sapiens H. sapiens
     NA19238 HG02057
FASTQ file size [GB] 8.7 16.4 45.9 353 208
Total gzipped size [GB] 1.8 4.6 16.3 116.6 65.9
No. of gzipped files 6 2 108 463 6
No. of reads [ ×106] 35 68 62 2,662 860
Read lengths 75 100 25-2043 36(most)-75 100
   Statistics of k -mers    
No. of singletons 43 347 1,010 11,823 3,367
No. of distinct 63 459 1,916 14,599 6,023
No. of distinct non-singletons 20 112 906 2,776 2,657
Total no. 1,803 5,127 20,214 44,687 65,325
  1. Totals in millions.