From: Data structures and compression algorithms for high-throughput sequencing technologies
 | Dataset 1 | Dataset 2 | Dataset 3 |
---|---|---|---|
Reads (× 106) | 6.4 | 1.7 | 31 |
Read length | 19 | 25 | 23-44 |
Coverage | Very sparse | Sparse | Full |
File sizes | Â | Â | Â |
   Raw Sequence | 1,030,333,440 | 353,181,920 | 8,869,613,392 |
   Uniform | 912,352,288 | 252,540,968 | 4,946,059,912 |
Location | 743,517,128 | 226,557,032 | 4,232,120,216 |
Mismatches | 168,835,160 | 25,983,936 | 713,939,696 |
   Bowtie | 3,145,664,248 | 902,954,872 | 19,475,952,512 |
Bowtie Extra Fields | Â | Â | Â |
   gzip | 50,382,904 | 106,576,328 | 839,247,848 |
   7zip | 36,306,064 | 93,238,688 | 778,347,264 |