From: BiSpark: a Spark-based highly scalable aligner for bisulfite sequencing data
Data set | Tailored data size | # of reads | Description |
---|---|---|---|
Simulation data | 122MB | 1,000,000 | Simulation set with 0% error |
122MB | 1,000,000 | Simulation set with 1% error | |
122MB | 1,000,000 | Simulation set with 2% error | |
GEO WGBS data (GSE80911) | 1.6GB | 10,000,000 | 10 million reads real data set |
7.9GB | 50,000,000 | 50 million reads real data set | |
16GB | 100,000,000 | 100 million reads real data set | |
32GB | 200,000,000 | 200 million reads real data set | |
Reference genome | Build 37, hg19 |