From: ADS-HCSpark: A scalable HaplotypeCaller leveraging adaptive data segmentation to accelerate variant calling on Spark
Dataset
Genome
File format
Coverage depth
File size
Default number of data blocks
D1
NA12878
BAM
14x
67.7GB
543
D2
28x
128.5GB
1028
D3
NA18507
11x
59.3GB
475
D4
60x
250.15GB
2002