Fig. 2From: ADS-HCSpark: A scalable HaplotypeCaller leveraging adaptive data segmentation to accelerate variant calling on SparkAlgorithm framework diagram of ADS-HCSpark. The figure shows the entire algorithm framework of ADS-HCSpark. In the preprocessing, the program scans the input BAM file to obtain the sequence features of each original block. According to the preprocessing result and the rules mentioned above, data blocks to be split are predicted and segmented. Then overlapped blocks are read in parallel by customized Hadoop-BAM library and finally variant calling is executed on themBack to article page