Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: ADS-HCSpark: A scalable HaplotypeCaller leveraging adaptive data segmentation to accelerate variant calling on Spark

Fig. 1

The flow-process diagram of ADS-HCSpark. The figure shows the execution flow of ADS-HCSpark. First the BAM file needs to be uploaded to HDFS. ADS-HCSpark includes two parts: the data preprocessing and the variant calling based on adaptive data segmentation (ADS-HC). ADS-HC includes targeted data partitioning, overlapped processing, variant calling and output merge. Among them, variant calling consists of four main steps of GATK HaplotypeCaller: identifying active regions, local reassembly, likelihood calculation and assigning genotypes

Back to article page