Skip to main content
Fig. 2 | BMC Bioinformatics

Fig. 2

From: OVarFlow: a resource optimized GATK 4 based Open source Variant calling workFlow

Fig. 2

Resource usage benchmarking of GATK applications at different Java garbage collection thread counts. The performance of some GATK applications is severely influenced by the number of employed Java garbage collection (GC) threads. Each application was executed several times with different Java GC thread counts, intending to identify GC thread counts that result in minimal resource utilization. Here, the Java 8 default parallel garbage collector was used. Resource usage concerning wall time, system time and resident set size (memory usage) was analyzed (see rows) for the four tools SortSam, MarkDuplicates, HaplotypeCaller and GatherVcfs (see columns) (GATK version 4.1.9). Triplicated measurements for each of eight different numbers of GC thread counts (1, 2, 4, 6, 8, 12, 16 and 20) were recorded and resulting mean values plotted in lines. Lower measured values are preferable as they reflect a lower resource usage of the respective application. Runtime comparisons between different applications should not be performed here. The ordinate scales of individual plots vary greatly, to represent variances within an application as clearly as possible. Furthermore, SortSam, MarkDuplicates and GatherVcfs analyzed an entire dataset, while the HaplotypeCaller was limited to the analysis of chromosome 6 (NC_006093.5), thereby reducing the runtime from days to some hours

Back to article page