Skip to main content

Table 1 Recommended job packing approach for best practices pipeline. Assumes node with 16 processors and 64 GB of memory

From: Group-based variant calling leveraging next-generation supercomputing for large-scale whole-genome sequencing studies

Step

Tool

Memory per

Cores per

Commands

  

command (GB)

command

per node

Map

BWA

32

8

2

Bam

Samtools

4

1

16

Merge

Samtools

4

1

16

Sort

Samtools

4

1

16

MarkDuplicates

PicardTools

7

2

8

TargetCreator

GATK

7

2

8

IndelRealigner*

GATK

12

3

5

BaseRecalibrator

GATK

30

8

2

PrintReads*

GATK

30

8

2

HaplotypeCaller

GATK

60

16

1

  1. *Smaller memory allocation and more samples per node may prove more computationally efficient