Skip to main content

Table 2 The PGen workflow consists of several individual tasks with diverse core and memory requirements, which were assigned based on tools’ applicability of multiple threads and memory cost after testing

From: PGen: large-scale genomic variations analysis workflow and browser in SoyKB

Tasks

Base code

Cores (Threads)

Memory (GB)

Indexing of reference genome

BWA/samtools/picard tools

1

4

Alignment to reference genome

BWA

1

21

Sorting sam files

Picard tools

1

21

Removal of PCR duplicates

Picard tools

1

21

Add or replace read groups

Picard tools

1

21

Create realign target

GATK_RealignerTargetCreator

15

20

Realign indels

GATK_IndelRealigner

1

10

Calling variants

GATK_HaplotypeCaller

1

3

Select SNPs and indels

GATK_SelectVariants

14

10

Filtering variants

GATK_VariantFiltration

14

10

Create genotype GVCF

GATK_GenotypeGVCFs

1

10

Merge GVCFs

GATK_CombineGVCFs

1

20

Combine variants

GATK_CombineVariants

1

10