Skip to main content


Fig. 3 | BMC Bioinformatics

Fig. 3

From: Investigating reproducibility and tracking provenance – A genomic workflow case study

Fig. 3

Graphical representation of the GATK workflow representing artefacts and information necessary to be captured as part of workflow execution. The description of main steps is depicted in the black rectangles whereas the tools responsible to carry out the steps are shown in grey ellipses. Input and reference files (brown rounded rectangles) are shown separately and labelled by the dataset name. The primary and secondary output files (if any) are shown in dark and light green snip diagonal corner rectangles respectively. The input and output data flow for each workflow step is demonstrated through red and green dotted arrows respectively. The connection between processes in a workflow is represented by blue solid arrow. The yellow highlighted parts of the workflow are the pivotal processes not explicitly declared in Galaxy and Cpipe. The red flag highlights the main input and final output for the workflow

Back to article page