Fig. 1From: Identifying and mitigating batch effects in whole genome sequencing dataA detectable batch effect was apparent in PCA of relevant quality metrics calculated using the gVCF (a). The standard GWAS PCA performed using 250,000 common SNPs did not reveal this batch effect (b). Quality metrics included in the PCA in (a) include percent of variants confirmed in 1000 genomes (phase 1, high confidence SNPs) [26], mean genotype quality, median read depth, transition transversion ratio in non-coding regions, transition transversion ratio in coding regions, and percent heterozygotes. Group 1 here refers to samples sequenced in 2010–2012 and Group 2 to samples sequenced in 2013 and 2014Back to article page