Skip to main content

Table 4 Splitting samples by batch (“batched”) retains more high quality variants

From: Effective filtering strategies to improve data quality from population-based whole exome sequencing studies

  HWE* Call rate Ave.GQ VQSR¥ Total
Number of variants filtered from “unbatched” dataset 14,209 197,540 0 26,967 238,716
Number of filtered variants found in “batched” 1,983 135,031 N/A 2,050 139,064
Ti/Tv of filtered variants found in “batched” 2.64 2.20 N/A 2.16 2.20
  1. *Used – hwe in vcftools to remove variants with Bonferroni-corrected p-value < 0.05.
  2. Used – geno in vcftools to remove variants with call rates < 88%.
  3. Used an awk command to remove variants with average GQ < 35.
  4. ¥Filtered VQSR processed variants at the 99% sensitivity tranche.