Skip to main content

Table 4 Splitting samples by batch (“batched”) retains more high quality variants

From: Effective filtering strategies to improve data quality from population-based whole exome sequencing studies

 

HWE*

Call rate

Ave.GQ

VQSR¥

Total

Number of variants filtered from “unbatched” dataset

14,209

197,540

0

26,967

238,716

Number of filtered variants found in “batched”

1,983

135,031

N/A

2,050

139,064

Ti/Tv of filtered variants found in “batched”

2.64

2.20

N/A

2.16

2.20

  1. *Used – hwe in vcftools to remove variants with Bonferroni-corrected p-value < 0.05.
  2. Used – geno in vcftools to remove variants with call rates < 88%.
  3. Used an awk command to remove variants with average GQ < 35.
  4. ¥Filtered VQSR processed variants at the 99% sensitivity tranche.