Skip to main content
Fig. 1 | BMC Bioinformatics

Fig. 1

From: SNVstory: inferring genetic ancestry from genome sequencing data

Fig. 1

Schematic of ancestry inference model strategy. The workflow visualizes each dataset separately with colored boxes and arrows: gnomAD (blue), 1kGP (yellow), and SGDP (red). For the gnomAD synthetic-based matrix, allele frequencies for each variant for each population given in gnomAD are used to create a distribution of reference, heterozygous and homozygous alleles for each population. A matrix format is created by converting the distributions into 0s, 1s, and 2s for each locus for samples in each population. For 1kGP and SGDP, a matrix format is built directly from variants in the VCF. For the model architecture, continental model labels are shown in white boxes, and the number of labels in the corresponding subcontinental models is below in brackets

Back to article page