Skip to main content

Table 1 Hapmap trio datasets description

From: Shape-IT: new rapid and accurate algorithm for haplotype inference

Datasets Chromosome #datasets #SNP #indiv Details
CEU Size 1 to 5 250 10 to 160 60 50 datasets of 10, 20, 40, 80 and 160 adjacent SNPs with MAF above 5%
CEU Density 1 to 5 300 40 60 50 datasets with spanned distance between SNP above 0, 0.5, 1, 2, 4 and 8 kb (MAF 5%)
CEU MAF 1 to 5 150 40 60 50 datasets with MAF above 1%, 5% and 10%
YRI Size 1 to 5 250 10 to 160 60 50 datasets of 10, 20, 40, 80 and 160 adjacent SNPs with MAF above 5%
YRI Density 1 to 5 300 40 60 50 datasets with spanned distance between SNP above 0, 0.5, 1, 2, 4 and 8 kb (MAF 5%)
YRI MAF 1 to 5 150 40 60 50 datasets with MAF above 1%, 5% and 10%
CEU illumina 50 12 300 50 60 15,000 illumina SNPs grouped by dataset of 50 SNPs
CEU illumina 100 12 150 100 60 15,000 illumina SNPs grouped by dataset of 100 SNPs
CEU illumina 200 12 75 200 60 15,000 illumina SNPs grouped by dataset of 200 SNPs
GRIV 1 90 50 to 200 100 to 300 3,500 illumina SNPs grouped by dataset of 50, 100 and 200 SNPs
  1. Description of the benchmarks derived from the HapMap trios datasets that we used to compare accuracy and runtimes of the various algorithms in Table 4. For each parameter (size, density, and MAF) 10 samples were chosen in each of the chromosomes 1 to 5, i.e. a total of 50 tests per parameter.