Skip to main content

Table 4 Algorithm for haplotype variation factor determination

From: Application of machine learning in SNP discovery

N is the total number of polymorphic positions

For each polymorphic position i = 1 to N

    List of chromatograms having the major allele b1, minor allele b2 are b1(i) and b2(i) respectively.

       Set Sum(HapVariationFactor) to zero.

       For each of the polymorphic position j = 1 to N and i ≠ j

          List of chromatograms having the major allele b1 and minor allele b2 are b1(j) and b2(j)

          c(i,j) is the number of elements (chromatograms) common in b2(i) and b2(j) and t is the number of elements in b2(j) then

             Sum(HapVariationFactor) += c(i,j)/t

    End of For loop

    HaplotypeFactor = Sum(HapVariationFactor)/N

End of For loop

  1. Haplotype variation factor is defined as a measure of co-variance observed in the same chromatogram across different SNP loci. For each SNP locus the fraction of number of co-variances (observing minor alleles at different SNP locus on the same chromatogram) with respect to total number of minor alleles observed is first calculated. These values are then summed for all positions and the mean value (haplotype variation factor) is calculated by dividing by the total number of polymorphisms.