TY - JOUR AU - Nicolas, Pierre AU - Sun, Fengzhu AU - Li, Lei M. PY - 2006 DA - 2006/06/15 TI - A model-based approach to selection of tag SNPs JO - BMC Bioinformatics SP - 303 VL - 7 IS - 1 AB - Single Nucleotide Polymorphisms (SNPs) are the most common type of polymorphisms found in the human genome. Effective genetic association studies require the identification of sets of tag SNPs that capture as much haplotype information as possible. Tag SNP selection is analogous to the problem of data compression in information theory. According to Shannon's framework, the optimal tag set maximizes the entropy of the tag SNPs subject to constraints on the number of SNPs. This approach requires an appropriate probabilistic model. Compared to simple measures of Linkage Disequilibrium (LD), a good model of haplotype sequences can more accurately account for LD structure. It also provides a machinery for the prediction of tagged SNPs and thereby to assess the performances of tag sets through their ability to predict larger SNP sets. SN - 1471-2105 UR - https://doi.org/10.1186/1471-2105-7-303 DO - 10.1186/1471-2105-7-303 ID - Nicolas2006 ER -