Ranking single nucleotide polymorphisms by potential deleterious effects

Lee, Phil Hyoun; Shatkay, Hagit

doi:10.1186/1471-2105-9-S10-O6

Volume 9 Supplement 10

Highlights from the Fourth International Society for Computational Biology (ISCB) Student Council Symposium

Oral presentation
Open access
Published: 30 October 2008

Ranking single nucleotide polymorphisms by potential deleterious effects

Phil Hyoun Lee¹ &
Hagit Shatkay¹

BMC Bioinformatics volume 9, Article number: O6 (2008) Cite this article

2776 Accesses
4 Citations
Metrics details

Introduction

Identifying single nucleotide polymorphisms (SNPs) that are responsible for common and complex diseases such as cancer is of major interest in current molecular epidemiology. However, due to the tremendous number of SNPs on the human genome, to expedite genotyping and analysis, there is a clear need to prioritize SNPs according to their potentially deleterious effects to human health. As of yet, there have been few efforts to quantitatively assess the possible deleterious effects of SNPs for effective genetic variation studies. Here we propose a new integrative scoring system for prioritizing SNPs based on their possible deleterious effects in a probabilistic framework.

Methods

We aim to quantitatively measure the potential deleterious effects of SNPs on four bio-molecular function of their genomic region, namely, splicing, transcription, translation, and post-translation modification. Figure 1 outlines the three main steps of our assessment process.

For simplicity, we refer to the assessed score as the functional significance (FS) score of a SNP.

STEP 1. Retrieving Predicted Functional Information

Given a set of SNPs, we first retrieve their predicted functional categories (i.e., 'deleterious' or 'neutral') and corresponding confidence scores for the decisions (i.e., S ∈ R) using 16 publicly available web-services and databases. The confidence scores are then normalized onto the common scale.

STEP 2. Computing Tool Reliability

We define the Tool Reliability (TR) score as how likely each tool is to correctly predict deleterious SNPs, and estimate it based on the tendency of the tool to make consistent predictions with others, following the approach proposed by Long and his colleagues [1].

STEP 3. Computing Functional Significance

Given the prediction results and normalized confidence scores obtained in step I and the TR score computed in step II, the FS score of a SNP is computed as the average of the normalized confidence scores, weighted by the reliability of each tool, as summarized in Figure 1.

Conclusion

We applied our method to 126,496 SNPs located in 607 disease-susceptible genes obtained from the OMIM [2] database (downloaded Jan. 2008). The assessment results show that splice sites and coding regions are most enriched with SNPs with highly putative deleterious effects, which is consistent with previous findings about functional SNPs [3]. We further validated our scoring system by checking out that the distribution of the FS scores for SNPs known to be disease-causing is significantly different from that of SNPs selected uniformly at random within the same gene (p-value 1.0303e-055, paired t-test, α = 0.05).

References

Long PM, Varadan V, Gilman S, Treshock M, Servedio RA: Unsupervised evidence integration. Proceedings of the 22nd international Conference on Machine Learning: 7–11 August 2005; Bonn, Germany 2005, 521–528.
Google Scholar
Online Mendelian Inheritance in Man, OMIM™[http://www.ncbi.nlm.nih.gov/omim/]
Xu H, Gregory SG, Hauser ER, Stenger JE, Pericak-Vance MA, Vance JM, Zuchner S, Hauser MA: SNPselector: a web tool for selecting SNPs for genetic association studies. Bioinformatics 2005, 21(22):4181–4186. 10.1093/bioinformatics/bti682
Article PubMed Central CAS PubMed Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, Queen's University, Kingston, Ontario, Canada, K7L 3N6
Phil Hyoun Lee & Hagit Shatkay

Authors

Phil Hyoun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Hagit Shatkay
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Phil Hyoun Lee.

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Lee, P.H., Shatkay, H. Ranking single nucleotide polymorphisms by potential deleterious effects. BMC Bioinformatics 9 (Suppl 10), O6 (2008). https://doi.org/10.1186/1471-2105-9-S10-O6

Download citation

Published: 30 October 2008
DOI: https://doi.org/10.1186/1471-2105-9-S10-O6

Highlights from the Fourth International Society for Computational Biology (ISCB) Student Council Symposium

Ranking single nucleotide polymorphisms by potential deleterious effects

Introduction

Methods

STEP 1. Retrieving Predicted Functional Information

STEP 2. Computing Tool Reliability

STEP 3. Computing Functional Significance

Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

BMC Bioinformatics

Contact us

Highlights from the Fourth International Society for Computational Biology (ISCB) Student Council Symposium

Ranking single nucleotide polymorphisms by potential deleterious effects

Introduction

Methods

STEP 1. Retrieving Predicted Functional Information

STEP 2. Computing Tool Reliability

STEP 3. Computing Functional Significance

Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us