Comparative studies of high-throughput biological graphs
© Reyles and Phillips; licensee BioMed Central Ltd. 2011
Published: 5 August 2011
The exponential growth of biological data has given rise to new and difficult challenges. Because large data is often dealt with, it is inefficient to infer from each individual characteristics of a given dataset. Bioinformaticists are developing quantitative techniques to analyze and interpret key data properties. Graph algorithms can provide powerful and intuitive insight on such properties . Using this approach, we collect biological data from transcriptomic and protein-protein interaction (PPI) sources. These data can be represented as a correlation matrix, where the rows are the vertices and the columns are the edges. We will analyze these graphs, and describe their differing structural characteristics.
Materials and methods
We are using a high throughput method for graphical exploration of genomic and proteomic data. Experimental datasets are extracted from the public databases Biomart and Gene Expression Omnibus (GEO) [2, 3]. R  and MATLAB are used to develop algorithms that compute and compare various structural characteristics. We specifically developed an in-house script used to output essential histograms and unweighted/weighted edges. We are currently developing protocols to analyze the comparison of transcriptomes and PPI sources.
We express gratitude towards Jay Snoddy and Michael Langston for the ideas that led us to pursue this bioinformatics investigation.
- Tor-Kristian Jenssen AL, Komorowski J, Hovig E: A literature network of human genes for high-throughput analysis of gene expression. Nature Genetics 2001, 28: 21–28.Google Scholar
- Smedley D, Haider S, Ballester B, Holland R, London D, Thorisson G, Kasprzyk A: BioMart – biological queries made easy. BMC Genomics 2009, 10: 22. 10.1186/1471-2164-10-22PubMed CentralView ArticlePubMedGoogle Scholar
- Barrett TD, Wilhite SE, Ledoux P, Rudnev D, Evangelista C, Kim IF, Soboleva A, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Muertter RN, Edgar R: NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res 2009, 37: D5–15. 10.1093/nar/gkn764PubMed CentralView ArticlePubMedGoogle Scholar
- Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria;Google Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.