Proceedings of the Seventh Annual UT-ORNL-KBRIN Bioinformatics Summit 2008

Address: 1Department of Computer Engineering and Computer Science, University of Louisville, 123 JB Speed Building, Louisville, KY 40292, USA, 2Department of Preventive Medicine and Center of Genomics and Bioinformatics, University of Tennessee Health Science Center, 66 N. Pauline Street, Suite 633, Memphis, TN 38163, USA and 3Child and Family Research Institute, Center for Molecular Medicine and Therapeutics, Department of Medical Genetics, University of British Columbia, Room 2026, 850 West 28th Avenue, Vancouver, BC V5Z 4H4, Canada

The University of Tennessee (UT), Oak Ridge National Laboratory (ORNL), and Kentucky Bioinformatics Network (KBRIN) have extensive research and educational ties in bioinformatics and have a long standing tradition of holding annual joint bioinformatics regional summits that bring together researchers, educators, and students interested in bioinformatics. These summits provide unique opportunities for enhancing collaborative links and integration of multidisciplinary research efforts and have resulted in numerous new collaborative projects in bioinformatics research and education. The Seventh Annual UT-ORNL-KBRIN Bioinformatics Summit was held at Lake Barkley State Park in Cadiz, Kentucky on March 28-30, 2008. A total of 174 participants registered for the summit, with 81 from various Tennessee institutions, and 78 from Kentucky institutions. Eighty-two registrants were faculty, while 45 were students, 33 staff, and 7 post-docs with the remainder as invited speakers.
The conference program included three days of presentations, the first devoted to the National Center for Biotechnology Information (NCBI) short courses, a discussion of the revolutionary changes in the concept of a gene in light of recent data generated by the ENCODE project and other advances in molecular biology and genome analysis [1], and a presentation on educational opportunities at UT-Knoxville and ORNL. The last day and a half was dedicated to scientific presentations divided into three ple-nary sessions: Pathways to Prediction; Biomedical Informatics; and Regulatory Analysis as well as a keynote speech given by Jeremy Smith of ORNL and a bioinformatics education workshop led by A. Malcolm Campbell of Davidson College.

NCBI short courses
Wayne Matten (NCBI) presented two of the NCBI minicourse modules, "Entrez Gene Quick Start" and "Correlating Disease Genes and Phenotypes." Both of these minicourses were 2.5 hours in length with a one and a half hour lecture overview, followed by a one-hour interactive problem-based session. The first module concentrated on using NCBI's resources for searching for gene-based information based on a RefSeq record, including sequence information such as reference gene, isoforms, single nucleotide polymorphisms and associated phenotypes, homologs, and protein structure. The second mini-course focused on associating a diseased gene with a corresponding change in phenotype using a number of NCBI resources [2].

Pathways to prediction
Nitin Baliga (Institute for Systems Biology) started the first plenary session with an exciting presentation titled "A Predictive Model for Transcriptional Control of Physiology in a Free Living Cell." What followed was a discussion of systems biology approaches incorporating transcrip- tion, translation, and interactions to derive dynamic temporal relationships that allow for organisms to thrive in extreme conditions. The main focus of this session was the molecular mechanisms of response of Archea to extreme temperature, radiation, pH levels, and high salinity and the wealth of systems biology methods and approaches to investigate this response [3][4][5][6].

Biomedical informatics
Dan Masys (Vanderbilt University) led the second plenary session on Biomedical Informatics. Dr Masys, along with Paul Harris of Vanderbilt, presented an overview of the Clinical and Translational Science Awards (CTSA) program at NIH. Vanderbilt's CTSA program, the Vanderbilt Institute for Clinical and Translational Research (VICTR), was discussed. In order to efficiently process the large influx of patient information and tissue samples into the Vanderbilt DNA Databank, VICTR has implemented a research portal, StarBRITE. StarBRITE offers clinical researchers a number of tools for conducting patientbased research, including regulatory support, patient recruitment support, data management, and clinical systems integration for electronic medical record (EMR) support. The VICTR Informatics Core is involved in the consortium developing the REDCap (Research Electronic Data Capture) http://www.iwg-online.org/projects/red cap/index.php data collection project as a portion of the EMR functionality of StarBRITE.
Mikael Benson (Göteborg University, Sweden) was an additional plenary speaker for this session. He discussed detection of markers for personalized medicine specifically targeted for allergies, including locating epigenetic markers using hay fever as a disease model. Potential markers could emerge from systems biology analysis of decomposed transcriptional networks using DNA microarrays [7][8][9]. Genes identified in these networks are further analyzed for polymorphisms, which may eventually be used as diagnostic markers.

Regulatory analysis
The final plenary session, Regulatory Analysis, was led by Ziv Bar-Joseph (Carnegie Mellon University) with his platform presentation "Reconstructing Dynamic Regulatory Networks in Multiple Species." Dr. Bar-Joseph discussed current limitations of common approaches to analysis of time series gene expression data, including dealing with populations of cells instead of single cells and the use of static data for the study of dynamic time course responses. In order to better address these limitations and to incorporate temporal features of such data, he presented an approach known as STEM [10] that aligns time course expression data to pre-determined sets of expression profiles, allowing for comparisons of significant profiles found across time course experiments. To better group together expression patterns of genes, a hidden Markov model approach for bifurcating profiles of dynamic regulatory events (DREM) [11] was presented, with evidence of experimental studies from yeast [12] and E. coli [13].
David Galas, (Institute for Systems Biology and Battelle Memorial), was the second speaker for the Regulatory Analysis session providing an intriguing closing talk in his presentation "Grappling with Complexity: Key Problems in Computational Biology." Dr. Galas discussed the current and future issues in computational biology, particularly in relationship to the massive explosion of available genome information and technical breakthroughs such as NextGen sequencing [14]. The importance of these issues is dictated by the fact that available biological data are growing exponentially rather than linearly, thus outpacing the rate at which the data can be handled by the growth of processor speed as dictated by Moore's Law [15]. He discussed how the Institute for Systems Biology is approaching this issue by looking at a thorough understanding of subsets of data in order to aid in P4 (Predictive, Personalized, Preventative, and Participatory) Medicine, particularly through their innovative Personalized Genome and Blood Organ Fingerprint projects.

Keynote speaker
Jeremy Smith, director of the ORNL Center for Molecular Physics, was the keynote speaker for the summit, giving a talk titled "Computer Simulation of Molecular Machines." Dr. Smith discussed numerous examples of how physical principles drive various processes at a molecular level, including protein folding [16] and muscular contraction. He also discussed the computational resources available at ORNL for bio-computing, including the 119-teraflop Cray XT3/XT4 Jaguar supercomputer, which currently ranks as the second most powerful supercomputer in the world and will provide a significant boost to computing power of Tennessee bioinformaticians and biology researches.

Bioinformatics education presentation and workshop
Dr. Cynthia Petersen, the director of UT/ORNL Graduate School of Genome Science and Technology, provided an exciting brief overview of new educational initiatives in bioinformatics aiming to raise computationally-enabled bioscientists. She discussed the new SCALE-IT (Scalable computing and leading-edge innovative technologies) program, the availability of TeraGrid portals for biology, and educational opportunities for undergraduate and graduate students.
Malcolm Campbell (Davidson College) presented a hands-on guide to teaching bioinformatics education at an undergraduate level through his workshop "Functional Genomics Cafeteria Style." His workshop was divided into Part I: Introducing Genomics and Bioinformatics Early; Part II: Meshing Mathematics with Biology; and Part III: A Fun Way to Integrate Biology, Math, CS, and Engineering. In Part I, he introduced the notion of teaching microarrays at the undergraduate level, through wet lab simulations [17] and creation of synthetic microarrays with known expression levels [18]. Part II focused on the use of MAGIC Tool for statistical analysis of microarrays [19,20]. Part III focused on facilitating interdisciplinary research at the undergraduate level by breaking down departmental boundaries and encouraging creativity through the International Genetically Engineered Machine Competition (iGEM).
Additional discussions of the role of bioinformaticians in student education and K-12 educational outreach were held throughout the summit.

Poster session
A total of 44 posters were presented during a 3-hour poster session on Saturday afternoon. These poster abstracts, a number of which are published in this supplement, were grouped in the topics of Bioimaging, Bioinformatics Infrastructure, Bioinformatics of Health and Disease, Comparative Genomics, Databases, Functional Genomics, Gene Regulation, Genomics, Machine Learning and Algorithms, Microarrays, Ontologies and Text Mining, Proteomics, Structure and Function Prediction, and Systems Biology.
Seven of the poster abstracts were selected for inclusion in the Summit program as short 15-minute platform presentations. Those posters presented as platform presentations were "Impact of sequence variants on the genetic analysis of expression" (Daniel Ciobanu); "Evaluation of pooled allelotyping versus individual genotyping for genomewide association analysis of complex disease" (Siddharth Pratap); "Using spacings to infer regions of loss-of-heterozygosity from paired genotype array data" (Stanley Pounds); "Integrated bioinformatics platform for differential proteomics" (Xiang Zhang); "Towards ultimate quantification -recent advances in statistical analysis of real-time PCR data with linear models" (Joshua Yuan); "Geometric databases of protein structures" (Di Wu); and "Extracting putative gene networks from microarray data using graph theoretical algorithms" (Sudhir Naswa).

Future plans
The 2009 Bioinformatics Summit will return to Fall Creek Falls State Park, Tennessee in the Spring of 2009. Potential focus areas for next year may include translational informatics, epigenetics, and current technological advances for generating biological data.