Lightweight genome viewer: portable software for browsing genomics data in its chromosomal context
© Faith et al; licensee BioMed Central Ltd. 2007
Received: 30 April 2007
Accepted: 18 September 2007
Published: 18 September 2007
Lightweight genome viewer (lwgv) is a web-based tool for visualization of sequence annotations in their chromosomal context. It performs most of the functions of larger genome browsers, while relying on standard flat-file formats and bypassing the database needs of most visualization tools. Visualization as an aide to discovery requires display of novel data in conjunction with static annotations in their chromosomal context. With database-based systems, displaying dynamic results requires temporary tables that need to be tracked for removal.
lwgv simplifies the visualization of user-generated results on a local computer. The dynamic results of these analyses are written to transient files, which can import static content from a more permanent file. lwgv is currently used in many different applications, from whole genome browsers to single-gene RNAi design visualization, demonstrating its applicability in a large variety of contexts and scales.
lwgv provides a lightweight alternative to large genome browsers for visualizing biological annotations and dynamic analyses in their chromosomal context. It is particularly suited for applications ranging from short sequences to medium-sized genomes when the creation and maintenance of a large software and database infrastructure is not necessary or desired.
Genome browsers are the primary tools for the visualization of raw genomic sequence data and annotations. Typically, these software systems are web-based and present an image with "tracks" of information that describe the underlying genome sequence. The tracks include features such as SNPs, ESTs, linkage-disequilibrium, and splice variants. Navigation through these annotations is done by zooming and scrolling along the track and the underlying sequence information.
Initially, most organisms with complete genomes had their own custom-built genome browser software [1–3]. More recently, there has been a push towards feature-rich species-generic genome browsers that can be reused for new genomes. The result is a small number of high quality genome browsers that are used across many species [3–7]. All of these browsers use a large set of annotations, which are input into a relational database. A collection of scripts then read the information for the genome region a user wants to view and presents the annotations corresponding to that region.
The large software systems used by genome browsers often require specialized knowledge for installation and maintenance. The requirement of a relational database complicates the genome browsers' applicability in dynamic contexts that change frequently. In addition, running a full-fledged genome browser on a personal computer is not trivial.
Here we present lightweight genome viewer (lwgv), a genomic sequence annotation visualizer that requires only a single text file and executable to run. This simplicity and independence from a database backend facilitates the dynamic creation of genome views based on user-chosen analyses. lwgv allows "include" files, which provide an object-oriented, plug-n-play, architecture for managing tracks and building text files for more complex viewer applications. We have successfully used lwgv to visualize RNAi oligos on their corresponding genes , to present a linkage disequilibrium map of chromosome 19 , and to display feature annotations for the GeneSeer . We also present a new application of lwgv to dynamically visualize changes in gene expression along a genome using any combination of the over 500 prokaryotic microarrays available in the Many Microbe Microarrays Database (M3D). lwgv is an ideal tool for the presentation of dynamic analyses and sequence annotations without resorting to the creation and maintenance of a large database and software infrastructure.
Results and discussion
lwgv as a traditional genome browser
Although lwgv is not as feature rich as database-driven genome browsers, it is sufficiently fast to be used in place of a genome browser in many contexts where the number of tracks and features is not too large (e.g. less than 10 tracks with less than 10000 features each). For example, lwgv was used to display a linkage disequilibrium map of human chromosome 19 [9, 11]. lwgv is a good replacement for traditional genome browsers when quick setup is needed and the visualization demands require only simple tracks and graphs. Larger software packages, such as the UCSC Genome Browser , should be used when more advanced browsing features are needed (e.g. expandable tracks, the ability to dynamically add/remove tracks, and the ability to navigate large genomes by clicking on regions of a chromosome image).
lwgv as a short sequence viewer
Most software applications for visualizing short DNA sequences are standalone-applications that are only available commercially or are devoted to a specific task such as restriction enzyme digestion . lwgv is well suited for visualizing annotations of short stretches of DNA. For example, we use lwgv to show the location of the RNAi knockdown clones from the Hannon-Elledge shRNA libraries [8, 14]. For this task, it is only necessary to show individual genes and the location of each shRNA designed for those genes. With lwgv, this task can be done by reading the available shRNAs for a particular gene from the database of all shRNAs and generating the corresponding annotations such as exon boundaries and shRNA binding sites into a temporary file to be read by lwgv. This dynamic approach allows one to update the shRNA database without having to sync a second database for a genome viewer.
lwgv is particularly well suited to dynamically display a user's analysis of a particular region of DNA. We previously developed a web application where biologists can design their own RNAi oligos . lwgv provides a simple way to show the locations of the RNAi oligo designs on the user's sequence (Figure 1). For traditional genome browsers, this would require either generating (and subsequently deleting) a new database or table for each user or developing a lot of workaround code to allow the genome browser to operate from a database that has many discontinuous sequences from different species. With lwgv, the user's sequence can be visualized by generating the appropriate temporary file with their sequence and the location of the siRNA oligos on their sequence. These temporary files can be deleted when they are past a pre-determined expiration date.
lwgv as a dynamic microarray analysis tool
lwgv is a lightweight genome browser that can be used in small-scale projects and individual labs. Scientists and laboratories with little computing infrastructure can use lwgv since it does not require databases or other software.
Availability and requirements
lwgv is distributed under the GPL license.
Project name: lwgv
Project homepage: http://lwgv.sourceforge.net
Operating systems: linux and mac os x
Programming languages: C
Other requirements: apache, cgic, gd graphics library, lex (flex), yacc (bison)
Any restrictions to use by non-academics: none
lwgv is distributed as a source code tarball and installs with the standard unix "./configure" and "make" commands. Details about installing lwgv, writing tracks, and customizing the output can be found in the manual and README files distributed with the software.
Manufacturers of America Foundation, the United States Department of Energy Office of Science (BER) grant number DE-FG02-04ER63803, the Whitaker Foundation, and the DART Neurogenomic Alliance at CSHL.
- Consortium TFB: FlyBase--the Drosophila database. The FlyBase Consortium. Nucleic Acids Res 1994, 22(17):3456–3458. 10.1093/nar/22.17.3456View ArticleGoogle Scholar
- Stein L, Sternberg P, Durbin R, Thierry-Mieg J, Spieth J: WormBase: network access to the genome and biology of Caenorhabditis elegans. Nucleic Acids Res 2001, 29(1):82–86. 10.1093/nar/29.1.82PubMed CentralView ArticlePubMedGoogle Scholar
- Stein LD, Mungall C, Shu S, Caudy M, Mangone M, Day A, Nickerson E, Stajich JE, Harris TW, Arva A, Lewis S: The generic genome browser: a building block for a model organism system database. Genome Res 2002, 12(10):1599–1610. 10.1101/gr.403602PubMed CentralView ArticlePubMedGoogle Scholar
- Hinrichs AS, Karolchik D, Baertsch R, Barber GP, Bejerano G, Clawson H, Diekhans M, Furey TS, Harte RA, Hsu F, Hillman-Jackson J, Kuhn RM, Pedersen JS, Pohl A, Raney BJ, Rosenbloom KR, Siepel A, Smith KE, Sugnet CW, Sultan-Qurraie A, Thomas DJ, Trumbower H, Weber RJ, Weirauch M, Zweig AS, Haussler D, Kent WJ: The UCSC Genome Browser Database: update 2006. Nucleic Acids Res 2006, 34(Database issue):D590–8. 10.1093/nar/gkj144PubMed CentralView ArticlePubMedGoogle Scholar
- Hubbard TJ, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, Down T, Dyer SC, Fitzgerald S, Fernandez-Banet J, Graf S, Haider S, Hammond M, Herrero J, Holland R, Howe K, Howe K, Johnson N, Kahari A, Keefe D, Kokocinski F, Kulesha E, Lawson D, Longden I, Melsopp C, Megy K, Meidl P, Ouverdin B, Parker A, Prlic A, Rice S, Rios D, Schuster M, Sealy I, Severin J, Slater G, Smedley D, Spudich G, Trevanion S, Vilella A, Vogel J, White S, Wood M, Cox T, Curwen V, Durbin R, Fernandez-Suarez XM, Flicek P, Kasprzyk A, Proctor G, Searle S, Smith J, Ureta-Vidal A, Birney E: Ensembl 2007. Nucleic Acids Res 2007, 35(Database issue):D610–7. 10.1093/nar/gkl996PubMed CentralView ArticlePubMedGoogle Scholar
- Karp PD, Ouzounis CA, Moore-Kochlacs C, Goldovsky L, Kaipa P, Ahren D, Tsoka S, Darzentas N, Kunin V, Lopez-Bigas N: Expansion of the BioCyc collection of pathway/genome databases to 160 genomes. Nucleic Acids Res 2005, 33(19):6083–6089. 10.1093/nar/gki892PubMed CentralView ArticlePubMedGoogle Scholar
- Peterson JD, Umayam LA, Dickinson T, Hickey EK, White O: The Comprehensive Microbial Resource. Nucleic Acids Res 2001, 29(1):123–125. 10.1093/nar/29.1.123PubMed CentralView ArticlePubMedGoogle Scholar
- Paddison PJ, Silva JM, Conklin DS, Schlabach M, Li M, Aruleba S, Balija V, O'Shaughnessy A, Gnoj L, Scobie K, Chang K, Westbrook T, Cleary M, Sachidanandam R, McCombie WR, Elledge SJ, Hannon GJ: A resource for large-scale RNA-interference-based screens in mammals. Nature 2004, 428(6981):427–431. 10.1038/nature02370View ArticlePubMedGoogle Scholar
- Phillips MS, Lawrence R, Sachidanandam R, Morris AP, Balding DJ, Donaldson MA, Studebaker JF, Ankener WM, Alfisi SV, Kuo FS, Camisa AL, Pazorov V, Scott KE, Carey BJ, Faith J, Katari G, Bhatti HA, Cyr JM, Derohannessian V, Elosua C, Forman AM, Grecco NM, Hock CR, Kuebler JM, Lathrop JA, Mockler MA, Nachtman EP, Restine SL, Varde SA, Hozza MJ, Gelfand CA, Broxholme J, Abecasis GR, Boyce-Jacino MT, Cardon LR: Chromosome-wide distribution of haplotype blocks and the role of recombination hot spots. Nat Genet 2003, 33(3):382–387. 10.1038/ng1100View ArticlePubMedGoogle Scholar
- Olson AJ, Tully T, Sachidanandam R: GeneSeer: a sage for gene names and genomic resources. BMC Genomics 2005, 6: 134. 10.1186/1471-2164-6-134PubMed CentralView ArticlePubMedGoogle Scholar
- Chromosome 19 Linkage Map[http://katahdin.cshl.org:9331/chr19/]
- UCSC Genome Browser[http://hgdownload.cse.ucsc.edu/downloads.html]
- Vincze T, Posfai J, Roberts RJ: NEBcutter: A program to cleave DNA with restriction enzymes. Nucleic Acids Res 2003, 31(13):3688–3691. 10.1093/nar/gkg526PubMed CentralView ArticlePubMedGoogle Scholar
- Olson A, Sheth N, Lee JS, Hannon G, Sachidanandam R: RNAi Codex: a portal/database for short-hairpin RNA (shRNA) gene-silencing constructs. Nucleic Acids Res 2006, 34(Database issue):D153–7. 10.1093/nar/gkj051PubMed CentralView ArticlePubMedGoogle Scholar
- RNAi Central[http://katahdin.cshl.org:9331/homepage/siRNA/RNAi.cgi?type=shRNA]
- Cohen BA, Mitra RD, Hughes JD, Church GM: A computational analysis of whole-genome expression data reveals chromosomal domains of gene expression. Nat Genet 2000, 26(2):183–186. 10.1038/79896View ArticlePubMedGoogle Scholar
- Spellman PT, Rubin GM: Evidence for large domains of similarly expressed genes in the Drosophila genome. J Biol 2002, 1(1):5. 10.1186/1475-4924-1-5PubMed CentralView ArticlePubMedGoogle Scholar
- Many Microbe Microarrays Database[http://m3d.bu.edu/]
- Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, Kasif S, Collins JJ, Gardner TS: Large-Scale Mapping and Validation of Escherichia coli Transcriptional Regulation from a Compendium of Expression Profiles. PLoS Biol 2007, 5(1):e8. 10.1371/journal.pbio.0050008PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.