- Open Access
svist4get: a simple visualization tool for genomic tracks from sequencing experiments
© The Author(s). 2019
- Received: 24 October 2018
- Accepted: 27 February 2019
- Published: 6 March 2019
High-throughput sequencing often provides a foundation for experimental analyses in the life sciences. For many such methods, an intermediate layer of bioinformatics data analysis is the genomic signal track constructed by short read mapping to a particular genome assembly. There are many software tools to visualize genomic tracks in a web browser or with a stand-alone graphical user interface. However, there are only few command-line applications suitable for automated usage or production of publication-ready visualizations.
Here we present svist4get, a command-line tool for customizable generation of publication-quality figures based on data from genomic signal tracks. Similarly to generic genome browser software, svist4get visualizes signal tracks at a given genomic location and is able to aggregate data from several tracks on a single plot along with the transcriptome annotation. The resulting plots can be saved as the vector or high-resolution bitmap images. We demonstrate practical use cases of svist4get for Ribo-Seq and RNA-Seq data.
svist4get is implemented in Python 3 and runs on Linux. The command-line interface of svist4get allows for easy integration into bioinformatics pipelines in a console environment. Extra customization is possible through configuration files and Python API. For convenience, svist4get is provided as pypi package. The source code is available at https://bitbucket.org/artegorov/svist4get/
- Genomic tracks
- Next-generation sequencing
- High-throughput sequencing
- Genome browser
Next-generation sequencing gave birth to multiple high-throughput methods of the life sciences, many of which are based on mapping short sequence reads to an existing genome assembly. Visualization of mapped read densities and computationally derived genome signal tracks is one of the most routine tasks in bioinformatics sequencing data analysis. One approach is the usage of dedicated genome browsers. The most popular universal tools such as UCSC Genome Browser  or Zenbu  are web-based and allow interactive exploration of existing genome annotation along with uploaded user data. However, in some cases, it is not convenient to upload the user data to a remote server, and the data can be visualized and explored with the help of stand-alone applications with graphical user interface such as Integrated Genome Viewer  and Integrated Genome Browser , or even directly in the console environment . Finally, there are hybrid approaches, for example, BioUML bioinformatics platform  provides genome browsing functionality in both web-based and stand-alone versions.
An overview of existing programmatic visualization tools for genomic signal tracks
All the listed tools can function in Linux environment and support bed or bedGraph format for genomic signal tracks and gtf or gff for genomic annotation. Most of the tools are not focused on visualization of genomic windows and include advanced functions for data analysis or exploration.
Svist4get is implemented in Python 3 and uses multiple pypi packages (argparse, biopython , configs, reportlab, pybedtools , wand, wheel). Pypi ‘wand’ package and ImageMagick are utilized for pdf-to-png conversion. Svist4get was developed and tested in Linux environment. The python svist4get package is available in pypi (python3 -m pip install svist4get), the source code and example data are provided in Additional file 1. Details of svist4get installation are given in Additional file 2.
As input data, svist4get supports bedGraph format for genomic signals and gtf format of the genome annotation. As the output data, svist4get can generate vector graphics in pdf and export raster graphics in png. ImageMagick is used to provide raster (png) output.
Given a particular genomic window and a set of genomic signal tracks, svist4get automatically performs moving-average smoothing of the signal tracks, if necessary, taking into account the image width and the visible length of the genomic window. However, svist4get is a pure visualization tool, thus the technical data conversion and pre-processing, such as read depth normalization, should be performed with external tools, such as deeptools , bedtools , or UCSC utilities .
To facilitate application of svist4get in standard scenarios and data exploration, the command line interface covers several practical use cases that arise in transcriptomic studies, without additional effort for user-side scripting. Furthermore, svist4get provides a Python API allowing additional customization and programmatic usage from within a Python program. The use cases and examples of svist4get results are described in the next section.
Svist4get capabilities are demonstrated in , where figures were produced with svist4get Python API. Here we show several practical use cases of the command-line interface by visualizing particular genomic windows related to genes and transcripts using existing genome annotation. The command line parameters to reproduce the presented images are provided in Additional file 2.
Basic visualization of genomic windows
We employed svist4get to generate a visualization of the genomic window containing the YFL031W transcript of HAC1 gene (Fig. 1). Based on genome annotation and a transcript identifier, svist4get selects a genomic window that includes a particular transcript. Alternative scenarios include the selection of a genomic window based on gene identifier and visualization of all transcripts in a given window (Additional file 2). Svist4get renders the transcript structure (based on genome annotation) as the top track, below it places the signal tracks (based on data in bedGraph format), and the structure of open reading frames (0, + 1, + 2, based on the nucleotide sequence of the displayed window) is shown at the bottom.
Visualizing a genomic window at the single-nucleotide resolution
We also used svist4get to show a surrounding region of a translation initiation site of DFG16 yeast gene (Fig. 2), including an upstream open reading frame (ORF). The general layout of tracks in Fig. 2 is similar to that of Fig. 1. An additional track is used to show arbitrary genomic segments with user-defined labels (upstream ORF and CDS). A smaller genomic region surrounding DFG16 translation initiation site was selected based on transcript ID. A wider template (the predefined configuration file) allowed single-nucleotide resolution.
Visualizing ribosome occ2upancy in overlapping transcripts
We also show a multi-track visualization illustrating differential ribosome occupancy in mouse kidney and liver Ribo-Seq data (Fig. 3). Reconcilable parts of introns of two annotated transcripts are collapsed (red vertical marks on the transcript structure tracks) to facilitate a non-interrupted view of the translated shortened open reading frame that is specific to the liver.
Advanced features and customization
The visualization of svist4get is highly customizable. Some essential options, such as custom track coloring, are available directly through the command-line interface. Other parameters, such as color palette, bitmap DPI setting, font typeface, and page size are defined in configuration files (see Additional file 2 for details). The package includes default color palette and editable configuration files for generating figures to fit one- and two-column layout of an A4 page.
Data from high-throughput sequencing requires specialized visualization tools. Here, we present svist4get, which produces publication-quality images of signal tracks along transcript structure in arbitrary genomic windows. We believe svist4get provides a reasonable compromise between tools with advanced R APIs and user-friendly graphical interfaces and can be useful as a component of bioinformatics pipelines as well as a stand-alone tool for data exploration.
Project name: svist4get.
Project home page: https://bitbucket.org/artegorov/svist4get
Operating system(s): Linux.
Programming language: Python 3.
Other requirements: pypi packages (argparse, biopython, configs, reportlab, pybedtools, wand, wheel), ImageMagick (OS-level requirement for wand).
License: WTFPL http://www.wtfpl.net
We thank Ilya Vorontsov and Andrey Buyan for valuable feedback and testing the software.
The work was primarily supported by a Russian Federation grant (14.W03.31.0012). NGS data analysis pipeline was supported by Russian Foundation for Basic Research (grant 18–34-20024 to IVK).
AAE implemented the program and prepared the figures; EAS performed code review, refactoring and testing; ASA, SED, and VNG designed the use cases and suggested key features; IVK supervised the project and drafted the manuscript together with AAE. All the authors have participated in the manuscript preparation.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Fujita PA, Rhead B, Zweig AS, Hinrichs AS, Karolchik D, Cline MS, et al. The UCSC genome browser database: update 2011. Nucleic Acids Res. 2011;39 Database issue:D876–82. https://doi.org/10.1093/nar/gkq963.
- Severin J, Lizio M, Harshbarger J, Kawaji H, Daub CO, Hayashizaki Y, et al. Interactive visualization and analysis of large-scale sequencing datasets using ZENBU. Nat Biotechnol. 2014;32:217–9.View ArticleGoogle Scholar
- Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative genomics viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 2013;14:178–92.View ArticleGoogle Scholar
- Freese NH, Norris DC, Loraine AE. Integrated genome browser: visual analytics platform for genomics. Bioinformatics. 2016;32:2089–95. https://doi.org/10.1093/bioinformatics/btw069.
- Beraldi D. ASCIIGenome: a command line genome browser for console terminals. Bioinformatics. 2017;33:1568–9.PubMedPubMed CentralGoogle Scholar
- Valeev T, Yevshin I, Kolpakov F. BioUML Genome Browser. Virtual Biol. 2013;1:e8.View ArticleGoogle Scholar
- Hahne F, Ivanek R. Visualizing genomic data using Gviz and bioconductor. In: Methods in molecular biology; 2016. p. 335–51.Google Scholar
- Yin T, Cook D, Lawrence M. Ggbio: an R package for extending the grammar of graphics for genomic data. Genome Biol. 2012;13:R77.View ArticleGoogle Scholar
- Georgiou G, van Heeringen SJ. Fluff: exploratory analysis and visualization of high-throughput sequencing data. PeerJ. 2016;4:e2209.View ArticleGoogle Scholar
- Shen L, Shao N, Liu X, Nestler E. Ngs.Plot: quick mining and visualization of next-generation sequencing data by integrating genomic databases. BMC Genomics. 2014;15:284.View ArticleGoogle Scholar
- Cock PJA, Antao T, Chang JT, Chapman BA, Cox CJ, Dalke A, et al. Biopython: freely available Python tools for computational molecular biology and bioinformatics. Bioinformatics. 2009;25:1422–3.View ArticleGoogle Scholar
- Dale RK, Pedersen BS, Quinlan AR. Pybedtools: a flexible Python library for manipulating genomic datasets and annotations. Bioinformatics. 2011;27:3423–4.View ArticleGoogle Scholar
- Ramírez F, Dündar F, Diehl S, Grüning BA, Manke T. DeepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 2014:W187–91.Google Scholar
- Quinlan AR. BEDTools: the Swiss-Army tool for genome feature analysis. Curr Protoc Bioinforma. 2014;8:11.12.1–34.View ArticleGoogle Scholar
- Meyer LR, Zweig AS, Hinrichs AS, Karolchik D, Kuhn RM, Wong M, et al. The UCSC genome browser database: extensions and updates 2013. Nucleic Acids Res. 2013;41(Database issue):D64–9.PubMedGoogle Scholar
- Makeeva DS, Lando AS, Anisimova AS, Egorov AA, Logacheva MD, Penin AA, et al. Translatome and transcriptome analysis of TMA20 (MCT-1) and TMA64 (eIF2D) knockout yeast strains: Data Br. 2019;23:103701. https://doi.org/10.1016/j.dib.2019.103701.
- Albert FW, Muzzey D, Weissman JS, Kruglyak L. Genetic influences on translation in yeast. PLoS Genet. 2014;10:e1004692.View ArticleGoogle Scholar
- Michel AM, Kiniry SJ, O’Connor PBF, Mullan JP, Baranov PV. GWIPS-viz: 2018 update. Nucleic Acids Res. 2018;46:D823–30.View ArticleGoogle Scholar
- Zerbino DR, Achuthan P, Akanni W, Amode MR, Barrell D, Bhai J, et al. Ensembl 2018. Nucleic Acids Res. 2018;46:D754–61.View ArticleGoogle Scholar
- Castelo-Szekely V, Arpat AB, Janich P, Gatfield D. Translational contributions to tissue specificity in rhythmic and constitutive gene expression. Genome Biol. 2017;18:116.View ArticleGoogle Scholar
- Janich P, Arpat AB, Castelo-Szekely V, Lopes M, Gatfield D. Ribosome profiling reveals the rhythmic liver translatome and circadian clock regulation by upstream open reading frames. Genome Res. 2015;25:1848–59.View ArticleGoogle Scholar
- Michel AM, Fox G, M. Kiran A, De Bo C, O’Connor PBF, Heaphy SM, et al. GWIPS-viz: development of a ribo-seq genome browser. Nucleic Acids Res 2014;42:D859-D864.Google Scholar
- Harrow J, Frankish A, Gonzalez JM, Tapanari E, Diekhans M, Kokocinski F, et al. GENCODE: the reference human genome annotation for the ENCODE project. Genome Res. 2012;22:1760–74.View ArticleGoogle Scholar