Dendroscope: An interactive viewer for large phylogenetic trees
DOI: 10.1186/1471-2105-8-460
© Huson et al; licensee BioMed Central Ltd. 2007
Received: 26 March 2007
Accepted: 22 November 2007
Published: 22 November 2007
Abstract
Background
Research in evolution requires software for visualizing and editing phylogenetic trees, for increasingly very large datasets, such as arise in expression analysis or metagenomics, for example. It would be desirable to have a program that provides these services in an effcient and user-friendly way, and that can be easily installed and run on all major operating systems. Although a large number of tree visualization tools are freely available, some as a part of more comprehensive analysis packages, all have drawbacks in one or more domains. They either lack some of the standard tree visualization techniques or basic graphics and editing features, or they are restricted to small trees containing only tens of thousands of taxa. Moreover, many programs are diffcult to install or are not available for all common operating systems.
Results
We have developed a new program, Dendroscope, for the interactive visualization and navigation of phylogenetic trees. The program provides all standard tree visualizations and is optimized to run interactively on trees containing hundreds of thousands of taxa. The program provides tree editing and graphics export capabilities. To support the inspection of large trees, Dendroscope offers a magnification tool. The software is written in Java 1.4 and installers are provided for Linux/Unix, MacOS X and Windows XP.
Conclusion
Dendroscope is a user-friendly program for visualizing and navigating phylogenetic trees, for both small and large datasets.
Background
Phylogenetic trees are used to represent evolutionary relationships between biological taxa, while taxonomical hierarchies such as the NCBI taxonomy are used to structure the wealth of molecular sequence data. The size of trees under consideration is growing larger and larger.
The Tree of Life project [1], which aims at reconstructing the evolutionary relationship of all living species on earth, now considers more than 11,000 species. The Ribosomal Database Project II provides a hierarchical browser for a collection of approximately 340,000 ribosomal RNA sequences. Recent metagenomic analysis software [2] makes use of the full NCBI taxonomy, which now contains more than 390,000 taxa, to estimate the taxonomical content of a dataset.
Most currently available tree viewers are designed to handle trees containing up to a few thousand nodes. A notable exception is TreeJuxtaposer [3], which was explicitly designed to visualize large trees. While TreeJuxtaposer is the tool of choice for very large datasets (containing hundreds of thousands of taxa), it has limited value as an all-round tree visualization tool, as it only implements one particular tree view (namely the rectangular phylogram, perhaps because this is the only view that is useful for large trees), it lacks basic graphics export capabilities and it does not allow one to save and reopen a modified tree.
Results and Discussion
Homo sapiens in the NCBI taxonomy. The placement of Homo sapiens and the Hominidae in the NCBI taxonomy, as displayed in Dendroscope using the program's magnifier feature.
Four different views of the same dataset. Four different views for the same dataset of 28 sequences of genera of the daisy family: (a) circular cladogram, (b) radial phylogram, (c) rectangular phylogram, and (d) slanted cladogram.
Formatting nodes and edges of a tree. Dendroscope provides a dialog box for formatting the nodes and edges of a tree; the example shows a tree drawn as an internal circular cladogram.
Comparison with other tree viewers
Comparison of popular tree viewers. Description of column headers: A: displayable taxa (see Methods section for details), B: search function, C: tree comparison, D: coloring of subtrees, E: editing of labels, F: collapsing of subtrees, G: rerooting, H: rectangular view, I: slanted view, J: radial view, K: circular view, L: graphic export formats
A | B | C | D | E | F | G | H | I | J | K | L | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
ATV | 2 k | ✓ | ✓ | ✓ | ✓ | |||||||
Dendroscope | 350 k | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | eps, svg, png, jpg, gif, bmp | |
HyperTree | 20 k | ✓ | ✓1 | ✓ | ✓ | ✓ | - | |||||
MEGA | 20 k | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | emf | ||
PHYLIP | 1336 k2 | ✓ | ✓ | ✓ | ✓ | ps, bmp, pict, pov, fig | ||||||
SplitsTree4 | 1 k | ✓ | ✓3 | ✓ | ✓ | ✓ | ✓ | ✓ | eps, svg, png, jpg, gif, bmp | |||
TreeDyn | 5 k | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ps, svg, png, jpg, gif, etc. | |
TreeJuxtaposer | 1002 k | ✓ | ✓ | ✓ | ✓ | ✓ | - | |||||
TreeView | 2 k4 | ✓5 | ✓6 | ✓ | ✓ | ✓ | ✓ | ✓ | wmf, emf |
The system requirements of existing viewers vary: some work only with particular versions of Unix/Linux or MacOS, or they need additional software to be installed. However, all viewers listed in Table 1 run on Linux/Unix, MacOS and Windows, except MEGA, which runs only on Windows.
Dendroscope at work
Our objective was to build a tree viewer that is able to handle a tree as large as the current version of the NCBI taxonomy. On a standard laptop, Dendroscope performs well on this tree in all rectangular and slanted views. Circular and radial view are less suitable for very large data sets. Figure 1 shows a screenshot of the NCBI taxonomic tree loaded in Dendroscope showing Homo sapiens and the Hominidae. Figure 2 demonstrates some of the views provided by the program.
Conclusion
With Dendroscope, we have developed a new all-round tree viewer that combines all major features found in popular viewers into a single program that can handle large datasets.
Availability and Requirements
Dendroscope is freely available and can be downloaded from http://www-ab.informatik.uni-tuebingen.de/software/dendroscope. The software is written in Java 1.4 and installers are provided for Linux/Unix, MacOS X and Windows.
Methods
Processing of trees
Since we want to represent very large trees, we need to be able to focus on the crucial parts of the representation to speed up calculations. To this end, we use bounding boxes: to each subtree, we assign a box containing the subtree. The tree is drawn from the root down, and each subtree is drawn only if its bounding box is in the visible region or at least intersects with it. In addition, we compare the height of the bounding box to the number of edges it contains; if we find too many edges in a too small a box, we draw the box as an opaque single element instead of drawing each edge separately. When we want to identify the element (edge or node) at a selected position, we also make use of the bounding boxes: The tree is searched from the root down, leaving out all subtrees whose bounding boxes do not contain the selected position. This reduces the search time from (n) to (log(n)).
We supply two different magnifiers to let the user easily access inner nodes and taxa: a horizontal magnifier band for rectangular and slanted views, and a circular one for radial tree views. In both cases, a point with distance d to the center of the magnifier is mapped to a point with distance from the center, where D denotes the diameter or height of the magnifier, as appropriate.
Test data and system
To estimate the number of displayable taxa for each viewer (see Table 1), we applied the viewer to a list of trees containing increasingly large numbers of taxa: 1 k, 2 k, 5 k, 10 k, 20 k, 50 k, 100 k, 200 k, 334 k, 668 k, 1002 k, 1336 k and 2004 k. In Table 1, we report the maximal size of dataset that could be opened by the viewer, and then loaded and browsed in a reasonable amount of time (less than 90 seconds to open and an interaction response time of less than 15 seconds) on a standard workstation.
Declarations
Acknowledgements
Funding for TD, DHH, CR, DCR and (partially) MF, and also the publication costs for this article, was provided by the Deutsche Forschungsgemeinschaft (funding for the ZBIT, BIZ 1/1-2 and BIZ 1/1-3). RR was funded by the Deutsche Forschungsgemeinschaft grant HU566/5-1.
Authors’ Affiliations
References
- Maddison DR, Schulz KS: The Tree of Life Web Project.2006. [http://tolweb.org]Google Scholar
- Huson D, Auch A, Qi J, Schuster S: MEGAN Analysis of Metagenomic Data. Genome Research 2007. [Published online before print, DOI: 10.1101/gr.5969107] [Published online before print, DOI: 10.1101/gr.5969107]Google Scholar
- Munzner T, Guimbretière F, Tasiran S, Zhang L, Zhou Y: TreeJuxtaposer: scalable tree comparison using Focus+Context with guaranteed visibility. ACM Trans Graph 2003, 22(3):453–462. 10.1145/882262.882291View ArticleGoogle Scholar
- Maddison DR, Swofford DL, Maddison WP: NEXUS: an extensible file format for systematic information. Syst Biol 1997, 46(4):590–621. 10.2307/2413497View ArticlePubMedGoogle Scholar
- Felsenstein J: Tree plotting/drawing software.2007. [http://evolution.genetics.washington.edu/phylip/software.html#Plotting]Google Scholar
- Christen R: Trees – software for visualisation and manipulations.2007. [http://bioinfo.unice.fr/biodiv/Tree_editors.html]Google Scholar
- Zmasek CM, Eddy SR: ATV: display and manipulation of annotated phylogenetic trees. Bioinformatics 2001, 17(4):383–384. 10.1093/bioinformatics/17.4.383View ArticlePubMedGoogle Scholar
- Bingham J, Sudarsanam S: Visualizing large hierarchical clusters in hyperbolic space. Bioinformatics 2000, 16(7):660–661. 10.1093/bioinformatics/16.7.660View ArticlePubMedGoogle Scholar
- Kumar S, Tamura K, Nei M: MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment. Brief Bioinform 2004, 5(2):150–163. 10.1093/bib/5.2.150View ArticlePubMedGoogle Scholar
- Felsenstein J: PHYLIP (PHYLogeny Inference Package) version 3.66.2006. [http://evolution.genetics.washington.edu/phylip.html] [Distributed by the author. Department of Genome Sciences, University of Washington, Seattle]Google Scholar
- Huson DH, Bryant D: Application of phylogenetic networks in evolutionary studies. Mol Biol Evol 2006, 23(2):254–267. 10.1093/molbev/msj030View ArticlePubMedGoogle Scholar
- Page RDM: TREEVIEW: An application to display phylogenetic trees on personal computers. Computer Applications in the Biosciences 1996, 12: 357–358.PubMedGoogle Scholar
- Chevenet F, Brun C, Bañuls AL, Jacq B, Christen R: TreeDyn: towards dynamic graphics and annotations for analyses of trees. BMC Bioinformatics 2006, 7: 439. 10.1186/1471-2105-7-439PubMed CentralView ArticlePubMedGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.



Comments
View archived comments (1)