primetv: a viewer for reconciled trees
© Sennblad et al; licensee BioMed Central Ltd. 2007
Received: 11 October 2006
Accepted: 07 May 2007
Published: 07 May 2007
Evolutionary processes, such as gene family evolution or parasite-host co-speciation, can often be viewed as a tree evolving inside another tree. Relating two given trees under such a constraint is known as reconciling them. Adequate software tools for generating illustrations of tree reconciliations are instrumental for presenting and communicating results and ideas regarding these phenomena. Available visualization tools have been limited to illustrations of the most parsimonious reconciliation. However, there exists a plethora of biologically relevant non-parsimonious reconciliations. Illustrations of these general reconciliations may not be achieved without manual editing.
We have developed a new reconciliation viewer, primetv. It is a simple and compact visualization program that is the first automatic tool for illustrating general tree reconciliations. It reads reconciled trees in an extended Newick format and outputs them as tree-within-tree illustrations in a range of graphic formats. Output attributes, such as colors and layout, can easily be adjusted by the user. To enhance the construction of input to primetv, two helper programs, readReconciliation and reconcile, accompany primetv. Detailed examples of all programs' usage are provided in the text. For the casual user a web-service provides a simple user interface to all programs.
With primetv, the first visualization tool for general reconciliations, illustrations of trees-within-trees are easy to produce. Because it clarifies and accentuates an underlying structure in a reconciled tree, e.g., the impact of a species tree on a gene-family phylogeny, it will enhance scientific presentations as well as pedagogic illustrations in an educational setting. primetv is available at http://prime.sbc.su.se/primetv, both as a standalone command-line tool and as a web service. The software is distributed under the GNU General Public License.
Illustrating evolution using phylogenetic trees is a natural way of depicting evolutionary processes. Indeed, already Darwin drew trees and there exist many tools for automatically producing graphical representations of phylogenies. Some phenomena are, however, better viewed as trees-within-trees. For example, gene family evolution should be put in the context of species evolution, parasite evolution is best understood in relation to a host tree, and a speciation process that is affected by geological processes, such as continental breakup and movement, is often viewed as a species tree that evolves within an area tree. The relation between the two trees is called a reconciliation . More precise, a reconciliation describes how a guest tree, G (e.g., a gene tree in the case of gene evolution), evolves inside a host tree, S (e.g., a species tree), and allows the prediction and dating of evolutionary events (e.g., gene duplications or speciations) corresponding to vertices of G with respect to S. A special case is the most parsimonious reconciliation, MPR, i.e., the reconciliation that minimizes the number of evolutionary events. In order to avoid confusion, we will here use the term general reconciliation to indicate the set of all possible reconciliations (i.e., not restricted to MPR s).
Illustrations of these phenomena are common in the literature today, but, apart from MPR s, graphic representations of general reconciliations must be created by hand. Early support for viewing MPR swas available in COMPONENT , GeneTree  and TreeMap , featuring a nested view of co-evolving trees as well as an opportunity to look at them side-by-side. The Mesquite  coalescence package also provides automatic calculation and viewing of MPR s. The tree viewer ATV  has support for displaying vertices corresponding to gene duplications in a different color, thus implicitly defining the reconciliation, but the relation to the species tree is not directly visualized. The ATV interface has been inherited and improved in Notung , which is a program for e.g. gene family analysis, including methods for tree reconstruction, MPR sand gene-tree rooting. The RAP system  visualizes MPR s similar to ATV, but also supports interactions with databases through a graphical query language for gene tree features, thereby helping the user find gene families with a given evolutionary pattern. DupLoss  displays MPR sbut is an exploratory tool for algorithmic research rather than a general reconciliation viewer and cannot, for example, read input with arbitrary leaf names. Another approach to identify duplications, that implicitly relies on MPR s, is taken by TreeSimplifier , which shows guest trees on a hyperbolic surface and supports collapsing subtrees that are isomorphic to the host tree. Finally, we mention TreeDyn , a powerful tree editor that, while not directly handling reconciliations, allows visualization of relations between leaves in host trees and guest trees. We present primetv (PrIME Tree Viewer, PrIME is Probabilistic Inference of Models of Evolution), to our knowledge the first computer program that produce tree-within-tree illustrations of general reconciliations, i.e., not restricted to MPR s. For convenience, a primetv web service is also available where a user can create illustrations without installing any software.
Input to primetv is a host tree, S (typically a species tree), and a reconciled tree, (G, γ), where the guest tree, G, typically is a gene tree and γ is a reconciliation between G and S. Formally, the reconciliation is a mapping between vertices in G and the edges in S on which they evolve. A reconciled tree is a guest tree given together with a reconciliation.
The first, readReconciliation, is a translator that takes a guest tree, a host tree, and a separate reconciliation description and returns a reconciled tree in PrIME-format. The input guest and host trees are in Newick or PrIME-format, and the input reconciliation is in tabular form, see Figure 2(d) and 2(e) and the Results and Discussion section. The output is in PrIME-format and can be directly fed into primetv. The second helper program, reconcile, is an implementation of Zhang's algorithm  for most parsimonious tree reconciliation. For simplicity of implementation, a quadratic time version of the least-common-ancestor subroutine was used. Its input is in the Newick or PrIME format and the output is either a reconciled tree in PrIME-format that can be directly fed to primetv, or a reconciliation in tabular format that can provide a starting point for a user's adaptation to non-MPR reconciliations.
The programs are all implemented using C++. A simple graphical user interface is provided for the web-service. The GNU plotutils  package was used to generate the graphics. It has the advantage that we can produce output in a number of practical formats. Besides the popular bitmap-based image formats such as GIF and PNG, there is EPS, SVG, and FIG for vector-based illustrations that also can be manipulated and improved manually with programs such as Adobe Illustrator (Adobe Systems Inc.), Inkscape  and XFig .
Results and Discussion
To illustrate the competence of primetv we give two examples of its application, illustrating different inputs to primetv. These two examples also illustrate the formatting options of primetv.
The first example is a test case from the major histocompatibility complex class I, MHCI, in primates, analyzed using PrIME in [9, 10]. The species tree consists of three primates, its branch times are taken from , and the gene tree is from . The reconciliation is estimated with PrIME and the reconciled tree from PrIME is used as input to primetv. The output from primetv is presented in Figure 1.
The relation between the gene tree and the species tree is clear when using primetv: Identifying the gene tree's speciation vertices and their corresponding vertices in the species tree is straightforward. Gene losses are also apparent from missing gene lineages at speciations in the species tree. Another advantage with this type of illustration is that deciding which species a gene belongs to is immediate; you do not need to rely on a prefix or suffix notation in the gene names. Notice that the reconciliation in Figure 1 is not the MPR: the second gene tree vertex from the left in Figure 1 would in the MPR be denoted as a speciation. However, as described in [12, 13], if auxiliary evidence, i.e., MHCI representatives from the human genome, is included it becomes evident, also with the most parsimonious reconciliation, that this vertex indeed represent a duplication. The ability to output non-MPR illustrations is currently unique to primetv.
The tabular format for the reconciliation is given as a two column table where, for each row, the left column holds the identity number of a host tree vertex and the right column holds the identity numbers of the guest tree vertices reconciled to this host tree vertex. Depending on how the guest tree vertices are given, there are two 'flavors' of the tabular format. In the full reconciliation, denoted γ, the right column comprise the guest tree vertices whose incoming edge appear on the incoming edge of the host tree vertex. An example is given in Figure 2(d) and row 4 in this table, i.e., γ(4) is graphically indicated in Figure 3(a). Notice that the guest vertices mapped to a host tree vertex in γ form subtrees of G. The reduced reconciliation, denoted , comprises for each host vertex the leaves of these subtrees, cf. Figure 2(e). Reduced reconciliations are generic in PrIME, but full reconciliations may be more intuitive for users. The result of readReconciliation run with S, G and either γ or as arguments is the reconciled tree (G, γ) shown in Figure 2(f). Here the tag AC for a guest tree vertex v comprise the host vertices to which v is mapped in the reduced reconciliation (i.e., the inverse of ). The primetv output for this reconciled tree is given in Figure 3(a).
The program reconcile produces most parsimonious reconciliations, MPR s, of trees. It takes as input a host tree and a guest tree in Newick format and a leaf-map in tabular format that maps the name of a guest tree leaf to the name of the corresponding host tree leaf, see example in Figure 2(c). The output is either the most parsimonious reconciliation, γ*, in tabular format, or the reconciled tree (G, γ*) in PrIME format. The output reconciliation in tabular form may be used as a starting point for the construction of user-defined reconciliations. The reconciled tree output can be fed directly to primetv; the most parsimonious reconciled tree for the host tree and guest tree in Figure 2 is shown in Figure 3(d).
As a further convenience, the programs can be accessed using the web interface, http://prime.sbc.su.se/primetv. There, guest and host trees in PrIME format are put into text boxes. The user can then choose between creating an MPR or giving a reconciliation on a tabular or PrIME format; if the MPR alternative is chosen, an additional leaf map (which for easy cases is deduced automatically) is needed as input to reconcile. In either case a primetv-generated illustration is presented along with options for layout and color changes. The computed reconciliation is also given in a text box to allow for modifications and immediate updates of the illustration.
Figures 1 and 3 also illustrate the customization options available in primetv. The layout can be altered in various ways: The tree in Figure 1 is drawn horizontally, ladderized right and with the subtrees positioned asymmetrically, while the tree in Figure 3(a) is drawn vertically, ladderized right and with subtrees distributed symmetrically. In these figures, the host tree edges are scaled in relation to their edge times; This scaling, however, can be removed as in Figure 3(d). It is also possible to remove either the guest tree, as in Figure 3(c), or the host tree, as in Figure 3(b). For this last option no host tree needs to be given, thus primetv can be used to illustrate any tree, i.e., not only reconciled trees. Colors of guest and host tree vertices and edges are highly customizable and examples are shown in Figures 1 and 3. Some annotation can also be customized. The time annotation of the host tree can be shown as divergence times on a time axis, as in Figure 3(a), or as the time intervals associated with host tree edges, as in Figure 1. As shown in Figure 3(b) and 3(c), vertex annotation with identity numbers assigned by PrIME can be shown. Further options not shown include scaling of fonts and change of page size or bitmap size. The result can be shown on an X11-window or saved to file in a number of formats, see section Implementation.
Our tool primetv is a simple and compact computer program dedicated to visualizing reconciled trees. It is the first tool to visualize general reconciliations and not only MPR s. primetv visualizes reconciled trees as trees within trees. This type of illustration clarifies and accentuates the underlying structure in a reconciled tree, for example the impact of a species tree on a gene family phylogeny or how parasite evolution has followed its host species. Many visualization attributes, such as colors and layout parameters, can be adjusted according to the user's needs, thus making it possible to adapt the style of the illustration to the style of a presentation. The range of output formats make adaptation easier as well.
These attributes makes primetv a powerful tool for illustrations of gene family histories, orthology analysis, biogeography and co-speciation studies, especially since primetv allows for alternative reconciliations. primetv has formats suitable for manuscript preparation, immediate inspection, web applications, and presentations. The clarity and intuitiveness of trees-within-trees illustrations is also ideal for pedagogic illustrations of the above-mentioned processes in an educational setting.
Two programs distributed jointly with primetv simplify input adaption by removing the need to create input by hand. The primary user interface of primetv is the command line, hence it is easy to include it into scripts allowing automatic visualizations in large data sets. For the casual user, the web interface provides a simple GUI and also avoids the need for installing software.
Availability and requirements
Project name: primetv
Project home page: http://prime.sbc.su.se/primetv
Release 1.5.3 of primetv is also distributed with this manuscript [see Additional file 1].
Operating systems: primetv has been tested on Linux systems (Red Hat 9 and Fedora Core 3), Mac OS X (10.3.9 and 10.4.5) and Microsoft Windows (cygwin 1.5.18-1).
Programming languages: The software is a command-line tool written in C++. We have compiled primetv using the GNU C and C++ compilers, both versions 3.4, 4.0.01 (Mac OS X), and 4.1 (Linux).
The casual user is recommended to use our web server which, besides being platform independent, provides an easy-to-use user interface.
Other requirements: The graphics interface of primetv makes use of the GNU plotutils  package. There are a couple of adjustments to plotutils needed to compile on Linux using a modern C++ compiler. The patched source are available on our website, where we have also put executable binaries. For Mac OS X and Microsoft Windows, patched versions of plotutils are available, e.g., through MacPorts and Cygwin , respectively.
License: The software is distributed under the GNU General Public License.
The authors wishes to express their thanks to Rod Page and two anonymous reviewers for valuable comments on the manuscript. This work was funded by the Foundation for Strategic Research (SSF), the Swedish Research Council and Carl Tryggers Stiftelse.
- Goodman M, Czelusniak J, Moore G, Romero-Herrera A, Matsuda G: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms. Syst Zool 1979, 28: 132–163. 10.2307/2412519View Article
- Page RD: COMPONENT: Tree comparison software for Microsoft Windows, Version 2.0.[http://taxonomy.zoology.gla.ac.uk/rod/cpw.html]
- Page RD: GeneTree: comparing gene and species phylogenies using reconciled trees. Bioinformatics 1998, 14(9):819–820. 10.1093/bioinformatics/14.9.819View ArticlePubMed
- Charleston MA, Page RD: TreeMap.2000. [http://taxonomy.zoology.gla.ac.uk/rod/treemap.html]
- Maddison WP, Maddison DR: Mesquite: a modular system for evolutionary analysis, version 1.06.2005. [http://mesquiteproject.org]
- Zmasek CM, Eddy SR: ATV: Display and manipulation of annotated phylogenetic trees. Bioinformatics 2001, 17(4):383–384. 10.1093/bioinformatics/17.4.383View ArticlePubMed
- Durand D, Halldorsson B, Vernot B: A hybrid micro-macroevolutionary approach to gene tree reconstruction. J Comput Biol 2006, 13(2):320–335. 10.1089/cmb.2006.13.320View ArticlePubMed
- Dufayard J, Duret L, Penel S, Gouy M, Rechenmann F, Perriere G: Tree pattern matching in phylogenetic trees: automatic search for orthologs or paralogs in homologous gene sequence databases. Bioinformatics 2005, 21(11):2596–2603. 10.1093/bioinformatics/bti325View ArticlePubMed
- Bélanger D, Hallet M: DupLoss: A Java Module for Computing and Drawing Gene Duplication-Loss Events.[http://www.sable.mcgill.ca/~dbelan2/duploss/applet/duploss.html]
- Lott PL, Mundry M, Sassenberg C, Lorkowski S, Fuellen G: Simplifying gene trees for easier comprehension. BMC Bioinformatics 2006, 7: 231. 10.1186/1471-2105-7-231PubMed CentralView ArticlePubMed
- Chevenet F, Brun C, Banuls AL, Jacq B, Christen R: TreeDyn: towards dynamic graphics and annotations for analyses of trees. BMC Bioinformatics 2006, 7: 439. 10.1186/1471-2105-7-439PubMed CentralView ArticlePubMed
- Arvestad L, Berglund AC, Lagergren J, Sennblad B: Bayesian gene/species tree reconciliation and orthology analysis using MCMC. Bioinformatics/ISMB'03 2003, 19: i7-i15.View Article
- Arvestad L, Berglund AC, Lagergren J, Sennblad B: Gene tree reconstruction and orthology analysis based on an integrated model for duplications and sequence evolution. Proc of RECOMB 2004, 326–335.View Article
- Felsenstein J, James Archie J, Day WHE, Maddison W, Meacham C, Rohlf FJ, Swofford D: The Newick tree format.1986. [http://evolution.genetics.washington.edu/phylip/newicktree.html]
- Maddison DR, Swofford DL, Maddison WP: NEXUS: An extensible file format for systematic information. Syst Biol 1997, 46(4):590–621. 10.2307/2413497View ArticlePubMed
- Zmasek CM: NHX – New Hampshire eXtended.2001. [http://www.phylogenomics.us/forester/NHX.html]
- Zhang L: On a Mirkin-Muchnik-Smith conjecture for comparing molecular phylogenies. J Comput Biol 1997, 4(2):177–187.View ArticlePubMed
- GNU Plotutils[http://www.gnu.org/software/plotutils]
- Inkscape: Open Source Scalable Vector Graphics Editor[http://www.inkscape.org]
- Sutanthavibul S, Smith BV, King P: XFIG Drawing Program for the X Window System.[http://www.xfig.org]
- Arnason U, Janke A: Mitogenomic analyses of eutherian relationships. Cytogenet Genome Res 2002, 96(1–4):20–32. 10.1159/000063023View ArticlePubMed
- Nei M, Gu X, Sitnikova T: Evolution by the birth-and-death process in multigene families of vertebrate immune system. Proc Natl Acad Sci USA 1997, 94(15):7799–7806. 10.1073/pnas.94.15.7799PubMed CentralView ArticlePubMed
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.