Importing data
TreeGraph 2 can read trees in Newick or Nexus format (including additional annotations in comments specified by BEAST [13]) as well as phyloXML tree descriptions [14] and can furthermore import annotations from text files generated e.g. with a spreadsheet application. Besides that, TreeGraph 2 facilitates combining information from different phylogenetic analyses of a given dataset. This is particularly useful e.g. in the study of extensive gene family datasets with large sets of terminals. The following sections describe this feature in greater detail.
Mapping statistical support onto congruent nodes
For each branch of a tree opened in TreeGraph 2, the corresponding support from other trees can be mapped whenever the topology defined by the current branch is present in them. Each of these other trees may represent the result from a different analytical approach or different data partition, and support values from these trees are assigned their own label ID by which they are grouped and amenable to future formatting or editing operations. Thus, all support values that stem from a particular analysis can be individually formatted e.g. by their relative position on the branch and/or their font and style.
Finding conflicting nodes and mapping contradictory support
In some studies not only the support from different analyses has been mapped onto the branches but also the strongest support for a contradictory topology was determined by inspection via eye [15, 16].
TreeGraph 2 uses the following algorithm automate this (for a better understanding it should be kept in mind that each branch splits a tree into exactly two subtrees).
Let tree1 specify the topology onto which contradictory support from other trees should be mapped (example in Figure 1a). For a given branch branch1 in tree1, the maximum support for a conflicting branch branch2 from another tree tree2 (example in Figure 1b) can be found as follows.
-
1.
Find the branch2 which defines a subtree subtree2 with the smallest number of terminals that contains all leafs of a subtree subtree1 defined by branch1.
-
2.
Inside subtree2 find all branches that define a subtree which are on the one hand fully enclosed by subtree2 and on the other hand contain at least one terminal which is also part of subtree1 as well as at least one leaf which is not.
-
3.
The highest support value in the set of these branches is added as a conflicting value onto branch1.
This highest conflicting support value can be distinguished from congruent values by user-specified formats, e.g. brackets, asterisks or different colors (see example in Figure 1).
Editing and formatting capabilities
The program features versatile editing and formatting options, such as automatically setting branch widths or colors according to the value of any of the unlimited number of variables that can be assigned to each node or branch.
Editing of node/branch data
Node/branch data imported from spread sheets or other trees (as described above), can be copied from and to other internal variables, be kept invisible or set visible and then be freely formatted (individually or across the whole tree), filtered according to their values or calculated from each other using an integrated mathematical expression parser which can access all node/branch data columns. Figure 2 shows a screenshot displaying a tree and its corresponding data table.
Editing operations
Beyond typical editing operations such as tree rerooting and ladderizing or moving and collapsing of nodes, whole clades can be copied or cut out and placed into new empty files or inserted (along with all node/branch data) into other trees. Since nodes can also be manually added, whole trees can quickly be manually constructed starting from an empty file.
The editing operations are facilitated by versatile additive selection options that allow selecting many elements in a tree for subsequent formatting with just a few clicks. Additionally, every operation applied to an opened tree can be easily undone or redone using the undo-function.
Searching, replacing and translating tree leaf names
Searching and replacing is possible across all node/branch data columns (including taxon names and node labels).
More restrictive alignment file formats do not allow lengthy taxon names, so names get truncated. In other cases, the often clumsy taxon- or lab IDs used during a study survive up to the final alignment, phylogenetic dataset and the trees constructed from it until they need to be adjusted for the final tree to be presented in a paper. TreeGraph 2 can be requested to apply a translation table to use "cleaned" taxon names for the final output. This translation table can be constructed easily with help of the data export feature and any text editor or spread sheet program. Furthermore the lab IDs (old terminal names) can be saved in a hidden data field to be able to identify the terminals by these lab IDs so that additional support values could still be added later on.
Formatting document elements
Great flexibility is offered by the application as it allows free formatting of line- and text-formats of all document elements like nodes, branches or legends (which mark a group of terminals). Additionally branches can carry an unlimited number of textual annotations (text labels) or icons (icon labels) the color, text style or size of which can also be freely formatted (see Figure 3). All distance values in TreeGraph 2 (e.g. line width or text height) are specified in millimeters or DTP-points (1/72 inch). This feature, along with the image export function (see below), allows the user to design trees in exactly the size they should appear in print or in the exported graphic file. In addition, TreeGraph 2 offers a feature to proportionally rescale all elements of a subtree or the whole document.
Automatically setting line width, text height, and color
TreeGraph 2 allows automatically setting all formats (e.g. branch widths, branch colors, text colors, text heights, icon sizes) according to the value of a chosen node/branch data column. This provides a very intuitive way to graphically present the relative magnitude of, e.g., certain types of support or rates assigned to branches (see Figure 2 and 3 for examples).
Different view modes
All editing operations are facilitated by a very convenient way to zoom in and out, fitting the zoom to the window size, and a miniature overview (Figure 2) for navigating large trees.
When applicable (i.e., given that branch length information is provided), trees can be displayed as phylogram or chronogram (Figure 3), with multiple options for adjusting a scale bar (to indicate e.g. time spans in chronograms, rates in ratograms, or branch lengths in phylograms).
Exporting to graphic formats and printing
TreeGraph 2 outputs various vector and (anti-aliased) pixel graphic formats. Among these are SVG, PDF, or PNG, supporting transparent background where this applies. Using the graphic export function of TreeGraph 2, the most adequate graphic formats, resolutions, and image sizes for manuscripts, presentation slides, or web pages, respectively, can be specified.
Help
An extensive, continuously updated online help system is available under http://treegraph.bioinfweb.info/Help and can also be accessed (in a context-dependent manner) from within the program. Additionally, several video tutorials are offered there to get started with TreeGraph 2 (see http://treegraph.bioinfweb.info/Help/wiki/Tutorial:Main_page).
Comparison to previous software
To date, a variety of tree visualization tools have been released, among which ATV [17], Dendroscope [18], FigTree (the tree editor accompanying BEAST), the MEGA tree explorer [19], Mesquite [20], PhyloWidget [21], TreeDyn [22] and TreeView [23] may be the most widely distributed. In spite of their great usefulness for the purposes they have been developed for, none of these software packages allows simultaneously visualizing, freely editing, properly formatting and exporting or printing trees with heavily annotated nodes (see Figure 4). Although TreeDyn is able to display multiple annotations on one node it is not able to automatically position them in a ready-to-publish way or to combine them from different analyses. FigTree is able to read the special Newick annotations generated by BEAST and therefore can also store several sets of annotations but only offers a limited number of ways to display them (like branch lengths or one textual annotation per branch). In contrast TreeGraph 2 (which is also able to read BEAST annotations) can show a nearly unlimited number of textual annotations at a time as well as display data in form of branch widths, line colors or many other formats.
Besides importing additional annotations from tables (which TreeDyn also offers), TreeGraph 2 is the only editor which can combine annotations (e.g. statistical support from different analysis methods) from different trees (with the same set of terminals). The information gained this way has a topological component and can therefore not simply be obtained from data in a table.
A feature closely related to the ones mentioned above is the ability to calculate numeric or textual annotations by mathematical expressions which can reference other annotations (see above). To date, a similar functionality is not offered by any other tree editor.
TreeGraph 2 features a multitude of format options which can be combined to every tree element (e.g. branches, nodes or labels) independently. As Figure 4 shows, no other tree editor currently provides functionalities like element-specific formats for all types of tree elements in combination with advanced selection options or collision free positioning of the whole tree. Moreover, none of the editors that offer at least some of TreeGraph 2's formatting options allow the user to precisely determine the print layout. In contrast to most other editors, our program offers context help buttons (which link to the online help system) everywhere in the program, making it very easy for new users to get started.
It should be noted, however, that TreeGraph 2 has been optimized as a tree editor for producing high quality tree figures and not as a viewer for trees with many thousands of taxa which could never be depicted completely in a publication or presentation. The latter is a specialty of software specifically designed for this purpose such as, e.g., Dendroscope [14] (Figure 4).
Since TreeGraph 2 is written in Java and is able to read and write all its supported formats directly from and to streams in would be possible to use it in a web application either on the server (e.g. with Apache Tomcat) or the client site (e.g. as an Java applet or a Java webstart application) to display and manipulate trees. As yet, our application would have to be integrated into such a web application by its programmer manually and we do not yet offer a ready-to-use plug-in solution for this. We do, however, offer a full documentation of our source code (including its interfaces) to facilitate such a web integration.