3D Network exploration and visualisation for lifespan data
BMC Bioinformatics volume 19, Article number: 390 (2018)
The Ageing Factor Database AgeFactDB contains a large number of lifespan observations for ageing-related factors like genes, chemical compounds, and other factors such as dietary restriction in different organisms. These data provide quantitative information on the effect of ageing factors from genetic interventions or manipulations of lifespan. Analysis strategies beyond common static database queries are highly desirable for the inspection of complex relationships between AgeFactDB data sets. 3D visualisation can be extremely valuable for advanced data exploration.
Interactive 3D network visualisation allows to supplement complex database queries by a visually guided exploration process. The JANet interface allows gaining deeper insights into lifespan data patterns not accessible by common database queries alone. These concepts can be utilised in many other research fields.
In ageing research, the lifespan of an organism is an indicator for determining factors that play a role in this process. These ageing factors (AFs) can be genes, chemical compounds or other factors like dietary restriction. Usually, they are examined under different experimental conditions in model organisms like the worm (Caenorhabditis elegans), yeast (Saccharomyces cerevisiae), fruit fly (Drosophila melanogaster), mouse (Mus musculus), and many others. The results of these experiments may be extracted from the scientific literature in the form of lifespan observations (LOs). They describe the effect of interventions at AFs on the lifespan of the model organism.
In a lifespan experiment, a single AF or a combination of two or more AFs can be involved. The intervention can be different for each AF. For example, a knock-out of gene A could be coupled to the overexpression of gene B. Also, AFs can be involved in different experiments together with different other AFs. For example, some genes like daf-2 and daf-16 from C. elegans were tested in several hundred AF combinations and various interventions, e.g. [1, 2]. The effects on the lifespan of the organism may also differ drastically.
3D Network visualisation
Dealing with this heterogeneity and the complexity of the relationships between the LOs is a major challenge in gaining a comprehensive overview or in generating an integrative model. Visualisation techniques can aid in analysing this complex data [3–5] and also help to generate new hypotheses not only on a quantitative level .
In line with these supportive approaches on sets, network visualisations can assist in organising vast amounts of data according to known relationships or properties. 3D visualisation can help researchers on special occasions in this task, although 2D representations should generally be preferred . In the lifespan network visualisation context we present here, 3D networks outperform their 2D counterparts regarding compactness and layout. While 3D embeddings allow a compact representation of hundreds or thousands of nodes, 2D embeddings result in a significant expansion, increasing navigation costs (see Additional file 1: Figure S1). Furthermore, it is known that any finite graph can be embedded into a three-dimensional space such that no pair of edges crosses . As LO networks cannot be guaranteed to be planar graphs, 2D embeddings might also result in intersecting edges, while in 3D these non-intersecting representations exist .
Additionally, psychophysical experiments provide evidence that the human primary visual processing system is specifically designed to process 3D information. Nakayma et al. indicate the parallel processing of attribute information like the colour from different depth planes . Enns et al. give evidence that 3D objects with lightning-related depth cues accelerate the visual search in comparison to 2D objects . Xu et al. report an increased capacity of the visual short-term memory (VSTM)  when objects are distributed between different layers .
The benefits of 3D network representations come at the risk of visual occlusion and possible perspective distortion . Due to the layered structure of 3D representations, elements in the foreground can mask elements in the background. Which can be overcome by interactive graph manipulation such as zooming, rotating, panning and filtering.
Perspective distortions might occur when object sizes and distances are modified for both data visualisation and perspective depth effects. They can be omitted by ignoring depth calculations. Similar risks, such as diminished legibility of text, can be avoided by excluding these objects from other perspective transformations (i.e. rotations).
Overall 3D visualisation has many advantageous unique selling points. Its disadvantageous can be mitigated via interactive graph exploration and manipulation.
The public JenAge Ageing Factor Database AgeFactDB  contains LOs for AFs. Table 1 provides an overview of the number of AFs and observations by type (lifespan, other ageing phenotypes, homology analysis) and by their ageing relevance evidence type (experimental, computational).
Currently, the core of AgeFactDB is a collection of about 2600 genes for which LOs and other experimental evidence were gathered from experiments with different model organisms (experimental AFs). This set was extended by about 14,000 genes gained in a homology analysis using data from the homology database HomoloGene  (putative AFs).
Overall, AgeFactDB contains about 9500 observations. About 1000 are free-text descriptions of ageing phenotypes, and about 7000 are structured LOs. Besides, there are about 1500 homology analysis observations, each resembling a homology group.
As an example for structured LOs, Table 2 shows observation OB_000094  involving the genes FOB1, SIR2 and TOR1 from S. cerevisiae. The deletion of these three genes resulted in a 33.5% increase in lifespan. Note that there are two different types of lifespan defined for yeast: chronological lifespan and replicative lifespan . The chronological lifespan is the number of days which a specific yeast cell is living. The replicative lifespan is a measure of the total number of daughter cells generated by a mother cell .
3D Network viewer
In the following, we present different types of networks and visualisation strategies for LO data. We show the benefit of augmentation of AF/LO networks by annotation nodes compared to AF annotation. Annotation nodes can be for example gene ontology (GO)  term nodes and KEGG pathway  nodes.
The primary interface of JANet is structured into eight tabs which provide access to the main viewer for graph visualisation and the different options for network generation (Fig. 2a). Networks are generated according to three principles focusing on overviews of AFs, the inspection of individual AFs and the interaction of AFs with user-specified genes of interest. The interface additionally provides a statistical summary of the current database of AgeFactDB, help sides and the imprint. Generally, actions like node colouring and rendering work in the viewer on the currently selected nodes, enabling graph manipulations of node properties individually to create custom views.
The Viewer tab provides the 3D graph representation of a chosen network and basic operations for graph editing (Fig. 2b). JANet comprises various options for customising the graph design and manipulating the graph layout. Among them, for example, changing the node colour and size. Network nodes can be marked with key information like the name or the lifespan change value. By hovering above a node with the mouse pointer, more detailed information about the node are given in the upper left corner of the viewer. The node itself is highlighted by a light blue halo. A white halo is shown when a node is selected. By selecting a region of interest (bounding box) sets of nodes can be selected or highlighted. The network can be restricted to the selected node and its direct neighbours. The network viewer provides alternative view options. It can be expanded to a fullscreen mode or send to an independent browser tab/window. In this way, several networks can be analysed simultaneously. For a better 3D impression the viewer also offers a stereo view option.
Rather holistic networks on all AFs of the AgeFactDB are generated in the tab Overview networks. The selection of AFs may be stratified according to the type of AF and the corresponding species. The tab also provides options for augmenting the graph with different kinds of annotation nodes. For example, they can provide additional information about the allele types or species. The constructed graph can afterwards be manipulated within the network viewer. Figure 2a gives an example of an overview network. It contains all AFs having LOs, all LOs, and the corresponding species.
Ageing factor networks
The networks generated by the tab Ageing Factor Networks are focused on one particular AF which is embedded either in its direct or complete neighbourhood. The functionality of this tab is comparable to the functionality of the tab for overview networks. Additionally, single AFs can be selected from a dropdown list.
Gene list selection / Network
JANet can be used as an interface for an integrative analysis combining the existing database of AgeFactDB and an user-specified list of genes of interest. Users can provide their list in the tab Import Gene List (Fig, 2c). A query to AgeFactDB returns a list of exact matches to ageing factors with experimental evidence and putative ageing factors. In the tab Gene List Network these results can be screened and edited (Fig. 2d). This tab also starts the network generation.
Network layout algorithms
Force layout algorithms position the nodes of a graph in two-dimensional or three-dimensional space so that all edges are of more or less equal length and as few crossing edges as possible are produced. Repulsive or attractive forces are assigned to edges and nodes based on their relative position. By minimising their energy the layout is generated step wise. Within JANet we utilise three different algorithms.
The 3d-forced-layout (3dFL) algorithm module implements a velocity Verlet numerical integrator for simulating physical forces on nodes. It is a numerical method used to integrate Newton’s equations of motion . In contrast the Fruchterman-Reingold (FR) algorithm focuses on even distribution of vertices, a minimal number of edge crossings and uniform edge length . The fast multipole multilevel method (FMMM) is especially designed for separating substructures in large graphs .
LO network visualisation techniques
We use LO Networks as basic visualisation technique for the integration of different lifespan experiments. In these networks LOs and AFs are represented by nodes. Edges between both types of nodes link the AFs with the LOs in which they were involved. Multiple AFs can be connected to one LO and vice versa.
In order to facilitate the visual navigation, we utilise a colour code for the different node types. In the special case of LO nodes the colour indicates the direction of the lifespan change. An increased, decreased or unchanged lifespan is indicated by green, red or grey colour. The node size can be proportional to the quantitative lifespan change or other quantitative measures like the number of connections (degree). The edge colour is usually inherited from the node of a pair whose colour carries specific information for the other node. For AF/LO edges this is the LO node, whose colour usually indicates the direction of the lifespan change.
We present several visualisation techniques for LO networks (Fig. 3). In the “Results and discussion” section we present example networks, based on AgeFactDB data, to demonstrate the usage of these techniques.
For focusing on a specific AF only the direct neighbourhood of the AF is visualized (Fig. 3a). This includes the respective AF, all directly linked LOs, and all other AFs linked directly to these LOs.
This network type provides a compact view of the effects of all LOs in which an AF is involved directly.
In the network in Fig. 3a the AF in the focus is AF1. AF1 is linked to 3 LOs (LO1−LO3). AF2 is linked to LO2 because it was tested together with AF1 in the corresponding experiment. The same applies to AF3 and LO3. No further AFs were tested in any of the 3 LOs.
In LO1 the lifespan was decreased, indicated by the red node colour. In LO2 and LO3 the lifespan was increased, indicated by the green node colour.
The direct neighbourhood network is extended iteratively by including further AFs that were observed together with the neighbours of the AF in the focus (Fig. 3b). In this way the complete neighbourhood is included in the visualisation. The resulting network can be seen as the largest connected subgraph including the AF in the focus and all directly or indirectly connected LOs and AFs.
The complete neighbourhood network provides an overview on all experimentally analysed AF combinations and their effects on lifespan.
The direct neighbourhood network visualisation from Fig. 3a was expanded to the complete neighbourhood network visualisation in Fig. 3b. In the first expansion step LO4 and LO5 were added, linked to AF2 but not to AF1. Also AF4 was added, linked to LO5 together with AF2. In the second expansion step LO6 and AF5, linked also to LO6, were added. In the final expansion step LO7 and LO8 were added, both linked to AF5.
Augmentation via annotation nodes
LO networks are augmented by additional information on either the AFs or the LOs or both. This is achieved by additional annotation nodes (ANNs). They provide the additional information directly in a visual form for all nodes at once. The augmented network visualisation in Fig. 3c shows an annotated version of the complete neighbourhood visualisation from Fig. 3b. The annotation node ANN1 was linked to the 3 AFs (AF2,AF4,AF5) and to the 4 LOs (LO4, LO5, LO7, LO8). The annotation node ANN2 was linked to the 3 AFs (AF1,AF3,AF4) and to the 4 LOs (LO1−LO3, LO6). This resulted in two groups around the annotation nodes ANN1 and ANN2, connected by the nodes AF4, LO2, and LO6.
Data transfer between nodes
The data transfer between nodes is especially useful for reducing the complexity of a network by removing the nodes from which data was transferred. The transfer enables to retain some information from the removed nodes.
Figure 3d shows a reduced version of the network from Fig. 3c. The collected qualitative lifespan change information from all LOs connected to an AF was transformed into a new colour for the AF. AF2 and AF3 are linked only to green LO nodes (indicating an increased lifespan) and got green as new colour. AF1, AF4, and AF5 are linked to 2 or 3 green LO nodes and 1 red LO node (in the latter case indicating a decreased lifespan). These mixed effects were transformed into orange as new node colour.
The transferred data can also be used to define a new node size, which may be for example proportional to the maximal observed lifespan change. This information can already be helpful even if the LOs are not removed.
In contrast to genes, chemical compounds and other AFs can be linked to LOs of multiple species. This requires to augment such networks by species nodes (SP) that are linked to the corresponding LOs. We propose to leave out the links between the AFs and the species nodes. This will result in a much clearer and less complex network view, grouping the network clearly according to the involved organisms.
Because of the importance of the species information in these networks and the additional restriction to one type of species links we defined Multi-species as a separate technique.
The multi-species network visualisation in Fig. 3e contains AF1 as multi-species AF. It is linked indirectly to all 3 species nodes (SP1−SP3) by 4 LOs (LO1−LO4). In LO1 also AF2 is involved, and in LO2 also AF3 is involved.
The layout shows 3 small groups centred around the multi-species node AF1.
Results and discussion
After the basic description of the visualisation techniques and network types given in the “Methods” section, we first present concrete examples for some of the visualisation techniques introduced in the “Methods” section followed by use cases how these techniques were applied with JANet to solve specific tasks.
LO network visualisation examples
We show examples for the application to lifespan data for S. cerevisiae and C. elegans. All LOs were taken from AgeFactDB.
Example 1: direct neighbourhood
The direct neighbourhood example described here is focused on the AF TOR1 from S. cerevisiae (Fig. 4a). The gene belongs to the TOR signalling pathway, which has been shown to regulate lifespan across multiple species (S. cerevisiae, C. elegans, D. melanogaster, and M. musculus), as part of the TORC1 complex . The direct neighbourhood consists of twenty LOs for TOR1 and five additional AFs (Dietary restriction, FOB1, GCN4, RPN4, SIR2) that are involved in these LOs.
The layout was calculated with the 3dFL algorithm. Different AF types are colour-coded: magenta indicates genes, dark purple indicates other factors, and light purple indicates chemical compounds (not present here). Each LO node is labelled with the lifespan change value, if available (n/a indicates a missing value).
Hovering over an LO node in the viewer provides additional information on the lifespan experiment. For this particular network all experiments were designed for the inactivation of TOR1. Those 15 LOs that are not connected to any other AF show a mean lifespan increase of 19% up to 56% for the inactivation of TOR1. In combination with the inactivation of the genes FOB1, GCN4, SIR2 are combined with dietary restriction, the lifespan was increased in the range of 19% up to 67%. For an inactivation of the gene RPN4 and TOR1, a lifespan decline of 42% was observed.
By focusing on those LOs that involve TOR1 directly, the direct neighbourhood enables a quick overview of all 20 lifespan experiments involving it. The compact 3D view as a network can reveal the lifespan changes in combination with the other ageing factors more quickly than the large LO table for TOR1 available in AgeFactDB. Due to the rather small size with 26 nodes and 26 edges a 2D view is already helpful too.
Example 2: complete neighbourhood
The direct neighbourhood network of the AF TOR1 from the previous example was iteratively extended to the complete neighbourhood case. Figure 4b shows the complete neighbourhood graph of TOR1. It can be seen as the largest connected subgraph including TOR1 and all directly or indirectly connected LOs and AFs for S. cerevisiae. The complete network consists of 718 nodes (78 AFs, 640 LOs) and 933 edges. The layout was built using the FMMM algorithm.
The extended graph reveals new information on RPN4. All experiments that included this gene led to a decreased lifespan. For GCN4, FOB1, and SIR2, the other direct neighbours of TOR1, there were observed decreased as well as increased lifespans.
The inclusion of all directly or indirectly connected LOs and AFs into the network enables to get an overview of all ageing factors examined directly or indirectly with TOR1. It would be much more laborious to compile the same dataset from the tabular AgeFactDB data and it would result in a very large table. In contrast to example 1, the much larger network profits much more from the 3D view compared to a 2D view. To illustrate this, a 2D view of this network, generated with the popular 2D network viewer Cytoscape , is shown in Additional file 1: Figure S2 and a JANet stereo representation in Additional file 1: Figure S3. The advantage of the 3D view will become even more obvious comparing interactive views in JANet and Cytoscape.
Example 3: augmentation via annotation nodes
In the augmentation example shown here the direct neighbourhood network is augmented by allele type (AT) and citation (CI) nodes. The AT provides information about the experimental manipulation of a gene, for example deletion or overexpression. To increase the benefit of adding AT nodes, we unified the ATs according to Table 3. There are for example 12 different ATs which are unified to loss of function. The original AT information was kept as annotation of the LO nodes.
CI nodes provide information about the publication from which an LO was extracted, represented by the corresponding PubMed ID.
Figure 5a displays the basic network of the gene RAS2 from S. cerevisiae. It is homologous to members of the the mammalian RAS oncogene family, involved in the development of cancers . There are 11 LOs where only the RAS2 gene is involved. The corresponding lifespan changes seem to be contradictory: 6 times an increased lifespan versus 5 times a decreased lifespan. In Fig. 5b the basic network is augmented by AT and CI nodes. Here, most of the supposed contradictions are resolved immediately. In all cases with reduced lifespan the RAS2 gene was deactivated (AT: loss of function). In 5 of 6 cases with increased lifespan it was overexpressed instead (AT: overexpression). So the differences here fit to the expectation that overexpression of a gene has the opposite effect compared to loss of function. It can be seen that the remaining contradictory LO was extracted from a different publication than the others. In general, this could be a hint that the experiments might have been performed under different conditions, which were not recorded during the extraction of the LO. But in this case, we could not resolve the contradiction by studying the two publications.
This example demonstrates how helpful it can be to include annotation information like ATs and CIs as additional nodes. Potential inconsistencies can be resolved quickly, without having to look up and remember the annotations individually for each node. Because the annotation nodes influence the network layout they can also help to identify quickly characteristics of the examined ageing factors. For RAS2 such a characteristic is, that it seems to be a longevity promoting gene, meaning that its overexpression prolongs life while its inactivation shortens life.
Example 4: augmentation combined with data transfer
In this example the techniques Augmentation via Annotation Nodes and Data Transfer are combined within the following 2 steps.
The augmentation increased the network size by about thirty percent, from 718 to 947 nodes (78 AFs, 640 LOs, 229 GO terms) and from 933 to 1353 edges. A stereo representation of the intermediate visualisation can be found in Additional file 1: Figure S5.
Step2: The Data Transfer from the LOs to the AFs in the second step compensates the increased network complexity by allowing to remove all 640 LO nodes (Fig. 4c). Here, the average observed lifespan changes are indicated by the colours of AF nodes. Genes, for which all connected LOs increased or decreased the lifespan, are coloured in green or red. Genes, for which the effects of at least eighty percent of the LOs are going into the same direction, are coloured translucent green or red. Genes with even more heterogeneous LOs are coloured in orange.
The blue annotation nodes symbolize GO terms of molecular processes. The size of the nodes increases proportional to the number of linked genes (number of incoming edges).
The annotation nodes group connected components close to each other. While GO terms connected with several genes tend to build clusters within the network, GO terms related to a single gene build satellites at the outside of the network.
Some of the GO terms are assigned to a larger number of AFs, indicated by the node size. The labelled GO terms replicative cell aging (connected to 14 AFs), cellular response to DNA damage stimulus (connected to 6 AFs), and DNA repair (connected to AFs) reveal therefore a connection of many AFs in the network to cell ageing and DNA repair processes playing an important and widely accepted role in ageing.
This example demonstrates on one hand the combination of two visualisation techniques. On the other hand it shows how the increase in network size and complexity by the inclusion of GO annotation nodes can be compensated. And the stereo representation in Additional file 1: Figure S3 provides a good impression of the benefit of a 3D network layout.
JANet use cases
We provide use cases for the application of JANet. In the first use case, JANet is utilised for analysing a set of differentially expressed genes. In the second use case we demonstrate how JANet can be used for the identification of novel candidate genes related to ageing.
Use case 1: analysis of differentially expressed genes
JANet can be used to inspect user-specified gene lists within the LO networks extracted from AgeFactDB.
As an example we analyse differentially expressed genes taken from a study on the effect of D-Glucosamine (GlcN) on the lifespan of nematodes and ageing mice by Weimer et.al. . The study comprises RNA-seq data of 12 C. elegans samples and 12 M. musculus samples. For each species, 6 samples were treated with GlcN supplementation; the other samples remained untreated. Data are available in the NCBI Gene Expression Omnibus (GEO) database  under accession GSE54853.
In their experiments, Weimer et al. identified 293 differentially expressed genes in mouse liver and 1272 genes in C. elegans. We analyse the combined list with 1565 genes.
Figure 2c illustrates how the analysis is started within JANet:
Copy / paste the list of differentially expressed genes into the tab “Import Gene List” (The viewer needs either the ID from the NCBI Gene database  or the gene symbol plus species name to be able to match a gene to an AF or putative AF.)
Start the analysis by clicking the “Start analysis” button.
The results are presented in a table containing the genes matching an AgeFactDB gene. Five entries of 157 matching gene entries are shown in Fig. 2d. For each gene the information provided by the user (NCBI Gene ID or gene symbol plus species name) and the corresponding information within AgeFactDB is shown. The table also provides the ageing relevance evidence type, characterising a matching gene as an ageing factor (“experimental evidence”) or a putative ageing factor (“homology analysis”). For putative ageing factors the homologous non-putative ageing factors are specified.
A visualisation of the results is presented in Fig. 6. It provides an overall impression on the differentially expressed genes and their fit into the LOs in AgeFactDB. For each matched differentially expressed gene the LO network is shown (FMMM). Only matching genes or their homologues with experimental evidence are included. Other genes involved in the LOs were excluded to provide a clearer overview.
The network also includes the observations from the homology analysis. Again, non-matching genes belonging to the homology groups were excluded. The genes from the user specified list are marked by a halo. 151 genes with LOs are given in the graph.
The first subnetwork at the top left is very large and looks rather different. It is centred around the ageing factor daf-16 from C. elegans. For this gene many LOs are available (384). Most of them are concentrated in the sphere-like structure at the centre of the subnetwork. It was tested in combination with many other genes. A large number of AFs (17) and putative AFs (11) match to differentially expressed genes.
The second subnetwork is centred on the AF col-93 from C. elegans. It only contains a single LO and a homology observation with a large homology group of 42 putative AFs, arranged in a sphere-like structure. The AF col-93 is not in the user specified list. The third subnetwork centred on daf-12 contains no other genes but only 60 LOs arranged in a sphere-like structure.
The other subnetworks have a rather similar structure: They consist of 1 to 5 AFs with a at most 13 LOs, 7 putative AFs, and 1 homology analysis observation. Several of the homology analysis observations are not connected to any putative AFs. This means that none of the other members of the homology group are on the user specified list.
Individual networks can be generated for each of the matching genes in the user specified list. An interactive table (Fig. 2d) can be used to narrow the number of AFs. Putative AFs are marked by a background colour in the interactive table.
The gene Gstp2 (Mus musculus) from the user specified list is an example for a putative AF in AgeFactDB. For this type of AF, no LOs are available in the database. The list from Fig. 2d provides the homologous gene gst-10 from C. elegans for which LOs exist. Networks for gst-10 can be constructed via a dropdown list (Fig. 2a). The result is shown in Additional file 1: Figure S6.
In contrast to the overview network (Fig. 6), all AFs involved in the LOs are included. The network is augmented with AT and CI nodes. The numbers at the LO nodes indicate the lifespan change. It can be seen that overexpression of the gst-10 resulted in an increased lifespan by about 20 percent, while reducing the expression by RNA interference resulted in a decreased lifespan by about 12 percent. It can also be seen that this data was gathered from the experiments reported in 3 publications.
This use case demonstrates how easy you can identify AFs within a large list of genes with JANet. The overview network provides a compact view of all LOs available for these ageing factors. The tiling of disconnected subnetworks separates AFs studied more extensively in combination with other AFs from those studied separately. Individual genes can also easily be looked at in more detail by building individual subsets.
Use case 2: candidate gene identification
JANet can be used for a de novo candidate gene identification on the basis of AgeFactDB. The following section provides a show case for a possible selection process for C. elegans. The task will be the identification of new promising candidates for ageing-related genes that are not yet included in AgeFactDB.
An overview of the candidate gene selection process is given in Fig. 7. It consists of nine major steps. For each step there is also an enlarged image available as additional file (Additional file 2: Figures S7–S15).
Step1: We start our screening with the inspection of the complete LO network of C. elegans (step 1, Fig. 7a and Additional file 2: Figure S7). It consists of 965 ageing-related genes and 4265 LOs and is divided into 676 disjunct subnetworks.
Step 2: The LO network is augmented by the introduction of pathway nodes that represent KEGG pathways  specific for C. elegans. These pathways were extracted from the BioSystems database , based on cross-linking information to AFs. The other gene components of a KEGG pathway are introduced as additional gene nodes, linked also to the pathway node. AFs unconnected to a pathway node are removed. The resulting network is shown in Fig. 7b and Additional file 2: Figure S8. This step reduced the numbers of ageing-related genes (324), LOs (2159) and disconnected subnetworks (1). A number of 2619 genes and 161 KEGG pathways were added. Linked to known AFs via common KEGG pathways, the additional genes can be seen as an initial set of candidates for ageing-related genes.
Step 3: A summary of the LO information was transferred to the AF nodes in preparation for reducing the complexity of the network (Fig. 7c and Additional file 2: Figure S9). This means that the LO node colours, indicating lifespan increase or decrease, from all LOs connected to an AF were transformed into a new colour for the AF node. The transformation was done by the following scheme:
only red LOs → opaque red AF
≥80% red LOs → translucent red AF
only green LOs → opaque green AF
≥80% green LOs → translucent green AF
>20% red and >20% green LOs → orange AF
Step 5: In order to focus on the most relevant AFs, the visible network was restricted to those AFs with a maximal lifespan change of at least 100%.
Step 6: Based on the assumption that genes which are connected to AFs via multiple pathways are more likely to be ageing-related, promising candidates were highlighted. Only those genes connected to at least 6 pathway nodes were selected and marked by a halo (Fig. 7f and Additional file 2: Figure S12). This led to a final list of 95 candidate genes connected to 20 visible pathway nodes.
Step 8: It is also possible to reinspect the network while focusing on a single candidate gene. As an example gene enol-1 was used (Fig. 7h and Additional file 2: Figure S14). We revisited the network before the reduction of the pathway and gene nodes (step 4), while focusing on enol-1. All nodes were hidden that are not connected to the 7 pathway nodes which are connected to enol-1. The candidate gene node, the AF nodes, and the pathway nodes are labelled with their names.
Step 9: An optimized view was created for candidate gene enol-1 by building a new subset and calculating a new layout (Fig. 7i and Additional file 2: Figure S15). The new subset contains enol-1, all KEGG pathway nodes connected to it, and all AFs connected to these pathway nodes. Like before, information from the LOs was transferred to the AFs and visualized by node colour and size. As you can see, enol-1 is linked to 3 different types of pathways: biosynthesis, degradation and energy metabolism. The LOs were obtained in the context of energy metabolism .
For all 95 selected candidate genes obtained in step 6 we did a literature search. The results are summarized in Table 4. LOs not yet contained in AgeFactDB with a significant lifespan change are available for the 5 candidate genes aco-1, enol-1, pkc-2, pyk-1, and tpi-1 [41–44]. Other ageing relevance evidence was found for the 8 other candidate genes acdh-7, acdh-9, alh-4, ech-6, ech-9, F59F4.1, hacd-1, and T02G5.7 [45–51]. No specific ageing relevance evidence was found for the remaining 35 candidate genes.
In this use case JANet helped to find new candidate genes as AFs for AgeFactDB. By applying several of the proposed visualisation techniques we could select 95 promising candidate genes from the initial 2619 genes. The literature search revealed for 27% of the candidate genes that there is ageing-relevance evidence. This true positive rate suggests that it would be justified to include also others of the remaining 35 candidate genes into experimental analysis. Moreover, such an in-silico approach may also improve the curation of AF/LO databases.
JANet provides a wide range of network visualisations for the analysis of lifespan observations. Integrating heterogeneous data from various sources, these networks allow a comprehensive overview of data from lifespan experiments and their dependencies. The investigations are linked via common components or external domain knowledge into network representations. This can generate interpretable patterns, which are recognisable by a human expert. These network representations can be seen therefore as a valuable addition to classic tabularized representations.
Interactive network manipulation allows replacing complex static queries with a visually guided search process. A life scientist can easily explore a network by merely changing the graph’s perspective. For example, zooming into a region of interest can reveal detailed information, which might be hidden at a broader scale. In this context, interactive 3D layouts allow a more extensive range of manipulations than classical 2D graphs. Providing more compact graph representations and 3D rotation, 3D arrangements allow reaching distant points much faster. The network itself can be reconfigured during the exploration. For example, LOs can be hidden or highlighted.
In our first use case, we show that a researcher can utilise the network visualisations of AgeFactDB to explore his/her experimental data. This type of analysis brings a set of candidate genes into the context of thousands of LOs. In this way, the single experiment gets highly connected to the ageing research field, which would be more laborious by a traditional literature or database screen. This general ability of network representations of reflecting existing knowledge and facilitating the analysis of experimental data can be useful in many other research areas.
3D forced layout
JenAge Ageing Factor Database
- C. elegans :
Fast multipole multilevel method
Gene Expression Omnibus
Jmol AgeFactDB network-viewer
- M. musculus :
- S. cerevisiae :
Hansen M, Hsu A-L, Dillin A, Kenyon C. New genes tied to endocrine, metabolic, and dietary regulation of lifespan from a caenorhabditis elegans genomic rnai screen. PLoS Genet. 2005; 1(1):17–01190128.
Berman RJ, Kenyon C. Germ-cell loss extends c. elegans life span through regulation of daf-16 by kri-1 and lipophilic-hormone signaling. Cell. 2006; 124(5):1055–68.
Kestler HA, Müller A, Gress TM, M B. Generalized venn diagrams: a new method of visualizing complex genetic set relations. Bioinformatics. 2005; 21(8):1592–5.
Kestler HA, Müller A, Buchholz M, Gress TM, Palm G. A perceptually optimized scheme for visualizing gene expression ratios with confidence values In: André E, Dybkjær L, Minker W, Neumann H, Weber M, editors. Perception and Interactive Technologies. Berlin: Springer: 2006. p. 73–84.
Müller A, Holzmann K, Kestler HA. Visualization of genomic aberrations using affymetrix snp arrays. Bioinformatics. 2007; 23(4):496–7.
Kestler HA, Müller A, Kraus JM, Buchholz M, Gress TM, Liu H, Kane DW, Zeeberg BR, Weinstein JN. Vennmaster: area-proportional euler diagrams for functional go analysis of microarrays. BMC Bioinforma. 2008; 9:67.
Gehlenborg N, Wong B. Into the third dimension. Nat Methods. 2012; 9(9):851.
Cohen RF, Eades P, Lin T, Ruskey F. Three-dimensional graph drawing. Algorithmica. 1997; 17(2):199–208.
Nakayama K, Siverman GH. Serial and parallel processing of visual feature conjunctions. Nature. 1986; 320(6059):264–5.
Enns JT, Rensink RA. Influence of scene-based properties on visual search. Science. 1990; 247(4943):721–3.
Phillips WA, Christie DFM. Components of visual memory. Q J Exp Psychol. 1977; 29(1):117–33.
Xu Y, Nakayama K. Visual Short-Term Memory Benefit for Objects on Different 3-D Surfaces. J Exp Psychol Gen. 2007; 136(4):653–62.
Munzner T. Rules of thumb In: Munzner T, editor. Visualization Analysis & Design. Boca Raton: A K Peters/CRC press: 2015. p. 116–144.
Hühne R, Thalheim T, Sühnel J. AgeFactDB – The JenAge Ageing Factor Database – Towards data integration in ageing research. Nucleic Acids Res. 2014; 42(Database issue):892–6.
NCBI Resource Coordinators. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2015; 43(Database issue):6–17. http://www.ncbi.nlm.nih.gov/homologene. Accessed 03 Nov 2017.
AgeFactDB Lifespan Observation OB_000094. http://agefactdb.jenage.de/cgi-bin/jaDB.cgi?RKEY=r001&SEARCH=OB_000094&TYPE=d_ob&VIEW=detail. Accessed 03 Nov 2017.
Fabrizio P, Longo VD. The chronological life span of Saccharomyces cerevisiae,. Aging Cell. 2003; 2:73–81.
Mortimer RK, Johnston JR. Life Span of Individual Yeast Cells. Nature. 1959; 183:1751–2.
Hagberg AA, Schult DA, Swart PJ. Exploring network structure dynamics, and function using NetworkX In: Varoquaux G, Vaught T, Millman J, editors. Proceedings of the 7th Python in Science Conference(SciPy2008).2008. p. 11–5.
Csardi G, Nepusz T. The igraph software package for complex network research. Inter Journal. 2006; Complex Systems(1695):1–9.
vis.js - A dynamic, browser based visualization library. http://visjs.org Accessed 03 Nov 2017.
Rohn H, Junker A, Hartmann A, Grafahrend-Belau E, Treutler H, Klapperstück M, Czauderna T, Klukas C, Schreiber F. VANTED v2: a framwork for systems biology applications. BMC Syst Biol. 2012; 6:139.
Cox CK, Eick GS, He T. 3D Geographic Network Displays. ACM SIGMOD Record. 1996; 25(4):50–4.
Cy, 3D. Simple 3D Network Renderer App.http://apps.cytoscape.org/apps/cy3d. Accessed 03 Nov 2017.
Shannon P, A M, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Research. 2003; 13(11):2498–504.
Theocharidis A, van Dongen S, Enright AJ, Freeman TC. Network visualization and analysis of gene expression data using BioLayout Express3D. Nature Protocols. 2009; 4(10):1535–50.
Hachul S, Jünger M. Large-graph layout with the fast multipole multilevel method. Technical report. Cologne: University of Cologne, Computer Science Department; 2005. http://e-archive.informatik.uni-koeln.de/509/.
Gene Ontology Consortium. Creating the gene ontology resource: design and implementation. Genome Res. 2001; 11(8):1425–33.
Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res. 2014; 42(Database issue):199–205.
Wiki WP. Main Page — WebGL Public Wiki.
Danchilla B. Three.js Framework. Berkeley: Apress; 2012, pp. 173–203.
Van Bruggen R. Learning Neo4j. Birmingham: Packt Publishing; 2014.
Verlet L. Computer “experiments” on classical fluids. i. thermodynamical properties of lennard-jones molecules. Phys Rev. 1967; 159:98–103.
Fruchterman TMJ, Reingold EM. Graph drawing by force-directed placement. Softw: Pract Experience. 1991; 21(11):1129–64.
Evans DS, Kapahi P, Hsueh W-C. TOR signaling never gets old: Aging, longevity and TORC1 activity. Ageing Res Rev. 2011; 10(2):225–37.
Powers S, Kataoka T, Fasano O, Goldfarb M, Strathern J, Wigler M. Genes in S. cerevisiae Encoding Proteins with Domains Homologous to the Mammalian ras Proteins. Cell. 1984; 36(3):607–12.
Weimer S, Priebs J, Kuhlow D, Groth M, Priebe S, Mansfeld J, Merry TL, Dubuis S, Laube B, Pfeiffer AF, Schulz TJ, Guthke R, Platzer M, Zamboni N, Zarse K, Ristow M. D-glucosamine supplementation extends life span of nematodes and of ageing mice. Nat Commun. 2014; 5:3563.
Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Holko M, Yefanov A, Lee H, Zhang N, Robertson CL, Serova N, Davis S, Soboleva A. NCBI GEO: archive for functional genomics data sets–update. Nucleic Acids Res. 2013; 41(Database issue):991–5.
Brown GR, Hem V, Katz KS, Ovetsky M, Wallin C, Ermolaeva O, Tolstoy I, Tatusova T, Pruitt KD, Maglott DR, Murphy TD. Gene: a gene-centered information resource at NCBI. Nucleic Acids Res. 2015; 43(Database issue):36–42.
Geer LY, Marchler-Bauer A, Geer RC, Han L, He J, He S, Liu C, Shi W, Bryant SH. The NCBI BioSystems database. Nucleic Acids Res. 2010; 38(Database issue):492–6.
Yuan Y, Kadiyala CS, Ching TT, Hakimi P, Saha S, Xu H, Yuan C, Mullangi V, Wang L, Fivenson E, Hanson R, Ewing R, Hsu A, Miyagi M, Feng Z. Enhanced energy metabolism contributes to the extended life span of calorie-restricted Caenorhabditis elegans. J Biol Chem. 2012; 287(37):31414–26.
Kim YI, Cho JH, Yoo OJ, Ahnn J. Transcriptional regulation and life-span modulation of cytosolic aconitase and ferritin genes in C. elegans. J Mol Biol. 2004; 342(2):421–33.
Xiao R, Zhang B, Dong Y, Gong J, Xu T, Liu J, Xu XZ. A genetic program promotes C. elegans longevity at cold temperatures via a thermosensitive TRP channel. Cell. 2013; 152(4):806–17.
Ralser M, Wamelink MM, Kowald A, Gerisch B, Heeren G, Struys EA, Klipp E, Jakobs C, Breitenbach M, Lehrach H, Krobitsch S. Dynamic rerouting of the carbohydrate flux is key to counteracting oxidative stress. J Biol. 2007; 6(4):10.
Digital Ageing Atlas. Human Gene ACADM. http://ageing-map.org/atlas/gene/34/ Accessed 03 Nov 2017.
WormBase. Caenorhabditis elegans Gene alh-4. http://www.wormbase.org/species/c_elegans/gene/WBGene00000110 Accessed 03 Nov 2017.
Zarse K, Schmeisser S, Groth M, Priebe S, Beuster G, Kuhlow D, Guthke R, Platzer M, Kahn CR, Ristow M. Impaired insulin/IGF1 signaling extends life span by promoting mitochondrial L-proline catabolism to induce a transient ROS signal. Cell Metab. 2012; 15(4):451–65.
Thyagarayan B, Blaszczak AG, Chandler KJ, Watts JL, Johnson WE, Graves BJ. ETS-4 is a transcriptional regulator of life span in Caenorhabditis elegans. PLoS Genet. 2010; 6(9):1001125.
Wolozin B, Gabel C, Ferree A, Guillily M, Ebata A. Watching worms whither: modeling neurodegeneration in C. elegans. Prog Mol Biol Transl Sci. 2011; 100:499–514.
Ackerman D, Gems D. The mystery of C. elegans aging: An emerging role for fat. Distant parallels between C. elegans aging and metabolic syndrome?. Bioessays. 2012; 34(6):466–71.
Soukas AA, Carr CE, Ruvkun G. Genetic regulation of Caenorhabditis elegans lysosome related organelle function. PLoS Genet. 2013; 9(10):1003908.
Ward JD, Mullaney B, Schiller BJ, He LD, Petnic SE, Couillault C, Pujol N, Bernal TU, Van Gilst MR, Ashrafi K, et al.Defects in the c. elegans acyl-coa synthase, acs-3, and nuclear hormone receptor, nhr-25, cause sensitivity to distinct, but overlapping stresses. PLoS One. 2014; 9(3):92552.
Reis-Rodrigues P, Czerwieniec G, Peters TW, Evani US, Alavez S, Gaman EA, Vantipalli M, Mooney SD, Gibson BW, Lithgow GJ, et al.Proteomic analysis of age-dependent changes in protein solubility identifies genes that modulate lifespan. Aging Cell. 2012; 11(1):120–7.
Shaw WM, Luo S, Landis J, Ashraf J, Murphy CT. The c. elegans tgf- β dauer pathway regulates longevity via insulin signaling. Curr Biol. 2007; 17(19):1635–45.
Murphy CT, McCarroll SA, Bargmann CI, Fraser A, Kamath RS, Ahringer J, Li H, Kenyon C. Genes that act downstream of daf-16 to influence the lifespan of caenorhabditis elegans. Nature. 2003; 424(6946):277–83.
Gao A, Smith R, Van Weeghel M, Kamble R, Houtkooper R. Identification of key pathways and metabolic fingerprints of longevity in c. elegans. bioRxiv; 2017.
Kwon G, Lee J, Koh J-H, Lim Y-H. Lifespan extension of caenorhabditis elegans by butyricicoccus pullicaecorum and megasphaera elsdenii with probiotic potential. Curr Microbiol. 2018; 75(5):557–64.
Uno M, Honjoh S, Matsuda M, Hoshikawa H, Kishimoto S, Yamamoto T, Ebisuya M, Yamamoto T, Matsumoto K, Nishida E. A fasting-responsive signaling pathway that extends life span in c. elegans. Cell Rep. 2013; 3(1):79–91.
Iwasa H, Yu S, Xue J, Driscoll M. Novel egf pathway regulators modulate c. elegans healthspan and lifespan via egf receptor, plc- γ, and ip3r activation. Aging Cell. 2010; 9(4):490–505.
Meng F, Li J, Rao Y, Wang W, Fu Y. Gengnianchun extends the lifespan of caenorhabditis elegans via the insulin/igf-1 signalling pathway. Oxidative Med Cell Longev. 2018; 2018:10.
Sasagawa Y, Urano T, Kohara Y, Takahashi H, Higashitani A. Caenorhabditis elegans rbx1 is essential for meiosis, mitotic chromosomal condensation and segregation, and cytokinesis. Genes to Cells. 2003; 8(11):857–72.
Pujol C, Bratic-Hench I, Sumakovic M, Hench J, Mourier A, Baumann L, Pavlenko V, Trifunovic A. Succinate dehydrogenase upregulation destabilize complex i and limits the lifespan of gas-1 mutant. PloS one. 2013; 8(3):59493.
He K, Zhou T, Shao J, Ren X, Zhao Z, Liu D. Dynamic regulation of genetic pathways and targets during aging in caenorhabditis elegans. Aging (Albany NY). 2014; 6(3):215–30.
Wang MC, Oakley HD, Carr CE, Sowa JN, Ruvkun G. Gene pathways that delay caenorhabditis elegans reproductive senescence. PLoS Genet. 2014; 10(12):1004752.
The work described here is part of a joint effort of the Research Cores (Forschungskern) JenAge and SyStaR within the Gerontosys II funding line. We acknowledge JenAge and SyStaR funding by the German Ministry for Education and Research (Bundesministerium für Bildung und Forschung, BMBF; JenAge support code: 0315581; SyStaR, support code: 0315894A). The research leading to these results has also received funding from the European Community’s Seventh Framework Programme (FP7/2007-2013) under grant agreement n°602783, the German Research Foundation (DFG, SFB 1074 project Z1), and the Federal Ministry of Education and Research (BMBF, e:Med, SYMBOL-HF, ID 01ZX1407A) to HAK.
Availability of data and materials
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Ludwig Lausser and Hans A. Kestler are joint senior authors.
Figure S1: Large AF/LO Network Navigation in a 3D and 2D Network Viewer. Figure S2: The Complete TOR1 AF/LO Network as 2D View. Figure S3: The Complete TOR1 AF/LO Network as JANet Stereogram. Figure S4: The Complete TOR1 AF/LO Network Augmented with GO Process Term Nodes. Figure S5: A Stereogram of the Complete TOR1 AF/LO Network Augmented with GO Process Term Nodes. Figure S6: The Differentially Expressed Gene Gstp2 Matched to the AF/LO Subnetwork of Gene gst-10. (PDF 15,960 kb)
Figure S7: Candidate Gene Selection - Step 1. Figure S8: Candidate Gene Selection - Step 2. Figure S9: Candidate Gene Selection - Step 3. Figure S10: Candidate Gene Selection - Step 4. Figure S11: Candidate Gene Selection - Step 5. Figure S12: Candidate Gene Selection - Step 6. Figure S13: Candidate Gene Selection - Step 7. Figure S14: Candidate Gene Selection - Step 8. Figure S15: Candidate Gene Selection - Step 9. (PDF 8189 kb)
About this article
Cite this article
Hühne, R., Kessler, V., Fürstberger, A. et al. 3D Network exploration and visualisation for lifespan data. BMC Bioinformatics 19, 390 (2018). https://doi.org/10.1186/s12859-018-2393-x
- Gene network
- 3D visualization
- Ageing factor database
- Differentially expressed genes