Visualizing metabolic network dynamics through time-series metabolomic data

Background New technologies have given rise to an abundance of -omics data, particularly metabolomic data. The scale of these data introduces new challenges for the interpretation and extraction of knowledge, requiring the development of innovative computational visualization methodologies. Here, we present GEM-Vis, an original method for the visualization of time-course metabolomic data within the context of metabolic network maps. We demonstrate the utility of the GEM-Vis method by examining previously published data for two cellular systems—the human platelet and erythrocyte under cold storage for use in transfusion medicine. Results The results comprise two animated videos that allow for new insights into the metabolic state of both cell types. In the case study of the platelet metabolome during storage, the new visualization technique elucidates a nicotinamide accumulation that mirrors that of hypoxanthine and might, therefore, reflect similar pathway usage. This visual analysis provides a possible explanation for why the salvage reactions in purine metabolism exhibit lower activity during the first few days of the storage period. The second case study displays drastic changes in specific erythrocyte metabolite pools at different times during storage at different temperatures. Conclusions The new visualization technique GEM-Vis introduced in this article constitutes a well-suitable approach for large-scale network exploration and advances hypothesis generation. This method can be applied to any system with data and a metabolic map to promote visualization and understand physiology at the network level. More broadly, we hope that our approach will provide the blueprints for new visualizations of other longitudinal -omics data types. The supplement includes a comprehensive user’s guide and links to a series of tutorial videos that explain how to prepare model and data files, and how to use the software SBMLsimulator in combination with further tools to create similar animations as highlighted in the case studies.


Changing node size
Metabolites are graphically represented as circles-a standard practice in Systems Biology Graphical Notation (SBGN) (Le Novère et al., 2009)-but with variable size according to the quantitative value of a data point in the experimental data. The maximum and minimum size of a metabolite is determined automatically by the maximum and minimum measured value in the given experimental data. Based on these numbers, the quantity of the metabolite at each specific time point is taken to calculate the size of the circle. This approach has been found to be intuitive because a larger area can be readily associated with more abundant quantity (Cleveland and McGrill, 1984).
The size of the metabolites can represent either the absolute quantity of one metabolite itself or the relative quantity of this metabolite compared to all others. For more in-depth analysis the absolute quantity seems to be more helpful, however. For instance, if the data points represent metabolic concentration, small changes in concentration would not impart massive changes in node size over time, thereby aiding visualization by allowing significant changes in metabolites' concentration to be more easily identified within the network.

Changing node color
The visualization mode of changing node colors encodes the quantitative values of the experimental data in a color gradient. Either three different colors or three different shadings of one or two colors can be pre-defined to represent the maximum, middle, and minimum measured value of a metabolite. The algorithm calculates the color of all values in-between from the color gradients. Compared to the visualization mode mentioned before, this avoids the problem of overlapping metabolites if the network is not dynamically and simultaneously rearranged.
As already mentioned for the visualization mode of changing node sizes, either the absolute or the relative value of a metabolite can be represented in this mode. Depending on the analysis task both versions can be useful, but the mainly used version will be the representation of the absolute value of a metabolite as the node color.

Combination of changing node size with changing node color
There is also the possibility to combine both visualization modes, i.e., one mode for displaying global differences and one for individual changes. Either combination of relative or absolute visualization of the color or size can be utilized at the same time for the investigation of metabolic changes. This offers a more holistic insight into the data set used and can highlight dynamics that may be overseen otherwise. This enables the investigation of relative and absolute metabolic changes at the same time, which offers a more holistic insight into the used data set.

Changing fill levels
In the third visualization type, metabolites are visualized as round "aquariums" showing the corresponding data points as a "fill level of water." Compared to the other two visualization modes, this mode combines the representation of absolute and relative values in one graphical presentation. While the fill levels show the absolute measured data points of one metabolite, the color of the fill level representation refers to the relatively measured data points of this metabolite compared to all others.
As depicted in Supplementary Figure S1e a fully colored metabolite describes the maximum measured value (full fill level), in contrast, an empty metabolite shows the minimum measured value (fill level is zero). In supplementary Figures S1e to S1f also the different colors of the involved metabolites are shown. The darker the color, the higher is the measured value of this metabolite in comparison to all others. For a lighter shading, the average value is lower (Supplementary Figure S1f). In contrast to the other types, this visualization mode offers the possibility of getting a reference of the current depicted data point to the maximum and the minimum measured value of this metabolite. For instance, by showing the absolute concentration of the metabolite as a fill level, the estimation of how close the depicted concentration is to its metabolic specific maximum or minimum is more apparent than for the other two visualization modes. According to Cleveland and McGrill (1984), the extraction of the information is also more natural than estimating area sizes or colors. (f) Absolute fill levels combined with relative color low medium high Concentration color code Figure S1 | Combinations of color and size for representing relative and absolute concentration changes. This figure contrasts three combinations of color and size changes to represent relative concentration values and absolute quantity at two different time points of the experiment. Metabolites for which no data is available, appear in the configurable default color, here light blue. The color bar at the bottom shows the mapping between high, medium, and low concentration values of the relative metabolite concentrations.

Video-like animation
Although the experimental data only offers discrete time points with specific quantitative values, time is a continuous, dynamic entity and, therefore, an appropriate representation should be created. Hence, the discrete data points should be displayed continuously in a smooth video-like animation. This is afforded by a linear interpolation which regularly calculates the current depicted value from the given interval of the possible maximum and minimum value. Each time, the interpolation calculates a new value, the metabolites are redrawn accordingly. This permanent redrawing of the metabolites with different size, color or fill level results in a smooth, video-like movement imitating an animation. Additionally, the interval of how close the time points of the experimental data follow each other can be adjusted, which shortens or extends the duration of the entire 'animation.' It also has to be noted that one standard defined by SBGN (Le Novère et al., 2009), called clone markers, was changed. Species occurring more than once in a network were initially marked by a dark band along the bottom of the species. However, because these bands can be easily confused with a fill level, the dark bands were just rotated by 90°counterclockwise (see Supplementary Figures S1e to S1f). This change was only made to the metabolites displaying a fill level, e.g., metabolites which offer experimental data (see Supplementary Figures S2a to S2b).

Escher
Escher is a web-based software for drawing, manipulating, and viewing metabolic networks, implemented in JavaScript (King et al., 2015). In this work, Escher's feature for overlaying these networks with data has been extended to cover time series data ( Figure S3). The source-code of this modified version of Escher is available at /christophblessing/escher/. Supported features include mapping of time series data to a pair of absolute node sizes and absolute colors as well as the display of animations. The current implementation maps high values to large dark nodes and low values to small bright nodes. Escher naively encodes network maps in a tool-specific JSON format.

SBMLsimulator
The software tool with which the new visualization concept was realized is called SBMLsimulator (Dörr et al., 2014) and is an open source project available at /draeger-lab/SBMLsimulator/. SBMLsimulator is a Java Archive file (JAR), based on Java™ 1.8, which can be run on every Java™ Virtual Machine (JVM). It is based on the open-source library JSBML (Dräger et al., 2011;Rodriguez et al., 2015) and was created for dynamic simulation and heuristic optimization of parameters of models which are written in SBML format. SBMLsimulator combines EvA2 (Streichert and Ulmer, 2005) with the Systems Biology Simulation Core Library (SBSCL) (Keller et al., 2013) for simulation and parameter estimation in one graphical user interface. See the screenshot in supplementary Figure S4.

yFiles
To visualize the given SBML format (Hucka et al., 2003) of systems biology networks the commercial Java™ library yFiles 1 version 3.0 was used. YFiles is made by yWorks and is provided for several platforms. It offers a wide range of different algorithms to analyze and visualize graphs or networks. There is also the possibility to arrange the layout of a network diagram automatically. In SBMLsimulator (Dörr et al., 2014) yFiles is used to display a given SBML model as a graph with metabolites as nodes, edges as reactions and compartments, like mitochondria, as yellow lines. For the realization of graph visualization in SBMLsimulator, the Swing implementation of yFiles was used.

Xuggle
Xuggle is an open source library which allows video and audio stream editing. This software tool is used to export the visualized animation of a network and its experimental data. For SBMLsimulator (Dörr et al., 2014) version 5.4 of Xuggle is used which encodes a video from a series of screenshots of the animation in SBMLsimulator. Figure S3 | Screenshot of dynamic data visualization using the web-based Escher application (King et al., 2015) Figure S4 | Screenshot of SBMLsimulator in "Graph" tab. SBMLsimulator (Dörr et al., 2014) was the primary tool used in this work. Similarly, dynamic visualizations can now also be produced with Escher (see Figure S3).

Sony VEGAS Pro 12
Sony VEGAS Pro 12.0 is a professional video and audio editing program provided by Sony. It is a standalone commercial setup for Windows, here used under Windows 10. Its design is easy to grasp for novices but also has a wide range of powerful features which are also used by experienced users. Sony VEGAS Pro accompanies the whole creation process of a video. The currently latest version is Sony VEGAS Pro 14.0. Sony VEGAS Pro 12.0 has been exclusively used to create the videos of this study (see Supplementary Figure S5).

ZOOM Handy Recorder H4
The ZOOM Handy Recorder H4 is an all-in-one recorder which combines a handy size with a highperformance stereo condenser microphone, a Secure Digital (SD) card recorder, a mixer, an effect section and some more features. It is possible to switch between the stereo mode and a 4-track mode which provides the playback of four simultaneous tracks and the recording of two tracks. There are several sampling rates to choose, the tracks are recordable in WAV or MP3 format, and the device is easily connected to the computer via SD. All audio tracks used in the videos are recorded with the ZOOM Handy Recorder H4.

Network models
iAT-PLT-636 The first model used as one application example is a cellscale, charge, and mass balanced biochemical network reconstruction of the human platelet with 1008 reactions (including biochemical transformations as well as intracellular and extracellular transporters), 225 proteins, 636 genes, and 738 compartment-specific metabolites, de- Figure S5 | Screenshot of Sony VEGAS Pro 12.0. This program was used exclusively for video and audio editing picted in this network. The model was constructed with the primary aim to offer a scaffold with which further data-driven examination could bring deeper insight into the influence of platelets on human diseases. The exact layout used for this work was uniquely created for this study and is available as supplementary material of this article. The model is freely available on the BiGG Models database (King et al., 2016).

iAB-RBC-283
For the red blood cell visualization, we used a modified version of the erythrocyte metabolic reconstruction iAB-RBC-283 (Bordbar et al., 2011), which had been previously used for building personalized kinetic models (Bordbar et al., 2015). The model iAB-RBC-283 is a full bottom-up created reconstruction of the human red blood cell with 292 intracellular reactions, 77 transporters, 267 unique metabolites and accounts for 283 metabolic genes, which suggest that the metabolic role of erythrocytes is much more diverse and expansive than previously presumed. An existing layout of the (human) red blood cell (RBC) metabolic network was used (Yurkovich et al., 2017a). The model is freely available on the BiGG Models database (King et al., 2016).

Preparation of network layouts
All systems biology networks used for the input of this study maps were manually drawn using Escher (King et al., 2015) and stored the native JSON-based format of this software. The stand-alone program Escher-Converter 2 converted these files maps to SBML format (Hucka et al., 2003) Level 3 Version 1 Release 2  with layout information (Gauges et al., 2006(Gauges et al., , 2015 and flux balance constraints (Ollivier and Bergmann, 2018). The annotation application Model-Polisher 3 merged the resulting maps with knowledge from BiGG Models Database (King et al., 2016;Römer et al., 2016).

Experimental data
The experimental data which was used to create the videos of the new time visualization modes were in CSV format structured as a table. The first column defines the time, and all others are labeled with the identifier of the corresponding metabolite. The rows identify the different measured time points.

Experimental data used for iAT-PLT-636
As experimental data of human platelets, which can be visually analyzed, a metabolomics study was chosen in which various quantitative, but no graphical visualization methods were used for examination. Paglia et al. (2014) investigated the over-time change in metabolism of stored platelet concentrations. Because of several beneficial clinical applications, the storage of Platelet Concentration (PC) at the highest quality possible is worthwhile. However, during storage, the (human) platelet (PLT) quickly suffer from a high-quality loss. This process of metabolic decay is called the Platelet Storage Lesion (PSL). As their primary result, Paglia et al. could distinguish three different metabolic states during PSL. The state of short-term stored PCs (day 0-3) is mainly characterized by the appearance of a different microenvironment compared to the initial compound. This is caused by the accumulation of different acidic metabolites secreted into the storage solution, i.e., succinate and malate. Consequential to the accumulation of acidic substances a decrease in pH value follows which marks a significant transition point in platelet metabolism. The second state (day 4-6) is apparent in medium-term stored PCs which show moderate mitochondrial metabolism activity and increased ATP production to sustain PLT metabolism as long as possible.
In the extreme state of long-term stored platelets (day 7-10) probably irreversible changes occur which then lead to a faster decay and total cell lysis. Considering all phenotypes the decline of platelet metabolism is a nonlinear process consisting of successive metabolic shifts.

Experimental data used for iAB-RBC-283
We used the quantitative metabolomic data from Yurkovich et al. (2017b) that examines the dynamics of RBCs under storage conditions at different temperatures. In this study, systems biology is used to fill the gap between temperature dependency studies on a biochemical and on a physiological level. The goal of Yurkovich et al. was to examine temperature dependencies of metabolism in red blood cells. They used a deep-coverage, quantitative, time-course metabolomics of human red blood cells in storage and investigated metabolic changes under four different temperature conditions: 4°C (storage temperature), 13°C, 22°C, and 37°C (body temperature). Studying temperatures in this range provides a baseline data with no transcriptional or translational regulation (e.g., red blood cells do not have genetic material), which offers insights into temperature dependence in a broader context.