In this section we report on the performance of MOSBIE. We follow with two case studies from the application domain, and finally report feedback from domain experts.
Performance analysis
We report the time required to calculate the similarity matrices, as well as the time to sort a collection of models, using an HP Pavilion g7 machine with 6 GB RAM and an i3 2.3 GHz dual-core processor. Our test set of 20 models from the fceri family represents biological systems with thousands of species and tens to hundreds of thousands of reactions. For example, the fceri_fyn model generates a reaction network of 1,281 species and 15,256 reactions, and the fceri_fyn_trimer model generates 20,881 species and 407,308 reactions. In their interactive contact map representations, the models in this family are captured as graphs with between 16 and 21 nodes and 4 to 5 edges.
The model set includes the nine models reported below in the first case study, plus eleven duplicates of these nine models to reach a total of twenty. This duplicate set construction enables the performance evaluation of our approach on a larger set models of the same significant size as the original fceri family. The duplicate approach is reasonable in this case: because it iterates through models in the same fashion regardless of their structure, the comparison algorithm computing time is not reduced when duplicate models are compared. We found that computing the four similarity matrices on this set (see “Sorting models”) required 0.25 seconds and that sorting the models based on their similarity to the most complete model required 0.0052 seconds. This computation time stands in contrast to the 14.56 seconds required to load the collection of models from disk and build the contact maps.To evaluate the performance of our browsing system, we calculated the average computing time for model comparison. We evaluated computing both similarities and differences across five model families, including the three families reported in the case studies and feedback section (Figure 5). Smaller model families ranging from 21 to 121 combined nodes and edges took less than 100 milliseconds for comparison runs. The largest model family we attempted had a combined 295 nodes and edges; the mean comparison time was slightly over one second.
Using the same test set of models and identical machine configuration as in the previous experiment, we computed the amount of time required to (i) locate a node in the family of models, (ii) locate an edge in the family of models, (iii) compare the similarities between a pair of models, and (iv) compare the differences between a pair of models. In all cases, the comparison took less than a quarter of a second to complete, including the call to the Comparison Engine and the display refresh.
Case study 1: comparison of a model family
In this case study, a computational biologist explores a family of rule-based models that describes signaling through the Fc εRI membrane receptor [41], looking at various properties of the set of models. The biologist begins by loading the family of models; a directory is selected through a standard dialog box and all models in that directory are loaded into the system. Reading the models from disk and generating the contact maps from the rules requires roughly one second per model. As the models are loaded, the Comparison Engine computes the similarity matrices for the family, sorts the models, and lays them out appropriately into the small multiples panel. This reflects Task 3 from our task analysis.
Next, the biologist enables layout stabilization across the model family (Task 7). With this new layout, the biologist notices that all of the models seem to have a common core structure, with a large Rec molecule centrally located, and surrounded by Syk, Lyn, Lig, and occasionally Fyn molecules. To confirm that this common structure does indeed exist, the biologist selects the “Compare similarities” radio button, then begins to select pairs of models to compare. Through this selection process (which maps to Task 1), the biologist confirms via a bubbleset overlay that this core structure does exist throughout the model family, with a few small differences. One such comparison is shown in Figure 6.
To investigate some of these differences more closely, the biologist switches the radio button selection to “Compare differences,” which generates a different bubbleset overlay. In one case, comparing the fceri_fyn and fceri_fyn_trimer models, the biologist notices that a single binding site in the Lig molecule differs between the two models (Figure 7). This “compare differences” action maps to Task 2 from our task analysis. Noting this difference of a single binding site, the biologist now wishes to learn how this change in the model affects the concentrations of certain species in the model simulations. Even subtle changes to the model can result in significant changes in the network output. By selecting the “Open Simulations” option from a context menu on either highlighted model, the most recent simulations for each model are identified and opened for the researcher to compare. These simulations are shown in Figure 8, which addresses Tasks 4 and 5. It should be noted that to validate the significance of this comparison the user would need to check that the parameter values governing reaction rates and initial species concentrations were the same between the two models. This can be done in several steps in MOSBIE by opening the corresponding model input files and comparing the parameter blocks.
With the simulation outputs displayed, the researcher can note that, while the concentrations of the observables follow similar curves, the fceri_fyn_trimer outputs grow at a rate roughly 50% faster than those in the fceri_fyn model (Figure 8). Additionally, the concentration of RecSykPS is higher than the concentration of RecPbeta throughout the full simulation of fceri_fyn_trimer, whereas the opposite occurs in fceri_fyn. From this observation, the researcher notes that it is clear that the addition of a third ligand site significantly increases the rate of phosphorylation of the receptor (RecPbeta and RecPgamma curves in Figure 8) and of Syk (RecSykPS curve). The effect on Syk phosphorylation is amplified in comparison to the effect on receptor phosphorylation, which is seen by a change in the ordering of the curves in the top and bottom panels.
It is worth emphasizing that the comparisons shown in Figures 6 and 7 involve large models. The fceri_fyn model generates 1,281 species and 15,256 reactions and the fceri_fyn_trimer model generates 20,881 species and 407,308 reactions. These models may take several hours to generate and simulate. As noted by the domain experts, MOSBIE reveals structural differences between models based on existing simulation data, without the user having to regenerate the results. Thus, MOSBIE potentially saves hours of simulation time.
Case study 2: comparison of models from a database
In this second case study a researcher is developing a model of the EGFR signaling network. The researcher wants to see what molecules and interactions have been included in previous models, with an eye toward integrating these into the new model. The researcher finds two models of EGFR signaling in the BioModels database [14], with model IDs BIOMD0000000019 (Model 19) and BIOMD0000000048 (Model 48), and downloads them as reaction networks in SBML format. Both models are fairly large — Model 19 has 87 species and 236 reactions and Model 48 has 23 species and 47 reactions — and the only visual representations of the models provided in the respective papers [42, 43] use different nomenclature and layout, making them difficult to compare visually.
These models are not rule-based and the molecular compositions of the species in each model are not explicitly provided. However, the models can be converted into a rule-based format and the species’ molecular compositions recovered using a recent web-based tool called the Atomizer [44]. Following successful translation to BioNetGen language (BNGL) format, both models are loaded into MOSBIE and their contact maps displayed. Because the two models use slightly different names to refer to some of the molecules they share in common, these mappings had to be identified and modified manually in the model editor.
Manual layout of the contact maps reveals the implicit molecular components and interactions of the original model (top row of Figure 9). The initial layout of the contact maps for the two models is somewhat different. To facilitate comparison, layout stabilization is applied using the layout for the larger model, followed by correction of the position of the PLCg molecule in Model 48 and its corresponding binding site in EGFR to line up with other molecules and components in the contact map. Selecting the “compare similarities” radio button, followed by zooming and recentering, results in the view shown in the bottom row of Figure 9.
The similarity comparison immediately highlights a core set of elements common to both models. In fact, Model 19 contains all molecules and interactions present in Model 48, except for the PLCg molecule. The comparison in Figure 9 also shows that, in addition to containing a number of additional molecules and interactions, Model 19 also considers synthesis and degradation of EGF and EGFR, which are represented by the unstructured nodes connecting to those molecules.
The similarity in the core structures of the models was not noted in the paper describing Model 19 [42], even though this model was published after the paper presenting Model 48 [43]. Without MOSBIE, it is difficult to identify similarities and differences between models published in the literature because they are usually presented in the form of long lists of equations that use different nomenclature. Although the nomenclature problem must still be addressed manually, in our opinion this case study demonstrates the power of MOSBIE to enable model comparisons that would be prohibitive without monumental effort.
Domain expert feedback
In addition to the two case studies reported above, three computational biologist domain experts (co-authors on this work) requested the MOSBIE system for the purpose of exploring model sets. The experts were most interested in using the system for locating core structures that are common across model families, including the TLR4 family shown in Figure 10. These core structures can be ideal sites for merging similar models into a larger structure.
As noted in “Performance Analysis”, comparison times for a pair of models of the size of those in the TLR4 family (contact map representations with a maximum combined 295 nodes and edges) approach one second, which is still roughly equivalent to the time required to load each model from disk and generate the contact map. Figure 10 shows a similarity comparison between two models in this family. We found that the researchers were still pleased with this comparison computation time, as it is still significantly faster than a manual comparison.
The domain experts expressed satisfaction with the Layout Stabilization module. They noted that, in addition to making it easier to visually compare models in the explorer view, they could also package the layout information with the model files when sharing models with other researchers. This allows the experts to highlight certain structures in discussions without either providing a screenshot or worrying about differences in the layout computed on each machine. It also enabled the experts to store their own custom layouts for models across multiple sessions.
Discussions with our domain experts led us to develop additional features that were not explicitly presented in the case studies. For example, the experts felt that the ability to highlight the location of a single node or edge across the entire model family (as opposed to the previously mentioned pairwise comparisons) would be useful for identifying which models in a family are missing a key structure. These discussions were also useful for refining some features of the system overall, such as using a grayscale color scheme for the small multiples so that the similarity and difference bubblesets stand out even more.
Finally, our domain experts also praised the ability to browse the results of past simulations, as some model structures result in very large networks that take significant resources to run. In this case, researchers would not want to rerun simulations. Specific to our first case study, the fceri_fyn_trimer model requires close to an hour to perform the network generation stage of the simulation.