Definition of the network topology
The first step towards modeling signaling networks is to define the components of the network and their connectivity. We symbolize the components of a network through nodes, represented as variables whose values reflect a state of activity. Nodes do not necessarily represent single molecules, but rather functional entities such as molecular complexes. In the T-helper network for example (Figure 1, upper left) the node describing the interferon-γ receptor (IFNg-R) represents a complex of multiple subunits that when active elicits the activation of the nodes downstream. The connectivity among nodes is expressed in terms of "activations" or "inhibitions". Once the topology of the network is established, it can be loaded into SQUAD to perform analyses and simulations.
SQUAD accepts three types of input formats: net, mml and sbml files. The net format is a simple text file, and the mml format is a XML file; both were developed specifically for SQUAD (see Additional file 1 for a description of the formats). Since defining a network topology for large networks in text format can be difficult and error-prone, we have included the possibility of using CellDesigner [14, 15] generated files as input. CellDesigner is a free, widely used graphical tool that allows the easy construction and edition of diagrams of metabolic and regulatory networks. CellDesigner has an implementation of the SBML (Systems Biology Markup Language) format [16], used by a large number of modeling tools. Whenever CellDesigner files are used as input, SQUAD retains the spatial layout of the nodes providing a more intuitive interpretation of the simulation results (use Additional file 2 as an example). The distribution of SQUAD includes a folder containing the T-helper network sample files in the three aforementioned formats.
Identification of network steady states using SQUAD
Biological systems are governed and regulated by intricate networks, most of the time containing feedback loops, whose activity is influenced by environmental stimuli. An illustrative example is the process of differentiation of precursor T-helper cells into effector Th1 or Th2 cells, in which a complex network controls the transition from a cell type to another according to the cytokines to which the precursor cells are exposed [17]. Dynamical systems may contain stable steady states, defined as specific activation states of the network, which do not change over time and are resistant to small perturbations. To find these stable steady states, it is necessary to translate the network topology into a dynamical system. We have previously published a methodology to automate this process [1]; hence, users of SQUAD only have to provide a network topology. The static representation of a network can be converted into a discrete dynamical system using the Equation 1 (described in [1]).
(1)
Once the network has been translated into a discrete dynamical system, it is possible to locate all its stable steady states using the Generalized Logical Analysis (GLA) [18], which is based on the analysis of the functionality of all the feedback loops that constitute the network. GLA is a well-established methodology to analyze the behavior of regulatory networks, even when they contain nodes with more than two levels of activation. However, the methodology has proved to scale badly for large networks. To address this issue, we implemented a Reduced Order Binary Decision Diagram (ROBDD) algorithm [8] in SQUAD. In contrast to GLA, the ROBDD algorithm works for networks containing only binary nodes.
ROBDD is a memory efficient data structure for representing the exponential state-space of logic functions, widely used in the field of electronic design automation and model checking. Using ROBDDs we can compute a set of subsequent network states in such a way that the state-space traversal can be performed very efficiently. Our implementation optimizes the use of ROBDDs by finding steady states without testing all the possible states of the network, allowing analysis of large regulatory networks. Using this algorithm we are able to identify the steady states of complex networks (>50 nodes) on a desktop computer in a matter of seconds (benchmarked on a Pentium4 2.1 GHz CPU). To support our claim, we provide a network of 111 nodes (Additional file 3) that can be tested for speed by the users. Another advantage of using the ROBDD algorithm is the identification of cyclic attractors, i.e. oscillating states. These states occur when the system does not reach a steady state, but rather a cycling pattern. We used SQUAD to identify the steady states of the T-helper cell network (Figure 1) either through GLA [1] or ROBDD. In both cases SQUAD identifies 3 stable steady states visible through the steady state selector panel (Figure 1, upper rightmost image), or on the graphical network layout (Figure 1, lower images). As previously reported [1], the steady states identified in the T-helper network can be mapped to a specific biological states of T-helper cells, namely the Th0, Th1 and Th2 cell types.
Using SQUAD for studying the dynamic behavior of a network
SQUAD allows the identification of the stable steady states present in the network, but it does not provide information on the events that lead to these states. In other words, it finds the attractors but it does not give any information on the basins of attraction.
SQUAD automatically converts the static network into a continuous dynamical system (as explained in [1]) using Equation 2.
(2)
Within this methodology, variables representing the activity of nodes are normalized, thus providing continuous levels of activation between 0 and 1 where, as in the discrete model, 1 represents the full-activation of a node.
In order to perform simulations, SQUAD solves numerically the continuous dynamical system starting from a given initial state, and a set of values for all parameters. By default, SQUAD sets all values of α's (weight of activations), β's (weight of inhibitions) to 1, and a value of 10 to h (the gain of the sigmoid). Users may change these values in the tables presenting the node and edge configurations. As for the initial states, the software uses the set of stable steady states found in the discrete implementation of the network. Here again, the user is able to specify any initial state that best matches the biological question addressed. In the case of the T-helper network for example, we can use the Th0 steady state, where all the nodes are set to 0, as a starting point for the dynamic simulations.
SQUAD provides a number of graphical utilities to perform simulations. In the steady state selector panel it is possible to choose the starting state from the list of stable steady states automatically found by the system. The selected state can then be further modified to simulate alternative initial states, allowing for the inclusion of perturbations. The results of the simulations are shown by a plot of the activity of each node against time (Figure 2). Importantly, since the equations used for the dynamic simulation are not fitted with experimentally determined kinetic values, the time is expressed in arbitrary units.
Simulations in SQUAD can be performed in two modes. The complete run option sets the dynamic simulation to stop either at a pre-defined time point, or when a steady state is reached. By contrast, in the progressive simulation mode the user is able to control the speed of the simulation and stop it at will. In addition to the plot of time-series, SQUAD also displays the activation status of all nodes. This is particularly useful to make a more intuitive display of how some signals propagate through the network. To exemplify the use of these graphical tools, Figure 2 presents a simulation of the behavior of the Th0 steady state in which the IL-4 node is activated, thus mimicking the addition of IL-4 ligand to the network. We observe that the network moves to the Th2 steady state in response to the IL-4 ligand.
The algorithm behind SQUAD has already been shown to correctly describe the qualitative behavior of a large regulatory network [1]. Different networks, however, might require special manipulations before being analyzed by SQUAD. This is the case whenever there are nodes implementing the AND logical function. Equation 1 indicates that a node integrates the total input by means of OR functions, apparently hindering the use of AND relationships between input nodes. This seeming problem can be solved by the introduction of an intermediary node. Suppose the user needs to include a node X that becomes active only when nodes A and B are both active. In this case, it is necessary to decompose one of the direct activations, say from A to X, into a pair of inhibitions via an extra node, C, created ad hoc. The final topology would then become A¬C¬X ← B. By applying Equation 1 we can see that X = B ∧ ¬ C, but since C = ¬ A then X = A ∧ B, which is the desired AND relationship between the A and B nodes over X. This solution for including AND gates increases the number of nodes in the network, and thus it might introduce some extra states in the attractors. Hence, the user has to be careful to eliminate from the final list of attractor the states of these intermediate nodes. Nevertheless, we are currently working on the explicit incorporation of AND gates into our methodology.
Using laboratory protocols for modeling
As discussed in the previous section, dynamic simulations are extremely useful for assessing how a network behaves in response to different stimuli. In the work described so far we have addressed the influence of stimuli such as IL-4 on the initial steady state. Biological experimental protocols however often rely on perturbations using multiple stimuli, at different times and for varying durations. To map the computational simulations to such biological experiments we have included in SQUAD a framework for performing dynamic perturbations.
The perturbations to be performed are listed within a protocol file written in a dedicated XML format (see Additional file 4, with extension prt). The file describes a set of initial network states and a set of perturbations. Each perturbation corresponds to a separate experiment containing an initial state and a set of actions specifying the node(s) to perturb, the perturbation value(s) as well as the duration and timing of the perturbations. Different types of actions can be specified. For example the singlepulse action modifies the node at a single time point, while rangepulse maintains the perturbation for a determined time period. The protocol file can be created within SQUAD using a wizard tool, which ensures that the protocol file has a valid format (Figure 3). Using these protocols it is possible to reproduce existing biological experiments computationally, or to test new experimental designs. Furthermore, having a file format to specify dynamic simulations allows for the storing, exchanging and comparisons of protocols.
We have used the perturbation framework on the T-helper cell network to assess the effects of IL-4 and IFN-γ (Figure 4 and Additional file 4). As shown in the previous section, the addition of IL-4 to the Th0 state moves the network towards the Th2 steady state (Figure 4A). Similarly, using the perturbation protocols we tested the effect of adding an IFN-γ pulse on the Th2 steady state (Figure 4B). Under these circumstances the Th2 state is temporarily perturbed, but returns to the Th2 state, consistent with experimental data showing the stability of Th2 cells [19].