The Process-Interaction-Model: a common representation of rule-based and logical models allows studying signal transduction on different levels of detail
© Kolczyk et al; licensee BioMed Central Ltd. 2012
Received: 13 March 2012
Accepted: 21 September 2012
Published: 28 September 2012
Signaling systems typically involve large, structured molecules each consisting of a large number of subunits called molecule domains. In modeling such systems these domains can be considered as the main players. In order to handle the resulting combinatorial complexity, rule-based modeling has been established as the tool of choice. In contrast to the detailed quantitative rule-based modeling, qualitative modeling approaches like logical modeling rely solely on the network structure and are particularly useful for analyzing structural and functional properties of signaling systems.
We introduce the Process-Interaction-Model (PIM) concept. It defines a common representation (or basis) of rule-based models and site-specific logical models, and, furthermore, includes methods to derive models of both types from a given PIM. A PIM is based on directed graphs with nodes representing processes like post-translational modifications or binding processes and edges representing the interactions among processes. The applicability of the concept has been demonstrated by applying it to a model describing EGF insulin crosstalk. A prototypic implementation of the PIM concept has been integrated in the modeling software ProMoT.
The PIM concept provides a common basis for two modeling formalisms tailored to the study of signaling systems: a quantitative (rule-based) and a qualitative (logical) modeling formalism. Every PIM is a compact specification of a rule-based model and facilitates the systematic set-up of a rule-based model, while at the same time facilitating the automatic generation of a site-specific logical model. Consequently, modifications can be made on the underlying basis and then be propagated into the different model specifications – ensuring consistency of all models, regardless of the modeling formalism. This facilitates the analysis of a system on different levels of detail as it guarantees the application of established simulation and analysis methods to consistent descriptions (rule-based and logical) of a particular signaling system.
Understanding intracellular signaling is one of the major challenges in Systems Biology that is complicated by the nature of signaling molecules themselves: many signaling molecules, in particular receptor molecules, are large structured proteins consisting of several interacting subunits. These subunits, also called domains, usually contain one site which can form a bond with other proteins and/or be subject to post-translational modifications. Hence, each site can take different states. The state of a molecule is defined by the states of its sites (e.g. a receptor is phosphorylated at a particular site and unphosphorylated at another site). If one is interested in the early events of signaling, then realistic descriptions of signaling systems have to reflect this protein structure, at least in part. Hence, already Pawson and Nash proposed to consider the domains of molecules instead of complete molecules as the main players in signaling networks.
In modeling approaches, utilizing this point of view, every possible state of a protein is described by a variable of its own. As signaling systems contain many such molecules, each with a large number of domains, one often faces a combinatorial explosion of the number of states. For example, in a complete description (i.e. a description incorporating all possible states of all molecule domains), a model of a protein with n phosphorylation sites contains 2 n variables. If each site can also be bound by other molecules, the number of required variables increases to 3 n .
In signaling systems composed (mainly) of such structured proteins, subsets of protein states often share common characteristics: for example, if the binding of receptor and ligand occurs with the same kinetic constants, regardless of the phosphorylation state of a different site. In a complete description this binding reaction has to be specified at least twice, with identical rate constants (once for the phosphorylated and once for the unphosphorylated receptor state). This redundant specification makes model set-up complicated and model analysis difficult and thus increases the probability of a model failing to be internally consistent.
Recently, rule-based modeling has been established as the tool of choice to handle this combinatorial complexity. Given a model in a rule-based formalism, quantitative predictions are in general easy to obtain – either via generation of a quantitative model in the form of ordinary differential equations (which is straightforward) or by direct (stochastic) simulation (see, for example,[4, 5]). By using the methods described in[6–8] it is possible to reduce the number of equations in an ODE model derived from a rule-based description without losing any information.
Many biologically relevant questions, however, are not necessarily quantitative but rather qualitative in nature. One might, for example, only be interested in whether or not a ligand can activate a transcription factor at all, or how the activation of a certain species is prevented by a small number of knockouts. More details and further examples can be found in[9–11]. Even though these questions can in principle be answered with the help of quantitative models, qualitative models such as logical models have become the tool of choice for studying these questions as they often require less detailed knowledge. For the set-up and analysis of such models a variety of methods exists that are especially suited for studying causal relationships among species in signaling networks. This kind of analysis is often called ‘Structural and Functional Analysis’[12–15].
Building a logical model describing all possible states of the structured molecules central to signaling systems faces the same challenges as building an ODE model considering such states. Even though it is in principle possible to build such a model (in a way similar to quantitative models), this is a challenging and error-prone task that is not immune to the combinatorial explosion of the number of states. Hence we propose what we call a site-specific logical model that enables a systematic description of processes on sites of molecules similar to the rule-based modeling formalism. Site-specific logical models enable – to the best of our knowledge – for the first time the above-mentioned structural and functional analysis of complete descriptions of signaling systems.
In this contribution we will exemplify that the Process-Interaction-Model (PIM) concept combines the advantages of rule-based modeling and site-specific logical modeling in a common representation. Every PIM incorporates all information that is necessary to build consistent models in the different formalisms. Furthermore, this article will describe a concept that comprises algorithms to generate rule-based and logical models from a PIM. Every PIM can be seen as a compact specification of a rule-based model and facilitates the systematic set-up of a rule-based model, while at the same time facilitating the automatic generation of a site-specific logical model.
In the following two subsections we briefly introduce the main concepts of rule-based and logical modeling required for the PIM concept. The remainder of this article consists of the sections “Results” and “Methods”. In the section “Results”, the basic ideas of the PIM concept are introduced, followed by a brief description of its realization within the ProMoT framework and an application to the early events of EGF and insulin signaling. Details of the underlying algorithms and the potential extension of the PIM concept are discussed in the section “Methods”.
Rule-based modeling facilitates handling of combinatorial complexity
Rule-based modeling has been established as an efficient way to handle the combinatorial complexity that is characteristic for realistic networks in signal transduction. It is an approach tailored to the set-up of such networks and can be seen as a compact model specification. In rule-based modeling classes of biochemical reactions having the same kinetic parameters are described by reaction rules that can be expanded to ordinary differential equations (ODEs) in a straightforward way[4, 17, 18].
By omitting unnecessary information about not involved molecule domains (“don’t care, don’t write principle”) and by using patterns, combinatorial complexity can be handled in a systematic manner. Patterns comprise sets of molecules or molecule complexes sharing common characteristics and describe their states. Such a pattern, for example, can comprise all receptor molecules which have a ligand bound, regardless of the states of other phosphorylation and binding sites (i.e. this pattern describes all receptor-ligand complexes with different phosphorylation and binding states).
Patterns are connected by reaction rules describing the evolution of a system. Each rule contains patterns on the right and left side of a reaction arrow followed by kinetic parameters. Every reaction rule is either reversible or irreversible and describes the change of the state of one or two sites (e.g. in modification processes one site changes from unmodified to modified or in binding processes two sites change from unbound to bound). The affected sites in a rule, that is, the sites which change their state, are called the reaction center, while sites that remain unchanged are called the reaction context. Rules describe biological facts like “the phosphorylation of the insulin receptor at a particular tyrosine residue occurs at a higher rate if insulin is bound to the receptor.”
Many tools facilitate rule-based modeling, for example, BioNetGen, ALC and Kappa. These tools require a text-based specification of rule-based models. BioNetGen additionally uses a graph structure to represent these models, where the molecules are represented as building blocks composed of reactive sites and the reaction rules are denoted as graph-rewriting rules. The BioNetGen language (BNGL) is emerging as a quasi-standard for rule-based modeling and several rule-based models have already been published in BNGL[22–25]. Furthermore, BioNetGen offers different simulation opportunities of rule-based models and various interfaces to simulation tools[26–28]. Recently, visualization and annotation guidelines for rule-based models have been proposed.
Logical modeling facilitates understanding of causal relationships
Qualitative modeling approaches have been emerging as relevant complements to dynamic modeling as they require less detailed knowledge about kinetic laws and parameters while at the same time allowing the study of important structural and functional properties of the system. An example are logical models. Originally used to describe random networks or gene regulatory networks of moderate size[30–33], logical modeling has been established as a valuable tool for the analysis of signaling pathways[9–11, 34, 35].
For the set-up and analysis of the site-specific logical models presented herein the logical modeling framework introduced in is used. This formalism is tailored to the study of qualitative input-output responses of signaling networks. Biological species such as ligands, receptors, adaptor proteins, or kinases are represented as nodes of the logical network. Each of these nodes has an associated logical state indicating whether the species is active/present (1) or not (0). As the state of a node can also be undefined/unknown (*) a three-valued logic is used. Logical operations on the network nodes represent the signaling events and are given in disjunctive normal form. Besides the logical operators AND, OR and NOT, operators with incomplete truth table (ITT gates) can be utilized in those cases where no decision whether an AND or OR gate should be used can be made. The logical model is represented as a logical interaction hypergraph and methods for the analysis of these networks are implemented in the software CellNetAnalyzer. The main difference to the site-specific logical model proposed herein is that states in the latter represent the states of molecule domains instead of molecules themselves.
In this section we demonstrate that PIM construction is straightforward given graphical representations commonly used in Systems Biology. It is organized as follows: in section “PIM definition and construction” the formal definitions are given. In the sections “A PIM facilitates rule-based model building” and “A PIM uncovers the logic of rule-based models” it is explained how both model types (rule-based and logical) can be derived from a PIM. In the section “Implementation of the PIM concept” we briefly discuss how the concept is realized in the software ProMoT and the section “Application to insulin and EGF signaling” finally demonstrates the applicability and the benefits of a PIM by applying it to the model presented in.
PIM definition and construction
A PIM can be defined for every signaling system consisting of reactions described by mass action kinetics. Obviously, many existing models contain non-mass action kinetics (e.g. convenience kinetics characterizing regulatory feedbacks). These are not directly amenable by the PIM concept. However, we are convinced that this is not a severe limitation, as by modeling such reactions in greater detail it is often possible to replace a reaction with non-mass action kinetics by a network of reactions on the mass action level (many examples can be found in). Moreover, PIMs are expected to be used in modeling early events in signaling pathways; such systems are often modeled in great enough detail to justify mass action kinetics.
A PIM is represented by a directed graph with nodes representing processes like post-translational modification, binding and so on. Edges represent interactions among processes. An edge is added between two processes if a process occurs with different kinetic parameters depending on the occurrence of the other process (e.g. a modification process on a particular site of a receptor is described by different reaction rates, depending on whether or not a ligand is bound). An interaction is either unidirectional, bidirectional or all-or-none. The latter type of interaction can be used to describe a situation where a process can occur only after another process has occurred (e.g. the binding at a phosphorylation site can only occur after the site has been phosphorylated). This type of interaction has been introduced in[6, 19] and is employed for model reduction purposes.
In the context of combinatorial reaction networks, processes and interactions are already introduced in[6, 7, 19] and a graph with nodes representing processes and edges representing interactions is used in. While in[6, 7, 19] the focus is on reduction of models of combinatorial reaction networks, the PIM concept focuses on the set-up of two consistent models in different formalisms.
The PIM concept is closely related to rule-based modeling
In rule-based modeling one often faces the situation that several rules with the same reaction center but different reaction context are necessary to describe a process. In a PIM every node represents a reaction center and the incoming edges represent the contextual information. Hence, the PIM concept is closely related to rule-based modeling as every process node can be interpreted as an aggregation of reaction rules with the same reaction center and the contextual information of a process node comprises the reaction context of every rule involving that reaction center. This merits our claim that every PIM is a compact representation of a rule-based model. For example, in Figure2 process node 1 corresponds to the first reaction rule in Figure1 (i.e. the binding of molecule A and R), process node 2 corresponds to reaction rules 2 to 5 describing the modification of molecule R at site p1 under different conditions (i.e. depending on whether or not the binding of A and R has previously taken place and whether or not molecule R has been modified at p2). Process node 3 corresponds to the reaction rules 6 and 7 (i.e. the modification of molecule R at site p2). And process node 4 corresponds to reaction rule 8 (i.e. the binding of B at the modified site p1 of molecule R).
A process node represents a process in different reaction context
The main processes in signal transduction are binding processes and modification processes. Every process node has assigned information about involved molecules and sites. Consequently, binding processes have assigned two molecules and sites and modification processes have assigned a single molecule and site. Additional process types are defined and will be described in detail in the section “Methods”.
Column y stores the information whether the process represented by the node itself has occurred (y = 1) or not (y = 0). The value of y is either determined by the equilibrium constant k eq (for reversible processes) or the forward rate constant (for irreversible processes), as is described below. The columns representing incoming edges contain logical values denoting the fact that the preceding process has or has not occurred (1 denotes ‘process has occurred’, 0 denotes ‘process has not occurred’). Figure3 depicts the PIM for the small example in Figure2 and the parameter table of process node 2. The column labeled 1 indicates the occurrence of process 1 and the column labeled 3 indicates the occurrence of process 3.
Parameter tables are related to truth tables
Note that column y can be interpreted as an output column associated to a process node and indicates if the process is considered as ‘has occurred’ (y = 1) or ‘has not occurred’(y = 0) for a given combination of input values representing a certain reaction context. Hence, the parameter tables are similar to truth tables, where the inputs for the table are “a previous process has occurred or not”. As in general all combinations have to be accounted for, the table has 2 #in rows.
To decide about the occurrence of a reaction and thereby the values of the output column, two threshold values t 1 < t 2 are introduced. If the process is reversible, the equilibrium constant defined as the quotient of the forward and the backward rate constant of each reaction is used (following from the law of mass action). If the equilibrium constant is greater than or equal to the upper threshold (k eq ≥ t 2), we regard the reaction as ‘has occurred’ and set y = 1. If the equilibrium constant is equal to or less than the lower threshold (k eq ≤ t 1), we regard the reaction as ‘has not occurred’ and set y = 0. If t 1 < k eq < t 2, neither is the case (y = ∗/unknown). If the process is defined as irreversible, we use the forward rate constant to decide about the output of the reaction. If k fw ≤ t 1, we regard the reaction as ‘ has not occurred’ and set y = 0, if k fw ≥ t 2, we regard the reaction as ‘ has not occurred’ and set y = 1 and if k fw lies between the two defined thresholds, the output is unknown (y = ∗/unknown).
This assignment of output values is based on the following idea: we compare the equilibrium constants of the same reaction under different conditions (i.e. in different context) and interpret the relative size of the equilibrium constant as a measure of the influence the reaction context exerts on the outcome of the process. The thresholds are thus a means to reflect this influence of the reaction context and can be chosen for each process individually. Moreover, threshold values will determine topology and logical function(s) of the site-specific logical model.
The choice of thresholds will in general be based on the biological intuition of each modeler as it reflects a judgment about the influence of the reaction context on the process outcome. Hence, threshold choice is one of the most delicate steps in setting up a PIM and one that can, by its nature, not be cast in rigorous rules. In general, it is advisable to study the effect of different threshold choices on the results of a subsequent structural analysis of the site-specific logical model (as we have done in section “Application to insulin and EGF signaling”). It can sometimes be advisable to start with identical thresholds for all or certain subgroups of the processes and refine those later on, based on the structural analysis.
How to set up a PIM
A PIM facilitates rule-based model building
As described in section “The PIM concept is closely related to rule-based modeling”, the PIM concept is strongly related to rule-based modeling. In the generation of a rule-based model, the information about the reaction center can be extracted from a process node; the reaction context in a particular rule is determined by a combination of the occurrence of preceding processes. Kinetic parameters for the combination are taken from the parameter table of the process node. In a PIM, forward and backward rate constants are defined to characterize mass action kinetics of the process. These parameters can be transferred to corresponding reversible reaction rules. If the process is considered to be irreversible, the forward rate constant is added behind the corresponding irreversible reaction rule obtained from the PIM. Furthermore, no units can be defined explicitly in a PIM but parameters are assumed to be specified in consistent units and should be expressed on a per molecule per cell basis.
For a complete rule-based model, the specification of initially existing species and their concentrations is required. For the rule-based models obtained from a PIM, basic (i.e. not complexed) molecules with all sites in unmodified state are assumed. The concentrations are initially set to the value ‘1’ but should be altered afterwards.
The systematic specification of information about involved molecules and their affected sites in process nodes opens up new possibilities in investigating quantitative models. Processes involving particular proteins can easily be omitted in the generation of rule-based models. This greatly simplifies the study of scenarios involving only subsets of proteins (e.g. if a molecule is missing in a model, rules involving this molecule don’t have to be generated). Of course, the study of such scenarios is also possible on the level of reaction rules. But this requires testing every rule, whether or not it involves certain proteins. In a PIM one only has to test each node.
A PIM uncovers the logic of rule-based models
As mentioned in the section “Background”, logical models consist of nodes, each equipped with a logical function. A convenient way to derive a logical model is therefore to begin with an interaction graph, followed by the assignment of a logical function to each of its nodes (called L-nodes to avoid confusion with the P-nodes of the PIM).
There are many ways to derive a logical model from a PIM. Arguably the easiest way is to create an L-node for every process and use the parameter table belonging to each P-node as truth table defining the logical function of that L-node. This, however, is not what is proposed here because such an interaction graph would not contain information about the connection of molecule domains (i.e. the information that two or more processes occur at the same molecule). Instead, we propose the site-specific logical model mentioned above. This model incorporates information about molecule structure and hence allows capturing more of the biological intuition usually present in a cartoon than is possible with a PIM alone. In particular, a site-specific logical model allows uncovering and visualizing the structure of molecules and their interactions. The construction of this site-specific logical model is described as a two-step process below: first an interaction graph is derived and in a subsequent step each node of the interaction graph is equipped with a logical function.
The interaction graph of a site-specific logical model
In order to build the interaction graph, an L-node is created for every site of every molecule and an additional L-node is created for each molecule representing its basal activity. This basal activity connects all L-nodes representing sites of the same molecule and is used to encode the presence/absence of a molecule in different (simulation) scenarios. Later on, in performing logical analysis, L-nodes representing basal activity will serve as inputs.
In general, the interaction graph will contain more L-nodes than the PIM (it is derived from) contains P-nodes. However, there is an intimate connection between L-nodes and P-nodes (see Figure6):
Every P-node representing a modification process gives rise to a unique L-node representing a modification site (as modification processes involve only a unique site).
Every P-node representing a binding process gives rise to a pair of L-nodes representing the binding sites of the involved molecules (as binding processes always involve two binding sites, one from each molecule).
Definition of the corresponds-to relation
p i …modification process
L i corresponds to p i
Binding with prior modification
p i …binding process,
…modification and binding site of molecule one,
corresponds to p j ,
p j …modification process
…binding site of molecule two
corresponds to p i
Binding without prior modification
p i …binding process
…binding site of molecule one,
corresponds to p i ,
…binding site of molecule two
L i and L j correspond to two P-nodes p i and p j (see Table 1).
There exists an edge between p i and p j .
One of the two nodes, say L i , represents the basal activity of a molecule and the other node L j represents a site of this molecule. In this case an edge is created from L i to L j . These activating edges represent the molecule structure. In a subsequent logical analysis this allows, for example, removal of a molecule (and all of its sites) by assigning a value ‘0’ to the L-node representing the basal activity.
Both L-nodes are connected by the corresponds-to relation. Assume L i corresponds to L j (see Table 1, row 3). Then an edge is created from L i to L j . This situation can only occur if both L-nodes arise from a binding process without prior modification (e.g. the activating edge from A_b1 to R_b1 in Figure 7).
In the latter case the orientation of the activating edge depends on the decision which of the two L-nodes corresponds to the P-node representing the binding process (see the last row in Table1). This choice is arbitrary and does not affect the results of the subsequent logical analysis, because, by definition, L i has exactly one incoming activating edge from the L-node representing the basal activity. Hence, in analysis L i passes the value of the L-node representing the basal activity to L j .
From interaction graphs to site-specific logical models: equipping L-nodes with logical functions
Logical function construction for L-nodes without unsigned incoming edges: For these L-nodes the logical function is a logical AND connecting all inputs. Logical function construction for L-nodes with unsigned incoming edges:
From truth tables to logical functions
The aim is to enable analysis of the site-specific logical model with methods available in CellNetAnalyzer. Therefore, the logical functions connecting incoming edges to nodes have to take the form of a sum of products. If 0 and 1 are the only values in the output columns of the respective truth tables, this is equivalent to the disjunctive normal form. Moreover, determination of a logical function from a truth table is straightforward in this case as one may use established algorithms like k-maps (Karnaugh-Veitch[37, 38]) or the Quine-McCluskey algorithm to obtain a logical function in disjunctive normal form.
From the previous discussion, however, it is obvious that truth tables containing ‘unknown’-symbols can be associated to an L-node. In this case, the aforementioned algorithms are not applicable (note that the ‘ don’t care’-symbols allowed in k-maps and the Quine-McCluskey algorithm are different from the ‘unknown’-symbols considered here, in turn precluding applicability of these algorithms). Logical functions hence have to be inferred from truth tables involving ‘ unknown’-symbols on a case-by-case basis. To guarantee applicability of the methods proposed in, it is recommended to use ITT gates to accommodate for the ‘unknown’-symbols: for example, if the first row (all inputs equal 0) has the output 0 and the last row (all inputs equal 1) has the output 1.
Implementation of the PIM concept
Furthermore, export functionality has been added to obtain rule-based models in BNGL from PIMs set up in ProMoT. The conversion into logical models is directly done in ProMoT and will be described in more detail in the next section. The software extension supporting PIMs is available upon request and will be contained in a future release.
A modular logical model obtained from a PIM enables an intuitive analysis and visualization
One of ProMoT’s key features is the opportunity to set up modular models. Modules are used to structure a model and easily exchange and reuse model parts in the modeling workflow. This feature is facilitated in the generation of logical models from PIMs. One module encapsulates all nodes representing the parts of the same molecule. Hence, every molecule is represented by a module and the interactions with other molecules are depicted by arrows across module borders. Figure10(b) shows the modular logical model for the small example depicted in Figure8 in the ProMoT Visual Explorer. ProMoT comprises functionality which enables to obtain logical models intended for the analysis in CellNetAnalyzer combined with suitable graphical representations.
Application to insulin and EGF signaling
Threshold choice and its effect on the logical model
A site-specific logical model can be derived from the PIM as described above. In doing so, a crucial point is the choice of the thresholds t 1, t 2 to discretize the equilibrium parameter. In our particular example, we decided to use the same thresholds for all reactions to limit the number of degrees of freedom. We have chosen t 1 = 0.01 and t 2 = 0.1 (Additional file3 contains the model, readily prepared for the analysis in CellNetAnalyzer). To examine the effect of the threshold values on the logical model, we also considered two other model variants where we moved both thresholds to the next larger/smaller value appearing as equilibrium constant (model M down : t 1 = 0.001, t 2 = 0.01; model M up : t 1 = 0.1, t 2 = 0.25).
The effects are illustrated in Figure13. One arrives at the following conclusions:
If threshold values are increased, the logical model becomes more restrictive, that is, compared to model M, model M up contains additional influences: (1) EGF dimerization becomes necessary for EGF binding in model M up . As EGF binding is in turn necessary for dimerization, neither of the two states can ever be activated in model M up , thus supporting model M. (2) The two insulin binding sites on the insulin receptor mutually inhibit each other, that is, insulin can only bind to either site in M up . This indeed reflects the biological situation. Hence, even though (1) clearly argues against increasing the thresholds, (2) seems to indicate that it might be necessary to vary the thresholds of individual processes (e.g. those of process 1 and 2), also accounting for possible parameter uncertainties. In this example, we nevertheless decided to use one threshold value for all processes, not least because in this particular case, the outcome of the logical analysis is to a certain degree independent of whether or not we assume that the two binding sites influence each other.
If threshold values are decreased, biochemically important interactions are missing: in M down IRS and Shc phosphorylation depend only on the basal activity of the respective molecules. Thus model M down does not account for the fact that both phosphorylation events are induced by preceding binding events, again supporting model M.
Minimal intervention sets to prevent binding of Grb2 to Shc in response to insulin stimulation
Minimal intervention set
set basal activity of insulin receptor, Grb2 or Shc to 0, i.e. remove respective species from the system
prevent insulin binding to its receptor by blocking the binding site on insulin or by blocking both
binding sites on the receptor
prevent Shc binding to insulin receptor by blocking the binding site on Shc or by preventing the
necessary phosphorylation of the receptor
prevent Grb2 binding to Shc by blocking the binding site on Grb2 or by preventing the
necessary phosphorylation of Shc
Conclusions and discussion
We introduced the Process-Interaction-Model (PIM) concept as a means of combining the advantages of rule-based and logical modeling approaches. A PIM is based on a directed graph and incorporates the definition of molecules, domains, processes, interactions, kinetic parameters and logical values. A prototypic implementation of the PIM concept has been integrated in the modeling software ProMoT. At the moment this software-extension is available on demand and will be contained in a future release.
A PIM can be seen as a compact description of rule-based models and the concept thereby facilitates the systematic set-up of such models. Besides rule-based models logical models can be derived from the same basis. Thus the PIM concept enables a systematic and consistent set-up of models in two different specifications. Consequently, the signaling system can be studied on two different levels of detail by applying established simulation and analysis methods to both models. The common basis for the two models has the additional advantage that modifications (e.g. new insights on the structure of signaling systems or changes of parameters) can be made on the basis model and propagated into the model specifications.
When defining a PIM, one faces the same problems as when setting up a rule-based model: one needs to specify rate constants for every reaction and every reaction context. These are often hard to come by. The generation of the site-specific logical model additionally needs values for the thresholds. These can be equally hard to determine. For conventional logical models this information is not necessary, hence, the construction of a site-specific logical model using a PIM can be more involved. A PIM, however, is an efficient means to generate models of two different formalisms in a consistent way. This can more than offset the effort of specifying all parameters.
In the following paragraphs we briefly discuss the potential of the PIM concept.
A modeling workflow incorporating PIMs
A possible modeling workflow employing the PIM concept starts with the set-up of the PIM. As a second step, a site-specific logical model is generated and the qualitative behavior and structural properties of the signaling system are determined by structural and functional analysis of this model. In step three, a PIM refinement may be required based on the results of step two (i.e. processes have to be changed, interactions have to be added or replaced and/or parameters have to be changed accordingly). Steps two and three have to be repeated until further refinement is unnecessary and the logical model can reproduce experimental data. In step four, a rule-based model is generated and the quantitative behavior of the signaling system is determined by simulation and analysis of the rule-based model or the corresponding ODE model. Further cycles of PIM refinement, generation of the rule-based model and its simulation may be necessary to explain experimental data.
Site-specific logical models obtained with the PIM concept will usually describe signaling events in a very detailed manner. This is justified for early events in signaling systems. A natural extension of the aforementioned steps is therefore an integration of the site-specific logical model into existing logical models describing signaling events further downstream of the receptor.
This modeling workflow is only one possibility to employ the PIM concept for the investigation of signaling systems and will most likely have to be adapted to the problem to be solved.
PIMs facilitate scenarios for rule-based models
The systematic specification of involved molecules and their affected sites in each process node greatly simplifies the study of scenarios that describe the removal of proteins from the system. This is especially useful for rule-based models where the systematic removal of proteins can be challenging. It is straightforward to generate not only one rule-based model but a family of models. This enables an analysis that has hitherto been restricted to logical models: to study the influence presence/absence of a molecule has on the system.
PIMs may support model reduction and checking of thermodynamic constraints
In general, quantitative models derived from PIMs result in tremendous ODE systems, thus model reduction is reasonable. The directed graphs used in the PIM concept are similar to the ones used in reduction techniques for rule-based models described in[6, 7]. It is in principle possible to adapt and apply these methods to the PIM concept such that both the rule-based specification and the site-specific logical model can be reduced in one step by applying them to the PIM. Furthermore, the systematic assembly of kinetic parameters enables comfortable checking of thermodynamic constraints.
To conclude, the PIM concept offers connections to a variety of established methods. It has the potential to become a valuable tool.
In the previous sections we presented the concept of a PIM with the two process types occurring most frequently in signaling systems. We pointed out which information has to be assigned to process nodes representing processes of the type binding and modification: the involved molecules, their affected sites and parameter tables. Furthermore, algorithms for the generation of rule-based and site-specific logical models from a PIM containing these process types have been worked out. In the following, algorithmic details for rule-based and logical model generation will be illustrated for special cases of combinations of these two process types, followed by a discussion on how further process types can be integrated in the PIM concept. Thereby, simple examples will be used to demonstrate how other process types can be represented in the PIM concept and how the algorithms for the generation of rule-based and site-specific logical models can be extended. The complete description of these process types is beyond the scope of the paper.
In the following, special cases of constellations of binding and modification processes will be presented. Peculiarities may arise in the generation of rule-based and site-specific logical model specifications from a PIM.
One molecule can bind on different sites which are subject to prior modification
Mutually exclusive preceding processes
Discussion of additional process types
In intracellular signaling, processes other than binding and modification can occur. Arguably the most important ones are polymerization, synthesis, degradation and change of compartment. Although these process types are currently not implemented, it is in principle possible to incorporate them. Below we briefly discuss how this can be achieved.
In the parameter table in Figure17(a) the cases ‘01’ and ‘10’ are identical for homodimerization. Parameters are just needed for one of these cases, thus the gray row means that the fields should not be filled with parameters. Analogous to BioNetGen the association of monomers in the same state (i.e. the first and the last row in the parameter table in Figure17(a)) is parametrized with 0.5 times the nominal rate constant. Hence, for the logical approach, we have to assume that all representations of the same species are in the same state, either on (e.g. phosphorylated in Figure17) or off (e.g. unphosphorylated in Figure17). Therefore, for the logical model, only the rows for equal monomer sites are considered. The remaining fields in the parameter table in Figure17(a) are colored in gray.
Degradation and synthesis
Degradation is a further process occurring in signaling systems. Here we argue that this process type could be integrated in the PIM concept. Information about the name of the molecule which will be degraded and a parameter table have to be assigned to the process node. As degradation is an irreversible process and BioNetGen has a special concept to describe these reactions, kinetic parameters are only needed for the forward direction.
BioNetGen additionally allows specifying the degradation of complexes by adding keywords to the rules. Here we restrict our discussion to the simplest case, the degradation of a single molecule.
To describe the synthesis of a molecule, another specialized process type has to be included in the PIM concept. A process node of this type has to store information about the newly synthesized molecule, the state of its sites, the additional molecule and if it is synthesized bound or unbound.
Logical models obtained from PIMs which contain synthesis processes do not gain additional information because in the site-specific logical models the basal activity species act as inputs and are set prior to analysis.
Change of compartment can be modeled like a modification process
The change of a species localization is a process occurring frequently during the signal transfer in cells; for example, a receptor receiving the signal from extracellular space is internalized (i.e. moves into an endosome). Rule-based models seldom incorporate information about species localization because it raises intricacy of the models. In BioNetGen molecule localization can be treated like a modification. A further site is added and the state of this site represents the localization of a molecule. In complex cases, however, this approach may be error-prone. The modeler has to take care that molecules in complexes change their state of the location together and that molecules can only switch into adjacent compartments. Furthermore, for processes taking place in different compartments, identical reaction rules varying solely in their localization state have to be written down. To overcome this difficulties, BioNetGen has recently incorporated a concept called cBNGL. This approach adds an additional attribute to species and molecules and therewith enables modeling of compartmental organization of cells by storing a directed graph representing the compartment topology. Incorporating this approach into PIMs would require storing additional information. Hence, it is currently not possible to generate rule-based models in cBNGL. Instead, in PIM two different compartment localizations for a molecule can be facilitated by treating compartment changes like modification processes. For that, no special process type is introduced. Some of the disadvantages of modeling compartment changes like modifications inducing error-proneness are overcome by the systematic specification approach followed by PIM. Nevertheless, the modeler has to take care about adjacency of compartments.
KK, RS, HC designed the PIM concept. KK implemented the concept in the modeling software ProMoT. KK, RS, SM and CC prepared the manuscript jointly. All authors have read and accepted the manuscript.v
We thank Julio Saez-Rodriguez for interesting discussions on site-specific logical modeling. And we thank Steffen Klamt for inspiring discussions about the manuscript.
- Gilbert D, Fuss H, Gu X, Orton R, Robinson S, Vyshemirsky V, Kurth MJ, Downes CS, Dubitzky W: Computational methodologies for modelling, analysis and simulation of signalling networks. Brief Bioinform 2006, 7(4):339–353. http://dx.doi.org/10.1093/bib/bbl043 10.1093/bib/bbl043View ArticlePubMedGoogle Scholar
- Pawson T, Nash P: Assembly of cell regulatory systems through protein interaction domains. Science 2003, 300(5618):445–452. http://dx.doi.org/10.1126/science.1083653 10.1126/science.1083653View ArticlePubMedGoogle Scholar
- Hlavacek WS, Faeder JR, Blinov ML, Perelson AS, Goldstein B: The complexity of complexes in signal transduction. Biotechnology and Bioengineering 2003, 84(7):783–794. http://dx.doi.org/10.1002/bit.10842 10.1002/bit.10842View ArticlePubMedGoogle Scholar
- Faeder JR, Blinov ML, Goldstein B, Hlavacek WS: Rule-based modeling of biochemical networks. Complexity 2005, 10(4):22–41. http://dx.doi.org/10.1002/cplx.20074 10.1002/cplx.20074View ArticleGoogle Scholar
- Sneddon MW, Faeder JR, Emonet T: Efficient modeling, simulation and coarse-graining of biological complexity with NFsim. Nat Methods 2011, 8(2):177–183. http://dx.doi.org/10.1038/nmeth.1546 10.1038/nmeth.1546View ArticlePubMedGoogle Scholar
- Koschorreck M, Conzelmann H, Ebert S, Ederer M, Gilles ED: Reduced modeling of signal transduction - a modular approach. BMC Bioinformatics 2007, 8: 336. http://dx.doi.org/10.1186/1471–2105–8-336 10.1186/1471-2105-8-336PubMed CentralView ArticlePubMedGoogle Scholar
- Conzelmann H, Fey D, Gilles ED: Exact model reduction of combinatorial reaction networks. BMC Syst Biol 2008, 2: 78. http://dx.doi.org/10.1186/1752–0509–2-78 10.1186/1752-0509-2-78PubMed CentralView ArticlePubMedGoogle Scholar
- Borisov NM, Chistopolsky AS, Faeder JR, Kholodenko BN: Domain-oriented reduction of rule-based network models. IET Syst Biol 2008, 2(5):342–351. http://dx.doi.org/10.1049/iet-syb:20070081 10.1049/iet-syb:20070081PubMed CentralView ArticlePubMedGoogle Scholar
- Saez-Rodriguez J, Simeoni L, Lindquist JA, Hemenway R, Bommhardt U, Arndt B, Haus UU, Weismantel R, Gilles ED, Klamt S, Schraven B: A logical model provides insights into T cell receptor signaling. PLoS Comput Biol 2007, 3(8):e163. http://dx.doi.org/10.1371/journal.pcbi.0030163 10.1371/journal.pcbi.0030163PubMed CentralView ArticlePubMedGoogle Scholar
- Samaga R, Saez-Rodriguez J, Alexopoulos LG, Sorger PK, Klamt S: The logic of EGFR/ErbB signaling: theoretical properties and analysis of high-throughput data. PLoS Comput Biol 2009, 5(8):e1000438. http://dx.doi.org/10.1371/journal.pcbi.1000438 10.1371/journal.pcbi.1000438PubMed CentralView ArticlePubMedGoogle Scholar
- Schlatter R, Schmich K, Vizcarra IA, Scheurich P, Sauter T, Borner C, Ederer M, Merfort I, Sawodny O: ON/OFF and beyond–a boolean model of apoptosis. PLoS Comput Biol 2009, 5(12):e1000595. http://dx.doi.org/10.1371/journal.pcbi.1000595 10.1371/journal.pcbi.1000595PubMed CentralView ArticlePubMedGoogle Scholar
- Klamt S, Saez-Rodriguez J, Lindquist JA, Simeoni L, Gilles ED: A methodology for the structural and functional analysis of signaling and regulatory networks. BMC Bioinformatics 2006, 7: 56. http://dx.doi.org/10.1186/1471–2105–7-56 10.1186/1471-2105-7-56PubMed CentralView ArticlePubMedGoogle Scholar
- Klamt S, Saez-Rodriguez J, Gilles ED: Structural and functional analysis of cellular networks with CellNetAnalyzer. BMC Syst Biol 2007, 1: 2. http://dx.doi.org/10.1186/1752–0509–1-2 10.1186/1752-0509-1-2PubMed CentralView ArticlePubMedGoogle Scholar
- Samaga R, von Kamp A, Klamt S: Computing combinatorial intervention strategies and failure modes in signaling networks. J Comput Biol 2010, 17: 39–53. http://dx.doi.org/10.1089/cmb.2009.0121 10.1089/cmb.2009.0121View ArticlePubMedGoogle Scholar
- Saez-Rodriguez J, Mirschel S, Hemenway R, Klamt S, Gilles ED, Ginkel M: Visual setup of logical models of signaling and regulatory networks with ProMoT. BMC Bioinformatics 2006, 7: 506. http://dx.doi.org/10.1186/1471–2105–7-506 10.1186/1471-2105-7-506PubMed CentralView ArticlePubMedGoogle Scholar
- Mirschel S, Steinmetz K, Rempel M, Ginkel M, Gilles ED: PROMOT: modular modeling for systems biology. Bioinformatics 2009, 25(5):687–689. http://dx.doi.org/10.1093/bioinformatics/btp029 10.1093/bioinformatics/btp029PubMed CentralView ArticlePubMedGoogle Scholar
- Faeder JR, Blinov ML, Hlavacek WS: Rule-based modeling of biochemical systems with BioNetGen. Methods Mol Biol 2009, 500: 113–167. http://dx.doi.org/10.1007/978–1-59745–525–1_5 10.1007/978-1-59745-525-1_5View ArticlePubMedGoogle Scholar
- Blinov ML, Faeder JR, Goldstein B, Hlavacek WS: BioNetGen: software for rule-based modeling of signal transduction based on the interactions of molecular domains. Bioinformatics 2004, 20(17):3289–3291. http://dx.doi.org/10.1093/bioinformatics/bth378 10.1093/bioinformatics/bth378View ArticlePubMedGoogle Scholar
- Koschorreck M, Gilles ED: ALC: automated reduction of rule-based models. BMC Syst Biol 2008, 2: 91. http://dx.doi.org/10.1186/1752–0509–2-91 10.1186/1752-0509-2-91PubMed CentralView ArticlePubMedGoogle Scholar
- Krivine J, Danos V, Benecke A: Modelling Epigenetic Information Maintenance: a Kappa Tutorial. In Computer Aided Verification, Proceedings, Volume 5643 of Lecture Notes in Computer Science. Edited by: Bouajjani A, Maler O. Springer-Verlag Berlin; 2009:17–32.Google Scholar
- Blinov M, Yang J, Faeder J, Hlavacek W: Graph Theory for Rule-Based Modeling of Biochemical Networks. In Transactions on Computational Systems Biology VII, Volume 4230 of Lecture Notes in Computer Science. Edited by: Priami C, Ingólfsdóttir A, Mishra B. Riis Nielson H: Springer Berlin /Heidelberg; 2006:89–106. http://dx.doi.org/10.1007/11905455_5Google Scholar
- Nag A, Monine MI, Blinov ML, Goldstein B: A detailed mathematical model predicts that serial engagement of IgE-Fc epsilon RI complexes can enhance Syk activation in mast cells. J Immunol 2010, 185(6):3268–3276. http://dx.doi.org/10.4049/jimmunol.1000326 10.4049/jimmunol.1000326PubMed CentralView ArticlePubMedGoogle Scholar
- Geier F, Fengos G, Iber D: A computational analysis of the dynamic roles of talin, Dok1, and PIPKI for integrin activation. PLoS One 2011, 6(11):e24808. http://dx.doi.org/10.1371/journal.pone.0024808 10.1371/journal.pone.0024808PubMed CentralView ArticlePubMedGoogle Scholar
- Chylek LA, Hu B, Blinov ML, Emonet T, Faeder JR, Goldstein B, Gutenkunst RN, Haugh JM, Lipniacki T, Posner RG, Yang J, Hlavacek WS: Guidelines for visualizing and annotating rule-based models. Mol Biosyst 2011, 7(10):2779–2795. http://dx.doi.org/10.1039/c1mb05077j 10.1039/c1mb05077jPubMed CentralView ArticlePubMedGoogle Scholar
- Kocieniewski P, Faeder JR, Lipniacki T: The interplay of double phosphorylation and scaffolding in MAPK pathways. J Theor Biol 2012, 295: 116–124. http://dx.doi.org/10.1016/j.jtbi.2011.11.014PubMed CentralView ArticlePubMedGoogle Scholar
- Yang J, Monine MI, Faeder JR, Hlavacek WS: Kinetic Monte Carlo method for rule-based modeling of biochemical networks. Phys Rev E Stat Nonlin Soft Matter Phys 2008, 78(3 Pt 1):031910.PubMed CentralView ArticlePubMedGoogle Scholar
- Colvin J, Monine M, Gutenkunst R, Hlavacek W, Von Hoff D, Posner R: RuleMonkey: software for stochastic simulation of rule-based models. BMC Bioinformatics 2010, 11: 404. http://www.biomedcentral.com/1471–2105/11/404 10.1186/1471-2105-11-404PubMed CentralView ArticlePubMedGoogle Scholar
- Yang J, Meng X, Hlavacek WS: Rule-based modeling and simulation of biochemical systems with molecular finite automata. IET Syst Biol 2010, 4: 453–466. 10.1049/iet-syb.2010.0015PubMed CentralView ArticlePubMedGoogle Scholar
- Kauffman SA: Metabolic stability and epigenesis in randomly constructed genetic nets. J Theor Biol 1969, 22(3):437–467. 10.1016/0022-5193(69)90015-0View ArticlePubMedGoogle Scholar
- Thomas R: Boolean formalization of genetic control circuits. J Theor Biol 1973, 42(3):563–585. 10.1016/0022-5193(73)90247-6View ArticlePubMedGoogle Scholar
- Thomas R, D’Ari R: Biological Feedback. Boca Raton Florida: CRC Press; 1990.Google Scholar
- Mendoza L, Thieffry D, Alvarez-Buylla ER: Genetic control of flower morphogenesis in Arabidopsis thaliana: a logical analysis. Bioinformatics 1999, 15(7–8):593–606.View ArticlePubMedGoogle Scholar
- Albert R, Othmer HG: The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. J Theor Biol 2003, 223: 1–18. 10.1016/S0022-5193(03)00035-3View ArticlePubMedGoogle Scholar
- Christensen TS, Oliveira AP, Nielsen J: Reconstruction and logical modeling of glucose repression signaling pathways in Saccharomyces cerevisiae. BMC Syst Biol 2009, 3: 7. http://dx.doi.org/10.1186/1752–0509–3-7 10.1186/1752-0509-3-7PubMed CentralView ArticlePubMedGoogle Scholar
- Ryll A, Samaga R, Schaper F, Alexopoulos LG, Klamt S: Large-scale network models of IL-1 and IL-6 signalling and their hepatocellular specification. Mol Biosyst 2011, 7: 3253–3270. http://dx.doi.org/10.1039/c1mb05261f 10.1039/c1mb05261fView ArticlePubMedGoogle Scholar
- Cornish-Bowden A: Fundamentals of Enzyme Kinetics. London: Portland Press Ltd.; 2004.Google Scholar
- Karnaugh M: The map method for synthesis of combinational logic circuits. Transactions of the American Institute of Electrical Engineers 1953, 72(9):593–599.Google Scholar
- Veitch EW: A chart method for simplifying truth functions. In Proceedings of the 1952 ACM national meeting (Pittsburgh),. New York, NY USA: ACM, ACM ’52; 1952:127–133. http://doi.acm.org/10.1145/609784.609801View ArticleGoogle Scholar
- McCluskey EJ: Minimization of Boolean functions. Bell Syst Tech J 1956, 35(5):1417–1444.View ArticleGoogle Scholar
- Ginkel M, Kremling A, Nutsch T, Rehner R, Gilles ED: Modular modeling of cellular systems with ProMoT/Diva. Bioinformatics 2003, 19(9):1169–1176. http://bioinformatics.oxfordjournals.org/content/19/9/1169.abstract 10.1093/bioinformatics/btg128View ArticlePubMedGoogle Scholar
- Ward CW, Lawrence MC, Streltsov VA, Adams TE, McKern NM: The insulin and EGF receptor structures: new insights into ligand-induced receptor activation. Trends Biochem Sci 2007, 32(3):129–137. http://dx.doi.org/10.1016/j.tibs.2007.01.001 10.1016/j.tibs.2007.01.001View ArticlePubMedGoogle Scholar
- Harris LA, Hogg JS, Faeder JR: Compartmental rule-based modeling of biochemical systems. In Winter Simulation Conference, WSC ’09. Winter Simulation Conference 2009, 908–919. http://dl.acm.org/citation.cfm?id=1995456.1995588Google Scholar
This article is published under license to BioMed Central Ltd. Copyright information is incorrect: Copyright note should be : This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.