The ProcessInteractionModel: a common representation of rulebased and logical models allows studying signal transduction on different levels of detail
 Katrin Kolczyk^{1}Email author,
 Regina Samaga^{1},
 Holger Conzelmann^{1},
 Sebastian Mirschel^{1} and
 Carsten Conradi^{1}
https://doi.org/10.1186/1471210513251
© Kolczyk et al; licensee BioMed Central Ltd. 2012
Received: 13 March 2012
Accepted: 21 September 2012
Published: 28 September 2012
Abstract
Background
Signaling systems typically involve large, structured molecules each consisting of a large number of subunits called molecule domains. In modeling such systems these domains can be considered as the main players. In order to handle the resulting combinatorial complexity, rulebased modeling has been established as the tool of choice. In contrast to the detailed quantitative rulebased modeling, qualitative modeling approaches like logical modeling rely solely on the network structure and are particularly useful for analyzing structural and functional properties of signaling systems.
Results
We introduce the ProcessInteractionModel (PIM) concept. It defines a common representation (or basis) of rulebased models and sitespecific logical models, and, furthermore, includes methods to derive models of both types from a given PIM. A PIM is based on directed graphs with nodes representing processes like posttranslational modifications or binding processes and edges representing the interactions among processes. The applicability of the concept has been demonstrated by applying it to a model describing EGF insulin crosstalk. A prototypic implementation of the PIM concept has been integrated in the modeling software ProMoT.
Conclusions
The PIM concept provides a common basis for two modeling formalisms tailored to the study of signaling systems: a quantitative (rulebased) and a qualitative (logical) modeling formalism. Every PIM is a compact specification of a rulebased model and facilitates the systematic setup of a rulebased model, while at the same time facilitating the automatic generation of a sitespecific logical model. Consequently, modifications can be made on the underlying basis and then be propagated into the different model specifications – ensuring consistency of all models, regardless of the modeling formalism. This facilitates the analysis of a system on different levels of detail as it guarantees the application of established simulation and analysis methods to consistent descriptions (rulebased and logical) of a particular signaling system.
Keywords
Background
Understanding intracellular signaling is one of the major challenges in Systems Biology[1] that is complicated by the nature of signaling molecules themselves: many signaling molecules, in particular receptor molecules, are large structured proteins consisting of several interacting subunits. These subunits, also called domains, usually contain one site which can form a bond with other proteins and/or be subject to posttranslational modifications. Hence, each site can take different states. The state of a molecule is defined by the states of its sites (e.g. a receptor is phosphorylated at a particular site and unphosphorylated at another site). If one is interested in the early events of signaling, then realistic descriptions of signaling systems have to reflect this protein structure, at least in part. Hence, already Pawson and Nash proposed to consider the domains of molecules instead of complete molecules as the main players in signaling networks[2].
In modeling approaches, utilizing this point of view, every possible state of a protein is described by a variable of its own. As signaling systems contain many such molecules, each with a large number of domains, one often faces a combinatorial explosion of the number of states[3]. For example, in a complete description (i.e. a description incorporating all possible states of all molecule domains), a model of a protein with n phosphorylation sites contains 2^{ n }variables. If each site can also be bound by other molecules, the number of required variables increases to 3^{ n }.
In signaling systems composed (mainly) of such structured proteins, subsets of protein states often share common characteristics: for example, if the binding of receptor and ligand occurs with the same kinetic constants, regardless of the phosphorylation state of a different site. In a complete description this binding reaction has to be specified at least twice, with identical rate constants (once for the phosphorylated and once for the unphosphorylated receptor state). This redundant specification makes model setup complicated and model analysis difficult and thus increases the probability of a model failing to be internally consistent.
Recently, rulebased modeling has been established as the tool of choice to handle this combinatorial complexity. Given a model in a rulebased formalism, quantitative predictions are in general easy to obtain – either via generation of a quantitative model in the form of ordinary differential equations (which is straightforward) or by direct (stochastic) simulation (see, for example,[4, 5]). By using the methods described in[6–8] it is possible to reduce the number of equations in an ODE model derived from a rulebased description without losing any information.
Many biologically relevant questions, however, are not necessarily quantitative but rather qualitative in nature. One might, for example, only be interested in whether or not a ligand can activate a transcription factor at all, or how the activation of a certain species is prevented by a small number of knockouts. More details and further examples can be found in[9–11]. Even though these questions can in principle be answered with the help of quantitative models, qualitative models such as logical models have become the tool of choice for studying these questions as they often require less detailed knowledge. For the setup and analysis of such models a variety of methods exists that are especially suited for studying causal relationships among species in signaling networks. This kind of analysis is often called ‘Structural and Functional Analysis’[12–15].
Building a logical model describing all possible states of the structured molecules central to signaling systems faces the same challenges as building an ODE model considering such states. Even though it is in principle possible to build such a model (in a way similar to quantitative models), this is a challenging and errorprone task that is not immune to the combinatorial explosion of the number of states. Hence we propose what we call a sitespecific logical model that enables a systematic description of processes on sites of molecules similar to the rulebased modeling formalism. Sitespecific logical models enable – to the best of our knowledge – for the first time the abovementioned structural and functional analysis of complete descriptions of signaling systems.
In this contribution we will exemplify that the ProcessInteractionModel (PIM) concept combines the advantages of rulebased modeling and sitespecific logical modeling in a common representation. Every PIM incorporates all information that is necessary to build consistent models in the different formalisms. Furthermore, this article will describe a concept that comprises algorithms to generate rulebased and logical models from a PIM. Every PIM can be seen as a compact specification of a rulebased model and facilitates the systematic setup of a rulebased model, while at the same time facilitating the automatic generation of a sitespecific logical model.
In the following two subsections we briefly introduce the main concepts of rulebased and logical modeling required for the PIM concept. The remainder of this article consists of the sections “Results” and “Methods”. In the section “Results”, the basic ideas of the PIM concept are introduced, followed by a brief description of its realization within the ProMoT framework[16] and an application to the early events of EGF and insulin signaling. Details of the underlying algorithms and the potential extension of the PIM concept are discussed in the section “Methods”.
Rulebased modeling facilitates handling of combinatorial complexity
Rulebased modeling has been established as an efficient way to handle the combinatorial complexity that is characteristic for realistic networks in signal transduction[3]. It is an approach tailored to the setup of such networks and can be seen as a compact model specification[4]. In rulebased modeling classes of biochemical reactions having the same kinetic parameters are described by reaction rules that can be expanded to ordinary differential equations (ODEs) in a straightforward way[4, 17, 18].
By omitting unnecessary information about not involved molecule domains (“don’t care, don’t write principle”) and by using patterns, combinatorial complexity can be handled in a systematic manner. Patterns comprise sets of molecules or molecule complexes sharing common characteristics and describe their states. Such a pattern, for example, can comprise all receptor molecules which have a ligand bound, regardless of the states of other phosphorylation and binding sites (i.e. this pattern describes all receptorligand complexes with different phosphorylation and binding states).
Patterns are connected by reaction rules describing the evolution of a system. Each rule contains patterns on the right and left side of a reaction arrow followed by kinetic parameters. Every reaction rule is either reversible or irreversible and describes the change of the state of one or two sites (e.g. in modification processes one site changes from unmodified to modified or in binding processes two sites change from unbound to bound). The affected sites in a rule, that is, the sites which change their state, are called the reaction center, while sites that remain unchanged are called the reaction context[17]. Rules describe biological facts like “the phosphorylation of the insulin receptor at a particular tyrosine residue occurs at a higher rate if insulin is bound to the receptor.”
Many tools facilitate rulebased modeling, for example, BioNetGen[17], ALC[19] and Kappa[20]. These tools require a textbased specification of rulebased models. BioNetGen additionally uses a graph structure to represent these models[21], where the molecules are represented as building blocks composed of reactive sites and the reaction rules are denoted as graphrewriting rules. The BioNetGen language (BNGL) is emerging as a quasistandard for rulebased modeling and several rulebased models have already been published in BNGL[22–25]. Furthermore, BioNetGen offers different simulation opportunities of rulebased models and various interfaces to simulation tools[26–28]. Recently, visualization and annotation guidelines for rulebased models have been proposed[24].
Logical modeling facilitates understanding of causal relationships
Qualitative modeling approaches have been emerging as relevant complements to dynamic modeling as they require less detailed knowledge about kinetic laws and parameters while at the same time allowing the study of important structural and functional properties of the system. An example are logical models. Originally used to describe random networks[29] or gene regulatory networks of moderate size[30–33], logical modeling has been established as a valuable tool for the analysis of signaling pathways[9–11, 34, 35].
For the setup and analysis of the sitespecific logical models presented herein the logical modeling framework introduced in[12] is used. This formalism is tailored to the study of qualitative inputoutput responses of signaling networks. Biological species such as ligands, receptors, adaptor proteins, or kinases are represented as nodes of the logical network. Each of these nodes has an associated logical state indicating whether the species is active/present (1) or not (0). As the state of a node can also be undefined/unknown (*) a threevalued logic is used. Logical operations on the network nodes represent the signaling events and are given in disjunctive normal form. Besides the logical operators AND, OR and NOT, operators with incomplete truth table (ITT gates) can be utilized in those cases where no decision whether an AND or OR gate should be used can be made[10]. The logical model is represented as a logical interaction hypergraph[12] and methods for the analysis of these networks are implemented in the software CellNetAnalyzer[13]. The main difference to the sitespecific logical model proposed herein is that states in the latter represent the states of molecule domains instead of molecules themselves.
Results
In this section we demonstrate that PIM construction is straightforward given graphical representations commonly used in Systems Biology. It is organized as follows: in section “PIM definition and construction” the formal definitions are given. In the sections “A PIM facilitates rulebased model building” and “A PIM uncovers the logic of rulebased models” it is explained how both model types (rulebased and logical) can be derived from a PIM. In the section “Implementation of the PIM concept” we briefly discuss how the concept is realized in the software ProMoT[16] and the section “Application to insulin and EGF signaling” finally demonstrates the applicability and the benefits of a PIM by applying it to the model presented in[7].
PIM definition and construction
A PIM can be defined for every signaling system consisting of reactions described by mass action kinetics. Obviously, many existing models contain nonmass action kinetics (e.g. convenience kinetics characterizing regulatory feedbacks). These are not directly amenable by the PIM concept. However, we are convinced that this is not a severe limitation, as by modeling such reactions in greater detail it is often possible to replace a reaction with nonmass action kinetics by a network of reactions on the mass action level (many examples can be found in[36]). Moreover, PIMs are expected to be used in modeling early events in signaling pathways; such systems are often modeled in great enough detail to justify mass action kinetics.
A PIM is represented by a directed graph with nodes representing processes like posttranslational modification, binding and so on. Edges represent interactions among processes. An edge is added between two processes if a process occurs with different kinetic parameters depending on the occurrence of the other process (e.g. a modification process on a particular site of a receptor is described by different reaction rates, depending on whether or not a ligand is bound). An interaction is either unidirectional, bidirectional or allornone. The latter type of interaction can be used to describe a situation where a process can occur only after another process has occurred (e.g. the binding at a phosphorylation site can only occur after the site has been phosphorylated). This type of interaction has been introduced in[6, 19] and is employed for model reduction purposes.
In the context of combinatorial reaction networks, processes and interactions are already introduced in[6, 7, 19] and a graph with nodes representing processes and edges representing interactions is used in[7]. While in[6, 7, 19] the focus is on reduction of models of combinatorial reaction networks, the PIM concept focuses on the setup of two consistent models in different formalisms.
The PIM concept is closely related to rulebased modeling
In rulebased modeling one often faces the situation that several rules with the same reaction center but different reaction context are necessary to describe a process. In a PIM every node represents a reaction center and the incoming edges represent the contextual information. Hence, the PIM concept is closely related to rulebased modeling as every process node can be interpreted as an aggregation of reaction rules with the same reaction center and the contextual information of a process node comprises the reaction context of every rule involving that reaction center. This merits our claim that every PIM is a compact representation of a rulebased model. For example, in Figure2 process node 1 corresponds to the first reaction rule in Figure1 (i.e. the binding of molecule A and R), process node 2 corresponds to reaction rules 2 to 5 describing the modification of molecule R at site p1 under different conditions (i.e. depending on whether or not the binding of A and R has previously taken place and whether or not molecule R has been modified at p2). Process node 3 corresponds to the reaction rules 6 and 7 (i.e. the modification of molecule R at site p2). And process node 4 corresponds to reaction rule 8 (i.e. the binding of B at the modified site p1 of molecule R).
A process node represents a process in different reaction context
The main processes in signal transduction are binding processes and modification processes. Every process node has assigned information about involved molecules and sites. Consequently, binding processes have assigned two molecules and sites and modification processes have assigned a single molecule and site. Additional process types are defined and will be described in detail in the section “Methods”.
Column y stores the information whether the process represented by the node itself has occurred (y = 1) or not (y = 0). The value of y is either determined by the equilibrium constant k_{ eq } (for reversible processes) or the forward rate constant (for irreversible processes), as is described below. The columns representing incoming edges contain logical values denoting the fact that the preceding process has or has not occurred (1 denotes ‘process has occurred’, 0 denotes ‘process has not occurred’). Figure3 depicts the PIM for the small example in Figure2 and the parameter table of process node 2. The column labeled 1 indicates the occurrence of process 1 and the column labeled 3 indicates the occurrence of process 3.
Parameter tables are related to truth tables
Note that column y can be interpreted as an output column associated to a process node and indicates if the process is considered as ‘has occurred’ (y = 1) or ‘has not occurred’(y = 0) for a given combination of input values representing a certain reaction context. Hence, the parameter tables are similar to truth tables, where the inputs for the table are “a previous process has occurred or not”. As in general all combinations have to be accounted for, the table has 2^{ #in } rows.
To decide about the occurrence of a reaction and thereby the values of the output column, two threshold values t 1 < t 2 are introduced. If the process is reversible, the equilibrium constant defined as the quotient of the forward and the backward rate constant of each reaction is used (following from the law of mass action). If the equilibrium constant is greater than or equal to the upper threshold (k_{ eq }≥ t 2), we regard the reaction as ‘has occurred’ and set y = 1. If the equilibrium constant is equal to or less than the lower threshold (k_{ eq }≤ t 1), we regard the reaction as ‘has not occurred’ and set y = 0. If t 1 < k_{ eq }< t 2, neither is the case (y = ∗/unknown). If the process is defined as irreversible, we use the forward rate constant to decide about the output of the reaction. If k_{ fw }≤ t 1, we regard the reaction as ‘ has not occurred’ and set y = 0, if k_{ fw }≥ t 2, we regard the reaction as ‘ has not occurred’ and set y = 1 and if k_{ fw } lies between the two defined thresholds, the output is unknown (y = ∗/unknown).
This assignment of output values is based on the following idea: we compare the equilibrium constants of the same reaction under different conditions (i.e. in different context) and interpret the relative size of the equilibrium constant as a measure of the influence the reaction context exerts on the outcome of the process. The thresholds are thus a means to reflect this influence of the reaction context and can be chosen for each process individually. Moreover, threshold values will determine topology and logical function(s) of the sitespecific logical model.
The choice of thresholds will in general be based on the biological intuition of each modeler as it reflects a judgment about the influence of the reaction context on the process outcome. Hence, threshold choice is one of the most delicate steps in setting up a PIM and one that can, by its nature, not be cast in rigorous rules. In general, it is advisable to study the effect of different threshold choices on the results of a subsequent structural analysis of the sitespecific logical model (as we have done in section “Application to insulin and EGF signaling”). It can sometimes be advisable to start with identical thresholds for all or certain subgroups of the processes and refine those later on, based on the structural analysis.
How to set up a PIM
A PIM facilitates rulebased model building
As described in section “The PIM concept is closely related to rulebased modeling”, the PIM concept is strongly related to rulebased modeling. In the generation of a rulebased model, the information about the reaction center can be extracted from a process node; the reaction context in a particular rule is determined by a combination of the occurrence of preceding processes. Kinetic parameters for the combination are taken from the parameter table of the process node. In a PIM, forward and backward rate constants are defined to characterize mass action kinetics of the process. These parameters can be transferred to corresponding reversible reaction rules. If the process is considered to be irreversible, the forward rate constant is added behind the corresponding irreversible reaction rule obtained from the PIM. Furthermore, no units can be defined explicitly in a PIM but parameters are assumed to be specified in consistent units and should be expressed on a per molecule per cell basis.
For a complete rulebased model, the specification of initially existing species and their concentrations is required. For the rulebased models obtained from a PIM, basic (i.e. not complexed) molecules with all sites in unmodified state are assumed. The concentrations are initially set to the value ‘1’ but should be altered afterwards.
The systematic specification of information about involved molecules and their affected sites in process nodes opens up new possibilities in investigating quantitative models. Processes involving particular proteins can easily be omitted in the generation of rulebased models. This greatly simplifies the study of scenarios involving only subsets of proteins (e.g. if a molecule is missing in a model, rules involving this molecule don’t have to be generated). Of course, the study of such scenarios is also possible on the level of reaction rules. But this requires testing every rule, whether or not it involves certain proteins. In a PIM one only has to test each node.
A PIM uncovers the logic of rulebased models
As mentioned in the section “Background”, logical models consist of nodes, each equipped with a logical function. A convenient way to derive a logical model is therefore to begin with an interaction graph, followed by the assignment of a logical function to each of its nodes (called Lnodes to avoid confusion with the Pnodes of the PIM).
There are many ways to derive a logical model from a PIM. Arguably the easiest way is to create an Lnode for every process and use the parameter table belonging to each Pnode as truth table defining the logical function of that Lnode. This, however, is not what is proposed here because such an interaction graph would not contain information about the connection of molecule domains (i.e. the information that two or more processes occur at the same molecule). Instead, we propose the sitespecific logical model mentioned above. This model incorporates information about molecule structure and hence allows capturing more of the biological intuition usually present in a cartoon than is possible with a PIM alone. In particular, a sitespecific logical model allows uncovering and visualizing the structure of molecules and their interactions. The construction of this sitespecific logical model is described as a twostep process below: first an interaction graph is derived and in a subsequent step each node of the interaction graph is equipped with a logical function.
The interaction graph of a sitespecific logical model
In order to build the interaction graph, an Lnode is created for every site of every molecule and an additional Lnode is created for each molecule representing its basal activity. This basal activity connects all Lnodes representing sites of the same molecule and is used to encode the presence/absence of a molecule in different (simulation) scenarios. Later on, in performing logical analysis, Lnodes representing basal activity will serve as inputs.
In general, the interaction graph will contain more Lnodes than the PIM (it is derived from) contains Pnodes. However, there is an intimate connection between Lnodes and Pnodes (see Figure6):

Every Pnode representing a modification process gives rise to a unique Lnode representing a modification site (as modification processes involve only a unique site).

Every Pnode representing a binding process gives rise to a pair of Lnodes representing the binding sites of the involved molecules (as binding processes always involve two binding sites, one from each molecule).
Definition of the correspondsto relation
Process(es)  Pnode(s)  Lnode(s)  Correspondsto relation 

Modification  p_{ i } …modification process  L _{ i }  L_{ i } corresponds to p_{ i } 
Binding with prior modification  p_{ i } …binding process,  ${L}_{i}^{\left(1\right)}$ …modification and binding site of molecule one,  ${L}_{i}^{\left(1\right)}$ corresponds to p_{ j }, 
p_{ j } …modification process  ${L}_{i}^{\left(2\right)}$ …binding site of molecule two  ${L}_{i}^{\left(2\right)}$ corresponds to p_{ i }  
Binding without prior modification  p_{ i } …binding process  ${L}_{i}^{\left(1\right)}$ …binding site of molecule one,  ${L}_{i}^{\left(1\right)}$ corresponds to p_{ i }, 
${L}_{i}^{\left(2\right)}$ …binding site of molecule two  ${L}_{i}^{\left(2\right)}$ corresponds to${L}_{i}^{\left(1\right)}$ 
 1.
L _{ i } and L _{ j } correspond to two Pnodes p _{ i } and p _{ j } (see Table 1).
 2.
There exists an edge between p _{ i } and p _{ j }.
 1.
One of the two nodes, say L _{ i }, represents the basal activity of a molecule and the other node L _{ j } represents a site of this molecule. In this case an edge is created from L _{ i } to L _{ j }. These activating edges represent the molecule structure. In a subsequent logical analysis this allows, for example, removal of a molecule (and all of its sites) by assigning a value ‘0’ to the Lnode representing the basal activity.
 2.
Both Lnodes are connected by the correspondsto relation. Assume L _{ i } corresponds to L _{ j } (see Table 1, row 3). Then an edge is created from L _{ i } to L _{ j }. This situation can only occur if both Lnodes arise from a binding process without prior modification (e.g. the activating edge from A_b1 to R_b1 in Figure 7).
In the latter case the orientation of the activating edge depends on the decision which of the two Lnodes corresponds to the Pnode representing the binding process (see the last row in Table1). This choice is arbitrary and does not affect the results of the subsequent logical analysis, because, by definition, L_{ i } has exactly one incoming activating edge from the Lnode representing the basal activity. Hence, in analysis L_{ i } passes the value of the Lnode representing the basal activity to L_{ j }.
From interaction graphs to sitespecific logical models: equipping Lnodes with logical functions
Logical function construction for Lnodes without unsigned incoming edges: For these Lnodes the logical function is a logical AND connecting all inputs. Logical function construction for Lnodes with unsigned incoming edges:
From truth tables to logical functions
The aim is to enable analysis of the sitespecific logical model with methods available in CellNetAnalyzer[13]. Therefore, the logical functions connecting incoming edges to nodes have to take the form of a sum of products. If 0 and 1 are the only values in the output columns of the respective truth tables, this is equivalent to the disjunctive normal form. Moreover, determination of a logical function from a truth table is straightforward in this case as one may use established algorithms like kmaps (KarnaughVeitch[37, 38]) or the QuineMcCluskey algorithm[39] to obtain a logical function in disjunctive normal form.
From the previous discussion, however, it is obvious that truth tables containing ‘unknown’symbols can be associated to an Lnode. In this case, the aforementioned algorithms are not applicable (note that the ‘ don’t care’symbols allowed in kmaps and the QuineMcCluskey algorithm are different from the ‘unknown’symbols considered here, in turn precluding applicability of these algorithms). Logical functions hence have to be inferred from truth tables involving ‘ unknown’symbols on a casebycase basis. To guarantee applicability of the methods proposed in[13], it is recommended to use ITT gates[10] to accommodate for the ‘unknown’symbols: for example, if the first row (all inputs equal 0) has the output 0 and the last row (all inputs equal 1) has the output 1.
Implementation of the PIM concept
Furthermore, export functionality has been added to obtain rulebased models in BNGL from PIMs set up in ProMoT. The conversion into logical models is directly done in ProMoT and will be described in more detail in the next section. The software extension supporting PIMs is available upon request and will be contained in a future release.
A modular logical model obtained from a PIM enables an intuitive analysis and visualization
One of ProMoT’s key features is the opportunity to set up modular models. Modules are used to structure a model and easily exchange and reuse model parts in the modeling workflow. This feature is facilitated in the generation of logical models from PIMs. One module encapsulates all nodes representing the parts of the same molecule. Hence, every molecule is represented by a module and the interactions with other molecules are depicted by arrows across module borders. Figure10(b) shows the modular logical model for the small example depicted in Figure8 in the ProMoT Visual Explorer. ProMoT comprises functionality which enables to obtain logical models intended for the analysis in CellNetAnalyzer combined with suitable graphical representations.
Application to insulin and EGF signaling
Threshold choice and its effect on the logical model
A sitespecific logical model can be derived from the PIM as described above. In doing so, a crucial point is the choice of the thresholds t 1, t 2 to discretize the equilibrium parameter. In our particular example, we decided to use the same thresholds for all reactions to limit the number of degrees of freedom. We have chosen t 1 = 0.01 and t 2 = 0.1 (Additional file3 contains the model, readily prepared for the analysis in CellNetAnalyzer). To examine the effect of the threshold values on the logical model, we also considered two other model variants where we moved both thresholds to the next larger/smaller value appearing as equilibrium constant (model M_{ down }: t 1 = 0.001, t 2 = 0.01; model M_{ up }: t 1 = 0.1, t 2 = 0.25).
The effects are illustrated in Figure13. One arrives at the following conclusions:

If threshold values are increased, the logical model becomes more restrictive, that is, compared to model M, model M_{ up } contains additional influences: (1) EGF dimerization becomes necessary for EGF binding in model M_{ up }. As EGF binding is in turn necessary for dimerization, neither of the two states can ever be activated in model M_{ up }, thus supporting model M. (2) The two insulin binding sites on the insulin receptor mutually inhibit each other, that is, insulin can only bind to either site in M_{ up }. This indeed reflects the biological situation[41]. Hence, even though (1) clearly argues against increasing the thresholds, (2) seems to indicate that it might be necessary to vary the thresholds of individual processes (e.g. those of process 1 and 2), also accounting for possible parameter uncertainties. In this example, we nevertheless decided to use one threshold value for all processes, not least because in this particular case, the outcome of the logical analysis is to a certain degree independent of whether or not we assume that the two binding sites influence each other.

If threshold values are decreased, biochemically important interactions are missing: in M_{ down }IRS and Shc phosphorylation depend only on the basal activity of the respective molecules. Thus model M_{ down }does not account for the fact that both phosphorylation events are induced by preceding binding events, again supporting model M.
Minimal intervention sets to prevent binding of Grb2 to Shc in response to insulin stimulation
Minimal intervention set  Interpretation  

1  ir.res_ir=0  
2  grb2.res_grb2=0  set basal activity of insulin receptor, Grb2 or Shc to 0, i.e. remove respective species from the system 
3  shc.res_shc=0  
4  ins.b1=0  prevent insulin binding to its receptor by blocking the binding site on insulin or by blocking both 
5  ir.b_ins=0, ir.b_ins2=0  binding sites on the receptor 
6  shc.b_ir=0  prevent Shc binding to insulin receptor by blocking the binding site on Shc or by preventing the 
7  ir.p1=0  necessary phosphorylation of the receptor 
8  grb2.b_shc=0  prevent Grb2 binding to Shc by blocking the binding site on Grb2 or by preventing the 
9  shc.p1=0  necessary phosphorylation of Shc 
Conclusions and discussion
We introduced the ProcessInteractionModel (PIM) concept as a means of combining the advantages of rulebased and logical modeling approaches. A PIM is based on a directed graph and incorporates the definition of molecules, domains, processes, interactions, kinetic parameters and logical values. A prototypic implementation of the PIM concept has been integrated in the modeling software ProMoT. At the moment this softwareextension is available on demand and will be contained in a future release.
A PIM can be seen as a compact description of rulebased models and the concept thereby facilitates the systematic setup of such models. Besides rulebased models logical models can be derived from the same basis. Thus the PIM concept enables a systematic and consistent setup of models in two different specifications. Consequently, the signaling system can be studied on two different levels of detail by applying established simulation and analysis methods to both models. The common basis for the two models has the additional advantage that modifications (e.g. new insights on the structure of signaling systems or changes of parameters) can be made on the basis model and propagated into the model specifications.
When defining a PIM, one faces the same problems as when setting up a rulebased model: one needs to specify rate constants for every reaction and every reaction context. These are often hard to come by. The generation of the sitespecific logical model additionally needs values for the thresholds. These can be equally hard to determine. For conventional logical models this information is not necessary, hence, the construction of a sitespecific logical model using a PIM can be more involved. A PIM, however, is an efficient means to generate models of two different formalisms in a consistent way. This can more than offset the effort of specifying all parameters.
In the following paragraphs we briefly discuss the potential of the PIM concept.
A modeling workflow incorporating PIMs
A possible modeling workflow employing the PIM concept starts with the setup of the PIM. As a second step, a sitespecific logical model is generated and the qualitative behavior and structural properties of the signaling system are determined by structural and functional analysis of this model. In step three, a PIM refinement may be required based on the results of step two (i.e. processes have to be changed, interactions have to be added or replaced and/or parameters have to be changed accordingly). Steps two and three have to be repeated until further refinement is unnecessary and the logical model can reproduce experimental data. In step four, a rulebased model is generated and the quantitative behavior of the signaling system is determined by simulation and analysis of the rulebased model or the corresponding ODE model. Further cycles of PIM refinement, generation of the rulebased model and its simulation may be necessary to explain experimental data.
Sitespecific logical models obtained with the PIM concept will usually describe signaling events in a very detailed manner. This is justified for early events in signaling systems. A natural extension of the aforementioned steps is therefore an integration of the sitespecific logical model into existing logical models describing signaling events further downstream of the receptor.
This modeling workflow is only one possibility to employ the PIM concept for the investigation of signaling systems and will most likely have to be adapted to the problem to be solved.
PIMs facilitate scenarios for rulebased models
The systematic specification of involved molecules and their affected sites in each process node greatly simplifies the study of scenarios that describe the removal of proteins from the system. This is especially useful for rulebased models where the systematic removal of proteins can be challenging. It is straightforward to generate not only one rulebased model but a family of models. This enables an analysis that has hitherto been restricted to logical models: to study the influence presence/absence of a molecule has on the system.
PIMs may support model reduction and checking of thermodynamic constraints
In general, quantitative models derived from PIMs result in tremendous ODE systems, thus model reduction is reasonable. The directed graphs used in the PIM concept are similar to the ones used in reduction techniques for rulebased models described in[6, 7]. It is in principle possible to adapt and apply these methods to the PIM concept such that both the rulebased specification and the sitespecific logical model can be reduced in one step by applying them to the PIM. Furthermore, the systematic assembly of kinetic parameters enables comfortable checking of thermodynamic constraints.
To conclude, the PIM concept offers connections to a variety of established methods. It has the potential to become a valuable tool.
Methods
In the previous sections we presented the concept of a PIM with the two process types occurring most frequently in signaling systems. We pointed out which information has to be assigned to process nodes representing processes of the type binding and modification: the involved molecules, their affected sites and parameter tables. Furthermore, algorithms for the generation of rulebased and sitespecific logical models from a PIM containing these process types have been worked out. In the following, algorithmic details for rulebased and logical model generation will be illustrated for special cases of combinations of these two process types, followed by a discussion on how further process types can be integrated in the PIM concept. Thereby, simple examples will be used to demonstrate how other process types can be represented in the PIM concept and how the algorithms for the generation of rulebased and sitespecific logical models can be extended. The complete description of these process types is beyond the scope of the paper.
Algorithmic details
In the following, special cases of constellations of binding and modification processes will be presented. Peculiarities may arise in the generation of rulebased and sitespecific logical model specifications from a PIM.
One molecule can bind on different sites which are subject to prior modification
Mutually exclusive preceding processes
Discussion of additional process types
In intracellular signaling, processes other than binding and modification can occur. Arguably the most important ones are polymerization, synthesis, degradation and change of compartment. Although these process types are currently not implemented, it is in principle possible to incorporate them. Below we briefly discuss how this can be achieved.
Polymerization
In the parameter table in Figure17(a) the cases ‘01’ and ‘10’ are identical for homodimerization. Parameters are just needed for one of these cases, thus the gray row means that the fields should not be filled with parameters. Analogous to BioNetGen the association of monomers in the same state (i.e. the first and the last row in the parameter table in Figure17(a)) is parametrized with 0.5 times the nominal rate constant[21]. Hence, for the logical approach, we have to assume that all representations of the same species are in the same state, either on (e.g. phosphorylated in Figure17) or off (e.g. unphosphorylated in Figure17). Therefore, for the logical model, only the rows for equal monomer sites are considered. The remaining fields in the parameter table in Figure17(a) are colored in gray.
Degradation and synthesis
Degradation is a further process occurring in signaling systems. Here we argue that this process type could be integrated in the PIM concept. Information about the name of the molecule which will be degraded and a parameter table have to be assigned to the process node. As degradation is an irreversible process and BioNetGen has a special concept to describe these reactions[17], kinetic parameters are only needed for the forward direction.
BioNetGen additionally allows specifying the degradation of complexes by adding keywords to the rules[17]. Here we restrict our discussion to the simplest case, the degradation of a single molecule.
To describe the synthesis of a molecule, another specialized process type has to be included in the PIM concept. A process node of this type has to store information about the newly synthesized molecule, the state of its sites, the additional molecule and if it is synthesized bound or unbound.
Logical models obtained from PIMs which contain synthesis processes do not gain additional information because in the sitespecific logical models the basal activity species act as inputs and are set prior to analysis.
Change of compartment can be modeled like a modification process
The change of a species localization is a process occurring frequently during the signal transfer in cells; for example, a receptor receiving the signal from extracellular space is internalized (i.e. moves into an endosome). Rulebased models seldom incorporate information about species localization because it raises intricacy of the models. In BioNetGen molecule localization can be treated like a modification. A further site is added and the state of this site represents the localization of a molecule. In complex cases, however, this approach may be errorprone. The modeler has to take care that molecules in complexes change their state of the location together and that molecules can only switch into adjacent compartments. Furthermore, for processes taking place in different compartments, identical reaction rules varying solely in their localization state have to be written down. To overcome this difficulties, BioNetGen has recently incorporated a concept called cBNGL[42]. This approach adds an additional attribute to species and molecules and therewith enables modeling of compartmental organization of cells by storing a directed graph representing the compartment topology. Incorporating this approach into PIMs would require storing additional information. Hence, it is currently not possible to generate rulebased models in cBNGL. Instead, in PIM two different compartment localizations for a molecule can be facilitated by treating compartment changes like modification processes. For that, no special process type is introduced. Some of the disadvantages of modeling compartment changes like modifications inducing errorproneness are overcome by the systematic specification approach followed by PIM. Nevertheless, the modeler has to take care about adjacency of compartments.
Author’s contributions
KK, RS, HC designed the PIM concept. KK implemented the concept in the modeling software ProMoT. KK, RS, SM and CC prepared the manuscript jointly. All authors have read and accepted the manuscript.v
Declarations
Acknowledgements
We thank Julio SaezRodriguez for interesting discussions on sitespecific logical modeling. And we thank Steffen Klamt for inspiring discussions about the manuscript.
Authors’ Affiliations
References
 Gilbert D, Fuss H, Gu X, Orton R, Robinson S, Vyshemirsky V, Kurth MJ, Downes CS, Dubitzky W: Computational methodologies for modelling, analysis and simulation of signalling networks. Brief Bioinform 2006, 7(4):339–353. http://dx.doi.org/10.1093/bib/bbl043 10.1093/bib/bbl043View ArticlePubMedGoogle Scholar
 Pawson T, Nash P: Assembly of cell regulatory systems through protein interaction domains. Science 2003, 300(5618):445–452. http://dx.doi.org/10.1126/science.1083653 10.1126/science.1083653View ArticlePubMedGoogle Scholar
 Hlavacek WS, Faeder JR, Blinov ML, Perelson AS, Goldstein B: The complexity of complexes in signal transduction. Biotechnology and Bioengineering 2003, 84(7):783–794. http://dx.doi.org/10.1002/bit.10842 10.1002/bit.10842View ArticlePubMedGoogle Scholar
 Faeder JR, Blinov ML, Goldstein B, Hlavacek WS: Rulebased modeling of biochemical networks. Complexity 2005, 10(4):22–41. http://dx.doi.org/10.1002/cplx.20074 10.1002/cplx.20074View ArticleGoogle Scholar
 Sneddon MW, Faeder JR, Emonet T: Efficient modeling, simulation and coarsegraining of biological complexity with NFsim. Nat Methods 2011, 8(2):177–183. http://dx.doi.org/10.1038/nmeth.1546 10.1038/nmeth.1546View ArticlePubMedGoogle Scholar
 Koschorreck M, Conzelmann H, Ebert S, Ederer M, Gilles ED: Reduced modeling of signal transduction  a modular approach. BMC Bioinformatics 2007, 8: 336. http://dx.doi.org/10.1186/1471–2105–8336 10.1186/147121058336PubMed CentralView ArticlePubMedGoogle Scholar
 Conzelmann H, Fey D, Gilles ED: Exact model reduction of combinatorial reaction networks. BMC Syst Biol 2008, 2: 78. http://dx.doi.org/10.1186/1752–0509–278 10.1186/17520509278PubMed CentralView ArticlePubMedGoogle Scholar
 Borisov NM, Chistopolsky AS, Faeder JR, Kholodenko BN: Domainoriented reduction of rulebased network models. IET Syst Biol 2008, 2(5):342–351. http://dx.doi.org/10.1049/ietsyb:20070081 10.1049/ietsyb:20070081PubMed CentralView ArticlePubMedGoogle Scholar
 SaezRodriguez J, Simeoni L, Lindquist JA, Hemenway R, Bommhardt U, Arndt B, Haus UU, Weismantel R, Gilles ED, Klamt S, Schraven B: A logical model provides insights into T cell receptor signaling. PLoS Comput Biol 2007, 3(8):e163. http://dx.doi.org/10.1371/journal.pcbi.0030163 10.1371/journal.pcbi.0030163PubMed CentralView ArticlePubMedGoogle Scholar
 Samaga R, SaezRodriguez J, Alexopoulos LG, Sorger PK, Klamt S: The logic of EGFR/ErbB signaling: theoretical properties and analysis of highthroughput data. PLoS Comput Biol 2009, 5(8):e1000438. http://dx.doi.org/10.1371/journal.pcbi.1000438 10.1371/journal.pcbi.1000438PubMed CentralView ArticlePubMedGoogle Scholar
 Schlatter R, Schmich K, Vizcarra IA, Scheurich P, Sauter T, Borner C, Ederer M, Merfort I, Sawodny O: ON/OFF and beyond–a boolean model of apoptosis. PLoS Comput Biol 2009, 5(12):e1000595. http://dx.doi.org/10.1371/journal.pcbi.1000595 10.1371/journal.pcbi.1000595PubMed CentralView ArticlePubMedGoogle Scholar
 Klamt S, SaezRodriguez J, Lindquist JA, Simeoni L, Gilles ED: A methodology for the structural and functional analysis of signaling and regulatory networks. BMC Bioinformatics 2006, 7: 56. http://dx.doi.org/10.1186/1471–2105–756 10.1186/14712105756PubMed CentralView ArticlePubMedGoogle Scholar
 Klamt S, SaezRodriguez J, Gilles ED: Structural and functional analysis of cellular networks with CellNetAnalyzer. BMC Syst Biol 2007, 1: 2. http://dx.doi.org/10.1186/1752–0509–12 10.1186/1752050912PubMed CentralView ArticlePubMedGoogle Scholar
 Samaga R, von Kamp A, Klamt S: Computing combinatorial intervention strategies and failure modes in signaling networks. J Comput Biol 2010, 17: 39–53. http://dx.doi.org/10.1089/cmb.2009.0121 10.1089/cmb.2009.0121View ArticlePubMedGoogle Scholar
 SaezRodriguez J, Mirschel S, Hemenway R, Klamt S, Gilles ED, Ginkel M: Visual setup of logical models of signaling and regulatory networks with ProMoT. BMC Bioinformatics 2006, 7: 506. http://dx.doi.org/10.1186/1471–2105–7506 10.1186/147121057506PubMed CentralView ArticlePubMedGoogle Scholar
 Mirschel S, Steinmetz K, Rempel M, Ginkel M, Gilles ED: PROMOT: modular modeling for systems biology. Bioinformatics 2009, 25(5):687–689. http://dx.doi.org/10.1093/bioinformatics/btp029 10.1093/bioinformatics/btp029PubMed CentralView ArticlePubMedGoogle Scholar
 Faeder JR, Blinov ML, Hlavacek WS: Rulebased modeling of biochemical systems with BioNetGen. Methods Mol Biol 2009, 500: 113–167. http://dx.doi.org/10.1007/978–159745–525–1_5 10.1007/9781597455251_5View ArticlePubMedGoogle Scholar
 Blinov ML, Faeder JR, Goldstein B, Hlavacek WS: BioNetGen: software for rulebased modeling of signal transduction based on the interactions of molecular domains. Bioinformatics 2004, 20(17):3289–3291. http://dx.doi.org/10.1093/bioinformatics/bth378 10.1093/bioinformatics/bth378View ArticlePubMedGoogle Scholar
 Koschorreck M, Gilles ED: ALC: automated reduction of rulebased models. BMC Syst Biol 2008, 2: 91. http://dx.doi.org/10.1186/1752–0509–291 10.1186/17520509291PubMed CentralView ArticlePubMedGoogle Scholar
 Krivine J, Danos V, Benecke A: Modelling Epigenetic Information Maintenance: a Kappa Tutorial. In Computer Aided Verification, Proceedings, Volume 5643 of Lecture Notes in Computer Science. Edited by: Bouajjani A, Maler O. SpringerVerlag Berlin; 2009:17–32.Google Scholar
 Blinov M, Yang J, Faeder J, Hlavacek W: Graph Theory for RuleBased Modeling of Biochemical Networks. In Transactions on Computational Systems Biology VII, Volume 4230 of Lecture Notes in Computer Science. Edited by: Priami C, Ingólfsdóttir A, Mishra B. Riis Nielson H: Springer Berlin /Heidelberg; 2006:89–106. http://dx.doi.org/10.1007/11905455_5Google Scholar
 Nag A, Monine MI, Blinov ML, Goldstein B: A detailed mathematical model predicts that serial engagement of IgEFc epsilon RI complexes can enhance Syk activation in mast cells. J Immunol 2010, 185(6):3268–3276. http://dx.doi.org/10.4049/jimmunol.1000326 10.4049/jimmunol.1000326PubMed CentralView ArticlePubMedGoogle Scholar
 Geier F, Fengos G, Iber D: A computational analysis of the dynamic roles of talin, Dok1, and PIPKI for integrin activation. PLoS One 2011, 6(11):e24808. http://dx.doi.org/10.1371/journal.pone.0024808 10.1371/journal.pone.0024808PubMed CentralView ArticlePubMedGoogle Scholar
 Chylek LA, Hu B, Blinov ML, Emonet T, Faeder JR, Goldstein B, Gutenkunst RN, Haugh JM, Lipniacki T, Posner RG, Yang J, Hlavacek WS: Guidelines for visualizing and annotating rulebased models. Mol Biosyst 2011, 7(10):2779–2795. http://dx.doi.org/10.1039/c1mb05077j 10.1039/c1mb05077jPubMed CentralView ArticlePubMedGoogle Scholar
 Kocieniewski P, Faeder JR, Lipniacki T: The interplay of double phosphorylation and scaffolding in MAPK pathways. J Theor Biol 2012, 295: 116–124. http://dx.doi.org/10.1016/j.jtbi.2011.11.014PubMed CentralView ArticlePubMedGoogle Scholar
 Yang J, Monine MI, Faeder JR, Hlavacek WS: Kinetic Monte Carlo method for rulebased modeling of biochemical networks. Phys Rev E Stat Nonlin Soft Matter Phys 2008, 78(3 Pt 1):031910.PubMed CentralView ArticlePubMedGoogle Scholar
 Colvin J, Monine M, Gutenkunst R, Hlavacek W, Von Hoff D, Posner R: RuleMonkey: software for stochastic simulation of rulebased models. BMC Bioinformatics 2010, 11: 404. http://www.biomedcentral.com/1471–2105/11/404 10.1186/1471210511404PubMed CentralView ArticlePubMedGoogle Scholar
 Yang J, Meng X, Hlavacek WS: Rulebased modeling and simulation of biochemical systems with molecular finite automata. IET Syst Biol 2010, 4: 453–466. 10.1049/ietsyb.2010.0015PubMed CentralView ArticlePubMedGoogle Scholar
 Kauffman SA: Metabolic stability and epigenesis in randomly constructed genetic nets. J Theor Biol 1969, 22(3):437–467. 10.1016/00225193(69)900150View ArticlePubMedGoogle Scholar
 Thomas R: Boolean formalization of genetic control circuits. J Theor Biol 1973, 42(3):563–585. 10.1016/00225193(73)902476View ArticlePubMedGoogle Scholar
 Thomas R, D’Ari R: Biological Feedback. Boca Raton Florida: CRC Press; 1990.Google Scholar
 Mendoza L, Thieffry D, AlvarezBuylla ER: Genetic control of flower morphogenesis in Arabidopsis thaliana: a logical analysis. Bioinformatics 1999, 15(7–8):593–606.View ArticlePubMedGoogle Scholar
 Albert R, Othmer HG: The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. J Theor Biol 2003, 223: 1–18. 10.1016/S00225193(03)000353View ArticlePubMedGoogle Scholar
 Christensen TS, Oliveira AP, Nielsen J: Reconstruction and logical modeling of glucose repression signaling pathways in Saccharomyces cerevisiae. BMC Syst Biol 2009, 3: 7. http://dx.doi.org/10.1186/1752–0509–37 10.1186/1752050937PubMed CentralView ArticlePubMedGoogle Scholar
 Ryll A, Samaga R, Schaper F, Alexopoulos LG, Klamt S: Largescale network models of IL1 and IL6 signalling and their hepatocellular specification. Mol Biosyst 2011, 7: 3253–3270. http://dx.doi.org/10.1039/c1mb05261f 10.1039/c1mb05261fView ArticlePubMedGoogle Scholar
 CornishBowden A: Fundamentals of Enzyme Kinetics. London: Portland Press Ltd.; 2004.Google Scholar
 Karnaugh M: The map method for synthesis of combinational logic circuits. Transactions of the American Institute of Electrical Engineers 1953, 72(9):593–599.Google Scholar
 Veitch EW: A chart method for simplifying truth functions. In Proceedings of the 1952 ACM national meeting (Pittsburgh),. New York, NY USA: ACM, ACM ’52; 1952:127–133. http://doi.acm.org/10.1145/609784.609801View ArticleGoogle Scholar
 McCluskey EJ: Minimization of Boolean functions. Bell Syst Tech J 1956, 35(5):1417–1444.View ArticleGoogle Scholar
 Ginkel M, Kremling A, Nutsch T, Rehner R, Gilles ED: Modular modeling of cellular systems with ProMoT/Diva. Bioinformatics 2003, 19(9):1169–1176. http://bioinformatics.oxfordjournals.org/content/19/9/1169.abstract 10.1093/bioinformatics/btg128View ArticlePubMedGoogle Scholar
 Ward CW, Lawrence MC, Streltsov VA, Adams TE, McKern NM: The insulin and EGF receptor structures: new insights into ligandinduced receptor activation. Trends Biochem Sci 2007, 32(3):129–137. http://dx.doi.org/10.1016/j.tibs.2007.01.001 10.1016/j.tibs.2007.01.001View ArticlePubMedGoogle Scholar
 Harris LA, Hogg JS, Faeder JR: Compartmental rulebased modeling of biochemical systems. In Winter Simulation Conference, WSC ’09. Winter Simulation Conference 2009, 908–919. http://dl.acm.org/citation.cfm?id=1995456.1995588Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. Copyright information is incorrect: Copyright note should be : This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.