Skip to main content

Periodic synchronization of isolated network elements facilitates simulating and inferring gene regulatory networks including stochastic molecular kinetics

Abstract

Background

The temporal progression of many fundamental processes in cells and organisms, including homeostasis, differentiation and development, are governed by gene regulatory networks (GRNs). GRNs balance fluctuations in the output of their genes, which trace back to the stochasticity of molecular interactions. Although highly desirable to understand life processes, predicting the temporal progression of gene products within a GRN is challenging when considering stochastic events such as transcription factor–DNA interactions or protein production and degradation.

Results

We report a method to simulate and infer GRNs including genes and biochemical reactions at molecular detail. In our approach, we consider each network element to be isolated from other elements during small time intervals, after which we synchronize molecule numbers across all network elements. Thereby, the temporal behaviour of network elements is decoupled and can be treated by local stochastic or deterministic solutions. We demonstrate the working principle of this modular approach with a repressive gene cascade comprising four genes. By considering a deterministic time evolution within each time interval for all elements, our method approaches the solution of the system of deterministic differential equations associated with the GRN. By allowing genes to stochastically switch between on and off states or by considering stochastic production of gene outputs, we are able to include increasing levels of stochastic detail and approximate the solution of a Gillespie simulation. Thereby, CaiNet is able to reproduce noise-induced bi-stability and oscillations in dynamically complex GRNs. Notably, our modular approach further allows for a simple consideration of deterministic delays. We further infer relevant regulatory connections and steady-state parameters of a GRN of up to ten genes from steady-state measurements by identifying each gene of the network with a single perceptron in an artificial neuronal network and using a gradient decent method originally designed to train recurrent neural networks. To facilitate setting up GRNs and using our simulation and inference method, we provide a fast computer-aided interactive network simulation environment, CaiNet.

Conclusion

We developed a method to simulate GRNs at molecular detail and to infer the topology and steady-state parameters of GRNs. Our method and associated user-friendly framework CaiNet should prove helpful to analyze or predict the temporal progression of reaction networks or GRNs in cellular and organismic biology. CaiNet is freely available at https://gitlab.com/GebhardtLab/CaiNet.

Peer Review reports

Background

Dynamics and progression of many fundamental processes in cells and organisms, including metabolism [1, 2], the cell cycle [3], the circadian clock [4], differentiation [5, 6] and development [7] are governed by GRNs. These networks generally control the activity of genes by regulatory motifs such as feedback or feed forward loops [8, 9], which ensure spatially and temporally controlled gene expression. A striking property of gene transcription is its distinct stochastic behavior. Rather than producing their product continuously, most genes stochastically transit from an inactive state into an active state where they produce the gene product [10, 11]. During the on-time, which often is short compared to the off-time, a burst of expression occurs that might significantly increases the expression level. In combination with stochastic production (birth) and degradation (death) processes, this leads to complex stochastic trajectories of gene products. This noise of gene expression was shown to play an important role in decision making within GRNs [12,13,14,15]. Furthermore, theoretical work suggests that this stochastic behavior can influence the functionality of networks such as the circadian clock [16, 17] and the pluripotency network [18, 19].

To model and simulate the average temporal behavior of GRNs, ordinary differential equations (ODEs) and corresponding solvers are well established [20, 21]. To add stochasticity, the Langevin method adds normal distributed gene product noise to the system, assuming that all reactions in the system went through multiple discrete reaction events during a single simulation time step [22]. While this method achieves a reasonable description of systems with large molecule numbers, it does not explicitly account for gene on/off switching. An exact approach to simulate stochastic processes is the Gillespie direct method [23]. In this method, two random numbers drawn from a probability distribution comprising all possible reactions in the network determine the time point and the type of the next reaction event. In this approach, the duration of a time-step is limited by the fastest occurring reaction in the network. Thus, for networks including large molecule numbers or reactions with fast rates that occur on a timescale much shorter than the total simulation time, the Gillespie direct method is computationally expensive. To mitigate this problem, the tau-leaping method has been developed [24]. There, a simulation time step may span over several reaction events. The number of reaction events that occurred during a time step is approximated by an average number. This method requires additional modifications to prevent negative particle numbers [25] and to calculate step sizes [26]. To further speed up simulations, hybrid approaches that combine all of the aforementioned approaches have been developed [27,28,29,30,31]. These approaches treat specific reactions in the network with different algorithms. To do so, complex considerations to decouple a generic network have to be performed [27]. Considering special cases of networks, i.e. networks with genes stochastically switching between on and off states, led to more effective dedicated hybrid stochastic deterministic simulation approaches [32]. However, a dedicated hybrid approach capable of modeling gene product synthesis in detail and providing biochemical reactions to simulate signaling pathways is still missing.

The inverse problem to simulating expression levels of GRNs with given parameters is to infer parameters and connections between genes from a set of given gene product levels, knockout data or time trajectories. A prominent approach to infer the topology of networks is based on Boolean interactions [33, 34]. Another approach looks for correlation coefficients [35] between steady state networks. Furthermore, linearized differential equations have been used to fit experimental data [36]. Using knockout data, a confidence matrix for network interactions was obtained [37]. Recent progress to the inference problem has been made using time trajectories of gene product levels [38,39,40,41]. Approaches such as WASABI [42] infer how perturbations of gene product levels propagate through a network. However, many inference approaches identify the relative importance of network parameters or connections, instead of yielding parameters that can directly be entered into physiological simulations.

Since genes may combine the actions of several transcription factors for their specific output behavior, they have been identified with perceptrons or neurons [43,44,45,46]. This enabled optimizing sophisticated algorithms originally designed for artificial neural networks, such as gradient decent methods, to the inference problem [47,48,49]. Using such an approach, a neural network was trained to reproduce a time-series of gene product levels [49]. During training, the neural network had to learn both how to simulate the ODE system corresponding to the gene network as well as the network topology. Thus, it remained a challenge to disentangle network parameters and topology from the coefficients of the neural network.

Here, we report a computer aided interactive gene network simulation tool (CaiNet), dedicated for simulation of GRNs including molecular kinetics and for inference of steady-state network parameters. We simplified the simulations by determining the temporal evolution of each network element isolated from all other network elements between fixed time steps, after which the gene product levels are synchronized within the GRN. This modular approach enabled us to include local analytical solutions of the temporal evolution of a network element such as a gene, thereby speeding up simulation time. For genes, we included local analytical solutions to the time behavior for various different regulatory promoter motifs and variable numbers of rate-limiting steps in gene product synthesis. In addition, we provided local analytical solutions for elementary biochemical reactions such as dimerization, that can be combined to model complex biochemical reaction networks. We tested CaiNet by comparing its simulation of a repressive gene cascade of four genes to the solution of the global, network-wide ODEs by an ODE solver and to a full stochastic Gillespie simulation. Further, we verified that CaiNet is able to reproduce noise-induced bi-stability and oscillations occurring in a positive and a negative autoregulatory feedback loop including enzyme-mediated degradation. For parameter inference, CaiNet uses a recurrent network approach to directly infer steady-state parameters. We tested CaiNet for up to 10 densely connected genes and find that CaiNet is able to recover the network topology and the network parameters well. The combination of a simulation algorithm and an inference algorithm directly yielding physiological simulation parameters will remove hurdles in understanding experimental results and investigating associated GRNs.

Results

Setup of GRNs with the graphical user interface of CaiNet

GRNs commonly comprise several elements: external signaling inputs, biochemical reactions and regulatory motifs including several genes (Fig. 1a, upper panel). In CaiNet, we characterized each element by certain structural and kinetic parameters (Fig. 1b). Inputs may be any kind of molecule. Each molecule input can be assigned a time course of molecule abundance, which might be for example sinusoidal or rectangular (Fig. 1c). Complex biochemical reactions can be set up by combining a certain set of elementary reactions [23]. In CaiNet, this set comprises homodimerisation, heterodimerisation and a transformation of a species by an enzyme. For example, formation of a homotetramer can be implemented by combining the homodimerization of a monomer and a subsequent homodimerization of the homodimer. We modeled genes by two states, an on-state and an off-state [50], and a promoter with a certain number of binding sites for transcription activators or repressors (Fig. 1b, d). The gene switches to the on-state with the effective rate λeff upon association of activating transcription factors according to the regulatory logic of the promoter, and switches to the off-state with the effective rate µeff upon their release (Fig. 1d and “Methods” section). Transcription repressors keep the gene in the off-state (“Methods” section). In the on-state, the gene product is produced with the production rate ν. The gene product can either be understood as mRNA or as protein. In the latter case, we initially simplified the process of protein production by combining transcription and translation of mRNA into a common rate-limiting step with one production rate. However, we introduced the possibility for an additional delay to account for production processes such as splicing and translation (see below). Moreover, gene products are associated with a degradation rate.

Fig. 1
figure 1

Implementation of GRNs in CaiNet. a Sketch of an exemplary GRN. GRNs in CaiNet may comprise elements of (extra)cellular input signals, elements of basic bi-molecular reactions combined to more complex biochemical reactions and gene elements combined to regulatory motives. b Sketch of the reaction rates entering the exemplary GRN. CaiNet provides analytical solutions to the evolution of molecule numbers in basic biochemical reactions and analytical effective two-state solutions of molecule numbers of different promoter structures of genes, which all are included into the unique modular simulation approach of CaiNet. The user can assign time profiles of input elements and reaction rates of bi-molecular reactions and genes. c Exemplary time trajectories for input elements. d Sketch of promoter structures implemented in CaiNet. e Sketch of the layout of elements including their kinetic parameters of the exemplary GRN in b set up in CaiNet via a GUI. f Sketch of the knockout tool applied to the exemplary GRN. The parameters of elements can be altered to simulate experimental conditions such as knockdown or knockout experiments

We designed a graphical user interface (GUI) for CaiNet to facilitate setting up complex GRNs (Fig. 1e), inspired by an effort to find general and understandable representations of biological networks [51]. With this GUI, icons representing the network elements ‘input’, ‘bi-molecular reaction’ and ‘gene’ can be connected intuitively using activating or inhibiting links represented by wires. For each network element, relevant structural and kinetic parameters can be defined (Fig. 1e and Table 1). In addition, we implemented means to manipulate a genetic network by knocking down one or more genes (Fig. 1f). For a knocked down gene, the transcription rate is adjusted to a low value or zero.

Table 1 Kinetic parameters of network elements

Computer-aided implementation of network simulations

To simulate GRNs with CaiNet, we developed a modular approach of solving the system of differential equations associated with the GRN or of simulating the stochastic behaviour of genes and of the numbers of gene products. At the heart of this modularisation lies the assumption that changes in molecule numbers within the GRN are small during a time step Δt and that it is sufficient to synchronize the numbers of molecules between all elements of the network only after each time step Δt. According to this assumption, each network element can be treated as if it was isolated from other network elements during Δt. Thus, for each Δt, CaiNet iterates through all network elements and calculates the local changes in molecule abundance over time for each element using element-specific functions (“Methods” section, pseudo code) and the molecule numbers present in the GRN at the beginning of this time step Δt. We derived various element-specific functions for bi-molecular reactions and different promoter structures of genes (“Methods” section). To synchronize molecule numbers after each Δt, CaiNet calculates the total change in molecule numbers for all molecular species by summing over the local changes in molecule numbers of each network element. These new network-wide numbers are then used as starting values for element-specific calculations during the next time step. Importantly, our modular approach enabled simulating different scenarios with increasing level of stochastic detail, ranging from an approximate solution of the system of deterministic differential equations associated with the GRN to an approximate solution of the corresponding stochastic chemical master equations.

To demonstrate the working principle of our modular approach, we chose a cascade of negative regulation comprising four genes (Fig. 2a). In the cascade, each gene encodes a repressive transcription factor, which represses the subsequent gene. Such a cascade has been shown to exhibit pathologic behaviour, i.e. large fluctuations in gene product levels, in presence of stochasticity [52, 53]. We set the number of the first repressive transcription factor, which acts as initial input, to a constant value of 100 molecules (high). We then determined the numbers of subsequent transcription factors during the simulation. In the following, we discuss the different scenarios considering increasing levels of stochastic detail.

Fig. 2
figure 2

Demonstration and test of CaiNet using a repressive gene cascade. a Regulatory logic of a repressive gene cascade of four genes (left panel) and sketch of the corresponding GRN including transcription repressors and gene elements (right panel). The input species (constant level of 100 molecules) and each gene product repress the subsequent gene element in the GRN. b, c Simulation scenario 1 of CaiNet with deterministic treatment of gene on/off switching and birth/death events of gene products. b Sketches of the probability of activated expression of a gene (upper panel) and the corresponding gene product level (lower panel) of scenario 1. The activation of a gene is constant within one synchronization time step. After each time-step, all gene product levels are synchronized and the activation probabilities of all gene elements are updated. c Comparison of CaiNet simulations of the repressive gene cascade according to scenario 1 performed with different synchronization time steps (red, yellow and purple lines) with the numerical solution of an ODE solver (dashed blue line). d, e Simulation scenario 2 of CaiNet with stochastic treatment of gene on/off switching and deterministic treatment of birth/death events of gene products. d Sketches of the production state of a gene (upper panel, either on or off)) and the corresponding gene product level (lower panel) of scenario 2. e Left panel: comparison of a CaiNet simulation of the repressive gene cascade according to scenario 2 (red line) with the numerical solution of an ODE solver (blue line). fj Simulation scenario 3 of CaiNet with stochastic treatment of gene on/off switching and birth/death events of gene products. f Sketches of the production state of a gene (upper panel, either on or off)) and the corresponding gene product level (lower panel) of scenario 3. g Left panel: comparison of a CaiNet simulation of the repressive gene cascade according to scenario 2 (blue line) and scenario 3 (red line) with a Gillespie simulation (yellow line). h, i As in f, g, but with faster gene on/off switching rates. j As in i, but with a constant input level of 1 molecule. Middle and right panels of c, e, g, i, j: histograms and autocorrelation curves of respective gene product levels. In panels (c, e, g, i, j) the expression level of the last transcription factor in the cascade is shown

Scenario 1 (Fig. 2b, c): In this scenario, we implemented the approximate solution of CaiNet to the system of deterministic differential equations associated with the GRN of the repressive gene cascade (“Methods” section, Eq. (59)). We modelled the protein output of each gene by

$$\begin{aligned} \dot{n} & = a(q \cdot \Delta t)\nu - \delta n \Rightarrow n(q \cdot \Delta t) = n_{(q - 1)\Delta t} \exp ( - \delta \Delta t) \\ & \quad + \frac{a(q \cdot \Delta t)\nu }{\delta }\left( {1 - \exp ( - \delta \Delta t)} \right) \\ \end{aligned}$$
(1)

where n is the number of proteins, ν is the production rate and δ is the degradation rate. Gene elements directly use an analytical solution to calculate the expression level. We calculated the probability function of activated expression, a(q Δt), using the expression levels from the last synchronization time point, q Δt, and the equilibrium constants of transcription factors of the appropriate promoter structure (“Methods” section, Eqs. (13) or (16)). According to our central assumption, the probability function a is constant during the time interval Δt (Fig. 2b). Thus, we could find an individual local analytical solution for each gene element in the network. Using CaiNet, we simulated the gene cascade with a fixed set of kinetic rates for the transcription factors (Additional file 1: Table S1) and for three different time steps Δt (Δt = 10 s, 100 s and 1000 s) (Fig. 2c). If Δt was too large, the synchronization of transcription factor abundance and thus the change in gene activation was delayed. Hence, CaiNet’s solution for the transient behavior did not match the global, network-wide solution of the ODE solver (“Methods” section, Eq. (59)) anymore. However, the steady state level was still reproduced correctly. If Δt was sufficiently small, the approximation of CaiNet approached the global, network-wide solution of the system of ODEs associated with the GRN.

Scenario 2 (Fig. 2d, e): We included that genes can stochastically switch between on and off states (Fig. 2d). For simplicity, we assumed that genes were switched on with rate λ(nTF) upon binding of a transcription factor, and switched off with rate µ upon unbinding of the transcription factor. Thus, the switching events are Poisson processes. For every time step Δt, we calculated the effective on/off-rates, λ, µ, of the gene using the appropriate promoter structures (“Methods” section Eqs. (14) or (17)). According to our central assumption, λ, µ, were constant within a time step Δt. In contrast, the gene was able to switch states within a time step. To obtain the time points of switching, ti, we drew the uniformly distributed random number X and used the current switching rate constant

$$t_{i} = \frac{ - \log (X)}{\alpha }\quad {\text{where}}\;\;\;\alpha = \left\{ {\begin{array}{*{20}c} {\lambda {,}} & {{\text{Gene}}\;{\text{off}}} \\ {\mu {,}} & {{\text{Gene}}\;{\text{on}}} \\ \end{array} } \right.$$
(2)

Dependent on the state of the gene, we distinguished between the two equations

$$\begin{aligned} & {\text{on:}}\;\;\dot{n} = \nu - \delta n \Rightarrow n(t) = n_{0} \exp ( - \delta t) + \frac{\nu }{\delta }\left( {1 - \exp ( - \delta t)} \right) \\ & {\text{off:}}\;\;\dot{n} = - \delta n \Rightarrow n(t) = n_{0} \exp ( - \delta t) \\ \end{aligned}$$
(3)

Gene elements directly used the corresponding analytical solutions to calculate the expression level. Using CaiNet, we simulated the gene cascade with a fixed set of kinetic rates for the transcription factors (Additional file 1: Table S1) and allowed for gene switching (Fig. 2e). We observed that protein abundance increased during the gene on-state and decreased while the gene was off. In combination with stochastic switching between these two states, the variance of the expression-level was significantly increased compared to the global, network-wide solution of the ODE-solver (Fig. 2e). Correspondingly, while the autocorrelation of the protein output of the gene cascade was constant for the solution of the ODE-solver, the autocorrelation for the solution including switching of genes exhibited a decay since protein levels varied over time (Fig. 2e).

Scenario 3 (Fig. 2f, g): In addition to the switching of genes, we included discretized and stochastic production and degradation events (birth and death events) of proteins (Fig. 2f). We initially determined the time points of gene switching according to Eq. (2) using constant protein numbers for a given Δt. For each on-period of the gene with duration τon, we determined the number of birth events nB of gene products by drawing a random number from the probability distribution corresponding to protein production:

$$p(n_{B} ) = \frac{{(\nu \tau_{on} )^{{n_{B} }} }}{{n_{B} !}}\exp (\nu \tau_{on} )$$
(4)

while the number of degraded gene product is determined deterministically according to Eq. (3). We chose this simplification to circumvent a complex probability distribution for gene product numbers. For small changes in gene product levels, it should not deviate much from a full stochastic treatment (see below). In a period without production, we determined the number of gene products k by drawing a random number from the probability distribution to find k gene products after the time interval τoff

$$p(k) = \left( {\begin{array}{*{20}c} n \\ k \\ \end{array} } \right)\exp ( - \delta \cdot \tau_{off} )^{k} (1 - \exp ( - \delta \cdot \tau_{off} ))^{n - k}$$
(5)

assuming there were n proteins at the beginning of the time interval. Using CaiNet, we simulated the gene cascade with a fixed set of kinetic rates for the transcription factors (Additional file 1: Table S1), allowed for gene switching, and accounted for birth/death events (Fig. 2g). Compared to the case including gene switching alone (scenario 2), the variance further increased and approached the variance of a Gillespie simulation of the gene cascade considering the promoter structure explicitly and including stochastic birth and death events of gene products (“Methods” section).

We further tested the behavior of the gene cascade in a situation where the rates of gene switching were much faster than those for production (birth) and degradation (death) events (Fig. 2h, i and Additional file 1: Table S1). For scenario 2 with deterministic birth/death events of proteins, fast rates of gene switching underestimated the variance in protein levels compared to a global, network-wide Gillespie simulation including both stochastic gene switching and stochastic birth/death events (Fig. 2i). This indicates that the main contribution to variance in protein levels in this situation is due to stochasticity in birth/death events. Accordingly, when we included both stochastic gene switching and stochastic birth/death events in CaiNet (scenario 3), the variance in protein levels well approached the variance of the global, network-wide Gillespie simulation, despite the simplified treatment of birth and death events in CaiNet. Similarly, the autocorrelation of protein output, which characterizes the temporal variation of protein levels, was similar for the protein levels obtained by CaiNet with scenario 3 and Gillespie.

At small molecule numbers, the simplification for the on-period of a gene of treating only birth events stochastically but degradation of gene products deterministically might produce artifacts. We therefore tested the behavior of the gene cascade for small numbers of transcription factors by setting the number of the first repressive transcription factor to the constant value of 1 (Fig. 2j). Again, scenario 2 without birth/death events of proteins deviated from a global, network-wide Gillepsie simulation. In contrast, the CaiNet simulation of the low protein situation with scenario 3 including both gene switching and birth/death events yielded very good agreement of the gene product levels and the temporal variation of these levels with the global Gillespie simulation. Thus, our semi-stochastic treatment of birth/death events constitutes a very good approximation.

To ensure sound consideration of gene product numbers in all three scenarios, we implemented that CaiNet monitors the synchronization time step and returns a warning if the change of a gene product, Δn, during one interval violates the criterion

$$\Delta n < 1$$
(6)

For the transcription rate ν, for example, this leads to the condition

$$p_{on} \nu \Delta t < 1$$
(7)

Thus, if ν was the fastest rate constant in the system, choosing the synchronization time step Δt < ν−1 ensures small changes in gene elements.

Consideration of delays

The processes in a cell may give rise to temporal delays. For example, the process of elongation may be considered as a constant delay in gene transcription. We implemented the possibility to account for deterministic, constant delays in the response of a network element in CaiNet as a shift of the element’s output by a fixed number of synchronization steps Δt (Fig. 3a). For example, if the output of a network element is delayed by 10 Δt, the element internally stores the value from ten synchronization steps and reports the delayed value during synchronization. A limitation of this approach is that delays can only be given as multiples of the synchronization time step Δt.

Fig. 3
figure 3

Implementation of delays in gene product synthesis in CaiNet. a A deterministic delay, i.e. a shift by a constant time-period, is realized by a queue. At time \(t\), network elements report the expression levels from time point \(t - m\Delta t\), where \(m\Delta t\) is the time by which the gene product synthesis is delayed. b Rate-limiting steps in gene product synthesis are modelled by analytical solutions of the corresponding system of ODEs. Such delays modify the shape of the time trajectory of the expression level and slightly delay the maxima of expression

Moreover, we integrated the possibility to account for several rate-limiting steps in gene product synthesis. Such delays may occur due to transcription termination, mRNA splicing or translation. We modeled each rate-limiting step as a Poisson process with rate βi, i.e. with exponentially distributed waiting times. Thus, changes in gene activity are blurred if average waiting times are on the order of or slower than gene on/off switching processes (Fig. 3b). We only simulated rate-limiting steps deterministically. Therefore, Eq. (1) was replaced by an analogous equation including the appropriate number of rate-limiting steps (“Methods” section, Eqs. (24) to (29)). At the synchronization time step Δt, the gene element returns the number of gene products of the corresponding solution.

Consideration of biochemical reactions

We further implemented the possibility to include and combine network elements of bi-molecular reactions in CaiNet (Fig. 4). To demonstrate the working principle of simulating enzymatic reactions, we chose an example of two enzymes that mutually take each others product and catalyze it to each others substrate (Fig. 4a). The differential equations for the substrate N and the enzyme F that transforms the substrate N into the product M are

$$\begin{aligned} \dot{N} & = - \delta_{1} N - \nu_{1} \hat{f} + \nu_{2} \hat{g} \\ \dot{f} & = \lambda_{1} (\hat{N} - f)(F - f) - (\delta_{1} + \mu_{1} + \nu_{1} )f \\ \end{aligned}$$
(8)

where δ is the degradation rate and ν1 and ν2 are the catalytic rates of the enzymes F and G. The association and dissociation rates of the substrate to the enzymes are denoted by λ1 and µ1 for N and F and by λ2 and µ2 for M and G. The differential equations for the enzyme G that transforms M back into N are

$$\begin{gathered} \dot{M} = - \delta_{2} M + \nu_{1} \hat{f} - \nu_{2} \hat{g} \hfill \\ \dot{g} = \lambda_{2} (\hat{M} - g)(G - g) - (\delta_{2} + \mu_{2} + \nu_{2} )g \hfill \\ \end{gathered}$$
(9)
Fig. 4
figure 4

Implementation of biochemical reactions in CaiNet. a Sketch of an exemplary biochemical reaction comprised of two basic enzyme turnover reactions (upper panel). The two enzymes mutually take each others product and catalyze it to each others substrate. During each time step, molecule numbers and molecule fluxes need to be synchronized to ensure mass conservation (lower panel). b Comparison of time trajectories of product levels of the biochemical reaction sketched in a simulated with CaiNet using two different synchronization time steps \(\Delta t\) (yellow/purple and green/blue lines) with the numerical solution of an ODE solver (dashed blue/dashed red lines). If the time-step \(\Delta t\) is too large, the transient behavior of CaiNet deviates from an ODE-solver. If the time-step \(\Delta t\) is sufficiently small, the transient behavior of CaiNet approximates the ODE-solver well

In Eqs. (8) and (9), the molecule numbers and fluxes that are synchronized at the end of each synchronization time step Δt are indicated with a circumflex accent. These parameters stay constant during each synchronization time step. With this assumption, we were able to decouple the differential equations of both bi-molecular elements and solve them separately (“Methods” section). The resulting product numbers are reported as output of the bi-molecular elements to all other network elements after each synchronization step Δt. We tested how the duration of the synchronization time step Δt influenced the accuracy of the results using a fixed set of reaction rates (Additional file 2: Table S2) and Δt = 1 s or 10 s (Fig. 4b). Similar to the case of gene elements, if Δt was too large, synchronization of reaction products was delayed compared to a global ODE solution and approximation of the transient behavior of the global ODE solution was poor. However, the steady state level was still reproduced correctly independent of the size of the time step, since the correct system of differential equations was used to calculate the outputs of the bi-molecular reaction elements. If Δt was sufficiently small, the approximation of CaiNet approached the global solution of the associated system of ODEs using a numerical solver.

To ensure a sound consideration of molecule abundance using the flux among elements, we implemented that CaiNet returns a warning once the difference in flux Δj between start and end time of the synchronization interval violates

$$\Delta j\Delta t < 1$$
(10)

CaiNet reproduces noise-induced bi-stability and noise-induced oscillations in GRNs

GRNs often combine both gene regulatory motives and biochemical reactions. To further test the performance of CaiNet, we thus applied it to two heterogeneous GRNs. Both exhibited complex dynamic behavior in presence of stochasticity, as before with the repressive gene cascade.

As first example of a heterogeneous GRN, we chose a network including a positive autoregulatory feedback loop combined with enzyme-coupled degradation (Fig. 5a, Eq. (60) in “Methods” section and Additional file 3: Table S3). Recently, noise-induced bi-stability of such a network has been studied using a new mathematical method, linear mapping approximation, which is able to reproduce even critical dynamic characteristics of GRNs [54]. We verified presence of noise-induced bi-stability in our network with a global Gillespie simulation (Fig. 5b). A CaiNet simulation of scenario 2, where only gene switching is treated stochastically but transcription of RNA and protein birth/death events are treated deterministically, did not reproduce this bi-stability. In contrast, when we included stochastic gene switching and stochastic transcription and birth/death events in the CaiNet simulation (scenario 3), the fluctuations in protein levels agreed to the fluctuations of the global, network-wide Gillespie simulation (Fig. 5b). Small deviations in the distribution of protein numbers remained for this heterogeneous GRN, since enzymatic reactions are only considered deterministically in CaiNet (Fig. 5b). Overall, CaiNet well recovered noise-induced bi-stability of the GRN.

Fig. 5
figure 5

CaiNet recovers noise-induced bi-stability and oscillations. a Sketch of a positive autoregulatory feedback motive combined with enzyme-mediated degradation. b Left panel: comparison of a CaiNet simulation of the positive autoregulatory feedback motive according to scenario 2 (blue line) and scenario 3 (red line) with a Gillespie simulation (yellow line). Right panel: histogram of respective gene product levels. c Sketch of a negative autoregulatory feedback motive combined with enzyme-mediated degradation. d Left panel: comparison of a CaiNet simulation of the negative autoregulatory feedback motive according to scenario 2 (blue line) and scenario 3 (red line) with a Gillespie simulation (yellow line). Right panel: Fourier transformation of the time traces in the left panel

We further compared the time necessary to simulate this positive autofeedback GRN. One day of simulation time took 0.07 s for deterministic CaiNet simulations (scenario 1), 0.08 s for CaiNet simulations including stochastic gene switching (scenario 2) and 0.18 s for CaiNet simulations including both stochastic gene switching and stochastic transcription and birth/death events of the gene product (scenario 3). A network-wide Gillespie simulation implemented in Matlab took 2 s. We note however, that the Gillespie-Simulation was specifically written and optimized for the positive autofeedback GRN while CaiNet is a framework for general GRNs.

Second, we applied CaiNet to a heterogeneous GRN including a negative autoregulatory feedback loop combined with enzyme-coupled degradation (Fig. 5c, Eq. (61) in “Methods” section and Additional file 3: Table S3). A similar network has been shown to exhibit noise-induced oscillations by using slow-scale linear noise approximation [55]. The CaiNet simulation of our network using only stochastic treatment of gene switching (scenario 2) settled at a constant steady state, after quickly decaying oscillatory transient behavior (Fig. 5d). In contrast, a global Gillespie simulation considering full molecular stochasticity revealed continuous, noise-induced oscillations, as expected. Similarly, when we considered both stochastic gene switching and stochastic transcription and birth/death events (scenario 3), the CaiNet simulation exhibited continuous noise-induced oscillations (Fig. 5d). As before, the CaiNet simulation did not fully reproduce the protein levels of the Gillepsie simulation due to deterministic treatment of the enzyme reaction. However, CaiNet well recovered the periodicity of the oscillations as revealed by a Fourier transformation of the simulated time trace (Fig. 5d). Overall, CaiNet simulations considering both stochastic gene switching as well as stochastic birth/death events of the gene product well recovered complex dynamic behavior of heterogeneous GRNs including both gene regulatory motives and biochemical reactions.

Inference of GRNs with a neural network approach

We took advantage of the modular approach of CaiNet to facilitate inference of kinetic rates of network elements and the topology of a GRN by using a gradient method originally designed to train recurrent neural networks. The analogy between GRNs and neural networks (NNs) has been pointed out before, by discussing genes as information processing units [43,44,45,46]. Accordingly, we directly identified each network element and its divers physiological steady-state input and output parameters with a unique perceptron. Thus, we could directly interpret the GRN laid out in CaiNet as a NN. This identification enabled us to apply a training method designed for recurrent NNs to the GRN, and the process of training corresponded to the process of inference of network parameters. As experimental ground truth, several known steady state measurements of gene product levels including different input conditions of the unknown network or knockout experiments of one or several genes may be taken. During the inference process, we minimized the difference between the current network behavior and the experimental ground truth by optimizing steady-state parameters of the GRN. In particular, we used a back-propagation algorithm introduced for recurrent NNs, which does not perform a global gradient descent step for all nodes in the network, but rather performs a local gradient decent and propagates remaining errors to other nodes [56] (“Methods” section). The modular layout of the GRN in CaiNet resulting from our central assumption facilitated calculating this gradient.

For an initial proof-of-concept, we applied the process of inference to a network of three genes, for which we assigned a set of reaction rates (Additional file 4: Table S4) and activating or repressing connections between the genes (Fig. 6a). We then used CaiNet to simulate the ground truth behavior of the GRN for two different inputs. The resulting expression levels served as measurements for the proof-of-concept inference. For simplicity, we only inferred the equilibrium constants of transcription factor binding, i.e. of on- and off-switching of the genes, while leaving production and degradation rates fixed. We assigned false starting values for the equilibrium constant of each gene (Fig. 6a), simulated the resulting network behavior and calculated the difference to the two measurements of the ground truth network. Next, we iteratively applied the adapted gradient method until the difference between simulated and ground-truth values dropped below a certain threshold (“Methods” section). During each iteration, the parameters in the guessed network were changed such that the difference between simulated and ground-truth values became smaller. These changes of parameters propagate through the network. Due to feedback loops in the network structure, such a change may cause worse performance of the guessed network. Therefore, the distance between ground truth and guessed network may oscillate before reaching a stable value (Fig. 6b–d). The resulting values for the equilibrium constants and thus the performance of the network were well regained during the inference process (Fig. 6a).

Fig. 6
figure 6

Inference of GRNs with CaiNet using a recurrent network training approach. a Demonstration of the inference procedure using a GRN comprising an input element and 2 genes. Left panel: sketch of the ground truth network. The equilibrium constants of transcription factor binding and gene product levels resulting from two different input levels (I and II) are depicted. Middle panel: sketch of the first training cycle. Based on the difference between a guessed network and the ground truth network and the gradient of the gene response functions, CaiNet iteratively adapts the parameters of the network until the expression levels of the guessed network match the ground truth values. Right panel: sketch of the trained network. Trajectories of the gene product levels of gene 1 (blue) and gene 2 (red) for input level I (b) and input level II (c). d Trajectories of equilibrium constants of transcription factor binding. e Sketch of the procedure to evaluate the inference approach. A ground truth network is generated by randomly omitting gene connections of a fully connected network and assigning equilibrium constants of transcription factors. Gene product levels simulated for the ground truth network are used to train a fully connected network. f Percentage of true positive gene connections (agreement between trained and ground truth network) versus percentage of false positive gene connections (exist in trained but not in ground truth network) for GRNs of 5 to 10 genes. Error bars denote s.d. of the training results of 12 different randomly generated networks. Inset: Histogram of relative errors (normalized difference in equilibrium constants between trained and ground truth network)

Next, we applied our inference approach to multiple different networks with N = 5 to 10 genes. To generate these networks, we set up fully connected networks with N genes, i.e. that have N·(N − 1) connections (Fig. 6e). To obtain a specific ground truth network, we randomly modified the fully connected network. For each connection, we determined whether it was deleted by comparing a random number with a deletion probability pdelete, such that on average N·(N − 1)(1 − pdelete) connections remained. For each equilibrium constant of transcription factor binding, i.e. of on- and off-switching of a gene, we drew a random value out of a probability distribution spanning over two orders of magnitude between 1e−3 and 1e−2. As a result of these two steps, we obtained a series of networks with N genes and randomly chosen topology and gene switching kinetics. Each series consisted of 12 networks, which served as ground truth networks.

To generate measurements for the inference process, we used CaiNet to simulate the expression levels of each gene of a network under different conditions. Here, we knocked out each gene in a given ground truth network one at a time and simulated the resulting expression levels of the other genes. This resulted in a set of N2 measurements, which we used as training data set in the inference process. Such large training data sets enabled inferring the actually implemented connections of the ground truth network when starting the inference process from the fully connected network. We repeated this process for each ground truth network in the series of networks with N genes.

We then trained the fully connected, non-randomized, starting network with the gradient descent method on a measurement dataset of a specific network (“Methods” section). During the inference process, CaiNet did not explicitly delete connections. Rather, the equilibrium constants of unneeded or contradictory connections approached zero. We interpreted a connection as deleted, if its equilibrium constant was smaller than 1e−4. With equilibrium constants smaller or equal to this value, the affinity of the corresponding transcription factor is too weak to regulate the expression level of the respective gene. We repeated this process of inference for each ground truth network in the series of networks with N genes.

To quantify the performance of the inference approach, we counted the number of true positive and false positive connections of a trained network, as compared to the ground truth network topology (Fig. 6c and Additional file 5: Table S5). For networks of N = 5 to 10 genes, the false positive fraction was on average ≈ 20%, and the true positive fraction was ≈ 90%. We further quantified the inference of the equilibrium constants by calculating the difference between ground truth and inferred value and normalized this with the ground truth value (Fig. 5c, Inset). On average, the relative error of equilibrium constants was ≈ 1%.

Discussion

CaiNet is designed to assist users in setting up and simulating complex GRNs at molecular detail. Simplifying the process of setting up networks, we designed gene elements to assume a two-state behaviour with off and on state of the gene and with a variable number of successive production and processing rates and a degradation rate of the gene product. After assigning the promoter structure and number of successive gene synthesis rates, CaiNet assigns the corresponding local analytical solutions or stochastic models to the gene elements without further input by the user. This is possible since we treat each gene element isolated from other network elements between two synchronization time steps.

Our modular algorithm enables adding more and more realistic kinetic behaviour and noise in molecule abundance to initial deterministic solutions, since the local analytical treatment of the two-state model and gene product synthesis of a gene element can be readily replaced by a stochastic treatment of gene on/off switching and stochastic product synthesis for each gene. Thus, it is possible to quickly assess the effect of stochasticity on the behaviour of a network. Another great advantage of the modular approach of CaiNet is that new GRNs that follow the implemented regulatory structure can be set up and simulated very quickly.

Simulations of the modular stochastic-deterministic elements in CaiNet may be fast compared to the Gillespie method. The time increment of a Gillespie simulation depends on the fastest rate in the system, which often scales with the abundance of a reactive species. In contrast, the synchronization time step of CaiNet has to be adjusted such that changes in the abundance of the species are small within this time step, which allows for longer time steps. In addition, since gene elements are decoupled in CaiNet, it is possible to distribute the calculations to multiple processor cores. Moreover, the CaiNet algorithm enables fast transitions between deterministic and stochastic simulations. This in principle allows approximating parts of a large GRN deterministically, while treating others stochastically.

The precalculated analytical solutions applied in CaiNet set a limit for the generality of modelling biological networks. For now, we implemented modelling of genes only with two states and a few promoter structures. We further assumed a direct relationship between transcription factor binding and activation of a gene. Epigenetic alterations to the gene locus or further steps such as recruitment of the transcription machinery are not modelled explicitly. In principle, new network elements with further promoter structures or more complex multistate models can be implemented in CaiNet as specialized effective two-state models. It has been found that such effective models are suitable to well describe the histogram of mRNA levels of a cell population [57]. Nevertheless, single-cell experiments revealed an effect of cell size, cell division and DNA content on the kinetics of gene transcription and mRNA content, for example in dosage compensation after DNA replication [58, 59]. Accordingly, the two-state model has been extended to include such effects [58]. Recently, a more comprehensive and analytically tractable stochastic model of mRNA dynamics has been devised [60]. In the future, incorporating such complex models will enable enhancing CaiNet to arrive at more and more realistic simulations of GRNs.

While CaiNet is optimized for setting up GRNs and simulating stochastic processes in gene product synthesis, biochemical reactions can also be implemented to account for molecular signalling pathways and reaction cascades that are oftentimes connected to GRNs. For these pathways, we did not implement stochastic fluctuations in the expression levels, since biochemical reactions typically are fast compared to the kinetics of gene on/off switching. Due to this simplification, some deviations in gene product levels compared to a Gillespie simulation may occur. However, even complex dynamic behaviour such as noise-induced bi-stability or oscillations were well recovered by a CaiNet simulation that considered both stochastic gene switching and stochastic birth/death events of the gene product. For biochemical reactions, CaiNet’s elements are limited to dimerization reactions and enzyme-mediated transformations of biomolecules. These basic bi-molecular reactions can be combined to more complex biochemical reactions. However, biochemical reactions with more complex response functions or hill coefficients larger than two are not implemented and require calculating new CaiNet elements.

CaiNet is able to infer GRNs from steady state expression levels. The inferred parameters are therefore limited to equilibrium constants. Since it is challenging to define a gold-standard for inference methods [42, 61], we refrained from comparing CaiNet with other inference methods, such as previously attempted in the DREAM challenge. Instead, we used a comprehensive randomized approach to evaluate the performance of CaiNet. Our simulations indicate that if each gene in a network is knocked out individually, the resulting gene product levels provide sufficient data to infer the topology and parameters of the network. A limit for the size of the inferred network is given by the computation time of the recurrent network algorithm and by the amount of known features of the GRN. Importantly, the inferred equilibrium constants directly correspond to physiological parameters of the GRN and therefore directly allow for subsequent forward simulation of the inferred GRN.

Conclusion

We provide a user-friendly framework, CaiNet, to simulate GRNs at molecular detail and to infer the topology and steady-state parameters of GRNs. In CaiNet, biochemical reactions and genes with various regulatory logics can be combined to complex GRNs, for which the temporal progression of gene product levels is simulated. Inversely, experimental steady-state measurements of expression levels can be used to infer the underlying GRN. The combination of a forward simulation tool and inverse inference tool in the single package CaiNet may facilitate the process of evaluating biologically relevant GRNs and interpreting associated measurement results.

Methods

Pseudo code of network simulations by CaiNet

figurea

Effective rates given by the promoter structure

Activation by a single transcription factor

For the most simplified case of activation of a gene we assume that a promoter is activated once a transcription factor (TF) binds. This means that the on-rate is given by the arrival rate λeff of the TF at the promoter. Once a TF has arrived at the Promoter the gene is ‘on’ for the time 1/µeff. This time may vary depending on the TF and the promoter of the gene. If not otherwise stated we assume that the binding time of the TF, 1/µ, corresponds to this on-time.

$$\begin{aligned} \lambda_{eff} & = n_{TF} \lambda_{0} \\ \mu_{eff} & = \mu \\ \end{aligned}$$
(11)

where nTF is the number of transcription factors and λ0 is the arrival rate of a single transcription factor.

Promoter with AND-logic

The AND-logic refers to a promoter that is only activated if TF1 and TF2 up to TFN are bound (Fig. 2d). Once a single TF leaves, the activation criterion is immediately violated and the promoter is off. Therefore, the off-rate µeff of the promoter is the sum over all off-rates µTFi of the TFs.

$$\mu_{eff} = \sum\limits_{i = 1}^{N} {\mu_{TFi} }$$
(12)

Combinatorically, we can also write down the probability of the promoter to be on as the product of the probability of all TFs to be bound

$$p_{on} = \prod\limits_{i} {p_{on,TFi} } = \prod\limits_{i} {\lambda_{TFi} /(\lambda_{TFi} + \mu_{TFi} )}$$
(13)

where λTFi = λTFi,0 nTFi is the arrival rate of a TF at the promoter. From pon,eff and µeff, we can calculate the on-rate of the promoter

$$\lambda_{eff} = p_{on} /(1 - p_{on} )\mu_{eff} = K\mu_{eff}$$
(14)

Promoter with OR logic

When a promoter is active if TF1 or TF2 up to TFN or a combination of all is bound, we refer to this promoter as OR-logic (Fig. 2d). We start the calculation of effective rates with the on-rates of individual TFs, λTFi. Since the arrival of any TF is enough to activate the promoter, the on-rate is

$$\lambda_{eff} = \sum\limits_{i} {\lambda_{TFi} }$$
(15)

The probability to be on can be calculated combinatorically with pon,TFi implicitly defined in (13)

$$p_{on} = 1 - (1 - p_{on,TF1} )(1 - p_{on,TF2} )...(1 - p_{on,TFN} )$$
(16)

With (14) we find the effective off rate of the promoter

$$\mu_{eff} = \lambda_{eff} (1 - p_{on} )/p_{on}$$
(17)

Competitive repression

We now enhance all promoters developed above by an additional feature, which is the blocking of the promoter by a repressor. We assume that the promoter cannot be activated once a repressor is bound. Once a gene is activated however, the repressor does not abort expression. Therefore, the repressor directly affects the effective on-rate of an arbitrary promoter while the off-rate remains unchanged

$$\begin{aligned} \lambda_{eff,repressor} & = \lambda_{eff} \cdot p_{{off,{\text{Re}} pressor}} \\ \mu_{eff,repressor} & = \mu_{eff} \\ \end{aligned}$$
(18)

To proof this, we write down the differential equations describing the change in the blocked and free promoter populations. We denote blocked promoters with b, free but inactive promoters with f and active promoters with a

$$\begin{aligned} \dot{b} & = - \mu_{R} b + \lambda_{R} f \\ \dot{f} & = \mu_{R} b + \mu_{eff} a - \lambda_{eff} f - \lambda_{R} f \\ \end{aligned}$$
(19)

Assuming that the changes in the repressor-population are small, we find

$$b = \frac{{\lambda_{R} }}{{\mu_{R} }}f = K_{R} f$$
(20)

If we now calculate the sum of blocked and unblocked promoters, we obtain

$$p = b + f = (1 + K_{R} )f$$
(21)

We add up Eq. (19), plug the result in (21) and obtain

$$\dot{p} = \mu a - \tfrac{{\lambda_{eff} }}{{1 + K_{R} }}p$$
(22)

From this result we conclude for the new effective rate modified by the repressor

$$\lambda_{eff,repressor} = \lambda_{eff} \left( {1 + K_{R} } \right)^{ - 1}$$
(23)

This is equivalent to Eq. (18) and the proof is complete.

Gene expression including delays

In a simplified approach, we considered one rate-limiting step with rate constant ν for synthesis of the gene product. This assumption neglects the fact that certain processes in gene product synthesis may introduce a delay between initiation of product synthesis and availability of the synthesized product, e.g. elongation of RNA, termination of transcription, initiation of translation, elongation of the protein and termination of translation. In addition, the mRNA needs time to be transported to ribosomes and might be subject to posttranslational modifications. To account for such processes in gene product synthesis, we included additional rate-limiting steps corresponding to Poisson processes with rates β1…βN.

To determine the resulting function n(t), we solve the general system of ODEs for an arbitrary number of delay processes. Using the switching function a(t) of a single gene, we arrive at the system of differential equations

$$\begin{aligned} & \dot{c}_{1} = a(t)\nu - \beta_{1} c_{1} \\ & \dot{c}_{2} = \beta_{1} c_{1} - \beta_{2} c_{2} \\ & ... \\ & \dot{c}_{N} = \beta_{N - 1} c_{N - 1} - \beta_{N} c_{N} \\ \end{aligned}$$
(24)

where cN represents the output expression level of the gene. Our first step to solve the system for cN(t) is to calculate the Laplace transform of all equations in (24). The result for the first and the i-th equation is

$$\begin{aligned} \tilde{c}_{1} & = \frac{1}{{p + \beta_{1} }}\nu \tilde{a} \\ \tilde{c}_{i} & = \frac{{\beta_{i - 1} }}{{p + \beta_{i} }}\tilde{c}_{i - 1} \\ \end{aligned}$$
(25)

where p the transform variable and ã is the Laplace transform of a. With these results we can easily find an equation for the Laplace-transform of the final product cN

$$\tilde{c}_{N} = \frac{{\prod\nolimits_{i = 1..N - 1} {\beta_{i} } }}{{\prod\nolimits_{i = 1..N} {(p + \beta_{i} )} }}\nu \tilde{a}$$
(26)

Before we calculate the inverse Laplace transform to obtain cN(t) we simplify the denominators in (26) by partial fraction composition with the coefficients αi

$$\begin{aligned} & \frac{{\prod\nolimits_{i = 1..N - 1} {\beta_{i} } }}{{\prod\nolimits_{i = 1..N} {(p + \beta_{i} )} }} = \sum\limits_{i} {\frac{{\alpha_{i} }}{{p + \beta_{i} }}} \\ & {\text{where}}\quad \alpha_{i} = \frac{{\prod\nolimits_{n = 1..N - 1} {\beta_{n} } }}{{\prod\nolimits_{n \ne i} {( - \beta_{i} + \beta_{n} )} }} \\ \end{aligned}$$
(27)

For few delays this simplification can most easily be verified by plugging in αi and rewriting the right hand side of (27). For a high number of delays the result can be derived by the well-known technique of partial fraction decomposition. We plug this result in (26) and obtain

$$\tilde{c}_{N} = \nu \tilde{a}\sum\limits_{i} {\frac{{\alpha_{i} }}{{p + \beta_{i} }}}$$
(28)

We next use the multiplication theorem for the Laplace transform and find

$$c_{N} = \sum\limits_{i} {\alpha_{i} \nu \int {\exp ( - \beta_{i} (t - t^{\prime}))} \cdot a(t^{\prime}){\text{d}} t^{\prime}}$$
(29)

Training of recurrent regulatory networks

We assume that steady state expression levels for a subset of species of a GRN are given. Our goal is to infer parameters characterizing the steady state behaviour of the network from this information. Importantly, since we work with steady state information, our approach is limited to the inference of equilibrium constants rather than kinetic rate constants. To infer these constants, we apply a gradient method particularly design for recurrent network topologies [56]. Once these equilibrium parameters are inferred, the temporal evolution of the network can still be simulated.

We start by writing down the minimization problem for the objective function W(ϕ) of the difference between the current values X and the target expression levels T of the network. The objective function shall be minimized by varying the parameters of the GRN, ϕ = {λ, µ, ν, δ}.

$$\mathop {\min }\limits_{{{\varvec{\upvarphi }} > 0}} {\mkern 1mu} W({\varvec{\upvarphi }})\quad {\text{where}}\;\;\;W = \frac{1}{2}\sum\limits_{i} {\left( {x_{i} - T_{i} } \right)^{2} }$$
(30)

where we used the l2-norm of the distance between target expression level and corresponding simulated values as a loss-function for the i-th element of the training data set.

To obtain the gradient for a parameter ϕjϕ we calculate the derivative of the objective function with respect to ϕj and obtain

$$\frac{\partial }{{\partial \varphi_{j} }}W = \sum\limits_{i} {\left( {x_{i} - T_{i} } \right)} \frac{{\partial x_{i} }}{{\partial \varphi_{j} }}$$
(31)

Using the delta rule introduced by Pineda et al. [56], we find a step size for the parameter ϕj

$$\Delta \varphi_{j} = - \eta \cdot \sum\limits_{i} {\left( {x_{i} - T_{i} } \right)} \frac{{\partial x_{i} }}{{\partial \varphi_{j} }}$$
(32)

where we use the empirical parameter η to scale the step size. In the above equation, the derivatives of xi with respect to ϕj are unknown. To derive a formula for Δϕj, we in the following formally work on a generic system of ODEs corresponding to a GRN.

The time evolution of all expression levels X = {x1, , xi}, is given by the set of differential equations \({\dot{\mathbf{x}}} = {\mathbf{F}}({\mathbf{x}})\). To obtain the missing derivatives we take the total derivative with respect to ϕj of the steady state case 0 = F(X). We obtain

$$\frac{d}{{d\varphi_{j} }}{\mathbf{F}}(\varphi_{j} ,x_{i} ) = \frac{{\partial {\mathbf{F}}}}{{\partial \varphi_{j} }} + \sum\limits_{k} {\frac{{\partial {\mathbf{F}}}}{{\partial x_{k} }}\frac{{\partial x_{k} }}{{\partial \varphi_{j} }}} = 0$$
(33)

In this equation we recognize the partial derivative of xk with respect to ϕj. Importantly, the system of equations is linear with respect to this derivative. We introduce the abbreviation

$$L_{ik} = \frac{{\partial F_{i} }}{{\partial x_{k} }}$$
(34)

With this abbreviation, we can rewrite (33) as a linear system of equations for the derivatives of xk with respect to ϕj.

$$- \frac{{\partial {\mathbf{F}}}}{{\partial \varphi_{j} }} = {\mathbf{L}}\frac{{\partial x_{k} }}{{\partial \varphi_{j} }}$$
(35)

By introducing a new variable z, we can make the step of solving this system of ODEs independent from the parameter ϕj, such that we only need to solve the system once in order to obtain the gradients for all parameters in ϕ. This variable z is characterized by the equation

$${\mathbf{z}}^{{\text{T}}} = \left( {{\mathbf{x}} - {\mathbf{T}}} \right)^{{\text{T}}} {\mathbf{L}}^{{{\mathbf{ - 1}}}} \Leftrightarrow {\mathbf{L}}^{{\text{T}}} {\mathbf{z}} = \left( {{\mathbf{x}} - {\mathbf{T}}} \right)$$
(36)

We plug this equation in (32) and identify z by introducing the identity matrix. We obtain

$$\Delta \varphi_{j} = - \eta \cdot \left( {{\mathbf{x}} - {\mathbf{T}}} \right) \cdot {\mathbf{L}}^{ - 1} {\mathbf{L}}\frac{{\partial x_{i} }}{{\partial \varphi_{j} }} = \eta \cdot {\mathbf{z}} \cdot \frac{{\partial {\mathbf{F}}}}{{\partial \varphi_{j} }}$$
(37)

To calculate this gradient using a network set up in CaiNet, we need to calculate the derivatives of the differential equations \(\partial {\mathbf{F}}/\partial \varphi_{j}\) and the derivatives \(\partial x_{k} /\partial \varphi_{j}\). Since the parameter \(\varphi_{j}\) and the species \(x_{i}\) only occur in a single network element, each network element can return these derivatives independently of other elements in the network.

For a promoter with or-logic, the differential equation is

$$\frac{{dn_{j} }}{dt} = \left[ {1 - \prod\limits_{{i \in A_{j} }} {p_{off,i} } } \right]\nu_{j} - \delta_{j} n_{j}$$
(38)

where \(A_{j}\) is the set of all transcription factors that activate the element \(j\). We now take the derivative of the steady state equation with respect to \(\lambda_{k}\) and obtain

$$0 = \left( {\prod\limits_{i} {p_{off,i} } } \right) \cdot \nu_{j} \cdot \sum\limits_{i} {\frac{{\lambda_{i} }}{{\lambda_{i} n_{i} + \mu_{i} }}\frac{{\partial n_{i} }}{{\partial \lambda_{k} }}} - \delta_{j} \frac{{\partial n_{j} }}{{\partial \lambda_{k} }} + \nu_{j} \frac{{\mu_{i} }}{{\left( {\lambda_{i} n_{i} + \mu_{i} } \right)^{2} }}n_{i} \Delta_{ik}$$
(39)

where \(\Delta_{ik}\) is the Kronecker delta. From this equation we can identify the variables that the network element returns to calculate a gradient decent step:

$$\begin{aligned} \frac{{\partial F_{j} }}{{\partial \lambda_{k} }} & = \nu_{j} \frac{{\mu_{i} }}{{\left( {\lambda_{i} n_{i} + \mu_{i} } \right)^{2} }}n_{i} \Delta_{ik} \\ L_{ji} & = \left\{ {\begin{array}{*{20}c} {\left( {\prod\limits_{{i \in A_{j} }} {p_{off,i} } } \right) \cdot \nu_{j} \cdot \frac{{\lambda_{i} }}{{\lambda_{i} n_{i} + \mu_{i} }} - \delta_{j} \Delta_{ij} } & {i \in A_{j} } \\ { - \delta_{j} \Delta_{ij} } & {i \notin A_{j} } \\ \end{array} } \right. \\ \end{aligned}$$
(40)

For a complete gradient descent scheme, we first obtain the steady state expression levels by simulating the network with CaiNet until all elements have reached a steady state. Next we use Eq. (37) to calculate the change for all parameters in ϕ based on the derivatives above and the difference between ground truth expression levels and simulated expression levels. Using the new parameters, we again simulate the steady state expression levels. We proceed in this manner until an abortion criterium is met. This abortion is \(W < b\), i.e. that the sum of all differences between ground truth and expression levels of the trained network are smaller than an upper bound \(b\).

Chemical reactions

In the following we derive analytical solutions for the elementary reactions ‘homo-dimerization’, ‘hetero-dimerization’ and ‘transformation of a species by an enzyme’. We then use these analytical solutions to couple multiple elementary reactions to obtain more complex biochemical reaction networks.

Hetero-dimerization

We start with the hetero-dimerization of the two species \(n_{1}\) and \(n_{2}\) that forms the species \(f\). For a closed system of these three species the change in \(f\) over time is

$$\dot{f} = \lambda n_{1} n_{2} - \left( {\delta_{1} + \delta_{2} + \mu } \right)f$$
(41)

where \(\lambda\) is the association rate of the two dimerizing species and \(\mu\) is the dissociation rate of the dimer. The rates \(\delta_{1}\) and \(\delta_{2}\) correspond to the degradation of the monomers \(n_{1}\) and \(n_{2}\) respectively. The law of mass conservation yields equations for the species \(n_{1} ,n_{2}\)

$$\begin{aligned} N_{1} & = f + n_{1} \\ N_{2} & = f + n_{2} \\ \end{aligned}$$
(42)

Plugging in the law of conservation we obtain

$$\dot{f} = \lambda (N_{1} - f)(N_{2} - f) - \left( {\delta_{1} + \delta_{2} + \mu } \right)f$$
(43)

To account for coupling to other elementary systems, we introduce the flow \(j_{in}\) that represents flux of the species \(f\) from other elements into the element at hand. The new differential equation for \(f\) is

$$\dot{f} = \lambda (N_{1} - f)(N_{2} - f) - \left( {\delta_{1} + \delta_{2} + \mu } \right)f + j_{in}$$
(44)

Next, we solve this equation for \(f(t)\) for an initial value of \(f_{0}\). We start by calculating the fixed points (defined by \(\dot{f} = 0\)) of \(f\). The corresponding quadratic equation yields the fixed points \(f_{1} ,f_{2}\)

$$f_{1,2} = \frac{{N_{1} + N_{2} + K^{ - 1} \pm \sqrt {\left( {N_{1} + N_{2} + K^{ - 1} } \right)^{2} - 4N_{1} N_{2} + j_{in} \lambda^{ - 1} } }}{2}$$
(45)

where \(K = \lambda \cdot \left( {\mu + \delta_{1} + \delta_{2} } \right)^{ - 1}\). Using these fixed points, we can rewrite Eq. (44) as

$$\dot{f} = - \lambda \left( {f - f_{2} } \right)\left( {f - f_{1} } \right)$$
(46)

Integration of this ODE yields

$$\begin{aligned} f\left( t \right) & = f_{1} + \left( {f_{2} - f_{1} } \right)\frac{u}{1 + u} \\ u & = \frac{{f_{0} - f_{1} }}{{f_{2} - f_{0} }}\exp ( - kt) \\ k & = \lambda \left( {f_{2} - f_{1} } \right) \\ \end{aligned}$$
(47)

Since the reaction consumes the species \(n_{1} ,n_{2}\), we need to calculate the flux of the respective elements into the species \(f\). This flux is calculated by

$$\begin{aligned} j_{in,n1} & = \int\limits_{0}^{\Delta t} {\left[ {\lambda (N_{1} - f)(N_{2} - f) - \left( {\delta_{2} + \mu } \right)f + j_{in} } \right]} \, dt \\ j_{in,n2} & = \int\limits_{0}^{\Delta t} {\left[ {\lambda (N_{1} - f)(N_{2} - f) - \left( {\delta_{1} + \mu } \right)f + j_{in} } \right]} \, dt \\ \end{aligned}$$
(48)

During each synchronization time-step, these fluxes are reported to the network element representing the corresponding species.

Homo-dimerization

The differential equations for a homo-dimerization are

$$\dot{f} = \lambda (N - 2f)^{2} - \left( {2\delta_{1} + \mu } \right)f + j_{in}$$
(49)

where \(\lambda\) is the association rate of the two dimerizing monomers and \(\mu\) is the dissociation rate of the dimer. The rate \(\delta_{1}\) corresponds to the degradation of the monomers \(N\). The shape of this equation is similar to the heterodimer case. Thus, by setting

$$\begin{aligned} \lambda \, & { = 4}\lambda \\ N_{1} & = N_{2} = N/2 \\ j_{in} & = 2j_{in} \\ \end{aligned}$$
(50)

we can reuse our solution (47) to calculate the number of dimers and (48) to calculate the flux out of the monomer into the homodimer. During each synchronization time-step, this flux is reported to the monomer element.

Enzyme kinetics

In case of a dimerization with an enzyme, the substrate can be subject to a reaction catalysed by the enzyme. Here, the species \(m\) is produced.

$$\dot{f} = \lambda (N_{1} - f)(N_{2} - f) - \left( {\delta_{1} + \delta_{2} + \nu + \mu } \right)f$$
(51)
$$\dot{m} = \nu f - \delta_{3} m + j_{in}$$
(52)

where \(\lambda\) is the association rate of substrate to enzyme and \(\mu\) is the dissociation rate of the substrate-enzyme complex. The rates \(\delta_{1}\), \(\delta_{2}\) and \(\delta_{3}\) correspond to the degradation of the substrate \(n_{1}\), degradation of the enzyme \(n_{2}\) and degradation of the product respectively. The rate \(\nu\) is the transformation rate of substrate to product. The shape of this equation is similar to the heterodimer case. Thus, by setting

$$\delta_{1} = \delta_{1} + v$$
(53)

we can reuse our solution (47) to calculate the number of enzyme–substrate–complexes and (48) to calculate the flux out of substrate and enzyme into the enzyme–substrate–complex. To obtain the amount of product, we integrate (52) and obtain

$$m(\Delta t) = m_{0} \exp \left( { - \delta_{3} \Delta t} \right) + \frac{{\nu \overline{f} + j_{in} }}{{\delta_{3} }}\left[ {1 - \exp \left( { - \delta_{3} \Delta t} \right)} \right]$$
(54)

where

$$\overline{f} = \frac{1}{\Delta t}\int\limits_{0}^{\Delta t} {f(t)dt = f_{1} + \frac{1}{\lambda \Delta t}\log \left( {\frac{1 + u(t = 0)}{{1 + u(t = 0)\exp ( - k\Delta t)}}} \right)}$$
(55)

We arrived at this result by reusing our solution (47). During each synchronization time-step, m(Δt) is reported to all other elements in the network.

Simulation of the negative feedback cascade

Gillespie simulation

All genes in the cascade have three states, since the promoter structure is not modelled with an effective two-state model as in the case of the CaiNet simulation. The promoter is empty in the second state. By binding of a transcription repressor the promoter enters the first state. This state can only be exited upon unbinding of the transcription repressor. The third state is entered upon binding of a transcription activator

$${\text{Repressed}} \rightleftharpoons {\text{Empty}} \rightleftharpoons {\text{Active}} \to {\text{TF}} \to \emptyset$$
(56)

This state model leads to the following system of stochastic reactions. We consider the n-th link of the chain. \(P_{n}\) then is transcriptionally repressed by the gene product of the (n − 1)-th link. The n-th link of the chain produces the transcription repressor \(R_{n}\)

$$\begin{aligned} & {\text{P}}_{n} + {\text{A}}\mathop{\longrightarrow}\limits^{{\lambda_{A} }}{\text{P}}_{n} - {\text{A}} \\ & {\text{P}}_{n} - {\text{A}}\mathop{\longrightarrow}\limits^{{\mu_{A} }}{\text{P}}_{n} + {\text{A}} \\ & {\text{P}}_{n} + R_{n - 1} \mathop{\longrightarrow}\limits^{{\lambda_{R} }}{\text{P}}_{n} - R_{n - 1} \\ & {\text{P}}_{n} - R_{n - 1} \mathop{\longrightarrow}\limits^{{\mu_{R} }}{\text{P}}_{n} + R_{n - 1} \\ & {\text{P}}_{n} - {\text{A}}\mathop{\longrightarrow}\limits^{\nu }{\text{P}}_{n} - {\text{A}} + {\text{R}}_{n} \\ & {\text{R}}_{n} \mathop{\longrightarrow}\limits^{\delta }\emptyset \\ \end{aligned}$$
(57)

We implemented this set of reactions for \(n = 1,2,3,4\).

System of ODEs

For the system of ODEs we do not explicitly simulate the association and dissociation of transcription factors to the promoters. Rather, we give the on-probability of the gene. The state model (56) corresponds to a competitive repression model of a promoter and we can give the on-probability of the n-th link of the chain using (18)

$$p_{on,n} = \frac{{\lambda_{a} p_{off,n} }}{{\lambda_{a} p_{off,n} + \mu_{a} }}\quad {\text{where}}\;\;\;{\text{p}}_{off,n - 1} = 1 - \frac{{\lambda_{n} }}{{\lambda_{n} R_{n - 1} + \mu_{n} }} \,$$
(58)

With this on-probability we can give the ODE for the gene product

$$\frac{{dR_{n} }}{dt} = p_{on,n} \nu_{n} - \delta_{n} R_{n}$$
(59)

We implemented this set of reactions for \(n = 1,2,3,4\). For solving the system of differential equations we used the ode45 solver of Matlab 2019a.

Noise-induced bi-stability and oscillations

The positive feedback loop

We simulated a gene that is activated by its own gene product A to transcribe RNA. The RNA is translated to yield the transcription factor, which can be degraded spontaneously or by an enzyme E.

$$\begin{aligned} & {\text{P}} + {\text{A}}\mathop{\longrightarrow}\limits^{{\lambda_{A} }}{\text{P}} - {\text{A}} \\ & {\text{P}} - {\text{A}}\mathop{\longrightarrow}\limits^{{\mu_{A} }}{\text{P}} + {\text{A}} \\ & {\text{P}} - {\text{A}}\mathop{\longrightarrow}\limits^{\nu }{\text{P}} - {\text{A}} + {\text{RNA}}_{A} \\ & {\text{RNA}}_{A} \mathop{\longrightarrow}\limits^{\beta }A \\ & E + A\mathop{\longrightarrow}\limits^{{\lambda_{E} }}EA \\ & EA\mathop{\longrightarrow}\limits^{{\mu_{E} }}E + A \\ & EA\mathop{\longrightarrow}\limits^{{\nu_{E} }}E \\ & A\mathop{\longrightarrow}\limits^{\delta }\emptyset \\ \end{aligned}$$
(60)

The simulation used the rate constants given in Additional file 3: Table S3.

The negative feedback loop

We simulated a gene that is repressed by its own gene product R. If R is absent, RNA is transcribed and further translated to yield the transcription repressor. The repressor can be degraded spontaneously or by an enzyme E.

$$\begin{aligned} & {\text{P}} + {\text{R}}\mathop{\longrightarrow}\limits^{{\lambda_{R} }}{\text{P}} - {\text{R}} \\ & {\text{P}} - {\text{R}}\mathop{\longrightarrow}\limits^{{\mu_{R} }}{\text{P}} + {\text{R}} \\ & {\text{P}}\mathop{\longrightarrow}\limits^{\nu }{\text{P}} + {\text{RNA}}_{R} \\ & {\text{RNA}}_{R} \mathop{\longrightarrow}\limits^{\beta }R \\ & E + R\mathop{\longrightarrow}\limits^{{\lambda_{E} }}ER \\ & ER\mathop{\longrightarrow}\limits^{{\mu_{E} }}E + R \\ & ER\mathop{\longrightarrow}\limits^{{\nu_{E} }}E \\ & R\mathop{\longrightarrow}\limits^{\delta }\emptyset \\ \end{aligned}$$
(61)

The simulation used the rate constants given in Additional file 3: Table S3.

Availability of data and materials

The CaiNet software is freely available at https://gitlab.com/GebhardtLab/CaiNet.

Abbreviations

GRN:

Gene regulatory network

NN:

Neural networks

CaiNet:

Computer Aided Interactive Gene Regulatory Network Simulation Tool

References

  1. 1.

    Schmidt-Heck W, et al. Fuzzy modeling reveals a dynamic self-sustaining network of the GLI transcription factors controlling important metabolic regulators in adult mouse hepatocytes. Mol Biosyst. 2015;11:2190–7.

    CAS  PubMed  Google Scholar 

  2. 2.

    Edwards JS, Ibarra RU, Palsson BO. In silico predictions of Escherichia coli metabolic capabilities are consistent with experimental data. Nat Biotechnol. 2001;19:125–30.

    CAS  PubMed  Google Scholar 

  3. 3.

    Orlando DA, et al. Global control of cell-cycle transcription by coupled CDK and network oscillators. Nature. 2008;453:944-U78.

    CAS  PubMed  PubMed Central  Google Scholar 

  4. 4.

    Dibner C, Schibler U, Albrecht U. The mammalian circadian timing system: organization and coordination of central and peripheral clocks. Annu Rev Physiol. 2010;72:517–49.

    CAS  PubMed  Google Scholar 

  5. 5.

    Li M, Belmonte JCI. Ground rules of the pluripotency gene regulatory network. Nat Rev Genet. 2017;18:180–91.

    CAS  PubMed  Google Scholar 

  6. 6.

    Chickarmane V, Troein C, Nuber UA, Sauro HM, Peterson C. Transcriptional dynamics of the embryonic stem cell switch. PLoS Comput Biol. 2006;2:1080–92.

    CAS  Google Scholar 

  7. 7.

    Davidson EH, et al. A genomic regulatory network for development. Science (80-). 2002;295:1669–78.

  8. 8.

    Jacob F, Monod J. Genetic regulatory mechanisms in synthesis of proteins. J Mol Biol. 1961;3:318.

    CAS  PubMed  Google Scholar 

  9. 9.

    Alon U. Network motifs: theory and experimental approaches. Nat Rev Genet. 2007;8:450–61.

    CAS  PubMed  Google Scholar 

  10. 10

    Larsson AJM, et al. Genomic encoding of transcriptional burst kinetics. Nature. 2019;565:251.

    CAS  PubMed  PubMed Central  Google Scholar 

  11. 11.

    Suter DM, et al. Mammalian genes are transcribed with widely different bursting kinetics. Science (80-). 2011;332:472–4.

  12. 12.

    Acar M, Mettetal JT, Van Oudenaarden A. Stochastic switching as a survival strategy in fluctuating environments. Nat Genet. 2008;40:471–5.

    CAS  PubMed  Google Scholar 

  13. 13.

    Acar M, Becskei A, Van Oudenaarden A. Enhancement of cellular memory by reducing stochastic transitions. Nature. 2005;435:228–32.

    CAS  PubMed  Google Scholar 

  14. 14.

    Arkin A, Ross J, McAdams HH. Stochastic kinetic analysis of developmental pathway bifurcation in phage λ-infected Escherichia coli cells. Genetics. 1998;149:1633–48.

    CAS  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Chang HH, Hemberg M, Barahona M, Ingber DE, Huang S. Transcriptome-wide noise controls lineage choice in mammalian progenitor cells. Nature. 2008;453:544–7.

    CAS  PubMed  PubMed Central  Google Scholar 

  16. 16.

    Leloup J-C, Goldbeter A. Toward a detailed computational model for the mammalian circadian clock. Proc Natl Acad Sci. 2003;100:7051–6.

    CAS  PubMed  PubMed Central  Google Scholar 

  17. 17.

    Forger DB, Peskin CS. Stochastic simulation of the mammalian circadian clock. Proc Natl Acad Sci. 2005;102:321–4.

    CAS  PubMed  Google Scholar 

  18. 18.

    Chickarmane V, Olariu V, Peterson C. Probing the role of stochasticity in a model of the embryonic stem cell—heterogeneous gene expression and reprogramming efficiency. Bmc Syst Biol. 2012;6:12.

    Google Scholar 

  19. 19.

    Lin YT, Hufton PG, Lee EJ, Potoyan DA. A stochastic and dynamical view of pluripotency in mouse embryonic stem cells. PLoS Comput Biol. 2018;14:24.

    Google Scholar 

  20. 20.

    Karlebach G, Shamir R. Modelling and analysis of gene regulatory networks. Nat Rev Mol Cell Biol. 2008;9:770–80.

    CAS  PubMed  Google Scholar 

  21. 21.

    Di Ventura B, Lemerle C, Michalodimitrakis K, Serrano L. From in vivo to in silico biology and back. Nature. 2006;443:527–33.

    PubMed  Google Scholar 

  22. 22.

    Gillespie DT, Physics, D. G.-T. J. of C. & 2000, undefined. The chemical Langevin equation. aip.scitation.org. J Chem Phys. 2000;113:64114.

  23. 23.

    Gillespie DT. Exact stochastic simulation of coupled chemical reactions. J Phys Chem. 1977;81:2340–61.

    CAS  Google Scholar 

  24. 24.

    Gillespie DT. Approximate accelerated stochastic simulation of chemically reacting systems. J Chem Phys. 2001;115:1716–33.

    CAS  Google Scholar 

  25. 25.

    Cao Y, Gillespie DT, Petzold LR. Avoiding negative populations in explicit Poisson tau-leaping. J Chem Phys. 2005;123:54104.

    Google Scholar 

  26. 26.

    Cao Y, Gillespie DT, Petzold LR. Efficient step size selection for the tau-leaping simulation method. J Chem Phys. 2006;124:44109.

    Google Scholar 

  27. 27.

    Haseltine EL, Rawlings JB. Approximate simulation of coupled fast and slow reactions for stochastic chemical kinetics. J Chem Phys. 2002;117:6959–69.

    CAS  Google Scholar 

  28. 28.

    Pahle J. Biochemical simulations: stochastic, approximate stochastic and hybrid approaches. Brief Bioinform. 2009;10:53–64.

    CAS  PubMed  PubMed Central  Google Scholar 

  29. 29.

    Hoops S, et al. COPASI—a COmplex PAthway SImulator. Bioinformatics. 2006;22:3067–74.

    CAS  PubMed  Google Scholar 

  30. 30

    Adalsteinsson D, McMillen D, Elston TC. Biochemical Network Stochastic Simulator (BioNetS): Software for stochastic modeling of biochemical networks. BMC Bioinform. 2004;5:1–21.

    Google Scholar 

  31. 31.

    Harris LA, Clancy P. A ‘partitioned leaping’ approach for multiscale modeling of chemical reaction dynamics. J Chem Phys. 2006;125:144107.

    PubMed  Google Scholar 

  32. 32.

    Lin YT, Buchler NE. Efficient analysis of stochastic gene dynamics in the non-adiabatic regime using piecewise deterministic Markov processes. J R Soc Interface. 2018;15:20170804.

    PubMed  PubMed Central  Google Scholar 

  33. 33.

    Akutsu T, Miyano S, Kuhara S. Identification of genetic networks from a small number of gene expression patterns under the Boolean network model. In: Pacific symposium on Biocomputing; 1999. p. 17–28. https://doi.org/10.1142/9789814447300_0003.

  34. 34.

    Liang S, Fuhrman S, Biocomputing, R. S.-P. Symposium on & 1998, undefined. Reveal, a general reverse engineering algorithm for inference of genetic network architectures. Citeseer. 1998;3:18–29.

    Google Scholar 

  35. 35.

    Yip KY, Alexander RP, Yan K-K, Gerstein M. Improved reconstruction of in silico gene regulatory networks by integrating knockout and perturbation data. PLoS ONE. 2010;5:e8121.

    PubMed  PubMed Central  Google Scholar 

  36. 36

    Scheinine A, et al. The challenges of systems biology inferring gene networks: dream or nightmare? Part 2: challenges 4 and 5. Ann N Y Acad Sci. 2009;1158:287–301. https://doi.org/10.1111/j.1749-6632.2008.04100.x.

    Article  PubMed  Google Scholar 

  37. 37.

    Pinna A, Soranzo N, de la Fuente A. From knockouts to networks: establishing direct cause-effect relationships through graph analysis. PLoS ONE. 2010;5:e12912.

    PubMed  PubMed Central  Google Scholar 

  38. 38.

    Pratapa A, Jalihal AP, Law JN, Bharadwaj A, Murali TM. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat Methods. 2020;17:147–54.

    CAS  PubMed  PubMed Central  Google Scholar 

  39. 39

    Casadiego J, Nitzan M, Hallerberg S, Timme M. Model-free inference of direct network interactions from nonlinear collective dynamics. Nat Commun. 2017;8:1–10.

    CAS  Google Scholar 

  40. 40

    Papili Gao N, Ud-Dean SMM, Gandrillon O, Gunawan R. SINCERITIES: inferring gene regulatory networks from time-stamped single cell transcriptional expression profiles. Bioinformatics. 2018;34:258–66.

    PubMed  Google Scholar 

  41. 41.

    Che D, Guo S, Jiang Q, Chen L. PFBNet: a priori-fused boosting method for gene regulatory network inference. BMC Bioinform. 2020;21:1–13.

    Google Scholar 

  42. 42.

    Bonnaffoux A, et al. WASABI: a dynamic iterative framework for gene regulatory network inference. BMC Bioinform. 2019;20:1–19.

    Google Scholar 

  43. 43.

    Chu DF, Zabet NR, Hone ANW. Optimal parameter settings for information processing in gene regulatory networks. BioSystems. 2011;104:99–108.

    CAS  PubMed  Google Scholar 

  44. 44.

    Zabet NR, Chu DF. Computational limits to binary genes. J R Soc Interface. 2010;7:945–54.

    PubMed  Google Scholar 

  45. 45.

    Ben-Hur A, Siegelmann HT. Computation in gene networks. Chaos. 2004;14:145–51.

    CAS  PubMed  Google Scholar 

  46. 46

    Ziv E, Nemenman I, Wiggins CH. Optimal signal processing in small stochastic biochemical networks. PLoS ONE. 2007;2:e1077.

    PubMed  PubMed Central  Google Scholar 

  47. 47.

    Hu X, Maglia A, Wunsch DC. A general recurrent neural network approach to model genetic regulatory networks. In: Annual international conference on IEEE engineering and medical biologyproceedings, vol 7; 2005. p. 4735–4738.

  48. 48.

    Raza K, Alam M. Recurrent neural network based hybrid model for reconstructing gene regulatory network. Comput Biol Chem. 2016;64:322–34.

    CAS  PubMed  Google Scholar 

  49. 49.

    Ling H, Samarasinghe S, Kulasiri D. Novel recurrent neural network for modelling biological networks: oscillatory p53 interaction dynamics. Biosystems. 2013;114:191–205.

    CAS  PubMed  Google Scholar 

  50. 50.

    Peccoud J, Ycart B. Markovian modeling of gene-product synthesis. Theor Popul Biol. 1995;48:222–34.

    Google Scholar 

  51. 51.

    Novère NL, et al. The systems biology graphical notation. Nat Biotechnol. 2009;27:735–41.

    Google Scholar 

  52. 52.

    Blake WJ, Kaern M, Cantor CR, Collins JJ. Noise in eukaryotic gene expression. Nature. 2003;422:633–7.

    CAS  PubMed  Google Scholar 

  53. 53.

    Hooshangi S, Thiberge S, Weiss R. Ultrasensitivity and noise propagation in a synthetic transcriptional cascade. Proc Natl Acad Sci. 2005;102:3581–6.

    CAS  PubMed  PubMed Central  Google Scholar 

  54. 54.

    Cao Z, Grima R. Linear mapping approximation of gene regulatory networks with stochastic dynamics. Nat Commun. 2018;9:1–15.

    Google Scholar 

  55. 55.

    Thomas P, Straube AV, Grima R. The slow-scale linear noise approximation: an accurate, reduced stochastic description of biochemical networks under timescale separation conditions. BMC Syst Biol. 2012;6:1–23.

    Google Scholar 

  56. 56.

    Pineda FJ. Generalization of back-propagation to recurrent neural networks. Phys Rev Lett. 1987;59:2229.

    CAS  PubMed  Google Scholar 

  57. 57.

    Popp AP, Hettich J, Gebhardt JCM. Altering transcription factor binding reveals comprehensive transcriptional kinetics of a basic gene. Nucleic Acids Res. 2021;49:6249–66.

    CAS  PubMed  PubMed Central  Google Scholar 

  58. 58

    Skinner SO, et al. Single-cell analysis of transcription kinetics across the cell cycle. Elife. 2016;5:e12175.

    PubMed  PubMed Central  Google Scholar 

  59. 59.

    Padovan-Merhar O, et al. Single mammalian cells compensate for differences in cellular volume and DNA copy number through independent global transcriptional mechanisms. Mol Cell. 2015;58:339–52.

    CAS  PubMed  PubMed Central  Google Scholar 

  60. 60.

    Cao Z, Grima R. Analytical distributions for detailed models of stochastic gene expression in eukaryotic cells. Proc Natl Acad Sci U S A. 2020;117:4682–92.

    CAS  PubMed  PubMed Central  Google Scholar 

  61. 61.

    Stolovitzky G, Monroe D, Califano A. Dialogue on reverse-engineering assessment and methods: the DREAM of high-throughput pathway inference. Ann N Y Acad Sci. 2007;1115:1–22.

    PubMed  Google Scholar 

Download references

Acknowledgements

We thank Jana Meles for writing a user-friendly manual explaining the application and functionalities of CaiNet.

Funding

Open Access funding enabled and organized by Projekt DEAL. This work was supported by the European Research Council (ERC) under the European Union’s Horizon 2020 Research and Innovation Programme [637987 ChromArch] and the German Research Foundation (DFG) [GE 2631/2-1 and GE 2631/3-1].

Author information

Affiliations

Authors

Contributions

JH and JCMG designed the study; JH performed calculations and programmed CaiNet; JH and JCMG wrote the manuscript. Both authors read and approved the final manuscript.

Corresponding author

Correspondence to J. Christof M. Gebhardt.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Table S1.

Kinetic parameters of the repressive gene cascade (Figure 2 of the main text).

Additional file 2: Table S2.

Kinetic parameters of the exemplary biochemical reaction (Figure 4 of the main text).

Additional file 3: Table S3.

Kinetic parameters of the heterogeneous GRNs (Figure 5 of the main text).

Additional file 4: Table S4.

Kinetic parameters of the exemplary GRN used to demonstrate the network inference (Figure 6 of the main text).

Additional file 5: Table S5.

Number of inferred true and false positive and true and false negative connections for networks of N = 5 to 10 genes as result of the network inference process (Figure 6 of the main text).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Hettich, J., Gebhardt, J.C.M. Periodic synchronization of isolated network elements facilitates simulating and inferring gene regulatory networks including stochastic molecular kinetics. BMC Bioinformatics 23, 13 (2022). https://doi.org/10.1186/s12859-021-04541-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12859-021-04541-6

Keywords

  • Gene regulatory network
  • Transcriptional bursting
  • Gene expression noise
  • Hybrid deterministic–stochastic simulation
  • Network inference