 Research article
 Open Access
Reconstructing nonlinear dynamic models of gene regulation using stochastic sampling
 Johanna Mazur^{1},
 Daniel Ritter^{1},
 Gerhard Reinelt^{2} and
 Lars Kaderali^{1}Email author
https://doi.org/10.1186/1471210510448
© Mazur et al; licensee BioMed Central Ltd. 2009
 Received: 24 March 2009
 Accepted: 28 December 2009
 Published: 28 December 2009
Abstract
Background
The reconstruction of gene regulatory networks from time series gene expression data is one of the most difficult problems in systems biology. This is due to several reasons, among them the combinatorial explosion of possible network topologies, limited information content of the experimental data with high levels of noise, and the complexity of gene regulation at the transcriptional, translational and posttranslational levels. At the same time, quantitative, dynamic models, ideally with probability distributions over model topologies and parameters, are highly desirable.
Results
We present a novel approach to infer such models from data, based on nonlinear differential equations, which we embed into a stochastic Bayesian framework. We thus address both the stochasticity of experimental data and the need for quantitative dynamic models. Furthermore, the Bayesian framework allows it to easily integrate prior knowledge into the inference process. Using stochastic sampling from the Bayes' posterior distribution, our approach can infer different likely network topologies and model parameters along with their respective probabilities from given data. We evaluate our approach on simulated data and the challenge #3 data from the DREAM 2 initiative. On the simulated data, we study effects of different levels of noise and dataset sizes. Results on real data show that the dynamics and main regulatory interactions are correctly reconstructed.
Conclusions
Our approach combines dynamic modeling using differential equations with a stochastic learning framework, thus bridging the gap between biophysical modeling and stochastic inference approaches. Results show that the method can reap the advantages of both worlds, and allows the reconstruction of biophysically accurate dynamic models from noisy data. In addition, the stochastic learning framework used permits the computation of probability distributions over models and model parameters, which holds interesting prospects for experimental design purposes.
Keywords
 Posterior Distribution
 Gene Regulatory Network
 Bayesian Framework
 Hill Coefficient
 Network Inference
Background
Since in 2003 the Human Genome Project released the complete human genome sequence, there is great interest in the complex interplay between different genes and proteins. Instead of focusing on individual cellular components, interest has shifted to the interplay between these components, introducing the view that a biological system is more than the sum of its parts.
One of the most difficult problems in systems biology is the reconstruction of gene regulatory networks from experimental data. This difficulty arises from numerous sources, among them the combinatorial explosion of possible network topologies for a given number of genes, limited information content and high levels of noise in experimental data, limited amounts of data, and the complexity of regulatory processes in cells during transcription, translation and posttranslation.
Many approaches have been proposed to infer networks from data, good reviews are, for example, [1–8]. A common method to represent dynamics in biochemical systems are differential equations [9]. Rich mathematical theory has been established for their solution and analysis, and can be exploited [10, 11]. Linear differential equation models have been proposed to infer gene regulatory networks [12, 13]. These are attractive models due to the low number of parameters and their analytical tractability. However, since biological networks are typically highly nonlinear, linear differential equations are usually not adequate to accurately capture a regulatory network's dynamic behavior [14, 15]. Some authors argue that if a system is linearized around a specific point of interest, e.g., a steady state, one may describe the local behavior using linear models [16–20].
To describe more complex dynamic behavior, nonlinear models are needed. Such models can describe nonlinear behavior such as oscillations, multistationarity and biochemical switches. Furthermore, by using differential equations which are based on chemical reaction kinetics, model parameters directly correspond to reaction rates, thus models and model parameters can be immediately interpreted biochemically [21].
On the other hand, due to the highdimensional search space, inference of nonlinear models from data is much more complex than linear system identification, and serious problems with overfitting and nonidentifiability arise.
Nevertheless, nonlinear models are increasingly being used, and are very likely to play an important role in our ability to understand progressively more complex systems in the future. Bongard and Lipson recently published a method that can be used to symbolically infer nonlinear systems without prior specification of a model class, which they applied to simulated data of a threecomponent model of the lac operon [22]. While such a modelfree approach is very interesting, it remains to be seen whether the methodology can be extended to larger networks.
Making assumptions about the underlying model class, Spieth et al. used Ssystems [23, 24], generalized linear models [25] and socalled HSystems and inferred models with up to 10 genes from data, using different search strategies, including evolutionary algorithms [26–28]. A cooperative, coevolutionary algorithm was used by Kimura et al. for the inference of Ssystem models of genetic networks [29]. Perkins et al. used partial differential equations to reverse engineer the Gap gene network in Drosophila melanogaster [30]. Busch et al. recently used an approach related to delay differential equations to infer the regulatory network underlying keratinocyte migration [31].
While models based on differential equations provide a quantitative dynamic description of a system under consideration, they completely disregard the stochastic nature of biological data. Linear stochastic differential equations have been proposed for this reason [32], but they still require strong assumptions, and it is unclear if larger, nonlinear stochastic differential equation models of genetic regulatory networks can successfully be inferred from experimental data.
A further difficulty with differential equation models is, that it is not straightforward to compute probability distributions over alternative models or model parameters. This would be most useful in particular if several alternative models fit the data well, and could be used to design additional experiments. Furthermore, such information would make it possible to consider alternative scenarios also in simulationbased perturbation studies, e.g., when interest is on the effect of potential drug candidates. The problems of overfitting and nonidentifiability of models typically encountered with nonlinear differential equation models can be addressed by regularization [20, 32, 33], or by including additional biological knowledge in the inference process [34, 35]. However, the former requires setting a regularization parameter, which is often a nontrivial problem, whereas the latter approach requires a systematic way to include such information in parameter estimation. Both issues can nicely be addressed in a Bayesian framework.
We therefore embed a nonlinear ordinary differential equation (ODE) model, based on chemical reaction kinetics, into a Bayesian framework. Network inference then amounts to evaluating the posterior distribution over models and model parameters, given the experimental data. A related idea has recently been pursued by Steinke et al. for linear models [20]. In their paper, the authors use expectation propagation to evaluate the posterior distribution. We combine a nonlinear differential equation model based on chemical reaction kinetics with a Bayesian framework, and use a Markov chain Monte Carlo (MCMC) approach to sample from the posterior distribution.
A difficulty with this approach comes from the fact, that it requires solving the differential equations at every step in the Markov chain. To avoid this problem, we adapt the parameter estimation method proposed by Ramsay et al. [36] for Markov chains. The authors iteratively fit smoothing splines to the data, and then learn the parameters of the differential equations using a least squares procedure on the slope estimates of the splines and the differential equations. The idea to carry out the optimization on slopes instead of concentrations was first suggested by Varah in 1982 [37]. It was then improved by Poyton et al., who proposed to iterate between spline interpolation and parameter estimation [38]. Ramsay et al. further improved this approach by proposing profiled estimation [36]. We adapt their objective function, and sample both model parameters and smoothing factor using a Markov chain. The combination of differential equations with a Bayesian framework proposed in this work allows it to adequately describe the nonlinear dynamic behavior of gene regulatory networks, and to incorporate prior knowledge into the network inference process at the same time. In contrast to simple optimization of the posterior distribution as we pursued in previous work [39], the MCMC approach used here provides confidences on learned parameters and computes probability distributions over alternative network topologies. This can be used to consider alternative future scenarios in simulation, and permits the design of additional, most informative experiments to improve the inference procedure.
Methods
Differential Equations Model
where m denotes the Hill coefficient and θ_{ j }is related to the reaction rate by describing the concentration of x_{ j }needed to significantly activate or inhibit expression of x_{ i }. To keep the number of parameters small, we use one joint Hill coefficient m for all regulations, and use the same threshold parameter θ_{ j }for all regulations out of the same gene x_{ j }. The regulation functions (2) are obtained from chemical reaction kinetics by considering the binding process of a transcription factor to a promoter a reversible chemical reaction. For a more detailed derivation, see, for example, [40–42].
Parameter Estimation of ODE systems
The estimation of model parameters from experimental time series data for differential equation models is typically carried out iteratively in two steps: (1) numerically solve the differential equations for the time interval of interest, and (2) compute an error between experimental data points and model prediction. Initial values and model parameters are then modified to minimize this error. The disadvantage of this procedure is, that the differential equations have to be solved numerically in every iteration of the optimization, which is very time consuming.
Here, (t_{ τ }, D) is the slope estimate from the cubic splines z(t, D), ω are the differential equation model parameters, T is the number of different time points t_{ τ }and n is the number of time series to be fitted, for example, the number of different genes in the network.
An obvious drawback of Varah's approach is, that the quality of the parameter estimates can only be as good as the spline fit, which is particularly difficult in case of noisy data. To address this problem, Poyton et al. developed a recursive method, called iterative principal differential analysis, where the two steps of Varah's method are iterated, and the model predictions are fed back into the spline estimation [38]. Ramsay et al. improved this method further using a generalization of profiled estimation to learn the parameters of interest [36].
where d_{ iτ }denotes the measured data, T_{1} is the number of time points in the experimental data and T_{2} denotes the number of points to be used in the squared error parameter fitting on the slopes. To adequately describe the dynamics of a system using derivatives, a large number of slope estimates (over time) is required, we will therefore usually have T_{2} >> T_{1}. We note that equation (1) requires concentrations to compute the derivatives, these are taken from the spline interpolation.
Bayesian Learning Framework
We now address two further problems in parameter estimation, regarding the entire topology of the regulatory network, and variability in experimental measurements. The network topology (which genes have regulatory interactions between them) can either be determined in a separate step prior to parameter estimation, or can be solved implicitly by assuming a fully connected network, and pruning edges with very small regulation strengths afterwards. Determination of the network topology in a separate step has the disadvantage, that edges not included in this prior step cannot be reintroduced in parameter estimation. We therefore use the latter approach, with appropriate regularization to prune many edges during the inference process.
To account for noise in the experimental data, we embed the differential equations into a Bayesian framework. For this purpose, we assume that the measured data d_{ iτ }is corrupted by independent mean zero Gaussian noise with variance . The assumption of normally distributed noise is clearly a simplification, which is made here to keep the model simple. Other noise models could be used. We furthermore assume the differences between slope estimates and differential equation predictions to follow a normal distribution with mean zero and variance . We note that the ratio between and corresponds to a parameter that weighs the two terms in (3) relative to one another.
which is equivalent to equation (3) up to logtransformation and scaling.
where is given by the likelihood (4), is a prior distribution on the model parameters, and p(D) is a normalizing factor which is independent of ω, λ and . For simplicity, we treat the variance parameters and as user parameters, which are set in advance and not sampled.
Inclusion of Prior Knowledge
The prior distribution p(ω, λ) on the model parameters can be used to include available biological knowledge on the system under consideration into the network inference process, as demonstrated, for example, by Wehrli and Husmeier [35]. It is a huge advantage of the Bayesian framework that it allows the easy and systematic integration of such expert knowledge. If no such detailed knowledge is available, one can resort to very general assumptions, such as sparsity of the interaction network [43] or rough estimates of reasonable ranges for parameters.
MCMC Sampling from the Posterior
The posterior distribution (5) can now be maximized using, for example, gradient based methods, simulated annealing or genetic algorithms. However, since multiple parameter combinations, corresponding to alternative network topologies, may explain the data equally well, we sample from p(ω, λD) using Markov chain Monte Carlo. This way, full distributions over each parameter are available, and can be used, for example, to consider different likely topologies, and to design experiments that will resolve ambiguities. This would not be possible with simple maximization approaches.
To sample from p(ω, λD) we use an iterative approach. First, we sample the model parameters ω using the Hybrid Monte Carlo algorithm (HMC), with fixed smoothing factor λ. HMC has originally been proposed by Duane et al. for problems arising in quantum chromodynamics [46], and has been introduced to the general Bayesian statistical community by Neal [47]. The method samples points from a given ndimensional distribution p(η) by introducing momentum variables ρ = (ρ_{1}, ρ _{2},..., ρ _{ n }) with associated energy K(ρ), and iterative sampling for the momentum variables from K(ρ) and following the Hamiltonian dynamics of the system H(η, ρ):= ln p(η) + K(ρ) in time. Doing so, HMC generates a sequence of points distributed according to p(η), and can avoid the random walk behavior typically observed with the Metropolis Hastings algorithm [47].
Iterative Markov chain Monte Carlo Algorithm
Algorithm 1 Iterative Hybrid Monte Carlo and Metropolis Hastings algorithm 

Require: desired distribution p(·), starting value (ω^{0}, λ ^{0}), proposal distribution q_{ λ }(·λ ^{(t)}), number of leapfrog steps for HMC L, proposal distribution for stepsize ϵ of leapfrog steps q_{ϵ}(·), standard deviation σ_{ ρ }for the sampling of the momentum variables ρ, number of Markov chain samples T 
1: t ← 0 
2: while t <T do 
3: Sample from q_{ϵ}(·) 
4: Sample from for all i ∈ {1,..., n} 
5: Perform L leapfrog steps with stepsize starting at state (ω^{(t)}, ρ ^{(t)}) 
6: Store resulting candidate state in 
7: Sample u_{1} from (0, 1) 
8: α_{1} ← min {1, exp H(ω^{(t)}, ρ^{(t)})  H )} 
9: if u_{1} <α_{1} then 
10: ω^{(t+1)}← 
11: else 
12: ω^{(t+1)}← ω^{(t)} 
13: end if 
14: Sample from q_{ λ }(·λ^{(t)}) 
15: Sample u_{2} from (0, 1) 
16: 
17: if u_{2} <α_{2} then 
18: λ^{(t+1)}← 
19: else 
20: λ^{(t+1)}← λ^{(t)} 
21: end if 
22: Append (ω^{(t+1)}, λ^{(t+1)}) to Markov chain 
23: t ← t + 1 
24: end while 
25: return Markov chain 
Evaluation of Reconstructed Networks
To evaluate reconstructed networks, we summarize the Markov chains by using the mean value of the chain for each model parameter, after excluding points from the burnin phase of the chain. This is of course a very crude simplification, which we take to allow for an automated, quantitative evaluation of reconstructed networks. Obviously, in case of, for example, bimodal distributions, the mean will be located somewhere between the two modes, possibly in a region with very low probability mass. We therefore emphasize here that the full set of points sampled can and should be analyzed in more detail.
To quantitatively evaluate inferred networks, we use receiver operator characteristic (ROC) and precision to recall (PR) analysis, and summarize these using the area under the curve (AUC). For twoclass classification problems (e.g. edge present/not present), ROC curves plot sensitivity against 1specificity for varying thresholds on the predictor (for example, absolute value of average edge weight β_{ ij }), whereas PR curves similarly plot precision against recall (= sensitivity). The AUC is then simply the area under the ROC or PR curve, and on a scale from 0 to 1 provides a single number to measure the quality of a predictor. We note that changing the threshold in ROC and PR curves corresponds to different thresholds for edge pruning in reconstructed networks.
Classification Matrix for ROC/PR Evaluation
predicted  

positive link  negative link  nonexistent link  
positive link  TP  FP  FN  
actual  negative link  FP  TP  FN 
nonexistent link  FP  FP  TN 
Implementation
We implemented our algorithm in Matlab, Release 2008b (The Mathworks), using the spline and statistics toolboxes. Computations were carried out on a Linux cluster with dualprocessor 3.1 GHz XEON quadcore machines with 32 GB RAM, running each Markov chain in a single thread (no additional parallelization). The Matlab code is available on request from the authors.
Results
Simulated Five Gene Network
Data was simulated using the differential equation model (1), with synthesis and degradation rates s = (0.2, 0.2, 0.2, 0.2, 0.2) and γ = (0.9, 0.9, 0.9, 1.5, 1.5) for the five genes. The threshold parameter of the Hill functions was set to θ = (1.5, 1.5, 1.5, 1.5, 1.5), with Hill coefficient m = 5. The parameters for the regulation strength were set to β_{ ij }= 2 for activations, β_{ ij }= 2 for inhibitions, and zero if there was no interaction between to genes, compare Figure 3a.
Data was simulated by numerical integration of the differential equations in Matlab using the function ode45. Simulated data shows oscillations for all genes, see Figure 3b. To simulate the typical setting in network inference, where only a limited number of noisy measurements are available, we evaluated our network reconstruction approach using different numbers of time points subsampled equidistantly from the simulated data, and added meanzero normally distributed noise with different standard deviations to the concentration values. We then used our method to reconstruct the original network from this data. For this purpose, we sampled 110, 000 points for (ω, λ) using the algorithm described in Table 1, with a burnin of 10, 000 points. Parameters of the prior distributions were set to a = 1, r = 2 for the gamma prior on synthesis and degradation rates, a = 1.5, r = 5.0 for the gamma prior on the Hill coefficient m, a' = 100, b' = 10 for the beta prior on λ, and q = 0.5, s = 1 for the L_{q} prior on the regulation strengths β_{ ij }. Shape and scale parameters for the gamma priors on the θ_{ j }for each gene j where chosen such that mean and variance of the priors correspond to mean and variance of the training data. The number of slope estimates T_{2} is set to 1000 and the corresponding variance is set to 1. Furthermore, the variance is set to T_{1}/T_{2}, where T_{1} denotes the number of time points.
Results on 40 time points
Inferred Interaction Strength Parameters for Simulated Data for Dataset with 40 Time Points
To ↓/From →  Gene 1  Gene 2  Gene 3  Gene 4  Gene 5 

Gene 1  0.53 ± 0.72  0.22 ± 0.41  0.10 ± 0.43  1.37 ± 1.16  0.88 ± 0.74 
Gene 2  1.54 ± 0.78  0.28 ± 0.42  0.49 ± 0.62  0.33 ± 0.55  0.28 ± 0.53 
Gene 3  1.35 ± 0.80  0.34 ± 0.57  1.07 ± 1.59  0.55 ± 0.63  0.26 ± 0.46 
Gene 4  0.23 ± 0.57  0.35 ± 0.61  0.51 ± 0.64  0.40 ± 0.69  0.84 ± 0.74 
Gene 5  0.01 ± 0.31  0.62 ± 0.67  0.91 ± 0.69  0.55 ± 0.81  0.65 ± 0.63 
Precision to recall and receiver operator characteristic analysis of results yield AUC values of AUC_{PR} = 0.516 (guessing: 0.14) for precision to recall curves and AUC_{ROC} = 0.706 (guessing: 0.358) for sensitivity vs. 1specificity curves.
To close the circle from the original concentration data over the reconstructed model back to dynamic simulation, we used the mean inferred model parameters to simulate gene concentrations. This simulation shows an accurate match of simulated and experimental data (not shown). It is well known that fitting models to oscillating data, and even more so, reconstructing networks from such data, are extremely hard problems, since models tend to learn a steady state [50]. In spite of this, oscillations were captured with high precision by our approach (see Figure 3b).
Effect of Noise and Dataset Size
We next studied the effect of different dataset sizes (number of time points) and different levels of noise in the data on the quality of network reconstruction. For this purpose, we added mean zero Gaussian noise with standard deviations 0.05, 0.1, 0.15, 0.2 and 0.3 to the simulated concentration data, and furthermore subsampled equidistantly from the time series to generate data sets with T_{1} = 10, 20, 30, 40, 50, 70, 90, 110, 140, 170 and 200 different time points for each of the five noise levels. Then the network reconstruction was performed as described in the methods section.
Evaluation on Experimental Data: The DREAM 2, Challenge 3 Dataset
To assess the performance of different reverse engineering approaches, Stolovitzky et al. fathered the DREAM (Dialogue on Reverse Engineering Assessment and Methods) initiative [51]. For this purpose, Cantone et al. provided invivo data on a small, bioengineered fivegene network, which was posted as challenge #3 within DREAM 2 [52]. This data was generated by inserting new promoter/gene combinations directly into the chromosomal DNA of budding yeast. Two time series of gene expression of the five inserted genes after stimulation were measured using quantitative PCR, measuring 15 time points in 3 minute intervals in time series 1, and 11 time points in 5 minute intervals in time series 2.
Measurements in both time series consist of negative (base 2) logratios of the genes of interest to housekeeping genes, we therefore transformed the measured data to recover the original ratios. We used the first time series (3 minute interval data) for network inference, i.e., T_{1} = 15.
Figure 3a shows the original engineered network. The topology is the same that we used in the simulation study. We attempted to reconstruct this network from the experimental data alone. For this purpose we ran a Markov chain with 60, 000 steps and a burnin of 10, 000. The parameters used for the gamma prior for the synthesis rate, degradation rate and Hill coefficient, T_{2}, and were set as described for the simulated data. We set the parameters for the beta prior on the smoothing factor to a' = 5 and b' = 100.
The hyperparameters for the L_{q} prior on the regulation strengths were set to q = 1 and s = 2. Parameters for the gamma priors on the threshold parameters were manually set to concentrate probability mass near the mean concentration value for each gene individually.
An evaluation of results using the mean of the Markov chain for each parameter, as done for the simulated data, results in AUC values that are equivalent to guessing (data not shown). This might indicate that either the level of noise present in the experimental data is too high, or that the posterior distribution has multiple modes, with the mean being an inappropriate summary statistic. We therefore searched the points sampled for the maximum aposteriori mode, and evaluate this mode further in the following. Clearly, data from the additional modes are available and can be studied similarly.
Inferred Interaction Strength Parameters for DREAM 2 Challenge #3 Data
To ↓/From →  Gene 1  Gene 2  Gene 3  Gene 4  Gene 5 

Gene 1  1.03  0.03  0.05  0.62  0.12 
Gene 2  1.09  1.55  0.09  0.00  0.10 
Gene 3  0.10  0.04  0.56  0.16  0.11 
Gene 4  0.15  0.03  0.43  0.26  0.09 
Gene 5  0.77  0.16  0.01  0.01  0.44 
To assess the quality of this result, we next compared performance of our approach to the performance of other approaches submitted to the DREAM 2 challenge. For this purpose, we computed performance measures as were used in the DREAM 2 challenge for our inferred network, and show results in Table 5. AUC values for this comparison were calculated as described in [51]. Since our method gives us a topology with positive and negative regulation strengths, we have transformed our results to be suitable for the evaluation method used in DREAM 2:

By skipping the sign and dividing by the largest learned regulation strength for the DIRECTEDUNSIGNED challenge.

For the two DIRECTEDSIGNED challenges we only took the regulation strengths with the appropriate sign and divide them by the highest absolute regulation strength.
Results of DREAM 2 Challenge #3 Data Compared to Other Approaches
Challenge  Best submitted  Our method  No selfregulations 

DIRECTEDSIGNEDEXCITATORY  AUC = 0.79  AUC = 0.61  AUC = 0.79 
AUC_{PR} = 0.72  AUC_{PR} = 0.25  AUC_{PR} = 0.54  
DIRECTEDSIGNEDINHIBITORY  AUC = 0.63  AUC = 0.96  AUC = 0.96 
AUC_{PR} = 0.14  AUC_{PR} = 0.45  AUC_{PR} = 0.45  
DIRECTEDUNSIGNED  AUC = 0.73  AUC = 0.56  AUC = 0.79 
AUC_{PR} = 0.55  AUC_{PR} = 0.30  AUC_{PR} = 0.57 
Our method outperforms all submitted results in the challenge DIRECTEDSIGNEDINHIBITORY. One difficulty we observed was, that our approach learned many strong selfregulations of genes, possibly because of an improper balancing of the priors on synthesis/degradation rates and regulation strengths. Since there are no selfregulations in the DREAM 2 challenge #3 data, we provide an additional evaluation when disregarding selfregulations, results are shown in the third column of Table 5. In this case, we not only outperform all submitted approaches in the DIRECTEDSIGNEDINHIBITORY challenge, but also beat the best models in the DIRECTEDUNSIGNED challenge.
Discussion and Conclusions
We have developed a novel methodological approach to reverse engineer gene regulatory networks from gene expression time series data, and evaluated this approach on both simulated and real gene expression data. The combination of ordinary differential equations and the Bayes' regularized inference technique can be used for the quantitative analysis of complex cellular processes. Nonlinear differential equations are able to describe complex dynamic behavior, and a rich mathematical theory for analyzing them is well established. Our method combines these advantages of differential equations with the advantages of a Bayesian framework, which is able to capture noise in data, makes it possible to include biological knowledge into the learning process, and allows the computation of probability distributions over model topologies and model parameters.
The latter is one of the main advantages of the MCMC approach. The information about distributions can be used to make predictions of future states of the network together with confidence intervals on the predictions made. This may allow it to take alternative future scenarios into account, and could be used to design additional most informative experiments that will help to distinguish between corresponding topologies or parameter sets. We therefore think that our approach will be highly useful to elucidate regulatory networks in an iterative procedure with several rounds of experiment, network inference, and experiment design.
In contrast to the usual approach of minimizing an error between experimentally measured concentrations and model predictions, our likelihood function uses the difference between model slopes and experimental slopes. We furthermore integrate a smoothing spline approximation into the likelihood, automatically performing an optimized tradeoff between an accurate representation of the experimental data, and smoothing out noise. Fitting of model parameters on slopes has the advantage that no numerical integration of the model is required in each step of the optimization or sampling. Instead, we must estimate smoothing splines and slopes. This can be carried out much faster than numerical integration, enabling us to use a Markov chain sampler on the posterior distribution instead of plain maximization. We have evaluated our approach on simulated and on real experimental data from a synthetic gene regulatory network. On the simulated data, we have shown that our approach can reconstruct the underlying topology with high accuracy. As expected, performance deteriorates with increasing levels of noise and with decreasing number of different time points available. We emphasize that the simulated example chosen is a difficult learning task due to the oscillations in the data. It is obvious that sufficient data points are required to sample the full dynamics of the oscillating network, and that oscillations quickly break down in the presence of noise.
On the DREAM 2 data, our method yields superior results when compared to other approaches that were submitted to DREAM 2 in the DIRECTEDSIGNEDINHIBITORY and DIRECTEDUNSIGNED categories. Importantly, our analysis shows that there are multiple posterior modes that describe the data well, which may explain a surprising result of the original DREAM 2 challenge: As reported by Stolovitzky et al., none of the submitted models were able to accurately reconstruct the original network topology from the synthetic data [51]. Our results indicate that this might be due to a dense population of local optima, in which network reconstruction approaches looking for a single optimal topology might get trapped and return suboptimal solutions. An obvious conclusion is that further experiments are required to resolve ambiguities in network reconstruction. This emphasizes the need for robust and efficient methods for optimum experimental design. Our sampling approach may be a good starting point for such experiment design, since it analyzes the full distribution over model parameters, and therefore yields information on alternative network topologies and confidence intervals on parameters, which are instrumental to design experiments that elucidate the network topology further.
Declarations
Acknowledgements
We would like to thank A. Szabowski, H. Busch, N. Radde and R. Eils for helpful comments and discussions. Furthermore, we thank M. P. Cosma and D. di Bernardo for the provision of the challenge #3 data from DREAM 2. We are grateful for funding to the German Ministry of Education and Research (BMBF), grant number 0313923 (FORSYS/Viroquant).
Authors’ Affiliations
References
 de Jong H: Modeling and Simulation of Genetic Regulatory Systems: A Literature Review. Journal of Computational Biology 2002, 9: 67–103. 10.1089/10665270252833208View ArticlePubMedGoogle Scholar
 Garner TS, Faith JJ: Reverseengineering transcription control networks. Physics of Life Reviews 2005, 2: 65–88. 10.1016/j.plrev.2005.01.001View ArticleGoogle Scholar
 Filkov V: Identifying Gene Regulatory Networks from Gene Expression Data. In Handbook of Computational Molecular Biology. Edited by: Aluru S. Boca Raton, FL, USA: Chapman & Hall/CRC Press; 2005:27.1–27.29.Google Scholar
 Goutsias J, Lee NH: Computational and Experimental Approaches for Modeling Gene Regulatory Networks. Current Pharmaceutical Design 2007, 13: 1415–1436. 10.2174/138161207780765945View ArticlePubMedGoogle Scholar
 Bansal M, Belcastro V, AmbesiImpiombato A, di Bernardo D: How to infer gene networks from expression profiles. Molecular Systems Biology 2007., 3(78):Google Scholar
 Cho KH, Choo SM, Jung SH, Kim JR, Choi HS, Kim J: Reverse engineering of gene regulatory networks. IET Systems Biology 2007, 1(3):149–163. 10.1049/ietsyb:20060075View ArticlePubMedGoogle Scholar
 Markowetz F, Spang R: Inferring cellular networks  a review. BMC Bioinformatics 2007, 8(Suppl 6):S5. 10.1186/147121058S6S5PubMed CentralView ArticlePubMedGoogle Scholar
 Kaderali L, Radde N: Inferring Gene Regulatory Networks from Expression Data.In Computational Intelligence in Bioinformatics, Volume 94 of Studies in Computational Intelligence Edited by: Kelemen A, Abraham A, Chen Y. Berlin: Springer; 2008, 33–74. [http://www.springerlink.com/content/t100323m8141840k/?p=9f516592e7b4486991d1928407f55d48&pi=4]View ArticleGoogle Scholar
 Voit EO: Computational Analysis of Biochemical Systems: A Practical Guide for Biochemists and Molecular Biologists. Cambridge: Cambridge University Press; 2000.Google Scholar
 Wiggins S: Introduction to applied nonlinear dynamical systems and chaos. New York: Springer; 2003.Google Scholar
 Heuser H: Gewöhnliche Differentialgleichungen. Wiesbaden: Teubner; 2006.Google Scholar
 Chen T, He H, Church G: Modeling gene expression with differential equations. Pacific Symposium on Biocomputing 1999, 4: 29–40.Google Scholar
 Gardner TS, di Bernardo D, Lorenzo D, Collins JJ: Inferring Genetic Networks and Identifying Compound Mode of Action via Expression Profiling. Science 2003, 301(5629):102–105. 10.1126/science.1081900View ArticlePubMedGoogle Scholar
 Savageau MA: Biochemical systems analysis. London [u.a.]: AddisonWesley; 1976.Google Scholar
 Heinrich R, Schuster S: The regulation of cellular systems. NewYork [u.a.]: Chapman & Hall; 1996.View ArticleGoogle Scholar
 Yeung MKS, Tegnér J, Collins JJ: Reverse engineering gene networks using singular value decomposition and robust regression. PNAS 2002, 99(9):6163–6168. 10.1073/pnas.092576199PubMed CentralView ArticlePubMedGoogle Scholar
 Kholodenko BN, Kiyatkin A, Bruggeman FJ, Sontag E, Westerhoff HV: Untangling the wires: A strategy to trace functional interactions in signaling and gene networks. PNAS 2002, 99(20):12841–12846. 10.1073/pnas.192442699PubMed CentralView ArticlePubMedGoogle Scholar
 Sontag E, Kiyatkin A, Kholodenko BN: Inferring dynamic architecture of cellular networks using time series of gene expression, protein and metabolite data. Bioinformatics 2004, 20(12):1877–1886. 10.1093/bioinformatics/bth173View ArticlePubMedGoogle Scholar
 Schmidt H, Cho KH, Jacobson EW: Identification of small scale biochemical networks based on general type systems perturbations. FEBS Journal 2005, 272: 2141–2151. 10.1111/j.17424658.2005.04605.xView ArticlePubMedGoogle Scholar
 Steinke F, Seeger M, Tsuda K: Experimental design for efficient identification of gene regulatory networks using sparse Bayesian models. BMC Systems Biology 2007., 1(51):Google Scholar
 de Jong H, Page M: Qualitative simulation of large and complex genetic regulatory systems. In Proceedings of the 14th European Conference on Artificial Intelligence. Edited by: Horn W. Amsterdam: IOS Press; 2000:141–145.Google Scholar
 Bongard J, Lipson H: Automated reverse engineering of nonlinear dynamical systems. Proceedings of the National Academy of Sciences USA 2007, 104: 9943–9948. 10.1073/pnas.0609476104View ArticleGoogle Scholar
 Savageau MA: Biochemical Systems Analysis: III. Dynamic Solutions using a Powerlaw Approximation. Journal of theoretical Biology 1970, 26(2):215–226. 10.1016/S00225193(70)800133View ArticlePubMedGoogle Scholar
 Savegeau MA: Biochemical Systems Theory: Operational Differences Among Variant Representations and their Significance. Journal of theoretical Biology 1991, 151(4):509–530. 10.1016/S00225193(05)803674View ArticleGoogle Scholar
 Weaver D, Workman C, Stormo G: Modeling regulatory networks with weight matrices. In Pacific Symposium on Biocomputing. Volume 4. World Scientific; 1999:112–123.Google Scholar
 Spieth C, Streichert F, Speer N, Zell A: A memetic inference method for gene regulatory networks based on SSystems. Evolutionary Computation 2004, 1: 152–157.Google Scholar
 Spieth C, Streichert F, Speer N, Zell A: MultiObjective Model Optimization for Inferring Gene Regulatory Networks. In Lecture Notes in Computer Science, Volume 3410/2005. Springer; 2005:607–620.Google Scholar
 Spieth C, Hassis N, Streichert F: Comparing Mathematical Models on the Problem of Network Inference. Proceedings 8th Annual Conference Genetic and evolutionary computation (GECCO 2006) 2006, 279–285. [http://portal.acm.org/citation.cfm?id=1143997.1144045]View ArticleGoogle Scholar
 Kimura S, Ide K, Kashihara A, Kano M, Hatakeyama M, Masui R, Nakagawa N, Yokoyama S, Kuramitsu S, Konagaya A: Inference of Ssystem models of genetic networks using a cooperative coevolutionary algorithm. Bioinformatics 2005, 21(7):1154–1163. 10.1093/bioinformatics/bti071View ArticlePubMedGoogle Scholar
 Perkins TJ, Jaeger J, Reinitz J, Glass L: Reverse engineering the Gap gene network of drosophila melanogaster. PLoS Computational Biology 2006, 2: e51. 10.1371/journal.pcbi.0020051PubMed CentralView ArticlePubMedGoogle Scholar
 Busch H, CamachoTrullio D, Rogon Z, Breuhahn K, Angel P, Eils R, Szabowski A: Gene Network Dynamics controlling Keratinocyte Migration. Molecular Systems Biology 2008., 4(199):Google Scholar
 Chen KC, Wang TY, Tseng HH, Huang CYF, Kao CY: A stochastic differential equation model for quantifying transcriptional regulatory network in Saccharomyces cerevisiae. Bioinformatics 2005, 21(12):2883–2890. 10.1093/bioinformatics/bti415View ArticlePubMedGoogle Scholar
 Nachman I, Regev A, Friedman N: Inferring quantitative models of regulatory networks from expression data. Bioinformatics 2004, 201: i248i256. 10.1093/bioinformatics/bth941View ArticleGoogle Scholar
 van Someren EP, Vaes BLT, Steegenga WT, Sijbers AM, Dechering KJ, Reinders MJT: Least absolute regression network analysis of the murine osteoblast differentiation network. Bioinformatics 2006, 22: 477–484. 10.1093/bioinformatics/bti816View ArticlePubMedGoogle Scholar
 Wehrli AV, Husmeier D: Reconstructing gene regulatory networks with Bayesian networks by combining expression data with multiple sources of prior knowledge. Statistical Applications in Genetics and Molecular Biology 2007, 6: 15.Google Scholar
 Ramsay JO, Hooker G, Campbell D, Cao J: Parameter estimation for differential equations: a generalized smoothing approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 2007, 69(5):741–796. 10.1111/j.14679868.2007.00610.xView ArticleGoogle Scholar
 Varah JM: A spline least squares method for numerical parameter estimation in differential equations. SIAM Journal on Scientific & Statistical Computing 1982, 3: 28–46.View ArticleGoogle Scholar
 Poyton AA, Varziri MS, McAuley KB, McLellan PJ, Ramsay JO: Parameter estimation in continuoustime dynamic models using principal differential analysis. Computers and Chemical Engineering 2006, 30: 698–708. 10.1016/j.compchemeng.2005.11.008View ArticleGoogle Scholar
 Radde N, Kaderali L: Bayesian Inference of Gene Regulatory Networks Using Gene Expression Time Series Data.In BIRD, Volume 4414 of Lecture Notes in Computer Science Edited by: Hochreiter S, Wagner R. Berlin: Springer; 2007, 1–15. [http://www.springerlink.com/content/k69807u72u21458u/?p=9f516592e7b4486991d1928407f55d48&pi=1]Google Scholar
 Alon U: An Introduction to Systems Biology: Design Principles of Biological Circuits. Boca Raton, FL, USA: Chapman & Hall/CRC; 2006.Google Scholar
 Jacob F, Monod J: Genetic regulatory mechanisms in the synthesis of proteins. Journal of Molecular Biology 1961, 3: 318–356.View ArticlePubMedGoogle Scholar
 Yagil G, Yagil E: On the Relation between Effector Concentration and the Rate of Induced Enzyme Synthesis. Biophysical Journal 1971, 11: 11–27. 10.1016/S00063495(71)861921PubMed CentralView ArticlePubMedGoogle Scholar
 Arnone MI, Davidson EH: The hardwiring of development: organization and function of genomic regulatory systems. Development 1997, 124: 1851–1864.PubMedGoogle Scholar
 Liu Y, Zhang HH, Park C, Ahn J: Support vector machines with adaptive Lq penalty. Comput Stat Data Anal 2007, 51(12):6380–6394. 10.1016/j.csda.2007.02.006View ArticleGoogle Scholar
 Kaderali L, Zander T, Faigle U, Wolf J, Schultze JL, Schrader R: CASPAR: A Hierarchical Bayesian Approach to predict Survival Times in Cancer from Gene Expression Data. Bioinformatics 2006, 22: 1495–1502. 10.1093/bioinformatics/btl103View ArticlePubMedGoogle Scholar
 Duane D, Kennedy AD, Pendleton BJ, Roweth D: Hybrid Monte Carlo. Physics Letters B 1987, 195: 216–222. 10.1016/03702693(87)91197XView ArticleGoogle Scholar
 Neal RM: Bayesian Learning for Neural Networks. Secaucus, NJ, USA: SpringerVerlag New York, Inc; 1996.View ArticleGoogle Scholar
 Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH: Equations of State Calculations by Fast Computing Machines. The Journal of Chemical Physics 1953, 21(6):1087–1092. 10.1063/1.1699114View ArticleGoogle Scholar
 Hastings WK: Monte Carlo sampling methods using Markov chains and their applications. Biometrika 1970, 57: 97–109. 10.1093/biomet/57.1.97View ArticleGoogle Scholar
 Radde N, Kaderali L: Inference of an oscillating model for the yeast cell cycle. Discrete Applied Mathematics 2009, 157(10):2285–2295. 10.1016/j.dam.2008.06.036View ArticleGoogle Scholar
 Stolovitzky G, Monroe D, Califano A: Dialogue on Reverse Engineering Assessment and Methods: The DREAM of high throughput pathway inference. Volume 1115. Annals of the New York Academy of Sciences; 2007:1–22.Google Scholar
 Cantone I, Marucci L, Iorio F, Ricci MA, Belcastro V, Bansal M, Santini S, di Bernardo M, di Bernardo D, Cosma MP: A Yeast Synthetic Network for In Vivo Assessment of ReverseEngineering and Modeling Approaches. Cell 2009, 137: 172–181. 10.1016/j.cell.2009.01.055View ArticlePubMedGoogle Scholar
 DREAM2 Challenge Scoring Methodology[http://wiki.c2b2.columbia.edu/dream/results/]
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.