- Methodology article
- Open Access
Global parameter estimation methods for stochastic biochemical systems
- Suresh Kumar Poovathingal^{1} and
- Rudiyanto Gunawan^{1}Email author
https://doi.org/10.1186/1471-2105-11-414
© Poovathingal and Gunawan; licensee BioMed Central Ltd. 2010
Received: 21 June 2010
Accepted: 6 August 2010
Published: 6 August 2010
Abstract
Background
The importance of stochasticity in cellular processes having low number of molecules has resulted in the development of stochastic models such as chemical master equation. As in other modelling frameworks, the accompanying rate constants are important for the end-applications like analyzing system properties (e.g. robustness) or predicting the effects of genetic perturbations. Prior knowledge of kinetic constants is usually limited and the model identification routine typically includes parameter estimation from experimental data. Although the subject of parameter estimation is well-established for deterministic models, it is not yet routine for the chemical master equation. In addition, recent advances in measurement technology have made the quantification of genetic substrates possible to single molecular levels. Thus, the purpose of this work is to develop practical and effective methods for estimating kinetic model parameters in the chemical master equation and other stochastic models from single cell and cell population experimental data.
Results
Three parameter estimation methods are proposed based on the maximum likelihood and density function distance, including probability and cumulative density functions. Since stochastic models such as chemical master equations are typically solved using a Monte Carlo approach in which only a finite number of Monte Carlo realizations are computationally practical, specific considerations are given to account for the effect of finite sampling in the histogram binning of the state density functions. Applications to three practical case studies showed that while maximum likelihood method can effectively handle low replicate measurements, the density function distance methods, particularly the cumulative density function distance estimation, are more robust in estimating the parameters with consistently higher accuracy, even for systems showing multimodality.
Conclusions
The parameter estimation methodologies described in this work have provided an effective and practical approach in the estimation of kinetic parameters of stochastic systems from either sparse or dense cell population data. Nevertheless, similar to kinetic parameter estimation in other modelling frameworks, not all parameters can be estimated accurately, which is a common problem arising from the lack of complete parameter identifiability from the available data.
Keywords
Background
Mathematical models form a cornerstone of systems biology and these models are usually constructed from available biological knowledge and data, which once validated, are subsequently analyzed to address specific biological questions. Many canonical modelling frameworks, from statistical Bayesian networks to differential equations, have been applied to capture a wide-variety of biological behaviours. Specifically, the dynamics related to cellular processes that involve low copy number of molecules, such as mRNA transcription, are best described as random and noisy events [1]. For example, cells in an iso-genetic population do not necessarily assume the same biological state, but rather exhibit variegated genetic expressions [2, 3]. In these examples, the distribution of cells is simulated by stochastic models that describe the probability density function (PDF) of cellular states. However, unlike differential equation models, the identification of stochastic models from experimental data of single cell or cell population data are not yet routine.
Despite the availability of high-throughput cell biology, the estimation of unknown (kinetic) model parameters from experimental data is still considered as the bottleneck in biological model identification, especially for dynamical models [4, 5]. The difficulty is generally attributed to the informativeness of the data, or the lack thereof, a property that is proportional to not only the quantity, but also the quality of data. Furthermore, in dynamical models, the time resolution of data is naturally of great importance. In recent years, advances in bio-imaging allow for real time measurements of cellular components such as mRNAs and proteins in individual cells through the use of fluorescent proteins [2, 3, 6–8]. Such measurements provide more in-depth and informative data about the states of a cell and variability in a cell population, than the traditional lumped measurements from cell culture lysate or tissue homogenate. The purpose of this work is to develop practical methods that can efficiently use these data in the parameter estimation framework for stochastic biochemical systems.
Chemical master equation (CME) is the most commonly adopted modelling framework to describe stochastic cellular dynamics [1–3] and thus is used as a benchmark application in this work. The estimation of unknown kinetic parameters from data in CME and other stochastic models has not been adequately addressed in the literature. Many of the published CME models use rate constants that are scaled from deterministic parameter values or selected ad-hoc to replicate desired behaviour. Since the low-copy-number random events can generate dynamics that are characteristically different from those in thermodynamic or deterministic limit [9, 10], deterministic model parameters identified from data collected under this limit or averaged over cell populations can be misleading. Furthermore, fitting deterministic models (e.g. ordinary differential equation) to stochastic data has been shown to give poor parameter estimates and model prediction [11]. Among the existing parameter estimation methods for stochastic biological models, some rely on Bayesian inference based on the stochastic differential equation [12, 13], while others are based on maximum likelihood (ML) methods. One ML method obtains parameter estimates by fitting transition density functions of stochastic differential equations in biochemical pathways [11]. A similar approach based on the ML of transitional probabilities requires measurements of the state trajectories at very fast sampling rate, whereby reactions are assumed to occur at most twice in a sampling time interval [14]. The fast sampling requirement makes this approach impractical, since biological data are typically sparse.
In this work, three kinetic parameter estimation methods for stochastic models were developed based on two criteria: maximum likelihood (ML) and density function distance (DFD). Two scenarios of practical application were considered involving both sparsely and densely populated datasets (i.e. low and high replicates). Since the distribution density functions are commonly constructed using histograms, an important aspect related to the binning strategy and the noise associated with finite sampling, has been incorporated in the parameter estimation framework. The efficacy of each method was evaluated and compared based on applications to three CME case studies: RNA dynamics in Escherichia coli, gene expression network of galactose uptake model in Saccharomyces cerevisiae, and a bimodal system comprising of a genetic toggle switch in E. coli. Despite the use of CME models here, the methods are generally applicable to other stochastic models in which the system behaviour or output can be characterized by a PDF of the states.
Methods
Chemical Master Equation
where the state x is an N-dimensional vector indicating the number of molecules of each species in the volume Ω, the density function P(x, t|x_{0}, t_{0}) denotes the probability that the system assumes the state configuration x_{ j }at time t, given the initial condition x_{0} at time t_{0}, the vector ν_{ j }gives the stoichiometric change in the molecular count of each species due to a single j-th reaction event, and k is the kinetic parameter vector. The function a_{ j }(x, k) is known as the propensity function, where a_{ j }(x, k)dt gives the probability of the j-th reaction to occur in the time interval t and t+ dt given the state x and parameters k. Due to the curse of dimensionality with increasing number of reacting species, the analytical solution of a CME is usually difficult, if not practically impossible, to obtain even for moderately sized systems [16].
In this work, Stochastic Simulation Algorithm (SSA) [16] was used to generate in silico experimental data for the purpose of parameter estimation and to solve for the PDF of the CME model. Briefly, at any given time and state configuration, the algorithm takes two uniform random numbers, from which the time to the next reaction and the reaction index are determined as a function of the propensities [16, 17]. The histogram should reflect the true state PDF in the limit of the number of realizations tending to infinity. Since only a finite number of data samples are computationally feasible and experimentally practical, the error associated with histogram binning strategy is important, but this is not often discussed in existing literature of the CME. The shape of the resulting density function is known to be sensitive to the number and size of the bins, and the optimal binning distribution need not be of uniform sizes [18]. Characteristic features of a distribution such as bimodality may not be apparent when using bins that are too wide, while histograms can be significantly affected by random fluctuations associated with a small number of data points in bins that are too narrow. Although there is no hard and fast rule on the selection of bin sizes, the minimum number of realizations in each bin should typically range between 5 and 20 [19]. Unless stated otherwise, the histograms here are constructed such that each bin contains no fewer than ten occurrences. The noise due to the histogram construction using finite size random sample will be taken into account in the parameter estimation below.
In practice, the choice of numerical solvers for model equations determines the performance of any parameter estimation methods. For CME, there has been a tremendous development of numerical algorithms for computing the PDF solution, directly [20–22] or indirectly [15, 16, 23]. The SSA was selected in this work because this algorithm is equivalent to the CME [16, 17], motivating its use to generate in silico data. Consequently, the CME model was also solved using SSA, such that the efficacy of the proposed methods can be evaluated independently from the solvers. In this case, deficiencies of SSA will appear equally in both in silico data and the model solution.
Parameter Estimation Methods
The methods developed here are formulated as a minimization of distance measures between model predictions and experimental data. The first method makes use of the common likelihood function and the second involves a distance metric between density functions as predicted by the CME and the data. When experimental error is known or can be determined from data, this noise should be accounted for in the PDF solution. In this work, the error is assumed to be independent and identically distributed (i.i.d.) random samples from a normal distribution with zero mean and variance σ^{2} (N(0,σ^{2})), which are then added to the SSA realizations.
Maximum Likelihood (ML) Method
Density Function Distance (DFD) Method
As a reliable construction of a PDF typically requires a large number of replicates, the PDF distance may not be appropriate when only few replicates of data are available. On the other hand, the ML method above can be applied to datasets with low replicates, as it does not require the construction of a density function from the experimental data.
The binning distribution can be kept the same as the PDF, but this need not be necessarily so. Unlike PDF, the shape of CDF is less sensitive to noise from finite sampling, with the exception of the tail ends of the CDF near the minimum and maximum values of the states. An alternate formulation with a finer binning strategy gives a similar performance to the objective function above (data not shown). The lesser sensitivity to noise also makes the CDF distance method applicable to sparse datasets (low replicates), in which case the binning strategy is done based on the SSA realizations.
Global Optimization Algorithm
Aside from model solvers, the effectiveness of any parameter estimation methods also depends on the ability to find the global optima to the minimization problems. In the case of stochastic models, the error landscape is anticipated to be highly stochastic due to noise from finite experimental data points, which prevents the use of any optimization algorithms involving gradient search. Here, a variant of evolutionary algorithms, called Differential Evolution (DE), is used as a general purpose global optimization algorithm. This method can effectively handle diversified objective function planes [25], and like other evolutionary algorithms such as genetic algorithm (GA), DE starts with a random population member and looks for the global optima by generating new population members using successive recombination and mutations based on the original parent population. However, unlike GA, DE uses floating point instead of bit string encoding, and arithmetic operations instead of logical rules, thereby providing a greater flexibility in the parameter search. Among the settings of DE, the population size and total number of generations are tuned in the case studies below based on the dimensionality of the problem (i.e. number of parameters) and the choice of parameter estimation method, respectively. The remaining parameters are maintained at previously suggested values [25]. The convergence and termination of the optimization can be based on the improvement of the best objective function in the population, standard deviation of the population vector, or maximum difference between the best and worst population member. A combination of several of these criteria can provide an efficient and robust termination criterion [26]. Since the case studies considered in this work involve in silico data with known true parameters, a maximum iteration number is used as a termination criteria and the efficacy of each method is judged based on the accuracy of the respective estimates.
The SSA and DE algorithms were implemented using Message Passing Interface (MPI) in C++ and run on a Linux IBM computing cluster (CentOS; GNU C++ compiler (v4.1.1)). A combination of a long period random number generator [27] and multiple independent streams generator [28] were used to guarantee statistically independent streams of random numbers required for both the SSA and DE.
Results
Case Study 1: RNA dynamics in E. coli
The first case study uses the CME model corresponding to the reactions and kinetic parameters proposed in the original work, as shown in Figure 1B and detailed in supplementary data [Additional File 1: Supplementary Table S1] [6]. Considering this model to be the true system, four experimental datasets of mRNA copy numbers with different replicates (m = 10, 20, 100, and 10,000) were simulated using the SSA. The simulated data were contaminated with measurement errors arising due to the normalization of the fluorescence flux, were taken to be discrete rounded values of normal random samples N(0,0.25), consistent with the actual wet-lab experiments [6]. The mRNA transcripts per cell generation were recorded every 0.5 minutes until 75 minutes, mimicking the original experimental protocol.
where l now denotes the index of the state in replicate vector after arranging the data in ascending order (i.e., o_{1} ≤ o_{2} ≤ ...≤ o_{ m }). This construction implicitly uses the differences between sorted data values as the bin sizes. As stated earlier, since the DFD-PDF method requires the histograms of experimental data, which in the case of low replicate datasets, are highly inaccurate, this method was only performed for cell population data (m = 10,000). The DE optimization was implemented with a population size of 30 (10 × the number of parameters) for 4,000 generations and the optimization routine took about 1.5 hours for completion.
Parameter estimation of RNA dynamics model in E. coli.
Replicates | ML | DFD-CDF | DFD-PDF | ||||||
---|---|---|---|---|---|---|---|---|---|
k _{ 1 } | k _{ 2 } | k _{ 3 } | k _{ 1 } | k _{ 2 } | k _{ 3 } | k _{ 1 } | k _{ 2 } | k _{ 3 } | |
10 | 0.0235 (0.0233)^{a} | 1.304 (0.3231)^{a} | 3.2201 (0.7232)^{a} | 0.02 | 0.1029 | 0.3643 | - | - | - |
20 | 0.0227 | 0.1095 | 0.2858 | 0.0371 | 0.2124 | 0.5263 | - | - | - |
100 | 0.0362 | 0.2930 | 0.5533 | 0.0273 | 0.1702 | 0.4121 | - | - | - |
10000 | 0.0279 | 0.2354 | 0.4872 | 0.0276 | 0.1659 | 0.4102 | 0.0273 | 0.1532 | 0.3837 |
Case Study 2: Galactose uptake model in S. cerevisiae
The CME model adapted from this work captures the random transitions among all possible promoter states as shown in Figure 3B. The states PC_{1}, PC_{2} and PC_{3} represent free/silent, intermediate complex, and pre-initiation complex promoter configurations, respectively, while the states RC_{1} and RC_{2} describe different forms of repressed promoter configurations. The transcriptional (RNA synthesis) and translational (protein synthesis) processes are modelled as single-step irreversible reactions (Figure 3B).
In the simplified model, the different promoter configurations are assumed to be in equilibrium, which reduces the model to a set of 8 irreversible reactions, 4 states, and 8 kinetic parameters, as shown in Figure 3B (dashed boxes) [29]. As in the first case study, this model was considered to be the true system and the molecular data of yEGFP and TetR were generated using SSA, giving 10^{4} realizations at every 5 dimensionless time units up to 50 (or about 18 times the half life of yEGFP [30]). This condition corresponds to 440 minutes of post induction by 2% galactose and 40 ng ml^{-1} ATc. To study the scalability of the proposed methods, the parameter estimation of the full network with 18 reactions, 9 states, and 15 kinetic parameters was also done using a second in silico dataset with 10^{4} SSA realizations from the complete model. The details on the CME formulation for both the reduced and the complete model of the yEGFP gene expression pathway have been included in the supplementary data [Additional File 1: Supplementary Table S2 and S3].
Parameter estimation of reduced yEGFP model in S. cerevisiae
Parameters | ML | DFD-CDF | DFD-PDF | Bounds | True values |
---|---|---|---|---|---|
κ _{ R } | 1.1443 | 1 | 1.0478 | [0,5] | 1 |
κ _{ P } | 1.0382 | 1.005 | 1.2174 | [0,5] | 1 |
γ _{ R } | 4.5036 | 5.0306 | 5.7355 | [0,10] | 5 |
γ _{ P } | 0.0128 | 0.0126 | 0.012 | [0,5] | 0.0125 |
${\kappa}_{R}^{t}$ | 0.428 | 0.432 | 0.431 | [0,5] | 0.417 |
${\kappa}_{P}^{t}$ | 2.1254 | 1.0542 | 1.24 | [0,5] | 1 |
${\gamma}_{R}^{t}$ | 6.2433 | 2.9966 | 3.4982 | [0,10] | 3 |
${\gamma}_{P}^{t}$ | 0.0102 | 0.0114 | 0.0115 | [0,5] | 0.0125 |
Parameter estimation of full yEGFP model in S. cerevisiae.
Parameters | Transcription processes | ||||
---|---|---|---|---|---|
ML | DFD-CDF | DFD-PDF | Bounds | True value | |
k _{1f} | 0.4061 | 0.4082 | 0.4292 | [0,5] | 0.42 |
k _{1b} | 0.211 | 0.1171 | 0.8296 | [0,5] | 0.2485 |
k _{2f} | 74.1848 | 25.9882 | 99.7701 | [0,100] | 50 |
k _{2b} | 4.1423 | 18.8779 | 2.0815 | [0,20] | 10 |
k _{3f} | 3.2 × 10^{-3} | 3.87 × 10^{-3} | 0.0166 | [0,5] | 3.032 × 10^{-3} |
k _{3b} | 17.2405 | 19.9408 | 19.7665 | [0,20] | 10 |
α | 0.1 | 0.0183 | 0.0211 | [0,5] | 0.025 |
Irreversible processes | |||||
κ _{ R } | 0.8939 | 0.9296 | 0.8078 | [0,5] | 1 |
κ _{ P } | 2.0345 | 1.1103 | 1.0995 | [0,5] | 1 |
γ _{ R } | 7.3543 | 5.2431 | 5.4116 | [0,10] | 5 |
γ _{ P } | 0.0116 | 0.0124 | 0.012 | [0,5] | 0.0125 |
${\kappa}_{R}^{t}$ | 0.4376 | 0.4157 | 0.4152 | [0,5] | 0.417 |
${\kappa}_{P}^{t}$ | 1.7641 | 0.9755 | 1.3732 | [0,5] | 1 |
${\gamma}_{R}^{t}$ | 4.3235 | 2.9034 | 3.9315 | [0,10] | 3 |
${\gamma}_{P}^{t}$ | 0.0107 | 0.0116 | 0.0103 | [0,5] | 0.0125 |
for PDF and CDF, respectively, have also been evaluated, showing similar performances and observations. The outcome of the application of these criteria to the estimation of parameters in the reduced and complete yEGFP gene expression pathway is described in supplementary data [Additional File 1: Supplementary Table S4 and S5].
Case Study 3: Stochastic model of a synthetic toggle switch
A simple deterministic model was proposed to examine the behaviour of the toggle switch and to analyze different conditions of bistability [8]. The corresponding CME formulation is described in the Figure 4B and 4C[35]. Here, the propensity functions are taken directly from the deterministic model and they give effective rates of reaction following a canonical Hill equation. Taking this model to be the true system, in silico data of GFP fluorescence at IPTG concentration of 6 × 10^{-5} M were simulated using 10^{4} independent SSA realizations, emulating flow cytometry data.
Parameter estimation of synthetic toggle switch in E. coli.
DFD-CDF | DFD-PDF | Bounds | True value | |
---|---|---|---|---|
α _{1} | 137.716 | 99.456 | [0,200] | 156.25 |
α _{2} | 15.644 | 15.391 | [0,20] | 15.6 |
β | 2.309 | 2.543 | [0,10] | 2.5 |
γ | 1.071 | 1.015 | [0,10] | 1 |
η | 2.065 | 8.434 | [0,10] | 2.0015 |
K | 7.331 × 10^{-5} | 5.831 × 10^{-4} | [0,1] | 6.0 × 10^{-5} |
Discussion
In this work, three practical methods are proposed for the estimation of the parameters from (noisy) single cell datasets with low and high replicates. As the methods rely on a histogram construction of density functions from a finite sample of experimental data and Monte Carlo simulations, the objective function evaluation has a trade-off between low accuracy when using bins that are too wide, and high sensitivity to noise when bins are too small. In order to balance this trade-off, the binning was done such that the narrowest bin has at least ten occurrences. The noise associated with this binning strategy is also taken into account in the objective function in the DFD methods, which is modelled according to a binomial distribution.
The proposed methods are developed while considering a few practical issues when dealing with real biological datasets, such as data sparsity (low replicates), data noise and relatively coarse sampling intervals. The methods developed here do not require fast time-sampling like in [14], which might pose a restrictive constraint in practice. When population data are available, the DFD methods can fully exploit the additional information and rigorously handle the noise associated with the finite sample construction of a density function through the weighting factors. Although the examples considered in this work are represented by the CME, the methodologies developed in this work are generally applicable to parameter estimation of other stochastic models (e.g. Langevin), as long as the distribution density function can be constructed. Furthermore, the different methods developed in this work can be used to robustly estimate the rate constants of large scale gene expression networks as well as systems with multistability and general nonlinear propensity equations.
The case studies above showed that methods based on matching density function shapes between model and data generally performed better than maximizing likelihood function. Furthermore, the DFD-CDF distance is more sensitive to parameters than both the DFD-PDF and ML, and thus is the most effective method developed in this work. The higher sensitivity of the CDF with respect to parameter variations is expected as a result of the cumulative sum of the PDF sensitivity. This is evident from comparing the normalized objective function surfaces as shown in Figure. 2, in which the CDF objective functions have the steepest curvature. The increased curvature leads to a faster convergence to the minima in the DE optimization of the CDF than the PDF, though both methods eventually converge to optimal parameter estimates with similar accuracy. In addition, the CDF is generally less sensitive to noise from finite sampling as can be seen from the noise weighting factor S_{ l,i }when normalized with the respective probability, i.e. the coefficient of variation (CoV) ${S}_{l,i}/{F}_{e}\left({o}_{l},{t}_{i}\right)=\sqrt{1-{F}_{e}\left({o}_{l},{t}_{i}\right)}/n\sqrt{{F}_{e}\left({o}_{l},{t}_{i}\right)}$. The monotonically decreasing CoV as a function F_{ e }(o_{ l }, t_{ i }) of indicates that the CDF construction becomes less affected by finite sampling noise with increasing F_{ e }(o_{ l }, t_{ i }).
Similar to the parameter estimation in deterministic models, parameter identifiability is a key issue in the estimation of the CME parameters. Such problem is commonly encountered in the parameter estimation of deterministic ODE models [36]. Following the same arguments from the deterministic estimation, the identifiability problem is caused by the limited information contained in the data about the parameters governing the fast transformations among the different promoter configurations. Such problem can be alleviated by getting additional measurements with a faster sampling rate and if possible, measuring the variables that are directly affected by the parameters, e.g. the fractions of promoters in each configuration of the second case study. An analogue of deterministic parameter identifiability analysis can be performed using the parametric sensitivity of the density function and experiments can be designed to maximize the degree of information in the data [35, 37, 38].
Conclusions
The inherent stochasticity associated with low copy number processes in the cellular genetic milieu can introduce significant noise in gene expression profiles. The modelling of such noisy system requires a careful consideration of random processes and the parameters governing the probability of random events [1]. Three parameter estimation methods for stochastic models have been proposed based on the maximum likelihood criterion and density function distances of PDF and CDF. Since state density functions of stochastic systems are often constructed from a finite number of experimental data points or Monte Carlo realizations, a careful consideration has been taken to characterize the influence of noise arising from the histogram binning. Specifically, the effects of histogram noise are directly incorporated into the parameter estimation objective function as weighting functions. Applications to two case studies have shown that the proposed methods are both effective and practical. Amongst the proposed methods, the CDF-DFD method has been found to be the most efficient in estimating the kinetic rate constant than the others (i.e., the ML and DFD-PDF methods) due to the higher sensitivity of CDF to the parameters.
Declarations
Acknowledgements
This work was supported by National University of Singapore Faculty Research Council grant [R-279-000-219-112/133].
Authors’ Affiliations
References
- McAdams HH, Arkin A: It's a noisy business! Genetic regulation at the nanomolar scale. Trends Genet 1999, 15: 65–69. 10.1016/S0168-9525(98)01659-XView ArticlePubMedGoogle Scholar
- Elowitz MB, Leibler S: A synthetic oscillatory network of transcriptional regulators. Nature 2000, 403: 335–338. 10.1038/35002125View ArticlePubMedGoogle Scholar
- Colman-Lerner A, Gordon A, Serra E, Chin T, Resnekov O, Endy D, Pesce CG, Brent R: Regulated cell-to-cell variation in a cell-fate decision system. Nature 2005, 437: 699–706. 10.1038/nature03998View ArticlePubMedGoogle Scholar
- Yang E, van Nimwegen E, Zavolan M, Rajewsky N, Schroeder M, Magnasco M, Darnell JE Jr: Decay rates of human mRNAs: correlation with functional characteristics and sequence attributes. Genome Res 2003, 13: 1863–1872. 10.1101/gr.997703View ArticlePubMedPubMed CentralGoogle Scholar
- Chou IC, Voit EO: Recent developments in parameter estimation and structure identification of biochemical and genomic systems. Math Biosci 2009, 219: 57–83. 10.1016/j.mbs.2009.03.002View ArticlePubMedPubMed CentralGoogle Scholar
- Golding I, Paulsson J, Zawilski SM, Cox EC: Real-time kinetics of gene activity in individual bacteria. Cell 2005, 123: 1025–1036. 10.1016/j.cell.2005.09.031View ArticlePubMedGoogle Scholar
- Yu J, Xiao J, Ren X, Lao K, Xie XS: Probing gene expression in live cells, one protein molecule at a time. Science 2006, 311: 1600–1603. 10.1126/science.1119623View ArticlePubMedGoogle Scholar
- Gardner TS, Cantor CR, Collins JJ: Construction of a genetic toggle switch in Escherichia coli. Nature 2000, 403: 339–342. 10.1038/35002131View ArticlePubMedGoogle Scholar
- Fange D, Elf J: Noise-induced Min phenotypes in E. coli. PLoS Comput Biol 2006, 2: e80. 10.1371/journal.pcbi.0020080View ArticlePubMedPubMed CentralGoogle Scholar
- Samoilov M, Plyasunov S, Arkin AP: Stochastic amplification and signaling in enzymatic futile cycles through noise-induced bistability with oscillations. Proc Natl Acad Sci USA 2005, 102: 2310–2315. 10.1073/pnas.0406841102View ArticlePubMedPubMed CentralGoogle Scholar
- Tian T, Xu S, Gao J, Burrage K: Simulated maximum likelihood method for estimating kinetic rates in gene expression. Bioinformatics 2007, 23: 84–91. 10.1093/bioinformatics/btl552View ArticlePubMedGoogle Scholar
- Golightly A, Wilkinson DJ: Bayesian sequential inference for stochastic kinetic biochemical network models. J Comput Biol 2006, 13: 838–851. 10.1089/cmb.2006.13.838View ArticlePubMedGoogle Scholar
- Golightly A, Wilkinson DJ: Bayesian inference for a discretely observed stochastic kinetic model. Stat Comput 2008, 125–135.Google Scholar
- Reinker S, Altman RM, Timmer J: Parameter estimation in stochastic biochemical reactions. Syst Biol (Stevenage) 2006, 153: 168–178.View ArticleGoogle Scholar
- Gillespie DT: Markov Processes: An Introduction for Physical Scientists. San Diego: Academic Press; 1991.Google Scholar
- Gillespie DT: Exact Stochastic Simulation of Coupled Chemical Reactions. J Phys Chem 1977, 81: 2340–2361. 10.1021/j100540a008View ArticleGoogle Scholar
- Gillespie DT: A rigorous derivation of the chemical master equation. Physica A 1992, 188: 404–425. 10.1016/0378-4371(92)90283-VView ArticleGoogle Scholar
- Scott DW: Multivariate Density Estimation: Theory, Practice, and Visualization (Wiley Series in Probability and Statistics). Wiley; 1992.View ArticleGoogle Scholar
- Montgomery DC, Runger GC: Applied Statistics and Probability for Engineers. New York: Wiley; 2006.Google Scholar
- Macnamara S, Bersani AM, Burrage K, Sidje RB: Stochastic chemical kinetics and the total quasi-steady-state assumption: application to the stochastic simulation algorithm and chemical master equation. J Chem Phys 2008, 129: 095105. 10.1063/1.2971036View ArticlePubMedGoogle Scholar
- Macnamara S, Burrage K, Sidje RB: Multiscale modeling of chemical kinetics via the master equation. SIAM J; Multiscale Modeling & Simulation 2008, 6: 1146–1168.View ArticleGoogle Scholar
- Munsky B, Khammash M: The finite state projection algorithm for the solution of the chemical master equation. J Chem Phys 2006, 124: 044104. 10.1063/1.2145882View ArticlePubMedGoogle Scholar
- Gibson MA, Bruck J: Efficient Exact Stochastic Simulation of Chemical Systems with Many Species and Many Channels. J Phys Chem A 2000, 104: 1876–1889. 10.1021/jp993732qView ArticleGoogle Scholar
- Kullback S, Leibler S: On Information and Sufficiency. Ann Math Stat 1951, 22: 79–86. 10.1214/aoms/1177729694View ArticleGoogle Scholar
- Storn R, Price K: Differential Evolution - A Simple and Efficient Heuristic for Global Optimization over Continuous Spaces. J Global Optim 1997, 4: 341–359. 10.1023/A:1008202821328View ArticleGoogle Scholar
- Zielinski K, Peters D, Laur R: Stopping Criteria for Single-Objective Optimization. Proceedings of the Third International Conference on Computational Intelligence, Robotics and Autonomous Systems; Singapore 2005.Google Scholar
- Matsumoto M, Nishimura T: Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator. ACM Trans Model Comput Simul 1998, 8: 3–30. 10.1145/272991.272995View ArticleGoogle Scholar
- LeCuyer P, Simard R, Chen EJ, Kelton WD: An Object-Oriented Random-Number Package with many long Streams and Substreams. Oper Res 2002, 50: 1073. 10.1287/opre.50.6.1073.358View ArticleGoogle Scholar
- Blake WJ, M KA, Cantor CR, Collins JJ: Noise in eukaryotic gene expression. Nature 2003, 422: 633–637. 10.1038/nature01546View ArticlePubMedGoogle Scholar
- Chen MT, Weiss R: Artificial cell-cell communication in yeast Saccharomyces cerevisiae using signaling elements from Arabidopsis thaliana. Nat Biotechnol 2005, 23: 1551–1555. 10.1038/nbt1162View ArticlePubMedGoogle Scholar
- Arkin A, Ross J, McAdams HH: Stochastic kinetic analysis of developmental pathway bifurcation in phage lambda-infected Escherichia coli cells. Genetics 1998, 149: 1633–1648.PubMedPubMed CentralGoogle Scholar
- Ozbudak EM, Thattai M, Lim HN, Shraiman BI, Van Oudenaarden A: Multistability in the lactose utilization network of Escherichia coli. Nature 2004, 427: 737–740. 10.1038/nature02298View ArticlePubMedGoogle Scholar
- Pomerening JR, Sontag ED, Ferrell JE Jr: Building a cell cycle oscillator: hysteresis and bistability in the activation of Cdc2. Nat Cell Biol 2003, 5: 346–351. 10.1038/ncb954View ArticlePubMedGoogle Scholar
- Bhalla US, Iyengar R: Emergent properties of networks of biological signaling pathways. Science 1999, 283: 381–387. 10.1126/science.283.5400.381View ArticlePubMedGoogle Scholar
- Gunawan R, Cao Y, Petzold L, Doyle FJ: Sensitivity analysis of discrete stochastic systems. Biophys J 2005, 88: 2530–2540. 10.1529/biophysj.104.053405View ArticlePubMedPubMed CentralGoogle Scholar
- Nikerel IE, van Winden WA, Verheijen PJ, Heijnen JJ: Model reduction and a priori kinetic parameter identifiability analysis using metabolome time series for metabolic reaction networks with linlog kinetics. Metab Eng 2009, 11: 20–30. 10.1016/j.ymben.2008.07.004View ArticlePubMedGoogle Scholar
- Gadkar KG, Gunawan R, Doyle FJ: Iterative approach to model identification of biological networks. BMC Bioinformatics 2005, 6: 155. 10.1186/1471-2105-6-155View ArticlePubMedPubMed CentralGoogle Scholar
- Plyasunov S, Arkin AP: Efficient stochastic sensitivity analysis of discrete event systems. J Comp Phys 2006, 221: 724–738. 10.1016/j.jcp.2006.06.047View ArticleGoogle Scholar
- Cao Y, Gillespie DT, Petzold LR: Efficient step size selection for the tau-leaping simulation method. J Chem Phys 2006, 124: 044109. 10.1063/1.2159468View ArticlePubMedGoogle Scholar
- Chatterjee A, Vlachos DG, Katsoulakis MA: Binomial distribution based tau-leap accelerated stochastic simulation. J Chem Phys 2005, 122: 024112. 10.1063/1.1833357View ArticlePubMedGoogle Scholar
- Haseltine EL, Rawlings JB: Approximate simulation of coupled fast and slow reactions for stochastic chemical kinetics. J Chem Phys 2002, 117: 6959–6969. 10.1063/1.1505860View ArticleGoogle Scholar
- Tian T, Burrage K: Binomial leap methods for simulating stochastic chemical kinetics. J Chem Phys 2004, 121: 10356–10364. 10.1063/1.1810475View ArticlePubMedGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.