Parametric sensitivity analysis for biochemical reaction networks based on pathwise information theory
- Yannis Pantazis^{1},
- Markos A Katsoulakis^{1}Email author and
- Dionisios G Vlachos^{2}
DOI: 10.1186/1471-2105-14-311
© Pantazis et al.; licensee BioMed Central Ltd. 2013
Received: 9 April 2013
Accepted: 5 October 2013
Published: 22 October 2013
Abstract
Background
Stochastic modeling and simulation provide powerful predictive methods for the intrinsic understanding of fundamental mechanisms in complex biochemical networks. Typically, such mathematical models involve networks of coupled jump stochastic processes with a large number of parameters that need to be suitably calibrated against experimental data. In this direction, the parameter sensitivity analysis of reaction networks is an essential mathematical and computational tool, yielding information regarding the robustness and the identifiability of model parameters. However, existing sensitivity analysis approaches such as variants of the finite difference method can have an overwhelming computational cost in models with a high-dimensional parameter space.
Results
We develop a sensitivity analysis methodology suitable for complex stochastic reaction networks with a large number of parameters. The proposed approach is based on Information Theory methods and relies on the quantification of information loss due to parameter perturbations between time-series distributions. For this reason, we need to work on path-space, i.e., the set consisting of all stochastic trajectories, hence the proposed approach is referred to as “pathwise”. The pathwise sensitivity analysis method is realized by employing the rigorously-derived Relative Entropy Rate, which is directly computable from the propensity functions. A key aspect of the method is that an associated pathwise Fisher Information Matrix (FIM) is defined, which in turn constitutes a gradient-free approach to quantifying parameter sensitivities. The structure of the FIM turns out to be block-diagonal, revealing hidden parameter dependencies and sensitivities in reaction networks.
Conclusions
As a gradient-free method, the proposed sensitivity analysis provides a significant advantage when dealing with complex stochastic systems with a large number of parameters. In addition, the knowledge of the structure of the FIM can allow to efficiently address questions on parameter identifiability, estimation and robustness. The proposed method is tested and validated on three biochemical systems, namely: (a) a protein production/degradation model where explicit solutions are available, permitting a careful assessment of the method, (b) the p53 reaction network where quasi-steady stochastic oscillations of the concentrations are observed, and for which continuum approximations (e.g. mean field, stochastic Langevin, etc.) break down due to persistent oscillations between high and low populations, and (c) an Epidermal Growth Factor Receptor model which is an example of a high-dimensional stochastic reaction network with more than 200 reactions and a corresponding number of parameters.
Keywords
Biochemical reaction networks Sensitivity analysis Relative entropy rate Pathwise Fisher information matrix p53 model EGFR modelBackground
The need of an intrinsic understanding of the interplay between complexity and robustness of biological processes and their corresponding design principles is well-documented, see for instance [1-5]. The concept of robustness can be described as “a property that allows a system to maintain its functions against internal and external perturbations” [3]. When referring to mathematical models of complex biological processes, one of the mathematical tools to describe the robustness of a system to perturbations is sensitivity analysis which attempts to determine which parameter directions (or their combinations) are the most/least sensitive to perturbations and uncertainty, or to errors resulting from experimental parameter estimation. Recently there has been significant progress in developing sensitivity analysis tools for low-dimensional stochastic processes, modeling well-mixed chemical reactions and biological networks. Some of the mathematical tools included log-likelihood methods and Girsanov transformations [6-8], polynomial chaos [9], finite difference methods and their variants [10, 11] and pathwise sensitivity methods [12]. However, existing sensitivity analysis approaches can have an overwhelming computational cost, either due to high variance in the gradient estimators, or in models with a high-dimensional parameter space, [13].
The aforementioned methods focus on the sensitivity of stochastic trajectories and corresponding averages. However, it is often the case that we are interested in the sensitivity of probability density functions (PDF), which are typically non-Gaussian in nonlinear and/or discrete systems. In that latter direction, there is a broad recent literature relying on information theory tools, and where sensitivity is estimated by using the Relative Entropy and the Fisher Information Matrix between PDFs, providing a quantification of information loss along different parameter perturbations. We refer to [14-18] for the case when the parametric PDF is explicitly known. For instance, in [16] the parametric PDF’s structure is known as it is obtained through an entropy maximization subject to constraints. Knowing the form of the PDF allows to carry out calculations such as estimating the relative entropy and identifying the most sensitive parameter combinations. Furthermore, the pathwise PDFs are also known in reaction networks when a Linear Noise Approximation (LNA) is employed and for this case the relative entropy can be explicitly computed allowing thus to carry out parametric sensitivity analysis, [18]. However, for complex stochastic dynamics of large reaction networks, spatial Kinetic Monte Carlo algorithms and molecular dynamics, such explicit formulas for the PDFs are not available in general.
In [19], we address such challenges by introducing a new methodology for complex stochastic dynamics based on the Relative Entropy Rate (RER) which provides a measure of the sensitivity of the entire time-series distribution. Typically, the space of all such time-series is referred in probability theory as the “path space”. RER measures the loss of information per unit time in path space after an arbitrary perturbation of parameter combinations. RER and the corresponding Fisher Information Matrix (FIM) become computationally feasible as they admit explicit formulas which depend only on the propensity functions (see (4) and (6), respectively). In fact, we showed in [19] that the proposed pathwise approach to sensitivity analysis has the following features: First, it is rigorously valid for the sensitivity of long-time, stationary dynamics in path space, including for example bistable, periodic and pulse-like dynamics. Second, it is a gradient-free sensitivity analysis method suitable for high-dimensional parameter spaces as the ones typically arising in complex biochemical networks. Third, the RER method does not require the explicit knowledge of the equilibrium PDFs, relying only on information for local dynamics and thus making it suitable for non-equilibrium steady state systems. In [19], we demonstrated these features by focusing on two classes of problems: Langevin particle systems with either reversible (gradient) or non-reversible (non-gradient) forcing, highlighting the ability of the method to carry out sensitivity analysis in non-equilibrium systems; and spatially extended Kinetic Monte Carlo models, showing that the method can handle high-dimensional problems.
In this paper, we extend and apply the pathwise sensitivity analysis method in [19] to biochemical reaction networks, and demonstrate the intrinsic sensitivity structure of the network. Such systems are typically modeled as jump Markov processes and they are simulated using either exact algorithms such as the Stochastic Simulation Algorithm (SSA), [20-22] and the next-reaction method [23], or by employing approximations such as mean field ODEs, tau-leap [24] and stochastic Langevin methods [25].
We show that the proposed pathwise method allows us to discover the intrinsic sensitivities of the reaction network by decomposing the FIM into diagonal blocks. The block-diagonal structure of the proposed FIM reveals, in a straightforward way, the sensitivity interdependencies between the system parameters. For instance, if each propensity function depends only on one parameter -usually the reaction constant- then the FIM is a diagonal matrix (see (14)). The sparse representation of the FIM can be essential in optimal experimental design as well as in parameter identifiability and robustness where each subset of the parameters defined by a block of the FIM can be treated separately. Moreover, our earlier rigorous analysis [19] for the stationary regime suggests suitable extensions in the transient case which are here tested and validated. Finally, we present strategies for efficiently and reliably implementing the proposed method for high-dimensional, complex stochastic systems using an array of existing accelerated versions of the SSA algorithm such as mean field, stochastic Langevin, τ-leap approximations and their variants, [21, 24-27].
We test the proposed set of methods and computational strategies in three examples of biochemical networks. First, we consider a prototypical protein production/degradation model, i.e, a single-species birth/death model, with explicitly known formulas for the stationary and the time-dependent distribution. This model serves as a benchmark where the differences between the proposed pathwise FIM and the stationary FIM are highlighted. Second, we study the parameter sensitivities of a p53 gene model for cell cycle regulation and response to DNA damage, that incorporates the feedback between the tumor suppressor p53 gene and the oncogene Mdm2 [28]. This is a reaction network that exhibits random oscillations in its steady state, and for which continuum approximations of the SSA such as LNA break down due to persistent oscillations between high and low populations. Using the proposed method, we also study a far more complex network, the epidermal growth factor receptor (EGFR) model, describing signaling processes between mammalian cells [29-31]. This is a high-dimensional system both in the number of variables and parameters, including 94 species and 207 reactions. Having a gradient-free method for this example with parameter space of dimension 207 provides a significant advantage over gradient methods such as finite differencing, where the computation of a very high number of partial derivatives and/or directional derivatives is needed and with possibly significant variance that scales with the dimension, [11]. By contrast, the eigenvalue/eigenvector analysis of the proposed FIM identifies the order from least to most sensitive directions (determined by the eigenvectors of the FIM) by the corresponding eigenvalues.
In Methods, we present the derivation of the Relative Entropy Rate and its corresponding Fisher Information Matrix for continuous-time jump Markov processes as well as we reveal the block-diagonal structure of the FIM for commonly encountered reaction networks, continued by the presentation of both unbiased and biased -but accelerated- statistical estimators for RER and FIM. Then, in the Results, we apply and validate the proposed pathwise sensitivity analysis methodology in three complex biological reaction networks.
Methods
We consider a well-mixed reaction network with N species, S = {S_{1}, …, S_{ N }}, and M reactions, R = {R_{1}, …, R_{ M }}. The state of the system at any time t ≥ 0 is denoted by an N-dimensional vector X(t) = [X_{1}(t), …, X_{ N }(t)]^{ T } where X_{ i }(t) is the number of molecules of species S_{ i } at time t. Let the N-dimensional vector ν_{ j } correspond to the stoichiometry vector of j-th reaction such that ν_{i,j} is the stoichiometric coefficient of species S_{ i } in reaction R_{ j }. Given that the reaction network at time t is in state X(t) = x, the propensity function, a_{ j }(x), is defined so that the infinitesimal quantity a_{ j }(x)d t gives the transition probability of the j-th reaction to occur in the time interval [t, t + d t]. Propensities are typically dependent on the state of the system and the reaction conditions (i.e., external parameters) of the network such as temperature, pressure, etc. Mathematically, ${\{\mathbf{\text{X}}(t)\}}_{t\in {\mathbb{R}}_{+}}$ is a continuous-time, time-homogeneous, jump Markov process with countable state space $E\subset {\mathbb{N}}^{N}$. The transition rates of the Markov process are the propensity functions a_{ j }(·), j = 1, …, M. The transition rates determine the clock of the updates (or jumps) from a current state x to a new (random) state x^{′} through the total rate ${a}_{0}(\mathbf{\text{x}}):={\sum}_{j=1}^{M}{a}_{j}(\mathbf{\text{x}})$ while the transition probabilities of the process are determined by the ratio $\frac{{a}_{j}(\mathbf{\text{x}})}{{a}_{0}(\mathbf{\text{x}})}$. We refer to Algorithm 1 for the details of the stochastic simulation.
Relative entropy
- (i)
it is always non-negative,
- (ii)
it equals to zero if and only if $\mathcal{P}=\stackrel{~}{\mathcal{P}}\mathcal{P}$-almost everywhere, and,
- (iii)$\mathcal{R}\left(\mathcal{P}|\stackrel{~}{\mathcal{P}}\right)<\infty $
if and only if and $\stackrel{~}{\mathcal{P}}$ are absolutely continuous with respect to each other.
From an information theory perspective, relative entropy quantifies the loss of information when $\stackrel{~}{\mathcal{P}}$ is utilized instead of , [33]. In other words, relative entropy quantifies the inefficiency of assuming an incorrect or perturbed distribution $\stackrel{~}{\mathcal{P}}$ instead of employing the true distribution . Therefore, even though not a metric, relative entropy has been used as a suitable quantity for the assessment of parametric sensitivities since the higher the relative entropy (i.e., the information loss) in some perturbed direction, the larger the sensitivity should be in this direction.
Pathwise relative entropy and relative entropy rate
where $\frac{d{Q}_{[0,T]}^{\theta}}{d{Q}_{[0,T]}^{\theta +\epsilon}}$ is the Radon-Nikodym derivative of ${Q}_{[0,T]}^{\theta}$ with respect to ${Q}_{[0,T]}^{\theta +\epsilon}$. In fact, using the Girsanov’s formula, we can obtain an explicit expression for the Radon-Nikodym derivative in terms of the propensities, [34]. In the context of sensitivity analysis, the pathwise relative entropy $\mathcal{R}\left({Q}_{[0,T]}^{\theta}\phantom{\rule{0.3em}{0ex}}|\phantom{\rule{0.3em}{0ex}}{Q}_{[0,T]}^{\theta +\epsilon}\right)$ is a measure of information loss due to an ε-perturbation of the model parameters, and consequently it is a natural measure of parametric sensitivity.
Thus, from a practical point of view, RER is an observable of the stochastic process which can be computed numerically as an ergodic average, requiring only the knowledge of the propensity functions and the stoichiometric matrix (ν)_{i,j}. Nevertheless, in order to carry out the sensitivity analysis in the parameter vector θ, the computation of RER for different ε’s is necessary which can be computationally challenging for high-dimensional parameter spaces. Thus, a sensitivity analysis methodology which does not depend on ε’s -such methods are called “gradient-free”- is desirable and is developed next.
Pathwise Fisher information matrix
The implications of this explicit formula are twofold. First, it reveals that for many typical reaction networks the FIM has a special block-diagonal structure which reflects the parameter interdependencies and it is discussed in detail below. Second, the FIM is based on the propensity functions as well as on their derivatives which are known - actually, they define the process- thus the FIM, similarly to RER, is numerically computable as an observable of the process. Subsequent sections present various strategies to numerically estimate both the RER and the FIM in an efficient fashion.
it can be inferred that the most sensitive direction corresponds to the eigenvector with eigenvalue ${\lambda}_{1}^{\theta}$ while the least sensitive direction corresponds to the eigenvector with eigenvalue ${\lambda}_{K}^{\theta}$. Additionally, the FIM is one of the most useful tools for optimal experimental design. Many of the optimality criteria such as D-optimality where the determinant of the FIM is maximized or A-optimality where the trace of the inverse of the FIM is minimized are based on FIM, [37].
In the same direction, robustness of the system to parameter perturbations or errors as well as parameter identifiability can be studied utilizing spectral analysis of the FIM. For instance, parameter identifiability is satisfied when all the eigenvalues of the FIM are above a given threshold, [18].
Sensitivity analysis at the logarithmic scale
Linking relative entropy and observables
As we discussed in the previous sections, relative entropy provides a mathematically elegant and computationally tractable methodology for the parameter sensitivity analysis of complex, stochastic dynamical systems. Such results focus on the sensitivity of the entire probability distribution, either at equilibrium or at the path-space level, i.e., for the entire stationary time-series. However, in many situations of chemical and biological networks, the interest is focused on observables such as mean populations, population correlations, population variance as well as path-space observables such as time autocorrelations and extinction times. Therefore, it is plausible to attempt to connect the parameter sensitivities of observables to the relative entropy methods proposed here. Indeed, relative entropy can provide an upper bound for a large family of observable functions through the Pinsker (or Csiszar-Kullback-Pinsker) inequality, [33].
where ||·||_{ ∞ } denotes the supremum (here, maximum) of f. An obvious outcome of this inequality is that if the (pseudo-)distance between two distributions defined by $\mathcal{R}\left({Q}^{\theta}|{Q}^{\theta +\epsilon}\right)$ is controlled, then the error between the two distributions is also controlled for any bounded observable. In the context of sensitivity analysis, inequality (9) states that if the relative entropy is small, i.e., insensitive in a particular parameter direction, then, any bounded observable f is also expected to be insensitive towards the same direction. In this sense, (9) can be viewed as a “conservative” -but not necessarily sharp- bound for the parametric sensitivity analysis of observables, including path-dependent observables such as long-time averages and autocorrelations.
From a practical perspective, (9) can be used as an indicator that suggests -even in the presence of a very high-dimensional parameter space- which are the insensitive parameter directions for observables of stochastic dynamical systems. The least-sensitive directions can be verified computationally and we present in the sequel two examples of this practical strategy in the p53 and the EGFR models. More generally, determining insensitive directions in parameter space can by particularly useful in multi-parameter models in systems biology characterized by “sloppiness”, i.e., when most model parameters allow for a vast range of perturbations without affecting the dynamics, [38]. As we concretely show in the EGFR example, our methodology can easily demonstrate and quantify such properties in stochastic dynamics through the use of the spectral analysis of the pathwise FIM, even if the models include a very large number of parameters.
Remark 1
We note that in order to carry out such an analysis in a mathematically rigorous manner which includes both sensitive and insensitive parameters, we need to require that the norm ||·||_{ ∞ } in (9) is controlled. For instance, typical observables in biochemical reaction networks are the number of molecules for each species, hence f(x) = x. Thus, for reaction networks where the population size is large, the Pinsker inequality (9) will provide a bound that may not be sharp. In fact, it was recently shown that there are alternative bounds which are expected to be practically more useful than (9) in the sense that they provide sharp upper bounds for observables in terms of the relative entropy, [39]. These sharp bounds rely on the variational representation of the relative entropy and the existence of an explicit minimizer for the upper bound. Furthermore, [39] combined with our work on RER and pathwise FIM suggests new mathematical questions towards pursuing practical sharp upper bounds involving RER and pathwise FIM. In view of these comments, we conclude that (9), (a) constitutes a theoretical indicator that relative entropy is a reliable tool for sensitivity analysis and more generally for uncertainty quantification; (b) from a practical perspective, it is capable to rule out insensitive directions in parameter space, which in turn provides a significant advantage in the study of “sloppy” multi-parameter models.
Block-diagonal structure of the pathwise FIM
Before proceeding with the theoretical computation of the FIM for various well-known classes of biochemical reaction networks, we list some of the implications of this simplified structure of the FIM in sensitivity analysis and elsewhere.
- (i)
The sparsity of the FIM is proportional to the parametric decoupling between the reactions. Knowing a priori the zero elements of the FIM, there is no need to numerically compute them. It is clear that the computation cost for each sample drops from O(K^{2}) to O(K L) where L is the largest dimension of the block matrices.
- (ii)
 The inverse of the FIM is also block-diagonal and each block of the inverse FIM is the inverse of the respective block. This fact allows us in the parameter estimation problem to easily evaluate the lower bound of the variance, at least for the complete-data case [40], i.e., obtain Cramer-Rao bounds, [41, 42] which are given by the diagonal elements of the inverse of the FIM.
- (iii)
Relation (11) implies that that optimality criteria in optimal experiment design, [37], are significantly simplified. For example, the determinant employed in the D-optimality test is given by the relation $det({\mathbf{F}}_{\mathcal{H}})={\prod}_{i=1}^{I}det({A}_{i})$, while the trace of the inverse of the FIM utilized in the A-optimality reduces to $\text{tr}({{F}_{\mathcal{H}}}^{-1})={\sum}_{i=1}^{I}\text{tr}({A}_{i}^{-1})$.
- (iv)
 Given that parametric identifiability is characterized by the magnitude of the eigenvalues of the FIM, e.g. a zero eigenvalue corresponds to a non-identifiable direction in parameter space, [18, 43], then the block-diagonal structure (11) can provide additional insights in parameter identification. For instance, the identifiability of the parameters of the group that corresponds to the i-th block, A _{ i }, will be increased if the smaller eigenvalues of the i-th block can be increased. On the other hand, if the determinant of the i-th block equals to zero then at least one of its eigenvalues is zero and thus the corresponding linear combinations of parameters are non-identifiable. Similarly, the robustness of the system to perturbations of the parameters of the i-th group will be increased if there is a way to decrease the larger eigenvalues of the i-th block.
Overall, we note that extracting useful information regarding model parameters can be performed for each block of the pathwise FIM independently. Next, we discuss two specific examples of biochemical reaction networks where the explicit calculation of the block-diagonal FIM is demonstrated.
Reactions with independent reaction constants (mass action kinetics)
This result demonstrates that the sensitivity of a reaction constant is proportional to the equilibrium average of the respective propensity function. Moreover, due to the diagonal form of the FIM, it is straightforward to carry out the eigenvalue analysis and infer the most/least sensitive directions of the reaction network: the eigenvalues of the FIM are its diagonal elements while the eigenvectors are the standard basis vectors of ${\mathbb{R}}^{K}$. Hence, the most (respectively least) sensitive parameter is obtained from the largest (respectively smallest) diagonal element of the FIM. Furthermore, (14) demonstrates that the (local) robustness of the reaction network to a specific parameter is inversely proportional to the mean propensity of the corresponding reaction. Another observation stemming from the diagonal structure of the pathwise FIM is that each rate constant can be estimated from time-series data independently from the other rate constants. This observation has been already pointed out and discussed in the context of maximum likelihood estimation for the complete-data case ([40], Sec. 10.2).
Additionally, the mean firing rate of a reaction is equal to the mean propensity. Hence, it can be stated that the parameters that correspond to the faster reactions, i.e., to reactions with larger mean firing rate, are more sensitive in a pathwise entropy sense. It should be noted, however, that not all observables are sensitive to the parameters that correspond to the faster reactions and there are examples (see the protein production-degradation model in the Results section) where steady state observables such as the equilibrium distribution remain insensitive to specific perturbation directions even though their mean propensity may be increased. Finally, we would like to remark that even though ${\mathbb{E}}_{{\mu}^{\theta}}\left[{a}_{k}^{\theta}(\mathbf{\text{x}})\right]={\theta}_{k}{\mathbb{E}}_{{\mu}^{\theta}}\left[{g}_{k}^{\theta}(\mathbf{\text{x}})\right]$ trivially holds true, the diagonal elements of the FIM are not linear functions of the corresponding reaction constants since the steady state distribution μ^{ θ }, depends also on the parameter vector θ. In fact, high reaction constants do not necessarily imply large mean propensities and hence a more sensitive parametric dependence. This is specifically due to the mean value in (14) and as an illustrative example we refer to the simple protein production-degradation model (e.g., compare (24) and (27)).
Michaelis-Menten kinetics
In general, biochemical reaction networks may have significantly more complex propensities, nevertheless, the computation of the FIM follows exactly the same calculation lines for any propensity function.
Strategies for the statistical estimation of RER and FIM
Previous sections introduced and justified RER and FIM as appropriate observables for measuring the sensitivity analysis of the reaction network’s parameters in long-time dynamics. This section presents strategies on how to efficiently estimate these quantities as ergodic averages of the underlying stochastic process.
Unbiased statistical estimators
Noticing that the computation of the local propensity functions ${a}_{j}^{\theta}({\mathbf{\text{x}}}_{i})$ for all j = 1, …, M is needed for the simulation of the jump Markov process when Monte Carlo methods such as SSA [45] is utilized, the computation of the perturbed transition rates is the only additional computational cost for the numerical RER while the additional cost for the estimation of the FIM is the computation of the derivatives of the propensities. Algorithm 1 summarizes the numerical computation of RER and FIM, employing the SSA for the simulation of the jump Markov process.
Accelerated statistical estimators
A typical feature of biochemical systems is that the modeled reaction network is large with hundreds or thousands of reactions and different time scales stemming from the orders of magnitude difference between the reaction rates and/or between the species concentrations, making the SSA extremely slow. A large number of multi-scale approximations of the original SSA have been developed in order to handle such issues resulting to accelerated simulation algorithms. For example, mean-field approximation ignores the fluctuations of the stochastic process and yields a deterministic system of ordinary differential equations (ODE) for the mean population of the species [46, 47]. Stochastic corrections to the mean-field model such as stochastic Langevin [25] and linear noise approximation [48] can be applied in order to improve the accuracy of the simulation. An alternative approximation of the jump Markov process is the tau-leap method proposed by Gillespie [24] where a batch of events occurs at each time-increment, τ. Several improvements of the basic tau-leap algorithm on how to select adaptively the τ[49] or avoiding negative populations [27, 50] have been proposed, however, their performance is heavily model-dependent.
Algorithm 1SSA-based numerical computation of RER and FIM
Relation (22) suggests an algorithm similar to Algorithm 1 for the numerical computation of the FIM where instead of SSA, an ODE solver is employed.
Remark 2
Multi-scale approximations are usually valid for large populations and relatively simple systems which do not exhibit complex dynamics such as bistability or intermittency. Indeed, large deviation arguments [52] or even explicitly available formulas for escape times [53] demonstrate that stochastic approximations cannot always capture correctly exit times, rare events, strong intermittency, etc. even in relatively simple systems. However, in order to simulate large biochemical systems there is often no other alternative than such approximate models, which nevertheless need to be employed with the necessary caution.
Remark 3
In biochemical systems, we are interested not only in the steady state, i.e., the stationary distribution or time-series, but also in the transient regime, e.g. signaling phenomena. The extension of the proposed sensitivity analysis method to the transient regime is justified by the fact that the time-normalized relative entropy can be also decomposed as a sum of simple integrals [33] which results to the fact that the statistical estimators (17) and (18) remain valid. In a subsequent section we present an example of a biochemical system (EGFR) which exhibits transient behavior, and where the proposed sensitivity analysis tools are tested and validated. The rigorous mathematical derivation of the relative entropy rate for the transient regime is out of the scope of this publication and a dedicated mathematical article on the time-dependent relative entropy rate will follow.
Results and discussion
A simple protein production/degradation model
We consider this simple stochastic model due to the available analytic representations of the steady state (equilibrium) distribution, time-dependent moments and autocorrelations, ([46], Sec. 7.1). Consequently, we can both illustrate the proposed pathwise sensitivity analysis, as well as compare it to the standard equilibrium FIM, revealing concretely differences between the two approaches.
The complete calculations can be found in the Additional file 2. Some of the implications of the differences between these two FIMs are discussed next.
where < ·, · >_{ s } denotes stationary averaging. Based on this formula, it is obvious that the autocorrelation function is also sensitive to k_{2}, in addiction to the ratio $\frac{{k}_{1}}{{k}_{2}}$. This example demonstrates that in contrast to the pathwise FIM, the equilibrium FIM is inadequate to fully capture the dynamic properties of the process. Moreover, the pathwise FIM depends linearly only on k_{1}, which shows that the reaction rate constants and propensity functions in (24) alone, can be misleading in the assessment of parametric sensitivity. Contrary, the mathematically correct equilibrium averaging of the propensities, i.e., (14) can lead to a completely different outcome, as can be readily seen when we compare (24) and (27).
In terms of parameter identifiability, the fact that one of the eigenvalues of (25) is zero implies that that the two-dimensional parameter vector of the system is non-identifiable. Indeed, the asymptotic normality of the maximum likelihood estimators, [41, 42], states that their variance (also a lower bound according to the Cramer Rao theorem), which determines parameter identifiability of k_{1} and k_{2}, is the reciprocal of the eigenvalues of (25). A straightforward calculation involving the eigenvectors of (25) shows that the only identifiable parameter is the ratio of the reaction constants appearing in (25). Therefore parametric inference for both parameters from equilibrium data is not possible. On the other hand, the pathwise FIM (26) is not singular, which readily implies that both parameters can be identified through (complete) time-series data, provided that k_{1} ≠ 0. Summarizing, this birth/death model is an example where equilibrium sampling is not enough for the identifiability of all the parameters, however, if dynamics data are available and are taken into account then all the parameters become identifiable as pathwise FIM asserts.
The p53 gene model
The reaction table with x corresponding to p53, y _{ 0 } to Mdm2-precursor while y corresponds to Mdm2
Event | Reaction | Rate | Rate’s derivative |
---|---|---|---|
R _{1} | ∅ → x | a_{1}(x) = b_{ x } | ∇_{ θ }a_{1}(x) = [1, 0, 0, 0, 0, 0, 0]^{ T } |
R _{2} | x → ∅ | ${a}_{2}(\mathbf{\text{x}})={a}_{x}x+\frac{{a}_{k}y}{x+k}x$ | ∇_{ θ }a_{2}(x) = [0, x, x y / (x + k), |
−a_{ k }x y / (x + k)^{2}, 0, 0, 0]^{ T } | |||
R _{3} | x → x + y_{0} | a_{3}(x) = b_{ y }x | ∇_{ θ }a_{3}(x) = [0, 0, 0, 0, x, 0, 0]^{ T } |
R _{4} | y_{0} → y | a_{4}(x) = a_{0}y_{0} | ∇_{ θ }a_{4}(x) = [0, 0, 0, 0, 0, y_{0}, 0]^{ T } |
R _{5} | y → ∅ | a_{5}(x) = a_{ y }y | ∇_{ θ }a_{5}(x) = [0, 0, 0, 0, 0, 0, y]^{ T } |
Parameter values for the p53 model
Parameter | b _{ x } | a _{ x } | a _{ k } | k | b _{ y } | a _{0} | a _{ y } |
---|---|---|---|---|---|---|---|
Value | 90 | 0.002 | 1.7 | 0.01 | 1.1 | 0.8 | 0.8 |
Comparison to the LNA-based sensitivity approach
In [18], the authors suggested a linear noise approximation (LNA) for the stochastic evolution around the nonlinear mean-field equation, and based on this approximation a system of ODEs is derived for the mean and the covariance matrix of the approximation process. Since the noise of LNA is Gaussian, the mean and the covariance matrix contains all necessary information regarding the approximate stochastic model. Then, the associated FIM is derived and based on it, the sensitivities for each parameter are computed. Although there are regimes where this approximation is applicable (short times, high populations, systems with a single steady state, etc.), for systems with nontrivial long-time dynamics, e.g. metastable, it is not correct as large deviation arguments [52] and explicit formulas for escape times [53] show. Similar issues with non-gaussianity in the long-time dynamics arise in stochastic systems with strongly intermittent (pulse-like) or random oscillatory behavior [58]. In the p53 model considered in [18] which had the same parameter values as here, Figure 2 reveals that the time-series of the p53 populations persistently fluctuate between high and low values, thus the LNA approximation may not be accurate at least when the concentration of the species is very low.
Epidermal growth factor receptor model
Initial population of the species for the EGFR network
EGF | EGFR | GAP | Grb2 | Sos | Ras-GDP | Shc |
---|---|---|---|---|---|---|
4.98e10 | 5e4 | 1.2e4 | 5.1e4 | 6.63e4 | 1.14e7 | 1.01e6 |
Raf | Phosphatase 1 | Phosphatase 2 | Phosphatase 3 | MEK | ERK | Pxrot |
4e4 | 4e4 | 4e4 | 1e6 | 2.2e7 | 2.1e7 | 8.1e4 |
The previous discussion refers to the analysis of the EGFR model to the steady state regime. On the other hand, EGFR is a signaling model whose transient regime, in addition to the steady state, is of great interest. As discussed in Remark 3, we can justify the application of the RER and FIM sensitivity analysis in the transient regime. Therefore, we compute the proposed FIM at the time interval [0,10], using (22). The lower plot of Figure 7 shows the diagonal elements of the pathwise FIM in the transient regime while keeping the ordering of the parameters unchanged from the upper, steady state plot. The parameter sensitivity ordering is completely different meaning that the sensitivities are time-dependent in the transient regime. For instance, the most sensitive parameters in the stationary regime correspond to the final products of the reaction network, however, in the time interval [0,10] these species have not been produced yet resulting to insensitive reaction constants. In terms of parameter identification and estimation, the time-dependent sensitivities imply that in order to extract the maximum information content from the experimental data, we have to estimate the parameters drawing samples from different time intervals. These time intervals should be defined based on the respective sensitivity indices and selected in order to maximize the identifiability for each set of parameters.
Conclusions
In this paper, we applied and extended a recently proposed parametric sensitivity analysis methodology to complex stochastic reaction networks. This sensitivity analysis approach is based on the quantification of information loss along different parameter perturbations between time-series distributions. This is achieved by employing the Relative Entropy Rate, which is directly computable from the propensity functions. A key aspect of the method is that we can derive rigorously an associated Fisher Information Matrix on path-space, which in turn constitutes a gradient-free approach for parametric sensitivity analysis; as such it provides a significant advantage in stochastic systems with a large number of parameters. We demonstrated that the structure of the pathwise FIM revealed hidden parameter interdependencies between the reactions. The block-diagonal structure of the FIM highlighted the sparsity of the matrix which resulted in further improvements in the computational efficiency of the proposed method. Therefore, parametric sensitivity analysis for high-dimensional stochastic reaction systems becomes tractable since it is well-known that in high dimensional stochastic systems sensitivity analysis techniques can involve estimators of very high variance, e.g. in finite difference methods and their recently proposed variants, which can present an overwhelming computational cost. Additionally, we proposed the use of multiscale numerical approximations of stochastic reaction networks in order to derive efficient statistical estimators for the FIM and implemented one such approximation (mean-field) in a high-dimensional system.
The proposed pathwise sensitivity analysis method is tested and validated on three biological systems: (a) a simple protein production/degradation model where explicit solutions are available, (b) the p53 reaction network where quasi-steady stochastic oscillations of the concentrations are observed and where multiscale stochastic approximations break down due to the persistent oscillations between low and high populations, and (c) a stochastic EGFR model which is an example of a high-dimensional reaction network with more than 200 reactions and a corresponding number of parameters. In the EGFR reaction network, we combined the proposed pathwise FIM which has been derived from the stochastic network and the mean-field approximation which is used for the efficient estimation of the pathwise FIM. Moreover, our earlier rigorous analysis for the steady state regime [19] suggests suitable extensions in the transient regime which were tested and validated for the EGFR model. We will present the full rigorous theory in an upcoming publication.
Finally, we note that the relation between RER and various observables is not straightforward. However, we note that the path distribution contains all information regarding the process including the steady state and all time-dependent observables: practically, our proposed sensitivity analysis represents a “conservative” sensitivity estimate in the sense that insensitive directions for the relative entropy on path-space will yield insensitive directions for every observable. This latter statement can be justified mathematically through the Pinsker inequality (9) which was tested in the examples considered here. Based on these observations, the proposed sensitivity analysis methods can be deployed in complementary fashion with existing sensitivity analysis tools, as it can be used to narrow down the most sensitive directions in a system.
Declarations
Acknowledgements
The work of MAK and YP was supported in part by the Office of Advanced Scientific Computing Research, U.S. Department of Energy under Contract No. DE-SC0002339 and DE-SC0010723. The work of DGV was supported in part by the Office of Advanced Scientific Computing Research, U.S. Department of Energy under Contract No. DE-FG02-05ER25702 and DE-SC0010549. The work of MAK was also supported in part by the European Union (European Social Fund) and Greece (National Strategic Reference Framework), under the THALES Program, grant AMOSICSS.
Authors’ Affiliations
References
- Barkai N, Leibler S: Robustness in simple biochemical networks. Nature. 1997, 387 (6636): 913-917. 10.1038/43199.View ArticlePubMedGoogle Scholar
- Csete M, Doyle J: Reverse engineering of biological complexity. Science. 2002, 295 (5560): 1664-1669. 10.1126/science.1069981.View ArticlePubMedGoogle Scholar
- Kitano H: Opinion - Cancer as a robust system: implications for anticancer therapy. Nat Rev Cancer. 2004, 4 (3): 227-235. 10.1038/nrc1300.View ArticlePubMedGoogle Scholar
- Donz A, Fanchon E, Gattepaille L, Maler O, Tracqui P: Robustness analysis and behavior discrimination in enzymatic reaction networks. PLoS ONE. 2011, 6: 1-16.Google Scholar
- Hart Y, Antebi Y, Mayo A, Friedman N, Alon U: Design principles of cell circuits with paradoxical components. Proc Nat Acad Sci USA (PNAS). 2012, 109 (21): 8346-8351. 10.1073/pnas.1117475109.View ArticleGoogle Scholar
- Glynn P: Likelihood ratio gradient estimation for stochastic systems. Commun ACM. 1990, 33 (10): 75-84. 10.1145/84537.84552.View ArticleGoogle Scholar
- Nakayama M, Goyal A, Glynn PW: Likelihood ratio sensitivity analysis for Markovian models of highly dependable systems. Stochastic Models. 1994, 10: 701-717. 10.1080/15326349408807318.View ArticleGoogle Scholar
- Plyasunov S, Arkin AP: Efficient stochastic sensitivity analysis of discrete event systems. J Comp Phys. 2007, 221: 724-738. 10.1016/j.jcp.2006.06.047.View ArticleGoogle Scholar
- Kim D, Debusschere B, Najm H: Spectral methods for parametric sensitivity in stochastic dynamical systems. Biophys J. 2007, 92: 379-393. 10.1529/biophysj.106.085084.PubMed CentralView ArticlePubMedGoogle Scholar
- Rathinam M, Sheppard PW, Khammash M: Efficient computation of parameter sensitivities of discrete stochastic chemical reaction networks. J Chem Phys. 2010, 132 (034103): 1-13.Google Scholar
- Anderson DF: An efficient finite difference method for parameter sensitivities of continuous-time Markov chains. SIAM J Numerical Anal. 2012, 50 (5): 2237-2258. 10.1137/110849079.View ArticleGoogle Scholar
- Sheppard P, Rathinam M, Khammash M: A pathwise derivative approach to the computation of parameter sensitivities in discrete stochastic chemical systems. J Chem Phys. 2012, 136 (034115):Google Scholar
- McGill JA, Ogunnaike BA, Vlachos DG: Efficient gradient estimation using finite differencing and likelihood ratios for kinetic Monte Carlo simulations. J Comp Phys. 2012, 231 (21): 7170-7186. 10.1016/j.jcp.2012.06.037.View ArticleGoogle Scholar
- Liu H, Chen W, Sudjianto A: Relative entropy based method for probabilistic sensitivity analysis in engineering design. J Mech Design. 2006, 128: 326-336. 10.1115/1.2159025.View ArticleGoogle Scholar
- Lüdtke N, Panzeri S, Brown M, Broomhead DS, Knowles J, Montemurro MA, Kell DB: Information-theoretic sensitivity analysis: a general method for credit assignment in complex networks. J R Soc Interface. 2008, 5: 223-235. 10.1098/rsif.2007.1079.PubMed CentralView ArticlePubMedGoogle Scholar
- Majda AJ, Gershgorin B: Quantifying uncertainty in climate change science through empirical information theory. Proc Natl Acad Sci. 2010, 107 (34): 14958-14963. 10.1073/pnas.1007009107.PubMed CentralView ArticlePubMedGoogle Scholar
- Majda AJ, Gershgorin B: Improving model fidelity and sensitivity for complex systems through empirical information theory. Proc Natl Acad Sci. 2011, 108 (25): 10044-10049. 10.1073/pnas.1105174108.PubMed CentralView ArticlePubMedGoogle Scholar
- Komorowski M, Costa MJ, Rand DA, Stumpf MPH: Sensitivity, robustness, and identifiability in stochastic chemical kinetics models. Proc Natl Acad Sci USA. 2011, 108: 8645-8650. 10.1073/pnas.1015814108.PubMed CentralView ArticlePubMedGoogle Scholar
- Pantazis Y, Katsoulakis M: A relative entropy rate method for path space sensitivity analysis of stationary complex stochastic dynamics. J Chem Phys. 2013, 138 (5): 054115-10.1063/1.4789612.View ArticlePubMedGoogle Scholar
- Gillespie DT: Exact stochastic simulation of coupled chemical reactions. J Chem Phys. 1977, 81: 2340-2361. 10.1021/j100540a008.View ArticleGoogle Scholar
- Chatterjee A, Vlachos DG: An overview of spatial microscopic and accelerated kinetic Monte Carlo methods for materials’ simulation. J Comput Aided Mater Design. 2007, 14 (2): 253-308. 10.1007/s10820-006-9042-9.View ArticleGoogle Scholar
- Slepoy A, Thompson A, Plimpton S: A constant-time kinetic Monte Carlo algorithm for simulation of large biochemical reaction networks. J Chem Phys. 2008, 128 (20): 205101-10.1063/1.2919546.View ArticlePubMedGoogle Scholar
- Gibson MA, Bruck J: Efficient exact stochastic simulation of chemical systems with many species and many channels. J Chem Phys. 2000, 104: 1876-1889. 10.1021/jp993732q.View ArticleGoogle Scholar
- Gillespie DT: Approximated accelerated stochastic simulation of chemically reacting systems. J Chem Phys. 2001, 115 (4): 1716-1733. 10.1063/1.1378322.View ArticleGoogle Scholar
- Gillespie DT: The chemical Langevin equation. J Chem Phys. 2000, 113: 297-306. 10.1063/1.481811.View ArticleGoogle Scholar
- Rathinam M, Petzold LR, Cao Y, Gillespie DT: Stiffness in stochastic chemically reacting systems: The implicit tau-leaping method. J Chem Phys. 2003, 119: 12784-12794. 10.1063/1.1627296.View ArticleGoogle Scholar
- Chatterjee A, Vlachos DG, Katsoulakis MA: Binomial distribution based tau-leap accelerated stochastic simulation. J Chem Phys. 2005, 122: 024112-10.1063/1.1833357.View ArticlePubMedGoogle Scholar
- Geva-Zatorsky N, Rosenfeld N, Itzkovitz S, Milo R, Sigal A, Dekel E, Yarnitzky T, Liron Y, Polak P, Lahav G, Alon U: Oscillations and variability in the p53 system. Mol Syst Biol. 2006, 2: 0033-View ArticlePubMedGoogle Scholar
- Moghal N, Sternberg P: Multiple positive and negative regulators of signaling by the EGF receptor. Curr Opin Cell Biol. 1999, 11: 190-196. 10.1016/S0955-0674(99)80025-8.View ArticlePubMedGoogle Scholar
- Hackel P, Zwick E, Prenzel N, Ullrich A: Epidermal growth factor receptors: critical mediators of multiple receptor pathways. Curr Opin Cell Biol. 1999, 11: 184-189. 10.1016/S0955-0674(99)80024-6.View ArticlePubMedGoogle Scholar
- Schoeberl B, Eichler-Jonsson C, Gilles E, Muller G: Computational modeling of the dynamics of the MAP kinase cascade activated by surface and internalized EGF receptors. Nat Biotechnol. 2002, 20: 370-375. 10.1038/nbt0402-370.View ArticlePubMedGoogle Scholar
- Kullback S: Information Theory and Statistics. 1959, New York: John Wiley and SonsGoogle Scholar
- Cover TM, Thomas JA: Elements of Information Theory. 1991, Wiley Series in TelecommunicationsView ArticleGoogle Scholar
- Kipnis C, Landim C: Scaling Limits of Interacting Particle Systems. 1999, Berlin, Heidelberg and New York: Springer-VerlagView ArticleGoogle Scholar
- Avellaneda M, Friedman C, Holmes R, Samperi D: Calibrating volatility surfaces via relative entropy minimization. Soc Sci Res Netw. 1997, Available at SSRN: [http://ssrn.com/abstract=648]Google Scholar
- Dumitrescu ME: Some informational properties of Markov pure-jump processes. C P Matematiky. 1988, 113: 429-434.Google Scholar
- Emery AF, Nenarokomov AV: Optimal experiment design. Meas Sci Technol. 1998, 9: 864-876. 10.1088/0957-0233/9/6/003.View ArticleGoogle Scholar
- Gutenkunst RN, Waterfall JJ, Casey FP, Brown KS, Myers CR, Sethna JP: Universally sloppy parameter sensitivities in systems biology models. PLOS Comput Biol. 2007, 3: 1871-1878.View ArticlePubMedGoogle Scholar
- Chowdhary K, Dupuis P: Distinguishing and integrating aleatoric and epistemic variation in uncertainty quantification. ESAIM Math Model Numerical Anal. 2013, 47: 635-662. 10.1051/m2an/2012038.View ArticleGoogle Scholar
- Wilkinson DJ: Stochastic Modelling for Systems Biology. 2012, Chapman & HallGoogle Scholar
- Kay SM: Funtamentals of Statistical Signal Processing: Estimation Theory. 1993, Englewood Cliffs: Prentice-HallGoogle Scholar
- Wasserman L: All of Statistics: A Concise Course in Statistical Inference. 2004, SpringerView ArticleGoogle Scholar
- Rothenberg T: Identification in parametric models. CONOMETRICA. 1971, 39: 577-0591. 10.2307/1913267.View ArticleGoogle Scholar
- Marangoni AG: Enzyme Kinetics: A Modern Approach. 2003, Wiley InterscienceGoogle Scholar
- Gillespie DT: A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J Comp Phys. 1976, 22: 403-434. 10.1016/0021-9991(76)90041-3.View ArticleGoogle Scholar
- Gardiner C: Handbook of Stochastic Methods: for Physics, Chemistry and the Natural Sciences. 1985, SpringerView ArticleGoogle Scholar
- van Kampen NG: Stochastic Processes in Physics and Chemistry. 2006, North HollandGoogle Scholar
- Kurtz TG: The relationship between stochastic and deterministic models for chemical reactions. J Chem Phys. 1972, 57: 2976-10.1063/1.1678692.View ArticleGoogle Scholar
- Cao Y, Gillespie DT, Petzold LR: Efficient step size selection for the tau-leaping simulation method. J Chem Phys. 2006, 124: 044109-10.1063/1.2159468.View ArticlePubMedGoogle Scholar
- Tian T, Burrage K: Binomial leap methods for simulating stochastic chemical kinetics. J Chem Phys. 2004, 121: 10356-10.1063/1.1810475.View ArticlePubMedGoogle Scholar
- Kurtz TG: Approximation of Population Processes, Society for Industrial and Applied Mathematics (SIAM). 1981View ArticleGoogle Scholar
- Doering CR, Sargsyan KV, Sander LM, Vanden-Eijnden E: Asymptotics of rare events in birth-death processes bypassing the exact solutions. J Phys Condens Matter. 2007, 19 (065145): 1-12.Google Scholar
- Hanggi P, Grabert H, Talkner P, Thomas H: Bistable systems: Master equation versus Fokker-Planck modeling. Phys Rev A. 1984, 29: 371-378. 10.1103/PhysRevA.29.371.View ArticleGoogle Scholar
- Wilkinson DJ: Stochastic modelling for quantitative description of heterogeneous biological systems. Nat Rev Genet. 2009, 10: 122-133. 10.1038/nrg2509.View ArticlePubMedGoogle Scholar
- Thattai M, van Oudenaarden A: Intrinsic noise in gene regulatory networks. Proc Natl Acad Sci. 2001, 98 (15): 8614-8619. 10.1073/pnas.151588598. [http://www.pnas.org/content/98/15/8614.abstract]PubMed CentralView ArticlePubMedGoogle Scholar
- Prives C: Signaling to p53: breaking the MDM2-p53 circuit. Cell. 1998, 95: 5-8. 10.1016/S0092-8674(00)81774-2.View ArticlePubMedGoogle Scholar
- Harris S, Levine A: The p53 pathway: positive and negative feedback loops. Oncogene. 2005, 24: 899-908.View ArticleGoogle Scholar
- Katsoulakis MA, Majda AJ, Sopasakis A: Intermittency, metastability and coarse graining for coupled deterministic-stochastic lattice systems. Nonlinearity. 2006, 19 (5): 1021-1047. 10.1088/0951-7715/19/5/002.View ArticleGoogle Scholar
- Komorowski M, Zurauskiene J, Stumpf M: StochSens-Matlab package for sensitivity analysis of stochastic chemical systems. Bioinformatics. 2012, 28: 731-733. 10.1093/bioinformatics/btr714.View ArticlePubMedGoogle Scholar
- Sibilia M, Steinbach J, Stingl L, Aguzzi A, Wagner E: A strain-independent postnatal neurodegeneration in mice lacking the EGF receptor. EMBO J. 1998, 17: 719-731. 10.1093/emboj/17.3.719.PubMed CentralView ArticlePubMedGoogle Scholar
- Kim H, Muller W: The role of the EGF receptor family in tumorigenesis and metastasis. Exp Cell Res. 1999, 253: 78-87. 10.1006/excr.1999.4706.View ArticlePubMedGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.