Open Access

A comparison of approximation techniques for variance-based sensitivity analysis of biochemical reaction systems

BMC Bioinformatics201011:246

DOI: 10.1186/1471-2105-11-246

Received: 20 October 2009

Accepted: 12 May 2010

Published: 12 May 2010

Abstract

Background

Sensitivity analysis is an indispensable tool for the analysis of complex systems. In a recent paper, we have introduced a thermodynamically consistent variance-based sensitivity analysis approach for studying the robustness and fragility properties of biochemical reaction systems under uncertainty in the standard chemical potentials of the activated complexes of the reactions and the standard chemical potentials of the molecular species. In that approach, key sensitivity indices were estimated by Monte Carlo sampling, which is computationally very demanding and impractical for large biochemical reaction systems. Computationally efficient algorithms are needed to make variance-based sensitivity analysis applicable to realistic cellular networks, modeled by biochemical reaction systems that consist of a large number of reactions and molecular species.

Results

We present four techniques, derivative approximation (DA), polynomial approximation (PA), Gauss-Hermite integration (GHI), and orthonormal Hermite approximation (OHA), for analytically approximating the variance-based sensitivity indices associated with a biochemical reaction system. By using a well-known model of the mitogen-activated protein kinase signaling cascade as a case study, we numerically compare the approximation quality of these techniques against traditional Monte Carlo sampling. Our results indicate that, although DA is computationally the most attractive technique, special care should be exercised when using it for sensitivity analysis, since it may only be accurate at low levels of uncertainty. On the other hand, PA, GHI, and OHA are computationally more demanding than DA but can work well at high levels of uncertainty. GHI results in a slightly better accuracy than PA, but it is more difficult to implement. OHA produces the most accurate approximation results and can be implemented in a straightforward manner. It turns out that the computational cost of the four approximation techniques considered in this paper is orders of magnitude smaller than traditional Monte Carlo estimation. Software, coded in MATLAB®, which implements all sensitivity analysis techniques discussed in this paper, is available free of charge.

Conclusions

Estimating variance-based sensitivity indices of a large biochemical reaction system is a computationally challenging task that can only be addressed via approximations. Among the methods presented in this paper, a technique based on orthonormal Hermite polynomials seems to be an acceptable candidate for the job, producing very good approximation results for a wide range of uncertainty levels in a fraction of the time required by traditional Monte Carlo sampling.

Background

Sensitivity analysis is an indispensable tool for the analysis of complex systems [1, 2]. It is routinely used to investigate how uncertainty in input variables affects uncertainty in system response and to quantify the relative importance of the input variables in influencing the response. In addition to many other areas of science and engineering, sensitivity analysis is used in systems biology to investigate the robustness and fragility properties of cellular systems, such as signaling, gene regulation, and metabolic networks [311], as well as in systems pharmacology [12], for designing novel pharmacological intervention strategies and for understanding drug action [13, 14].

To study the sensitivity properties of a biochemical reaction system, such as a signaling network, we must construct a mathematical model that relates uncertainty in key biochemical factors of interest to a biologically relevant system response, and develop techniques for determining how factor uncertainty affects the system response. Since biochemical reaction systems are subject to physical laws, an important requirement is that sensitivity analysis must satisfy important thermodynamic constraints, such as the principle of detailed balance [15]. Bearing these in mind, we have proposed in [16] a probabilistic sensitivity analysis approach for biochemical reaction systems that uses the standard chemical potentials of the activated complexes of the underlying reactions and molecular species as the biochemical factors of interest and propagates factor uncertainty to a given system response in a thermodynamically consistent manner. Moreover, we have adopted a formal statistical approach to sensitivity analysis, known as variance-based sensitivity analysis [2, 1719], which uses a set of indices to quantify the contribution of individual biochemical factors to the variance of the system response.

Unfortunately, it is not in general possible to analytically evaluate variance-based sensitivity indices. As a consequence, these indices are estimated by Monte Carlo sampling [2, 16, 18, 20], which requires evaluation of the system response at each sample. A major drawback of this approach is its slow rate of convergence. As a matter of fact, the error produced by a naive Monte Carlo estimation approach decreases with an error rate of O(1/ https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq1_HTML.gif ), where L is the number of Monte Carlo samples used [21]. Hence, accurate estimation of the sensitivity indices requires a large number of Monte Carlo samples and, therefore, a large number of system response evaluations. This makes Monte Carlo estimation of variance-based sensitivity indices computationally very expensive, especially in the case of biochemical reaction systems comprised of a large number of reactions and molecular species.

To reduce the computational burden of Monte Carlo estimation, it is imperative that we develop techniques which can produce sufficiently accurate estimates of the sensitivity indices in a fraction of the time required by Monte Carlo sampling. In this paper, we present four such techniques and apply them to a well-known biochemical reaction model of the mitogen-activated protein kinase (MAPK) signalling cascade. The first technique is based on a second-order Taylor series expansion of the response function and is an extension of the first-order derivative-based approach for variance-based sensitivity analysis discussed in [2, 18, 19, 22] by including second-order derivative terms. The other approximation techniques are based on the high-dimensional model representation (HDMR) schemes developed by H. Rabitz and his coworkers [2325]. We use analytical derivations, provided in the Additional file 1 accompanying this paper, and sensitivity analysis results generated by the four methods, to clarify the relative merits of each approximation technique and produce useful insights on when these techniques can be used for sensitivity analysis of biochemical reaction systems. We have coded the sensitivity analysis techniques discussed in this paper using MATLAB®. Interested readers can request a copy of the software, and the entire set of data obtained with this software, by contacting the corresponding author.

We should mention here that, in systems biology, the most commonly used sensitivity analysis techniques are based on derivatives of molecular concentrations or other system responses, known as control coefficients [3]. These differential methods are based on a Taylor series approximation of the response function and, as such, are subject to several drawbacks that must be carefully considered before applying them to problems of systems biology. For example, derivative-based sensitivity indices assess the sensitivity properties of a biochemical reaction system around a set of reference input values. Their performance usually depends on the particular choice of these values, due to the nonlinear nature of the response function. For the results to be relevant, the reference values must be the true values, which are usually not known in practice. As a consequence, derivative-based sensitivity analysis techniques are limited by the quality of the underlying Taylor series approximation. Moreover, and due to our difficulty in accurately evaluating high-order derivatives, differential sensitivity analysis techniques are usually limited in practice to assessing the effect of one input factor on the system response, by keeping all other factors fixed to their reference values. This is usually not adequate, since we are most often interested in the effects of multiple biochemical factors on the system response. Finally, traditional differential analysis cannot cope with probabilistic uncertainty in biochemical factor values, unless it is combined with variance-based sensitivity analysis (as it is done by the first approximation technique considered in this paper). It turns out that variance-based sensitivity analysis does not depend on the additivity or linearity of the system model and can be naturally used to quantify the simultaneous effect of probabilistic biochemical factor uncertainty on the system response [2, 18]. For this reason, it provides a very attractive and powerful approach for sensitivity analysis of biochemical reaction systems.

We should finally mention that a number of alternative approximation techniques for variance-based sensitivity analysis have been proposed in the literature [2629]. In these techniques, the original response function is approximated by a surrogate function and the sensitivity indices are then estimated by Monte Carlo sampling based on that function. Reduction in computations is achieved by the fact that the time required for computing the system response at each Monte Carlo iteration using the surrogate function is much smaller than computing the response using the original function (whose evaluation requires solving a system of ordinary differential equations). However, the computations associated with these techniques are still substantial, since they must employ a large number of samples to sufficiently reduce the Monte Carlo estimation error. By contrast, the techniques discussed in this paper are based on surrogate functions that lead to analytical formulas for the sensitivity indices, thus avoiding Monte Carlo estimation. As a matter of fact, the computational cost for calculating the variance-based sensitivity indices using the techniques discussed in this paper is mainly associated with the problem of estimating the underlying parameters of the surrogate function used, which leads to appreciable computational savings over the techniques proposed in [2629].

Methods

Biochemical reaction systems

In this paper, we consider a well-stirred (homogeneous) biochemical reaction system at constant temperature and volume that consists of M coupled reactions of the form:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equa_HTML.gif
where κ2m-1, κ2m≥ 0 are the normalized rate constants of the forward and reverse reactions (measured in s-1) and ν nm , https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq2_HTML.gif ≥ 0 are the stoichiometry coefficients of the reactants and products. We assume that the system consists of N molecular species X1, X2, ..., X N , with concentrations (measured in molecules/cell) at time t ≥ 0 given by q1(t), q2(t), ..., q N (t), respectively. We characterize the dynamic evolution of molecular concentrations by the following chemical kinetic equations:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ1_HTML.gif
(1)
where https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq3_HTML.gif is the stoichiometry coefficient of the nth molecular species associated with the mth reaction and
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ2_HTML.gif
(2)

is the flux of the mth reaction at time t.

The sensitivity analysis approach we consider here is based on quantifying the influence of a reaction or molecular species on an appropriately chosen response characteristic R of a biochemical reaction system. We employ a well-known model of the MAPK signaling cascade (see Figure 1 and Additional file 2 for details on this model) and consider three response characteristics with established biological significance, namely the duration D, integrated response I, and strength S of the doubly phosphorylated extracellular signal-regulated kinase (ERK-PP), defined by
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Fig1_HTML.jpg
Figure 1

A biochemical reaction model of the MAPK signaling cascade, adopted from Zhang et al.[16].

https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equb_HTML.gif

where q(t) is the concentration profile of ERK-PP and t0 is the time at which q(t) converges to zero. If convergence to zero does not occur within the observation time interval [0, tmax], then we set t0 = tmax. We choose to work with the duration, integrated response, and strength of ERK-PP activity, since it has been experimentally observed that differences in duration and strength may cause distinct biological outcomes, such as cell differentiation, proliferation, and apoptosis [3034], whereas, the integrated response directly correlates with DNA synthesis [35, 36]. We take the system response R to be the logarithm of the duration, integrated response, or strength; that is, we take R to be lnD, lnI, or lnS. This reduces the effect of outliers and increases the efficiency of numerically evaluating the indices associated with variance-based sensitivity analysis [16].

Variance-based sensitivity analysis

We employ the variance-based sensitivity analysis approach for biochemical reaction systems we recently introduced in [16]. This method is based on a biophysically-derived probabilistic model for the rate constants of a biochemical reaction system. According to this model, we treat the rate constants κ2m-1and κ2mas random variables K2m-1and K2m, given by the Eyring-Polanyi equations [37]
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ3_HTML.gif
(3)
where k B is the Boltzmann constant (k B = 1.3806504 × 10-23JK-1), T is the system temperature, h is the Planck constant (h = 6.62606885 × 10-34Js), https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq4_HTML.gif is the (random) capacity of the activated complex associated with the mth reaction, and C n is the (random) capacity of the nth molecular species. The capacities are defined by
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ4_HTML.gif
(4)
where https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq5_HTML.gif , https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq6_HTML.gif are the (random) standard chemical potentials of the mth activated complex and the nth molecular species, respectively, given by
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ5_HTML.gif
(5)

In (5), https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq7_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq8_HTML.gif are the nominal standard chemical potential values associated with the mth reaction and the nth molecular species, whereas, https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq9_HTML.gif and Y n are zero-mean Gaussian random variables with standard deviations https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq10_HTML.gif and λ n , respectively. These random variables account for variations in the standard chemical potentials about their nominal values caused by unpredictable biological variability and uncertainty regarding their exact values. Similarly to [16], we assume that the random variables https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq9_HTML.gif , m = 1, 2, ..., M, and Y n , n = 1, 2, ..., N, are statistically independent.

Our variance-based sensitivity analysis technique assesses how uncertainty in the rate constants of a biochemical reaction system affect the system response. As a consequence of (3), (4), and (5), we have that
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ6_HTML.gif
(6)
where
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equc_HTML.gif
are the nominal values of the rate constants, with
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equd_HTML.gif

Equation (6) suggests that uncertainty in the forward and reverse reaction rates occurs due to uncertainty in the standard chemical potentials of the activated complexes associated with the reactions and the standard chemical potentials of the reactants.

As a consequence of the previous model, we investigate the sensitivity properties of a biochemical reaction system due to the uncertainty in the standard chemical potentials. To simplify notation, we use W= {W1, W2, ..., W J } to denote the random variables Y and Y. We consider two cases, namely J = M and W j = https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq11_HTML.gif , for j = 1, 2, ..., M, as well as J = N and W j = Y j , j = 1, 2, ..., N. In the first case, the standard chemical potentials of the molecular species are assumed to be fixed, whereas, the standard chemical potentials of the activated complexes are perturbed randomly. Our objective is to investigate the importance of reactions in influencing the system response and, for this reason, we refer to this case as reaction-oriented sensitivity analysis (ROSA) [16]. In the second case, the standard chemical potentials of the activated complexes are assumed to be fixed, whereas, the standard for chemical potentials of the molecular species are perturbed randomly. In this case, our objective is to investigate the importance of molecular species in influencing the system response. For this reason, we refer to this case as species-oriented sensitivity analysis (SOSA) [16].

Given the response R of a biochemical reaction system with random factors W, its total variance Vtot := Var[R(W)] satisfies [17, 38, 39]:
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ7_HTML.gif
(7)
where
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ8_HTML.gif
(8)

with similar expressions for the remaining terms. If the biochemical factors W are statistically independent (which we assume here to be true), then each term on the right-hand-side of (7) is nonnegative. This equation provides a decomposition of the total system response variance Vtot into individual terms V1,V2, ..., V12, .... It turns out that V j quantifies the average reduction in total response variance, obtained by keeping the jth biochemical factor fixed. As a consequence, we use V j to measure the singular influence of the jth biochemical factor W j on the system response. Moreover, the term V jj' quantifies the average reduction in the total response variance due to jointly fixing the two biochemical factors W j and W j' , not accounted for by summing the average reductions obtained by separately fixing these factors. Therefore, we use V jj' to measure the joint influence of the biochemical factors W j and W j' on the system response. Finally, higher-order terms in (7) quantify the joint influence of three or more biochemical factors on the system response.

In most practical situations, it is difficult to evaluate the high-order terms (≥ 3) in the response variance decomposition scheme given by (7). Although these terms are usually negligible at low to moderate levels of biochemical factor uncertainty, they may take substantial values at high levels [16]. Unfortunately, it is difficult to deal in practice with high-order variance terms. For this reason, it is quite convenient to base our sensitivity analysis effort only on the first- and second-order terms V j and V jj' . Then, instead of using the total system response variance Vtot, we base our sensitivity analysis on its second-order portion V, given by
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ9_HTML.gif
(9)
By using the probabilistic model given by (6) and the variance decomposition scheme in (9), we can develop a powerful (second-order) methodology for sensitivity analysis of biochemical reaction systems, similar to the one discussed in [16] that was based on the total response variance Vtot. The method requires evaluation of two indices, namely the (second-order) single-effect sensitivity index (SESI) σ j , defined by
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ10_HTML.gif
(10)
and the (second-order) joint-effect sensitivity index (JESI) η j , defined by
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ11_HTML.gif
(11)
where
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ12_HTML.gif
(12)

Clearly, σ j quantifies the fractional singular contribution of the jth biochemical factor to the second-order portion V of the total response variance, whereas, η j quantifies the fractional contribution of the jth biochemical factor to V jointly with another factor. It turns out that, if σ j = η j = 0, then we can conclude that factor j does not influence the system response singularly or jointly with another factor (although, it may influence the system response jointly with two or more factors). On the other hand, if σ j > 0 and η j = 0, then we can conclude that factor j influences the system response singularly but not jointly with another factor. Moreover, if σ j = 0 and η j > 0, we can conclude that factor j does not influence the system response singularly but it does so jointly with some other factor, whereas, if σ j > 0 and η j > 0, we can conclude that factor j influences the system response both singularly and jointly with some other factor. In practice, we can set a small threshold θ to determine whether σ j and η j are sufficiently larger than zero.

Unfortunately, we cannot evaluate the exact values of the sensitivity indices σ j and η j . For this reason, we must resort to approximations. In this paper, we consider the possibility of employing one of five methods to accomplish this goal. We discuss these methods next and refer the reader to [16] and the accompanying Additional file 1 for details pertaining to their development and numerical implementation.

Monte Carlo estimation

A straightforward technique for approximating the SESI and JESI values is based on a Monte Carlo Latin hypercube sampling approach, whose details can be found in [16] (see also [2, 20]). This approach can be used to provide estimates https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq12_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq13_HTML.gif of the second-order SESI's and JESI's by using 2L(J + 1) system evaluations [by integrating the system of N ordinary differential equations given by (1) and (2)], where L is the number of Latin hypercube samples used and J is the number of biochemical factors considered in the analysis. We refer to https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq12_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq13_HTML.gif as the (second-order) SESI's and JESI's obtained by M onte C arlo (MC) estimation. This method is computationally expensive, since a large number L of Latin hypercube samples is required to obtain sufficiently accurate estimates of the sensitivity indices.

Derivative approximation

A method for deriving approximations https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq12_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq13_HTML.gif of the sensitivity indices σ j and η j is to replace the response function R(w) by its second-order Taylor series approximation https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq14_HTML.gif about w= 0, given by
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ13_HTML.gif
(13)
where
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Eque_HTML.gif
are the first- and second-order partial derivatives of R at w= 0, and set
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ14_HTML.gif
(14)
where
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ15_HTML.gif
(15)
Equation (13) and the statistical independence and zero-mean Gaussianity of the biochemical factors W j imply that
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ16_HTML.gif
(16)

where λ j is the standard deviation of W j , for j = 1, 2, ..., J. As a consequence, we obtain an analytical expression for the sensitivity indices https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq12_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq13_HTML.gif , which requires evaluation of the first- and second-order partial derivatives of the response function R(w), with respect to the biochemical factors, at w= 0.

Although many techniques have been developed to compute response derivatives [40], for reasons we explain in Additional file 1, we choose to approximate the partial derivatives by symmetric finite differences. We refer to https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq12_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq13_HTML.gif given by (14), (15), and (16), as the (second-order) SESI's and JESI's obtained by D erivative A pproximation (DA). The resulting method requires 2J(J + 1) + 1 system integrations, which is quadratic in terms of the number J of the biochemical factors and is much smaller than the number 2L(J + 1) of system integrations required by MC, since J « L.

Polynomial approximation

Another way to approximate the sensitivity indices σ j and η j is to replace the response function R(w) by
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ17_HTML.gif
(17)

where the α's are parameters whose values must be appropriately determined so that https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq14_HTML.gif (w) sufficiently approximates the response function R(w) in an appropriately chosen neighborhood around 0. Note that https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq14_HTML.gif provides a polynomial approximation of the response function R(w). If https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq14_HTML.gif is sufficiently close to R(w) in a neighborhood around 0, then the parameters α jj' coincide with the partial derivatives https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq15_HTML.gif , 1 ≤ κ1, κ2 ≤ 2, of R at w = 0.

By using (17) and the statistical independence and zero-mean Gaussianity of the biochemical factors W j , we can show that, in this case, https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq12_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq13_HTML.gif are given by (14) and (15), with
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ18_HTML.gif
(18)

As a consequence, we obtain again an analytical expression for the sensitivity indices https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq12_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq13_HTML.gif , which requires evaluation of the α parameters. This can be done by the polynomial regression approach we discuss in Additional file 1. We refer to https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq12_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq13_HTML.gif , given by (14), (15), and (18), as the (second-order) SESI's and JESI's obtained by P olynomial A pproximation (PA). The resulting method is based on the approach proposed in [41] and requires J(J - 1)S2/2 + JS + 1 system integrations, which is quadratic both in terms of the number J of biochemical factors and the number S of the samples per factor used in the regression. Note that J(J - 1)S2/2 + JS + 1 2J2(S/2)2, for sufficiently large J. This number is much smaller than the number 2L(J + 1) 2LJ of system integrations required by MC, since L J(S/2)2, but larger than the number 2J(J + 1) + 1 2J2 of system integrations required by DA, since S > 2.

Gauss-Hermite integration

We can obtain a more accurate approximation https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq14_HTML.gif of the response function R(w) than the one given by (13) if we truncate the Taylor series expansion of R(w) about w= 0 by removing all terms that involve partial derivatives with respect to more than two factors [note that the approximation given by (13) is obtained from the Taylor series expansion by truncating all terms that involve partial derivatives of order greater than two]. In this case, we can show that
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equf_HTML.gif
where
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ19_HTML.gif
(19)
as we explain in Additional file 1. The approximations https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq12_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq13_HTML.gif are now given by (14) and (15), with
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ20_HTML.gif
(20)
where e0, e j , and e jj' are given by
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ21_HTML.gif
(21)

Note that evaluation of https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq16_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq17_HTML.gif requires only one- and two-dimensional integrations, which can be numerically done by a standard Gauss-Hermite integration approach. For this reason, we refer to https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq12_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq13_HTML.gif , given by (14), (15), (19), (20), and (21), as the (second-order) SESI's and JESI's obtained by G auss H ermite I ntegration (GHI). The resulting method is based on the approach proposed in [42, 43] and requires 2J(J - 1)Q/22 + 2JQ/2 + 1 system integrations, which is quadratic both in terms of the number J of biochemical factors and the order Q of Gauss-Hermite integration used. Note that, if the number S of the samples per factor used in the regression associated with the PA is even, and Q = S or Q = S + 1, then GHI requires the same number of system integrations as PA.

Orthonormal Hermite approximation

The last method we consider for approximating the sensitivity indices σ j and η j is based on replacing the response function R(w) by
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ22_HTML.gif
(22)

where the α's and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq18_HTML.gif are parameters whose values must be appropriately determined so that https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq14_HTML.gif sufficiently approximates the response function R(w) over the entire space of biochemical factor values. Note that https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq14_HTML.gif provides a polynomial approximation of the response function R(w), similar to the one given by (17). However, the polynomials used in the approximation given by (22) are orthonormal Hermite polynomials, as opposed to the polynomials used in the approximation given by (17), which are standard second- and fourth-order polynomials. Note also that the approximation given by (22) is "global," in the sense that it is based on approximating the system response function R(w) over the entire factor space, whereas, the approximation given by (17) is "local," in the sense that it approximates the system response function R(w) in a neighborhood around w= 0.

By using (22), the orthonormality of the Hermite polynomials, and the statistical independence and zero-mean Gaussianity of the biochemical factors W j , we can show that https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq12_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq13_HTML.gif are given by (14) and (15), with
https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_Equ23_HTML.gif
(23)

As a consequence, we obtain again an analytical expression for the sensitivity indices https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq12_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq13_HTML.gif , which requires evaluation of the α parameters. This can be done by polynomial regression based on the Monte Carlo Latin hypercube sampling approach we discuss in Additional file 1. We refer to https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq12_HTML.gif and https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq13_HTML.gif , given by (14), (15), and (23), as the (second-order) SESI's and JESI's obtained by O rthonormal H ermite A pproximation (OHA). The resulting method is based on the approach suggested in [4447] and requires L system integrations, where L is the number of regression points obtained by Latin hypercube sampling. We here take the number of regression points used to be the same as the number of Latin hypercube samples employed by MC, although these two numbers can be different in general. As a consequence, the number of system integrations performed by OHA is smaller than the number 2L(J + 1) of system integrations used in MC by a factor of 2(J + 1), but it could be larger than the number of system integrations required by DA, PA, or GHI.

Results

We now employ the previously discussed techniques to estimate the variance-based sensitivity indices σ j and η j associated with the duration, integrated response, and strength of ERK-PP activity. We do this by considering the dynamic behavior, within a time frame of 6 hours, of the MAPK signaling cascade model depicted in Figure 1 (see Additional file 2 for more details on this model). As we have explained in the previous section, we consider two cases: ROSA and SOSA. In each case, we need to set values for the standard deviations { https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq10_HTML.gif , m = 1,2, ..., M} of the standard chemical potentials of the activated complexes of the reactions and the standard deviations {λ n , n = 1,2, ..., N} of the standard chemical potentials of the molecular species. Due to difficulties in obtaining these values in practice, we assume here that https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq10_HTML.gif = λ, for m = 1,2, ..., M, and λ n = λ, for n = 1,2, ..., N, and consider λ, λ as two "user-defined" parameters that quantify the perturbation levels in biochemical factor values. By following our previous work in [16], we perform sensitivity analysis with λ, λ = 0.1, 0.2, 0.3, 0.4. Finally, we employ L = 6,000 Latin hypercube samples in MC and OHA, S = 4 regression samples per factor in PA, and a Gauss-Hermite integration of order Q = 5 in GHI.

In our simulations, we use S = 4 regression points per biochemical factor, located at -2w, -w, w, and 2w, where w = λ for ROSA and w = λ for SOSA (i.e., we use regression points located at ± one and two standard deviations from 0). Note also that, as a consequence of equation (6), if Y n = 0, for n = 1, 2, ..., N, then a ± λ variation in the values of https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq9_HTML.gif about zero will produce a variation in the nominal values of the rate constants of the mth reaction within the percentage interval 100[exp{- https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq10_HTML.gif } - 1, exp{ https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq10_HTML.gif } - 1]%. This corresponds to variations in the nominal values of the reaction rate constants within the interval [-9.52%, 10.52%], for λ = 0.1, [-18.13%, 22.14%], for λ = 0.2, [-25.92%, 34.99%], for λ = 0.3, and [-32.97%, 49.18%], for λ = 0.4.

In Table 1, we summarize the number of system integrations and the equations used by each method. For ROSA-based sensitivity analysis (J = 21), the number of system integrations required by DA, PA, GHI, and OHA, are respectively only 0.35%, 1.30%, 1.30%, and 2.27% of that required by MC. For SOSA-based sensitivity analysis (J = 23), the number of system integrations required by DA, PA, GHI, and OHA, are respectively only 0.38%, 1.44%, 1.44%, and 2.08% of that required by MC.
Table 1

Required system integrations, equations used, and sources of error.

Method

System Integrations

ROSA

SOSA

Equations Used

Error Sources

MC

2L(J + 1)

264000

288000

(10)-(12)

• number of MC samples used

DA

2J(J + 1) + 1

925

1105

(14)-(16)

• local approximation

     

• truncation of Taylor series

     

• derivative approximation

PA

J(J - 1)S2/2 + JS + 1

3445

4141

(14), (15), (18)

• local approximation

     

• truncation of FD-HDMR

     

• polynomial approximation

     

• polynomial regression

GHI

2J(J - 1)Q/22 + 2JQ/2 + 1

3445

4141

(14), (15), (19)-(21)

• local approximation

     

• truncation of FD-HDMR

     

• Gauss-Hermite integration

OHA

L

6000

6000

(14), (15), (23)

• truncation of ANOVA-HDMR

     

• Hermite approximation

     

• polynomial regression

L: number of Monte Carlo (Latin hypercube) samples.

J: number of biochemical factors.

S: number of regression samples per factor.

Q: order of Gauss-Hermite integration.

We list the ROSA results in Tables 2, 3, and 4, whereas, we list the SOSA results in Tables S-3.1, S-3.2, and S-3.3 of Additional file 3. The results are given in percentages and have been truncated to the nearest integers. To reduce the size of the tables, we depict only the results associated with reactions whose truncated SESI or JESI values, estimated by MC, are at least 5%. We consider the SESI and JESI values obtained by MC as the "true" values. By following our previous work in [16], we classify reactions and molecular species into one of four categories of interest: singularly influential, jointly influential, singularly/jointly influential, and noninfluential. We do this by comparing their SESI and JESI values to a 10% threshold. Bold reaction numbers indicate SESI or JESI values, obtained by MC, that are above the 10% threshold. Note that a reaction is singularly influential if the corresponding SESI value is at least 10% and the JESI value is smaller than 10%, jointly influential if the JESI value is at least 10% and the SESI value is smaller than 10%, singularly/jointly influential if both the SESI and JESI values are at least 10%, and noninfluential if both the SESI and JESI values are smaller than 10%.
Table 2

ROSA-based sensitivity analysis results for the duration of ERK-PP activity.

SESI - DURATION (λ = 0.1)

JESI - DURATION (λ = 0.1)

Reaction

MC

DA

PA

GHI

OHA

Reaction

MC

DA

PA

GHI

OHA

4

28

28

28

27

28

4

1

0

0

0

0

6

24

26

25

22

25

6

1

0

0

0

0

11

7

7

7

9

8

11

0

0

0

0

0

13

18

18

20

18

19

13

1

0

0

0

0

SESI - DURATION ( λ = 0.2)

JESI - DURATION ( λ = 0.2)

Reaction

MC

DA

PA

GHI

OHA

Reaction

MC

DA

PA

GHI

OHA

4

26

27

27

29

27

4

2

1

1

1

1

6

22

25

25

25

23

6

2

1

1

1

1

11

7

7

7

8

8

11

1

0

0

0

0

13

16

17

18

16

17

13

1

1

0

0

0

17

5

5

6

4

5

17

1

1

1

1

1

21

5

5

5

6

5

21

1

1

0

1

1

SESI - DURATION ( λ = 0.3)

JESI - DURATION ( λ = 0.3)

Reaction

MC

DA

PA

GHI

OHA

Reaction

MC

DA

PA

GHI

OHA

4

26

26

26

24

26

4

1

2

2

2

2

6

21

24

20

21

21

6

1

2

1

1

1

11

7

6

7

7

8

11

0

1

0

0

0

13

15

16

13

15

15

13

1

1

1

1

1

17

5

4

6

5

5

17

1

2

2

2

1

21

6

5

8

8

6

21

2

2

3

2

1

SESI - DURATION ( λ = 0.4)

JESI - DURATION ( λ = 0.4)

Reaction

MC

DA

PA

GHI

OHA

Reaction

MC

DA

PA

GHI

OHA

4

23

24

23

21

25

4

4

3

2

3

3

6

19

22

20

19

21

6

4

3

2

2

2

11

8

6

6

7

9

11

1

1

0

0

0

13

14

15

12

11

15

13

1

2

1

1

1

17

5

4

6

8

5

17

2

3

2

3

1

Table 3

ROSA-based sensitivity analysis results for the integrated response of ERK-PP activity.

SESI - I-RESPONSE (λ = 0.1)

JESI - I-RESPONSE (λ = 0.1)

Reaction

MC

DA

PA

GHI

OHA

Reaction

MC

DA

PA

GHI

OHA

4

39

39

39

39

39

4

1

0

0

0

0

6

26

27

27

27

27

6

1

0

0

0

0

11

9

10

9

9

9

11

0

0

0

0

0

13

8

8

8

8

8

13

0

0

0

0

0

SESI - I-RESPONSE ( λ = 0.2)

JESI - I-RESPONSE ( λ = 0.2)

Reaction

MC

DA

PA

GHI

OHA

Reaction

MC

DA

PA

GHI

OHA

4

37

38

40

40

39

4

5

1

1

2

2

6

25

27

26

26

25

6

4

0

0

1

1

8

5

5

5

5

6

8

2

0

0

1

1

11

7

9

8

8

8

11

1

0

0

0

0

13

6

8

7

7

7

13

1

1

0

0

0

SESI - I-RESPONSE ( λ = 0.3)

JESI - I-RESPONSE ( λ = 0.3)

Reaction

MC

DA

PA

GHI

OHA

Reaction

MC

DA

PA

GHI

OHA

4

38

37

43

41

36

4

10

2

9

10

11

6

21

26

22

21

21

6

7

1

4

4

6

8

8

4

7

7

7

8

4

0

3

4

5

SESI - I-RESPONSE ( λ = 0.4)

JESI - I-RESPONSE ( λ = 0.4)

Reaction

MC

DA

PA

GHI

OHA

Reaction

MC

DA

PA

GHI

OHA

4

36

36

43

40

34

4

15

3

18

15

16

6

18

25

16

19

18

6

8

2

7

7

8

8

8

4

8

9

8

8

7

1

6

6

7

Table 4

ROSA-based sensitivity analysis results for the strength of ERK-PP activity.

SESI - STRENGTH (λ = 0.1)

JESI - STRENGTH (λ = 0.1)

Reaction

MC

DA

PA

GHI

OHA

Reaction

MC

DA

PA

GHI

OHA

4

38

38

36

30

38

4

1

0

0

0

0

6

17

15

15

14

17

6

1

1

0

0

0

8

10

10

9

6

10

8

1

0

0

0

0

11

8

9

9

4

8

11

0

0

0

0

0

19

12

10

12

15

13

19

1

1

0

0

0

SESI - STRENGTH ( λ = 0.2)

JESI - STRENGTH ( λ = 0.2)

Reaction

MC

DA

PA

GHI

OHA

Reaction

MC

DA

PA

GHI

OHA

4

32

34

40

39

33

4

13

2

3

8

11

6

14

14

14

12

13

6

8

3

1

3

6

8

8

9

11

12

9

8

7

1

1

2

5

17

6

4

6

3

6

17

6

1

1

2

4

19

10

9

11

12

12

19

5

2

1

1

4

SESI - STRENGTH ( λ = 0.3)

JESI - STRENGTH ( λ = 0.3)

Reaction

MC

DA

PA

GHI

OHA

Reaction

MC

DA

PA

GHI

OHA

4

31

30

37

37

27

4

23

3

22

25

26

6

10

12

12

11

10

6

17

5

9

10

15

8

9

8

10

9

8

8

11

2

8

9

11

19

6

8

7

6

5

19

5

4

3

3

4

SESI - STRENGTH ( λ = 0.4)

JESI - STRENGTH ( λ = 0.4)

Reaction

MC

DA

PA

GHI

OHA

Reaction

MC

DA

PA

GHI

OHA

4

28

25

40

36

26

4

28

5

29

27

29

5

2

1

1

0

2

5

6

5

2

2

5

6

10

10

9

11

10

6

16

7

11

11

15

8

8

7

8

10

8

8

15

3

11

11

14

15

1

0

0

0

2

15

7

5

4

4

7

21

1

0

0

0

1

21

7

4

4

4

8

In the remaining of this section, we discuss the ROSA results separately for each response characteristic. A similar discussion applies for the SOSA results presented in Additional file 3.

Duration

Estimation, by MC, of the ROSA-based sensitivity indices associated with the duration of ERK-PP activity produces values that change little with the size λ of the underlying perturbations; see Table 2. Moreover, the estimated SESI and JESI values indicate that the duration is primarily affected by reactions 4, 6, and 13 (refer to Figure 1 and Additional file 2 for identifying these reactions), which exert their influence only singularly (since the SESI values are larger than 10%, whereas the corresponding JESI values are less than 10%). As a matter of fact, all JESI values are negligible, which indicates that the log-duration may be approximately additive, at least within the range of the applied perturbations. Note that a multivariate response function is called additive if it can be decomposed into a sum of one-dimensional functions of one variable. Additive response functions do not produce high-order (≥ 2) joint effects and result in zero JESI values [2]. Although a linear response function is additive, the inverse is not necessarily true. It turns out that the SESI's associated with an additive response function can be well estimated by all previous approximation techniques.

From the results depicted in Table 2 (and Table S-3.1 in Additional file 3), it is clear that, as compared to MC, the DA, PA, GHI, and OHA consistently provide good approximations to the SESI and JESI values at all perturbation levels. Moreover, all methods can be used to correctly classify reactions 4, 6, and 13 as being singularly influential.

Integrated response

Estimation, by MC, of the ROSA-based sensitivity indices associated with the integrated response of ERK-PP activity produces the SESI and JESI values depicted in Table 3. These values indicate that the integrated response is primary influenced by reactions 4 and 6 (refer to Figure 1 and Additional file 2 for identifying these reactions). For small to moderate perturbations (i.e., for λ = 0.1, 0.2), reactions 4 and 6 influence the integrated response only singularly. However, for large perturbations (i.e., for λ = 0.3, 0.4), reaction 4 influences the integrated response both singularly and jointly (since both SESI and JESI values are at least 10%), whereas, reaction 6 still influences the integrated response only singularly.

It is clear from the results depicted in Table 3 (and Table S-3.2 in Additional file 3) that all approximation techniques work relatively well for small to moderate perturbation levels (i.e., for λ = 0.1, 0.2), providing accurate SESI and JESI values, as compared to the values obtained by MC, and produce correct classification of the reactions. This is true, since the log integrated response may be approximately additive in this case, as indicated by the negligible JESI values. However, for large perturbations (i.e., for λ = 0.3, 0.4), the log integrated response is not additive anymore and the results obtained by DA deteriorate noticeably, deeming the use of DA inappropriate. For example, using the JESI results produced by ROSA, the largest differences between the values obtained by DA and MC are 8% and 12% for λ = 0.3, 0.4, respectively. As a matter of fact, the DA is not capable of capturing second-order joint effects and the resulting JESI values are very small. If we use the DA results to classify the reactions, then we will erroneously conclude that reaction 4 influences the integrated response only singularly, when λ = 0.3, 0.4.

From the results depicted in Table 3 (and Table S-3.2 in Additional file 3), it is clear that, for large perturbations, GHI and OHA provide good approximations to the sensitivity indices. Moreover, the results indicate that OHA may be a better approximation technique than GHI (e.g., compare the SESI results obtained by GHI and OHA for reaction 4). On the other hand, the results obtained by PA are much better than the results obtained by DA. However, the performance of PA may deteriorate at high perturbation levels and may become inferior to GHI and OHA (e.g., compare the results obtained by PA, GHI, and OHA for reaction 4). Finally, it is clear that the sensitivity results obtained by GHI and OHA can be used to correctly classify all reactions.

Strength

Estimation, by MC, of the ROSA-based sensitivity indices associated with the strength of ERK-PP activity produces the SESI and JESI values depicted in Table 4. These values indicate that the log strength may be approximately additive when λ = 0.1. However, the log strength becomes nonadditive when λ = 0.2, 0.3, 0.4, since the estimated JESI values are not negligible at these perturbation levels. Note that, when λ = 0.1, the strength is primarily affected by reactions 4, 6, 8, and 19, which exert their influence only singularly. However, when λ = 0.2, reaction 8 becomes noninfluential, reaction 4 influences the strength both singularly and jointly, whereas, reactions 6 and 19 still influence the strength singularly. On the other hand, when λ = 0.3, 0.4, reactions 4 and 6 influence the strength both singularly and jointly, whereas, reaction 8 influences the strength only jointly (since the JESI values are larger than 10%, whereas, the corresponding SESI values are less than 10%).

It is clear from the results depicted in Table 4 (and Table S-3.3 in Additional file 3) that all approximation techniques work relatively well when λ = 0.1, producing accurate SESI and JESI values, as compared to the values obtained by MC, and resulting in correct classification of the reactions. However, when λ = 0.2, 0.3, 0.4, DA produces inaccurate results, while the performance of PA and GHI deteriorates noticeably. For example, using the JESI results produced by ROSA, the largest differences between the values obtained by DA and MC are 11%, 20% and 23% for λ = 0.2, 0.3, 0.4, respectively. Moreover, the largest differences between the values obtained by PA and MC are 10%, 8% and 5% for λ = 0.2, 0.3, 0.4, respectively. Finally, the largest differences between the values obtained by GHI and MC are 5%, 7% and 5% for λ = 0.2, 0.3, 0.4, respectively. Once more, OHA consistently provides good results, which can be used to correctly classify the reactions at all perturbation levels.

Discussion

The previous numerical results demonstrate that, in terms of estimation accuracy, OHA is the best method and DA is the worst, whereas, PA and GHI are in between, with GHI slightly better than PA. To explain why this is so, we must investigate the sources of error introduced by each technique, which we summarize in Table 1.

The estimation error produced by the MC approach is mainly due to the finite number L of samples used and decreases slowly as L increases, regardless of the number J of biochemical factors used, at least theoretically. Note, however, that to achieve a certain level of accuracy in practice, we may also need to increase L as the number J of biochemical factors increases, due to the exponential growth in the volume of the biochemical factor space when adding extra dimensions ("curse of dimensionality").

There are two sources of error associated with DA. First, substantial errors may be introduced due to the fact that DA locally approximates the response function by a Taylor series expansion that includes only first- and second-order partial derivatives. Consequently, DA may not produce good estimates of the sensitivity indices under large perturbations, since a second-order Taylor series approximation of the response function may not be sufficiently accurate over the range of factor values generated by such perturbations. This is especially true when the response function is nonadditive (as it is the case with the log integrated response and the log strength of ERK-PP in the MAPK example). In such cases, large factor variations may produce substantial joint effects, which cannot be captured by a local second-order Taylor series approximation. This is evident by the fact that, under large perturbations, the JESI values obtained by DA, associated with the integrated response and strength, are significantly different than the ones produced by MC.

A second source of error associated with DA is the approximation of the first- and second-order derivatives of the response function by finite-differences. In our simulations, we approximate the first- and second-order partial derivatives of the response function by using equations (S-1.35) and (S-1.36) in Additional file 1, with Δ = 0.1. It has been pointed out in [1] that the resulting approximations must be carefully used, since it is difficult to theoretically predict, control, and numerically evaluate their accuracy. Although a number of techniques have been developed to deal with this problem [40], exact evaluation of the response derivatives usually requires simultaneous integration of a set of "sensitivity equations," together with the differential equations governing the underlying molecular concentration dynamics, which turns out to be a very difficult task due to stiffness of the resulting system of differential equations [1].

PA attempts to improve the accuracy of DA by adding high-order derivative terms in the Taylor series expansion of the response function. In addition to the first- and second-order partial derivatives used by the DA, the Taylor series expansion now includes third- and fourth-order partial derivatives that involve only two biochemical factors. Moreover, instead of approximating the derivatives by finite differences, the method avoids such computations by expanding the response function using FD-HDMR, by truncating all components of order ≥ 3, by respectively approximating the first- and second-order FD-HDMR components with second- and fourth-order polynomials, and by estimating the coefficients of these polynomials using regression (see Additional file 1 for details). Errors are introduced by truncating the FD-HDMR and locally approximating the resulting response function by a fourth-order polynomial including only single biochemical factors and pairs of factors. As a consequence, PA may not be able to accurately estimate some SESI and JESI values under large perturbations, since the underlying truncation and polynomial approximation of the response function may not be sufficiently accurate over the range of factor values generated by such perturbations. Note also that errors can be introduced due to estimating the polynomial coefficients by regression, a situation that cannot be evaluated and controlled easily. As a matter of fact, and counter to intuition, we cannot necessarily increase accuracy of estimation by using more samples per biochemical factor, especially when dealing with polynomial regression [48, 49].

GHI attempts to improve the accuracy of estimating the sensitivity indices by employing the exact first- and second-order FD-HDMR components, and numerically calculating the required expectations and variances using Gauss-Hermite integrations (see Additional file 1 for details). Errors are introduced when truncating the FD-HDMR and evaluating the expectations and variances by one- and two-dimensional Gauss-Hermite integrations. Evaluating and controlling these errors is practically impossible. Note that higher-order Gauss-Hermite integrations do not necessarily produce higher accuracy. This is true only when the integrands are sufficiently smooth, in the sense that can be well-approximated by polynomials [49]. Truncation of the FD-HDMR essentially corresponds to a local approximation of the response function, although this approximation is expected to be more accurate than the Taylor series and polynomial approximations used by DA and PA, respectively. As a consequence, GHI may not be able to accurately estimate some SESI and JESI values under large perturbations, since the underlying FD-HDMR truncation may not be sufficiently accurate over the range of factor values generated by such perturbations.

Finally, the errors introduced by OHA are due to approximating the ANOVA-HDMR expansion of the response function by first- and second-order ANOVA-HDMR components, approximating these components with first- and second-order orthonormal Hermite polynomials, and estimating the coefficients of these polynomials using regression (see Additional file 1 for details). Here, the truncation of high-order ANOVA-HDMR components does not correspond to a local approximation of the response function, which is why this approximation is more accurate than truncating the FD-HDMR components, as in GHI. In fact, if we consider perturbation levels at which the higher-order (≥ 3) terms in the variance decomposition scheme given by (7) are negligible, then the higher-order (≥ 3) terms in the ANOVA-HDMR decomposition of the response function will be negligible as well [see equation (S-1.30) in Additional file 1]. This is not necessarily true for the higher-order terms in the FD-HDMR decomposition. Therefore, truncating the ANOVA-HDMR decomposition of the response function, as opposed to the FD-HDMR decomposition, is well justified for perturbation levels at which the response variance is not appreciably influenced by high-order joint effects. Under very large perturbations, OHA may not accurately estimate the sensitivity indices, since the underlying truncation of ANOVA-HDMR may not be accurate enough due to appreciable high-order (≥ 3) joint effects in the response variance. However, the global nature of the approximation methodology employed by OHA, the direct relationship between ANOVA-HDMR and the response variance decomposition scheme given by (7), and the orthonormality properties of the Hermite polynomials, make OHA the most desirable technique for approximating the sensitivity indices, among the techniques considered in this paper.

Although we have also obtained simulation results for other biochemical reaction systems, due to lack of space, we have limited our presentation in this paper to the results obtained for the MAPK model depicted in Figure 1. To illustrate various aspects of the approximation techniques and their relative merits, we have chosen the response functions to represent three types of high-dimensional system responses: the log duration, lnD, is approximately additive for the levels of biochemical factor uncertainty considered in this paper, the log integrated response, lnI, is moderately nonadditive, whereas, the log strength, lnS, is highly nonadditive. Based on our experience so far, all our simulation results are consistent with each other and perfectly agree with the theoretical analysis presented in this paper. We therefore believe that the conclusions based on the MAPK model are general and can be applied to other biochemical reaction systems as well.

It is very important to keep in mind that the four approximation techniques considered in this paper are based on the assumption that, for most biochemical reaction systems of interest, perturbations of input biochemical factors will produce only single and second-order joint effects at the output. As a consequence, truncating the HDMR of the response function to a second-order is a natural thing to do. Note that this assumption depends on the particular choice of the biochemical factors used, on how the system response relates to these factors, and on the perturbation levels used for sensitivity analysis. In general, the approximation methods discussed in this paper are expected to fail in the presence of high-order ≥ 3 joint effects among biochemical factors. Therefore, it may be necessary in these cases to consider truncated HDMR's that include higher-order basis functions. Extension to this case is straightforward but computationally demanding, since higher-order cases require evaluation of a large number of variance terms in the decomposition scheme given by (7), which can be a tedious thing to do for large biochemical reaction systems.

We should point out here that GHI is based on the methodology proposed in [42, 43], which has been effectively used to calculate statistical moments of the responses of high-dimensional mechanical systems subject to randomly fluctuating loads. In this paper, we have reformulated this method to fit the framework of variance-based sensitivity analysis and have applied it to biochemical reaction systems. On the other hand, OHA is based on the methodology proposed in [25, 44, 45, 50] for approximating ANOVA-HDMR's using orthonormal basis functions. OHA can also be viewed as a special case of the polynomial chaos expansion (PCE) approach to sensitivity analysis discussed in [46, 47, 51], and has been recently employed in [52] for estimating variance-based sensitivity indices in order to learn the topology of a functional network of interactions from given data. To our knowledge, this is the first time that the four approximation techniques presented in this paper are systematically compared to each other and used to study the sensitivity properties of biochemical reaction systems.

To conclude, we would like to stress the fact that the approximation techniques presented in this paper have been derived by assuming that the biochemical factors used for sensitivity analysis are statistically independent and that each factor follows a Gaussian distribution. The assumption of statistical independence between the random variables { https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq9_HTML.gif , m = 1,2, ..., M} and {Y n , n = 1, 2, ..., N} has been justified in [16]. However, justifying mutual independence within the sets { https://static-content.springer.com/image/art%3A10.1186%2F1471-2105-11-246/MediaObjects/12859_2009_Article_3703_IEq9_HTML.gif , m = 1, 2, ..., M} and {Y n , n = 1,2, ..., N} is a very difficult thing to do. We simply view this assumption as a convenient approximation that allows us to proceed with the sensitivity analysis approaches discussed in this paper. Developing variance-based sensitivity analysis for correlated biochemical factors is a challenging problem that needs careful investigation [2, 53]. On the other hand, if the biochemical factors follow non-Gaussian distributions, such as uniform, gamma, binomial, etc., the approximation techniques must be appropriately modified to accommodate these distributions. For example, if each biochemical factor follows a uniform distribution, then we must replace the Gauss-Hermite integration step in GHI by Gauss-Legendre integration [49]. Moreover, if the biochemical factors follow gamma distributions, then we must replace the orthonormal Hermite polynomials in OHA by orthonormal Laguerre polynomials [47, 51].

Conclusions

In this paper, we discussed four methods that one can use to analytically approximate the second-order sensitivity indices associated with a previously introduced variance-based sensitivity analysis methodology for biochemical reaction systems. The need for developing such methods stems from an effort to remedy the large computational burden associated with Monte Carlo estimation. We highlighted important theoretical, numerical, and computational aspects of each method, in an attempt to provide a comprehensive understanding of the advantages and disadvantages of each technique. Our simulation results, based on a mathematical model for the MAPK signalling cascade, clearly demonstrate the inferiority of second-order derivative-based sensitivity analysis at moderate to high levels of uncertainty. It also shows the superiority of OHA, which is constructed by truncating the ANOVA-HDMR of the response function of a biochemical reaction system and approximating the first- and second-order ANOVA-HDMR component functions with orthonormal Hermite polynomials.

Declarations

Acknowledgements

The authors acknowledge the National Science Foundation (NSF) for support of this research. Special thanks to Garrett Jenkinson for providing the C++ code used to develop a fast Matlab implementation of the MAPK signaling cascade model.

Authors’ Affiliations

(1)
Whitaker Biomedical Engineering Institute, The Johns Hopkins University

References

  1. Varma A, Morbidelli M, Wu H: Parametric Sensitivity in Chemical Systems. Cambridge, UK: Cambridge University Press; 1999.View ArticleGoogle Scholar
  2. Saltelli A, Ratto M, Andres T, Campolongo F, Cariboni J, Gatelli D, Saisana M, Tarantola S: Global Sensitivity Analysis: The Primer. Chichester, UK: John Wiley; 2008.Google Scholar
  3. Heinrich R, Schuster S: The Regulation of Cellular Systems. New York: Chapman & Hall; 1996.View ArticleGoogle Scholar
  4. Cho KH, Shin SY, Kolch W, Wolkenhauer O: Experimental design in systems biology, based on parameter sensitivity analysis using a Monte Carlo method: A case study for the TNF α -mediated NF- κ B signal transduction pathway. Simulation 2003, 79: 726–739. 10.1177/0037549703040943View ArticleGoogle Scholar
  5. Degenring D, Froemel C, Dikta G, Takors R: Sensitivity analysis for the reduction of complex metabolism models. J Process Contr 2004, 14: 729–745. 10.1016/j.jprocont.2003.12.008View ArticleGoogle Scholar
  6. Feng XJ, Hooshangi S, Chen D, Li G, Weiss R, Rabitz H: Optimizing genetic circuits by global sensitivity analysis. Biophys J 2004, 87: 2195–2202. 10.1529/biophysj.104.044131View ArticlePubMedPubMed CentralGoogle Scholar
  7. Leloup JC, Goldbeter A: Modeling the mammalian circadian clock: Sensitivity analysis and multiplicity of oscillatory mechanisms. J Theor Biol 2004, 230: 541–562. 10.1016/j.jtbi.2004.04.040View ArticlePubMedGoogle Scholar
  8. Stelling J, Gilles ED, Doyle FJ: Robustness properties of circadian clock architectures. P Natl Acad Sci USA 2004, 101: 13210–13215. 10.1073/pnas.0401463101View ArticleGoogle Scholar
  9. Hornberg JJ, Binder B, Bruggeman FJ, Schoeberl B, Heinrich R, Westerhoff HV: Control of MAPK signalling: From complexity to what really matters. Oncogene 2005, 24: 5533–5542. 10.1038/sj.onc.1208817View ArticlePubMedGoogle Scholar
  10. Hu D, Yuan JM: Time-dependent sensitivity analysis of biological networks: Coupled MAPK and PI3K signal transduction pathways. J Chem Phys 2006, 110(16):5361–5370.View ArticleGoogle Scholar
  11. Mahdavi A, Davey RE, Bhola P, Yin T, Zandstra PW: Sensitivity analysis of intracellular signaling pathway kinetics predicts targets for stem cell fate control. PLoS Comput Biol 2007, 3: 1257–1267. 10.1371/journal.pcbi.0030130View ArticleGoogle Scholar
  12. Berger SI, Iyengar R: Network analyses in systems pharmacology. Bioinformatics 2009, 25: 2466–2472. 10.1093/bioinformatics/btp465View ArticlePubMedPubMed CentralGoogle Scholar
  13. Krewski D, Wang Y, Bartlett S, Krishnan K: Uncertainty, variability, and sensitivity analysis in physiological pharmacokinetic models. J Biopharm Stat 1995, 5: 245–271. 10.1080/10543409508835112View ArticlePubMedGoogle Scholar
  14. Nestorov I: Whole body pharmacokinetic models. Clin Pharmacokinet 2003, 42: 883–908. 10.2165/00003088-200342100-00002View ArticlePubMedGoogle Scholar
  15. Ederer M, Gilles ED: Thermodynamically feasible kinetic models of reaction networks. Biophys J 2007, 92: 1846–1857. 10.1529/biophysj.106.094094View ArticlePubMedPubMed CentralGoogle Scholar
  16. Zhang HX, Dempsey WP, Goutsias J: Probabilistic sensitivity analysis of biochemical reaction systems. J Chem Phys 2009, 131: 094101. 10.1063/1.3205092View ArticlePubMedGoogle Scholar
  17. Sobol' IM: Sensitivity analysis for nonlinear mathematical models. Math Mod Comput Exp 1993, 1: 407–414.Google Scholar
  18. Saltelli A, Tarantola S, Campolongo F, Ratto M: Sensitivity Analysis in Practice: A Guide to Assessing Scientific Models. Chichester, UK: John Wiley; 2004.Google Scholar
  19. Saltelli A, Ratto M, Tarantola S, Campolongo F: Sensitivity analysis for chemical models. Chem Rev 2005, 105: 2811–2827. 10.1021/cr040659dView ArticlePubMedGoogle Scholar
  20. Saltelli A: Making best use of model evaluations to compute sensitivity indices. Comput Phys Commun 2002, 145: 280–297. 10.1016/S0010-4655(02)00280-1View ArticleGoogle Scholar
  21. Liu JS: Monte Carlo Strategies in Scientific Computing. New York: Springer; 2001.Google Scholar
  22. Helton JC, Davis FJ: Latin hypercube sampling and the propagation of uncertainty in analyses of complex systems. Reliab Eng Syst Safe 2003, 81: 23–69. 10.1016/S0951-8320(03)00058-9View ArticleGoogle Scholar
  23. Rabitz H, Alis ÖF, Shorter J, Shim K: Efficient input-output model representations. Comput Phys Commun 1999, 117: 11–20. 10.1016/S0010-4655(98)00152-0View ArticleGoogle Scholar
  24. Rabitz H, Alis ÖF: General foundations of high-dimensional model representations. J Math Chem 1999, 25: 197–233. 10.1023/A:1019188517934View ArticleGoogle Scholar
  25. Li G, Rosenthal C, Rabitz H: High dimensional model representations. J Phys Chem A 2001, 105: 7765–7777. 10.1021/jp010450tView ArticleGoogle Scholar
  26. Oakley JE, O'Hagan A: Probabilistic sensitivity analysis of complex models: a Bayesian approach. J Roy Stat Soc B 2004, 66(3):751–769. 10.1111/j.1467-9868.2004.05304.xView ArticleGoogle Scholar
  27. Ratto M, Pagano A, Young P: State dependent parameter metamodelling and sensitivity analysis. Comput Phys Commun 2007, 177: 863–876. 10.1016/j.cpc.2007.07.011View ArticleGoogle Scholar
  28. Storlie CB, Helton JC: Multiple predictor smoothing methods for sensitivity analysis: Description of techniques. Reliab Eng Syst Safe 2008, 93: 28–54. 10.1016/j.ress.2006.10.012View ArticleGoogle Scholar
  29. Storlie CB, Swiler LP, Helton JC, Sallaberry CJ: Implementation and evaluation of nonparametric regression procedures for sensitivity analysis of computationally demanding models. Reliab Eng Syst Safe 2009, 94: 1735–1763. 10.1016/j.ress.2009.05.007View ArticleGoogle Scholar
  30. Marshall CJ: Specificity of receptor tyrosine kinase signaling: Transient versus sustained extracellular signal-regulated kinase activation. Cell 1995, 80: 179–185. 10.1016/0092-8674(95)90401-8View ArticlePubMedGoogle Scholar
  31. Murphy LO, Smith S, Chen RH, Fingar DC, Blenis J: Molecular interpretation of ERK signal duration by immediate early gene products. Nat Cell Biol 2002, 4: 556–564.PubMedGoogle Scholar
  32. Murphy LO, MacKeigan JP, Blenis J: A network of immediate early gene products propagates subtle differences in mitogen-activated protein kinase signal amplitude and duration. Mol Cell Biol 2004, 24: 144–153. 10.1128/MCB.24.1.144-153.2004View ArticlePubMedPubMed CentralGoogle Scholar
  33. Mayawala K, Gelmi CA, Edwards JS: MAPK cascade possesses decoupled controllability of signal amplification and duration. Biophys J 2004, 87: L01-L02. 10.1529/biophysj.104.051888View ArticlePubMedPubMed CentralGoogle Scholar
  34. Ebisuya M, Kondoh K, Nishida E: The duration, magnitude and compartmentalization of ERK MAP kinase activity: Mechanisms for providing signaling specificity. J Cell Sci 2005, 118: 2997–3002. 10.1242/jcs.02505View ArticlePubMedGoogle Scholar
  35. Tombes RM, Auer KL, Mikkelsen R, Valerie K, Wymann MP, Marshall CJ, McMahon M, Dent P: The mitogen-activated protein (MAP) kinase cascade can either stimulate or inhibit DNA synthesis in primary cultures of rat hepatocytes depending upon whether its activation is acute/phasic or chronic. Biophys J 1998, 330(Pt 3):1451–1460.Google Scholar
  36. Asthagiri AR, Reinhart CA, Horwitz AF, Lauffenburger DA: The role of transient ERK2 signals in fibronectin- and insulin-mediated DNA synthesis. J Cell Sci 2000, 113: 4499–4510.PubMedGoogle Scholar
  37. Berry RS, Rice SA, Ross J: Physical Chemistry. 2nd edition. New York: Oxford University Press; 2000.Google Scholar
  38. Sobol' IM: Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Math Comput Simulat 2001, 55: 271–280. 10.1016/S0378-4754(00)00270-6View ArticleGoogle Scholar
  39. Sobol' IM: Theorems and examples on high-dimensional model representation. Reliab Eng Syst Safe 2003, 79: 187–193. 10.1016/S0951-8320(02)00229-6View ArticleGoogle Scholar
  40. Cacuci DG: Sensitivity and Uncertainty Analysis. Theory. Volume I. Boca Raton: Chapman & Hall/CRC; 2003.View ArticleGoogle Scholar
  41. Chen W, Jin R, Sudjianto A: Analytical variance-based global sensitivity analysis in simulation-based design under uncertainty. J Mech Design 2005, 127: 875–886. 10.1115/1.1904642View ArticleGoogle Scholar
  42. Xu H, Rahman S: A generalized dimension-reduction method for multidimensional integration in stochastic mechanics. Int J Numer Meth Eng 2004, 61: 1992–2019. 10.1002/nme.1135View ArticleGoogle Scholar
  43. Xu H, Rahman S: Decomposition methods for structural reliability analysis. Probabilist Eng Mech 2005, 20: 239–250. 10.1016/j.probengmech.2005.05.005View ArticleGoogle Scholar
  44. Li G, Wang SW, Rabitz H: Practical approaches to construct RS-HDMR component functions. J Phys Chem A 2002, 106: 8721–8733. 10.1021/jp014567tView ArticleGoogle Scholar
  45. Wang SW, Georgopoulos PG, Li G, Rabitz H: Random sampling-high dimensional model representation (RS-HDMR) with nonuniform distributed variables: Application to an integrated multimedia/multipathway exposure and dose model for trichloroethylene. J Phys Chem A 2003, 107: 4707–4716. 10.1021/jp022500fView ArticleGoogle Scholar
  46. Sudret B: Global sensitivity analysis using polynomial chaos expansions. Reliab Eng Syst Safe 2008, 93: 964–979. 10.1016/j.ress.2007.04.002View ArticleGoogle Scholar
  47. Crestaux T, Maître OL, Martinez JM: Polynomial chaos expansion for sensitivity analysis. Reliab Eng Syst Safe 2009, 94: 1161–1172. 10.1016/j.ress.2008.10.008View ArticleGoogle Scholar
  48. Montgomery DC, Peck EA, Vining GG: Introduction to Linear Regression Analysis. 3rd edition. New York: John Wiley; 2001.Google Scholar
  49. Press WH, Teukolsky SA, Vetterling WT, Flannery BP: Numerical Recipes: The Art of Scientific Computing. 3rd edition. New York: Cambridge University Press; 2007.Google Scholar
  50. Aliş ÖF, Rabitz H: Efficient implementation of high dimensional model representations. J Math Chem 2001, 29: 127–142. 10.1023/A:1010979129659View ArticleGoogle Scholar
  51. Choi SK, Grandhi RV, Canfield RA, Pettit CL: Polynomial chaos expansion with Latin hypercube sampling for estimating response variability. AIAA J 2004, 42: 1191–1198. 10.2514/1.2220View ArticleGoogle Scholar
  52. Castillo E, Sánchez-Maroño N, Alonso-Betanzos A, Castillo C: Functional network topology learning and sensitivity analysis based on ANOVA decomposition. Neural Comput 2007, 19: 231–257. 10.1162/neco.2007.19.1.231View ArticlePubMedGoogle Scholar
  53. Li G, Hu J, Wang SW, Georgopoulos PG, Schoendorf J, Rabitz H: Random sampling-high dimensional model represenattion (RS-HDMR) and orthogonality of its different order component functions. J Phys Chem A 2006, 110: 2474–2485. 10.1021/jp054148mView ArticlePubMedGoogle Scholar

Copyright

© Zhang and Goutsias; licensee BioMed Central Ltd. 2010

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Advertisement