Deterministic global optimization algorithm based on outer approximation for the parameter estimation of nonlinear dynamic biological systems

Miró, Anton; Pozo, Carlos; Guillén-Gosálbez, Gonzalo; Egea, Jose A; Jiménez, Laureano

doi:10.1186/1471-2105-13-90

Research article
Open access
Published: 10 May 2012

Deterministic global optimization algorithm based on outer approximation for the parameter estimation of nonlinear dynamic biological systems

Anton Miró¹,
Carlos Pozo¹,
Gonzalo Guillén-Gosálbez¹,
Jose A Egea² &
…
Laureano Jiménez¹

BMC Bioinformatics volume 13, Article number: 90 (2012) Cite this article

6060 Accesses
29 Citations
Metrics details

Abstract

Background

The estimation of parameter values for mathematical models of biological systems is an optimization problem that is particularly challenging due to the nonlinearities involved. One major difficulty is the existence of multiple minima in which standard optimization methods may fall during the search. Deterministic global optimization methods overcome this limitation, ensuring convergence to the global optimum within a desired tolerance. Global optimization techniques are usually classified into stochastic and deterministic. The former typically lead to lower CPU times but offer no guarantee of convergence to the global minimum in a finite number of iterations. In contrast, deterministic methods provide solutions of a given quality (i.e., optimality gap), but tend to lead to large computational burdens.

Results

This work presents a deterministic outer approximation-based algorithm for the global optimization of dynamic problems arising in the parameter estimation of models of biological systems. Our approach, which offers a theoretical guarantee of convergence to global minimum, is based on reformulating the set of ordinary differential equations into an equivalent set of algebraic equations through the use of orthogonal collocation methods, giving rise to a nonconvex nonlinear programming (NLP) problem. This nonconvex NLP is decomposed into two hierarchical levels: a master mixed-integer linear programming problem (MILP) that provides a rigorous lower bound on the optimal solution, and a reduced-space slave NLP that yields an upper bound. The algorithm iterates between these two levels until a termination criterion is satisfied.

Conclusion

The capabilities of our approach were tested in two benchmark problems, in which the performance of our algorithm was compared with that of the commercial global optimization package BARON. The proposed strategy produced near optimal solutions (i.e., within a desired tolerance) in a fraction of the CPU time required by BARON.

Background

Elucidation of biological systems has gained wider interest in the last decade. Despite recent advances, fundamental understanding of life processes still requires powerful theoretical tools from mathematics and physical sciences. Particularly, mathematical modelling of biological systems is nowadays becoming an essential partner of experimental work. One of the most challenging tasks in computational modelling of biological systems is the estimation of the model parameters. The aim here is to obtain the set of parameter values that make the model response consistent with the data observed. Parameter estimation can be formulated as an optimization problem in which the sum of squared residuals between the measured and simulated data is minimized. The biological model dictates the type of optimization problem being faced. Many biological systems are described through nonlinear ordinary differential equations (ODEs) that provide the concentration profiles of certain metabolites over time. Recent methodological developments have enabled the generation of some dynamic profiles of gene networks and protein expression data, although the latter are still very rare. In this context, there is a strong motivation for developing systematic techniques for building dynamic biological models from experimental data. The parameter estimation of these models gives rise to dynamic optimization problems which are hard to solve.

Existing approaches to optimize dynamic models can be roughly classified as direct or indirect (also known as variational) [1]. Direct methods make use of gradient-based nonlinear programming (NLP) solvers and can in turn be divided into sequential and simultaneous. In sequential approaches, the optimization of the control variables, which are discretized, is performed by a NLP solver, whereas the ODE is calculated externally, that is, both steps are executed in a sequential manner. In contrast, in simultaneous strategies, both the control and state profiles are approximated using polynomials (e.g., Lagrange polynomials) and discretized in time by means of finite elements [2, 3]. In the latter strategy, the ODE system is replaced by a system of algebraic equations that is optimized with a standard gradient-based NLP solver. Simultaneous approaches can handle dynamic systems with unstable modes and with path constraints [1]. Furthermore, they allow performing automatic differentiation with respect to the control and state variables, avoiding the need to calculate the derivatives numerically as is the case in the sequential approach. Unfortunately, the discretization step can lead to large scale NLPs that are difficult to solve.

Models of biological systems are typically highly nonlinear, which gives rise to nonconvex optimization problems with multiple local solutions (i.e., multimodality). Because of this, traditional gradient-based methods used in the sequential and simultaneous approaches may fall in local optima. In the context of parameter estimation, these local solutions should be avoided, since they may lead to inaccurate models that are unable to predict the system’s performance precisely.

Global optimization (GO) algorithms are a special class of techniques that attempt to identify the global optimum in nonconvex problems. These methods can be classified as stochastic and deterministic. Stochastic GO methods are based on probabilistic algorithms that provide near optimal solutions in short CPU times. Despite having shown great potential with large-scale problems like parameter estimation [4], these methods have as major limitation that are unable to guarantee convergence to the global optimum in a finite number of iterations. In other words, they provide solutions whose optimality (i.e., quality) is unknown, and may or may not be globally optimal. In contrast, deterministic global optimization methods ensure global optimality within a desired tolerance, but lead to larger computational burdens. Hence, in addition to the solution itself, these methods provide as output a rigorous interval within which the best possible solution (i.e., global optimum) must fall. Despite recent advances in deterministic global optimization methods [5, 6], their application to parameter estimation has been quite scarce. Two main deterministic GO methods exist: spatial branch-and-bound (sBB) [2, 5–7], and outer approximation [8]. Both algorithms rely on computing valid lower and upper bounds on the global optimum. These bounds tend to approach as iterations proceed, thus offering a theoretical guarantee of convergence to the global optimum.

A rigorous lower bound on the global optimum of the original nonconvex problem is obtained by solving a valid relaxation that contains its feasible space. To construct this relaxed problem, the nonconvex terms in the original formulation are replaced by convex envelopes that overestimate its feasible region. There are different types of convex envelopes that provide relaxations for a wide variety of nonconvexities. These relaxations are the main ingredient of deterministic GO methods and play a key role in their performance. In general, tighter relaxations provide better bounds (i.e., closer to the global optimum), thereby expediting the overall solution procedure.

To the best of our knowledge, Esposito and Floudas were the first to propose a deterministic method for the global solution of dynamic optimization problems with embedded ODEs [2]. Their approach relies on reformulating the problem as a nonconvex NLP using orthogonal collocation on finite elements. This reformulated NLP was then solved by means of a sBB method. To this end, they constructed a convex relaxation of the reformulated problem following the α BB approach previously proposed by the authors [5–7]. Despite being valid for twice continuous differentiable functions, these relaxations may provide weak bounds in some particular cases and therefore lead to large CPU times when used in the context of a spatial branch and bound framework [9].

This work proposes a computational framework for the deterministic global optimization of parameter estimation problems of nonlinear dynamic biological systems. The main contributions of our work are: (1) the application of deterministic global optimization methods to dynamic models of biological systems, and (2) the use of several known techniques employed in dynamic (i.e., orthogonal collocation on finite elements) and global optimization (i.e., symbolic reformulation of NLPs and piecewise McCormick envelopes) in the context of an outer approximation algorithm. The approach presented relies on discretizing the set of nonlinear ODEs using orthogonal collocation on finite elements, thereby transforming the dynamic system into an equivalent nonconvex NLP problem. A customized outer approximation algorithm that relies on a mixed-integer linear programming (MILP) relaxation is used in an iterative scheme along with the aforementioned NLP to solve the nonconvex model to global optimality. The MILP relaxation is tightened using a special type of cutting plane that exploits the problem structure, thereby expediting the overall solution procedure.

The capabilities of our algorithm are tested through its application to two case studies: the isomerisation of α-Pinene (case study 1) and the inhibition of HIV proteinase (case study 2). The results obtained are compared with those produced by the state-of-art commercial global optimization package BARON (Branch And Reduce Optimization Navigator). Our algorithm is proved from these numerical examples to produce near optimal solutions in a fraction of the CPU time required by BARON.

Methods

Problem statement

The problem addressed in this work can be stated as follows: given is a dynamic kinetic model describing the mechanism of a set of biochemical reactions. The goal is to determine the appropriate values of the model coefficients (e.g., rate constants, initial conditions, etc.), so as to minimize the sum-of-squares of the residuals between the simulated data provided by the model and the experimental observations.

Mathematical formulation

We consider dynamic parameter estimation optimization problems of the following form:

min_{θ, {\hat{z}}_{u}} \sum_{j \in J M} \sum_{u \in U} {({\hat{z}}_{u, j} - {\bar{z}}_{u, j})}^{2}

(1)

s.t. {\dot{z}}_{j} = g (z, θ, t) \forall j \in J

(2)

z_{j} (t_{0}) = z_{0} \forall j \in J

(3)

t \in [t_{0}, t_{f}]

(4)

{\hat{z}}_{u, j} = z_{j} (t_{u}) \forall u \in U; \forall j \in J M

(5)

Where $\dot{z}$ represents the state variables (i.e., metabolite concentrations), z₀their initial conditions, ${\hat{z}}_{u, j}$ represents the experimental data variables, ${\bar{z}}_{u, j}$ are the experimental observations, J is the set of state variables whose derivatives explicitly appear in the model, θ are the parameters to be estimated and t_u, is the time associated with the u th experimental data point in the set U.

Our solution strategy relies on reformulating the nonlinear dynamic optimization problem as a finite-dimensional NLP by applying a complete discretization using orthogonal collocation on finite elements. This NLP is next solved using an outer approximation algorithm (see Figure 1). In the sections that follow, we explain in detail the main steps of our algorithm.

Orthogonal collocation approach

There is a considerable number of collocation-based discretizations for the solution of differential-algebraic systems [10]. Without loss of generality, we employ herein the so-called orthogonal collocation on finite elements method [11, 12]. Consider the following set of ODE’s defined as

{\dot{z}}_{j} = g (z, θ, t) \forall j \in J

(6)

The state variables are first approximated using Lagrange polynomials as follows:

z_{N K + 1} (t) = \sum_{k = 0}^{N K} ξ_{k} ϕ_{k} (t) ϕ_{k} (t) = \prod_{q = 0, q \neq k}^{N K} \frac{t - t_{q}}{t_{k} - t_{q}}

(7)

These polynomials have the property that at the orthogonal collocation points their coefficients, ξ_k, take the value of the state profile at that point. Therefore, the collocation coefficients ξ_k acquire physical meaning which allows to generate bounds for these variables.

Because state variables may present steep variations, the whole solution space is commonly divided into time intervals called finite elements. Hence, the time variable t is divided into N E elements of length Δη_e and rescaled as τ∈[0,1]. Within each finite element, N K + 1 orthogonal collocation points τ(0), τ(1), τ(2), ⋯ ,τ(N K)are distributed at the shifted (between 0 and 1) roots of the orthogonal Legendre polynomial of N K degree. Recall that the 0th orthogonal collocation point is located at the beginning of each element (Figure 2).

Following the collocation method [10], the residual equations arising from the combination of Eqs. 6 and 7, are defined for each element e in the set E and state variable in the set J, giving rise to the following constraints:

\begin{align} \sum_{k = 0}^{N K} ξ_{e, k, j} {\dot{ϕ}}_{e, k, j} (τ_{k^{'}}) - Δ η_{e} g_{j} (ξ_{e, k^{'}, j}, θ, t_{e, k^{'}}) = 0 \\ \forall e \in E k^{'} = 1, \dots, N K; \forall j \in J \end{align}

(8)

The state variables have to be continuous between elements, so we enforce the following continuity constrains:

ξ_{e, 0, j} - \sum_{k = 0}^{N K} ξ_{e - 1, k, j} ϕ_{k} (τ = 1) = 0 e = 2, \dots, N E \forall j \in J

(9)

These equations extrapolate the polynomial at element e-1, providing an accurate initial condition for the next element e.

Moreover, initial conditions are enforced for the beginning of the first element using the following equation:

ξ_{1, 0, j} - z_{0, j} = 0 \forall j \in J

(10)

Recall that collocation points in which time has been discretized will not necessarily match the times at which experimental profiles were registered. Hence, variable ${\hat{z}}_{u, j}$ is added to determine the value of the model states profiles at times t_umaking it possible to fit the model to the experimental points. This is accomplished by adding the following equation:

- {\hat{z}}_{u, j} + \sum_{k = 0}^{N K} ξ_{e_{u}, k, j} ϕ_{k} (τ_{u}) = 0 \forall u \in U; \forall j \in J M

(11)

Where τ_uis calculated as follows:

τ_{u} = \frac{t_{u} - η_{e_{u}}}{Δ η_{e_{u}}}

(12)

Here, the subscript e_u refers to the element where t_ufalls, that is, e_u≡{e:η_e≤t_u<η_{e + 1}}.

NPL formulation

The dynamic optimization problem is finally reformulated into the following NLP:

min_{θ, ξ, {\hat{z}}_{u}} \sum_{j \in J M} \sum_{u \in U} {({\hat{z}}_{u, j} - {\bar{z}}_{u, j})}^{2}

(13)

\begin{align} s.t. \sum_{k = 0}^{N K} ξ_{e, k, j} {\dot{ϕ}}_{e, k, j} (τ_{k^{'}}) - Δ η_{e} g_{j} (ξ_{e, k^{'}, j}, θ, t_{e, k^{'}}) = 0 \\ \forall e \in E k^{'} = 1, \dots, N K; \forall j \in J \end{align}

(14)

\begin{align} ξ_{e, 0, j} - \sum_{k = 0}^{N K} ξ_{e - 1, k, j} ϕ_{k} (τ = 1) = 0 \\ e = 2, \dots, N E \forall j \in J \end{align}

(15)

ξ_{1, 0, j} - z_{0, j} = 0 \forall j \in J

(16)

\begin{align} - {\hat{z}}_{u, j} + \sum_{k = 0}^{N K} ξ_{e_{u}, k, j} ϕ_{k} (τ_{u}) = 0 \\ \forall u \in U; \forall j \in J M \end{align}

(17)

Results and discussion

Optimization approach

The method devised for globally optimizing the NLP that arises from the reformulation of the parameter estimation problem (Eqs. 1317) is based on an outer approximation algorithm [8] used by the authors in previous works [13–17]. This approach relies on decomposing the original NLP into two subproblems at different hierarchical levels: a lower level MILP problem and an upper level slave NLP problem. The master problem is a relaxation of the original NLP (i.e., it overestimates its feasible region) and hence provides a rigorous lower bound on its global optimum. The slave NLP yields a valid upper bound when it is solved locally. The algorithm iterates between these two levels until the optimality gap (i.e., the relative difference between the upper and lower bounds) is reduced below a given tolerance (Figure 3). In the following subsections, we provide a detailed description of the algorithm.

Lower level master problem

Designing efficient and smart strategies for attaining tight bounds is a mayor challenge in deterministic global optimization. Both the quality of the bounds and the time required to generate them drastically influence the overall performance of a deterministic global optimization algorithm.

Any feasible solution of the original NLP is a valid upper bound and can be obtained by means of a local NLP solver. To obtain lower bounds, we require a rigorous convex (linear or nonlinear) relaxation. This relaxation is obtained by replacing the nonconvex terms by convex overestimators. Since the relaxed problem is convex, it is possible to solve it to global optimality using standard local optimizers. Furthermore, since its feasible region contains that of the original problem and its objective function rigorously underestimates the original one, it is guaranteed to provide a lower bound on the global optimum of the original nonconvex model [18].

Androulakis et al. [19] proposed a convex quadratic relaxation for nonconvex functions named α BB underestimator which can be applied to general twice continuously differentiable functions. This technique, which was used in parameter estimation by Esposito and Floudas [2], might lead in some cases to weak relaxations and therefore poor numerical performance [9].

To construct a valid MILP relaxation, we apply the following approach. We first reformulate the NLP using the symbolic reformulation method proposed by Smith and Pantelides [20]. This technique reformulates any system of nonlinear equations into an equivalent canonical form with the only nonlinearities being bilinear products, linear fractional, simple exponentiation and univariate function terms with the following standard form:

min_{w} w_{obj}

(18)

s.t. Aw = b

(19)

w^{l} \leq w \leq w^{u}

(20)

y \in [y^{l}, \dots, y^{u}]

(21)

w_{k} \equiv w_{i} w_{j} \forall (i, j, k) \in T_{bt}

(22)

w_{k} \equiv \frac{w_{i}}{w_{j}} \forall (i, j, k) \in T_{lft}

(23)

w_{k} \equiv w_{i}^{n} \forall (i, k, n) \in T_{et}

(24)

w_{k} \equiv fn (w_{i}) \forall (i, k) \in T_{uft}

(25)

where vector w comprises continuous variables x as well as integers y, while the sets $T_{bt}$ , $T_{lft}$ , $T_{et}$ and $T_{uft}$ are the bilinear product, linear fractional, simple exponentiation and univariate function terms, respectively.

A rigorous relaxation of the original model is constructed by replacing the nonconvex terms in the reformulated model by convex estimators. The solution of the convex relaxation provides a valid lower bound on the global optimum. More precisely, the bilinear terms are replaced by piecewise McCormick relaxations. The fractional terms can be convexified in two different manners. The first is to replace them by tailored convex envelopes that exploit their structure [21]. The second is to express them as bilinear terms by performing a simple algebraic transformation, and then use the McCormick envelopes to relax the associated bilinear term. Univariate functions commonly used in process engineering models (e.g., logarithms, exponentials, and square roots) are purely convex or purely concave, and can be replaced by the exact function-secant pair estimators [22].

The reader is referred to the work by Smith and Pantelides [20] for further details on the symbolic reformulation. We focus next on explaining how the bilinear terms are approximated in the reformulated NLP.

Piecewise McCormick-based relaxation

The bilinear terms appearing in the reformulated model are approximated using McCormick’s envelopes [23–26]. For bilinear terms, this relaxation is tighter than the α BB-based relaxations [18, 27].

Each bilinear term xy can be replaced by an auxiliary variable z as follows:

z = xy x^{L} \leq x \leq x^{U} y^{L} \leq y \leq y^{U}

(26)

The best known relaxation for approximating a bilinear term is given by the McCormick envelopes, obtained by replacing Eq. 26 by the following linear under (Eqs. 27 and 28), and overestimators (Eqs. 29 and 30):

z \geq x y^{L} + x^{L} y - x^{L} y^{L}

(27)

z \geq x y^{U} + x^{U} y - x^{U} y^{U}

(28)

z \leq x y^{L} + x^{U} y - x^{U} y^{L}

(29)

z \leq x y^{U} + x^{L} y - x^{L} y^{U}

(30)

In this work we further tighten the McCormick envelopes by adding binary variables [25, 28]. Particularly, two additional sets of variables are defined in the piecewise formulation:

Binary switch: $λ \in {0, 1}^{N_{P}}$

Continuous switch: $Δ y \in {[0, y^{U} - y^{L}]}^{N_{P}}$

The binary switch λis active (i.e., λ(n_P)=1) for the segment where x is located $(x^{L} + a (n_{P} - 1) \leq x \leq x^{L} + a n_{P})$ and is otherwise inactive. Therefore, the partitioning scheme activates exactly only one n_P∈{1,…,N_P} so that the feasible region corresponding to the relaxation of xy is reduced from the parallelogram in Figure 4(a) to a significantly smaller one depicted in Figure 4(b).

Eq. 31 enforces that only one binary variable is active:

\sum_{n_{P} = 1}^{N_{P}} λ (n_{P}) = 1

(31)

The continuous switch Δy takes on any positive value between 0 and y^U−y^L when the binary switch corresponding to the n_Pth piecewise λ(n_P) is active (i.e., λ(n_P)=1) and 0 otherwise. Therefore:

y = y^{L} + \sum_{n_{P} = 1}^{N_{P}} Δ y (n_{P})

(32)

0 \leq Δ y (n_{P}) \leq (y^{U} - y^{L}) λ (n_{P}) n_{P} = 1, \dots, N_{P}

(33)

Finally, the under and overestimators for the active segment are defined in algebraic terms as follows:

z \geq x y^{L} + \sum_{n_{P} = 1}^{N_{P}} [x^{L} + a (n_{P} - 1)] Δ y (n_{P})

(34)

z \geq x y^{U} + \sum_{n_{P} = 1}^{N_{P}} [x^{L} + a n_{P}] [Δ y (n_{P}) - (y^{U} - y^{L}) λ (n_{P})]

(35)

z \leq x y^{L} + \sum_{n_{P} = 1}^{N_{P}} [x^{L} + a n_{P}] Δ y (n_{P})

(36)

\begin{matrix} z & \leq x y^{U} + \sum_{n_{P} = 1}^{N_{P}} [x^{L} + a (n_{P} - 1)] [Δ y (n_{P}) - (y^{U} - y^{L}) \\ \times λ (n_{P})] \end{matrix}

(37)

x^{L} \leq x \leq x^{U}; y^{L} \leq y \leq y^{U}

(38)

Note that the discrete relaxation is tighter than the continuous one over the entire feasible region. The introduction of the binary variables required in the piecewise McCormick reformulation gives rise to a mixed-integer nonlinear programming (MINLP) problem, with the only nonlinearities appearing in the objective function. While this MINLP is convex and can be easily solved to global optimality with standard MINLP solvers, it is more convenient to linearize it in order to obtain an MILP formulation, for which more efficient software packages exist. The section that follows explains how this is accomplished.

Hyper-planes underestimation

The convex MINLP can be further reformulated into an MILP by replacing the objective function by a set of hyper-planes. For this, we define two new variables as $z_{u, j}^{'} = {\hat{z}}_{u, j} - {\bar{z}}_{u, j}$ and $α \geq {z^{'}}_{u, j}^{2}$ . The quadratic terms are then approximated by 1st degree Taylor series. That is, the square terms are replaced by l hyper-planes uniformly distributed between the maximum and minimum desired values of $z_{u, j}^{'}$ (Figure 5) so that the objective function is reduced to a summation of quadratic terms as follows:

min_{θ, ξ, {\hat{z}}_{u}} \sum_{j \in J M} \sum_{u \in U} α_{u, j}

(39)

\begin{matrix} α_{u, j} \geq {z^{'}}_{0}^{2}_{u, j, l} + 2 {z_{0}^{'}}_{u, j, l} (z_{u, j}^{'} - {z_{0}^{'}}_{u, j, l}) \\ \forall u \in U \forall j \in J M \forall l \in L \end{matrix}

(40)

Upper level slave problem

A valid upper bound on the global optimum is obtained by optimizing the original NLP locally. This NLP is initialized using the solution provided by the MILP as starting point. The solution of this NLP is used to tighten the MILP, so the lower and upper bounds tend to converge as iterations proceed.

Algorithm steps

The proposed algorithm comprises the following steps:

1.
Set iteration count it = 0, UB = ∞, LB = −∞ and tolerance error = tol.
2.
Set it=it + 1. Solve the master problem MILP.
(a)
If the MILP is infeasible, stop (since the NLP is also infeasible).
(b)
Otherwise, update the current LB making $LB = max_{it} ({LB}_{it})$ , where LB_it is the value of the objective function of the MILP in the it^th iteration.
3.
Solve the slave problem NLP.
(a)
If the NLP is infeasible add one more piecewise term and hyper-plane to the master MILP and go to step 2 of the algorithm.
(b)
Otherwise, update the current UB making UB = $min_{it}$ (UB_it), where UB_it is the value of the objective function of the NLP in the it^th iteration.
4.
Calculate the optimality gap OG as $OG = \frac{| UB - LB |}{UB}$ .
(a)
If OG≤tol, then stop. The current UB is regarded as the global optimum within the desired tolerance.
(b)
Otherwise, add one more piecewise section and hyper-plane to the master MILP and go to step 2 of the algorithm.

Remarks:

There are different methods to update the piecewise bilinear approximation. One possible strategy is to update it by dividing the active piecewise (i.e., the piecewise term in which the solution is located) into two equal-length segments.

The new hyper-plane term ${z_{0}^{'}}_{u, j, l}$ is added at the optimal solution of the MILP (solution point $z_{u, j}^{'}$ ) in the previous iteration.

The univariate convex and concave terms in the reformulated problem can be either approximated by the secant or by a piecewise univariate function similarly as done with the McCormick envelopes.

Our algorithm needs to be tuned prior to its application. This is a common practice in any optimization algorithm. In a previous publication [13], we studied the issue of defining the number of piecewise intervals and supporting hyper-planes in an optimal manner. In practice, however, the optimal number of piecewise terms and hyper-planes is highly dependent on the specific instance being solved, so it is difficult to provide general guidelines on this.

The approach presented might lead to large computational burdens in large-scale models of complex biological systems. Future work will focus on expediting our algorithm through the addition of cutting planes and the use of customized decomposition strategies.

Case studies

We illustrate the performance of the proposed algorithm through its application to two challenging benchmark parameter estimation problems: the isomerisation of α-Pinene (case study 1) and the inhibition of HIV proteinase (case study 2). The objective in these problems is to obtain the set of values of the model parameters such that the model response is as close as possible to the experimental data. For comparison purposes we used the global optimization package BARON (version 8.1.5). BARON is a commercial software for solving nonconvex optimization problems to global optimality. BARON combines constraint propagation, interval analysis, duality, and enhanced "branch and bound" concepts for efficient range reduction with rigorous relaxations constructed by enlarging the feasible region and/or underestimating the objective function. The interested readers have the possibility to evaluate this software on their own for free in this link: http://www.neos-server.org/neos/solvers/go:BARON/GAMS.html. Our algorithm was implemented in GAMS 23.5.2 using CPLEX 12.2.0.0 for the MILPs and SNOPT 4 for the NLPs subproblems. All the calculations were performed in a PC/AMD Athlon II at 2.99 Ghz using a single core. Data about the size of the models can be found in Table 1.

Table 1 Model size in the last iteration

Full size table

Case study 1: Isomerisation of α-Pinene

In this first case study, five kinetic parameters describing the thermal isomerisation of α-Pinene are estimated. The proposed reaction scheme for this process is depicted in Figure 6. In this homogeneous chemical reaction, α-Pinene (γ₁) is thermally isomerised to dipentene (γ₂) and allo-ocimene (γ₃), which in turn yields α- and β-Pyronene (γ₄) and a dimer (γ₅). This process was originally studied by Fuguitt and Hawkins [29], which carried out a single experiment reporting the experimental concentrations (mass fraction) of the reactant and the four products measured at eight time intervals.

Hunter and McGregor [30] postulated first-order kinetics and proposed the following set of ODE’s describing the dynamic process:

\frac{d γ_{1}}{dt} = - (p_{1} + p_{2}) γ_{1}

(41)

\frac{d γ_{2}}{dt} = p_{1} γ_{1}

(42)

\frac{d γ_{3}}{dt} = p_{2} γ_{1} - (p_{3} + p_{4}) γ_{3} + p_{5} γ_{5}

(43)

\frac{d γ_{4}}{dt} = p_{3} γ_{3}

(44)

\frac{d γ_{5}}{dt} = p_{4} γ_{3} - p_{5} γ_{5}

(45)

γ_{0} = [100, 0, 0, 0, 0] t \in [0, 36420]

(46)

Rodriguez-Fernandez et al. [4] addressed this problem by applying a metaheuristic based on the scatter search method. This strategy does not offer any theoretical guarantee of convergence to the global optimum in a finite number of iterations.

Following our approach, the state variables were approximated by Lagrange polynomials using three collocation points evaluated at the shifted roots of orthogonal Legendre polynomials and defining five finite elements of equal length. The nonconvexities in the resulting residual equations are given by the bilinear terms θ_iξ_{e,k, j} which were relaxed using piecewise McCormick approximations as described previously. The objective function was underestimated using supporting hyper-planes.

It is well known that the quality of the lower bound predicted by a relaxation strongly depends on the bounds imposed on its variables [31]. Hence bounds on collocation coefficients ( $ξ_{e, k, j}^{L}$ and $ξ_{e, k, j}^{U}$ , originally set to 0 and 100, respectively) were tightened by performing a bound contraction procedure [21, 32]. Particularly, tight lower and upper bounds were estimated for each collocation coefficient by maximizing and minimizing its value while satisfying the constraints contained in the master problem. This is a costly process (i.e., if bounds for n variables are to be estimated, 2n optimization problems should be solved). For this reason, it was only performed recursively 3 times before the initialization of the algorithm. The MILP was further tightened by adding the following constraint:

\sum_{j \in J M} \sum_{u \in U} {({\hat{z}}_{u, j} - {\bar{z}}_{u, j})}^{2} \leq 20

(47)

which forces the model to find a solution better than the one obtained at the beginning of the search by locally minimizing the original NLP (i.e., 20 is a rigorous upper bound for the objective function). Furthermore, the parameter θ_i was allowed to take any value within the [0, 1]interval.

The problem was solved with 6 initial hyper-planes. An extra hyper-plane was added in each iteration, but the total number of piecewise terms was kept constant (4 piecewise intervals were considered) in order to keep the MILP in a manageable size. A tolerance of 5% was set as termination criterion.

For comparison purposes, we solved the same problem with the standard global optimization package BARON using its default settings. BARON was able to find the global optimum but failed at reducing the optimality gap below the specified tolerance after 12h of CPU time. In contrast, our algorithm closed the gap in less than 3h (see Table 2). As shown in Table 2, the results obtained agree with those reported in the literature.

Table 2 Global optimization results for the α -Pinene isomerisation problem

Full size table

Case study 2: Inhibition of HIV proteinase

In this second case study, we considered a much more complex biological dynamic system. Particularly, we studied the reaction mechanism of the irreversible inhibition of HIV proteinase, as originally examined by Kuzmic [33] (Figure 7). Note that this dynamic model has lack of practical identifiability, as reported in Rodriguez-Fernandez et al[4]. Nevertheless, we think that this example is still useful for the purpose of our analysis, since the emphasis here is placed on globally optimizing dynamic models of biological systems rather than analyzing identifiability issues.

The model can be described mathematically through a set of 9 nonlinear ODE’s with ten parameters:

\frac{d [M]}{dt} = - 2 k_{11} [M] [M] + 2 k_{12} [E]

(48)

\frac{d [P]}{dt} = k_{3} [ES] - 2 k_{41} [P] [E] + 2 k_{42} [EP]

(49)

\frac{d [S]}{dt} = - k_{21} [S] [E] + k_{22} [ES]

(50)

\frac{d [I]}{dt} = - k_{51} [I] [E] + k_{52} [EI]

(51)

\frac{d [ES]}{dt} = k_{21} [S] [E] - k_{22} [ES] - k_{3} [ES]

(52)

\frac{d [EP]}{dt} = k_{41} [P] [E] - k_{42} [EP]

(53)

\begin{array}{l} \frac{d [E]}{dt} = k_{11} [M] [M] - k_{12} [E] - k_{21} [S] [E] + k_{22} [ES] \\ + k_{3} [ES] - k_{41} [P] [E] + k_{42} [EP] - k_{51} [I] [E] \\ + k_{52} [EI] \end{array}

(54)

\frac{d [EI]}{dt} = k_{51} [I] [E] - k_{52} [EI] - k_{6} [EI]

(55)

\frac{d [EJ]}{dt} = k_{6} [EI]

(56)

where the following initial conditions and parameters are known:

\begin{align} {[M]}_{0} = 0 {[P]}_{0} = 0 {[ES]}_{0} = 0 \\ {[EP]}_{0} = 0 {[EI]}_{0} = 0 {[EJ]}_{0} = 0 \\ {[I]}_{0} (\exp 1) = 0 {[I]}_{0} (\exp 2) = 0.0015 \\ {[I]}_{0} (\exp 3) = 0.003 {[I]}_{0} (\exp 4) = 0.004 \\ {[I]}_{0} (\exp 5) = 0.004 \end{align}

(57)

\begin{align} k_{11} = 0.1 k_{12} = 0.001 k_{41} = 100 \\ k_{21} = 100 k_{51} = 100 \end{align}

(58)

t \in [0, 3600]

(59)

A series of five experiments where the enzyme HIV proteinase (E) (assay concentration 0.004 μ M) was added to a solution of an irreversible inhibitor (I) and a fluorogenic substrate (S) (25 μ M) were considered. The five experiments were carried out at four different concentrations of the inhibitor (0, 0.0015, 0.003, and 0.004 μ M in replicate).

The fluorescence changes were monitored during one hour. The measured signal is a linear function of the product (P) concentration, as expressed in the following equation:

signal = ϵ [P] + offset

(60)

In this fit, the offset (baseline) of the fluorimeter was considered as a degree of freedom. A certain degree of uncertainty (±50%) was assumed for the value of the initial concentrations of substrate and enzyme (titration errors).

The calibration of a total of 20 adjustable parameters was addressed: five rate constants, five initial concentrations of enzyme and substrate and five offset values. Mendes and Kell [34] solved this problem using simulated annealing and reported its first known solution. Later, Rodriguez-Fernandez et al. [4] improved that solution by means of a scatter search metaheuristic, which required a fraction of the time employed by Mendes’ simulated annealing. Recall that, despite producing near optimal solutions in short CPU times, stochastic algorithms provide no information on the quality of the solutions found and are unable to guarantee convergence to the global optimum in a finite number of iterations. On the contrary, the proposed methodology ensures the global optimality of the solution computed within a desired tolerance.

In our study, the state variables were approximated using five orthogonal collocation points and five equal-length finite elements. In this case, the nonconvexities arise from the bilinear terms ξ_{e, k, j}ξ_{e, k, j} and θ_iξ_{e, k, j}.

The parameter bounds θ_i were set to θ_i∈[0, 10⁶]. The lower and upper limits for the collocation coefficients ξ_{e, k, j, n}were fixed to ξ_{e, k, j, n}∈[0, 37.5]except for ξ_{e, k, E, n}∈[0.002, 0.006] and ξ_{e, k, S, n}∈[12.5,37.5]. The bounds for all the offsets were set to offset_n∈[−0.5, 0.5].

The master problem was further tightened by adding a special type of strengthening cuts. These cuts are generated by temporally decomposing the original full space MILP into a series of MILPs in each of which we fit only a subset of the original dataset, and remove the continuity equations corresponding to the extreme elements included in the sub-problem. The cuts are expressed as inequalities added to the master problem that impose lower bounds on the error of a subset of elements for which the sub-MILPs are solved. These bounds are hence obtained from the solution of a set of MILP sub-problems that optimize the error of only a subset of elements.

This case study was solved with 3 initial piecewise intervals and 6 initial hyper-planes. Two strengthening cuts involving elements 1, 2, 3 and 4, and 2, 3, 4 and 5, respectively, were added as constrains. A tolerance of 20% was used in the calculations. Hyper-planes and piecewise terms were updated at each iteration of the algorithm. In this case, BARON failed to identify any feasible solution after 12h of CPU time.

In contrast, our algorithm was able to obtain the global optimum (Table 3) with a gap of 18.64% in approximately 4,000 CPU s (Table 4). Remarkably, the solution found by our algorithm improves the best known solution reported by Rodriguez-Fernandez et al. [4]. Hence, our algorithm clearly outperformed other parameter estimation methods, improving the best known solution [4, 34], and providing a rigorous lower bound on the minimum error that can be attained.

Table 3 Optimal parameters for the HIV proteinase inhibition problem

Full size table

Table 4 Global optimization results for the HIV proteinase inhibition problem

Full size table

Conclusions

In this work, we have proposed a novel strategy for globally optimizing parameter estimation problems with embedded nonlinear dynamic systems. The method presented was tested through two challenging benchmark problems: the isomerisation of α-Pinene (case study 1) and the inhibition of HIV proteinase (case study 2).

The proposed algorithm identified the best known solution, which was originally reported by Rodriguez-Fernandez et al. [4], in the case of the α-Pinene, and improved the best known one in the HIV proteinase case study. In both cases, rigorous lower bounds were provided on the global optimum, making it possible to determine the optimality gap of the solutions found.

The method proposed produced promising results, surpassing the capabilities of BARON. Our method requires some knowledge on optimization theory as well as skills using modelling systems. Our final goal is to develop a software to automate the calculations, so our approach can be easily used by a wider community. This is a challenging task, since nonlinear models are hard to handle and typically require customized solution procedures. Particularly, nonlinear models must be initialized carefully to ensure convergence even to a local solution. In this regard, the use of an outer approximation scheme that relies on a master MILP formulation is quite appealing, since the outcome of this MILP can be used to initialize the NLP in a robust manner.

Another key point here is how to construct tight relaxations of the nonconvex terms. An efficient algorithm must exploit the problem structure to obtain high quality relaxations and therefore good bounds close to the global optimum. These relaxations can be further tightened through the addition of cutting planes or the use of customized decomposition methods. As observed, there is still much work to be done in this area, but we strongly believe that such an effort is worthy. Furthermore, recent advances in global optimization theory and software applications are paving the way to develop systematic deterministic tools for the global optimization of parameter estimation problems of increasing size. Our future work will focus on making the approach more efficient through the use of tailored cutting planes and decomposition strategies and also through the hybridization of deterministic methods with stochastic approaches.

References

Kameswaran S, Biegler L: Simultaneous dynamic optimization strategies: recent advances and challenges. Comput & Chem Eng 2006, 30(10–12):1560–1575. 10.1016/j.compchemeng.2006.05.034
Article CAS Google Scholar
Esposito W, Floudas C: Global optimization for the parameter estimation of differential-algebraic systems. Ind & Eng Chem Res 2000, 39(5):1291–1310. 10.1021/ie990486w
Article CAS Google Scholar
Cizniar M, Salhi D, Fikar M, Latifi M: A MATLAB package for orthogonal collocations on finite elements in dynamic optimisation. Proc 15 Int Conference Process Control, Volume 5 058f-058f.
Rodriguez-Fernandez M, Egea J, Banga J: Novel metaheuristic for parameter estimation in nonlinear dynamic biological systems. BMC Bioinf 2006, 7: 483. 10.1186/1471-2105-7-483
Article Google Scholar
Esposito W, Floudas C: Deterministic global optimization in nonlinear optimal control problems. J Global Optimization 2000, 17: 97–126. 10.1023/A:1026578104213
Article Google Scholar
Papamichail I, Adjiman C: A rigorous global optimization algorithm for problems with ordinary differential equations. J Global Optimization 2002, 24: 1–33. 10.1023/A:1016259507911
Article Google Scholar
Singer A, Barton P: Global solution of optimization problems with parameter-embedded linear dynamic systems. J Optimization Theory and Appl 2004, 121(3):613–646.
Article Google Scholar
Kesavan P, Allgor R, Gatzke E, Barton P: Outer approximation algorithms for separable nonconvex mixed-integer nonlinear programs. Math Programming 2004, 100(3):517–535.
Article Google Scholar
Biegler L, Grossmann I: Retrospective on optimization. Comput & Chem Eng 2004, 28(8):1169–1192. 10.1016/j.compchemeng.2003.11.003
Article CAS Google Scholar
Finlayson B: The method of weighted residuals and variational principles: with application in fluid mechanics, heat and mass transfer, Volume 87. Academic Pr; 1972.
Google Scholar
Cuthrell J, Biegler L: On the optimization of differential-algebraic process systems. AIChE J 1987, 33(8):1257–1270. 10.1002/aic.690330804
Article CAS Google Scholar
Tieu D, Cluett W, Penlidis A: A comparison of collocation methods for solving dynamic optimization problems. Comput & Chem Eng 1995, 19(4):375–381. 10.1016/0098-1354(94)00064-U
Article CAS Google Scholar
Pozo C, Guillén-Gosálbez G, Sorribas A, Jiménez L: Outer approximation-based algorithm for biotechnology studies in systems biology. Comp & Chem Eng 2010, 34(10):1719–1730. 10.1016/j.compchemeng.2010.03.001
Article CAS Google Scholar
Carlos P, Alberto M, Rui A, Gonzalo G, Laureano J, Albert S: Steady-state global optimization of metabolic non-linear dynamic models through recasting into power-law canonical models. BMC Syst Biol 5: 137.
Pozo C, Guillén-Gosálbez G, Sorribas A, Jiménez L: A spatial branch-and-bound framework for the global optimization of kinetic models of metabolic networks. Ind & Eng Chem Res 2010.
Google Scholar
Sorribas A, Pozo C, Vilaprinyo E, Guillén-Gosálbez G, Jiménez L, Alves R: Optimization and evolution in metabolic pathways: Global optimization techniques in Generalized Mass Action models. J Biotechnol 2010, 149(3):141–153. 10.1016/j.jbiotec.2010.01.026
Article CAS PubMed Google Scholar
Guillén-Gosálbez G, Sorribas A: Identifying quantitative operation principles in metabolic pathways: a systematic method for searching feasible enzyme activity patterns leading to cellular adaptive responses. BMC Bioinf 2009, 10: 386. 10.1186/1471-2105-10-386
Article Google Scholar
Wicaksono D, Karimi I: Piecewise MILP under-and overestimators for global optimization of bilinear programs. AIChE J 2008, 54(4):991–1008. 10.1002/aic.11425
Article CAS Google Scholar
Androulakis I, Maranas C, Floudas C: αBB: A global optimization method for general constrained nonconvex problems. J Global Optimization 1995, 7(4):337–363. 10.1007/BF01099647
Article Google Scholar
Smith E, Pantelides C: A symbolic reformulation/spatial branch-and-bound algorithm for the global optimisation of nonconvex MINLPs. Comput & Chem Eng 1999, 23(4–5):457–478. 10.1016/S0098-1354(98)00286-5
Article CAS Google Scholar
Quesada I, Grossmann I: Global optimization algorithm for heat exchanger networks. Ind & Eng Chem Res 1993, 32(3):487–499. 10.1021/ie00015a012
Article CAS Google Scholar
Smith E, Pantelides C: Global optimisation of general process models. NONCONVEX OPTIMIZATION APPL 1996, 9: 355–384.
Article Google Scholar
McCormick G: Computability of global solutions to factorable nonconvex programs: Part I Convex underestimating problems. Math Programming 1976, 10: 147–175. 10.1007/BF01580665
Article Google Scholar
McCormick G: Nonlinear programming: Theory, algorithms, and applications. 1983.
Google Scholar
Misener R, Thompson J, Floudas C: APOGEE: Global optimization of standard, generalized, and extended pooling problems via linear and logarithmic partitioning schemes. Comput & Chem Eng 2011, 35: 876–892. 10.1016/j.compchemeng.2011.01.026
Article CAS Google Scholar
Singer A, Barton P: Bounding the solutions of parameter dependent nonlinear ordinary differential equations. SIAM J Sci Comput 2006, 27(6):2167–2184. 10.1137/040604388
Article Google Scholar
Singer A, Barton P: Global optimization with nonlinear ordinary differential equations. J Global Optimization 2006, 34(2):159–190. 10.1007/s10898-005-7074-4
Article Google Scholar
Karuppiah R, Grossmann I: Global optimization for the synthesis of integrated water systems in chemical processes. Comput & Chem Eng 2006, 30(4):650–673. 10.1016/j.compchemeng.2005.11.005
Article CAS Google Scholar
Fuguitt R, Hawkins J: Rate of the thermal isomerization of α-Pinene in the liquid phase1. J Am Chem Soc 1947, 69(2):319–322. 10.1021/ja01194a047
Article CAS Google Scholar
Hunter W, McGregor J: The estimation of common parameters from several responses: Some actual examples. Unpublished Report. 1967.
Google Scholar
Grossmann I, Biegler L: Part II. Future perspective on optimization. Comput & Chem Eng 2004, 28(8):1193–1218. 10.1016/j.compchemeng.2003.11.006
Article CAS Google Scholar
Hansen P, Jaumard B, Lu S: An analytical approach to global optimization. Math Programming 1991, 52: 227–254. 10.1007/BF01582889
Article Google Scholar
Kuzmic P: Program DYNAFIT for the analysis of enzyme kinetic data: application to HIV proteinase. Anal Biochem 1996, 237(2):260–273. 10.1006/abio.1996.0238
Article CAS PubMed Google Scholar
Mendes P, Kell D: Non-linear optimization of biochemical pathways: applications to metabolic engineering and parameter estimation. Bioinformatics 1998, 14(10):869. 10.1093/bioinformatics/14.10.869
Article CAS PubMed Google Scholar

Download references

Acknowledgements

This work was supported by the Spanish Ministry of Science and Innovation through the doctoral research grant FPI-MICINN reference grant BES-2010-037166 and through the project CTQ2009-14420-C02-01.

Author information

Authors and Affiliations

Departament d’Enginyeria Química, Universitat Rovira i Virgili, Tarragona, Spain
Anton Miró, Carlos Pozo, Gonzalo Guillén-Gosálbez & Laureano Jiménez
Departamento de Matemática Aplicada y Estadística, Universidad Politécnica de Cartagena, Cartagena, Spain
Jose A Egea

Authors

Anton Miró
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Pozo
View author publications
You can also search for this author in PubMed Google Scholar
Gonzalo Guillén-Gosálbez
View author publications
You can also search for this author in PubMed Google Scholar
Jose A Egea
View author publications
You can also search for this author in PubMed Google Scholar
Laureano Jiménez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gonzalo Guillén-Gosálbez.

Additional information

Competing interest

The authors declare that they have no competing interests.

Authors’ contributions

G. G-G suggested the need for the approach and J.A. E provided the biological problems. A. M, C. P, G. G-G and L. J developed the optimization algorithms and performed the numerical analysis. All authors evaluated the results, wrote the paper and contributed to its final form.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Authors’ original file for figure 8

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Miró, A., Pozo, C., Guillén-Gosálbez, G. et al. Deterministic global optimization algorithm based on outer approximation for the parameter estimation of nonlinear dynamic biological systems. BMC Bioinformatics 13, 90 (2012). https://doi.org/10.1186/1471-2105-13-90

Download citation

Received: 07 November 2011
Accepted: 10 May 2012
Published: 10 May 2012
DOI: https://doi.org/10.1186/1471-2105-13-90

Deterministic global optimization algorithm based on outer approximation for the parameter estimation of nonlinear dynamic biological systems

Abstract

Background

Results

Conclusion

Background

Methods

Problem statement

Mathematical formulation

Orthogonal collocation approach

NPL formulation

Results and discussion

Optimization approach

Lower level master problem

Piecewise McCormick-based relaxation

Hyper-planes underestimation

Upper level slave problem

Algorithm steps

Case studies

Case study 1: Isomerisation of α-Pinene

Case study 2: Inhibition of HIV proteinase

Conclusions

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interest

Authors’ contributions

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Bioinformatics

Contact us