Identifying quantitative operation principles in metabolic pathways: a systematic method for searching feasible enzyme activity patterns leading to cellular adaptive responses
© Guillén-Gosálbez and Sorribas; licensee BioMed Central Ltd. 2009
Received: 25 February 2009
Accepted: 24 November 2009
Published: 24 November 2009
Optimization methods allow designing changes in a system so that specific goals are attained. These techniques are fundamental for metabolic engineering. However, they are not directly applicable for investigating the evolution of metabolic adaptation to environmental changes. Although biological systems have evolved by natural selection and result in well-adapted systems, we can hardly expect that actual metabolic processes are at the theoretical optimum that could result from an optimization analysis. More likely, natural systems are to be found in a feasible region compatible with global physiological requirements.
We first present a new method for globally optimizing nonlinear models of metabolic pathways that are based on the Generalized Mass Action (GMA) representation. The optimization task is posed as a nonconvex nonlinear programming (NLP) problem that is solved by an outer-approximation algorithm. This method relies on solving iteratively reduced NLP slave subproblems and mixed-integer linear programming (MILP) master problems that provide valid upper and lower bounds, respectively, on the global solution to the original NLP. The capabilities of this method are illustrated through its application to the anaerobic fermentation pathway in Saccharomyces cerevisiae. We next introduce a method to identify the feasibility parametric regions that allow a system to meet a set of physiological constraints that can be represented in mathematical terms through algebraic equations. This technique is based on applying the outer-approximation based algorithm iteratively over a reduced search space in order to identify regions that contain feasible solutions to the problem and discard others in which no feasible solution exists. As an example, we characterize the feasible enzyme activity changes that are compatible with an appropriate adaptive response of yeast Saccharomyces cerevisiae to heat shock
Our results show the utility of the suggested approach for investigating the evolution of adaptive responses to environmental changes. The proposed method can be used in other important applications such as the evaluation of parameter changes that are compatible with health and disease states.
The emergence of design in biological systems was an unsolved problem until natural selection was established as the driving force for the evolution of such systems [1–4]. At the molecular level, the identification of design principles through controlled mathematical comparisons that evaluate different functional criteria in metabolic networks has led to a better understanding of adaptation and design emergence [5–12]. Such principles enable building new gene and metabolic networks that accomplish specific requirements which is the main goal of Synthetic Biology [13–15]. The identification of quantitative evolutionary constraints plays an important role in understanding the actual characteristics of biological systems [16, 17]. In that sense, one may argue that the adaptive response of the cellular metabolism to different situations is ultimately shaped by physiological requirements that must be met by tuning gene expression and enzyme activity [18–20]. Understanding the evolution of the adaptive strategies that assure cell survival in different conditions is, thus, an important goal in Systems Biology [18–22].
In unicellular organisms, adaptive capability is specially important as they lack an internal milieu than could buffer the environmental changes. In this context, the adaptive responses to different stress conditions, (heat shock, oxidative stress, osmotic stress, and other environmental changes), have been extensively investigated using yeast as a biological model [23–26]. In general, such adaptive responses require synthesis of protective proteins (chaperons, trehalose, sphingolipids, etc.), and a fine tuning of the metabolic status to assure an appropriate supply of energy and building blocks for new proteins. The operating principles of the adaptive response to heat shock were investigated under this perspective [18, 20]. The increase in trehalose, ATP, and NADPH synthesis are primary requirements for an appropriate response in this conditions. However, these flux constraints cannot fully explain the observed changes. Complementing flux constraints, economy savings in the gene expression changes and constraints for preventing an unnecessary increase of metabolites were shown to be necessary to understand the adaptive response to heat shock. Thus, a combination of constraints on particular metabolic processes, economy issues, fluxes, etc. are required to define the scenario in which natural selection operates .
In this paper we develop a new approach that focuses on identifying feasible parametric regions that contain solutions for system parameters so that a set of physiological constraints are satisfied. With that method we will be able of identifying the possible evolutionary solutions that are expected to contain the actual adaptive response. First, we present a global optimization procedure that capitalizes on the properties of a particular class of nonlinear models, the GMA (Generalized Mass Action) models based on the power-law formalism. This method improves the method recently proposed by Polisetty et al.  and takes advantage of the properties of the power-law formalism as a canonical nonlinear modeling technique [28–30]. Second, we introduce a search strategy that allows identifying the parameter regions containing admissible solutions for the problem. The proposed algorithm is very efficient for realistic problems, although different technical improvements are possible to optimally scale it to large problems. The feasible regions found would represent the landscape in which evolutive solutions are expected. Comparison of our result and actual data allows to discuss the practical usefulness of the proposed approach.
Here, μ ir is a stoichiometric factor that indicates how many molecules of X i are produced or used by the process v r ; it is a positive integer if the flux r produces X i and it is a negative integer, if the flux r depletes the pool of X i .
If X j has no direct influence on the rate of reaction r, the kinetic order is zero. If it directly activates the flux of reaction r, the kinetic order is positive. If it inhibits the flux of reaction r, then the kinetic order is negative.
Without loss of generality, enzyme levels can be considered embedded in the γ r parameters. This is so because, in most cases, the kinetic-order for the enzyme is 1, as velocities are linearly dependent on the enzyme. Of course, if necessary, enzymes can also be included in the model as control variables. On the other hand, if the model includes dynamic changes in enzymes, for instance through gene regulation, they can also be considered as dependent variables.
Optimization in biotechnological applications: Given a model that represents the reference state of the system, we are interested in finding the parameter changes (engineering design) that optimize a given objective function (usually a flux).
Find feasibility regions in evolution studies: Given a model that represents the normal metabolic state of a cell, find the admissible changes at the level of enzyme activities that are compatible with a set of physiological and functional effectiveness criteria (evolutive emergence of design).
Results and discussion
Motivation of the optimization approach
Optimization of biological processes is a very important goal in biotechnological applications [40–44]. In general, the main purpose is finding the appropriate changes in enzyme levels, mainly through changes in gene expression, so that a given objective function is optimized. This can be a flux, for example in a production process, a metabolite concentration, or any other objective function. Developing appropriate optimization techniques is fundamental for defining a practical metabolic engineering approach . The intrinsic complexity of the target systems and the non-linearities involved in the underlying processes makes optimization a difficult task in this field. The use of models based on approximated representations such as the power-law models, either in their S-system or in GMA form, is a promising alternative [19, 45].
Optimization based on nonlinear models defined with the power-law formalism was first proposed by Voit . By using the S-system model strategy, a transformation to logarithmic coordinates allows using linear optimization for solving a number of biotechnological problems [45–47]. However, when the problem is represented by a GMA model as in (4), this technique cannot be directly applied. To overcome this problem, methods for optimizing models based on this particular form have been developed [27, 39, 47]. Here, we shall present a new method that is closely related with the method suggested by Polisetty et al. . Our first goal is to develop an efficient global optimization method for GMA models. Besides its own interest for metabolic engineering tasks, this is a requirement for developing the feasibility approach that is the ultimate goal of this work.
In practical biotechnological applications, we are interested in obtaining the best set of changes in enzyme levels (that is γ r in GMA models), so that a given goal is attained (for instance maximize a flux) and a set of constraints are satisfied (for instance: metabolite levels do not increase over a given threshold, some reactions do not reduce the fluxes under given values, changes in enzymes do not go beyond a realistic maximum, etc.)
The optimization task can be posed as an standard optimization problem that aims to identify the specific values of v r , γ r and X j that minimize a given predefined criterion while satisfying at the same time a set of biological constraints. In what follows, we consider an standard optimization problem in which the objective function is minimized. Maximization problems can be easily reformulated into minimization problems by changing the sign of the objective function. At this point, kinetic-orders are regarded as fixed parameters. These parameters represent intrinsic kinetic properties of the involved processes. Optimization of their values is possible, although the method should be further adapted to deal with that case and assure a global optimum. We shall consider this possibility in the future.
Here the feasible set S includes the steady-state equations as a basic constraint. Thus, optimization is run to find a steady-state solution that optimizes the objective function and that fulfills the considered constraints. As commented before, the optimization model may also incorporate other equations that impose certain conditions on the values of the variables v r , γ r , X j . For instance, we can consider a constraint X1 + X2 + X6 ≤ k that forces the summation of certain concentrations of metabolites to be lower than a desired upper limit k. This second type of constraints may represent specific physiological properties that the solution sought should satisfy.
Hence, the objective of the NLP model defined above is to find the solution that simultaneously minimizes the function U(·) and satisfies the equations (physiological constraints) defined upon the biological system. There are currently many strategies available to solve NLP problems like ONLP. These methods are typically implemented in software packages that allow the solution of models with thousands of variables and constraints (see ). Unfortunately, there is a particular difficulty in solving the NLP defined by ONLP, which stems from the fact that its feasible space is nonconvex. This nonconvex structure is given by the nonlinear equality constraints that define the velocity terms. In nonconvex models, standard NLP techniques are not guaranteed to converge to the global optimum and yield solutions that must be regarded as locally optimal. In the context of performing a biological analysis, this limitation constitutes a major shortcoming, since it can lead to wrong conclusions.
To overcome this drawback, it is necessary to resort to a specific type of mathematical methods known as global optimization techniques (for a detailed review see ), which are able to assure the optimality of the solutions found within a desired tolerance. Although such strategies can deal with any type of non-convexities, they tend to be computationally expensive, which hampers their practical implementation. A possible way to reduce their computational burden consists of developing tailored methods that exploit the mathematical structure of the non-convexities of the model.
In this paper, following this general idea, a novel deterministic global optimization method inspired on the works of Bergamini and co-workers [50–52] is presented to solve ONLP to global optimality. The method introduced relies on hierarchically decomposing the original problem into two levels, an upper level master problem CMILP an a lower level slave problem RNLP, between which the algorithm iterates until a termination criterion is satisfied.
The master level of our algorithm entails the solution of a mixed-integer linear programming (MILP) problem, which is a relaxation of ONLP (i.e., it rigorously overestimates the feasible region of ONLP), and therefore predicts a valid lower bound on its global optimum. Such a model is constructed by replacing the non-linearities in ONLP by valid over and under estimators. Particularly, in our method, supporting hyper-planes and piecewise linear functions are used to approximate the original search space of ONLP. In the lower level, the original problem is locally optimized in a reduced search space, thus yielding an upper bound on its global solution. The upper and lower level problems are solved iteratively until the bounds converge. For clarity, technical details of the main features of the proposed algorithm are commented in the Appendix section.
Hence, the prediction made by the upper level must be checked in the lower level, so the solution is guaranteed to be feasible in the original search space. Specifically, in our algorithm, the solution of the master model is used as a starting point in the lower level problem, which is solved in a reduced search space that is defined according to the output of the master level. The lower level solution is then employed to tighten the search space of the master problem. As a result, the new modified master problem predicts better lower bounds that are closer to the global solution. Furthermore, tightening the search space of the master problem also improves the quality of both, the starting points and reduced search spaces passed to the lower level. This can be observed in Figure 1, which shows how the envelopes employed in the master problem become tighter as iterations proceed. As a result, the upper and lower bounds tend to approximate and finally converge within the desired optimality gap.
Set iteration count c = 0, upper bound UB = ∞, lower bound LB = -∞, and tolerance error = tol.
Set c = c + 1.
Solve the MILP master problem CMILP:
If problem CMILP is infeasible, then stop. There is no feasible solution to the problem.
Otherwise, update the current lower bound as follows: , where LB c represents the objective function value associated with the optimal solution of CMILP in iteration c.
For fixed lower and upper bounds on the velocity terms (i.e., and , respectively), solve the lower level problem RNLP to obtain an upper bound on the solution of ONLP.
If problem RNLP is infeasible, then update the grid in CMILP and go to step 2.
Otherwise, update the current upper bound as follows: , where UB c represents the objective function value associated with the optimal solution of RNLP in iteration c.
Check the convergence criteria:
If , then stop. The solution corresponding to UB (i.e., the solution of model RNLP in the iteration with minimum objective function value) satisfies the finalization criterion (i.e., it can be regarded as optimal within the predefined optimality gap).
Otherwise, update the grid in CMILP and go to step 2.
Optimization of the anaerobic fermentation pathway in Saccharomyces cerevisiae
As an illustrative example of the proposed technique, we use the anaerobic fermentation pathway in Saccharomyces cerevisiae presented in Polisetty et al. (Case study 1 in ) as a benchmark problem for optimization. The reader is referred to the original publication for details (Figures 1, 2 in ). The model can be found in the Additional File 1.
The algorithm was implemented in GAMS interfacing with CPLEX 9.0 and SNOPT 6.2 as main MILP/NLP optimization packages, respectively, on an Intel 1.2 GHz machine. The upper level of the algorithm was constructed using 50 supporting hyper-planes and 9 piece-wise terms. These upper and lower estimators were updated during the execution of the algorithm by defining new linearizations and terms of the piece-wise approximation in the middle points of the active subintervals in the solution of the master problems. The tolerance error (i.e., optimality gap) was set to 0.2%.
Optimization of ethanol production.
Modified reaction numbers
(X i ) opt /(X i ) nom
Total time (s)
In Table 1, the main computational details of the algorithm, which include the values of the NLP and MILP solved in the last iteration, the total number of iterations and the total solution time are also provided. From a technical point of view, it is worth to indicate that our method produces much tighter upper bounds than those reported in  (compare Table 1 with Table III in ). For instance, for the case when two enzymes are accessible to manipulation, our method yields an upper bound of 72.68 mM min-1, whereas in  it is 126.11 mM min-1. This means that our method assures convergence to the global minimum within an optimality gap of 0.19%, whereas in the solution reported in  this gap is 22%. Similar results can be observed in the remaining optimizations. This advantage can be important for an appropriate scaling of the method for more complex problems. Note that the total time indicated in Table 1 refers to the exploration of all the combinations within each strategy. Thus, the one-enzyme optimization (the case when only one enzymes can be changed) takes 2.04 CPU seconds. The case of finding the maximum when two enzymes can be changed takes 2.64 seconds, whereas the examples in which all enzymes can be modified is solved in 0.89 CPU seconds. Note that at a first glance, one would expect that the problem in which all the enzyme changes are allowed would take more CPU time than those in which only a subset of changes are permitted. These results, which are due to some implementation details of the algorithm, can be found in small problems (i.e., around 1 second of CPU time) but are not likely to appear in bigger problems in which the CPU time is indeed dominated by the complexity of the model rather than by the implementation details.
Optimal adaptive response of yeast to heat shock
As a motivation for the feasibility approach that we shall develop in the next section, we shall obtain the optimal enzyme activity changes that would correspond to different goals in the adaptive response of yeast to heat shock. This problem was first analyzed by intensive computation by Vilaprinyo et al. . By using a GMA model of the major metabolic pathways in yeast central metabolism, the goal is to identify which changes at the level of enzyme activities are more likely to produce a desired adaptive response. This response is defined by a set of physiological changes that can be considered as necessary for adaptation. In the model we include glycolysis, synthesis of trehalose and glycerol, and the branching from glycolysis to the pentose phosphate pathway. There are nine enzymes that can be changed, and the target fluxes are those of trehalose, ATP, and NADPH synthesis. Model details can be obtained from [18, 20] (see also Table 1 for nomenclature, and the Additional File 1 for the model equations and physiological constraints).
Physiological constraints that shape the admissible adaptive response to heat shock in yeast.
Fluxes (mM min-1)
Glucose < 0.04
V TRE > 0.03
Cost < 12.06
Glucose-6-P < 20.22
V NADPH > 3.53
8.64 < Fructose-1,6-P < 22.86
V ATP > 180.6
Phosphoenol piruvate < 0.01
V GLY > 0.39
ATP < 6.77
ψ < 28.1
Optimization results for the model of heat shock response in yeast
4.9 × 10-4
1.7 × 10-3
3.5 × 10-3
8.3 × 10-3
6.6 × 10-5
Fluxes (mM min -1 )
1.3 × 10-5
3.6 × 10-4
Total time (s)
Trehalose synthesis can be increased up to 2.08 mM min-1 by increasing the activity of HTX, PYK, and TPS+GOL, while decreasing most of the other activities. In doing that, two of the involved metabolites (G6P and F16P) accumulates. Furthermore, synthesis of glycerol is reduced to zero and synthesis of NADPH is decreased. This optimal solution cannot be considered a good adaptive solution as it compromises the cellular inner milieu by accumulating metabolites and challenges other processes by decreasing some critical fluxes. In that sense, this solution optimizes a metabolic goal (v TRE ) but does not match other important physiological constraints. A similar result is obtained when the goal is to maximize the synthesis of NADPH or ATP. Particularly striking is the unrealistic concentration of F16P that is obtained in the optimal strategy for maximizing ATP synthesis. In the case of minimization of Cost, as no further constraints are imposed, we find the trivial solution that corresponds to maintain the basal conditions.
Considering the set of constraints identified by Vilaprinyo et al. (Table 2), we run again the optimization procedure for each of the four objective functions. In all the cases, an optimal solution compatible with the imposed constraints is obtained. For instance, the maximum rate of trehalose synthesis that is compatible with the considered constraints is 0.26 mM min-1. This is well below the maximum obtained without constraints, but now the solution is reasonable in the terms imposed by the physiological constraints. Similar results are obtained for the other objective functions. We also obtain the solution that minimizes the overall cost. In that case, cost can be lowered about a 50% with respect the other cases. Minimization of cost undertakes values of the fluxes that are lower than in the optimal solutions for the other objective functions. From the implementation point of view, it is interesting to notice that in all the cases the algorithm was able to provide near optimal solutions (with an optimality gap of 0.2%) in few CPU seconds. Similarly, as in the previous example, the master problem provided very tight relaxations, which led to few iterations.
Optimization techniques seek finding the best strategy for changing control variables so that the system can reach a given goal. Thus, such methods are important in biotechnology and metabolic engineering where the scientist fixes the goal and searches for the best strategy in changing the underlying processes. The situation is different if we analyze the evolution of natural systems. As discussed above, natural systems are (in some sense) optimized by the evolutive process by natural selection that acts as a purifying process. Although the exact contribution of this mechanism is still discussed , it is generally accepted as one of the basic driving forces in evolution. From this perspective, different criteria for functional effectiveness have been used for investigating the emergence of design principles in cellular systems [5, 16, 17].
This part of the paper introduces a method that aims at providing an approximate characterization of the feasible space of a biological problem rather than identifying a single optimal solution. The tool presented is intended to provide valuable insights on the set of changes in enzyme activities that would be required for adaptation to an environmental challenge.
The approach introduced is based on the NLP formulation defined in ONLP and exploits the specific structure of the GMA representation. The use of this particular representation allows to perform the feasibility study by slightly modifying the algorithm described in the previous section. Specifically, our strategy relies on solving the original problem ONLP iteratively over a reduced search space. At each iteration the method identifies a region that contains a feasible solution to the problem. This region is then removed from the search space, and the optimization problem is resolved in the reduced domain. The algorithm proceeds in this manner until there is no feasible solution in the remaining regions of the search space.
The overall method, which relies on the global optimization approach introduced before, comprises two different levels. At the upper level, a master problem is solved to identify a region that may contain a feasible solution to ONLP. At the lower level, the prediction made by the master problem is checked by solving the original problem in a reduced search space. If a feasible solution is found, then integer cuts are added to the master problem in order to exclude the region containing such a feasible point. Otherwise, the master model is updated by refining its grid, until either a feasible solution is obtained in the lower level or an infeasibility is detected in the higher level problem. The main features of the algorithm are outlined next, whereas more technical details can be found in the Appendix.
This assumption implies that the feasibility analysis will be performed on the values of the apparent rate constants γ, although in general it could also account for other variables. In this notation, the values of the r components of γ q and , which are denoted by and , respectively, correspond to the limits of the subintervals obtained by partitioning the original domain of each single variable γ r [0, ∞] into T subintervals. It follows that T r = Q. In each hyper-rectangle , the component r of the vector γ must fall into a specific subinterval .
The algorithm relies on solving two different problems, a modified master problem CFMILP and a modified slave problem RFNLP, between which the algorithm iterates. Model CFMILP is obtained from CMILP by adding a set of auxiliary equations that define the set of hyper-rectangles on which the search is conducted. Furthermore, it also incorporates a set of integer cuts that exclude from the search space those hyper-rectangles that have been explored so far. Similarly, model RFNLP is derived from RNLP by imposing certain lower and upper bounds on the values of the apparent rate constants γ. These bounds correspond to the limits of the hyper-rectangle that contains the solution predicted by the master problem, which may or may not be feasible in the search space of the original problem ONLP.
Model CFMILP is a relaxation of ONLP and therefore predicts a valid lower bound on its solution. Furthermore, if CFMILP is unfeasible, it follows that ONLP (and also RFNLP) are unfeasible in any hyper-rectangle contained in the search space of the master problem (i.e., any hyper-rectangle that has not yet been removed from the search space). In our approach, this property is indeed exploited to terminate the algorithm.
Set outer iteration count b = 0, inner iteration count c = 0, upper bound UB = ∞, lower bound LB = -∞.
Set b = b + 1.
Set c = c + 1.
Solve the MILP master problem CFMILP:
If problem CFMILP is infeasible, then stop. There is no feasible solution to the problem in the current search space.
Otherwise, for fixed lower and upper bounds on the apparent rate constants ( , ) and on the velocity terms ( , ) solve the lower level problem RFNLP.
If problem RFNLP is infeasible, then update the grid in CFMILP and go to step 3.
Otherwise, derive a new integer cut, set c = 0 and go to step 2.
Feasible enzyme activity patterns in the adaptive response of yeast to heat shock stress
Using the same model as in the second optimization example, we have investigated the feasibility regions for changing enzyme activities in yeast metabolism so that specific physiological constraints are met. For simplicity in representing the results of the analysis, we shall present 2-D plots in which the feasible regions for the changes in two enzymes are shown. It is important to indicate that in each case all the enzymes are considered in the feasibility analysis, although only the corresponding region for the two selected enzymes is shown.
According to that, the hyper-rectangles for the feasibility analysis are defined on the domain of the selected enzymes, say for instance PFK and TDH. For practical reasons, we consider γ r = k r γr 0. Here, γr 0represents the basal value and k r the change-fold over the basal value. We shall use the values of k r that correspond to the relative change of that enzyme with respect the basal activity. In this paper, we consider that enzyme activities are changed only by changing the amount of enzyme. Of course, activity changes due to other reasons, such as temperature dependency, could also be considered. In such cases, the cost component would be much lower. In either case, the resulting changes in activity would affect metabolite concentrations and fluxes. Those changes are considered in other constraints used in this analysis.
As an example, in the case of PFK and TDH 10 equally spaced segments from 0.25 to 20 fold-change are considered in the study. This leads to 100 hyper-rectangles, each of which corresponds to a specific combination of one sub-interval of PFK and another of TDH. On the other hand, no sub-intervals are defined for the remaining enzymes. This implies that in each hyper-rectangle the model is free to choose any values for them. Thus, the method is free for finding the best combination of enzymes that, within the considered hyper-rectangle, optimizes the objective function and meets the constraints. In this example, the synthesis of ATP was regarded as the criterion to be optimized in the master and slave problems of our algorithm. This objective is only used to guide the MILP and NLP subproblems. However, as mentioned before, any other criterion could have been employed, with identical results, since the main goal of our algorithm is to identify feasible regions and not to find optimal solutions. The implementation details of the algorithm are the same as in the previous examples. It is interesting to notice that the total CPU time was 31.81 CPU seconds, which shows the efficiency of the proposed method.
Effect of the constraint values
The results obtained in the feasibility analysis shown in the previous section are dependent on the values of the constraints. The values used in that case were obtained in a previous analysis of the model  and could be modified in different ways. The interesting advantage of our method is that it allows for exploring the resulting feasible regions in practical terms for large models. Thus, one could consider different values that can be suggested by experts or obtained from other biological considerations. Of course, a sound knowledge of the biological problem is a fundamental basis for this. In any case, comparison of the resulting feasible regions can help in evaluating the considered constraints and may help in identifying those constraints that are more likely responsible for the actual characteristics of the adaptive response. This would require appropriate experimental data for comparing the model predictions with the actual system behavior.
While the solutions obtained via optimization methods can be realizable in a biotechnological application, provided the required changes can be practically implemented, the optimal changes would seldom be attained in natural systems that have evolved through natural selection. From a general point of view, those systems have evolved so that the metabolic status meets a set of constraints without compromising survival and viability. For instance, adaptive response to a given condition may require to increase a flux over a given limit. However, instead to evolve towards a solution that reaches the maximum possible flux, evolution finds a solution that provides the required increase without compromising other objectives, for instance maintain a low concentration of internal metabolites.
Our feasibility approach has been developed to address this problem. By considering a set of physiological constraints, we search for the feasible parameter regions that allow the system to meet those constraints. We have investigated the utility of this approach in exploring the adaptive response of yeast to heat shock. Our results identify the admissible changes in enzyme activities that can meet the physiological constraints suggested in  (see Table 2). Interestingly, the experimental observations are located within the predicted feasibility region and close to the changes that would be required to minimize the cost of overexpressing the different enzymes (Figure 6). Furthermore, the proximity to the optimum for trehalose synthesis also suggest that this is an important requirement for the actual adaptive response.
Although this is not the case in the considered examples, it may happen that two or more unconnected feasible regions may exist for a given problem. That situation would be very interesting from the point of view of discussing the evolution of that adaptive response. In theory, it would mean that solutions in either of the regions could emerge from natural selection. If actual data situates the evolutionary solution in a particular region, then one may ask which were the disadvantages of the other possibilities. Besides a random choice, selection of a given solution among equally admissible alternatives would be a clue of the existence of complementary constraints. For instance, it may be that the evolved solution is better for assuring an appropriate adaptive response to other stress conditions. Also, one should check for differences in dynamic responses as a complementary argument for evaluating the performance of each of the potential solutions.
The practical use of the methodology developed in this work requires a mathematical model that accurately reflects the properties of the system. Furthermore, although detailed models would be desirable, the mathematical complexity makes optimization tasks very difficult on those models. GMA models provide a practical solution and have several advantages that allow an efficient implementation of the optimization procedures. First, automatic generation of models from conceptual schemes is straightforward, which facilitates obtaining a useful mathematical model. Second, a number of techniques exist for fitting those models to dynamic data (see  for a review), which is basic for obtaining a numerical model that can be used in optimization procedures. Third, GMA models can incorporate qualitative data, which can help when limited information is available for parameter identification. Finally, the specific structure of the GMA model can be exploited to devise an efficient tailor-made global optimization algorithm. Particularly, in the context of the proposed method, we take advantage of the GMA representation in order to construct master MILPs that provide tight lower bounds on the optimal solution to the original problem. Thus, our method exploits the advantages offered by this kind of models for obtaining a useful implementation of the optimization and feasibility approaches.
In practical problems, once a model has been appropriately identified, parameter uncertainties can be a difficulty for obtaining a reasonable result [54, 55]. As the procedure developed here is highly efficient, it should be practically possible to run a sensitivity analysis by screening different parameter sets. This would help in discussing the validity of the obtaining results when parameter variability is an issue.
The proposed method requires a sound knowledge of the biological problem. This is especially important for identifying physiological constraints that can be relevant in limiting the feasibility region. If those constraints are not clearly identified, our method can be used as an exploratory tool for identifying different feasibility regions that would correspond to different sets of physiological constraints. An analysis of the different results could help in identifying those constraints that may be more important in a given scenario. In any case, validation of the feasibility regions obtained would require experimental data. Discrepancies between theoretical results and actual data may help in discarding unreasonable physiological constraints. In the example discussed here, experimental observations agree with the predicted feasibility region. This should be interpreted as an indirect prove that the physiological constraints used are meaningful in that case. However, it is not a prove that these are the only constraints that explain the observed results. An iterative analysis through alternative sets of constraints would be required for completely identifying the most significant ones.
Besides its application to understand the evolution of adaptive responses, our methodology can be useful in exploring health and disease states  so that optimal targets for specific treatments via regulation of enzyme activity can be suggested . All these problems must take into account the evaluation of several objective functions . Our approach could simplify this task by finding the optimal solution for each of the various objective functions and by comparing these results within the feasibly region (see for instance Figure 6).
Our method focuses on exploring adaptive responses through changing enzyme activities. It would be interesting to extend the methodology to include kinetic-orders in the optimization. This would allow to explore design principles required for a given network to perform according to specific performance criteria. Also, it would be necessary to adapt the optimization procedures so that dynamic properties, such as stability and response time, could be included as physiological constraints.
Upper level master problem for the optimization procedure
Model CMILP takes the form of a mixed-integer linear programming (MILP) problem. This type of model can be efficiently solved via standard branch and bound techniques .
where is the reformulated objective function; and represents the set of reformulated equations that define the feasible set, which includes the auxiliary constraints that define the lower and upper estimators of the logarithmic terms.
Lower level slave problem for the optimization procedure
The solution of such a model provides an upper bound on the objective function of ONLP. The master and slave problems are solved iteratively until the upper and lower bounds converge. Note that in RNLP, the lower and upper limits of the intervals within which the values of v r must fall (i.e., and ) are given by the master problem CMILP. Specifically, these bounds correspond to the limits of the intervals that are active in the master problem. Then, if denotes the solution of the master problem, in which the h term of the disjunction that approximates v r is active, then we have that and, hence, .
The upper and lower estimators are only required to replace those velocity terms that appear in equations with more than two terms. On the other hand, equations containing only two of the remaining velocity terms can be written as follows:(A.16)
which is a linear constraint.
The grid in problem CMILP can be updated in different ways. A possible strategy to perform the updating consists of including in it the middle points of the active subintervals in the solution of the master problem CMILP. Therefore, if the solution of CMILP is such that (i.e., interval h is active), then the grid corresponding to v r is modified by adding the new point . Alternatively, the grid can be updated by just adding the optimal solution obtained in the lower level problem RNLP.
In each iteration, additional hyper-planes can be added in the master problem in a similar way as it is done with the grid updating. Thus, the logarithmic terms can be linearized either at the middle points of the active subintervals or at the optimal values obtained in the lower level problem RNLP.
The approach presented can easily handle the case in which lower and upper bounds are imposed on the apparent rate constants γ r and/or the concentrations of metabolites X j . These conditions can be expressed via the following constraints:(A.19)(A.20)
The approach presented also allows to fix upper bounds on the summation of γ r and X j . This can be accomplished by adding the following inequalities:(A.23)(A.24)
Constraints A.25 and A.26 are convex, and hence can be linearized in a similar way as was done with equation A.6. Note, however, that the definition of lower bounds on the summation of γ r and X j leads to nonconvex terms. In this latter case, additional piecewise estimators are required to preserve the convexity of the model.
Let us finally note that different types of piecewise functions could be applied in the master problem .
Modified master problem for the feasibility method
In the context of our algorithm, this master problem is employed to predict whether a feasible solution exists in the search space SP or not.
Modified slave problem for the feasibility method
Note that in model RFNLP, the lower and upper bounds imposed to γ r are given by the values of the binary variables z rt in the master problem, whereas the values of and correspond to the limits of the active intervals of the linear piecewise approximations in CFMILP.
Integer cuts for the feasibility method
where and , with being the value of the s component of the vector of binary variables in the feasible solution identified in the outer iteration b. The sets and are therefore obtained in each outer iteration b, and are employed to derive integer cuts that are added cumulatively to the master problem CFMILP.
Note that the algorithm only requires solving the problems to feasibility, i.e., a feasible solution of the problems is sufficient for the algorithm goal. However, by defining a small tolerance error tol, the algorithm can also determine the optimal solution in the region SP explored at each iteration within the predefined optimality gap.
The integer cuts in equation A.31 are added cumulatively at each iteration to the upper-level model CFMILP, which leads to an increase in its size.
The authors wish to acknowledge support of this research work from the Spanish Ministry of Education and Science (MEC) (projects BFU2005-00234/BMC, BFU2008-00196/BMC, DPI2008-04099, CTQ2009-14420-C02-01 and PHB2008-0090-PC) and the Spanish Ministry of External Affairs (projects A/8502/07, HS2007-0006 and A/020104/08).
- Darwin C, Wallace AR: On the Tendency of Species to form Varieties; and on the Perpetuation of Varieties and Species by Natural Means of Selection. Journal of the Proceedings of the Linnean Society of London 1958, Zoology 3: 46–50.Google Scholar
- Darwin C: On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life. 1st edition. London: John Murray; 1859.Google Scholar
- Savageau MA: Biochemical Systems Analysis: A Study of Function and Design in Molecular Biology. Reading, Mass.: Addison-Wesley; 1976.Google Scholar
- Turner JS: The Tinkerer's Accomplice: How Design Emerges from Life Itself. Harvard University Press; 2007.View ArticleGoogle Scholar
- Savageau MA: Optimal design of feedback control by inhibition. Journal of Molecular Evolution 1974, 4(2):139–156. 10.1007/BF01732019View ArticlePubMedGoogle Scholar
- Savageau MA: Comparison of classical and autogenous systems of regulation in inducible operons. Nature 1974, 252(5484):546–549. 10.1038/252546a0View ArticlePubMedGoogle Scholar
- Savageau MA: Significance of autogenously regulated and constitutive synthesis of regulatory proteins in repressible biosynthetic systems. Nature 1975, 258(5532):208–214. 10.1038/258208a0View ArticlePubMedGoogle Scholar
- Hlavacek WS, Savageau MA: Rules for coupled expression of regulator and effector genes in inducible circuits. Journal of Molecular Biology 1996, 255: 121–139. 10.1006/jmbi.1996.0011View ArticlePubMedGoogle Scholar
- Hlavacek WS, Savageau MA: Completely uncoupled and perfectly coupled gene expression in repressible systems. Journal of Molecular Biology 1997, 266(3):538–558. 10.1006/jmbi.1996.0811View ArticlePubMedGoogle Scholar
- Alves R, Savageau MA: Irreversibility in unbranched pathways: preferred positions based on regulatory considerations. Biophysical journal 2001, 80(3):1174–1185. 10.1016/S0006-3495(01)76094-8PubMed CentralView ArticlePubMedGoogle Scholar
- Alves R, Savageau MA: Comparative analysis of prototype two-component systems with either bi-functional or monofunctional sensors: differences in molecular structure and physiological function. Molecular microbiology 2003, 48: 25–51. 10.1046/j.1365-2958.2003.03344.xView ArticlePubMedGoogle Scholar
- Alves R, Savageau MA: Evidence of selection for low cognate amino acid bias in amino acid biosynthetic enzymes. Molecular microbiology 2005, 56(4):1017–1034. 10.1111/j.1365-2958.2005.04566.xPubMed CentralView ArticlePubMedGoogle Scholar
- Igoshin OA, Price CW, Savageau MA: Signalling network with a bistable hysteretic switch controls developmental activation of the sigma transcription factor in Bacillus subtilis. Molecular microbiology 2006, 61: 165–184. 10.1111/j.1365-2958.2006.05212.xView ArticlePubMedGoogle Scholar
- Igoshin OA, Alves R, Savageau MA: Hysteretic and graded responses in bacterial two-component signal transduction. Molecular microbiology 2008, 68(5):1196–1215. 10.1111/j.1365-2958.2008.06221.xPubMed CentralView ArticlePubMedGoogle Scholar
- Dasika MS, Maranas CD: OptCircuit: an optimization based method for computational design of genetic circuits. BMC Systems Biology 2008, 2: 24. 10.1186/1752-0509-2-24PubMed CentralView ArticlePubMedGoogle Scholar
- Salvador A, Savageau MA: Quantitative evolutionary design of glucose 6-phosphate dehydrogenase expression in human erythrocytes. Proceedings of the National Academy of Sciences of the United States of America 2003, 100(24):14463–14468. 10.1073/pnas.2335687100PubMed CentralView ArticlePubMedGoogle Scholar
- Salvador A, Savageau MA: Evolution of enzymes in a series is driven by dissimilar functional demands. Proceedings of the National Academy of Sciences of the United States of America 2006, 103(7):2226–2231. 10.1073/pnas.0510776103PubMed CentralView ArticlePubMedGoogle Scholar
- Voit EO, Radivoyevitch T: Biochemical systems analysis of genome-wide expression data. Bioinformatics 2000, 16(11):1023–37. 10.1093/bioinformatics/16.11.1023View ArticlePubMedGoogle Scholar
- Voit EO: Design principles and operating principles: the yin and yang of optimal functioning. Math Biosci 2003, 182: 81–92. 10.1016/S0025-5564(02)00162-1View ArticlePubMedGoogle Scholar
- Vilaprinyo E, Alves R, Sorribas A: Use of physiological constraints to identify quantitative design principles for gene expression in yeast adaptation to heat shock. BMC Bioinformatics 2006, 7: 184. 10.1186/1471-2105-7-184PubMed CentralView ArticlePubMedGoogle Scholar
- Klipp E, Nordlander B, Kruger R, Gennemark P, Hohmann S: Integrative model of the response of yeast to osmotic shock. Nature biotechnology 2005, 23(8):975–982. 10.1038/nbt1114View ArticlePubMedGoogle Scholar
- Alvarez-Vasquez F, Sims KJ, Cowart LA, Okamoto Y, Voit EO, Hannun YA: Simulation and validation of modelled sphingolipid metabolism in Saccharomyces cerevisiae. Nature 2005, 433(7024):425–430. 10.1038/nature03232View ArticlePubMedGoogle Scholar
- Eisen MB, Spellman PT, Brown PO, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences of the United States of America 1998, 95(25):14863–14868. 10.1073/pnas.95.25.14863PubMed CentralView ArticlePubMedGoogle Scholar
- Gasch AP, Spellman PT, Kao CM, Carmel-Harel O, Eisen MB, Storz G, Botstein D, Brown PO: Genomic expression programs in the response of yeast cells to environmental changes. Molecular biology of the cell 2000, 11(12):4241–4257.PubMed CentralView ArticlePubMedGoogle Scholar
- Causton HC, Ren B, Koh SS, Harbison CT, Kanin E, Jennings EG, Lee TI, True HL, Lander ES, Young RA: Remodeling of yeast genome expression in response to environmental changes. Molecular biology of the cell 2001, 12(2):323–337.PubMed CentralView ArticlePubMedGoogle Scholar
- Molina-Navarro MM, Castells-Roca L, Belli G, Garcia-Martinez J, Marin-Navarro J, Moreno J, Perez-Ortin JE, Herrero E: Comprehensive transcriptional analysis of the oxidative response in yeast. The Journal of biological chemistry 2008, 283(26):17908–17918. 10.1074/jbc.M800295200View ArticlePubMedGoogle Scholar
- Polisetty PK, Gatzke EP, Voit EO: Yield optimization of regulated metabolic systems using deterministic branch-and-reduce methods. Biotechnol Bioeng 2008, 99(5):1154–69. 10.1002/bit.21679View ArticlePubMedGoogle Scholar
- Savageau MA: Biochemical systems analysis. I. Some mathematical properties of the rate law for the component enzymatic reactions. Journal of theoretical biology 1969, 25(3):365–369. 10.1016/S0022-5193(69)80026-3View ArticlePubMedGoogle Scholar
- Savageau MA: Biochemical systems analysis. II. The steady-state solutions for an n-pool system using a power-law approximation. Journal of theoretical biology 1969, 25(3):370–379. 10.1016/S0022-5193(69)80027-5View ArticlePubMedGoogle Scholar
- Savageau MA: Biochemical systems analysis. 3. Dynamic solutions using a power-law approximation. Journal of theoretical biology 1970, 26(2):215–226. 10.1016/S0022-5193(70)80013-3View ArticlePubMedGoogle Scholar
- Alves R, Vilaprinyo E, Hernandez-Bermejo B, Sorribas A: Mathematical formalisms based on approximated kinetic representations for modeling genetic and metabolic pathways. Biotechnology and Genetic Engineering Reviews 2008, 25: 1–40.View ArticlePubMedGoogle Scholar
- Alves R, Vilaprinyo E, Sorribas A: Integrating Bioinformatics and Computational Biology: Perspectives and Possibilities for In Silico Network Reconstruction in Molecular Systems Biology. Current Bioinformatics 2008, 3(2):98–129. 10.2174/157489308784340694View ArticleGoogle Scholar
- Voit EO: Computational Analysis of Biochemical Systems. In A Practical Guide for Biochemists and Molecular Biologists. Cambridge, U.K.: Cambridge University Press; 2000.Google Scholar
- Chou IC, Voit EO: Recent Developments in Parameter Estimation and Structure Identification of Biochemical and Genomic systems. Math Bisoc 2009, (219):57–83. 10.1016/j.mbs.2009.03.002
- Goel G, Chou IC, Voit EO: System Estimation from Metabolic Time Series Data. Bioinformatics (Oxford, England) 2008, 24(21):2505–11. 10.1093/bioinformatics/btn470View ArticleGoogle Scholar
- Curto R, Voit EO, Sorribas A, Cascante M: Mathematical models of purine metabolism in man. Mathematical biosciences 1998, 151: 1–49. 10.1016/S0025-5564(98)10001-9View ArticlePubMedGoogle Scholar
- Alves R, Herrero E, Sorribas A: Predictive reconstruction of the mitochondrial iron-sulfur cluster assembly metabolism: I. The role of the protein pair ferredoxin-ferredoxin reductase (Yah1-Arh1). Proteins 2004, 56(2):354–66. 10.1002/prot.20110View ArticlePubMedGoogle Scholar
- Alves R, Herrero E, Sorribas A: Predictive reconstruction of the mitochondrial iron-sulfur cluster assembly metabolism. II. Role of glutaredoxin Grx5. Proteins 2004, 57(3):481–92. 10.1002/prot.20228View ArticlePubMedGoogle Scholar
- Marin-Sanguino A, Voit EO, Gonzalez-Alcon C, Torres NV: Optimization of biotechnological systems through geometric programming. Theor Biol Med Model 2007, 4: 38. 10.1186/1742-4682-4-38PubMed CentralView ArticlePubMedGoogle Scholar
- Bailey J, Birnbaum S, Galazzo J, Khosla C, Shanks J: Strategies and challenges in metabolic engineering. Ann NY Acad Sci 1990, 589: 1–15. 10.1111/j.1749-6632.1990.tb24230.xView ArticlePubMedGoogle Scholar
- Cameron D, Tong J: Cellular and metabolic engineering: an overview. Appl Biochem Biotechnol 1993, 38: 105–140. 10.1007/BF02916416View ArticlePubMedGoogle Scholar
- Cameron D, Chaplen F: Developments in metabolic engineering. Curr Opin Biotechnol 1997, 8: 175–180. 10.1016/S0958-1669(97)80098-5View ArticlePubMedGoogle Scholar
- Mendes P, Kell D: Making cells work - metabolic engineering for everyone. Trends Biotechnol 1996, 15: 6–7. 10.1016/S0167-7799(96)30030-9View ArticleGoogle Scholar
- Banga JR: Optimization in computational systems biology. BMC Syst Biol 2008, 2: 47. 10.1186/1752-0509-2-47PubMed CentralView ArticlePubMedGoogle Scholar
- Voit EO: Optimization in integrated biochemical systems. Biotechnol Bioeng 1992, 40(5):572–82. 10.1002/bit.260400504View ArticlePubMedGoogle Scholar
- Alvarez-Vasquez F, Canovas M, Iborra JL, Torres NV: Modeling, optimization and experimental assessment of continuous L-(-)-carnitine production by Escherichia coli cultures. Biotechnol Bioeng 2002, 80(7):794–805. 10.1002/bit.10436View ArticlePubMedGoogle Scholar
- Marin-Sanguino A, Torres NV: Optimization of biochemical systems by linear programming and general mass action model representations. Math Biosci 2003, 184(2):187–200. 10.1016/S0025-5564(03)00046-4View ArticlePubMedGoogle Scholar
- Biegler JT, Grossmann IE: Retrospective on optimization. Computers and Chemical Engineering 2004, 28: 1169–1192.View ArticleGoogle Scholar
- Floudas CA: Deterministic global optimization: Theory. In Methods and Applications. Dordrecht, The Nether-lands: Kluwer, Academic Publishers; 2000.Google Scholar
- Bergamini ML, Aguirre P, Grossmann IE: Logic-based outer approximation for globally optimal synthesis of process networks. Computers and Chemical Engineering 2005, 29: 1914–1933. 10.1016/j.compchemeng.2005.04.003View ArticleGoogle Scholar
- Bergamini ML, Scenna NJ, Aguirre P: Global Optimal Structures of Heat Exchanger Networks by Piecewise Relaxation. Industrial and Engineering Chemistry Research 2007, 46: 1752–1763. 10.1021/ie061288pView ArticleGoogle Scholar
- Bergamini ML, Grossmann IE, Scenna N, Aguirre P: An improved piecewise outer-approximation algorithm for the global optimization of MINLP models involving concave and bilinear terms. Computers and Chemical Engineering 2008, 32: 477–493. 10.1016/j.compchemeng.2007.03.011View ArticleGoogle Scholar
- Koonin EV: Darwinian evolution in the light of genomics. Nucleic acids research 2009, 37(4):1011–1034. 10.1093/nar/gkp089PubMed CentralView ArticlePubMedGoogle Scholar
- de Atauri P, Sorribas A, Cascante M: Analysis and prediction of the effect of uncertain boundary values in modeling a metabolic pathway. Biotechnology and bioengineering 2000, 68: 18–30. 10.1002/(SICI)1097-0290(20000405)68:1<18::AID-BIT3>3.0.CO;2-5View ArticlePubMedGoogle Scholar
- Voit EO, del Signore M: Assessment of effects of experimental imprecision on optimized biochemical systems. Biotechnol Bioeng 2001, 74: 443–448. 10.1002/bit.1135View ArticlePubMedGoogle Scholar
- Voit EO: A systems-theoretical framework for health and disease: Inflammation and preconditioning from an abstract modeling point of view. Mathematical biosciences 2008, 217(1):11–8. 10.1016/j.mbs.2008.09.005PubMed CentralView ArticlePubMedGoogle Scholar
- Vera J, Curto R, Cascante M, Torres NV: Detection of potential enzyme targets by metabolic modelling and optimization: application to a simple enzymopathy. Bioinformatics 2007, 23(17):2281–9. 10.1093/bioinformatics/btm326View ArticlePubMedGoogle Scholar
- Vera J, de Atauri P, Cascante M, Torres NV: Multi-criteria optimization of biochemical systems by linear programming: application to production of ethanol by Saccharomyces cerevisiae. Biotechnol Bioeng 2003, 83(3):335–43. 10.1002/bit.10676View ArticlePubMedGoogle Scholar
- Raman R, Grossmann IE: Modeling and computational techniques for logic based integer programming. Comput Chem Eng 1994, 18: 563. 10.1016/0098-1354(93)E0010-7View ArticleGoogle Scholar
- Nemhauser GL, Wolsey LA: Integer and Combinational Optimization. New York: John Wiley; 1998.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.