New time-scale criteria for model simplification of bio-reaction systems

Background Quasi-steady state approximation (QSSA) based on time-scale analysis is known to be an effective method for simplifying metabolic reaction system, but the conventional analysis becomes time-consuming and tedious when the system is large. Although there are automatic methods, they are based on eigenvalue calculations of the Jacobian matrix and on linear transformations, which have a high computation cost. A more efficient estimation approach is necessary for complex systems. Results This work derived new time-scale factor by focusing on the problem structure. By mathematically reasoning the balancing behavior of fast species, new time-scale criteria were derived with a simple expression that uses the Jacobian matrix directly. The algorithm requires no linear transformation or decomposition of the Jacobian matrix, which has been an essential part for previous automatic time-scaling methods. Furthermore, the proposed scale factor is estimated locally. Therefore, an iterative procedure was also developed to find the possible multiple boundary layers and to derive an appropriate reduced model. Conclusion By successive calculation of the newly derived time-scale criteria, it was possible to detect multiple boundary layers of full ordinary differential equation (ODE) models. Besides, the iterative procedure could derive the appropriate reduced differential algebraic equation (DAE) model with consistent initial values, which was tested with simple examples and a practical example.


Background
The dynamic simulation of bio-reaction pathways is becoming more important as the kinetic information of various pathways is revealed. Moreover, the necessary data for the specific pathways are easily obtained through various channels, including the internet. However, there are fundamental difficulties in the numerical solution of the differential equation system: model stiffness is one. a bioreaction system consists of multiple reactions with various enzymes that have different turn-over numbers, meaning various magnitudes of reaction rates. Furthermore, the metabolites involved in one reaction can also participate in other reactions in the same system. These characteristics make typical bio-systems strongly coupled and have mul-tiple time-scales. Therefore, ordinary differential equations (ODEs) based on the dynamic modeling of a metabolic system are usually stiff.
The stiffness problem requires unnecessary effort to track the boundary layer solutions, hence, the computational efficiency decreases. Furthermore, when the simulation is concurrent with the experiment, the calculation accuracy is closely related to the measurement interval. If the measurement interval is modest so that it is impossible to find some specific parameters, then the numerical result does not need to be precise. In these situations, by sacrificing the accuracy modestly, simplifying the model structure is necessary, which is often the case for parameter estimations and sensitivity analyses [1].
For this purpose, apart from a numerical analysis approach, the kinetic field's specific solution methods have been required [2]. Traditionally, simplification of an original complex model, such as a quasi-steady state approximation (QSSA) and a partial equilibrium approximation, have been applied to relieve the stiffness characteristics [3][4][5][6][7]. However, since these approaches require the practitioner's intuition and experience, computational methods have been developed, especially vividly in combustion engineering fields. There are two important procedures in computational model simplification: the determination of the simplification criteria and the determination of the slow invariant manifolds.
Some computational methods concentrated on deriving the correct slow manifolds. They are iterative trajectory based methods [8,9], the method of invariant manifold (MIM) [10], the minimal entropy production trajectory (MEPT) based methods [11,12], and the nonlinear model reduction method [13]. Gorban et al. collected and reviewed such kinds of methods [14].
The others suggested the appropriate simplification criteria: generalized sensitivity analyses [15] used singular values of the sensitivity matrix as the scale factor, while computational singular perturbation (CSP) [16][17][18], intrinsic low dimensional manifold (ILDM) [19], and dynamic dimension reduction, which is a modified version of ILDM [20,21], used the eigenvalue analysis of the system's Jacobian matrix with focusing on the time-scales of the system. Currently, ILDM based methods have been applied to reduce complexity of biochemical systems [21][22][23]. Since CSP or ILDM based methods use the dynamic properties of a system, they can give the dynamically useful information of the system.
Even if there are several differences between the CSP and ILDM, the most important ideas are similar: the determination of the speed ranking is based on the eigenvalue cal-culation of the Jacobian matrix, which require at least O(n 3 ) flops of computation, and the derivation of solution is based on a linear transformation of the original system. Furthermore, they share common barriers in producing an explicitly reduced model, generally as a result of the system transformation. This feature is important when model simplification approaches are related to not only computational efficiency but also parameter estimation.
This work suggests an automatic method for speed ranking that directly uses the Jacobian matrix without a system transformation. Due to its simplicity, this approach requires O(n) flops of computation, being more physically intuitive. In addition to the scaling procedure, the decision of the differential and algebraic variables in a slow dynamic regime after relaxation of the fast dynamics is also introduced. The result of the proposed process is an explicitly phrased model, which can be a route to distinguish the meaningful parameters from the unobservable ones.

Criteria for balancing
In homogeneous chemical kinetics, the dynamic model can be written in the following form of the ODE.
Since a chemical reaction system generally consists of production and loss terms, the ODE can be rewritten as: or with a matrix-vector notation, where y ∈ ‫ޒ‬ +n is a concentration vector, P i : ‫ޒ‬ +n → ‫ޒ‬ + is a production term, and L i : ‫ޒ‬ +n → ‫ޒ‬ + is a loss terms. S is a stoichiometric matrix and v is a reaction rate vector. The subscripts P and L denote the production and loss, respectively. Generally, it can be said that if y i exhibits a quasi-steady state behavior, such behavior is observed after a short period of time for the corresponding P i and L i to balance each other. In a normal computational environment, the period can be readily estimated from equation (1), but the applicability of QSSA cannot be directly determined based on the estimation. The goal of this work is to determine the proper criteria to determine the applicability of the QSSA based on the estimation of balancing period. At the moment either P i or L i enlarge, the period during which P i balances with L i can be evaluated in a simple manner.
By chain rule, Let δt i be a short time period after which P i and L i balance each other. Then, we have following relationship: where superscript 0 indicates the reference value. Rearranging the equation gives: If the denominator on the righthand side of equation (3) is not zero, we can compute the time scale δt i from this equation. If the magnitude of the time scale is large, namely, |δt i | > t for some t > 0, y i is considered to exhibit slow dynamics and QSSA is not applied.
If |δt i | < t , y i exhibits fast dynamics, QSSA can be applied depending on the sign of δt i .
• If δt i is positive, it will reach a balancing state quickly and QSSA can be applied to y i .
• If δt i is negative, the applicability of QSSA cannot be determined by δt i alone.
If the denominator is zero, the numerator L i -P i should be considered for following three cases: • P i ≠ L i indicates a non-reducing state between production and loss. QSSA should not be applied.
• P i = L i = 0 indicates that no dynamics has occurred yet for y i . QSSA should not be applied.
• P i = L i ≠ 0 indicates that δt i should be set to zero since complete balance has been obtained.
The time scale of the element reaction can be also estimated using the same method, which will be used to determine a closed subsystem later.
Another scale factor should be used to determine whether the i'th variable is balanced or not; the ratio of |f i | to the larger one of P i and |L i |, to determine the dynamics of y i . If r i is large, namely r i > r for some r > 0, the production and loss are neither balanced nor can be balanced soon, hence QSSA is not applicable to such y i .
Based on the values calculated from (3) to (5), the criteria to separate QSSA variables are as follows.
1. Both production and loss terms must exist.
2. For certain r << 1 and t << t f , the applicability of QSSA to the ith variable can be summarized as in Table 1.

Iterative balancing
The above-mentioned scale factors δt i and r i are locally determined at a certain moment, initially in the innermost boundary layer. Therefore, if there are multiple boundary layers, the separation process should be applied iteratively to exit the fastest dynamic regime and move to the next fastest one, jumping to the next boundary layers, and finally to the slow dynamic regime, outer region, Applicability of QSSA to y i where fast dynamics of the inner regions are fully relaxed. The iterative procedure locates the variable values of the fast dynamics, y(0), to the outer region, y(0 + ), which become the consistent initial values of the reduced system.
Once δt i and r i are computed, identifying the fast variables, one can decompose the original stoichiometric matrix into two sub-matrices, S f and S s . Equation (2) is rewritten as: where subscript s indicates slow variables and f indicates fast ones. To derive consistent initial values for the outer region, the following algebraic equation should be solved: The solution of equation (6) can give other scale factors that also satisfy the approximation criteria, which shows the possibility of multiple boundary layers with different time scales. The existence of multiple boundary layers with different time scales corresponds to the cascaded nested hierarchy concept of inertial manifolds [8]. For these events, equations for the iterative approach can be written as: where the superscript k is the number of iterations. From , and are iteratively calculated and the updated equation (7) is solved until |y k+1 -y k | satisfies the convergence criteria. The convergence criterion of this iterative procedure can be considered as the partial equilibrium among the fast variables. This is conceptually similar to the equilibrium value convergence of Lebiedz's work [11]. After convergence is achieved, the following reduced models with the modified initial values are derived: The solution of the algebraic equation (7) can be computed using the appropriate numerical method such as Newton's method or other similar methods. However, those methods sometimes give physically meaningless solutions. To overcome this, Mass and Pope applied an iterative technique that successively constructs sub-ODE systems and successively solves them [19]. This work applied a similar approach to that of Mass and Pope to find the plausible initial guesses for the algebraic equation (7).
From equation (7), the columns of S f , say s j , give information about the closing and opening of the subsystem by the j 'th reaction. If s j is composed of only non-negative signs or only non-positive signs, this system is opened by the j'th reaction. If the time scale of the opening j'th reaction, δτ j , is large, say δτ j > t , the reaction is excluded from the subsystem, resulting in a closed subsystem, Since the closed system always has steady state, the k'th solution of the transient subsystem reaching a steady state is used as the initial guess for equation (7).

Overall process
The criteria are used to detect the existence of a boundary layer and the iterative balancing computes the initial values of the reduced model at the border of the outer region. There are two categories of QSSA possibilities. The first category is that the large deviation between P i and L i decreases very fast, which gives a relatively larger r i but a small δt i . The second is that P i and L i are almost balanced by a coincidental initial value. In this case, the approximation is also dependent on how small δt i is. Then the itera- This process can also identify the boundary layers that occur internally, not only at the initial point, because the detection checks the possibility of QSSA at every step of the calculation. These features are illustrated in the next section. In summary, the overall detecting and balancing process is as follows: Iterative balancing process 1. Calculate δt i , δτ j , and r i using equations (3) through (5).
2. If no species can be approximated, then go to step 4, or else go to step 3.

Results and Discussion
The Michaelis-Menten kinetics without inhibition, and with inhibition, are considered in this study. The parameters are (k 1 , k 2 , k 3 , k 4 , k 5 ) = (500000, 5, 1000, 100, 0.16) and the initial values are (e 0 , s 0 , c 10 , c 20 , p 0 ) = (1, 100, 0, 0, 0) and (e 0 , s 0 , c 10 , c 20 , p 0 , i 0 , ei 0 ) = (1, 100, 0, 0, 0, 1000, 0) [24]. As can be seen, there are two boundary layers at the initial region and near t = 900 (see Figure 1). Since p is only produced, its dynamics are not considered when searching for the fast balancing species. In the initial region, the estimated value of δt ≈ 3.96 × 10 -8 and species e, s, and c 1 were selected as fast variables, as expected. The subsystem composed of the chosen species was opened by the second reaction, hence the second reaction was removed from the subsystem. Based on the solution from the first iteration, species c 1 and c 2 were selected. Since the δt of e and s remains small, there indices were maintained as fast variables. The updated subsystem was opened by the third reaction, and consequently, the reaction was excluded. After the second iteration, the solution converged and was stored, then the open system was solved in a small interval period. Since a very large δt of s was identified in this refining step, s was excluded from the fast variable set. Finally, e, c 1 , and c 2 were selected as QSSA variables before the second boundary layer. The values of δt i at each iteration after three iterations of the iterative process are listed in Table  2. At the second boundary layer, the predefined criteria gave another iterative process and relocated the solution toward the outer area in the same manner as described above. For comparison with the conventional manual QSSA approach, the time scales of each species were also derived by mathematical balancing [25]. The meaning of the time-scales of the fast variables from the conventional derivation is the time to exit the boundary layer. Therefore, the summation of δt i for every iteration until the species enters the outer region is the direct comparative value of the conventional scales (see Table 3). A similar tendency is observed between the sums of δt i and the mathematical scales. As in Table 3, the sums of δt i and the mathematically scaled values indicate that there are two boundary layers near t ≈ 10 -8 and t ≈ 10 -3 before exiting the initial regions. A semi-log plot of the full model simulation ( Figure 2) supports this expectation.
The second example, the Michaelis-Menten kinetics with inhibition, shows a boundary layer at the initial area only (see equation (11) and the inhibition. These complex dynamics of the second example require a few more iterations than that of the first example to exit the initial boundary layer.
The third example is the caspase activation model given by equation (12) in [26].
The reaction rate equations for equation (12) [26]. There are also two boundary layers at the initial and internal regions, but with a much larger δt relative to the former cases; δt ≈ O(10 -1 ) at the initial area and δt ≈ O(1) at the internal boundary layer (Figure 4).
Although it is difficult to recognize the initial boundary layer in the figure because of the small change in the concentrations relative to that of the internal boundary layer, the initial boundary layer exists in the third model, which can be overlooked by the simple observation of the full   δt values during the initial iterative process of two complex Michaelis- Mathematically derived time scales and sums of δt i of two complex

Menten kinetics
Michaelis-Menten kinetics model simulation. The existence of the initial boundary layer of the third model could be detected by the proposed algorithm. Figure 1 and Figure 3 show that the exact value of the parameters k 1 , k 2 , k 3 , and k 4 cannot be properly identified using measurement intervals larger than O (10 -8 ). Figure 4 also illustrates that it is impossible to obtained the specific parameters of the original model with a measurement interval larger than O (1). Only the ratios of some species's concentrations can be observed. Therefore, the ratios are meaningful for these situations, not the detailed dynamics occurred within the period smaller than the measurement interval. The suggested scheme improves computational efficiency in the stiff inner region and extracts the information of experimentally meaningful QSSA concentrations for the corresponding species. Besides, by Simplification results (II) Simplification results (III) Figure 4 Simplification results (III). Concentrations of c8 and c8* for the caspase system; (a) the solution profile of c8, from the full ODE model (c8) and from the reduced model (c8_red) and (b) the solution profile of c8*, from the full ODE model (c8*) and from the reduced model (c8*_red).
determining a measurement scale, the level of simplification can be controlled easily.

Conclusion
This work proposed new criteria for the time-scale analysis and iterative balancing approach to develop an automatic simplification. This approach has different consistent initial values after model reduction and successfully found consistent initial values of the simplified DAE model using iterative balancing. With some examples, from small systems to practical systems, this scheme gave a successful reduction and found consistent initial values. If a whole cell is the system of the dynamic simulation, the network of reaction pathways will be more complex, the number of variables will be increased, and the simulation will be more difficult. Henceforth, it may be also important to relate the derived model to the experimental view, and this approach can give the criteria to classify the meaningful values from the original model.