FFCA: a feasibility-based method for flux coupling analysis of metabolic networks

Background Flux coupling analysis (FCA) is a useful method for finding dependencies between fluxes of a metabolic network at steady-state. FCA classifies reactions into subsets (called coupled reaction sets) in which activity of one reaction implies activity of another reaction. Several approaches for FCA have been proposed in the literature. Results We introduce a new FCA algorithm, FFCA (Feasibility-based Flux Coupling Analysis), which is based on checking the feasibility of a system of linear inequalities. We show on a set of benchmarks that for genome-scale networks FFCA is faster than other existing FCA methods. Conclusions We present FFCA as a new method for flux coupling analysis and prove it to be faster than existing approaches. A corresponding software tool is freely available for non-commercial use at http://www.bioinformatics.org/ffca/.


Background
Constraint-based analysis of metabolic networks has become increasingly important for describing and predicting the behavior of living organisms [1,2]. While a growing number of metabolic network reconstructions have become available during the last years, the computational analysis of genome-scale networks with hundreds or thousands of reactions may still be very time-consuming. Therefore, there is a need for more efficient algorithms and tools (see e.g. [3][4][5]).
Flux coupling analysis (FCA) [6] is a useful method for finding dependencies between fluxes of a metabolic network at steady-state. If a non-zero flux through a reaction in steady-state implies a non-zero flux through another reaction, then the first reaction is said to be coupled to the second reaction. Several studies have used FCA for exploring various biological questions such as network evolution [7][8][9], gene essentiality [7], analysis of experimentally measured fluxes [10,11] or gene regulation [12,13]. Having a time efficient implementation of FCA is important in such studies.
In the rest of this section, we first recall some basic definitions. Then we briefly review the previously proposed FCA methods.

Mathematical definitions Basic preliminaries
For a metabolic network with m (internal) metabolites and n reactions, the stoichiometric matrix S is an m × n matrix, where element S ij is the stoichiometric coefficient of metabolite i in reaction j. We denote by R = {1, ..., n} the set of reactions and by M = {1, ..., m} the set of metabolites.
There are two types of reactions in a metabolic network. Irr is the set of irreversible reactions which are assumed to always have non-negative flux values. Rev is the set of reversible reactions which are allowed to have positive, negative or zero flux values. If a reversible reaction has a positive (resp. negative) flux, we say that it is working in forward (resp. backward) direction. Splitting a reversible reaction i means making reaction i irreversible and adding one more irreversible reaction i + n to the network, which works in the backward direction. Without loss of generality, we assume that in the numbering of reactions, the first |Rev| reactions are the reversible ones.
If v is the vector of metabolic fluxes through all the reactions, the steady-state condition holds for v if S · v = 0. The steady-state flux cone is defined as If for some reaction r R, v r = 0 for all v C, then r is called blocked, otherwise r is unblocked. The set of blocked reactions will be denoted by Blk. The set of unblocked reactions can be further partitioned based on the reversibility type of reactions [14]. We define Irev as the set of all reactions that can work only in one direction, i.e., those reactions that take only non-negative or only non-positive flux values at steady-state. If i Irev and v i ≤ 0 for all v C, one can multiply by -1 the i-th column of the stoichiometric matrix, such that in the modified network, we have v i ≥ 0 for all v C. Without loss of generality, we assume that all reactions in Irev are operating in the forward direction.
The set of reactions that can work in both directions at steady-state is further partitioned into two subsets: (a) Prev, the set of pseudo-irreversible reactions, and (b) Frev, the set of fully reversible reactions. A reaction i is in Prev if it can work in both directions, and additionally, for all flux vectors v in the lineality space of the flux cone, we have v i = 0. Accordingly, we define Frev as the set of those reactions, which can work in both directions, and additionally, can have a non-zero flux value if fluxes through all irreversible reactions are set to zero.
Flux coupling relations between a pair of reactions [6] Consider two unblocked reactions i, j R. If for all v C, v i ≠ 0 implies v j ≠ 0, then i is directionally coupled to j. This will be denoted by i j, or equivalently, j i. If for all v C, v i ≠ 0 implies v j ≠ 0 and vice versa, then we say that i and j are partially coupled, or i ↔ j. If i and j are partially coupled, and additionally there exists a constant k such that for all v C, v i ≠ 0 implies v j /v i = k, then i and j are fully coupled, or i ⇔ j. If neither i j nor j i holds, then i and j are uncoupled, or i Un ←→ j .

Elementary flux patterns
The set supp(v) = {i R | v i ≠ 0} of reactions active in v is called the support of v. Suppose v C is a flux vector and A ⊆ R is a subnetwork. The flux pattern of v for A is defined as A ∩ supp(v), which is the set of those reactions in A which have non-zero values in v [15]. A flux pattern is called an elementary flux pattern (EFP) if it cannot be written as the union of other flux patterns. For studying EFPs, Kaleta et al. [15] assume that the network contains only irreversible reactions. To achieve this, every reversible reaction should be split into two irreversible reactions (forward and backward).

Approaches to Flux Coupling Analysis
In this subsection, we briefly recall existing approaches to flux coupling analysis. For additional information and the technical details on the implementation of these algorithms, we refer to Additional file 1.

Flux Coupling Finder Algorithm (FCF)
The most widely used method for FCA is the Flux Coupling Finder (FCF) algorithm [6]. This approach is based on solving linear programs (LPs), and therefore is an optimality-based method. After finding blocked reactions and splitting reversible reactions, for every pair of unblocked reactions i and j, two LPs are solved. Depending on the optimal values obtained, the coupling relation between i and j is determined. There is a postprocessing step in FCF. Since the reversible reactions have been split, flux coupling relations for these reactions have to be obtained from the coupling relations of the corresponding irreversible forward and backward reactions.
The FCF algorithm has been successfully used for finding coupling relations in a number of metabolic networks [6][7][8][9][10][11][12][13]16]. However, this approach is rather timeconsuming for genome-scale metabolic networks with thousands of reactions, although it is still one of the fastest FCA methods.

FCA based on Minimal Metabolic Behaviors (MMB-FCA)
Larhlimi and Bockmayr [14,17] have proposed a different strategy for flux coupling analysis. In this approach, a minimal set of generating vectors of the flux cone is computed. Then, the coupling relation for any pair of reactions is inferred based on the co-appearance of nonzero fluxes in the generating vectors.

FCA based on elementary flux patterns (EFP-FCA)
Recently, Kaleta et al. [15] introduced the concept of elementary flux patterns (EFPs) for the analysis of minimal active reactions in a "subnetwork", which account for possible steady-state flux distributions in a (much) larger metabolic network. They also presented a method based on mixed-integer linear programming (MILP) to compute EFPs. Kaleta  is not possible to distinguish between partial and full coupling, since flux patterns only contain the information about the activity or inactivity of the fluxes, but not the flux values.

Further strategies for improving flux coupling analysis
In FCF, and some other FCA approaches, every reversible reaction is split into a forward and a backward reaction. This splitting procedure results in an increase in the number of LPs and also in the size of each LP to be solved. Moreover, a complicated postprocessing step is required to infer the flux coupling relations of the original reversible reactions. For all these reasons, it might be better to perform flux coupling analysis without splitting the reversible reactions. An implementation of FCF without splitting is presented in Additional file 1.
Additionally, Larhlimi and Bockmayr [14] show that depending on the reversibility type of the reactions, only certain flux coupling relations can occur. Two unblocked reactions i and j can be coupled only if one of the following 4 cases holds (note that initially 3 × 3 = 9 reversibility types for the pair (i, j) would be possible): Therefore, it is enough to determine the reversibility types of i and j, and then check if the corresponding coupling relation holds. This will be referred to as "Reversibility-Type prunings" (or simply, RT-prunings). RT-prunings were originally used in MMB-FCA as presented in [14].
After RT-prunings, a further improvement can be applied to LP-based FCA methods. Assume that in the stoichiomtric matrix S, the columns corresponding to the blocked reactions have been removed. According to Proposition 6.13 in [18], for two reactions i and j such that i, j Prev or i, j Frev, the following two statements are equivalent: This means that it is sufficient to solve only one LP: The two reactions i and j will be fully coupled if and only if in the above LP v max j = 0 . Note that the irreversibility constraints are not included in the above LP. Therefore, not only solving one LP is sufficient, but also the LP becomes smaller. This improvement will be called Prev/Frev-based improvement (or simply, PFimprovement). An improved version of FCF, which takes into account all of the above-mentioned improvements, has been suggested in [18]. Here we implemented this improved version of FCF and compared it with the other FCA approaches (see the Results and Discussion section).

Goals of the present work
FCA is a promising tool for metabolic network analysis. However, to perform FCA most authors seem to use their personal implementation of the FCF algorithm [9,13]. Only very recently, an implementation of FCF has been included in the latest versions of the FASIMU software [19,20]. On the other hand, several approaches for FCA have been proposed in the literature. It is not known which of these methods is the fastest in practice. In this paper, we present a novel "feasibility-based" flux coupling analysis method (FFCA) and compare it to previously existing approaches. A corresponding software tool is freely available for non-commercial use.

FFCA: Feasibility-based Flux Coupling Analysis
A linear programming problem maximize(c T x | Ax ≤ b) can be solved by enumerating a sequence of feasible solutions, i.e., solutions of the system of linear inequalities Ax ≤ b such that at each solution x the objective function value c T x improves. Accordingly, finding one feasible solution could be faster than computing an optimal solution (see [21] for the theoretical background).
In this section, we introduce a new approach for flux coupling analysis, FFCA, which is based on feasibility testing. Taking into account the previously mentioned strategies for improving flux coupling analysis, we propose the following procedure for FFCA: 1. i, j Irev: In this case, we check the feasibility of two systems of linear inequalities: and v i = 0, v j = 1, Sv = 0, v r ≥ 0, for all r ∈ Irr.  (P2)) is infeasible, then i j (resp. j i). If (P1) and (P2) are both infeasible, then i and j are partially (and maybe fully) coupled. To check whether they are fully coupled, one has to use other methods, e.g. computing enzyme subsets [22] or solving two LPs as in the FCF algorithm [6]. 2. i Irev and j Prev: The only possible coupling relation is j i (but not i j). Hence, (P1) will always be feasible, because feasibility of (P1) means that i j does not hold. However, we need to check the feasibility of (P2). Additionally, since j can take negative values, one more system should be checked for feasibility: It can be easily shown that if (P2) and (P3) are both infeasible, then j i. Otherwise, i and j are uncoupled. 3. i, j Prev or i, j Frev: In this case, the following system of linear inequalities should be checked for feasibility: If (P4) is infeasible, then i ⇔ j (because feasibility of (P4) implies j i, which in turn implies i ⇔ j based on Proposition 2 in [14]). If (P4) is feasible, then i and j are uncoupled.
To perform FFCA, a method is needed to check the feasibility of a system of linear inequalities. In practice this can be done by solving an LP constructed by the system of inequalities together with a constant objective function. Any feasible solution will be an optimal solution of the LP, and therefore, the LP solver will finish after finding the first feasible solution. For example, for checking the feasibility of (P1), one can solve the following LP: Since v i is constant, the optimal value exists if and only if this problem is feasible. Similar LPs can be solved to check the feasibility of (P2) and (P3).
In Table 1, the characteristics of the FFCA approach are compared to the other FCA methods studied in this article. Note that in all methods, blocked reactions are found and the possible irreversibility of initially reversible reactions is detected within the preprocessing step.

Comparison of the four FCA approaches
To compare the different approaches, namely FCF, MMB-FCA, EFP-FCA and FFCA, we implemented all of them in Matlab (see Additional file 1). A benchmark set of six metabolic network models was used for the evaluation. The number of unblocked reactions in these models ranges from 18 to 765. Table 2 summarizes the running  times, while Table 3 reports on the resulting coupling relations. One can see that in all cases FFCA is 2 to 3 times faster than FCF and orders of magnitude faster than EFP-FCA. Table 2 also shows that FFCA is more appropriate for FCA in genome-scale networks. MMB-FCA is the fastest method for the three smallest networks. However, for the middle-sized H. pylori network and especially for the large networks of S. cerevisiae and E. coli, FFCA proves to be faster than MMB-FCA. The computation time required for MMB-FCA rapidly grows when the number of reactions increases. This is due to the possibly exponential size of the set of generating vectors which has to be computed before finding the coupled reactions (see Section 4.4 in [18]). EFP-FCA, which is based on solving mixed-integer linear programs, turns out to be much slower than other methods. Although the concept of elementary flux patterns is very useful in the analysis of subnetworks, its applicability in full FCA seems to be limited.
Both the FCF algorithm and the current implementation of FFCA solve LPs for flux coupling analysis. One might ask why FFCA is faster than the classical FCF method. There are at least five major differences between FCF and FFCA: • When an LP is solved in FFCA, finding the first feasible solution is sufficient, while the LPs should be solved to optimality in case of the FCF algorithm.
• In the FCF method, in contrast to FFCA, every reversible reaction is split into two (forward and backward) irreversible reactions. This step slows down the procedure and increases the size of the LPs to be solved.
• For computing the flux coupling relation between any pair of reactions, we always need two LPs in FFCA, while in FCF sometimes more LPs have to be solved. For example, for computing the coupling relation between an irreversible and a reversible reaction (after splitting), four LPs are solved (see Additional file 1).
• The Reversibility-Type prunings [14] to reduce the number of possible coupled reaction pairs are only considered in FFCA.
• The PF-improvements are only considered in FFCA.
One can think of implementing FCF without splitting reversible reactions and/or with the RT-prunings and/or PF-improvement. In order to assess the importance of these issues, three improved versions of FCF were implemented (see the Additional file 1 and also Table 1): (i) FCF was re-implemented without splitting reactions (W-FCF); (ii) FCF was re-implemented without splitting reactions and with the RT-prunings (WR-FCF); and (iii) FCF was re-implemented without splitting reactions and with the RT-prunings and the PF-improvement (WRP-FCF), as suggested in [18].
In Table 2 the computational running times of these methods are also shown. As expected, the three versions of the improved FCF algorithm are better than the classical FCF algorithm, while they are still slower that FFCA.

Computational Properties of FFCA
Flux coupling analysis of genome-scale networks can be very time consuming. Only recently (see e.g. [9]) FCA has been used for some of the new genome-scale metabolic networks that contain more than 2000 reactions. To further illustrate our approach, we have applied FFCA to three of these very large networks. The results are reported in Table 4. We can see that FFCA needs 37-46 hours for each of the networks to perform a complete FCA. The numbers of the resulting flux coupling relations are also given in the table.
Since FFCA includes the RT-prunings and PF-improvement, it might be interesting to see how the number of coupling relations (and also the running time of FFCA) depends on the number of reversible reactions in a network. This problem is studied in Additional file 2. Briefly, the flux coupling relations have been computed for the original metabolic network of E. coli [23], for the modified metabolic network of E. coli [24] (which includes a smaller number of irreversible reactions), and for some random networks in between. As expected, the numbers of uncoupled pairs increase (and the running times generally decrease) with the increase in the number of reversible reactions.

Conclusions
We introduced FFCA as a new method for flux coupling analysis, and proved it to be faster than any other available approach. Our implementation of FFCA is fast enough to perform FCA for every pair of reactions in S. cerevisiae and E. coli genome-scale networks in a few hours. A corresponding software tool is available for non-commercial use at http://www.bioinformatics.org/ffca/ and we recommend it for FCA of genome-scale networks.
For FCA, all uptake reactions were assumed to be able to carry non-zero fluxes.

Comparison of different FCA methods
All computations were performed on a 64-bit Debian Linux system with Intel Xeon 3.0 GHz processor. The running times include the CPU time for preprocessing, computation of flux coupling relations, and post-processing (where necessary).

Additional material
Additional file 1: Different approaches to flux coupling analysis and implementation details. In this file, pseudocodes and implementation details of different FCA methods are presented.