An analytic and systematic framework for estimating metabolic flux ratios from 13C tracer experiments

Background Metabolic fluxes provide invaluable insight on the integrated response of a cell to environmental stimuli or genetic modifications. Current computational methods for estimating the metabolic fluxes from 13C isotopomer measurement data rely either on manual derivation of analytic equations constraining the fluxes or on the numerical solution of a highly nonlinear system of isotopomer balance equations. In the first approach, analytic equations have to be tediously derived for each organism, substrate or labelling pattern, while in the second approach, the global nature of an optimum solution is difficult to prove and comprehensive measurements of external fluxes to augment the 13C isotopomer data are typically needed. Results We present a novel analytic framework for estimating metabolic flux ratios in the cell from 13C isotopomer measurement data. In the presented framework, equation systems constraining the fluxes are derived automatically from the model of the metabolism of an organism. The framework is designed to be applicable with all metabolic network topologies, 13C isotopomer measurement techniques, substrates and substrate labelling patterns. By analyzing nuclear magnetic resonance (NMR) and mass spectrometry (MS) measurement data obtained from the experiments on glucose with the model micro-organisms Bacillus subtilis and Saccharomyces cerevisiae we show that our framework is able to automatically produce the flux ratios discovered so far by the domain experts with tedious manual analysis. Furthermore, we show by in silico calculability analysis that our framework can rapidly produce flux ratio equations – as well as predict when the flux ratios are unobtainable by linear means – also for substrates not related to glucose. Conclusion The core of 13C metabolic flux analysis framework introduced in this article constitutes of flow and independence analysis of metabolic fragments and techniques for manipulating isotopomer measurements with vector space techniques. These methods facilitate efficient, analytic computation of the ratios between the fluxes of pathways that converge to a common junction metabolite. The framework can been seen as a generalization and formalization of existing tradition for computing metabolic flux ratios where equations constraining flux ratios are manually derived, usually without explicitly showing the formal proofs of the validity of the equations.


Background
From microorganisms to animals and plants, cells adjust their metabolic operations to fulfill the demand of energy and biosynthetic precursors. In nature this is a challenging task because substrate availability is typically limited and often changing in its composition. To ensure viability on a broad palette of chemically heterogeneous substrates, cells have developed intertwined enzymatic networks that also confer robustness against genetic mutations. Optimum redistribution of molecular fluxes in metabolism is achieved by multilevel regulation circuits. In recent years, experimental measurement of in vivo metabolic fluxes has attracted much attention. For example, in biotechnology metabolic fluxes are utilized to lead rational strain engineering, whereas systems biologists assess fluxes to unravel targets and mechanisms of metabolic regulation.
Metabolic fluxes are often estimated using flux balance analysis (FBA) [1,2]. In FBA, fluxes are solved by fixing some objective for the metabolism of an organism, such as maximal growth. Then, a corresponding optimization problem is solved by using the stoichiometry of the metabolic network as a constraint to the optimization. FBA is a viable method for studying the metabolic capabilities of an organism, but as a method for estimating metabolic fluxes it has some weaknesses. First, selecting the correct objective for the metabolism is far from trivial [3], especially when mutant strains or behaviour in exceptional environmental conditions is analyzed. Second, there can be many biologically interesting flux distributions that give an optimal solution to the optimization problem of FBA.
A more direct method for experimental determination of the metabolic fluxes is to feed an organism with 13 C labelled substrate, observe the fate of 13 C atoms in the cell at isotopomeric steady state with mass spectrometry (MS) or nuclear magnetic resonance (NMR) measurements, and then infer the metabolic fluxes from the measurements. The rationale behind these 13 C tracer experiment is that, often alternative pathways between metabolites in the network manipulate the carbon backbones of the metabolites differently, thus inducing different 13 C labelling patterns to metabolites. Then, constraints to fluxes complementary to the basic stoichiometric constraints can be derived by measuring the relative abundances of different labelling patterns in the metabolites.
The main difficulty in applying the procedure in practice is that current measurement techniques only can produce incomplete information about relative abundances of different 13 C labelling patterns, the isotopomer distributions, of some metabolites, usually protein bound amino acids in the network, and no isotopomer information at all for many intermediate metabolites of interest [4][5][6]. This imposes a highly non-linear dependency between the measured isotopomer distributions of the metabolites and the metabolic fluxes, which is very challenging to solve both computationally and statistically.
Currently, two main approaches for 13 C metabolic flux analysis exist. In the global isotopomer balancing approach, the problem of estimating metabolic fluxes from the isotopomer measurements is formulated as a nonlinear optimization problem, where candidate flux distributions are iteratively generated until they fit well enough to the experimental 13 C labelling patterns [7][8][9][10][11]. Global isotopomer balancing is a versatile approach that can be applied with all network topologies, substrate labelling distributions and with all measurement techniques -also in short time scales where isotopomeric steady state is not reached [12][13][14]. However, due to the nonlinearity of the problem, it is hard to make sure that the optimization has converged to a global optimum and that this optimum is unique [15]. Also, to apply the global isotopomer balancing approach successfully, one usually needs comprehensive information on the uptake and production rates of external metabolites, as well as about biomass composition of the cell. This information can be hard to obtain, especially in large-scale experiments where dozens to hundreds of mutants or different organisms are comparatively analyzed [16,17].
A metabolic flux ratio approach (METAFoR) [4,18,19] for 13 C metabolic flux analysis has traditionally relied more on the expertise of a domain specialist than advanced computational techniques. In metabolic flux ratio analysis, the aim is to write linear equations constraining the ratios of fluxes producing the same metabolite. The equations are manually derived by domain experts, by careful (and tedious) inspection of metabolic networks. The motivation for the approach is that, in many cases, the knowledge about the flux ratios already offers enough information about the response of an organism to its environment.
The ratio of competing fluxes or pathways producing the same metabolite is easy to understand, and in many cases easier to estimate reliably than all the fluxes in the network -some interesting flux ratios might be obtainable from scarce measurement data or from the incomplete model of metabolic network that would not allow a reliable estimation of a complete flux distribution using global isotopomer balancing. Flux ratios can also be obtained without knowing the uptake and production rates of external metabolites of the biomass composition. And, if enough non-redundant flux ratios are identified, it is possible to use this information to construct and solve an equation system for the full flux distribution of the metabolic network [20][21][22].
As a downside, manually derived flux ratio equations depend heavily on the topology of a metabolic network, measurement capabilities and substrate labelling distributions. Thus, each time a new organism or new mixture of substrates are introduced, flux ratio equations have to be verified and new ones possibly derived. To date, flux ratio equations are manually derived for central carbon metabolisms of three model organisms on glucose, S. cerevisiae [17,23], B. subtilis [24] and for Escherichia coli [18,19]. Recently, flux ratio equations of S. cerevisiae were modified for Pichia pastoris grown on glycerol and on glycerol/ methanol mixtures [25,26]. Facilitating the process of deriving flux ratio equations for other organisms and other substrates clearly calls for automatic tools. Also, many times the (simplifying) assumptions made by the expert in the derivation and solution of flux ratio equations, are not reported in detail. Thus, it is often nontrivial to verify the correctness of given flux ratio equations.
In this article we present a novel automatic framework for deriving flux ratios when the measurement data and the model of metabolic network are given as input. The framework is based on the graph algorithmic flow analysis of metabolite fragments in the metabolic network [27] and on the interpretation and manipulation of both NMR and MS data with vector space techniques [21]. The goal of our work is to combine the good aspects of global isotopomer balancing and manual flux ratio analysis: like global isotopomer balancing, our framework is systematic and can be applied with all network topologies, substrates and substrate labelling distributions and with all current isotopomer measurement techniques. Thus, laborious and error-prone manual inspection of metabolic network models and the tailoring of the equation systems constraining the fluxes separately for each experimental setting required in manual flux ratio techniques can be avoided. On the other hand, during the automated construction of flux ratios we resort to linear optimization techniques only, combined with graph algorithms of polynomial worst case time complexity. Thus, our framework is computationally efficient and avoids problems with local and multiple optima frequently met in global isotopomer balancing. The trade-off of this philosophy is, however, the requirement of measuring isotopomer distributions of metabolites more rigorously to obtain full flux distribution. Given insufficient measurements, our framework can solve the flux ratios only for some, but not necessary for all metabolites in the network. We expect that, especially as measuring isotopomers of intermediate metabolites becomes more routine, our framework will be an attractive method for 13 C flux analysis.

Results and Discussion
In this section we demonstrate the applicability of the presented framework by empirical results. We show that our automatic and systematic framework is able to reproduce flux ratios previously determined by a manual analysis from NMR and GC-MS isotopomer measurements of protein bound amino acids of S. cerevisiae and B. subtilis on glucose. Thus, we can conclude that the presented framework is powerful enough to provide interesting flux ratio information in the well studied experimental settings. Furthermore, we show that the framework can be applied to study less known experimental conditions without any further effort by discovering the flux ratios that can be estimated when B. subtilis is grown on malate instead of glucose. The results of this analysis show that our framework can detect profound effects the change of external substrate can have to the flux ratio computations. The results indicate that our framework is a good tool to study flux ratios of microbes in different experimental conditionsa claim that will try to validate with more experiments in our further work.
We obtained NMR and GC-MS labelling data, where isotopomer distributions of protein bound amino acids of S. cerevisiae and B. subtilis grown on different conditions were measured. Then, available flux ratios were computed with the presented framework. Models of metabolic networks applied in the analysis can be found from the supplementary material of this article: additional files 1 and 2 contain the SBML model file and a visualization of the model of S. cerevisiae, while additional files 3 and 4 contain the same information for B. subtilis. In the models, some simplifications common to 13 C metabolic flux analysis were applied by pooling metabolites whose isotopomer pools can be assumed to be fully mixed (cf. [28]). Pooling of metabolites was carried for (i) the three pentose-phosphates in PPP, (ii) phopshotrioses between GA3P and PEP in glycolysis, and (iii) oxaloacetate and malate in the TCA cycle. In these cases, pooling is justified by the existence of fast equilibrating, bidirectional reactions between the listed intermediates and the empirical evidence that their isotopic labeling is not distinguishable with the current analytical tools. Cofactor metabolites were excluded from the model as cofactor specificities and activities are not accurately known for many reactions.
The bulk of the carbon mappings of reactions in the metabolic network were provided by ARM project [29]. Carbon mappings from amino acids to their precursors were conform to [4] and [23]. Before the analysis of the real measurement data, the correctness of the implementation of the framework was empirically verified by estimating flux ratios for junction metabolites in the metabolic network of S. cerevisiae from the artificial data generated by the 13C-FLUX software [8].

NMR measurements from S. cerevisae on glucose
In the first experiment we analyzed NMR isotopomer measurement data from protein bound amino acids of S. cerevisiae that was grown on uniformly labelled glucose (see Section Experimental NMR and GC-MS methods for more details on experimental settings).
From the 15 measured amino acids we were able to estimate flux ratios for seven junction metabolites: oxaloacetate, PEP, glycine and serine on cytosol and for oxaloacetate, acetyl-CoA and pyruvate in mitochondria. Furthermore, an upper bound for a ratio of GA3P molecules that have visited the transketolase reaction was obtained by manually simplifying the model to imitate the previously reported ways to manually compute the corresponding upper bound (cf. [4]). (The structural analysis of the metabolic network model described in Section Structural analysis of isotopomer systems can help in discovering such simplifications, but they also need some expert insight. As the simplifications are currently not done automatically, the systematical framework is unable to validate them.) The computed flux ratios were compared with the manually derived ratios [23], when the assumptions made in the manual derivation of flux ratios were consistent with the model used. In all cases, automatically computed flux ratios agreed well with the manually derived ratios (Table  1). Differences between the estimations can be explained by numerical instabilities and by differences in computational procedures: in manually derived ratios the estimations are based on the breakage of a single bond in different routes leading to a metabolite while in our framework more isotopomer information is possible utilized in the estimation.

GC-MS measurements from B. subtilis on glucose
In the second experiment we analyzed GC-MS isotopomer measurement data from protein bound amino acids of Bacillus subtilis that was grown on uniformly labelled glucose (see Section Experimental NMR and GC-MS methods for more details on experimental settings).
In comparison to eukaryotic S. cerevisiae, the metabolic network of prokaryotic B. subtilis lacks cellular compartments. Thus, from the point of view of 13 C metabolic flux analysis, there are fewer interesting junction metabolites in the central carbon metabolism of B. subtilis where the flux ratios can be estimated. From the GC-MS measurements of 14 amino acids we were able to compute flux ratios for four junction metabolites -oxaloacetate, pyruvate, PEP and glycine -when [U-13C]-glucose was used as a carbon source. Furthermore, an upper bound for a ratio of GA3P molecules that have visited transketolase reaction was obtained by manually simplifying the model of the metabolic network. Excluding pyruvate, we were able to compute the same ratios with [1-13C]-glucose as a carbon source.
We compared the computed flux ratios to ones obtained with the software FiatFlux [30] that is based on the manually derived analytic equations for computing flux ratios. Currently, manually derived flux ratio equations for [1-13C]-glucose as a carbon source exist only for PEP and for the upper bound to the flux through oxidative pentose phosphate pathway. In general, the flux ratios computed with different methods from the same data and with the same assumptions about the topology of metabolic network were in good agreement ( Table 2). (As a data cleaning procedure, we removed from [1-13C]-glucose data the mass distributions of fragments whose fractional enrichment deviated more than 5% from the assumed fractional enrichment of 20% in [U-13C]-data. This was done because differences in fractional enrichments can be tracked in uniformly labelled data where the fractional enrichment of each fragment is know a priori, but not in positionally labelled data, where the fractional enrichment of a fragment depends on the network topology and the fluxes. This kind of irregularities are in general caused by noise in fragments with low intensity or by coeluting analytes with overlapping fragment masses.) Again, differences between the flux ratios estimated by different methods can be explained by numerical instabilities and by differences in isotopomer information applied during the estimation. Variation in the estimated flux ratios between repeated experiments (six repetitions for [1-13C]-glucose Directly comparable flux ratios computed from the NMR data by the framework presented in this paper and by manual flux ratio analysis (METAFoR). experiment, four repetitions for uniformly labelled glucose experiment) was negligible.

In silico calculability analysis of B. subtilis on malate
One of the strengths of the presented framework is that it is able to automatically produce metabolic flux ratios also when other external labelled substrates than commonly used glucose are fed to organisms. To demonstrate this ability, we applied our framework to predict what flux information would be available, if we feed B. subtilis with malate.
We applied the in silico calculability analysis (see Section Calculability analysis) to examine which flux ratios are calculable in the best case from GC-MS measurements of amino acids, when B. subtilis is grown on [U-13C]labelled malate. Interestingly, our fragment flow analysis revealed that -with the reaction reversibilites in the applied model -the isotopomer distributions of GA3P, PEP and pyruvate depend only on the isotopomer distribution of the fragment containing the first three carbons of oxaloacetate, but not on the relative fluxes producing these metabolites. Thus, isotopomer balances for GA3P, PEP and pyruvate reduce to mass balances and the ratios of fluxes producing these metabolites cannot be estimated. This somewhat surprising phenomenon is due the fact that the rearrangements of carbon chains occurring in PPP pathway will affect only to the carbon fragments that will be recycled in the upper metabolism but not the carbon fragments that can flow back GA3P, PEP and pyruvate from PPP (we modelled a reaction from GA3P to F6P as unidirectional one).
Preliminary experiments with GC-MS data from B. subtilis grown on [U-13C]-labelled malate agreed with the results of fragment flow analysis: constraints to the isotopomer distributions of fluxes entering to these metabolites were identical within the limits of measurement accuracy. On the other hand, our framework was able to estimate for example the TCA-cycle activity also when B. subtilis is grown on malate, just as predicted by the calculability analysis.

Conclusion
In this article we introduce a systematic and analytic framework for 13 C metabolic flux ratio analysis. At the heart of the framework lie the techniques for flow analysis of a metabolic network and for manipulating isotopomer measurements as linear subspaces. These techniques facilitate the efficient and analytic computation of the ratios between the fluxes producing the same junction metabolite. The framework can be seen as a generalization and formalization of existing analytic methods for computing metabolic flux ratios [23,30,31] where equations constraining flux ratios are manually derived. Like the recent methods to improve the speed of the simulation of isotopomer distributions in the global isotopomer balancing framework [10,32], our framework relies on graph algorithms. However, both our goals and applied techniques are quite different from these approaches. In [10] and [32] connected components of isotopomer graphs are discovered to divide the simulation of isotopomer distributions to smaller subtasks. In our framework, flow analysis techniques are applied to discovered metabolite fragments with equivalent isotopomer distributions in every isotopomeric steady state.
Our experiments with NMR and MS data show that the framework is able to provide relevant information about metabolic fluxes, even when only constraints to the isotopomer distributions of protein-bound amino acids are measured.
Thanks to recent advancements in measurement technology improving the feasibility of mass isotopomer measurements of intermediate metabolites [13,33], we expect that the full power of the framework will be harnessed in near future. Measurements from intermediates will make it possible to use larger models of metabolic networks and estimate flux ratios more accurately, without simplifying assumptions about the topology of the metabolic network or directionality of the fluxes. However, these measurements will not be easy to conduct, because of the low abundances of intermediates in the cell. Thus, systematic methods for experimental planning and data quality control are required. The presented framework provides pow- Directly comparable flux ratios computed from the GC-MS data described by the framework presented in this paper (our fw) and by FiatFlux software [30]. UL denotes uniform labelling of external glucose, 1CL external glucose labelled to the first carbon position.
erful tools for these tasks. First, the framework facilitates time saving in silico calculability analysis.
Second, the manipulation of isotopomer measurements as linear subspaces offers a natural way for comparing measurements from different metabolites that contain overlapping information to detect inconsistencies in the measurements: it is enough to compare propagated isotopomer information in the fragments that belong to the same equivalence class. Third, as MS isotopomer measurement techniques have to be developed separately for different intermediate metabolites or metabolite classes, it will be very useful to select a small subset of intermediates that gives enough information about the interesting metabolic fluxes with least experimental effort. In future research, we want to tackle this problem by generalizing our earlier experimental planning method [34], to all measurement data and to realistic measurement error models.
As the presented framework for 13 C metabolic flux analysis only resorts to linear optimization techniques, it is not always able to provide as much information about the metabolic fluxes as the global isotopomer balancing frameworks utilizing more powerful, nonlinear optimization techniques [8,35], that do not necessarily converge to the global optimum. On the other hand, some flux ratios might be computable from the scarce data or incomplete model of metabolic network that does not allow global isotopomer balancing. The differences in the practical performance of different approaches require further research. We see these alternative approaches as complementary ones. A very nice goal would be an integration of our work with global isotopomer balancing: our analytic flux ratios could speed up and direct the optimization process of global isotopomer balancing, that would then fill in the flux ratios possibly missed by our framework.

Methods
In this section we describe a systematic and analytic computational framework for 13 C metabolic flux analysis. At the end of the section we also shortly describe the experimental method that were applied to produce the isotopomer measurement data that was analyzed in Section Results and Discussion.
The overall goal of our computational framework is to automatically infer from the available isotopomer measurement data produced by MS, MS-MS or NMR techniques, an equation system constraining the fluxes. The crucial question is to derive as many non-redundant equations as possible, ideally constraining the flux distribution to a point solution, or in general, as low-dimensional convex set as possible.
In short, the framework consists of the following steps: 1. The model of the metabolic network of an organism is constructed by selecting a set of biochemical reactions operating in the organism and by designating them to correct cellular compartments; 2. Structural analysis of the isotopomer system is conducted, consisting of the following steps: (a) Flow analysis of the metabolic network is conducted in order to discover equivalent fragments, fragments of carbon backbones of metabolites that will have the same theoretical isotopomer distribution, regardless of the fluxes.
(b) Independence analysis of fragments is conducted to find statistically independent carbon subsets from metabolites, that is, subsets that have been at some point separated along every pathway able to producing them and that have flux invariant isotopomer distributions. This guarantees that the isotopomer distribution of their union assumes the form of a product distribution.
(c) In silico calculability analysis is performed to test if the available measurement techniques and substrate labellings make it in principle possible to obtain the required flux information.
3. Wet-lab isotopomer tracer experiments are conducted and constraints to isotopomer distributions are measured; 4. The fluxes of the network are estimated. The process consists of the following steps: (a) Isotopomer measurement data is propagated in the metabolic network model from the measured metabolites to unmeasured ones according to the equivalences discovered in step 2.
(b) An equation system tying the isotopomer data and the fluxes together is constructed and solved, either to obtain a flux distribution for the metabolic network as a whole, or for a single junction metabolite to obtain the ratios of fluxes producing it.
(c) The statistical analysis of obtained fluxes or flux ratios is carried out.
In the following we first formalize the 13 C flux analysis problem and then detail the computational steps above.

Preliminaries
In 13

Reactions are pairs
is a vector of stoichiometric coefficients-denoting how many molecules of each kind are consumed and produced in a single reaction event-and λ j is a carbon mapping describing the transition of carbon atoms in ρ j (see Figure 1). Metabolites M i with α ij < 0 are called substrates and with α ij > 0 are called products of ρ j . If a metabolite is a product of at least two reactions, it is called a junction. If α ij < 0, a reaction event of ρ j consumes |α ij | molecules of M i , and if α ij > For the rest of the article, it is important to distinguish between the subpools of a metabolite pool produced by different reactions. Therefore, we denote by M ij , the subpool of the pool of M i produced (α ij > 0) or consumed (α ij < 0) by reaction ρ j . The concept of the subpools of a metabolite pool is illustrated in Figure 2.

Generalized isotopomer balances
The framework for 13 C metabolic flux analysis presented in this article rests on the assumption that the metabolic network is in metabolic and isotopomeric steady state. In where s bh is the weight of isotopomer b in the h'th constraint, d h is a value derived from isotopomer measurements, and r is the total number of constraints. We call (3) isotopomer constraints. For a k carbon metabolite, 2 k linearly independent isotopomer constraints -one for each isotopomer -are necessary and sufficient to constrain the isotopomer distribution D M to a point solution. A set of isotopomer constraints has a natural matrix representa- i.e. they are the same for each subpool M ij . We call (4) a generalized isotopomer balance.

Representing MS and NMR isotopomer measurements as linear constraints
In the following, we will show by examples how MS and NMR data can be represented as isotopomer constraints. Let us first consider mass isotopomer distributions obtained from MS. Here we omit discussion on practicalities such as corrections for natural abundances of 13 C isotopes (c.f., [6,36]) and concentrate on the conceptual level. For example, the +2 mass isotopomer of a three carbon metabolite M satisfies  as the precursor must always have two 13 C atoms in total. We refer the reader to [6,36] for further details. Also NMR 13 C isotopomer measurements, where relative intensities of different combinations of 13 C and 12 C atoms that are coupled to an observed 13 C atom are measured, can be expressed as linear combinations of isotopomer abundances. For example, for a three-carbon metabolite M, the following constraints to can be inferred for the labeling pattern 010: where d(010) is the measured intensity. Rewriting the above as and denoting s b = d(010), for b ∈ {110, 011, 111}, s 010 = d(010) -1 and s b = 0 for b ∈ {000, 100, 001, 101}, the above can be seen to conform to (3). Similar derivation can be used for other isotopomer signals present in the NMR spectrum to obtain the corresponding isotopomer constraints.

Projection of isotopomer measurements to fragments
In our computational framework, it will be necessary to project the measurement data obtained for a metabolite M to its fragments M|F and vice versa. In this projection, we want to avoid any unnecessary loss of measurement information, that is, we want to obtain as many linearly independent constraints to the isotopomer distribution of F as possible. For example, if we have measured that a two-carbon metabolite M has the isotopomer distribution In conclusion, the available information about D M is given as its linear projection onto , and anything we can express about D M|F in terms of isotopomer constraints is contained within . Thus, any isotopomer constraint for D M|F that we can derive from the measurements can be expressed in terms of the vector space intersection .
Thus, to obtain isotopomer constraints for fragment M|F from a measurement SD M = d, we need to compute the for all carbon positions otherwise. .
vector space intersection and project the measurement to . This can be done by standard linear algebra (c.f. [21]). This process gives us as output isotopomer constraints of the required form Finally, transforming a fragment constraint Y F D M|F = d F into an isotopomer constraint SD M = d is easy: we postmultiply the fragment constraint with the matrix U M|F : S = Y F U M|F and d = d F U M|F .

Structural analysis of isotopomer systems
The incomplete nature of 13 C measurement data requires us to find ways to use the available data the best way possible. The central concept is to find invariants of isotopomer distributions that remain through the pathways, and allow us to trade or propagate measurement information from one metabolite to another. This allows us to write or augment generalized isotopomer balances for metabolites for which the isotopomer distributions are not completely determined by measurements. Thus, the fluxes are potentially more tightly pinpointed as well.
In particular, we use two techniques: First, flow analysis is used to uncover sets of metabolite fragments that have the same isotopomer distribution regardless of the fluxes. Second, independence analysis of fragments is used to uncover situations where two fragments of the same metabolite induce the product distribution for the isotopomer distribution of their union.

Flow analysis of metabolic networks
The goal of the flow analysis [27] is to partition the fragments of the metabolites in the network to equivalence classes such that fragments in the same equivalence class have identical isotopomer distributions in every steady state. This can be guaranteed if a fragment is produced from another a such a manner that the carbons within the fragment never depart from each other regardless of the pathway that is being used.  More complicated case of fragment equivalence is found when a fragment of a junction metabolite is dominated by an upstream fragment (Figure 3). In [27] we show that the the classes of equivalent fragments corresponding the conditions (1-4) can be efficiently computed. Very brie fly, first the metabolic network is transformed to a fragment flow graph that connects substrate metabolite fragments to their product fragments for each reaction in the network. Then, the dominator tree [37,38] of the fragments in the fragment flow graph is constructed. It turns out that the subtrees of this dominator tree correspond to the required fragment equivalence classes (see Figure 4).
Fragment equivalence classes have many uses [27]. Most importantly, measured isotopomer constraints to fragment F can be directly propagated to another fragment F' in the same equivalence class, by applying the joint carbon mappings between F and F'. This helps in the construction of generalized balance equations (4) where the same isotopomer information is required for each subpool of junction metabolites.

Independence analysis of fragments
A complementary property to fragment equivalence 13 C is the statistical independence of fragment isotopomer distributions. Intuitively, if two fragments of the same metabolite are statistically independent, new isotopomer constraints to the union of them can be obtained by taking a product of the isotopomer distributions of the independent fragments.
More formally, the basic question is on what conditions the distribution D M|E∪F of union of two fragments will necessarily have the form of a product distribution (6) where denotes the tensor product consisting all terms of the form  The second necessary condition is that the two fragments need to be dominated by some other metabolite fragments in the network. This will make the fragment distributions flux invariant. Together, the two criteria guarantee (6)  and are dominated (by reactant fragments of ρ j ). Hence (6) holds. The underlying assumption here is that enzymes pick their reactants independently and randomly from the available pools. This case of statistical independence of fragments is depicted Figure 1, where white and grey fragments of D-fructose 1,6-biphosphate are statistically independent (in the subpool of reaction ρ j ).
Another simple example is a junction metabolite M i that has two or more producers with associated subpools M ij .
If M ij |E and M ij |F are structurally independent in all subpools, M i |E and M i |F are structurally independent as well.
If M i |E and M i |F are dominated by some fragments in the network, all subpools have the same distribution which takes the form of (6). Without dominance the distribution will in general be a flux-dependent mixture of product distributions .
This case of statistical independence is depicted in Figure  5.
A generalized form of (6), useful for propagation of isotopomer constraints, is derived as follows.
An example of fragment equivalence classes in a branched pathway Figure 3 An example of fragment equivalence classes in a branched pathway. An example of equivalence classes of fragments in the metabolic network that contains dominated junction fragments M|E and M|F. Grey and white fragments constitute two equivalence classes. Dashed lines depict carbon mappings.
Multiplying the constraints, and denoting s b = s ah,E ·s cg,F where b is the isotopomer consistent with fragment isotopomers a and c, we get the following equation for the l'th constraint for E ∪ F. The equations of the above kind can be concisely written in terms of tensors: From above, if two of the three vectors d, d E , d F are known, the remaining unknown one can be solved.
We note in passing that computing constraints to the metabolite given constraints to the fragments is straightforward application of (7).

Applying fragment independence analysis to flux ratio computation
In our framework, statistical independence of fragments has two uses. We apply it 1. to compute isotopomer constraints for the union of independent fragments, given isotopomer constraints to its independent fragments, and 2. to compute isotopomer constraints for an independent fragment given isotopomer constraints to the other fragments and the metabolite as a whole.
In both cases making use of (6) gives us a larger set of constraints than the vector space and flow analysis approach alone.
Next we describe how (7) generalizes the basic measurement propagation step of the traditional metabolic flux ratio analysis [31]. In the basic case, the flux ratios are solved for two competing pathways p and q, which p cleaves a certain carbon-carbon bond b of junction M while the q preserves b intact from the external substrate. (See Figure 6 for an example). This serves also as an example of applying (7) to compute isotopomer constraints for the union of independent fragments.
When a uniformly labelled substrate is used, the labelling degree of every carbon in the network is the same (and known a priori) when the system reaches isotopomeric steady state. Thus, the isotopomer distribution of a two-carbon fragment F (metabolite M F in Figure 6) containing bond b can be computed by (7) for pathway p cleaving b, while for pathway q, can be propagated An example of a fragment flow graph and a dominator tree  from the external substrate (metabolite M E in Figure 6) using the fragment equivalence classes of the previous section. If we are able to measure (constraints to) the isotopomer distribution of the mixed pool F, we can then automatically derive a generalized isotopomer balance corresponding the manually derived ratio. To use (7) to compute isotopomer constraints for an independent fragment from the known isotopomer constraints to the other independent fragment and to the whole metabolite is complicated by the incompleteness of the measurement data: an arbitrary measurement SD M|E∪F = d might not be directly representable via a tensor product S = S E S F .
Instead, we need to first compute isotopomer subspaces An example of statistical independence of fragments An example of using fragment independence to obtain new isotopomer constraints under uniform substrate labelling Figure 6 An example of using fragment independence to obtain new isotopomer constraints under uniform substrate labelling. Constraints to the isotopomer distributions of striped metabolites are assumed to be known, either by direct measurement of measurement propagation.
In pathway q = (ρ 2 , ρ 4 ), the isotopomer distribution of M F molecules will be the same as in M E . In pathway p = (ρ 1 , ρ 3 ), the isotopomer distribution of M F can be derived by applying fragment independence: the isotopomer distributions of single carbon metabolites produced by ρ 1 are known a priori to be equal to the labelling degree of uniformly labelled substrate. As the two carbons of M F3 are produced from two different metabolites, these carbons are statistically independent to each other in the subpool and the isotopomer distribution D( ) of M F molecules produced by   (7) can be applied.
The detailed description of this technique is rather technical and omitted from this article. Here we give an example of the technique (See Figure 7). We assume that we know the mass isotopomer distributions of metabolite M i (metabolite M 1 in Figure 7) and of fragment M i |E.
We furthermore assume that As the two matrices are not the same (7) is not directly applicable. However, by summing up the second and the third rows and the fourth and the fifth rows of S E S F we obtain S. Intuitively, this means that Equation (7) can be applied to compute , when we take into account that the isotopomer constraints corresponding both second and third rows of S E S F contribute to mass isotopomer which is equal to (8).

Calculability analysis
Isotopomer tracer experiments using less common carbon sources can be very costly because of the prices of purposefully labelled substrates. Thus, it is very useful to be able to first conduct in silico calculability analysis to find out, whether it is even in principle possible to obtain the required flux information from the tracer experiment. By analyzing the fragment equivalence classes, it is relatively easy to perform this kind of "structural identifiability analysis" (cf. [39,40] for global isotopomer balancing), that is, to discover the set of junction metabolites for which the flux ratios can be calculated (in the best case) from the given measurement data: it is enough to check what type of isotopomer constraints can be propagated to each subpool M ij of junction metabolites M i from the measured metabolites (we need to know only coefficients s bij , not the isotopomer abundances d ij ). Then, by applying the techniques of computing vector subspace intersection described above, we can compute the maximal number of linearly independent constraints obtainable for the flux ratios of each junction. Thus, it is possible to check before costly and time-consuming wet lab experiments, whether the experiments even have potential to answer the biological questions at hand. The results of the calculability analysis tell which flux ratios are in principle determinable, given the labelling of external substrates, topology of the metabolic network and the available measurement data. It then depends on the actual flux distribution and the accuracy of the measurements, whether these ratios can be reliably determined from the experimental data.

Estimating the flux distribution of the metabolic network
In the main step of our framework for 13 C metabolic flux analysis, the fluxes of the metabolic network are estimated by forming and solving generalized isotopomer balance equations (4). The generalized isotopomer balance equations are based on the isotopomer measurement data that is first propagated in the network to unmeasured metabolites by utilizing the results of the structural analysis presented above.

Measurement propagation
The aim of the propagation of measurement data is to infer from the isotopomer constraints of measured metabolites as many isotopomer constraints as possible to unmeasured metabolites. As a rule of thumb, more constraints the unmeasured metabolites will get more gener-alized isotopomer balance equations (4) bounding the fluxes can be written.
Fragment equivalence classes can be utilized in the measurement propagation: from isotopomer constraints known for fragment M i |F isotopomer constraints for other fragments M l |F k in the equivalence class of F can be easily computed. The process is the following: 1. Before measurements are propagated from fragment M|F of measured metabolite M to other fragments in the equivalence class of F, isotopomer constraints to F are computed from the constraints measured to the whole metabolite M by using the vector space projection techniques (see Section Projection of isotopomer measurements to fragments).
2. The fragment constraints are propagated to all fragments F' that have been found equivalent to F via the flow analysis technique.
This requires mapping of isotopomers of F to isotopomers of F' by applying the carbon mappings of the reactions along any pathway between F and F'.
3. After the propagation of measurement data inside the fragment equivalence classes, new isotopomer constraints for independent fragments of the same metabolite can be derived, as described in Section Independence analysis of fragments.
Steps 2 and 3 can be iterated until no new isotopomer constraints to the fragments are discovered.

Construction of generalized isotopomer balances
After the propagation step, we have some isotopomer constraints for each subpool j of every junction metabolite M i . (For non-junction metabolites, isotopomer balance equations do not contain any additional flux information compared to the mass balances.) In the best case we know complete isotopomer distribution , in the worst case we have only trivial constraints stating that the sum of relative abundances of all isotopomers equals one.
Next, a linear equation system containing flux constraints obtained from mass balances (1) and generalized isotopomer balances (4) is constructed.
However, the isotopomer constraints of different subpools do not yet conform to (4) as the matrices S ij are not necessarily the same.
Thus we still need to compute a common subspace ( is spanned by the rows of S ij ) of the isotopomer constraints known for each subpool M ij and project subpool constraints to .
This can be done with the same techniques that were previously applied to project measured isotopomer information of a metabolite to its fragments. Let Y i be a matrix with row space . After the projection we obtain isotopomer constraints for each subpool M ij (See Figure 8 for an example).
Now the isotopomer constraints of all the subpools lie in the same subspace of and we are ready to write the system of generalized isotopomer balance equations (4) for every junction M i : that is, Estimating the fluxes The ratios of (forward) fluxes producing M i can be computed by solving the corresponding Equation (11) augmented with a constraint that fixes the out flow from M i to equal 1. Thus, we obtain flux ratios of junction metabolites without manual derivation of ratio equations, without nonlinear optimization and without knowing intake and outtake rates of external metabolites or biomass composition.
In addition, when the equations (11) of all junction metabolites are combined with the mass balances (1) of non-junctions, we obtain a linear equation system constraining the fluxes v of the network that contains a block (junctions) or a row (non-junctions) A k for each metabolite M k . Measured external fluxes and other known constraints, such as the composition of biomass can also An example of the computation of the common subspace of isotopomer constraints in different subpools Figure 8 An example of the computation of the common subspace of isotopomer constraints in different subpools. The mass isotopomer distribution of junction metabolite M 1 is assumed to be measured. For the in flow subpools M 11 and M 12 we obtain isotopomer constraints from the above reactant metabolites by measurement propagation. These propagated constraints must be projected to mass isotopomer to the subspace defined by the mass isotopomer distribution of M 1 before generalized isotopomer balances are constructed. be added to (12) as additional constraints. Additional constraints, like ones derived from gene regulatory information [41] or from thermodynamic analysis of metabolism [42][43][44] can also easily be included to (12).
If (12) is of full rank, the whole flux distribution can be solved with standard linear algebra [45]. Also, more complex, nonlinear methods can be applied to model the effect of experimental errors to the estimated flux distribution [20]. In a common case where the system is of less than full rank, a single flux distribution can not be pinpointed without additional constraints. Instead, (12) defines the space of feasible flux distributions, that are all equally good solutions. In that case we can apply techniques developed for the analysis of stoichiometric matrices to determine as many fluxes as possible [46] from (12). More generally, by linear programming we can obtain maximum (resp. minimum) values for each flux v i : where and are predetermined minimum and maximum allowable values for v i Furthermore, it is possible to search for in some sense optimal flux distribution -for example a flux distribution maximizing the production of biomass -from the feasible space defined by (12) by linear programming techniques of flux balance analysis [1,3,47,48]. In that case, isotopomer data constrain the feasible space more than the stoichiometric information would alone do, thus possibly allowing more accurate estimations of the real flux distribution.

Statistical analysis
For an experimentalist, it is important to know how sensitive the obtained estimation of fluxes is to measurement errors. If enough repeated measurements are not available to assess this sensitivity, it has to be estimated by computational techniques. In the global isotopomer balancing framework for 13 C metabolic flux analysis, many mathematically or computationally involved methods have been developed to analyze the sensitivity of estimated flux distributions to errors in isotopomer measurements and the sensitivity of the objective function to the changes in the generated candidate flux distributions [49][50][51][52][53].
As our direct method for 13 C metabolic flux analysis is computationally efficient, we can afford to a simple, yet powerful Monte Carlo procedure to obtain estimates on the variability of individual fluxes due to measurement errors: 1. For each measured metabolite M i : By studying the variability in the repeated measurements, fix the distribution Ω i from which the measurements of M i are sampled.
2. Repeat k times: (a) For each measured metabolite M i : sample a measurement from Ω i .
(b) Estimate fluxes v l from the sampled measurements.
3. Compute appropriate statistics from the set V = {v 1 , ..., v k } to describe the sensitivity of fluxes to measurement errors.
Possible statistics that can be applied in the last step of the above algorithm include standard deviation, empirical confidence intervals [53], kurtosis, standard error etc. of each individual flux v j and measures of "compactness" of V, such as (normalized) average distance of items of V from the sample average.

Experimental NMR and GC-MS methods
In this section we shortly describe the experimental procedures applied in NMR and GC-MS isotopomer measurements that produced the data for Section.
In the first experiment S. cerevisiae was grown in an aerobic glucose-limited chemostat culture at dilution rate 0.1 h -1 . After reaching a metabolic steady state, as determined by constant physiological parameters 10% of the carbon source in the medium was replaced with fully carbon labelled glucose ([U-13C]) for approximately 1.5 residence times, so that about 78% of the biomass was uniformly labelled. 2D [ 13 C, 1 H] COSY spectra of harvested and hydrolysed biomass were acquired for both aliphatic and aromatic resonances at 40°C on a Varian Inova 600 MHz NMR spectrometer. The software FCAL v.2.3.0 [19] was used to compute isotopomer constraints for 15 amino acids from the spectra. Detailed description of the cultivation set up can be found in [54] whereas similar 13 C labeling set up, NMR experiments and spectral data analysis as were applied here have been described in [55].
In the second experiment B. subtilis was grown on shake flasks containing 50 ml M9 minimal medium. In the experiment, the medium was supplemented with 50 mg/ L tryptophan and 3 g/L glucose labelled to the first carbon position ( teen derivatized amino acids were analyzed for 13 C labeling patterns with a series 8000GC combined with an MD800 mass spectrometer (Fisons instruments). More information about the details of the measurement procedure can be found from [20] where identical measurement techniques were applied.