Skip to main content
  • Research article
  • Open access
  • Published:

Quantitative inference of dynamic regulatory pathways via microarray data

Abstract

Background

The cellular signaling pathway (network) is one of the main topics of organismic investigations. The intracellular interactions between genes in a signaling pathway are considered as the foundation of functional genomics. Thus, what genes and how much they influence each other through transcriptional binding or physical interactions are essential problems. Under the synchronous measures of gene expression via a microarray chip, an amount of dynamic information is embedded and remains to be discovered. Using a systematically dynamic modeling approach, we explore the causal relationship among genes in cellular signaling pathways from the system biology approach.

Results

In this study, a second-order dynamic model is developed to describe the regulatory mechanism of a target gene from the upstream causality point of view. From the expression profile and dynamic model of a target gene, we can estimate its upstream regulatory function. According to this upstream regulatory function, we would deduce the upstream regulatory genes with their regulatory abilities and activation delays, and then link up a regulatory pathway. Iteratively, these regulatory genes are considered as target genes to trace back their upstream regulatory genes. Then we could construct the regulatory pathway (or network) to the genome wide. In short, we can infer the genetic regulatory pathways from gene-expression profiles quantitatively, which can confirm some doubted paths or seek some unknown paths in a regulatory pathway (network). Finally, the proposed approach is validated by randomly reshuffling the time order of microarray data.

Conclusion

We focus our algorithm on the inference of regulatory abilities of the identified causal genes, and how much delay before they regulate the downstream genes. With this information, a regulatory pathway would be built up using microarray data. In the present study, two signaling pathways, i.e. circadian regulatory pathway in Arabidopsis thaliana and metabolic shift pathway from fermentation to respiration in yeast Saccharomyces cerevisiae, are reconstructed using microarray data to evaluate the performance of our proposed method. In the circadian regulatory pathway, we identified mainly the interactions between the biological clock and the photoperiodic genes consistent with the known regulatory mechanisms. We also discovered the now less-known regulations between crytochrome and phytochrome. In the metabolic shift pathway, the casual relationship of enzymatic genes could be detected properly.

Background

Biological phenomena at different organismic levels have revealed some sophisticated systematic architectures of cellular and physiological activities implicitly. These architectures were built upon the biochemical processes before the emergence of proteome and transcriptome [1–3]. Under the molecular machinery, the biochemical processes are mostly interpreted as frameworks of connectivity between biochemical compounds and proteins, which are synthesized from genes to function as transcription factors binding to regulatory sites of other genes, as enzymes catalyzing metabolic reactions, or as components of signal transduction pathways [4–6]. This implies that, in order to understand the molecular mechanism of genes in the control of intracellular or intercellular processes, the scope should be broadened from DNA sequences coding for proteins to the systems of genetic regulatory pathways determining which genes are expressed, when and where in the organism and to which extent [7]. In the experience of engineering field, the systematic architecture and dynamic model could investigate the characteristics of signaling regulatory pathways [8]. Therefore, how to construct the dynamic model of a signaling pathway from the system structure point of view might be the first key to the door of system biology. Most biological phenomena directly or indirectly influenced by genes such as metabolism, stress response, and cell cycle are well studied on the molecular basis. Thus, identification of a signal transduction pathway could be traced back to the genetic regulatory level. The rapid advances of genome sequencing and DNA microarray technology make possible the quantitative analysis of signaling pathway besides the qualitative analysis. More particularly, the embedded time-course feature of microarray data would promote the system analysis of signal regulatory pathways as well, which is very mature in the field of engineering.

In addition to northern blots and reverse transcription-polymerase chain reaction (RT-PCR), which study a small number of genes in a single assay, the transcriptome analysis has, via DNA microarray technology [9], managed to achieve high-throughput monitoring of the almost genome-wide mRNA expression levels in living cells or tissues. Two types of available microarrays, the spotted cDNA and in situ synthesized oligonucletide [10] chips, which permit the spatiotemporal expression levels of genes to be rapidly measured in a massively parallel way, are used in different experimental requirements and stocked in the databases on net, such as Stanford Microarray Database (SMD) [11], Gene Expression Omnibus(GEO) [12] in NCBI, and ArrayExpress [13] in EBI. Microarray experiments are now routinely used to collect large-scale time series data that facilitate quantitative genetic regulatory analysis while qualitative discussion is the traditional thinking [14–17].

Several analytic methods have been proposed to infer genetic interrelations from gene expression data. In the coarse-scale approach of clustering, the underlying conjecture is that co-expression is indicative of the co-regulation, thus clustering may identify genes that have similar functions or are involved in the related biological processes. The most widely used method is the unsupervised hierarchical clustering [18]. This approach has an increasing number of nested classes by similarity measurement and resembles a phylogenetic classification. If we know the number of clusters in advance, the k-means clustering [19] could assign gene elements into a fixed number k of clusters in a way to minimize the overall Pearson or Euclidean distances of each member internally in the same cluster. Other algorithms such as the neural-network-based self-organizing maps (SOM) [20], singular value decomposition (SVD) or principal component analysis (PCA) [21], and fuzzy clustering methods [19] also have their own advantages and limitations. Alternative supervised clustering algorithm of support vector machine [22], which uses prior biological information of cluster for training, would enhance the accuracy of clustering. However, the nature of clustering algorithms apparently cannot uncover the causal interactions between genes just by grouping. Regarding the causality of pathways, the clustering analysis needs to cooperate with sequence motif detection [23]. It is also important to note that models using clustering analysis are static and thus can not describe the dynamic evolution of gene expression, even in the type of time-course microarray data.

A statistical model of Bayesian network [24] was proposed to model genetic regulatory networks. Basically, the technique uses a probabilistic score to evaluate the networks with respect to the expression data and searches for the network with the optimal score. The dynamic Bayesian network [25] was proposed to learn the network structure and parameters by maximizing the posterior probability via Bayes rule of prior probability and marginal likelihood. Another algorithm of Boolean networks [26] can also be employed to model the dynamic evolution of gene expression, where the state of a gene can be simplified to being either active (on, 1) or inactive (off, 0). The probabilistic nature of Bayesian networks is capable of handling noise inherent in both the biological processes and the microarray experiments. This makes Bayesian networks superior to Boolean networks, which are deterministic in nature. The validity of dynamic Bayesian networks is evaluated by the sensitivity-specificity score ratios [25], which depend on the training size, the degree of accuracy of prior assumption. A genetic regulatory network based on the first order differential equation with given decay rates was discussed in [27].

In this study, the dynamic system approach could be employed to model how a target gene's expression profile is regulated by its upstream regulatory genes from the system causality point of view. Then, with the causal dynamic model, the upstream regulatory function can be extracted from the expression profile of the target gene by the optimal estimation method, i.e. maximum likelihood estimation. Since merely the second-order differential equation is employed to model the dynamic evolution of the target gene, only a few parameters need to be estimated. Furthermore, the derived regulatory function is closely related to the causal upstream information of the pathway and will create a basis for inferring the regulatory pathway from the system biology point of view.

In either eukaryote or prokaryote, signaling regulatory pathways are considered as responses to the physiological activities or the deviation from homeostasis, which would affect the normal states of an organism. Among these signaling regulatory pathways, cell cycle [17] is one of the most conspicuous features of life which plays an important role in growth and cellular differentiation in all organisms. In plants, the stress-induced pathways [28] are very important to survivability under the abiotic environmental treatment such as drought, salinity and cold[29]. If these critical pathways can be identified from quantitative analysis in silico, the defect of biological processes would be predicted and corrected before hand. Our aim is to construct signaling regulatory pathways quantitatively by the system inference approach with a dynamic model and microarray data.

In this study, a second-order differential equation, which has been widely used to model many physical dynamic systems with good characteristics, is proposed to model the time-profile evolutional behavior of a target gene. The regulatory function is taken as the driving input of the dynamic equation of the target gene. Using the dynamic equation and microarray data, we first extract the regulatory function for each target gene. According to the extracted regulatory function, we deduce their upstream regulators to trace back upstream signaling pathways. Then, upstream regulatory genes are taken as target genes to trace back their upstream regulatory genes. Iteratively, we can construct the whole regulatory pathway to the genome wide using the dynamic regulatory model and microarray data from the system biology point of view. Finally, we give some independent validation of our approach by repeating the analysis with randomly reshuffling the time order of microarray data and see if the proposed pathways are destroyed.

We have applied our dynamic system approach to two genetic regulatory pathways with microarray data sets publicly available on net [15, 30]. One is the circadian regulatory pathway in Arabidopsis thaliana [31, 32], and the other is the metabolic shift pathway from fermentation to respiration in yeast Saccharomyces cerevisiae [33]. The circadian system is an essential signaling pathway that allows organisms to adjust cellular and physiological processes in anticipation of periodic changes of light in the environment [34–38]. According to the synchronously dynamic evolution of microarray data, we have successively identified the core signaling transduction from light receptors to the endogenous biological clock [39, 40], which is coupled to control the correlatively physiological activity with paces on a daily basis. On the other hand, the diauxic shift [41] from the exhausted fermentable sugar of anaerobic metabolism to aerobic growth is correlated with widespread changes in the expression of genes involved in fundamental cellular processes such as carbon metabolism, protein synthesis, and carbohydrate storage. [28, 31, 32, 42–47] The architecture of the signaling pathway correlative to glycolysis or gluconeogenesis during the diauxic shift is properly built up. With the dynamic system approach, not only the regulatory abilities between causal genes could be derived, but also the delays of regulatory activity are specified. These quantitative characteristics will help determine the intrinsic frameworks of connectivity in the above interesting pathways from the system biology point of view.

Results

The proposed methods in this study would be divided into four steps. In the first step, a dynamic model using the second-order differential equation is developed to describe the expression profile data as output and the regulatory function as input to denote the implicit characteristics of each gene with some parameters. With the help of the second-order dynamic model, we would then extract the upstream regulatory function from the expression profile of the target gene using the optimal estimation method. In the third step, the regulatory function estimated will help seek the correlative regulatory signals from the upstream paths. Iteratively, we can reconstruct the whole signaling regulatory pathway by linking up the upstream regulatory paths. Finally, some biological filters using available biological knowledge are employed to prune the constructed signaling regulatory pathway to improve the accuracy of the proposed method.

I. Dynamic system description of signaling regulatory model

The second-order differential equation is well used in the description of dynamic system evolved from the causality of gene regulatory function. Let X i (t) denote the expression profile of the i-th gene at time point t. The following second-order differential equation is proposed to model the expression level of the i-th gene,

where G i (t) is the upstream regulatory function to influence the expression profile X i (t) of the i-th gene while a i , and b i are the parameters that characterize the dynamic inherent property of the gene like degradation and oscillation, and ε i (t) is the noise of current microarray data or the residue of the model. In general, the second-order differential equation has been widely used to model dynamic systems to characterize efficiently the dynamic properties of damping and resonance of systems in physics and engineering.

Obviously, the clue of upstream regulatory pathways is in G i (t). Thus, the first step is to detect the upstream regulatory function G i (t) from both dynamic equation in (1.1) and microarray data. However, to detect the input regulatory function G i (t) from both equation (1.1) and microarray data directly is not easy. In this situation, a Fourier decomposition technique is employed to decompose G i (t) as a synthesis of some harmonic sinusoid functions so that the signal detection problem of G i (t) is reduced to a simple parameter estimation problem.

Accordingly, we can decompose G i (t) by the following Fourier series,

Then the detection of G i (t) becomes how to estimate the Fourier coefficients of α n and β n , which are the magnitudes of different harmonics of cos(nωt) and sin(nωt), for n = 0,..., N in equation (1.2), respectively. In science and engineering, the Fourier series has been widely employed to synthesize any continuous functions with finite energy. The estimation of α n , β n and the detection of G i (t) in equation (1.2) are given in Methods in the sequel.

As a result of parameter estimation in Methods, the detection

of regulatory function G i (t) could be derived as follows,

Since the input regulatory function G i (t) of a target gene is usually due to the transcriptional binding or some physical interactions from the upstream regulatory genes, in the following, we would trace back to the corresponding regulatory genes from input regulatory function of the target gene.

II. Inference of the regulatory pathway via

Apparently the input regulatory function G i (t) in equation (1.1) contains the driving information for the target gene's expression from the upstream regulatory genes. The identified regulatory function from equation (1.3) could be interpreted as the regulatory connectivity through transcriptional binding or protein-protein interaction imposed on the i-th target gene. Nevertheless, the expression data of protein type which should be considered directly in practice are by now unavailable and unreliable to trace back upstream regulatory genes. Instead, the expression data on mRNA level which is now widely available from microarray assays would make tracing back the upstream regulatory pathway possible under proper assumptions. All along the paper we assume that the expression levels of mRNA transcripts are proportional to the actual number of corresponding proteins in the cell. This assumption is indeed a strong approximation since post-transcription is known to play a very important role in down regulating the number of the transcription factor in the cell.

Before the inference of upstream regulatory genes, it is reasonable to confine the effect of the regulatory genes on the regulated target gene. The saturated activity of expression level reveals that the regulatory ability cannot extend unlimitedly. The sigmoid function is often chosen to express the nonlinear saturation with proper parameters. Here, we apply the sigmoid transformation to represent the 'on' and 'off activities of the regulatory genes on binding or not to motifs of the target gene. So the regulatory signal

shown below with the parameter set of θ j = {γ, M j , τ j } is the sigmoid transformation of X j (t), the expression profile of the j-th regulatory gene.

where γ is the transition rate, M j is the mean expression of the j-th regulatory gene's profile, and τ j is the corresponding signal transduction delay.

The delay activity should be considered in order to describe the signal transduction delay Ï„ j from the j-th regulatory gene to the target gene. The delay Ï„ j would be computed by statistical correlation between the regulatory signal transformed from the j-th regulatory gene and the identified regulatory function of the target gene. The delay Ï„ j is determined by the following maximum correlation criterion,

where r Ï„ is the correlation between and under variable delay Ï„. If there are many Ï„ to achieve the maximum correlation in (2.2), then only the smallest one is chosen.

Using the correlation method, we trace back R i regulatory genes whose regulatory signals are most correlated with the regulatory function of the i th target gene, i.e. choose R i genes with maximum correlation but with smaller Ï„ j in (2.2). The determination of number R i will be discussed later. Then, we construct the regulatory pathway by tracing back R i regulatory genes from the identified regulatory function of the target gene as the following kinetic relationship,

where c ij is the pathway kinetic parameters from the regulatory gene j to the target gene i, R i are the searched upstream regulatory genes, the constant ci 0represents the basal level to denote the regulatory function other than upstream regulatory genes, and e i (t) is the residue of the model.

Furthermore, to estimate the pathway kinetic parameters c ij , equation (2.3) for m time points should be written in the following regression form,

where

, .

We assume that each element in the error vector, e i (t k ), k = {1,..., m}, is an independent random variable with a normal distribution with zero mean and variance σ2. By maximum likelihood parameters estimation method (see Methods), the estimates of σ2 and Ω i are given as follows, which is solved as

and

It should be noted that with the combination of biological knowledge about the transcriptional factors, protein phosphorylation, post-transcriptional and specific enzyme regulation of target genes, lots of putative and verified genes correlated with the target genes are pruned by this biological filter for the more efficient and accurate searching of R i upstream regulatory genes in equation (2.3). For example, suppose the expression profile of gene j has a high correlation with regulation function G i (t) of target gene i. However, if gene j is not a transcription factor, protein phosphorylation, post-transcription or specific enzyme of target gene i, it will be deleted from the candidates of R i upstream regulatory genes because it may be only a co-expressed gene with the target gene corregulated by the other gene. On the contrary, a verified regulatory gene should be recruited into the candidates even with small correlation with G i (t).

Finally, we take the well-known Akaike Information Criterion (AIC) into account for determining the number R i of regulatory signal [42],

The first term in AIC is the residual variance and the second term R i is the number of regulatory genes. AIC includes both the estimated residual variance and model complexity in one statistic, which decreases as σ2 decreases and increases as R i increases. AIC has been widely employed to determine the complexity of system modeling science and engineering [42]. The optimal number R i of the upstream regulatory genes will be determined by the minimization of the AIC value in equation(2.7).

Now, for the selected target genes in the interesting pathway, we could search for the optimal R i upstream regulatory genes by AIC in equation (2.7) after the biological filtering and determine their pathway kinetic parameters c ij of regulatory signal by equation (2.6). After biological filter pruning, if the number of candidates of regulatory genes is still less than R i determined by AIC, then some genes, which are highly correlative to G i (t) but not of transcription factors or signaling proteins of target gene i, should be recruit into candidates to uncover regulatory relationships that were not suspected to be connected. After the combination of equations (2.3) and (1.1), the whole regulatory pathway is obtained as

for i = {l, 2,..., L}, and L is the number of target genes in the pathway. The sub-paths related to the i-th target gene in the interesting pathway could be detected by the inference algorithm. Then, it is natural that the whole regulatory pathway would be constructed by the links of all the sub-paths. We also outline the whole flowchart of our dynamic inferring algorithm as shown in Figure 1 for an overview.

Figure 1
figure 1

Illustration of the overall flowchart of the pathway inference algorithm.

Discussion

Data set of analysis

The two famous modeling organisms, Arabidopsis thaliana and yeast Saccharomyces cerevisiae, have been well studied biologically and their microarray assays are abundant. Thus, we chose different types of pathways, one is the plant behavior under environmental variation and the other is the cellular metabolism in response to exhaustion of external source, as examples in this study. In other words, two signaling pathways, i.e. circadian regulatory pathway in Arabidopsis thaliana and metabolic shift pathway from fermentation to respiration in yeast Saccharomyces cerevisiae, are constructed from microarray data to confirm the accuracy of our proposed method.

For cells grown in the light/dark cycle according to circadian rhythm, Harmer and colleagues [15] used highly reproducible oligonucleotide-based arrays representing about 8200 different genes to determine steady-state mRNA levels in Arabidopsis thaliana that are measured in replicate hybridization of 12 samples harvested every 4 hours over 2 days. With their investigation on the circadian regulatory system, Harmer et al. have provided an abundance of correlated genes for the regulatory pathway inference.

As for the metabolic pathway, an cDNA microarray assay from DeRisi et al. [30], containing approximately 6400 distinct expression sequence tags (ESTs) in yeast Saccharomyces cerevisiae, is harvested at seven successive 2-hour intervals after an initial nine hours of growth under the diauxic shift. Adoption of the diauxic shift data set would make possible the inference of metabolic shift pathways.

Process of raw microarray data

With the second-order equation and the optimal estimation method, the dynamic model should be developed first for the regulatory scheme of target genes in the signaling regulatory pathway. Because the raw microarray data sample of the biological assays that will be analyzed is small with less than 15 data points for an individual gene, the cubic spline method is used to interpolate the observed data to increase the data points of each gene's time-course microarray data. As shown in Figure 2, the expression profiles of Cry1 (CRYTOCHROME 1) and PhyA (PHYTOCHROME A) genes in the circadian regulatory pathway of Arabidopsis thaliana are interpolated by the cubic spline method among raw data points on the left-hand side. Similarly, Pgi1 (PHOSPHOGLUCOSE ISOMERASE 1) and Pgm2 (PHOSPHOGLUCOMUTASE 2) genes in the metabolic shift pathway of yeast Saccharomyces cerevisiae are on the right-hand side. After the expression profiles are smoothed by the cubic spline technique, we can obtain the data of the first derivative and the second derivative more accurately and abundantly.

Figure 2
figure 2

The expression profiles derived from the cubic spline interpolations. The expression profiles after performing the cubic spline interpolations for Cry1, PhyA (pathway of circadian regulatory system of Arabidopsis thaliana) on the left-hand side, and PGI1, PGM2 (metabolic shift pathway of yeast Saccharomyces cerevisiae) on the right-hand side. The red open triangles are the raw microarray data, and the blue dotted points are the interpolation data.

Extraction of regulatory information

After data expansion by the cubic spline method, we would have enough data to estimate the parameters of the regulatory dynamic model of the target gene from equation (2.4). Following the dynamic model in equation (3.1), the parameters which characterize the dynamic regulatory mechanism are estimated successfully for each target gene in the pathway. By dynamic model fitting, the expression profiles of the mentioned genes in Figure 2 can be reconstructed in Figure 3 with time progression again. Hence, we not only could predict the dynamic evolution of the target gene's expression profile accurately, but also deduce the regulatory function simultaneously as the scheme of Figure 4. The regulatory information between target genes and their upstream genes can be extracted properly with this method.

Figure 3
figure 3

The second-order dynamic model fitting of pathway genes. The expression profile with corresponding second-order dynamic model fitting for Cry1, PhyA (pathway of circadian regulatory system of Arabidopsis thaliana) on the left-hand side, and PGI1, PGM2 (metabolic shift pathway of yeast Saccharomyces cerevisiae) on the right-hand side. The blue dotted points are the cubic spline interpolations of microarray data, and the red dashed lines are the estimated dynamic evolution of expression data.

Figure 4
figure 4

The extracted upstream regulatory functions of pathway genes. The upstream regulatory function extracted from expression profiles of corresponding target genes Cry1, PhyA (pathway of circadian regulatory system of Arabidopsis thaliana) on the left-hand side, and PGI1, PGM2 (metabolic shift pathway of yeast Saccharomyces cerevisiae) on the right-hand side.

Inference of the regulatory pathway

For illustrations, the inferring strategy is applied to the selected core genes (X1~X13 and Y1~Y11) in two pathways of the circadian regulatory system in Arabidopsis thaliana and the metabolic shift pathway in yeast Saccharomyces cerevisiae to recognize their upstream regulatory genes, respectively. Their regulatory abilities with signal transduction delays are shown in the form of dynamic equation in Table 1 and Table 2, respectively. These regulatory abilities implying different degrees of influence are converted into a red-colored line as positive regulation (activation) and a blue-colored line as negative regulation (inhibition) for each target gene. Then, according to the dynamic regulatory equations in Table 1 and Table 2, the pathways of the circadian regulatory system and the metabolic shift pathway are described in Figure 5 and Figure 6, respectively.

Table 1 The dynamic equation set of the identified upstream regulators and their regulatory relationships to the specific target genes in the pathway of circadian regulatory system of Arabidopsis thaliana.
Table 2 The dynamic equation set of the identified upstream regulators and their regulatory relationships to the specific target genes in the metabolic shift pathway of yeast Saccharomyces cerevisiae.
Figure 5
figure 5

The pathway of circadian regulatory system of Arabidopsis thaliana according to the dynamic regulatory modeling in Table 1. The related genes are represented as ovals with different colors of light yellow (crytochrome), light blue (phytochrome), orange (biological clock genes), light green (some physiologically light-dependent downstream genes), and gray (other relevant genes not as target genes). There are three types of lines with colors of red (activation), blue (repression), and black. In addition to the black line representing the signal pipes from genes, the red lines are shown depending on the degree of activation whilst the blue lines are shown depending on the degree of repression. The black square symbol attached on the lines is the bifurcate node from the same pipe of signal. The circle symbol attached on the lines is the collecting nodes from different signal sources. The colored degree bar between activation and repression is also shown in the bottom of the figure.

Figure 6
figure 6

The metabolic shift pathway of yeast Saccharomyces cerevisiae according to the dynamic regulatory modeling in Table 2. The related genes are represented as ovals with different colors of light blue (fermentation), light yellow (gluconeogenesis), and gray (other relevant genes not as target genes). There are three types of lines with colors of red (activation), blue (repression), and black. In addition to the black line representing the signal pipes from genes, the red lines are shown depending on the degree of activation whilst the blue lines are shown depending on the degree of repression. The black square symbol attached on the lines is the bifurcate node from the same signal source. And the circle symbol attached on the lines is the collecting nodes from different signal sources. The colored degree bar between activation and repression is also shown in the bottom of the figure.

a. Pathway of circadian regulatory system

The circadian rhythm controls processes ranging from cyanobacteria cell division to human wake-sleep cycles. In plant, especially for Arabidopsis thaliana, the growth and development have adapted to the diurnal cycling of light and dark [28, 31, 32, 42, 44, 46–49]. The ability of plants to respond to light is achieved through some photoreceptors. Two classes of photoreceptors are well known to form the photo-transduction pathway under the circadian regulatory system in Arabidopsis thaliana [50]. One is the crytochrome of blue-light photoreceptors, containing Cry1 and Cry2. The other is the phytochrome of mainly red-light photoreceptors, including PhyA, PhyB, PhyD and PhyE.

In the photo-transduction related genes (Table 1 and Figure 5), containing both crytochrome (Cry1 and Cry2) and phytochrome (PhyA, PhyB, PhyD and PhyE), Cry1 [X6] and Cry2 [X10] are commonly regulated by Lhy [X3] (LATE ELONGATED HYPOCOTYL) in reciprocal ways with significant values (0.7569 in Eq.(6) and -1.8773, Eq.(10) of Table 1, respectively), implying the essentially regulatory role of Lhy on crytochrome genes. In addition, from Eq.(10) in Table 1, we further observe that Ccal [X4] (CIRCADIAN CLOCK ASSOCIATED 1) has the greatest positive regulation (2.3465) on Cry2, meaning that Cry2 is jointly regulated by Lhy and Ccal. Because the binding sites of Lhy and Ccal found in the promoter regions of Cry2 [51] are consistent with our inference, the transcriptional binding might be the mechanism of Cry2 affected by both Lhy and Ccal. In addition, the mutual activations of phosphorylation between Cry1 [X6] and PhyA [X7] in Eq.(6) and Eq.(7) of Table 1 are specifically identified consistent with the previous work [52]. At present, little is known about the nature of interactions between these two classes of photoreceptors. From Eq.(10) in Table 1, Cry2 [X10] is also positively regulated by PhyA [X7] with 0.5-hr activation delay similar to that in Cry1 (Eq.(6) in Table 1). Therefore, PhyA is considered as a post-transcriptional regulator of phosphorylation to crytochrome within 1.0-hr after transcription. On the other hand, PhyB [X11] down-regulates Cry2 with a significant effect (-0.7141) while Cry2 [X10] up-regulates PhyB (0.0511) weakly by feedback (see Eqs.(10), (11) in Table 1.). The mutual interactions between Cry2 and PhyB in nuclear speckles that are formed in a light-dependent fashion are also confirmed by Mas et al. [48]. Because Cry1 and Cry2 are both negatively co-regulated by PhyD [X8] and PhyE [X12] significantly (see Eqs.(6), (10) in Table 1), PhyA has apparently different behavior from PhyB, PhyD, and PhyE in activating crytochrome. This might suggest the mechanism that PhyA mediates the blue light by up-regulating Cry1 and Cry2, whilst PhyB, PhyD, and PhyE would mediate the red light by inhibiting blue photoreceptors [53, 54].

In the mainly red-light photoreceptors of phytochrome (PhyA, PhyB, PhyD and PhyE) in Figure 5, undoubtedly Lhy [X3] and Ccal [X4], well-known biological clock genes in the circadian system [40, 46], are core regulators involved in the transcriptions of both phytochrome (see Eqs.(7), (8), (11), and (12) in Table 1) and crytochrome (see Eqs.(6), (10) in Table 1) via feedback transcriptional binding. Similarly, Gi [X15] (GIGANTEA) in Figure 5 has been identified as a manifested regulator to all the phytochromes (also see Eqs.(7), (8), (11), and (12) in Table 1), although Gi sequence lacks any motifs suggesting that it is a transcription factor of phytochromes [55]. Hence, Gi might be a post-transcriptional regulatory factor. However, there is another gene Elf3 [X16] (EARLY FLOWERING 3) opposite to Gi on phytochrome, especially for PhyA, PhyB and PhyE (Eqs.(7), (11) and (12) in Table 1). Because of lower regulatory ability than transcription factor Lhy or Ccal, Elf3 might play the same role as Lhy and Ccal. Just as expected, Elf3 contains glutamine-rich motif suggesting that it is a transcription factor [56].

Before entrance of the biological oscillator of the circadian system formed by Toc1, Lhy, and Ccal, a crucial gene of Pif3 [X9] (Figure 5) is mediated significantly by PhyA [X7] (-0.7631) and PhyB [X11] (0.1223) (see Eq.(9) in Table 1). This is consistent with the post-transcriptional interactions of Pif3-PhyA and Pif3-PhyB complexes. As a core gene in the biological oscillator, Toc1 [X13] is transcriptionally regulated by Lhy [X3] (0.7009) and Ccal [X4] (-1.4704) whilst Pif3 [X16] (-0.1698) is presumably considered as the bridge between Toc1 and phytochrome (Eq.(13), Table 1). From Lhy and Ccal point of view, they are both positively affected simultaneously by Pif3 implying the regulation on the transcriptional level [57]. In addition, Toc1 inhibits both Lhy and Ccal to form the structure of mutual transcriptional regulation (please compare Eqs.(3), (4) with Eq.(13) in Table 1). So we conclude that Lhy and Ccal function as principal transcription factors.

We also infer some downstream pathways of Chs [X5] (CHALCONE SYNTHASE), Pap1 [X1], and Co [X2] (CONSTANS) in Figure 5. Chs is known as correlated with UV-B protection. It seems that Ccal and Lhy have greater effect (2.7078, -0.7631, respectively) on Chs than Pap1 (-0.0455) as a transcription factor (Eq.(5) in Table 1). This might mean that Chs is regulated by Pap1 in a small scale with amplifying effect on the cis-regulatory level. Co is recognized as a pivotal gene of photoperiodic regulation of flowering. Indeed, strong regulations from Ccal and Lhy are identified to show that Co is regulated with a large-scale attenuation effect on the cis-regulatory level (Eq.(2) in Table 1).

In the overview of the circadian system in Figure 5, most red lines of activating regulation are found in the photo-transduction pathway between phytochrome (light blue ovals) and crytochrome (light yellow ovals) implying the chain interactions after the external light input. By the feedback regulations of Lhy and Ccal (orange ovals), represented by black lines with more linking to upstream genes, the photo-transduction pathways are stabilized to provide oscillation. On the other hand, more blue lines of inhibitive interactions are revealed in the biological-clock regulatory pathways relevant to Co, Pap1, and Chs (light green ovals) underlying the anti-phase functional regulation between these output pathways and the oscillator. In addition, the essential signal transduction factors of Fkf1, Gi, Elf3, and Pif3 (gray ovals) make some critical links between the functional blocks mentioned above in the circadian system[58]. Finally, in order to validate the proposed approach, an independent validation is also given by randomly reshuffling the time order of microarray experiment [see Additional file 2] but with the same choices of target gene and regulatory genes, as shown in Figure 7. It is seen that the proposed circadian regulatory pathway in Figure 5 is destroyed by reshuffling the experimental data.

Figure 7
figure 7

The pathway of circadian regulatory system of Arabidopsis thaliana in Fig. 5 is repeated as independent validation by randomly reshuffling the time order of microarray experiment but with the same choices of target and regulatory genes. Obviously, the proposed pathway in Fig. 5 is destroyed by the reshuffling of experimental data. The related genes are represented as ovals with different colors of light yellow (crytochrome), light blue (phytochrome), orange (biological clock genes), light green (some physiologically light-dependent downstream genes), and gray (other relevant genes not as target genes). There are three types of lines with colors of red (activation), blue (repression), and black. In addition to the black line representing the signal pipes from genes, the red lines are shown depending on the degree of activation whilst the blue lines are shown depending on the degree of repression. The black square symbol attached on the lines is the bifurcate node from the same pipe of signal. The circle symbol attached on the lines is the collecting nodes from different signal sources. The colored degree bar between activation and repression is also shown in the bottom of the figure.

b. Metabolic shift pathway

Sugars, such as glucose and sucrose, are excellent carbon sources for yeasts and almost all of the energy requirements of the cell can be satisfied by glycolysis [6, 45, 59–61, 63–66]. Saccharomyces cerevisiae can switch from fermentatioon at high levels of glucose to respiration at low levels of glucose with major changes in metabolic activity (diauxic shift). In their experiment on the diauxic shift [30], DeRisi et al. inoculated cells from an exponentially growing culture into fresh medium and grew them at 30 for 21 hrs. This offers a resource to infer the possible allosteric regulation of enzymatic activities, protein modification and transcriptional regulation as shown in Table 2. In addition, the scheme of the corresponding inferred pathway is shown in Figure 6. In the overview of the inferring relationships in Table 2, the gluconeogenesis from Pyk1 to pgm2 and the partial fermentation from Pyk1 (PYRUVATE KINASE 1) to Adh1 and Adh2 (ETHANOL DEHYDROGENASE ISOZYME 1, 2) are unraveled as a result of the diauxic shift, so two sub-pathways in opposite directions are concluded.

In the fermentation direction, Pykl [Y4] encoding an enzyme, which catalyzes PEP (Phosphoenolpyruvate) to pyruvate, is negatively regulated (-5.8763) by Pck1 [Y26] (Eq.(4) in Table 2). Pck1 could be intepreted here as an indirect upstream transcription factor or regulatory gene for Pyk1 due to its function of decarboxylation and phosphorylation of oxalacetat in the presence of a nucleoside triphosphate and a divalent metal ion to yield PEP. Another Gcrl [Y15] gene is also identified as the strongest positive regulation (5.9829) to Pyk1 (also see Eq.(4) in Table 2), which is putatively considered as a transcription factor. This candidate transcription factor Gcrl of Pdc1 [Y6] (PYRUVATE DECARBOXYLASE ISOZYME 1) plays a more essential role (-2.5615, Eq.(6) in Table 2) than Rap1 [Y17] (0.1164), and Pyk1 [Y4] is an upstream regulatory factor coding an enzyme with the most positive effect (3.1295) on Pdc1 according to the production of acetaldehyde from pyruvate. In the last kernel of the fermentation, Adh1 [Y7] and Adh2 [Y3] are involved in the ethanol metabolism of carbohydrate storage. Adh2 is implicated to up-regulate Adh1 (0.5145, Eq.(7) in Table 2) under the catabolism from ethanol to acetaldehyde and is significantly up-regulated by Adh1 (1.0746, Eq.(3) in Table 2) to produce ethanol reversely. The mutual regulations of these two isozymes are within a tiny activation delay of 0.5-hr implying their close relationship. In addition, Gcr2 [Y16] and Sfp1 [Y30] with consistently dominant negative influences on Adh2 and Adh1 respectively would be at the transcriptional level presumably (see Eqs.(3), (7) in Table 2).

In the sub-pathway of glyconeogenesis, Eno2 [Y2] (ENOLASE ISOZYME 2) is regulated by Pck1 [Y26] (-0.7195) in the same way as Pyk1 while the main transcription factor is Stp2 [Y31] with significantly positive regulation (0.2147, see Eq.(2) in Table 2). As seen in Table 2, a causal cascade of Eno2, Gpm1 [Y8] (PHOSPHOGLYCERATE MUTASE l), Tpi1 [Y10] (TRIOSE-P ISOMERASE 1), Fba1 [Y11] (ALDOLASE 1), and Pgi1 [Y9] (PHOSPHOGLUCOSE ISOMERASE 1) indicates the construction of a trunk of the glyconeogenesis (see Eqs.(8), (9), (10), and (11) in Table 2). Among them, Rap1 [Y17] and Gcrl [Y15] are the common regulators of Gpm1, Pgi1, and Tpi1. This means that Rap1 and Gcrl might be the most important regulators in the glyconeogenesis pathway by the transcriptional binding. Finally, Pgm2 [Y5] (PHOSPHOGLUCOMUTASE 2) co-regulated by Glk1 [Y27] (GLUCOKINASE 1), Hxk1 [Y28], and Hxk2 [Y29] (HEXOKINASE 1, 2) significantly confers another pathway leading to the synthesis of UDP-GLU from Glucose-6-P (see Eq.(5) in Table 2).

In the overview of the metabolic shift pathway in Figure 6, extremely significant regulations (vivid red lines or blue lines) from most transcription factors (gray ovals) means that transcriptional regulations are feasibly identified. However, in the fermentation sub-pathway (light blue ovals), the mutual regulations between Adh1 and Adh2 are apparent when compared with the obscure relationships in glyconeogenesis (light yellow ovals). Interestingly, three transcription factors Gcr1, Gcr2 and Rap1 (black-line signals) appear to have very significant effects on the metabolic shift pathway. Finally, in order to validate the proposed method, an independent validation is also given by randomly reshuffling the time order of microarray experiment but with same choices of target gene and regulatory gene, as shown in Figure 8. Obviously, the proposed metabolic shift in Figure 6 is destroyed by reshuffling the experimental data.

Figure 8
figure 8

The pathway of metabolic shift regulatory system of yeast Saccharomyces cerevisiae in Fig. 6 is repeated as independent validation by randomly reshuffling the time order of microarray experiment but with the same choices of target and regulatory genes. Obviously, the proposed pathway in Fig. 6 is destroyed by the reshuffling of experimental data. The related genes are represented as ovals with different colors of light blue (fermentation), light yellow (gluconeogenesis), and gray (other relevant genes not as target genes). There are three types of lines with colors of red (activation), blue (repression), and black. In addition to the black line representing the signal pipes from genes, the red lines are shown depending on the degree of activation whilst the blue lines are shown depending on the degree of repression. The black square symbol attached on the lines is the bifurcate node from the same signal source. And the circle symbol attached on the lines is the collecting nodes from different signal sources. The colored degree bar between activation and repression is also shown in the bottom of the figure.

Conclusion

Microarray expression analysis by the dynamic system approach offers an opportunity to generate functional regulation interpretation on the genome-wide scale. The crucial ontology behind using dynamic system techniques is that the causality between gene expression profiles could be identified according to the differential equation underlying a dynamic system. Therefore, because the microarray data were harvested with time progression, the simultaneously varied gene expressions implicated in a genetic regulatory system would be detected to infer the regulatory pathways in spite of the versatile interactions such as transcriptional control, protein phosphorylation, or specific enzyme regulation.

The clustering method answers the problem of what is the functional catalogue of a specific gene by the identification of resembling patterns of gene expressions. Similarly, the co-regulations of upstream genes in our method also imply their concurrent functions. In contrast to the clustering algorithm, the causality of time-course data has been smoothly drawn by our dynamic method. The Bayesian networks were used merely for forward probabilistic estimation with time transition lacking in the feedback linkages. This unidirectional problem would not happen in our algorithm. Owing to the quantitative regulatory abilities of our model, we have a greater diversity of regulatory influence than the Boolean networks, which are deterministic with merely two states.

In our dynamic system approach, we not only can link qualitatively the upstream genes to the downstream ones iteratively, but also indicate quantitatively their regulatory relationships, including the regulatory abilities and the activation delays. In terms of the regulatory abilities, the comparison between the upstream regulatory genes of a target gene can inspire us to ask which one is significant biologically and whether it is a positive or negative influence on the investigated gene. Moreover, the speculation of activation delays benefits the empirical reference by providing us when the upstream regulatory genes might interact with their target genes. Since any gene can be considered as a target gene to trace back its upstream regulators, these regulators are then considered as target genes to trace back their upstream regulators. Iteratively, the genetic regulatory pathway (or network) can be constructed to the genome-wide. According to the qualitative and quantitative features imbedded, two regulatory pathway examples are characterized as in Figure 5 and Figure 6 for the identification of the proposed method. In addition, using the Akaike Information Criterion (AIC), a proper number of regulatory genes would be affirmed. As a result, many links overlap with well-known regulatory and signaling pathways in the previous literature and several putative ones are also found. Furthermore, the activation or repression relationships inferred via the microarray data would distinctly uncover the overall effect of regulatory interactions among casual genes in pathways on the transcriptional level.

In the two pathways under investigation, we have a more detailed understanding about the regulatory interactions among phytochrome, crytochrome and biological clock in the circadian regulatory system. On the other hand, the sophisticated knowledge of the metabolic pathway after the diauxic shift can be unfolded properly in our analysis. Furthermore, the independent validation of our approach is also given by randomly reshuffling the time order of microarray experiment. We found that the proposed pathways in Figures 5 and 6 are all destroyed as shown in Figures 7 and 8, respectively. The successful analysis of these two pathways implies the development of a valid and high-throughput method. All of the programs have been released [see Additional file 1]

There are some shortcomings in our study. First, although the time-course microarray data are available, its lower samplings will distort the real changes of gene expressions, especially for quick dynamic evolution. A more sampling experiment with respect to the intrinsic turnover rate is expected to have more precise analysis. Secondly, a regulatory gene with larger activation delay would not be recognized because the less activation delay criterion is used, but this might be overcome by properly relaxing the criterion. Thirdly, activation profiles under the proteome should be highly correlated with the transcriptional profiles to elevate the interpretation of our system model. In general, the synchronous time-course microarray assay is more suitable to underlie the transcriptional binding among causal genes, but an inference of physical interactions in the post-transcriptional level also has sufficient feasibility in our study.

In the near future, the most pressing task is to investigate our presumed paths in the laboratory. As the pathway construction algorithms are further developed, we expect this system approach to have immense impact in elucidating the underlying molecular mechanisms of pathways in a variety of organisms, especially after the maturation of the protein chips. Ultimately, we envision that biologists will perform routine pathway inference to seek some novel regulations and to identify the evolutionarily conserved links.

Methods

I. Detection of regulatory function Gi(t) in equation (1.2)

After the decomposition of G i (t) in equation (1.2), we substitute equation (1.2) into equation (1.1) to obtain the following dynamic equation for the expression profile of the i-th gene,

In the above dynamic equation, parameters a i , b i , α n , and β n should be estimated by the time profile of microarray data of the i-th gene, i.e. these parameters should be specified so that the simulating output X i (t) of the dynamic model in equation (3.1) should meet the empirical expression profile of the i-th gene. The least-squares estimation method is employed to solve this parameter estimation problem.

To make the dynamic model effective, the dynamic equation in (3.1) should meet the expression profile at all time points t = t1,…, t m and is then arranged in a vector differential form. Consequently, the vector differential form underlined in this equation is applied to m time points in order.

where

, , and m denotes the number of time points.

In the next step, formula (3.2) can be translated into a differential matrix equation as follows,

Y i = A i Φ i + E i     (3.3)

where

, Φ i = [a i b i α 0 β 0 … α N β N ]T, and are in vector forms, while is a matrix.

To estimate the relevant unknown parameters in Φ i , the least-squares method below is used to derive the optimal parameters estimation of ,

Actually, the modeling error could be concluded into E i as the noise of the gene-expression profile or of the microarray chips. So the consideration of modeling error makes equation (3.3) approach more the reality. By the way, in order to get accurate data of and from the expression profile of the target gene, the cubic spline should be employed to interpolate the time profile of the target gene. Furthermore, the choice of N is based on the tradeoff between the accuracy of approximation in (1.2) and the complexity of parameter estimation in (3.4). In this study N = 6 is chosen because these harmonics are enough to approximate regulation functions.

II. Maximum likelihood Estimate of kinetic parameters Ω i in equation (2.4)

Maximum likelihood method for Ω i in equation (2.4) is given as follows:

The log-likelihood function for given m data points is then described by

The necessary condition for the maximum likelihood estimation of variance σ2 is , by which equation (2.5) is obtained.

Substituting equation (2.5) into equation (4.2) yields,

meaning that we can find the maximum likelihood estimation of Ω i by minimizing the value of σ2 in equation(2.5). Then, the maximum likelihood estimate in equation (2.6) is obtained by in equation(2.5).

References

  1. Kettman JR, Frey JR, Lefkovits I: Proteome, transcriptome and genome: top down or bottom up analysis. Biomol Eng. 2001, 18: 207-212. 10.1016/S1389-0344(01)00096-X.

    Article  CAS  PubMed  Google Scholar 

  2. Scheel J, Von Brevern MC, Horlein A, Fischer A, Schneider A, Bach A: Yellow pages to the transcriptome. Pharmacogenomics. 2002, 3: 791-807. 10.1517/14622416.3.6.791.

    Article  CAS  PubMed  Google Scholar 

  3. Velculescu VE, Zhang L, Zhou W, Vogelstein J, Basrai MA, Bassett DE, Hieter P, Vogelstein B, Kinzler KW: Characterization of the yeast transcriptome. Cell. 1997, 88: 243-251. 10.1016/S0092-8674(00)81845-0.

    Article  CAS  PubMed  Google Scholar 

  4. Eastmond PJ, Graham IA: Trehalose metabolism: a regulatory role for trehalose-6-phosphate. Curr Opin Plant Biol. 2003, 6: 231-235. 10.1016/S1369-5266(03)00037-2.

    Article  CAS  PubMed  Google Scholar 

  5. Harkin DP: Uncovering Functionally Relevant Signaling Pathways Using Microarray-Based Expression Profiles. The Oncologist. 2000, 5: 511-517. 10.1634/theoncologist.5-6-501.

    Article  Google Scholar 

  6. Verdone L, Cesari F, Denis CL, Di Mauro E, Caserta M: Factors affecting Saccharomyces cerevisiae ADH2 chromatin remodeling and transcription. J Biol Chem. 1997, 272: 30828-30834. 10.1074/jbc.272.49.30828.

    Article  CAS  PubMed  Google Scholar 

  7. Yanovsky MJ, Kay SA: Signaling networks in the plant circadian system. Curr Opin Plant Biol. 2001, 4: 429-435. 10.1016/S1369-5266(00)00196-5.

    Article  CAS  PubMed  Google Scholar 

  8. Sontag E, Kiyatkin A, Kholodenko BN: Inferring dynamic architecture of cellular networks using time series of gene expression, protein and metabolite data. Bioinformatics. 2004, 20: 1877-86. 10.1093/bioinformatics/bth173.

    Article  CAS  PubMed  Google Scholar 

  9. Schena M, Shalon D, Davis RW, Brown PO: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995, 270: 467-470.

    Article  CAS  PubMed  Google Scholar 

  10. Lipshutz RJ, Fodor SP, Gingeras TR, Lockhart DJ: High density synthetic olignucleotide arrays. Nat Gentet. 1999, 21: 20-24. 10.1038/4447.

    Article  CAS  Google Scholar 

  11. Sherlock G, Hernandez-Boussard T, Kasarskis A, Binkley G, Matese JC, Dwight SS, Kaloper M, Weng S, Jin H, Ball CA, Eisen MB, Spellman PT, Brown PO, Botstein D, Cherry JM: The Stanford Microarray Database. Nucleic Acids Res. 2001, 29: 152-155. 10.1093/nar/29.1.152.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  12. Edgar R, Domrachev M, Lash AE: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002, 30: 207-210. 10.1093/nar/30.1.207.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  13. Brazma A, Parkinson H, Sarkans U, Shojatalab M, Vilo J, Abeygunawardena N, Holloway E, Kapushesky M, Kemmeren P, Lara GG, Oezcimen A, Rocca-Serra P, Sansone SA: ArrayExpress – a public repository for microarray gene expression data at the EBI. Nucleic Acids Res. 2003, 31: 68-71. 10.1093/nar/gkg091.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  14. Causton HC, Ren B, Koh SS, Harbison CT, Kanin E, Jennings EG, Lee TI, True HL, Lander ES, Young RA: Remodeling of yeast genome expression in response to environmental changes. Mol Biol Cell. 2001, 12: 323-337.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  15. Harmer SL, Hogenesch JB, Straume M, Chang HS, Han B, Zhu T, Wang X, Kreps JA, Kay SA: Orchestrated transcription of key pathways in Arabidopsis by the circadian clock. Science. 2000, 290: 2110-2113. 10.1126/science.290.5499.2110.

    Article  CAS  PubMed  Google Scholar 

  16. Schaffer R, Landgraf J, Accerbi M, Simon V, Larson M, Wisman E: Microarray analysis of diurnal and circadian-regulated genes in Arabidopsis. Plant Cell. 2001, 13: 113-123. 10.1105/tpc.13.1.113.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  17. Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PQ, Bostein D, Futcher B: Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cererisiae by microarray hybridization. Mol Biol Cell. 1998, 9: 3273-3297.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  18. Eisen MB, Spellman PT, Brown PO, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA. 1998, 95: 14863-14868. 10.1073/pnas.95.25.14863.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  19. Gasch AP, Eisen MB: Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering. Genome Biol. 2002, 3: RESEARCH0059-10.1186/gb-2002-3-11-research0059.

    Article  PubMed Central  PubMed  Google Scholar 

  20. Tamayo P, Slonim D, Mesirov J, Zhu Q, Kitareewan S, Dmitrovsky E, Lander ES, Golub TR: Interpreting patterns of gene expression with self- organizing maps: methods and application to hematopoietic differentiation. Proc Natl Acad Sci USA. 1999, 96: 2907-2912. 10.1073/pnas.96.6.2907.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  21. Alter O, Brown PO, Botstein D: Singular value decomposition for genome- wide expression data processing and modeling. Proc Natl Acad Sci USA. 2000, 97: 10101-10106. 10.1073/pnas.97.18.10101.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  22. Brown MP, Grundy WN, Lin D, Cristianini N, Sugnet CW, Furey TS, Ares M, Haussler D: Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci U S A. 2000, 97: 262-267. 10.1073/pnas.97.1.262.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  23. Tavazoie S, Hughes JD, Campbell MJ, Cho RJ, Church GM: Systematic determination of genetic network architecture. Nat Genet. 1999, 22: 281-285. 10.1038/10343.

    Article  CAS  PubMed  Google Scholar 

  24. Friedman N, Linial M, Nachman I, Pe'er D: Using Bayesian networks to analyze expression data. J Comput Biol. 2000, 7: 601-620. 10.1089/106652700750050961.

    Article  CAS  PubMed  Google Scholar 

  25. Husmeier D: Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics. 2003, 19: 2271-82. 10.1093/bioinformatics/btg313.

    Article  CAS  PubMed  Google Scholar 

  26. Huang S: Gene expression profiling, genetic networks, and cellular states: an integrating concept for tumorigenesis and drug discovery. J Mol Med. 1999, 7: 469-480. 10.1007/s001099900023.

    Article  Google Scholar 

  27. Chen HC, Lee HC, Lin TY, Li WH, Chen BS: Quantitative characterization of the transcriptional regulatory network in the yeast cell cycle. Bioinformatics. 2004, 20: 1914-1927. 10.1093/bioinformatics/bth178.

    Article  CAS  PubMed  Google Scholar 

  28. Michael TP, McClung CR: Phase-specific circadian clock regulatory elements in Arabidopsis. Plant Physiol. 2002, 130: 627-638. 10.1104/pp.004929.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  29. Motaki S, Ayako K, Kazuko YS, Kazuo S: Molecular response to drought, salinity and frost: common and different paths for plant protection. Curr Opin Biotechnol. 2003, 14: 194-199. 10.1016/S0958-1669(03)00030-2.

    Article  Google Scholar 

  30. DeRisi JL, Iyer VR, Brown PO: Exploring the metabolic and genetic control of gene expression on a genomic scale. Science. 1997, 278: 680-686. 10.1126/science.278.5338.680.

    Article  CAS  PubMed  Google Scholar 

  31. Staiger D: Circadian rhythms in Arabidopsis: time for nuclear proteins. Planta. 2002, 214: 334-344. 10.1007/s004250100662.

    Article  CAS  PubMed  Google Scholar 

  32. Somers DE, Devlin PF, Kay SA: Phytochromes and cryptochromes in the entrainment of the Arabidopsis circadian clock. Science. 1998, 282: 1488-1490. 10.1126/science.282.5393.1488.

    Article  CAS  PubMed  Google Scholar 

  33. Flores CL, Rodriguez C, Petit T, Gancedo C: Carbohydrate and energy- yielding metabolism in non-conventional yeasts. FEMS Microbiol Rev. 2000, 24: 517-529. 10.1016/S0168-6445(00)00037-1.

    Google Scholar 

  34. Deng XW, Quail PH: Signalling in light-controlled development. Semin Cell Dev Biol. 1999, 10: 121-129. 10.1006/scdb.1999.0287.

    Article  CAS  PubMed  Google Scholar 

  35. Eriksson ME, Hanano S, Southern MM, Hall A, Millar AJ: Response regulator homologues have complementary, light-dependent functions in the Arabidopsis circadian clock. Planta. 2003, 218: 159-162. 10.1007/s00425-003-1106-4.

    Article  CAS  PubMed  Google Scholar 

  36. Green RM, Tingay S, Wang ZY, Tobin EM: Circadian rhythms confer a higher level of fitness to Arabidopsis plants. Plant Physiol. 2002, 129: 576-584. 10.1104/pp.004374.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  37. Hall A, Kozma-Bognar L, Bastow RM, Nagy F, Millar AJ: Distinct regulation of CAB and PHYB gene expression by similar circadian clocks. Plant J. 2002, 32: 529-537. 10.1046/j.1365-313X.2002.01441.x.

    Article  CAS  PubMed  Google Scholar 

  38. Hayama R, Coupland G: Shedding light on the circadian clock and the photoperiodic control of flowering. Curr Opin Plant Biol. 2003, 6: 13-19. 10.1016/S1369-5266(02)00011-0.

    Article  CAS  PubMed  Google Scholar 

  39. Carre IA: Day-length perception and the photoperiodic regulation of flowering in Arabidopsis. J Biol Rhythms. 2001, 16: 415-423. 10.1177/074873001129002006.

    Article  CAS  PubMed  Google Scholar 

  40. Alabadi D, Oyama T, Yanovsky MJ, Harmon FG, Mas P, Kay SA: Reciprocal regulation between TOC1 and LHY/CCA1 within the Arabidopsis circadian clock. Science. 2001, 293: 880-883. 10.1126/science.1061320.

    Article  CAS  PubMed  Google Scholar 

  41. Yin Z, Hatton L, Brown AJ: Differential post-transcriptional regulation of yeast mRNAs in response to high and low glucose concentrations. Mol Microbiol. 2000, 35: 553-565. 10.1046/j.1365-2958.2000.01723.x.

    Article  CAS  PubMed  Google Scholar 

  42. Johansson R: System Modeling and Identification. 1993, Prentice-Hall, Englewood Cliffs

    Google Scholar 

  43. Mas P, Alabadi D, Yanovsky MJ, Oyama T, Kay SA: Dual role of TOC1 in the control of circadian and photomorphogenic responses in Arabidopsis. Plant Cell. 2003, 15: 223-236. 10.1105/tpc.006734.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  44. Mazzella MA, Cerdan PD, Staneloni RJ, Casal JJ: Hierarchical coupling of phytochromes and cryptochromes reconciles stability and light modulation of Arabidopsis development. Development. 2001, 128: 2291-2299.

    CAS  PubMed  Google Scholar 

  45. Ohlmeier S, Kastaniotis AJ, Hiltunen JK, Bergmann U: The yeast mitochondrial proteome, a study of fermentative and respiratory growth. J Biol Chem. 2004, 279: 3956-3979. 10.1074/jbc.M310160200.

    Article  CAS  PubMed  Google Scholar 

  46. Somers DE: Clock-associated genes in Arabidopsis: a family affair. Philos Trans R Soc Lond B Biol Sci. 2001, 356: 1745-1753. 10.1098/rstb.2001.0965.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  47. Valverde F, Mouradov A, Soppe W, Ravenscroft D, Samach A, Coupland G: Photoreceptor regulation of CONSTANS protein in photoperiodic flowering. Science. 2004, 303: 1003-1006. 10.1126/science.1091761.

    Article  CAS  PubMed  Google Scholar 

  48. Mas P, Devlin PF, Panda S, Kay SA: Functional interaction of phytochrome B and cryptochrome 2. Nature. 2000, 408: 207-211. 10.1038/35041583.

    Article  CAS  PubMed  Google Scholar 

  49. Sharrock RA, Clack T: Patterns of expression and normalized levels of the five Arabidopsis phytochromes. Plant Physiol. 2002, 130: 442-456. 10.1104/pp.005389.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  50. Fankhauser C, Staiger D: Photoreceptors in Arabidopsis thaliana: light perception, signal transduction and entrainment of the endogenous clock. Planta. 2002, 216: 1-16. 10.1007/s00425-002-0831-4.

    Article  CAS  PubMed  Google Scholar 

  51. Toth R, Kevei E, Hall A, Millar AJ, Nagy F, Kozma-Bognar L: Orcadian clock-regulated expression of phytochrome and cryptochrome genes in Arabidopsis. Plant Physiol. 2001, 127: 1607-1616. 10.1104/pp.127.4.1607.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  52. Ahmad M, Jarillo JA, Smirnova O, Cashmore AR: The CRY1 blue light photoreceptor of Arabidopsis interacts with phytochrome A in vitro. Mol Cell. 1998, 1: 939-948. 10.1016/S1097-2765(00)80094-5.

    Article  CAS  PubMed  Google Scholar 

  53. Casal JJ: Phytochromes, cryptochromes, phototropin: photoreceptor interactions in plants. Photochem Photobiol. 2000, 71: 1-11. 10.1562/0031-8655(2000)071<0001:PCPPII>2.0.CO;2.

    Article  CAS  PubMed  Google Scholar 

  54. Hall A, Kozma-Bognar L, Toth R, Nagy F, Millar AJ: Conditional circadian regulation of PHYTOCHROME A gene expression. Plant Physiol. 2001, 127: 1808-1818. 10.1104/pp.127.4.1808.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  55. Martinez-Garcia JF, Huq E, Quail PH: Direct targeting of light signals to a promoter element-bound transcription factor. Science. 2000, 288: 859-863. 10.1126/science.288.5467.859.

    Article  CAS  PubMed  Google Scholar 

  56. Carre IA, Kim JY: MYB transcription factors in the Arabidopsis circadian clock. J Exp Bot. 2002, 53: 1551-1557. 10.1093/jxb/erf027.

    Article  CAS  PubMed  Google Scholar 

  57. Kim JY, Song HR, Taylor BL, Carre IA: Light-regulated translation mediates gated induction of the Arabidopsis clock protein LHY. EMBO J. 2003, 22: 935-944. 10.1093/emboj/cdg075.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  58. Imaizumi T, Tran HG, Swartz TE, Briggs WR, Kay SA: FKF1 is essential for photoperiodic-specific light signalling in Arabidopsis. Nature. 2003, 426: 302-306. 10.1038/nature02090.

    Article  CAS  PubMed  Google Scholar 

  59. Barnett JA: A history of research on yeasts 6: the main respiratory pathway. Yeast. 2003, 20: 1015-1044. 10.1002/yea.1021.

    Article  CAS  PubMed  Google Scholar 

  60. de Mesquita JF, Zaragoza O, Gancedo JM: Functional analysis of upstream activating elements in the promoter of the FBP1 gene from Saccharomyces cerevisiae. Curr Genet. 1998, 33: 406-411. 10.1007/s002940050353.

    Article  CAS  PubMed  Google Scholar 

  61. Dickinson FM, Back S: The activity of yeast ADH I and ADH II with long- chain alcohols and diols. Chem Biol Interact. 2001, 130–132 ((1–3)): 417-423. 10.1016/S0009-2797(00)00266-0.

    Article  PubMed  Google Scholar 

  62. Dickinson JR, Salgado LE, Hewlins MJ: The catabolism of amino acids to long chain and complex alcohols in Saccharomyces cerevisiae. Biol Chem. 2003, 278: 8028-8034. 10.1074/jbc.M211914200.

    Article  CAS  Google Scholar 

  63. Fraenkel DG: The top genes: on the distance from transcript to function in yeast glycolysis. Curr Opin Microbiol. 2003, 6: 198-201. 10.1016/S1369-5274(03)00023-7.

    Article  CAS  PubMed  Google Scholar 

  64. Goncalves PM, Griffioen G, Bebelman JP, Planta RJ: Signalling pathways leading to transcriptional regulation of genes involved in the activation of glycolysis in yeast. Mol Microbiol. 1997, 25: 483-493. 10.1046/j.1365-2958.1997.4811847.x.

    Article  CAS  PubMed  Google Scholar 

  65. Goncalves P, Planta RJ: Starting up yeast glycolysis. Trends Microbiol. 1998, 6: 314-319. 10.1016/S0966-842X(98)01305-5.

    Article  CAS  PubMed  Google Scholar 

  66. Scheffler IE, de la Cruz BJ, Prieto S: Control of mRNA turnover as a mechanism of glucose repression in Saccharomyces cerevisiae. Int J Biochem Cell Biol. 1998, 30: 1175-1193. 10.1016/S1357-2725(98)00086-7.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

We thank the National Science Council, Taiwan for grants NSC 93-3112-B-007-003.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bor-Sen Chen.

Additional information

Authors' contributions

W.C. Chang and Chang-Wei Li carried out the computational studies and analysis. B.S. Chen gave the topic and suggestions. All authors read and approved the final manuscript.

Electronic supplementary material

12859_2004_369_MOESM1_ESM.rar

Additional File 1: The raw data has been transform to .mat file, which is one of the Matlab files. This file has been divided into three parts: Name, Time, and Profile. [see Additional file 1] Name denotes the names of microarray data. Time denotes the time points of microarray data. Profile denotes the gene profiles of microarray data. (RAR 9 MB)

12859_2004_369_MOESM2_ESM.xls

Additional File 2: The 'Shuffled_data' is the Shuffled raw data. [see Additional file 2] (XLS 2 MB)

Authors’ original submitted files for images

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chang, WC., Li, CW. & Chen, BS. Quantitative inference of dynamic regulatory pathways via microarray data. BMC Bioinformatics 6, 44 (2005). https://doi.org/10.1186/1471-2105-6-44

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1471-2105-6-44

Keywords