Skip to main content

Bayesian Orthogonal Least Squares (BOLS) algorithm for reverse engineering of gene regulatory networks



A reverse engineering of gene regulatory network with large number of genes and limited number of experimental data points is a computationally challenging task. In particular, reverse engineering using linear systems is an underdetermined and ill conditioned problem, i.e. the amount of microarray data is limited and the solution is very sensitive to noise in the data. Therefore, the reverse engineering of gene regulatory networks with large number of genes and limited number of data points requires rigorous optimization algorithm.


This study presents a novel algorithm for reverse engineering with linear systems. The proposed algorithm is a combination of the orthogonal least squares, second order derivative for network pruning, and Bayesian model comparison. In this study, the entire network is decomposed into a set of small networks that are defined as unit networks. The algorithm provides each unit network with P(D|Hi), which is used as confidence level. The unit network with higher P(D|Hi) has a higher confidence such that the unit network is correctly elucidated. Thus, the proposed algorithm is able to locate true positive interactions using P(D|Hi), which is a unique property of the proposed algorithm.

The algorithm is evaluated with synthetic and Saccharomyces cerevisiae expression data using the dynamic Bayesian network. With synthetic data, it is shown that the performance of the algorithm depends on the number of genes, noise level, and the number of data points. With Yeast expression data, it is shown that there is remarkable number of known physical or genetic events among all interactions elucidated by the proposed algorithm.

The performance of the algorithm is compared with Sparse Bayesian Learning algorithm using both synthetic and Saccharomyces cerevisiae expression data sets. The comparison experiments show that the algorithm produces sparser solutions with less false positives than Sparse Bayesian Learning algorithm.


From our evaluation experiments, we draw the conclusion as follows: 1) Simulation results show that the algorithm can be used to elucidate gene regulatory networks using limited number of experimental data points. 2) Simulation results also show that the algorithm is able to handle the problem with noisy data. 3) The experiment with Yeast expression data shows that the proposed algorithm reliably elucidates known physical or genetic events. 4) The comparison experiments show that the algorithm more efficiently performs than Sparse Bayesian Learning algorithm with noisy and limited number of data.


High-throughput technologies such as DNA microarrays provide the opportunity to elucidate the underlying complex cellular networks. There are now many genome-wide expression data sets available. As an initial step, several computational clustering analyses have been applied to expression data sets to find sets of co-expressed and potentially co-regulated genes [15]. As a next step, there have been efforts to elucidate gene regulatory networks (GRN) embedded in complex biological systems. A growing number of methods for reverse engineering of GRN have been reported as follows: Boolean networks [6, 7], Bayesian networks [810], Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) [11], linear models [12], neural networks [13], methods using ordinary differential equations [14, 15], a sparse graphical Gaussian model [16], a method using a genetic algorithm [17], a method using Sparse Bayesian Learning (SBL) algorithm [18, 19], and etc.

In the reverse engineering of GRN, essential tasks are developing and comparing alternative GRN models to account for the data that are collected (Figure 1). There are two levels of inference involved in the task of data modeling process [20]. The first level of inference is fitting one of models to the data with an assumption that our chosen model is true. The second level of inference is the task of model comparison. It is desired to compare the alternative models with the help given by the data, and give some level of preference to the alternative models. Thus, the reverse engineering method should be used as a framework for fitting several different GRN models to the data to compare the models. For instance, there are several GRN modeling studies in which the reverse engineering algorithm could be applied: 1) system of ordinary differential equation (ODE) [15], 2) Dynamic Bayesian networks based methods (DBN) [10, 18], 3) a linear stochastic differential equation for a transcriptional regulatory network [14], and etc. It is noted that these three models can be represented by linear systems. By a "linear system", we mean a system represented as a linear equation in matrix form such as Eq. 2 in methods section.

Figure 1
figure 1

The Bayesian Orthogonal Least Squares algorithm could be used as a framework for gene regulatory study including the collecting and modeling of data.

Microarrays have been used to measure genome-wide expression patterns during the cell cycle of different eukaryotic and prokaryotic cells. The review paper of Cooper and Shedden [21] presents various published microarray data sets, which have been interpreted as showing that a large number of genes are expressed in a cell-cycle-dependent manner. From this point of view, it is assumed that the underlying GRN is dependent upon cell-cyclic time. Therefore, this study uses the DBN because it can represent how the expression levels evolve over time.

In this paper, we address two main challenges in reverse engineering with linear systems and present a novel algorithm to overcome these difficulties. Firstly, reverse engineering of GRN will be computationally less challenging task if significantly large amount of experimental data is available. However, this is limited due to the expensive cost of microarray experiments. This problem makes the reverse engineering of GRN to be underdetermined, which means that there is substantially greater number of genes than the number of measurements. Secondly, reverse engineering of GRN with linear systems is ill conditioned because small relative changes in design matrix E in Eq. 2 due to the noise make substantially large changes in the solution. Therefore, the reverse engineering algorithm named as Bayesian orthogonal least squares (BOLS) is developed to overcome these difficulties. The BOLS method is created by combining three techniques: 1) Orthogonal Least Squares method (OLS) [22], 2) second order derivative for network pruning [23], and 3) Bayesian model comparison [20].

We evaluate the BOLS method by inferring GRN from both synthetic and Yeast expression data. We provide the performance comparison between BOLS and one of state-of-the-art reverse engineering methods, SBL algorithm [24]. The SBL algorithm has been recently used in GRN studies with linear systems [18, 19]. For evaluation with Yeast expression data, we validate the inferred GRN using the information from the database that contains large data sets of known biological interactions.

Results and discussion

Case study 1: In silico experiment

Our in silico experiment follows the methodology for the generation of synthetic expression dataset for systems of DBN as used in Rogers and Girolami [19]. We generate synthetic networks using power-law distribution. To create a network structure, we decompose the entire network into a set of small networks that are defined as unit network and proceed a unit network by a unit network. Figure 2 presents a unit network consisting of a target gene and a list of genes as regulators. It should be noted that there is no requirement for the network to be acyclic. All created unit networks will be combined to create a whole GRN. The combination of all (or selected) unit networks is straightforward process based on the definition of a graph (see Methods section). This unit network approach is similar to the approach adopted in Bayesian network based methods [9, 25] and SBL based method [19]. For each target gene, we sample the number of genes (mi) regulating this target gene from the approximate power distribution

Figure 2
figure 2

The schematic of unit network. (a) Input unit network consisting of target gene Y i and all other genes as regulator candidates. (b) Output unit network consisting of target gene Y i and its most probable regulators.

P ( m i ) = { M 1 m i η m min m i m max 0 o t h e r w i s e MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGqbaudaqadaqaaiabd2gaTnaaBaaaleaacqWGPbqAaeqaaaGccaGLOaGaayzkaaGaeyypa0ZaaiqaaeaafaqaaeGacaaabaGaemyta00aaWbaaSqabeaacqGHsislcqaIXaqmaaGccqWGTbqBdaqhaaWcbaGaemyAaKgabaGaeyOeI0ccciGae83TdGgaaaGcbaGaemyBa02aaSbaaSqaaiGbc2gaTjabcMgaPjabc6gaUbqabaGccqGHKjYOcqWGTbqBdaWgaaWcbaGaemyAaKgabeaakiabgsMiJkabd2gaTnaaBaaaleaacyGGTbqBcqGGHbqycqGG4baEaeqaaaGcbaGaeGimaadabaGaem4Ba8MaemiDaqNaemiAaGMaemyzauMaemOCaiNaem4DaCNaemyAaKMaem4CamNaemyzaugaaaGaay5Eaaaaaa@5C98@

where the normalization constant is given by

M = i = m min m max m i η . MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGnbqtcqGH9aqpdaaeWbqaaiabd2gaTnaaDaaaleaacqWGPbqAaeaacqGHsisliiGacqWF3oaAaaaabaGaemyAaKMaeyypa0JaemyBa02aaSbaaWqaaiGbc2gaTjabcMgaPjabc6gaUbqabaaaleaacqWGTbqBdaWgaaadbaGagiyBa0MaeiyyaeMaeiiEaGhabeaaa0GaeyyeIuoakiabc6caUaaa@4559@

Following Rogers and Girolami [19], Wagner [26], and Rice et al. [27], the constant η is set to 2.5. mmax (or mmin) is the maximum (or minimum) number of allowed regulator genes in a unit network respectively. The condition mmax << K (the number of genes) ensures the sparseness of the synthetic network. Note that mmax is set to 4 for all experiments. The more details of creating synthetic network can be found from the supplementary information of Rogers and Girolami's study [19]. In this study, the Matlab code from Rogers and Girolami's study for creating synthetic network is used, which is available from [28]. We first randomly generate the synthetic networks unit network by unit network to combine them for a whole GRN, and then we generate the synthetic expression data with randomly generated synthetic GRN. Using these synthetic expression data, we infer the network with BOLS and the simulated data set. It should be noted that BOLS and the generation method of synthetic networks are not cooperative to work because the generation and inference of networks are completely separate processes.

In this experiment, the synthetic expression data is generated based on DBN using Eq. 1, in which the expression data are evolved over the time. However, as the simulation process is continued over the time, the expression data diverge by constantly either increasing or decreasing. Thus, we collect a single time point only after the expression data are simulated for a certain period of time (from t = 0 to t = T) to avoid expression levels being too high. We proceed in a single time point by a single time point manner. For a single time point, the generation of expression data is started with initial synthetic data. For each gene, the initial condition is assigned with random number between 0 and 1. With given initial condition, we simulate the expression data for each gene i from t = 0 to t = T using Eq. 1. We take the measure with t = T-1 for design matrix E and with t = T for Yi in Eq. 2. We repeat these process N times to collect N data points and apply the reverse engineering algorithms to reconstruct the synthetic network.

We make the performance comparison between BOLS and SBL methods to show the efficiency of BOLS method. The SBL method is one of the state-of-the-art algorithms, which has been recently applied to GRN studies with linear systems [18, 19]. We use the Matlab code of SBL algorithm that is available from [29]. At first, we investigate the effect of the number of data points (N = 20, 40, 60, 80, 100) and noise (ε = 0.01, 0.05, 0.1) on the performance using synthetic GRN and expression data with fixed number of genes (K = 100). Sensitivity and complementary specificity are used as measures for the performance of the algorithm [9, 19, 27]. We compute the sensitivity = TP/(TP + FN) and the complementary specificity = FP/(TN + FP), where TP is the number of true positive, FN is the number of false negative, FP is the number of false positive, and TN is the number of true negative. In other words, the sensitivity is the proportion of recovered true positive interactions and the complementary specificity is the proportion of false positive interactions. To investigate the variability of test results, we have run both algorithms 20 times with same control parameters, i.e. the number of data points, noise, and etc. For each run, new random synthetic network has been created. We find that the sensitivity and the complementary specificity are constant over 20 experiments with small variability (see Table 1). The systematic effect of noise and the number of data points on the performance can be analyzed from Table 1. As the number of data points increases with the fixed number of genes and noise level ε, the performance of both algorithms increase. As noise level ε increases, the performance of both algorithms decreases. For the number of data points ≥ 80, the sensitivity of SBL is slightly greater than BOLS and the complementary specificity of SBL is significantly greater than the ones of BOLS. It means that BOLS algorithm produces significantly smaller proportions of false positive interactions than SBL algorithm. It should be noted that results from SBL algorithm with N = 20, 40, and 60 are not available because the Matlab code of SBL algorithm dose not run when the number of data points N is relatively low. SBL algorithm includes the Cholesky factorization of their Hessian matrix that is required to be positive definite. Note that the Hessian matrix is consisted of design matrix and hyperparameters [24]. When the number of data points is relatively smaller than the number of genes in the data set, this Hessian matrix becomes non positive definite. Our experiments show that SBL algorithm is not suitable for the reverse engineering with limited number of data points and it doesn't even run with significantly limited number of data points. On the other hand, BOLS algorithm produces relatively small proportion of FP interactions with limited number of data points. It should be noted that Rogers and Girolami [19] generate 2R expression levels for each gene in each knock-out experiment-R in the normal (wild type) system and R in the knock-out (mutant) system. In a network of K genes, in which each is knocked out individually, they have 2RK data points. Since the evaluation of BOLS with Rogers and Girolami's knockout approach [19] is beyond the scope of the objectives of our study, i.e. the underdetermined problem using DBN model, we provide the comparisons between BOLS and SBL based on their approach as Additional file 1.

Table 1 Sensitivity and complementary specificity for BOLS and SBL algorithms

We use the receiver operating characteristic (ROC) analysis [19] to characterize the trade-off between the proportions of true and false positive interactions with limited number of data points (N < 20 and K = 100) in Figure 3. This ROC analysis shows that BOLS algorithm produces the solution with extremely small proportion of false positive interactions when the number of data points is extremely small. We also analyze the effect of the number of gene K with limited number of data points N = 10. In Figure 4, it is shown that the performance of BOLS algorithm decreases as K increases from 100 to 300. Therefore, it can be concluded that performance of the proposed algorithm is dependent on K, N, and ε.

Figure 3
figure 3

ROC analysis of BOLS output with K = 100. (a) N = 5 and ε = 0.01, (b) N = 5 and ε = 0.05, (c) N = 5 and ε = 0.1, (d) N = 10 and ε = 0.01, (e) N = 10 and ε = 0.05, (f) N = 10 and ε = 0.1, (g) N = 15 and ε = 0.01, (h) N = 15 and ε = 0.05, (i) N = 15 and ε = 0.1. For all Figures, the x-axis corresponds to the complementary specificity, the y-axis sensitivity.

Figure 4
figure 4

ROC analysis of BOLS output with N = 10. (a) K = 100 and ε = 0.01, (b) K = 100 and ε = 0.05, (c) K = 100 and ε = 0.1, (d) K = 200 and ε = 0.01, (e) K = 200 and ε = 0.05, (f) K = 200 and ε = 0.1, (g) K = 300 and ε = 0.01, (h) K = 300 and ε = 0.05, (i) K = 300 and ε = 0.1. For all Figures, the x-axis corresponds to the complementary specificity, the y-axis sensitivity.

From the experiments, it is shown that BOLS produce solutions with significantly low complementary specificity regardless of K, N, and ε, because Bayesian model selection scheme is efficiently enough to discover the optimal solution. Since we do not have any information on noise in the data, we completely over-fit the data to DBN model using OLS as a first step. Then, we remove the unnecessary inferred parameters (the inferred parameters that are related with "noise") to obtain the optimal solution by a trade-off between minimizing the natural complexity of inferred GRN and minimizing the data misfit. As the complexity of inferred GRN decreases with network pruning process, we use the Bayesian model selection to select the most optimal solution. It should be noted that the Bayesian model selection includes Occam's factor, which automatically suppresses the tendency to discover spurious structure in data. Thus, we can say that BOLS is efficient to infer GRN with significant small portion of FP interactions with the noisy and limited number of data set for DBN. In Figure 5, we present an example showing that the performance of BOLS increases as the network pruning step proceeds. We first generate the synthetic networks of 50 genes (N) and then simulate 20 data points (K) with this networks and noise level ε = 0.1. We concentrate on an output unit-network with highest evidence value logP(D|Hi) for evaluation. It should be reminded that as the network pruning continues the number of inferred interactions in the unit network decreases. Figure 5b shows that the number of errors (FP+FN) decreases, as the network pruning proceeds. The complementary specificity also decreases along the network pruning (Figure 5c). On the other hand, the sensitivity remains constant, which is equal to 1 (Figure 5d). It means that the over-fitted solutions after OLS step contain only TP and FP interactions (no FN interactions). It is observed that the number of errors and the complementary specificity converge at 0 as the network pruning process proceeds, in which the unit network has the highest evidence logP(D|Hi) (Figure 5a). Therefore, it can be concluded that the OLS method can cope with underdetermined problems using noisy data, provided that the method is combined with network pruning process and Bayesian model selection techniques.

Figure 5
figure 5

The changes of performance of BOLS as the network pruning step proceeds. The simulation experiment is done with N = 50, K = 20, and ε = 0.1. In these Figures, we concentrate on an output unit-network that has the highest logP(D|Hi) among all output unit networks. For all Figures, the x-axis corresponds to the number of inferred interactions: as the network pruning proceeds, the number of inferred interactions in unit network decreases. Each y-axis corresponds to (a) logP(D|Hi), (b) the number of errors (FP+FN), (c) the complementary specificity, (d) the sensitivity.

The BOLS algorithm should be run K times producing K unit networks, which are combined to build the whole network. Each unit network is assigned with P(D|Hi). For each unit network, we compute the number of errors = FN + FP. The relationship between P(D|Hi) and the number of errors for each unit-networks is shown in Figure 6. The number of errors decreases as the number of data points increases on the synthetic data. Unit networks with higher P(D|Hi)s are more accurate than those with lower P(D|Hi)s. This signifies that unit networks with higher P(D|Hi)s have a higher confidence that unit networks are correctly reconstructed. Thus, when low numbers of data points and extremely high numbers of genes are given, the algorithm should be able to recover a partially correct network with unit networks only having relatively high P(D|Hi)s. It should be noted that BOLS algorithm does not provide the confidence levels among interactions inside unit network. However, it can be noticed from Figure 6 that many unit networks with relatively high evidence values have zeroed number of incorrectly inferred interactions. Therefore, the evidence values for unit networks are efficient enough to cope with problems for locating unit networks without FP or FN interactions.

Figure 6
figure 6

The relationship between the evidence value P(D|Hi) and the number of errors for unit networks, where K = 100, ε = 1.0e-2, and mmax = 4. (a) N = 10, (b) N = 15, (c) N = 20. For all Figures, the x-axis corresponds to log(P(D|Hi)), the y-axis the number of errors

Case study 2: Application of BOLS to Saccharomyces cerevisiae data

To evaluate our algorithm for reverse engineering of GRN, we use the microarray data from Spellman et al. [30], in which they created three different data sets using three different synchronization techniques. We focus our analysis using the data set produced by α factor arrest method. Here we concentrate on our study with the expression data set of 20 genes known to be involved in cell cycle regulation [31]. Thus, we have a data set with 20 genes and 17 data points in this experiment. Based on the previous simulation experiment, it is expected to have sensitivity as 0.967 (ε = 0.01), 0.937 (ε = 0.05), 0.919 (ε = 0.1) and complementary specificity as 0.022 (ε = 0.01), 0.019 (ε = 0.05), 0.029 (ε = 0.1), respectively. It means that the output is expected to have approximately less than 3% of complementary specificity if we assume that ε ≤ 0.1, the sparseness mmax of GRN = 4, and etc.

The output unit networks using this data set are presented in Table 2 including the P(D|Hi) of unit networks, in which there are 107 inferred interactions from BOLS. Among those interactions, 45 of them are identified as physical or genetic interactions from the BioGRID database [32]. Later in this section, we provide the logical basis such that some of unidentified interactions might be possible physical or genetic events. BioGRID is a freely accessible database including physical or genetic interactions from the Saccharomyces cerevisiae available at [33]. The output in Table 2 shows that the interactions with higher P(D|Hi)s have higher likelihood of having known physical or genetic interactions identified from the BioGRID than the interactions with lower P(D|Hi)s: 1) Among the elucidated interactions with P(D|Hi) > 66th percentile of all P(D|Hi)s, 23 of them are identified as physical or genetic interactions from BioGRID database. 2) Among the elucidated interactions with 66th percentile > P(D|Hi)s > 33rd percentile, 14 of them are identified as physical or genetic interactions, 3) Among the elucidated interactions with P(D|Hi)s < 33rd percentile, 8 of them are identified as physical or genetic interactions. This could be explained from the previous simulation experiment showing that BOLS has less false positive interactions with relatively high P(D|Hi) than with low P(D|Hi). It is noted that the overall values of P(D|Hi) in Table 2 are relative small compared to the ones in Figure 6. It should be noted that number of genes is set to 20 in Table 2 and the number of genes is set to 100 in Figure 6. From Figure 6, it is also noted that the overall values of P(D|Hi) decreases as the number of data points decreases. It is also found that the overall values of P(D|Hi) decreases as the noise level increases. Therefore, the overall values of P(D|Hi) can be relatively different depending on the number of genes, noise level, and the number of data points.

Table 2 The output unit networks of BOLS with log(P(D|Hi))

We pool both physical and genetic interactions from BioGRID to validate the output interactions in this experiment. The rationale for this pooling can be described as follows. Several proteins join together to form multi-protein complex having certain functions or regulating other proteins. For example, SCF complex consists of Skp, Cullin, and F-box proteins, which promotes G1-S transition by targeting G1 cyclins and Cln-Cdk inhibitor Sic1 for degradation. From Breeden's [34] review study, it is known that cell cycle regulated complexes with at least one sub-unit are regulated at the transcript level. This means that certain protein complexes might regulate other complexes with time delay. With only expression data sets, we only have information that which proteins are present for certain time. In terms of efficiency and logical order, it is assumed that the cell only makes the proteins when it is needed. If the proteins are made all the time, the cell could be inefficient in an environment without the substrates of the protein [21]. From these points of view, there can be two possible cases to be considered when certain two proteins are present: 1) two proteins form a multi-protein complex by interacting each other, 2) One protein might form a complex with some other proteins. This complex might regulates the other protein that could also form a protein complex. Therefore, those two types of interactions can be pooled together to validate the output interactions.

Unit-networks having relatively many identified physical or genetic interactions are the ones with Cln3, Cln2, Clb5, Cln1, Cdc28, Swi6, Cdc53 and Cdc34 as target genes. For example, we present a unit network with Cdc28 as a target gene. Cdc28 is identified as having physical or genetic interactions with Cln1 [3540], Cln2 [3649], Cln3 [47, 5055], Clb1 [35, 56, 57], Clb2 [35, 38, 41, 5864], Clb4 [56, 57, 61], Hct1 [38, 53, 63, 6567], Sic1 [37, 53, 61, 6871], Cdc20 [53, 67], Swi5 [53], and Swi6 [53, 72]. Cdc28 is a catalytic subunit of the main cell cycle cyclin-dependent kinase (CDK), which alternatively associates with G1 cyclins (Cln1, Cln2, and Cln3) and G2/M cyclins (Clb1, Clb2, and Clb4) that direct the CDK to specific substrates. Hct1 (Cdh1) and Cdc20 are cell cycle regulated activators of the anaphase-promoting complex/cyclosome (APC/C), which direct ubiquitination of mitotic cyclins. One of Cdc28's Gene Ontology definitions gives another evidence that Cdc28 might have associations with Hct1 and Cdc20, because Cdc28 is involved in the progression from G1 phase to S phase of the mitotic cell cycle. Sic1 is an inhibitor of Cdc28-Clb kinase complexes that controls G1/S phase transition, which prevents premature S phase and ensuring genomic integrity.

Among the list of genes inside Cdc28 unit network, there are several genes not identified as having physical or genetic interactions with Cdc28 from the BioGRID: Clb6, Mbp1, Mcm1, Cdc34, Cdc53 and Skp1. However, the definitions of these genes from the BioGRID give enough evidences such that some of them indirectly interact with Cdc28. For example, Clb6 is a B-type cyclin involved in DNA replication during S phase, which activates Cdc28 to promote initiation of DNA synthesis. Clb6 also has a role for the formation of mitotic spindles along with Clb3 and Clb4. Thus, Clb6 indirectly regulates Cdc28 along with Clb4. Cdc53, Cdc34 and Skp1 also indirectly regulate Cdc28 through Sic1. They form a structural protein of SCF complexes, called cullin, with an F-box protein. The SCF promotes the G1-S transition by targeting G1 cyclins and the Cln-Cdk inhibitor Sic1 for degradation. Mbp1 is a transcription factor involved in regulation of cell cycle progression from G1 to S phase, which forms a complex with Swi6 that binds to Mull cell cycle box regulatory element in promoters of DNA synthesis genes. Thus, Mbp1 is associated with Cdc28 by forming a complex with Swi6.

For another example, we present a unit network having Clb5 as target gene. Clb5 is identified as having physical or genetic interactions with Clb6 [7379], Sic1 [41, 61, 71, 8082], Clb2 [83], Cdc20 [84], Cln3 [75], Hct1 [63, 85], Swi4 [75, 86], and Swi6 [86].

Among the list of genes inside Clb5 unit network, there are several genes not identified as having physical or genetic interactions with Clb5: Cln2, Clb1, Mcm1, Mbp1, Swi5 and Clb4. However, there are evidences that some of genes in the list might have indirect interactions with Clb5. For example, Clb4 has an association with Clb5, because Clb5 is a B-type cyclin involved in DNA replication during S phase and has a role for the formation of mitotic spindles along with Clb3 and Clb4. For another example, Mbp1 and Clb5 have an indirect interaction. Mbp1 is a transcription factor involved in regulation of cell cycle progression from G1 to S phase, which forms a complex with Swi6 that binds to MluI cell cycle box regulatory element in promoters of DNA synthesis genes. It is already identified that Swi6 regulates Clb5. Thus, Mbp1 indirectly regulate Clb5 through Swi6.

We also evaluate both BOLS and SBL using the same data set in Table 2 to compare the efficiency of BOLS with SBL. For direct comparison purpose, we make the graphs of inferred GRN by BOLS and SBL (Figure 7). There are 107 inferred interactions (45 of them are identified as known interactions from the database) by BOLS (Figure 7a) and 116 interactions (38 of them are known interactions) by SBL (Figure 7b). Figure 7c shows that 62 interactions are inferred by both BOLS and SBL, in which 32 of them are known interactions. Figure 7d shows 45 interactions only by BOLS in which 13 of them are identified as known interactions from the database. On the other hand, Figure 7e shows 54 interactions only by SBL in which only 6 of them are known interactions from the database. It is reasonable to assume that the complete information of biological interactions of Yeast is not available from the database yet. It is believed that the more depositions of the information concerning Yeast interactions are still on the way to reach the more complete understanding of the underlying complex cellular networks of Yeast. Based on the currently available information from the database, we can say that SBL algorithm infers GRN with relatively more complexity and less identified known interactions than BOLS. Therefore, based on our evaluation experiments with synthetic and Yeast expression data, it is sufficient to conclude that SBL produces more over-fitting solutions (i.e. more FP solutions) than BOLS.

Figure 7
figure 7

The inferred GRN by both BOLS and SBL are compared with the same expression data used in Table 2. (a) A GRN by BOLS, (b) A GRN by SBL, (c) The inferred interactions both by BOLS and SBL, (d) The inferred interactions only by BOLS, (e) The inferred interactions only by SBL. For all figures, the solid line correspond the inferred interactions which are identified as known physical or genetic interactions from the BioGRID database, and the dashed line the unknown interactions.


In the evaluation of BOLS using synthetic data, it is shown the proposed BOLS algorithm is able to reconstruct networks using a very limited number of experimental samples. In this study, we assume that there is significantly limited number of data points and the noise level in the data is not known. This is a common situation in expression data analysis. To handle these difficulties, we adopt a decomposition strategy to break down the entire network into a set of unit networks. This decomposition makes the inferring of a whole GRN into several separate regressions. Thus, if we have extremely small number of data points, our method can not provide 100% correct solutions, but provides each unit network with P(D|Hi), which can be used as confidence level. The unit network with a higher P(D|Hi) has a higher confidence such that the unit network is correctly inferred (Figure 5). Previously, Basso et al. [11] validated their ARACNe algorithm using 19 nodes synthetic network. With 350 sample size, the sensitivity and complementary specificity are approximately 80% and 20%, respectively. The inferred interactions from their method contain approximately 20% false positive and false negative interactions, respectively. With our method, it is possible to locate false positive or false negative interactions with P(D|Hi)s, which is the unique property of BOLS algorithm. Our in silico experiment shows that the performance depends on the number of data points, genes, and noise level. Further study will be required to investigate the relationship between these parameters and the performance so that the BOLS algorithm can be generally applied to the microarray data with any number of genes, data points, and noise level.

Another evaluation is conducted with the Yeast Saccharomyces cerevisiae data of 20 genes, which are involved in cell-cycle regulation. In the output network, there is noticeable number of interactions that are identified as physical or genetic interactions from the literature. There are also several interactions that are not identified from the literature. However, it is shown that the definition of these genes from the BioGRID database gives enough evidences that some of them have indirect interactions. Thus, this experiment shows that BOLS algorithm is able to elucidate remarkable number of known physical or genetic events.

For both evaluation experiments with synthetic and Yeast expression data, we compare the performance between BOLS and SBL algorithms. The SBL algorithm [18, 19, 24] is a general Bayesian framework to obtain sparse solutions utilizing linear models. This method is known as type-II maximum likelihood method [24], in which the solutions are obtained by maximizing the marginal likelihood. On the other hand, BOLS utilizes the Bayesian model selection that is an extension of maximum likelihood model selection, in which the posterior is obtained by multiplying the best fit likelihood by the Occam's factor. From our both evaluation experiments, it is concluded that BOLS produces sparser solutions with less FP than SBL does.


1. Gene regulatory network model and Unit networks

To study GRN, we choose a system of DBN as our GRN model. This model is described by

e i ( t + 1 ) = j = 1 K w i j e j ( t ) + ξ i ( t ) f o r t = 1 , 2 , , N 1. MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGLbqzdaWgaaWcbaGaemyAaKgabeaakmaabmaabaGaemiDaqNaey4kaSIaeGymaedacaGLOaGaayzkaaGaeyypa0ZaaabCaeaacqWG3bWDdaWgaaWcbaGaemyAaKMaemOAaOgabeaakiabdwgaLnaaBaaaleaacqWGQbGAaeqaaOWaaeWaaeaacqWG0baDaiaawIcacaGLPaaaaSqaaiabdQgaQjabg2da9iabigdaXaqaaiabdUealbqdcqGHris5aOGaey4kaSccciGae8NVdG3aaSbaaSqaaiabdMgaPbqabaGcdaqadaqaaiabdsha0bGaayjkaiaawMcaaiabbccaGiabdAgaMjabd+gaVjabdkhaYjabbccaGiabdsha0jabg2da9iabigdaXiabcYcaSiabikdaYiabcYcaSiablAciljabcYcaSiabd6eaojabgkHiTiabigdaXiabc6caUaaa@5F54@

Here N is the number of data points, K is the number of gene in the data, and ξi(t) is a noise at any time. The ei(t)s are the level of mRNAs at any given time t, which influences the expression levels of the gene. The value wij describes the interaction strength between the jth gene and the ith gene. This model has the first order Markov property, which means that the future expression ei(t+1) is independent of the past expression level ei(t-1) given the present expression level ei(t). As briefly mentioned in the previous section, it is required that the data set is reorganized into a linear system to reverse engineer GRN using BOLS algorithm. This GRN model can be easily rewritten into a linear system as

Y i = Ew i + n i (2)

where i = 1, 2,..., K. w i is a column matrix of regulation strength values, which is defined as w i = [wi1, wi2,......, wik]T. Y i is a column matrix of expression levels for target genes, which is defined as Y i = [ei(2), ei(3),..., ei(N)]T. E is a N-1 × K design matrix, which is defined as E = [e1, e2,......, eK] and ei = [ei(1), ei(2),......, ei(N-1)]T. n i is an Gaussian noises, which is defined as n i = [ni(1), ni(2),..., ni(N-1)]T.

It should be noted that the expression levels ei(t)s in both Eq. 1 and 2 are same and noisy ones. Eq. 1 describes that the current expression levels ei(t)s are determined depending on the previous ones ei(t-1) and noises ξi(t-1) and the expression levels are evolved over the time based on the GRN. Hence, ξi(t) is a noise added into ei(t)s during the generation of synthetic or real expression levels based on DBN model. Once the "noisy" expression levels are available, we consider Eq. 2 for reverse engineering of GRN. Because the given expression levels ei(t)s in Eq. 2 are noisy, we should have a condition such that |Y i - Ew i | = |n i | > 0. If n i = 0, we will have over-fitting solutions, in which model fitting oscillates widely so as to fit the noise. Thus, we can say that n i is a noise related with "data misfit" or "confidence interval" on the best fit parameters. On the other hand, ξi(t) is a noise related with the "generation" of expression levels. If n i is modeled as zero-mean Gaussian noise with standard deviation σn, the probability of the data given the parameter w i is

P ( D | w i , β ) = exp ( β E D ( D | w i ) ) Z D ( β ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGqbaudaqadaqaaiabdseaejabcYha8jabdEha3naaBaaaleaacqWGPbqAaeqaaOGaeiilaWccciGae8NSdigacaGLOaGaayzkaaGaeyypa0ZaaSaaaeaacyGGLbqzcqGG4baEcqGGWbaCdaqadaqaaiabgkHiTiab=j7aIjabdweafnaaBaaaleaacqWGebaraeqaaOWaaeWaaeaacqWGebarcqGG8baFcqWG3bWDdaWgaaWcbaGaemyAaKgabeaaaOGaayjkaiaawMcaaaGaayjkaiaawMcaaaqaaiabdQfaAnaaBaaaleaacqWGebaraeqaaOWaaeWaaeaacqWFYoGyaiaawIcacaGLPaaaaaaaaa@5000@

where β = 1/σn2, ED = (Yi-Ewi)T(Yi-Ewi), and ZD = (2π/β)N/2. P(D|wi, β) is called the maximum likelihood. It is well known that maximum likelihood is underdetermined and ill conditioned problems. Thus, we are motivated to develop a novel strategy to overcome these problems.

We decompose the entire network into a set of small networks that are defined as unit networks. Each unit network consists of one particular target gene and its regulator genes. The unit network is used as input and output of BOLS algorithm. Figure 2a presents an input unit network that includes all genes in the data set as regulator candidates. Figure 2b presents an output unit network that contains most probable regulator genes to the target gene. Thus, we can decompose the GRN into several separate regressions and apply the BOLS algorithm to each unit network. Therefore, for GRN with K number of genes, we will run the algorithm K times-producing a unit network for each time. It should be noted that we can use all unit networks to create whole GRN that consists of two finite sets, a set of nodes (genes) and a set of edges (interactions) such that each edge connects two nodes. With all (or selected) unit networks, the generation of GRN can easily be generalized by constructing a K × K graph matrix G = [gi, j], with binary element

g i , j = { 1 i f g e n e i a n d g e n e j h a v e r e g u l a t o r y r e l a t i o n s h i p 0 o t h e r w i s e . MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGNbWzdaWgaaWcbaGaemyAaKMaeiilaWIaemOAaOgabeaakiabg2da9maaceaabaqbaeaabiGaaaqaaiabigdaXaqaaiabdMgaPjabdAgaMjabbccaGiabdEgaNjabdwgaLjabd6gaUjabdwgaLjabbccaGiabdMgaPjabbccaGiabdggaHjabd6gaUjabdsgaKjabbccaGiabdEgaNjabdwgaLjabd6gaUjabdwgaLjabbccaGiabdQgaQjabbccaGiabdIgaOjabdggaHjabdAha2jabdwgaLjabbccaGiabdkhaYjabdwgaLjabdEgaNjabdwha1jabdYgaSjabdggaHjabdsha0jabd+gaVjabdkhaYjabdMha5jabbccaGiabdkhaYjabdwgaLjabdYgaSjabdggaHjabdsha0jabdMgaPjabd+gaVjabd6gaUjabdohaZjabdIgaOjabdMgaPjabdchaWbqaaiabicdaWaqaaiabd+gaVjabdsha0jabdIgaOjabdwgaLjabdkhaYjabdEha3jabdMgaPjabdohaZjabdwgaLjabc6caUaaaaiaawUhaaaaa@819E@

Furthermore, this matrix induces a GRN, in which nodes corresponds to genes and an edge joins nodes i and node j if and only if gi, j= 1. For each edge, we store the information of unit network where it belongs. The information of these sets can be easily obtainable from all unit networks. Thus, the combination of all unit networks for creating whole GRN is easy and straightforward procedure.

2. Bayesian orthogonal least square algorithm

As briefly described in background section, reverse engineering with a linear system with limited data has to overcome two difficulties. In this section, we describe our efforts to overcome these challenges by developing BOLS algorithm.

The system is referred as underdetermined when the number of parameters is larger than the number of available data points, so that standard least squares techniques break down. This issue can be solved with the OLS [22] method involves decomposition of the design matrix into two using Gram-Schmidt Orthogonalization theory as,

E = XU

where EN-1 × K = [e1, e2,......, ek], XN-1 × K = [x1, x2,......, xk] and UK × K is a triangular matrix with 1's on the diagonal and 0's below the diagonal, that is,

U = [ 1 u 12 u 13 u 1 k 0 1 u 23 u 2 k 0 0 1 u k 1 k 0 0 0 1 ] . MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGvbqvcqGH9aqpdaWadaqaauaabeqafuaaaaaabaGaeGymaedabaGaemyDau3aaSbaaSqaaiabigdaXiabikdaYaqabaaakeaacqWG1bqDdaWgaaWcbaGaeGymaeJaeG4mamdabeaaaOqaaiabl+Uimbqaaiabdwha1naaBaaaleaacqaIXaqmcqWGRbWAaeqaaaGcbaGaeGimaadabaGaeGymaedabaGaemyDau3aaSbaaSqaaiabikdaYiabiodaZaqabaaakeaaaeaacqWG1bqDdaWgaaWcbaGaeGOmaiJaem4AaSgabeaaaOqaaiabicdaWaqaaiabicdaWaqaaiablgVipbqaaiablgVipbqaaiabl6Uinbqaaiabl6UinbqaaaqaaiablgVipbqaaiabigdaXaqaaiabdwha1naaBaaaleaacqWGRbWAcqGHsislcqaIXaqmcqWGRbWAaeqaaaGcbaGaeGimaadabaGaeS47IWeabaGaeGimaadabaGaeGimaadabaGaeGymaedaaaGaay5waiaaw2faaiabc6caUaaa@61A8@

Let's say that w is the regression parameter inferred by E and g is the regression parameter inferred by X. It is noted that g and w satisfy the triangular system

g = Uw.

The computational procedure of Gram-Schmidt method is described as

x 1 = e 1 u i j = x i T e j / ( x i T x i ) , 1 i j x j = e j i = 1 K 1 u i j x i , 2 j K . MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakqaabeqaaiabdIha4naaBaaaleaacqaIXaqmaeqaaOGaeyypa0Jaemyzau2aaSbaaSqaaiabigdaXaqabaaakeaacqWG1bqDdaWgaaWcbaGaemyAaKMaemOAaOgabeaakiabg2da9iabdIha4naaDaaaleaacqWGPbqAaeaacqWGubavaaGccqWGLbqzdaWgaaWcbaGaemOAaOgabeaakiabc+caVmaabmaabaGaemiEaG3aa0baaSqaaiabdMgaPbqaaiabdsfaubaakiabdIha4naaBaaaleaacqWGPbqAaeqaaaGccaGLOaGaayzkaaGaeiilaWIaaGzbRlaaywW6caaMfSUaeGymaeJaeyizImQaemyAaKMaeyizImQaemOAaOgabaGaemiEaG3aaSbaaSqaaiabdQgaQbqabaGccqGH9aqpcqWGLbqzdaWgaaWcbaGaemOAaOgabeaakiabgkHiTmaaqahabaGaemyDau3aaSbaaSqaaiabdMgaPjabdQgaQbqabaGccqWG4baEdaWgaaWcbaGaemyAaKgabeaaaeaacqWGPbqAcqGH9aqpcqaIXaqmaeaacqWGlbWscqGHsislcqaIXaqma0GaeyyeIuoakiabcYcaSiaaywW6caaMfSUaaGzbRlabikdaYiabgsMiJkabdQgaQjabgsMiJkabdUealjabc6caUaaaaa@7AAE@

Because xi and xj (i ≠ j) are orthogonal to each other, the sum of square of Y is defined as

Y T Y = ( x i T x i ) g i 2 + n i T n i . MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGzbqwdaahaaWcbeqaaiabdsfaubaakiabdMfazjabg2da9maabmaabaGaemiEaG3aa0baaSqaaiabdMgaPbqaaiabdsfaubaakiabdIha4naaBaaaleaacqWGPbqAaeqaaaGccaGLOaGaayzkaaGaem4zaC2aa0baaSqaaiabdMgaPbqaaiabikdaYaaakiabgUcaRiabd6gaUnaaDaaaleaacqWGPbqAaeaacqWGubavaaGccqWGUbGBdaWgaaWcbaGaemyAaKgabeaakiabc6caUaaa@471E@

The variance of Yi is defined as

Y T Y / N 1 = ( x i T x i ) g i 2 / N 1 + n i T n i / N 1 . MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWcgaqaaiabdMfaznaaCaaaleqabaGaemivaqfaaOGaemywaKfabaGaemOta4KaeyOeI0IaeGymaedaaiabg2da9maalyaabaWaaeWaaeaacqWG4baEdaqhaaWcbaGaemyAaKgabaGaemivaqfaaOGaemiEaG3aaSbaaSqaaiabdMgaPbqabaaakiaawIcacaGLPaaacqWGNbWzdaqhaaWcbaGaemyAaKgabaGaeGOmaidaaaGcbaGaemOta4KaeyOeI0IaeGymaedaaiabgUcaRmaalyaabaGaemOBa42aa0baaSqaaiabdMgaPbqaaiabdsfaubaakiabd6gaUnaaBaaaleaacqWGPbqAaeqaaaGcbaGaemOta4KaeyOeI0IaeGymaedaaiabc6caUaaa@5066@

It is noticed that (xiTxi)gi2/N-1 is the variance of Yi which is contributed by the regressors and niTni/N-1 is the noise (or unexplained) variance of Yi. Hence, (xiTxi)gi2/N-1 is the increment to the variance of Yi contributed by wI, and the error reduction ratio only due to xi can be defined as

[NError]i = (xiTxi)gi2/(YTY), 1 ≤ i ≤ K.

This error term provides a simple and efficient measure for seeking a subset of significant regression parameters in a forward-regression way. The repressor selection procedure from Chen et al.[22] is summarized as follows:

  1. 1)

    At the first step, for 1 ≤ i < K, compute x1(i) = eig1(i) = (x1(i))TY/((x1(i))Tx1(i))[NError]1(i) = (g1(i))2(x1(i)Tx1(i))/(YTY)

Find [NError]1(i1) = max{[NError]1(i), 1 ≤ i ≤ K}

and select x1 = x1(i1) = ei1.

  1. (2)

    At the jth step where j ≥ 2, for 1 ≤ i ≤ K, i ≠ i1,..., i ≠ = ij-1, compute umj(i) = xmTei/(xmTxl), 1 ≤ m < j

    x j ( i ) = e i m = 1 j 1 u m j ( i ) x m MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegyvzYrwyUfgarqqtubsr4rNCHbGeaGqiA8vkIkVAFgIELiFeLkFeLk=iY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqVepeea0=as0db9vqpepesP0xe9Fve9Fve9GapdbaqaaeGacaGaaiaabeqaamqadiabaaGcbaGaeeiEaG3aa0baaSqaaiabbQgaQbqaaiabcIcaOiabbMgaPjabcMcaPaaakiabg2da9iabbwgaLnaaBaaaleaacqqGPbqAaeqaaOGaeyOeI0YaaabCaeaacqWG1bqDdaqhaaWcbaGaemyBa0MaemOAaOgabaGaeiikaGIaemyAaKMaeiykaKcaaOGaemiEaG3aaSbaaSqaaiabd2gaTbqabaaabaGaemyBa0Maeyypa0JaeGymaedabaGaemOAaOMaeyOeI0IaeGymaedaniabggHiLdaaaa@5B1B@

    gj(i) = (xj(i))TY/((xj(i))Txj(i)) [NError]j(i) = (gj(i))2(xj(i))Txj(i)/(YTY)

Find [NError]j(ij) = max{[NError]j(i), 1 ≤ i ≤ K, i ≠ i1,..., i ≠ = ij-1 }

and select

x j = x j ( ii ) = e ij m = 1 j 1 u m j ( i ) x m , MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegyvzYrwyUfgarqqtubsr4rNCHbGeaGqiA8vkIkVAFgIELiFeLkFeLk=iY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqVepeea0=as0db9vqpepesP0xe9Fve9Fve9GapdbaqaaeGacaGaaiaabeqaamqadiabaaGcbaGaeeiEaG3aaSbaaSqaaiabbQgaQbqabaGccqGH9aqpcqqG4baEdaqhaaWcbaGaeeOAaOgabaGaeiikaGIaeeyAaKMaeeyAaKMaeiykaKcaaOGaeyypa0Jaeeyzau2aaSbaaSqaaiabbMgaPjabbQgaQbqabaGccqGHsisldaaeWbqaaiabdwha1naaDaaaleaacqWGTbqBcqWGQbGAaeaacqGGOaakcqWGPbqAcqGGPaqkaaGccqWG4baEdaWgaaWcbaGaemyBa0gabeaaaeaacqWGTbqBcqGH9aqpcqaIXaqmaeaacqWGQbGAcqGHsislcqaIXaqma0GaeyyeIuoakiabcYcaSaaa@62C7@

where umj = umj(ij), 1 ≤ m < j.

  1. (3)

    The OLS is terminated at the Ksth step when

    1 m = 1 K S [ N E r r o r ] m < ρ , MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegyvzYrwyUfgarqqtubsr4rNCHbGeaGqiA8vkIkVAFgIELiFeLkFeLk=iY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqVepeea0=as0db9vqpepesP0xe9Fve9Fve9GapdbaqaaeGacaGaaiaabeqaamqadiabaaGcbaGaeGymaeJaeyOeI0YaaabCaeaadaWadaqaaiabd6eaojabdweafjabdkhaYjabdkhaYjabd+gaVjabdkhaYbGaay5waiaaw2faamaaBaaaleaacqWGTbqBaeqaaOGaeyipaWdcciGae8xWdihaleaacqWGTbqBcqGH9aqpcqaIXaqmaeaacqWGlbWsdaWgaaadbaGaem4uamfabeaaa0GaeyyeIuoakiabcYcaSaaa@5604@

where 0 <ρ < 1 is a chosen tolerance.

We assume that we do not have any information about noise level in the data, so that we completely over-fit the data to the model using OLS method with ρ << 1 (ρ = 1.0e-3 in this study). Then we reduce unnecessary parameters to deal with the ill-conditioned problem by using second order derivative for network pruning techniques and Bayesian model comparison framework. We can obtain the optimal solution by trading off between the complexity of the model and the data misfit [20]. We start this procedure using an extremely small value for data misfit by completely over-fitting the data to the model. As the complexity of the model is reduced; i.e. the number of effective parameters is reduced, the value for data misfit is increased. The optimal complexity of the model for "true solution" is decided using a Bayesian model comparison frame that assigns a preference to the model Hi with certain complexity, a Bayesian evaluation so called as the evidence P(D|Hi). The evidence is obtained by multiplying the best-fit likelihood by the Occam's factor,

P ( D | H i ) = P ( D | g i , H i ) P ( g i | H i ) d g P ( D | g M P , H i ) P ( g M P | H i ) ( 2 π ) k / 2 det 1 / 2 A MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqadeGacaaabaGaemiuaa1aaeWaaeaacqWGebarcqGG8baFcqWGibasdaWgaaWcbaGaemyAaKgabeaaaOGaayjkaiaawMcaaaqaaiabg2da9maapeaabaGaemiuaa1aaeWaaeaacqWGebarcqGG8baFcqWGNbWzdaWgaaWcbaGaemyAaKgabeaakiabcYcaSiabdIeainaaBaaaleaacqWGPbqAaeqaaaGccaGLOaGaayzkaaGaemiuaa1aaeWaaeaacqWGNbWzdaWgaaWcbaGaemyAaKgabeaakiabcYha8jabdIeainaaBaaaleaacqWGPbqAaeqaaaGccaGLOaGaayzkaaGaemizaqMaem4zaCgaleqabeqdcqGHRiI8aaGcbaaabaGaeyyrIaKaemiuaa1aaeWaaeaacqWGebarcqGG8baFcqWGNbWzdaWgaaWcbaGaemyta0KaemiuaafabeaakiabcYcaSiabdIeainaaBaaaleaacqWGPbqAaeqaaaGccaGLOaGaayzkaaGaemiuaa1aaeWaaeaacqWGNbWzdaWgaaWcbaGaemyta0KaemiuaafabeaakiabcYha8jabdIeainaaBaaaleaacqWGPbqAaeqaaaGccaGLOaGaayzkaaWaaeWaaeaacqaIYaGmiiGacqWFapaCaiaawIcacaGLPaaadaahaaWcbeqaaiabdUgaRjabc+caViabikdaYaaakiGbcsgaKjabcwgaLjabcsha0naaCaaaleqabaGaeyOeI0IaeGymaeJaei4la8IaeGOmaidaaOGaemyqaeeaaaaa@791D@

where P(D|g MP , Hi) corresponds to the best fit likelihood, P(g MP |Hi)(2π)K/2det-1/2A corresponds to the Occam's factor, A = ∂2logP(g|D, Hi)/∂g2, and g MP represents the most probable parameters of g. The Occam's factor is equal to the ratio of the posterior accessible volume of Hi's parameter space to the prior accessible volume, or the factor by which Hi's hypothesis space collapses when the data is collected [20]. The model His can be viewed as consisting of a certain number of exclusive sub-models, of which only one is chosen when the data is collected. The Occam's factor is a measure of complexity of the model that depends not only on the number of parameters in the model but also on the prior probability of the model. Therefore, the over-fitting solution can be avoided by using Bayesian model comparison frame because the Bayesian Occam's factor assures getting the optimal complexity of the model. See Mackay [20] for more details of Occam's factor.

With a second order derivative for network pruning [23], we can select the parameters to be eliminated first. Our goal here is to find a set of parameters whose deletion causes the least increase of cost function CC = β/2(Y - Xg)T (Y - Xg) + α/2(gT g), (3)

where α and β are hyper-parameters that measure the complexity of the model and data misfit, respectively. It will be shown later in this section the iterative formulae to estimate α and β with a given data set and model structure in Eq. 6 and 7. Using the second order derivative for network pruning method, we can derive the saliency equation as follows,

L j = 1 2 ( g j 2 A j j 1 ) α ( g j A j j 1 ) ( g T A 1 v j ) α 2 2 [ g T A 1 g ( g T A 1 v j ) 2 A j j 1 ] MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGmbatdaWgaaWcbaGaemOAaOgabeaakiabg2da9maalaaabaGaeGymaedabaGaeGOmaidaamaabmaabaWaaSaaaeaacqWGNbWzdaqhaaWcbaGaemOAaOgabaGaeGOmaidaaaGcbaGaemyqae0aa0baaSqaaiabdQgaQjabdQgaQbqaaiabgkHiTiabigdaXaaaaaaakiaawIcacaGLPaaacqGHsisliiGacqWFXoqydaqadaqaamaalaaabaGaem4zaC2aaSbaaSqaaiabdQgaQbqabaaakeaacqWGbbqqdaqhaaWcbaGaemOAaOMaemOAaOgabaGaeyOeI0IaeGymaedaaaaaaOGaayjkaiaawMcaamaabmaabaGaem4zaC2aaWbaaSqabeaacqWGubavaaGccqWGbbqqdaahaaWcbeqaaiabgkHiTiabigdaXaaakiabdAha2naaBaaaleaacqWGQbGAaeqaaaGccaGLOaGaayzkaaGaeyOeI0YaaSaaaeaacqWFXoqydaahaaWcbeqaaiabikdaYaaaaOqaaiabikdaYaaadaWadaqaaiabdEgaNnaaCaaaleqabaGaemivaqfaaOGaemyqae0aaWbaaSqabeaacqGHsislcqaIXaqmaaGccqWGNbWzcqGHsisldaWcaaqaamaabmaabaGaem4zaC2aaWbaaSqabeaacqWGubavaaGccqWGbbqqdaahaaWcbeqaaiabgkHiTiabigdaXaaakiabdAha2naaBaaaleaacqWGQbGAaeqaaaGccaGLOaGaayzkaaWaaWbaaSqabeaacqaIYaGmaaaakeaacqWGbbqqdaqhaaWcbaGaemOAaOMaemOAaOgabaGaeyOeI0IaeGymaedaaaaaaOGaay5waiaaw2faaaaa@754C@

where Ajj-1 is a jth diagonal component of the inverse matrix of A, A = ∂2C/∂g2, and vj is the unit vector in parameter space, the jth dimension at which it is equal to one and the rest of the dimensions are equal to zero. It should be noted that A and A-1 are diagonal matrices because of the decomposition of design matrix E by Gram-Schmidt Orthogonalization theory. With Eq. 3 we can select parameters, whose elimination produces the least increase of cost function C.

With a Bayesian frame [20] we can compare alternative models when our model structures keep changing with the network pruning method. Let's say we have a data set D = [Y, X], where Y = [y(1), y(2),..., y(N)]T is the target data set, X = [x1, x2,..., xK] the N × K design matrix, and xi = [xi(1), xi(2),..., xi(N)]T each column matrix in X. The regression parameters we want to infer are g = [g1, g2,..., gK]T. The log posterior probability of data D, given α and β, can be derived [20] as,

log ( P ( D | α , β , H i ) ) = 1 2 β ( Y X g M P ) T ( Y X g M P ) 1 2 α g M P T g M P 1 2 log | A | + k 2 log ( α ) + N 2 log ( β ) N 2 log ( 2 π ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBamXvP5wqSXMqHnxAJn0BKvguHDwzZbqegyvzYrwyUfgarqqtubsr4rNCHbGeaGqiA8vkIkVAFgIELiFeLkFeLk=iY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqVepeea0=as0db9vqpepesP0xe9Fve9Fve9GapdbaqaaeGacaGaaiaabeqaamqadiabaaGcbaqbaeaabiGaaaqaaiGbcYgaSjabc+gaVjabcEgaNnaabmaabaGaemiuaa1aaeWaaeaacqWGebarcqGG8baFiiGacqWFXoqycqGGSaalcqWFYoGycqGGSaalcqWGibasdaWgaaWcbaGaemyAaKgabeaaaOGaayjkaiaawMcaaaGaayjkaiaawMcaaiabg2da9aqaaiabgkHiTmaalaaabaGaeGymaedabaGaeGOmaidaaiab=j7aInaabmaabaGaemywaKLaeyOeI0IaemiwaGLaem4zaC2aaSbaaSqaaiabd2eanjabdcfaqbqabaaakiaawIcacaGLPaaadaahaaWcbeqaaiabdsfaubaakmaabmaabaGaemywaKLaeyOeI0IaemiwaGLaem4zaC2aaSbaaSqaaiabd2eanjabdcfaqbqabaaakiaawIcacaGLPaaacqGHsisldaWcaaqaaiabigdaXaqaaiabikdaYaaacqWFXoqycqWGNbWzdaqhaaWcbaGaemyta0KaemiuaafabaGaemivaqfaaOGaem4zaC2aaSbaaSqaaiabd2eanjabdcfaqbqabaaakeaaaeaacqGHsisldaWcaaqaaiabigdaXaqaaiabikdaYaaacyGGSbaBcqGGVbWBcqGGNbWzdaabdaqaaiabdgeabbGaay5bSlaawIa7aiabgUcaRmaalaaabaGaem4AaSgabaGaeGOmaidaaiGbcYgaSjabc+gaVjabcEgaNnaabmaabaGae8xSdegacaGLOaGaayzkaaGaey4kaSYaaSaaaeaacqWGobGtaeaacqaIYaGmaaGagiiBaWMaei4Ba8Maei4zaC2aaeWaaeaacqWFYoGyaiaawIcacaGLPaaacqGHsisldaWcaaqaaiabd6eaobqaaiabikdaYaaacyGGSbaBcqGGVbWBcqGGNbWzdaqadaqaaiabikdaYiab=b8aWbGaayjkaiaawMcaaaaaaaa@A0BB@

where the subscript MP denotes Most Probable. The evidence P(D|Hi) can be obtained if we marginalize the probability defined in Eq. 5 over the hyper-parameters α and β. Before the estimation of the evidence P(D|Hi), we have to find the most probable value of the hyper-parameters α and β. The differentiation of Eq. 5 over α and β and the rearrangement gives formulae for the iterative re-estimation of α and β [20],

α ^ : = γ g T g MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacuWFXoqygaqcaiabcQda6iabg2da9maalaaabaGae83SdCgabaGaem4zaC2aaWbaaSqabeaacqWGubavaaGccqWGNbWzaaaaaa@362C@
β ^ : = K γ ( Y X g ) T ( Y X g ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacuWFYoGygaqcaiabcQda6iabg2da9maalaaabaGaem4saSKaeyOeI0Iae83SdCgabaWaaeWaaeaacqWGzbqwcqGHsislcqWGybawcqWGNbWzaiaawIcacaGLPaaadaahaaWcbeqaaiabdsfaubaakmaabmaabaGaemywaKLaeyOeI0IaemiwaGLaem4zaCgacaGLOaGaayzkaaaaaaaa@420E@

where γ = N - α Trace(A-1), g = A-1XTY, N is number of data points, and K is number of variables (genes). To rank alternative structures (or complexities) of the model in the light of data set D, we evaluate the evidence by marginalizing the posterior probability P(D|α, β, Hi) over α and β,

P(D|Hi) = ∫∫P(D|α, β, Hi)P(α, β)dαdβ.

We have very little prior information about α and β. When the available prior information is minimal, the learning process is often started with an objective prior probability. This uninformative prior probability is referred to as "vague prior" for a parameter with a range from 0 to ∞, which is a flat prior [87]. This prior probability can be left out when we compare alternative models. With the prior available, we can marginalize the posterior P(D|α, β, Hi). The marginalization of P(D|α, β, Hi) over α and β can be estimated using a flat prior and Gaussian integration [20],

P ( D | H i ) P ( D | α ^ , β ^ , H i ) P ( log α ^ , log β ^ ) 2 π σ log α | D 2 π σ log β | D P ( D | α ^ , β ^ , H i ) 2 π σ log α | D 2 π σ log β | D , MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqaaeGacaaabaGaemiuaa1aaeWaaeaacqWGebarcqGG8baFcqWGibasdaWgaaWcbaGaemyAaKgabeaaaOGaayjkaiaawMcaaaqaaiabgwKiajabdcfaqnaabmaabaGaemiraqKaeiiFaWhcciGaf8xSdeMbaKaacqGGSaalcuWFYoGygaqcaiabcYcaSiabdIeainaaBaaaleaacqWGPbqAaeqaaaGccaGLOaGaayzkaaGaeyyXICTaemiuaa1aaeWaaeaacyGGSbaBcqGGVbWBcqGGNbWzcuWFXoqygaqcaiabcYcaSiGbcYgaSjabc+gaVjabcEgaNjqb=j7aIzaajaaacaGLOaGaayzkaaGaeyyXIC9aaOaaaeaacqaIYaGmcqWFapaCcqGHflY1aSqabaGccqWFdpWCdaWgaaWcbaGagiiBaWMaei4Ba8Maei4zaCMae8xSdeMaeiiFaWNaemiraqeabeaakiabgwSixpaakaaabaGaeGOmaiJae8hWdahaleqaaOGaeyyXICTae83Wdm3aaSbaaSqaaiGbcYgaSjabc+gaVjabcEgaNjab=j7aIjabcYha8jabdseaebqabaaakeaaaeaacqGHDisTcqWGqbaudaqadaqaaiabdseaejabcYha8jqb=f7aHzaajaGaeiilaWIaf8NSdiMbaKaacqGGSaalcqWGibasdaWgaaWcbaGaemyAaKgabeaaaOGaayjkaiaawMcaaiabgwSixpaakaaabaGaeGOmaiJae8hWdaNaeyyXICnaleqaaOGae83Wdm3aaSbaaSqaaiGbcYgaSjabc+gaVjabcEgaNjab=f7aHjabcYha8jabdseaebqabaGccqGHflY1daGcaaqaaiabikdaYiab=b8aWbWcbeaakiabgwSixlab=n8aZnaaBaaaleaacyGGSbaBcqGGVbWBcqGGNbWzcqWFYoGycqGG8baFcqWGebaraeqaaOGaeiilaWcaaaaa@AA3F@

where σlogα|Dand σlogβ|Dare the error bars on logα and logβ, found by differentiating Eq. 5 twice:

σ log α | D 2 2 γ , MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacqWFdpWCdaqhaaWcbaGagiiBaWMaei4Ba8Maei4zaCMae8xSdeMaeiiFaWNaemiraqeabaGaeGOmaidaaOGaeyyrIa0aaSaaaeaacqaIYaGmaeaacqWFZoWzaaGaeiilaWcaaa@3C9F@
σ log β | D 2 2 ( N γ ) . MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaaiiGacqWFdpWCdaqhaaWcbaGagiiBaWMaei4Ba8Maei4zaCMae8NSdiMaeiiFaWNaemiraqeabaGaeGOmaidaaOGaeyyrIa0aaSaaaeaacqaIYaGmaeaadaqadaqaaiabd6eaojabgkHiTiab=n7aNbGaayjkaiaawMcaaaaacqGGUaGlaaa@4040@

In this study, we create a novel reverse engineering algorithm for linear systems with K number of genes using three techniques described above. The algorithm is run K times so that all genes in the data set are considered as a target gene at least once. The algorithm of BOLS for a unit-network construction is summarized as

  1. 1.

    Set certain gene as the target gene and set remaining genes as regulator candidates as input.

  2. 2.

    Over-fit the data to Eq. 2 using OLS.

  3. 3.

    While the number of parameters is greater than 1.

3.1 Estimate α and β with iterative re-estimation Eq. 6 and 7.

3.2 Compute the P(D|Hi) for the current state network Hi with Eq. 8.

3.3. Find the parameter gj that gives the smallest Lj by Eq. 4 and delete gj

  1. 4.

    Select the network with the maximum P(D|Hi) as an output unit network.


  1. Altman RB, Raychaudhuri S: Whole genome expression analysis: challenges beyond clustering. Curr Opin Struct Biol. 2001, 11: 340-347. 10.1016/S0959-440X(00)00212-8.

    Article  CAS  PubMed  Google Scholar 

  2. Brown MP, Grundy WN, Lin D, Cristianini N, Sugnet CW, Furey TS, Ares M, Haussler D: Knowledge-based analysis of microarray gene expression data by using support vector machines. Proc Natl Acad Sci USA. 2000, 97: 262-267. 10.1073/pnas.97.1.262.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  3. Huges JD, Estep PW, Tavazoi S, Church GM: Computational identification of cis-regulatory elements associated with groups of functionally related genes in Sacchaaromyces cerevisiae. J Mol Biol. 2000, 296: 1205-1214. 10.1006/jmbi.2000.3519.

    Article  Google Scholar 

  4. Jansen R, Greenbaum D, Gerstein M: Relating whole-genome expression data with protein-protein interactions. Genome Res. 2000, 12: 37-46. 10.1101/gr.205602.

    Article  Google Scholar 

  5. Niehrs C, Pollet N: Synexpression groups in eukaryotes. Nature. 1999, 402: 483-487. 10.1038/990025.

    Article  CAS  PubMed  Google Scholar 

  6. Kauffman SA: Metabolic stability and epigenesist in randomly constructed genetic nets. J Theor Biol. 1969, 22: 437-467. 10.1016/0022-5193(69)90015-0.

    Article  CAS  PubMed  Google Scholar 

  7. Shmulevich I, Dougherty E, Kim S, Zhang W: Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks. Bioinformatics. 2002, 18: 261-274. 10.1093/bioinformatics/18.2.261.

    Article  CAS  PubMed  Google Scholar 

  8. Hartmink AJ, Gifford DK, Jaakola TS, Young RA: Using graphical models and genomic expression data to statistically validate models of genetic regulatory networks. Pac Symp Biocomputing. 2001, 6: 422-433.

    Google Scholar 

  9. Husmeier D: Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks. Bioinformatics. 2003, 19: 2271-2282. 10.1093/bioinformatics/btg313.

    Article  CAS  PubMed  Google Scholar 

  10. Murphy K, Mian S: Modeling gene expression data using dynamic Bayesian networks. Technical report. 1999, Computer Science Division, University of California, Berkeley

    Google Scholar 

  11. Basso K, Margolin A, Stolovitzky G, Klein U, Dalla-Favera R, Califano A: Reverse engineering of regulatory networks in human B cells. Nat Genet. 2005, 37: 382-390. 10.1038/ng1532.

    Article  CAS  PubMed  Google Scholar 

  12. D'Haeseleer P, Wen X, Fuhrman S, Somogyi R: Linear modeling of mRNA expression levels during CNS development and injury. Pac Symp Biocomputing. 1999, 4: 41-52.

    Google Scholar 

  13. Weaver D, Workman C, Stormo G: Modeling regulatory networks with weight matrices. Pacific Symposium on Biocomputing. 1999, 4: 112-123.

    Google Scholar 

  14. Chen K, Wang T, Tseng H, Huang C, Kao C: A stochastic differential equation model for quantifying transcriptional regulatory network in Saccharomyces cerevisiae. Bioinformatics. 2005, 21: 2883-2890. 10.1093/bioinformatics/bti415.

    Article  CAS  PubMed  Google Scholar 

  15. Yeung SM, Tegner J, Collins J: Reverse engineering gene networks using singular value decomposition and robust regression. Proc Natl Acad Sci USA. 2002, 99: 6163-6168. 10.1073/pnas.092576199.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  16. Wille AZP, Vranova E, Furholz A, Laule O, Bleuler S, Hennig L, Prelic A, von Rohr P, Thiele L, Zitzler E, Gruissem W, Buhlmann P: Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana. Genome Biol. 2004, 5: R92-10.1186/gb-2004-5-11-r92.

    Article  PubMed Central  PubMed  Google Scholar 

  17. Deng X, Geng H, Ali H: EXAMINE: A computational approach to reconstructing gene regulatory networks. BioSystems. 2005, 81: 125-136. 10.1016/j.biosystems.2005.02.007.

    Article  CAS  PubMed  Google Scholar 

  18. Chan ZS, Collins L, Kasabov N: Bayesian learning of sparse gene regulatory networks. Biosystems. 2007, 87 (2–3): 299-306. 10.1016/j.biosystems.2006.09.026.

    Article  CAS  PubMed  Google Scholar 

  19. Rogers S, Girolami M: A Bayesian regression approach to the inference of regulatory networks from gene expression data. Bioinformatics. 2005, 21: 3131-3137. 10.1093/bioinformatics/bti487.

    Article  CAS  PubMed  Google Scholar 

  20. Mackay DJC: Bayesian interpolation. Neural Computation. 1992, 4: 415-447.

    Article  Google Scholar 

  21. Cooper S, Kerby S: Microarray analysis of gene expression during the cell cycle. Cell & Chromosome. 2003, 2: 1-10.1186/1475-9268-2-1.

    Article  Google Scholar 

  22. Chen S, Cowan C, Grant P: Orthogonal least squares learning algorithm for radial basis function networks. IEEE Trans Neural Net. 1991, 2: 302-309. 10.1109/72.80341.

    Article  CAS  Google Scholar 

  23. Hassibi B, Stork DG: Second order derivatives for networks pruning: optimal brain surgeon. Adv Neural Inform Proc Syst. 1993, 5: 164-171.

    Google Scholar 

  24. Tipping ME: Sparse Bayesian learning and the relevance vector machine. Journal of Machine Learning Research. 2001, 1: 211-244. 10.1162/15324430152748236.

    Google Scholar 

  25. Friedman N, Linjal M, Nachman I, Pe'er D: Using Bayesian networks to analyze expression data. J Comput Biol. 2000, 7: 601-620. 10.1089/106652700750050961.

    Article  CAS  PubMed  Google Scholar 

  26. Wagner A: Estimating coarse gene network structure from large-scale gene perturbation data. Genome Research. 2002, 12: 309-315. 10.1101/gr.193902.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  27. Rice J, Tu Y, Stolovitzky G: Reconstructing biological networks using conditional correlation analysis. Bioinformatics. 2005, 21 (6): 765-773. 10.1093/bioinformatics/bti064.

    Article  CAS  PubMed  Google Scholar 

  28. Matlab code for creating synthetic network. []

  29. A Matlab implementation of "SparseBayes". []

  30. Spellman PT, Sherlock G, Zhang MQ, Iyer V, Anders K, Eisen M, Brown P, Botstein D, Futcher B: Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell. 1998, 9: 3273-3297.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  31. Chen K, Csikasz-Nagy A, Gyorffy B, Val J, Novak B, Tyson J: Kinetic analysis of a molecular model of the budding yeast cell cycle. Mol Bio Cell. 2000, 11: 369-391.

    Article  CAS  Google Scholar 

  32. Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M: BioGRID: a general repository for interaction datasets. Nucleic Acids Res. 2006, 1 (34): D535-539. 10.1093/nar/gkj109.

    Article  Google Scholar 

  33. BioGRID database: general depository for interaction datasets. []

  34. Breeden LL: Periodic transcription: a cycle within a cycle. Current Biology. 2003, 13: R31-R38. 10.1016/S0960-9822(02)01386-6.

    Article  CAS  PubMed  Google Scholar 

  35. Ahn SH, Tobe BT, Fitz-Gerald JN, Anderson SL, Acurio A, Kron SJ: Enhanced cell polarity in mutants of the budding yeast cyclin-dependent kinase Cdc28p. Mol Biol Cell. 2001, 12 (11): 3689-3560.

    Article  Google Scholar 

  36. Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM, Remor M, Hofert C, Schelder M, Brajenovic M, Ruffner H, Merino A, Klein K, Hudak M, Dickson D, Rudi T, Gnau V, Bauch A, Bastuck S, Huhse B, Leutwein C, Heurtier MA, Copley RR, Edelmann A, Querfurth E, Rybin V, Drewes G, Raida M, Bouwmeester T, Bork P, Seraphin B, Kuster B, Neubauer G, Superti-Furga G: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature. 2002, 415 (6868): 141-147. 10.1038/415141a.

    Article  CAS  PubMed  Google Scholar 

  37. Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen LJ, Bastuck S, Dumpelfeld B, Edelmann A, Heurtier MA, Hoffman V, Hoefert C, Klein K, Hudak M, Michon AM, Schelder M, Schirle M, Remor M, Rudi T, Hooper S, Bauer A, Bouwmeester T, Casari G, Drewes G, Neubauer G, Rick JM, Kuster B, Bork P, Russell RB, Superti-Furga G: Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006, 440 (7084): 631-636. 10.1038/nature04532.

    Article  CAS  PubMed  Google Scholar 

  38. Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, Adams SL, Millar A, Taylor P, Bennett K, Boutilier K, Yang L, Wolting C, Donaldson I, Schandorff S, Shewnarane J, Vo M, Taggart J, Goudreault M, Muskat B, Alfarano C, Dewar D, Lin Z, Michalickova K, Willems AR, Sassi H, Nielsen PA, Rasmussen KJ, Andersen JR, Johansen LE, Hansen LH, Jespersen H, Podtelejnikov A, Nielsen E, Crawford J, Poulsen V, Sorensen BD, Matthiesen J, Hendrickson RC, Gleeson F, Pawson T, Moran MF, Durocher D, Mann M, Hogue CW, Figeys D, Tyers M: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature. 2002, 415 (6868): 180-183. 10.1038/415180a.

    Article  CAS  PubMed  Google Scholar 

  39. Reed SI, Hadwiger JA, Richardson HE, Wittenberg C: Analysis of the Cdc28 protein kinase complex by dosage suppression. J Cell Sci Suppl. 1989, 12: 29-37.

    Article  CAS  PubMed  Google Scholar 

  40. Tyers M, Futcher B: Far1 and Fus3 link the mating pheromone signal transduction pathway to three G1-phase Cdc28 kinase complexes. Mol Cell Biol. 1993, 13 (9): 5659-5669.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  41. Archambault V, Chang EJ, Drapkin BJ, Cross FR, Chait BT, Rout MP: Targeted proteomic study of the cyclin-Cdk module. Mol Cell. 2004, 14 (6): 699-711. 10.1016/j.molcel.2004.05.025.

    Article  CAS  PubMed  Google Scholar 

  42. Ceccarelli E, Mann C: A Cdc28 mutant uncouples G1 cyclin phosphorylation and ubiquitination from G1 cyclin proteolysis. J Biol Chem. 2001, 276 (45): 41725-41732. 10.1074/jbc.M107087200.

    Article  CAS  PubMed  Google Scholar 

  43. Deshaies RJ, Kirschner M: G1 cyclin-dependent activation of p34CDC28-Cdc28p – in vitro. Proc Natl Acad Sci USA. 1995, 92 (4): 1182-1186. 10.1073/pnas.92.4.1182.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  44. Lanker S, Valdivieso MH, Wittenberg C: Rapid degradation of the G1 cyclin Cln2 induced by CDK-dependent phosphorylation. Science. 1996, 271 (5255): 1597-1601. 10.1126/science.271.5255.1597.

    Article  CAS  PubMed  Google Scholar 

  45. Levine K, Oehlen LJ, Cross FR: Isolation and characterization of new alleles of the cyclin-dependent kinase gene CDC28 with cyclin-specific functional and biochemical defects. Mol Cell Biol. 1998, 18 (1): 290-302.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  46. Lim HH, Loy CJ, Zaman S, Surana U: Dephosphorylation of threonine 169 of Cdc28 is not required for exit from mitosis but may be necessary for start in Saccharomyces cerevisiae. Mol Cell Biol. 1996, 16 (8): 4573-4583.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  47. Miller ME, Cross FR, Groeger AL, Jameson KL: Identification of novel and conserved functional and structural elements of the G1 cyclin Cln3 important for interactions with the CDK Cdc28 in Saccharomyces cerevisiae. Yeast. 2005, 22 (13): 1021-1036. 10.1002/yea.1292.

    Article  CAS  PubMed  Google Scholar 

  48. Peter M, Herskowitz I: Direct inhibition of the yeast cyclin-dependent kinase Cdc28-Cln by Far1. Science. 1994, 265 (5176): 1128-1231. 10.1126/science.8066461.

    Article  Google Scholar 

  49. Wang H, Gari E, Verges E, Gallego C, Aldea M: Recruitment of Cdc28 by Whi3 restricts nuclear accumulation of the G1 cyclin-Cdk complex to late G1. EMBO J. 2004, 23 (1): 180-190. 10.1038/sj.emboj.7600022.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  50. Cross FR: Further characterization of a size control gene in Saccharomyces cerevisiae. J Cell Sci Suppl. 1989, 12: 117-127.

    Article  CAS  PubMed  Google Scholar 

  51. Cross FR, Blake CM: The yeast Cln3 protein is an unstable activator of Cdc28. Mol Cell Biol. 1993, 13 (6): 3266-3271.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  52. Tyers M, Tokiwa G, Nash R, Futcher B: The Cln3-Cdc28 kinase complex of S-cerevisiae is regulated by proteolysis and phosphorylation. EMBO J. 1992, 11 (5): 1773-1784.

    PubMed Central  CAS  PubMed  Google Scholar 

  53. Ubersax JA, Woodbury EL, Quang PN, Paraz M, Blethrow JD, Shah K, Shokat KM, Morgan DO: Targets of the cyclin-dependent kinase Cdk1. Nature. 2003, 425 (6960): 859-864. 10.1038/nature02062.

    Article  CAS  PubMed  Google Scholar 

  54. Wijnen H, Landman A, Futcher B: The G(1) cyclin Cln3 promotes cell cycle entry via the transcription factor Swi6. Mol Cell Biol. 2002, 22 (12): 4402-4418. 10.1128/MCB.22.12.4402-4418.2002.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  55. Yaglom J, Linskens MH, Sadis S, Rubin DM, Futcher B, Finley D: p34Cdc28-mediated control of Cln3 cyclin degradation. Mol Cell Biol. 1995, 15 (2): 731-741.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  56. Grandin N, Reed SI: Differential function and expression of Saccharomyces cerevisiae B-type cyclins in mitosis and meiosis. Mol Cell Biol. 1993, 13 (4): 2113-2125.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  57. Surana U, Robitsch H, Price C, Schuster T, Fitch I, Futcher AB, Nasmyth K: The role of CDC28 and cyclins during mitosis in the budding yeast S. cerevisiae. Cell. 1991, 65 (1): 145-161. 10.1016/0092-8674(91)90416-V.

    Article  CAS  PubMed  Google Scholar 

  58. Bailly E, Cabantous S, Sondaz D, Bernadac A, Simon MN: Differential cellular localization among mitotic cyclins from Saccharomyces cerevisiae – a new role for the axial budding protein Bud3 in targeting Clb2 to the mother-bud neck. J Cell Sci. 2003, 116: 4119-4130. 10.1242/jcs.00706.

    Article  CAS  PubMed  Google Scholar 

  59. Honey S, Schneider BL, Schieltz DM, Yates JR, Futcher B: A novel multiple affinity purification tag and its use in identification of proteins associated with a cyclin-CDK complex. Nucleic Acids Res. 2001, 29 (4): E24-10.1093/nar/29.4.e24.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  60. Kaiser P, Moncollin V, Clarke DJ, Watson MH, Bertolaet BL, Reed SI, Bailly E: Cyclin-dependent kinase and Cks-Suc1 interact with the proteasome in yeast to control proteolysis of M-phase targets. Genes Dev. 1999, 13 (9): 1190-1202.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  61. Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP, Punna T, Peregrin-Alvarez JM, Shales M, Zhang X, Davey M, Robinson MD, Paccanaro A, Bray JE, Sheung A, Beattie B, Richards DP, Canadien V, Lalev A, Mena F, Wong P, Starostine A, Canete MM, Vlasblom J, Wu S, Orsi C, Collins SR, Chandran S, Haw R, Rilstone JJ, Gandi K, Thompson NJ, Musso G, St Onge P, Ghanny S, Lam MH, Butland G, Altaf-Ul AM, Kanaya S, Shilatifard A, O'shea E, Weissman JS, Ingles CJ, Hughes TR, Parkinson J, Gerstein M, Wodak SJ, Emili A, Greenblatt JF: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature. 2006, 440: 637-643. 10.1038/nature04670.

    Article  CAS  PubMed  Google Scholar 

  62. Ross KE, Kaldis P, Solomon MJ: Activating phosphorylation of the Saccharomyces cerevisiae cyclin-dependent kinase, cdc28p, precedes cyclin binding. Mol Biol Cell. 2000, 11 (5): 1597-1609.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  63. Schwab M, Neutzner M, Mocker D, Seufert W: Yeast Hct1 recognizes the mitotic cyclin Clb2 and other substrates of the ubiquitin ligase APC. EMBO J. 2001, 20 (18): 5165-5175. 10.1093/emboj/20.18.5165.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  64. Shellman YG, Svee E, Sclafani RA, Langan TA: Identification and characterization of individual cyclin-dependent kinase complexes from Saccharomyces cerevisiae. Yeast. 1999, 15 (4): 295-309. 10.1002/(SICI)1097-0061(19990315)15:4<295::AID-YEA377>3.0.CO;2-Z.

    Article  CAS  PubMed  Google Scholar 

  65. Jaspersen SL, Charles JF, Morgan DO: Inhibitory phosphorylation of the APC regulator Hct1 is controlled by the kinase Cdc28 and the phosphatase Cdc14. Curr Biol. 1999, 9 (5): 227-236. 10.1016/S0960-9822(99)80111-0.

    Article  CAS  PubMed  Google Scholar 

  66. Loog M, Morgan DO: Cyclin specificity in the phosphorylation of cyclin-dependent kinase substrates. Nature. 2005, 434 (7029): 104-108. 10.1038/nature03329.

    Article  CAS  PubMed  Google Scholar 

  67. Rudner AD, Hardwick KG, Murray AW: Cdc28 activates exit from mitosis in budding yeast. J Cell Biol. 2000, 149 (7): 1361-1376. 10.1083/jcb.149.7.1361.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  68. Petroski MD, Deshaies RJ: Context of multiubiquitin chain attachment influences the rate of Sic1 degradation. Mol Cell. 2003, 11 (6): 1435-1444. 10.1016/S1097-2765(03)00221-1.

    Article  CAS  PubMed  Google Scholar 

  69. Rice LM, Plakas C, Nickels JT: Loss of meiotic replication block in Saccharomyces cerevisiae cells defective in Cdc28p regulation. Eukaryotic Cell. 2005, 4 (1): 55-62. 10.1128/EC.4.1.55-62.2005.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  70. Skowyra D, Craig KL, Tyers M, Elledge SJ, Harper JW: F-box proteins are receptors that recruit phosphorylated substrates to the SCF ubiquitin-ligase complex. Cell. 1997, 91 (2): 209-219. 10.1016/S0092-8674(00)80403-1.

    Article  CAS  PubMed  Google Scholar 

  71. Verma R, McDonald H, Yates JR, Deshaies RJ: Selective degradation of ubiquitinated Sic1 by purified 26S proteasome yields active S phase cyclin-Cdk. Mol Cell. 2001, 8 (2): 439-448. 10.1016/S1097-2765(01)00308-2.

    Article  CAS  PubMed  Google Scholar 

  72. Geymonat M, Spanos A, Wells GP, Smerdon SJ, Sedgwick SG: Clb6-Cdc28 and Cdc14 regulate phosphorylation status and cellular localization of Swi6. Mol Cell Biol. 2004, 24 (6): 2277-2285. 10.1128/MCB.24.6.2277-2285.2004.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  73. Donaldson AD, Raghuraman MK, Friedman KL, Cross FR, Brewer BJ, Fangman WL: CLB5-dependent activation of late replication origins in S-cerevisiae. Mol Cell. 1998, 2 (2): 173-182. 10.1016/S1097-2765(00)80127-6.

    Article  CAS  PubMed  Google Scholar 

  74. Gibson DG, Aparicio JG, Hu F, Aparicio OM: Diminished S-phase cyclin-dependent kinase function elicits vital Rad53-dependent checkpoint responses in Saccharomyces cerevisiae. Mol Cell Biol. 2004, 24 (23): 10208-10222. 10.1128/MCB.24.23.10208-10222.2004.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  75. Li X, Cai M: Recovery of the yeast cell cycle from heat shock-induced G(1) arrest involves a positive regulation of G(1) cyclin expression by the S phase cyclin Clb5. J Biol Chem. 1999, 274 (34): 24220-24231. 10.1074/jbc.274.34.24220.

    Article  CAS  PubMed  Google Scholar 

  76. Meyn MA, Holloway SL: S-phase cyclins are required for a stable arrest at metaphase. Curr Biol. 2000, 10 (24): 1599-1602. 10.1016/S0960-9822(00)00863-0.

    Article  CAS  PubMed  Google Scholar 

  77. Schwob E, Nasmyth K: CLB5 and CLB6, a new pair of B cyclins involved in DNA replication in Saccharomyces cerevisiae. Genes Dev. 1993, 7 (7): 1160-1175. 10.1101/gad.7.7a.1160.

    Article  CAS  PubMed  Google Scholar 

  78. Smith KN, Penkner A, Ohta K, Klein F, Nicolas A: B-type cyclins CLB5 and CLB6 control the initiation of recombination and synaptonemal complex formation in yeast meiosis. Curr Biol. 2001, 11 (2): 88-97. 10.1016/S0960-9822(01)00026-4.

    Article  CAS  PubMed  Google Scholar 

  79. Stuart D, Wittenberg C: CLB5 and CLB6 are required for premeiotic DNA replication and activation of the meiotic S-M checkpoint. Genes Dev. 1998, 12 (17): 2698-2710.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  80. Cross FR, Jacobson MD: Conservation and function of a potential substrate-binding domain in the yeast Clb5 B-type cyclin. Mol Cell Biol. 2000, 20 (13): 4782-4790. 10.1128/MCB.20.13.4782-4790.2000.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  81. Verma R, Feldman RM, Deshaies RJ: SIC1 is ubiquitinated in vitro by a pathway that requires CDC4, CDC34, and cyclin-CDK activities. Mol Biol Cell. 1997, 8 (8): 1427-1437.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  82. Wasch R, Cross FR: APC-dependent proteolysis of the mitotic cyclin Clb2 is essential for mitotic exit. Nature. 2002, 418 (6897): 556-562. 10.1038/nature00856.

    Article  PubMed  Google Scholar 

  83. Jacobson MD, Gray S, Yuste-Rojas M, Cross FR: Testing cyclin specificity in the exit from mitosis. Mol Cell Biol. 2000, 20 (13): 4483-4493. 10.1128/MCB.20.13.4483-4493.2000.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  84. Shirayama M, Toth A, Galova M, Nasmyth K: APC-Cdc20-promotes exit from mitosis by destroying the anaphase inhibitor Pds1 and cyclin Clb5. Nature. 1999, 402 (6758): 203-207. 10.1038/46080.

    Article  CAS  PubMed  Google Scholar 

  85. Zachariae W, Schwab M, Nasmyth K, Seufert W: Control of cyclin ubiquitination by CDK-regulated binding of Hct1 to the anaphase promoting complex. Science. 1998, 282 (5394): 1721-1724. 10.1126/science.282.5394.1721.

    Article  CAS  PubMed  Google Scholar 

  86. Queralt E, Igual JC: Functional Connection Between the Clb5 Cyclin, the Protein Kinase C Pathway and the Swi4 Transcription Factor in Saccharomyces cerevisiae. Genetics. 2005, 171 (4): 1485-1498. 10.1534/genetics.105.045005.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  87. Press SJ: Subjective and Objective Bayesian Statistics: Principles, Models, and Applications. 2002, John Wiley & Sons

    Chapter  Google Scholar 

Download references


The author was supported by the postdoctoral fellowship of Turku Centre for Computer Science during the development of BOLS algorithm. The author thanks Tapio Salakoski and Mauno Vihinen for helpful discussion.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Chang Sik Kim.

Additional information

Authors' contributions

CSK developed BOLS algorithm, performed experiments for evaluation of BOLS algorithm, and drafted and finalized manuscript.

Electronic supplementary material


Additional file 1: Comparing the performance between BOLS and SBL using the data set generated based on Rogers and Girolami's study – Supplementary Information. This description provides the comparison of performance between BOLS and SBL using the synthetic data generated by Rogers and Girolami [19]. (DOC 43 KB)

Authors’ original submitted files for images

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Kim, C.S. Bayesian Orthogonal Least Squares (BOLS) algorithm for reverse engineering of gene regulatory networks. BMC Bioinformatics 8, 251 (2007).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: