A Simple Approach to Ranking Differentially Expressed Gene Expression Time Courses through Gaussian Process Regression
 Alfredo A Kalaitzis^{1}Email author and
 Neil D Lawrence^{1}Email author
DOI: 10.1186/1471210512180
© Kalaitzis and Lawrence; licensee BioMed Central Ltd. 2011
Received: 18 January 2011
Accepted: 20 May 2011
Published: 20 May 2011
Abstract
Background
The analysis of gene expression from time series underpins many biological studies. Two basic forms of analysis recur for data of this type: removing inactive (quiet) genes from the study and determining which genes are differentially expressed. Often these analysis stages are applied disregarding the fact that the data is drawn from a time series. In this paper we propose a simple model for accounting for the underlying temporal nature of the data based on a Gaussian process.
Results
We review Gaussian process (GP) regression for estimating the continuous trajectories underlying in gene expression timeseries. We present a simple approach which can be used to filter quiet genes, or for the case of time series in the form of expression ratios, quantify differential expression. We assess via ROC curves the rankings produced by our regression framework and compare them to a recently proposed hierarchical Bayesian model for the analysis of gene expression timeseries (BATS). We compare on both simulated and experimental data showing that the proposed approach considerably outperforms the current state of the art.
Conclusions
Gaussian processes offer an attractive tradeoff between efficiency and usability for the analysis of microarray time series. The Gaussian process framework offers a natural way of handling biological replicates and missing values and provides confidence intervals along the estimated curves of gene expression. Therefore, we believe Gaussian processes should be a standard tool in the analysis of gene expression time series.
Background
Gene expression profiles give a snapshot of mRNA concentration levels as encoded by the genes of an organism under given experimental conditions. Early studies of this data often focused on a single point in time which biologists assumed to be critical along the gene regulation process after the perturbation. However, the static nature of such experiments severely restricts the inferences that can be made about the underlying dynamical system.
With the decreasing cost of gene expression microarrays time series experiments have become commonplace giving a far broader picture of the gene regulation process. Such time series are often irregularly sampled and may involve differing numbers of replicates at each time point [1]. The experimental conditions under which gene expression measurements are taken cannot be perfectly controlled leading the signals of interest to be corrupted by noise, either of biological origin or arising through the measurement process.
Primary analysis of gene expression profiles is often dominated by methods targeted at static experiments, i.e. gene expression measured on a single timepoint, that treat time as an additional experimental factor [1–6]. However, were possible, it would seem sensible to consider methods that can account for the special nature of time course data. Such methods can take advantage of the particular statistical constraints that are imposed on data that is naturally ordered [7–12].
The analysis of gene expression microarray timeseries has been a stepping stone to important problems in systems biology such as the genomewide identification of direct targets of transcription factors [13, 14] and the full reconstruction of gene regulatory networks [15, 16]. A more comprehensive review on the motivations and methods of analysis of timecourse gene expression data can be found in [17].
Testing for Expression
Failure to capture the signal in a profile, irrespective of the amount of embedded noise, may be partially due to temporal aggregation effects, meaning that the coarse sampling of gene expression or the sampling rates do not match the natural rates of change in mRNA concentrations [18]. For these reasons, the classification scheme of differential expression in this paper is focused on reaching a high true positive rate (TPR, sensitivity or recall ) and is to serve as a preprocessing tool prior to more involved analysis of timecourse microarray data. In this work we distinguish between twosample testing and experiments where control and treated cases are directlyhybridized on the microarray (For brevity, we shall refer to experiments with such setups as onesample testing). The twosample setup is a common experimental setup in which two groups of sample replicates are used [13, 19]; one being under the treatment effect of interest and the other being the control group, so to recover the most active genes under a treatment one may be interested in testing for the statistical significance of a treated profile being differentially expressed with respect to its control counterpart. Other studies use data from a onesample setup [11, 12], in which the control and treated cases are directly hybridized on a microarray and the measurements are normalized log foldchanges between the two output channels of the microarray [20], so the analogous goal is to test for the statistical significance of having a nonzero signal.
A recent significant contribution in regards to the estimation and ranking of differential expression of timeseries in a onesample setup is a hierarchical Bayesian model for the analysis of gene expression timeseries (BATS) [11, 12] which offers fast computations through exact equations of Bayesian inference, but makes a considerable number of prior biological assumptions to accomplish this (cf. Simulated data).
Gene Expression Analysis with Gaussian Processes
Gaussian processes(GP) [21, 22] offer an easy to implement approach to quantifying the true signal and noise embedded in a gene expression timeseries, and thus allow us to rank the differential expression of the gene profile. A Gaussian process is the natural generalisation of a multivariate Gaussian distribution to a Gaussian distribution over a specific family of functions  a family defined by a covariance function or kernel, i.e. a metric of similarity between datapoints (Roughly speaking, if we also view a function as a vector with an infinite number of components, then that function can be represented as a point in an infinitedimensional space of a specific family of functions and a Gaussian process as an infinitedimensional Gaussian distribution over that space).
In the context of expression trajectory estimation, a Gaussian process coupled with the squaredexponential covariance function (or radial basis function, RBF)  a standard covariance function used in regression tasks  makes the reasonable assumption that the underlying true signal in a profile is a smooth function [23], i.e. a function with an infinite degree of differentiability. This property endows the GP with a large degree of flexibility in capturing the underlying signals without imposing strong modeling assumptions (e.g. number of basis functions) but may also erroneously pick up spurious patterns (false positives) should the timecourse profiles suffer from temporal aggregation. From a generative viewpoint, the profiles are assumed to have been corrupted by additive white Gaussian noise. This property makes the GP an attractive tool for bootstrapping simulated biological replicates [24].
In a different context, Gaussian process priors have been used for modeling transcriptional regulation. For example in [25], while using the timecourse expression of apriori known direct targets (genes) of a transcriptionfactor, the authors went one step further and inferred the concentration rates of the transcriptionfactor protein itself and [26] extended the same model for the case of regulatory repression. The everlingering issue of outliers in time series is still critical, but is not addressed here as there is significant literature on this issue in the context of GP regression, which is complementary to this work.
For example [19, 27] developed a probabilistic model using Gaussian processes with a robust noise model specialised for twosample testing to detect intervals of differential expression, whereas the present work optionally focuses on onesample testing, to rank the differential expression and ultimately detect quiet/active genes. Other examples can also be easily applied; [28] use a Studentt distribution as the robust noise model in the regression framework along with variational approximations to make inference tractable, and [29] employ a Studentt observation model with Laplace approximations for inference. The standard GP regression framework is straightforward to use here with a limited need for manual tweaking of a few hyperparameters. We describe the GP framework, as used here for regression, in more detail in the Methods section.
Results and Discussion
We apply standard Gaussian process (GP) regression and the Bayesian hierarchical model for the analysis of timeseries (BATS) on two insilico datasets simulated by BATS and GPs, and on one experimental dataset coming from a study on primary mouse keratinocytes with an induced activation of the TRP63 transcription factor, for which a reverseengineering algorithm was developed (TSNI: timeseries network identification) to infer the direct targets of TRP63 [13].
We assume that each gene expression profile can be categorized as either quiet or differentially expressed. We consider algorithms that provide a rank ordering of the profiles according to which is most likely to be nonquiet (or differentially expressed). Given ground truth we can then evaluate the quality of such a ranking and compare different algorithms. We make use of receiver operating characteristic curves (ROC curves) to evaluate the algorithms. These curves plot the false positive rate on the horizontal axis, versus the true positive rate on the vertical axis; i.e. the percentage of the total negatives (nondifferentially expressed profiles) erroneously classified as positives (differentially expressed) versus the percentage of the total positives correctly classified as positives.
From the output of each model a ranking of differential expression is produced and assessed with ROC curves to quantify how well in accordance to each of the three ground truths (BATSsampled, GPsampled, TSNIexperimental) the method performs. The BATS model can employ three different noise models, where the marginal distribution of the error is assumed to be either Gaussian, Studentt or double exponential respectively. For the following comparisons we plot four ROC curves, one for each noise model of BATS and one for the GP. We demonstrate that the ranking of the GP framework outperforms that of BATS with respect to the TSNI ranking on the experimental data and on GPsampled profiles.
Simulated data
The first set of insilico profiles are simulated by the BATS software http://www.na.iac.cnr.it/bats/ in accordance to the guidelines given in [12]. In BATS [11] each timecourse profile is assumed to be generated by a function expanded in an orthonormal basis (Legendre or Fourier) plus noise. The number of bases and their coefficients, are estimated with analytic computations in a fully Bayesian manner. Thus the global estimand for every gene expression trajectory is the linear combination of some number of bases whose coefficients are estimated by a posterior distribution. In addition, the BATS framework allows various types of nonGaussian noise models.
BATS simulation
We reproduce one instantiation of the simulations performed in [11]; specifically, three sets of N = 8000 profiles, of n = 11 timepoints and replicates, for i = 1; ..., N, j = 1, ..., n except according to the model defined in [11, sec. 2.2]. In each of the three sets of profiles, 600 out of 8000 are randomly chosen to be differentially expressed (labeled as "1" in the ground truth) and simulated as a sum of an orthonormal basis of Legendre polynomials with additive i.i.d.(identically and independently distributed) noise.
GP simulation
Parameters of the Gamma distributions for sampling the RBFhyperparameters.
Sampling Gamma distribution Γ(a, b)  

a (scale)  b (shape)  
Sampled RBF Hyperparameters  ℓ^{2} (characteristic lengthscale)  1.4  5.7 
(signal variance)  2.76  0.2  
(noise variance)  23  0.008 
The other 7400 nondifferentially expressed profiles are simply zero functions with additive white Gaussian noise of variance equal to the sum of two samples from the Gamma distribution for the signal variance and the noise variance. This addition serves to create a nondifferentiated profile of comparative scale to the differentiated ones, but nonetheless of completely random nature. Figure 2(d) illustrates the comparison on the GPsampled data.
Experimental data
We apply the standard GP regression framework and BATS on an experimental dataset coming from a study on primary mouse keratinocytes with an induced activation of the TRP63 transcription factor (GEOaccession number [GEOdataset:GSE10562]), where a reverseengineering algorithm was developed (TSNI: timeseries network identification) to infer the direct targets of TRP63 [13]. In that study, 786 out of 22690 gene reporters were chosen based on the area under their curves, and ranked by TSNI according to the probability of belonging to direct targets of TRP63. The ranking list was published in a supplementary file available for download
(genome.cshlp.org/content/suppl/2008/05/05/gr.073601.107.DC1/DellaGatta_SupTable1.xls) and used here as a noisy ground truth. We preprocess the data with the robust multiarray average (RMA) expression measure [30], implemented in the "affy" Rpackage.
Discussion
On BATSsampled data, Figure 2(a, b, c), we observe that the change in the induced noise is barely noticeable in regards to the performances of both methods and that BATS maintains its stable supremacy over the GP framework. This performance gap is partially due to the lack of a robust noise model for the GP (cf. Conclusions). Furthermore, there is a modeling bias in the underlying functions of the simulated profiles, which contain a finite small degree of differentiability (maximum degree of Legendre polynomial is 6). This puts the GP in a disadvantaged position as it models for (smooth) infinitely differentiable functions when its covariance function is a squared exponential. Consequently, for this simulated dataset the GP is more susceptible to capturing spurious patterns as they are more likely to lie within its modeling range, whereas for BATS modeling the polynomials with a limited degree acts as a safeguard against spurious patterns, most of which vary rapidly in time.
On GPsampled data, Figure 2(d), we observe the reversal of the performance gap in favor of the GP framework while its performance is almost unaffected. The GP is still prone to nondifferentially expressed profiles with spurious patterns and differentially expressed profiles with excessive noise. However, the limited polynomial degree of BATS proves to be inadequate for many of the GPsampled functions and the two BATS variants with robust noise models (BATS_{ T }, BATS_{ DE }) only alleviate the problem slightly. In Figure 4 we observe the GP outperforming the Gaussian noise variant of BATS (BATS_{ G }) by a similar degree as in Figure 2(d). The experimental data are much more complex and apparently the robust BATS variants now offer no increase in performance. Since the ground truth focuses on the 100 most differentially expressed genes with respect to the induction of the TRP63 transcription factor, then these results indicate that the GP method of ranking presented here indeed highlights differentially expressed genes and that it naturally features an attractive degree of robustness against different kinds of noise.
Conclusions
We presented an approach to estimating the continuous trajectory of gene expression timeseries from microarray data through Gaussian process (GP) regression and ranking the differential expression of each profile via a logratio of marginal likelihoods of two GPs, each one representing the hypothesis of differential and nondifferential expression respectively. We compared our method to a recent Bayesian hierarchical model (BATS) via ROC curves on data simulated by BATS and GPs and experimental data. Each evaluation was made on the basis of matched percentages to a ground truth  a binary vector which labeled the profiles in a dataset as differentially expressed or not. The experimental data were taken from a previous study on primary mouse keratinocytes and the top 100 genes of its ranking were used here as the noisy ground truth for the purposes of assessment. The GP framework significantly outperformed BATS on experimental and GPsampled data and the results showed that standard GP regression can be regarded as a serious competitor in evaluating the continuous trajectories of gene expression and ranking its differential expression.
This ranking scheme presented here is reminiscent of the work in [19] on twosample data (separate timecourse profiles for each treatment), where the two competing hypotheses are represented in a graphical model of two different generative models connected with a gating scheme; one where the two profiles of the gene reporter are assumed to be generated by two different GPs, and thus the gene is differentially expressed across the two treatments, and one where the two profiles are assumed to be generated by the same GP, and thus the gene is nondifferentially expressed. The gating network serves to switch between the two generative models, in time, to detect intervals of differential expression and thus allow biologists to draw conclusions about the propagation of a perturbation in a gene regulatory network. Instead, the issue presented in this paper is more basic and so is the methodology to deal with it. However, we note that the robust mechanisms against outliers used in [19, 28, 29] are complementary to this work and one should not hesitate to incorporate one into a framework similar to ours. Practicalities aside, this paper also introduces additional proof that Gaussian processes, naturally and without much engineering, fit to the analysis of gene expression timeseries and that simplicity can still be preferred over the everincreasing  but sometimes necessary  complexity of hierarchical Bayesian frameworks.
Future work
A natural next step would be to add a robust noise mechanism in our framework. In this regard, fine examples can be found in [19, 28, 29]. Finally, an interesting biological question is about the potential periodicity of the underlying signal in a gene expression profile. In this regard a different of kind stationary covariance function, the periodic covariance function [22], can fit a timeseries generated by an periodic process and thus its lengthscale hyperparameter can be interpreted as its cycle.
Methods
As we mentioned earlier, analysing timecourse microarray data by means of Gaussian process (GP) regression is not a new idea (cf. Background). In this section we review the methodology to estimating the continuous trajectory of a gene expression by GP regression and subsequently describe a likelihoodratio approach to ranking the differential expression of its profile. The following content is based on the key components of GP theory as described in [21, 22].
The Gaussian process model
The idea is to treat trajectory estimation given the observations (gene expression timeseries) as an interpolation problem on functions of one dimension. By assuming the observations have Gaussiandistributed noise, the computations for prediction become tractable and involve only the manipulation of linear algebra rules.
A finite parametric model
Where .
Introducing Bayesian methodology
By computing the marginal likelihood in eq. (8), we can compare or rank different models, without fear of overfitting on the data, or having to explicitly apply a regulariser to the likelihood; the marginal likelihood implicitly penalises too complex models [21, sec. 5.4].
Notice in eq. (7) how the structure of the covariance implies that choosing a different feature space Φ results in a different K _{ y }. Whatever K_{ y } is, it must satisfy the following requirements to be a valid covariance matrix of the GP:

Kolmogorov consistency, which is satisfied when K_{ ij }= K(x_{ i }, x_{ j }) for some covariance function K, such that all possible K are positive semidefinite (y^{ ⊤ } Ky ≥ 0).

Exchangeability, which is satisfied when the data are i.i.d.. It means that the order in which the data become available has no impact on the marginal distribution, hence there is no need to hold out data from the training set for validation purposes (for measuring generalisation errors, etc.).
Definition of a Gaussian process
More formally, a Gaussian process is a stochastic process (or collection of random variables) over a feature space, such that the distribution p (y(x_{1}), y(x_{2}),..., y(x_{ n } )) of a function y(x), for any finite set of points {x_{1}, x_{2}, ..., x_{ n } } mapped to that space, is Gaussian, and such that any of these Gaussian distributions is Kolmogorov consistent.
The squaredexponential kernel
In this paper we only use the univariate version of the squaredexponential (SE) kernel. But before embarking on its analysis, the reader should be aware of the existing wide variety of kernel families, and potential combinations of them. A comprehensive review of the literature on covariance functions is found in [21, chap. 4].
Derivation and interpretation of the SE kernel
One can also combine covariance functions as long as they are positivedefinite. Examples of valid combined covariance functions include the sum and convolution of two covariance functions. In fact, eq. (14) is a combined sum of the SE kernel with the covariance function of isotropic Gaussian noise.
Gaussian process prediction
Predictive equations for GP regression
and K_{ f } = K_{ f } (x, x). These equations can be generalised easily for the prediction of function values at multiple new timepoints by augmenting k_{*} with more columns and k(x*, x*) with more components, one for each new timepoint x*.
Hyperparameter learning
Given the SE covariance function, one can learn the hyperparameters from the data by optimising the logmarginal likelihood function of the GP. In general, a nonparametric model such as the GP can employ a variety of kernel families whose hyperparameters can be adapted with respect to the underlying intensity and frequency of the local signal structure, and interpolate it in a probabilistic fashion (i.e. while quantifying the uncertainty of prediction). The SE kernel allows one to give intuitive interpretations of the adapted hyperparameters, especially for onedimensional data such as a gene expression timeseries, see Figure 5 for interpretations of various localoptima.
Optimising the marginal likelihood
We use scaled conjugate gradients[32]  a standard optimisation scheme  to maximise the LML.
Ranking with likelihoodratios
based on some initial beliefs, such as the functions having large lengthscales, and optimise the marginal likelihood so that the optimum lengthscale tends to a large value, unless there is evidence to the contrary. Depending on the model , the integral in eq. (27) may be analytically intractable and thus one has to resort to approximating this quantity [33] (e.g. Laplace approximation) or using Markov Chain Monte Carlo (MCMC) methods to sample from the posterior distribution [34].
where the models usually represent two different hypotheses, namely  the profile has a significant underlying signal and thus it is truly differentially expressed and  there is no underlying signal in the profile and the observed gene expression is just the effect of random noise. The ranking is based on how likely in comparison to , given a profile.
with each LML being a function of different instantiations of θ. We still maintain hypotheses and that represent the same notions explained above, but in our case they differ simply by configurations of θ. Specifically, on the hyperparameters are fixed to θ_{1} = (∞, 0; var(y))^{⊤} to encode a function constant in time (l^{2} → ∞), with no underlying signal , which generates a timeseries with a variance that can be solely explained by noise . Analogously, on the hyperparameters θ_{2} are initialised to encode a function that fluctuates in accordance to a typical significant profile (e.g. ℓ^{2} = 20), with a distinct signal variance that solely explains the observed timeseries variance and with no noise .
Local optima of the logmarginal likelihood (LML) function
These two configurations correspond to two points in the threedimensional function that is the LML, both of which usually lie close to localoptimum solutions. This assumption can be verified, empirically, by exhaustively plotting the LML function for a number of profiles, see Figure 5. In case the LML contour differs for some profiles, more initialisation points should be used to ensure convergence to the maximumlikelihood solution. Because the configuration of the second hypothesis (no noise, ) is an extremely unlikely scenario, we let θ_{2} adapt to a given profile by optimising the LML function, as opposed to keeping it fixed like θ_{1}.
In most cases the LML (eq. (25)) is not convex. Multiple optima do not necessarily pose a threat here; depending on the data and as long as they have similar function values, multiple optima present alternative interpretations on the observations. To alleviate the problem of spurious local optimum solutions however, we make the following observation: when we explicitly restrict the signal variance hyperparameter ( ) to low values during optimisation, we also implicitly restrict the noise variance hyperparameter ( ) to large values. This occurs as the explanation of the observed data variance (var(y)) is shared between the signal and noise variance hyperparameters, i.e. . This dependency allows us to treat the threedimension optimisation problem as a twodimension problem, one of lengthscale ℓ ^{2} and one of signaltonoise ratio without fear of missing out an optima.
Figure 5 illustrates the marginal likelihood as a function of the characteristic lengthscale ℓ^{2} and the SNR. It features two local optima, one for a small lengthscale and a high SNR, where the observed data are explained with a relatively complex function and a small noise variance, and one optimum for a large lengthscale and a low SNR, where the data are explained by a simpler function with high noise variance. We also notice that the first optimum has a lower LML. This relates to the algebraic structure of the LML (eq. (25)); the first term (dot product) promotes data fitness and the second term (determinant) penalizes the complexity of the model [21, sec.5.4]. Overall, the LML function of the Gaussian process offers a good fitnesscomplexity tradeoff without the need for additional regularisation. Optionally, one can use multiple initialisation points focusing on different noninfinite lengthscales to deal with the multiple local optima along the lengthscale axis, and pick the best solution (max LML) to represent the hypothesis in the likelihoodratio during the ranking stage.
Source code
The source code for the GP regression framework is available in MATLAB code http://staffwww.dcs.shef.ac.uk/people/N.Lawrence/gp/ and as a package for the R statistical computing language http://cran.rproject.org/web/packages/gptk/. The routines for the estimation and ranking of the gene expression timeseries are available upon request for both languages. The time needed to analyse the 22690 profiles in the experimental data, with only the basic two initialisation points of hyperparameters, is about 30 minutes on a desktop running Ubuntu 10.04 with a dualcore CPU at 2.8 GHz and 3.2 GiB of memory.
Declarations
Acknowledgements
The authors would like to thank Diego di Bernardo for his useful feedback on the experimental data. Research was partially supported by a EPSRC Doctoral Training Award, the Department of Neuroscience, University of Sheffield and BBSRC (grant BB/H018123/2).
Authors’ Affiliations
References
 Lönnstedt I, Speed TP: Replicated microarray data. Statistica Sinica 2002, 12: 31–46.Google Scholar
 Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B: Comprehensive identification of cell cycleregulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Molecular biology of the cell 1998, 9(12):3273.PubMed CentralView ArticlePubMedGoogle Scholar
 Friedman N, Linial M, Nachman I, Pe'er D: Using Bayesian networks to analyze expression data. Journal of computational biology 2000, 7(3–4):601–620. 10.1089/106652700750050961View ArticlePubMedGoogle Scholar
 Dudoit S, Yang YH, Callow MJ, Speed TP: Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Statistica sinica 2002, 12: 111–140.Google Scholar
 Kerr MK, Martin M, Churchill GA: Analysis of variance for gene expression microarray data. Journal of Computational Biology 2000, 7(6):819–837. 10.1089/10665270050514954View ArticlePubMedGoogle Scholar
 Efron B, Tibshirani R, Storey JD, Tusher V: Empirical Bayes analysis of a microarray experiment. Journal of the American Statistical Association 2001, 96(456):1151–1160. 10.1198/016214501753382129View ArticleGoogle Scholar
 BarJoseph Z, Gerber G, Simon I, Gifford DK, Jaakkola TS: Comparing the continuous representation of timeseries expression profiles to identify differentially expressed genes. Proceedings of the National Academy of Sciences of the United States of America 2003, 100(18):10146. 10.1073/pnas.1732547100PubMed CentralView ArticlePubMedGoogle Scholar
 Ernst J, Nau G, BarJoseph Z: Clustering short time series gene expression data. Bioinformatics 2005, 21(Suppl 1):i159. 10.1093/bioinformatics/bti1022View ArticlePubMedGoogle Scholar
 Storey JD, Xiao W, Leek JT, Tompkins RG, Davis RW: Significance analysis of time course microarray experiments. Proceedings of the National Academy of Sciences of the United States of America 2005, 102(36):12837. 10.1073/pnas.0504609102PubMed CentralView ArticlePubMedGoogle Scholar
 Tai YC, Speed TP: A multivariate empirical Bayes statistic for replicated microarray time course data. The Annals of Statistics 2006, 34(5):2387–2412. 10.1214/009053606000000759View ArticleGoogle Scholar
 Angelini C, De Canditiis D, Mutarelli M, Pensky M: A Bayesian approach to estimation and testing in timecourse microarray experiments. Stat Appl Genet Mol Biol 2007, 6: 24.Google Scholar
 Angelini C, Cutillo L, De Canditiis D, Mutarelli M, Pensky M: BATS: a Bayesian userfriendly software for Analyzing Time Series microarray experiments. BMC bioinformatics 2008, 9: 415. 10.1186/147121059415PubMed CentralView ArticlePubMedGoogle Scholar
 Della Gatta G, Bansal M, AmbesiImpiombato A, Antonini D, Missero C, di Bernardo D: Direct targets of the TRP63 transcription factor revealed by a combination of gene expression profiling and reverse engineering. Genome research 2008, 18(6):939. 10.1101/gr.073601.107PubMed CentralView ArticlePubMedGoogle Scholar
 Honkela A, Girardot C, Gustafson EH, Liu YH, Furlong EEM, Lawrence ND, Rattray M: Modelbased method for transcription factor target identification with limited data. Proceedings of the National Academy of Sciences 2010, 107(17):7793. 10.1073/pnas.0914285107View ArticleGoogle Scholar
 Bansal M, Gatta GD, Di Bernardo D: Inference of gene regulatory networks and compound mode of action from time course gene expression profiles. Bioinformatics 2006, 22(7):815. 10.1093/bioinformatics/btl003View ArticlePubMedGoogle Scholar
 Finkenstadt B, Heron EA, Komorowski M, Edwards K, Tang S, Harper CV, Davis JRE, White MRH, Millar AJ, Rand DA: Reconstruction of transcriptional dynamics from gene reporter data using differential equations. Bioinformatics 2008, 24(24):2901. 10.1093/bioinformatics/btn562PubMed CentralView ArticlePubMedGoogle Scholar
 BarJoseph Z: Analyzing time series gene expression data. Bioinformatics 2004, 20(16):2493. 10.1093/bioinformatics/bth283View ArticlePubMedGoogle Scholar
 Bay SD, Chrisman L, Pohorille A, Shrager J: Temporal aggregation bias and inference of causal regulatory networks. Journal of Computational Biology 2004, 11(5):971–985. 10.1089/cmb.2004.11.971View ArticlePubMedGoogle Scholar
 Stegle O, Denby KJ, Cooke EJ, Wild DL, Ghahramani Z, Borgwardt KM: A robust Bayesian twosample test for detecting intervals of differential gene expression in microarray time series. Journal of Computational Biology 2010, 17(3):355–367. 10.1089/cmb.2009.0175PubMed CentralView ArticlePubMedGoogle Scholar
 Schena M, Shalon D, Davis RW, Brown PO: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 1995, 270(5235):467. 10.1126/science.270.5235.467View ArticlePubMedGoogle Scholar
 Rasmussen CE, Williams CKI: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press; 2005.Google Scholar
 MacKay DJC: Gaussian Processes. In Information theory, inference, and learning algorithms. Cambridge University Press; 2003:535–548.Google Scholar
 Yuan M: Flexible temporal expression profile modelling using the Gaussian process. Computational statistics & data analysis 2006, 51(3):1754–1764. 10.1016/j.csda.2005.11.017View ArticleGoogle Scholar
 Kirk PDW, Stumpf MPH: Gaussian process regression bootstrapping: exploring the effects of uncertainty in time course data. Bioinformatics 2009, 25(10):1300. 10.1093/bioinformatics/btp139PubMed CentralView ArticlePubMedGoogle Scholar
 Lawrence ND, Sanguinetti G, Rattray M: Modelling transcriptional regulation using Gaussian processes. Advances in Neural Information Processing Systems 2007, 19: 785.Google Scholar
 Gao P, Honkela A, Rattray M, Lawrence ND: Gaussian process modelling of latent chemical species: applications to inferring transcription factor activities. Bioinformatics 2008, 24(16):i70. 10.1093/bioinformatics/btn278View ArticlePubMedGoogle Scholar
 Stegle O, Denby KJ, Wild L, McHattie S, Meade A, Ghahramani Z, Borgwardt KM: Discovering temporal patterns of differential gene expression in microarray time series. In GCB 2009, 133–142.Google Scholar
 Tipping ME, Lawrence ND: Variational inference for Studentt models: Robust Bayesian interpolation and generalised component analysis. Neurocomputing 2005, 69(1–3):123–141. 10.1016/j.neucom.2005.02.016View ArticleGoogle Scholar
 Vanhatalo J, Jylänki P, Vehtari A: Gaussian process regression with Studentt likelihood. Neural Information Processing System, Citeseer 2009.Google Scholar
 Irizarry RA, Hobbs B, Collin F, BeazerBarclay YD, Antonellis KJ, Scherf U, Speed TP: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2003, 4(2):249. 10.1093/biostatistics/4.2.249View ArticlePubMedGoogle Scholar
 Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES, Mesirov JP: Gene set enrichment analysis: a knowledgebased approach for interpreting genomewide expression profiles. Proceedings of the National Academy of Sciences 2005, 102(43):15545. 10.1073/pnas.0506580102View ArticleGoogle Scholar
 Möller MF: A scaled conjugate gradient algorithm for fast supervised learning. Neural networks 1993, 6(4):525–533. 10.1016/S08936080(05)800565View ArticleGoogle Scholar
 MacKay DJC: Comparison of approximate methods for handling hyperparameters. Neural Computation 1999, 11(5):1035–1068. 10.1162/089976699300016331View ArticleGoogle Scholar
 Neal RM: Monte Carlo implementation of Gaussian process models for Bayesian regression and classification. Arxiv preprint physics/9701026 1997.Google Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.