Hierarchical Bayesian modelling of gene expression time series across irregularly sampled replicates and clusters
- James Hensman^{1, 2}Email author,
- Neil D Lawrence^{1, 2} and
- Magnus Rattray^{3}Email author
https://doi.org/10.1186/1471-2105-14-252
© Hensman et al.; licensee BioMed Central Ltd. 2013
Received: 13 August 2012
Accepted: 13 August 2013
Published: 20 August 2013
Abstract
Background
Time course data from microarrays and high-throughput sequencing experiments require simple, computationally efficient and powerful statistical models to extract meaningful biological signal, and for tasks such as data fusion and clustering. Existing methodologies fail to capture either the temporal or replicated nature of the experiments, and often impose constraints on the data collection process, such as regularly spaced samples, or similar sampling schema across replications.
Results
We propose hierarchical Gaussian processes as a general model of gene expression time-series, with application to a variety of problems. In particular, we illustrate the method’s capacity for missing data imputation, data fusion and clustering.The method can impute data which is missing both systematically and at random: in a hold-out test on real data, performance is significantly better than commonly used imputation methods. The method’s ability to model inter- and intra-cluster variance leads to more biologically meaningful clusters. The approach removes the necessity for evenly spaced samples, an advantage illustrated on a developmental Drosophila dataset with irregular replications.
Conclusion
The hierarchical Gaussian process model provides an excellent statistical basis for several gene-expression time-series tasks. It has only a few additional parameters over a regular GP, has negligible additional complexity, is easily implemented and can be integrated into several existing algorithms. Our experiments were implemented in python, and are available from the authors’ website: http://staffwww.dcs.shef.ac.uk/people/J.Hensman/.
Background
Gene expression time course experiments have been used to investigate fundamental biological processes which are often dynamic in nature. For example, the cell cycle [1], cell signalling [2], circadian rhythms [3] and developmental processes [4] have been studied extensively using gene expression time-series data.
Many computational approaches to time-series analysis are not always well suited to gene expression data, where missing measurements are common and time points may not be spaced regularly. In many conventional time-series models such as state-space models [5, 6] there is no straightforward manner to deal with missing data, and time points must occur at regular intervals. Whilst gene expression experiments can be sampled regularly, such designs may not be optimal from a statistical or cost perspective. A method for modelling arbitrarily sampled time points may elicit more information from fewer samples, where time points are selected to capture pertinent temporal features.
Furthermore, existing time-series models do not necessarily capture the structure of gene expression data. Many gene expression time-series are performed with multiple biological replicates: the crude method of simply averaging the replicates may be discarding interesting information. It is also unclear what to do when the replicates are not sampled at the same times. There is a need for a temporal model which deals with the replicate structure.
Our proposed model is based upon two important ideas: Gaussian process (GP) regression allows for parsimonious temporal inference, whilst a hierarchical structure accounts for (temporally structured) covariance between biological replicates. Additional layers can be added to the hierarchy to model more structure in the data. For example, in a data fusion application, a layer of hierarchy can be used to account for differences between gene expression measurement platforms; or in a clustering application, a hierarchical layer can be added to account for temporal covariance of genes within a cluster.
GPs have been successfully applied to the analysis of gene expression time series by several authors [7-9]. There is little doubt that they provide a coherent and principled framework for regression: for an introduction see [10]. Our contribution is to propose hierarchical Gaussian processes to deal with structure in the data. We provide an introduction to the idea, deriving a novel covariance function which accounts for structure. The idea is simple to implement yet highly effective as we demonstrate on several problems. A hierarchical GP model could easily be integrated with existing GP-based applications, allowing them to properly account for replicate structure.
In a further contribution, we manipulate the marginal likelihood expression for the hierarchical GP model for the case where each part of the structure is sampled at the same time, leading to an expression with reduced computational complexity. This situation is most likely to occur during clustering of genes, which must all be measured simultaneously using high throughput methods.
Short time series are prevalent in gene-expression data sets [11]. Our GP-based model is well suited to short time-series, and the behaviour of GPs can be set to mimic that of other temporal models (such as autoregressors) through the covariance function [10], though in this work we use a simple form for the covariance which assures smoothness of the underlying dynamics.
Unlike other time-series based approaches, GPs are not restricted to data which has been sampled at evenly spaced time points. The model therefore removes any restriction on temporal sampling — it can be totally irregular and differ between replicates. This also allows our method to deal with both randomly and systematically missing data. We show how the model can be used for data fusion where the temporal sampling differs between experiments.
Related work
Hierarchical models are an important idea in Bayesian statistics [12], allowing information to be exchanged between related groups of data. The idea is that by performing inference on the structure as a whole, rather than on each part of the structure independently, inference is improved. GPs have been successfully used in models of gene expression time-series before; for example for inferring transcriptional regulation [8], and to identify differential expression in time-series [7, 13]. A key contribution of this work is to combine hierarchical structures with GPs to provide a parsimonious and elegant method for dealing with replicated gene expression time-series.
An alternative to our method was proposed by [14]. In this model, uncertainty is assumed in the time of data collection, and the time-shift in each replicate is estimated. In our model, the times are assumed correct whilst the shift is assumed to occur in the expression. Our model has significant computational advantages, since we can marginalise the shifts in expression analytically under the GP framework, whilst the method proposed in [14] required optimisation of a large number of variables (one for each observation). Further, our model is easily included in more complex GP-based models, such as the clustering application which we shall demonstrate. The estimation of time-shifts would be difficult to incorporate into a clustering method, especially considering the very large numbers of parameters which require optimising.
Clustering expression data while modelling within-cluster variance is one of the primary applications of our model. Previously, [15] proposed a random effects model to account for variance between observations of genes and also within clusters of genes. Further, [16] and [17] explored clustering methods with hierarchical structures to model replicate variance. In these models, replicate variance was modelled as multivariate Gaussian around some gene-specific mean, and the gene’s expression was considered multivariate Gaussian around a cluster-specific mean. This paper presents a similar but more powerful idea: we use a hierarchy of GPs to model gene-specific and replicate-specific temporal covariance. We demonstrate that the introduction of a GP prior makes inference of clusters more viable by reducing the number of parameters required to model the data within a cluster, and we also provide a method for dramatically reducing the computational cost of evaluating clusters under our model. Previous methods for clustering temporal data (e.g. [18]) have not used the replicate structure in the model.
Methods
Background: Gaussian processes
Gaussian processes (GPs) have been used extensively in a variety of regression problems, and have been applied to gene expression time-series by several authors [7-9]. We briefly introduce GP regression and introduce some notation: for an in-depth introduction one may consult [10].
For practicality, a zero-mean function is often assumed; throughout this work, the squared-exponential covariance function will be used: k(t,t^{′})=α exp{−γ(t−t^{′})^{2}}.
and regression follows from the conditional property similarly to (3).
which depends on θ through the covariance matrix K_{t,t}.
A hierarchy across replicates
Gene expression time-series may be collected in multiple replicates, to account for biological variation. The idea is that there exists some common trend, present in all replicates, which we wish to identify, and the measurements made of each replicate vary due to biological differences as well as technical noise.
We shall use the notation y_{ nr } to denote the vector of measurements of gene expression of the n^{th} gene, in the r^{th} biological replicate; these measurements were made at times which we collect into a vector t_{ nr }. The data for the n^{th} gene is denoted ${\mathbf{Y}}_{n}={\left\{{\mathbf{y}}_{\mathit{\text{nr}}}\right\}}_{r=1}^{{N}_{n}}$, ${\mathbf{T}}_{n}={\left\{{\mathbf{t}}_{\mathit{\text{nr}}}\right\}}_{r=1}^{{N}_{n}}$.
Note that the two covariance functions k_{ g } and k_{ f } may in general be different: we have used the squared exponential function for both, with independent parameters. This simple model is illustrated in Figure 1, where the dependent nature of the functions is illustrated, as well as the effects of the hyper-parameters.
Inferences about functions can then be made using the standard methods described above, and hyper-parameters of the covariance functions can be optimised.
These examples demonstrate different behaviours of the time-series which are captured by the model. For the first gene, CG18135, the replicates are quite similar, and most deviation of the data from g_{ n }(t) is attributed to noise. The model attributes 87% of the data’s variation to the underlying function g_{ n }(t), and only 6% each to replicate variance and noise, i.e. the model ‘recognised’ the similarity of the replicates.
The gene represented by the middle row is AP-47, and it can be seen that there is considerable replicate variance: although each replicate follows a similar pattern, the pattern is ‘amplified’ differently in the replicates. Here, the model attributes 60% of the data’s variance to the function g_{ n }(t) and 34% to f_{ nr }(t), with 6% to noise.
The gene represented by the bottom row of Figure 2 is OstStt3. Here, the variances of g_{ n }(t), f_{ nr }(t) and noise are 55%, 36% and 8% respectively. The model recognises the differences in the replicates, but uses a long length scale for f_{ nr }(t). In this gene, the detailed pattern of the time-series is captured entirely by g_{ n }(t), and f_{ nr }(t) is used to account for amplitude shifts between replicates. Note that these cannot be simply ‘normalised out’ because not all replicates cover the same temporal region. These genes were selected using a simple filtering procedure. The model was fitted independently to each gene on a microarray, and the genes were ranked according to the ratio of signal variance (a hyperparameter of k_{ g }) and replicate-plus-noise variance (hyperparameters from k_{ f }).
Deeper hierarchies
With every layer of the hierarchy, we have introduced new parameters corresponding to the covariance function for that layer. Note that the hierarchy can be extended arbitrarily to represent the structure of the data. For example, we might want to model biological variation where the lineage is known, or to model inter-species variation, or to build a hierarchy which reflects the phylogenetic relationship between species.
An efficient model of clusters
Clustering of gene expression time-series is often performed with a view to finding groups of co-regulated or associated genes. The central assumption is that genes which are involved in the same biological processes will be expressed together: they share an underlying time-series.
In order to model a group of genes as defined by a cluster, the hierarchical model is extended to a three-layer hierarchy across the cluster, individual genes and replicates.
Note that the diagonal blocks of Σ_{ i } are themselves block-structured, reflecting the double hierarchy in the model.
where ${\stackrel{\u0304}{\mathbf{y}}}_{i}$ is the mean of the ${\widehat{\mathbf{y}}}_{n}$ in the cluster, D is the length of $\widehat{\mathbf{t}}$, and N_{ i } is the number of genes in the cluster (see appendix for a derivation). This expression has reduced the computational complexity of the model from $\mathcal{O}\left({N}_{i}^{3}{D}^{3}\right)$ to $\mathcal{O}\left({D}^{3}\right)$.
Clustering
To use our model for clustering, the partitioning of genes into clusters needs to be inferred. Dunson [18] proposed a clustering scheme where a GP is used to model the function within a cluster, and a Dirichlet process prior is placed on the partitioning. This leads to a Gibbs sampling scheme where each Gibbs step involves removing a gene from the clustering and then stochastically re-allocating it. Our model potentially improves on Dunson’s formulation since we consider a structured covariance across the genes and replicates (which was treated as iid noise by Dunson), though it would be possible to use the same Gibbs sampling scheme to infer the cluster partitions.
Heller and Ghahramani [19] showed that inference in a DP can be effectively approximated using an agglomerative clustering scheme dubbed Bayesian Hierarchical Clustering (BHC)^{b}. Cooke et al. [20] applied this hierarchical clustering scheme to gene expression time-series with a GP prior, and we extend their work using a hierarchically structured GP to model the clusters, as well as the efficient computation of the marginal likelihood as per equation (15).
The algorithm is depicted in Algorithm 1, and works in a ‘bottom-up’ fashion. Starting with each gene in an individual cluster, the clusters are merged by greedily selecting the merge which maximises the log marginal likelihood of the data (by summing the log marginal likelihood over all clusters). Once no more merges are available to improve the overall marginal likelihood, the hyper-parameters are optimised, and the procedure is repeated with the new covariance function in an EM fashion.
Results and discussion
In a recent study of Drosophila development [21], gene expression was measured in eight replicates measured across six species at differing time-points. 3695 genes (with orthologs across the species) were investigated using Agilent microarrays. Here we focus on Melanogaster development: time courses for typical genes are shown in Figure 2 and 3, with eight replicates at up to thirteen time-points. Note in particular that no replicate contains all the time-points: some replicates cover only the last few points, whilst some have broader coverage.
For all the missing data experiments, the covariance function hyper-parameters were set to the maximum likelihood value using gradient based-numerical optimisation. Whilst we show that the hierarchical GP has better performance than the GP in all cases, it does not require any extra computation. All experiments took only a few minutes on a desktop PC.
Missing data imputation
The imputation of missing data is a straightforward method for validation of our model. In this Section, we remove data systematically, effectively removing entire microarrays from the experiment and predicting what was on them. Most missing data imputation methods cannot handle this type of missing data, highlighting an advantage of our method. This experiment also validates our assertion that it is important to include the replicate structure in modelling microarray time-series, and that simply averaging the data on a time-point basis is not satisfactory.
Whilst systematically missing data are not common in the laboratory, this test does examine the performance of our model in some potentially interesting applications. For example, we may wish to predict the future gene expression levels of a patient given the time series observed in other patients.
RMSE for missing data imputation for differing numbers of randomly removed arrays
1 of 56 | 5 of 56 | 10 of 56 | 20 of 56 | |
---|---|---|---|---|
HGP | 0 . 3 0 ± 0 . 2 7 | 0 . 3 2 ± 0 . 0 9 | 0 . 3 4 ± 0 . 0 5 | 0 . 3 8 ± 0 . 0 6 |
GP | 0.46±0.23 | 0.44±0.09 | 0.43±0.08 | 0.46±0.07 |
Mean | 0.52±0.24 | 0.48±0.12 | 0.48±0.11 | 0.48±0.08 |
Median | 0.50±0.25 | 0.46±0.11 | 0.47±0.11 | 0.48±0.08 |
Table 1 shows that the hierarchical GP performs better at imputing the missing data in all examined cases. Although the Table shows only the average over randomisations, the HGP algorithm gave the lowest RMSE for every randomisation that we tried. The standard deviations in the Table generally decrease as the number of missing points increases. This reflects the degree to which the missing data imputation depends on which time-points are missing, which may be due to the different temporal sampling schemes employed in the different replicates.
We note from Table 1 that our contribution of adding replicate structure to the GP methodology makes a significant difference to the results, since the standard GP offers only modest improvement over the simple averaging methods. We also note that the averaging methods are only possible where time-points are duplicated between replicates, a restriction which the (H)GP methodology removes.
Randomly missing data imputation
Our proposed model is novel in the sense that it can impute entire missing arrays, as above. Most missing data algorithms assume randomly missing data and use correlation between genes for imputation. To compare our algorithm with those from the literature, we randomly removed 100 values from the Melanogaster dataset, and measured the error on imputation. For comparison, we also used two popular methods, K-nearest-neighbour (KNN) [22] and Bayesian principal component analysis (PCA) [23].
RMSE of missing data imputation for hierarchical Gaussian processes (HGP) principal component analysis (PCA) and K-nearest neighbour (KNN)
Replicate median | Full model | |
---|---|---|
HGP | 0 . 1 3 ± 0 . 0 3 | 0 . 0 5 ± 0 . 0 1 |
PCA | 0.15±0.03 | 0.10±0.02 |
KNN | 0.17±0.04 | 0.13±0.02 |
Another way to investigate the ability of the model to deal with missing data is to examine the difference between the model as inferred with some missing points to that inferred with all the data: the results of doing so are shown in the second column of Table 2. Whilst this method may not give a completely fair reflection of the performance of the PCA and KNN methods, the small size of the errors on imputation imply that our model is relatively insensitive to missing data: because the model can borrow statistical strength from other replicates, small amounts of data missing at random make little difference to the model.
Data fusion
The data under investigation are sampled at two-hour intervals. To improve our knowledge of the system, it is possible to perform data fusion with existing data sets. We demonstrate this with two previous studies: [4, 24], which offer more tightly temporally-spaced data, but with fewer replications (three and one respectively).
Clustering
In order to investigate the usefulness of our model in a clustering task, we first selected 300 dynamically differentially expressed genes using a method similar to [7].
We computed the marginal likelihood using our hierarchical model and a simpler GP model without replicate-specific or gene-specific variance. This model, simply fitting a GP to the lumped data is similar to the method proposed in [20], which represents the current state-of-the-art. Cooke et al[20] compared their method to several other algorithms and concluded that the GP method allowed for the discovery of clusters in a more effective manner than non-temporal models. Here we show that this method can be improved further by considering the gene-wise and replicate-wise intra-cluster structure of the data’s variance. For both models, we applied the EM algorithm with several restarts (varying the covariance function hyper-parameters on each restart), selecting the solution with the highest log-likelihood. We used our own implementation for Cooke et al.’s method to ensure that results were comparable: i.e. that improvements in the method were due to the HGP structure, rather than specifics of the implementation. We are primarily concerned with the improvements that the HGP model can offer in explaining cluster variance, and this allows for a direct comparison.
For further comparison, we used the R package Mclust [25]. Mclust fits a range of Gaussian models of increasing complexity; in the first instance, we simply concatenated the replicates and Mclust struggled to fit some models in the 56 dimensional space. Subsequently, we provided Mclust with shorter vectors formed by averaging the replicates at each time point, which gave similar results. We include the validation for both methods. In both cases, we used all available covariance structures for Mclust, and let the package pick the best using its BIC (Bayesian Information Criterion) approach.
Results of clustering 300 genes using the four proposed algorithms
MF | BP | CC | $\mathcal{\mathcal{L}}$ | N. clust. | |
---|---|---|---|---|---|
Agglomerative HGP | 0 . 9 2 | 0 . 3 2 | 1 . 0 | 7 3 6 0 . 8 | 50 |
agg. GP (as Cooke et al.) | 0.78 | 0.26 | 0.72 | 6203.7 | 128 |
Mclust (concat.) | 0.78 | 0.14 | 0.50 | 1324.0 | 26 |
Mclust (averaged) | 0.80 | 0.16 | 0.42 | -736.2 | 20 |
From the Table, it can be seen that HGP method improves the biological homogeneity for all three GO categories. By directly comparing with the standard GP method, we have demonstrated that the improvement in clustering performance is not due simply to the clustering methodology or the GP correlations which give the GP method a small improvement over Mclust, but the hierarchical structure of intra-cluster variance which allows genes and replicates to differ in a temporally-correlated fashion.
Conclusions
We have presented a method based on hierarchically structured GPs, which are a practical and flexible framework for modelling replicated time-series. The framework has a wide range of applications, and can be extended for various data structures besides biological replications.
The method performed well in several tasks, including missing data imputation and clustering. We have shown that the method performs particularly well in missing data imputation, and that small amounts of missing randomly data have only a minor effect of the model. Biological validation through the BHI confirms the importance of modelling intra-cluster variance in a hierarchical fashion.
Above we showed how fitting the simplest of our proposed models can lead to a quantitative assessment of how biological replications are behaving, as well as illustrating how our method deals with different types of replicate variation. Of course, if the replicate variance is truly independent - e.g. if only technical variation is present - then we recover standard GP regression. In this case the hierarchical approach requires the inclusion of an extra parameter, but we find that the additional computational complexity is negligible.
A problem with standard GP regression is that the computational complexity grows cubically with the number of data. We have presented a method which exploits the necessary condition that all genes in a cluster are measured on the same time-points in order to significantly reduce the computational complexity and make our clustering algorithm applicable to large data sets. We note that the complexity of the clustering algorithm scales poorly with the number of genes: the initial step of the algorithm must compare the likelihood of merging of each pair. To address this, randomised versions of the same algorithm can be adopted [30, 31], and our hierarchical model and its efficient computation as (15) could be used with no modification.
Whilst we have demonstrated that our model is useful in several applications, we envisage a number of extensions. For example, our model could be used for data fusion of microarray data with high-throughput sequencing data. Or, the hierarchical structure could be used in models of pathway activity [32], which may include prior information about gene groupings from Gene Ontology.
Although we have only used simple GP models within our hierarchical structure, the idea can be applied to more complex GP models, such as those proposed to model gene interactions [9, 33].
Endnotes
^{a} Other noise distributions are possible, but break the conjugacy of the model and thus complicate inference, see [10].
^{b} Note that hierarchical in this sense means a hierarchical partitioning of the genes, distinct from our Bayesian hierarchical model applied within the cluster.
Appendix
Efficient computation of a cluster likelihood
where we have defined for brevity $\mathit{\Lambda}\phantom{\rule{0.3em}{0ex}}=\phantom{\rule{0.3em}{0ex}}{\mathbf{K}}_{h}^{-1}\phantom{\rule{0.3em}{0ex}}+\phantom{\rule{0.3em}{0ex}}{N}_{g}\phantom{\rule{0.3em}{0ex}}{\mathbf{\Sigma}}_{n}^{-1}$ and $\widehat{\mathbf{h}}={\mathit{\Lambda}}^{-1}{\mathbf{\Sigma}}^{-1}{N}_{g}{\stackrel{\u0304}{\mathbf{y}}}_{i}$. The first and third lines of this expression can be moved outside the integral, and we recognise the Gaussian nature of $\int \phantom{\rule{0.3em}{0ex}}exp\left\{-\frac{1}{2}{(\mathbf{\text{h}}-\widehat{\mathbf{h}})}^{\top}\mathit{\Lambda}(\mathbf{h}-\widehat{\mathbf{h}})\right\}\phantom{\rule{0.3em}{0ex}}\text{d}\mathbf{h}={\left(2\pi \right)}^{D/2}|\mathit{\Lambda}{|}^{1/2}$. Substituting this and the expressions for $\widehat{\mathbf{h}}$ and Λ back into (18) leads to the expression given in (15).
Declarations
Acknowledgements
This work was funded by EU FP7-KBBE Project Ref 289434, EraSysBio+ project SYNERGY (BBSRC award BB/I004769/2) and EU FP7 RADIANT (award no. 305626).
Authors’ Affiliations
References
- Spellman P, Sherlock G, Zhang M, Iyer V, Anders K, Eisen M, Brown P, Botstein D, Futcher B: Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell. 1998, 9 (12): 3273-10.1091/mbc.9.12.3273.PubMed CentralView ArticlePubMedGoogle Scholar
- Barenco M, Tomescu D, Brewer D, Callard R, Stark J, Hubank M: Ranked prediction of p53 targets using hidden variable dynamic modeling. Genome Biol. 2006, 7 (3): R25-10.1186/gb-2006-7-3-r25.PubMed CentralView ArticlePubMedGoogle Scholar
- Straume M: DNA microarray time series analysis: automated statistical assessment of circadian rhythms in gene expression patterning. Methods Enzymol. 2004, 383: 149-166.View ArticlePubMedGoogle Scholar
- Tomancak P, Beaton A, Weiszmann R, Kwan E, Shu S, Lewis S, Richards S, Ashburner M, Hartenstein V, Celniker S, et al: Systematic determination of patterns of gene expression during Drosophila embryogenesis. Genome Biol. 2002, 3 (12): 0081-0088.View ArticleGoogle Scholar
- Sanguinetti G, Lawrence N, Rattray M: Probabilistic inference of transcription factor concentrations and gene-specific regulatory activities. Bioinformatics. 2006, 22 (22): 2775-10.1093/bioinformatics/btl473.View ArticlePubMedGoogle Scholar
- Beal M, Falciani F, Ghahramani Z, Rangel C, Wild D: A Bayesian approach to reconstructing genetic regulatory networks with hidden factors. Bioinformatics. 2005, 21 (3): 349-10.1093/bioinformatics/bti014.View ArticlePubMedGoogle Scholar
- Kalaitzis A, Lawrence N: A simple approach to ranking differentially expressed gene expression time courses through Gaussian process regression. BMC Bioinformatics. 2011, 12: 180-10.1186/1471-2105-12-180.PubMed CentralView ArticlePubMedGoogle Scholar
- Gao P, Honkela A, Rattray M, Lawrence N: Gaussian process modelling of latent chemical species: applications to inferring transcription factor activities. Bioinformatics. 2008, 24 (16): i70-i75. 10.1093/bioinformatics/btn278.View ArticlePubMedGoogle Scholar
- Honkela A, Girardot C, Gustafson E, Liu Y, Furlong E, Lawrence N, Rattray M: Model-based method for transcription factor target identification with limited data. Proc Natl Acad Sci. 2010, 107 (17): 7793-10.1073/pnas.0914285107.PubMed CentralView ArticlePubMedGoogle Scholar
- Rasmussen C, Williams C: Gaussian Processes for Machine Learning. 2006, Cambridge, Massachusetts and London, England: MIT pressGoogle Scholar
- Ernst J, Nau G, Bar-Joseph Z: Clustering short time series gene expression data. Bioinformatics. 2005, 21 (suppl 1): i159-10.1093/bioinformatics/bti1022.View ArticlePubMedGoogle Scholar
- Gelman A, Carlin JB, Stern HS, Rubin DB: Bayesian Data Analysis. 2004, Boca Raton: CRC pressGoogle Scholar
- Stegle O, Denby K, Cooke E, Wild D, Ghahramani Z, Borgwardt K: A robust Bayesian two-sample test for detecting intervals of differential gene expression in microarray time series. J Comput Biol. 2010, 17 (3): 355-367. 10.1089/cmb.2009.0175.PubMed CentralView ArticlePubMedGoogle Scholar
- Liu Q, Lin K, Andersen B, Smyth P, Ihler A: Estimating replicate time shifts using Gaussian process regression. Bioinformatics. 2010, 26 (6): 770-776. 10.1093/bioinformatics/btq022.PubMed CentralView ArticlePubMedGoogle Scholar
- Ng S, McLachlan G, Wang K, Jones LBT, Ng SW: A mixture model with random-effects components for clustering correlated gene-expression profiles. Bioinformatics. 2006, 22 (14): 1745-1752. 10.1093/bioinformatics/btl165.View ArticlePubMedGoogle Scholar
- Medvedovic M, Yeung K, Bumgarner R: Bayesian mixture model based clustering of replicated microarray data. Bioinformatics. 2004, 20 (8): 1222-10.1093/bioinformatics/bth068.View ArticlePubMedGoogle Scholar
- Lin K, Chudova D, Hatfield G, Smyth P, Andersen B: Identification of hair cycle-associated genes from time-course gene expression profile data by using replicate variance. Proc Natl Acad Sci USA. 2004, 101 (45): 15955-10.1073/pnas.0407114101.PubMed CentralView ArticlePubMedGoogle Scholar
- Dunson D: Nonparametric Bayes applications to biostatistics. Bayesian Nonparametrics. Edited by: Hjort L, Holmes C, Muller P, Walker S. 2010, Cambridge: Cambridge University PressGoogle Scholar
- Heller K, Ghahramani Z: Bayesian hierarchical clustering. Proceedings of the 22nd International Conference on Machine Learning. 2005, ACM press, 297-304.Google Scholar
- Cooke E, Savage R, Kirk P, Darkins R, Wild D: Bayesian hierarchical clustering for microarray time series data with replicates and outlier measurements. BMC Bioinformatics. 2011, 12: 399-10.1186/1471-2105-12-399.PubMed CentralView ArticlePubMedGoogle Scholar
- Kalinka A, Varga K, Gerrard D, Preibisch S, Corcoran D, Jarrells J, Ohler U, Bergman C, Tomancak P: Gene expression divergence recapitulates the developmental hourglass model. Nature. 2010, 468 (7325): 811-814. 10.1038/nature09634.View ArticlePubMedGoogle Scholar
- Troyanskaya O, Cantor M, Sherlock G, Brown P, Hastie T, Tibshirani R, Botstein D, Altman R: Missing value estimation methods for DNA microarrays. Bioinformatics. 2001, 17 (6): 520-10.1093/bioinformatics/17.6.520.View ArticlePubMedGoogle Scholar
- Oba S, Sato M, Takemasa I, Monden M, Matsubara K, Ishii S: A Bayesian missing value estimation method for gene expression profile data. Bioinformatics. 2003, 19 (16): 2088-10.1093/bioinformatics/btg287.View ArticlePubMedGoogle Scholar
- Hooper S, Boué S, Krause R, Jensen L, Mason C, Ghanim M, White K, Furlong E, Bork P: Identification of tightly regulated groups of genes during Drosophila melanogaster embryogenesis. Mol Syst Biol. 2007, 3: 72-PubMed CentralView ArticlePubMedGoogle Scholar
- Fraley C, Raftery AE: MCLUST: Software for model-based cluster analysis. J Classif. 1999, 16 (2): 297-306. 10.1007/s003579900058.View ArticleGoogle Scholar
- Brock G, Pihur V, Datta S, Datta S: clValid: An R package for cluster validation. J Stat Softw. 2008, 25 (4): 1-22.View ArticleGoogle Scholar
- Durinck S, Moreau Y, Kasprzyk A, Davis S, De Moor B, Brazma A, Huber W: BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis. Bioinformatics. 2005, 21 (16): 3439-3440. 10.1093/bioinformatics/bti525.View ArticlePubMedGoogle Scholar
- Mistry M, Pavlidis P: Gene Ontology term overlap as a measure of gene functional similarity. BMC bioinformatics. 2008, 9: 327-10.1186/1471-2105-9-327.PubMed CentralView ArticlePubMedGoogle Scholar
- Kirk P, Griffin JE, Savage RS, Ghahramani Z, Wild DL: Bayesian correlated clustering to integrate multiple datasets. Bioinformatics. 2012, 28 (24): 3290-3297. 10.1093/bioinformatics/bts595.PubMed CentralView ArticlePubMedGoogle Scholar
- Heller K, Ghahramani Z: Randomized algorithms for fast Bayesian hierarchical clustering. PASCAL Statistics and Optimization of Clustering Workshop. 2005Google Scholar
- Darkins R, Cooke EJ, Ghahramani Z, Kirk PD, Wild DL, Savage RS: Accelerating Bayesian hierarchical clustering of time series data with a randomised algorithm. PloS one. 2013, 8 (4): e59795-10.1371/journal.pone.0059795.PubMed CentralView ArticlePubMedGoogle Scholar
- Shi Y, Klustein M, Simon I, Mitchell T, Bar-Joseph Z: Continuous hidden process model for time series expression experiments. Bioinformatics. 2007, 23 (13): i459-i467. 10.1093/bioinformatics/btm218.View ArticlePubMedGoogle Scholar
- Lawrence N, Girolami M, Sanguinetti G, Rattray M: Learning and Inference in Computational Systems Biology. 2010, Cambridge: MIT pressGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.