 Methodology article
 Open Access
 Published:
Phase analysis of circadianrelated genes in two tissues
BMC Bioinformaticsvolume 7, Article number: 87 (2006)
Abstract
Background
Recent circadian clock studies using gene expression microarray in two different tissues of mouse have revealed not all circadianrelated genes are synchronized in phase or peak expression times across tissues in vivo. Instead, some circadianrelated genes may be delayed by 4–8 hrs in peak expression in one tissue relative to the other. These interesting biological observations prompt a statistical question regarding how to distinguish the synchronized genes from genes that are systematically lagged in phase/peak expression time across two tissues.
Results
We propose a set of techniques from circular statistics to analyze phase angles of circadianrelated genes in two tissues. We first estimate the phases of a cycling gene separately in each tissue, which are then used to estimate the paired angular difference of the phase angles of the gene in the two tissues. These differences are modeled as a mixture of two von Mises distributions which enables us to cluster genes into two groups; one group having synchronized transcripts with the same phase in the two tissues, the other containing transcripts with a discrepancy in phase between the two tissues. For each cluster of genes we assess the association of phases across the tissue types using circularcircular regression. We also develop a bootstrap methodology based on a circularcircular regression model to evaluate the improvement in fit provided by allowing two components versus a onecomponent vonMises model.
Conclusion
We applied our proposed methodologies to the circadianrelated genes common to heart and liver tissues in Storch et al. [2], and found that an estimated 80% of circadianrelated transcripts common to heart and liver tissues were synchronized in phase, and the other 20% of transcripts were lagged about 8 hours in liver relative to heart. The bootstrap pvalue for being one cluster is 0.063, which suggests the possibility of two clusters. Our methodologies can be extended to analyze peak expression times of circadianrelated genes across more than two tissues, for example, kidney, heart, liver, and the suprachiasmatic nuclei (SCN) of the hypothalamus.
Background
Circadian rhythms (or the biologic clocks that control them) have stimulated interest in recent years due to their importance in orchestrating physiological behavior, biological processes, and adaptability of biological systems to changes in environment [1–3]. Many circadianrelated genes have been explored using highthroughput DNA microarray technology [1–3]. These studies also have stimulated efforts to apply and develop methodologies in circular/directional statistics to elucidate important characteristics of circadian gene expression and also compare their patterns of peak expression times (phase angles) across different tissue types, to help elucidate their diverse tissuespecific functions [1, 2, 4, 5].
As periodic oscillation characterizes the expression pattern of both circadian genes and cell cycle genes, many correlationbased and Fourierbased methodologies [1–3] proposed for analyzing cell cycle gene expression can be directly applied to circadian gene expression analysis. However, there are some distinct differences between studies in cell cycle gene expression and circadian gene expression. First, most cell cycle gene expression patterns are based on cell cultures studied in vitro, while most circadian gene expressions are based on various tissues or organs in vivo. Consequently, circadian gene expression may be more complex or tissue/cellspecific. Second, the four phases of a cell cycle, namely, G_{1}, S, G_{2}, and M phases, have been well characterized through intensive research over the last thirty years, and more than 54 mammalian [6] and 104 yeast cell cycle genes have been identified [7]. In contrast, to date, less is known about circadian genes: only eight core mammalian circadian genes have been identified: Csnk1e, Cry1, Cry2, Per1, Per2, Per3, Clock, and Bmall [8]. In addition, it is not clear whether these known circadian genes and any other circadianrelated genes identified from highthroughput microarrays can be assigned to a few functional phases, in analogy to the phases (G_{1}, S, G_{2}, M) in cell cycle. Note that many studies on cell cycle gene expression based on microarray were on same organism [6] and celltype [7] under different experimental conditions. Therefore, we expect that a set of cell cycle genes commonly expressed in various conditions are consistent in their peak expression/activation time [9]. However, it is an opening question whether phases or peak expression times for a set of circadianrelated genes commonly expressed in multiple tissues, such as heart, liver, kidney, and SCN of the hypothalamus are in synchrony because expression of some circadianrelated genes may be tissuespecific. Statistical tools for analyzing such a type of circular data cross multiple tissues need to be developed.
The phase angles estimated from cycling transcripts in Panda et al. [1] and Storch et al. [2] can be regarded as points on a circle of unit radius, which are treated as circular data in circular/directional statistics [10, 11]. Circular data are commonly modeled with a von Mises distribution function on the unit circle, an analog to the normal distribution for linear data. The main feature of circular data is that it is directional and classical methods based on linear data can produce meaningless results. As an example, suppose a bird takes off in the northeast direction at an angle of 2°, while another takes off in the southeast direction at an angle 358° then their mean direction (by usual linear methods) is 180°, or due west! Means and variances and other statistical analyses must respect the directional nature of the data to avoid such nonsensical results. For example, a sum of two points on a unit circle is calculated as the sum of the two vectors, yielding a vector with certain direction and length. With this vector averaging, the two birds now have a mean direction of 0°, corresponding properly to due east. Special methods are available in the literature for describing correlation and regression between circular variables [10, 11]. One needs to be cautious when analyzing circular and linear variables simultaneously. In some cases circadian phase angles have been mistreated as linear variables in linear regression of the phase angle on the period and amplitude [12]. Results so obtained may not be useful for interpretation.
The motivation for our work is based on the observation in [1–3] that some circadianrelated genes that are expressed in two tissues are systematically lagged in peak expression time in the two tissues. Panda et al. [1] reported that many of the 28 circadianrelated transcripts common to the suprachiasmatic nuclei (SCN) of the hypothalamus and liver, including Per2 and RevErbβ, are delayed by 4–8 hrs in peak expression in liver relative to the SCN. Ueda et al. [3] validated in vitro that the RevErbA/ROR response element in both the SCN and liver tissues is expressed in phase with Bmal1 and in antiphase with Per2 oscillation. These studies suggest that the coordinated temporal expression of circadian genes inphase and antiphase in different tissues is an interesting but a complex biological phenomenon. Statistical analysis tools for studying this type of interesting biological questions arising in recent genomics studies are needed.
To address the above questions, we propose a few steps in the following sections. Given a set of circadianrelated genes common to two tissues, we first fit a randomperiods model [13] to the timecourse expression for each gene individually in each of two tissues, to estimate its phase angles along with periods and amplitudes. The angular difference between the two phases for each gene can be represented as an angle, i.e. as a point on a circle. Using a mixture of two von Mises distributions, we cluster angular differences of the genes into two groups; genes whose expressions are synchronized (mean difference is close 0) in the two tissues and those whose expressions are different in the two tissues. The identified clusters may provide a hint on association of circadian genes specific to these tissues. We then assess the association of each set of genes common to two tissues using Down and Mardia's circularcircular regression model [14]. In addition, we propose a new circularcircular regressionbased bootstrap method to assess the mixture of two homogeneous phase distributions for the two tissues. We illustrate the proposed methodologies using the heart and liver circadianrelated gene expression data sets from Storch et al. [2].
Methods
Phase estimation
One characterizing feature of circadianrelated genes expression is the periodically oscillating pattern. Sinusoidal functions have been used to model the circadian gene expression level [1–3]. We apply the "randomperiods model" [13] to estimate the phase angles, period, and amplitude together for a given circadian gene expression using nonlinear leastsquares regression. While there is no attenuation in circadian gene expression, the sinusoidal component of the "randomperiods model" is reduced to a simple sinusoidal function K_{ g }cos(2π t/T_{ g }+ φ_{ g }), where K_{ g }, T_{g,}, and φ_{ g }are the amplitude, period, and phase (angle) of gene g. The phase parameter φ_{ g }indicates when the expression of the g gene reaches its maximum.
A mixture of two von Mises distributions for circular paireddifference data
After estimation of activation times or phase angles for a set of circadianrelated genes that expressed in two tissues or organs, we are interested in examining whether the phase angles are synchronous or not. Let ${\widehat{\phi}}_{g}^{x}$ and ${\widehat{\phi}}_{g}^{y}$ denote the estimated phase angles of a circadianrelated gene g, g = 1, 2, ..., n, in the two tissues x and y, where π≤ ${\widehat{\phi}}_{g}^{x}$ ≤π, π ≤${\widehat{\phi}}_{g}^{y}$ ≤ π. Further, we model the distribution of the angular difference
as a mixture of two von Mises distributions. One component will correspond to a subset of the n genes have the same phase angle in the two tissues and the other will correspond to genes having unequal phase angles in the two tissues. Thus the probability density function is given by:
where ${f}_{i}({\Delta}_{g})=\frac{1}{2\pi {I}_{0}({\kappa}_{i})}\mathrm{exp}({\kappa}_{i}\mathrm{cos}({\Delta}_{g}{\mu}_{i}))$, i = 1, 2; 0 ≤ κ_{ i }; π <μ_{ i }≤ π. Here, p_{ i }is the mixing parameter, μ_{ i }is the mean direction for distribution i, κ_{ i }is the concentration parameter characterizing the variability of the estimated differences Δ_{ g }about μ_{ i }, and I_{0}(κ_{ i })is the modified Bessel function of the first kind and order zero. We expect that one von Mises distribution has mean close to 0 radians, because it consists of a concordant subset of genes having the same phase in the two tissues, whereas the other distribution contains a set of "discordant" genes. The variation in shift characterizing genes of the second set can be measured by summing1  cos(Δ_{ g }) [11].
The log likelihood for the mixture of two von Mises distributions in (2) is
The parameters in the vector (p_{1}, κ_{1}, μ_{1}, κ_{2}, μ_{2}) in the mixture model (3) can be estimated using the Newtontype optimization method in the Matlab optimzation toolbox. To ensure convergence to the global solution, we use fifty random starting points. A comparison of the performance of various estimators can be found in [15]. We chose the Newtontype optimization method in the estimation due to its simplicity and flexibility of converting unconstrained searching to constrained optimization by adding constraints on the mixing parameter p_{ 1 }, i.e., 0.15 <p_{ 1 }< 0.85 or the concentration parameter κ_{1}, or κ_{2}, i.e., κ_{1} <10 and κ_{2} < 10. Upon obtaining the five estimated parameters (${\widehat{p}}_{1},{\widehat{\kappa}}_{1},{\widehat{\mu}}_{1},{\widehat{\kappa}}_{2},{\widehat{\mu}}_{2}$) in the mixture model (2), we statistically assign each of the Δ_{ g }to one of the two components based on its relative likelihood. That is, gene g is assigned to cluster 1 if ${\widehat{p}}_{1}{\widehat{f}}_{1}({\Delta}_{g})>{\widehat{p}}_{2}{\widehat{f}}_{2}({\Delta}_{g})$, otherwise to cluster 2.
Circularcircular regression
In a recent article [9] we described the notion of association between the phase angles of a set of cellcycle genes from a pair of experiments using the circularcircular regression model of Downs and Mardia [14]. Within each cluster obtained above, we shall apply the methodology described in [9] to examine the association between the estimated phase angles of the genes in the two tissues.
Consider a pair of angular random variables for cluster i as ( ${\widehat{\phi}}_{ig}^{y}$, ${\widehat{\phi}}_{ig}^{x}$), i = 1, 2, g = 1, ..., n, with mean directions α_{ i }and β_{ i }, respectively. Further, suppose η_{ ig }denotes the mean direction of ${\widehat{\phi}}_{ig}^{y}$ given ${\widehat{\phi}}_{ig}^{x}$. In the present context, this would be the mean estimated phase angle of a gene in one tissue, conditional on its estimated phase angle in the other tissue. Downs and Mardia [16] introduced the following flexible circular regression model to regress ${\widehat{\phi}}_{ig}^{y}$ on ${\widehat{\phi}}_{ig}^{x}$
where ω_{ i }denotes the "slope" parameter of the regression and η_{ ig }is the mean direction of ${\widehat{\phi}}_{ig}^{y}$ conditional on ${\widehat{\phi}}_{ig}^{x}$. The above model allows for estimating not only the rotational angle θ_{ i }= β_{ i }α_{ i }, but also the slope parameter ω_{ i }. As in Downs and Mardia [16], to avoid multiple solutions, we restrict  1 ≤ ω_{ i }≤ 1 and π ≤ α_{ i }≤ π and π ≤ β_{ i }≤ π. We model the conditional distribution of ${\widehat{\phi}}_{ig}^{y}$ given ${\widehat{\phi}}_{ig}^{x}$ as a von Mises with concentration parameter ${\kappa}_{i}^{c}$, i.e.,
As shown in Downs and Mardia [16], the angular error ${\widehat{\phi}}_{ig}^{y}$ η_{ ig }(${\widehat{\phi}}_{ig}^{x}$;α_{ i }, β_{ i }, ω_{ i }) is von Mises with mean 0 and concentration parameter κ_{ i }, where
The association from one tissue to the other of genes in each cluster (${\widehat{\phi}}_{i}^{y}$, ${\widehat{\phi}}_{i}^{x}$), i = 1, 2, can then be assessed using the Ftest derived by Downs and Mardia [16].
A bootstrap test for number of clusters
To assess whether there are two clusters in the mixture of Δ_{ g }, g = 1,..., n, in (1), in the following we propose a bootstrap methodology to test the null hypothesis that Δ_{ g }'s are a random sample from a single vonMises distribution against the alternative hypothesis that they are from a mixture of two independent vonMises distributions.
Let $cv={\displaystyle \sum _{g=1}^{n}(1\mathrm{cos}({r}_{g}))}$ denote an estimate of the circular variance for the combined sample of n = n_{1} + n_{2} observations, based on residuals from a single circular regression, while $c{v}_{1}={\displaystyle \sum _{g=1}^{{n}_{1}}(1\mathrm{cos}({r}_{g}))}$ and $c{v}_{2}={\displaystyle \sum _{g=1}^{{n}_{2}}(1\mathrm{cos}({r}_{g}))}$ denote the estimates of the circular variances for the two individual clusters separately based on residuals from two circular regressions.
The proposed bootstrap procedure is described in the following steps:

1)
Regress phase angles in y tissue on phase angles in x tissue using the circularcircular regression model (4) and compute the circular variance cv based on the residuals r_{ g }, g = 1, ..., n, from this single circular regression;

2)
Compute for each gene g the difference Δ_{ g }= ${\widehat{\phi}}_{g}^{y}$  ${\widehat{\phi}}_{g}^{x}$, where ${\widehat{\phi}}_{g}^{y}$ and ${\widehat{\phi}}_{g}^{x}$ are the estimated phase angles for gene g in tissues y and x;

3)
Fit the phase differences Δ_{ g }, g = 1, ..., n, to a twocomponent mixture of vonMises model, obtaining two separate clusters with n_{1} and n_{2} genes in provisional clusters 1 and 2, respectively;

4)
Regress n_{1} and n_{2} phase angles of y_{1} on x_{1} and y_{2} on x_{2} separately for the two clusters, and obtain the residuals r^{cluster 1}and r^{cluster 2}from each of the two regressions;

5)
Compute the circular variances cv_{1} and cv_{2} for each of the twocluster sets of residuals r^{cluster 1}and r^{cluster 2}from the regressions carried out in step 4);

6)
Calculate the test statistic: T = cv  cv_{1}  cv_{2}

7)
Compute the absolute values of r_{ g }obtained from Step 1, and randomly assign a +/ sign to each r_{ g }, call it ${r}_{g}^{*}$, then obtain bootstrapped data ${r}_{g}^{bootstrap}$ from ${r}_{g}^{*}$;

8)
Obtain each pseudo phase angle for the heart data by η_{ g }+ ${r}_{g}^{bootstrap}$ => ${\phi}_{g}^{y,bootstrap}$, where η_{ g }is the predicted angle in tissue y based on regression performed in Step 1;

9)
Repeat the loop from step 1) to 8) 3000 times using ${\widehat{\phi}}_{g}^{x}$, the bootstrap data ${\theta}_{g}^{y,bootstrap}$ to replace ${\widehat{\phi}}_{g}^{y}$ and replacing r_{ g }by ${r}_{g}^{bootstrap}$. For the bootstrap data denote the T in step 6 by T^{bootstrap}.
Then the bootstrap pvalue is the proportion of T^{bootstrap} that are greater than the calculated value of test statistic T.
Results
Datasets
We apply our methodologies to 52 circadianrelated cycling transcripts that are expressed in mouse heart and liver tissues, identified by Storch et al. [2]. In Storch's studies, mice were entrained to a 12 hrs light/dark cycle for more than two weeks, then placed in a constant dim light for more than 42 hrs. The tissue samples were taken from sacrificed mice at 4hour intervals for 48 hrs, or about two circadian cycles, as in the circadian studies of Panda et al. [1].
Due to the poor fit of our randomperiods model [13] to the expression of four transcripts (accession numbers: AI834950, AF003348, AF043288, and AB014494), we excluded them from the list of 52 circadianrelated transcripts [2] in our analysis. The estimated phase angles for the remaining 48 transcripts in both heart and liver for clusters 1 and 2 are listed in Tables 1 and 2, respectively. The angular differences (heart, denoted as y, minus liver, denoted as x) Δ_{ g }are plotted in Figure 1.
The 48 circadianrelated cycling transcripts were assigned into two clusters based on a mixture of two von Mises distributions, as described above. The first cluster contains 38 genes (Table 1) and the second cluster contains 10 genes (Table 2). The five estimated parameters (${\widehat{p}}_{1},{\widehat{\kappa}}_{1},{\widehat{\mu}}_{1},{\widehat{\kappa}}_{2},{\widehat{\mu}}_{2}$) = (0.79, 1.50, 0.07, 4.56, 2.16). The distance between the two clusters in mean direction (${\widehat{\mu}}_{1}{\widehat{\mu}}_{2}$) is 2.23 rads, suggesting that the ten transcripts in cluster 2 have different points of peak expression in heart and liver. This can also be seen in Figures 2 and 3. The mean direction of 38 phase angles in the cluster 1 is 0.07 rads, suggesting that the peak expression times for the 38 cycling genes in heart and liver are close to synchronized. In contrast, the peak expression times for the 10 genes in heart and liver (in cluster 2) are away from the 0 direction by 2.16 rads, with heart ahead of liver by about 8 hrs. This result suggests that these 10 discordant circadianrelated genes may play different roles in the heart and liver.
We estimated the acrosstissue association of the genes in each of the two clusters by regressing the phase angles in heart, denoted as y, onto those in liver, denoted as x, using the circularcircular regression model described above (5). The estimated rotational parameters, the slope, and the concentration parameter for the von Mises distribution are (${\widehat{\alpha}}_{1},{\widehat{\beta}}_{1},{\widehat{\omega}}_{1},{\widehat{\kappa}}_{1}^{c}$) = (0.83, 0.91, 0.58, 1.85) for cluster 1 and (${\widehat{\alpha}}_{2},{\widehat{\beta}}_{2},{\widehat{\omega}}_{2},{\widehat{\kappa}}_{2}^{c}$) = (2.85, 0.76, 0.86, 9.41) for cluster 2. Figure 2 shows the residuals for all genes, i.e. the angular difference between the phase angle in heart and the prediction based on the same gene's pattern of expression in liver. The residuals for the 10 genes in the second cluster are shown in Figure 3. We can see that the activation times of these transcripts in heart and liver tissues are matched well after a 2.16 radian clockwise rotation of the phase angles in heart relative to in liver. Upon obtaining the 'slope parameters' ${\widehat{\omega}}_{1}$ and ${\widehat{\omega}}_{2}$, we tested the null hypothesis H_{0}: ω_{ i }= 0 vs. the alternative hypothesis, H_{a}: ω_{ i }≠ 0, i = 1, 2, using the test statistic derived by Downs and Mardia [14]. The corresponding pvalues for clusters 1 and 2 were less than 5 × 10^{7} and 1.4 × 10^{4}, respectively, suggesting that the associations of the circadian activation times in heart and liver are strong in both clusters. Here we included the expression plots of the transcripts AI850638 and AI846522 in heart and liver tissues [2] and the fits of our model to the two transcripts in Figures 4 and 5, accordingly. The plots reveal that the fits of our model to the data are reasonably good, and peak times of the two transcripts in liver tissue are markedly lagged relative to heart tissue.
Based on 3000 bootstraps using the procedure outlined in section of bootstrap test, we found that the estimated pvalue for sufficiency of a onecomponent distribution in the phase difference of heart and liver tissues for the 48 genes was 0.063. Although this is not significant at 5% level of significance, it suggests that a twocomponent vonMises distribution for the phase in heart and liver tissues better describes the relationships among the peak expression times for the studied genes.
Discussion and conclusion
Our analysis of the peak expression times (phases) for a set of 48circadianrelated genes expressed in both heart and liver tissues [2] suggests that not all of the genes are maximally expressed at the same time in the two tissues. Instead, among the 48 genes, 38 are synchronized in phase or peak expression times in heart and liver tissues, and the other 10 genes express earlier by about 2.23 rads or 8 hours in heart than in liver. Our bootstrapping test result supports, albeit weakly, the existence of two distinct subsets among the 48 genes. Although our findings are based on the single experimental dataset of Storch et al. [2], our results are similar to an earlier observation made by Panda et al. [1] that the peak expression times for some genes are not synchronized in suprachiasmatic nuclei (SCN) compared to liver. One implication of our results, giving quantitative support for the conclusion of Storch et al. [2], may be that some commonly expressed circadianrelated genes may perform different functions across different organs [1, 2].
We have developed a new bootstrap method for assessing the adequacy of one versus the need for two clusters of genes in the two sets of phase angles in heart and liver tissues for the 48 genes. In particular, we evaluated the significance of the circular variances of the residuals from circularcircular regression of the phase angles in heart on in liver in one cluster vs. two clusters. In contrast, most studies on mixtures of circular variables have focused on dividing a set of data into two subsets. To the best of our knowledge, no studies have used a mixture of two components for circular datasets for testing heterogeneity of a cyclic pattern.
This work would not have been undertaken without the interesting observations by Panda et al. [1], Storch et al. [2] and Ueda et al. [3] that a few circadianrelated genes expressed out of phase across two tissues of mouse. In addition, quantitatively the results of our statistical analysis depend on the approximation of sinusoidal waveform for circadian gene expression, on reasonably accurate estimation of the phase angles, on a relative large sample size of genes common to two tissues, on the approximate validity of the von Mises distribution for each cluster of differences. In previous circadian gene expression studies [1–3], a tissue sample was commonly taken at each 4hr interval for two circadian cycles, i.e., 12 time points per gene expression. Although our fits to the 48 circadian gene expression in both tissues are reasonably good, as shown in Figures 4, 5, further experimental and simulation studies may be needed to understand the role of sample size and sampling frequency on phase estimation when a sinusoidal waveform is presumed for circadian rhythm. In this work, we considered the difference of two phases for a gene in two tissues as a random variable modeled by a von Mises distribution on a circle. The corresponding uncertainties are captured by the degree to which the von Mises distribution is spread out on the circle.
The 48 circadian related genes expressed in heart and liver of mouse each provide a pair of peak expression times. We have assumed that there are at most two clusters. Further experimental studies are needed for testing whether there might be more than two clusters with for 48 genes. While our method can hypothetically be extended to allow one to test the need for 3 clusters rather than for 2 clusters, the sample size of 48 genes, may not be sufficient to carry out such a test with much power. The results of the twoclusters analysis must be regarded as descriptive. Our analysis of circadian gene expression may serve to stimulate further methodological development in circular/directional statistical analysis of genes that may be expressed differently in phase in two or more tissues. The genes in each cluster would need to be scrutinized separately for further elucidation of their tissuespecific biological and physiological functions [1, 2, 5].
Our methodologies can be, in principle, extended to analyzing multiple circadian gene expression data sets across multiple tissues (organs), e.g., kidney, heart, liver, and SCN where investigators are interested in understanding whether there is one set of core circadianrelated genes that are similar in their patterns of activation across different organs (or tissues), and others that are differently expressed in different tissues, as suggested by Reppert and Weaver [5]. Because the maximum likelihood approach considered in this paper may become challenging due to computational complexity, further methodology development is needed in this area. For such multiple mixture problems, one may want to consider a Bayes or empirical Bayes approach. However, to the best of our knowledge, Bayes and empirical Bayes methods for mixture problems associated with circular data are not well developed. The present application provides an excellent opportunity for developing such methodology for mixture problems associated with circular data. Secondly, using the standard likelihood approach, we clustered the 48 genes into two clusters on the basis of the phase difference between the two tissues. It would be interesting and useful to derive an estimate of "reliability" of clustering for each gene. One possible approach is to perform a bootstrap by selecting a simple random sample of 48 genes from the list of 48 genes and classify them into two clusters using procedure described in this paper. This procedure could be repeated for a large number of times, say 1000. One then could estimate the proportion of times a gene was classified into one of two clusters. Unfortunately, such a procedure does not work well when the number of genes is small and their "true" cluster memberships are unknown and the numbers of genes in each cluster are highly unbalanced. Nonetheless, these important questions need to be addressed because the number of potential applications for such a procedure is ever growing.
References
 1.
Panda S, Antoch MP, Miller BH, Su AI, Schook AB, Straume M, Schultz PG, Kay SA, Takahashi JS, Hogenesch JB: Coordinated transcription of key pathways in the mouse by the circadian clock. Cell 2002, 109: 307–320. 10.1016/S00928674(02)007225
 2.
Storch KF, Lapan O, Leykin I, Viswannthan N, David FC, Wong WH, Weitz CJ: Extensive and divergent circadian gene expression in liver and heart. Nature 2002, 417: 78–83. 10.1038/nature744
 3.
Ueda H, Chen W, Adachi A, Wakamatsu H, Hayashi S, Takasugi T, Nagano M, Nakahama KI, Suzuki Y, Sugano S, Lino M, Shigeyoshi Y, Hashimoto S: A transcription factor response element for gene expression during circadian night. Nature 2002, 418: 534–539. 10.1038/nature00906
 4.
Zylka MJ, Sheaman LP, Weaver DR, Reppert SM: Three period homologs in mammals differential light responses in the suprachiasmatic circadian clock and oscillating transcripts outside of brain. Neuron 1998, 20: 1103–1110. 10.1016/S08966273(00)804924
 5.
Reppert SM, Weaver DR: Coordination of circadian timing in mammals. Nature 2002, 418: 935–941. 10.1038/nature00965
 6.
Whitfield ML, Sherlook G, Saldanha AJ, Murray JI, Ball CA, Alexander KE, Matese JC, Perou CM, Hurt MM, Brown PO, Bostein D: Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol Boil Cell 2002, 13: 1977–2003. 10.1091/mbc.02020030.
 7.
Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Bostein D, Futcher B: Comparative identification of cell cycleregulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 1998, 9: 3273–3297.
 8.
Fu L, Pelicano H, Liu J, Huang P, Lee CC: The circadian gene Period2 plays an important role in tumor suppression and DNA damage response in vivo. Cell 2002, 111: 41–50. 10.1016/S00928674(02)009613
 9.
Liu D, Weinberg RC, Peddada S: A geometric approach to determine association and coherence of the activation times of cellcycling genes under different experimental conditions. Bioinformatics 2004, 20: 2521–2528. 10.1093/bioinformatics/bth274
 10.
Fisher NI: Statistical Analysis of Circular Data. New York: Cambridge University Press; 1993.
 11.
Mardia KV, Jupp PE: Directional Statistics. Chichester: John Wiley & Son; 2000.
 12.
Micheal TD, Salome PA, Yu HJ, Spencer TR, Sharp EL, McPeek MA, Alonso JM, Ecker JR, McClung CR: Enhanced fitness conferred by naturally occurring variation in the circadian clock. Science 2003, 302: 1049–1053. 10.1126/science.1082971
 13.
Liu D, Umbach DM, Peddada SD, Li L, Crockett PW, Weinberg CR: A RandomPeriods Model for Expression of CellCycle Genes. Proc Natl Acad Sci USA 2004, 101: 7240–7245. 10.1073/pnas.0402285101
 14.
Downs TD, Mardia KV: Circular regression. Biometrika 2002, 89: 683–697. 10.1093/biomet/89.3.683
 15.
Spurr BD, Koutbeiy MA: A comparison of various methods for estimating the parameters in mixtures of von Mises distribution. Comm Stat: Simul Comp 1991, 20: 725–741.
Acknowledgements
We thank David Umbach for reading our manuscript and offering suggestions, and Mei Liu, and Bhaskar Mandavalli for their helpful comments on an earlier version of the manuscript.
Author information
Additional information
Authors' contributions
DL conceived of the study, performed the calculations, and drafted the manuscript. SDP and CRW suggested the bootstrap method. SDP, CRW, and LL drafted the manuscript. All authors read and approved the final manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
About this article
Received
Accepted
Published
DOI
Keywords
 Phase Angle
 Angular Difference
 Cell Cycle Gene
 Circadian Gene
 Circular Data