Suppose that a time series is observed and one concern is the possible periodicity of this time series. To be specific in the context of gene expressions observed at time *t* for any fixed gene *g*, we denote the time series (or gene expression observed in a time course) by *Y*
_{
g
}(*t*) for *t* = 1, ..., *N* and *g* = 1, ...,
. To model *Y*
_{
g
}(*t*) with periodicity, we can assume:

*Y*
_{
g
}(*t*) = *f*
_{
g
}(*t*) + *ε*
_{
gt
},

where *f*
_{
g
}(*t*) is a periodic function with a smallest positive period *T*
_{
g
}for gene *g*, that is *f*
_{
g
}(*t* + *T*
_{
g
}) = *f*
_{
g
}(*t*) for all *t*; and *ε*
_{
gt
}is a sequence of non-observable random errors with mean 0 and homogenous variance *σ*
^{2} for all *g* and *t*. For a fixed gene *g*, we can specifically assume that a time series gene expression is well represented by

*Y*
_{
g
}(*t*) = *μ* + *A* cos (*ωt*) + *B* sin(*ωt*) + *ε*
_{
gt
},

where *A*, *B*, and *μ* (known) are constants, *ω* is of the form 2*πk*/*N*, for *k* = 0,1, ..., *m*, with *m* = (*N* - 1)/2 for *N* odd and *m* = *N*/2 for *N* even. Given a finite realization of the time series gene expressions *y*
_{
g
}(*t*) (sample values or microarray expressions obtained from the experiment), we can then view *y*
_{
g
}(*t*) as represented by

where *ω*
_{
k
} = 2*πk*/*N*, for *k* = 0,1, ..., *m*,
, and

for *k* = 1, ..., *m* and g = 1, ...,
. For the testing of periodicity related hypotheses of a time series, the periodogram of gene *g* is denned as

for *k* = 1, ..., *m* and *g* = 1, ...,
. Under the assumption that *ε*
_{
gt
}'s are identically independently distributed normal random errors with mean 0 and homogenous variance *σ*
^{2} (that is, *Y*
_{
g
}(*t*) is a normal white noise), Fisher [7] proposed a G-statistic and derived the exact null distribution of *G*. Suppose it is of our interest to test

*H*
_{0}: *Y*
_{
g
}(*t*) = *μ* + *ε*
_{
gt
}, (5)

versus

*H*
_{1}: *Y*
_{
g
}(*t*) = *μ* + *A* cos(*ωt*) + *B* sin (*ωt*) + *ε*
_{
gt
}, (6)

then for a fixed gene *g*, the Fisher's *G*-statistic is given by

For details on the *G* test statistic, its null distribution and the percentage points of the *G* test statistics, please refer to Fisher [7], Davis [18], Wilks [19], and Priestley [20].

Other test statistics for searching "hidden periodicity" in a time series have been proposed as part of spectral analysis (Fuller [21]) in the literature. For the following more general setting of hypothesis testing of

*H*
_{0}: *Y*
_{
g
}(*t*) is a normal white noise, (8)

versus

*H*
_{0}: *Y*
_{
g
}(*t*) is not a normal white noise, (9)

for fixed gene *g*, Bartlett [8] proposed to use a *C*-statistic as a test statistic to fulfill the task of such hypothesis testing procedure. For a fixed gene *g*, we obtain the *C*-statistic as

with

for *g* = 1, ...,
. Durbin ([9, 22]) provided the details of the null distribution of the test statistic *C* under the normality assumption.

According to Fisher [7], the observed significance value, or p-value
, for the hypothesis testing of the periodicity of a fixed gene *g* using *G*-statistic as the test statistic is expressed as in (1), or again

where *ξ*
_{
g
}is the sample realization of the *G*-statistic value calculated from (7) divided by m, and *L*(*ξ*
_{
g
}) is the largest integer less than 1/*ξ*
_{
g
}. Meanwhile, according to Durbin [9], the p-value,
, for the hypothesis testing of the periodicity of a fixed gene *g* using *C*-statistic as the test statistic is given in (2), or specifically,

where *a*
_{
g
}= *mC*
_{
g
}, *C*
_{
g
}is given in (10), [*a*
_{
g
}] = *INT*{*a*
_{
g
}}, and *n* = *m* - 1.

The C&G Procedure utilizes both of the test statistics and gives a practical way for identifying significant periodic genes in massive microarray data.