Statistical significance of quantitative PCR

Background PCR has the potential to detect and precisely quantify specific DNA sequences, but it is not yet often used as a fully quantitative method. A number of data collection and processing strategies have been described for the implementation of quantitative PCR. However, they can be experimentally cumbersome, their relative performances have not been evaluated systematically, and they often remain poorly validated statistically and/or experimentally. In this study, we evaluated the performance of known methods, and compared them with newly developed data processing strategies in terms of resolution, precision and robustness. Results Our results indicate that simple methods that do not rely on the estimation of the efficiency of the PCR amplification may provide reproducible and sensitive data, but that they do not quantify DNA with precision. Other evaluated methods based on sigmoidal or exponential curve fitting were generally of both poor resolution and precision. A statistical analysis of the parameters that influence efficiency indicated that it depends mostly on the selected amplicon and to a lesser extent on the particular biological sample analyzed. Thus, we devised various strategies based on individual or averaged efficiency values, which were used to assess the regulated expression of several genes in response to a growth factor. Conclusion Overall, qPCR data analysis methods differ significantly in their performance, and this analysis identifies methods that provide DNA quantification estimates of high precision, robustness and reliability. These methods allow reliable estimations of relative expression ratio of two-fold or higher, and our analysis provides an estimation of the number of biological samples that have to be analyzed to achieve a given precision.

Where N c is the amount of PCR DNA product at cycle c. In a perfectly efficient PCR reaction, the amount, or the copy number of PCR DNA molecules, would double at each cycle but, due to a number of factors, this is rarely the case in experimental conditions. Therefore the equation above is generalized as: Where E is the PCR efficiency with 1 < E < 2.
Because quantitative PCR measures fluorescence as an estimator of the amount of DNA present in the reaction at a given time, the absolute amount of initial DNA is generally unknown.
Therefore the initial amount of cDNA corresponding to the gene of interest (DNA A) must be normalized to the level of an endogenous control gene or of a reference DNA standard (DNA B).
Eq.1 can be formulated for the two DNA: when the amplification curves of each DNA cross the threshold fluorescence value, where Ct A and Ct B are the threshold cycles of the two genes. Thus the DNA concentration ratio becomes: where R AB is the normalized ratio of the molar concentrations or of the copy number of DNA A relative to DNA B in a given sample. Note that Eq. is a simplified form of the one developed by Meijerink and colleagues [1].

Propagation of error on normalized expression
Assuming that in the second equality of Eq.2 E A , Ct A , E B and Ct B are obtained from imprecise measurements, the error on the normalized expression ratio may be determined using the Taylor expansion to the first order for the calculation of error propagation (or the delta method), which consists in expressing the derivative of R AB as the sum of the partial derivative relative to each variable: Note that this is an approximation of the behaviour of errors, whose approximated error tends to zero for large sample sizes, thus fulfilling the central limit theorem.
When considering the first equality of Eq.2 (ratio of the original amount of template DNA), AB R ∆ is given by:

Relative induction of gene expression
Usually, experimenters are interested in the difference of the gene expression between two conditions (with versus without a drug, sane tissue versus metastatic tissue, etc…) [2][3][4]. We want to know whether the expression of the gene of interest is induced or repressed upon treatment. So the useful figure is the ratio of the normalized ratios (Eq.2), that we will refer as the normalized induction ratio thereafter: Note that Eq.13 is valid only if the gene used for the normalization (internal standard) has an expression that is invariant with condition 1 and 2 [5]. Error on induction values is given by Eq. 14 2 ) 2 (

Estimation of the PCR efficiency
Determination of PCR efficiency can be readily performed by log-linearizing Eq.1: ). The regression function of Microsoft Excel was used to evaluate this error, providing the standard deviation of the slope and then calculating the associated error on the efficiency using the previous formulas for the propagation of errors:

Propagation of error on averaged values
The different data processing models were first compared using experimental data averaged from all the conditions tested (primer, sample, dilution, etc), so as to avoid the potential bias that may result from the use of a particular primer or sample. Since each value used in the average has an associated error, we used a Taylor limited expansion method to determine how these errors propagate in the average value: