Stepwise kinetic equilibrium models of quantitative polymerase chain reaction
- Gary Cobbs^{1}Email author
https://doi.org/10.1186/1471-2105-13-203
© Cobbs; licensee BioMed Central Ltd. 2012
Received: 8 March 2012
Accepted: 2 July 2012
Published: 16 August 2012
Abstract
Background
Numerous models for use in interpreting quantitative PCR (qPCR) data are present in recent literature. The most commonly used models assume the amplification in qPCR is exponential and fit an exponential model with a constant rate of increase to a select part of the curve. Kinetic theory may be used to model the annealing phase and does not assume constant efficiency of amplification. Mechanistic models describing the annealing phase with kinetic theory offer the most potential for accurate interpretation of qPCR data. Even so, they have not been thoroughly investigated and are rarely used for interpretation of qPCR data. New results for kinetic modeling of qPCR are presented.
Results
Two models are presented in which the efficiency of amplification is based on equilibrium solutions for the annealing phase of the qPCR process. Model 1 assumes annealing of complementary targets strands and annealing of target and primers are both reversible reactions and reach a dynamic equilibrium. Model 2 assumes all annealing reactions are nonreversible and equilibrium is static. Both models include the effect of primer concentration during the annealing phase. Analytic formulae are given for the equilibrium values of all single and double stranded molecules at the end of the annealing step. The equilibrium values are then used in a stepwise method to describe the whole qPCR process. Rate constants of kinetic models are the same for solutions that are identical except for possibly having different initial target concentrations. Analysis of qPCR curves from such solutions are thus analyzed by simultaneous non-linear curve fitting with the same rate constant values applying to all curves and each curve having a unique value for initial target concentration. The models were fit to two data sets for which the true initial target concentrations are known. Both models give better fit to observed qPCR data than other kinetic models present in the literature. They also give better estimates of initial target concentration. Model 1 was found to be slightly more robust than model 2 giving better estimates of initial target concentration when estimation of parameters was done for qPCR curves with very different initial target concentration. Both models may be used to estimate the initial absolute concentration of target sequence when a standard curve is not available.
Conclusions
It is argued that the kinetic approach to modeling and interpreting quantitative PCR data has the potential to give more precise estimates of the true initial target concentrations than other methods currently used for analysis of qPCR data. The two models presented here give a unified model of the qPCR process in that they explain the shape of the qPCR curve for a wide variety of initial target concentrations.
Keywords
Quantitative polymerase chain reaction qPCR Kinetic modelBackground
where T_{ n } is target concentration at the end of the reassociation/extension step of cycle n. At the beginning of the n^{th} cycle, target DNA is present at concentration T_{ n-1 } and is replicated with amplification efficiency E_{ n }. Target concentration at the end of cycle n is then: ${T}_{n}={T}_{n-1}\left(1+{E}_{n}\right)$
Equation 4 is useful when F_{ n } can be reliably distinguished from _{ b }F, which is possible only during exponential phase and later. Models of qPCR describe how E_{ n } changes with n and use equation 2 to determine the target concentration or equation 3 to determine target fluorescence at each step of the process. Some models of qPCR model _{ T }F_{ n } instead of T_{ n } and thus use the estimate of _{ T }F_{ 0 } instead of T_{ 0 }[1]. The estimated _{ T }F_{ 0 } values can be converted to T_{ 0 } by dividing by K_{ f } when a value for K_{ f } is available. The primary goal in analysis of qPCR data is to estimate T_{ 0 } or _{ T }F_{ 0 } from the qPCR curve as accurately as possible. Frequently, two or more qPCR curves are analyzed with a goal of determining the value of the ratio of two T_{ 0 } values when K_{ f } is not known. The ratio of two T_{ 0 } values is estimated by the ratio of two _{ T }F_{ 0 } values provided K_{ f } values are the same in the two qPCR experiments. This principle is often used in qPCR analyses to compare two or more samples by estimating the ratio of two T_{ 0 } values by the ratio of two estimated _{ T }F_{ 0 } values [2–4].
Numerous approaches to modeling the qPCR process are present in recent literature and involve three different general approaches. First the C_{t} and Linear Regression approaches assume constant efficiency and estimate it by linear regression of ln(F_{ n }_{ b }F) on cycle number using only values from the exponential phase of the curve (sometimes called the ‘Window-of-Linearity’) [3, 5–9] . Second, the sigmoidal function approach has used to fit a variety of sigmoidal functions to approximate the qPCR curve (Figure 1) and estimate _{ T }F_{ 0 } as the intercept of a sigmoidal function [1, 10, 11]. The sigmoidal functions used are described in the literature and none are based on a mechanistic model of the reactions in qPCR. Thirdly, kinetic models are a mechanistic approach to modeling the process and several such models have been proposed [8, 12–14]. The work I present here extends the kinetic approach to modeling qPCR by including primer concentration in the model and providing analytical equilibrium solutions for the re-association phase. I also introduce the use of simultaneous analysis of qPCR curves which have common values for rate constants in the kinetic models.
Current models
The models presented here may be regarded as an extension of the models presented by Smith et. al. (2007) [13] and Boggy and Woolf (2010) [14]. Though the models presented here are similar to those of Smith et. al. (2007) [13] for hydrolysis probe detection systems, they differ in that they are designed for intercalating dye detection systems such as SYBR green. They also use analytical solutions for equilibrium concentrations in the annealing step. One of the models is the same as that of Boggy and Woolf (2010) [14], however, a more general solution for equilibrium values for the annealing step is presented here that includes primer concentration and gives improved fit to data. Two stepwise kinetic equilibrium models are presented in which efficiency of amplification at each cycle depends on the concentrations of target and primers at that cycle. Events in a single cycle are modeled and then used in iterations to model the entire process. In each cycle, the re-association and the primer extension steps are modeled separately and consecutively. Double-stranded target DNA is denoted A_{ 1 }A_{ 2 } where A_{ 1 } and A_{ 2 } are complementary single strands. Primer that hybridizes with A_{ 1 } is denoted a_{ 1 } and primer for A_{ 2 } is denoted a_{ 2 }. Primer-template hybrid molecules are denoted A_{ 1 }-a_{ 1 } and A_{ 2 } -a_{ 2 }, respectively.
Re-association step
Summary of initial and equilibrium values for state variables of Model 1 and Model 2
At end of: | A _{ 1 } | A _{ 2 } | a _{ 1 } | a _{ 2 } | A_{ 1 } -a_{ 1 } | A_{ 2 } -a_{ 2 } | A_{ 1 } -A_{ 2 } |
---|---|---|---|---|---|---|---|
dissociation step | S _{ n,0 } | S _{ n,0 } | P _{ n,0 } | P _{ n,0 } | 0 | 0 | 0 |
re-association step | S _{ n,e } | S _{ n,e } | P _{ n,e } | P _{ n,e } | Q _{ n,e } | Q _{ n,e } | D _{ n,e } |
Model 1 reversible re-association
Model 2 non-reversible re-association
Extension step
The whole process
Since equation 7 is defined only for n = 1,2,3… it cannot be used directly to obtain a value of _{ T }F_{ 0 }, however, fitting equation 7 to an actual qPCR curve provides an estimate of S_{ 0,0 } which is the initial concentration of each of A_{ 1 } and A_{ 2 }. If the initial target sample is all double-stranded then S_{ 0,0 } is also the concentration of the double-stranded molecules, so T_{0} = S_{0,0}. Defining _{ T }F_{ 0 } as K_{f}S_{ 0,0 } provides a measure of total initial target concentration in fluorescence units. This is the amount of fluorescence that would be due to the target if it is all double-stranded. If _{ T }F_{ 0 } could actually be measured it would be K_{ f }D_{ o,e } where D_{ o,e } is the concentration of double-stranded target DNA before cycle 1. Most likely nearly all of the target is double-stranded before cycle 1 but, since the total amount of target is what is desired using K_{f}S_{ 0,0 } as the estimate of _{ T }F_{ 0 } is preferred.
Relating Model 2 to Boggy & Woolf (2010)
Model 2 annealing kinetics have been analyzed by Boggy & Woolf [14] who give an approximate equilibrium solution for the re-association step and use it to obtain a recurrence formula for double stranded target concentration over successive cycles. Their approximation is obtained by removing primer concentration from the model to obtain a simpler set of differential equations which they solve to get an approximate solution. I show in Additional file 3 that my solution for model 2 becomes equivalent to theirs if $\frac{{k}_{a}}{{k}_{a12}-{k}_{a}}$ is small and their parameter k is set to $\left(\frac{{k}_{a}}{{k}_{a12}-{k}_{a}}\right){P}_{n,e}$. This result indicates the rate constant, k, for the approximate solution found by Boggy & Woolf [14] depends on primer concentration and ignoring primer concentration in the analysis will cause at least some error in parameter estimates. Boggy and Woolf (2010) [14] note that their parameter k varied from one analysis to another, a result they did not expect, and this is likely due to variations in the amount of primer available. This finding suggest the analysis for model 2 presented here is a more accurate solution for model 2 kinetics than that of Boggy and Woolf (2010) [14] process since it includes variation in primer concentration.
Restricting model fit to only part of the qPCR curve
Since models 1 & 2 both include the effect of declining primer concentration during PCR they predict a different shape of the qPCR curve between lag and stationary phase for solutions that differ in primer concentration. Consider two solutions, one with low S_{ 0,0 } and one with high S_{ 0,0 } and with identical primer concentrations. Allow the one with low S_{ 0,0 } to undergo enough cycles so its current target concentration is equal to the initial target concentration for the solution with higher S_{ 0,0 } . If primer concentration is not limiting in any way then the two solutions would have identical qPCR curves from that point on. If primer concentration is limiting in some way then the qPCR curve for the solution with lower S_{ 0,0 } will at that point rise more slowly than that for the solution with higher S_{0,0} due to lowered primer concentration. When fitting models 1 and 2 to data it is thus desirable to include as much of the exponential phase as possible to allow changing primer concentrations to affect the analysis. Including most of the exponential phase is desired, however, in late exponential phase effects not included in the model also begin to occur and may cause the model to have poor fit. Effects in late exponential phase not modeled include reduced amount of dNTPs, reduced amount of polymerase due to decay and possibly partially replicated templates. In order to restrict analyses to only part of the exponential phase, models were fit to qPCR curves for cycles in the lag and exponential phase for which (F_{n} - _{b}F)/(F_{max} - _{b}F) ≤ L, where _{b}F, and F_{max} are baseline fluorescence and maximum fluorescence achieved in stationary phase. The value, L, is the proportion of increase of F_{n} above _{b}F and is a cut off value used to restrict the analysis to the first parts of the qPCR curve up to a point in the exponential phase where the model is thought to be valid. When choosing L, a compromise between these opposing effects is necessary and is done here by choosing the highest value for L that allows good fit to qPCR curves. To accomplish this, the models were fit to qPCR curves using a range of L values from 0.1 to 0.98 and the goodness of fit and estimated initial concentration determined for each L value. Both model 1 and 2 generally gave poor fits for L ≥ 0.95. The value of 0.8 was chosen for presentation here because the MSresidual values for most qPCR curves increased for values of L higher than 0.8 and also because the accuracy of estimates of initial target concentration was better for L values near 0.8. The question of how much of the qPCR curve to use in analysis is a problem inherent to most methods used for interpretation of qPCR curves. Boggy and Woolf (2010) [14] choose the point of maximal increase in fluorescence as their cut-off point for all curves which corresponds roughly to a value of 0.5 for L. Their rational for using only about half the exponential phase is that the effects of declining primer concentration are not likely to affect the shape of this part of the curve. The C_{t} method uses a different cut-off for every curve which is invoked when the ‘window-of-linearity’ is chosen.
Methods for incorporating experimental design into the data analysis
Kinetic models of qPCR allow more sophisticated and powerful analyses than are possible with other models. If a group of solutions have exactly the same chemical components except for possibly differing in initial target concentration, then all the kinetic parameters should be the same for all the solutions. The rate constants may then be estimated by fitting several or many qPCR curves simultaneously to get better estimates of rate constants and initial target concentrations. In such fits the kinetic parameters are common to all the different qPCR curves in the analysis but each curve may have a unique value for initial target concentration. Also kinetic models such those presented here allow an analyst to adjust the model to accommodate different experimental designs. I present four different estimation methods (A, B, C, and D) which are used here to fit models to qPCR data which have different experimental designs.
Standard curve method (Method A)
This method may be used only when the true initial target concentrations are known for each curve in the analysis (a standard curve). All initial conditions of the qPCR, other than initial target concentration are assumed to be the same for all curves. All curves are analyzed simultaneously in a non-linear fit in which the estimated rate constants apply to all curves and the initial target concentration is held constant at its known value for each curve. The fitting process minimizes the cost function over all cycles in all curves. The advantage of this method is that it pools the information of all the curves to give a single best estimate for each model parameter. The predicted fluorescence values provided by the analysis may be compared to the observed fluorescence values to assess how well each qPCR curve is fit by the model. The value of this method is that it may be used for estimation of initial target concentration for samples with unknown initial target concentration provided all other conditions of the PCR are the same as those used in the standard curve. This is done by doing a non-linear fit to the qPCR curve of the unknown sample in which only the initial target concentration, S_{ 0,0 } , is estimated while using the estimated parameters from the standard curve analysis as constants. To assess how well the models presented here estimate S_{ 0,0 } , each qPCR curve in the known samples was treated as an unknown sample and the estimated S_{ 0,0 } value compared to the known value for S_{ 0,0 }.
Dilution curve method (Method B)
This method may be used when a standard curve is not available but a dilution series of the unknown sample is available. Here the absolute target concentration in the undiluted sample is unknown and denoted S_{ 0,0,max }. The absolute target concentration of the i^{th} diluted sample is d_{i}S_{ 0,0,max } where d_{i} is the dilution ratio for the i^{th} sample. The qPCR curves from the diluted samples are analyzed simultaneously using non-linear curve fitting. Rate constants apply to all curves in the analysis and a single concentration parameter S_{ 0,0,max } is estimated with the initial concentration for each sample set at d_{i}S_{ 0,0,max }. In general usage of this method the estimated S_{ 0,0,max } would be the end result. To assess the accuracy of this method in this study the known absolute target concentrations for the data sets used here are ignored but their ratios are used to obtain values for d_{i}. Simultaneous fitting of all the curves was done as described above to estimate S_{ 0,0,max } and all the rate constants. Next the estimated rate constants are treated as constants in a second non-linear fit of each qPCR curve separately in which an S_{ 0,0 } value is estimated for qPCR curve. These estimated S_{ 0,0 } values are compared to the known values to assess the accuracy of method B.
Simultaneous curves method (Method C)
This method may be used when two or more samples are to be analyzed that may have different initial concentrations and a standard curve is not available nor are any dilutions of either sample available. The samples are assumed to be subject to qPCR with identical conditions except for the fact they may have different initial target concentrations. In this method all samples are analyzed simultaneously with estimated rate constants applying to all samples, but each sample has a unique S_{ 0,0 } value to be estimated. To assess the accuracy of this method with the data sets analyzed here all information on initial target concentration is ignored and the estimated S_{0,0} values for each sample is then compared to the known values.
Separate curves method (Method D)
This method does not assume any similarity of rate constants among any of the samples and does not analyze samples simultaneously as methods A, B, and C do. Each sample is analyzed separately with all parameters estimated independently for each qPCR curve. Here the accuracy of method C is assessed by comparing the actual S_{ 0,0 } values to the estimated ones. This is the type of analysis that is conventionally done when analyzing qPCR curves.
Methods
Data sets
Both models 1 and 2 were fit to two different data sets for which the actual initial concentrations of target are known. The model of Boggy and Woolf (2010) [14] was also fit to the same data sets and is called Model 0. The data sets used are: 1) Data of Boggy and Woolf (2010) [14], 2) Data of Rutledge (2004) [15] and are referred to as data set 1 and 2, respectively. The data of Boggy and Woolf (2010) [14] are for a 129 base pair synthetic target sequence with the following conditions for PCR: reaction vol = 25 μl, primer concentration = 400 nM each, detection system is SYTO-13 dye with concentration = 2 μM, dNTP concentration = 0.2 mM each. Six different absolute target concentrations range from 5 × 10^{3} to 5 × 10^{8} molecules per reaction volume at 10 fold increments. The data of Rutledge (2004) [15] is for the 102 base pair K1/K2 target sequence with the following conditions for PCR: reaction vol = 35 μl, primer concentration = 0.25 μM each, detection system is Syber Green dye. Six different absolute target concentrations ranged from 4.17 × 10^{2} to 4.17 × 10^{7} molecules per reaction volume at 10 fold increments. Data set 1 has 2 replicate qPCR curves for each target concentration and data set 2 has 5 replicates for each target concentration.
Determining baseline fluorescence
The first term on the right side of equation 8 is the baseline fluorescence, _{ b }F . The second term is a modified sigmoidal function describing the increase in fluorescence due to polymerase chain reaction where F_{ max } is the maximum fluorescence, c_{ 1/2 } is the cycle number at which F_{ n } is midway between _{ b }F and F_{ max }, and k_{ s } is a scale parameter. Parameters in equation 8 and bounds used in the non-linear estimation program are _{ b }F ≥ 0, F_{ max } > _{ b }F , c_{1/2} > 1, and k_{ s } > 0. Equation 8 is fit to each entire qPCR curve by nonlinear estimation of the parameters using least squares as the criterion for fit with the PROC NLIN procedure of the SAS program version 9.3 [16]. Each observation was weighted by the reciprocal of the observed fluorescence. The value of _{ b }F is estimated separately for each qPCR curve and then treated as a constant for that curve when estimating parameters of models 0, 1, and 2. In preliminary analyses for this study the baseline fluorescence was estimated as a parameter in the model with each qPCR curve having its own unique value. In the resulting fits some qPCR curves were found to have baseline values that were not consistent with the observed values in the lag phase. This is due to the estimation procedure selecting a baseline value that gives the best overall fit to the curve rather than just in the lag phase. Ruijter et. al. (2009) [9] have shown that baseline estimates can affect the quality of fit in the exponential phase for the C_{t} method and this same effect is likely occurring in fits of the models here. They also point out inherent problems with estimating baseline fluorescence from a fixed number of points in early cycles. Rutledge and Stewart (2008) [1] showed sigmoidal functions similar to equation 8 give very good fit to the observed values of qPCR curves. The use of the sigmoidal function given in equation 8 is found to always give values that fit well to data in the lag phase and is not affected much by peculiar changes in fluorescence often seen during the first few cycles.
Estimation of model parameters
Methods described here are used to determine how well each model explains the shape of each qPCR curve and also to determine how accurately each model estimates initial target concentration for each of the two data sets. Each model, 0, 1 or 2, was fit to qPCR data by nonlinear curve fitting with PROC NLIN of SAS version 9.3 [16] using the Marquardt optimization method and the mean sum of squared differences between the observed and predicted fluorescence values (MSresidual) as the measure of goodness-of-fit. Mean Square Residual (MSresidual) is a measure of how well a model explains the qPCR curve. A MSresidual value near zero indicates a very good fit in which the model predicted values are all very close to the observed values and higher values indicate a poorer fit. Up to 10,000 iterations were used for each fit. The parameters estimated and bounds imposed during estimation are: S_{ 0,0 } > 0, k > 0 for model 0, S_{ 0,0 } > 0 , K_{ f } >0, K_{ s } > K_{ D12 } >0 for model 1, and S_{ 0,0 } >0 , K_{ f } >0 , and K >0 for model 2. The initial concentrations of primers were set to the values reported by the authors providing the two data sets and expressed as nmoles/L. Additional file 4 contains SAS code that fits model 1 and 2 to a sample data set.
Effect of varying L
To determine the effect of the choice of the value of L on the results of the analysis, the entire analysis was done separately for each of a range of values for L. Specifically, each of the three models (0, 1, 2) was fit using each of the estimation methods (A, B, C, D) to each of the two data sets (1, 2) for each of the L values of 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 0.95, 0.98. The MSresidual and Fold Error for each fit were averaged over replicates 1 and 2 for each data set and the resulting means plotted versus L for each dilution separately.
Effect of errors in estimation of baseline fluorescence
Analyses were done to assess the effect of errors in the estimation of baseline fluorescence, _{b}F, on the goodness of fit of the model and the estimated initial target concentration. The estimation procedure used to estimate _{b}F provides the standard error of the estimated value. Fits of the models and estimation of initial target concentration was done using _{b}F values 1 and 2 standard deviation units above and below the estimated value. Plots of the mean MSresidual and Fold Error in estimation of initial target concentration versus the _{b}F value used were used to assess the effect of errors in estimation of _{b}F.
Results
Estimation of absolute target concentration
Effect of varying L
Effect of errors in estimation of baseline fluorescence
Plots of mean MSresidual and fold error in estimated initial target concentration versus baseline fluorescence for estimation methods A, B, and D are affected very little by variation in the value of _{b}F used (plots not reported here). When using estimation method C, the MSresidual values were not affected much by variations in the _{b}F but the initial target concentrations were over-estimated when using low values of _{b}F. Increases in Fold Error by a factor of up to 10 was found in some plots when a _{b}F value two standard error units below the estimated value was used.
Discussion
Fit to qPCR curves
The two models presented here have both been shown to give good fit to the lag and exponential phase of qPCR curves over a wide range of initial target concentrations while using a single set of rate constants. The lack of fit of model 0 to qPCR curves seen in Figures 4, 5 and 6 is likely to be due to the same cause as the variation in k values pointed out by Boggy and Woolf (2010) [14] . The derivation of the Boggy and Woolf [14] formula assumes their parameter k = k_{a}/k_{a12} where k_{a} and k_{a12} are the rate constants of model 2 (Figure 3). Thus the same value of k should apply to all qPCR curves that are identical except for initial target concentration. I have shown that the parameter k of the Boggy and Woolf (2010) [14] model is not constant but depends on the primer concentration during the PCR process. Estimation methods A, B, and C force model 0 to use the same k value for all curves and thus each curve gives a poorer fit than it would if allowed to have a unique k value for each curve. These results suggest the Boggy and Woolf (2010) [14] assumption that primer concentration is sufficiently large that it can be ignored is not met in these data sets.
Estimation of T_{0}
The results here indicate the best methods for estimation of target concentration in an unknown sample is to use a standard curve that has a range of values for initial target concentration or to use a dilution curve composed of a series of dilutions of the original sample and analyze the qPCR curve with estimation methods A and B, respectively. Figures 4, 5, 7, 8, 9, and 10 indicate these two methods (A and B) are essentially equivalent in their accuracy of estimation. Note that with the dilution curve method, B, the absolute initial concentrations of target are estimated independently of the known values. Only the dilution ratios are used in the simultaneous fit to the data. The best way to estimate target concentration of an unknown sample when a standard curve is not available is to make a series of dilutions of the sample and then apply qPCR to each dilution and use the dilution curve method described here to estimate the target concentration of the undiluted sample. The separate curves method, D, worked well for data set 1 for both models 1 and 2, and worked moderately well only for model 1 for data set 2. Lastly, estimation method C gave good estimates for data set 2 but poorer estimates for data set 1. More sources of data need to be studied to determine the suitability of estimation methods C and D with these models.
The poor fit of the Boggy and Woolf (2010) [14] model to qPCR curves with estimation methods A, B, and C suggest it will give poorer estimates of initial target concentration than models giving good fit. Estimation method D allows each curve to have unique parameter values and gives much better fit to qPCR curves when using model 0, however, the accuracy of estimates is not as good as those for models 1 and 2. The non-linearity of initial target concentration estimates for Model 0 with method D using data set 2 as seen in Figure 8 panels C and D data is different than the result reported by Boggy and Woolf (2010) [14] who obtained a very linear relationship using method D. The reason for this difference is not clear but may be due to the fact that model fitting is done differently. Boggy and Woolf (2010) [14] estimated baseline fluorescence as a parameter in the model. I estimate baseline fluorescence before fitting the model and treat it as a constant during model fitting. Other minor differences in the programs used for fitting and the searching routine may have also caused some difference in the result.
Effect of varying L
Figures 11 and 12 as well as those given in Additional file 5 indicate the effects of variation of L depend on the estimation method used and to a lesser extent on the concentration of initial target sequence. Examination of the plots in Additional file 5 indicate the effects of L can be variable but some general patterns are present which are shown in Figures 11 and 12. First, for all models, all methods and all dilutions, MSresidual increases with increasing L. All the plots involve a log scale and show an approximate linear increase of MSresidual with L indicating an underlying exponential increase of MSresidual with L. Thus when L has values of 0.9 and higher the fits to qPCR curves are much worse than with lower values of L. Secondly, the effect of L on the accuracy of the estimated initial target concentration depends strongly on the estimation method used. Estimated target concentrations obtained with estimation methods A and B show remarkably little dependence on the value of L used for either of the data sets analyzed. There is a consistent trend for a slight over-estimation of target concentration for L values above 0.8 with the amount of over-estimation increasing with increasing L. The dependency of the accuracy of estimation by methods C and D on L was more variable, however, both methods showed a patterns similar to that present in methods A and B, but more extreme over-estimation of initial target concentration when L values above 0.8 are used. For these reasons the value of 0.8 was used for L in the analyses presented here. Though the use of L = 0.8 gave good fits to qPCR curves and accurate estimates of initial target concentration, investigation of other methods for restricting analysis to only parts of the qPCR curve may be worthwhile.
Effect of errors in estimation of baseline fluorescence
Estimation methods A, B and D are robust to errors in baseline fluorescence. Estimation method C is less robust to such errors.
General comments
The models presented here assume the annealing of single-stranded DNA to double-stranded molecules reaches equilibrium at each cycle. This assumption may not be true, however, the ability of the present models to fit qPCR curves as well as they do suggests it may be approximately true. Model 1 gave slightly better fits to qPCR curves and slightly better estimates of initial concentration than Model 2 when estimation methods A and B were used. This result suggests the reversible reactions assumed in model 1 (Figure 2) may be a better model than the non-reversible reactions assumed in model 2 (Figure 3). It is thought the dissociation constant for double-stranded templates is so small that the annealing of templates is effectively non-reversible [8]. However, the annealing of primer and template may produce a much less stable double-stranded structure such that the reversible reaction model is more reasonable. A model in which template-template molecules are completely stable and primer-template molecules are not is not useful when considering equilibrium solutions, as equilibrium then occurs only when the concentration of primer-template molecule is zero. To use such a model, a non-equilibrium solution would need to be found. It is also possible that there is not enough time allowed during the experiments for the re-association reactions to reach equilibrium. Whether or not DNA re-association achieves equilibrium or not during PCR is not clear. Smith et. al. (2007) [13] offer some analysis suggesting that equilibrium does occur.
An advantage of the mechanistic approach to modeling qPCR curves is that models describe actual events of the process and thus may be expanded to include other effects that may improve accuracy. Kinetic models naturally allow simultaneous estimates of common parameters which will increase accuracy. This is extremely difficult or impossible with other methods used for modeling qPCR. Because the kinetic models account for variable efficiency by kinetic theory they may give better estimates of initial target concentration than other approaches. Another advantage of kinetic models is that parameters of the model may be estimated by dedicated experiments distinct from the qPCR curve experiment. For example, the parameter K_{ f } present in both models 1 and 2 converts double-stranded DNA concentration into amount of fluorescence for a particular experimental system. The value of K_{ f } could be determined in experiments separate from qPCR and then used as a constant in the analysis of a qPCR curve, thus increasing the accuracy of estimation of the kinetic parameters and target concentrations. Lastly, kinetic models uniquely allow estimation of absolute initial concentrations of target sequence without use of a standard curve of any type. The accuracy of estimation with kinetic models is enhanced greatly by the use of a standard curve, but it is not required. In fact the dilution curve method gave fits essentially as good as the standard curve method. The mechanistic models developed here offer a more complete description of the amplification occurring in qPCR, fit observed data very well, and allow more accurate estimation of initial target concentration than other methods.
Conclusions
Two stepwise kinetic equilibrium models of qPCR are presented and analytical solutions are given for equilibrium values during annealing of single stranded to double stranded molecules. The models are amenable to different types of non-linear fitting which include fitting several curves simultaneously when they have common parameter values. Both models are shown to give very good fit to qPCR data with a wide range of initial target concentrations with a single set of values for rate constants. The two models also give accurate estimates of initial absolute target concentration using several different methods for estimation. Using the models with data from a standard curve gives accurate estimates of initial absolute target concentration. In the absence of a standard curve, a dilution curve method also provided accurate estimates of the initial absolute target concentration. In the absence of either a standard or dilution curve the models provide estimates, though less accurate, of absolute initial target concentration. These models presently give the best unified theory for the interpretation of qPCR data in that they explain well the shape of the qPCR curve and how it is affected by variation in the initial target concentration.
Declarations
Acknowledgments
I thank Dr. James Alexander, Dr. Robert Page, Dr. David Schultz and 2 reviewers for critically reviewing earlier versions of this manuscript.
Authors’ Affiliations
References
- Rutledge Robert G, Don S: A kinetic-based sigmoidal model for the polymerase chain reaction and its application to high-capacity absolute quantitative real-time PCR. BMC Biotechnol 2008, 8: 47. 10.1186/1472-6750-8-47PubMed CentralView ArticlePubMedGoogle Scholar
- Iker S-N, Angelo G-P, Manuel G-M, Alvaro P-M, David H, Rocio L, Rosario M, Paloma C, Marta M, Enrique E, Juan Angel Fresno V: Comparison of gene expression profiling by reverse transcription quantitative PCR between fresh frozen and formalin-fixed, paraffin-embedded breast cancer. Biotechniques 2010, 48(5):389–397. 10.2144/000113388View ArticleGoogle Scholar
- Pfaffl MW: A new mathematical model for relative quantification in real-time RT_PCR. Nucleic Acids Res 2001, 29(9):00.View ArticleGoogle Scholar
- Pfaffl Michael W, Horgan GW, Leo D: Relative expression software tool (RESTÂ©) for group-wise comparison and stistical analysis of relative expression results in real-time PCR. Nucleic Acids Res 2002, 30:(9):e36.PubMed CentralPubMedGoogle Scholar
- Livak KJ, Schmittgen TD: Analysis of relative gene expression data using real-time quantitative PCR and the 2-DDCT method. Methods 2001, 25: 402–408. 10.1006/meth.2001.1262View ArticlePubMedGoogle Scholar
- Yuan Joshua S, Neal Stewart C Jr: Real-time PCR Statistics. PCR Encyclopedia 2005, 1: 101149–101127.Google Scholar
- Ramakers C, Ruijter JM, Lekanne Deprez RH, Antoon FM M: Assumption-free analysis of quantitative real-time polymerase chain reaction (PCR) data. Neurosci Lett 2003, 339: 62–66. 10.1016/S0304-3940(02)01423-4View ArticlePubMedGoogle Scholar
- Gevertz Jana L, Dunn SM, Roth CM: Mathematical Model of Real-Time PCR Kinetics. Biotechnol Bioeng 2005, 92: 346–355. 10.1002/bit.20617View ArticlePubMedGoogle Scholar
- Ruijter JM, Ramakers C, Hoogaars WMH, Karlen Y, Bakker O, Van den Hoff MJB, Moorman AFM: Amplification efficiency: linking baseline and bias in the analysis of quantitative PCR data. Nucleic Acids Res 2009, 37:(6):e45. 10.1093/nar/gkp045PubMed CentralView ArticlePubMedGoogle Scholar
- Alvares Mariano J, Vila-Ortiz GJ, Salibe MC, Podhajcer OL, Pitossi FJ: Model based analysis of real-time PCR data from DNA binding dye protocols. BMC Bioinforma 2007, 8: 85. 10.1186/1471-2105-8-85View ArticleGoogle Scholar
- Batsch A, Andrea N, Christian F, Anita U, Daliborka L, Tina L, Julia P, Andreas L, Edgar S, Dirk G: Simultaneous fitting of real-time PCR data with efficiency of amplification modeled as Gaussian function of target fluorescence. BMC Bioinforma 2008, 9: 95. 10.1186/1471-2105-9-95View ArticleGoogle Scholar
- Mehra S, Wei-Shou H: A Kinetic Model of Quantitative Real-Time Polymerase Chain Reaction. Biotechnol Bioeng 2005, 91: 848–860. 7 7 10.1002/bit.20555View ArticlePubMedGoogle Scholar
- Smith Marjo V, Miller CR, Michael K, Walker NJ, Portier CJ: Absolute estimation of initial concentrations of amplicon in a real-time RT-PCR process. BMC Bioinforma 2007, 8: 409. 10.1186/1471-2105-8-409View ArticleGoogle Scholar
- Boggy Gregory J, Woolf PJ: A Mechanistic Model of PCR for Accurate Quantification of Quantitative PCR Data. PLoS One 2010, 5(8):e12355. 10.1371/journal.pone.0012355PubMed CentralView ArticlePubMedGoogle Scholar
- Rutledge Robert G: Sigmoidal curve-fitting redefines quantitative real-time PCR with the prospective of developing automated high-throughput applications. Nucleic Acids 2004, 32: e178. No. 22 No. 22 10.1093/nar/gnh177View ArticleGoogle Scholar
- SAS version 9.3 TS Level 1M0 SAS Institute Inc: 100 SAS Campus Drive Cary. NC 27513–2414, USA http://www.sas.com
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.