A fixed-point algorithm for estimating amplification efficiency from a polymerase chain reaction dilution series
© Jones et al.; licensee BioMed Central Ltd. 2014
Received: 28 May 2013
Accepted: 31 October 2014
Published: 10 December 2014
The polymerase chain reaction amplifies and quantifies small amounts of DNA. It is a cyclic process, during each cycle of which each strand of template DNA is copied with probability approaching one: the amount of DNA approximately doubles and this amount can be estimated fluorimetrically each cycle, producing a set of fluorescence values hereafter referred to as the amplification curve. Commonly the biological question of relevance is one of the ratio of DNA concentrations in two samples: a ratio that is deduced by comparing the two amplification curves, usually by way of a plot of fluorescence against cycle number. Central to this analysis is measuring the extent to which one amplification curve is shifted relative to the other, a measurement often accomplished by defining a threshold or quantification cycle, C q , for each curve: the fractional cycle number at which fluorescence reaches some threshold or at which some other criterion (maximum slope, maximum rate of change of slope) is satisfied.
We propose an alternative where position is measured relative to a reference curve; position equates to the cycle shift which maximizes the correlation between the reference and the observed fluorescence sequence. A key parameter of the reference curve is obtained by fixed-point convergence.
We consider the analysis of dilution series constructed for the estimation of qPCR amplification efficiency. The estimate of amplification efficiency is based on the slope of the regression line when the C q is plotted against the logarithm of dilution. We compare the approach to three commonly used methods for determining C q ; each is applied to publicly accessible calibration data sets, and to ten from our own laboratory. As in the established literature we judge their relative merits both from the standard deviation of the slope of the calibration curve, and from the variance in C q for replicate fluorescence curves.
The approach does not require modification of experimental protocols, and can be applied retrospectively to existing data. We recommend that it be added to the methodological toolkit with which laboratories interpret their real-time PCR data.
Since its introduction by Mullis et al. , the polymerase chain reaction (PCR) has been widely used to amplify and quantify small amounts of DNA. Briefly it constitutes a few dozen cycles in each of which there are three stages: denaturation, annealing and extension.
where ∆ C=C b -C a . Accordingly, the cycle difference, ∆ C, and E (or equivalently, p), are the key to estimation of the ratio N A /N B .
where F b is background, and F max is maximum contribution of the reaction to fluorescence, (the asymptote, rather than maximum observed experimentally), to describe the data. Rutledge and Stewart  introduced an analysis which takes into account the linear decrease in amplification under this model, simplifying the estimation of the initial amplification efficiency from the curve itself. MIQE guidelines  recommend the former approach: `PCR amplification efficiency must be established by means of calibration curves...’ but we acknowledge ongoing debate on this issue.
Strictly speaking the data from a tube are discontinuous; fluorescence is measured at the end of each cycle, and there is no such thing as a fluorescence after a fractional number of cycles as implied by the continuous functions above. We use the term reference curve to imply an abstraction; a smooth continuous curve of fluorescence as a function of x, which we observe at cycles C which are integer values of x. The observed fluorescence is the fluorescence at these integer values, but with the addition of error or noise.
A key to analyzing PCR, therefore, is, given two fluorescence curves, to measure ∆ C, the extent to which one curve is shifted laterally relative to the other. There are two very different circumstances under which one may need to do this. If E is to be estimated from the cycle-to-cycle increase in fluorescence of a single assay tube, then quantifying some aspect of the fluorescence is important. Conversely, if one is using dilution to estimate amplification then the shape of the curve is of less import so long as the data and the reference curve have in common that they are S-shaped (sigmoid): interest lies only in the extent to which dilution has shifted the curve, of whatever shape, to the right. Whatever the method of estimating E, that estimate is commonly used subsequently to derive, in concert with a measured cycle difference between two tubes, ∆ C, the initial concentration ratio implied by Eq. 1.
It is the estimation of cycle shift in these scenarios which we address; to what extent is one fluorescence curve shifted relative to another? There is, of course, a significant literature detailing several algorithms to do just that, and we should justify any attempt to add another. Ruitjer et al.  have examined the performance of nine estimators of E and have proposed several measures of their relative merits. In using the publicly available data sets comprising dilution series for establishing amplification efficiency, two measures are of central importance. One is the within-replicate variance; most data sets have three or more replicates at each dilution, and for a good estimator we expect values of C q from these replicates to be close. The second measure is the standard deviation of the estimate of the slope when C q is regressed against the logarithm of dilution; the smaller the standard deviation of the slope, the smaller will be that of the estimated efficiency. Following Ruitjer et al., we use both of these, and compare approaches using Friedman's non-parametric rank sum.
The three algorithms which we examine in detail, and which performed very well in the review by Ruitjer et al. are C y0, Standard- C q , and PCR-Miner. The latter algorithm includes both an estimate of C q , and an estimate of efficiency derived from each curve, and we should emphasize that we are implementing only the C q -estimating component of PCR-Miner. To avoid confusion with the full PCR-Miner algorithm we will refer to it as the SDM-l5 method (second derivative maximum of the model designated l5 in the qpcR package  associated with the R statistical software ).
Notwithstanding their established utility we have concerns about each of these approaches. For C y0, the derived C q depends on the baseline. We regard baseline fluorescence as a `nuisance’ parameter, as do several algorithms that attempt to eliminate it. Our bias (and we accept that it is personal bias) is to use an estimator independent of baseline. Standard- C q finds the fluorescence (F q ) at the second derivative maximum (SDM) for the (mean) undiluted sample, and for subsequent samples C q is defined as the (interpolated) cycle at which fluorescence achieves F q . Again this is influenced by sample-to-sample variation in baseline, and for the subsequent diluted samples it takes information from only two readings out of the entire curve. The SDM-l4 approach overcomes the above reservations by fitting a four-parameter sigmoid curve and calculating the cycle of SDM as implemented, for instance in some commercially available software . This approach is independent both of baseline and of scale, but it raises a more subtle problem. Each reference curve is of a different shape. The distance between the curves in a dilution series is not well defined if each curve is of a different shape; they are not the same curve translated laterally along the dilution axis. The distance between the second derivative maxima, for instance, is different from the distance between the first derivative maxima.
Ethics approval for the use of peripheral blood leucocytes was obtained from the Human Researach Ethics Committee of The Queen Elizabeth Hospital (South Australia), and the use of samples followed the protocol approved by that committee, as documented in Bianco-Miotto et al. .
RNA extraction and reverse transcription
RNA was extracted from cells grown in tissue culture using Trizol (Invitrogen, USA) according to the manufacturer's protocol. The concentration of RNA was determined using a Biophotometer (Eppendorf, North America Inc, Westbury, USA). DNAse treatment of total RNA was performed prior to reverse transcription in order to minimize PCR signal arising from carry-over genomic DNA (Ambion DNAfree kit). RNA was reverse transcribed using Superscript III RT (Invitrogen, USA). cDNA was diluted 20 fold in ultra pure water (Fischer Biotech) prior to real time PCR.
Preparation of genomic DNA
Mononuclear cells were isolated from the peripheral blood of healthy donors using Lymphoprep (Axis-Shield, Oslo, Norway) according to the manufacturer's instructions. Genomic DNA (gDNA) was purified from the mononuclear cells using Trizol (Invitrogen Life Technologies, NY, USA) according to the manufacturer's instructions.
Preparation of dilution series
50 μl of ultra pure water was aliquoted into a series of 0.5 ml PCR tubes, and either 50 μl of gDNA or 50 μl of cDNA was added to the first tube and mixed by pipetting up and down 10 times. 50 μl of this mixture was then pipetted to the next tube and mixed, and the process repeated across the tubes, to produce a two-fold serial dilution.
Real-time polymerase chain reaction
PCR amplification was performed in 20 μL final volumes containing 6 μL of cDNA or gDNA template, 2 μL of each forward and reverse primer (5 μM), and 10 μL of 2 × Quantitect Sybr Green Master Mix (Qiagen, Germany). Thermocycling was performed in a Rotorgene 6000 thermocycler (Corbett, Australia) with an initial activation/denaturation (hot start) at 95°C for 15 min; followed by 45 cycles of 20 sec at 95°C, 30 sec at the annealing temperature, and 30 sec extension at 72°C. After the cycling there was a final extension at 72°C for 4 min. Triple replicates of twelve (sometimes eleven) 2-fold dilutions reactions were performed on all samples. Products were then melted in the Rotorgene 6000 thermocycler from 60°C to 99°C at 0.5°C for 5 sec per step to determine if the PCR products melted at the same temperature as PCR products that had been fractionated through 1% agarose gel to confirm that the product was of the predicted size.
Details of amplicons and primers appear under Additional file 1: Table S1.
Data analysis was carried out under GNU/Linux Ubuntu 14.04 LTS using the R programming language  and the associated packages qpcR  and ggplot2 . The fixed-point estimator is as documented below. The methods Standard- C q , SDM-l4 nd C y0 were implemented as follows.
Standard- C q
The essence of standard- C q is to locate the fractional cycle corresponding to the SDM of the (averaged) undiluted sample, and to define F q as the (interpolated) fluorescence at that fractional cycle. The C q of each cycle, diluted or undiluted, is the fractional cycle at which F q is achieved.
If F i denotes the fluorescence at the i t h cycle of the averaged undiluted samples we find i for which the second derivative, (F i-1-2F i +F i+1) is maximal, and then assuming that the second derivative of fluorescence, if continuous, would be adequately approximated by a quadratic around the i t h cycle we now have as fractional cycle maximizing that quadratic as the location of SDM. The (mean) fluorescence of the undiluted sample is then found by interpolating the cubic through the adjacent four fluorescence values. This defines F q . For each sample we then find k such that F k-1<F q <F k implying that C q for that sample lies between (k-1) and k. Again the fractional cycle at which F q occurs is found by cubic interpolation of the observed fluorescence at F k-2…F k+1.
and we have implemented both for comparison. We denote the former Cy0-b5, the latter Cy0-l5, referring to the five-parameter functions b5 and l5 of the qpcR package.
The function Cy0 from the qpcR package takes the five parameter function and returns C y0 as the point of intersection with the abscissa of the tangent through the maximum first derivative.
Theoretical development of fixed-point approach
In estimating ∆ C we are quantifying the extent to which one curve needs to be shifted horizontally (on the cycle axis) in order that it might overlie the other. That aim requires three qualifications: first, that there may need to be some vertical shift to accommodate different baselines; second, that the same applies to scale; third that we have equally-spaced points rather than a continuous curve.
If, as in some standard analyses, the `position’ of a fluorescence curve is taken to be the fractional cycle at which fluorescence attains some arbitrary threshold, then the tube-to-tube variation in the baseline and scale of fluorescence becomes a problem; scale is particularly so where fluorescence has not reached a terminal plateau. The appeal of using position of maximum first or second derivative (as in PCR-Miner software) is that these are not influenced by changes in baseline or scale.
We can ask how much one fluorescence curve needs to be shifted such that it overlies another, but because we have points, rather than continuous curves we will usually find that, at best, one set of points lies close to, but between, the points of the other.
A reference curve
where A=e β . We still have four parameters, but instead of varying β to obtain best fit we are varying A=e β . This makes no difference to the fit; it just makes the physical significance of the parameters more obvious to the reader.
Looking at the four parameters in turn we have
This is the background fluorescence of an assay which we are assuming to be a nuisance variable. We want our estimate, C q to be independent of Fb.
This is the difference in fluorescence between Fb and the asymptote which fluorescence is approaching. We take tube-to-tube variation in F max to result from differences in such factors as the opacity of the assay tubes; it is a nuisance variable, and C q should be invariant with F max .
This determines shift along the abscissa (cycle axis). In the four-parameter models mentioned, it is the fractional cycle at which the fluorescence representing reaction product is 50% of F max .
which, for large tends to A. That is, the parameter, A is the amplification efficiency during the early, exponential, part of the chain reaction.
If we now want to fit the fluorescence in each tube of a dilution series to the `same’ continuous function, it seems preferable to use the function which models the appropriate amplification efficiency. By the `same’ continuous function we mean functions for which A is the same. We accept that Fb, F max and x 0 will vary from tube to tube, because baseline, scale and C q will vary from tube to tube. But the idea of running a dilution series to determine amplification efficiency is predicated on the assumption that amplification efficiency, and hence A, is the same for every tube. We might seek, therefore, to find for each tube the values for Fb, F max and x 0 that best fit the observed fluorescence in that tube, keeping A fixed at the amplification efficiency as derived in the usual way from a regression of C q against logarithm of dilution.
The impasse is now obvious; the point of dilution assay is to determine the amplification efficiency. Until we know the amplification efficiency we do not know the appropriate value of A to use in Eq. 5 to determine the C q for each tube. The Bauer fixed-point theorem resolves that impasse. Using fixed-point convergence (see, for instance, ), we begin with an initial guess, A 0. This leads to estimates C q for each tube on the basis of which the slope of regression against logarithm of dilution gives a first estimate E 1 of amplification efficiency. We now replace our initial guesstimate A 0 with A 1=E 1 and repeat the process giving a second estimate E 2 and so on, until subsequent estimates are unchanged and convergence has been achieved.
For the process to converge, the requirement of the fixed-point theorem is that a plot of E 1 against A 0 (which is, of course also that of E 2 against A 1and so on), should have an absolute slope less than one. Providing this condition is satisfied (and for the data sets considered here it is), the theorem asserts that
the process will converge
the smaller the slope, the faster it will converge
the value to which it converges is independent of the starting estimate A 0
The reference curves for all tubes are now the same shape, apart from their shift along the abscissa determined by x i . Because shift along the abscissa is the only difference between any two curves the concept of ∆ C q as that difference is now unambiguous; we could define it as the difference between first derivative maximum, second derivative maximum, or difference between cycles at which some fraction of the increase has been achieved and all these definitions would result in the same ∆ C q . The simplest definition is to use x i as the C q for the corresponding tube (the cycle at which the reaction is 50% of its maximum fluorescence).
- 1.Using an initial guesstimate of amplification efficiency (we have used A=1.5) define
- 2.Use algorithm of choice to find, for each tube, the value of x 0 which maximizes the correlation between f(A,x 0,x) and the fluorescence data from that tube. For that tube let C q =x 0. The first fluorescence data from batsch1 is shown in Figure 1. By inspection we can see that half the generated fluorescence occurs by about cycle 29, so the reference curve showing greatest correlation will have a value of x 0 of about 29. Figure 2 shows a plot of correlation against x 0 for values of x 0 from 1 to 45 and we can see the correlation maximizing at cycle 29. A good first-estimate of the fractional cycle at which correlation maximizes is obtained by quadratic interpolation using the three correlations at this and the adjacent cycles. Given the correlations are 0.99497, 0.99766, and 0.99408 at cycles 28, 29 and 30 respectively we imply a maximum at cycle
- 3.Regressing C q against logarithm of dilution determine estimated amplification efficiency, and return to step 1 replacing A with this estimate. We prefer to use logarithms to base 2 as in Figure 3 because the implication of the regression slope is clear from inspection; a doubling at each cycle (E=2) would imply that a two-fold dilution shifts the fluorescence curve by exactly one cycle. The regression slope in Figure 3 is 1.185, implying that it takes 1.185 cycles to compensate for a two-fold dilution. If there is an E-fold increase each cycle then
Return to step 1, replacing the initial A=1.5 with A=1.795. Iterate until estimated efficiency is unchanged. This is the fixed-point iteration, and is illustrated in Figure 4. In the above we started with guesstimate A 0=1.5 (deliberately far from what we expect, so as to illustrate the method) and the first cycle returns an estimated efficiency 1.795. This corresponds to the vertical blue line on Figure 4 in which the red curve shows for this process the output E for a range on input A from 1·4 to 2·2. Bauer's fixed point theorem guarantees convergence if the slope of this line is absolutely less than one in the region of interest (as shown here for these data). We then replace our initial A 0=1.5 with the revised A 1=1.795 (horizontal blue line, completing the first iteration of fixed-point convergence. The second cycle is shown in green, giving the second estimate E 2=1.848 and so on, converging rapidly to an amplification efficiency of 1.8517.
Results and discussion
Fixed point convergence
Sum of ranks (smaller is better)
Std C q
Within the 23 data sets are 181 sets of replicate C q . The rank sums of replicate standard deviations appear as the second line of Table 1. Again using Friedman's rank sum test on the null hypothesis of equal distributions of replicate standard deviations the null hypothesis is rejected (χ 42=149,p<0.0001). On direct comparison of the better two estimators, SDM-l4 and Fixed-point, the Fixed-point is the better estimator in 96 of 181 comparisons but again this is not statistically significant.
When C q is regressed against log dilution the standard deviation of the slope depends in part on whether C q increases linearly with log dilution, and this in turn depends on amplification efficiency being constant at all dilutions. In using the variance of slope as a measure of the merit of a method, we assume that amplification efficiency is invariant with dilution. In practice the data sets we have analyzed show a remarkable linearity; the reason for assaying at intermediate dilutions is to confirm that linearity, without which a dilution series would be difficult to interpret.
The fixed-point method assumes that the fluorescence data approach the plateau. If, in a dilution series, the higher dilutions result in only the very early part of the fluorescence curve emerging, then the estimated C q at these dilutions will be unreliable.
Finally we have presented a comparison of the five methods discussed, and use Friedman's non-parametric rank sum as a test of the null hypothesis that the methods are equivalent. Our data, however, are not randomly selected from the population of dilution series in general, and the Friedman's test should be interpreted with caution in this context. We have examined two `merits’ of the methods: replicate standard deviation and slope standard deviation. These are not independent: the standard deviation of the slope estimate takes into account that of the replicates.
The use of a reference curve, (in this case logistic) relative to which the position of fluorescence data can be measured, avoids subjective decisions as to baseline and scale and threshold. Using data from the whole curve, rather than just a few points, it offers an approach to the estimation of amplification efficiency from a dilution series. The logistic function represents a family of curves, however, and the specific curve appropriate to a given dilution series can be defined by fixed-point iteration. Convergence is rapid and for the illustrative data used here the method is often, but not always, an improvement on existing estimators.
DW and DH conceived and oversaw the research. DH, TW and GM constructed the primers and carried out the experimental work. MJ and GM devised, carried out, and documented the data analysis. All authors read and approved the final manuscript.
This work was funded by grant APP1008337 from the National Health and Medical Research Council of Australia. We gratefullly acknowledge the recommendations of two anonymous reviewers contributed substantially to the final form of this manuscript.
- Mullis K, Faloona F, Scharf S, Saiki R, Horn G, Erlich H: Specific enzymatic amplification of dna in vitro; the polymerase chain reaction . Cold Spring Harbor Symp. Quant. Biol . 1986, Cold Spring Harbor Laboratory Press, New York, 263-273.Google Scholar
- Ruitjer J, Pfaffl M, Zhao S, Speiss A, Boggy G, Blom J, Rutledge R, Sisti D, Lievens A, De Preter K, Derveaux S, Hellemans J: JVandesompele: Evaluation of qpcr curve analysis methods for reliable biomarker discovery: Bias, resolution, precision and implications . Methods. 2013, 59: 32-46. 10.1016/j.ymeth.2012.08.011.View ArticleGoogle Scholar
- Bustin S, Benes V, Garson J, Hellemans J, Huggett J, Kubista M, Mueller R, Nolan T, Pfaffl M, Shipley G, Vandesompele J, Wittwer C: The miqe guidelines: Minimum information for publication of quantitative real-time pcr experiements . Clin Chem. 2009, 55: 611-622. 10.1373/clinchem.2008.112797.View ArticlePubMedGoogle Scholar
- Morrison T, Weis J, Wittwer C: Quantification of low-copy transcripts by continuous sybr green i monitoring during amplification . Biotechniques. 1998, 24: 954-962.PubMedGoogle Scholar
- Gentle A, Anastasopoulos F, McBrien N: High-resolution semi-quantitative real-time pcr without the use of a standard curve . Biotechniques. 2001, 31: 502-508.PubMedGoogle Scholar
- Zhao S, Fernald R: Comprehensive algorithm for quantitative real-time polymerase chain reaction . J Comput Biol. 2005, 12 (8): 1047-1064. 10.1089/cmb.2005.12.1047.View ArticlePubMed CentralPubMedGoogle Scholar
- Liu W, Saint D: Validation of a quantitative method for real time pcr kinetics . Biochem Biophys Res Comm. 2002, 249: 347-353. 10.1016/S0006-291X(02)00478-3.View ArticleGoogle Scholar
- Rutledge R, Stewart D: A kinetic-based sigmoidal model for the polymerase chain reaction and its application to high-capacity absolute quantative real-time pcr . BMC Biotechnol. 2008, 8: 47-10.1186/1472-6750-8-47.View ArticlePubMed CentralPubMedGoogle Scholar
- Andrej-Nikolai Spiess: qpcR: Modelling and Analysis of Real-time PCR Data2014. http://cran.r-project.org/web/packages/qpcR/index.html.Google Scholar
- R Foundation for Statistical Computing: R: A Language and Environment for Statistical Computing . 2014, R Foundation for Statistical Computing, Vienna, AustriaGoogle Scholar
- Roche Molecular Biochemical: LightCycler Software Version 3.5 . 2001, Indiana, Roche Molecular BiochemicalGoogle Scholar
- Bianco-Miotto T, Hussey D, Kay T, O’Keefe D, Dobrovic A: Dna methylation of the ABO promoter underlies loss of ABO allelic expresson in a significant proportion of leukaemic patients . PLoS ONE. 2009, 4 (3): e4788-10.1371/journal.pone.0004788.View ArticlePubMed CentralPubMedGoogle Scholar
- Wickham H: Ggplot2: Elegant Graphics for Data Analysis . 2009, Springer, New YorkView ArticleGoogle Scholar
- Guiscini M, Sisti D, Rocchi M, Stocchi L, Stocchi V: A new real-time pcr method to overcome significant quantitative inaccuracy due to slight amplification inhibition . BMC Bioinformatics. 2008, 9: 326-10.1186/1471-2105-9-326.View ArticleGoogle Scholar
- Chapra S, Canale R: Numerical Methods for Engineers . 2006, McGraw-Hill, BostonGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.