- Research
- Open access
- Published:
Introducing the f0% method: a reliable and accurate approach for qPCR analysis
BMC Bioinformatics volume 25, Article number: 17 (2024)
Abstract
Background
qPCR is a widely used technique in scientific research as a basic tool in gene expression analysis. Classically, the quantitative endpoint of qPCR is the threshold cycle (CT) that ignores differences in amplification efficiency among many other drawbacks. While other methods have been developed to analyze qPCR results, none has statistically proven to perform better than the CT method. Therefore, we aimed to develop a new qPCR analysis method that overcomes the limitations of the CT method. Our f0% [eff naught percent] method depends on a modified flexible sigmoid function to fit the amplification curve with a linear part to subtract the background noise. Then, the initial fluorescence is estimated and reported as a percentage of the predicted maximum fluorescence (f0%).
Results
The performance of the new f0% method was compared against the CT method along with another two outstanding methods—LinRegPCR and Cy0. The comparison regarded absolute and relative quantifications and used 20 dilution curves obtained from 7 different datasets that utilize different DNA-binding dyes. In the case of absolute quantification, f0% reduced CV%, variance, and absolute relative error by 1.66, 2.78, and 1.8 folds relative to CT; and by 1.65, 2.61, and 1.71 folds relative to LinRegPCR, respectively. While, regarding relative quantification, f0% reduced CV% by 1.76, 1.55, and 1.25 folds and variance by 3.13, 2.31, and 1.57 folds regarding CT, LinRegPCR, and Cy0, respectively. Finally, f0% reduced the absolute relative error caused by LinRegPCR by 1.83 folds.
Conclusions
We recommend using the f0% method to analyze and report qPCR results based on its reported advantages. Finally, to simplify the usage of the f0% method, it was implemented in a macro-enabled Excel file with a user manual located on https://github.com/Mahmoud0Gamal/F0-perc/releases.
Background
Quantitative polymerase chain reaction (qPCR) allows the quantification of minute amounts of a specific target DNA by monitoring the increase in fluorescence associated with its amplification. Fluorescence signals collected in qPCR are commonly produced from DNA-binding dyes, such as SYBR Green I or fluorophore-labeled oligonucleotides [1]. Combining qPCR with reverse transcription (RT-qPCR) extended its ability to quantify RNA, especially microRNA and mRNA. This advent unleashed the power of gene expression analysis which became one of the widely used methods in scientific research [2].
qPCR experiments are used to quantify the target nucleic acid either absolutely or relatively [3]. Absolute quantification requires building a standard curve to give a copy number for each reaction. A standard curve is a linear relationship between the threshold cycles (CT) and their log10 transformed concentrations [4]. On the other side, relative quantification requires one or more reference genes to quantify the target gene relative to them. While absolute quantification is widely used in quantifying microbial nucleic acid, relative quantification is a basic tool in gene expression analysis [3].
The quantitative endpoint of both types of quantification is a threshold cycle (CT) that is defined as the PCR cycle at which the fluorescence signal crosses an arbitrary threshold [5]. Till now, CT is the most used method to analyze and report qPCR results [6]. However, there are several limitations to this method including:
-
1.
The efficiency of the PCR reaction (E): The PCR reaction starts—in best cases—with (E = 2, complete doubling). Then, efficiency gradually declines due to the reduced availability of the reaction substrates till (E = 1, no amplification) at the plateau phase [7]. The starting efficiency of the PCR reaction can vary depending on the template, primers, and reaction conditions. However, the CT method assumes that the PCR efficiency is the same for both the target and the reference genes. If there is a significant difference in efficiency between them, then the normalization of the target gene expression using the CT method is invalid [8, 9].
-
2.
Inhibition of the PCR reaction: The presence of inhibitors in the sample can affect the efficiency of the PCR reaction and lead to inaccurate CT values. Inhibitors can be present in the sample due to various reasons such as impurities in the RNA preparation or the presence of PCR inhibitors in the sample matrix [10].
-
3.
Accuracy of the instrument: The accuracy of the qPCR instrument can affect the precision and accuracy of the CT values. Differences in the sensitivity and specificity of the instrument can lead to variations in the CT values obtained [10].
-
4.
Data analysis: The CT method assumes that the PCR amplification is in the exponential phase, and the CT value is determined at a fixed threshold level. However, the actual amplification kinetics can vary between samples and genes, and the choice of threshold level can affect the accuracy of the CT values obtained [11].
Many methods have been developed to overcome the limitations of the CT method such as the sigmoidal models, Cy0, LinRegPCR, CyC*, CqMAN, etc. The sigmoidal curve methods involve fitting the raw data to a four or five-parameter sigmoid equation and the initial fluorescence is calculated [12, 13]. In the Cy0 method, the raw data is fitted to Richard's equation and a tangent is drawn at the inflection point where its intersection with the abscissa is considered the Cy0 value that is used as a CT [14]. LinRegPCR calculates the efficiency for each reaction through a straight line fitted through a predetermined window of linearity. Then, the average of these efficiencies is calculated and used for each amplicon [15]. CyC* determines the earliest amplification cycle (C*) as an outlier over the background fluorescence and calculates efficiency through three amplification cycles starting with the C* followed by calculating the initial template amount [16]. In CqMAN, the Cq is the cycle corresponding to the midpoint between the baseline and the second derivative maximum fluorescence based on a modified Gompertz model. While efficiency (averaged per amplicon) is calculated from a three-parameter exponential model fitted to the cycles from the Cq to the second derivative maximum [17].
Despite the aforementioned limitations of the CT method and the development of many methods to overcome them, the CT method is still the most adopted in analyzing qPCR results [6]. One of the reasons is the simplicity of the CT method. Another important reason is the lack of statistical evidence of the advantage of using the other methods. The CT method was compared once with 10 methods and another time with 13 methods and the Friedman test included the CT method along with LinRegPCR and Cy0 in the subset with the highest rank in both studies [7, 16].
The current study aims to develop a qPCR analysis method (f0%) that addresses the drawbacks of the CT method by minimizing the quantification errors and variation between replicates. Hence, enhances the validity and robustness of the gene expression analysis. Moreover, the performance of the f0% method was compared with the best methods in analyzing qPCR—the CT, LinRegPCR, and Cy0 methods—as reported earlier [7, 16] using datasets that depend on DNA-binding dyes. Moreover, the analysis process considered the presence or absence of a dilution curve. Finally, to facilitate the use of the f0% method, a model was developed and implemented in a user-friendly program.
Methods
Datasets
20 dilution curves obtained from 7 different datasets were used to evaluate the f0% method against the CT, LinRegPCR, and Cy0 methods. The datasets represent various PCR instruments, DNA binding-dyes, and reaction mixtures. All datasets were imported directly from the qpcR R package [18]. In each reaction, the baseline cycles (C3:8) were averaged and their slope was calculated, after which the background fluorescence was subtracted except for the LinRegPCR method. Furthermore, the normalization was performed by dividing the fluorescence of each reaction by the maximum fluorescence of the corresponding dilution curve.
The used datasets are named after the name of their first author as follows:
-
1.
Boggy contains a dilution curve with six tenfold dilutions with 2 replicates. The qPCR was performed on Chromo4 thermal cycler (Bio-Rad) using the SYTO-13 fluorescent dye. The target was a randomly generated synthetic DNA sequence that was optimized to reduce the secondary structures [19].
-
2.
Ruijter contains a dilution curve with four tenfold dilutions with 94 replicates. The qPCR was performed on CFX 384 instrument (Bio-Rad) using SYBR Green I dye. The DNA target was a synthetic oligonucleotide for the human MYCN gene [7].
-
3.
Guescini contains a dilution curve with seven tenfold dilutions with 12 replicates. The qPCR amplification was conducted using LightCycler® 480 (Roche) with SYBR Green I dye. A plasmid containing a 104 bp fragment of the mitochondrial gene NADH dehydrogenase 1 served as the target DNA [14].
-
4.
Lievens contains a dilution curve with five fivefold dilutions with 18 replicates. Soybean genomic DNA was used with primers targeting the lectin endogene Le1. Quantification was based on SYBR Green I [20].
-
5.
Spiess contains four dilution curves denoted as Spiess_1, Spiess_2a, Spiess_2b, and Spiess_3. Spiess_1 is a dilution curve of seven tenfold dilutions with 4 replicates. Spiess_2a and Spiess_2b are two dilution curves of five fourfold dilutions with 3 replicates for two different cDNA samples. Spiess_3 is a dilution curve of seven fourfold dilutions with 3 replicates. The S27a housekeeping gene served as the target. The qPCR instruments used were LightCycler 1.0 (Roche) for Spiess_1 and MXPro3000P (Stratagene) for Spiess_2a, Spiess_2b, and Spiess_3. While SYBR Green I dye was used for the quantification of all dilution curves, only Spiess_3 was ROX-normalized [18].
-
6.
Rutledge contains Six tenfold dilutions with 4 replicates in 5 individual batches. Each batch is considered a dilution curve and denoted as Rutledge_1:5. The primers were designed to amplify a 102 bp amplicon with the help of SYBR Green I dye using Opticon2 (MJ Research Inc) [12].
-
7.
Vermeulen is a huge dataset containing the expression data of 59 genes in addition to 5 housekeeping genes. It was performed to build a multigene-expression signature to help in the prognosis of patients with neuroblastoma. Each of the 64 genes had a dilution curve of five tenfold dilutions with 3 replicates. We included only the first 7 genes (alphabetical order) to avoid bias to a single qPCR instrument. The used genes were AHCY, AKR1C1, ARHGEF7, BIRC5, CAMTA1, CAMTA2, and CD44 while ALUsq was excluded as it shows early amplification (CT = 21:26) in the no template control. Quantification was conducted on LightCycler® 480 (Roche) using SYBR Green I dye [21].
The CT method
The threshold cycle (CT) is a method to report qPCR results quantitatively as defined earlier. This method is based on placing an arbitrary threshold at the exponential phase of the amplification curve [5]. Since choosing the threshold value has a great impact on the CT analysis results [11], we employed a strategy that links the threshold to the baseline noise of the corresponding dilution curve. We found that the maximum standard deviation of the baseline fluorescence of each dilution curve serves as a reliable indicator of its noise. Then, the threshold value for each dilution curve was set to be 100-fold the baseline noise. This threshold yielded the best performance in all datasets except for Boggy, Ruijter, and Rutledge datasets where a threshold value of 10, 50, and 50 folds of their baseline noise yielded better results, respectively. An amplification plot for each dilution curve with its threshold is provided in Additional file 1: Fig. S1 and Additional file 2: Fig. S2. Finally, the CT was calculated according to Eq. (1, 2) [22].
where x is the cycle immediately before the threshold, fx is the fluorescence at cycle x, and E is calculated as follows:
The LinRegPCR method
The LinRegPCR method analyzes qPCR using efficiencies calculated from the slope of a regression line of the datapoints in the exponential phase of baseline corrected fluorescence data. The software of this method takes non-baseline corrected raw data and corrects the baseline. Then an iterative algorithm is used to allocate the exponential phase known as the window of linearity. Then, the average of these efficiencies is calculated and used for each amplicon. Finally, the software produces an N0—initial nucleic acid amount—calculated using the mean efficiency [15].
The Cy0 method
In the Cy0 method, the 5-parameter Richard's equation was used to fit a non-linear curve using the raw data. At the inflection point of this curve a tangent is drawn where its intersection with the abscissa is considered the Cy0 value [14]. The Cy0 was calculated by the cy0() function of the qpcR R package [18] using RStudio-2023.06.1-524 [23] and R programming language v4.3.1 [24].
The f 0% method
The f0% method is based on a six-parameter model Eq. (3) composed of two parts: a four-parameter sigmoid part and a two-parameter linear part. The four-parameter sigmoid part is used to fit the amplification curve with parameters to predict the values of the maximum fluorescence (fm), the rate of efficiency decay (D), the starting efficiency (E), and the inflection cycle (Ci). While the role of the two-parameter linear part is to subtract the background noise with parameters to predict the values of the baseline slope (a) and the baseline intercept (b). The mathematical role of each of the equation parameters is graphically illustrated in Fig. 1.
where fx represents the fluorescence at cycle x.
The previous equation (Eq. 3) operates in two modes:
-
1.
Free E mode: where E is a variable that is left to be predicted. Then, Es from different reactions are averaged per amplicon to give a prediction of the starting efficiency of that amplicon. This mode of Eq. 3 is fitted to cycles ranging from the first cycle to the cycle just after the inflection cycle (Ci).
-
2.
Fixed E mode: where E is a constant that may be calculated in the free E mode or predetermined using the slope of a standard curve as described later. In this mode, all cycles are fitted by solving for fm, D, Ci, a, and b using the constant E. Then, the initial fluorescence (f0) is calculated using Eq. (4).
$$f_{0} = f_{m} - \frac{{f_{m} }}{{\left( {1 + DE^{{ - C_{i} }} } \right)^{1/D} }}$$(4)
In all modes, the initial cycle or cycles should be discarded if they deviate obviously from the baseline. Finally, f0% is calculated as a percentage of the predicted maximum fluorescence as shown in Eq. (5). A flowchart of the analysis process is shown in Fig. 2.
It is important to note that there are two sources of E that should be considered when using the fixed E mode:
-
1.
In experiments lacking the data for a standard curve, E is calculated in the free E mode.
-
2.
In experiments with the data for a standard curve, this data is converted to an approximate f0% using the fixed E mode (assuming E = 2). Then, the standard curve is built by regressing log10(f0%) on log10(conc.). Finally, E is calculated from the slope of the regression line using Eq. (6).
$$E = 2^{1/Slope}$$(6)
Quantification
After calculating the CT, N0, Cy0, and f0% for all reactions in each of the 20 dilution curves, the CT, N0, Cy0, and f0% values were converted to predicted concentrations. For all methods, the predicted concentrations were calculated twice. Once considering the presence of a standard curve and once assuming the absence of a standard curve. This strategy was adopted to assess the performance of the used methods for experiments that either include or lack a standard curve, respectively.
Condition 1: considering the presence of a standard curve
The standard curve is built by regressing CT, log10(N0), Cy0, or log10(f0%) on log10(conc.). In the case of using CT or Cy0, the slope of the regression line is negative because both CT and Cy0 proportionate inversely with the concentration. While using log10(N0) or log10(f0%) produces a positive slope because N0 and f0% are directly proportional to the concentration. Finally, to retrieve concentrations, the regression equation is solved for log10(conc.) using the CT, log10(N0), Cy0, or log10(f0%) values. And the obtained log10(conc.) is raised to the power of 10.
Condition 2: assuming the absence of a standard curve
Normally in this condition, a relative quantification is performed rather than an absolute one. However, for the aim of our study, a predicted concentration should be calculated for each reaction to be used in the performance indicators (described later). To solve this problem, the concentrations were predicted relative to the highest concentration (1st level) in each dilution curve. So, the predicted concentrations have the same scale as the true concentrations.
For the CT and Cy0 methods:
For the f0% and LinRegPCR methods:
Performance indicators
The predicted concentrations calculated by the CT, LinRegPCR, Cy0, and f0% methods were compared with the true concentrations—concentrations obtained from the datasets—to evaluate the performance of these methods. Different performance indicators were needed to measure the performance of the tested methods from different aspects. Precision, which refers to the variation between replicates, was evaluated using the coefficient of variation and variance. Accuracy, which refers to the deviation of the predicted concentrations from the true concentrations, was assessed using the relative error and bias. The performance indicators were calculated as follows:
-
1.
Coefficient of variation (CV%). CV% was calculated for each level of the dilution curves as follows [17]:
$$CV\% = \frac{{SD\left( {predicted\;conc.} \right)}}{{Mean\left( {predicted\;conc.} \right)}} \times 100$$(10) -
2.
Variance. Variance represents the within-level variance of the log10(predicted concentration) [7].
-
3.
Relative error (RE). RE is the deviation of the predicted concentrations from the true concentrations [17].
$$RE = \frac{predicted\;conc. -\; true\;conc.}{{true\;conc.}}$$(11)The perfect value of RE is zero which indicates no error. While values greater or lesser than zero indicate error proportional to the absolute value. In this manner, if we take the average of RE for different reactions, negative values will negate the effect of positive values leading to a misleading average. Therefore, we calculated an absolute relative error that could be averaged.
$$Absolute\;RE = \left| {Relative\; error} \right|$$(12) -
4.
Bias. Bias is the ratio of the averaged predicted concentrations of the highest to the lowest levels [7]. For example, if the true concentration of the highest level in a given dilution curve is 10,000 and the true concentration of the lowest level is 10, then the perfect ratio for bias is 1000. Obviously, the perfect ratio varies between dilution curves according to the dilution rate and the number of levels. Therefore, a normalized bias was calculated by dividing the bias ratio by the perfect expected ratio of the respective dilution curve.
$$Normalized\;Bias = \frac{{ Mean\left( {the \;highest\;predicted\;conc.} \right) }}{{Mean\left( {the \;lowest\;predicted\;conc.} \right)}} \div \frac{ the\;highest\;true\;conc.}{{the\;lowest\;true\;conc.}}$$(13)Because the normalized bias may be greater or lesser than one—the optimum value—an averaged normalized bias would be misleading as described earlier for relative error. So, we calculated an absolute bias that can be averaged with one indicating no bias and values greater than one indicating bias.
$$Absolute\;Bias = e^{{\left| {\ln \left( {Normalized\;Bias} \right)} \right|}}$$(14)For each of the previous performance indicators, a fold reduction was calculated to quantify the effect of using f0% instead of CT, LinRegPCR, or Cy0. Then, the geometric mean of the fold reduction was reported.
$$Fold\;Reduction_{{\left( {PIx} \right)}} = \frac{{C_{T} ,\;LinRegPCR, \;or\; Cy_{{0 \left( {PIx} \right)}} }}{{f_{0} \%_{{ \left( {PIx} \right)}} }}$$(15)where PIx is one of the performance indicators: CV%, variance, absolute RE, and absolute bias.
Statistical analysis
Friedman test was performed to check if the distribution of each performance indicator—grouped by method, paired by dilution curve—shows a statistically significant difference. A significant Friedman test was followed by pairwise Wilcoxon signed-rank tests to identify the differences between methods. Then, p values were adjusted for alpha inflation using the Bonferroni correction. All statistical analysis was conducted using RStudio-2023.06.1-524 [23] and R programming language v4.3.1 [24]. A statistically significant difference was considered when (p value < 0.05). All R scripts containing the statistical analysis were provided in the Additional file 4.
Results
The variation in the used qPCR datasets was intended to represent different templates, primers, master mixes, DNA-binding dyes, and qPCR instruments. Furthermore, the analysis process considered the varying objectives of the qPCR experiments from absolute to relative quantification, which influences the need for a standard curve. Finally, different performance indicators were used to measure the accuracy of the compared methods from different perspectives.
Performance evaluation considering the presence of a standard curve
Upon comparing the methods considering experiments using a standard curve, it was clear that the f0% method offers more advantages than the CT and LinRegPCR methods. Calculating the f0% reduced the variation between technical replicates indicating increased precision. This was evident by reducing the CV% of the CT and LinRegPCR methods by 1.66 folds (p value < 0.001) and 1.65 folds (p value < 0.01), respectively. Moreover, the f0% reduced the variance of the CT and LinRegPCR methods by 2.78 folds (p value < 0.001) and 2.61 folds (p value < 0.01), respectively. On the other hand, there was no statistically significant difference between the f0% and the Cy0 methods in both parameters. Furthermore, the CT, LinRegPCR, and Cy0 methods also didn't show a statistically significant difference between each pair of them regarding the CV% and variance.
Regarding accuracy, both f0% and Cy0 methods reduced the absolute relative error in comparison to the CT method by 1.8 folds (p value < 0.0001) and 1.19 folds (p value = 0.022), respectively. Furthermore, only the f0% method decreased the absolute relative error regarding the LinRegPCR method by 1.71 folds (p value < 0.01). Regarding the other parameter of accuracy—absolute bias, the Friedman test was insignificant. Figure 3 outlines these results while Additional file 3: Table S1 shows the detailed performance of all methods on each dilution curve.
Performance evaluation assuming the absence of a standard curve
In experiments lacking a standard curve, the f0% method showed potential advantage when compared to other methods specifically in the precision parameters. The f0% method increased the uniformity of replicates by lowering the CV% by 1.25 folds (p value < 0.01) against the Cy0 method, by 1.55 folds (p value < 0.01) against the LinRegPCR method, and by 1.76 folds (p value < 0.0001) against the CT method. Moreover, the f0% method diminished the variance by 1.57 folds (p value < 0.001) regarding the Cy0 method, by 2.31 folds (p value = 0.033) regarding the LinRegPCR method, and by 3.13 folds (p value < 0.0001) regarding the CT method.
Regarding the accuracy related parameters, statistically significant differences were scarce. However, the f0% method minimized the absolute relative error in comparison to the LinRegPCR method by 1.83 folds (p value < 0.01). These results are briefed in Fig. 4 while the performance on each dilution curve is provided in Additional file 3: Table S2.
Effect of the inflection cycle (C i) position on the performance of the f 0% method.
As described earlier, all available cycle readings are used to obtain the f0% in the Fixed E Mode, however, sometimes all available cycles are not enough to calculate a precise f0%. The performance of the f0% depends on the inflection cycle (Ci) position. It was assumed that the earlier the Ci, the more precise the f0%. To check this assumption, the first levels of the 20 dilution curves were used to evaluate the performance of the f0% (represented as CV%) relative to the position of the Ci. For each reaction, 11 predictions were performed that differed in the final cycle (Cfinal) to be used in the model ranging from (Cfinal = Ci − 5) to (Cfinal = Ci + 5). We found that our assumption is true: an earlier Ci is associated with a more precise f0%. Generally, f0% is considered precise only when the Cfinal is two cycles or more after the Ci (Cfinal ≥ Ci + 2). Figure 5 shows the relationship between the performance of the f0% and the position of the Ci relative to the Cfinal.
Discussion
qPCR is a widely used technique to quantify minute amounts of specific nucleic acid. It allowed unprecedented advances in gene expression analysis and quantifying infectious pathogens [2]. The traditional method used to report the quantitative endpoint for that technique is the CT method [5] which ignores most of the reaction data. To overcome the drawbacks of the CT method, we introduce a new qPCR analysis method (f0%). The new method proved to be statistically more robust in most situations than the other compared methods. The f0% method outperformed the widely used CT method in experiments that either contain or lack a standard curve. In experiments with a standard curve, the f0% reduced CV%, variance, and absolute relative error by 1.66, 2.78, and 1.8 folds, respectively. Moreover, when analyzing the experiments without using the standard curve data, the f0% reduced CV% and variance by 1.76, 3.13 folds, respectively. Finally, the f0% method was implemented in a macro-enabled Excel file available at https://github.com/Mahmoud0Gamal/F0-perc/releases with a user manual to describe how to professionally use the f0% method.
Method development
The amplification curve of the qPCR has a characteristic sigmoid nature. Thereafter, many qPCR analysis methods adopted the sigmoid functions [12,13,14]. In 2004, Rutledge introduced the sigmoid function as a tool to analyze the qPCR curve. He used a four-parameter sigmoid equation with a symmetric nature that struggled to fit the asymmetric sigmoid curve of the PCR. To overcome this limitation, he excluded the plateau phase from the fit by choosing a cut-off cycle [12]. Later on, Spiess et al. introduced a flexible five-parameter sigmoid equation containing an asymmetry parameter. Spiess' equation increased the accuracy of the fit, but it retained a parameter responsible for the slope [13]. This slope parameter enables the calculation of efficiency from single reaction data and inherently quantification using this reaction-specific efficiency [25]. Although it corrects for efficiency variation and intuitively seems to be advantageous, quantification based on reaction-specific efficiencies tends to dramatically increase the variation between replicates and diminishes the reliability of the quantification. On the other hand, averaging the efficiency per each target achieves greater robustness and reliability [8].
The f0% method also utilizes a sigmoid equation (Eq. 3) with minor but critical modifications. In our equation, we substituted the constant in the denominator (the natural base—e) of the previous sigmoid equations [12, 13] with a variable that directly represents the starting efficiency (E). Moreover, redundancy was reduced by removing the parameter responsible for the slope, since its effect was transferred to the E parameter. Also, we retained the flexibility of the model by including a parameter for asymmetry (D). Here, D represents the rate of efficiency decay, a high value of D indicates a rapid reduction in cycle-to-cycle efficiency, associated with increased deceleration of the amplification rate. In contrast to D, E is the starting efficiency and it's not related to changes in efficiency from cycle to cycle.
Using reaction-specific efficiency in quantification was reported to increase the variation between replicates [8]. After validating this report, we avoided the direct quantification using reaction-specific efficiencies by calculating and averaging efficiencies from all amplicon-specific reactions using the free E mode. Then, taking this averaged efficiency to the fixed E mode to calculate the f0%. This scenario provides an estimate for efficiency when no standard curve is present while preventing the drastic errors caused by using reaction-specific efficiencies.
For accurate analysis of the qPCR curve, the background noise should be properly subtracted by a process called baselining. Therefore, a linear part (ax + b) was added to (Eq. 3) to allow the f0% method to subtract the background noise during the analysis. This approach was adopted instead of baselining separately before analyzing the qPCR curve to avoid reduced precision as noted earlier [25]. Indeed, our method of baselining is fairly accurate even if the background noise is shifting up or down assuming a linear shift. However, some qPCR reactions may show a complex non-linear baseline. In these reactions removing the first few cycles leaves a nearly linear baseline which is suitable for the f0% analysis.
The use of the f0% method was extended to correct for variations in the maximum fluorescence or the reaction volume. It is assumed that the maximum fluorescence is similar in all reactions containing similar amounts of the same PCR mixture. However, due to volume variations and fluctuations in the signal output from the PCR instrument, maximum fluorescence is not identical. This artifact occurs between different wells in the PCR instrument, and it is exaggerated when ROX normalization is ignored [26]. To remove the effect of this artifact on quantification, the f0% method reports the predicted initial fluorescence (f0) as a percentage of the predicted maximum fluorescence (fm) using (Eq. 5). Therefore, apparent differences in the maximum fluorescence would not impair the accuracy of the quantification.
Performance
The performance of the f0% method was compared with the CT, LinRegPCR, and Cy0 methods which constitute the best subset of Friedman test in analyzing qPCR data as reported earlier by two independent benchmarking studies [7, 16]. The methods were tested for precision and accuracy using different indicators regarding various qPCR platforms and experimental designs. The overall precision of the f0% method was statistically superior to the CT and LinRegPCR methods evidenced by reduced CV% and variance regarding both types of quantification (absolute and relative). In addition, considering relative quantification the f0% method was more robust than the Cy0 method. The superiority of the f0% method in precision is attributed to many factors. First, the f0% method—unlike the CT and LinRegPCR—takes benefit of all cycle readings recorded by the qPCR instrument like other curve-fitting methods e.g., Cy0. Consequently, the f0% method doesn't depend on an arbitrary threshold that its position may change the results [13,14,15, 17]. Moreover, unlike other sigmoidal models [12, 13], the f0% and LinRegPCR methods depends on averaged efficiency per amplicon rather than quantification directly using reaction-specific efficiencies [15]. Finally, the f0% method is the first—based on our information—to normalize the result to the predicted maximum fluorescence.
In terms of accuracy, absolute relative error (RE) and absolute bias were calculated. Regarding the absolute RE, the f0% method offered substantial improvement over the CT and LinRegPCR methods in experiments containing a standard curve. Moreover, when considering experiments lacking a standard curve, the f0% method continued to outperform the LinRegPCR method. However, in both types of experiments, there was no statistical difference between the f0% and Cy0 methods. This might be attributed to using the whole data points in the analysis of the f0% and Cy0 [14] but not for the CT and LinRegPCR methods [5, 15].
Regarding the other accuracy parameter—absolute bias, depending on a standard curve made all methods produce relatively unbiased results that were very close to one. This was predicted as quantification using a standard curve nearly eliminates bias [7]. However, in experiments lacking a standard curve, the values of absolute bias were greatly deviated. Although bias is caused—in theory—by misestimating efficiency [7], methods that correct efficiency didn't show statistically significant enhancements in absolute bias. To interpret these unexpected findings, we shall consider the effect of dilution errors on bias. As these errors make the actual bias ratio vary greatly from the perfect expected ratio [8]. Therefore, it will be obvious that the most accurate methods will produce a deviated bias in case of dilution errors. In our case, we used publicly available datasets, so we had no control over their quality, and we relied on the quality control parameters stated by their authors.
Limitations
The f0% method predicts the shape of the amplification curve which has a sigmoid nature. Sigmoid curves are characterized by the presence of one inflection point that separates the upper and the lower parts of the curve. To precisely predict the shape of this curve, we should provide enough data around the inflection point or inflection cycle (Ci) [27]. However, in very rare occasions amplification curves may show late amplification—especially those with reduced efficiency—and fail to reach the Ci before the final cycle (Cfinal) leading to insufficient data around this critical point. Therefore, the precision of the calculated f0% will be reduced.
In practical situations, most amplification curves will pass the Ci. However, in very rare occasions—genes with very low expression along with reduced amplification efficiency—the amplification curve may fail to reach the Ci. If a researcher encounters this situation, the reaction efficiency shall be enhanced. If it is not possible, the number of cycles could be increased. However, increasing the number of cycles will increase the chance of amplifying non-specific products. Therefore, increasing the number of cycles should be done with caution in these rare occasions only if efficiency improvements fail. Moreover, a high-resolution melt curve analysis must be examined carefully to identify any non-specific products.
Another limitation related to data availability is using the datasets that depend on DNA-intercalating dyes only for assessing the performance of the f0% method. Although the widespread usage of TaqMan probes in signal detection in qPCR experiments, we didn't find enough publicly available datasets to validate the performance of the f0% method on them.
Applications
The goal of qPCR experiments is either absolute or relative quantification [3]. Absolute quantification relies on the presence of a standard curve with known concentrations as described earlier. On the other side, relative quantification could be performed without using a standard curve, but it requires the presence of one or more reference genes. Relative quantification is well described using the CT method [5] and here we will describe how to perform two modes of relative quantification using the f0% method.
Fold change
Fold change is the ratio between the concentration of the target gene and the geometric mean of the concentrations of the reference genes of the same sample. Fold change is useful in experiments lacking a control group and could be calculated using Eq. (16).
Normalized fold change
Normalized fold change is the ratio between the fold change of a given sample and the mean of the fold changes of the control group samples. In this case, the mean of the normalized fold changes of the control group samples should equal one. Normalized fold change is calculated using Eq. (17).
Conclusions
Although the widespread usage of the CT method in analyzing qPCR data, it suffers from many drawbacks. To address these limitations, we introduced the f0% method which utilizes all the available cycle readings to give more reliable results. The new method is based on a flexible sigmoid model that was modified to avoid the instability of the previous sigmoidal models. Indeed, our method demonstrated more robust and accurate results when compared with the CT, LinRegPCR, and Cy0 methods using a compiled multi-platform dataset. However, the enhanced performance of the f0% method comes at the cost of requiring a well-developed amplification curve that has at least two cycles post-inflection. Moreover, we facilitated the usage of the f0% by implementing the method in a user-friendly macro-enabled Excel file which can be easily downloaded and used by researchers. Overall, the f0% method offers a more reliable and accurate approach to qPCR analysis, with the potential to improve the accuracy of quantitative measurements in a variety of applications using qRT-PCR.
Availability of data and materials
All data analyzed during this study are imported from the qpcR R package as shown in the R scripts in Additional file 4.
References
Navarro E, Serrano-Heras G, Castaño MJ, Solera J. Real-time PCR detection chemistry. Clin Chim Acta. 2015;439:231–50.
VanGuilder HD, Vrana KE, Freeman WM. Twenty-five years of quantitative PCR for gene expression analysis. Biotechniques. 2008;44:619–26.
Taylor SC, Nadeau K, Abbasi M, Lachance C, Nguyen M, Fenrich J. The ultimate qPCR experiment: producing publication quality, reproducible data the first time. Trends Biotechnol. 2019;37:761–74.
Svec D, Tichopad A, Novosadova V, Pfaffl MW, Kubista M. How good is a PCR efficiency estimate: recommendations for precise and robust qPCR efficiency assessments. Biomol Detect Quantif. 2015;3:9–16.
Schmittgen TD, Livak KJ. Analyzing real-time PCR data by the comparative CT method. Nat Protoc. 2008;3:1101–8.
Hawke DC, Watson AJ, Betts DH. Selecting normalizers for microRNA RT-qPCR expression analysis in murine preimplantation embryos and the associated conditioned culture media. J Dev Biol. 2023;11:17.
Ruijter JM, Pfaffl MW, Zhao S, Spiess AN, Boggy G, Blom J, et al. Evaluation of qPCR curve analysis methods for reliable biomarker discovery: bias, resolution, precision, and implications. Methods. 2013;59:32–46.
Ruijter JM, Barnewall RJ, Marsh IB, Szentirmay AN, Quinn JC, van Houdt R, et al. Efficiency correction is required for accurate quantitative PCR analysis and reporting. Clin Chem. 2021;67:829–42.
Bustin SA, Benes V, Garson JA, Hellemans J, Huggett J, Kubista M, et al. The MIQE guidelines: minimum information for publication of quantitative real-time PCR experiments. Clin Chem. 2009;55:611–22.
Nolan T, Hands RE, Bustin SA. Quantification of mRNA using real-time RT-PCR. Nat Protoc. 2006;1:1559–82.
Ruiz-Villalba A, Ruijter JM, van den Hoff MJB. Use and misuse of Cq in qPCR data analysis and reporting. Life. 2021;11:496.
Rutledge RG. Sigmoidal curve-fitting redefines quantitative real-time PCR with the prospective of developing automated high-throughput applications. Nucleic Acids Res. 2004;32:e178–e178.
Spiess A-N, Feig C, Ritz C. Highly accurate sigmoidal fitting of real-time PCR data by introducing a parameter for asymmetry. BMC Bioinform. 2008;9:221.
Guescini M, Sisti D, Rocchi MBL, Stocchi L, Stocchi V. A new real-time PCR method to overcome significant quantitative inaccuracy due to slight amplification inhibition. BMC Bioinform. 2008;9:326.
Ruijter JM, Ramakers C, Hoogaars WMH, Karlen Y, Bakker O, van den Hoff MJB, et al. Amplification efficiency: linking baseline and bias in the analysis of quantitative PCR data. Nucleic Acids Res. 2009;37:e45–e45.
Zhang L, Dong R, Wei S, Zhou H-C, Zhang M-X, Alagarsamy K. A novel data processing method CyC* for quantitative real time polymerase chain reaction minimizes cumulative error. PLoS ONE. 2019;14:e0218159.
Zhang Y, Li H, Shang S, Meng S, Lin T, Zhang Y, et al. Evaluation validation of a qPCR curve analysis method and conventional approaches. BMC Genom. 2021;22:680.
Spiess A-N. Modelling and analysis of real-time PCR data—CRAN. 2018, p. 109.
Boggy GJ, Woolf PJ. A mechanistic model of PCR for accurate quantification of quantitative PCR data. PLoS ONE. 2010;5:e12355.
Lievens A, Van Aelst S, Van den Bulcke M, Goetghebeur E. Enhanced analysis of real-time PCR data by using a variable efficiency model: FPK-PCR. Nucleic Acids Res. 2012;40:e10–e10.
Vermeulen J, De Preter K, Naranjo A, Vercruysse L, Van Roy N, Hellemans J, et al. Predicting outcomes for children with neuroblastoma using a multigene-expression signature: a retrospective SIOPEN/COG/GPOH study. Lancet Oncol. 2009;10:663–71.
Rao X, Huang X, Zhou Z, Lin X. An improvement of the 2ˆ(-delta delta CT) method for quantitative real-time polymerase chain reaction data analysis. Biostat Bioinform Biomath. 2013;3:71–85.
Posit team. RStudio: integrated development environment for R. 2023.
R Core Team. R: a language and environment for statistical computing. 2023.
Tellinghuisen J. Estimating real-time qPCR amplification efficiency from single-reaction data. Life. 2021;11:693.
Wang G, Becker E, Mesa C. Optimization of 6-carboxy-X-rhodamine concentration for real-time polymerase chain reaction using molecular beacon chemistry. Can J Microbiol. 2007;53:391–7.
Pödör Z, Manninger M, Jereb L. Advanced computational methods for knowledge engineering. Cham: Springer; 2014.
Acknowledgements
Not applicable.
Funding
Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB). This research received no external funding.
Author information
Authors and Affiliations
Contributions
MG and MI conceptualized the research idea, interpreted the results, and wrote the article. MG wrote the R code and programmed the f0% analyzer excel sheet. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Additional file 1: Fig. S1.
Graphical representation of Dilution curves (1–10).
Additional file 2: Fig. S2.
Graphical representation of Dilution curves (11–20).
Additional file 3: Table S1.
Results of the performance indicators of the Ct, LinRegPCR, Cy0, and f0% methods considering the presence of the standard curve. Table S2. Results of the performance indicators of the Ct, LinRegPCR, Cy0, and f0% methods assuming the absence of the standard curve.
Additional file 4:
R scripts used in the analysis process.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Gamal, M., Ibrahim, M.A. Introducing the f0% method: a reliable and accurate approach for qPCR analysis. BMC Bioinformatics 25, 17 (2024). https://doi.org/10.1186/s12859-024-05630-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12859-024-05630-y