Open Access

vFitness: a web-based computing tool for improving estimation of in vitro HIV-1 fitness experiments

  • Jingming Ma1Email author,
  • Carrie Dykes2,
  • Tao Wu1,
  • Yangxin Huang3,
  • Lisa Demeter2 and
  • Hulin Wu1
BMC Bioinformatics201011:261

DOI: 10.1186/1471-2105-11-261

Received: 7 January 2010

Accepted: 18 May 2010

Published: 18 May 2010



The replication rate (or fitness) between viral variants has been investigated in vivo and in vitro for human immunodeficiency virus (HIV). HIV fitness plays an important role in the development and persistence of drug resistance. The accurate estimation of viral fitness relies on complicated computations based on statistical methods. This calls for tools that are easy to access and intuitive to use for various experiments of viral fitness.


Based on a mathematical model and several statistical methods (least-squares approach and measurement error models), a Web-based computing tool has been developed for improving estimation of virus fitness in growth competition assays of human immunodeficiency virus type 1 (HIV-1).


Unlike the two-point calculation used in previous studies, the estimation here uses linear regression methods with all observed data in the competition experiment to more accurately estimate relative viral fitness parameters. The dilution factor is introduced for making the computational tool more flexible to accommodate various experimental conditions. This Web-based tool is implemented in C# language with Microsoft ASP.NET, and is publicly available on the Web at


The replication rate (or fitness) between viral variants has been investigated in vivo [1, 2] and in vitro [37] for human immunodeficiency virus (HIV). The lack of a consensus on how to measure fitness makes it difficult to determine if the replication capacity is important in disease progression. An accurate method to calculate fitness along with an easy to use tool will be valuable to virologists who study virus fitness.

Although the importance of HIV fitness in disease progression is unknown, the fitness itself plays an important role in drug resistance [8]. In order to develop a better understanding of viral fitness, Marée et al. proposed a mathematical model to describe the dynamics of viral competition between a wild-type virus and a mutant virus, and presented a formula to calculate the relative fitness 1+s based on data collected from two time points during the course of the experiment [6]. Here, s is the selection coefficient [9]. If there are more than two time points, investigators must choose a pair of time points for the calculation of relative fitness, and the formula does not provide a way to obtain a more accurate estimation over all the observed data. Bonhoeffer et al. proposed a more complicated approach for estimation of viral fitness from time-series data [3] based on the work of Marée et al [6]. Most recently, Wu et al. combined a mathematical model and statistical methods for estimation of virus fitness in growth competition assays [7], which is more in line with population biologist's definition of fitness [9] than the work of Marée et al. [6].

In this paper, we present a Web-based computing tool based on linear regression methods for improving the estimation of in vitro HIV-1 virus fitness measured by the growth competition experiment [7]. We will briefly describe the methods and models used in this computing tool, including the growth competition experimental design, a differential equation model, the least-squares regression, and the linear regression with measurement error. Then we will describe software specifications, like the graphic user interface for the estimation, and dilution factors for various experiments. With the data from two experiments of in vitro HIV-1 growth competition assay, we use this Web-based tool to estimate the fitness parameters and compare the estimation results with two-point calculations used in previous studies. The Web-based tool is implemented in C# with Microsoft ASP.NET. We also implemented validation controls into the web interface to help users input the correct data. The two-point calculation of virus fitness is also provided in this tool for the purpose of comparison.


Growth Competition Assay of HIV-1

A growth competition assay developed by Dykes et al. is used here to measure HIV-1 replication fitness by using flow cytometry to determine the relative proportion of test (mutant) and reference (wild-type) viruses [4]. PM1 cells were infected with two virus stocks, each virus expressed a unique marker for expression that is detected on the surface of the infected cell. After 1 hour incubation at 37°C, unbound viruses were washed out with phosphate-buffered saline (PBS). Cells were then seeded in medium and cultured at 37°C. Half of the culture was removed and fresh medium were added in the culture on day 3, 4, 5, and 6. Cells removed from culture were stained with antibodies specific to the markers for infection, and fixed before analysis by flow cytometry. The numbers of wild-type or mutant infected cells are calculated by multiplying the percentage of cells determined by flow cytometry with the absolute number of viable cells in the culture measureed by automated cell counting.


Nowak and May have discussed the general forms of virus dynamics in their book [10], and some simple mathematical models have been used for the estimation of relative fitness for HIV-1 virus fitness experiments [1, 3, 6]. Wu et al. have used a mathematical model of five ordinary differential equations with five compartments, uninfected target cells (T), cells infected by mutant virus (Tm), cells infected by wild-type virus (Tw), number of mutant viruses (M), and number of wide-type viruses (W) [7]. The model can be simplified to three equations involving T, Tm, and Tw under quasi steady state (QSS) which assumes that the free virus is proportional to the number of infected cells. Under the assumption of QSS two equations about the change rate of infected cells can be written in the following form [7],
where δm represents the death rate ofTm, and δw the death rate of Tw. If we assume that the number of target cells is constant, integrating Equations (1a) and (1b) over the time period from t1 to t2 will yield
where Δt = t2 - t1. By introducing g m = k m T - δ m and g w = k w T - δ w for the net growth rates of mutant and wild-type infected cells, we have the following three formulas based on two data points to measure fitness parameters,
where p is the production rate ratio, r the log fitness ratio, and d the log relative fitness. And the relative fitness 1+s is calculated as

where s is the selection coefficient [9].

Linear Regression

Multiple data points

For the growth competition experiments with more than twoobservations we will use statistical methods to get more accurateestimations of virus fitness. Let t i bethe time-point of the i th observationfor Tm and Tw (i =0, 1, ..., N-1), and Δt j be the time interval t j - t0 (j = 1, ..., N-1). We also introduce two variables as follows,
Then, the general form of Equation (3) can be written as

where two variables m j and w j form a linear relationship. Therefore, we know that the parameter p can be estimated by linear regression with the observed values of wild-type infected cells and mutant infected cells. Similarly, we can use the linear regression method to get the estimations for parameters r and d. Finally, the relative fitness 1+s can be estimated by exp(d) as indicated in Eq.(6). The following sections will briefly list two linear regression methods, the least-squares approach and the measurement error models, which will be used in our computation tool.

Least-squares approach

The term linear regression refers to the fact that correlation and regression measure only a linear relationship between two variables. The typical linear regression model without intercept is described as
where x i is the predictor variable,Y i the observed response, andε i the random error with a normaldistribution of N (0,σ ε 2). According to the least-squares approach, the estimation of parameter β can be expressed as

Linear regression with measurement errors

The measurement error models can be seen in statistical literatures [11, 12]. If the measurement errors follow normal distribution and are independent of each other, linear regression with measurement errors can be written as follows [12],
Equation (11-1) is a specification of classical regression, but the true explanatory variable x i is not observed directly. Xi in Eq.(11-2) denotes the observed measure x i . With the following notations of sample variance and covariance,
the regression coefficient β in Eq.(11) can beestimated in two cases: when the ratio of measurement variances isknown, or when the measurement variance is known. If the ratio is known, the estimation of β is
If the variance of the measurement error in covariate, , is known, the estimation of β is

For most biologists who are interested in virus fitness, using those formulas to calculate the regression coefficient would be cumbersome, time-consuming, and impractical. Therefore, we developed a Web-based computing tool, vFitness. Investigators can use different statistical methods to improve the estimation of viral fitness.

Software Development

Web application

We have implemented a Web-based computing tool in C# language with ASP.NET under Microsoft .NET Framework, which provides a means to program Web pages on the Web server facilities of Internet Information Services (IIS). The code of this computing tool runs on the server machine, and investigators can use their web browser to estimate fitness.

Graphic user interface

This computing tool provides the graphic user interface forinvestigators to estimate the relative fitness in competitionexperiments. Investigators just need to type in the observed valuesfor wild-type infected cells and mutant infected cells in therequired format (values delimited by comma), along with theparameters (δm, δw). Then, the estimation of virus fitness can be easily obtained by submitting the calculation request. This computing tool also provides the validation controls to help users to input correct values for calculation. Four types of validation controls (Range, Compare, RequiredField, RegularExpression) have been used to verify the input values. For example, an error message will show up if the observations of Tm are not delimited by commas. The server code also verifies the input values for error checking. One validation is to make sure that the number of time-points is equal to the number of observations.

Dilution factor

Since the experimental design involves replacing half the culture with fresh media at each time point, we developed the graphic interface to accommodate the half dilution in growth competition assays and the other dilutions as well.

For an in vitro growth competition assay with a half dilution [4, 6], half the medium is taken out from the culture for counting and then thrown away at each time point. The observed data are the data from the half volume. So, the total infected cells in the initial culture would be two times the observed data, which results in a dilution factor of 2. The calculation model here is based on the total number of infected cells relative to the initial culture. The only exception is the estimation of parameter d, which depends on the ratio of two observations Tm and Tw at the same time-point in Eq.(5). Two examples of the dilution factor are given as follows,

  • If the half dilution is taken at every time point of Day 3, 4, and 5, the corresponding dilution factors would be 2, 4, and 8;

  • If one third of testing medium is taken away for counting at each time point of Day 3, 4, 5, and 6, the dilution factors would be 3, 4.5 (or 9/2), 6.75 (or 27/4), and 10.125 (or 81/8).

Missing data

If a dataset is missing at one time point, we can ignore it andcontinue to estimate fitness parameters with the rest of data. Forexample, if the data from Day 4 of a 5-day experiment on Days 3, 4,5, 6, and 7 (half dilution at each time point) was missing, thedilution factors from Day 3 to Day 5 would be 2 to 8 since anadditional dilution was made on Day 4.

Note that the above case is different from the case of four observations at Day 3, 5, 6, and 7, in which no dilution takes place on Day 4 and the dilution factors are still 2, 4, 8, and 16.

Software deployment

This Web-based computing tool has been deployed on a server computer where the Windows 2003 operating system is running. The web server must run IIS (Internet Information Services), FrontPage Server Extensions and must have the .Net Framework installed. This computing tool can be freely used on the Web at


HIV-1 replication fitness experiments

The growth competition assay mentioned above has been used for the experiments of HIV replication fitness in cell culture [4]. Seven million PM1 cells were infected by a total of 300 ng viruses at a ratio of 75% mutant and 25% wild-type. AT2V106I mutant virus is used in one experiment, and AT2Y188C mutant virus in the other. The same wild-type virus AT1WT is used in both experiments. On day 3, 4, 5, and 6, half of the culture was removed and replaced with fresh medium. Cells removed from culture were measured by a flow cytometer. Table 1 and Table 2 show the measurements for the mutant infected cells Tm and the wild-type infected cells Tw in those two experiments, respectively. The dilution factors (2, 4, 8, 16) have been applied at all time-points to keep the same concentration relative to the initial culture.
Table 1

Observation of infected cells in AT1WT/AT2V106I fitness test

Number of infected cells

Day 3

Day 4

Day 5

Day 6

T w





T m





Table 2

Observation of infected cells in AT1WT/AT2Y188C fitness test

Number of infected cells

Day 3

Day 4

Day 5

Day 6

T w





T m





Fitness estimation by statistical methods

Both experiments here have four time points. This computing toolcan be easily used for getting the fitness estimation over allobservations based on three approaches of linear regression, theleast-squares approach (LS), the measurement error model withvariance ratio known (MEr), and the measurement error model withvariance known (MEv). We set δm = 0.5 andδw = 0.5 for all estimations (the same death rate chosen in [6], more discussions seen in [13]), ρ = 1 for MEr, and = 0.2 for MEv. Table 3 and Table 4 show the parameter estimation results with the standard deviation (SD) listed in parentheses from those two experiments, respectively. This computing tool also calculated the fitness parameter based on the average method (AM) [3], in which the average value of the production rate ratio p was calculated on the consecutive pair of time points according to Equation 2.4 in the work of Marée et al. [6]. All three statistical approaches gave a very close estimation for the fitness parameter. The simulation analysis in the work of Wu et al. has already shown that the LS, MEr, and MEv approaches yield better estimation than the AM method in terms of mean squared error [7].
Table 3

Fitness estimation from AT1WT/AT2V106I experiment





1 + s


0.9988 (0.0831)

0.9970 (0.106)

0.1157 (0.353)



1.0023 (0.0835)

1.0026 (0.107)



0.9974 (0.0836)

0.9945 (0.107)







Table 4

Fitness estimation from AT1WT/AT2Y188C





1 + s


0.8532 (0.0471)

0.8104 (0.0557)

-0.2181 (0.271)



0.8543 (0.0471)

0.8119 (0.0558)



0.8486 (0.0476)

0.8029 (0.0567)







Estimation with missing data

As mentioned earlier, the Web-based tool can be used to deal with virus fitness experiments with missing data by setting the dilution factors accordingly. For examples, we analyzed data from the AT1WT/AT2Y188C experiment. One case with data missing on Day 4, the other with data missing on Day 5, where half of the culture was moved away but could not be counted correctly. The dilution factors were 2, 8, and 16 for the first case, and 2, 4, and 16 for the second one. Table 5 shows the estimation results of parameter p for both cases, respectively. The estimations from those two cases of missing data are very close and are also approximately equal to the values shown in Table 4, except for the average method (AM).
Table 5

Parameter p estimation with missing data in AT1WT/AT2Y188C

Missing data





on Day 4





on Day 5





Comparison with two-point calculation

With data from the two experiments, we used this computing tool to easily calculate the fitness parameters on all pairs of time points. Table 6 shows the calculation results of the production rate ratio p on any pair of two time points. The results vary depending on the time point chosen. We believe this is due to differences in cultural conditions from day to day. Therefore, estimating fitness based on the linear regression methods will be more accurate because it considers all the observations from the assay.
Table 6

Parameter p based on two-point calculations

Pair of time-points

Day 3 & 4

Day 3 & 5

Day 3 & 6

Day 4 & 5

Day 4 & 6

Day 5 & 6
















We have developed a Web-based computing tool for improving the estimation of HIV-1 fitness. The tool is based on a mathematical model and linear regression methods which use multiple measurements over time. Two experiments of HIV-1 fitness were completed in this study using growth competition (one with AT2V106I mutant virus, and the other with AT2Y188C mutant virus), and the experimental data has been applied to evaluate the fitness estimation by this Web-based computing tool. The least-squares approach and measurement error models fit the fitness estimation of HIV-1 growth competition, even when data points are missing. It provides an easy way to get a more accurate estimation by using all observations in a fitness experiment.

For comparison, this computing tool also provides the two-point calculation used in the previous studies. Our data has shown that the calculation of the fitness parameter can be very different depending on the pair of time points chosen. Therefore, using all time points to calculate fitness will incorporate the variability from day to day. This computing tool is implemented in C# with Microsoft ASP.NET. The tool provides a graphic user interface and validation controls. Introducing the dilution factor makes it more adaptable to different experimental designs. In this study we competed mutant and wild-type viruses. However, it can be used with any two competing strains of virus by letting W represent one of the strains. This computing tool can be freely used on the Web at

Availability and requirement

Project name: vFitness

Project home page:

Operating system: Platform independent, Web application

Program language: C# with ASP.NET

Any restrictions to use by non-academics: license needed



The authors are grateful for financial support from NIH/NIAID R01 AI041387, R01 AI065217, R01 AI087135, R21 AI078842, P30 AI078498, N01 AI50020, N01 AI50029, N01 AI70008, HHSN272200900041C, and University of Rochester Center for AIDS Research.

Authors’ Affiliations

Department of Biostatistics and Computational Biology, School of Medicine and Dentistry, University of Rochester
Department of Medicine, School of Medicine and Dentistry, University of Rochester
Department of Epidemiology and Biostatistics, College of Public Health, University of South Florida


  1. Goudsmit J, De Ronde A, De Rooij E, De Boer R: Broad spectrum of in vivo fitness of human immunodeficiency virus type 1 subpopulations differing atreverse transcriptase condons 41 and 215. J Virol 1997, 71: 4479–4484.PubMedPubMed CentralGoogle Scholar
  2. Perelson AS, Neumann AU, Markowitz M, Leonard JM, Ho DD: HIV-1 dynamics in vivo: virion clearance rate, infected cell life-span, and viral generation time. Science 1996, 271: 1582–1586. 10.1126/science.271.5255.1582View ArticlePubMedGoogle Scholar
  3. Bonhoeffer S, Barbour AD, De Boer RJ: Procedures for reliable estimation of viral fitness from time-series data. Proc R Soc Lond B 2002, 269: 1887–1893. 10.1098/rspb.2002.2097View ArticleGoogle Scholar
  4. Dykes CJ, Wang J, Jin X, Planelles V, An DS, Tallo A, Huang Y, Wu H, Demeter LM: Evaluation of a multiple-cycle, recombinant virus, growth competition assay that uses flow cytometry to measure replication efficiency of human immunodeficiency virus type 1 in dell culture. J Clinical Microbiology 2006, 44: 1930–1943. 10.1128/JCM.02415-05View ArticleGoogle Scholar
  5. Holland JJ, Dela Torre C, Clarke DK, Duarte E: Quantitation of relative fitness and great adaptability of clonal populations of RNA viruses. J Virol 1991, 65: 2960–2967.PubMedPubMed CentralGoogle Scholar
  6. Marée AFM, Keulen W, Boucher CAB, De Boer RJ: Estimating relative fitness in viral competitive experiments. J Virol 2000, 74: 11067–11072. 10.1128/JVI.74.23.11067-11072.2000View ArticlePubMedPubMed CentralGoogle Scholar
  7. Wu H, Huang Y, Dykes C, Liu D, Ma J, Perelson AS, Demeter L: Modeling and estimation of replication fitness of human immunodeficiency virus type 1 in vitro experiments by using a growth competition assay. J Virol 2006, 80: 2380–2389. 10.1128/JVI.80.5.2380-2389.2006View ArticlePubMedPubMed CentralGoogle Scholar
  8. Dykes C, Demeter LM: Clinical significance of human immunodeficiency virus type 1 replication fitness. Clinical Microbiology Rev 2007, 20: 550–578. 10.1128/CMR.00017-07View ArticleGoogle Scholar
  9. Domingo EL, Menendez-Arias L, Holland JJ: RNA virus fitness. Rev Med Virol 1997, 7: 87–96. 10.1002/(SICI)1099-1654(199707)7:2<87::AID-RMV188>3.0.CO;2-0View ArticlePubMedGoogle Scholar
  10. Nowak MA, May RM: Virus Dynamics: Mathematical principles of immunology and virology. New York, Oxford Univ. Press; 2000.Google Scholar
  11. Carroll RJ, Ruppert D, Stefanski LA: Measurement error in nonlinear models. Chapman & Hall/CRC, New York; 1995.View ArticleGoogle Scholar
  12. Fuller WA: Measurement error models. Wiley, New York; 1987.View ArticleGoogle Scholar
  13. Samali A, Cotter TG: Measurement of cell death in culture. In Animal Cell Biotechnology: Methods and Protocols. Edited by: Jenkins N. Humana Press; 1999:155–164. full_textView ArticleGoogle Scholar


© Ma et al; licensee BioMed Central Ltd. 2010

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.