Standardized high-throughput evaluation of cell-based compound screens

Background High-throughput screening of pharmaceutical compound activity in tissue culture experiments requires time-consuming repeated analysis of the large amounts of data generated. Automation of the evaluation procedure and assessment of measurement accuracy can save time and improve the comparability of results. Results We present a tool for simultaneous evaluation of an arbitrary number of compound screens including a standardized statistical validation. It is provided as a novel R package with a Tcl/Tk-based GUI for convenient use in the lab and runs on usual platforms like Linux, Windows and Mac OS. In a compound screen of lung cancer cells, the tool was successfully and efficiently applied for data analysis. Conclusion The package provides an efficient and intuitive platform for automatic evaluation of compound screens, improving the performance and standardization of data analysis.


Background
Cell-based screening of the cytotoxic activity of chemical compounds in cancer cells has emerged as a widely used method in the drug discovery process. Typically, cells are treated with several concentrations of compound in 96-or 384-well microtiter plates for a predefined time period. A common method to evaluate these experiments in a quantitative fashion is to determine a half-maximal inhibitory concentration (IC50) for which cell growth is inhibited by 50%. Comprehensive efforts have been focused on screening experiments with thousands of compounds in industrial laboratories as well as institutions of public health. A screen of 60 cancer cell lines with a large library of agents was supervised by the National Cancer Institute [1]. Yet, these compound screens lack a standardized tool and implementation for automatic high-throughput evaluation. We propose the methods and software applied for evaluation in a screen of non-small cell lung cancer (NSCLC) in vitro cell cultures as a standard for cell line screens in future. The implementation is available for download under the General Publice License (GPL).

Evaluation and validation of compound screens
For l = 1, ..., k, consider the screen of the lth compound in log-transformed concentrations X lj with j = 1, ..., m l . On the other hand, denote by Y lij the observed proportion of cells still being alive under concentration j in the ith replicate where i = 1, ..., n l . This determines n l dose-response curves formed by the respective points (X l1 , ..., Y li1 ), ..., (X lm , ..., Y lim ). One IC50 value can be determined from each of these by the preimage c li of the 50% point under a linear spline. In real experiments, this value may not be uniquely determined as the curve crosses the 50% point several times. In these cases, it is most appropriate to define the IC50 value as the smallest concentration where this occurs. The resulting IC50 from the repeated screen is determined as the mean of these n l concentrations with a 95% confidence interval making use of the fact that the IC50 concentrations are normally distributed through the above logarithmic transformation which is inverted subsequently after analysis. Here, denotes the standard deviation of the n l values. If most samples are resistant towards a particular compound in the overall screen, we propose to determine the 25% inhibitory concentration (IC25 value) instead to get a more widespread profile for that sample. To guess the accuracy of an experiment, one point of interest is the variability of the resulting IC50 values. This can be determined by the coefficient of variation of these. On the other hand, the standard deviations of the raw data can be determined for each concentration to verify the initial validity of the measurements. As this results in a total of m l values, it is reasonable to regard the maximum of these values as the overall accuracy of the data points.

Features of the R package
The novel add-on package 'ic50' is available for download from the Comprehensive R Archive Network (CRAN) and provides automatic performance of the above evaluation methods. The functions of the package are appropriate for immediate use on the R console but can be accessed by an intuitive GUI as well ( Figure 1). The main feature that makes the described tool exceedingly useful for practice is that all data in an arbitrary directory on the local harddisk can be evaluated simultaneously by just one mouse click.
In particular, the amount of data to be evaluated is not limited and may comprise screens of hundreds or thousands of compounds or samples, respectively, as long as the same design is shared by all experiments.
Microtiter plates with 96 or 384 wells are supported up to now. Raw data are expected to be passed as tab-delimited text files which are the typical output from appropriate microplate readers. The arrangement of the measurements on the well matrix can be different for each experimental setup. To address this, the design can be configured by three separate files, one specifying the coordinates of the wells for the actual compound measurements, one for the locations of control measurements to be used for normalization and a third for specification of the respective concentrations used for each measurement. Several samples of such files are distributed together with the package. Normalization with control wells can be performed by taking the mean of a specified control row or by one single control well per concentration, where wells can be used multiple times in both cases. Inhibitory percentages can be configured as 50% for all compounds, which is default, or any other individual value, e.g. to calculate IC25 values. Graphical output can be modified by additional options.
As for any R package, there is detailed documentation of all features available with additional examples for illustration and a step-by-step tutorial document guiding the user to prepare his data and configuration files for analysis with the tool (Additional file 1).

Results and Discussion
Results from an evaluation of the lung cancer cell line H3255 under treatment with 7 different compounds are given in Table 1 with the corresponding dose-response curves for gefitinib and SU11274 in Figure 2. The measurements were carried out using a Mithras LB 940 multimode reader (Berthold Technologies, Bad Wildbad, Germany) with the output files converted to tab-delimited text files before the procedure. In general, the numeric results are all given in one single text file with the structure of Table 1 and a graphics output as exemplified in Figure  2 is written to one single pdf file in the specified output directory for all compounds in the screen. The cell line H3255 carries an activating mutation of the EGFR gene making it sensitive to the EGFR inhibitors gefitinib and erlotinib [2]. The full data collection of this compound screen will be published elsewhere [3].  (b)). However, other kinds of unexpected behaviour may occur in real experiments. The curve can be non-monotonic and cross the 50% point several times (Additional file 2, figure (c)). As mentioned above, the smallest of the several concentrations is returned in this case. On the other hand, erroneous measurements may yield a monotonically increasing curve with viability below 50% even for small concentrations (Additional file 2, figure (d)). In this case, the tool returns a NA value for the IC50 concentration.
The lowest IC50 value in the H3255 cells was observed under gefitinib treatment, thus confirming the appropriateness of our screening and analytical approaches [2]. For the coefficient of variation, a usual standard is to require v < 0.05 for reasonable accuracy. Regarding the results in Table 1, the maximum standard deviation ranges between 0.0815 and 0.2413, suggesting an upper threshold of τ = 0.2 for validation. The measurements for rapamycin show very strong variability with an artificially wide confidence interval. For the cell line screen, this result was therefore discarded and replaced by a repeated experiment.

Conclusion
In summary, the 'ic50' package provides a platform for time-efficient evaluation of cell-based compound screens.  The experimental setup can be configured in any order and re-used for multiple subsequent analyses. A standardized validation is included in the tool and can be used to assess the accuracy of the experiments. The approach is suitable to confirm biological activity of targeted drugs in cancer cells with specific genetic lesions.