Reproducibility of gene expression across generations of Affymetrix microarrays
- Ashish Nimgaonkar†1, 2Email author,
- Despina Sanoudou†3,
- Atul J Butte1, 2,
- Judith N Haslett3,
- Louis M Kunkel3,
- Alan H Beggs3 and
- Isaac S Kohane1, 2
© Nimgaonkar et al; licensee BioMed Central Ltd. 2003
Received: 11 March 2003
Accepted: 25 June 2003
Published: 25 June 2003
The development of large-scale gene expression profiling technologies is rapidly changing the norms of biological investigation. But the rapid pace of change itself presents challenges. Commercial microarrays are regularly modified to incorporate new genes and improved target sequences. Although the ability to compare datasets across generations is crucial for any long-term research project, to date no means to allow such comparisons have been developed. In this study the reproducibility of gene expression levels across two generations of Affymetrix GeneChips® (HuGeneFL and HG-U95A) was measured.
Correlation coefficients were computed for gene expression values across chip generations based on different measures of similarity. Comparing the absolute calls assigned to the individual probe sets across the generations found them to be largely unchanged.
We show that experimental replicates are highly reproducible, but that reproducibility across generations depends on the degree of similarity of the probe sets and the expression level of the corresponding transcript.
Expression microarrays provide a vehicle for exploring the gene expression in a manner that is rapid, sensitive, systematic and comprehensive [1–6]. Thousands of genes can now be studied simultaneously without the need of an a priori candidate gene list. In order to keep up with advances in genome sequencing, the number and composition of representative gene sequences are frequently updated and probe sets representing newly discovered expressed sequences are added on commercial microarrays. Furthermore, existing probe sets are revised because probe sequences once thought to be unique for a single gene are occasionally found to be less specific. This leads to the question of whether results from newer microarray generations are comparable to those of previous generations. The cost, time and irreplaceable nature of some of the samples used for microarray analysis require that a method to compare data from different generations be developed.
Although Affymetrix Chips can each measure the expression of over 12,000 genes and ESTs, the true transcript level is confounded by a substantial amount of noise and variability induced by both the large number of observations and the wide range of gene expression values . Microarrays are sensitive to noise from many sources including the manufacturing process and the experimental (RNA isolation, labeling, hybridization, staining, washing and scanning) processes. Even within the same generation of chips and for replicates of single tissue samples, there may be substantial variability in the measurement levels for the same gene . It is critical to distinguish this noise from the changes that are real. Many empirical approaches have been adopted to decrease noise from microarray based experiments. Different methodologies and strategies for reducing noise include establishing an arbitrary global threshold for fold-changes , noise-filtering look up tables , normalization techniques to make microarrays comparable, such as using ANOVA methods to provide estimates of changes in gene expression that are corrected for potential confounding effects [10, 11], and using replicate experiments to estimate the variability in reported gene expression . Applying fold change thresholds has been the most common method of reducing noise by filtering out the false positives [8, 9, 13]. However, much of the work done to date has focused on decreasing noise within the same generation and has not addressed the issue of comparability across generations.
In this analysis, we have estimated the level of congruency between corresponding probe sets on two generations of Affymetrix Chips, HuGeneFL (old) and HG-U95A (new). We aim to understand the characteristics that contribute to the systematic variability of the expression values for experiments performed on different generations of microarrays and extract features that would make them more comparable. Furthermore, we address the issue of variable scanner settings, since a ten-fold decrease in the photo-multiplier tube (PMT) settings of the scanner was another parameter introduced by Affymetrix in parallel to the new chip generation, and interfered with data comparability. More specifically, to expand the dynamic range of the expression assay, a reduction of the system amplification was recommended when using HG-U95A chips.
In order to assess the accuracy and reproducibility of the experiments, as well as the effect of different scanner settings and different chip generations we performed the following types of comparisons. The labeled cRNA from a single sample was split in two, hybridized to two HG-U95A chips and both were scanned at "low gain" photo-multiplier tube (PMT) settings (Exp 1). The labeled cRNA from a sample was split, hybridized to two HG-U95A chips, one was scanned at "low gain" PMT settings, the other at "high gain" settings (Exp 2). The labeled cRNA from a sample was split, hybridized to two HuGeneFL chips, one was scanned at "low gain", the other at "high gain" (Exp 3). The labeled cRNA from one sample was split, hybridized to a HuGeneFL chip and a HG-U95A chip, the HuGeneFL scanned at "high gain" and the HG-U95A at "low gain" (Exp 4) according to manufacturer's recommendations.
The correlation coefficient for pair wise comparisons of samples within and across chip generations and scanner settings.
No. of probe sets U95A / Low Gain in common
U95A / Low Gain
U95A / High Gain
U95A / Low Gain
U95A / Low Gain
The rest of the analyses focused solely on the measurements made on the seven samples split across the HuGeneFL at "high gain" and the HG-U95A chips at "low gain", as this is the most common comparison that will need to be made (Exp 4). The correlation between the gene expression values was computed for different subsets of probe sets based on i) the number of common probe pairs; ii) the number of 'P' calls assigned to every probe set; iii) the expression level of the genes on HuGeneFL chips.
A chi-square analysis was done for all the probe sets on both generations to determine if the absolute calls made for the HuGeneFL chips were statistically independent from the absolute calls made for the HG-U95A chips. A three-by-three contingency table was constructed based on the absolute calls. The 113,050 pairs of calls (7 samples × 2 chip generations × 8,044 common probe sets) were placed into this contingency table and the chi-square value computed. The computed chi-square value was greater than the chi-square value at 0.01 significance level, giving sufficient confidence to reject the null hypothesis that the calls made for the HuGeneFL chips were independent from the calls made for the HG-U95A generation of Chips.
All the above described analysis was repeated using the Affymetrix MAS 5.0 algorithm. The obtained results were highly similar (see Additional Figures 7–10 at http://www.chip.org/~ashish/Reproducibility/). However, when using the MAS 5.0 algorithm, 2,637 of the 8,044 genes (32%) were negatively correlated.
This work is focused on the comparison of HuGeneFL at "high gain" settings and HG-U95A settings at "low gain" settings. Although this comparison represents probe sets with the worst correlation coefficients, it was specifically chosen because most research labs tend to use HuGeneFL chips with the old scanner ("high gain") settings and HG-U95A chips with the new scanner ("low gain") settings, due to a change in Affymetrix recommendations. This represents the most common problem of comparability across the two generations.
Many of the probe sets in the new generation of Affymetrix chips (HG-U95A) have been significantly modified from the corresponding probe sets in the older generation. These differences in the design of the probe sets are due to several factors, including corrections and additions made to the public databases and new techniques used in probe selection. Our aim was to determine the characteristics of the two chip generations that would account for the systematic variability in the gene expression values across them.
The gene expression values for replicates of a particular tissue sample measured at the same scanner setting and on the same chip generation (HG-U95A) gave a very high correlation of 0.99 (Exp 1, Table 1). This indicates that expression measurements within one generation are highly reproducible. Therefore, any variation in gene expression levels across the two generations should be due to the chip technology itself and the specificity of the probe set sequences.
The reproducibility of HG-U95A chips scanned at "high gain" and "low gain" scanner settings is poorer than the reproducibility of HuGeneFL chips at the two scanner settings. This lower correlation of HG-U95A chips at the two scanner settings could be attributed to the fact that HG-U95A chips have higher density of probe pairs than HuGeneFL chips, making them more sensitive to background noise. Furthermore, since the HG-U95A chips are more specific with respect to their sequence selection criteria, they would hybridize more efficiently than HuGeneFL chips and so would be more saturated at high scanner settings giving a lower correlation between "high gain" and "low gain" scanner settings. The experiment involving HG-U95A chips at "high gain" versus the HG-U95A at "low gain" had a higher correlation compared to the HuGeneFL at "high gain" versus the HG-U95A at "low gain" experiments (Table 1). This could be attributed to several factors. The different composition of probe pairs used for some probe set across generations could result in altered hybridization efficiency, and consequently different expression values for the corresponding genes. The different number of probe pairs per probe set in each generation could also introduce some variance since it alters the "sample size" on which all calculations are based. The higher density of probe cells in the HG-U95A chips means that probe pairs are closely packed, and perhaps affected in a different way than standard density chips by noise and background levels. Moreover, the probe pairs for each probe set are scattered over HG-U95A chips as opposed to being physically grouped together as on the HuGeneFL chips. This could result in a variable impact of background and noise on the expression value obtained for each probe set.
Every probe set on HuGeneFL has a corresponding probe set on HG-U95A. However, not all the probe pairs within a probe set are common for the corresponding probe sets on both chip types. In this analysis, the correlation between probe sets increases as the number of common probe pairs increases (from zero to 16 probe pairs), with a correlation coefficient of 0.4 if there are no probe pairs in common and over 0.8 if even one probe pair is in common. The sharp increase of the correlation coefficient between probe sets with none and one common probe pairs, could be explained by the use of poor sequence selection criteria for the specific HuGeneFL probe sets, which later required the complete replacement of the probe set. The chi square value computed using the absolute calls given to each of the probe sets demonstrates that most probe sets (77%) were assigned the same absolute calls on both generations. Using the reproducibility of absolute calls as a measure of consistency across the two generations indicates that the two generations are consistent overall.
The reproducibility of gene expression measurements across generations was higher for probe sets with higher gene expression measurements. To some extent, high expression levels appear to compensate for low numbers of common probe pairs between chip generations, with highest correlations reached when increased gene expression was combined with a large number of common probe pairs (Figure 5 and 6). This pattern was also evident when analyzing the number of 'P' calls for every probe set. More specifically, the correlation of absolute calls for every probe set, increased with increasing gene expression levels. Although the absolute calls are qualitative indicators of the presence of a transcript in a sample, they are derived from the intensities of individual probe pairs within the probe set. We propose that the increased reproducibility at higher expression levels is due to the decreased significance of the fixed measurement noise effect.
This paper gives a basic summary statistic of the comparison between different chip generations, as well as information on the extent to which this is possible. Being able to perform such comparisons is critical especially when tissue availability and financial limitations are an issue. Skeletal muscle was used for the purposes of this study, but any tissue can be used for the establishment of benchmarks depending on the specific interests of individual labs. Further study of more samples and tissue types could establish a widely applicable analytical model to make the most of current datasets and accelerate work with future microarray generations and platforms.
RNA extraction and hybridization
Total RNA was extracted from normal human skeletal muscle tissue samples and used for cDNA and labeled cRNA synthesis as previously described [14, 15]. The fragmented cRNA together with control targets recommended by Affymetrix were hybridized to the GeneChip of choice (HuGeneFL or HG-U95A). HuGeneFL chips contain oligonucleotide sequences representative of 5,600 genes. Each gene is represented by at least one probe set, which in turn consists of approximately 20 probe pairs. Each probe pair consists of two probe cells, the perfect match (PM) and the mismatch (MM). The former is complementary to, and interrogates the expression of a 25 base pair region of the gene sequence, while the latter contains a one-base change and is used to control for non-specific hybridization. HG-U95A chips contain probe sets, each consisting of approximately 16 probe pairs, representative of ~12,600 genes. All 5,600 measured by the HuGeneFL chips are also measured by the HG-U95A chips; however, in order to improve their sensitivity and specificity, the composition of some of the probe pairs has been changed.
Signal detection and analysis
List of chip generation and scanner settings used in each experiment.
Affymetrix software also assigns every probe set an "absolute call" (Present [P], Absent [A], Marginal [M]), which represents a qualitative indication of whether or not a transcript is detected within a sample. In the MAS 4.0 algorithm these calls are determined using the following metrics: 1) the ratio of the number of positive probe pairs to the number of negative probe pairs (known as the Positive/Negative Ratio), 2) the fraction of positive probe pairs (Positive Fraction), and 3) the average across the probe set of each probe pair's log ratio of positive intensity over negative intensity (Log Average Ratio) (1).
Affymetrix tables http://www.affymetrix.com/Auth/support/downloads/comparisons/PN600444HumanFLComp.zip indicate that 6,623 probe sets from HuGeneFL chip have been mapped to 7,094 probe sets from the HG-U95A chip giving a total of 8,044 comparisons between the two generations. Affymetrix also provides a list of the numbers of probe pairs common for the two generations.
The correlation coefficient was used as a measure of congruency between the probe sets across the two generations of Affymetrix Chips (see Table 3 [Additional File 1]). The correlation for different subsets of probe sets was computed based on certain probe set characteristics, as discussed above. Finally, a chi-square analysis was done to determine whether the absolute calls made for the HuGeneFL chip were different from the absolute calls made for the HG-U95A chip.
The authors wish to acknowledge Peter Park for reviewing the manuscript and for his suggestions. The authors would also like to thank Marco Ramoni for his suggestions and to Mei Han and Travis Burleson for their outstanding technical assistance. This work was made possible in part by funding from NIH grants P01828-01A1, U01HL066582, R01AR44345, the Lawson Wilkins Pediatric Endocrinology Society, the Endocrine Fellows Foundation, generous support from the Joshua Frase Foundation and the Muscular Dystrophy Association. Microarray experiments were conducted in the Children's Hospital Gene Expression Core Laboratory supported by NIH grant NS40828.
- Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang C, Kobayashi M, Horton H, et al.: Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol 1996, 14: 1675–80.View ArticlePubMedGoogle Scholar
- Eisen MB, Spellman PT, Brown PO, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 1998, 95: 14863–8. 10.1073/pnas.95.25.14863PubMed CentralView ArticlePubMedGoogle Scholar
- Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 1999, 286: 531–7. 10.1126/science.286.5439.531View ArticlePubMedGoogle Scholar
- Schena M, Shalon D, Davis RW, Brown PO: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 1995, 270: 467–70.View ArticlePubMedGoogle Scholar
- Schena M, Shalon D, Heller R, Chai A, Brown PO, Davis RW: Parallel human genome analysis: microarray-based expression monitoring of 1000 genes. Proc Natl Acad Sci U S A 1996, 93: 10614–9. 10.1073/pnas.93.20.10614PubMed CentralView ArticlePubMedGoogle Scholar
- Scherf U, Ross DT, Waltham M, Smith LH, Lee JK, Tanabe L, Kohn KW, Reinhold WC, Myers TG, Andrews DT, et al.: A gene expression database for the molecular pharmacology of cancer. Nat Genet 2000, 24: 236–44. 10.1038/73439View ArticlePubMedGoogle Scholar
- Bassett DE Jr, Eisen MB, Boguski MS: Gene expression informatics – it's all in your mine. Nat Genet 1999, 21: 51–5. 10.1038/4478View ArticlePubMedGoogle Scholar
- Mills JC, Gordon JI: A new approach for filtering noise from high-density oligonucleotide microarray datasets. Nucleic Acids Res 2001, 29: E72–2. 10.1093/nar/29.15.e72PubMed CentralView ArticlePubMedGoogle Scholar
- Wang Y, Rea T, Bian J, Gray S, Sun Y: Identification of the genes responsive to etoposide-induced apoptosis: application of DNA chip technology. FEBS Lett 1999, 445: 269–73. 10.1016/S0014-5793(99)00136-2View ArticlePubMedGoogle Scholar
- Kerr MK, Martin M, Churchill GA: Analysis of variance for gene expression microarray data. J Comput Biol 2000, 7: 819–37. 10.1089/10665270050514954View ArticlePubMedGoogle Scholar
- Schuchhardt J, Beule D, Malik A, Wolski E, Eickhoff H, Lehrach H, Herzel H: Normalization strategies for cDNA microarrays. Nucleic Acids Res 2000, 28: E47. 10.1093/nar/28.10.e47PubMed CentralView ArticlePubMedGoogle Scholar
- Lee ML, Kuo FC, Whitmore GA, Sklar J: Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations. Proc Natl Acad Sci U S A 2000, 97: 9834–9. 10.1073/pnas.97.18.9834PubMed CentralView ArticlePubMedGoogle Scholar
- Fambrough D, McClure K, Kazlauskas A, Lander ES: Diverse signaling pathways activated by growth factor receptors induce broadly overlapping, rather than independent, sets of genes. Cell 1999, 97: 727–41.View ArticlePubMedGoogle Scholar
- Haslett JN, Sanoudou D, Kho AT, Bennett RR, Greenberg SA, Kohane IS, Beggs AH, Kunkel LM: Gene expression comparison of biopsies from Duchenne muscular dystrophy (DMD) and normal skeletal muscle. Proc Natl Acad Sci U S A 2002, 99: 15000–15005. 10.1073/pnas.192571199PubMed CentralView ArticlePubMedGoogle Scholar
- Sanoudou D, Haslett JN, Kho AT, Guo G, Gazda HT, Greenberg SA, Lidov HGW, Kohane IS, Kunkel LM, Beggs AH: Expression profiling reveals altered satellite cell numbers and glycolytic enzyme transcription in nemaline myopathy muscle. Proc Natl Acad Sci U S A 2003, 100: 4666–71. 10.1073/pnas.0330960100PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.