- Research article
- Open Access
Ensemble analyses improve signatures of tumour hypoxia and reveal inter-platform differences
© Fox et al.; licensee BioMed Central Ltd. 2014
- Received: 2 November 2013
- Accepted: 27 May 2014
- Published: 6 June 2014
The reproducibility of transcriptomic biomarkers across datasets remains poor, limiting clinical application. We and others have suggested that this is in-part caused by differential error-structure between datasets, and their incomplete removal by pre-processing algorithms.
To test this hypothesis, we systematically assessed the effects of pre-processing on biomarker classification using 24 different pre-processing methods and 15 distinct signatures of tumour hypoxia in 10 datasets (2,143 patients).
We confirm strong pre-processing effects for all datasets and signatures, and find that these differ between microarray versions. Importantly, exploiting different pre-processing techniques in an ensemble technique improved classification for a majority of signatures.
Assessing biomarkers using an ensemble of pre-processing techniques shows clear value across multiple diseases, datasets and biomarkers. Importantly, ensemble classification improves biomarkers with initially good results but does not result in spuriously improved performance for poor biomarkers. While further research is required, this approach has the potential to become a standard for transcriptomic biomarkers.
- Ensemble Classifier
- Improve Risk Stratification
- High Hazard Ratio
- Ensemble Classification
- Buffa Metagene
Optimizing cancer treatment aims for a cure which kills all cancerous cells in the body with as little detriment to the patient as possible. Cancer is a highly heterogeneous disease with extreme genomic, intra- and inter-tumour heterogeneity; unsurprisingly, patients show a large variety in response to treatment [1–3]. Personalizing treatment is therefore expected to improve treatment response, and thus patient outcome. For example, in some cases surgical resection of the tumour is curative; additional treatment, which has serious side-effects, is unnecessary. In contrast, other patients presenting with similar clinical characteristics (e.g. age, tumour site, stage and histology) could have more aggressive disease, for which adjuvant treatment is required to cure or control disease . Without markers to distinguish these patients, all are given the same treatment, resulting in over-treatment in some patients and under-treatment in others.
To address this urgent clinical need, many groups have sought to create transcriptomic biomarkers using microarray-, PCR- or RNA-Seq-based assessments of mRNA abundances. The resulting multi-gene prognostic biomarkers (sometimes called signatures) can identify patient subgroups that would be particularly likely to derive benefit from more intense therapy [5, 6]. However, there have been numerous challenges in the development of clinically-useful biomarkers; most published biomarkers fail to enter routine clinical practice .
In cancer, where heterogeneity plays such an important role, these challenges are magnified; important tumour biomarkers may be missed when using the common practice of a single tumour biopsy to direct treatment. If faced with uncertainty in biomarkers, these are deemed unsuitable for clinical applications and clinicians prefer to treat without the information and save costs . In order to advance personalized medicine, robust, reproducible biomarkers are required.
We have shown that, at least in lung cancer one of the major sources of biomarker irreproducibility is their sensitivity to relatively subtle changes in pre-processing . We found that analyzing a single biomarker with different pre-processing techniques yielded highly-divergent results, and these could indeed change clinical management for individual treatments . However, we also found tantalizing hints that different ways of analyzing a single biomarker could be integrated: an “ensemble” of pre-processing methodologies out-performed any individual one in a 442-patient cohort of non-small cell lung cancer patients. It appears that each pre-processing technique removes a different aspect of the underlying noise in a dataset, and thus a large enough collection of them provides a more accurate estimate of the underlying biological signal.
To generalize and extend this finding, we explored the impact of data pre-processing on a micro-environmental biomarker problem: the prediction of tumour hypoxia. Tumor hypoxia (poor oxygenation) contributes to both inter- and intra-tumour heterogeneity, and can compromise cancer treatment. It is a result of the uncontrolled growth of tumour cells and the formation of an abnormal tumour vascular network , and is related to chemotherapy and radiotherapy resistance, tumour aggressiveness and metastasis . Hypoxia is associated with poor prognosis , and a marker for hypoxia both identify patients with more aggressive disease and those who might benefit from specific therapeutic options . Many different predictors of hypoxia have been generated [13–20]. To understand pre-processing sensitivity and how ensemble-classification can be best exploited, we evaluate this approach for 15 separate biomarkers in 10 datasets comprising transcriptomic profiles of 2,143 primary, treatment-naïve breast cancers.
The ensemble approach  was applied to two separate groups of primary breast cancer datasets. The first group comprises 8 datasets profiled on the Affymetrix Human Genome U133A microarrays (HG-U133A), with 1,564 total patients [21–28]. The second group is made up of 2 datasets profiled on Affymetrix Human Genome U133 Plus 2.0 GeneChip Array (HG-U133 Plus 2.0), comprising a combined 579 patients [29, 30]. Only datasets reflected similar disease states and profiles were included, for example datasets of metastatic tumours were excluded . All samples included were treatment-naïve.
A series of 15 published hypoxia gene biomarkers were evaluated. The following signatures were included: Buffa metagene , Chi signature , Elvidge up gene set , Hu signature , the 0% and 2% early Seigneuric signatures , Sorensen gene set , Winter metagene  and Starmans clusters 1 to 7 . Descriptions of each biomarker are given in Additional file 1: Table S1 and Additional file 2: Table S2. The signatures evaluated here only contain up-regulated genes for which high gene expression is associated with poor survival.
The first factor creating pipeline variation for the ensemble classifier was the pre-processing algorithm. We used Robust Multi-array Average (RMA) , MicroArray Suite 5.0 (MAS5) , Model-base Expression Index (MBEI) , GeneChip Robust Multi-array Average (GCRMA) . All of which are available in the R statistical environment (R packages: affy v1.36.0, gcrma v2.30.0). RMA and GCRMA return data in log2-transformed space whereas MAS5 and MBEI return data in normal space. It is common practice to log2-transform MAS5 and MBEI pre-processed data, therefore both normal-space and log2-transformed versions of these two methods were included, giving us six pre-processing algorithms.
Number of probe sets after pre-processing
Number of probe sets
HG-U133 Plus 2.0
HG-U133 Plus 2.0
The last aspect of pipeline variation considered was dataset handling. Pre-processing was either done on each dataset individually or on all datasets merged into one. Separate dataset handling involves pre-processing of a single dataset as a unit, independent of others. Each separate dataset went through the pipeline and was classified independent of the other datasets. From all separate datasets, patients classified as having good prognosis were pooled and patients predicted to have poor prognosis were pooled. Alternatively, for merged data handling, the CEL files from all datasets were combined during pre-processing and went through the entire pipeline as one dataset.
Univariate gene analysis
For each gene represented on both array platforms, patients were median dichotomized into low and high risk groups based on the signal-intensity of that gene across all patients for a single pipeline variant. Cox proportional hazards modeling was used to assess whether survival properties were significantly different between the low risk and high risk patients. Statistical significance was assessed using the Wald test (R package: survival v2.36-14) and p-values were false-discovery rate (FDR) adjusted to correct for multiple-testing.
where Y is the number of genes, V is the annotation method, W is the platform, X is the data handling and Z is the pre-processing algorithm, was evaluated to determine if the model was a good fit for the data.
where Y is the number of genes, V is the annotation method, W is the platform, X is the data handling and Z1..Z5 specify the 6 options for the pre-processing algorithm, backwards stepwise refinement was performed using the Akaike information criterion (AIC).
The linear modelling was constructed with alternative annotation as the baseline for V, HG-U133A as the baseline for W, merged data handling as the baseline for X, and GCRMA as the baseline for Z.
Patient risk group classification
where N is the number of genes in a signature and geneexpr,n is the median dichotomized value for the gene expression of the nth gene in the signature compared to the expression levels of that gene from all samples. If the level of the nth gene is above the median for all samples then geneexpr,n is 1, otherwise −1.
After calculating a score for each patient, these scores were used to median dichotomize patients into high and low risk groups for each signature.
The patient risk group classifications across all pre-processing methods were combined to create an ensemble classification by looking for unanimous agreement between all pipeline variants. The high risk classification for the ensemble classification is given to the patients who have been classified as high risk in all 24 pre-processing pipeline variants; similarly for the low risk grouping. Patients with conflicting classifications between pipeline variants were deemed to have unreliable molecular classifications and were thus excluded from ensemble classification as before  as a conservative approach that might be used in the clinic.
Individual classification for subset of patients
For better comparison between the ensemble classification and individual classifications, the number of patients classified based on one pre-processing approach was reduced to match the number of patients classified in the ensemble classifier. Instead of median dichotomization, the patients were ordered by their multi-gene signature score. Then the number of patients that the ensemble had classified as high risk was selected from the top of the order as high risk patients and this was equivalently done for the low risk classifications.
Kaplan-Meier survival curves and unadjusted Cox proportional hazard ratio modeling (R survival package, v2.36-14) were used to assess survival differences between the low risk and high risk groups. The Wald test was used to determine whether the hazard ratio was statistically different from unity. In all analyses, the superior classification was defined as the classification with the higher Cox proportional hazard ratio.
Permutation sampling for variable number of pipelines in the ensemble
In these analyses, the ensemble classification is generally a combination of all 24 pipeline variants. However, we also varied the number of pipeline variants being combined. To represent a combination of n pipeline variants, we randomly sampled n pipelines (without replacement) and created an ensemble classifier as outlined above. This process was repeated with replacement 2000 times for each value of n ranging from 1 to 24.
Student's t-test methods comparison
The pool of all 24 individual methods across the 15 signatures was split based on a single aspect of the pipeline (dataset handling, gene annotations or pre-processing algorithms). We compared pipelines only differing on a single aspect using the paired t-test to assess statistical differences between pipelines.
Permutation sampling for variable number of pipelines in the ensemble when subgrouping for methods comparison
As part of the method comparison, the pipelines where subgrouped based on a single aspect of the pipeline and then within the subgroups ensembles of a varying number of the pipelines were constructed. To represent a combination of n pipeline variants, we sampled n pipelines (without replacement) and created an ensemble classifier. For each value of n (from 1 to 4 for the pre-processing algorithm or 1 to 12 if subgrouping based on gene annotation or data handling), all possible combinations containing n unique pipeline variants were created.
All plotting was performed in the R statistical environment (v2.15.2) using the lattice (v0.20-10), latticeExtra (v0.6-24), RColorBrewer (v1.0-5) and cluster (v1.14.3) packages.
Ensemble classification approach
Each dataset was pre-processed using 24 different pipeline variants. Each biomarker was then applied separately for each pipeline variant, producing an ensemble of 24 predictions for each patient and biomarker. These were analyzed for consistency and combined to form a single ensemble classification. Figure 1 outlines the approach used. We separated our datasets according to the microarray platform used, and tested the two most widely-used platforms at the time of writing according to depositions in the Gene Expression Omnibus: HG-U133A and HG-U133 Plus 2.0. Since both platforms are Affymetrix arrays and therefore have the same set of potential normalization methods, we can perform inter-platform analysis independent of pre-processing.
Univariate gene analysis
Significant coefficients of linear model for prognostics based on individual gene
Pr (> | t | )
Platform, HGU133 Plus 2.0: Handling, separate
Platform, HGU133 Plus 2.0: Algorithm, log2 MAS5
Platform, HGU133 Plus 2.0: Algorithm, MAS5
Handling, separate: Algorithm, log2 MAS5
Handling, separate: Algorithm, MAS5
Handling, separate: Algorithm, RMA
We next focused on multi-gene classifiers, seeking to determine if our single-gene results could be generalized. We compared the hazard ratios from Cox modeling of the ensemble and the 24 individual classifications for 15 published hypoxia signatures. For all multi-gene signatures, superior classification was defined as the classification with a higher hazard ratio. As seen with the single gene classifiers, variation was observed between classifications from the different pipelines and there was not one single variant which consistently resulted in larger risk stratification than the others. Further this analysis identified microarray platform as another possible source for variation. One pipeline variant (separate data handling, MAS5 algorithm and default annotation) showed the lowest risk stratification of the 24 pipelines on one platform (HG-U133A) and the largest of the 24 pipelines on the other platform (HG-U133 Plus 2.0) (Figure 3). As shown in Figure 3, ensemble classification performed better than individual pipelines and improved signature performance for both microarray platforms.
The Buffa metagene and the Winter metagene showed similar results across pipeline variants, but many of the signatures performed very differently depending on the dataset platform (Figure 4C, Additional file 4: Figure S2, Additional file 5: Table S3). Overall signatures showed better risk-stratification on HG-U133 Plus 2.0 arrays (p = 2.75 × 10−48, paired t-test), although this was signature-specific. Some signatures (Hu signature, Elvidge signature and Starmans cluster 3) showed consistently better results on the HG-U133 Plus 2.0 dataset compared to the HG-U133A dataset. Conversely, Starmans cluster 4 and cluster 5 performed better in the HG-U133A datasets.
The Buffa and the Winter metagene were the only signatures which were statistically significant across all pipelines tested. Hu and Sorensen, additionally, were other signatures with statistically significant ensemble classifications for both datasets. In contrast, Starmans clusters 1, 2, 3 and Seigneuric 0% early signatures did not perform well in either dataset; none of their ensemble classifications were statistically significant. In general, if a signature performed poorly for single pipeline variants, using the ensemble classification did not improve it. This was demonstrated by the correlation between the hazard ratios for the ensemble classification and the maximum hazard ratios for classification from the individual pipeline variants (R = 0.87 for HG-U133A and R = 0.88 for HG-U133 Plus 2.0).
Since previous analyses involved comparing unequal numbers of patients classified, we also compared ensemble classification to classification for the individual pre-processing methods. In this way, we match patient numbers between the two conditions, removing this potential confounding variable. In general, this approach yielded fewer statistically significant results (Additional file 6: Figure S3), although both the range and the variance of hazard ratios increased for every signature using this classification algorithm. However the comparison between of ensemble classifications and individual classifications shows that patient-number differences are not the origin of the superior performance of ensemble classification. For 13/30 signatures, the ensemble classification was superior to all classifications from the individual pre-processing pipelines and in 26/30 signatures the ensemble exceeded the median classification.
What is the optimal ensemble size?
Having shown that the ensemble-approach improved classification for most biomarkers and datasets, we explored the limits of its performance. We wondered if 24 distinct pipelines were always necessary, and therefore evaluated the number of pipeline variants required for optimal performance (maximum risk stratification, as measured by the hazard ratio) of the ensemble classifier. If creating an ensemble of four pipeline variants is equally successful to one from eight variants, then it is not beneficial to introduce the complexity and computational-costs of pre-processing with four extra pipelines.
Focusing on signatures with a significant 24-pipeline ensemble, different combinations of pipelines, ranging from combinations of only 2 to all 24, were evaluated. These analyses indicated that in general increasing the number of pipeline variants resulted in an increase in absolute effect size which started to plateau as the number of methods in the ensemble increased (Figure 5C). In parallel, the percentage of patients classified with the ensemble method decreased and plateaued (Figure 5D). Most signatures shared the same shape but with different rates of hazard ratio increase. The Sorensen signature on the HG-U133A dataset plateaued at about four pipeline variants. Therefore, in this case, randomly choosing four pipeline variants to combine provided roughly the same risk stratification as using all 24 pipelines. Conversely, for the Winter metagene signature in either dataset, the mean hazard ratio continued to increase all the way up to 24 pipelines, though the curve was steeper at the beginning then in the end. Although the hazard ratio stopped increasing in some cases, stability continued to increase as the number of methods in the ensemble increased. This is demonstrated in Additional file 8: Figure S5 by the tightening of the hazard ratio range as the number of pipelines is increased.
Considering the Winter metagene signature in HG-U133A data, the ensembles created from nine or more of the 24 pipelines outperformed all single pipeline classifiers (Additional file 8: Figure S5 and Additional file 9: Table S4). Many ensembles did not require all 24 variants to be an improvement over all non-ensemble methods (Additional file 8: Figure S5, Additional file 9: Table S4, Additional file 10: Table S5). Even if the ensemble of 24 variants was not an improvement over non-ensemble methods, there may still have been an ensemble of a subset of the variants which was superior to the non-ensemble methods (Additional file 8: Figure S5, Additional file 9: Table S4, Additional file 10: Table S5). These data provide a compelling rationale to consider and evaluate ensemble pipelines for all microarray-based biomarkers.
After showing that ensembles are beneficial, we wanted to look at whether we can determine the combination of pipelines that lead to higher hazard ratios in order to add the most benefit for each additional pre-processing pipeline. There is a clear relationship between the number of patients classified in the ensemble and the gain in hazard ratio, meaning that the ensemble is choosing to exclude the right subset of patients (Additional file 11: Figure S6A). Methods that produce less-correlated classifications gain more from the ensemble classification. However, if we look at which methods are diverse by a different metric such as the profiles of prognostic ability of each gene as a single gene classifier, there is only a slight but not obvious increase in hazard ratio from using more diverse pipelines in the ensemble classification (Additional file 11: Figure S6B).
To help direct pipeline choices, we sought to address whether certain aspects of the pipeline resulted in better or worse performance. For each aspect of the pipeline (dataset handling, gene annotations, and pre-processing algorithms), the hazard ratios were grouped per variant of that aspect and compared. This was done for both platforms separately and combined.
Ensemble of signatures
To further filter out unreliable classifications we investigated combining the classifications from two signatures. The Buffa metagene and the Winter metagene performed best across our analyses. These two signatures share 24 genes (out of 52 for Buffa metagene, out of 102 for Winter metagene). Expansion of the ensemble classification to only classify patients that both signatures agreed on (intersect of patients classified by both signatures) improved risk stratification (the hazard ratio) compared to ensemble evaluations of both signatures (Additional file 12: Figure S7).
To complete the analysis and expand the number of patients classified, we also pooled the unanimous classifications (the union of both signatures, excluding patients that were classified in contrasting risk groups). This failed to improve risk stratification compared to ensemble evaluations of both signatures; however, prognostic performance was improved over all the signatures’ individual pre-processing methods. Further, many more patients were classified than with the basic ensemble technique (Additional file 12: Figure S7), suggesting that ensembles of signatures could be used to further remove noise or to increase the number of patients given confident molecular classifications.
The purpose of pre-processing is to remove “noise” from the data. However, since no method is perfect, each pre-processing pipeline removes a somewhat different aspect of the “noise”. Indeed, groups around the world have focused on identifying the “optimal” pre-processing technique for different types of data [37, 38]. The principle of ensemble classification is that by combining pre-processing approaches we can select the parts of the data which are reliable across the multiple approaches. The central tendency of this pool of methods is thus predicted to lie closer to the “true” value, and thereby to provide a better biomarker.
Although different pre-processing methods may cause some variation in the analysis, pre-processing is expected to have a minor effect on the core experimental results and conclusions [38, 39]. Our previous work has indicated this is not the case and pre-processing caused major outcome differences in non-small cell lung cancer . Here we systematically extend and deepen these analyses to explore the variation caused by algorithmic diversity in pre-processing.
At the single gene level substantial differences in prognostic power were seen in univariate analysis. Therefore pre-processing is part of the reason different studies identify different biomarker genes. Many authors will use public data to show that a given gene is prognostic; however, essentially all genes (~75%) can meet that criterion, depending on which platform and pre-processing technique is used. Single genes did not appear to behave the same across pipelines demonstrating variation in classification results are expected and signatures are dependent on the pre-processing platform they were discovered on. The fluctuation in prognostic ability for each gene partially explains why we get different results for multi-gene signatures and why there is such difficulty validating biomarkers between research groups .
In combination with data showing a plurality of gene sets are associated with outcome in breast cancer and other cancers [41–43], the variation suggests that mining of public data for prognostic significance is very prone to over-fitting and multiple-testing concerns. Therefore robust, permutation-based approaches need to be developed .
In non-small cell lung cancer, Starmans et al. showed one example of a permutation-based approach where an ensemble biomarker classifier improved survival separation between low risk and high risk patient groups. Here we extended this finding showing that the method replicates on two microarray platforms representing 10 separate datasets in breast cancer for a series of 15 biomarkers. Across both platforms, there was not a single pipeline that unfailingly outperformed all other pipelines; therefore, the ensemble classification provides a generalized approach to improve biomarkers, both in terms of performance and stability, without determining the actual optimal pre-processing pipeline.
Furthermore, in many of the cases, the ensemble classifications outperformed all single pre-processing methods. The ultimate value of the ensemble classifiers as a concept was demonstrated with the Buffa metagene and Winter metagene. For these two signatures, any ensemble classifier comprising at least nine pipelines on HG-U133A or 20 pipelines on HG-U133 Plus 2.0 arrays generated superior risk stratifications compared to all the classifiers that used only a single pre-processing pipeline. Consequentially, ensemble classifiers are most definitely beneficial and should be used.
The ensemble approach did not improve all biomarkers. Biomarkers with generally bad risk stratifications across pre-processing pipelines still showed poor performance when combined in an ensemble. The ensemble approach magnified the separation of risk groups rather than corrected for a poor initial biomarker. Therefore the ensemble approach can also be used as a metric to assess the quality of biomarkers, distinguishing between poor and good signatures. By the statistical significance and consistency in risk stratification improvement across the datasets, the Buffa and Winter metagenes are shown to be strong, consistent signatures. By the same metric, Seigneuric 0% early and Starmans cluster 1, 2, 3 appear to be poor signatures validating previous findings where these signatures did not show prognostic power .
A disadvantage of the ensemble classification is that a fraction of patients are not classified. Only patients with robust risk classification across pipelines in the ensemble are assigned to risk groups. Here, using only the statistically significant ensembles 16% to 80% of the patients were not classified. The signatures showing significance on both platforms tend to have a higher percentage of patients classified (36% to 80%) than the signature significant on only on platform (16% to 68%) and the signatures not significant on either platform (15% to 46%). Nevertheless a patient classified as unreliable with one signature may be robustly classified using a different signature. This was shown by intersecting the Buffa and Winter metagenes, which resulted in improved prognostic power compared to the single pre-processing pipelines and classified more patients than with the two ensembles individually. One might consider taking this into account in biomarker-developement by attempting to construct ensembles that minimize the fraction of unclassified patients in the training dataset, although unclassified patients could resort to standard clinical care.
An important note about our approach to using the ensemble classification is the diversity of the pre-processing methods. Our choice of signature evaluation meant that a log2-transform on the data does not create different classifications. For example, the multiple methods MAS5 and log2 MAS5 pre-processing are actually only one pipeline variant and aren't filtering out additional unreliable patients. Here, the array of 24 pre-processing methods in reality only gives 16 unique classifications so there is less diversity than the numbers indicate. However, for other signatures, such as the risk-score and clustering methods evaluated previously , up to 24 unique classifications are possible.
In multiple signatures, the ensemble of all 24 pre-processing pipelines is an improvement but not the optimal ensemble classification (Additional file 8: Figure S5). So future analysis to refine the process and finesse which pipelines to use and how they should be combined would be advantageous. An important future direction is exploring how ensemble methods can be improved by incorporating greater algorithm diversity. Continual addition of diverse methods may increase the optimal ensemble classification or the optimal ensemble may be a certain combination of pipelines and new additions may not lead to an increase. In some cases, such as the Hu signature on HG-U133 Plus 2.0 addition of another separate data handling pipeline is not likely to increase the risk stratification but adding a merged data handling pipeline would be advantageous (Figure 6).
We systematically show that differences in pre-processing create differences when using biomarkers. This effect of pre-processing is important for the research community to recognize and consider, as accurately accounting for it will advance biomarker discovery, validation and ultimately clinical application. We found that the Buffa metagene is the most consistent biomarker and therefore most clinical useful signature evaluated and we show that application of ensemble classification technique is beneficial for improving risk stratification both in terms of effect size and stability of biomarkers.
The authors thank Nathalie Moon for advice on statistical analysis and all members of the Boutros lab for helpful suggestions.
This study was conducted with the support of the Ontario Institute for Cancer Research to PCB through funding provided by the government of Ontario. PCB was supported by a Terry Fox Research Institute New Investigator Award. This work was supported by the Center for Translational Molecular Medicine (http://www.ctmm.nl) (AIRFORCE Project Ref. 030–103), EU IMI program (QuIC-ConCePT), EU 7th framework program (Metoxia and Artforce program) and Dutch Cancer Society (KWF UM 2011–5020, KWF UM 2009–4454) through funding to MHWS and PL.
- Polyak K: Heterogeneity in breast cancer. J Clin Invest. 2011, 121: 3786-3788. 10.1172/JCI60534.View ArticlePubMed CentralPubMedGoogle Scholar
- Curtis C, Shah SP, Chin S-F, Turashvili G, Rueda OM, Dunning MJ, Speed D, Lynch AG, Samarajiwa S, Yuan Y, Gräf S, Ha G, Haffari G, Bashashati A, Russell R, McKinney S, Langerød A, Green A, Provenzano E, Wishart G, Pinder S, Watson P, Markowetz F, Murphy L, Ellis I, Purushotham A, Børresen-Dale A-L, Brenton JD, Tavaré S, METABRIC Group, et al: The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature. 2012, 486: 346-352.PubMed CentralPubMedGoogle Scholar
- Russnes HG, Navin N, Hicks J, Borresen-Dale A-L: Insight into the heterogeneity of breast cancer through next-generation sequencing. J Clin Invest. 2011, 121: 3810-3818. 10.1172/JCI57088.View ArticlePubMed CentralPubMedGoogle Scholar
- Van ’t Veer LJ, Bernards R: Enabling personalized cancer medicine through analysis of gene-expression patterns. Nature. 2008, 452: 564-570. 10.1038/nature06915.View ArticleGoogle Scholar
- Lambin P, van Stiphout RGPM, Starmans MHW, Rios-Velazquez E, Nalbantov G, Aerts HJWL, Roelofs E, van Elmpt W, Boutros PC, Granone P, Valentini V, Begg AC, De Ruysscher D, Dekker A: Predicting outcomes in radiation oncology–multifactorial decision support systems. Nat Rev Clin Oncol. 2013, 10: 27-40.View ArticlePubMed CentralPubMedGoogle Scholar
- Abba M, Lacunza E, Butti M, Aldaz C: Breast cancer biomarker discovery in the functional Genomic Age: a systematic review of 42 gene expression signatures. Biomark Insights. 2010, 5: 103-118.View ArticlePubMed CentralPubMedGoogle Scholar
- Kern SE: Why your New cancer biomarker May never work: recurrent patterns and remarkable diversity in biomarker failures. Cancer Res. 2012, 72: 6097-6101. 10.1158/0008-5472.CAN-12-3232.View ArticlePubMed CentralPubMedGoogle Scholar
- Diamandis EP: The failure of protein cancer biomarkers to reach the clinic: why, and what can be done to address the problem?. BMC Med. 2012, 10: 87-10.1186/1741-7015-10-87.View ArticlePubMed CentralPubMedGoogle Scholar
- Starmans MH, Pintilie M, John T, Der SD, Shepherd FA, Jurisica I, Lambin P, Tsao M-S, Boutros PC: Exploiting the noise: improving biomarkers with ensembles of data analysis methodologies. Genome Med. 2012, 4: 84-10.1186/gm385.View ArticlePubMed CentralPubMedGoogle Scholar
- Goel S, Duda DG, Xu L, Munn LL, Boucher Y, Fukumura D, Jain RK: Normalization of the vasculature for treatment of cancer and other diseases. Physiol Rev. 2011, 91: 1071-1121. 10.1152/physrev.00038.2010.View ArticlePubMed CentralPubMedGoogle Scholar
- Brown JM, Wilson WR: Exploiting tumour hypoxia in cancer treatment. Nat Rev Cancer. 2004, 4: 437-447. 10.1038/nrc1367.View ArticlePubMedGoogle Scholar
- Wouters BG, van den Beucken T, Magagnin MG, Lambin P, Koumenis C: Targeting hypoxia tolerance in cancer. Drug Resist Updat. 2004, 7: 25-40. 10.1016/j.drup.2003.12.004.View ArticlePubMedGoogle Scholar
- Buffa FM, Harris AL, West CM, Miller CJ: Large meta-analysis of multiple cancers reveals a common, compact and highly prognostic hypoxia metagene. Br J Cancer. 2010, 102: 428-435. 10.1038/sj.bjc.6605450.View ArticlePubMed CentralPubMedGoogle Scholar
- Chi J-T, Wang Z, Nuyten DSA, Rodriguez EH, Schaner ME, Salim A, Wang Y, Kristensen GB, Helland Å, Børresen-Dale A-L, Giaccia A, Longaker MT, Hastie T, Yang GP, van de Vijver MJ, Brown PO: Gene expression programs in response to hypoxia: cell type specificity and prognostic significance in human cancers. PLoS Med. 2006, 3: e47-10.1371/journal.pmed.0030047.View ArticlePubMed CentralPubMedGoogle Scholar
- Elvidge GP, Glenny L, Appelhoff RJ, Ratcliffe PJ, Ragoussis J, Gleadle JM: Concordant regulation of gene expression by hypoxia and 2-oxoglutarate-dependent dioxygenase inhibition: the role of HIF-1alpha, HIF-2alpha, and other pathways. J Biol Chem. 2006, 281: 15215-15226. 10.1074/jbc.M511408200.View ArticlePubMedGoogle Scholar
- Hu Z, Fan C, Livasy C, He X, Oh DS, Ewend MG, Carey LA, Subramanian S, West R, Ikpatt F, Olopade OI, van de Rijn M, Perou CM: A compact VEGF signature associated with distant metastases and poor outcomes. BMC Med. 2009, 7: 9-10.1186/1741-7015-7-9.View ArticlePubMed CentralPubMedGoogle Scholar
- Seigneuric R, Starmans MHW, Fung G, Krishnapuram B, Nuyten DSA, van Erk A, Magagnin MG, Rouschop KM, Krishnan S, Rao RB, Evelo CTA, Begg AC, Wouters BG, Lambin P: Impact of supervised gene signatures of early hypoxia on patient survival. Radiother Oncol. 2007, 83: 374-382. 10.1016/j.radonc.2007.05.002.View ArticlePubMedGoogle Scholar
- Sørensen BS, Toustrup K, Horsman MR, Overgaard J, Alsner J: Identifying pH independent hypoxia induced genes in human squamous cell carcinomas in vitro. Acta Oncol. 2010, 49: 895-905. 10.3109/02841861003614343.View ArticlePubMedGoogle Scholar
- Winter SC, Buffa FM, Silva P, Miller C, Valentine HR, Turley H, Shah KA, Cox GJ, Corbridge RJ, Homer JJ, Musgrove B, Slevin N, Sloan P, Price P, West CML, Harris AL: Relation of a hypoxia metagene derived from head and neck cancer to prognosis of multiple cancers. Cancer Res. 2007, 67: 3441-3449. 10.1158/0008-5472.CAN-06-3322.View ArticlePubMedGoogle Scholar
- Starmans MHW, Chu KC, Haider S, Nguyen F, Seigneuric R, Magagnin MG, Koritzinsky M, Kasprzyk A, Boutros PC, Wouters BG, Lambin P: The prognostic value of temporal in vitro and in vivo derived hypoxia gene-expression signatures in breast cancer. Radiother Oncol. 2012, 102: 436-443. 10.1016/j.radonc.2012.02.002.View ArticlePubMedGoogle Scholar
- Miller LD, Smeds J, George J, Vega VB, Vergara L, Ploner A, Pawitan Y, Hall P, Klaar S, Liu ET, Bergh J: An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survival. Proc Natl Acad Sci U S A. 2005, 102: 13550-13555. 10.1073/pnas.0506230102.View ArticlePubMed CentralPubMedGoogle Scholar
- Schmidt M, Petry IB, Böhm D, Lebrecht A, von Törne C, Gebhard S, Gerhold-Ay A, Cotarelo C, Battista M, Schormann W, Freis E, Selinski S, Ickstadt K, Rahnenführer J, Sebastian M, Schuler M, Koelbl H, Gehrmann M, Hengstler JG: Ep-CAM RNA expression predicts metastasis-free survival in three cohorts of untreated node-negative breast cancer. Breast Cancer Res Treat. 2011, 125: 637-646. 10.1007/s10549-010-0856-5.View ArticlePubMedGoogle Scholar
- Wang Y, Klijn JGM, Zhang Y, Sieuwerts AM, Look MP, Yang F, Talantov D, Timmermans M, Meijer-van Gelder ME, Yu J, Jatkoe T, Berns EMJJ, Atkins D, Foekens JA: Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet. 2005, 365: 671-679. 10.1016/S0140-6736(05)17947-1.View ArticlePubMedGoogle Scholar
- Sotiriou C, Wirapati P, Loi S, Harris A, Fox S, Smeds J, Nordgren H, Farmer P, Praz V, Haibe-Kains B, Desmedt C, Larsimont D, Cardoso F, Peterse H, Nuyten D, Buyse M, de Vijver MJV, Bergh J, Piccart M, Delorenzi M: Gene expression profiling in breast cancer: understanding the molecular basis of Histologic grade to improve prognosis. JNCI J Natl Cancer Inst. 2006, 98: 262-272. 10.1093/jnci/djj052.View ArticleGoogle Scholar
- Pawitan Y, Bjöhle J, Amler L, Borg A-L, Egyhazi S, Hall P, Han X, Holmberg L, Huang F, Klaar S, Liu ET, Miller L, Nordgren H, Ploner A, Sandelin K, Shaw PM, Smeds J, Skoog L, Wedrén S, Bergh J: Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts. Breast Cancer Res. 2005, 7: R953-10.1186/bcr1325.View ArticlePubMed CentralPubMedGoogle Scholar
- Symmans WF, Hatzis C, Sotiriou C, Andre F, Peintinger F, Regitnig P, Daxenbichler G, Desmedt C, Domont J, Marth C, Delaloge S, Bauernhofer T, Valero V, Booser DJ, Hortobagyi GN, Pusztai L: Genomic index of sensitivity to endocrine therapy for breast cancer. J Clin Oncol. 2010, 28: 4111-4119. 10.1200/JCO.2010.28.4273.View ArticlePubMed CentralPubMedGoogle Scholar
- Desmedt C, Piette F, Loi S, Wang Y, Lallemand F, Haibe-Kains B, Viale G, Delorenzi M, Zhang Y, d’ Assignies MS, Bergh J, Lidereau R, Ellis P, Harris AL, Klijn JGM, Foekens JA, Cardoso F, Piccart MJ, Buyse M, Sotiriou C: Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series. Clin Cancer Res. 2007, 13: 3207-3214. 10.1158/1078-0432.CCR-06-2765.View ArticlePubMedGoogle Scholar
- Zhang Y, Sieuwerts AM, McGreevy M, Casey G, Cufer T, Paradiso A, Harbeck N, Span PN, Hicks DG, Crowe J, Tubbs RR, Budd GT, Lyons J, Sweep FCGJ, Schmitt M, Schittulli F, Golouh R, Talantov D, Wang Y, Foekens JA: The 76-gene signature defines high-risk patients that benefit from adjuvant tamoxifen therapy. Breast Cancer Res Treat. 2009, 116: 303-309. 10.1007/s10549-008-0183-2.View ArticlePubMedGoogle Scholar
- Sabatier R, Finetti P, Cervera N, Lambaudie E, Esterni B, Mamessier E, Tallet A, Chabannon C, Extra J-M, Jacquemier J, Viens P, Birnbaum D, Bertucci F: A gene expression signature identifies two prognostic subgroups of basal breast cancer. Breast Cancer Res Treat. 2011, 126: 407-420. 10.1007/s10549-010-0897-9.View ArticlePubMedGoogle Scholar
- Kao K-J, Chang K-M, Hsu H-C, Huang AT: Correlation of microarray-based breast cancer molecular subtypes and clinical outcomes: implications for treatment optimization. BMC Cancer. 2011, 11: 143-10.1186/1471-2407-11-143.View ArticlePubMed CentralPubMedGoogle Scholar
- Bos PD, Zhang XH-F, Nadal C, Shu W, Gomis RR, Nguyen DX, Minn AJ, Van de Vijver M, Gerald W, Foekens JA, Massague J: Genes that mediate breast cancer metastasis to the brain. Nature. 2009, 459: 1005-1009. 10.1038/nature08021.View ArticlePubMed CentralPubMedGoogle Scholar
- Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003, 4: 249-264. 10.1093/biostatistics/4.2.249.View ArticlePubMedGoogle Scholar
- Hubbell E, Liu W-M, Mei R: Robust estimators for expression analysis. Bioinformatics. 2002, 18: 1585-1592. 10.1093/bioinformatics/18.12.1585.View ArticlePubMedGoogle Scholar
- Li C, Hung Wong W: Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application. Genome Biol. 2001, 2: research0032.1-research0032.11.Google Scholar
- Wu Z, Irizarry RA: Stochastic models inspired by hybridization theory for short oligonucleotide arrays. J Comput Biol. 2005, 12: 882-893. 10.1089/cmb.2005.12.882.View ArticlePubMedGoogle Scholar
- Dai M, Wang P, Boyd AD, Kostov G, Athey B, Jones EG, Bunney WE, Myers RM, Speed TP, Akil H, Watson SJ, Meng F: Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic Acids Res. 2005, 33: e175-10.1093/nar/gni179.View ArticlePubMed CentralPubMedGoogle Scholar
- Boutros PC, Ewing AD, Ellrott K, Norman TC, Dang KK, Hu Y, Kellen MR, Suver C, Bare JC, Stein LD, Spellman PT, Stolovitzky G, Friend SH, Margolin AA, Stuart JM: Global optimization of somatic variant identification in cancer genomes with a global community challenge. Nat Genet. 2014, 46: 318-319. 10.1038/ng.2932.View ArticlePubMed CentralPubMedGoogle Scholar
- Shi L, Campbell G, Jones WD, Campagne F, Wen Z, Walker SJ, Su Z, Chu T-M, Goodsaid FM, Pusztai L, Shaughnessy JD, Oberthuer A, Thomas RS, Paules RS, Fielden M, Barlogie B, Chen W, Du P, Fischer M, Furlanello C, Gallas BD, Ge X, Megherbi DB, Symmans WF, Wang MD, Zhang J, Bitter H, Brors B, Bushel PR, Bylesjo M, et al: The MicroArray quality control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat Biotechnol. 2010, 28: 827-838. 10.1038/nbt.1665.View ArticlePubMedGoogle Scholar
- Verhaak RGW, Staal FJT, Valk PJM, Lowenberg B, Reinders MJT, de Ridder D: The effect of oligonucleotide microarray data pre-processing on the analysis of patient-cohort studies. BMC Bioinforma. 2006, 7: 105-10.1186/1471-2105-7-105.View ArticleGoogle Scholar
- Bolstad BM, Irizarry RA, Åstrand M, Speed TP: A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 2003, 19: 185-193. 10.1093/bioinformatics/19.2.185.View ArticlePubMedGoogle Scholar
- Subramanian J, Simon R: Gene expression-based prognostic signatures in lung cancer: ready for clinical use?. J Natl Cancer Inst. 2010, 102: 464-474. 10.1093/jnci/djq025.View ArticlePubMed CentralPubMedGoogle Scholar
- Boutros PC, Lau SK, Pintilie M, Liu N, Shepherd FA, Der SD, Tsao M-S, Penn LZ, Jurisica I: Prognostic gene signatures for non-small-cell lung cancer. PNAS. 2009, 106: 2824-2828. 10.1073/pnas.0809444106.View ArticlePubMed CentralPubMedGoogle Scholar
- Starmans MHW, Fung G, Steck H, Wouters BG, Lambin P: A simple but highly effective approach to evaluate the prognostic performance of gene expression signatures. PLoS ONE. 2011, 6: e28320-10.1371/journal.pone.0028320.View ArticlePubMed CentralPubMedGoogle Scholar
- Venet D, Dumont JE, Detours V: Most random gene expression signatures are significantly associated with breast cancer outcome. PLoS Comput Biol. 2011, 7: e1002240-10.1371/journal.pcbi.1002240.View ArticlePubMed CentralPubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.