- Open Access
Stromal microenvironment processes unveiled by biological component analysis of gene expression in xenograft tumor models
- Xinan Yang†1,
- Younghee Lee†1,
- Yong Huang1,
- James L Chen2,
- Rosie H Xing3, 4Email author and
- Yves A Lussier1, 4, 5Email author
© Yang et al; licensee BioMed Central Ltd. 2010
- Published: 28 October 2010
Mouse xenograft models, in which human cancer cells are implanted in immune-suppressed mice, have been popular for studying the mechanisms of novel therapeutic targets, tumor progression and metastasis. We hypothesized that we could exploit the interspecies genetic differences in these experiments. Our purpose is to elucidate stromal microenvironment signals from probes on human arrays unintentionally cross-hybridizing with mouse homologous genes in xenograft tumor models.
By identifying cross-species hybridizing probes from sequence alignment and cross-species hybridization experiment for the human whole-genome arrays, deregulated stromal genes can be identified and then their biological significance were predicted from enrichment studies. Comparing these results with those found by the laser capture microdissection of stromal cells from tumor specimens resulted in the discovery of significantly enriched stromal biological processes.
Using this method, in addition to their primary endpoints, researchers can leverage xenograft experiments to better characterize the tumor microenvironment without additional costs. The Xhyb probes and R script are available at http://www.lussierlab.org/publications/Stroma
- Gene Ontology
- Xenograft Model
- Laser Capture Microdissection
- Mouse Homolog
- Stromal Microenvironment
Characterizing the tumor microenvironment is essential as it relates to clinical prognoses, metastatic potential, and treatment-related outcomes . Innovations in cell labeling techniques and small-animal in vivo imaging have enabled investigators to phenotype the microenvironment and cancer cell interactions. However, genome-wide expression analyses of the tumor and its microenvironment have not kept pace. Tearing apart cancer cells from stromal tissues in whole tissue expression requires time consuming and costly laser microdissection of the tumor before RNA extraction [2, 3]. Computational methods previously have been proposed as a means of subtraction out the stromal signal . Yet this method simply eliminates genes which have conflated expression levels, however it does little to elucidate the tumor microenvironment.
In this paper, we describe a computational method of exploiting interspecies differences in the mouse tumor xenograft model in which human cancer cells are grown in immune-suppressed mice. This model is popular for studying the mechanisms of preclinical drug trials, tumor progression and the development of metastasis. Mouse xenografts consist of both human tumor cells and mouse stromal tissues , thus differential gene expression derived from human expression arrays are generally implicitly attributed to the cancer cells [6, 7]. However, cross-hybridization of human chip probes with homologous mouse genes may result in a mixed gene expression signal where deregulated mice stromal genes in xenograft tumor models are being jointly measured along with the human cancer genes. Up to date, functional categories (Gene Ontology terms) enriched in the gene expression of whole tumor xenografts have neglected the impact of cross-species hybridization.
Tumor/stromal interactions have been examined by modeling the diffusion of the nutrients in the stroma, invasion of single cells into the stroma, and characterizing the function of the stromal elements [8–10]. To our knowledge, no studies have specifically focused on identifying mouse stromal signals in human gene arrays. However, unsurprisingly, a few groups have identified cross-species signals in multiple species chips [11, 12]. Others and us have designed pan-viral arrays comprising of species-probes for over 1000 species on one chip for diagnostic purposes [12, 13]. In collaboration with others, we have demonstrated that statistically-based gene enrichment approaches to these pan-viral arrays were effective in increasing the species-specific signal and consequently the diagnostic accuracy of these pan-microbial arrays . The xenograft tumor expressed on human arrays presents a multiple species problem. Human probes were not designed to be species specific and thus, the cross-hybridization is equally present in Human expression arrays (if not more important) than in pan-microbial arrays.
We hypothesized that we could identify stromal microenvironment signals (Gene Ontology (GO) biological processes) from deregulated genes using the human array probes unintentionally cross-hybridizing with the mouse homolog. This assumption is based on the observation that 1) the majority of the cross-hybridizing probes designed for human genes also target their mouse homolog, and 2) the majority of Gene Ontology  annotations for human and mouse homolog are identical (Additional file 1: Suppl. Methods). Thus, in this paper, we design an unbiased method to identify and optimize the choice of cross-species hybridizing probes. To test out this method, we compared the Gene Ontology terms enriched in cross-hybridizing probes with those derived from microdissected stromal components.
The GO annotations and gene homolog for the human genome were downloaded from the NCBI. Microarray platform information for the Agilent 44k whole human genome oligo microarray was obtained from the Gene Expression Omnibus  (GEO:GPL6480; Additional file 1: Suppl. Methods, Table A).
Simulating a xenograft model with human and mouse RNA
In order to biologically determine cross-hybridizing probes, the human and mouse universal reference RNAs were obtained from ArrayIt (Sunnyvale, CA).
For this study, we selected the Agilent 44k whole human genome oligo microarray which allows for custom design. The Agilent microarray is designed with one or more 60-mer probes per gene rather than the more complex Affymetrix probe-sets containing multiple shorter (25-mer) probes voting for each gene (Additional file 1: Suppl. Methods, array preprocessing).
Determining probes of human arrays cross-hybridizing with mouse RNA and the proportion of homologous genes between species
Deregulated human genes and their contributed Gene Ontology in a xenograft model.
Gene Ontology Enrichment from the Differentially Expressed Genes
Expression of Significantly Differentially Expressed Probes in Human Array
Biological process chiefly attributable to mouse stromal cells
ci,jMm » ci,jHs
ci,jHs is stochastically distributed
Biological process chiefly attributable to human cancer cells
ci,jMm is stochastically distributed
ci,jMm « ci,jHs
Biological processes enriched in the experiment
Both ci,jMm and ci,jHs are stochastically distributed
Stromal effect among literature-reported stroma gene lists (Figure 1A, Additional file 2: Suppl. Table 1)
Previous xenograft cancer studies using the Agilent GPL6480 whole human microarray.
Tests to determine the statistical significance of enriched cross-species GO terms
To identify the overrepresented biological processes, the Entrez Gene ID for annotated probes were inputted into the Bioconductor package GOstats , followed by the enrichment of GO biological process (BP) terms conducted in four ways. In Enrichment a (conventional enrichment of genes derived from Xenograft models), the list of literature-reported genes (defined in previous paragraph, Additional file 2: Suppl. Table 1) were compared to the entire gene set of the 44k Agilent microarray. In Enrichment b (enrichment of putative stromal signal), the Xhyb subset of the literature-reported genes were compared to all Xhyb genes in the microarray that we had identified. In Enrichment c (enrichment of cancer cells signal), the remaining (non-Xhyb) subset of reported genes were compared to the remaining (non-Xhyb) genes in the microarray. In Enrichment d (enrichment of more specific stromal signal), the subset of reported genes whose probes cross-hybridized with their mouse homolog were compared to all the genes with cross-hybridized probes on the array. Note that since we could not know which cross-hybridizing probes targeted which homolog on the entire array without experimental evidence, test d) used all genes with cross-hybridizing probes as the background resulting in an estimated statistic. Alternatively, approach b) uses an evaluation by proxy based on the observation that the majority of mouse GO Annotations were identical to human GO Annotations, and the majority of the cross-hybridizing genes target their corresponding mouse homolog (Additional file 1: Suppl. Methods). For each described Tests (above Tests “a”, “b”, “c’, “d”), a “conditional” enrichment analysis  was conducted. This “conditional” hypergeometric test allows for (i) controlling false positives results resulting from genes inherited in the hierarchical structure of GO instead (ii) increasing the robustness of results from small gene lists because it follows a bottom-up testing method. This method tests the leaves of the GO graph, and then it removes all genes annotated at significant child-terms from the parent-term’s gene list before testing the terms whose descendant have already been tested.
Evaluation of the methodology (Figure 1B)
We validated our methodology using two gold standards; each one was based on an independently published list of biological processes (GO terms) enriched in differentially expressed genes derived from microarray experiments of cancer stroma microdissected out of whole tumors [2, 3]. The Fisher’s exact test was performed between the GO terms of (i) each “gold standard” and those (ii) overrepresented among the “Xhyb genes”. This was based on the assumption that between the xenograft model and the human environment, the stromal response to human cancer has a shared phenotype leading to a shared Gene Ontology annotation. Thus, the stromal response in xenograft models could be applied to cancer study. In particular, the first gold standard (GS1) contained 61 biological processes in the differentially expressed breast stromal genes that were identified by comparing laser microdissected stromal tissues (adjusted to cancer) with paired normal epithelium . The second gold standard (GS2) compared gene expression profiles of breast tumor stroma captured by laser capture microdissection, and reported the overrepresented GO terms associated with poor-prognosis, good-prognosis and mixture-outcome, respectively . These 107 biological processes derived from the whole tumor with poor prognosis were used as the second gold standard because samples with poor prognosis are more representative for characterization of stromal signatures 
The network of the enriched GO biological process terms and those in the two gold standards were visualized using Bioconductor packages Rgraphviz and GOstats. An additional evaluation for the union of statistically significant GO terms(p-value<1%, unadjusted cumulative conditional hypergeometric test) were performed as described in (Additional file 1: Suppl. Method and Additional file 2: Suppl. Table 2. The Xhyb probes and R script can be downloaded from http://www.lussierlab.org/publications/Stroma.
Identification of Xhyb probes
After a sequence similarity search of a human probe with mouse RNA sequence databases and the subsequent biological cross-species expression experiment, cross-species hybridizing probes were identified and additional species-specific probes were designed by our research group for the same gene. As shown in the left plot of Figure D in Additional file 1: Suppl. Methods, we examined the top 18% (notated as BGS2.18) of the absolute expression of mouse probes when exposed to human RNA because it had the highest F-score (precision=40%, recall=25%) together with the 3rd theoretical prediction model (noted as M-Xhyb3). We examined multiple conditions in the M-Xhyb3 and developed an optimized model (right plot of Figure D, Additional file 1: Suppl. Methods). Using this optimized model, we adjusted the parameter, x, of BGS2. This process resulted in 6,200 Xhyb probes involving 5,300 mouse genes from GPL7202 and 5,900 Xhyb probes involving 4,900 human genes from GPL6480, respectively.
Xhyb genes found in published deregulated genes from xenograft models
The identified Xhyb subset of the literature-reported tumor deregulated genes is listed in Additional file 2: S. Table 1. Note that there were many genes targeted by multiple Agilent probes. In such cases, reported genes with only one cross-mouse hybridizing probe were annotated as “Xhyb deregulated genes” while reported genes with both Xhyb and none-Xhyb probes were considered as “possible Xhyb deregulated genes”. Details are given in Additional file 2: S. Table 3-4. Furthermore, among the 17 identified cross-mouse hybridizing human probes that were reported by literature in Additional file 2: S. Table 3, 13 probes targeted their mouse homologs (Additional file 2: S. Table 5) as predicted by the BLASTN algorithm  with default parameters. Several biological processes associated with stromal cells were significantly (p<1%) enriched only among the Xhyb deregulated genes. For example, it has been previously shown that implanted tumors developed intensive angiogenesis with vascular endothelial growth factor (VEGF) induction in the stroma . Accordingly, we found that positive regulation of vascular endothelial growth factor receptor signalling pathway was significantly enriched among the Xhyb deregulated genes (p=0.004), suggesting a different stromal/tumor cell cross-talk and was further demonstrated by differences in angiogenesis and in necrosis .
In addition, as presented in Additional file 4: S. Figure 2 panel a, deregulated Xhyb genes (Enrichment d, cyan circles) are significantly better at predicting biological processes of stromal cells when compared to the the remaining genes (Enrichment c) and the full reported gene list (Enrichment a) annotations in both studies. Conversely, Additional file 4: S. Figure 2 panel b shows that the remaining deregulated genes (Enrichment c, orange triangles) or the full gene list (Enrichment a, blue squares) are better at predicting biological processes associated to cancer cells.
Limitations and future studies. This study focused on human probes that were inadvertently cross-hybridized with mouse RNAs. The confounded expression of 5,000 human and mouse genes can thus be assessed; however, these measures are limited in that they are not genome-wide. The platform-specific cross-species hybridizing probes, thus, limits our study as we are indirectly measuring the stromal microenvironment effects as compared to the direct measurements that could arguably be obtained with species-specific mouse probes.
For future studies, we plan to use knowledge extracted from the literature to further refine the interpretation using BioMedLEE  in a high-throughput manner. Our current experiment identified Xhyb probes in Agilent human microarrays and therefore can only reveal part of the stromal signatures. A more complete view of stromal signatures in xenograft model should be investigated using mouse-specific probes. To further identify microenvironment factors associated with tumor progression and clinical prognosis, we are currently developing a novel methodology to dissect cancer and stromal signature using a xenograft model of head and neck cancer.
In this paper, we introduced a novel design to detect biological processes of the stroma using human arrays with probes of which approximately 25% cross-hybridized with mouse genes. We also conclude that the majority of our identified cross-hybridized genes target their corresponding homolog. Although human and mouse RNA gene expression are confounded in this subset of probes, we have demonstrated that an appropriately performed Gene Ontology enrichment analysis can identify significant stroma-associated GO terms by evaluating our findings with GO annotations from previous laser capture microdissections of the stroma. These results suggest that our method may be a reasonable alternative to laser microdissection in a specially designed chip. Our results also suggest that human chip probes cross-hybridizing with mouse genes are, in fact, over-annotated with stroma-associated biological processes. This leads us to believe that xenograft gene expression contain confounded cell signals and stromal cell signals which must be accounted for. An additional benefit of this methodology is that from a cost perspective, this in silico approach allows us to extract additional stromal knowledge from current xenograft tumor arrays without needing to perform further in vivo/in vitro experiments. In future studies, we intend to examine these cross-species enrichments for biological interactions between cancer cells and stromal cells, an important initiative of the National Cancer Institute Tumor Microenvironment Think Tank .
This work supported by the National Institute of Health (National Cancer Institute) 1U54CA121852 (National Center for Multiscale Analyses of Genomic and Cellular Networks - MAGNET), K22 LM008308-04, CTSA UL1 RR024999-03, and Cancer Research Foundation (T-AML).
This article has been published as part of BMC Bioinformatics Volume 11 Supplement 9, 2010: Selected Proceedings of the 2010 AMIA Summit on Translational Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2105/11?issue=S9.
- Joyce JA, Pollard JW: Microenvironmental regulation of metastasis. Nat Rev Cancer 2009, 9(4):239–252. 10.1038/nrc2618PubMed CentralView ArticlePubMedGoogle Scholar
- Finak G, Bertos N, Pepin F, Sadekova S, Souleimanova M, Zhao H, Chen H, Omeroglu G, Meterissian S, Omeroglu A, et al.: Stromal gene expression predicts clinical outcome in breast cancer. Nat Med 2008, 14(5):518–527. 10.1038/nm1764View ArticlePubMedGoogle Scholar
- Finak G, Sadekova S, Pepin F, Hallett M, Meterissian S, Halwani F, Khetani K, Souleimanova M, Zabolotny B, Omeroglu A, et al.: Gene expression signatures of morphologically normal breast tissue identify basal-like tumors. Breast Cancer Res 2006, 8(5):R58. 10.1186/bcr1608PubMed CentralView ArticlePubMedGoogle Scholar
- Stuart RO, Wachsman W, Berry CC, Wang-Rodriguez J, Wasserman L, Klacansky I, Masys D, Arden K, Goodison S, McClelland M, et al.: In silico dissection of cell-type-associated patterns of gene expression in prostate cancer. Proc Natl Acad Sci U S A 2004, 101(2):615–620. 10.1073/pnas.2536479100PubMed CentralView ArticlePubMedGoogle Scholar
- Tlsty TD, Coussens LM: Tumor stroma and regulation of cancer development. Annu Rev Pathol 2006, 1: 119–150. 10.1146/annurev.pathol.1.110304.100224View ArticlePubMedGoogle Scholar
- Gouyer V, Fontaine D, Dumont P, de Wever O, Fontayne-Devaud H, Leteurtre E, Truant S, Delacour D, Drobecq H, Kerckaert JP, et al.: Autocrine induction of invasion and metastasis by tumor-associated trypsin inhibitor in human colon cancer cells. Oncogene 2008, 27(29):4024–4033. 10.1038/onc.2008.42View ArticlePubMedGoogle Scholar
- Klymkowsky MW, Savagner P: Epithelial-mesenchymal transition: a cancer researcher's conceptual friend and foe. Am J Pathol 2009, 174(5):1588–1593. 10.2353/ajpath.2009.080545PubMed CentralView ArticlePubMedGoogle Scholar
- Anderson AR, Rejniak KA, Gerlee P, Quaranta V: Microenvironment driven invasion: a multiscale multimodel investigation. J Math Biol 2009, 58(4–5):579–624. 10.1007/s00285-008-0210-2View ArticlePubMedGoogle Scholar
- Sangar V, Blankenberg DJ, Altman N, Lesk AM: Quantitative sequence-function relationships in proteins based on gene ontology. BMC Bioinformatics 2007, 8: 294. 10.1186/1471-2105-8-294PubMed CentralView ArticlePubMedGoogle Scholar
- Wang C, Maass T, Krupp M, Thieringer F, Strand S, Worns MA, Barreiros AP, Galle PR, Teufel A: A systems biology perspective on cholangiocellular carcinoma development: focus on MAPK-signaling and the extracellular environment. J Hepatol 2009, 50(6):1122–1131. 10.1016/j.jhep.2009.01.024View ArticlePubMedGoogle Scholar
- Chiu CY, Alizadeh AA, Rouskin S, Merker JD, Yeh E, Yagi S, Schnurr D, Patterson BK, Ganem D, DeRisi JL: Diagnosis of a critical respiratory illness caused by human metapneumovirus by use of a pan-virus microarray. J Clin Microbiol 2007, 45(7):2340–2343. 10.1128/JCM.00364-07PubMed CentralView ArticlePubMedGoogle Scholar
- Palacios G, Quan PL, Jabado OJ, Conlan S, Hirschberg DL, Liu Y, Zhai J, Renwick N, Hui J, Hegyi H, et al.: Panmicrobial oligonucleotide array for diagnosis of infectious diseases. Emerg Infect Dis 2007, 13(1):73–81. 10.3201/eid1301.060837PubMed CentralView ArticlePubMedGoogle Scholar
- Jabado OJ, Liu Y, Conlan S, Quan PL, Hegyi H, Lussier Y, Briese T, Palacios G, Lipkin WI: Comprehensive viral oligonucleotide probe design using conserved protein regions. Nucleic Acids Res 2008, 36(1):e3. 10.1093/nar/gkm1106PubMed CentralView ArticlePubMedGoogle Scholar
- Liu Y, Sam L, Li J, Lussier YA: Robust methods for accurate diagnosis using pan-microbiological oligonucleotide microarrays. BMC Bioinformatics 2009, 10(Suppl 2):S11. 10.1186/1471-2105-10-S2-S11View ArticleGoogle Scholar
- Camon E, Magrane M, Barrell D, Binns D, Fleischmann W, Kersey P, Mulder N, Oinn T, Maslen J, Cox A, et al.: The Gene Ontology Annotation (GOA) project: implementation of GO in SWISS-PROT, TrEMBL, and InterPro. Genome Res 2003, 13(4):662–672. 10.1101/gr.461403PubMed CentralView ArticlePubMedGoogle Scholar
- Edgar R, Domrachev M, Lash AE: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 2002, 30(1):207–210. 10.1093/nar/30.1.207PubMed CentralView ArticlePubMedGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990, 215(3):403–410.View ArticlePubMedGoogle Scholar
- Wolfinger RD, Gibson G, Wolfinger ED, Bennett L, Hamadeh H, Bushel P, Afshari C, Paules RS: Assessing gene significance from cDNA microarray expression data via mixed models. J Comput Biol 2001, 8(6):625–637. 10.1089/106652701753307520View ArticlePubMedGoogle Scholar
- Sakariassen PO, Prestegarden L, Wang J, Skaftnesmo KO, Mahesparan R, Molthoff C, Sminia P, Sundlisaeter E, Misra A, Tysnes BB, et al.: Angiogenesis-independent tumor growth mediated by stem-like cancer cells. Proc Natl Acad Sci U S A 2006, 103(44):16466–16471. 10.1073/pnas.0607668103PubMed CentralView ArticlePubMedGoogle Scholar
- Falcon S, Gentleman R: Using GOstats to test gene lists for GO term association. Bioinformatics 2007, 23(2):257–258. 10.1093/bioinformatics/btl567View ArticlePubMedGoogle Scholar
- Fujita M, Hayashi I, Yamashina S, Fukamizu A, Itoman M, Majima M: Angiotensin type 1a receptor signaling-dependent induction of vascular endothelial growth factor in stroma is relevant to tumor-associated angiogenesis and tumor growth. Carcinogenesis 2005, 26(2):271–279. 10.1093/carcin/bgh324View ArticlePubMedGoogle Scholar
- Lussier Y, Friedman C: BiomedLEE: a natural-language processor for extracting and representing phenotypes, underlying molecular mechanisms and their relationships. In: ISMB: 2007 2007.Google Scholar
- Tumor Microenvironment Think Tank[http://www.cancer.gov/think-tanks-cancer-biology/page3]
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.