- Methodology article
- Open Access
Conserved co-expression for candidate disease gene prioritization
© Oti et al; licensee BioMed Central Ltd. 2008
- Received: 12 February 2008
- Accepted: 23 April 2008
- Published: 23 April 2008
Genes that are co-expressed tend to be involved in the same biological process. However, co-expression is not a very reliable predictor of functional links between genes. The evolutionary conservation of co-expression between species can be used to predict protein function more reliably than co-expression in a single species. Here we examine whether co-expression across multiple species is also a better prioritizer of disease genes than is co-expression between human genes alone.
We use co-expression data from yeast (S. cerevisiae), nematode worm (C. elegans), fruit fly (D. melanogaster), mouse and human and find that the use of evolutionary conservation can indeed improve the predictive value of co-expression. The effect that genes causing the same disease have higher co-expression than do other genes from their associated disease loci, is significantly enhanced when co-expression data are combined across evolutionarily distant species. We also find that performance can vary significantly depending on the co-expression datasets used, and just using more data does not necessarily lead to better prioritization. Instead, we find that dataset quality is more important than quantity, and using a consistent microarray platform per species leads to better performance than using more inclusive datasets pooled from various platforms.
We find that evolutionarily conserved gene co-expression prioritizes disease candidate genes better than human gene co-expression alone, and provide the integrated data as a new resource for disease gene prioritization tools.
- Disease Gene
- Evolutionary Conservation
- Candidate Disease Gene
- Total Expression Level
- Gene Atlas Expression
In the past few years several bioinformatic tools and approaches have been developed to assist medical genetic researchers in positional candidate disease gene identification (reviewed in ; see also [2–5]). Several tools use functional genomics to prioritize candidate genes located within disease-associated genomic loci by evaluating functional relationships between known disease genes and positional candidate genes [6–8]. These tools are based on the premise that genes which are involved in the same disease phenotype are likely to be functionally related [1, 9, 10]. This has indeed been shown to be the case as evidenced by the fact that these tools all perform better than random expectation in the prediction or prioritization of candidate disease genes. Nevertheless, not all types of functional genomic data perform equally well in terms of sensitivity and specificity [2, 7, 8]. Microarray expression data have wider coverage than other high-throughput genomic data such as protein-protein interactions, as genome-scale expression analyses are readily and routinely performed with them. Additionally, they are less biased toward better studied genes than gene function annotation or literature mining, although the latter approaches fare better at prioritizing disease candidate genes [2, 7, 8]. Therefore, given the large coverage of co-expression data and their complementarity to functional annotation and literature mining, it is of importance to maximize the disease gene predictive value of this type of data.
Several bioinformatic candidate disease gene prioritization tools already incorporate microarray-based co-expression data [2, 6–8, 11, 12]. This approach is based on the assumption that if two genes are functionally related then their expression should vary concordantly across tissues and under different circumstances, and proposes that their expression profiles should therefore be correlated. For candidate disease gene prioritization, the use of co-expression analysis is preferable to the use of tissue-specific gene expression patterns, as it is a better predictor of functional relatedness between genes .
However, co-expression data can be applied more comprehensively than is currently implemented by these tools. One important and currently underexploited approach is to incorporate co-expression data from other species. One might expect that while human co-expression data are the most relevant for disease gene prioritization, evolutionary conservation of co-expression can be used to enhance the reliability of identified co-expression relationships. The premise is that co-expression relationships that are maintained across phylogenetically distant organisms must be under selective pressure, and should therefore be functional – a premise that has indeed been confirmed in several previous studies [14–17]. Though one tool already includes multi-species co-expression data , the improvement in disease gene ranking performance due to the exploitation of evolutionary conservation has not yet been investigated.
We therefore investigated the predictive value of conserved co-expression for candidate disease gene prioritization. To this end we analyzed how well co-expression between known and candidate disease genes could prioritize positional candidate disease genes. We restricted our analysis to known disease genes from genetic diseases containing at least two known causative genes. We constructed artificial loci of 100 candidate genes around the known disease-causing genes, and investigated the tendency of these causative genes to have higher co-expression with other known causative genes compared to the non-causative candidate genes from the same disease loci. Using co-expression data from five eukaryotic species – baker's yeast (Saccharomyces cerevisiae), nematode worm (Caenorhabditis elegans), fruit fly (Drosophila melanogaster), mouse (Mus musculus) and human – we investigated the effect of evolutionary conservation on the ranking of the disease gene pairs, finding that evolutionary conservation of co-expression does indeed improve disease gene ranking. Therefore, exploiting evolutionary conservation could potentially improve the performance of co-expression data in existing disease candidate gene prioritization tools [2, 6–8], which might in turn improve the prioritization of less well-studied genes.
Evolutionary conservation of co-expression improves disease gene ranking performance
We investigated how well disease genes tend to rank relative to non-causative candidate disease genes when ranked according to co-expression with other genes known to cause the same disease. We combined co-expression scores across species using orthology relationships from the euKaryotic clusters of Orthologous Groups (KOG) database . The co-expression scores are thus based on these KOGs rather than on individual genes (see methods section for further details). We used expression data from human, mouse, fruit fly (D. melanogaster), worm (C. elegans) and baker's yeast (S. cerevisiae) assembled from the Gene Expression Omnibus database  and the Genomics Institute of the Novartis Foundation  Gene Atlas expression data. For this study we used artificial disease loci containing 100 genes per locus. Testing with 50, 100 and 200 genes per locus does not make much difference though smaller loci tend to perform slightly better than larger loci (data not shown). Only disease gene pairs with co-expression scores unlikely to occur randomly in the corresponding dataset (i.e. more than 2 standard deviations from the dataset randomization mean) were included in the final rankings. This process implicitly weighs the scores according to the number of species involved, as the random score distributions are narrower for datasets combining more species. The standard deviations of these randomized distributions range from 0.051 for the five-species combined dataset to between 0.057 (yeast) and 0.094 (mouse) for the individual single-species datasets. Therefore, a given correlation score is more likely to be considered significant in a multi-species dataset than in a single-species dataset.
Disease gene ranking improved by co-expression conservation at different evolutionary distances
Pairwise species comparisons for co-expression-based disease gene ranking.
# Disease gene pairs
Pairwise combined sets
# Disease gene pairs
P-value (Combined better than single species)
6.7 × 10-4 *
1.3 × 10-5 *
1.8 × 10-5 *
2.1 × 10-4 *
9.1 × 10-6 *
In contrast to the other species, co-expression conservation with yeast does not significantly improve disease gene ranking (Table 1). For this species pair yeast-only co-expression performs best, outperforming even the combined human-yeast set at ranking human disease genes (albeit not significantly; p = 0.26). This is primarily due to specific disease types involving housekeeping processes such as metabolism (congenital disorder of glycosylation, glycogen storage disease) and DNA repair (xeroderma pigmentosum) which consistently score well particularly in the yeast set. As yeast co-expression already performs very well, the combination with human co-expression may not yield much extra information. However, this performance comes at the expense of much reduced coverage of disease genes relative to the other sets, which all have a similar coverage (550 disease gene rankings for the human-yeast set, versus ~3000 for human-mouse, human-fly and human-worm sets). It is thus evident that despite the large evolutionary distance between these two species, yeast co-expression is still effective at ranking human disease genes for those genes that have orthologs in both species.
Disease gene ranking performance is dependent on co-expression data used
In addition to restricting expression data to a single platform per species, normalizing the microarray expression data according to total expression level also improves the ranking of disease genes relative to non-disease genes from the candidate loci. As we were mainly interested in relative expression levels of genes across conditions and not in total gene expression levels, we normalized all expression values according to the total expression level of the microarray sample (see methods section for further details). This reduces systematic biases between samples due to differences in total expression levels and highlights the expression relationships between genes per sample, resulting in up to 5% improvement in candidate disease gene ranking (data not shown).
All disease gene ranking and conserved co-expression correlation data presented here are freely available online .
In this study, we show that we can increase the predictive value of co-expression for disease gene prioritization by exploiting evolutionary conservation, despite the variations in the biology of the species compared. Given a genuine co-expression relationship between the disease genes, using conserved co-expression to prioritize candidate disease genes can reduce the number of genes to be tested over sevenfold compared to using a random ranking of the candidate disease genes, as the correct gene will be found on average after testing 7% of the candidates (the median disease gene rank is 0.93) instead of 50% (Figure 2). Encouragingly, even human-mouse conservation can lead to a substantial improvement in disease gene ranking performance, despite the relatively short evolutionary distance between these two species (Figure 3). This means that improvements in specificity can be gained without large losses in sensitivity, as most human genes have mouse orthologs.
An interesting finding is that large pooled datasets which combine as many expression data as possible from various experiments and platforms can actually result in reduced co-expression performance relative to smaller but more coherent expression datasets (Figure 4). Microarray data are notoriously variable between independently generated datasets while they are somewhat more consistent between experiments using the same platform [22, 23]. In order to minimize dilution of the co-expression signal when combining data from many different sources a weighting scheme is required, such as using the co-expression overlap between different sets to weigh the relevance of the co-expression value . Our results are consistent with these previously reported findings, as our single-platform-per-species dataset ranks disease genes significantly better than the more inclusive pooling approach adopted earlier by Stuart and colleagues . An alternative explanation would be that their expression sets are of lower quality or are less representative of the relationships between disease genes, but there is no reason to assume that either of these is the case. This underscores the fact that combining as many data as possible does not necessarily lead to an improved performance of co-expression data for disease gene prioritization, so it is therefore not a trivial finding that combining data from different species does.
Another reason why the larger sets do not perform as well as the smaller sets could lie in the use of correlation coefficients to determine genetic relatedness. Correlation coefficients estimate expression coherence across all conditions surveyed, but even functionally related genes may not have coherent expression patterns across all tissues and conditions. The larger the datasets, the greater the potential for irrelevant conditions to mask the co-expression relationship that a group of genes has under a limited set of conditions. Therefore, a biclustering-based approach [26, 27] may yield more refined co-expression relationships between genes, and is a potential avenue for future improvement of co-expression-based disease candidate gene prioritization.
We analyze here the predictive power of gene co-expression for disease gene prioritization and identify factors that affect it, such as evolutionary conservation. We show that co-expression data from other species have predictive power for human disease gene prioritization, and that evolutionarily conserved gene co-expression can improve disease gene prioritization over human-only gene co-expression. In addition, we show that platform consistency is important and that smaller but more cohesive datasets can outperform larger pooled datasets. Though we only examined disease gene ranking, these findings have broader relevance for the use of microarray co-expression data in functional genomics. We provide these conserved co-expression data as a new resource that can be used in disease gene prioritization programs, particularly those that integrate several different data types.
We used the Online Mendelian Inheritance in Man (OMIM)  Morbid Map as a source of genetic diseases and known disease genes. We restricted our analysis to those diseases with two or more known disease genes. There were 890 known disease genes (727 distinct genes) for 177 diseases in our dataset. Artificial disease loci were constructed around these known disease genes based on localization information from the Ensembl database , by taking the required number of neighboring genes centered on the disease gene. These genes were then translated to HGNC gene IDs  or KOG IDs  depending on the analysis. This means that the locus genes used in the analyses are a subset of the Ensembl genes in the locus, depending on how many could be mapped to their relevant IDs. We used artificial loci of 100 genes, which is representative for the average candidate disease locus, as the OMIM heterogeneous disease loci have a median of 88 genes per locus. In addition, we investigated the use of 50- and 200-gene artificial loci, as well as actual associated loci from OMIM Morbid Map, but do not consider them further as their results did not differ substantially from those of the 100-gene artificial loci.
Expression data processing
Initially we used the multi-species expression data used by Stuart and colleagues in their functional analysis of conserved co-expression . This dataset contains expression data for human, fruit fly (Drosophila melanogaster), nematode worm (Caenorhabditis elegans) and baker's yeast (Saccharomyces cerevisiae) genes. These expression data had already been normalized and were therefore not further processed prior to their use in the co-expression calculations.
GEO expression datasets used
GEO series ID
Number of samples
GSE1311, GSE1312, GSE1313
Genes included in the co-expression calculations
Genes included in co-expression data
Number of genes
Those with gene names also present in Affymetrix HG-U133A microarray platform
13955 (11410 genes in all disease loci combined)
Those with mouse gene names (unknown transcripts, RIKEN transcripts and predicted genes excluded)
Those with FlyBase IDs
Those with systematic names (Y...IDs)
No artificial cut-off was used to filter out noisy low expression values, or to define presence or absence of gene expression in a sample. This is not necessary, as we are using correlation between expression profiles rather than absolute expression levels. The inclusion of non-biologically significant noise should not result in spurious correlations between genes, and if there is a correlation between low expression values then they are probably not merely noise and should not be filtered out.
It should be noted that while we used log2-transformed signal intensity values in a multi-species study involving different microarray platforms, and not relative abundances as Liao & Zhang did , our analysis does not suffer from the same problems that led them to use relative abundances. We do not directly compare expression values between different microarray platforms. Instead, these expression values are converted to co-expression values for each platform separately. This process involves only within-platform signal intensity comparisons. The between-species – and therefore between-platform – comparisons are done at the co-expression level and involve comparisons of Spearman rank correlation coefficients.
Co-expression score calculations
We used Spearman rank correlation coefficients as the microarray signal intensity values were not normally distributed. For the GEO datasets comprising several experiments (the fly and yeast sets), these data were pooled before gene pair co-expression correlation coefficients were calculated.
In order to be able to compare co-expression relationships between species, we used the gene orthology relationships as defined by the euKaryotic clusters of Orthologous Groups (KOG) database . We chose to use KOGs instead of a metagenes-based approach such as was used by Stuart and colleagues  in order to maximize coverage, as KOGs not only contain bidirectional best hits but also closely related paralogs. The gene to KOG mapping was done using the STRING database version 6.1 . Mapping of the protein IDs used in STRING to the gene IDs used on the microarrays was done using Ensembl BioMart . Of the 13955 human genes with expression data used in this study 8186 could be mapped to KOGs.
In order to incorporate evolutionary conservation into the final co-expression scores, we took the mean of the species-specific KOG-based co-expression scores over all species considered (between-species averaging). For the comparison between human and multi-species conserved co-expression the union of all the sets was taken for maximal coverage – i.e. all KOG-based co-expression scores were used regardless of which species were represented in the KOGs.
Disease gene ranking analyses
To avoid ranking candidate disease genes which do not have any co-expression relationship with each other at all, we randomly permuted the co-expression datasets used to determine the random distribution of co-expression scores for each dataset. We then excluded all disease gene pairs for which the co-expression score fell within 2 standard deviations of the randomization means. These distributions all had co-expression scores with a mean of approximately zero and standard deviations ranging between 0.05 and 0.09 depending on the dataset. Multiple randomizations always resulted in almost identical score distributions per dataset due to the large numbers involved.
To investigate the influence of evolutionary conservation on disease gene pair ranking performance, human-derived co-expression data were compared with co-expression data averaged across all five species included in the study. Additionally, pairwise species comparisons were performed comparing human-only co-expression data with pairwise conserved co-expression between human and mouse, fly, worm or yeast.
In order to test for the effect of averaging gene-gene co-expression within KOGs, the performance of the GNF human expression set when using KOG-based co-expression was compared to its performance when using gene-based co-expression.
The R statistical software package  was used for the microarray data processing and the Spearman rank correlation calculations, as well as for statistical tests and data plotting. For performance reasons, small custom-written C++ programs were used to average the gene-gene correlation coefficients into KOG-KOG correlation coefficients, and the per-species KOG-KOG correlations into cross-species KOG-KOG correlation values. Python scripts were written for the disease gene correlation coefficient ranking analyses. All scripts and source code are available on request.
We would like to thank Walter Hoolwerf and Mark Jans for implementing the initial version of the conserved co-expression database.
Funding: This work was supported in part by the BioRange program of the Netherlands Bioinformatics Centre, and by the Horizon program, both of which are supported by a BSIK grant through the Netherlands Genomics Initiative.
- Oti M, Brunner HG: The modular nature of genetic diseases. Clin Genet 2007, 71(1):1–11. 10.1111/j.1399-0004.2006.00708.xView ArticlePubMedGoogle Scholar
- Chen J, Xu H, Aronow BJ, Jegga AG: Improved human disease candidate gene prioritization using mouse phenotype. BMC Bioinformatics 2007, 8: 392. 10.1186/1471-2105-8-392PubMed CentralView ArticlePubMedGoogle Scholar
- George RA, Liu JY, Feng LL, Bryson-Richardson RJ, Fatkin D, Wouters MA: Analysis of protein sequence and interaction data for candidate disease gene prediction. Nucleic Acids Res 2006, 34(19):e130. 10.1093/nar/gkl707PubMed CentralView ArticlePubMedGoogle Scholar
- Lage K, Karlberg EO, Storling ZM, Olason PI, Pedersen AG, Rigina O, Hinsby AM, Tumer Z, Pociot F, Tommerup N, Moreau Y, Brunak S: A human phenome-interactome network of protein complexes implicated in genetic disorders. Nat Biotechnol 2007, 25(3):309–316. 10.1038/nbt1295View ArticlePubMedGoogle Scholar
- Xu J, Li Y: Discovering disease-genes by topological features in human protein-protein interaction network. Bioinformatics 2006, 22(22):2800–2805. 10.1093/bioinformatics/btl467View ArticlePubMedGoogle Scholar
- Adie EA, Adams RR, Evans KL, Porteous DJ, Pickard BS: SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics 2006, 22(6):773–774. 10.1093/bioinformatics/btk031View ArticlePubMedGoogle Scholar
- Aerts S, Lambrechts D, Maity S, Van Loo P, Coessens B, De Smet F, Tranchevent LC, De Moor B, Marynen P, Hassan B, Carmeliet P, Moreau Y: Gene prioritization through genomic data fusion. Nat Biotechnol 2006, 24(5):537–544. 10.1038/nbt1203View ArticlePubMedGoogle Scholar
- Franke L, Bakel H, Fokkens L, de Jong ED, Egmont-Petersen M, Wijmenga C: Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am J Hum Genet 2006, 78(6):1011–1025. 10.1086/504300PubMed CentralView ArticlePubMedGoogle Scholar
- Brunner HG, van Driel MA: From syndrome families to functional genomics. Nat Rev Genet 2004, 5(7):545–551. 10.1038/nrg1383View ArticlePubMedGoogle Scholar
- van Driel MA, Bruggeman J, Vriend G, Brunner HG, Leunissen JA: A text-mining analysis of the human phenome. Eur J Hum Genet 2006, 14(5):535–542. 10.1038/sj.ejhg.5201585View ArticlePubMedGoogle Scholar
- Perez-Iratxeta C, Wjst M, Bork P, Andrade MA: G2D: a tool for mining genes associated with disease. BMC Genet 2005, 6: 45. 10.1186/1471-2156-6-45PubMed CentralView ArticlePubMedGoogle Scholar
- Rossi S, Masotti D, Nardini C, Bonora E, Romeo G, Macii E, Benini L, Volinia S: TOM: a web-based integrated approach for identification of candidate disease genes. Nucleic Acids Res 2006, 34(Web Server issue):W285–92. 10.1093/nar/gkl340PubMed CentralView ArticlePubMedGoogle Scholar
- Zhang W, Morris QD, Chang R, Shai O, Bakowski MA, Mitsakakis N, Mohammad N, Robinson MD, Zirngibl R, Somogyi E, Laurin N, Eftekharpour E, Sat E, Grigull J, Pan Q, Peng WT, Krogan N, Greenblatt J, Fehlings M, van der Kooy D, Aubin J, Bruneau BG, Rossant J, Blencowe BJ, Frey BJ, Hughes TR: The functional landscape of mouse gene expression. J Biol 2004, 3(5):21. 10.1186/jbiol16PubMed CentralView ArticlePubMedGoogle Scholar
- Bergmann S, Ihmels J, Barkai N: Similarities and differences in genome-wide expression data of six organisms. PLoS Biol 2004, 2(1):E9. 10.1371/journal.pbio.0020009PubMed CentralView ArticlePubMedGoogle Scholar
- Liao BY, Zhang J: Low rates of expression profile divergence in highly expressed genes and tissue-specific genes during mammalian evolution. Mol Biol Evol 2006, 23(6):1119–1128. 10.1093/molbev/msj119View ArticlePubMedGoogle Scholar
- Liao BY, Zhang J: Evolutionary conservation of expression profiles between human and mouse orthologous genes. Mol Biol Evol 2006, 23(3):530–540. 10.1093/molbev/msj054View ArticlePubMedGoogle Scholar
- van Noort V, Snel B, Huynen MA: Predicting gene function by conserved co-expression. Trends Genet 2003, 19(5):238–242. 10.1016/S0168-9525(03)00056-8View ArticlePubMedGoogle Scholar
- Koonin EV, Fedorova ND, Jackson JD, Jacobs AR, Krylov DM, Makarova KS, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Rogozin IB, Smirnov S, Sorokin AV, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA: A comprehensive evolutionary classification of proteins encoded in complete eukaryotic genomes. Genome Biol 2004, 5(2):R7. 10.1186/gb-2004-5-2-r7PubMed CentralView ArticlePubMedGoogle Scholar
- Edgar R, Domrachev M, Lash AE: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 2002, 30(1):207–210. 10.1093/nar/30.1.207PubMed CentralView ArticlePubMedGoogle Scholar
- Su AI, Wiltshire T, Batalov S, Lapp H, Ching KA, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G, Cooke MP, Walker JR, Hogenesch JB: A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci U S A 2004, 101(16):6062–6067. 10.1073/pnas.0400782101PubMed CentralView ArticlePubMedGoogle Scholar
- Stuart JM, Segal E, Koller D, Kim SK: A gene-coexpression network for global discovery of conserved genetic modules. Science 2003, 302(5643):249–255. 10.1126/science.1087447View ArticlePubMedGoogle Scholar
- Irizarry RA, Warren D, Spencer F, Kim IF, Biswal S, Frank BC, Gabrielson E, Garcia JG, Geoghegan J, Germino G, Griffin C, Hilmer SC, Hoffman E, Jedlicka AE, Kawasaki E, Martinez-Murillo F, Morsberger L, Lee H, Petersen D, Quackenbush J, Scott A, Wilson M, Yang Y, Ye SQ, Yu W: Multiple-laboratory comparison of microarray platforms. Nat Methods 2005, 2(5):345–350. 10.1038/nmeth756View ArticlePubMedGoogle Scholar
- Kuo WP, Liu F, Trimarchi J, Punzo C, Lombardi M, Sarang J, Whipple ME, Maysuria M, Serikawa K, Lee SY, McCrann D, Kang J, Shearstone JR, Burke J, Park DJ, Wang X, Rector TL, Ricciardi-Castagnoli P, Perrin S, Choi S, Bumgarner R, Kim JH, Short GF 3rd, Freeman MW, Seed B, Jensen R, Church GM, Hovig E, Cepko CL, Park P, Ohno-Machado L, Jenssen TK: A sequence-oriented comparison of gene expression measurements across different hybridization-based technologies. Nat Biotechnol 2006, 24(7):832–840. 10.1038/nbt1217View ArticlePubMedGoogle Scholar
- Conserved Coexpression for Candidate Disease Gene Prioritization[http://www.cmbi.ru.nl/~moti/coexpression/]
- Lee HK, Hsu AK, Sajdak J, Qin J, Pavlidis P: Coexpression analysis of human genes across many microarray data sets. Genome Res 2004, 14(6):1085–1094. 10.1101/gr.1910904PubMed CentralView ArticlePubMedGoogle Scholar
- Cheng Y, Church GM: Biclustering of expression data. Proc Int Conf Intell Syst Mol Biol 2000, 8: 93–103.PubMedGoogle Scholar
- Madeira SC, Oliveira AL: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans Comput Biol Bioinform 2004, 1(1):24–45. 10.1109/TCBB.2004.2View ArticlePubMedGoogle Scholar
- Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA: Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 2005, 33(Database issue):D514–7. 10.1093/nar/gki033PubMed CentralView ArticlePubMedGoogle Scholar
- Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T, Durbin R, Eyras E, Gilbert J, Hammond M, Huminiecki L, Kasprzyk A, Lehvaslaiho H, Lijnzaad P, Melsopp C, Mongin E, Pettett R, Pocock M, Potter S, Rust A, Schmidt E, Searle S, Slater G, Smith J, Spooner W, Stabenau A, Stalker J, Stupka E, Ureta-Vidal A, Vastrik I, Clamp M: The Ensembl genome database project. Nucleic Acids Res 2002, 30(1):38–41. 10.1093/nar/30.1.38PubMed CentralView ArticlePubMedGoogle Scholar
- Wain HM, Lush MJ, Ducluzeau F, Khodiyar VK, Povey S: Genew: the Human Gene Nomenclature Database, 2004 updates. Nucleic Acids Res 2004, 32(Database issue):D255–7. 10.1093/nar/gkh072PubMed CentralView ArticlePubMedGoogle Scholar
- Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP: Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 2003, 4(2):249–264. 10.1093/biostatistics/4.2.249View ArticlePubMedGoogle Scholar
- Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JY, Zhang J: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004, 5(10):R80. 10.1186/gb-2004-5-10-r80PubMed CentralView ArticlePubMedGoogle Scholar
- von Mering C, Jensen LJ, Snel B, Hooper SD, Krupp M, Foglierini M, Jouffre N, Huynen MA, Bork P: STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res 2005, 33(Database issue):D433–7. 10.1093/nar/gki005PubMed CentralView ArticlePubMedGoogle Scholar
- Kasprzyk A, Keefe D, Smedley D, London D, Spooner W, Melsopp C, Hammond M, Rocca-Serra P, Cox T, Birney E: EnsMart: a generic system for fast and flexible access to biological data. Genome Res 2004, 14(1):160–169. 10.1101/gr.1645104PubMed CentralView ArticlePubMedGoogle Scholar
- Team RDC: R: A Language and Environment for Statistical Computing.[http://www.R-project.org/]
- Magalhaes TR, Palmer J, Goodman CS: Axon guidance study in Drosophila embryos.Google Scholar
- Wang J, Kean L, Yang J, Allan AK, Davies SA, Herzyk P, Dow JA: Function-informed transcriptome analysis of Drosophila renal tubule. Genome Biol 2004, 5(9):R69. 10.1186/gb-2004-5-9-r69PubMed CentralView ArticlePubMedGoogle Scholar
- Akdemir F, Christich A, Sogame N, Chapo J, Abrams JM: p53 directs focused genomic responses in Drosophila. Oncogene 2007.Google Scholar
- Dostert C, Jouanguy E, Irving P, Troxler L, Galiana-Arnoux D, Hetru C, Hoffmann JA, Imler JL: The Jak-STAT signaling pathway is required but not sufficient for the antiviral response of drosophila. Nat Immunol 2005, 6(9):946–953. 10.1038/ni1237View ArticlePubMedGoogle Scholar
- Beckstead RB, Lam G, Thummel CS: The genomic response to 20-hydroxyecdysone at the onset of Drosophila metamorphosis. Genome Biol 2005, 6(12):R99. 10.1186/gb-2005-6-12-r99PubMed CentralView ArticlePubMedGoogle Scholar
- Wijnen H, Naef F, Boothroyd C, Claridge-Chang A, Young MW: Control of daily transcript oscillations in Drosophila by light and the circadian clock. PLoS Genet 2006, 2(3):e39. 10.1371/journal.pgen.0020039PubMed CentralView ArticlePubMedGoogle Scholar
- Zimmerman JE, Rizzo W, Shockley KR, Raizen DM, Naidoo N, Mackiewicz M, Churchill GA, Pack AI: Multiple mechanisms limit the duration of wakefulness in Drosophila brain. Physiol Genomics 2006, 27(3):337–350. 10.1152/physiolgenomics.00030.2006View ArticlePubMedGoogle Scholar
- Wang X, Bo J, Bridges T, Dugan KD, Pan TC, Chodosh LA, Montell DJ: Analysis of cell migration using whole-genome expression profiling of migratory cells in the Drosophila ovary. Dev Cell 2006, 10(4):483–495.View ArticlePubMedGoogle Scholar
- Baugh LR, Hill AA, Claggett JM, Hill-Harfe K, Wen JC, Slonim DK, Brown EL, Hunter CP: The homeodomain protein PAL-1 specifies a lineage-specific regulatory network in the C. elegans embryo. Development 2005, 132(8):1843–1854. 10.1242/dev.01782View ArticlePubMedGoogle Scholar
- Tai SL, Boer VM, Daran-Lapujade P, Walsh MC, de Winde JH, Daran JM, Pronk JT: Two-dimensional transcriptome analysis in chemostat cultures. Combinatorial effects of oxygen availability and macronutrient limitation in Saccharomyces cerevisiae. J Biol Chem 2005, 280(1):437–447.View ArticlePubMedGoogle Scholar
- Yarragudi A, Parfrey LW, Morse RH: Genome-wide analysis of transcriptional dependence and probable target sites for Abf1 and Rap1 in Saccharomyces cerevisiae. Nucleic Acids Res 2007, 35(1):193–202. 10.1093/nar/gkl1059PubMed CentralView ArticlePubMedGoogle Scholar
- Singh J, Kumar D, Ramakrishnan N, Singhal V, Jervis J, Garst JF, Slaughter SM, DeSantis AM, Potts M, Helm RF: Transcriptional response of Saccharomyces cerevisiae to desiccation and rehydration. Appl Environ Microbiol 2005, 71(12):8752–8763. 10.1128/AEM.71.12.8752-8763.2005PubMed CentralView ArticlePubMedGoogle Scholar
- Sabet N, Volo S, Yu C, Madigan JP, Morse RH: Genome-wide analysis of the relationship between transcriptional regulation by Rpd3p and the histone H3 and H4 amino termini in budding yeast. Mol Cell Biol 2004, 24(20):8823–8833. 10.1128/MCB.24.20.8823-8833.2004PubMed CentralView ArticlePubMedGoogle Scholar
- Hochwagen A, Wrobel G, Cartron M, Demougin P, Niederhauser-Wiederkehr C, Boselli MG, Primig M, Amon A: Novel response to microtubule perturbation in meiosis. Mol Cell Biol 2005, 25(11):4767–4781. 10.1128/MCB.25.11.4767-4781.2005PubMed CentralView ArticlePubMedGoogle Scholar
- Schawalder SB, Kabani M, Howald I, Choudhury U, Werner M, Shore D: Growth-regulated recruitment of the essential yeast ribosomal protein gene activator Ifh1. Nature 2004, 432(7020):1058–1061. 10.1038/nature03200View ArticlePubMedGoogle Scholar
- Pitkanen JP, Torma A, Alff S, Huopaniemi L, Mattila P, Renkonen R: Excess mannose limits the growth of phosphomannose isomerase PMI40 deletion strain of Saccharomyces cerevisiae. J Biol Chem 2004, 279(53):55737–55743. 10.1074/jbc.M410619200View ArticlePubMedGoogle Scholar
- Ronald J, Akey JM, Whittle J, Smith EN, Yvert G, Kruglyak L: Simultaneous genotyping, gene-expression measurement, and detection of allele-specific expression with oligonucleotide arrays. Genome Res 2005, 15(2):284–291. 10.1101/gr.2850605PubMed CentralView ArticlePubMedGoogle Scholar
- Takagi Y, Masuda CA, Chang WH, Komori H, Wang D, Hunter T, Joazeiro CA, Kornberg RD: Ubiquitin ligase activity of TFIIH and the transcriptional response to DNA damage. Mol Cell 2005, 18(2):237–243. 10.1016/j.molcel.2005.03.007View ArticlePubMedGoogle Scholar
- Guan Q, Zheng W, Tang S, Liu X, Zinkel RA, Tsui KW, Yandell BS, Culbertson MR: Impact of nonsense-mediated mRNA decay on the global expression profile of budding yeast. PLoS Genet 2006, 2(11):e203. 10.1371/journal.pgen.0020203PubMed CentralView ArticlePubMedGoogle Scholar
- Kresnowati MT, van Winden WA, Almering MJ, ten Pierick A, Ras C, Knijnenburg TA, Daran-Lapujade P, Pronk JT, Heijnen JJ, Daran JM: When transcriptome meets metabolome: fast cellular responses of yeast to sudden relief of glucose limitation. Mol Syst Biol 2006, 2: 49. 10.1038/msb4100083PubMed CentralView ArticlePubMedGoogle Scholar
- Yu C, Palumbo MJ, Lawrence CE, Morse RH: Contribution of the histone H3 and H4 amino termini to Gcn4p- and Gcn5p-mediated transcription in yeast. J Biol Chem 2006, 281(14):9755–9764. 10.1074/jbc.M513178200View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.