- Research article
- Open Access
Integrative investigation of metabolic and transcriptomic data
© Pir et al; licensee BioMed Central Ltd. 2006
- Received: 29 November 2005
- Accepted: 12 April 2006
- Published: 12 April 2006
New analysis methods are being developed to integrate data from transcriptome, proteome, interactome, metabolome, and other investigative approaches. At the same time, existing methods are being modified to serve the objectives of systems biology and permit the interpretation of the huge datasets currently being generated by high-throughput methods.
Transcriptomic and metabolic data from chemostat fermentors were collected with the aim of investigating the relationship between these two data sets. The variation in transcriptome data in response to three physiological or genetic perturbations (medium composition, growth rate, and specific gene deletions) was investigated using linear modelling, and open reading-frames (ORFs) whose expression changed significantly in response to these perturbations were identified. Assuming that the metabolic profile is a function of the transcriptome profile, expression levels of the different ORFs were used to model the metabolic variables via Partial Least Squares (Projection to Latent Structures – PLS) using PLS toolbox in Matlab.
The experimental design allowed the analyses to discriminate between the effects which the growth medium, dilution rate, and the deletion of specific genes had on the transcriptome and metabolite profiles. Metabolite data were modelled as a function of the transcriptome to determine their congruence. The genes that are involved in central carbon metabolism of yeast cells were found to be the ORFs with the most significant contribution to the model.
- Partial Little Square
- Deletion Mutant
- Dilution Rate
- Gene Deletion
- Glucose Consumption
After the completion of the genomic sequencing of organisms, integrative post-genomic studies and the systems biology approach have emerged with the aim of developing a more complete understanding of cell physiology. Attempts at data integration for the model organism, Saccharomyces cerevisiae were reviewed recently . Experimental designs that involve (a) perturbations to elucidate the response of the cell under various conditions, (b) collection of high-throughput data at different functional genomic levels and (c) the use of bioinformatics for integrating data from all three levels of analysis (transcriptome, proteome, and metabolome) constitute the three major steps of a procedure common to all integrative studies.
It is possible to design systems biology experiments in a hypothesis-driven manner, such that the designed perturbations provide the information of interest. Alternatively, question-driven discoveries may be made by observing the effects of an intuitively chosen modification and making use of the extracted information to generate new ideas and hypotheses .
Transcriptome data from S. cerevisiae growing in chemostats on a glucose medium under carbon, nitrogen, phosphorus or sulphur limitation allowed detection of the genes that were affected by the different nutrient limitations . The genes that were co-regulated under glucose, ethanol, ammonium or phosphate limitation were identified, and genes from the same pathway were shown to be clustered together . Responses to modifications in the growth medium and/or the dilution rate allowed the identification of genes that enable the cells to adapt to various growth conditions .
Perturbations can also be introduced by genetic, rather than physiological, means – e.g. by gene deletions. Yeast cells carrying gene deletions have been investigated for various purposes: (a) functional analysis based on discrimination of mutants via metabolic fingerprints  or footprints , (b) selection of genes encoding organelle-specific proteins , (c) building and testing of metabolic pathways  and (d) identification of uncharacterized genes and drug targets . These studies have shown that specific changes in the transcriptome or metabolome profiles may occur due to gene deletion. The changes are expected to be more significant when a gene encoding a regulator protein is deleted.
Hap4p was reported to have a function in the regulation of respiration-related genes on the basis of transcriptome data collected during batch growth of yeast cells on glucose, followed by diauxic shift from the fermentation of glucose to the respiratory metabolism of ethanol . The activation mechanism of the Hap2/3/4/5 protein complex has been reviewed by Gancedo, 1998 . The physiology of haploid cells exhibiting HAP4 over-expression  and the transcriptome profile of haploid hap4 Δ deletion mutants  have also been investigated. hap4 Δ deletion mutants were reported to be respiratory deficient  and deletion of HAP4 causes down-regulation of respiration-related genes. In contrast, such genes were expressed at higher levels in HAP4-overexpressing strains growing under aerobic conditions. Moreover, an increase in yeast's respiratory capacity was observed due to over-expression of HAP4 .
In the present study, three types of perturbations that were expected to have an impact on yeast central metabolism, were investigated in chemostat cultures. Changes in growth medium (C- and N-limitations), growth rate (dilution rates) and gene deletions (hap4 and ho) were the perturbations studied. Transcriptome profiles, biomass, glucose and ethanol concentrations of samples from chemostats operating under steady-state conditions were analysed to show the applicability of the Partial Least Squares (Projection to Latent Structures – PLS) method in the integration of transcriptome and metabolite data.
The PLS method linearly models a set of dependent "response" data with respect to a set of independent "cause" data while repressing both of the sets simultaneously. PLS was recently used to analyse transcriptome data for classification of samples from human tumours  and classification of patients for their survival time . In another study, genes expressed periodically within the cell cycle were determined using PLS . Design of experiments and PLS were used for establishing dose- and time-dependent metabolic variations in animals treated with toxic materials [18, 19].
Modelling expression levels of ORFs
Linear modelling was used as a filtering tool to eliminate the ORFs with insignificant expression changes in response to the perturbations in growth medium, dilution rate and gene deletion. Mean-centred and scaled (unit variation) expression levels of 6361 ORFs were modelled and p-values were calculated to decide on the significance of the effects of the factors on the expression of the ORFs. For most of the ORFs, the constructed models did not predict a variation more significant than the expected level of random error, thus these ORFs were not included in further analyses. A p-value of 0.05 was used as the threshold, in order to include all ORFs that were affected significantly by the three factors considered in this study.
324 out of 6361 models estimated that at least one of the factors was affecting the expression of the modelled ORF. The growth medium is the factor with most effect on the expression of most of the ORFs (62.1%), followed by dilution rate (26.2%); while gene deletion is the most effective factor for only 11.7% of the ORFs.
Integration of metabolic and transcriptomic data
23 Factorial Experiment Design
ho Δ G1a
hap4 Δ G1
ho Δ N1
hap4 Δ N1
ho Δ G2
hap4 Δ G2
ho Δ N2
hap4 Δ N2
Factors and Experimental Conditions
Homozygous diploid, ho Δ/ho Δ
Homozygous diploid, hap4 Δ/hap4 Δ
Proportion of the variation explained by each latent variable
% Variation (X)
Cumulative % variation (X)
% Variation (Y)
Cumulative % variation (Y)
The variation generated by the change in dilution rate was represented by LV3 in transcriptome data (25%, Fig. 4B). Variation generated by the change in dilution rate was represented weakly by LV3 and LV4 in metabolic variables (2.8 and 8.2%, Fig. 4D). Thus, the effect of dilution rate on metabolic variables is not successfully modelled by the transcriptome data.
Each of the latent variables that model the response of the metabolic variables to perturbations using the transcriptome data represents the variation in the data set in one of the perturbations applied in the present analysis (except for the variance generated by dilution rate perturbation, which is represented by two latent variables for the metabolic data). For instance, the projection of samples onto LV1 represents the change that was generated in the sample by ammonium limitation when compared to glucose limitation. The direction of each new variable (LV) in the space is a linear combination of the original variables, i.e. ORFs and metabolites. The direction of an LV is dominated by the variables that respond more than the others and the direction of their response. Thus, an LV can be interpreted as a new composite variable that is the only affected feature in the cell when a certain perturbation is applied. As an exception, for the dilution rate change, two latent variables are needed in order to discriminate the metabolic samples from two different dilution rates.
All response variables have positive loadings on LV3 and LV4 (Fig. 5B), and therefore would be expected to have higher values in samples with positive scores on LV3 and LV4. Indeed, all response variables increased at the higher dilution rate (Fig. 1B). However, this behaviour cannot be predicted by the model as some of the scores of metabolic samples from higher dilution rates are not positive on LV3 and LV4 (Fig. 4D).
Analysis of ORFs with significant contribution
Loadings of the ORFs on the latent variables were investigated to unravel the relationship between the transcriptome and response variables (Figs 5C–F). The ORFs with positive loadings on an LV are up-regulated in samples with positive scores on that LV while they are down-regulated in samples with negative scores. On the other hand, the ORFs with negative loadings on an LV are down-regulated in samples with positive scores on that LV.
The variance in LV1 represents the differences due to the medium factor. The genes with positive loadings on LV1, which are up-regulated under ammonium limitation when compared to glucose limitation, are expected to be the genes that mediate the increase in biomass production, glucose consumption, and ethanol production rates. Similarly, the genes with negative loadings on LV1 are most likely to be the genes that are up-regulated under glucose limitation causing the decrease in these response variables. The ORFs with positive loadings on LV2 are up-regulated in ho Δ/ho Δ deletion mutants, since samples from such mutants have positive scores on LV2. These ORFs are expected to mediate the changes in the rates of biomass production, glucose consumption and ethanol production in the hap4 Δ/hap4 Δ deletion mutants as compared to ho Δ/ho Δ deletion mutants.
ORFs and GO terms with highest contributions to the LVs
ORFs with significant loadings
Biological Process GO Terms
HXT1, MNT4, HXT3, YER028C, YJL132W, YGL157W, ALD1, ZRT2
establishment of localization
GSY1, MBR1, ISF1, GDB1, MAL33, QCR8, GLG1, PIG1, YDL157C, CBP4, GPH1, HXK1, GAC1, YPR196W, YLR327C, PRX1, QCR9, PCL7, MAL31, BAP2, INH1, MRK1, YOL053W, YKL187C, YMR103C, MTH1, MCR1, YGR243W, PRS2, ROM1, COX8, COX4, YJR008W, YNL274C, HOR2, COX7, YPL099C, ATP18, QCR10, CNM67, ATP5, ACN9, COX12, COX6
generation of precursor metabolites and energy
energy derivation by oxidation of organic compounds
ATP synthesis coupled electron transport (sensu Eukaryota)
QCR8, PRX1, QCR9, INH1, MCR1, COX8, COX4, COX12, COX6, COX7, ATP18, QCR10, ATP5, AMS1, HAP4, RPM2, PHM8, FBP26, ATP15, YMR034C, YOR220W, TUF1, COR1, ATP3, YNL122C, ATP7, ATP17, ATP20, HXT1, YER028C, YJL132W
generation of precursor metabolites and energy
PIG1, BAP2, MRK1, PRS2, UBP14, MKC7
regulation of carbohydrate biosynthesis
regulation of carbohydrate metabolism
regulation of cellular biosynthesis
regulation of biosynthesis
PRS2, CBP4, RPL7A, YOR314W, ALD1, ATP5, COX6
purine ribonucleotide biosynthesis
purine ribonucleotide metabolism
purine nucleotide biosynthesis
PIG1, ROM1, CNM67, YBL112C, MSC2, YOL153C, UBI4, RAD2, CHS1, MNT4, ZRT2, PRX1, AMS1, PHM8, FBP26, YMR034C, YOR220W, HXT1, YJL132W
MSC2, PRX1, AMS1, PHM8, YMR034C, HXT1, MTH1, YPL099C, TRS23, CYC7, ZRG17, YLR431C, GPG1, YFL034W, PKH1, HXT3, YER028C
establishment of localization
CNM67, MRK1, MKC7, YDL157C, YDR119W, QCR8, QCR9, INH1, COX8, COX4, HAP4, ALD1
ATP synthesis coupled electron transport (sensu Eukaryota)
ATP synthesis coupled electron transport
The ORFs with significant positive loadings on the first latent variable (LV1+) are involved in hexose transport, these ORFs are up-regulated under ammonium limitation in comparison to carbon limitation. Up-regulation of the hexose transport pathway may be the first step of the mechanism to increase biomass production, ethanol production and glucose consumption rates under nitrogen limitation. The ORFs that are down-regulated under ammonium limitation as compared to glucose limitation are the genes that are active in oxidative phosphorylation, generation of precursor metabolites and energy (LV1-). The high glucose concentration in the ammonium-limited culture should repress the expression of genes acting on the respiratory pathways (oxidative phosphorylation). Repression of respiration, in turn, would cause the fermentation pathway to be activated and ethanol production to be enhanced.
The genes that were down-regulated due to the hap4 Δ/hap4 Δ deletion when compared to standart strain ho Δ/ho Δ mainly have roles in respiration and phosphate metabolism (LV2+). Consequently, the hap4 Δ/hap4 Δ deletion causes respiratory deficiency under the conditions studied, and fermentation is the only route for glucose metabolism. The higher glucose consumption and ethanol production rates achieved provide further confirmation that Hap4p plays a major role in the switch mechanism from respiration to fermentation. The genes that were up-regulated in response to the hap4 Δ/hap4 Δ deletion (LV2-) were involved in regulation of carbohydrate biosynthesis, indicating the propensity of the cells to convert excess carbon into storage molecules if the carbon source cannot be respired. While this theory explains the results from glucose-limited case perfectly, the effect of hap4 Δ/hap4 Δ deletion is not apparent in ammonia-limited conditions as the high glucose levels in the N-limited medium repress respiration, quite independently from the respiratory deficiency caused by the hap4 Δ/hap4 Δ deletion. Thus the metabolic variables behave similarly in ho Δ/ho Δ and hap4 Δ/hap4 Δ mutants growing under ammonium limitation, and the insignificant variation in these variables cannot be estimated by the model, as discussed previously.
The ORFs up-regulated at the higher dilution rate (LV3+) are the genes that act in ribonucleotide metabolism. Up-regulation of these ORFs mediates the increase in biomass production rate, and the consequent increases in ethanol production and glucose consumption rates. The GO terms common among the ORFs down-regulated at the lower dilution rate (LV3-) are related to reproduction mechanisms. The significance of these terms is quite low (p ~10-2); however, up-regulation of these mechanisms at the lower growth rate is an interesting phenomenon that remains to be explained.
The number of genes given in groups LV1- and LV2+ (Table 4) that are members of the GO terms "generation of precursor metabolites and energy" and "oxidative phosphorylation" have high significance (p < 1.0 E-12). The unknown ORFs that appear in the same group as these genes may also be members of the functional categories denoted by the over-represented GO terms.
Discrimination of the effects of the above factors on transcriptome and metabolic data.
Modelling of metabolic data as a function of transcriptome data and elucidation of the extent of congruence between these two data sets.
Identification of ORFs that mediate the changes in metabolic data in response to perturbations.
In cases where the number of variables in the metabolic data is much higher, the PLS method will help in the identification of metabolites that are affected by the conditions applied and the genes that mediate the effects of the conditions. The unknown genes can be annotated using this methodology and studies towards product maximization can be conducted by identifying the genes and pathways that are responsible for the changes in formation of metabolic products.
Experimental materials and methods
Deletion strains of S. cerevisiae with genomic background BY4743 (MATa/MAT α his3 Δ1/his3 Δ1 leu2 Δ0/leu2 Δ0 lys2 Δ0/LYS2 MET15/met15 Δ0 ura3 Δ0/ura3 Δ0) from the Yeast Genome Deletion Project library  were used. The ho Δ/ho Δ deletant is commonly used as a standard strain in control experiments since the deletion has no measurable impact on either flux (growth rate, ) or the metabolome . The absence of the HO or HAP4 genes in a strain's genome was verified using PCR-based methods.
Mineral media, supplemented with trace elements and vitamins were used . The compositions of the media were as follows: KH2PO4 (2 g/l), MgSO4·7H20 (0.55 g/l), NaCl (0.1 g/l), CaCl2·2H2O (0.09 g/l), Uracil (0.02 g/l), L-Histidine (0.02 g/l), L-Leucine (0.1 g/l), ZnSO4·7H2O (0.7 × 10-4 g/l), CuSO4·5H2O (0.1 × 10-4 g/l), H3BO3 (0.1 × 10-4 g/l), KI (0.1 × 10-4 g/l), FeCl3·6H2O (0.5 × 10-4 g/l), inositol (0.12 g/l), thiamine/HCl (0.014 g/l), pyridoxine (0.004 g/l), Ca-pantothenate (0.004 g/l), biotin (0.0003 g/l).
For glucose-limited medium, 3.13 g/l (NH4)2SO4 and 2.5 g/l glucose were added to the medium described above. For ammonium-limited medium 0.46 g/l (NH4)2SO4 and 20 g/l glucose were added to medium described above.
The fermentors were autoclaved, and the media were filter-sterilized prior to inoculation. Pre-cultures (10 ml) were grown overnight in G418-containing YPD medium and used to inoculate the fermentors. The medium was fed at a constant flow rate. Temperature and pH of chemostats with 1L working volume were kept constant at 30°C and 4.5 respectively, and the oxygen content was maintained at saturation.
Homozygous ho Δ/ho Δ and hap4 Δ/hap4 Δ deletion strains were grown both in glucose-limited and ammonium-limited media in separate experiments. The experiments were started at a dilution rate of 0.1 h-1 and, after samples had been collected, the dilution rate was shifted to 0.2 h-1. The samples were collected at steady state after three residence times, and total RNA extraction was carried out. Yeast Genome S98 arrays were used for hybridizations as described by the manufacturer (Affymetrix, USA, 2003, ). Supernatants were analyzed enzymatically for glucose and ethanol content (using kits from Boehringer-Mannheim, Germany).
Samples (5 ml) were centrifuged in pre-weighed tubes, dried at 80°C overnight and re-weighed to determine the dry weight of biomass.
Factorial experimental design is used to reveal the effects of various factors on the output of a system. For an experiment set with "a" levels of "k" factors, ak experiments are needed to cover all possible combinations. The 23 factorial design used in this work is given in Table 1. Eight experiments were conducted to investigate all combinations of the factors. Levels of the factors and the corresponding experimental conditions are given in Table 2.
The abbreviations used for the homozygous ho Δ/ho Δ and hap4 Δ/hap4 Δ mutants deletion strains are "ho Δ" and "hap4 Δ", respectively. The "G" and "N" are the abbreviations for the glucose and ammonium-limited cases, while "1" and "2" are used for dilution rates 0.1 h-1 and 0.2 h-1, respectively. In the Figures, "Δ" is omitted and the abbreviations ho and hap4 represent the deletion mutants.
The linear model for a factorial experiment with three factors is as follows :
y ijk = μ + τ i + β j + γ k + ε ijk (1)
where i, j, k: indices of the levels of the factors; i = 1, 2, ..., a; j = 1, 2, ..., b; k = 1, 2, ..., c; μ: mean of the outputs; y: simulated value of the output variable; τ, β, γ: effects of the factors; ε: random error. In the present experimental design, each factor has two levels; thus, a = b = c = 2. The output variable y represents the expression level of an ORF on a log 2 basis.
The linear model in Eq.(1) was used to estimate the coefficients to describe the expression level of each ORF. The coefficients obtained in each case are the "effects" of the factors on the expression of the ORF modelled. A positive effect made by a factor indicates that the ORF is up-regulated at Level (+) when compared to Level (-) of that factor. Similarly, a negative effect indicates down-regulation of the ORF at Level (+) when compared to Level (-).
P-values of the factors were calculated using the ratio of variation sum of squares to error sum of squares in order to indicate the significance of the correlation between the gene expression and the factor.
Partial least squares
In industrial processes, large sets of process data are collected by computerized control and monitoring systems. Multivariate data analysis methods have emerged to compensate for the need for data reduction towards understanding the nature of the process and fault diagnosis. Partial least squares (projection to latent structures – PLS) is a statistical method that was proposed for process analysis, monitoring and diagnosis [26–28]. Later on, this method was employed as one of the standard tools of chemometrics in analytical chemistry .
In PLS methodology, the independent "cause" matrix X and the dependent "response" matrix Y are regressed and modelled simultaneously. The columns of these matrices represent the variables (genes and response variables in X and Y, respectively) and rows represent the samples. The linear model is:
Y = XB + E (2)
where B is the regression vector and E is the residual matrix.
Projection of the original data set X to a new space with reduced dimensions is made by the loading matrix (p) and the observations are represented by the score matrix (t) in the new space. Decomposition of the data matrix X into the score matrix (t), the loading matrix (p) and the residual matrix (e) is as follows:
X = tp t + e (3)
where the superscript "t" denotes the transpose of the matrix p. Columns of p and t matrices correspond to the latent variables (LVs), which lie in the direction of the maximum variation that remains in the data after removal of the variation explained by the previous LV. The residual matrix (e) represents the variation that remains unrepresented in the t and p matrices. Similarly, the response matrix Y is decomposed as:
Y = uq t + f (4)
The score vectors (vectors of u) and the loading vectors (vectors of q) correspond to LVs. The residuals are given by the f matrix. A linear inner relation also exists between the matrices t and u, where ū denotes the matrix of estimated values of u:
ū = bt (5)
The optimal number of LVs to be included in the model depends on the amount of variation explained by the LVs which are in descending order of the variation they explain. One criterion for the selection of an optimal number of LVs is to set a threshold value for the variation. Then, a sufficient number of LVs is included in the model to represent the threshold variation, and the rest of the variation remains in the residual matrix. Cross-validation is another criterion where the analysis is performed with a subset of the data and the rest of the data set is used to determine the prediction power of the model. Then, the number of LVs that results in minimum prediction error sum of squares (PRESS) is selected.
This work was supported by Bogazici University Research Fund through projects 03S108, 03A504, 04HA503D, and by DPT-03K120250. The scholarship provided for PP by The Turkish Scientific and Technical Research Council (TUBITAK-BAYG) is gratefully acknowledged. We acknowledge grants from the Wellcome Trust (062350/2/00) and the COGEME (19F13036, 918882) grant made to SGO under the 'Investigating Gene Function' Initiative of the UK Biotechnology and Biological Sciences Research Council. We thank to Leanne Wardleworth for technical assistance.
- Castrillo JI, Oliver SG: Yeast as a touchstone in post-genomic research: strategies for integrative analysis in functional genomics. J Biochem Mol Biol 2004, 37: 93–106.View ArticlePubMedGoogle Scholar
- Lockhart DJ, Winzeler EA: Genomics, gene expression and DNA arrays. Nature 2000, 405: 827–836. 10.1038/35015701View ArticlePubMedGoogle Scholar
- Boer VM, de Winde JH, Pronk JT, Piper MDW: The genome-wide transcriptional responses of Saccharomyces cerevisiae grown on glucose in aerobic chemostat cultures limited for carbon, nitrogen, phosphorus, or sulfur. J Biochem 2003, 278: 3265–3274.Google Scholar
- Wu J, Zhang N, Hayes A, Panoutsopoulou K, Oliver SG: Global analysis of nutrient control of gene expression in Saccharomyces cerevisiae during growth and starvation. Proc Natl Acad Sci USA 2004, 101: 3148–3153. 10.1073/pnas.0308321100PubMed CentralView ArticlePubMedGoogle Scholar
- Hayes A, Zhang N, Wu J, Butler PR, Hauser NC, Hoheisel JD, Lim FL, Sharrocks AD, Oliver SG: Hybridization array technology coupled with chemostat culture: Tools to interrogate gene expression in Saccharomyces cerevisiae . Methods 2002, 26: 281–290. 10.1016/S1046-2023(02)00032-4View ArticlePubMedGoogle Scholar
- Raamsdonk LM, Teusink B, Broadhurst D, Zhang N, Hayes A, Walsh MC, Berden JA, Brindle KM, Kell DB, Rowland JJ, Westerhoff HV, van Dam K, Oliver SG: A functional genomics strategy that uses metabolome data to reveal the phenotype of silent mutations. Nat Biotechnol 2001, 19: 45–50. 10.1038/83496View ArticlePubMedGoogle Scholar
- Allen J, Davey HM, Broadhurst D, Heald JK, Rowland JJ, Oliver SG, Kell DB: High-throughput classification of yeast mutants for functional genomics using metabolic footprinting. Nat Biotechnol 2003, 6: 692–696. 10.1038/nbt823View ArticleGoogle Scholar
- Steinmetz LM, Scharfe C, Deutschbauer AM, Mokranjac D, Herman ZS, Jones T, Chu M, Giaever G, Prokisch H, Oefner PJ, Davis RW: Systematic screen for human disease genes in yeast. Nat Genet 2002, 31: 400–404.PubMedGoogle Scholar
- Ideker T, Thorsson V, Ranish JA, Christmas R, Buhler J, Eng JK, Bumgarner R, Goodlett DR, Aebersold R, Hood L: Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 2001, 292: 929–933. 10.1126/science.292.5518.929View ArticlePubMedGoogle Scholar
- Hughes TR, Marton MJ, Jones AR, Roberts CJ, Stoughton R, Armour CD, Bennett HA, Coffey E, Dai H, He YD, Kidd MJ, King AM, Meyer MR, Slade D, Lum PY, Stepaniants SB, Shoemaker DD, Gachotte D, Chakraburtty K, Simon J, Bard M, Friend SH: Functional discovery via a compendium of expression profiles. Cell 2000, 102: 109–126. 10.1016/S0092-8674(00)00015-5View ArticlePubMedGoogle Scholar
- DeRisi JL, Iyer VR, Brown PO: Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 1997, 278: 680–686. 10.1126/science.278.5338.680View ArticlePubMedGoogle Scholar
- Gancedo JM: Yeast carbon catabolite repression. Microbiol Mol Biol Rev 1998, 62: 334–361.PubMed CentralPubMedGoogle Scholar
- Blom J, de Mattos JT, Grivell LA: Redirection of the respiro-fermentative flux distribution in Saccharomyces cerevisiae by overexpression of the transcription factor Hap4p. Appl Environ Microbiol 2000, 66: 1970–1973. 10.1128/AEM.66.5.1970-1973.2000PubMed CentralView ArticlePubMedGoogle Scholar
- Buschlen S, Amillet JM, Guiard B, Fournier A, Marcireau C, Bolotin-Fukuhara M: The S. cerevisiae HAP complex, a key regulator of mitochondrial function, coordinates nuclear and mitochondrial gene expression. Comp Funct Genom 2003, 4: 37–46. 10.1002/cfg.254View ArticleGoogle Scholar
- Nyugen DV, Rocke DM: Tumor classification by partial least squares using microarray gene expression data. Bioinformatics 2002, 18: 39–50. 10.1093/bioinformatics/18.1.39View ArticleGoogle Scholar
- Nyugen DV, Rocke DM: Partial least squaresproportional hazard regression for application to DNA microarray survival data. Bioinformatics 2002, 18: 1625–1632. 10.1093/bioinformatics/18.12.1625View ArticleGoogle Scholar
- Johansson D, Lindgren P, Berglund A: A multivariate approach applied to microarray data for identification of genes with cell cycle-coupled transcription. Bioinformatics 2003, 19: 467–473. 10.1093/bioinformatics/btg017View ArticlePubMedGoogle Scholar
- Azmi Y, Griffin JL, Shore RF, Johansson E, Nicholson JK, Holmes E: Metabolic trajectory characterisation of xenobiotic-induced hepatotoxic lesions using statistical batch processing of NMR data. The Analyst 2002, 127: 271–276. 10.1039/b109430kView ArticlePubMedGoogle Scholar
- Antti H, Ebbels TMD, Keun HC, Bollard ME, Beckonert O, Lindon JC, Nicholson JK, Holmes E: Statistical experimental design and partial least squares regression analysis of biofluid metabonomic NMR and clinical chemistry data for screening of adverse drug effects. Chemometrics and Intelligent Laboratory Systems 2004, 73: 139–149. 10.1016/j.chemolab.2003.11.013View ArticleGoogle Scholar
- Dolinski K, Balakrishnan R, Christie KR, Costanzo MC, Dwight SS, Engel SR, Fisk DG, Hirschman JE, Hong EL, Nash R, Oughtred R, Theesfeld CL, Binkley G, Lane C, Schroeder M, Sethuraman A, Dong S, Weng S, Miyasato S, Andrada R, Botstein D, Cherry JM: Saccharomyces Genome Database.[http://www.yeastgenome.org/]
- Yeast Genome Deletion Project[http://www.sequence.stanford.edu/group/yeast_deletion_project/deletions3.html]
- Baganz F, Hayes A, Marren D, Gardner DCJ, Oliver SG: Suitability of replacement markers for functional analysis studies in Saccharomyces cerevisiae . Yeast 1997, 13: 1563–1573. 10.1002/(SICI)1097-0061(199712)13:16<1563::AID-YEA240>3.0.CO;2-6View ArticlePubMedGoogle Scholar
- Oliver SG, Winson MK, Kell DB, Baganz F: Systematic functional analysis of the yeast genome. Trends Biotechnol 1998, 16: 373–378. 10.1016/S0167-7799(98)01214-1View ArticlePubMedGoogle Scholar
- Affymetrix: Affymetrix GeneChip expression analysis technical manual. Affymetrix Inc 2000.Google Scholar
- Montgomery DG: Design and Analysis of Experiments. 5th edition. New York: John Wiley and Sons; 2001.Google Scholar
- Geladi P, Kowalski BR: Partial least-squares regression: A tutorial. Anal Chim Acta 1996, 185: 1–17. 10.1016/0003-2670(86)80028-9View ArticleGoogle Scholar
- Kourti T, MacGregor JF: Process analysis, monitoring and diagnosis, using multivariate projection methods. Chemometrics and Intelligent Laboratory Systems 1995, 28: 3–21. 10.1016/0169-7439(94)00079-XView ArticleGoogle Scholar
- Wold S, Sjostrom M, Eriksson L: PLS-regression a basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems 2001, 58: 109–130. 10.1016/S0169-7439(01)00155-1View ArticleGoogle Scholar
- Hopke PK: The evolution of chemometrics. Anal Chim Acta 2003, 500: 365–377. 10.1016/S0003-2670(03)00944-9View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.