- Open Access
BAMarray™: Java software for Bayesian analysis of variance for microarray data
© Ishwaran et al; licensee BioMed Central Ltd. 2006
- Received: 26 May 2005
- Accepted: 08 February 2006
- Published: 08 February 2006
DNA microarrays open up a new horizon for studying the genetic determinants of disease. The high throughput nature of these arrays creates an enormous wealth of information, but also poses a challenge to data analysis. Inferential problems become even more pronounced as experimental designs used to collect data become more complex. An important example is multigroup data collected over different experimental groups, such as data collected from distinct stages of a disease process. We have developed a method specifically addressing these issues termed Bayesian ANOVA for microarrays (BAM). The BAM approach uses a special inferential regularization known as spike-and-slab shrinkage that provides an optimal balance between total false detections and total false non-detections. This translates into more reproducible differential calls. Spike and slab shrinkage is a form of regularization achieved by using information across all genes and groups simultaneously.
BAMarray™ is a graphically oriented Java-based software package that implements the BAM method for detecting differentially expressing genes in multigroup microarray experiments (up to 256 experimental groups can be analyzed). Drop-down menus allow the user to easily select between different models and to choose various run options. BAMarray™ can also be operated in a fully automated mode with preselected run options. Tuning parameters have been preset at theoretically optimal values freeing the user from such specifications. BAMarray™ provides estimates for gene differential effects and automatically estimates data adaptive, optimal cutoff values for classifying genes into biological patterns of differential activity across experimental groups. A graphical suite is a core feature of the product and includes diagnostic plots for assessing model assumptions and interactive plots that enable tracking of prespecified gene lists to study such things as biological pathway perturbations. The user can zoom in and lasso genes of interest that can then be saved for downstream analyses.
BAMarray™ is user friendly platform independent software that effectively and efficiently implements the BAM methodology. Classifying patterns of differential activity is greatly facilitated by a data adaptive cutoff rule and a graphical suite. BAMarray™ is licensed software freely available to academic institutions. More information can be found at http://www.bamarray.com.
- Gene Label
- Graphical Suite
- File Menu
- Baseline Option
DNA microarray technology allows researchers to estimate the relative expression levels of thousands of genes simultaneously over different time points, different experimental conditions, or different tissue samples. It is the relevant abundance of the mRNA genetic product that provides surrogate information about the relative abundance of the cell's proteins. The differences in protein abundance are what characterize phenotypic differences between cells. Identifying such differences (even at the mRNA level) can lead to insight about biological processes and pathways that might be involved in a disease process as well as highlight new potential targets for diagnostic and therapeutic development. See [1–4] for more background on microarrays.
Identifying signal in the presence of abundant noise
While potentially rich in information, microarray data pose a serious statistical challenge due to the sheer volume of information being processed . It is the norm to see data collected on tens of thousands of genes from only a handful of samples. Data analysis is further complicated because of heterogeneity of gene-specific variances and correlation of gene expressions due to biological effect or technological artifact. Although many inferential questions are of interest, a common concern is of the detection of differentially expressing genes between experimental groups (e.g., between control samples and treatment samples, or between normal tissue samples and diseased tissue samples). Because of the large number of genes and tests involved, and because of the many inherent sources of noise in microarray data, the potential for Type-I errors or false detections is large. For two-group problems, a common strategy is to control the false discovery rate (FDR) using the method of  or empirical Bayes methods [7–9]. However, while these methods work well in controlling FDR, the price paid is often a conservativeness that leads to missing important genes . Indeed, in two-group problems, the total number of misclassified genes can be derived in closed form assuming normally distributed data . Such calculations suggest that when the fraction of truly differentially expressing genes is relatively low, total misclassification of differential effects will be large unless FDR is controlled at a high value, thus putting into question the value of such control.
The issues become more complex for multigroup data collected over different experimental groups, such as data collected from distinct stages of a disease process, or time course experiments in which microarrays are used to track gene expression profiles over time (the time points can be thought of as groups). The richness of such data lends itself to a myriad of potential questions and each question brings with it the thorny statistical problems associated with multiple testing. Because of this, most approaches start by simplifying multigroup hypotheses into a composite question that can be tested using a one-dimensional test statistic for each gene. While this is certainly convenient – for example, it makes it possible to apply standard error control methods such as the FDR – the strategy may not be optimal for several reasons. First, the underlying test statistic is likely to be fairly elementary, and thus highly variable because it will not be regularized. That is, the test is not likely to be constructed in a way that uses information across all genes and samples. Regularization is an important concept in microarray settings where sample sizes are small and the number of parameters are large (we will say more on this shortly). Secondly, composite statistics are seriously limited in the information they provide. Consider an F-test analysis involving contrasts for identifying specific patterns of differential expression across groups. For example, consider a gene that differentially expresses early on in a disease process, such as cancer, significantly affecting the biological milieu and making it possible for other genes to act, but then later vanishes. We call this a hit-and-run hypothesis. A contrast, or set of contrasts, looking for hit-and-run genes would simply provide what is equivalent to a p- value for rejecting the null hypothesis of no such pattern being present, but it would tell you very little about the likelihood of classifying a gene as having a hit-and-run pattern as apposed to some other pattern type.
Rescaled spike and slab model selection and regularization
Recently Ishwaran and Rao , building upon work in , introduced a method for detecting differentially expressing genes between multiple groups termed Bayesian ANOVA for microarrays (BAM). This method recasts the statistical problem as a high dimensional model selection problem, and uses a specific Bayesian hierarchical model oriented towards adaptive shrinkage. By using model averaging, a way of accounting for model uncertainty, BAM provides gene effect estimates that are shrunken relative to standard least square estimates in which primarily only the non-differentially expressing gene effects are shrunken. This is a general phenomenon called selective shrinkage [12, 13] that enables BAM to optimally balance total false detections (the total number of genes falsely identified as being differentially expressed) against total false non-detections (the total number of genes falsely identified as being non-differentially expressed). Selective shrinkage, theoretically, translates into more reproducible differential calls. BAM's ability to selectively shrink gene effects is an important form of regularization and is due to the use of a rescaled spike and slab model introduced by . This model, in combination with a carefully selected continuous bimodal prior (also introduced in ), enables BAM to use data across all genes and all experimental groups to accurately estimate different levels of sparsity (the percentage of genes differentially expressing over a specific experimental group) and then to selectively shrink gene effects based on the estimated complexities. Equivalently, this procedure can be viewed as a penalization method in which each gene effect has a unique penalty term that is adaptively estimated from the data .
The BAM estimation procedure is fully automatic and is based on a Gibbs sampling algorithm. Not only are regularized differential gene effects estimated, but so is an automatic data adaptive cutoff value for determining which genes are differentially expressing. This cutoff value, for large enough sample sizes, has the theoretical property of delineating genes with true differential expression from those genes with no differential activity . This is crucial, since determining an appropriate cutoff value is a critical aspect in searching for differential expression (whatever the method being used).
Another important feature in analyzing microarray data is the ability to systematically deal with heterogeneity of variances across genes and groups. Variance stabilization can lead to tremendous gains in power and is another important aspect of regularization. This issue was discussed in depth in [10, 12] and Ishwaran and Papana (2005). BAMarray™ incorporates a nonparametric Classification and Regression Tree (CART) clustering algorithm described in Ishwaran and Papana (2005) to effectively deal with unequal variances. Of note is that the procedure does not artificially dampen or amplify group differences across genes for the sake of attaining variance stabilization.
Illustrative example: tracking the genomic stagewise development of liver metastatic colon cancer
As preliminary illustration and motivation for BAMarray™, we look at expression data from a large microarray repository of colon cancer tissue samples comprising various stages of tumor progression. This data were obtained from Sanford Markowitz at the Ireland Cancer Center of Case Western Reserve University. All gene expression data were collected using high density 59K-on-one gene chips developed by EOS Biotechnology. These are Affymetrix-derived chips with proprietary probe sets. The high density of probe sets reflects known genes and ESTs (expressed sequence tags) as well as predicted exons.
BAMarray™ (Release 2.0) is a stand-alone platform independent desktop Java application. Solutions currently exist for the Mac OS X, Linux, and Windows XP operating systems. A native code C library is at the core of the product. This library implements the BAM algorithm and consists of several components including data pre-processing, data variance stabilizing transformations, and the Gibbs sampler. A Java graphical user interface surrounds the native code library and allows the user to interact with the library and conduct customized data analysis.
Installing and uninstalling BAMarray™
BAMarray™ is available for download in the form of a self-extracting executable install package. Details can be found at http://www.bamarray.com. Users must register online in order to download the product. A 30-day evaluation license key will be automatically generated and emailed to the user upon registration (a full production license key will be emailed upon completing a signed license agreement). The user may then download the install package and execute the file according to the operating system specific protocol. The user completes the install by following the prompts generated by the package.
On first run, BAMarray™ will query the user for the license key. Once the key is verified, the product will present the user with the main console from which analysis can proceed.
Uninstalling BAMarray™ is as straightforward as the install process. An uninstall icon is produced during the install process in the product's home directory. Double-clicking on this icon will remove the product from the system. User modified data files will remain, but can be disposed of manually if so desired.
Some key software features
BAMarray is a stand-alone platform independent desktop Java application. Solutions currently exist for the Mac OS X, Linux, and Windows XP operating systems.
Full multigroup analysis for up to 256 groups can be handled. Overlay multigroup plots (similar to Figure 1) are available for visualizing how genes are mapped to specific pattern types of differential expression across groups.
Graphical zoom-in and lassoing tools enable the user to interactively generate lists of differentially expressing genes.
Gene labels can be toggled on or off allowing genes of interest to be readily identified. Genes of interest (such as those making up a biological pathway of interest) can be highlighted using a selection list.
Gene lists of interest can be exported for further exploration.
Unequal variances across genes and groups are systematically handled by an automated pre-processing step.
Note on normalization of microarray data
BAMarray™ assumes that the data to be analyzed has been suitably normalized (exact data formats and importing of data is discussed in subsequent sections). Normalization is simply the removal of systematic effects across samples that might bias inference. Two examples are batch effects in which samples were run, and dates that samples were extracted. Normalization can significantly affect microarray inferences . The user is required to provide suitably normalized expression data to BAMarray™. Normalization procedures are currently not provided within the package, but a future release (3.0) will have this capability (see the Discussions section for details).
Normalization methods for two-color array data (such as cDNA arrays) are discussed in . For Affymetrix oligonucleotide arrays, suitable options include the Affymetrix MAS 5.0 analysis suite  or robust multi-array analysis . These, and other, procedures are available in Bioconductor .
Data formats and importing data files
Illustrative example (bundled data)
The brain tissue dataset shown in Figures 3 and 4 (this will used for all illustrations henceforth) is a microarray experiment studying hippocampal aging and cognitive impairment. The goal of the experiment was to look for gene expression changes that track aging-dependent cognitive decline. Hippocampal CA1 tissue was collected from 4, 14, and 24 month old male Fischer rats after 7 days training on a water maze which included object memory task (see  for details). There were 10, 9 and 10 samples collected for the respective age groups. The age groups are labeled as Young, Middle, and Aged. The data are available at the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) data repository under series record accession number GSE854. This dataset comes pre-bundled with the default BAMarray™ installation. The default input directory (initialized when the user first starts the software) contains the brain tissue dataset.
BAMarray™ run settings
Accuracy: Low, Medium, High and Super settings correspond to the number of iterations for the Gibbs sampler. The Gibbs sampler is a Monte Carlo method for estimating parameter values of interest. The more iterations used (i.e., Super vs Low), the more accurate, but the longer the run time. For data exploration, a Medium setting will suffice. However, it is good practice to confirm results at the High or Super setting when possible.
Baseline: This allows the user to define the baseline group for comparison purposes. The rationale for the baseline group is provided in [10, 12]. It is typical to assign a control group or perhaps a normal or preliminary disease state as the baseline group. In our colon cancer example the BSurvivors represent the baseline, whereas in the brain tissue dataset the Young group serves as the baseline (see Figure 6). For time-course data the zero time point might be the most sensible baseline choice. Note that a No Baseline option is also available; details are provided later.
Variance: Equal and Unequal settings. Expression values for genes are expected to have different variances (this is addressed by (c)). This option, however, indicates whether the variability of expression values differs over experimental group as well. The default Equal option implies equal variances across groups. Graphical diagnostic plots (to be discussed shortly) are provided for assessing if this assumption is met. For many applications, an equal variance model will be reasonable. For more details please consult  and Ishwaran and Papana (2005).
Clicking Run initiates the analysis. A status and progress bar at the bottom of the BAMarray™ main console indicate how long the Gibbs sampler will take and when the analysis has successfully completed.
Data Plots are used to verify the assumption of equal variances. These include (i) cluster diagnostic plots, (ii) standard deviation plots, (iii) group mean plots, and (iv) V-plots (the last three are based on the transformed data).
Inferential Plots are based on estimated parameters from the model and are used for detecting differentially expressing genes. These include color enhanced shrinkage plots of Zcut values for identifying differentially expressing genes for a specific group. Also provided are multigroup Zcut scatter plots (similar to Figure 1 described earlier) for visualizing differentially expressing genes simultaneously over two or more groups.
Data plots for assessing model assumptions: cluster diagnostic plots
Data plots for assessing model assumptions: standard deviation, group mean difference and v-plots
Inferential plots for detecting differentially expressing genes: shrinkage plots
The horizontal axis for the shrinkage plot are Zcut gene differential effects while the vertical axis are the corresponding posterior variances. Theoretical arguments show that genes that are truly differentially expressing will have posterior variances that coalesce to 1 on the far left and right sides of the plot. As the number of samples increases, eventually all of the truly differentially expressing genes will be found and none of the non-differentially expressing genes will be falsely detected . BAMarray™ uses this principle to determine a data adaptive cutoff value.
Inferential plots for detecting differentially expressing genes: multigroup scatterplots and zooming in
Adding gene labels to plots and saving gene lists
Read in the data. Select the groups for the analysis and the baseline group. Click on Run.
When the analysis is complete, go to the main console under the File Menu and click on Save All Sig Genes.... This will save all significant genes and their classification values.
Plotting options and using the gene tracking facility
BAMarray™ plots can be customized by pulling down the Tools menu item on any graph. This will highlight an Options command, which when activated, will open up a Plot Options window that highlights Preferences. Plotting label and character sizes can be adjusted here. Clicking the Apply button activates the desired changes. The default label and character sizes are 6 pt.
More on assuming equal variances across groups
The no baseline option
There are occasions when fitting a model with a No Baseline option is of interest. This option is accessible under the Tools menu on the main console under Baseline Options. Clicking on No Baseline Selection enables this feature for the session. No baseline means that each gene effect is being tested against a null value of zero (i.e. no detectable effect at all) rather than against a defined baseline group.
Example: tracking tumor progression genes. The colon cancer example presented earlier compared the various stages of colon cancer against the early onset BSurvivors group. In an informal sense, this analysis asks the question "what makes a good tumor go bad?" Another approach would be to identify genes that track the stagewise progression of metastatic colon cancer. This can be done by creating a new response variable, which measures the difference in gene expression between the successive stages of colon cancer, and then using a four-group analysis with no baseline in BAMarray™. The new response values would include modified stage C gene expression data created by comparing C measurements for a gene to some overall summary measurement for the corresponding BSurvivors gene. Similarly there would be a modified stage D measurement designed to measure difference from the D's to the C's. Finally we would have modified METS expression values recording differences between the METS and the D's. Each gene effect could then be tested against the null value of zero and statistical inference would reveal which genes have significant changes in gene expression as a function of stagewise progression of colon cancer.
The need for high quality software is rapidly growing in the area of genomic research. More powerful and elegant ways to store and analyze data are making mining the vast quantities of data we collect much more manageable and time efficient. Our main objective in producing BAMarray™ was to provide cutting edge statistical tools embedded within a sophisticated and easy-to-use graphical interface. Our goal was to free the user from as many subjective choices as possible and facilitate interactions with their data. While some knowledge of the underlying methodology is certainly useful, our main focus was to delineate the methodological ideas via simple, yet elegant, graphics that would make the software much more approachable for non-statisticians. Yet because the output from our software is stored in text format with clean and simple summary structures, it makes it possible for a more advanced statistical user to interface BAMarray™ with their own favorite products.
BAMarray is a stand-alone desktop Java application that interfaces with a native code C library. The software is highly portable and it is possible to create builds of the software for virtually any operating system. At this time, the software has solutions for the Mac OS X (10.3+), Linux, and the Windows XP operating systems.
BAMarray™ allows for a full multigroup analysis. This facilitates the searching for complex biological patterns such as the hit-and-run patterns of differential expression described earlier in this manuscript. Of course, many other applications are possible. For example, the very large number of experimental groups that can be analyzed (up to 256) facilitates studying expression changes in data settings where group labels could be tissue types collected from multiple regions within an organism (for human data this could be used for a genomic body map analysis for example).
BAMarray™ is nearly automatic in its usage. The user is freed from having to set and (or) choose tuning parameters, many of which can often affect the resulting conclusions . Instead, we appeal to underlying theory to set tuning parameters at theoretically optimal values.
BAMarray™ importantly does not require a user-specified cutoff value for identifying significant genes. This is often the most difficult part of using a statistical software package. The choice of what is deemed significant is often arbitrary and dictated by available resources for follow-up analyses. Instead with BAMarray™, the cutoff values are set by appealing to the underlying theory via a novel shrinkage plot. From this plot, genes with posterior variances coalescing at a value of 1 are guaranteed to be truly differentially expressing with probability tending to 1.
Unequal variances across genes and (or) experimental groups is a common occurrence in multigroup studies. As described, global variance stabilizing transformations can be difficult to find and also unduly affect signal-to-noise ratios. BAMarray™ uses a sophisticated local variance stabilizing CART algorithm which does not suffer from the adverse properties of a global transformation. Importantly, this type of variance stabilization can also be used as a pre-processing step on its own. So even if a user would eventually like to analyze data in another package, variance stabilization can still be handled effectively in BAMarray™.
A no-baseline option in BAMarray™ allows for some non-standard experimental designs to be analyzed. This includes analyzing one-way ANOVA models (for example paired experimental designs), time course gene expression profiles, or perhaps tracking disease progression genes.
A suite of graphical tools are available in BAMarray™. These include diagnostic plots to check for the appropriateness of model assumptions and the adequacy of the pre-processing; zoom-in and lassoing features that allow the user to interactively generate lists of differentially expressing genes; toggling on or off of gene labels; and a gene tracker function that allows pre-specified lists of genes to be interactively tracked for differential expression across experimental groups.
Gene lists can exported to any software package that can read simple text files. A myriad of possibilities exist for follow-up analyses, but for most users, annotating gene lists would be of first importance. This can be done easily by importing significant gene lists from BAMarray™ into packages like GeneSpring, NetAffx™ or Bioconductor.
All figures can be saved as publication quality color graphics.
Analyses can be done at various levels of accuracy. This amounts to user-control over how many Gibbs sampling iterations are allowed. For most exploratory, first-wave analyses, a lower number of iterations would be sufficient. When conducting confirmatory analyses, a much larger number of iterations can be set.
Our illustrative example involving colon cancer showed how BAMarray™ can be used to track differentially expressing genes in multigroup experiments by statistically mapping genes to unique differential expression pattern types (for example, hit-and-run patterns). An important outcome of this is the ability to group genes by pattern type in order to find more focused underlying biology. We note, however, that higher order analyses like building molecular classifiers or survival outcome predictors can also benefit from this information. In fact, the very patterns that are found can be used in a special way to help build more powerful molecular models. This is work that we will report on shortly.
In addition to this work, the team continues to upgrade the software and a new release, Release 3.0, will soon be made available at http://www.bamarray.com. This major upgrade will contain some important enhancements to the product. For example, the capability to run BAMarray™ in an unattended Batch Mode initiated from a script file will be available. Batch Mode allows users to source BAMarray™ from any application that implements the use of operating system command driven script files. Writing custom designed scripts allow users to interface with different types of software, such as Bioconductor, and R, and could be used, for example, to automate the process of normalizing data. Release 3.0 will also have a Save Run feature allowing users to save the results of a run for later retrieval. A run that can take minutes to execute can be restored in only seconds using a Restore Run feature. Save Run can also be triggered in Batch Mode. This unique feature allows users to batch multiple jobs for later retrieval. Finally, Release 3.0 will allow users to populate a tracking list from gene labels found in an existing file. These and many more enhancements will be found in the next release of the product.
BAMarray™ is user friendly Java-based software that effectively and efficiently implements the BAM methodology for analyzing expression data from multigroup experimental designs. The portability and flexibility of the product make it possible to rapidly adapt BAMarray™ to the highly dynamic field of genomic informatics and to modify the existing product to allow for seemless interface with other software and data mining tools as they become available.
Project name: BAM
Project home page: http://www.bamarray.com
Operating system(s): Windows XP, Linux, Mac OS X (10.3+).
Programming language: Java, C.
Other requirements: 512 MB RAM, 2.0 GHz Pentium 4 CPU, 200 MB free disk space on hard drive, Sun Java™ 2 Runtime Environment, Standard Edition (JRE) 1.4X. For Windows XP, installation must be done by users in the "Administrators" group or "Power Users" group only.
License: Academic and commercial license available from Technology Transfer Office at Case Western Reserve University. Details found at http://www.bamarray.com.
Any restrictions to use by non-academics: License needed.
Hemant Ishwaran was supported by National Science Foundation grant DMS-0405675. J. Sunil Rao was supported by National Institutes of Health career grant K25-CA89867 and National Science Foundation grant DMS-0405072.
- Schena M, Heller RA, Theriault TP, Konrad K, Lachenmeier E, Davis RW: Microarrays: biotechnology's discovery platform for functional genomics. Trends in Biotechnology 1998, 16: 301–306. 10.1016/S0167-7799(98)01219-0View ArticlePubMedGoogle Scholar
- Schena M, Davis RW: Genes, Genomes and Chips. In DNA Microarrays: A Practical Approach. Edited by: Schena M. Oxford: Oxford University Press; 1999.Google Scholar
- Brown P, Botstein D: Exploring the new world of the genome with DNA microarrays. Nature Genetics 1999, 21: 33–37. 10.1038/4462View ArticlePubMedGoogle Scholar
- Nguyen D, Arpat AB, Wang N, Carroll RJ: DNA microarray experiments: biological and technological aspects. Biometrics 2002, 58: 701–717. 10.1111/j.0006-341X.2002.00701.xView ArticlePubMedGoogle Scholar
- Rao JS, Bond M: Microarrays: managing the data deluge. Cir Research 2001, 1226–1227.Google Scholar
- Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Royal Stat Soc B 1995, 57: 289–300.Google Scholar
- Efron B, Tibshirani R, Storey JD, Tusher VG: Empirical Bayes analysis of a microarray experiment. J Amer Stat Assoc 2001, 96: 1151–1160. 10.1198/016214501753382129View ArticleGoogle Scholar
- Tusher VG, Tibshirani R, Chu G: Significance analysis of microarrays applied to the ionizing radiation response. Proc Nat Acad Science 2001, 98: 5116–5121. 10.1073/pnas.091062498View ArticleGoogle Scholar
- Storey JD: A direct approach to false discovery rates. J Royal Stat Soc B 2002, 64: 479–498. 10.1111/1467-9868.00346View ArticleGoogle Scholar
- Ishwaran H, Rao JS: Detecting differentially expressed genes in microarrays using Bayesian model selection. J Amer Stat Assoc 2003, 98: 438–455. 10.1198/016214503000224View ArticleGoogle Scholar
- Genovese C, Wasserman L: Operating characteristics and extensions of the FDR procedure. J Royal Stat Soc B 2002, 64: 499–517. 10.1111/1467-9868.00347View ArticleGoogle Scholar
- Ishwaran H, Rao JS: Spike and slab gene selection for multigroup microarray data. J Amer Stat Assoc 2005, 100: 764–780. 10.1198/016214505000000051View ArticleGoogle Scholar
- Ishwaran H, Rao JS: Spike and slab variable selection: frequentist and Bayesian strategies. Ann Statist 2005, 33: 730–773. 10.1214/009053604000001147View ArticleGoogle Scholar
- Hoffman R, Seidi T, Dugas M: Profound effect of normalization on detection of differentially expressed genes in oligonucleotide microarray data analysis. Genome Biology 2002, 3: 1–11.Google Scholar
- Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP: Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res 2002, 30: el5.View ArticleGoogle Scholar
- Affymetrix: Affymetrix Microarray Suite User Guide, Version 5. Santa Clara 2002.Google Scholar
- Gautier L, Cope L, Bolstad BM, Irizarry RA: affy – analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 2004, 20: 307–315. 10.1093/bioinformatics/btg405View ArticlePubMedGoogle Scholar
- Ihaka R, Gentleman R: R: A language for data analysis and graphics. J Comp and Graph Statist 1996, 5: 299–314.Google Scholar
- Blalock EM, Chen KG, Sharrow K, Herman JP, Porter NM, Foster TC, Landfield PWY: Gene microarrays in hippocampal aging: statistical profiling identifies novel process correlated with cognitive impairment. J of Neuroscience 2003, 23: 3807–3819.PubMedGoogle Scholar
- Larsson O, Wahlestedt C, Timmons JA: Considerations when using the significance analysis of microarrays (SAM) algorithm. BMC Bioinformatics 2005, 6(129):1–6.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.