Gene expression analysis using microarrays opened new insights into the living cell, revolutionizing biological research in many fields. Gene expression of a whole system can be measured at once, yielding information about the mRNA level of every gene. Microarrays have become a standard tool for gene expression measurement in biology and medicine. Their application ranges from identification of gene expression changes in different states of the cell cycle over the classification of disease types to drug development. Although microarrays are widely used, a fundamental challenge is to cope with the immense amount of data generated. Therefore special software packages have been developed, capable of handling the analysis of microarray data. Still, we think that many of the existing tools are not optimal in respect of usability and integration. To date, most freely-available programs split the data analysis into two parts: In the first, statistical methods are used to identify lists of 'interesting' genes, in the second these lists are searched for biological relevance. Although these two steps are dependent on each other and should be highly interconnected, currently most analysis tools lack an integration of these steps. In the following, we will give an overview of selected tools.
One of the most sophisticated software for microarray data analysis is the Bioconductor toolkit , based on the R statistical programming language . Most algorithms developed for microarray data analysis are available within this package. Unfortunately, Bioconductor is a text-driven command line tool and does not provide an easy-to-use graphical interface. Therefore, it offers advanced analysis methods and the possibility of easy extension only for professional users, and is difficult to use for people unskilled in R. Results could be misinterpreted if people are not understanding the data they are working with or the way to perform the analysis. To solve this problem, different tools were developed wrapping the Bioconductor toolkit for an easier usage. AMDA  is an R package, providing a graphical user interface and a workflow for the analysis of Affymetrix microarray data. CARMAWeb  acts as a web-based user interface, making the Bioconductor modules available for data analysis over the internet.
Besides Bioconductor, other data analysis tools are available. Expression Profiler  offers an integrated, web based approach for microarray data analysis. Various normalization, filtering, between-group-testing, clustering, cluster comparison and GO term enrichment analysis methods are available. Expression Profiler integrates analysis methods in an application-like web interface. GEPAS  is also a widely used web-based approach for microarray data analysis. In addition to the functionality of Expression Profiler, it also offers class prediction methods, survival analysis and multiple tree viewers. GEPAS functionality is split up into a number of tools, connected by the same file format. The user interface is more web-styled than Expression Profiler, making the usage more complicated for untrained users.
Other Tools are not web-based, but installed on the local machine. EXPANDER  includes biclustering methods and analysis methods regarding regulatory elements. TM4  is a collection of 4 programs, covering all computational steps for microarray analysis. TM4 includes spot detection/image analysis, data normalization and data analysis, linked together by the same file format. The data analysis part includes, beside other analysis methods, support vector machines, gene shaving and relevance networks.
All these programs share the focus on the data analysis part, but most of them lack tools for the interpretation of the results. Only GEPAS offers with Babelomics  an approach into data interpretation. On the other hand there exist tools focusing on the interpretation of analysis results. Besides many others, WebGestalt  offers biological term enrichment analysis, protein domain tables, tissue expression analysis, links to chromosome location and textmining analysis. The widely used DAVID  allows an enrichment analysis for GO categories, pathway enzymes, protein domains and other biological terms. Cytoscape  supports the integration of network information with microarray gene expression data. Other tools for acquiring gene set information are MAPPFinder , GFINDer  and Pathway Explorer . The Ensembl  annotation system ENSMART allows the user to perform a genome information search and retrieval for sets of genes, but does not help in exploring the information associated with the gene set. All these tools provide annotation ability, the drawback of these tools is the inability of an integrated analysis. They require precalculated gene sets as input, needing other tools for normalization, clustering and subset determination.
For interpreting microarray analysis results with the tools described above, researchers need first to obtain a list of differential expressed genes from an analysis program, and use this list in an interpretation program to get biological information for the results. This might prove feasible for smaller number of experiments, but is time-consuming and complicated if used for larger numbers.
As we were unhappy with the separation of analysis and interpretation, we developed our own tool, GEPAT. GEPAT offers combined genome-, expression- and pathway analysis and interpretation methods. Our idea was the integration of gene expression data evaluation with the cellular regulation and interaction network. Therefore, we provide gene annotation for the probes on the microarray and allow the visualization of analysis results on metabolic pathways and gene interaction networks. GEPAT includes different biological databases, making them directly usable in data analysis and interpretation. As a large number of databases require lots of disk space and the analysis methods demand much computation power, we developed GEPAT as a web-based toolkit. GEPAT offers an application-like user interface with menu bar and dialog boxes for easy usage. The installation as server system allows either installation and usage on a single computer, installation on a web server for use within a workgroup, or installation on a web server connected to a computer cluster for large user groups. GEPAT is distributed under LGPL and can be freely downloaded , an installation on our server can be used by academic users . For an easy start, GEPAT provides a video tutorial for the basic steps, and offers online help for most functions. For a first impression of GEPAT, a guest login is available, preloaded with microarray data from cancer type classification  and cancer subgroup profiling of diffuse large B-cell lymphoma , including chromosomal alteration information . All figures in this paper are based on the B-cell lymphoma dataset.