- Oral presentation
- Open Access
Visualization of large microarray experiments with space maps
BMC Bioinformaticsvolume 10, Article number: O7 (2009)
Heatmaps and profile plots are effective techniques to visualize expression profiles of several hundred genes across a few dozen samples. However, these techniques do not scale to data sets with expression profiles that have been measured across several hundred samples or even thousands of samples. Our motivation to find a solution to this scaling problem is based on the observation that with increasingly mature and affordable microarray platforms, the number of studies in ArrayExpress  including hundreds of samples has been increasing steadily over the years.
We have developed the glyph-based Space Maps visualization technique that is conceptually similar to Value and Relation Displays . The technique comprises two steps: (1) Generation of glyphs to represent gene expression profiles and (2) arrangement of the glyphs to reflect relationships between genes. Both steps support the integration of biological knowledge into the visualization, for instance in form of ontologies that describe hierarchical relationships among the conditions in the data. We also use hierarchical organization of samples and aggregation of expression levels to summarize expression values of groups of samples, which enables the user to reduce the amount of data shown on each glyph. Similar to treemaps , this construction makes it possible to start out with an overview of the data and then view details on demand.
We have applied the Space Maps visualization to a data set with 5,372 samples (Margus Lukk, personal communication). This data set has been constructed from a large collection of publicly available gene expression data sets and a problem-specific hierarchy on the samples is available. We selected the 1,000 most variable genes from this data set and visualized this subset with our technique (Figure 1). The arrangement of the glyphs represents an overview of the global patterns in the data, such as clusters and outliers. Furthermore, the visualization provides insight into local patterns in the gene expression profiles. Since global patterns arise directly from local patterns we were able to explain several of the clusters and outliers and assign meaningful labels to them.
The Space Maps visualization technique is a novel approach to visualization of gene expression data that facilitates the visualization of expression profiles of genes with hundreds or thousands of samples without loss of context information. A major strength of this technique is that it allows a tightly coupled exploration of local and global patterns, which makes hypothesis generation more efficient than with traditional techniques.
Parkinson H, et al.: ArrayExpress update – from an archive of functional genomics experiments to the atlas of gene expression. Nucleic Acids Res 2007, 37(Database issue):D868-D872. 10.1093/nar/gkn889
Yang J, et al.: Value and Relation Display – Interactive visual exploration of large data sets with hundreds of dimensions. IEEE Transactions on Visualization and Computer Graphics 2007, 13: 494–507. 10.1109/TVCG.2007.1010
Johnson B, Shneiderman B: Tree maps: A space-filling approach to the visualization of hierarchical information structures. Proceedings of the 2nd International IEEE Visualization Conference 1991, 284–291.
Venna J, Kaski S: Non-linear dimensionality reduction as information retrieval. Proceedings of the 11th International Conference on Artificial Intelligence and Statistics (AISTATS 2007) 2007, 568–575.