Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data
- Christoph Bartenhagen^{1}Email author,
- Hans-Ulrich Klein^{1},
- Christian Ruckert^{1},
- Xiaoyi Jiang^{2} and
- Martin Dugas^{1}
DOI: 10.1186/1471-2105-11-567
© Bartenhagen et al; licensee BioMed Central Ltd. 2010
Received: 19 January 2010
Accepted: 18 November 2010
Published: 18 November 2010
Abstract
Background
Visualization of DNA microarray data in two or three dimensional spaces is an important exploratory analysis step in order to detect quality issues or to generate new hypotheses. Principal Component Analysis (PCA) is a widely used linear method to define the mapping between the high-dimensional data and its low-dimensional representation. During the last decade, many new nonlinear methods for dimension reduction have been proposed, but it is still unclear how well these methods capture the underlying structure of microarray gene expression data. In this study, we assessed the performance of the PCA approach and of six nonlinear dimension reduction methods, namely Kernel PCA, Locally Linear Embedding, Isomap, Diffusion Maps, Laplacian Eigenmaps and Maximum Variance Unfolding, in terms of visualization of microarray data.
Results
A systematic benchmark, consisting of Support Vector Machine classification, cluster validation and noise evaluations was applied to ten microarray and several simulated datasets. Significant differences between PCA and most of the nonlinear methods were observed in two and three dimensional target spaces. With an increasing number of dimensions and an increasing number of differentially expressed genes, all methods showed similar performance. PCA and Diffusion Maps responded less sensitive to noise than the other nonlinear methods.
Conclusions
Locally Linear Embedding and Isomap showed a superior performance on all datasets. In very low-dimensional representations and with few differentially expressed genes, these two methods preserve more of the underlying structure of the data than PCA, and thus are favorable alternatives for the visualization of microarray data.
Background
DNA microarrays allow the measurement of transcript abundances for thousands of genes in parallel. Applications in quality assessment and interpretation of such high dimensional data by clustering [1, 2] and visualization [3, 4] make use of algorithms that reduce its dimension. Two and three dimensional visualizations are often a good way to get a first impression of properties or the quality of a dataset or of special patterns within the data by showing clusters such as diseased and healthy patients, revealing outliers, a high level of noise or to generate hypotheses for further experimentation [5–8]. In general, there are two different approaches to reduce a datasets' dimension. Feature selection methods [9–11] compute a ranking on all genes by means of some given score and pick a gene subset based on this ranking. Feature extraction methods define a mapping between the high-dimensional input space and a low-dimensional target space of a given dimension. Both methods are used in machine learning concepts. Most classification algorithms use many or all features in a complex (nonlinear) manner whereas approaches like [12, 13] are based on the relative expression of only two or three genes to overcome the "black box" character of the other classifiers. So they allow an easy traceability of the genes leading to the classification result. On the other hand, applications like the visualization of high-dimensional data may profit from extracting information from all features. This results in feature extraction methods usually being more suited for low-dimensional representations of the whole data. In the following, we refer to feature extraction methods when speaking of dimension reduction techniques.
Considering visualization, these kind of mappings are often unsupervised, because they don't use further information of the data like class labels and allow an unbiased view of the structure within the data. Supervised methods are more applicable to improve classification or regression procedures, assuming that less non-differential or noisy features are reduced after the mapping.
All features, that are related to special properties of the data or a separation into classes or clusters, often lie in a subspace of a lower (intrinsic) dimension within the original data. A 'good' dimension reduction technique should preserve most of these features and generate data with similar characteristics like the high-dimensional original. For example, classifications should work at least as well on the low-dimensional representation and clusters within the reduced data should also be found, preferably more distinct. Principal Component Analysis (PCA) is a widely used unsupervised method to define this mapping from high-to low-dimensional space. Availability of large datasets with high-dimensional data, especially in biological research (e.g. microarrays), led to many new approaches in the last years.
Other studies, that deal with the assessment of dimension reduction techniques, either compare them against the background of classification [14–18], and hence mainly discuss supervised methods like Partial Least Squares [19, 20], Sliced Inverse Regression [21] or other Regression models [22], or come from Computer Vision and deal with text, image, video or artificial data like the Swiss Roll [23–28]. This study instead, focuses on microarray data and its two and three dimensional visualization. We compare PCA to six recent unsupervised methods to find out if and under which conditions they are able to outperform PCA. In the following sections, we describe a benchmark, consisting of classifications and cluster validations, to compare the visualization performance of seven dimension reduction techniques on ten real microarray and several simulated datasets. After some technical details in the methods section, we present and discuss all results, based on one representative dataset. Further details of the other nine datasets are available in the supplement.
Methods
Dimension Reduction
Seven unsupervised dimension reduction techniques were compared within this study: Principal Component Analysis (PCA), Kernel PCA (KPCA), Isomap (IM), Maximum Variance Unfolding (MVU), Diffusion Maps (DM), Locally Linear Embedding (LLE) and Laplacian Eigenmaps (LEM). These dimension reduction techniques can be divided into two groups: linear and nonlinear methods. While PCA belongs to the former, due to a linear combination of the input data, the other six methods were designed with respect to data lying on or near a nonlinear submanifold in the higher dimensional input space and perform a nonlinear mapping.
Given an input space ℝ ^{ D } and target space ℝ ^{ d } (with d << D) let X ∈ ℝ^{N×D}be an input dataset of N samples and D features (gene expression values) and Y ∈ ℝ^{N×d}its low-dimensional representation. A dimension reduction technique is a mapping Φ: ℝ ^{ D } → ℝ ^{ d } that optimizes a cost function ∈ : R^{ d } → ℝ on the target space. This problem can often be reduced to an eigenvalue problem, whose eigenvectors will define the embedding Y .
Principal Component Analysis
w_{2}, . . . , w_{ d } are chosen in the same way, but orthogonal (independent) to each other (here, C ∈ ℝ ^{ DxD } denotes the covariance matrix of the data X). So, the principal components p_{ i } = Xw_{ i } explain most of the variance in the data. Before mapping the data, the samples in X were centered by subtracting their mean. Since PCA only considers the variance among samples, it works best if those features, that are relevant for class labeling, account for a large part of the variance. Sometimes, the first two or three principal components are not sufficient for a good representation of the data [26]. This can lead to a high target dimensionality and prevent a well suited visualization. Furthermore, the covariance matrix grows rapidly for high-dimensional input data. To overcome this issue, we substituted the covariance matrix by the matrix of squared Euclidean distances ${D}_{E}=\frac{1}{N}X{X}^{\prime}\left({D}_{E}\in {\mathbb{R}}^{N\times N}\right)$[14, 31].
Kernel PCA
To make PCA more suitable for nonlinear data, Kernel PCA (KPCA) maps the data into a higher dimensional feature space before applying the the same optimization as PCA. [32, 33]. The mapping can be done implicitly by using a kernel function. The Gaussian kernel $K\left({x}_{i},{x}_{j}\right)=exp\left(\frac{-{\Vert {x}_{i}-{x}_{j}\Vert}^{2}}{{\sigma}^{2}}\right)$ was applied in our study.
Isomap
,is minimized, with ${D}_{Y}\left(i,j\right)={\Vert {y}_{i}-{y}_{j}\Vert}^{2}$ being the pairwise distance matrix of neighbors y_{ i } , y_{ j } in the target space.
Previous work in [23] addressed problems in visualizing datasets consisting of several well separated clusters. Since Isomap is known to suffer from holes in the underlying manifold [14], it is suggested to modify the method by selecting $\frac{k}{2}$ nearest and $\frac{k}{2}$ farthest neighbors when constructing the graph, instead of the k nearest neighbors. Both, IM and IM(mod), will be discussed in the results section.
Maximum Variance Unfolding
Based on the same concept, MVU shares some weaknesses with Isomap like suffering from erroneous connections in the graph.
Diffusion Maps
The term $\Psi \left({x}_{i}\right)=\frac{{\Sigma}_{j}\widehat{W}\left(i,j\right)}{{\Sigma}_{jl}\widehat{W}\left(j,l\right)}$ leads to stronger weighting of samples from dense areas in the graph. Since the diffusion distance between two points is computed over all possible paths in the graph, Diffusion Maps are more robust to noise.
Locally Linear Embedding
Unlike Isomap and MVU, Locally Linear Embedding (LLE) [24, 37] attempts to preserve local properties of the data. Each sample x_{ i } is represented by a linear combination of its k nearest neighbors:
,
the low-dimensional representation that best preserves the weights in the target space is chosen.
Laplacian Eigenmaps
for neighbored y_{ i }, y_{ j } (W (i, j) = 0 otherwise), the distances between the low-dimensional representations are minimized and nearby samples x_{ i }, x_{ j } are highly weighted, and thus brought closer together. This way, Laplacian Eigenmaps implicitly enforces natural clusters in the data.
Methods of Assessment
Benchmark
Datasets
Microarray datasets
Dataset | samples | features | class 1(#samples) | class 2(#samples) |
---|---|---|---|---|
1 Wang et al. - Breast cancer [50] | 286 | 22.283 | ER+(209) | ER-(77) |
2 Verhaak et al. - Leukemia [51] | 461 | 54.675 | NPM1 pos.(140) | NPM1 neg.(321) |
3 Haferlach et al. - Leukemia [52] | 251 | 54.675 | NPM1 pos.(138) | NPM1 neg.(113) |
4 Haferlach et al. - Leukemia [52] | 77 | 54.675 | AML with t(8;21)(40) | AML with t(15;17)(37) |
5 Golub et al. - Leukemia [53] | 72 | 7.129 | ALL(47) | AML(25) |
6 Chiaretti et al. - Leukemia [54] | 22 | 12.625 | CLL stable(8) | CLL progressive(14) |
7 Alizadeh et al. - Lymphoma [55] | 38 | 18.432 | Activated B-like DLBCL(17) | GC B-like DLBCL(21) |
8 Nutt et al. - High-grade glioma [56] | 50 | 12.625 | Glioblastoma(28) | Anaplastic oligodendroglioma(22) |
9 Alon et al. - Colon cancer [57] | 62 | 2.000 | Tumor(42) | Normal(20) |
10 Singh et al. - Prostate cancer [58] | 102 | 12.600 | Tumor(52) | Normal(50) |
The simulated data is based on a 50 sample dataset whose 10.000 gene expression values are normally distributed with zero mean and standard deviation one. The covariances of all genes are given by a block diagonal matrix with coefficients ρ = 0.2 within and ρ = 0 outside the blocks of size 50 × 50. To separate the data into two classes, between 10 and 500 genes were randomly chosen to be differentially expressed by adding a constant of 0.6 to the expression values of the first 25 samples. We generated 100 datasets for testing.
In the same manner as for the ten microarray datasets before, normally distributed noise with zero mean and an increasing variance between 0 and 0.2 was added to the simulated data. We repeated the benchmark on 50 of these noisy artificial datasets. The number of differential features was fixed to 300.
Dimension reduction
All dimension reduction techniques discussed here have one or two free parameters, that influence the embedding and the target dimension. Their determination was done by minimizing the error rate of a Support Vector Machine (SVM) within a leave-one-out cross-validation (loo-cv) schema: For N samples, the dataset was divided N times into a training and a test set. One sample was excluded for testing while the rest was taken for training. The average over all prediction accuracies gives an estimate of the SVMs' generalization error.
This procedure was repeated for every set of parameters within the following ranges:
Target dimensionality: 2 ≤ d ≤ 15
Neighbors: 4 ≤ k ≤ 16
Gaussian kernel: 1e − 1 ≤ σ ≤ 5e 5
If the same loo-cv accuracies were achieved by using different parameter values for the target dimension, the lowest value was taken for reasons of a most simple representation. The same applies to the neighbor/kernel parameters.
After the loo-cv, the whole dataset was reduced in its dimension in an unsupervised manner, i.e. without consideration of class labels.
Classification
The first evidence for the quality of the different dimension reduction methods are the accuracies of a Support Vector Machine with Gaussian kernel.
The data was classified repeatedly during several randomization steps:
We randomly split the dataset a hundred times into a set to train the SVM and a test set for classification, and selected the median accuracy of all runs. Within the training set, a loo-cv was performed to determine the SVM parameters. For reasons of performance, a gradient descent procedure as proposed in [41] was used to minimize the loo-cv error. Every time during randomization, the training set consisted of two thirds of the original data and the test set of the remaining samples. The only constraint was to keep the balance between the number of samples in each class. Since SVMs do not restrict the dimension of the input data, the randomization results of the low-dimensional data can be compared to the high-dimensional original data, to see if more or less significant features got lost after the embedding.
Cluster validation
averages the worst cases of the clusters' separations. One might expect well separated clusters to have smaller values close to one. In our case, the DB-Index was computed for fixed target space dimensions 2,3,5, and 10.
Implementation details
The presented benchmark was implemented in Matlab 7.8.0 (R2009a). Furthermore, libsvm (version 2.89) [43] served as Support Vector Machine implementation, in conjunction with Automatic Model Selection for Kernel Methods (Apr 2005) [44]. The Dimensionality Reduction Toolbox (version 0.7 - Nov 2008) [45], Isomap package (Release 1 - Dec 2000) [46], LLE routine [45] and MVU implementation (version 1.3) [47] were used for dimension reduction. Because the Isomap and LLE routines performed best in our benchmark, we converted their Matlab implementations for the statistical programming language R [48]. The R-package 'RDRToolbox', also including a routine to compute the Davis-Bouldin-Index and our microarray gene expression data simulator, can be downloaded from [49] (see also Additional file 1).
Results and Discussion
A linear approach like PCA is known to recover the true structure of data lying on or near a linear subspace of the high-dimensional input space. The following results show that the structure of microarray data is often too complex to be captured well in very low dimensional target spaces in a linear manner. Nonlinear methods, in particular LLE and Isomap, preserve more information in the data than the first few principle components of a PCA are able to cover.
Classification
Parameter estimation
method | dim | neighbors/σ | loo-cv accuracy |
---|---|---|---|
PCA | 14 | - | 87.4 |
KPCA | 15 | 5e5 | 87.1 |
LLE | 12 | 14 | 88.5 |
IM | 8 | 10 | 85 |
IM(mod) | 15 | 4 | 87.4 |
LEM | 5 | 4 | 85.3 |
DM | 13 | 5e5 | 84.3 |
MVU | 5 | 14 | 85 |
While all methods perform nearly even in higher dimensions, Isomap, LLE and Laplacian Eigenmaps performed best in two and three dimensions. Only on two of ten datasets (Alizadeh et. al and Singh et. al), PCA performed as well as other nonlinear methods like Isomap in two or three dimensional target spaces (see Supplemental Figures S18/S19, S27/S28). On all ten datasets considered together (see supplement), Diffusion Maps and Laplacian Eigenmaps produce more varying results and especially Diffusion Maps are very sensitive to the choice of the kernel parameter (see for example Figure 3, dimension two). But like Kernel PCA, they perform quite similar to PCA in most cases. MVU, which is based on Multidimensional Scaling like Isomap, is comparable to Isomap's good accuracies.
The initial publications on Isomap and MVU [25, 27], covering text classification and face recognition, pointed out, that PCA might need higher dimensional target spaces than its nonlinear counterparts to lead to similar results. Since PCA only considers the variance in the data, it works best if those features, which are relevant for the class labeling, account most for the variance. Considering complex microarray data, the first two or three principal components were often not enough to cover the information necessary to sufficiently distinguish different classes within the data. This might prevent a well suited visualization, which is true to the original. LLE, Isomap and MVU, which classified best most of datasets, take advantage of overlapping local neighborhoods to create an image of the global geometry of the data. Although this approach may suffer from "holes" within the data (manifold), it proved more useful for accurate low-dimensional representations.
Well sampled datasets may overcome this issue of sparse data. But the Chiaretti et al. leukemia (22 samples), Alizadeh et al. lymphoma (38 samples) and Nutt et al. high-grade glioma dataset (50 samples) show that even with relatively few samples, a true to the original embedding is possible. The classification accuracies of most of the dimension reduction methods on these datasets (in ≥ 2 target dimensions) are comparable and sometimes even better than the accuracies on the high-dimensional data (see Supplemental Figures S15, S18, S21).
Cluster validation
Because LLE and Isomap performed best on most of the datasets during classification and cluster validation, Figure 2 compares their two dimensional embedding of the Haferlach et al. Leukemia dataset to the first two principal components of a PCA. All three visualizations clearly show two clusters of AML patients with t(15;17) and t(8;21) respectively. But LLE and Isomap distinguish both classes best, while in the PCA embedding three more t(15;17) samples lie between samples of the other class. Since LLE and Isomap both map more samples correctly, there seems to be more information within the data, that the first two PCA components fail to preserve. On closer inspection, the common three t(15;17) outliers, that are in between or closest to t(8;21) samples in all three visualizations, are always the same samples #44 and #46 #57. Another visualization example of the Alon et al. Colon Cancer dataset with all eight dimension reduction techniques can be seen in Supplemental Figures S1 and S2.
Noise evaluation
Simulated data
The benchmark with noisy simulated data, however, confirms the results of the noise evaluation for the ten microarray datasets. Supplemental Figures S31 and S32 show for two and three target dimensions, that PCA performs more robust than LLE and Isomap for both, classification and cluster validation, when noise within the data increases. These conclusions hold true for noisy data with a larger variance, since PCA, LLE and Isomap are invariant to multiplication of the data with a scalar.
Statistical hypothesis test
Wilcoxon signed-rank test (p-values)
PCA compared to ... | Accuracies(dim 2) | DB-Index(dim 2) |
---|---|---|
KPCA | 0.1562 | 0.3223 |
LLE | 0.0195 | 0.0273 |
IM | 0.0078 | 0.1055 |
IM(mod) | 0.0547 | 0.2324 |
LEM | 0.1953 | 0.3750 |
DM | 0.5000 | 0.8457 |
MVU | 0.1953 | 0.7695 |
Runtime
Runtime
PCA | KPCA | LLE | IM | LEM | DM | MVU | |
---|---|---|---|---|---|---|---|
Chiaretti et al. | 0.09 s | 0.03 s | 0.14 s | 0.04 s | 0.04 s | 0.16 s | 0.25 s |
Verhaak et al. | 9.4 s | 12.7 s | 21.9 s | 14 s | 15.2 s | 13.2 s | 2 hrs |
The runtime of all nonlinear methods (Kernel PCA, Isomap, LLE, LEM, DM, MVU) depends on the number of samples. Even for relatively large microarray datasets (461 samples in this case), runtimes between 9.4 and 21.9 seconds are acceptable for visualization purposes. Only the solution of a semidefinite program in the MVU algorithm takes two hours. For all methods, the computing time for datasets with more common sample sizes (≤ 50) is less than a second.
Conclusions
Classifications on high and low-dimensional data showed, that the most significant information within microarray data can be captured quite well in very few dimensions compared to the thousands of features of the original data.
Our benchmark further revealed significant shortcomings of PCA in two and three dimensional target spaces and brought out two nonlinear methods, that distinguished most from PCA. Especially the performances of Locally Linear Embedding and Isomap in classification and cluster validation make them well suited alternatives to the classic, linear approach of PCA.
Declarations
Acknowledgements
This study was supported by COST Action BM0801 Translating genomic and epigenetic studies of MDS and AML (EuGESMA) and by the European Leukemia Network of Excellence (ELN).
Authors’ Affiliations
References
- Hibbs MA, Dirksen NC, Li K, Troyanskaya OG: Visualization methods for statistical analysis of microarray clusters. BMC Bioinformatics 2005, 6: 115. 10.1186/1471-2105-6-115View ArticlePubMedPubMed CentralGoogle Scholar
- Yeung KY, Ruzzo WL: Principal component analysis for clustering gene expression data. Bioinformatics 2001, 17(9):763–774. 10.1093/bioinformatics/17.9.763View ArticlePubMedGoogle Scholar
- Lim IS, Ciechomski PDH, Sarni S, Thalmann D: Planar arrangement of high-dimensional biomedical data sets by Isomap coordinates. In Proceedings of the 16 th IEEE Symposium on Computer-Based Medical Systems 2003, 50–55.Google Scholar
- Baek J, McLachlan GJ, Flack LK: Mixtures of factor analyzers with common factor loadings: Applications to the clustering and visualization of high-dimensional data. IEEE Transactions on Pattern Analysis and Machine Intelligence 2010, 32: 1298–1309. 10.1109/TPAMI.2009.149View ArticlePubMedGoogle Scholar
- Butte A: The use and analysis of microarray data. Nature Reviews Drug Discovery 2002, 1(12):951–960. 10.1038/nrd961View ArticlePubMedGoogle Scholar
- Misra J, Schmitt W, Hwang D, Hsiao LL, Gullans S, Stephanopoulos G, Stephanopoulos G: Interactive exploration of microarray gene expression patterns in a reduced dimensional space. Genome research 2002, 12(7):1112–1120. 10.1101/gr.225302View ArticlePubMedPubMed CentralGoogle Scholar
- Mramor M, Leban G, Demsar J, Zupan B: Visualization-based cancer microarray data classification analysis. Bioinformatics (Oxford, England) 2007, 23(16):2147–2154. 10.1093/bioinformatics/btm312View ArticleGoogle Scholar
- Dawson K, Rodriguez RL, Malyj W: Sample phenotype clusters in high-density oligonucleotide microarray data sets are revealed using Isomap, a nonlinear algorithm. BMC Bioinformatics 2005, 6: 195. 10.1186/1471-2105-6-195View ArticlePubMedPubMed CentralGoogle Scholar
- Umpai TJ, Aitken S: Feature selection and classification for microarray data analysis: evolutionary methods for identifying predictive genes. BMC Bioinformatics 2005, 6: 148. 10.1186/1471-2105-6-148View ArticleGoogle Scholar
- Li T, Zhang C, Ogihara M: A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics 2004, 20(15):2429–2437. 10.1093/bioinformatics/bth267View ArticlePubMedGoogle Scholar
- Su Y, Murali TM, Pavlovic V, Schaffer M, Kasif S: RankGene: identification of diagnostic genes based on expression data. Bioinformatics 2003, 19(12):1578–1579. 10.1093/bioinformatics/btg179View ArticlePubMedGoogle Scholar
- Geman D, d'Avignon C, Naiman DQ, Winslow RL: Classifying gene expression profiles from pairwise mRNA comparisons. Statistical Applications in Genetics and Molecular Biology 2004., 3: 10.2202/1544-6115.1071Google Scholar
- Lin X, Afsari B, Marchionni L, Cope L, Parmigiani G, Naiman D, Geman D: The ordering of expression among a few genes can provide simple cancer biomarkers and signal BRCA1 mutations. BMC Bioinformatics 2009, 10: 256. 10.1186/1471-2105-10-256View ArticlePubMedPubMed CentralGoogle Scholar
- Van der Maaten LJP, Postma EO, van den Herik HJ: Dimensionality reduction: a comparative review. Tech. rep., MICC, Maastricht University 2008.Google Scholar
- Chao S, Lihui C: Feature dimension reduction for microarray data analysis using locally linear embedding. In APBC 2004, 211–217.Google Scholar
- Cho SB, Won HH: Machine learning in DNA microarray analysis for cancer classification. In APBC '03: Proceedings of the First Asia-Pacific bioinformatics conference on Bioinformatics 2003. Australian Computer Society, Inc; 2003:189–198.Google Scholar
- Liu CCC, Hu J, Kalakrishnan M, Huang H, Zhou XJJ: Integrative disease classification based on cross-platform microarray data. BMC Bioinformatics 2009, 10(Suppl 1):25. 10.1186/1471-2105-10-S1-S25View ArticleGoogle Scholar
- Pochet N, De Smet F, Suykens JA, De Moor BL: Systematic benchmarking of microarray data classification: assessing the role of nonlinearity and dimensionality reduction. Bioinformatics 2004, 3185–3195. 10.1093/bioinformatics/bth383Google Scholar
- Nguyen DV, Rocke DM: Tumor classification by partial least squares using microarray gene expression data. Bioinformatics 2002, 18: 39–50. 10.1093/bioinformatics/18.1.39View ArticlePubMedGoogle Scholar
- Boulesteix AL: PLS dimension reduction for classification with microarray data. Statistical Applications in Genetics and Molecular Biology 2009, 3: 33.Google Scholar
- Dai JJ, Lieu L, Rocke D: Dimension reduction for classification with gene expression microarray data. Statistical applications in genetics and molecular biology 2006, 5.Google Scholar
- Antoniadis A, Lambert-Lacroix S, Leblanc F: Effective dimension reduction methods for tumor classification using gene expression data. Bioinformatics 2003, 19(5):563–570. 10.1093/bioinformatics/btg062View ArticlePubMedGoogle Scholar
- Vlachos M, Domeniconi C, Gunopulos D, Kollios G, Koudas N: Non-linear dimensionality reduction techniques for classification and visualization. in Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2002, 645–651.Google Scholar
- Roweis ST, Saul LK: Nonlinear dimensionality reduction by Locally Linear Embedding. Science 2000, 290(5500):2323–2326. 10.1126/science.290.5500.2323View ArticlePubMedGoogle Scholar
- Weinberger KQ, Saul LK: Unsupervised learning of image manifolds by semidefinite programming. International Journal of Computer Vision 2006, 70: 77–90. 10.1007/s11263-005-4939-zView ArticleGoogle Scholar
- Weinberger KQ, Saul LK: An introduction to nonlinear dimensionality reduction by maximum variance unfolding. In AAAI'06: proceedings of the 21st national conference on Artificial intelligence. AAAI Press; 2006:1683–1686.Google Scholar
- Tenenbaum JB, de Silva V, Langford JC: A global geometric framework for nonlinear dimensionality reduction. Science 2000, 290(5500):2319–2323. 10.1126/science.290.5500.2319View ArticlePubMedGoogle Scholar
- Silva VD, Tenenbaum JB: Global versus local methods in nonlinear dimensionality reduction. In Advances in Neural Information Processing Systems 15. MIT Press; 2003:705–712.Google Scholar
- Hotelling H: Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology 1933, 24: 417–441,498–520. 10.1037/h0071325View ArticleGoogle Scholar
- Jolliffe IT: Principal Component Analysis. Springer 2nd edition. 2002.Google Scholar
- Chatfield C, Collins AJ: Introduction to multivariate analysis. Chapman and Hall 1980.Google Scholar
- Schölkopf B, Smola A, Müller KR: Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 1998, 10(5):1299–1319. 10.1162/089976698300017467View ArticleGoogle Scholar
- Schölkopf B, Smola A, Müller KR: Kernel principal component analysis. Advances in kernel methods: support vector learning 1999, 327–352.Google Scholar
- Cox TF, Cox MAA, Raton B: Multidimensional Scaling. Technometrics 2003, 45(2):182.Google Scholar
- Nadler B, Lafon S, Coifman RR, Kevrekidis IG: Diffusion maps, spectral clustering and reaction coordinates of dynamical systems. Applied and Computational Harmonic Analysis 2006, 21: 113–127. 10.1016/j.acha.2005.07.004View ArticleGoogle Scholar
- Lafon S, Lee AB: Diffusion maps and coarse-graining: a unified framework for dimensionality reduction, graph partitioning, and data set parameterization. Pattern Analysis and Machine Intelligence, IEEE Transactions on 2006, 28(9):1393–1403. 10.1109/TPAMI.2006.184View ArticleGoogle Scholar
- Saul LK, Roweis ST: Think globally, fit locally: unsupervised learning of low dimensional manifolds. Journal of Machine Learning Research 2003, 4: 119–155. 10.1162/153244304322972667Google Scholar
- Belkin M, Niyogi P: Laplacian Eigenmaps for dimensionality reduction and data representation. Neural Comp 2003, 15(6):1373–1396. 10.1162/089976603321780317View ArticleGoogle Scholar
- Belkin M, Niyogi P: Laplacian Eigenmaps and spectral techniques for embedding and clustering. Advances in Neural Information Processing Systems 14 2001, 14: 585–591.Google Scholar
- Cristianini N, Shawe-Taylor J: An introduction to Support Vector Machines and other kernel-based learning methods. 1st edition. Cambridge University Press; 2000.View ArticleGoogle Scholar
- Chapelle O, Vapnik V, Bousquet O, Mukherjee S: Choosing multiple parameters for Support Vector Machines. Machine Learning 2002, 46: 131–159. 10.1023/A:1012450327387View ArticleGoogle Scholar
- Xu R, Wunsch D: Clustering. illustrated edition. Wiley-IEEE Press; 2008.View ArticleGoogle Scholar
- Chang CC, Lin CJ: LIBSVM, a library for support vector machines.2001. [http://www.csie.ntu.edu.tw/~cjlin/libsvm] [last accessed at 29th of Oct 2010]Google Scholar
- Chapelle O: Automatic model selection for kernel methods. http://olivier.chapelle.cc/ams/ [last accessed at 29th of Oct 2010] [last accessed at 29th of Oct 2010]Google Scholar
- van der Maaten LJP: Matlab toolbox for dimensionality reduction.[http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Reduction.html] [last accessed at 29th of Oct 2010]
- Tenenbaum JB: Matlab Isomap package.[http://isomap.stanford.edu/] [last accessed at 29th of Oct 2010]
- Weinberger KQ: Maximum Variance Unfolding.[http://www.cse.wustl.edu/~kilian/code/code.html] [last accessed at 29th of Oct 2010]
- The R Project for statistical computing[http://www.r-project.org/] [last accessed at 29th of Oct 2010]
- RDRToolbox - A package for nonlinear dimension reduction with Isomap and LLE[http://www.bioconductor.org/help/bioc-views/release/bioc/html/RDRToolbox.html] [last accessed at 29th of Oct 2010]
- Wang Y, Klijn JG, Zhang Y, Sieuwerts AM, Look MP, Yang F, Talantov D, Timmermans M, van Gelder MEM, Yu J, Jatkoe T, Berns EM, Atkins D, Foekens JA: Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. The Lancet 2005, 365(9460):671–679.View ArticleGoogle Scholar
- Verhaak R, Wouters B, Erpelinck C, Abbas S, Beverloo H, Lugthart S, Löwenberg B, Delwel R, Valk P: Prediction of molecular subtypes in acute myeloid leukemia based on gene expression profiling. Haematologica 2009, 94: 131–134. 10.3324/haematol.13299View ArticlePubMedPubMed CentralGoogle Scholar
- Klein HU, Ruckert C, Kohlmann A, Bullinger L, Thiede C, Haferlach T, Dugas M: Quantitative comparison of microarray experiments with published leukemia related gene expression signatures. BMC Bioinformatics 2009, 10: 422. 10.1186/1471-2105-10-422View ArticlePubMedPubMed CentralGoogle Scholar
- Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA, Bloomfield CD: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 1999, 286: 531–537. 10.1126/science.286.5439.531View ArticlePubMedGoogle Scholar
- Del Giudice I, Chiaretti S, Tavolaro S, De Propris MS, Maggio R, Mancini F, Peragine N, Santangelo S, Marinelli M, Mauro FR, Guarini A, Foa R: Spontaneous regression of chronic lymphocytic leukemia: clinical and biologic features of 9 cases. Blood 2009, 114(3):638–646. 10.1182/blood-2008-12-196568View ArticlePubMedGoogle Scholar
- Powell JI, Yang L, Marti GE, Moore T, Hudson J, Lu L, Lewis DB, Tibshirani R, Sherlock G, Chan ea W C: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 2000, 403(6769):503–511. 10.1038/35000501View ArticlePubMedGoogle Scholar
- Nutt CL, Mani DR, Betensky RA, Tamayo P, Cairncross JG, Ladd C, Pohl U, Hartmann C, McLaughlin ME, Batchelor TT, Black PM, von Deimling A, Pomeroy SL, Golub TR, Louis DN: Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. Cancer Research 2003, 63(7):1602–1607.PubMedGoogle Scholar
- Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences of the United States of America 1999, 96(12):6745–6750. 10.1073/pnas.96.12.6745View ArticlePubMedPubMed CentralGoogle Scholar
- Singh D, Febbo PG, Ross K, Jackson DG, Manola J, Ladd C, Tamayo P, Renshaw AA, D'Amico AV, Richie JP, Lander ES, Loda M, Kantoff PW, Golub TR, Sellers WR: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 2002, 1(2):203–209. 10.1016/S1535-6108(02)00030-2View ArticlePubMedGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.