Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: The projection score - an evaluation criterion for variable subset selection in PCA visualization

Figure 1

Variance filtering of the lung cancer data. (a) The sample representation obtained by applying PCA to the most informative variable subset obtained by variance filtering, containing 591 genes. The different colors indicate different cancer subtypes. (b) The sample representation obtained by applying PCA to the entire variable set (12,625 variables). (c) The projection score as a function of the variance threshold θ (fraction of maximal variance) used for inclusion. (d) Heatmap of the most informative variable subset, that is, the one used to create the sample representation in (a). In panels (a) and (b), in order to obtain more easily interpretable plots, we joined the closest neighbors among the samples with line segments. The distance between two samples is defined by the Euclidean distance in the space spanned by all the remaining variables. The hierarchical clusterings in panel (d) are created using Euclidean distances and average linkage. The figures in (a), (b) and (d) were generated using Qlucore Omics Explorer 2.2 (Qlucore AB, Lund, Sweden).

Back to article page