Skip to main content
Fig. 5 | BMC Bioinformatics

Fig. 5

From: Euclidean distance-optimized data transformation for cluster analysis in biomedical data (EDOtrans)

Fig. 5

Results of hierarchical cluster analysis of a four-dimensional data set with 3,000 instances of flow cytometric (FACS) measurements modes (“FACSData”). A The original and transformed data (z-transformation, EDO transformation, and PVS transformation [6]) are shown as a probability density function (PDF) estimated using the Pareto density estimation (PDE [27]), which was developed as a nonparametric kernel density estimator to improve subgroup separation in mixtures. B Cluster quality and stability assessed a as cluster accuracy and adjusted Rand index [21] against the prior classification of the data, and as Dunn’s index [22]. The boxes were constructed using minimum, quartiles, median (solid line inside the box) and maximum. The whiskers add 1.5 times the inter-quartile range (IQR) to the 75th percentile or subtract 1.5 times the IQR from the 25th percentile. The figure has been created using the R software package (version 4.1.2 for Linux; https://CRAN.R-project.org/ [9]) and the R packages “ggplot2” (https://cran.r-project.org/package=ggplot2 [10]), and “FactoMineR” (https://cran.r-project.org/package=FactoMineR [16]). The colors were selected from the “colorblind_pal” palette provided with the R library “ggthemes” (https://cran.r-project.org/package=ggthemes [11])

Back to article page