Skip to main content
Fig. 4 | BMC Bioinformatics

Fig. 4

From: Feature-specific quantile normalization and feature-specific mean–variance normalization deliver robust bi-directional classification and feature selection performance between microarray and RNAseq data

Fig. 4

Model performance in PAM50 and CMS classification with feature selection. a. Balanced accuracy (y-axis) derived from unseen out-of-fold test data from each normalization method (x-axis) for breast PAM50 classifier using feature selection. b. Balanced accuracy (y-axis) derived from unseen out-of-fold test data from each normalization method (x-axis) for colon CMS classifier using feature selection. For a and b, the gray labels above the plot denote the feature selection method and the gray labels to the right denote the training distribution. c. Balanced accuracy (y-axis) derived from unseen out-of-fold test data versus the number of selected features (x-axis) for PAM50 classification. d. Balanced accuracy (y-axis) derived from unseen out-of-fold test data versus the number of selected features (x-axis) for CMS classification. For c and d, the gray labels above the plot denote the classifier model and the gray labels to the right denote the training distribution. Scatter plot colours correspond to the normalization method (blue = reference/training distribution, orange = FSQN, green = FSMVN, red = log2). 95% confidence intervals were calculated using 1000 bootstraps with replacement. The significance of a Kruskal–Wallis with Dunn’s post-hoc test is annotated in the plot. (****p < 0.0001, ***p < 0.001, **p < 0.01, *p < 0.05, ns = not significant)

Back to article page