Skip to main content
Fig. 6 | BMC Bioinformatics

Fig. 6

From: The shape of gene expression distributions matter: how incorporating distribution shape improves the interpretation of cancer transcriptomic data

Fig. 6

Comparing prediction accuracy using classifiers that incorporate the expression shape versus assuming a symmetric distribution for all genes. We used random survival forests to predict the prognosis of patients and tested the performance of classifiers derived three ways; first, incorporating information from the distribution shape, second, assuming symmetry for all genes, and third, for a random set of genes. Classifiers were trained on 2/3 of the data, tested on 1/3, and repeated 100 times a AML Microarray, b GBM Microarray, c OV Microarray, d AML RNA-seq, e GBM RNA-seq, f OV RNA-seq. Stars indicate datasets where the shape-based approach produced lower misclassification rates that were statistically significant (Wilcoxon test, * = p value < 0.05, ** = p value < 0.01, NS = Not Significant). The notch in each boxplot displays a confidence interval based on median misclassification rate ± 1.58 × IQR/√n where n = 100, notches that do not overlap reflect statistically significant comparisons

Back to article page