Skip to main content
Fig. 3 | BMC Bioinformatics

Fig. 3

From: Guidelines for cell-type heterogeneity quantification based on a comparative analysis of reference-free DNA methylation deconvolution software

Fig. 3

Impact of pre-processing on method performance. Heatmap of method performances (`A MAE`: Mean Absolute Error on estimated A, the matrix of cell proportions). RFE stands for RefFreeEWAS, MDC for MeDeCom and EDec for EDec stage 1. All algorithms are run on 10 D matrices: 10 different random noises ε were simulated on one matrix D computed from one simulated A matrix. In each heatmap, the left panel corresponds to algorithms run without accounting for confounders (no removal of confounding probes), the right panel corresponds to algorithms run accounting for confounders (removal of confounding probes by linear regression). In each case, different types of feature selection (FS) are tested: no FS = no feature selection, FS variance = selecting probes with high variance (var > 0.02), FS PCA = selecting probes highly correlated with the 4 first PCs (p-value < 0.1), FS infloci = selecting probes expected to biologically vary in methylation levels across constitutive cell types. a Simulations were performed with the following parameters K = 5, n = 20, α0 = 1, ε = 0.2 and G = 1. b Simulations were performed with the following parameters K = 5, n = 100, α0 = 1, ε = 0.2 and G = 1. The number of conserved probes is display Additional file 2: Table S1

Back to article page