Skip to main content
Fig. 4 | BMC Bioinformatics

Fig. 4

From: Metacells untangle large and complex single-cell transcriptome networks

Fig. 4

Metacells facilitate data integration and accelerate downstream analyses. a–b UMAP visualization of the non-integrated (a) and Harmony-integrated (b) COVID-19_atlas dataset (\(N=\mathrm{1,462,702}\)) at the metacell \(\left(\gamma =10\right)\) level. Metacells are colored according to the cell type annotation, protocol or sample. c, Batch effect level in terms of protocol (top) and sample (bottom) in the non-integrated and Harmony-integrated COVID-19_atlas dataset, computed as the kBET acceptance rate for the four most frequent cell types. d Computational time (top) and memory allocation (bottom) for the visualization (UMAP), clustering (Seurat), DE analysis (t-test, each cell type versus the rest), data integration (Harmony) and all steps together (‘Combined analysis’) for the metacells (dashed lines) and single cells (solid line). Red dots show the limits reached on standard desktops (16G of RAM). Black dots correspond to the limits reached on a machine with 512G RAM (linear extrapolations shown in gray). e UMAP visualization of the TIM_atlas dataset (\(N=\mathrm{108,566}\)) at the single-cell (left) and the metacell \(\left(\gamma =50\right)\) (right) levels computed with the approximate coarse-graining. Cells are colored according to the cell type annotation. f Relative (z-score) expression of genes experimentally tested in Fig. 2g at the single-cell (top) and metacell (bottom) levels. The number following the ‘#’ sign indicates the ranking of each gene among the top differentially expressed ones. All comparisons pass statistical significance based on two-tailed unpaired Student’s t-test (p values < 0.05) except for CD74 at the single-cell level (p value = 1). Ranks for genes showing a different behavior both at single-cell and metacell levels between mouse and human are shown in red. g, Computational time (top) and memory allocation (bottom) for the building of metacells followed by downstream analyses including dimensionality reduction, clustering and DE analysis for metacells computed with SuperCell (red dashed line) or MetaCell (green dashed line), and for the single cells (solid black line). Red dots show the limit reached on standard desktops (16G of RAM)

Back to article page