Table 1 Data sets considered in this work. The ’genes’ column lists the number of non-zero genes detected in the data set. The peripheral blood mononuclear cell (PBMC) data set from [2] appears twice in this table: ZHENGFILT contains a subset of the full data set ZHENGFULL. See experimental data for more information. ZHENGSIM is a collection of simulated data sets created with the Splatter R package; see the generating synthetic data in the Methods sections for more information

From: A rank-based marker selection method for high throughput scRNA-seq data

Data set Cells Genes Description Ground truth clusters Ref.
Zeisel 3005 4999 mouse neurons 9 (well-separated clusters) [24]
Paul 2730 3451 mouse myeloid progenitor cells 19 (differentiation trajectory) [25]
ZhengFull 68579 20387 human PBMCs 11 (some clusters overlap) [2]
ZhengFilt   5000    
10 x Mouse 1.3 million 24015 mouse neurons 39 (algorithmically generated) [3]
ZhengSim 5000 varies simulated from human CD19+ B cells 2 using [26]