Skip to main content

Table 1 Data sets considered in this work. The ’genes’ column lists the number of non-zero genes detected in the data set. The peripheral blood mononuclear cell (PBMC) data set from [2] appears twice in this table: ZHENGFILT contains a subset of the full data set ZHENGFULL. See experimental data for more information. ZHENGSIM is a collection of simulated data sets created with the Splatter R package; see the generating synthetic data in the Methods sections for more information

From: A rank-based marker selection method for high throughput scRNA-seq data

Data set

Cells

Genes

Description

Ground truth clusters

Ref.

Zeisel

3005

4999

mouse neurons

9 (well-separated clusters)

[24]

Paul

2730

3451

mouse myeloid progenitor cells

19 (differentiation trajectory)

[25]

ZhengFull

68579

20387

human PBMCs

11 (some clusters overlap)

[2]

ZhengFilt

 

5000

   

10 x Mouse

1.3 million

24015

mouse neurons

39 (algorithmically generated)

[3]

ZhengSim

5000

varies

simulated from human CD19+ B cells

2

using [26]