Skip to main content

Table 2 Batched datasets used in this work

From: Batch effect detection and correction in RNA-seq data using machine-learning-based automated assessment of quality

Dataset

Authors

Tissue

Disease

n samples

n groups

n batches

Sample filtering

GSE120099

Lo Sardo et al. [24]

Vascular smooth muscle cells

Cardiovascular diseases

29

2

3

Only WT cells

GSE61491

Sugathan et al. [25]

Neural progenitor cells

Memory disorders

54

2 (3)

2

No

GSE82177

Wijetunga et al. [26]

Liver

Carcinoma, hepatocellular

27

3

2

No

GSE117970

Cassetta et al. [27]

Breast

Breast neoplasms

53

3

5

Only breast related samples

GSE173078

Hyunijin Kim et al. [28]

Periodontium

Periodontitis, gingivitis

36

3

2

No

GSE162760

Farias et al. [29]

Whole blood

Leishmaniasis, cutaneous

128

2

6

No

GSE171343

Bowles et al. [30]

IPSC derived cerebral organoids

Dementia

36

3

3

Only of one Type: GIH-6-C1-(delta)A02

GSE153380

Alvarez-Benayas et al. [31]

Primary plasma—and myeloma cells

Muliple myeloma

33

2

3

Only primary cells

GSE163214

Procida et al. [32]

HeLa Kyoto cell line

None

10

2

2

No

GSE182440

Lim et al. [33]

Brain

Alcoholism

24

2

2

No

GSE163857

Moser et al. [34]

Microglia

 

24

3

2

Only human samples

GSE144736

Roth et al. [35]

iPSC-derived patient neuroepithelium

Microcephaly

52

3

2

No

  1. Samples in each dataset have been filtered to remove factors that could bias batch effect evaluation