Skip to main content

Table 1 Main Illumina 450k DNAm datasets used. We list the main datasets used in this study, the cell-types/tissue profiled, whether the data was used for reference database construction (if yes, we specify which cell-types were used), whether the data was used for validation/evaluation purposes (if yes, we specify which cell-types were used) and the reference/citation

From: A comparison of reference-based algorithms for correcting cell-type heterogeneity in Epigenome-Wide Association Studies

Dataset Name

Tissue/cell-types

Use in Reference DNAm Database

Testing/Evaluation

Reference

Reinius et al.

WB, PBMC, NK, B, CD4T, CD8T, Monoc, Neutro, Eosino. (n = 6 of each)

NK, B, CD4T, CD8T, Monoc., Neutro., Eosino.

WB & PBMC

[24]

Liu et al.

WB (n = 335 controls, n = 354 rheumathoid arthritis cases)

No

Average Flow Cytometry estimates for cases and controls

[2]

Koestler et al.

WB (n = 18)

No

12 Reconstructed WB mixtures + 6 WB samples with Flow Cytometry estimates

[20]

Zilbauer et al.

PBMC, CD4T, CD8T, NK, B, Monoc, Neutro. (n = 6 of each)

No

In-silico mixtures of purified blood cell subtypes

[27]

ENCODE

Various

HMEC, HRCE, IMR90, Liver

No

[22]

Slieker et al.

Various

Pancreas

Liver

[26]

SCM2

Various

No

HRCE, Pancreas, IMR90

[29]

Lowe et al.

Various

No

HMEC

[28, 35]

Teschendorff et al.

WB (n = 152)

No

Smoking associated DMCs

[31]

  1. Abbreviations: DNAm = DNA methylation, WB = whole blood, PBMC = peripheral blood mononuclear cells, HMEC = human mammary epithelial cells, HRCE = human renal cortical epithelia, IMR90 (fetal lung fibroblast), SCM2 = Stem-Cell-Matrix Compendium-2, DMCs = differentially methylated CpGs, NK = natural killer cells, B = B-cell, Monoc = Monocytes, Neutro. = Neutrophils, Eosino = Eosinophils, CD4T = CD4+ T-cells, CD8T = CD8+ T-cells