Skip to main content

Table 1 Main Illumina 450k DNAm datasets used. We list the main datasets used in this study, the cell-types/tissue profiled, whether the data was used for reference database construction (if yes, we specify which cell-types were used), whether the data was used for validation/evaluation purposes (if yes, we specify which cell-types were used) and the reference/citation

From: A comparison of reference-based algorithms for correcting cell-type heterogeneity in Epigenome-Wide Association Studies

Dataset Name Tissue/cell-types Use in Reference DNAm Database Testing/Evaluation Reference
Reinius et al. WB, PBMC, NK, B, CD4T, CD8T, Monoc, Neutro, Eosino. (n = 6 of each) NK, B, CD4T, CD8T, Monoc., Neutro., Eosino. WB & PBMC [24]
Liu et al. WB (n = 335 controls, n = 354 rheumathoid arthritis cases) No Average Flow Cytometry estimates for cases and controls [2]
Koestler et al. WB (n = 18) No 12 Reconstructed WB mixtures + 6 WB samples with Flow Cytometry estimates [20]
Zilbauer et al. PBMC, CD4T, CD8T, NK, B, Monoc, Neutro. (n = 6 of each) No In-silico mixtures of purified blood cell subtypes [27]
ENCODE Various HMEC, HRCE, IMR90, Liver No [22]
Slieker et al. Various Pancreas Liver [26]
SCM2 Various No HRCE, Pancreas, IMR90 [29]
Lowe et al. Various No HMEC [28, 35]
Teschendorff et al. WB (n = 152) No Smoking associated DMCs [31]
  1. Abbreviations: DNAm = DNA methylation, WB = whole blood, PBMC = peripheral blood mononuclear cells, HMEC = human mammary epithelial cells, HRCE = human renal cortical epithelia, IMR90 (fetal lung fibroblast), SCM2 = Stem-Cell-Matrix Compendium-2, DMCs = differentially methylated CpGs, NK = natural killer cells, B = B-cell, Monoc = Monocytes, Neutro. = Neutrophils, Eosino = Eosinophils, CD4T = CD4+ T-cells, CD8T = CD8+ T-cells