Skip to main content

Table 1 Summary of Datasets

From: Reference-free deconvolution of DNA methylation data and mediation by cell composition effects

Code

Tissue

Source

Ref

Platform

Source description

Number

Covariate model

g[nt]

gastric tissue: tumor + normal

GEO:

GSE30601

[42]

27K

203 gastric tumors and 94 matched gastric non-malignant samples.

297

Tumor[normal|tumor]

g[n]

gastric tissue: normal

94

–

g[t]

gastric tissue: tumor

203

–

br-1[t]

breast: tumor

GEO:

GSE20712

[43]

27K

119 breast tumor samples with histological information. Removed 29 samples with ambiguous or missing histology.

119

Histology[basal|HER2|LumA|LumB] + Age[young|old] + Size[small|large]

br-2[t]

breast: tumor

GEO:

GSE31979

[40]

27K

103 primary invasive breast tumors.

90

Histology[basal|ER-|ER+|HER2|LumA|LumB] + Age

br-3[t]

breast: tumor

GEO:

GSE32393

[41]

27K

Breast tumor samples: 91 invasive ductal, 13 invasive lobular, 10 mucinous or medullary; 76 were ER+.

114

ER[ER-|ER+] + Histology[duct|lob|muc or med] + Age

bl-ov

peripheral blood

GEO:

GSE19711

[35]

27K

Whole blood from 131 ovarian cancer cases (drawn pre-treatment) and 274 controls.

402

Case[control|ovarian cancer case] + Age

bl-hn

peripheral blood

GEO:

GSE30229*

[34]

27K

Peripheral blood from 92 head and neck squamous cell carcinoma (HNSCC) patients and 92 controls. Removed 2 outlier cases.

182

Case[control|HNSCC case] + Age

BL-ra

peripheral blood

GEO:

GSE42861

[22]

450K

Peripheral blood from 354 rheumatoid arthritis patients and 335 controls.

689

Case[control|arthritis case]

BL-as

cord blood

(not public)

[3]

450K*

Cord blood from 45 Bangladeshi neonates, with corresponding drinking water arsenic concentrations.

45

Log-arsenic + Sex[female|male]

SP

sperm

GEO:

GSE47627

[36]

450K

26 normal sperm samples.

26

Fraction[swim down|swim up|whole 1h|whole 2h]

BV+LV

endothelial tissue

16

Source[BV|LV]

BV

endothelial tissue: blood vessel

GEO:

GSE34487

[37]

450K

16 vascular samples: 6 primary blood vessel endothelial cell samples and 10 primary lymphatic endothelial cell samples.

6

–

LV

endothelial tissue: lymphatic vessel

10

–

UV-as

umbilical vein endothelial tissue

(not public)

[9]

450K*

Umbilical vein endothelial tissues from 51 Bangladeshi neonates, with corresponding drinking water arsenic concentrations.

51

Log-arsenic + Sex[female|male]

AR-as

placental artery

(not public)

[9]

450K*

Placental arteries from 46 Bangladeshi neonates, with corresponding drinking water arsenic concentrations.

46

Log-arsenic + Sex[female|male]

AR[np]

arterial tissue: atherosclerotic + normal

GEO:

GSE46394

[38]

450K

15 normal aortic tissues, 15 atherosclerotic aortic lesions, 19 carotid atherosclerotic samples.

49

Source[normal|ath|carotid ath] + Sex[female|male] + Age

AR[n]

arterial tissue: normal aorta

15

−

PL-as

placenta

(not public)

[9]

450K*

Placentas from 45 Bangladeshi neonates, with corresponding drinking water arsenic concentrations.

45

Log-arsenic + Sex[female|male]

L[np]

liver tissue: cirrhotic + normal

GEO:

GSE60753

[39]

450K

34 normal liver tissues, 21 cirrhotic tissues (due to alcoholism), 45 cirrhotic tissues [due to chronic hepatitis B (HBV) or C (HCV) viral

100

Source[normal|CirrEtOH|CirrV]

L[n]

liver tissue: normal

34

–

BR-tcga[n]

breast: normal

TCGA

(11/2014)

[44]

450K*

96 normal breast tissues (matched to tumor) from The Cancer Genome Atlas, downloaded Nov. 2014

96

Age + Race[white|other]

BR-tcga[t]

breast: tumor

450K*

725 breast tumors from The Cancer Genome Atlas, downloaded Nov. 2014

725

Age + Race[white|other] + Staging[II+|III+|IV/X|?] + ER[ER+|ER-] + HER2[HER2+|HER2-|HER2?]

  1. *Processed from idat files using FunNorm algorithm (Bioconductor library minfi). See Methods for details