Skip to main content

Table 1 Overview on published methods to infer tumor purity based on WES/SNP array, gene expression arrays and methylation arrays

From: RF_Purify: a novel tool for comprehensive analysis of tumor-purity in methylation array data based on random forest regression

Publication

Method name

Statistical framework/technique

Datasets used for establishing the method/ validation of the method

Datatypes which can be used as input

Carter et al. [4]

ABSOLUTE

Tumor purity inference based on somatic copy number aberrations in SNP arrays

TCGA

WES data/SNP array

Yoshihara et al., 2013 [8]

ESTIMATE

Comparison of various published gene sets to delineate a) immune signature b) stromal signature - based on these signatures, calculation of purity score

TCGA

Affymetrix gene expression array data

Aran, D. et al. 2015 [7]

LUMP (leukocytes unmethylation for purity)

Averaging of the methylation values 44 CpG sites, known to be hypomethylated in immune cells

TCGA

450 K methylation array data

Zhang et al. 2017 [9], Qin al. 2018 [5]

InfiniumPurify

Tumor purity estimation: (PMID:28122605) comparison of tumor and normal samples to identify DMC (differentially methylated CpG sites) between tumors and an universal set of normal samples in the TCGA dataset followed by kernel density estimation to obtain tumor purity

TCGA

450 K Methylation array data

Benelli et al. 2018 [6]

PAMES (Purity Assessment from clonal MEthylation Sites)

- Calculation of average methylation values per CpG island from TCGA entities.

- Calculation of the Area under the curve for the ROC curves of each CpG island: If AUC < 0.2 or AUC > 0.8 a certain CpG site was considered discriminatory and taken into the model.

- Tumor purity estimate based on the median of hypomethylated and hypermethylated sites

TCGA (generation of the model), Comparison to other TCGA samples and one additional dataset (333 prostata adenocarcinomas)

450 K methylation array data