Skip to main content

Table 1 Overview on published methods to infer tumor purity based on WES/SNP array, gene expression arrays and methylation arrays

From: RF_Purify: a novel tool for comprehensive analysis of tumor-purity in methylation array data based on random forest regression

Publication Method name Statistical framework/technique Datasets used for establishing the method/ validation of the method Datatypes which can be used as input
Carter et al. [4] ABSOLUTE Tumor purity inference based on somatic copy number aberrations in SNP arrays TCGA WES data/SNP array
Yoshihara et al., 2013 [8] ESTIMATE Comparison of various published gene sets to delineate a) immune signature b) stromal signature - based on these signatures, calculation of purity score TCGA Affymetrix gene expression array data
Aran, D. et al. 2015 [7] LUMP (leukocytes unmethylation for purity) Averaging of the methylation values 44 CpG sites, known to be hypomethylated in immune cells TCGA 450 K methylation array data
Zhang et al. 2017 [9], Qin al. 2018 [5] InfiniumPurify Tumor purity estimation: (PMID:28122605) comparison of tumor and normal samples to identify DMC (differentially methylated CpG sites) between tumors and an universal set of normal samples in the TCGA dataset followed by kernel density estimation to obtain tumor purity TCGA 450 K Methylation array data
Benelli et al. 2018 [6] PAMES (Purity Assessment from clonal MEthylation Sites) - Calculation of average methylation values per CpG island from TCGA entities.
- Calculation of the Area under the curve for the ROC curves of each CpG island: If AUC < 0.2 or AUC > 0.8 a certain CpG site was considered discriminatory and taken into the model.
- Tumor purity estimate based on the median of hypomethylated and hypermethylated sites
TCGA (generation of the model), Comparison to other TCGA samples and one additional dataset (333 prostata adenocarcinomas) 450 K methylation array data